Workshop on Frequent Itemset Mining Implementations (FIMI'04)
November 1, 2004, Brighton, UK
in conjunction with
Scope & Objectives |
Call for implementations |
Call for datasets |
Submission guidelines |
Intent to submit |
Important dates |
Workshop Committee |
The FIMI'04 'diapers and beer' best implementation award was granted to Takeaki Uno, Masashi Kiyomi and Hiroki Arimura
for their LCM implementation described in "LCM ver. 2: Efficient Mining Algorithms for Frequent/Closed/Maximal Itemsets".
The proceedings are published online in the CEUR Workshop Proceedings.
Frequent itemset mining (FIM) is a core problem in many data mining
tasks, and varied approaches to the problem appear in numerous papers
across all data mining conferences. While the problem was introduced
in the context of market basket analysis, the scope of the problem is
much broader. Generally speaking, the problem involves the
identification of items, products, symptoms, characteristics, and so
forth, that often occur together in a given dataset. As a fundamental
operation in data mining, algorithms for FIM can be used as a building
block for other, more sophisticated data mining processes.|
The first Frequent Itemset Mining Implementations workshop (FIMI)
, held at ICDM-2003
, provided many new and surprising insights.
Many of these insights, coupled with the
online availability of all source code for every participating
implementation, have inspired several followup investigations.
Therefore, we envisage that this second edition of this successful
workshop will provide further insight into real problems related to
the FIM task.
Submissions consist of code implementing any or all of the following
three main tasks:
In addition to the implementations, each submission must include a paper that describes the implemented algorithms, and provides a performance study on publicly provided datasets.
Each paper should also provide a qualitative explanation of why the submitted algorithm performs well when compared to other known approaches.
- all frequent itemset mining,
- closed frequent itemset mining, and
- maximal frequent itemset mining.
The submissions will be tested independently by the
co-chairs and other members of the organizing committee. All
submissions will also be tested on test datasets which will not be
made public until after all submissions have been received.
workshop participants will be required to come and discuss the
submissions; there will be a heavy focus on critical evaluation, i.e.,
what are the limitations, under what conditions does the algorithm
work well, why it fails in other cases, and what are the open
areas. One outcome of the workshop will be to outline the focus for
research on new problems in the field.
The conditions for
acceptance of the submissions will consist of a correct
implementation for the given task along with either of the two
criteria: (1) an efficient implementation compared with other
submissions in the same category, or (2) a submission that provides
new insight into the FIM problem. The idea is to highlight both
successful and unsuccessful but interesting ideas.
Source code that is accepted will be made publicly available
(via a web link or source code) on the FIMI repository
(with flexible licensing).
Each implementation should adhere to the following rules.
The data mining community unfortunately lacks publicly available real life datasets
which can be used for benchmarking purposes.
Each accepted dataset submission is allowed a one page description in the
All submissions should be sent electronically to email@example.com.
The email should contain exactly 2 files:
The body of the email should contain the title of the paper and the name of the implemented algorithm,
the list of authors with their respective affiliations and email-addresses.
- the tar or zip file of the entire source code directory (including the Makefile), and
- the accompanying paper describing the implementation.
Each implementation should adhere to the following
The accompanying paper should be in Pdf or Postscript format only, not exceeding 25 pages, double spaced, 12pt font, including all figures, tables and references.
If you are interested in submitting an implementation and if you would like
to stay up to date about the latest changes and news,
fill in the following form.|
- Submission Deadline: September 3, 2004
- Notification: October 4, 2004
- Camera-ready Copies: October 11, 2004
- Workshop date: November 1, 2004
- Charu Aggarwal, IBM Watson, USA
- Johannes Gehrke, Cornell University, USA
- Jiawei Han, University of Illinois at Urbana-Champaign, USA
- Ramakrishnan Srikant, IBM Almaden, USA
- Hannu Toivonen, University of Helsinki, Finland