Rensselaer Polytechnic Institute
University of Minnesota
Queens University, Canada
As the volume of data increases, it is clear that both parallel and distributed data mining techniques are required to make the whole knowledge discovery process scalable and interactive. This workshop will target papers on high performance parallel and distributed methods, as well as mining on distributed and heterogeneous datasets. Topics of interest include:
association rules, sequences, classification, clustering, deviation detection, etc.
9:00 - 9:15 Opening
9:15 -10:00 Keynote Talk
10:00-10:30 Coffee Break
10:30-12:00 Session I
13:30-14:15 Invited Talk
14:15-15:15 Session II
15:15-15:20 Concluding Remarks
15:20-15:30 Coffee Break
Keynote Talk: Scalable Parallel
Data Mining for High-Dimensional Data, Alok Choudhary,
Abstract: Large-scale Data analysis and data mining on warehouses (where huge amount of time-varying observational, transactional or simulation data is stored) pose many challenges. The data stored is typically multidimensional with large number of dimensions. In many cases, the data is highly sparse. Parallel processing techniques have become important to enable the use of larger data sets and reduce the time for analysis and knowledge discovery. In this talk, I will briefly present PARSIMONY, a system which provides an infrastructure as well as scalable algorithms for analysis and mining of large and multidimensional data. In particular, I will present MAFIA, a scalable parallel clustering algorithm for large dimensional data.
Invited Talk: Ubiquitous Mining of Distributed Data, Hillol Kargupta, University of Maryland Baltimore County (Speaker Bio)
Abstract: Knowledge discovery and data mining deal with the problem of extracting interesting associations, classifiers, clusters, and other patterns from data. The emergence of network-based environments has introduced a new important dimension to this problem--distributed sources of data and computing. The advent of laptops, palmtops, handhelds, and wearable computers is making ubiquitous access to large quantity of distributed data a reality. Advanced analysis of distributed data for extracting useful knowledge is the next natural step in the increasingly connected world of ubiquitous computing. However, this will not come for free; it will introduce additional cost due to communication, computation, security among others. Distributed data mining (DDM) offers the capability to analyze distributed data by minimizing this cost to maintain the ubiquitous presence. This talk will explain the Collective Data Mining (CDM) approach to DDM that offers a collection of different scalable distributed data analysis techniques. It will present an overview of the CDM technology and its applications.
|Maintained by: Mohammed J. Zaki <zaki.AT.cs.rpi.edu>||You are visitor You are visitor|