SPIDER: Scalable, Parallel and Interactive Data Mining and Exploration at Rensselaer

The goal of our research is to develop a high performance data mining system (HPDM), which can manipulate very large scientific databases. The research pursues an application-oriented approach with special focus on bioinformatics (e.g., protein structure prediction). The HPDM system is based on a three-tiered architecture consisting of a front-end interface, visualization, and query tool, a middle layer Data Mining Template Library of common high-level mining algorithms and a core set of data mining "primitive operations'', and a back-end Extensible Data Mining System tightly integrated with a database system, and delivering high performance.

Research

Our ultimate goal is to develop a fully functional HPDM toolkit for massive databases. We are exploring
and unifying two dominant frameworks:
Our current accomplishments include:
 

Current Students

  • Mohammed AlHasan
  • Vineet Chaoji
  • Saeed Salem
  • Past Students/Contributors

  • Adnan Saifee
  • Nagender Parimi
  • Joe Urban
  • Paolo Palmerini
  • Nilanjana De
  • Benjarath Phoophakdee
  • Feng Gao
  • Jeevan Pathuri
  • SPIDER Related Links

     Papers on Data Mining

    Introduction to Data Mining Course

     Open Source Software