CSCI-4390/6390: Data Mining, Fall 2011
Class Time: MR 10-11:50AM
Room: Carnegie 113
Instructor Office Hours: 12-1PM, MR, Lally 307
TA: Amina Shabbeer
TA Office Hours: 4-5PM, TW, AE 304
TA Contact: shabba@rpi.edu
Table of Contents (hide)
Announcements
|
Calendar & Lecture Notes
A tentative sequence of topics to be covered in the classes; changes are likely as the course progresses.
| Day: Date | Topic | Chapters | Lecture Notes
|
|---|---|---|---|
| M: Aug 29 | CLASSES CANCELLED | ||
| R: Sep 1 | Data Mining Overview & Data Analysis Foundations (DA): Algebraic & Probabilistic Views | chap1.pdf | Attach:dmintro.pptx,lecture1.pdf |
| M: Sep 5 | Labor Day Holiday | ||
| R: Sep 8 | DA: Numeric Attributes | chap2.pdf | lecture2.pdf |
| M: Sep 12 | NO CLASS NSF-RPI Workshop on Complex Data | ||
| R: Sep 15 | DA: Numeric Attributes & Eigenvectors | lecture3.pdf | |
| M: Sep 19 | DA: Categorical Data | chap3.pdf | lecture4.pdf |
| R: Sep 22 | DA: Graph Data | chap4.pdf | lecture5.pdf |
| M: Sep 26 | DA: Graph Models | lecture6.pdf | |
| R: Sep 29 | DA: Kernel Methods | chap5.pdf | lecture7.pdf |
| M: Oct 3 | DA: High Dimensional Analysis | chap6.pdf | lecture8.pdf |
| R: Oct 6 | EXAM I | ||
| Tue: Oct 11 | NO CLASS | ||
| R: Oct 13 | DA: Dimensionality Reduction | chap8.pdf | lecture9.pdf |
| M: Oct 17 | Frequent Pattern Mining (FPM): Itemset Mining | chap10.pdf | lecture10.pdf |
| R: Oct 20 | FPM: Itemset Summaries & Sequence Mining | chap11.pdf, chap12.pdf | lecture11.pdf |
| M: Oct 24 | FPM: Sequence Mining, Graph Mining | chap13.pdf | lecture12.pdf |
| R: Oct 27 | FPM: Graph Mining, Classification (CLASS): Linear Discriminants | chap27.pdf | lecture13.pdf |
| M: Oct 31 | CLASS: Linear Discriminants, Support Vector Machines (SVM) | chap28.pdf | lecture14.pdf |
| R: Nov 3 | CLASS: SVMs | lecture15.pdf | |
| M: Nov 7 | EXAM II | ||
| R: Nov 10 | CLASS: Bayesian Classifier, Decision Trees | chap26.pdf, chap24.pdf | lecture16.pdf |
| M: Nov 14 | Clustering (CLUS): Partitional | chap16.pdf | lecture17.pdf |
| R: Nov 17 | CLUS: Hierarchical Clustering | chap17.pdf | lecture18.pdf |
| M: Nov 21 | CLUS: Density-based Clustering, | chap18.pdf | lecture19.pdf |
| R: Nov 24 | Thanksgiving Break | ||
| M: Nov 28 | CLUS: Subspace Clustering | chap19.pdf | lecture20.pdf |
| R: Dec 1 | Spectral & Graph Clustering | chap20.pdf | lecture21.pdf |
| M: Dec 5 | Evaluation & Assessment | chap21.pdf | lecture22.pdf |
| R: Dec 8 | EXAM III |
Syllabus
IntroductionData mining is the process of automatic discovery of patterns, models, changes, associations and anomalies in massive databases. This course will provide an introduction to the main topics in data mining and knowledge discovery, including: algebraic and statistical foundations, pattern mining, classification, and clustering. Emphasis will be laid on the algorithmic approach. Learning ObjectivesAfter taking this course students will be
PrerequisitesThe pre-requisites for this course include data structures and algorithms and discrete mathematics. Linear algebra and probability & statistics are also essentially pre-requisites, though an attempt will be made to review the basic concepts. Assignments will require the use of the python language, with NumPy package for numeric computations. You are expected to learn python on your own via web tutorials, etc. Assignments must be submitted via email to . TextbookThere is no required text for the course. Notes will be posted online on the course webpage. The following text books are also good references:
Grading PolicyYour grade will be a combination of the following items.
Other Policies
Academic IntegrityYou may consult other members of the class on the assignments, but you must submit your own work. For instance you may discuss general approaches to solving a problem, but you must implement the solution on your own (similarity detection software may be used). Anytime you borrow material from the web or elsewhere, you must acknowledge the source. The school takes cases of academic dishonesty very seriously, resulting in an automatic "F" grade for the course. Students should familiarize themselves with the relevant portion of the Rensselaer Handbook of Student Rights and Responsibilities on this topic. |