Main
CompbioCourse.Main History
Show minor edits - Show changes to markup
- Jan 31: Assignment 1 has been posted. Due date is 11th Feb, just before midnight
- Jan 24: Check out the neat video Cellular Visions: Inner Life of a Cell and the other animations at Virtual Cell.
- Jan 24: Check out the neat video Cellular Visions: Inner Life of a Cell and the other animations at Virtual Cell.
[l]
[l] Lecture2.pdf
[l] Significance and Database Search
[l] Database Search and Significance
[l] R3, R4
[l] R3
[l] Scoring Matrices & Database Searching [l] R5, R6
[l] Alignment & Scoring Matrices [l] R4, R5
[l] Alignment Significance [l]
[l] Significance and Database Search [l] R6, R7
[l] R3, R4
[l]
[l]
[l] R5, R6
[l] HMMs ( Guest Lecture: Prof. Chris Bystroff )
[l] HMMs ( Guest Lecture: Prof. Chris Bystroff )
[l] Profile HMMs ( Guest Lecture: Prof. Chris Bystroff )
[l] Profile HMMs ( Guest Lecture: Prof. Chris Bystroff )
[l]T: Feb 19 note Tuesday
[l] T (Tuesday): Feb 19
[l] Significance & Multiple Sequence Alignment
[l] Alignment Significance
[l] Profile HMMs
[l] Multiple Sequence Alignment
[l] HMMs
[l] HMMs ( Guest Lecture: Prof. Chris Bystroff )
[l] HMMs
[l] Profile HMMs ( Guest Lecture: Prof. Chris Bystroff )
Computational Biology and Bioinformatics are essentially interchangeable terms, referring to the science of analyzing biological data. The goal of this course is to introduce the main topics and the frontiers of computational biology. The basic topics include sequence and protein structure analysis (alignment, evolution, search, motifs, and indexing). The emerging topics include gene expression analysis, network biology, and kernel data mining methods. The emphasis will be on the application of these methods to the various "omics" within computational systems biology, i.e., genomics, proteomics, interactomics, transcriptomics, and metabolomics.
Computational Biology and Bioinformatics are essentially interchangeable terms, referring to the science of analyzing biological data. The goal of this course is to introduce the main topics and the frontiers of computational biology. The basic topics include sequence and protein structure analysis (alignment, evolution, search, motifs, and indexing). The emerging topics include next generation sequencing, gene expression analysis, network biology, and kernel data mining methods. The emphasis will be on the application of these methods to the various "omics" within computational systems biology, i.e., genomics, proteomics, interactomics, transcriptomics, and metabolomics.
- Jan 24: Check out the neat video Cellular Visions: Inner Life of a Cell and the other animations at Virtual Cell.
- Jan 24: Check out the neat video Cellular Visions: Inner Life of a Cell and the other animations at Virtual Cell.
[l] Martin Luther King Holiday
[l] NO CLASS (Martin Luther King Holiday)
[l] Suffix Trees
[l] HMMs
[l] Suffix Arrays
[l] Suffix Trees and Arrays
[l] R12
[l] Lecture9.PDF
[l] [l]
[l] EXAM I [row bgcolor=aliceblue] [l]R: Feb 28
[l] R13, R14
[l] Lecture10.PDF
[row bgcolor=aliceblue]
[l]R: Feb 28
[l] EXAM I
[l] R15
[l] Lecture11.PDF
[l] [l]
[l] R16
[l] Lecture12.PDF
[l] [l]
[l] R17, R18
[l] Lecture13.PDF
[l] [l]
[l] R19, R20
[l] Lecture14.PDF
[l] [l]
[l] R21
[l] Lecture15.PDF
[l] [l]
[l] R22
[l] Lecture16.PDF
[l] [l]
[l] EXAM II [l] [l] [row bgcolor=aliceblue] [l]R: Apr 4
[l] R23
[l] Lecture17.PDF
[l] [l]
[row] [l]M: Apr 8 [l] Gene Expression Clustering
[l]R: Apr 4
[l] Gene Expression Clustering
[l] R24, R25
[l] Lecture18.PDF
[l]R: Apr 11 [l] Kernel Methods for Bioinfo [l] [l]
[l]M: Apr 8 [l] EXAM II
[l]M: Apr 15 [l] Kernels for Sequences [l] [l]
[l]R: Apr 11
[l] Kernel Methods for Bioinfo
[l] R26
[l] Lecture19.PDF
[l]R: Apr 18 [l] Kernel-based Classification [l] [l]
[l]M: Apr 15
[l] Kernels for Sequences
[l] R27
[l] Lecture20.PDF
[l]M: Apr 22 [l] Kernel-based Clustering [l] [l]
[l]R: Apr 18
[l] Kernel-based Classification
[l] R28, R29
[l] Lecture21.PDF
[l]R: Apr 25 [l] Network Motifs -- Transcription Networks [l] [l]
[l]M: Apr 22
[l] Kernel-based Clustering
[l] R30
[l] Lecture22.PDF
[l]M: Apr 29 [l] Network Motifs -- Transcription/Signaling Networks [l] [l]
[l]R: Apr 25
[l] Network Motifs -- Transcription Networks
[l] R31
[l] Lecture23.PDF
[row]
[l]M: Apr 29
[l] Network Motifs -- Transcription/Signaling Networks
[l] R32
[l] Lecture24.PDF
[row bgcolor=aliceblue]
[l]R: May 2
[l] Signaling Networks
[l]
[l] Lecture25.PDF
[l]R: May 2 [l] Signaling Networks [l] [l]
[l] Sequence Alignment Scoring Matrices & Database Searching
[l] R4, R5, R6
[l] Lecture3.PDF
[l] Sequence Alignment [l] [l]
[l] Scoring Matrices & Database Searching [l] [l]
[row] [l]M: Feb 4
[l] [l] [row bgcolor=aliceblue] [l]R: Feb 7
[l] R8, R9
[l] Lecture5.PDF
[row bgcolor=aliceblue]
[l]R: Feb 7
[l] [l]
[row] [l]M: Feb 11
[l] R8
[l] Lecture6.PDF
[l] [l] [row bgcolor=aliceblue] [l]R: Feb 14 [l] Suffix Trees [l] [l]
[l]M: Feb 11
[l] Suffix Trees
[l] R10
[l] Lecture7.PDF
[row bgcolor=aliceblue]
[l]R: Feb 14
[l] Suffix Arrays
[l] R11
[l] Lecture8.PDF
[row] [l]M: Feb 18 [l] NO CLASS (president's day)
[l]T: Feb 19 note Tuesday [l] Suffix Arrays
CSCI-4964/6964: Bioinformatics & Computational Biology, Spring 2012
CSCI-4964/6964: Bioinformatics & Computational Biology, Spring 2013
- Jan 22: Course website is up, with the calendar and syllabus.
- Jan 25: Check the Reading section to lookup the latest reading materials.
- Jan 26: Check out the neat video Cellular Visions: Inner Life of a Cell and the other animations at Virtual Cell.
- Jan 31: Assignment 1 has been posted.
- Feb 12: Assignment 2 has been posted.
- Feb 26: Assignment 3 has been posted.
- Mar 23: Assignment 4 has been posted.
- Apr 14: Assignment 5 has been posted.
- Apr 24: Assignment 6 has been posted.
- Jan 21: Course website is up, with the calendar and syllabus.
- Jan 24: Check out the neat video Cellular Visions: Inner Life of a Cell and the other animations at Virtual Cell.
[l]M: Jan 23 [l] Overview [l] R1, R2 [l] intro.ppt
[l]M: Jan 21 [l] Martin Luther King Holiday [l] [l]
[l]R: Jan 26 [l] Sequence Alignment [l] R3 [l] Lecture2.PDF
[l]R: Jan 24 [l] Introduction [l] R1, R2 [l] intro.ppt
[l]M: Jan 30 [l] Scoring Matrices & Database Searching
[l]M: Jan 28 [l] Sequence Alignment Scoring Matrices & Database Searching
[l]R: Feb 2
[l]R: Jan 31
[l]M: Feb 6
[l]M: Feb 4
[l]R: Feb 9
[l]R: Feb 7
[l]M: Feb 13
[l]M: Feb 11
[l]R: Feb 16
[l]R: Feb 14
[l]M: Feb 20
[l]M: Feb 18
[l]R: Feb 23
[l]R: Feb 21
[l]M: Feb 27
[l]M: Feb 25
[l]R: Mar 1
[l]R: Feb 28
[l]M: Mar 5
[l]M: Mar 4
[l]R: Mar 8
[l]R: Mar 7
[l]M: Mar 12
[l]M: Mar 11
[l]R: Mar 15
[l]R: Mar 14
[l]M: Mar 19
[l]M: Mar 18
[l]R: Mar 22
[l]R: Mar 21
[l]M: Mar 26
[l]M: Mar 25
[l]R: Mar 29
[l]R: Mar 28
[l]M: Apr 2
[l]M: Apr 1
[l]R: Apr 5
[l]R: Apr 4
[l]M: Apr 9
[l]M: Apr 8
[l]R: Apr 12
[l]R: Apr 11
[l]M: Apr 16
[l]M: Apr 15
[l]R: Apr 19
[l]R: Apr 18
[l]M: Apr 23
[l]M: Apr 22
[l]R: Apr 26
[l]R: Apr 25
[l]M: Apr 30
[l]M: Apr 29
[l]R: May 3
[l]R: May 2
[l]M: May 7
[l]M: May 6
[l]R: May 10
[l]R: May 9
[l] PPI Networks
[l] Signaling Networks
[l]
[l] [l] Lecture25.PDF
- Apr 24: Assignment 6 has been posted.
[l] Network Science [l] [l]
[l] Network Motifs -- Transcription Networks
[l] R31
[l] Lecture23.PDF
[l] Network Models
[l] Network Motifs -- Transcription/Signaling Networks
[l] Network Clustering
[l] PPI Networks
[l] R28
[l] R28, R29
[l] [l]
[l] R30
[l] Lecture23.PDF
[l]
[l] Lecture20.PDF
[l] Network Biology
[l] Kernel-based Classification
[l] Network Models
[l] Kernel-based Clustering
[l] Network Motifs
[l] Network Science
[l] Network Motifs and Clustering
[l] Network Models
- Apr 14: Assignment 5 has been posted.
[l] SVD, Gene biClustering
[l] Network Biology
[l] Network Biology
[l] Network Models
[l] Network Models
[l] Network Motifs
[l] Gene expression clustering [l] [l]
[l] Kernel Methods for Bioinfo
[l] R26
[l] Lecture19.PDF
[l] PCA/SVD [l]
[l] Kernels for Sequences [l] R27
[l] [l]
[l] R23
[l] Lecture17.PDF
[l]
[l] R24
[l] [l]
[l] R21
[l] Lecture15.PDF
[l]
[l] R22
[l]
[l] R17, R18
[l]
[l] R19, R20
[l] Phylogenetic Trees
[l] Protein Structure & Alignment
[l]
[l] Lecture13.PDF
[l] Genome Rearrangements [l] [l]
[l] Genome Rearrangements & Phylogenetic Trees
[l] R15
[l] Lecture11.PDF
[l]
[l] R16
[l] Genome Scale Alignment
[l] Motif Discovery
[l]
[l] Lecture9.PDF
[l] Genome Rearrangements
[l] Genome Scale Alignment & Genome Rearrangements
[l] Motif Discovery
[l] Genome Rearrangements
[l] Motif Discovery
[l] Phylogenetic Trees
[l] [l]
[l] R10
[l] Lecture7.PDF
[l]
[l] Lecture7.PDF
[l] R11 [l]
[l] Motifs
[l] Suffix Trees
[l] Genome Scale Alignment
[l] Suffix Arrays
[l]
[l] Lecture7.PDF
[l] Suffix Trees & Arrays
[l] Genome Scale Alignment
[l]
[l] Lecture6.PDF
(:tableend:)
(:tableend:)
[l]
[l] R6, R7
[l]
[l] R8, R9
[l] Multiple Sequence Alignment & Profile HMMs
[l] Significance & Multiple Sequence Alignment
[l]
[l] Lecture4.PDF
[l] HMMs
[l] Profile HMMs
[l] PSI-BLAST, Motifs
[l] HMMs
[l] Suffix Trees & Genome Alignment
[l] Motifs
[l] Suffix Trees & Arrays
[l] Genome Scale Alignment
[l] Genome Rearrangements
[l] Suffix Trees & Arrays
[l] Genome Scale Analysis
[l] Genome Rearrangements
[l] Motifs
[l] Suffix Trees & Genome Alignment
[l] Genome Rearrangements
[l] Suffix Trees & Arrays
[l] Suffix Trees & Genome Alignment
[l] Genome Rearrangements
[l] Suffix Trees & Arrays
[l] Genome Scale Analysis
[l] Significance & Multiple Sequence Alignment
[l] Multiple Sequence Alignment & Profile HMMs
[l] Profiles & PSI-BLAST
[l] HMMs
[l] Profile HMMs
[l] PSI-BLAST, Motifs
[l] Hidden Markov Models
[l] Motifs
[l] Sequence Alignment & Scoring Matrices
[l] Sequence Alignment
[l] Database Searching [l] R4
[l] Scoring Matrices & Database Searching [l] R4, R5
- Jan 26: Check out the neat video Cellular Visions: Inner Life of a Cell and the other animations at Virtual Cell.
[l] Significance & Multiple Sequence Alignment
[l]
[l] Significance & Multiple Sequence Alignment
[l] Profiles & PSI-BLAST
[l] Profiles & PSI-BLAST
[l] Profile HMMs
[l] Profile HMMs
[l] Hidden Markov Models
[l] Hidden Markov Models
[l] Genome Rearrangements
[l] Motif Discovery
[l] Suffix Trees & Genome Alignment
[l] Motif Discovery
[l] Suffix Trees & Arrays
[l] Suffix Trees
[l] Motif Discovery
[l] Suffix Trees & Genome Alignment
[l] Motif Discovery
[l] Suffix Trees & Arrays
[l] Phylogenetic Trees
- Jan 25: Check the Readings section to lookup the latest reading materials.
- Jan 25: Check the Reading section to lookup the latest reading materials.
- Jan 25: Check the Readings section to lookup the latest reading materials.
[l]
[l] R4
[l] Sequence Alignment
[l] Sequence Alignment & Scoring Matrices [l] R3
[l]
[l] Alignments & Scoring Matrices
[l] Database Searching
[l] Scoring Matrices
[l]
[l] Database Searching
[l] Significance & Multiple Sequence Alignment
[l] Significance & Multiple Sequence Alignment
[l] Profiles & PSI-BLAST
[l] Profiles & PSI-BLAST
[l] Profile HMMs
[l] Profile HMMs
[l] Hidden Markov Models
[l] Hidden Markov Models
[l] Motif Discovery
[l] Suffix Trees
[l]
[l] Motifs, Suffix Trees & Genome Alignment
[l] Suffix Trees & Genome Alignment
[!c] Reading
[!c] Readings
[l] [l]
[l] R1, R2 [l] intro.ppt
Python, Perl, or R. Only these three scripting languages will be permitted for the assignments.
Python, or R. Only these two scripting languages will be permitted for the assignments.
The required text for the course is
- Introduction to Computational Proteomics, Golan Yona, CRC Press, 2010.
HW assignments will be assigned from the text. Additional reading materials will be posted to supplement any chapters, where needed. The following books are also good references:
- Biological Sequence Analysis, Durbin, Eddy, Krogh, Mitchison, Cambridge University Press, 1999
- Protein Bioinformatics, Eidhammer, Jonassen, Taylor, John Wiley & Sons, 2004
- Understanding Bioinformatics, Zvelebil, Baum, Garland Science, 2007
There is no required text for the course. Reading materials will be posted online.
- Assignments (40%): There will be two types of assignments: homework questions from the book, and practically oriented assignments. For the latter you'll be asked to implement algorithms and apply them to real datasets, to complement the theory. Only python, perl, or R are permitted for the scripting language.
- Exams (60%): There will be three exams covering the main topics of the course. The tentative exam schedule is posted on the class schedule table. There is no comprehensive final exam. All exams are open book.
- Assignments (40%): There will be two types of assignments: homework questions from the book, and practically oriented assignments. For the latter you'll be asked to implement algorithms and apply them to real datasets, to complement the theory. Only python, and R are permitted for the scripting language.
- Exams (60%): There will be three exams covering the main topics of the course. The tentative exam dates are posted on the class schedule table. There is no comprehensive final exam. All exams are open book.
Laptop Policy: No laptops or other electronic devices is permitted during lectures. You may however use these during exams to access course material online, or to use the calculator functions. Browsing the web for solutions, etc. is of course not permitted. Scripting (using perl, python or R) is also not permitted to solve the exam questions, which are intended to be done by hand.
Laptop Policy: No laptops or other electronic devices are permitted during lectures. You may however use these during exams to access course material online, or to use the calculator functions. Browsing the web for solutions, etc. is of course not permitted. Scripting (using python or R or other languages) is also not permitted to solve the exam questions, which are intended to be done by hand.
[l] ch 5, R7
[l] Lecture 10
[l] [l]
[l] NO CLASS
[l]
[l] ch 5, R8, R9
[l] Lecture 11
[l] [l]
[l] R10, R11
[l] Lecture 12
[l] [l]
[l] ch 8, R12
[l] Lecture 13
[l] [l]
[l] ch 8
[l] Lecture 14
[l] [l]
[l] ch 8, R13
[l] Lecture 15
[l] [l]
[l] ch 12
[l] Lecture 16
[l] [l]
[l] Ch 10,12
[l] Lecture 17
[l] [l]
[l] ch 12, R14, R15
[l] Lecture 18
[l] [l]
[l] ch 11, R16
[l] Lecture 19
[l] [l]
[l] Lecture 20
[l]
[l] ch 13-14, R17
[l] Lecture 21
[l] [l]
[l] Lecture 22
[l]
[l] R18, R19
[l] Lecture 23
[l] [l]
[l] Lecture 24
[l]
[l] ch 1-2, R1 [l] Molecular Biology Overview
[l] [l]
[l] ch 3, R2, R3 [l] Lecture 2
[l] [l]
[l] ch 3, R4
[l] Lecture 3
[l] [l]
[l] ch 3, R4
[l]
[l] ch 3, R5, R6
[l] Lecture 5
[l] [l]
[l] ch 4
[l] Lecture 6
[l] [l]
[l] ch 4
[l] Lecture 7
[l] [l]
[l] ch 6
[l] Lecture 8
[l] [l]
[l] ch 6
[l] Lecture 9
[l] [l]
[l]M: Jan 24
[l]M: Jan 23
[l]R: Jan 27
[l]R: Jan 26
[l]M: Jan 31
[l]M: Jan 30
[l]R: Feb 3
[l]R: Feb 2
[l]M: Feb 7
[l]M: Feb 6
[l]R: Feb 10
[l]R: Feb 9
[l]M: Feb 14
[l]M: Feb 13
[l]R: Feb 17
[l]R: Feb 16
[l]M: Feb 21
[l]M: Feb 20
[l]R: Feb 24
[l]R: Feb 23
[l]M: Feb 28
[l]M: Feb 27
[l]R: Mar 3
[l]R: Mar 1
[l]M: Mar 7
[l]M: Mar 5
[l]R: Mar 10
[l]R: Mar 8
[l]M: Mar 14
[l]M: Mar 12
[l]R: Mar 17
[l]R: Mar 15
[l]M: Mar 21
[l]M: Mar 19
[l]R: Mar 24
[l]R: Mar 22
[l]M: Mar 28
[l]M: Mar 26
[l]R: Mar 31
[l]R: Mar 29
[l]M: Apr 4
[l]M: Apr 2
[l]R: Apr 7
[l]R: Apr 5
[l]M: Apr 11
[l]M: Apr 9
[l]R: Apr 14
[l]R: Apr 12
[l]M: Apr 18
[l]M: Apr 16
[l]R: Apr 21
[l]R: Apr 19
[l]M: Apr 25
[l]M: Apr 23
[l]R: Apr 28
[l]R: Apr 26
[l]M: May 2
[l]M: Apr 30
[l]R: May 5
[l]R: May 3
[l]M: May 9
[l]M: May 7
[l]R: May 12
[l]R: May 10
CSCI-4964/6964: Computational Biology & Bioinformatics, Spring 2011
CSCI-4964/6964: Bioinformatics & Computational Biology, Spring 2012
Class: 10-11:50AM, MR, Low 4034\\
Class: 10-11:50AM, MR, Low 3130\\
- Apr 16: Assignment 6 has been posted. Due date: 5th May (thurs), before midnight.
- Apr 16: Assignment 5 has been posted. Due date: 25th Apr (mon), before midnight.
- Mar 22: Assignment 4 has been posted. Due date: 8th Apr (fri), before midnight.
- Mar 8: Assignment 3 has been posted. Due date: 24th Mar (thurs), before midnight.
- Feb 4: Assignment 1 has been posted. Due date: 14th Feb (mon), before midnight.
- Jan 25: Readings from the book are indicated with each topic in the calendar below.
- Jan 23: Course website is up, with the calendar and syllabus.
- Jan 22: Course website is up, with the calendar and syllabus.
[l] R18, R19
[l]
[l] Network Classification
[l] Network Clustering
- Apr 16: Assignment 6 has been posted. Due date: 5th May (thurs), before midnight.
[l] Network Models and Motifs
[l] Network Models
[l]
[l] Lecture 22
[l] Network Clustering
[l] Network Motifs and Clustering
[l] Network Centralities
[l] Network Models and Motifs
[l] Network Motifs
[l] Network Clustering
[l] Network Clustering
[l] Network Classification
[l] ch 10
[l]
[l] ch 14
[l]
[l] ch 15
[l]
[row bgcolor=aliceblue] [l]R: Apr 28 [l] Network Centralities [l] ch 10 [l]
[l]M: Apr 25 [l] Network Centrality and Motifs [l] ch 13
[l]M: May 2 [l] Network Motifs [l] ch 14
[l]R: Apr 28 [l] Network Clustering [l] ch 10 [l]
[row] [l]M: May 2 [l] Cellular Pathways [l] ch 14 [l] [row bgcolor=aliceblue] [l]R: May 5 [l] Gene Networks
[l]R: May 5 [l] Network Clustering [l] ch 15
[l] Network Biology [l] ch 13-14 [l]
[l] PCA/SVD
[l] ch 11, R16
[l] Lecture 19
- Apr 16: Assignment 5 has been posted. Due date: 25th Apr (mon), before midnight.
[l] PCA/SVD [l] ch 11 [l]
[row] [l]M: Apr 18
- Mar 22: Assignment 4 has been posted. Due date: 6th Apr (wed), before midnight.
- Mar 22: Assignment 4 has been posted. Due date: 8th Apr (fri), before midnight.
[l]
[l] Lecture 16
[l] Clustering [l] Ch 10
[l] Gene Expression Clustering [l] Ch 10,12
- Mar 22: Assignment 4 has been posted. Due date: 6th Apr (mon), before midnight.
- Mar 22: Assignment 4 has been posted. Due date: 6th Apr (wed), before midnight.
[l] ch 8 [l]
[l] ch 8, R13
[l] Lecture 15
[l] Protein Structure Prediction
[l] Structure Alignment
[l] Structure Alignment
[l] Protein Structure Prediction
[l] Sequence Kernels [l] ch 7 [l]
[l] Protein Structure Alignment
[l] ch 8
[l] Lecture 13
- Mar 22: Assignment 4 has been posted. Due date: 4th Apr (mon), before midnight.
- Mar 22: Assignment 4 has been posted. Due date: 6th Apr (mon), before midnight.
[l] EXAM II
[row] [l]M: Apr 11
[l] ch 10 [l]
[l] Ch 10 [l]
[row] [l]M: Apr 11 [l] EXAM II
- Mar 22: Assignment 4 has been posted. Due date: 4th Apr (mon), before midnight.
[l] EXAM II [row bgcolor=aliceblue] [l]R: Apr 7
[l]
[l]
[row bgcolor=aliceblue] [l]R: Apr 7 [l] EXAM II
[l] Classification: SVMs [l] ch 7 [l]
[l] Suffix Trees & Arrays
[l] R10
[l] Lecture 12
- Mar 8: Assignment 3 has been posted. Due date: 24th Mar (thurs), before midnight.
[l] ch 5
[l] ch 5, R7
[l] Motif Discovery & Suffix Trees [l] ch 5
[l] NO CLASS [l]
[l] Suffix Trees & Genome Alignment
[l] Motifs, Suffix Trees & Genome Alignment
[l] Motif Discovery
[l] Motif Discovery & Suffix Trees
[l] Motif Discovery
[l] Suffix Trees & Genome Alignment
[l]
[l] Lecture 9
[l] Suffix Trees/Arrays [l]
[l] Motif Discovery
[l] ch 5
[l] Lecture 10
[l] Ch 6
[l] ch 6 [l]
[l] Motif Discovery [l] Ch 5
[l] Suffix Trees/Arrays [l]
[l] Suffix Trees/Arrays [l] ch 6
[l] Motif Discovery [l] ch 5
[l]
[l] Ch 6
[l]
[l] Ch 5
[l] ch 6
[l] ch 5
[l] EXAM I
[l] Hidden Markov Models
[l] Sequence Indexing (Suffix Trees)
[l] Motif Discovery
[l] Sequence Indexing (Suffix Arrays)
[l] EXAM I
[l] Hidden Markov Models
[l] Motif Discovery
[l] Hidden Markov Models
[l] Suffix Trees/Arrays
[l] Multiple Sequence Alignment
[l] Significance & Multiple Sequence Alignment
[l] Profiles
[l] Profiles & PSI-BLAST
[l] Multiple Sequence Alignment & Profiles
[l] Multiple Sequence Alignment
[l]
[l] Lecture 6
[l] Motif Discovery [l] ch 5
[l] Profiles
[l] ch 4
[l] Lecture 7
[l] Motif Discovery [l] ch 5 [l]
[l] Profile HMMs
[l] ch 6
[l] Lecture 8
[l] Alignment & Scoring Matrices
[l] Alignments & Scoring Matrices
[l]
[l] R4
[l] Scoring Matrices & Database Searching
[l] Scoring Matrices
[l] Multiple Sequence Alignment [l] ch 4 [l]
[l] Database Searching
[l] ch 3
[l] Lecture 5
[l] Profiles
[l] Multiple Sequence Alignment & Profiles
- Feb 4: Assignment 1 has been posted. Due date: 14th Feb (mon), before midnight.
[l] Scoring Matrices
[l] Alignment & Scoring Matrices
[l]
[l] Lecture 3
[l] Database Searching
[l] Scoring Matrices & Database Searching
[l] Protein Structure Prediction (ch 8)
[l] Protein Structure Prediction [l] ch 8
[l] Structure Alignment (ch 8)
[l] Structure Alignment [l] ch 8
[l] Gene Expression Analysis (ch 12)
[l] Gene Expression Analysis [l] ch 12
[l] Clustering (ch 10)
[l] Clustering [l] ch 10
[l] PCA/SVD (ch 11)
[l] PCA/SVD [l] ch 11
[l] Gene expression clustering (ch 12)
[l] Gene expression clustering [l] ch 12
[l] Network Biology (ch 13-14)
[l] Network Biology [l] ch 13-14
[l] Network Centrality and Motifs (ch 13)
[l] Network Centrality and Motifs [l] ch 13
[l] Network Clustering (ch 10)
[l] Network Clustering [l] ch 10
[l] Cellular Pathways (ch 14)
[l] Cellular Pathways [l] ch 14
[l] Gene Networks (ch 15)
[l] Gene Networks [l] ch 15
[l] Profiles (ch 4)
[l] Profiles [l] ch 4
[l] Motif Discovery (ch 5)
[l] Motif Discovery [l] ch 5
[l] Motif Discovery (ch 5)
[l] Motif Discovery [l] ch 5
[l] Hidden Markov Models (ch 6)
[l] Hidden Markov Models [l] ch 6
[l] Hidden Markov Models (ch 6)
[l] Hidden Markov Models [l] ch 6
[l] Classification: SVMs (ch 7)
[l] Classification: SVMs [l] ch 7
[l] Sequence Kernels (ch 7)
[l] Sequence Kernels [l] ch 7
[!c]Day: Date [!c]Topic [!c]Lectures
[!c] Day: Date [!c] Topic [!c] Reading [!c] Lecture Notes
[l] Overview (ch 1-2)
[l] Overview [l] ch 1-2, R1
[l] Sequence Alignment (ch 3)
[l] Sequence Alignment [l] ch 3 [l]
[l] Scoring Matrices (ch 3)
[l] Scoring Matrices [l] ch 3
[l] Database Searching (ch 3)
[l] Database Searching [l] ch 3
[l] Multiple Sequence Alignment (ch 4)
[l] Multiple Sequence Alignment [l] ch 4
- Jan 25: Readings from the book are indicated with each topic in the calendar below.
[l]Network Biology (Graph Models)
[l] Network Biology (ch 13-14)
[l]Network Centrality and Motifs
[l] Network Centrality and Motifs (ch 13)
[l]Network Clustering
[l] Network Clustering (ch 10)
[l]Network Clustering (spectral)
[l] Cellular Pathways (ch 14)
[l]Microarray Analysis
[l] Gene Networks (ch 15)
[l] Gene expression clustering (ch 12)
[l] PCA/SVD (ch 11)
[l]Network Biology
[l] Gene expression clustering (ch 12)
[l]Sequence Assembly
[l] Gene Expression Analysis (ch 12)
[l]Genome Rearrangement
[l] Clustering (ch 10)
[l] Gene Finding
[l] Gene expression clustering (ch 12)
[l]
[l] Protein Structure Prediction (ch 8)
[l]
[l] Structure Alignment (ch 8)
[l] Sequence Indexing (Suffix Trees)
[l] EXAM I
[l] Sequence Indexing (Suffix Arrays)
[l] Sequence Indexing (Suffix Trees)
[l] EXAM I
[l] Sequence Indexing (Suffix Arrays)
[l]S [l]
[l] Classification: SVMs (ch 7) [l]
[l] Sequence Kernels (ch 7)
[l]
[l]Sequence Motifs
[l]
[l]Sequence Motifs
[l]
[l] Sequence Indexing (Suffix Trees)
[l]
[l] Sequence Indexing (Suffix Arrays)
[l]
[l]
[l] Hidden Markov Models (ch 6)
[l] Hidden Markov Models
[l] Hidden Markov Models (ch 6)
[l]Sequence Indexing (Suffix Trees)
[l]S
[l]Sequence Indexing (Suffix Arrays)
[l]
[l]
[l] Motif Discovery (ch 5)
[l]
[l] Motif Discovery (ch 5)
[l] Overview
[l] Overview (ch 1-2)
[l] Sequence Alignment
[l] Sequence Alignment (ch 3)
[l] Scoring Matrices
[l] Scoring Matrices (ch 3)
[l] Database Searching
[l] Database Searching (ch 3)
[l]
[l] Multiple Sequence Alignment (ch 4)
[l]
[l] Profiles (ch 4)
[l]Multiple Sequence Alignment
[l]
[l]
[l] Sequence Alignment
[l] Scoring Matrices
[l]
[l]Sequence Alignment
[l] Database Searching
[l]Sequence Alignment & Scoring
[l]
[l]Sequence Scoring & Searching
[l]
[l] NO CLASS (reading days)
[l] NO CLASS (reading day)
[l] [
[l]
[l] NO CLASS (president's day)
[l]
[l] NO CLASS (president's day) [row bgcolor=aliceblue] [l]R: Feb 24
[row bgcolor=aliceblue] [l]R: Feb 24 [l] [l]
[l] NO CLASS (spring break)
[l]
[l] NO CLASS (spring break)
[l] Hidden Markov Models
[l]Hidden Markov Models [l]
[l] NO CLASS (spring break)
[l]Hidden Markov Models [l]
[l] NO CLASS (spring break)
[l] Lecture 5
[l]
[l] Phylogenetics (parsimony) [l] Lecture 6
[l] [l]
[l] Phylogenetics (distance-based) [l] Lecture 7
[l] [l]
[l] Phylogenetics (probabilistic) [l] Lecture 8
[l] [l]
[l] Lecture 9
[l]
[l] Lecture 10
[l]
[l] Lecture 11
[l]
[l] Lecture 12
[l]
[l] Lecture 13
[l]
[l] Lecture 14
[l]
[l] Lecture 15
[l]
[l] Lecture 16
[l]
[l] Lecture 17
[l]
[l] Lecture 18
[l]
[l] Lecture 19
[l]
[l] Lecture 20
[l]
[l] Lecture 21
[l]
[l] Lecture 22
[l]
[l] Lecture 23
[l]
[l] NO CLASS
[l] Overview
[l] NO CLASS
[l]
[l] Overview of Biology [l] Intro.ppt
[l] [l]
[l] Lecture 2
[l]
[l] Lecture 3
[l] [
[l] Lecture 4
[l]
[l]M: Jan 25
[l]M: Jan 24
[l]R: Jan 28
[l]R: Jan 27
[l]M: Feb 1
[l]M: Jan 31
[l]R: Feb 4
[l]R: Feb 3
[l]M: Feb 8
[l]M: Feb 7
[l]R: Feb 11
[l]R: Feb 10
[l]M: Feb 15
[l]M: Feb 14
[l]R: Feb 18
[l]R: Feb 17
[l]M: Feb 22
[l]M: Feb 21
[l]R: Feb 25
[l]R: Feb 24
[l]M: Mar 1
[l]M: Feb 28
[l]R: Mar 4
[l]R: Mar 3
[l]M: Mar 8
[l]M: Mar 7
[l]R: Mar 11
[l]R: Mar 10
[l]M: Mar 15
[l]M: Mar 14
[l]R: Mar 18
[l]R: Mar 17
[l]M: Mar 22
[l]M: Mar 21
[l]R: Mar 25
[l]R: Mar 24
[l]M: Mar 29
[l]M: Mar 28
[l]R: Apr 1
[l]R: Mar 31
[l]M: Apr 5
[l]M: Apr 4
[l]R: Apr 8
[l]R: Apr 7
[l]M: Apr 12
[l]M: Apr 11
[l]R: Apr 15
[l]R: Apr 14
[l]M: Apr 19
[l]M: Apr 18
[l]R: Apr 22
[l]R: Apr 21
[l]M: Apr 26
[l]M: Apr 25
[l]R: Apr 29
[l]R: Apr 28
[l]M: May 3
[l]M: May 2
[l]R: May 6
[l]R: May 5
[l]M: May 10
[l]M: May 9
[l]R: May 13
[l]R: May 12
Laptop Policy: No laptops or other electronic devices is permitted during lectures. You may however use these during exams to access course material online, or to use the calculator functions. Browsing the web for solutions, etc. is of course not permitted.
Laptop Policy: No laptops or other electronic devices is permitted during lectures. You may however use these during exams to access course material online, or to use the calculator functions. Browsing the web for solutions, etc. is of course not permitted. Scripting (using perl, python or R) is also not permitted to solve the exam questions, which are intended to be done by hand.
Computational Biology deals with the science of analyzing biological data. The goal of this course is to introduce the main topics and the frontiers of computational biology. The basic topics include sequence and protein structure analysis (alignment, evolution, search, motifs, and indexing). The emerging topics include gene expression analysis, network biology, and kernel data mining methods. The emphasis will be on the application of these methods to the "omics" subfields of computational systems biology, i.e., genomics, proteomics, interactomics, transcriptomics, and metabolomics.
Computational Biology and Bioinformatics are essentially interchangeable terms, referring to the science of analyzing biological data. The goal of this course is to introduce the main topics and the frontiers of computational biology. The basic topics include sequence and protein structure analysis (alignment, evolution, search, motifs, and indexing). The emerging topics include gene expression analysis, network biology, and kernel data mining methods. The emphasis will be on the application of these methods to the various "omics" within computational systems biology, i.e., genomics, proteomics, interactomics, transcriptomics, and metabolomics.
The pre-requisites for this course include data structures and algorithms, discrete mathematics, and probability & statistics. Knowledge of basic linear algebra will serve you well too. Assignments will require the use of the R or Perl or Python. Only these three scripting languages will be permitted for the assignments, which must be submitted online at the wiki site. Knowledge of pmwiki markup usage will be your responsibility.
The pre-requisites for this course include data structures and algorithms, discrete mathematics, and probability & statistics. Knowledge of basic linear algebra will serve you well too. Assignments will require the use of Python, Perl, or R. Only these three scripting languages will be permitted for the assignments.
There is no required text for the course. Reading materials will be handed out via the course wiki.
The following books are good references:
The required text for the course is
- Introduction to Computational Proteomics, Golan Yona, CRC Press, 2010.
HW assignments will be assigned from the text. Additional reading materials will be posted to supplement any chapters, where needed. The following books are also good references:
Your grade will be a combination of the following items. Note that the final distribution is subject to some change depending on the number of assignments, but exams will be at least 60%.
- Assignments (40%): The assignments are meant to be practically oriented. You'll be asked to implement algorithms and apply them to real datasets, to complement the theory. Only R, perl or python is permitted for the scripting language. There will be roughly one assignment every two week, to be submitted via the course wiki site. User accounts will be created after first day of class.
- Exams (60%): There will be three exams covering the main topics of the course. The tentative exam schedule is posted on the class schedule table. There is no comprehensive final exam.
Your grade will be a combination of the following items.
- Assignments (40%): There will be two types of assignments: homework questions from the book, and practically oriented assignments. For the latter you'll be asked to implement algorithms and apply them to real datasets, to complement the theory. Only python, perl, or R are permitted for the scripting language.
- Exams (60%): There will be three exams covering the main topics of the course. The tentative exam schedule is posted on the class schedule table. There is no comprehensive final exam. All exams are open book.
Laptop Policy: No laptops or other electronic devices is permitted during lectures. You may however use these during exams to access course material online, or to use the calculator functions. Browsing the web for solutions, etc. is of course not permitted.
CSCI-4964/6964: Computational Biology, Spring 2010
CSCI-4964/6964: Computational Biology & Bioinformatics, Spring 2011
Class: 10-11:50AM, MR, AE 216\\
Class: 10-11:50AM, MR, Low 4034\\
- Apr 25: Assignment 5 posted.
- Apr 13: Assignment 4 posted.
- Mar 17: Assignment 3 posted.
- Mar 15: reading material on HMMs posted.
- Feb 23: Readings on multiple sequence alignment and phylogenetics posted.
- Feb 21: Assignment 2 posted.
- Feb 10: Assignment 1 posted.
- Feb 8: New reading material posted for alignment, scoring matrices and database search
- Feb 6: Accounts created for all student on the assignment submission site
- Jan 31: Course website is up, with the calendar and syllabus.
- Jan 23: Course website is up, with the calendar and syllabus.
[l] Lecture 21
[l]Network Clustering
[l]Network Clustering (spectral) [l] Lecture 22
[l]Network Biology
[l]Network Centrality and Motifs [l] Lecture 20
[l]Network Motifs
[l]Network Clustering
[l]Network Biology
[l]Network Biology (Graph Models) [l] Lecture 19
[l] EXAM III
[l]Network Clustering
[l]Network Clustering
[l] EXAM III
[l]Network Biology Overview
[l]Sequence Assembly [l] Lecture 15
[l]Network Biology Basics
[l]Genome Rearrangement
[l]Network Motifs
[l]Microarray Analysis
[l]Network Clustering
[l]Microarray Analysis
[l]Network Clustering
[l]Network Biology
[l]Microarray Analysis
[l]Network Biology
[l]Microarray Analysis
[l]Network Motifs
[l]Kernel Methods in Computational Biology
[l]Network Clustering
[l]Kernel Methods
[l]Network Clustering
[l]Sequence Patterns & Indexing (Suffix Trees)
[l]Hidden Markov Models
[l]Hidden Markov Models
[l]Sequence Indexing (Suffix Trees)
[l]Structure Alignment
[l]Sequence Indexing (Suffix Arrays)
[l]Structure Motifs
[l]Sequence Motifs
[l]Structure Indexing
[l]Sequence Motifs
[l]Sequence Patterns & Indexing (Suffix Trees)
[l] Phylogenetics (probabilistic) [l] Lecture 8
[l]Sequence Motifs
[l]Sequence Patterns & Indexing (Suffix Trees)
[l] [Path:/~zaki/Courses/bioinfo/lectures/Lecture6.pdf|Lecture 6]]
[l] Lecture 6
[l] Phylogenetics
[l] Phylogenetics (parsimony) [l] [Path:/~zaki/Courses/bioinfo/lectures/Lecture6.pdf|Lecture 6]]
- Feb 23: Readings on multiple sequence alignment and phylogenetics posted.
[l]Sequence Indexing (Suffix Trees)
[l]Multiple Sequence Alignment [l] Lecture 5
[l]Sequence Indexing (Suffix Arrays)
[l] Phylogenetics
[l]Sequence Motifs
[l] Phylogenetics
[l]Hidden Markov Models
[l]Sequence Patterns & Indexing (Suffix Trees)
[l]Phylogenetics
[l]Sequence Motifs
[l]Phylogenetics
[l]Hidden Markov Models
[l]Structure Alignment
[l]Hidden Markov Models
[l]Structure Prediction
[l]Structure Alignment
[l]Sequence Searching
[l]Sequence Alignment & Scoring [l] Lecture 3
[l]Sequence Scoring
[l]Sequence Scoring & Searching
- Feb 8: New reading material posted for alignment, scoring matrices and database search
- Feb 6: Accounts created for all student on the assignment submission site
- Biological Sequence Analysis, Durbin, Eddy), Krogh, Mitchison, Cambridge University Press, 1999
- Protein Bioinformatics: An Algorithmic Approach to Sequence and Structure Analysis, Eidhammer, Jonassen, Taylor, John Wiley & Sons, 2004
- Understanding Bioinformatics, Zvelebil, Baum, Garland Science, 2007
- Biological Sequence Analysis, Durbin, Eddy, Krogh, Mitchison, Cambridge University Press, 1999
- Protein Bioinformatics, Eidhammer, Jonassen, Taylor, John Wiley & Sons, 2004
- Understanding Bioinformatics, Zvelebil, Baum, Garland Science, 2007
Computational Biology deals with the science of analyzing biological data. The goal of this course is to introduce the main topics and the frontiers of computational biology. The basic topics include sequence and protein structure analysis (alignment, evolution, search, motifs, and indexing). The emerging topics include gene expression analysis, network biology, and kernel data mining methods. The emphasis will be on the application of these methods to the "omics" subfields of systems biology, i.e., computational genomics, proteomics, interactomics, transcriptomics, and metabolomics.
Computational Biology deals with the science of analyzing biological data. The goal of this course is to introduce the main topics and the frontiers of computational biology. The basic topics include sequence and protein structure analysis (alignment, evolution, search, motifs, and indexing). The emerging topics include gene expression analysis, network biology, and kernel data mining methods. The emphasis will be on the application of these methods to the "omics" subfields of computational systems biology, i.e., genomics, proteomics, interactomics, transcriptomics, and metabolomics.
Data mining is the process of automatic discovery of patterns, models, changes, associations and anomalies in massive databases. This course will provide an introduction to the main topics in data mining and knowledge discovery, including: statistical foundations, pattern mining, classification, and clustering. Emphasis will be laid on the algorithmic foundations.
Computational Biology deals with the science of analyzing biological data. The goal of this course is to introduce the main topics and the frontiers of computational biology. The basic topics include sequence and protein structure analysis (alignment, evolution, search, motifs, and indexing). The emerging topics include gene expression analysis, network biology, and kernel data mining methods. The emphasis will be on the application of these methods to the "omics" subfields of systems biology, i.e., computational genomics, proteomics, interactomics, transcriptomics, and metabolomics.
- knowledgeable about the fundamental data mining tasks like pattern mining, classification and clustering
- knowledgeable about the fundamental computational biology tasks like sequence and structure analysis and evolution, biological networks, and data mining methods in bioinformatics
- able to implement and apply the techniques to real world datasets
- able to implement and apply the techniques to real world omics datasets
The pre-requisites for this course include data structures and algorithms and discrete mathematics. Basics of linear algebra, and probability & statistics will be very useful as well. Assignments will require the use of the R software. Students are expected to learn R on their own. Assignments must be submitted online at the wiki site. Knowledge of pmwiki markup usage will be your responsibility.
The pre-requisites for this course include data structures and algorithms, discrete mathematics, and probability & statistics. Knowledge of basic linear algebra will serve you well too. Assignments will require the use of the R or Perl or Python. Only these three scripting languages will be permitted for the assignments, which must be submitted online at the wiki site. Knowledge of pmwiki markup usage will be your responsibility.
There is no required text for the course. Notes will be handed out in class.
The following text books are also good references:
- Introduction to Data Mining, by Pang-Ning Tan, Michael Steinbach, and Vipin Kumar, Addison Wesley, 2006.
- Data Mining: Concepts and Techniques (2nd edition), by Jiawei Han and Micheline Kamber, Morgan Kaufmann, 2006.
There is no required text for the course. Reading materials will be handed out via the course wiki.
The following books are good references:
- Biological Sequence Analysis, Durbin, Eddy), Krogh, Mitchison, Cambridge University Press, 1999
- Protein Bioinformatics: An Algorithmic Approach to Sequence and Structure Analysis, Eidhammer, Jonassen, Taylor, John Wiley & Sons, 2004
- Understanding Bioinformatics, Zvelebil, Baum, Garland Science, 2007
- Assignments (40%): The assignments are meant to be practically oriented. You'll be asked to run some mining methods on some real datasets, or to implement some algorithms, to complement the theory. There will be roughly one assignment per week, to be submitted via the course wiki site. User accounts will be created after first day of class.
- Assignments (40%): The assignments are meant to be practically oriented. You'll be asked to implement algorithms and apply them to real datasets, to complement the theory. Only R, perl or python is permitted for the scripting language. There will be roughly one assignment every two week, to be submitted via the course wiki site. User accounts will be created after first day of class.
You may consult other members of the class on the homeworks, but you must submit your own work. Anytime you borrow material from the web or elsewhere, you must acknowledge the source.
You may consult other members of the class on the homeworks, but this must be limited to the ideas only; you must submit your own implementation and work. Anytime you borrow material from the web or elsewhere, you must acknowledge the source.
[l] EXAM I
[l]Hidden Markov Models
[l]Hidden Markov Models
[l] EXAM I
[l]
[l]Phylogenetics
[l]
[l]Phylogenetics
[l]
[l]Structure Alignment
[l]
[l]Structure Prediction
[l]
[l]Structure Motifs
[l]
[l]Structure Indexing
[l]
[l] EXAM II
[l]
[l]Network Biology Overview
[l]
[l]Network Biology Basics
[l]
[l]Network Motifs
[l]
[l]Network Clustering
[l]
[l]Network Clustering
[l]
[l]Microarray Analysis
[l]
[l]Microarray Analysis
[l]
[l]Kernel Methods in Computational Biology
[l]
[l] EXAM III
[l]
[l]Kernel Methods
[l]
[l] Overview of Biology
[l]
[l]Sequence Alignment
[l]
[l]Sequence Searching
[l]
[l]Sequence Scoring
[l]
[l]Sequence Indexing (Suffix Trees)
[l]
[l]Sequence Indexing (Suffix Arrays)
[l]
[l]Sequence Motifs
[l]
[l] EXAM I
[l]
[l]Hidden Markov Models
CSCI-4964/6964: Computational Biology, Spring 2010
Class: 10-11:50AM, MR, AE 216
Instructor Office Hours: 12-1PM, MR
Announcements
(:table border=1 bgcolor=aliceblue width=100%:) (:cell:) (:div style="height: 200px; overflow: auto; text-align: justify; padding-top: 10px; padding-left:10px; padding-right:10px;" :)
- Jan 31: Course website is up, with the calendar and syllabus.
(:divend:) (:tableend:)
Calendar
A tentative sequence of topics to be covered in the classes; changes are likely as the course progresses.
[table border=1 width=100%] [row bgcolor=lavender] [!c]Day: Date [!c]Topic
[row] [l]M: Jan 25 [l] NO CLASS [row bgcolor=aliceblue] [l]R: Jan 28 [l] NO CLASS
[row] [l]M: Feb 1 [l] [row bgcolor=aliceblue] [l]R: Feb 4 [l]
[row] [l]M: Feb 8 [l] [row bgcolor=aliceblue] [l]R: Feb 11 [l]
[row] [l]M: Feb 15 [l] NO CLASS (president's day) [row bgcolor=aliceblue] [l]R: Feb 18 [l]
[row] [l]M: Feb 22 [l] [row bgcolor=aliceblue] [l]R: Feb 25 [l]
[row] [l]M: Mar 1 [l] [row bgcolor=aliceblue] [l]R: Mar 4 [l]
[row] [l]M: Mar 8 [l] NO CLASS (spring break) [row bgcolor=aliceblue] [l]R: Mar 11 [l] NO CLASS (spring break)
[row] [l]M: Mar 15 [l] [row bgcolor=aliceblue] [l]R: Mar 18 [l]
[row] [l]M: Mar 22 [l] [row bgcolor=aliceblue] [l]R: Mar 25 [l]
[row] [l]M: Mar 29 [l] [row bgcolor=aliceblue] [l]R: Apr 1 [l]
[row] [l]M: Apr 5 [l] [row bgcolor=aliceblue] [l]R: Apr 8 [l]
[row] [l]M: Apr 12 [l] [row bgcolor=aliceblue] [l]R: Apr 15 [l]
[row] [l]M: Apr 19 [l] [row bgcolor=aliceblue] [l]R: Apr 22 [l]
[row] [l]M: Apr 26 [l] [row bgcolor=aliceblue] [l]R: Apr 29 [l]
[row] [l]M: May 3 [l] [row bgcolor=aliceblue] [l]R: May 6 [l]
[row] [l]M: May 10 [l] [row bgcolor=aliceblue] [l]R: May 13 [l] NO CLASS (reading days)
[tableend]
Syllabus
(:table border=1 bgcolor=aliceblue width=100%:) (:cell:) (:div style="height: 400px; overflow: auto; text-align: justify; padding-top: 10px; padding-left:10px; padding-right:10px;" :)
Introduction
Data mining is the process of automatic discovery of patterns, models, changes, associations and anomalies in massive databases. This course will provide an introduction to the main topics in data mining and knowledge discovery, including: statistical foundations, pattern mining, classification, and clustering. Emphasis will be laid on the algorithmic foundations.
Learning Objectives
After taking this course students will be
- knowledgeable about the fundamental data mining tasks like pattern mining, classification and clustering
- able to understand the key algorithms for the main tasks
- able to implement and apply the techniques to real world datasets
Prerequisites
The pre-requisites for this course include data structures and algorithms and discrete mathematics. Basics of linear algebra, and probability & statistics will be very useful as well. Assignments will require the use of the R software. Students are expected to learn R on their own. Assignments must be submitted online at the wiki site. Knowledge of pmwiki markup usage will be your responsibility.
Textbook
There is no required text for the course. Notes will be handed out in class.
The following text books are also good references:
- Introduction to Data Mining, by Pang-Ning Tan, Michael Steinbach, and Vipin Kumar, Addison Wesley, 2006.
- Data Mining: Concepts and Techniques (2nd edition), by Jiawei Han and Micheline Kamber, Morgan Kaufmann, 2006.
Grading Policy
Your grade will be a combination of the following items. Note that the final distribution is subject to some change depending on the number of assignments, but exams will be at least 60%.
- Assignments (40%): The assignments are meant to be practically oriented. You'll be asked to run some mining methods on some real datasets, or to implement some algorithms, to complement the theory. There will be roughly one assignment per week, to be submitted via the course wiki site. User accounts will be created after first day of class.
- Exams (60%): There will be three exams covering the main topics of the course. The tentative exam schedule is posted on the class schedule table. There is no comprehensive final exam.
Attendance: Students are strongly encouraged to participate in the class, and should try to attend all classes.
Academic Integrity
You may consult other members of the class on the homeworks, but you must submit your own work. Anytime you borrow material from the web or elsewhere, you must acknowledge the source.
The school takes cases of academic dishonesty very seriously, resulting in an automatic "F" grade for the course. Students should familiarize themselves with the relevant portion of the Rensselaer Handbook of Student Rights and Responsibilities on this topic. (:divend:) (:tableend:)