Data Mining for Biomedical Informatics

A half-day workshop, to be held in conjunction with the
8th SIAM International Conference on Data Mining (SDM 2008)


Taming Patterns in Biological Data. (plenary talk)

Laxmi Parida
IBM TJ Watson Research Center (http://www.research.ibm.com/people/p/parida/)
Courant Institute of Mathematical Sciences, New York University (
http://cs.nyu.edu/~parida/)

 

Abstract

Patterns abound in large data sets. The more flexible a pattern definition, the larger the instances and higher the confusion; while rigid definitions are very restrictive. Must we throw away the proverbial baby with the bath water? In this talk we look at the combinatorics and statistics of patterns that computational biologists discover at different levels in biological data be it nucleic acid sequence, microarray data or other formal structures. We exploit the commonality that runs across these various domains and apply the lessons learned in one to another. I make the case of applying nontrivial combinatorics to an interesting class of patterns called permutation patterns. This mathematical structure is applied to some problems arising naturally in the area of computational biology such as the problem of common gene clusters across species, phylogeny within populations, and the task of modelling complex control of transcriptions via motifs.

I will end the talk with ‘patterns on Linkage Disequilibrium (LD)’ and what it entails in the context of the Genographic Project.
 

Bio

Laxmi Parida is a research staff member in the Computational Biology Center, at the IBM T.J. Watson Research Center, Yorktown Heights and a visiting professor at New York University. She obtained her PhD in Computer Science, at the Courant Institute of Mathematical Sciences, 1998, in the area of computational genomics. She has authored over seventy-five research papers, and holds several patents related to her algorithmic work. She has been on the program committees of several leading conferences in the area of computational biology, as well as string algorithms. Her research monograph on the subject of pattern discovery in computational biology was published by Chapman Hall, and appeared last summer.