Recursive Data Mining
Introduction
Recursive Data Mining (RDM) is a new appoach for discovering features from sequences of tokens. The sequence of tokens can represent any form of data that can be represented as a sequence of basic units, such as characters, words, amino acid and etc.
The overview of RDM's algorithm can be found under RDM tab on the menu. RDM has been introduced and applied successfully by B. Szymanski and Y. Zhang in the task for Masquerade Detection and Author Identification task.
Also, we have applied RDM on the role identification task. Base on the results we observed, RDM has illustrated that it can extracted useful patterns from a very noisy dataset such as Enron email dataset. Currently, we look into the possibility of using RDM for protein family classification. For more detail on these specific projects, please look under project.
Recursive Data Mining (RDM) is a new appoach for discovering features from sequences of tokens. The sequence of tokens can represent any form of data that can be represented as a sequence of basic units, such as characters, words, amino acid and etc.
The overview of RDM's algorithm can be found under RDM tab on the menu. RDM has been introduced and applied successfully by B. Szymanski and Y. Zhang in the task for Masquerade Detection and Author Identification task.
Also, we have applied RDM on the role identification task. Base on the results we observed, RDM has illustrated that it can extracted useful patterns from a very noisy dataset such as Enron email dataset. Currently, we look into the possibility of using RDM for protein family classification. For more detail on these specific projects, please look under project.