Home
Publications
Code
CV
| Open
Source Code:
December 21, 2010: This page contains the
various open source libraries that I have developed for my research.
Although this code has been primarily used for the particular
experiments, they have been designed to be used for other tasks with
little to no change needed in the library files. I am currently working
to improve these libraries to be as robust and flexible as possible.
Feel free to contact me if you have questions, comments, or problems
using the code.
Sampled Spectral Distance Embedding
(SSDE):
Originally developed by Ali Civril and Eli Bocke-Rivel, this code
implements the SSDE method. A given (undirected and connected) graph is
embedded into Cartesian space in such a way as to minimize the loss of
information in the graph edges. The algorithm executes quickly due to
clever sampling of the nodes and a matrix approximation method that
runs in linear time with respect to the number of nodes in the graph.
SSDE
Code
SSDE
Documentation
Related Works:
Civril, A., Magdon-Ismail, M.
SSDE: Fast graph drawing using sampled spectral distance embedding,
Graph Drawing, 2007.
Low-Rank Perturbation Gaussian
Mixture Model (LRPGMM):
This code is a modified version of the typical EM algorithm used for
training a Gaussian Mixture Model (GMM). The defining difference is the
approximation of the covariance matrix for each mixture. The GMM can be
trained using the full covariance matrix, a diagonal matrix
approximation (i.e. only the variances), or using a low-rank perturbed
diagonal matrix decomposition (LRPDD). With the LRPDD, the covariance
matrix is represented by a linear number of parameters (with respect to
the dimension of the data being modeled). This decomposition allows for
a linear time (vs. quadratic or cubic) training the GMM and minimizes
the overfitting that occurs with a full rank covariance matrix.
Additionally, there is a naive overlapping clustering algorithm
built-in to the code that clusters the data based on the posterior
probabilities.
Note: This code relies on the ALGLIB
library.
GMM
Code
GMM
Documentation
Related Works:
Purnell, J., Magdon-Ismail, M.,
"Approximating the Covariance Matrix with Low-Rank
Perturbations", IDEAL, 2010.
SSDE-Clustering:
Given a graph, the SSDE code can be applied to embed the graph into
Carteisan coordinates. With these coordinates, a GMM can be fitted to
the data. In addition to the GMM model, the posterior probabilites for
each node is calculated. Finally the data is cluster in one of two
ways. One method is to use the posterior probabilities to cluster the
nodes in such a way as to create higher quality clusters than would be
found using the GMM's naive clustering function. The quality of the
clusters is based on a metric described in the related works. The other
method is specify the average number of clusters a data sample is
assigned to. The SSDE-Cluster code provides the code for Cluster, which
clusters the data, and the SSDE-Cluster shell script which provides a
wrapper for SSDE, GMM, and Cluster.
SSDE-Cluster
Code
SSDE-Cluster
Documentation
Related Works:
Purnell, J., Magdon-Ismail, M. "Fast
Overlapping Clustering of Networks Using Sampled Spectral Distance
Embedding and GMMs" CS Dept., RPI, 2010 |
|