Main

# Density Clustering

### Density Based Clustering: DENCLUE

Write a script to implement the DENCLUE density-based clustering algorithm Algorithm 15.2 in chapter 15. The script should take as input a dataset $$\mathbf{D}$$, the minimum density $$\xi$$, the tolerance for convergence $$\epsilon$$, and the width $$h$$. Do not make any assumptions about the data (i.e., column names, etc), except that the last column gives the "true" cluster id.

Run your script on the iris.txt dataset, with $$\epsilon=0.0001$$. Your script should output the following:

• The number of clusters, and the size of each cluster
• The density attractor, followed by the set of point in that cluster.
• Purity of the clustering, based on the true id.

For Iris, you should use a value of $$\xi$$ that gives you 3 clusters in the end, i.e., try different values and then finally report only the results for the value that gives you 3 clusters, since there are 3 true clusters in the data. Select the value of $$h$$ empirically.

To speed up the computation for estimating the density at a point, you may want to first identify the K nearest neighbors, and use only those neighbors.