From Data Mining and Analysis

Main: KernelMethods

Kernel Computation

Compute the centered and normalized homogeneous quadratic kernel matrix \(\mathbf{K}\) for the iris.txt dataset using the kernel function in input space. Ignore the last column of the data, which is a categorical attribute for the type of Iris flower.

Next, explicitly transform each point \(\mathbf{x}_i\) to the feature space \(\phi(\mathbf{x}_i)\), using the homogeneous quadratic kernel. Center these points and normalize them. Finally verify that the pair-wise dot products of the centered and normalized points in feature space yield the same kernel matrix computed directly in input space via the kernel function. To do this, compute the matrix difference between the kernel matrices from the two approaches, and then print the sum of the differences.

Principal Components of Kernel Matrix

Compute the principal components (PCs) of the centered and normalized kernel matrix computed above. (Note: in Python you may use numpy.linalg.eig; in R you may use eigen ).

How many components are required to capture 90% of the total variance?

Project each of the original points/rows of \(\mathbf{K}\) onto the first two PCs and create a scatter plot of the projected points. What is the range of values in each PC dimension?

Retrieved from http://www.cs.rpi.edu/~zaki/dataminingbook/pmwiki.php/Main/KernelMethods
Page last modified on September 06, 2014, at 11:02 AM