Assignment 5: SVM Training
Due Date: Thurs, 17 Nov, 2011, before midnight
Your goal is to learn a SVM in the traditional dual formulation for the Attach:iris-slwc.txt dataset. This is a simple 2D dataset, consisting of 2 dimensions (the sepal length and width), and the third column is the class (+1,-1). One of the class corresponds to iris-setosa, and the other class to other types of irises.
Implement the stochastic gradient ascent algorithm 28.1 in chapter 28, with two different kernels, namely, the linear kernel and the homogeneous quadratic kernel. Use \epsilon=0.0001, and C=10. Important: Make sure you map each point to one dimension higher, as explained at the beginning of sec 28.5 (the last dimension is used for computing the offset value b). This should be done before computing the kernel matrix.
At the end, print all values of non-zero \alpha_i, i.e., for the support vectors, in the following format:
i, \alpha_i
one per line.
You should also print the number of support vectors.
Do this both the kernels. The results on the linear kernel should approximately match the hyperplane h_{10} in example 28.7.
To test your approach you can use the small dataset Attach:ldata.txt. The results on this should be similar to the hyperplane in example 28.3.
What to turn in
- Write a python script called RCSID-Assign5.py, and submit a PDF file named RCSID-Assign5.pdf that contains the output for both the kernels in the format mentioned above.
- Submit the assignment as a zip or tar file via email to: dmcourse.cs@gmail.com. The subject of your email should be "RCSID-Assign5 Submission".
Solutions
Iris-slwc.txt data Linear Kernel The support vectors are: 24 9.17846165748 54 2.26408152452 65 10.0 106 10.0 108 10.0 121 10.0 137 10.0 145 2.7249961103e-15 number of support vectors: 8 Quadratic Kernel The support vectors are: 24 5.13769977739 73 0.195297587238 108 5.7346121391 number of support vectors: 3
Here is the python script:
#!/usr/bin/env python
import sys
import random
import numpy as np
kernel = "linear" #default
fname = sys.argv[1]
C = np.float(sys.argv[2])
if len(sys.argv) > 3:
kernel = sys.argv[3]
print "params", fname, C, kernel
D = np.loadtxt(fname,delimiter=",")
(n, d) = np.shape(D) #get input dimensions
X = D[:,0:2]
X = np.append(X, np.ones((n,1)), 1) #add extra column of ones
Y = D[:,2] #classes
if kernel == "linear":
K = np.dot(X, X.T)
elif kernel == "quadratic":
K = np.dot(X, X.T)
K = K*K #square each element
eta = 1/np.diag(K)
alpha = np.zeros(n)
#start iterative stochastic gradient ascent
eps = 0.0001
t = 0
err = 1
while err > eps:
alpha_prev = alpha.copy()
idx = range(n)
#random.shuffle(idx)
for k in idx:
ay = alpha*Y
ayK = np.dot(ay, K[:,k])
alpha[k] += eta[k] * (1 - Y[k]*ayK)
if alpha[k] < 0: alpha[k] = 0
elif alpha[k] > C: alpha[k] = C
t += 1
err = np.linalg.norm(alpha-alpha_prev)
print t, err
print "The support vectors are:"
for (i, ai) in enumerate(alpha):
if ai > 0: print i, ai
print "number of support vectors:", len(alpha[alpha > 0])
if kernel == "linear":
ay = alpha*Y
print np.shape(ay)
print np.shape(X)
w = np.sum(X.T*ay, axis=1)
print "hyperplane:", w