CSCI 5521 Homework 5 Perceptron algorithm solution

$40.00

Original Work ?
Category: You will Instantly receive a download link for .ZIP solution file upon Payment

Description

5/5 - (7 votes)

1. (60 points) Perceptron algorithm can be kernelized to allow non-linear mapping. A simple version of kernel perceptron algorithm is to simply replace coefficients w with sample weight α as f(x) =< w, x > +b =
P
i α
i
y
i < xi
, x > +b
(note: you can also stack 1 as the last feature in x to represent the bias term
b within w). The kernel perceptron algorithm be equivalently trained as the
original perceptron algorithm as follows,
α = {0}
N ;
b = 0;
while not converged do
for each sample (x
t
, yt
) do
if (
PN
i=1 α
i
y
i < xi
, xt > +b)y
t ≤ 0 then
α
t = α
t + 1
b = b + y
t
end if
end for
end while
(a) Implement this kernel perceptron using a polynomial kernel (try different
degree) and train it on non-linearly separable data given by the following
Python code:
import numpy as np
np.random.seed(1) # For reproducibility
r1 = np.sqrt(np.random.rand(100, 1)) # Radius
t1 = 2*np.pi*np.random.rand(100, 1) # Angle
data1 = np.hstack((r1*np.cos(t1), r1*np.sin(t1))) # Points
np.random.seed(2) # For reproducibility
r2 = np.sqrt(3*np.random.rand(100, 1)+2) # Radius
t2 = 2*np.pi*np.random.rand(100, 1) # Angle
data2 = np.hstack((r2*np.cos(t2), r2*np.sin(t2))) # Points
Data from class 1 is contained in the matrix data1 and data from class 2
1
Instructor: Rui Kuang (kuang@umn.edu). TAs: Tianci Song (song0309@umn.edu) and Ruyuan
(wanxx199@umn.edu).
1
is contained in the matrix data2. Use the following code to visualize the
data:
import matplotlib.pyplot as plt
plt.scatter(data[np.where(labels==1)[0],0],
data[np.where(labels==1)[0],1], c=’r’)
plt.scatter(data[np.where(labels==-1)[0],0],
data[np.where(labels==-1)[0],1], c=’b’)
Combine the data into a single matrix and assign class labels {1, −1} to
use to test your kernel perceptron algorithm using the following code:
data3 = np.vstack((data1, data2))
labels = np.ones((200, 1))
labels[0:100, :] = -1
All the data is contained in matrix data3 and the corresponding labels
are in vector theclass.
Test your implementation of the kernel perceptron algorithm on the above
data and plot the decision boundary (hint: follow the explanations in footnote 2
(matlab code), and then use similar functions provided in Python:
meshgrid function from numpy and contour function from matplotlib to
draw the boundary, and footnote 3
shows a demo to do contour plotting
by using these two functions) and report the error rate.
(b) Apply the svc function in svm module from sklearn package 4 using the
same kernel to the same data and plot the decision boundary obtained
by the SVM in the same figure with the decision boundary of your kernel
perceptron. Compare the results with your kernel perceptron implementation. Play with the C penalty parameter and explain how it changes the
margin obtained.
(c) Once you have tested that your kernel perceptron works, train and evaluate your implementation using the given subset of the optdigits dataset.
The first training and test datasets digits49 train.txt and digits49 test.txt
consists of only digits 4 and 9. The second training and test datasets
digits79 train.txt and digits79 test.txt consists of only digits 7
and 9. Report the training and test error rates on both datasets.
2https://www.mathworks.com/help/stats/support-vector-machines-for-binary-classification.
html
3https://matplotlib.org/3.1.1/gallery/images_contours_and_fields/contour_demo.
html
4https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html
2
2. (40 points) Given the ν-SVM (Section 13.4 in the book), consider the hardmargin version of its objective function defined as:
minimize
1
2
||w||2 − νρ
subject to r
t
(w
T x
t + w0) ≥ ρ
ρ ≥ 0.
Start from the primal form and show each step to derive the Lagrange dual of
the hard-margin ν-SVM variation. (Hint: Read Section 13.3 carefully to see
how Equation 13.15 is derived from Equation 13.10).
Instructions
• Solutions to all questions must be presented in a report which includes result
explanations, and all error rates.
• All programming questions must be written in Python, no other programming
languages will be accepted. The code must be able to be executed from either
command line or PyCharm window on the cselabs machines. Each function
must take the inputs in the order specified and print/display the required output to either terminal or PyCharm console. For each part, you can submit
additional files/functions (as needed) which will be used by the main functions
specified below. Put comments in your code so that one can follow the key
parts and steps. Please follow the rules strictly. If we cannot run your
code, you will receive no credit.
• Question 1:
– Train the kernel perceptron: kernPercGD(train data: training data matrix, train label: training label vector, poly degree: polynomial degree).
The function must return in variables the outputs (α: n × 1 vector, b: a
scalar).
• For the optdigit datasets, the last column is the label {−1, 1}.
Submission
• Things to submit:
3
1. hw5 sol.pdf: A PDF document which contains the report with solutions
to all questions.
2. kernPercGD.py: The Python code of the kernPercGD function in Question 1.
3. hw5 Q1.py: Main script for Question 1 which will learn the model and
print the error rates for each of the three datasets.
4. Any other files, except the data, which are necessary for your code.
• Submit: Upload hw5 sol.pdf as a separate file and a zip file which is the
compression of all other files. All material must be submitted electronically
via Canvas.
4