Description
(1) [300 points] Support Vector Machines.
For this question, you will use the data the data you generated in HW2 from the MNIST Digits Dataset for the training
and test datasets in ZipDigits.train and ZipDigits.test respectively with the two features you computed
for the problem of classifying 1s vs. 5s.
(a) Use this method for training support vector machines and this method for model selection with cross validation
from the scikit-learn python library to find the value C for the regularization parameter with the smallest
cross validation error using 5-fold cross validation and the training dataset with two features you formed from
the data in ZipDigits.train. Report ECV for the best value for C.
(b) For the chosen value of C, learn a support vector machine using all of the training data. Compute and report its
Ein.
(c) Use the test dataset from ZipDigits.test to compute Etest for the classifier you just learned and report it.
Compare it with the results from HW2 using the linear model.
(2) [300 points] Support Vector Machines with the Polynomial Kernel.
For this question, you will use the data you generated in HW3 from the MNIST Digits Dataset for classifying 1s vs.
Not 1s, where you created D with 300 randomly selected data points and Dtest consisting of the remaining data points.
(a) Use this method (not the same as the one for the previous question) for training support vector machines using
the kernel for the 10-th order polynomial feature transform and this method for model selection with cross
1
validation from the scikit-learn python library to find the value C for the regularization parameter with
the smallest cross validation error using 5-fold cross validation and the training dataset with two features you
formed from the data in ZipDigits.train. Report ECV for the best value for C.
(b) For the chosen value of C, learn a support vector machine using all of the training data. Compute and report its
Ein.
(c) Use the test dataset from ZipDigits.test to compute Etest for the classifier you just learned and report it.
Compare it with the results from HW3 using the linear model with the 10th order polynomial feature transform.
(3) [400 points] The k-NN rule.
For this question, you will use the data you generated in HW3 from the MNIST Digits Dataset for classifying 1s vs.
Not 1s, where you created D with 300 randomly selected data points and Dtest consisting of the remaining data points.
You will have to implement the k Nearest Neighbors (k-NN) rule. You may use the helper code as a starting point.
(a) Use cross validation with D to select the optimal value of k for the k-NN rule. Show a plot of ECV versus k and
indicate the value of k you choose.
(b) For the chosen value of k, plot the decision boundary. Also compute and report its Ein and ECV.
(c) Report Etest for the k-NN rule corresponding to the chosen value of k.