# CSCI 5525: Machine Learning Homework 2 solution

\$30.00

Category:

## Description

1. (20 points) Recall that a function K : X × X 7→ R is a valid kernel function if it is
symmetric and positive semi-definite function. For the current problem, we assume that the
domain X = R.
(a) (10 points) Let K1, . . . , Km be a set of valid kernel functions. Show that for any
wj ≥ 0, j = 1, . . . , m, x, x
0 ∈ R
p
that the function K(x, x
0
) = Pm
j=1 wjKj (x, x
0
) is a
valid kernel function.
(b) (10 points) Consider the function K(x, x
0
) = K1(x, x
0
) + K2(x, x
0
) where K1 and K2
are valid kernel functions. Show that K is a valid kernel function.
2. (20 points) The SVM classifier can be implemented in either primal or dual form. In this
problem, implement a linear SVM in dual form using slack variables. Note, this will be a
quadratic program with linear constraints. For this you need an optimizer. Use the optimizer
cvxopt which can be installed in your environment either through pip or conda. Refer to
Apply your SVM to the dataset “hw2 data 2020.csv”. This dataset consists of samples from
2 classes making this a 2-class classification problem. The rows are the samples and the first
p columns are the features and the last column is the label. Split this dataset into 80% train
data and 20% test data. Apply k = 10 fold cross validation on the train data to choose the
optimal value of C (see (a) below).
Please submit (a) summary of methods and results report and (b) code:
(a) Summary of methods and results: Briefly describe the approaches used above, along
with relevant equations. Also, calculate the train and validation error rates over the 10
folds for each value of C = {10−4
, 10−3
, 10−2
, 0.1, 1, 10, 100, 1000}. Report the average
train error rate and its associated standard deviation (over the 10 train error rates)
and the average validation error rate and its associated standard deviation (over the 10
validation error rates). After running cross validation, choose the value of C which gives
the lowest average validation error. Apply the learned model with that value of C to
the held out test set and report the error rate on the test set (1 number). Make sure to
explain why you chose that value of C beyond that it has the lowest validation error rate.
(b) Code: Submit the file SVM dual.py which contains the function def SVM dual(dataset:
str) -> None: where dataset is a string consisting of the name of the dataset and the
function does not return anything but must print out to the terminal (stdout) the average
train and validation error rates and standard deviations, the optimal value of C, and the
test set error rate for the model with the lowest validation error rate.
3. (30 points) In this problem, we consider Kernel SVM. Implement a Kernel SVM for a generic
kernel. Apply your Kernel SVM to the dataset “hw2 data 2020.csv”. Split the dataset in
the same way as in Problem 2 (80% train, 20% test) and apply k = 10 fold cross validation
on the train data to choose to optimal hyperparameters (you must decide on reasonable
hyperparameter ranges) for the following kernels:
(i) Linear kernel,
(ii) RBF kernel.
Please submit (a) summary of methods and results report and (b) code:
(a) Summary of methods and results: Briefly describe the approaches used above, along
with relevant equations. Also, for both (i) and (ii), report the average train and validation error rates and standard deviations (over the 10 folds) for each combination of the
hyperparameter values (you choose the values to experiment with – they must be reasonable and you must be able to explain why they are reasonable). After running cross
validation, choose the optimal hyperparameter values and apply the learned model with
those values to the held out test set and report the error rate on the test set. Make sure
to explain why you chose those hyperparameter values.
(b) Code: Submit the file kernel SVM.py which contains the function def kernel SVM(dataset:
str) -> None: where dataset is a string consisting of the name of the dataset and the
function does not return anything but must print out to the terminal (stdout) the average train and validation error rates and standard deviations, the optimal hyperparameter
values, and the test set error rate for the best model.
4. (30 points) In this problem, we consider multi-class classification using SVM. Implement
a multi-class SVM using the one vs all strategy. Apply your SVM to the “mfeat” dataset1
which contains descriptors from MNIST for reducing the data dimensionality.
Split the dataset in the same way as in Problem 2 (80% train, 20% test) and apply k = 10
fold cross validation on the train data to choose to optimal hyperparameters (you must decide
on reasonable hyperparameter ranges) for the following kernels:
(i) Linear kernel,
(ii) RBF kernel.
Please submit (a) summary of methods and results report and (b) code:
(a) Summary of methods and results: Briefly describe the approaches used above, along
with relevant equations. Also, for both (i) and (ii), report the average train and validation error rates and standard deviations (over the 10 folds) for each combination of the
hyperparameter values (you choose the values to experiment with – they must be reasonable and you must be able to explain why they are reasonable). After running cross
validation, choose the optimal hyperparameter values and apply the learned model with
those values to the held out test set and report the error rate on the test set. Make sure
to explain why you chose those hyperparameter values.
(b) Code: Submit the file multi SVM.py which contains the function def multi SVM(dataset:
str) -> None: where dataset is a string consisting of the name of the dataset and the
function does not return anything but must print out to the terminal (stdout) the average train and validation error rates and standard deviations, the optimal hyperparameter
values, and the test set error rate for the best model.
Additional instructions: Code can only be written in Python 3.6+; no other programming
languages will be accepted. One should be able to execute all programs from the Python command
Each function must take the inputs in the order specified in the problem and display the textual
output via the terminal and plots/figures should be included in the report.
For each part, you can submit additional files/functions (as needed) which will be used by
the main file. In your code, you cannot use machine learning libraries such as those available
from scikit-learn for learning the models. However, you may now use scikit-learn for cross validation – consider the function sklearn.model selection.KFold and see details here: https://
scikit-learn.org/stable/modules/generated/sklearn.model_selection.KFold.html. You
may also use libraries for basic matrix computations and plotting such as numpy, pandas, and
matplotlib. Put comments in your code so that one can follow the key parts and steps in your
code.
Your code must be runnable on a CSE lab machine (e.g., csel-kh1260-01.cselabs.umn.edu).
One option is to SSH into a machine. Learn about SSH at these links: https://cseit.umn.edu/
and https://cseit.umn.edu/knowledge-help/remote-linux-applications-over-ssh.
Instructions
Follow the rules strictly. If we cannot run your code, you will not get any credit.
• Things to submit
1. hw2.pdf: The report that contains the solutions to Problems 1, 2, 3, and 4 including the
summary of methods and results.
2. dual SVM.py: Code for Problem 2.
3. kernel SVM.py: Code for Problem 3.
4. multi SVM.py: Code for Problem 4.