Description
1. (50 points) Check out the files logistic reg.m and find test error.m from the SVN
repository set up for this assignment. The files are just function headers that need to be filled
in. find test error should encode a function that, given as inputs a weight vector w, a
data matrix X and a vector of true labels y (in the formats defined in the header), returns
the classification error of w on the data (assuming that the classifier applies a threshold at
0 to the dot product of w and a feature vector x (augmented with a 1 in the first position in
the vector to allow for a constant or bias term). logistic reg should encode a gradient
descent algorithm for learning a logistic regression model. It should return a weight vector w and the training set error Ein (not the classification error, the negative log likelihood
function) as defined in class. Use a learning rate η = 10−5 and automatically terminate the
algorithm if the magnitude of each term in the gradient is below 10−3 at any step.
• Implement the functions in the two files. Remember to check in the final version of
your code for these two files.
• Read more about the “Cleveland” dataset we’ll be using here: https://archive.
ics.uci.edu/ml/datasets/Heart+Disease
1
• Learn a logistic regression model on the data in cleveland.train (be careful about
the fact that the classes are 0/1 – you should convert them to −1/+ 1 so that everything
we’ve done in class is still valid). Apply the model to classify the data (using a probability of 0.5 as the threshold) in cleveland.test. In your writeup, report Ein as well
as the classification error on both the training and test data when using three different
bounds on the maximum number of iterations: ten thousand, one hundred thousand,
and one million. What can you say about the generalization properties of the model?
• Now train and test a logistic regression model using the inbuilt matlab function glmfit
(learn about and use the “binomial” option, and check the label format). Compare the
results with the best ones you achieved and also compare the time taken to achieve the
results.
• Now scale the features by subtracting the mean and dividing by the standard deviation
for each of the features in advance of calling the learning algorithm (you may find the
matlab function zscore useful). Experiment with the learning rate η (you may want to
start by trying different orders of magnitude), this time using a tolerance (how close to
zero you need each element of the gradient to be in order to terminate) of 10−6
. Report
the results in terms of number of iterations until the algorithm terminates, and also the
final Ein.
2. (15 points) LFD Problem 3.4
3. (10 points) LFD Problem 3.19
4. (10 points) LFD Problem 4.8
5. (15 points) LFD Problem 4.25, parts (a) through (c) only
2