COMS 4721: Machine Learning for Data Science Homework 1 solution

$24.99

Original Work ?
Category: You will Instantly receive a download link for .ZIP solution file upon Payment

Description

5/5 - (5 votes)

Problem 1 (written) – 25 points
Imagine you have a sequence of N observations (x1, . . . , xN ), where each xi ∈ {0, 1, 2, . . . , ∞}. You
model this sequence as i.i.d. from a Poisson distribution with unknown parameter λ ∈ R+, where
p(X|λ) = λ
X
X!
e
−λ
(a) What is the joint likelihood of the data (x1, . . . , xN )?
(b) Derive the maximum likelihood estimate λML for λ.
To help learn λ, you use a prior distribution. You select the distribution p(λ) = gamma(a, b).
(c) Derive the maximum a posteriori (MAP) estimate λMAP for λ?
(d) Use Bayes rule to derive the posterior distribution of λ and identify the name of this distribution.
(e) What is the mean and variance of λ under this posterior? Discuss how it relates to λML and λMAP.
Problem 2 (written) – 20 points
(a) You have data (xi
, yi) for i = 1, . . . , n, where x ∈ R
d
and y ∈ R. You model this as yi
iid∼
N(x
T
i w, σ2
). You use the data you have to approximate w with wRR = (λI + XT X)
−1XT y, where X
and y are defined as in the lectures. Derive the results for E[wRR] and V[wRR] given in the slides.
(b) If wRR is the ridge regression solution and wLS is the least squares solution for the above problem,
derive an equation for writing wRR as a function of wLS and the singular values and right singular vectors
of feature matrix X. Recall that the singular value decomposition of X = USV T
.
Problem 3 (coding) – 30 points
In this problem you will analyze data using the linear regression techniques we have discussed. The goal
of the problem is to predict the miles per gallon a car will get using six quantities (features) about that
car. The zip file containing the data can be found on Courseworks.1 The data is broken into training
and testing sets. Each row in both “X” files contain six features for a single car (plus a 1 in the 7th
dimension) and the same row in the corresponding “y” file contains the miles per gallon for that car.
Remember to submit all original source code with your homework. Put everything you are asked to show
below in the PDF file.
Part 1. Using the training data only, write code to solve the ridge regression problem
L = λkwk
2 +
P350
i=1 kyi − x
T
i wk
2
.
(a) For λ = 0, 1, 2, 3, . . . , 5000, solve for wRR. (Notice that when λ = 0, wRR = wLS.) In one figure,
plot the 7 values in wRR as a function of df(λ). You will need to call a built in SVD function to do
this as discussed in the slides. Be sure to label your 7 curves by their dimension in x.
2
(b) Two dimensions clearly stand out over the others. Which ones are they and what information can
we get from this?
(c) For λ = 0, . . . , 50, predict all 42 test cases. Plot the root mean squared error (RMSE)3 on the test
set as a function of λ—not as a function of df(λ). What does this figure tell you when choosing λ
for this problem (and when choosing between ridge regression and least squares)?
Part 2. Modify your code to learn a pth-order polynomial regression model for p = 1, 2, 3. (You’ve
already done p = 1 above.) For this implementation use the method discussed in the slides. Also, be
sure to standardize each additional dimension of your data.
(d) In one figure, plot the test RMSE as a function of λ = 0, . . . , 100 for p = 1, 2, 3. Based on this
plot, which value of p should you choose and why? How does your assessment of the ideal value
of λ change for this problem?
1
See https://archive.ics.uci.edu/ml/datasets/Auto+MPG for more details on this dataset.
Since I have done some preprocessing, you must use the data provided with this homework.
2The dimensions correspond to: 1. cylinders, 2. displacement, 3. horsepower, 4. weight, 5. acceleration, 6. year made
3RMSE =
q
1
42
P42
i=1(y
test
i − y
pred
i
)
2.