COMS W4721: Machine Learning for Data Science Homework 4 solution

$24.99

Original Work ?
Category: You will Instantly receive a download link for .ZIP solution file upon Payment

Description

5/5 - (5 votes)

Problem 1 (K-means) – 20 points
Implement the K-means algorithm discussed in class. Generate 500 observations from a mixture of three
Gaussians on R2 with mixing weights ⇡ = [0.2, 0.5, 0.3] and means µ and covariances ⌃,
µ1 =
 0
0

, ⌃1 =
 1 0
0 1
, µ2 =
 3
0

, ⌃2 =
 1 0
0 1
, µ3 =
 0
3

, ⌃3 =
 1 0
0 1
.
a) For K = 2, 3, 4, 5, plot the value of the K-means objective function per iteration for 20 iterations
(the algorithm may converge before that).
b) For K = 3, 5, plot the 500 data points and indicate the cluster of each for the final iteration by
marking it with a color or a symbol.
Problem 2 (Matrix factorization) – 40 points
In this problem, you will implement the MAP inference algorithm for the matrix completion problem
discussed in class. As a reminder, for users u 2 Rd and movies v 2 Rd, we have
ui ⇠ N(0, 1I), i = 1,…,N1, vj ⇠ N(0, 1I), j = 1,…,N2.
We are given an N1 ⇥ N2 matrix M with missing values. Given the set ⌦ = {(i, j) : Mij is measured},
for each (i, j) 2 ⌦ we model Mij ⇠ N(uT
i vj , 2).
You should run your code on the user-movie ratings dataset provided on Courseworks and the course
website. For your algorithm, set 2 = 0.25, d = 10 and = 1. Train the model on the larger training
set for 100 iterations. For each user-movie pair in the test set, predict the rating using the relevant dot
product. Note that the mean rating has been subtracted from the data and you do not need to round your
prediction. Since the equations are in the slides, there’s no need to re-derive it.
a) Run your code 10 times. For each run, initialize your ui and vj vectors as N(0, I) random vectors.
On a single plot, show the the log joint likelihood for iterations 2 to 100 for each run. In a table,
show in each row the final value of the training objective function next to the RMSE on the testing
set. Sort these rows according to decreasing value of the objective function.
b) For the run with the highest objective value, pick the movies “Star Wars” “My Fair Lady” and
“Goodfellas” and for each movie find the 10 closest movies according to Euclidean distance using
their respective locations vj . List the query movie, the ten nearest movies and their distances. A
mapping from index to movie is provided with the data.