## Description

Problem 1 (30 points). Consider the following movie rating matrix with five users.

LOTR HPATPOZ Snatch LSATSB The Gentlemen The Hobbit

A 5 ? 1 2 3 4

B 5 4 2 2 2 5

C 1 2 4 ? 4 3

D ? 2 3 5 ? ?

E ? 3 5 4 5 1

1. Compute the user-user similarity for all the 10 pairs of users and 15 pairs of movies using

Pearson’s similarity. (Ignore missing values when computing similarity, i.e., you only need to

consider the commonly rated entries, which can be a lower-dimensional vector).

2. Let k = 3. Fill the missing entries of the matrix above using k-NN user-user CF and Pearson’s

similarity. When predicting, use the following formula:

rˆu,i = ¯ru +

P

j∈Ku

S(u, uj )(ruj ,i − r¯uj

)

P

j∈Ku

|S(u, uj )|

,

where ¯ru is the average rating of user u (on the items that they actually have rated) and Ku

is the top neighbours of u who also rated item i. Note here we rank user similarities by

the absolute values of S(u, uj ) since S(u, uj ) can be negative for pearson similarity. (If I

always like things you hate, then your rating is also very useful to me.)

Problem 2 (30 points). Consider a Markov chain with three states, Overcast, Rain, and Sunny.

The transition probabilities are given in the following table. The (i, j)th entry of the matrix is the

probability that the next day to be j if today is i. November 29, 2020 is Rain.

1

O S R

O 1/3 1/3 1/3

S 1/4 1/2 1/4

R 1/4 1/4 1/2

1. Draw the state transition diagram with arrows annotating the transition probabilities.

2. What is the probability that it will be Sunny on November 30th, 2020?

3. What is the probability that it will Rain on December 2nd, 2020?

4. What is the probability that it will Rain every day until December 5, 2020 (including it)?

5. Compute the probability that it will Rain on December 6, 2020?

Problem 3 (30 points). See attached Jupyter Notebooks for details.

2