## Description

## Problem 1 Hidden Markov Models

In this problem, you are given parameters of a small hidden Markov model and you will implement three

inference procedures discussed in the class. In hmm model.json, you will find the following model parameters (note the slight notation difference from the class):

• π: the initial probabilities, πi = P(Z1 = si);

• A: the transition probabilities, with aij = P(Zt = sj

|Zt−1 = si);

• B: the observation probabilities, with bik = P(Xt = ok

|Zt = si).

Now we observe a sequence O of length N. Your task is to write code to compute probabilities and infer

about the possible hidden states given this observation. In particular, in 1.1 you need to implement the

forward and backward procedures; in 1.2, you need to compute P(X1:N = O), the probability of observing

the sequence, using the forward or backward messages; finally in 1.3, you need to infer the most likely

state path using the Viterbi algorithm.

Note that your code might be tested against different parameter

sets/observation sequences when graded.

1.1 Please finish the implementation of the functions forward() and backward() in hmm.py:

• forward(π, A, B,O) takes the model parameters π, A, B and the observation sequence O as input

and output a numpy array α, where α[j, t − 1] = P(Zt = sj

, X1:t = x1:t). Note that this is α[j, t − 1]

instead of α[j, t] just because the index of an array starts from 0.

• backward(π, A, B,O) takes the model parameters π, A, B and the observation sequence O as input

and output a numpy array β, where β[j, t − 1] = P(Xt+1:N = xt+1:N | Zt = sj). Note the same index

difference from the lecture slides.

1.2 Now we can calculate P(X1:N = O) from the output of your forward and backward algorithms. Please

finish the implementation of the function seqprob forward() and seqprob backward() in hmm.py.

Both of them should return P(X1:N = O). Note that in seqprob forward() the input is only the forward

messages, while in seqprob backward() you have access to the model parameters too. Please refer to

Slide 29/60 of Lec 10 for the concrete formulas.

1.3 Next you need to find the most likely state path based on the observation O, namely,

argmax

z

∗

1:N

P(Z1:N = z

∗

1:N | X1:N = O).

Please implement the Viterbi algorithm viterbi() in hmm.py. The function viterbi(π, A, B,O) takes

the model parameters π, A, B and the observation sequence O as inputs and outputs a list path which contains the most likely state path z

∗

1:N

(in terms of the state index) you have found.

What to do and submit: After you finish each task in 1.1, 1.2 and 1.3 in hmm.py, run the script hmm.sh.

It will generate hmm.txt. Add, commit, and push both hmm.py and hmm.txt before the due date.

## Problem 2 Principal Component Analysis

In this question we will implement Principal Component Analysis (PCA) and use it for image compression.

We will use black and white images in this example for illustratory purposes. (However, PCA can also be

used for compression of color images too. For that you need to perform compression on each channel

separately and then combine them.)

2

Figure 1: Eigenvectors together with their eigenvalues

Figure 2: Mean image

Let X be an N × D data matrix, where N is the number of images and D is the number of pixels of

each image. For a given parameter K < D, you need to use PCA to compress X into an N × K matrix. As

discussed in the class, this is done via first centering the data, then finding the top K eigenvectors of the

covariance matrix X

TX, and finally projecting the original data onto these eigenvectors.

Since each of these eigenvectors has dimensionality D, it can also be represented as an image of the same

size as the dataset. For example, in Figure 1 we demonstrate top eigenvectors corresponding to all images

with digit 6 in MNIST together with their eigenvalues. Figure 2 demonstrates the mean value of MNIST

images that correspond to digit 6.

Figure 3: Original image (on the right) and decompressed versions with different compression rates.

3

Compression

Please implement PCA in function pca of pca.py file, which returns the compressed representation Y ∈

RN×K and a matrix M ∈ RD×K that consists of the top K eigenvectors, represented as column vectors.

Note

that we have subtracted the mean values from the data before passing it to pca, so you do not need to do

this step.

### Decompression

With the eigenvector matrix M, we can approximately reconstruct the original dataset as Xˆ = YMT

. Implement this step in function decompress of pca.py.

In Figure 3 we show several decompressed images with different compression rates (K) and the original

image. It is easy to see that larger K leads to better reconstruction. To quantify the reconstruction error

between the original image x and the reconstructed one ˆx, we calculate the pixel-wise mean-square-error

as 1

D

kx − xˆk

2

2

. Implement this step in function reconstruction error of pca.py.

What to do and submit: After implementation, run python3 pca.py to generate pca output.txt. Submit

both pca.py and pca output.txt.

4