## Description

Problem 1

For PCA, from the perspective of maximizing variance, please show that the solution of φ to

maximize kXφk

2

2

, s.t. kφk2 = 1 is exactly the first column of U, where [U, S] = svd(XT X).

(Note:

you need prove why it is optimal than any other reasonable combinations of Ui

, say φˆ = 0.8 ∗ U(:

, 1) + 0.6 ∗ U(:, 2) which also satisfies kφˆk2 = 1.)

Problem 2

Why might we prefer to minimize the sum of absolute residuals instead of the residual sum of

squares for some data sets? Recall clustering method K-means when calculating the centroid, it is

to take the mean value of the data-points belonging to the same cluster, so what about K-medians?

What is its advantage over of K-means? Please use a synthetic (toy) experiment to illustrate your

conclusion.

Problem 3

Let’s revisit Least Squares Problem: minimize

β

1

2

ky − Aβk

2

2

, where A ∈ R

n×p

.

1. Please show that if p > n, then vanilla solution (AT A)

−1AT y is not applicable any more.

2. Let’s assume A = [1, 2, 4; 1, 3, 5; 1, 7, 7; 1, 8, 9], y = [1; 2; 3; 4]. Please show via experiment results that Gradient Descent method will obtain the optimal solution with Linear Convergence

rate if the learning rate is fixed to be 1

σmax(AT A)

, and β0 = [0; 0; 0].

3. Now let’s consider ridge regression: minimize

β

1

2

ky − Aβk

2

2 +

λ

2

kβk

2

2

, where A, y, β0 remains the same as above while learning rate is fixed to be 1

λ+σmax(AT A)

where λ varies from

0.1, 1, 10, 100, 200, please show that Gradient Descent method with larger λ converges faster.

Problem 4

Please download the image from https://en.wikipedia.org/wiki/Lenna#/media/File:Lenna_

(test_image).png with dimension 512 × 512 × 3. Assume for each RGB channel data X, we have

[U, Σ, V ] = svd(X). Please show each compression ratio and reconstruction image if we choose first

2, 5, 20, 50, 80, 100 components respectively.

Also please determine the best component number to

obtain a good trade-off between data compression ratio and reconstruction image quality. (Open

question, that is your solution will be accepted as long as it’s reasonable.)