## Description

INSTRUCTIONS: This homework contains two parts – theory (I) and programming (II). Please submit your homework via your bitbucket repository. Your submission should consist of

1. a ﬁle called hw1 writeup.pdf providing the solution to the theory questions;

2. completed code in hw1a.py and hw1b.py ;

3. an output folder containing png ﬁles created by executing hw1a.py and hw1b.py.

Part I: Programming

For this homework, you will be using the JAFFE dataset.

Clone the repository created for you via bitbucket. Message the instructors via Piazza if a repository for Homework #1 was not created for you by end of Tuesday, Feb. 2, 2016. The repository contains a script download.sh that will download and unzip the dataset.

A public AMI E6040 hw1 ami has been created with all necessary packages and patches preinstalled for solving this assignment.

If you would like to use your own Ubuntu machine/AMI, the following instructions might be helpful • Install libjpeg by executing sudo apt-get install libjpeg-dev • Install libx11 by executing sudo apt-get install libx11-dev • Install PIL in your theano environment by executing pip install Pillow • Apply changes from this PR to your theano installation. An alternative to the last step above is downgrading numpy by running the following commands in your conda environment

1. conda uninstall numpy

2. conda uninstall scipy

3. conda uninstall matplotlib

4. conda install numpy==1.9.3

5. conda install scipy==0.15.1

6. conda install matplotlib==1.4.2

7. pip uninstall theano

8. pip install theano

If you will be executing the code on a diﬀerent OS, please search online for how to perform the above steps for your conﬁguration.

Use single precision for both problems in this assignment.

PROBLEM a: (35 points)

In this problem, you will be dividing the images into blocks of sizes (16,16),(32,32), (64,64) and performing principal component analysis in each case. For each case you will visualize reconstructions using diﬀerent number of principal components and also visualize the top components.

Skeleton code for this problem has been provided in hw1a.py in the repository.

Some useful links:

1. PIL documentatiom

2. PIL convert documentation

3. theano.tensor.nnet.neighbours documentation

4. numpy.linalg.eigh documentation

PROBLEM b: (35 points)

In this problem, you will be essentially performing the same tasks as in PROBLEM a, but on the whole image instead of blocks. Since the images are of size 256×256, the matrix XTX will be of size 65,536×65,536. You most likely won’t be able to even load this matrix (unless you have enormous amount of RAM available), let alone

2

perform eigenanalysis on it. Hence, you will be solving this problem by using gradient descent.

Recall from class, that the top principal component can be extracted by solving the following optimization problem argmin d −dTXTXd, subject to dTd = 1.

The above can be resolved by using gradient descent with the cost function f(d) = −dTXTXd while normalizing d after each update (descent). Other principal components can be found by “taking out the contribution of the already determined components ”. This can be done as in the following pseudocode.

Algorithm 1 Multiple principle components via gradient descent Input: Data Matrix X, number of components to extract N, learning rate η, Max steps T, Stopping condition Returns: Principal components di for i = 0,··· ,N 1: for i = 0,··· ,N do 2: Ai ← XTX− i−1 P j=0 λjdjdjT 3: Initialize di randomly and let t = 1 4: while (t ≤ T & Stopping condition is not True) do 5: y ← di −η∇di−dT i Aidi6: di ← y kyk7: t ← t + 1 8: λi ← diTXTXdi Gradient descent can be performed very easily in theano as it supports symbolic diﬀerentiation. Please note that there shouldn’t be a need to compute large matrices of order 65,536×65,536 at any point. You should write your theano expressions and functions such that it avoids computing the large matrices. Choose an appropriate learning rate and an appropriate stopping criteria (for example, when the change in the cost is below some small or the change in kdik is below some small ). Skeleton code for this part is available in hw1b.py.

Some useful links:

1. Theano Logistic Regression Example

3

Part II: Theory

PROBLEM c (15 points)

(i)

px(x) =(1 if 0 ≤ x ≤ 1 0 otherwise y = − 1 λ ln(x) Find py(y).

(ii)

p(x = x,y = y) =(3(xy2 + yx2) for x,y ∈ [0,1] 0 otherwise Find p(x = x), p(y = y), E(x), E(y), E(xy). Are x and y independent?

PROBLEM d (15 points) Let us assume that we model certain data X = {x(1),··· ,x(m)}, x(i) ∈ Rn, to have been drawn (independently) from a multivariate gaussian distribution N(µ,Σ). (i) Find maximum likelihood estimators for µ,Σ;

(ii) Are the estimators biased or unbiased?

Note:- The multivariate normal distribution is given by N(µ,Σ) ∼ 1 (2π)n/2 1 |Σ|1/2 exp−1 2 (x−µ)TΣ−1(x−µ), where µ is the n-dimensional mean vector and Σ is the n×n covariance matrix. Some useful links:

1. Matrix Cookbook

NEED HELP:

If you have any questions you are advised to use Piazza forum which is accessible through courseworks.