## Description

Problem 1 Assume we collected a dataset D = {(Xi

, Yi)}i∈1..7 of N = 7 points (i.e.,

observations) with inputs X = (1, 2, 3, 4, 5, 6, 7) and outputs Y = (6, 4, 2, 1, 3, 6, 10) for a

regression problem.

1. (0 points) Draw a scatter plot of the dataset on a spreadsheet software (e.g., Excel).

2. (6 points) Let us use a linear regression model gw,b(x) = wx + b to model this data.

Write down the analytical expression of the mean squared loss of this model on dataset

D. Your loss should take the form of

1

2N

X

i∈1..N

Aw2 + Bb2 + Cwb + Dw + Eb + F

where A, B, C, D, E, and F are expressed only as a function of Xi and Yi and constants.

Do not fill-in any numerical values yet.

3. (3 points) Derive the analytical expressions of w and b by minimizing the mean squared

loss from the previous question. Your expressions for parameters w and b should only

depend on A, B, C, D and E. Do not fill-in any numerical values yet.

4. (1 point) Give approximate numerical values for w and b by plugging in numerical values

from the dataset D.

5. (0 points) Double-check your solution with the scatter plot from the question earlier:

e.g., you can use Excel to find numerical values of w and b.

Problem 2 Let us now assume that D is a dataset with d features per input and N > 0

inputs. We have D = {((Xij )j∈1..d, Yi)}i∈1..N . In other words, each Xi

is a column vector

with d components indexed by j such that Xij is the jth component of Xi

. The output Yi

remains a scalar (real value).

Let us assume for simplicity that b = 0 so we have a simplified linear regression model:

g ~w(X) = ~wX

where ~w is now a vector of dimensionality d. Each component of ~w multiplies the corresponding feature of X, which gives the following: g ~w(Xi) = P

j∈1..d wjXij .

ECE1513H – Winter 2020 Assignment 1 – Page 2 of 2 Due Jan 20th

We would like to train a regularized linear regression model, where the mean squared loss is

augmented with an `2 regularization penalty 1

2

k~wk

2

2

on the weight parameter ~w:

L( ~w, D) = 1

2N

X

i∈1..N

(Yi − g ~w(Xi))2 +

λ

2

k~wk

2

2

where λ > 0 is a hyperparameter that controls how much importance is given to the penalty.

1. (2 points) Let A =

P

i∈1..N XiX>

i

. Give a simple analytical expression for the components of A.

2. (3 points) Let us write B =

P

i∈1..N YiXi

, prove that the following holds:

∇L( ~w, D) = 1

N

(A ~w − B) + λ ~w

3. (1 point) Write down the matrix equation that ~w

∗

should satisfy, where:

~w

∗ = arg min

~w

L( ~w, D)

Your equation should only involve A, B, λ, N, and ~w

∗

.

4. (2 points) Prove that all eigenvalues of A are positive.

5. (2 points) Demonstrate that matrix A + λNId is invertible by proving that none of its

eigenvalues are zero. Here, Id is the identity matrix of dimension d.

6. (2 points) Using the invertibility of matrix A+λNId, solve the equation stated in question 3 and deduce an analytical solution for ~w

∗

. You’ve obtained a linear regression

model regularized with an `2 penalty.

∗

∗ ∗