Description

5/5 - (4 votes)

Ex 1. [Overfitting]
Some data {xi
, yi}i=1…N has been generated from an unknown polynomial P∗(x) (see
figure 1 and code page 3 to load the data ’data_HW1_ex1.csv’ in Python), i.e.
yi = P∗(xi) + εi
,
where εi
is some ’white noise’ or more precisely ε ∼ N (0, σ2
) with σ
2
is the intensity of
the noise (also unknown). The goal is to estimate the polynomial P∗ and in particular
its degree denoted k∗ (i.e. k∗ = deg(P∗)).
Figure 1: Data points {xi
, yi}i=1..N generated from a polynomial P∗.
a) For each k = 0 . . . 12, estimate the polynomial Pˆ
k of order k that minimizes the
mean-square-error (use polyfit):
`(P) = 1
N
X
N
i=1
|yi − P(xi)|
2
,
in other words find the polynomial Pˆ
k satisfying:
Pˆ
k = argmin
P, deg(P)≤k
`(P).
Plot the loss `(Pˆ
k) as a function of k (i.e. plot the points {k, `(Pˆ
k)}k=0…12).
1
b) Split the data-set {xi
, yi}i=1…N into training and test sets using (resp.) 80% and
20% of the data. We define two loss functions:
`train(P) = 1
Ntrain
X
i in train set
|yi − P(xi)|
2
(1)
`test(P) = 1
Ntest
X
i in test set
|yi − P(xi)|
2
(2)
where Ntrain and Ntext denote the number of elements in (resp.) the train and test
sets (Ntrain + Ntext = N).
Similarly, for each k = 0 . . . 12, estimate the polynomial ePk of order k that minimizes
the mean-square-error over the training set:
ePk = argmin
P, deg(P)≤k
`train(P).
and plot the values of the loss functions `train and `test at ePk for all k.
c) Guess what is the order k∗ of the polynomial P∗ and give your estimate of its
coefficients.
Ex 2. [Gradient descent]
The goal is to implement and test gradient descent methods. We use linear regression
as a toy problem using the data set {xi
, yi}i=1..N from Ex 1 and we would like to minimize
the following loss function:
`(a, b) = 1
N
X
N
i=1
|yi − (a + bxi)|
2
.
a) Compute the gradient of the loss function ∇` = (∂a`, ∂b`).
Deduce (numerically) the minimum (a∗, b∗) of `.
b) Starting from (a0, b0) = (1.8, 1.4) and using the constant learning rate η = .05,
implement the gradient descent algorithm:

an+1
bn+1 !
=

an
bn
!
− η∇`(an, bn).
Estimate the rate of convergence (i.e. how fast does k(an, bn) − (a∗, b∗)k converge
to zero?).
c) Implement the momentum and Nesterov algorithm, denoting θn = (an, bn),
momentum (
vn+1 = γvn + η∇`(θn)
θn+1 = θn − vn+1
Nesterov (
vn+1 = γvn + η∇`(θn − γvn)
θn+1 = θn − vn+1
Estimate the convergence rate for γ = .9 and η = .05.
extra) Test other gradient descent methods (e.g. Adam, AdaGrad, RMSProp) using class
defined in either Pytorch or TensorFlow.
2
Ex 3. [Classification]
The goal is to implement a linear classifier for the data-set MNIST-fashion (see also
code page 3).
a) Adapt the linear classifier for the MNIST-data set to the new data-base.
b) Train the model using 40 epochs with learning rate η = 10−4 and the gradient
descent method of your choices.
Plot the evolution of the loss at each epoch.
c) Draw the ’template’ of each class.
##############################################
## Exercise 1 (Python) ##
##############################################
import matplotlib.pyplot as plt
import numpy as np
# get the data
data = np.loadtxt(‘data_HW1_ex1.csv’,delimiter=’,’)
x,y = data[:,0],data[:,1]
# plot
plt.figure(1)
plt.plot(x,y,’o’)
plt.xlabel(r’$x$’)
plt.ylabel(r’$y$’)
plt.show()
##############################################
## Exercise 3 (Python) ##
##############################################
from torchvision import datasets
import matplotlib.pyplot as plt
# Download and load
data_collection = datasets.FashionMNIST(‘data_fashionMNIST’, train=True, download=True)
# Illustration
label_fashion = dict([(0,’T-shirt’),(1,’trouser’),(2,’pullover’),(3,’dress’),(4,’coat’),
(5,’sandal’),(6,’shirt’),(7,’sneaker’),(8,’bag’),(9,’boot’)])
X,y = data_collection.__getitem__(123)
plt.figure(1);plt.clf()
plt.imshow(X)
plt.title(“Example of image with label “+label_fashion[y.item()])
plt.show()
3

Custom Work, Just for You!

Can’t find the tutorial you need? No worries! We create custom, original work at affordable prices! We specialize in Computer Science, Software, Mechanical, and Electrical Engineering, as well as Health Sciences, Statistics, Discrete Math, Social Sciences, Law, and English.

Custom/Original Work Essays cost as low as $10 per page.
Programming Custom Work starts from $50.

Get top-quality help now!