## Description

Problem 1 The purpose of this problem is to illustrate how to train recurrent neural networks

(RNNs). We will illustrate concepts from the class, and learn a RNN that models an OrnsteinUhlenbeck process. You do not need to know anything about Ornstein-Uhlenbeck processes for the

purpose of this assignment, but if you are curious, you can learn more about them online (e.g.,

https://en.wikipedia.org/wiki/Ornstein-Uhlenbeck process).

Code in the starter notebook includes a function generate ou process to sequentially generate Ornstein-Uhlenbeck time series, to which Gaussian noise is added. The first graph in the

notebook includes an example with the ground truth Ornstein-Uhlenbeck time series and its noisy

counterpart. Our goal is to train a RNN to denoise the time series.

The RNN we use in this assignment uses Gated Recurrent Units (GRUs) rather than LSTMs

we studied in class. GRUs are similar to LSTMs but do not include an output gate. Here are the

3 gates implemented by the GRU:

• Update gate: zt ← σ(Wz · xt + Uz · ht−1 + bz)

• Reset gate: rt ← σ(Wr · xt + Ur · ht−1 + br)

• Output gate: ot ← zt

· ht−1 + (1 − zt) · tanh(Wo · xt + Uo · (rt ◦ ht−1) + bo)

where σ is the sigmoid function, xt the input at time step t, ht the hidden state at time step t, and

all other parameters either weights or biases. For instance, Wz, Uz, and bz are the parameters for

the update gate. They respectively represent the weight of the connection to the input, the weight

of the connection to the previous hidden state, and the bias.

1. (1 point) Fill the line implementing the forward pass for the update gate in apply fun scan.

2. (1 point) Fill the line implementing the forward pass for the reset gate in apply fun scan.

3. (1 point) Fill the line implementing the forward pass for the output gate in apply fun scan.

4. (1 point) Fill the missing line in the function mse loss. The function returns the mean squared

error loss between the model’s predictions (i.e., preds) and the target sequence (i.e., targets).

ECE1513H – Winter 2020 Assignment 6 – Page 2 of 2 Due April 6

5. (2 points) Fill the missing lines at the top of the cell titled “Training the RNN”. These lines

should use the optimizers pre-built into JAX to instantiate an Adam optimizer. As seen in

previous homeworks, you should obtain three things from the pre-built JAX optimizer: a

method opt init that takes in a set of initial parameter values returned by init fun and

returns the initial optimizer state opt state, a method opt update which takes in gradients

and parameters and updates the optimizer states by applying one step of optimization, and a

method get params which takes in an optimizer state and returns current parameter values.

6. (1 point) Fill the lines that define x in (the input) and y (the output) of our recurrent neural

network. The RNN should take in all but the last step of the noisy time series, and predict all

but the first step of the ground truth time series (i.e., the time series before it was noised).

7. (1 point) Fill the line that calls update to take one step of gradient descent on the batch of

training data sampled.

8. (2 points) As done in prior assignments, perform a hyperparameter search to find a good value

for your learning rate. Describe briefly how you conducted the search, the value you chose, and

why you chose that value. You may find it useful to call plot ou loss(train loss log)

9. (1 point) Using the last cell of the notebook, comment qualitatively on the difference between

the predicted time series, the ground truth, and the noisy time series. You will have to reuse

the definition of x in (the input) and y from the question above.

∗

∗ ∗