Description
Here, (m) denotes the covariance at lag m for a block of data. There are di§erent ways
to compute (m), speciÖcally the scaling factor, so let us use the following convention:
assume the data is 0 mean; if not subtract the sample mean Örst. Then (with data indexed
1 n N):
(m) = 1
N
X
N
n=m+1
xnxnm
The (auto-)correlation coe¢ cients are:
(m) = (m) = (0)
For example, the MATLAB function autocorr(x) will compute and plot (m) for 0 m 20,
with signiÖcance bounds (in the graph, if a displayed value is below the bounds, then it can be
considered as 0). For our purposes, it is Öne if we use j (m)j > 0:2 as a test of ìsigniÖcanceî.
Our formula for (m) is technically biased, and can be interpreted as a Bartlett (i.e., similar
to triangular) windowed form of unbiased estimates. In any case, for our purposes, the lags
of interest are M, where M N so this distinction is not of interest for us, here.
For pairs of signals, we can deÖne:
xy (m) = 1
N
X
N
n=m+1
xnynm
for m 0, and similarly for m < 0. Then the cross-correlation coe¢ cient is:
xy (m) =
xy (m)
q
xx (0) yy (0)
and in MATLAB, crosscorr(x; y) will compute and graph this, by default for 20 m 20.
1. Heavy Tail Distributions
First we are going to explore some distributions. Generate N = 1e6 iid samples of
N (0; 1), Cauchy with = 1, and Studentsí t-distribution with = 5 and = 10
degrees of freedom. The variance of Studentsít-distribution is Önite for 3 (well,
technically > 2 but here we consider only integer values for ) and is given by:
2
For comparison, you should normalize your Studentsít-distribution data sets so they
have variance 1. Obviously donít normalize Cauchy since it has no variance! However,
= 0:544 is a reasonable “match” to N (0; 1) in the sense that both yield approximatel
the same P (jXj < 1) = 68% (which is the probability of being within one standard
deviation for the case of N (0; 1)):
In MATLAB, trnd generates the Studentsí t-distribution. You can generate Cauchy
= 1 as a special case of Studentsít by setting = 1; multiply the result by to
create Cauchy with general . Another approach is that if U is uniform over (0,1),
then X = tan (U) is Cauchy. Note in general Cauchy has pdf;
f (x) = =
2 + x
2
In any case, at this point you have N samples of ìcomparably scaledîGaussian, Cauchy
and Studentsí data. For each, compute the fraction of the time the absolute value
exceeds 4. Also, graph the Cauchy data: you will see something interesting (you will
not see in the others).
2. AR and ARMA
Let us start by synthesizing data using an ARMA(2; 2) model:
rt =
(1 + 0:2z
1
) (1 + 0:5z
1
)
(1 0:8z
1
) (1 0:7z
1
)
vt
where vt
is iid N (0; 1). Interpret the above as an operator notation for the appropriate
di§erence equation. Generate N = 250 samples. Consider the AR(p) model:
rt = vt + wp1rt1 + + wpprtp
The AR coe¢ cients as given in the notes would be ap0 = 1 and apk = wpk, 1 k p.
As usual we take ap0 = 1. We can interpret this as a linear regression model with model
error vt
. In this way, we can compute the AR coe¢ cients fapkg
p
k=1 using a least-squares
Öt. Recall app is the reáection coe¢ cient p, and E
jvt
j
2
is Pp, the power in the p
th
order prediction. The set p and Pp can be found using a Levinson-Durbin recursion
instead.
Take maximum order M = 10.
(a) Estimate the covariances (m) for 0 m M. Obtain a stem plot of the values
(m) = (0) for 0 m M. Those above say about 0:2 are signiÖcant. The
number above that level gives an indication that an AR model of that length
would be useful, higher order models would give marginal improvement. Note:
In MATLAB, autocorr does all this for you.
(b) Set up the corresponding (M + 1) (M + 1) Toeplitz matrix C, and compute its
eigenvalues. They are (hopefully) strictly positive real.
(c) Recall the Cholesky factorization of a pd matrix is C = LcholL
T
chol where Lchol is
lower triangular with positive entries on the diagonal. Closely related to this is
the so-called LDL decomposition C = LDLT where L is lower triangular with 1ís
on the diagonal, and D is a diagonal matrix with positive entries. If f`ig are the
diagonal elements of Lchol, then D = diag f`
2
i g. You should be able to
decomposition in your function library whether you use MATLAB, Python or
really any good scientiÖc computing software; if not and all you have is Cholesky,
you can derive D from Lchol as suggested, and then L = LcholD1=2
. If you canít
Önd either LDL or Cholesky decomposition, look harder. If itís not there, get a
better library! In any case, Önd the L and D for your covariance matrix C.
(d) Use the Levinson-Durbin recursion (you can use a function from your library, if
not then hand code it) to compute the reáection coe¢ cients and prediction error
powers Pm up to M. Also compute the FPEF coe¢ cients of all orders up to order
M. Pack these FPEFs into a triangular matrix of the form:
F =
2
6
6
6
6
6
4
1 0 0 0
a11 1 0 0
a22 a21 1 0
.
.
.
aMM aM1 1
3
7
7
7
7
7
5
(e) Compute F CFT
. Compare the diagonal elements of D with the Pmís:Compare
F with L
1
. In summary, the Levinson-Durbin recursion is tied to the LDL
decomposition (though it doesnít give you L and D directly, it gives what turns
out to be the more useful L
1 and D).
(f) Now use a LS Öt to compute the AR coe¢ cients of order M. Compare them with
the Mth order model you found via Levinson-Durbin.
(g) Comment on the reáection coe¢ cients you found. Consider the following general
issue: if a reáection coe¢ cient is close to 1 in magnitude, what does that tell you?
if it is close to 0, then what? Hint: Recall the formula that relates the Ppís to
the pís. Also, recall what holds for the ís if the data is exactly AR (M0) for
some order M0.
3. Repeat the above experiment with the modiÖcation that you have something close to
an ARIMA(2,1,2) model: r = Hv where:
H (z) =
1 0:99z
1
1 H0 (z)
where H0 (z) is the ARMA model in Problem 2. In other words, your rt will now exhibit
behavior that would suggest a unit-root nonstationarity. Plot the original rt (one
simulated block of data), and now the rt with this (close to) unit root nonstationarity.
After you generate the AR models and do the other analysis, now compute st = rtrt1,
the Örst di§erence, and repeat all for st
. The model should work better for st
. The
question is, can you detect the unit root nonstationarity early on? Go back to where
you graphed the partial correlation coe¢ cients, i.e., (m) = (0) for rt and comment.
4. Now repeat both Problems 2 and 3 using a Studentsít-distribution with = 5 for
vt (normalized so E (v
2
t
) = 1 still). The parts to repeat, and the issue of concern, is
the numerical stability and apparent Öt of the model. In theory, this should change
NOTHING! The theoretical values of (m) are IDENTICAL. There are various ways
of examining the changes, see if you can detect anything noticeable (or, in particular,
disturbing) happening. If not, good fo
5. Cointegration
Here we look for an example of cointegration, among other e§ects in time series. Let us
start with two underlying signals at
, bt
, with the Örst having a unit root nonstationarity
and the other stationary:
at
bt
= + G
at1
bt1
+
ut
vt
where:
=
0:3
0:5
G =
0:99 0
0:3 0:3
where ut
is ARMA(2; 2) driven by an input white noise that is iid N (0; 1), with innovations Ölter:
H =
(1 + 0:2z
1
) (1 + 0:3z
1
)
(1 0:2z
1
) (1 0:5z
1
)
and vt are iid samples with E (vt) = 0, E (v
2
t
) =
2
0
(a constant to be prescribed later),
and two possible distributions you will test: N (0; 2
0
), or Studentsít-distribution with
= 5 scaled to achieve the desired E (v
2
t
).
Next, let:
A =
3 4
2 3
and then:
rt =
r1t
r2t
= A
at
bt
(a) Compute the
0
vector and G0 matrix such that we can write:
r1t
r2t
=
0 + G
0
r1t1
r2t1
+
1t
2t
where 1t
; 2t are correlated disturbances, computed via t = A
ut
vt
.
(b) We want the e§ect of bt to be noticeable but not dominant, so do the following.
After you generate at
, take:
2
0 =
1
16
E
a
2
=
1
16
1
N
X
N
n=1
a
2
n
!
Then use that to generate bt
. Finally compute r1t
; r2t
. [In reality, you should do
this twice, using the two di§erent distributions for vt
. Do the rest of this exercise,
as well, for each case].
(c) Graph ut and vt
,. separately. Superimpose graphs of at
; bt
. Superimpose graphs
of r1t
; r2t
. Repeat this, say three times, to see “typical”
(d) We now want to detect the unit root stationarity in r1t
; r2t
, and determine if
there is indeed cointegration. Indeed, we can recover at
; bt via A1
rt
, which shows
that a linear combination of r1t
; r2t
is stationary! Determine the coe¢ cients of
cointegration (i.e., the linear combination of these two that yield a stationary
result).
(e) Compute and graph the auto-correlation coe¢ cients of r1t
; r2t
, separately, i.e.,
11 (m), 22 (m) for 0 m 20, and compute the cross-correlations 12 (m) for
20 m 20. In MATLAB, autocorr and crosscorr will do this for you.
(f) With sit = (1 z
1
) rit, i = 1; 2, repeat the above (i.e., look at the two autocorrelations and cross-correlation coe¢ cients between s1t
; s2t). You should see a
steeper decay which indicates both r1t
; r2t su§er from unit root nonstationarity.
(g) You will note the cross-correlation between r1t
; r2t decays slowly. To check this is
indicative of co-integration, try creating another signal that is INDEPENDENT
of r1t but also has a unit root nonstationarity. For example, let t be iid N (0; 1),
and ct = 0:99ct1 + t
. Checking the c
(m) coe¢ cients of c should of course
conÖrm it has the unit root nonstationarity. But now also look at the crosscorrelation coe¢ cients between ct and r1t
, and between ct and r2t
. You should see
signiÖcantly less cross-correlation between them.
6. ARCH/GARCH
Now we try ARCH/GARCH modeling. Take two years each of daily data for S&P500
and say two stocks that interest you, letís say some time within the last 20 years.
(You can get data o§ Yahoo Finance). SpeciÖcally, we are looking at adjusted closing
prices, and as a Örst step compute the daily log-returns for the period. Each year
will produce about 250 data points, but realize the exact number may vary.
First graph the returns and square returns. The returns should appear to be close to
constant mean (if not zero mean). Since we do not want to deal with the means here,
compute and subtract o§ the sample means, so from this point we will assume the
returns rt have 0 mean (constant). You may also try multiyear models, say 2007-2009,
which captures the 2008 Önancial crisis, or 2018-2020 (to date) which captures…. hmm,
that which will not be mentioned!
Compute and graph the auto-correlation coe¢ cients up to lag 20 for the returns AND
the square returns. If there are not signiÖcant values for the returns, that suggests our
mean model can be simple, i.e., constant mean. If the values for the square returns
however show some signiÖcance, then it is indicative a conditional heteroskedasticity
is in play. Ideally, for this simple experiment, your data should exhibit this (conditional heteroskedasticity and constant mean), and pick di§erent data sets if not. If we
assume the conditional expectation is 0 (letís say subtract o§ the sample mean Örst, if
necessary), our model is simply:
rt = t
t = tzt
2
t = GARCH (p; q)
; 2
where zt are iid with E (zt) = 0, E (z
2
t
) =
(a) First some simulation. For a GARCH(1; 1) model we can write:
2
t = ! + 2
t1 + 2
t1
Synthesize rt according to such a model, with zt N (0; 1), with ! = 0:5, = 0:6,
= 0:4. Superimpose graphs of rt and t
. You should see variability in the
volatility of rt
, that t seems to “track”. Now, repeat this with zt have a Students
t-distribution with = 5 degrees of freedom.
For each case, Öt a GARCH(1; 1) model, and an ARCH(2) model, assuming a
Gaussian distribution for zt (this is called QML- quasi-maximum likelihood, since
you are assuming a Gaussian likelihood function even when it is not the true one).
For the GARCH(1; 1) model, do you get back something close to your coe¢ cients?
For the ARCH(2) model, superimpose plots of rt and t as generated by the model,
and see if you have a good match for the volatility. What impact, if any, does the
Students t-distribution with = 5 degrees of freedom have?
(b) Your GARCH Ötting library should provide some information regarding the signiÖcance of the model coe¢ cients. Take your actual Önancial data, and Öt rt with
a GARCH(1,1) and ARCH(2) and see if either provides good Öts, and also plot rt
and t superimposed. If the volatility model is a good Öt, the graph of t should
appear to be like an ìenvelopeî for the return data.