Description
The purpose of this homework is to see the relationship between the condition
number and the numerical error when solving linear least-squares problems.
First, implement the following methods for least-squares, which you’ll use
in the next exercise.
1. Method of Normal Equations (uses the Cholesky factorization)
2. Method based on the Thin QR factorization
Next, load the given matrix (download from Canvas) into memory. Call the
matrix A,
A =
a1 · · · an
, (1)
where ai ∈ R
m is the ith column of A. Define the matrices A1, . . . , An as:
Ak =
a1 · · · ak
, k = 1, . . . , n. (2)
That is, Ak contains the first k columns of the matrix A (that you loaded
into memory).
Now, generate the error data that you’ll analyze. For k from kmin = 40
to kmax = 65:
1. Report the size, rank, and condition number of Ak.
2. Generate 100 random vectors bi ∈ R
m. For each bi
,
(a) Use the built-in equation solver (numpy.linalg.lstsq) to compute the least-squares minimizers given Ak and bi
. Call this vector
xtrue, because we’re just gonna trust the software on this one.
(b) Use your Normal Equation solver to compute the least-squares
minimizer, xNE. Compute the relative error with xtrue.
(c) Use your QR solver to compute the least-squares minimizer, xQR.
Compute the relative error with xtrue.
3. For each of QR and Normal Equations, compute the average error over
all the bi
.
Make two plots on a semilogy scale:
1
• the average error versus k (how many columns in the matrix) for both
QR and the Normal Equations,
• the condition number of Ak versus k.
Now tell me what’s going on. More specifically:
1. What is the relationship between the error using QR versus the Normal
Equations?
2. What is the relationship between the errors and the condition number
of Ak?
3. Suppose your matrix A is ill-conditioned. Which method is more favorable?
2