## Description

The purpose of this homework is to see the relationship between the condition

number and the numerical error when solving linear least-squares problems.

First, implement the following methods for least-squares, which you’ll use

in the next exercise.

1. Method of Normal Equations (uses the Cholesky factorization)

2. Method based on the Thin QR factorization

Next, load the given matrix (download from Canvas) into memory. Call the

matrix A,

A =

a1 · · · an

, (1)

where ai ∈ R

m is the ith column of A. Define the matrices A1, . . . , An as:

Ak =

a1 · · · ak

, k = 1, . . . , n. (2)

That is, Ak contains the first k columns of the matrix A (that you loaded

into memory).

Now, generate the error data that you’ll analyze. For k from kmin = 40

to kmax = 65:

1. Report the size, rank, and condition number of Ak.

2. Generate 100 random vectors bi ∈ R

m. For each bi

,

(a) Use the built-in equation solver (numpy.linalg.lstsq) to compute the least-squares minimizers given Ak and bi

. Call this vector

xtrue, because we’re just gonna trust the software on this one.

(b) Use your Normal Equation solver to compute the least-squares

minimizer, xNE. Compute the relative error with xtrue.

(c) Use your QR solver to compute the least-squares minimizer, xQR.

Compute the relative error with xtrue.

3. For each of QR and Normal Equations, compute the average error over

all the bi

.

Make two plots on a semilogy scale:

1

• the average error versus k (how many columns in the matrix) for both

QR and the Normal Equations,

• the condition number of Ak versus k.

Now tell me what’s going on. More specifically:

1. What is the relationship between the error using QR versus the Normal

Equations?

2. What is the relationship between the errors and the condition number

of Ak?

3. Suppose your matrix A is ill-conditioned. Which method is more favorable?

2