Description
In this assignment, you will implement a Matlab function to decompose a
matrix A into the product of two matrices A = QR, where Q has orthonormal
columns and R is upper triangular. You will then use your decomposition to
solve a shape fitting problem.
You will need the files back_sub.m for Part 4 and generate_data.m and
visualize.m for Part 5. These can be found in the zip file provided on the
Moodle assignment page.
1 Submission Guidelines
You will submit a zip file that contains the following .m files on Moodle:
• ortho_decomp.m
• my_qr.m
• least_squares.m
• my_pack.m
• my_unpack.m
• design_matrix.m
• affine_fit.m
as well as any helper functions that are needed by your implementation.
As with previous assignments, please note the following rules:
1. Each file must be named as specified above.
2. Each function must return the specified value(s).
3. You may use Matlab’s array manipulation syntax, e.g. size(A), A(i,:),
and [A,B], and basic operations like addition, multiplication, and transpose
of matrices and vectors. High-level linear algebra functions such as inv,
qr, and A\b are not allowed except where specified. Please contact the
instructor with further questions.
1
4. This is an individual assignment, and no collaboration is allowed. Any
in-person or online discussion should stop before you start discussing or
designing a solution.
2 Orthogonal decomposition
Suppose you have an orthonormal basis {u1, . . . , un} for a subspace H ⊆ R
m.
Any vector v ∈ R
m can be decomposed into the sum of two orthogonal vectors,
vˆ ∈ H and vˆ
⊥ ∈ H⊥. Furthermore, since vˆ ∈ H, it can be expressed as
vˆ = c1u1 + · · · + cnun for some weights c1, . . . , cn. Implement a function to
perform this decomposition.
Specification:
function [c, v_perp] = ortho_decomp(U, v)
Input: an m × n matrix U =
u1 · · · un
with orthonormal columns, and a
vector v.
Output:
c: a vector c =
c1
.
.
.
cn
such that v = c1u1 + · · · + cnun + vˆ
⊥
v_perp: the residual vector v
⊥ such that ui
· vˆ
⊥ = 0 for all i.
Implementation notes:
Here vˆ is simply the orthogonal projection of v onto the subspace H. Since we
have an orthogonal basis of H, this should be easy to compute. In fact, since
the basis is orthonormal, it is possible to compute the projection in just one line
of code without any for loops. Hint: What are the entries of UT v?
Test cases:
1. U =
1 0
0 0
0 1
0 0
, v =
2
3
4
5
→ c =
2
4
, vˆ
⊥ =
0
3
0
5
2. U =
0.6
0.8
0
, v =
1
1
1
→ c =
1.4
, vˆ
⊥ =
0.16
−0.12
1
2
3. Generate a random orthonormal set of 5 vectors in R
10 using the command
[U,~] = qr(randn(10,5),0), and generate another random integer
vector v = randi([-2,2], 10,1). Compute your orthogonal decomposition,
[c, v_perp] = ortho_decomp(U, v). Verify that U*c + v_perp is
nearly the same as v, and U’*v_perp is nearly zero (both to within 10−12).
3 QR decomposition
Suppose you have an m × n matrix A which is “tall and thin”, i.e. with m n,
and the columns of A are linearly independent. The Gram-Schmidt process
corresponds to a factorization A = QR, where Q is an m × n matrix with
orthonormal columns, and R is an n × n upper triangular matrix.1
Specification:
function [Q, R] = my_qr(A)
Input: an m × n matrix A with m n.
Output: an m × n matrix Q with orthonormal columns, and an n × n upper
triangular matrix R, such that A = QR.
Implementation notes:
I recommend starting your implementation by first having your function compute
Q correctly. After that, you can modify your code to also compute R.
The columns of Q are essentially obtained by performing the Gram-Schmidt
process with normalization on the columns of A:
q1 = a1/ka1k,
aˆ
⊥
2 = a2 − projH1
a2 where H1 = span{q1},
q2 = aˆ
⊥
2 /kaˆ
⊥
2 k,
aˆ
⊥
3 = a3 − projH2
a3 where H2 = span{q1, q2},
q3 = aˆ
⊥
3 /kaˆ
⊥
3 k,
.
.
.
You can carry this out using your orthogonal decomposition function. For each
column i = 1, . . . , n, call ortho_decomp with an appropriate set of inputs, and
1Technically, what we are computing here is known as the “thin” or “reduced” QR decomposition.
In the full QR decomposition, Q is square and orthogonal, and R is an m × n upper
triangular matrix whose lower (m − n) rows are all zero. In practice, it is also not computed
with the Gram-Schmidt process, but with other methods that are more numerically stable.
3
use the resulting vˆ
⊥ to fill in the ith column of Q. In the end, the columns of
Q should be an orthonormal set of vectors {q1, . . . , qn} which span Col A.
Once you can compute Q correctly, modify your function to compute R as well.
In principle, since QT Q = I, you could obtain R simply via QT A = QT QR = R.
However, this is more work than necessary, because you can actually fill in the
entries of R as you compute each column of Q. When you compute the ith
column of Q, the ortho_decomp call gives you both c and vˆ
⊥; can you use these
to fill in the i nonzero entries in the ith column of R? Hint: Consider the case
when A is already an upper triangular matrix, say A =
1 2 3
0 4 5
0 0 6
, and work
out by hand what c and vˆ
⊥ will be for each column.
Test cases:
1. A =
0 3 0
0 0 −4
−2 0 0
→ Q =
0 1 0
0 0 −1
−1 0 0
, R =
2 0 0
0 3 0
0 0 4
2. A =
1 1 0
1 0 0
1 1 1
1 0 1
→ Q =
1/2 1/2 −1/2
1/2 −1/2 −1/2
1/2 1/2 1/2
1/2 −1/2 1/2
, R =
2 1 1
0 1 0
0 0 1
3. Generate a random 10 × 5 integer matrix A = randi([-2,2], 10,5)
and compute your QR decomposition, [Q, R] = my_qr(A). Verify that
istriu(R) is true, Q’*Q is nearly the identity matrix, and Q*R is nearly
the same as A (both to within 10−12).
4
4 Least-squares problems
Use the QR decomposition to solve the least-squares problem Ax ≈ b.
Specification:
function x = least_squares(A, b)
Input: an m × n matrix A and a vector b ∈ R
m.
Output: a vector x ∈ R
n which is the least-squares solution of the overdetermined
system Ax ≈ b, i.e. such that Ax − b ∈ (Col A)
⊥.
Implementation notes:
Do not solve the normal equations AT Ax = AT b directly, as this is a very
numerically unstable approach. Instead, compute the QR factorization of A and
use it to solve the least-squares problem with only a matrix-vector multiplication
and a back-substitution, as described in the textbook. Use the back_sub function
provided with this assignment to perform the back-substitution.
If you cannot get your my_qr function to work, you may use Matlab’s built-in qr
function here so you can still attempt Part 5. Call qr(A,0) to get a rectangular
Q and square R as desired.
Test cases:
1. A =
1
1
0
, b =
2
3
4
→ x =
2.5
2. A =
1 0 0
1 1 1
1 2 4
1 3 9
1 4 16
, b =
0
3
4
3
0
→ x =
0
4
−1
3. Generate a random 10 × 5 integer matrix A = randi([-2,2], 10,5)
and a random integer vector b = randi([-2,2], 10,1). Compute your
least-squares solution, x = least_squares(A, b), and obtain the residual
r = b – A*x. Verify that A’*r is nearly zero (to within 10−12).
5
5 Best-fitting transformations
An affine transformation is a combination of a linear transformation and a
translation (i.e. displacement) while retaining parallelism, i.e., parallel lines
remain parallel after the transformation. For example, a point
x
y
on A (orange
square) in Figure 1 can be transformed to
x
y
on B (red parallelogram) via a
linear transform M =
m11 m12
m21 m22
, i.e.,
x
y
=
m11 m12
m21 m22 x
y
.
Note that the parallel lines stay parallel. This transform can be combined with
a translation, moving the red parallelogram to blue parallelogram (C). This
composite transformation can be written as:
xe
ye
=
x
y
+
tx
ty
=
m11 m12
m21 m22 x
y
+
tx
ty
= M
x
y
+ t, (1)
where t =
tx
ty
is the translation vector.
11 12
2 21 22 1
2 m m 1
m
x
y m y
x
=
11 12
21 22
1
1
x
y
m m t
m y t
x
y m
x
= +
A
B
C
Figure 1: Affine transform.
In sum, Equation (1) can be written as
xe
ye
=
m11x + m12y + tx
m22x + m22y + ty
. (2)
6
5.1 Linear Equation from a Single Correspondence
Your task is given many correspondences2
(xi
, yi) ↔ (xei
, yei) where i is the
index for the correspondence, to compute the best affine transform parameters,
m11, m12, m21, m22, tx, ty (unknowns): We can rewrite Equation (2) by arranging
it with respect to the unknowns for, say, the first correspondence:
xe1
ye1
| {z }
pe1
=
x1 y1 0 0 1 0
0 0 x1 y1 0 1
| {z }
A1
m11
m12
m21
m22
tx
ty
| {z }
β
. (3)
or simply,
pe1 = A1β. (4)
Note that A1 depends on the original position of the point, p1 =
x1
y1
. The
correspondence produces 2 linear equations while the number of unknowns is
6, so at least 3 correspondences are needed to uniquely determine the affine
transform parameters.
Implement a function beta = my_pack(M, t) to obtain β from M =
m11 m12
m21 m22
and t =
tx
ty
, and its inverse function [M, t] = my_unpack(beta).
function beta = my_pack(M, t)
Input: a 2 × 2 matrix M and a vector t.
Output: a vector β ∈ R
6
containing the entries of M and t.
function [M, t] = my_unpack(beta)
Input: a vector β ∈ R
6
.
Output: a 2 × 2 matrix M and a vector t ∈ R
2
such that my_pack(M, t) = β.
Finally, construct the matrix Ai given pi
.
function Ai = design_matrix(pi)
Input: a vector pi =
xi
yi
∈ R
2
.
Output: the matrix Ai given by Equation (3).
2A point (x, y) in A is transformed to a point (x, e ye) in C. This pair of points forms a
correspondence.
7
5.2 Solving Linear System from n Correspondences
Equation (3) can be extended to include n correspondences as follow:
xe1
ye1
.
.
.
xen
yen
=
x1 y1 0 0 1 0
0 0 x1 y1 0 1
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
xn yn 0 0 1 0
0 0 xn yn 0 1
m11
m12
m21
m22
tx
ty
, (5)
or again simply,
pe1
.
.
.
pen
=
A1
.
.
.
An
β, (6)
If all correspondences are noise-free, Equation (6) will be always satisfied. Due
to correspondence noise in practice, it cannot be satisfied, and therefore,
pe1
.
.
.
pen
≈
A1
.
.
.
An
β, (7)
Now we will compute the best affine transform parameters using Equation (7).
function [M, t] = affine_fit(P, P_tilde)
Input: 2×k correspondence matrices P =
p1 · · · pk
and Pe =
pe1 · · · pek
containing the reference points and transformed points as columns.
Output: a 2 × 2 matrix M and a vector t ∈ R
2
such that the affine transformation
Mx + t best matches the given data. To compute this, you will have
to set up and solve the least-squares problem in Equation (7) to obtain β, then
unpack it to get M and t out.
Test cases:
For the first two tests, pick an arbitrary M and t, say M =
1 2
3 4
and t =
5
6
.
1. Check that [M_new, t_new] = my_unpack(my_pack(M, t)) recovers the
original values of M and t.
2. Verify that when x =
0
0
, design_matrix(x)*my_pack(M,t) gives back
t. Similarly, x =
1
0
and x =
0
1
should give you m1 + t and m2 + t
respectively, where m1 and m2 are the columns of M.
8
3. Applying affine_fit to P =
0 1 1 0
0 0 1 1
, Pe =
0.4 1 1.8 1.2
0.8 0 0.6 1.4
should give M =
0.6 0.8
−0.8 0.6
and t =
0.4
0.8
: a rotation and a translation.
p1
pe1
p2 = pe2
p3
pe3
p4
pe4
4. We have provided a function [P, P_tilde, M, t] = generate_data()
that produces some random test data. It does so by filling random values
in M and t, choosing random points pi
, then setting each pei to Mpi + t
plus a small amount of random noise. You can visualize the data by calling
the provided function visualize(P, P_tilde).
Call affine_fit(P, P_tilde) to obtain your own estimates of M and t.
The result should be close to the “ground truth” transformation returned
by generate_data, although due to the added noise it will not be exactly
identical. Call visualize(P, P_tilde, M, t) to see what your fit looks
like.
Note: If you want to test on a smaller data set, just use the first few columns
of P and Pe, for example P = P(:, 1:10) and P_tilde = P_tilde(:, 1:10).
9