Project 5 – Image Deblurring and Conjugate Gradient Solver solution


Original Work


Rate this product

The purpose of this assignment is to gain insight on the theoretical and numerical properties of the Conjugate Gradient method. Here we use this method in an image processing application with the goal of deblurring an image given
the exact (noise-free) blurred image and the original transformation matrix. Note that the “noise-free” simplification
is essential for us to solve this problem in the scope of this assignment. For the interested reader, googling image
deblurring or inverse problem image processing may be enlightening.
1. Simple Image Deblurring – Problem Definition
Suppose we have a blurred image B which we want to refine (or deblur) and we know the transformation that lead
to this blurred image. Then how can we efficiently reconstruct the original image? Let X ∈ R
n×n be the original
square, grayscale image matrix, where each matrix entry corresponds to one pixel value. Hence X corresponds to an
image with n pixels per side. We can convert the original image matrix X into a column vector x ∈ R
, where the
entries of X get stacked into a vector, either row- or column-wise. This process is referred to as vectorization1
. If we
now define as b the vectorized form of the blurred image B, we can write the following system of equations:
Ax = b, (1)
Here, A ∈ R
indicates the transformation matrix coming from the the repeated application of what is referred
to as the ”image kernel” which in our case produces a blurring effect. An excellent visual explanation of the kernel
operation can be found at [4]. Our objective will be to solve the linear system in Eq.[1] for the original vectorized
image x, where the transformation matrix A and the blurred image B (and hence b) are known. By solving the system,
we can recover the original image, getting rid of the blurring effect applied on top of it.
1. Visualization and Vectorization
The image in consideration is black and white and the corresponding image matrix can be visualized with the command plot (see Plot.jl documentation). We can vectorize the blurred image matrix with the command vec(B). The
vectorization ordering can be arbitrary, by column, row or whatever fits best with the data access pattern; however, it
must correspond with A. In our case we require row by row vectorize. A blurred image pixel is the weighted average
of the surrounding pixels. These weights are defined by the kernel matrix K ∈ R
. The much bigger transformation matrix A is constructed from the kernel matrix, such that the non-zero elements of each row of A correspond to
the values of K. The detailed construction of the transformation matrix is omitted for brevity, see [4]. The following
attributes should be noted:
Borders: The transformation matrix ignores the elements outside of borders of the matrix, (max(i, j) > n).
For more details on vectorization check, e.g.,
Numerical Computing, Fall Semester 2022
Lecturer: Prof. O. Schenk
Assistants: E. Vecchi, L. Gaedke-Merzhauser ¨
Banded Matrix: A is a d
-banded matrix, where d  n. This means that A is sparse, and thus one should use
sparse data structures. Recall that A is of dimension n
2 × n
, defining a dense matrix of that size will not fit on your
system. For example, if n = 250 we would need 31 GB (i.e., 8n
· n
2 Bytes) of memory.
For this assignment you will use the provided transformation matrix A.mat. It is generated from the normalized blur
image kernel
K =

100 9 9 9 9 9 100
9 1 1 1 1 1 9
9 1 1 1 1 1 9
9 1 1 1 1 1 9
9 1 1 1 1 1 9
9 1 1 1 1 1 9
100 9 9 9 9 9 100

, (2)
and operates on a “row-vectorized” image matrix.
2. Conjugate Gradient
The conjugate gradient method (CG) [1, 3] is widely used for solving systems of linear equations Ax = b, where
A is both symmetric and positive-definite. The conjugate gradient method can theoretically be viewed as a direct
method, as it produces the exact solution after a finite number of iterations, which is not larger than the size of
the matrix, in the absence of round-off error. However, the conjugate gradient method is unstable with respect to
even small perturbations, e.g., most directions are not in practice conjugate, and the exact solution is never obtained.
Fortunately, the conjugate gradient method can be used as an iterative method as it provides monotonically improving
approximations xk to the exact solution, which may reach the required tolerance after a relatively small (compared
to the problem size) number of iterations. The improvement is typically linear and its speed is determined by the
condition number κ(A) of the system matrix A: the larger κ(A) is, the slower the improvement. The CG algorithm
starts from an initial guess x0 and applies a series of matrix-vector and vector-vector operations until the desired
tolerance is reached:
Algorithm 1 Conjugate Gradient Algorithm
1: r ← b − Ax0
2: d ← r
3: ρold ← hr, ri
4: for i = 0, 1, . . . do
5: s ← Adi
6: α ← ρold/ hd, si
7: x ← x + αd
8: r ← r − αs
9: ρnew ← hr, ri
10: β ← ρnew/ρold
11: d ← r + βd
12: ρold ← ρnew
13: end for
1. Symmetry and Positivity
In our case A is symmetric and of full rank but not positive-definite. We can bypass this issue by solving the augmented
system Eq. [3]. The pre-multiplication with AT
, results in the positive-definite augmented transformation matrix A˜.
T Ax = A
b (3)
Ax˜ = ˜b.

Numerical Computing, Fall Semester 2022
Lecturer: Prof. O. Schenk
Assistants: E. Vecchi, L. Gaedke-Merzhauser ¨
2. Condition Number
An important attribute of a linear system Ax = b is its condition number κ(A). This is the relation of the sensitivity of
the solution x, to changes in b. If small changes in b results in large changes in x, the system is called ill-conditioned
and the condition number of the system is large. The convergence rate of CG is hindered when the system has a high
condition number. Notice that the number of iterations needed to find the exact solution is not related to κ(A), but
rather equal to the number of unique eigenvalues. The condition number of a square matrix A is
κ(A) = σmax
with σ indicating the singular values of A. For a real symmetric matrix, the singular values are equal to the absolute
value of the eigenvalues: σ = |λ|.
In our case, matrix A is symmetric and real, and this implies that the condition number is given by the ratio of the
magnitudes of the maximum and minimum eigenvalues. Furthermore, the transformation matrix is ill-conditioned and
– to make matters worse – we will be solving the augmented system in Eq.[3] with condition number κ(A)
2 = κ(A˜).
These factors will unfortunately deteriorate the convergence rate of our deblurring program, but this problem can be
partially mitigated through preconditioning, as explained in the next section.
3. Preconditioning
A symmetric positive-definite preconditioner P is selected such that P
−1A˜ ≈ I, that is P
approximates A˜−1
. We
can decompose the preconditioner such that P = LLT
, where L is the Cholesky factor (see [2]). Our intention is to
solve the preconditioned augmented system
−1Ax˜ = P
−1˜b (5)
−1AL˜ −1
| {z }
| {z }

= L
| {z }
. (6)
This operation is done to both decrease the condition number κ(A
¯˜) < κ(A˜) and to decrease the range of the eigenvalues. The underlying caveat here is that the preconditioner should be computationally inexpensive to find.
To meet the desired properties of the preconditioner, a common approach is to use incomplete Cholesky (IC) factorization (see [1]). With IC, we only compute the Cholesky factorization of the non-zero elements of A˜. This cheap factorization of A˜ gives us the preconditioner P = F
TF, where F is the sparse IC factors. This routine can fail since the
existence of F is not guaranteed (note that F is forced to be sparse). To amend this issue, a heuristic approach is used
to apply a diagonal shift of P, enforcing positive-definiteness and making F computable. The detailed explanation of
this routine is outside the scope of this assignment. We can use the Julia function CholeskyPreconditioner()
command for the IC factorization (please refer to the Julia documentation for more information and also note the
memory option).
3. The Assignment
Read this document and Section 7.4 of the textbook [1].
1. General Questions [10 points]
1. What is the size of the matrix A?

Numerical Computing, Fall Semester 2022
Lecturer: Prof. O. Schenk
Assistants: E. Vecchi, L. Gaedke-Merzhauser ¨
2. How many diagonals bands does A have?
3. What is the length of the vectorized blur image b?
2. Properties of A [10 points]
1. If A is not symmetric, how would this affect A˜?
2. Explain why solving Ax = b for x is equivalent to minimizing 1
T Ax − b
T x over x, assuming that A is
symmetric positive-definite.
3. Conjugate Gradient [30 points]
1. Write a function for the conjugate gradient solver x, rvec = myCG(A,b,x0,max itr,tol), where x
and rvec are, respectively, the solution value and a vector containing the norm of the residual at every iteration.
Hint: You can use Julia’s cg() function for comparison.
2. In order to validate your implementation, solve the system defined by A test.mat and b test.mat. Plot
the convergence (residual vs iteration).
3. Plot the eigenvalues of A test.mat and comment on the condition number and convergence rate.
4. Does the residual decrease monotonically? Why or why not?
4. Deblurring problem [35 points]
1. Solve the deblurring problem for the blurred image matrix B.mat and transformation matrix A.mat using your
routine myCG and Julia’s preconditioned conjugate gradient cg using the Pl option (from the IterativeSolvers
package). As a preconditioner, use CholeskyPreconditioner() from the Preconditioners package to
get the incomplete Cholesky factors. In order to ensure existence of the IC factor shift the diagonal by 0.01
and set the memory fill equal to 1. Solve the system with both solvers using max iter = 200, abstol = 10−4
Plot the convergence (residual vs iteration) of each solver and display the original and final deblurred image.
Carefully explain and comment on the results that you observe.
Hint: You can use ?function() to retrieve some information on function().
2. When would the preconditioned CG be worth the added computational cost? What about if you are debluring
lots of images with the same blur operator?
Quality of the code and of the report [15 Points]
The highest possible score of each project is 85 points and up to 15 additional points can be awarded based on the
quality of your report and code (maximum possible grade: 100 points). Your report should be a coherent document,
structured according to the template provided on iCorsi. If there are theoretical questions, provide a complete and
detailed answer. All figures must have a caption and must be of sufficient quality (include them either as .eps or .pdf).
If you made a particular choice in your implementation that might be out of the ordinary, clearly explain it in the
report. The code you submit must be self-contained and executable, and must include the set-up for all the results that
you obtained and listed in your report. It has to be readable and, if particularly complicated, well-commented.

Numerical Computing, Fall Semester 2022
Lecturer: Prof. O. Schenk
Assistants: E. Vecchi, L. Gaedke-Merzhauser ¨
Additional notes and submission details
Summarize your results and experiments for all exercises by writing an extended LaTeX report, by using the template provided on iCorsi ( Submit your gzipped
archive file (tar, zip, etc.) – named project number lastname – on iCorsi (see [NC
2022] Project 5 – Conjugate Gradient) before the deadline. Submission by email or through any other channel will
not be considered. Late submissions will not be graded and will result in a score of 0 points for that project. You
are allowed to discuss all questions with anyone you like, but: (i) your submission must list anyone you discussed
problems with and (ii) you must write up your submission independently. Please remember that plagiarism will result
in a harsh penalization for all involved parties.
In-class assistance
If you experience difficulties in solving the problems above, please join us in class either on Tuesdays 16:30-18:15 or
on Wednesdays 14:30-16:15 in room C1.03.
[1] Linear Systems: Iterative Methods, Chapter 7, SIAM Book A First Course on Numerical Methods, C. Greif,
U. Ascher.
[2] The Cholesky decomposition, Chapter 5, Section 5.5, SIAM Book A First Course on Numerical Methods,
C. Greif, U. Ascher.
[3] An Introduction to the Conjugate Gradient Method Without the Agonizing Pain, http://www.cs.cmu.
[4] Image Kernels,