Description
CENG 391 – Introduction to Image Understanding Homework 1
Download and extract the contents of ceng391 02T image formation.tar.gz.
Exercise 1 Color Support for the ceng391::Image Class
Modify the contents of the Image::write pnm function so that instead of
writing an error message for RGBA images, it saves the image contents in
the binary PPM format. Hint: You should only write the RGB values into
the file. The alpha values should not be saved. This means you will need
two loops instead of one.
Exercise 2 Loading PNM Images
Write a new member function Image::read pnm that takes a std::string
argument named filename. The function should try to open the file with
the given name and read its contents if its contents are in the PGM or
PPM binary formats. When reading color images, remember to create a
four channel RGBA image. You should read the RGB values from the file
and initialize all alpha values to 255.
Exercise 3 Color Conversion
Write two new member functions Image::to grayscale and Image::to rgba,
that converts the image contents from grayscale to RGBA and vice versa.
If the image is already of the target format the functions should return
immediately.
To convert from grayscale to RGBA, just copy the gray value to all the
color and alpha channels. To convert from RGBA to grayscale, you can use
the following set of formulas:
IGray(x, y) = IRed(x, y) ∗ 0.3 + IGreen(x, y) ∗ 0.59 + IBlue(x, y) ∗ 0.11;
Before writing back the grayscale value, you should check for underflow and
overflow and you should discard the alpha values.
1
CENG 391 – Introduction to Image Understanding Homework 2
Exercise 1 2D Image Filtering
a. Write a new member function Image::filter 2d that takes an odd
integer n, and an n × n single-precision floating point matrix K. The
function should return a new image that is of size
(w() − n + 1) × (h() − n + 1)
and filtered by the kernel K.
Hint: If n is even you may take n as the largest odd number smaller than
n. Make sure to clamp the filter results to the range [0, 255]. The size of the
output is smaller so that you do not need to worry about the image borders.
Exercise 2 Image Derivatives
a. Write a new member function Image::deriv x that takes computes
the image derivative in the x direction using a filter of the form
−1 0 1
−2 0 2
−1 0 1
.
The results should be returned in a newly allocated array of type short
which can store negative values.
b. Write a new member function Image::deriv y that takes computes
the image derivative in the y direction using a filter of the form
−1 −2 −1
0 0 0
1 2 1
.
The results should be returned in a newly allocated array of type short
which can store negative values.
1
Exercise 3 Geometric Transforms
a. Write a new member function Image::warp affine that takes a twoby-two transform matrix A and a two-by-one translation vector t,
and an Image pointer out. After the function call finishes the image pointed by out should contain the result of applying the affine
transform
x
0 = Ax + t =
a11 a12
a21 a22
x +
t1
t2
with nearest neighbor sampling.
b. Add an option to perform bilinear sampling to the function Image::warp affine.
Hint: You must not change the size of the image out. Assume that the
matrix and vector entries are stored in the double-precision floating point
format and the matrix is stored in the column major order (Its entries are
stored in memory in the order [a11, a21, a12, a22]).
2
CENG 391 – Introduction to Image Understanding Homework 3
Exercise 1 Gaussian Image Pyramid
Write a new class ImagePyr in a file named as image pyr.cc that represents
a Gaussian Pyramid with the following properties:
a. It should have the following private fields:
• an integer m n levels storing number of pyramid levels (octaves).
• an Image pointer m levels storing pyramid levels in an array of
Images.
b. The constructor must take the desired number of levels, a pointer to
the base image to create the pyramid levels for and the initial sigma,
sigma0, for the base image.
c. The destructor must deallocate all memory allocated by the constructor.
d. A getter for the number of levels.
e. A getter for the i
th level image.
f. Level 0 is a copy of the base image given in the constructor (do not
just copy the pointer, create a new image that is a copy of the base
image).
g. Level i is created by first Gaussian smoothing the image at level i − 1
so that its scale (sigma value) is doubled. And then it is downsampled
by two in x and y dimensions so that its width and height are the half
of the width and height of the image at level i − 1 (Downsample by
throwing out every other row and column of the smoothed image).
1
Exercise 2 Gaussian Image Pyramid Tester
Write the code for a new executable image-pyr-test in a file named as
image pyr test.cc that reads an image from a file given as the first command line argument and creates an image pyramid with number of levels
equal to the number given as the second command line argument. It should
than save each pyramid level in files named as “/tmp/pyr level i.png”
where i is the level number.
The test program should display a usage message if there is a missing argument or it is given an argument with an invalid
value.
Modify CMakeLists.txt so that this test executable is compiled along
with the existing executables in the project.
2
CENG 391 – Introduction to Image Understanding Homework 4
Exercise 1 Image Processing and Feature Detection (40
points)
Please do the following exercises by a single Python script named as
src/detect and match.py. You may use OpenCV for feature detection
and descriptor computation.
a. Detect SIFT interest points on the six images of the Golden Gate
Bridge that are in the folder data.
b. Draw the SIFT interest points on each image and store the resulting images in the same folder with names as sift keypoints i.png,
where i is the image number.
c. Calculate SIFT descriptor matches between consecutive pairs of images by brute force matching, for example between goldengate-00.png
and goldengate-01.png, between goldengate-01.png and
goldengate-02.png, and so on.
d. Draw these tentative correspondences on a match image and save the
resulting images in the same folder with names as
tentative correspondences i-j.png, where i and j are image numbers.
e. Save the SIFT interest points, descriptors, and tentative correspondences as text files in the same folder with names as sift i.txt and
tentative correspondences i-j.txt.
1
Exercise 2 RANSAC (40 points)
Please do the following exercises by a single Python script named as
src/ransac.py. You may use OpenCV for homography computation with
RANSAC.
a. Read the keypoints and tentative correspondences for each image pair
and match them by RANSAC.
b. You may use RANSAC from OpenCV, implement RANSAC yourself
for 10 bonus points.
c. Save the resulting homography matrices in files within the folder data
with names such as h i-j.txt, where i and j are image numbers.
d. Do not forget about normalization and the final estimation over all
inliers. You may optionally perform guided matching.
e. Draw and save the resulting final inlier correspondences in files in the
data folder with names as inliers i-j.png and inliers i-j.txt.
Exercise 3 Basic Stitching (20 points)
Please do the following exercises by a single Python script named as
src/stitch.py. You may use OpenCV function warp perspective for image warping.
a. Stitch all the images by calculating a homography matrix from each
image to one of the center images goldengate-02.png or
goldengate-03.png and warping the images to this coordinate system.
b. Save the resulting image in the folder data named as panorama.png.
c. To blend multiple images just overwrite or average intensities of overlapping pixels.
2