Description

5/5 - (1 vote)

Programming Problems
1. (50 points) Ordinarily, image resize functions, like the one in OpenCV, treat each pixel
equally — everything gets reduced or increased by the same amount. In 2007, Avidan and
Shamir published a paper called “Seam Carving for Content-Aware Image Resizing” in SIGGRAPH that does the resizing along contours in an image — a “seam” — where there is not
a lot of image content. The technique they described was the starting point for what is now
a standard feature in image manipulation software such as Adobe Photoshop.
Here is an example of an image with a vertical seam drawn on it in red.
A vertical seam in an image contains one pixel per row, and the pixels on the seam are 8-
connected between rows, meaning that pixel locations in adjacent rows in a seam differ by at
1
most one column. Formally a vertical seam in an image with M rows and N columns can be
described as a set of pixels
sr = {(i, c(i))}
M−1
i=0 s.t. ∀i, |c(i) − c(i − 1)| ≤ 1. (1)
In reading this, think of i as the row, and c(i) is the chosen column in each row. Similarly, a
horizontal seam in an image contains one row per column and is defined as a set of pixels
sc = {(r(j), j)}
N−1
j=0 s.t. ∀j, |r(j) − r(j − 1)| ≤ 1. (2)
Here, think of j as the column and r(j) as the row for that column.
Once a seam is selected — suppose for now that it is a vertical seam — the pixels on the
seam are removed from the image, and the pixels that are to the right of the seam are shifted
to the left by one. This will create a new image that has M rows and N − 1 columns. (There
are also ways to use this to add pixels to images, but we will not consider this here!) Here is
an example after enough vertical seams have been removed to make the image square.
The major question is how to select a seam to remove from an image. This should be the
seam that has the least “energy”. Energy is defined in our case as the sum of the derivative
magnitudes (not the gradient magnitude!) at each pixel:
e[i, j] =|
∂I
∂x(i, j) | + |
∂I
∂y (i, j) | .
for i ∈ 1 . . . M − 2, j ∈ 1 . . . N − 2. (Use the OpenCV Sobel function and no Gaussian
smoothing to compute the partial derivatives.) The minimum vertical seam is defined as the
one that minimizes
M
X−1
i=0
e[i, c(i)]/M
over all possible seams c(·). Finding this seam appears to be a hard task because there is an
exponential number of potential seams.
Fortunately, our old friend (from CSCI 2300) dynamic programming comes to the rescue,
allowing us to find the best seam in linear time in the number of pixels. To realize this, we
need to recursively compute a seam cost function W[i, j] at each pixel that represents the
2
minimum cost seam that runs through that pixel. Recursively this is defined as
W[0, j] = e[0, j] ∀j
W[i, j] = e[i, j] + min
W[i − 1, j − 1], W[i − 1, j], W[i − 1, j + 1]
∀i > 0, ∀j
Even if you don’t know dynamic programming, computing W[i, j] is pretty straightforward
(except for a few NumPy tricks — see below).
Once you have matrix W, you must trace back through it to find the actual seam. This is
also defined recursively. The seam pixels, as defined by the function c(·) from above, are
c(N − 1) = argmin
1≤j≤N−1
W[N − 1, j]
c(i) = argmin
j∈{c(i+1)−1,c(i+1),c(i+1)+1}
W[i, j] for i ∈ N − 2 downto 0
In other words, in the last row, the column with the minimum weight (cost) is the end point
of the seam. From this end point we trace back up the image, one row at a time, and at
each row we choose from the three possible columns that are offset by -1, 0 or +1 from the
just-established seam column in the next row.
A few quick notes on this.
• You need to be careful not to allow the seam to reach the leftmost or rightmost column.
The easiest way to do this is to introduce special case handling of columns 0 and N − 1
in each row, assigning an absurdly large weight.
• The trickiest part of this from the NumPy perspective is handling the computation of
the minimum during the calculation of W. While you clearly must explicitly iterate
over the rows (when finding the vertical seam), I don’t want you iterating over the
columns. Instead, use slicing in each row to create a view of the row that is shifted by
+1, -1 or 0 and then take the minimum. For example, here is code that determines at
each location in an array, if the value at that is greater than the value at either its left
or right neighbors.
import numpy as np
a = np.random.randint( 0,100, 20 )
print(a)
is_max = np.zeros_like(a, dtype=np.bool)
left = a[ :-2]
right = a[ 2: ]
center = a[ 1:-1 ]
is_max[ 1:-1 ] = (center > right) * (center > left)
is_max[0] = a[0] > a[1]
is_max[-1] = a[-1] > a[-2]
print(“Indices of local maxima in a:”, np.where(is_max)[0])
’’’
Example output:
[93 61 57 56 49 40 51 85 5 13 28 89 31 56 11 10 60 93 26 86]

Indices of local maxima in a: [ 0 7 11 13 17 19]
’’’
• Recompute the energy matrix e after each seam is removed. Don’t worry about the fact
that the energy of most pixels will not change.
• The seam should be removed from the color image, but the energy is computed on a
gray scale image. This means you will have to convert from color to gray scale before
each iteration energy matrix computation and seam removal.
• Convert the image to float immediately after reading it (and before any derivative computation). This will ensure the greatest consistency with our results. In particular, if
fname stores the name of the image file, use
img = cv2.imread(fname).astype(np.float32)
Command-line and output: Your program should take an image as input and remove
enough rows or columns to make the image square. The command-line should be
python p1_seam_carve.py img
For 0th, 1st and last seam, please print the index of the seam (0, 1, …), whether the seam is
vertical or horizontal, the starting, middle and end pixel locations on the seam (e.g. if there
are 11 pixels, output pixels 0, 5 and 10), and the average energy of the seam (accurate to
two decimal places). Finally, output the original image with the first seam drawn on top of
image in red and output the final, resized image. If foo.png is the input image, the output
images should be foo_seam.png and foo_final.png.
Finally, you may not be able to reproduce the exact output of my code. Do not
worry about this too much as long as your energies and seams are close. Especially important
is that the final square image looks close.
2. (40 points) In class we started to implement an edge detector in the Jupyter notebook
edge demo.ipynb, including Gaussian smoothing and the derivative and gradient computations. The code is posted on Submitty.
In this problem, you will implement the non-maximum suppression step and then a thresholding step, one that is simpler than the thresholding method we discussed in class. Here are
the details:
• For non-maximum suppression, a pixel should be marked as a maximum if its gradient
magnitude is greater than or equal to those of its two neighbors along the gradient
direction, one “ahead” of it, and one “behind” it. (Note that by saying “greater than or
equal”, edges that have ties will be two (or more) pixels wide — not the right solution
in general, but good enough for now.) As examples, if the gradient direction at pixel
location (x, y) is π/5 radians (36◦ degrees) then the ahead neighbor is at pixel (x+1, y+1)
and the behind neighbor is at pixel (x − 1, y − 1), whereas if the gradient direction is
9π/10 (162◦
) then the ahead neighbor is at pixel (x − 1, y) and the behind neighbor is
at pixel (x + 1, y).
• For thresholding, start from the pixel locations that remain as possible edges after nonmaximum suppression and eliminate those having a gradient magnitude of lower than
1.0. Then, for the remaining pixels, compute the mean, µ, and the standard-deviation,
s, of the gradient magnitudes. The threshold will be the minimum of µ+ 0.5s and 30/σ,
4
the former value because in most images, most edges are noise, and the latter value to
accommodate clean images with no noise. Dividing by σ is because Gaussian smoothing
reduces the gradient magnitude by a factor of σ.
The command-line should be
python p2_edge.py sigma in_img
where
• sigma is the value of σ used in Gaussian smoothing, and
• in_img is the input image.
(I have posted an example on line with sigma = 2 and in img = disk.png)
The text output from the program will be:
• The number of pixels that remain as possible edges after the non-maximum suppression
step.
• The number of pixels that remain as possible edges after the gradient threshold of 1.0
has been applied.
• µ, s and the threshold, each on a separate line and accurate to 2 decimal places.
• The number of pixels that remain after the thresholding step.
Three output images will be generated, with file names created by adding a four character
string to the file name prefix of the input image. Examples below assume that the image is
named foo.png. Here are the three images:
• The gradient directions of all pixels in the image encoded to the following five colors:
red (255, 0, 0) for pixels whose gradient direction is primarily east/west; green (0, 255, 0)
for pixels whose gradient direction is primarily northwest/southeast; blue (0, 0, 255) for
pixels whose gradient direction is primarily north-south; white (255, 255, 255) for pixels
whose gradient direction is primarily northeast/southwest; and black (0, 0, 0) for any
pixel on the image border (first or last row or column) and for any pixel, regardless of
gradient direction, whose gradient magnitudent is below 1.0. The file name should be
foo_dir.png.
• The gradient magnitude before non-maximum suppression and before thresholding, with
the maximum gradient mapping to the intensity 255. The file name should be foo_grd.png.
• The gradient magnitude after non-maximum suppression and after thresholding, with
the maximum gradient mapping to the intensity 255. The file name should be foo_thr.png.
Notes:
• Be sure that your image is of type float32 before Gaussian smoothing.
• At first it will seem a bit challenging — or at least tedious — to convert the initial gradient direction, which is measured in radians in the range [−π, π], into a decision as to whether the gradient magnitude is primarily west/east, northwest/southeast,
north/south, or northeast/southwest. For example, the ranges from [−π, −7π/8], [−π/8, π/8],
and [7π/8, π] are all east-west. You could write an extended conditional to assign
5
these directions, or you could write one or two expressions, using Numpy’s capabilty for floating-point modular arithmetic, to simultaneously assign 0 to locations that
are west/east, 1 to locations that are northwest/southeast, etc. Think about it!
• This problem is a bit harder than previous problems to solve without writing Python for
loops that range over the pixels, but good solutions do exist. Full credit will be given
for a solution that does not require for loops, while up to 36 of 40 for students in 4270
will be given for a solution that requires for loops. (For students in 6270 this will be
32 out of 40.) In other words, we’ve provided mild incentive for you to figure out how
to work solely within Python (and numpy) without for loops. Examples that have been
given in class and even worked through on homework can help. You’ll have to consider
each direction (somewhat) separately.
• A final word of wisdom: build and test each component thoroughly before using it in a
larger system – I know it’s hard to force yourself to move this slowly, but I promise it
will make this (and future) problems easier!
3. 20 points Object detection and change detection are two of the most important problems in
computer vision. We will discuss object detection at several points throughout the semester.
Here in particular we are going to adapt the simple change detection method from the Lecture 7 Jupyter notebook to detect the presence or absence of a bird in a picture. Thanks to
Olivia Lundelius for the pictures and suggestion.
Your python program will be given two images taken from the same camera, in the same
pose, at nearly the same time. The first image will definitely NOT show a bird. The second
image may or may not show a bird. Your output should be in two parts. The first is a single
line of text containing the word YES (meaning that there is a bird in the second image), or
the word NO (meaning that there is a bird in the second image). The second is an image
showing the change regions of the images that indicate whether or not there is a bird present.
Change regions indicating the presence of a bird should be shown in color (however you
choose). Change regions that do not indicate the presence of a bird should be shown as gray
level (intensity vector (100, 100, 100)). So, if there is not a bird, all change regions (there will
almost always be some) should be shown as gray.
Please use as much or as little of the change detection Jupyter notebook as you wish. Adjust
the parameters and add decision criteria. You are welcome to add whatever the decision
criteria you’d like, including location, size, etc.
Written Problem
1. (15 points) Evaluate the quality of the results of seam carving on several images. In particular, find images for which the results are good, and find images for which the results are
poor. Show these images before and after applying your code. What causes poor results?
Please explain.
2. (10 points) Regarding your bird detector, please answer the following questions, briefly and
precisely:
(a) What is your bird detection decision criteria? Include any modification to the Jupyter
notebook methods and parameters.
6
(b) When will your algorithm succeed, when will it fail, and why? Note that your algorithm
will fail at some point, so don’t be shy in your answer to this. Provide examples to
justify this.

Solved CSci 4270 and 6270 Computational Vision Homework 3 Programming Problems 1. (50 points) Ordinarily, image resize functions, like the one in OpenCV

Download Details:

Description

Solved CSci 4270 and 6270 Computational Vision Homework 3 Programming Problems 1. (50 points) Ordinarily, image resize functions, like the one in OpenCV

Download Details:

Description

Related products

Solved CSCI 4270 and 6270 Computational Vision Homework 4 Part 1 — 100 Points Overview Here is the basic problem statement, which is elaborated on below

Solved CSci 4270 and 6270 Computational Vision Homework 1 1. (20 points) Write a script that takes a single image

CSci 4270 and 6270 Computational Vision, Homework 3 solution