Description
CPS843/CP8307 Assignment 0 – MATLAB Warm-up
This assignment is designed to make sure you can load an image, manipulate the values, produce
some output; see MATLAB tutorials on the course webpage (and search the web) for additional
help. Use BrightSpace to submit your assignment. This assignment is to be done individually.
Try to avoid using loops, as much as possible. Traditionally, loops are notoriously slow, due to
MATLAB being an interpreted language. Instead try to use vectorized operations by applying a
single MATLAB command (i.e., precompiled code) to an entire array. To get documentation for a
particular MATLAB function, type help and the then command name. Finally, make sure you do
the appropriate typecasting (i.e., uint8 and double) when working with images.
You should submit a zip file that contains the following:
β’ the images (e.g., JPEG or some other easy to recognize format) and
β’ and your MATLAB files for the assignment, including βa0 script.mβ; see the comments in
βa0 script.mβ for additional details.
Your assignment must run fully by invoking βa0 script.mβ.
1. Input images
(a) Find two interesting images to use. They should be colour. You can find some classic
vision examples at https://sipi.usc.edu/database/database.php?volume=
misc. Make sure they are not larger than 512 Γ 512.
Output: Display both images; see imshow() (Tip: Always make sure to set the display
range of imshow() to a reasonable range for display, such as the minimum and maximum
values of the image).
2. Colour planes
(a) Load image 1 and store it in the variable img; see imread().
(b) Swap the red and blue channels of image 1
Output: Display new image
(c) Create a monochrome image (call it img g) by selecting the green channel of image 1
Output: Display new image
(d) Create a monochrome image (call it img r) by selecting the red channel of image 1
Output: Display new image
1
(e) Convert the image to grayscale; see rgb2gray()
Output: Display new image
3. Replacement of pixels
(a) Take the square of 100 Γ 100 pixels located in the centre of image 1 (grayscale version)
and insert them into image 2 (grayscale version).
Output: Display new image
4. Arithmetic and geometric operations
(a) What is the minimum and maximum of the pixel values of img g? What is the mean?
What is the standard deviation? How did you compute these?
(b) Subtract the mean from all the pixels, then divide by the standard deviation, then
multiply by 10 (if your image is zero to 255) or by 0.05 (if your image ranges from 0.0
to 1.0). Now add the mean back in.
Output: Display new image
(c) Shift img g to the left by 2 pixels.
Output: Display new image
(d) Subtract the shifted version of img g from the original. (Tip: Whenever performing
arithmetic operations with images make sure you convert the image to a floating point
type prior to usage, otherwise issues will arise since the default type is an unsigned
integer.)
Output: Display new image
(e) Flip horizontally img g from the original, i.e., flip image left-to-right.
Output: Display new image
(f) Change the intensities of img g from the original, such that the lightest values appear
dark and the darkest appear light.
Output: Display new image
5. Image noise
(a) Take the original colour image 1 and start adding Gaussian noise to the pixels in each
colour channel. (Hint: Create a three-channel image containing Gaussian noise.) Increase the variance of the noise until the noise is somewhat visible; see randn(). To
increase the noise variance, multiply the Gaussian noise output by the standard deviation you desire. What value for the standard deviation did you use?
Output: Display new image
CPS843/CP8307 Assignment 1 – Filtering
1 Convolution (Total 25)
1. [5 points] Fill in the empty table (below-right) with the resulting image obtained after convolution of the original
image (bottom-left image) with the following approximation of the derivative filter [1, 0, β1] in the horizontal
direction. Assume that the image is zero padded. The origin is located at the top-left corner with coordinates
[0, 0]. This question is to be done by hand; use fprintf() in MATLAB to output your response.
-5 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 -7 2 1 1 3 0 0 0
0 0 0 1 1 1 1 0 0 0
0 0 0 3 1 1 5 0 0 0
0 0 0 -1 -1 -1 -1 0 0 0
0 0 0 1 2 3 4 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
2. [5 points] Compute the gradient magnitude at pixels [2, 3], [4, 3] and [4, 6] in the left image in Q1.1 (the image
pixels marked in bold). Hint: Assume the same derivative filter approximation used for the horizontal direction
is used for the vertical direction. Use fprintf() in MATLAB to output your response.
1
3. [5 points] Write a convolution function, call it MyConv, that takes in as input an arbitrary image and kernel.
Your convolution function should only use loops (i.e., do not use any of the prebuilt MATLAB functions, e.g.,
imfilter). For the boundary cases, assume that the values outside the image are zero. (HINT: Pad the image
prior to convolution with zeros.) Your code should appropriately flip the input kernel.
4. [5 points] Compare the output of your convolution function with the output of imfilter using a 2D Gaussian
kernel with standard deviation 2 and dimensions 13 Γ 13. Specifically, subtract the convolution outputs and
display the absolute value using reasonable scalings for imshow. Is there any difference between the outputs?
Make sure that imfilter is assuming that the boundaries are zero outside the image. Use fprintf() in
MATLAB to output your response.
5. [5 points] Determine the execution times between convolving a 640 Γ 480 image by a 2D Gaussian with a
standard deviation of 8 and separably convolving the same image with two 1D Gaussians each with a standard
deviation of 8. Use imfilter to perform the convolution and fspecial to generate the kernels (with the
βthree-sigma ruleβ). What do you observe? In MATLAB, if you place the command tic before your script and
toc after your script, MATLAB will return the total execution time. Use fprintf() in MATLAB to output
your response.
2 Canny edge detection (Total 35 for CPS843 and 45 for CP8307)
1. [30 points] Implement the Canny edge detection algorithm as a MATLAB function, call it MyCanny, up to
but not including the hysteresis step, as described in class and the handout available on the course webpage.
Your function should take as input a greyscale image and the edge detection parameters and return the Canny
edge binary image. For this question, there are two edge detection parameters, the standard deviation, Ο, of
the Gaussian smoothing filter and the gradient magnitude threshold, Ο. Note, if you implement the peak/ridge
detection step correctly the output of your program should NOT have βthickβ edges!
In addition to turning in your source code for the Canny edge detector, submit a MATLAB script that runs your
edge detector on the test image provided at this link and on an image of your own choosing. Your Canny output
images should use the best set of parameters you find for each input image.
For obvious reasons, you are excluded from using MATLABβs edge() function in your own code but are
encouraged to run the edge() function to get a sense of what the correct output should look like.
2. [5 points] Implement the Gaussian convolution as a sequence of horizontal and vertical convolutions, i.e., a
separable filter.
3. [10 points] (CP8307 question, bonus for CPS843) Add the hysteresis mechanism to the function you wrote for
Q2.1. For hysteresis thresholding, a high and low threshold is specified by the user beforehand. The process
begins by marking all pixels with gradient magnitudes above the high threshold as a βdiscoveredβ definite edge.
These pixels are placed into a queue and become the starting points for a breadth first search (BFS) algorithm.
Run the BFS by iterating through the queue of pixels. The hysteresis process terminates when the queue is
empty. All adjacent pixels (one of the eight neighbours) are treated as connected nodes to the current pixel
removed from the queue. The criteria for adding a new pixel to the queue is given by: an adjacent pixel that has
not been previously discovered and has a gradient magnitude greater than the low threshold. Adjacent pixels that
meet the criteria are subsequently added to the BFS queue. Every adjacent pixel is also marked as discovered
once it is checked against the criteria.
3 Seam carving (Total 20 points)
1. [20 points] Seam carving is a procedure to resize images in a manner that preserves βimportantβ image content.
A video demo is available on YouTube. The general steps for seam carving are as follows:
(a) Compute the energy image, E, for the input image, e.g., the sum of the gradient magnitude images computed for each of the three colour channels of the input image.
(b) Create a scoring matrix, M, with spatial image dimensions matching those of the input image.
2
(c) Set the values of the first row of the scoring matrix, M, to match those of the energy image, E.
(d) Set the values of every entry in the scoring matrix to the energy value at that position and the minimum
value in any of the neighbouring cells above it in the seam, i.e.,
M(x, y) = E(x, y) + min
M(x β 1, y β 1), M(x, y β 1), M(x + 1, y β 1)
, (1)
where M(x, y) is the cost of the lowest cost seam crossing through that point. This minimization procedure
is an instance of dynamic programming.
(e) Find the minimum value in the bottom row of the scoring matrix. The corresponding position of the
minimal value is the bottom of the optimal seam.
(f) Using M(x, y), trace back up the seam by following the smallest value in any of the neighbouring positions
above.
(g) Remove the seam from the image.
(h) To reach a desired resized image, you will have to repeat the above procedure. Note that you will have to
recompute the energy matrix (and scoring matrix) each time to take into account changes in the resized
image.
Your task is to write a MATLAB function, call it MySeamCarving, that takes in an image and the desired new
resolution by removing the necessary horizontal and vertical seams and returns the resized image. Implement
the seam carving routine with a single helper function, call it CarvingHelper, that removes all the necessary
vertical (horizontal) seams. You first call the helper function to remove the vertical (horizontal) seams, transpose
the result of this function and then call the seam removal function again with the transposed image as the input
to remove the horizontal (vertical) seams. In addition to submitting your code, submit resizing outputs for the
Ryerson image to 640 Γ 480 and 720 Γ 320. Also submit an image of your own and its resized output.
2. [10 points] (Bonus): Implement image expansion by inserting seams, see the original paper for details.
CPS843/CP8307 Assignment 2 – Model Fitting
1 Least Squares Fitting of a Plane
1. [5 points] Write a MATLAB script that generates 500 data points for a plane, z = Ξ±x + Ξ²y + Ξ³, with additive
Gaussian noise. (HINT: See the MATLAB example in the lecture slides for generating points on a line with
additive noise.)
2. [5 points] Write a MATLAB script to estimate the parameters for the point set in Q1.1 based on all 500 data
points using least-squares fitting. You will need to rewrite the equation of a plane as a non-homogeneous matrix
equation, Ax = b, where x is a vector of unknowns (Ξ±, Ξ², Ξ³)
>.
3. [5 points] Print to the screen the absolute error you found between each parameter of the ground truth and
estimated planes?
2 RANSAC-based Image Stitching
The goal of this question is to write a simple mosaic/panorama application. A panorama is a wide-angle image constructed by compositing together a number of images with overlapping fields-of-views in a photographically plausible
way.
Part A:
[30 points] In this part, you will write code to construct a mosaic based on an affine transformation. The images
you will work with are shown in Fig. 1 and can be downloaded here: image 1 and image 2. An example result is
shown in Fig. 2. An affine transformation is equivalent to the composed effects of translation, rotation, isotropic
scaling and shear. Formally, an affine transformation of an image coordinate, x1, is given by the matrix equation
x2 = Tx1 + c. The unknowns are given by the elements in the 2 Γ 2 matrix T and the 2 Γ 1 vector c. Rewrite the
affine equation as a non-homogeneous matrix equation, Ax = b, where x is a vector containing the the six unknown
elements of T and c. Since each point correspondence yields two equations and there are six unknowns, a minimum of
three point correspondences are required. (Real mosaics are constructed with a homographic image transformation.
This transformation is more general than an affine transformation. Nonetheless, the same basic robust estimation
architecture you are implementing in this part of the assignment applies when constructing a homography-based
mosaic in Part B.
1. Preprocessing Load both images, convert to single and to grayscale.
1
Figure 1: Images used to create the Parliament panorama using an affine transformation.
2. Detect keypoints and extract descriptors Compute image features in both images. The feature detector and
descriptor you will be using is SIFT. Use the publicly available VLFeat library to compute SIFT features. The
instructions for setting up VLFeat in MATLAB are available here: instructions. Also, check out the VLFeat
SIFT demo. Compute SIFT feature descriptors using: [f, d] = vl sift(img);
3. Match features Compute distances between every SIFT descriptor in one image and every descriptor in the
other image. You can use this code for fast computation of (squared) Euclidean distance.
4. Prune features Select the closest matches based on the matrix of pairwise descriptor distances obtained above.
You can select all pairs whose descriptor distances are below a specified threshold, or select the top few hundred
descriptor pairs with the smallest pairwise distances.
5. Robust transformation estimation Implement RANSAC to estimate an affine transformation mapping one
image to the other. Use the minimum number of pairwise matches to estimate the affine transformation. Since
you are using the minimum number of pairwise points, the transformation can be estimated using an inverse
transformation rather than least-squares. Inliers are defined as the number of transformed points from image 1
that lie within a user-defined radius of Ο pixels of their pair in image 2. You will need to experiment with the
matching threshold, Ο, and the required number of RANSAC iterations. For randomly sampling matches, you
can use the MATLAB functions randperm or randsample functions.
6. Compute optimal transformation Using all the inliers of the best transformation found using RANSAC (i.e.,
the one with the most inliers), compute the final transformation with least-squares.
7. Create panorama Using the final affine transformation recovered using RANSAC, generate the final mosaic
and display the color mosaic result to the screen; your result should be similar to the result in Fig. 2. Warp
one image onto the other using the estimated transformation. To do this, use MATLABβs maketform and
imtransform functions. Create a new image big enough to hold the mosaic and composite the two images
into it. You can create the mosaic by taking the pixel with the maximum value from each image. This tends
to produce less artifacts than taking the average of warped images. To create a color mosaic, apply the same
compositing step to each of the color channels separately.
2
Figure 2: An example (affine) panorama result using the Parliament images.
3
Part B:
[10 points] In this part, you will write code for constructing a panorama based on a homography transformation.
The images you will work with are shown in Fig. 3 and can be downloaded here: image 1 and image 2. An example
result is shown in Fig. 4. You should reuse the code from Part A but swap out the parts that refer to the affine
transformation with the homography.
The minimum number of point correspondences to estimate a homography is four. Using a homography yields a
set of homogeneous linear equations, AX = 0. The solution to both the system of homogeneous equations consisting
of four point correspondences and homogeneous least squares,
X
β = argmin
X
kAXk (1)
subject to the constraint
kXk = 1, (2)
is obtained from the singular value decomposition (SVD) of A by the singular vector corresponding to the smallest
singular value: [U,S,V]=svd(A); X = V(:,end);
1. Using your RANSAC-based homography code generate the mosaic using the Egerton Ryerson images and
display the color mosaic result to the screen.
2. Run your code on an image pair of your own choosing and display the color mosaic result to the screen.
Make sure the images you choose have significant overlap; otherwise, you will not be able to establish correspondences. Further, for a homography to be valid, the images can either be obtained from rotating in the same
place OR from multiple vantage points if the scene is planar or approximately planar.
Figure 3: Images used to create the Egerton Ryerson statue panorama using a homography transformation.
4
Figure 4: An example (homography) panorama result using the Egerton Ryerson statue images.
5
Bonus:
1. [10 points] Experiment with combining image pairs where establishing correspondence is rendered difficult
because of widely varying images sources. These images should have a Ryerson theme. Possible ideas include:
(i) combining a modern and a historical view1 of the same location, such as these ones and (ii) combining images
taken from different times of day or different seasons. Display the result to the screen and indicate this is for
the bonus in the MATLAB Command Window.
2. [5 points] Experiment with image blending techniques to remove salient seems between images; see Szeliski
(the course textbook) Chapter 9. Display the before and after blending color mosaic results to the screen
and indicate in the MATLAB command window this is for the bonus.
Submission Details
Submit all MATLAB files and images required for the various parts of the assignment to run. Your submission should
include a MATLAB script named a2.m for the grader to run. The script should break up the assignment with pause()
commands, so that the grader can press βEnterβ to step through all of your figures and written answers. If your code
does not run we cannot mark it.
1The Ryerson archives may be able to assist with obtaining suitable historical imagery.
CPS843/CP8307 Assignment 3 – Machine Learning
1 Face detection (30 points)
In this part of the assignment, you will be implementing various parts of a sliding window object detector [1]. In particular, you are tasked with implementing a multi-scale face detector. You need to train an SVM to categorize 36 Γ 36
pixel images as βfaceβ or βnot faceβ, using HOG features. Use the VLFeat library (https://www.vlfeat.org/) for
both HOG and the SVM.
You are given:
β’ a directory of cropped grayscale face images, called cropped training images faces,
β’ a directory of images without faces, called images notfaces,
β’ a skeleton script called generate cropped notfaces.m,
β’ a skeleton script called get features.m,
β’ a skeleton script called train svm.m, and
β’ a helper script called report accuracy.m (do not edit this file).
You need to do the following:
1. [5 points] Using the images in images notfaces, generate a set of cropped, grayscale, non-face images. Use
generate cropped notfaces.m as your starting point. The images should be 36 Γ 36, like the face images.
2. [5 points] Split your training images into two sets: a training set, and a validation set. A good rule of thumb is
to use 80% of your data for training, and 20% for validation.
3. [5 points] Generate HOG features for all of your training and validation images. Use get features.m as your
starting point. You are free to experiment with the details of your HOG descriptor. A useful resource is the
VLFeat tutorial on HOG: https://www.vlfeat.org/overview/hog.html
1
4. [5 points] Train an SVM on the features from your training set. Use train svm.m as your starting point.
The parameter βlambdaβ will help you control overfitting. A useful resource is the VLFeat tutorial on SVMs:
https://www.vlfeat.org/matlab/vl_svmtrain.html. Note: If you test your SVM on the training set
features, you should get near perfect accuracy.
5. [5 points] Test your SVM on the validation set features. From the SVMβs performance at this step, try to refine
the parameters you chose in the earlier steps (e.g., the cell size for HOG, and lambda for the SVM). Save your
final SVM (weights and bias) in a mat file called my svm.mat, and include it in your submission.
6. [5 points] Write a script called recog summary.m, which prints out a brief summary of your approach (using
fprintf). Be sure to include your best accuracy on the validation set, and what you did to improve performance.
2 Multi-scale face detection (35 points)
In this part, you need to create a multi-scale sliding window face detector.
In addition to the files from Part 1, you are given:
β’ an image called class.jpg,
β’ a skeleton script called detect.m,
β’ a directory of grayscale test images, called test images,
β’ bounding box annotations for the test images, called test images gt.txt (do not edit this file),
β’ a helper script called look at test images gt.m,
β’ a helper script called evaluate detections on test.m (do not edit this file), and
β’ a helper script called VOCap.m (do not edit this file).
You need to do the following:
1. Get familiar with the test set, and how bounding boxes work, by exploring look at test images gt.m.
2. [5 points] Write a single-scale sliding window face detector, using the SVM you trained in Part 1. Use
detect.m as your starting point. Evaluate your detector by calling evaluate detections on test.m with
the appropriate arguments.
3. [10 points] Upgrade your face detector so that it does not make overlapping predictions. This is called
non-maximum suppression. Detectors typically yield multiple high scores over a region. You want to report
the single best bounding box per object. Since only a single bounding box is reported in the ground truth,
failure to do so will result in a reduction in the test accuracy score. You may want to inspect the code in
evaluate detections on test.m to how to calculate area of intersection, and area of union.
4. [10 points] Upgrade your face detector so that it makes predictions at multiple scales.
5. [5 points] Use your face detector on class.jpg, and plot the bounding boxes on the image.
6. [5 points] Write a script called detect summary.m, which prints out a brief summary of your approach (using
fprintf). Be sure to include your best accuracy on the test set, what you did to improve performance, and a
brief qualitative evaluation of your performance on class.jpg.
Bonus points will be awarded to the top-performing classifiers on class.jpg. Secret ground-truth labels (bounding
boxes for faces) have already been generated. Do not use class.jpg in anyway to train your detector. If you would
like to compete for these points, include a single script called detect class faces.m which runs the full detection
pipeline. This script should:
1. load your SVM from a saved file (my svm.mat),
2
2. generate features from the image at multiple scales,
3. classify the features,
4. suppress overlapping detections,
5. generate an N Γ 4 matrix of bounding boxes called bboxes, where N is the number of faces you detect,
6. generate an N Γ 1 matrix of SVM scores called confidences, and
7. plot the bounding boxes on the image.
The marker will run this script, followed by an evaluation script that will use your bboxes and confidences to
generate an average precision score. The top three performing groups will get points as follows:
1st place: 30 points,
2nd place: 15 points,
3rd place: 10 points.
Feel free to research ways to improve your detectorβs performance, aside from inputting detections manually. For example, it may help to add new training data to the existing set (e.g., training data from another dataset or augmenting
the existing set by applying image transformations to some subset of the images, such as left-right flipping), revise
your training approach (e.g., use hard negative mining1
) or add some colour cues to the feature vector. Good luck!
References
[1] Navneet Dalal and Bill Triggs. Histograms of oriented gradients for human detection. In IEEE Conference on
Computer Vision and Pattern Recognition (CVPR), pages 886β893, 2005.
Acknowledgement: The face detection portion of this assignment is adapted from one created by Prof. James Hayes
(Georgia Tech).
1Hard negative mining [1] is a training scheme to improve the performance of a detector. It begins with training a detector on a set of positive
examples and an initial set of negative ones. Following this initial training stage, negative examples that are incorrectly classified by the initial
model are collected to form a set of hard negatives. These hard negatives are added to the negative training set and a new model is trained. This
process may be repeated several times.
CPS843/CP8307 Introduction to Computer Vision Assignment 4
1 Optical flow estimation (45 points)
In this part, you will implement the Lucas-Kanade optical flow algorithm that computes the pixelwise motion between
two images in a sequence. Compute the optical flow fields for the three image sets labeled synth, sphere and
corridor. Before running your code on the images, you should first convert your images to grayscale and map the
intensity values to the range [0, 1].
1. [20 points] Recall from lecture, to compute the optical flow at a pixel, compute the spatial derivatives (in the
first frame), Ix and Iy, compute the temporal derivative, It
, and then over a window centred around each pixel
solve the following:
P P IxIx
P P
P P
IxIy
IxIy
P P IyIy
! u
v
!
= β
P P
P P
IxIt
IyIt
!
. (1)
Write a MATLAB function, call it myFlow, that takes as input two images, img1 and img2, the window length
used to compute flow around a point, and a threshold, Ο. The function should return three images, u and v, that
contain the horizontal and vertical components of the estimated optical flow, respectively, and a binary map that
indicates whether the flow is valid.
To compute spatial derivatives, use the five-point derivative of Gaussian convolution filter (1/12)*[-1 8 0 -8
1] (make sure the filter is flipped correctly); the image origin is located in the top-left corner of the image, with
the positive direction of the x and y axes running to the right and down, respectively. To compute the temporal
derivative, apply Gaussian filtering with a small Ο value (e.g., 3 Γ 3 filter with Ο = 1) to both images and
then subtract the first image from the second image. Since Lucas-Kanade only works for small displacements
(roughly a pixel or less), you may have to resize the input images (use MATLABβs imresize) to get a reasonable
flow field. Hint: The (partial) derivatives can be computed once by applying the filters across the entire image.
Further, to efficiently compute the component-wise summations in (1), you can apply a smoothing filter (e.g.,
box filter, Gaussian, etc.) on the image containing the product of the gradients.
Recall, the optical flow estimate is only valid in regions where the 2Γ2 matrix on the left side of (1) is invertible.
Matrices of this type are invertible when their smallest eigenvalue is not zero, or in practice, greater than some
threshold, Ο, e.g., Ο = 0.01. At image points where the flow is not computable, set the flow value to zero.
2. [5 points] Visualize the flow fields using the function flowToColor. Play around with the window size and
explain what effect this parameter has on the result.
1
3. [10 points] Another way to visualize the accuracy of the computed flow field is to warp img2 with the computed
optical flow field and compare the result with img1. Write a function, call it myWarp, that takes img2 and the
estimated flow, u and v, as input and outputs the (back)warped image. If the images are identical except for a
translation and the estimated flow is correct then the warped img2 will be identical to img1 (ignoring discretization artifacts). Hint: Use MATLABβs functions interp2 (try bicubic and bilinear interpolation) and meshgrid.
Be aware that interp2 may return NaNs. In particular, this may occur around the image boundaries, since
data is missing to perform the interpolation. Make sure your code handles this situation in a reasonable way.
Visualize the difference between the warped img2 and img1 by: (i) take the difference between the two images
and display their absolute value output (use an appropriate scale factor for imshow) and (ii) using imshow display
img1 and the warped img2 consecutively in a loop for a few iterations, the output should appear approximately
stationary. When running imshow in a loop, you will need to invoke the function draw now to force MATLAB
to render the new image to the screen.
4. [10 points] In this question you will implement the Kanada-Lucas-Tomasi (KLT) tracker. The steps to implement are as follows:
β’ Detect a set of keypoints in the initial frame using the Harris Corner detector. Here, you can use the code
outlined in the lecture as a starting point.
β’ Select 20 random keypoints in the initial frame and track them from one frame to the next; for each
keypoint, use a window size of 15 Γ 15. The tracking step consists of computing the optical flow vector
for each keypoint and then shifting the window, i.e., x
t+1
i
= x
t
i
+ u and y
t+1
i
= y
t
i
+ v, where i denotes the
keypoint index and t the frame. This step is to be repeated for each frame using the estimated window
position from the previous tracking step.
β’ Discard any keypoints if their predicted location moves out of the frame or is near the image borders.
Since the displacement values (u, v) are generally not integer value, you will need to use interpolation for
subpixel values; use interp2.
β’ Display the image sequence and overlay the 2D path of the keypoints using line segments.
β’ Display a separate image of the first frame and overlay a plot of the keypoints which have moved out of
the frame at some point in the sequence.
For this question, you will use the images from the Hotel Sequence.