Description
PART I: PROBLEM SET
Your solutions to the problems will be submitted as a single PDF document. Be certain that your problems
are well-numbered and that it is clear what your answers are. Additionally, you will be required to duplicate
your answers to particular problems in the README file that you will submit.
1 K-Means (12 pts)
Show 2 iterations of the k-means algorithm (k = 2) on the following one-dimensional data set:
Data: [ 4, 1, 9, 12, 6, 10, 2, 3, 9 ]
First iteration: cluster centers (randomly chosen): 1, 6
Data assignment:
• Cluster 1: { 1, 2, 3 }
• Cluster 2: { 4, 9, 12, 6, 10, 9 }
(a) Show the cluster centers, then the data assignments, that would be obtained for each of two more
iterations.
(b) After your iterations, has the algorithm converged to a solution at this point, or not? How can you tell?
2 K-Means and Variance (6 pts)
(a) When using the K-Means clustering algorithm, we seek to minimize the variance of the solution. In
general, what happens to the variance of a partition as you increase the value of K (the number of
clusters) and why? State your answer in one sentence.
(b) For a dataset with n instances, what value of k can you always get a variance of 0? Why? State your
answer in one sentence.
3 Reinforcement Learning I (12 pts)
(Adapted from Sutton and Barto, Ex. 3.5) Imagine that you are designing a robot to run a maze. You decide
to give it a reward of +1 for escaping the maze and a reward of zero at all other times. The task seems to
break down naturally into episodes – the successive runs through the maze – so you decide to treat it as
an episodic task, where the goal is to maximize expected total reward Rt = rt+1 + rt+2 + rt+3 + . . . + rT ,
where T is the final time step of an episode. After running the learning agent for a while, you find that it is
showing no improvement in escaping from the maze. Something is going wrong.
Does the reward function effectively communicate the goal to the agent? If not, can you suggest another
reward function that will work? If the reward function is fine, what else is going wrong?
4 Reinforcement Learning II (Extra Credit – 15 pts)
(Adapted from Sutton and Barto, Ex. 3.10.) Consider reinforcement learning in a gridworld where rewards
are positive for goals, negative for running into the edge of the world, and zero otherwise.
(a) Are the signs of these rewards important, or only the intervals between them?
(b) Provide a formal proof that adding a constant C to all the rewards simply adds a constant, K, to the
values of all states, and thus does not affect the relative value of any policies. In your proof, you will
find it useful to use the following equations we studied in class for the expected discounted return and
value of a state:
Rt =
X∞
k=0
γ
k
rt+k+1 V
π
(s) = Eπ
h
Rt | st = s
i
(c) What is K in terms of C and gamma?
3
PART II: PROGRAMMING EXERCISES
1 Image Segmentation using K-Means (Extra Credit – 30 pts)
This entire problem is extra credit. If you do decide to do this project, you will apply K-Means to image
segmentation. Write a program named imageSegmentation.py that reads in an image, segments that image
using K-Means clustering as described below, and outputs the new segmented image. Your program must
support the following command line arguments:
python imageSegmentation.py K inputImageFilename outputImageFilename
The first argument K is an integer greater than 2 that specifies the number of clusters, inputImageFilename
is the filename of the input image, and outputImageFilename is the filename to write the output image.
For example, we might call your program via:
python imageSegmentation.py 24 newyorkcity.jpg nyc-segmented.jpg
Choose several nice natural images, such as a farmhouse against a blue sky, or a city scene. First write
code to load the image using the Python Image Library (you might find it useful to consult https://en.
wikibooks.org/wiki/Python_Imaging_Library). The Python Image Library supports a wide variety of
image file formats, and will automatically determine the filetype based on the file extension. (We will test
your program with .jpg and .png files, so make certain to test your program with those types.)
We can think of an image as being represented as a 3-D matrix of size imageWidth × imageHeight × 3. For
each location in the image (i, j), the matrix contains three values for the red, green, and blue components of
the pixels. We will use these pixel values for clustering. In addition to the color values (rp, gp, bp) for pixel
p, we will also use the x,y coordinates (ip, jp) as features. In particular, we can represent each pixel p as a
five-dimensional data vector xp =
rp gp bp ip jp
.
Complete the program via the following steps:
• Convert the input image into a data set with five features, as described above. To improve results, you
should also standardize the values of each feature in the data set.
• Implement your own version of K-Means and use it to cluster the data (i.e. the features for each pixel)
into K clusters. If a cluster is ever empty during the procedure, assign a random data point to it. Use
random initializations for the cluster centers, and iterate until the centroids converge.
• Use the cluster centers to generate the segmented image by replacing each data point’s color values
with the closest center. For example, xp becomes
xˆp =
rC(p) gC(p) bC(p)
ip jp
,
where C(p) is the cluster to which xp belongs and (rC(p)
, gC(p)
, bC(p)) are the corresponding RGB values
of that cluster’s centroid. Note specifically that we’re only replacing the color values of each instance
with its centroid’s colors, we’re not changing the (i,j) coordinates of that instance.
• Create an output image the same size as the input image. Then fill in the color values of each pixel
of the image based on the xˆp’s. For example, xˆp informs us that the pixel at (ip, jp) should have color
(rC(p)
, gC(p)
, bC(p)). Note that you also have to undo the feature standardization at this point (just
invert the standardization equation by solving for the original value given the standardized value).
• Output the resulting image to the file outputImageFilename.
• In your PDF writeup, include three different examples of an original image alongside the resulting
segmented image.
The result of this process is called an over-segmented image. It is the first step to building such systems as
this: https://make3d.cs.cornell.edu/. Later steps would piece these segments together into objects.
4