Description

Rate this product

As we discussed in class, photometric stereo is a computational imaging method to determine the
shape of an object from its appearance under a set of lighting directions. In the course of this homework, we will solve a simplified version of the photometric stereo problem. We will make certain
assumptions: that the object is Lambertian and is imaged with an orthographic camera under a set
of directional lights. We will first solve calibrated photometric stereo, in which lighting directions
are given; then uncalibrated photometric stereo, where we have to infer lighting directions and the
shape of the object jointly from the given images. In this process, we will learn about an ambiguity
in the resultant shape introduced because of this joint estimation.
Even though the homework is fairly self-contained, we strongly recommend going over the papers
in the references before starting. As always, start early.
1 Calibrated photometric stereo (80 points)
Figure 1: 3D reconstruction result from calibrated photometric stereo
with questions from previous
Homework 5: Photometric Stereo
All code in this part goes in src/q1.py. The code to pipeline these functions and generate the
results are provided for you at the end of the file.
Image formation. We will now look at image formation under our specific assumptions. We will
understand and implement the process that occurs in a scene when we take a picture with an orthographic camera.
՜
𝑙
𝑑𝐴
՜
՜ 𝑣
𝑛
(a) Geometry of photometric stereo (b) Projected area
Figure 2: Geometry of photometric stereo
(a, 5 points) Understanding n-dot-l lighting. In your write-up, explain the geometry of the ndot-l lighting model from Fig. 2a. Where does the dot product come from? Where does projected
area (Fig. 2b) come into the equation? Why does the viewing direction not matter?
Figure 3: Setup for Problem 1b
Homework 5: Photometric Stereo
(b, 10 points) Rendering n-dot-l lighting. Consider a uniform fully reflective Lambertian sphere
with its center at the origin and a radius of 0.75 cm (Fig 3). An orthographic camera located at (0,
0, 10) cm looks towards the negative z-axis with its sensor axes aligned and centered on the x- and
y-axes. The pixels on the camera are squares 7 µm in size, and the resolution of the camera is 3840
⇥ 2160 pixels. Simulate the appearance of the sphere under the n-dot-l model with directional
light sources with incoming lighting directions (1, 1, 1)/p3, (1, -1, 1)/p3 and (-1, -1, 1)/ p3 in the
function renderNDotLSphere (individual images for all three lighting directions). Note that
your rendering isn’t required to be absolutely radiometrically accurate: we need only evaluate the
n-dot-l model. Include the 3 renderings in your write-up. Include a snippet of the code you wrote
for this part in your write-up.
Inverting the image formation model. Armed with this knowledge, we will now invert the image
formation process given the lighting directions. Seven images of a face lit from different directions
are given to us, along with the ground-truth directions of light sources. These images are taken at
lighting directions such that there is not a lot of occlusion or normals being directed opposite to the
lighting, so we will ignore these effects. The images are in the data/ directory, named as input n.tif for the nth image. The source directions are given in the file data/sources.npy.
(c, 5 points) Loading data. In the function loadData, read the images into Python. Convert the
RGB images into the XYZ color space and extract the luminance channel. Vectorize these luminance images and stack them in a 7 ⇥ P matrix, where P is the number of pixels in each image.
This is the matrix I, which is given to us by the camera. Next, load the sources file and convert it
to a 3 ⇥ 7 matrix L. For this question, include a screenshot of your function loadData.
These images are 16-bit .tifs, and it’s important to preserve that bit depth while reading the
images for good results. Therefore, make sure that the datatype of your images is uint16 after
loading them.
(d, 10 points) Initials. Recall that in general, we also need to consider the reflectance, or albedo,
of the surface we’re reconstructing. We find it convenient to group the normals and albedos (both
of which are a property of only the surface) into a pseudonormal b = a · n, where a is the scalar
albedo. We then have the relation I = LTB, where the 3 ⇥ P matrix B is the set of pseudonormals in the images. With this model, explain why the rank of I should be 3. Perform a singular
value decomposition of I and report the singular values in your write-up. Do the singular values
agree with the rank-3 requirement? Explain this result and include the code snippet in the write-up.
(e, 20 points) Estimating pseudonormals. Since we have more measurements (7 per pixel) than
variables (3 per pixel), we will estimate the pseudonormals in a least-squares sense. Note that
there is a linear relation between I and B through L: therefore, we can write a linear system of
the form Ax = y out of the relation I = LTB and solve it to get B. Solve this linear system
in the function estimatePseudonormalsCalibrated. In your write-up, mention how you
constructed the matrix A and the vector y along with the code for the function you wrote.
Homework 5: Photometric Stereo
Note that here matrices you end up creating might be huge and might not fit in your computer’s
memory. In that case, to prevent your computer from freezing or crashing completely, make sure
to use the sparse module from scipy. You might also want to use the sparse linear system
solver and kronecker product in the same module.
(f, 10 points) Albedos and normals. Estimate per-pixel albedos and normals from matrix B in
estimateAlbedosNormals. Note that the albedos are the magnitudes of the pseudonormals
by definition. Calculate the albedos, reshape them into the original size of the images and display the resulting image in the function displayAlbedosNormals. Include the image in your
write-up and comment on any unusual or unnatural features you may find in the albedo image, and
on why they might be happening. Make sure to display in the gray colormap.
The per-pixel normals can be viewed as an RGB image. Reshape the estimated normals into an
image with 3 channels and display it in the function displayAlbedoNormals. Note that the
components of these normals will have values in the range [1, 1]. You will need to rescale them
so that they lie in [0, 1] to display them properly as RGB images. Include this image in the writeup. Do the normals match your expectation of the curvature of the face? Make sure to display in
the rainbow colormap. Your results should look like that in Fig. 4. Include the code snippets.
Figure 4: Calibrated photometric stereo results. Estimated albedo (⇥10) and normals.
(g, 5 points) Normals and depth. We will now estimate from the normals the actual shape of the
face. Represent the shape of the face as a 3D depth map given by a function z = f(x, y). Let the
normal at the point (x, y) be n = (n1, n2, n3). Explain, in your write-up, why n is related to the
partial derivatives of f at (x, y): fx = @f(x, y)/@x = n1/n3 and fy = @f(x, y)/@y = n2/n3.
You may consider the 1D case where z = f(x).
Normal integration. Given that the normals represent the derivatives of the depth map, we will
integrate the normals obtained in (g). We will use a special case of the Frankot-Chellappa algo-
Homework 5: Photometric Stereo
rithm for normal integration, as given in [1]. You can read the paper for the general version of the
normal integration algorithm.
(h, 5 points) Understanding integrability of gradients. Consider the 2D, discrete function g on
space given by the matrix below. Find its x and y gradient, given that gradients are calculated as
gx(xi, yj ) = g(xi+1, yj ) g(xi, yj ) for all i, j (and similarly for y). Let us define (0, 0) as the top
left, with x going in the horizontal direction and y in the vertical.
g =
2
6
6
4
1234
5678
9 10 11 12
13 14 15 16
3
7
7
5
(1)
Note that we can reconstruct the entire of g given the values at its boundaries using gx and gy.
Given that g(0, 0) = 1, perform these two procedures:
1. Use gx to construct the first row of g, then use gy to construct the rest of g;
2. Use gy to construct the first column of g, then use gx to construct the rest of g.
Are these the same?
Note that these were two ways to reconstruct g from its gradients. Given arbitrary gx and gy, these
two procedures will not give the same answer, and therefore this pair of gradients does not correspond to a true surface. Integrability implies that the value of g estimated in both these ways (or
any other way you can think of) is the same. How can we modify the gradients you calculated
above to make gx and gy non-integrable? Why may the gradients estimated in the way of (g) be
non-integrable? Note all this down in your write-up.
The Frankot-Chellappa algorithm for estimating shapes from their gradient first projects the (possibly non-integrable) gradients obtained in (h) above onto the set of all integrable gradients. The
resulting (integrable) projection can then be used to solve for the depth map. The function integrateFrankot in utils.py implements this algorithm given the two surface gradients.
(i, 10 points) Shape estimation. Write a function estimateShape to apply the FrankotChellappa algorithm to your estimated normals. Once you have the function f(x, y), plot it as a
surface in the function plotSurface and include some significant viewpoints in your write-up.
The 3D projection from mpl toolkits.mplot3D.Axes3D along with the function plot –
surface might be of help. Make sure to plot this in the coolwarm colormap. The result we
expect of you is shown in Fig. 1. Include a snippet of the code you wrote for this part in your
write-up.
Homework 5: Photometric Stereo
2 Uncalibrated photometric stereo (50 points)
Figure 5: 3D reconstruction result from uncalibrated photometric stereo
We will now put aside the lighting direction and estimate shape directly from the images of the
face. As before, load the lights into L0 and images in the matrix I. We will not use the matrix L0
for anything except as a comparison against our estimate in this part. All code for this question
goes in src/q2.py.
(a, 10 points) Uncalibrated normal estimation. Recall the relation I = LTB. Here, we know
neither L nor B. Therefore, this is a matrix factorization problem with the constraint that with the
estimated Lˆ and Bˆ , the rank of ˆI = LˆTBˆ be 3 (as you answered in the previous question), and the
estimated ˆI and Lˆ have appropriate dimensions.
It is well-known that the best rank-k approximation to a m ⇥ n matrix M, where k  min{m, n}
is calculated as the following: perform a singular value decomposition SVD M = U⌃VT , set all
singular values except the top k from ⌃ to 0 to get the matrix ⌃ˆ , and reconstitute Mˆ = U⌃ˆ VT .
Explain in your write-up how this can be used to construct a factorization of the form detailed
above following the required constraints.
(b, 10 points) Calculation and visualization. With your method, estimate the pseudonormals
Bˆ in estimatePseudonormalsUncalibrated, include a snippet of this function in your
write-up and visualize the resultant albedos and normals in the gray and rainbow colormaps
respectively. Include these too in the write-up. A sample result has been shown in Fig. 6.
Homework 5: Photometric Stereo
Figure 6: Uncalibrated photometric stereo. Estimated albedo and normals.
(c, 5 points) Comparing to ground truth lighting. In your write-up, compare the Lˆ estimated
by the factorization above to the ground truth lighting directions given in L0. Are they similar?
Unless a special choice of factorization is made, they will be different. Describe a simple change
to the procedure in (a) that changes the Lˆ and Bˆ , but keeps the images rendered using them the
same using only the matrices you calculated during the singular value decomposition (U/S/V). (No
need to code anything for this part, just describe the change in the write-up)
(d, 5 points) Reconstructing the shape, attempt 1. Use the given implementation of the FrankotChellappa algorithm from the previous question to reconstruct a 3D depth map and visualize it as
a surface in the ‘coolwarm’ colormap as in the previous question. Does this look like a face?
Enforcing integrability explicitly. The ambiguity you demonstrated in (c) can be (partially) resolved by explicitly imposing integrability on the pseudonormals recovered from the factorization. We will follow the approach in [2] to transform our estimated pseudonormals into a set of
pseudonormals that are integrable. The function enforceIntegrability in utils.py implements this process.
(e, 5 points) Reconstructing the shape, attempt 2. Input your pseudonormals into the enforceIntegrability function, use the output pseudonormals to estimate the shape with the
Frankot-Chellappa algorithm and plot a surface as in the previous questions (results shown in Fig.
5). Does this surface look like the one output by calibrated photometric stereo? Include at least
three viewpoints of the surface and your answers in your write-up.
Hint After integration, your surface may look ‘inside-out’. You can deal with this by applying
the following GBR (generalized bas-relief) transform matrix to your pseudo-normals.
G =
2
4
10 0
01 0
0 0 1
3
5
Homework 5: Photometric Stereo
Check below for what G is and once this transformation is applied, make sure to recompute the
normals and reintegrate the surface afterwards.
The generalized bas-relief ambiguity. Unfortunately, the procedure in (e) resolves the ambiguity
only up to a matrix of the form
G =
2
4
100
010
µ ⌫
3
5 (2)
where > 0, as shown in [3]. This means that for any µ, ⌫ and > 0, if B is a set of integrable
pseudonormals producing a given appearance, GTB is another set of integrable pseudonormals
producing the same appearance. The ambiguity introduced in uncalibrated photometric stereo by
matrices of this kind is called the generalized bas-relief ambiguity (French, translates to ‘lowrelief’, pronounced ‘bah ree-leef’).
(f, 5 points) Why low relief? Vary the parameters µ, ⌫ and in the bas-relief transformation
and visualize the corresponding surfaces. Include at least six (two with each parameter varied) of
the significant ones in your write-up. Looking at these, what is your guess for why the bas-relief
ambiguity is so named? In your write-up, describe how the three parameters affect the surface.
(g, 5 points) Flattest surface possible. With the bas-relief ambiguity, in Eq. 2, how would you
design a transformation that makes the estimated surface as flat as possible?
(h, 5 points) More measurements. We solved the problem with 7 pictures of the face. Will
acquiring more pictures from more lighting directions help resolve the ambiguity?
Credits This homework, in a large part, was inspired from the photometric stereo homework
in Computational Photography (15-463) offered by Prof. Ioannis Gkioulekas at CMU. The data
comes from the course on Computer Vision offered by Prof. Todd Zickler at Harvard.
References
[1] R. T. Frankot and R. Chellappa. A method for enforcing integrability in shape from shading
algorithms. IEEE Transactions on Pattern Analysis and Machine Intelligence, 10(4):439–451,
July 1988.
[2] Alan Yuille and Daniel Snow. Shape and albedo from multiple images using integrability. In
Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 158–164. IEEE, 1997.
[3] P. N. Belhumeur, D. J. Kriegman, and A. L. Yuille. The bas-relief ambiguity. In Proceedings
of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages
1060–1066, June 1997.

Solved 16-820 HOMEWORK 5: PHOTOMETRIC STEREO

Download Details:

Description

Solved 16-820 HOMEWORK 5: PHOTOMETRIC STEREO

Download Details:

Description

Related products

Solved 16-820 Advanced Computer Vision: Homework 4 (Fall 2024) Neural Networks for Recognition

Solved HOMEWORK 3: 3D RECONSTRUCTION 16-820 Advanced Computer Vision (Fall 2024)

Solved 16-820 Advanced Computer Vision: Homework 2 (Fall 2024) Lucas-Kanade Tracking