Solved CSCI 5561: Project #1 Histogram of Oriented Gradients (HOG) Fall 2025

$30.00

Original Work ?

Download Details:

  • Name: P1_Histogram-of-Oriented-Gradients-HOG-bdk48d.zip
  • Type: zip
  • Size: 5.23 MB

Category: Tags: , , You will Instantly receive a download link upon Payment||Click Original Work Button for Custom work

Description

Rate this product

HOG Figure 1: Histogram of oriented gradients. HOG feature is extracted and visualized for (a) the entire image and (b) zoomed-in image. The orientation and magnitude of the red lines represent the gradient components in a local cell. In this assignment, you will implement a variant of HOG (Histogram of Oriented Gradients) in Python proposed by Dalal and Triggs [1] (2015 Longuet-Higgins Prize Winner). It had been long standing top representation (until deep learning) for the object detection task with a deformable part model by combining with an SVM classifier [2]. Given an input image, your algorithm will compute the HOG feature and visualize as shown in Figure 1 (the line directions are perpendicular to the gradient to show edge alignment). The orientation and magnitude of the red lines represent the gradient components in a local cell. You should begin by implementing the functions in Sections 2.1 through 2.4. def extract_hog(image, cell_size=8, block_size=2): … return hog Input: A grayscale image with uint8 format. Output: HOG descriptor. Description: You will compute the HOG descriptor of input image. The pseudo-code can be found below: Algorithm 1 HOG 1: Convert the grayscale image to float format and normalize to range [0, 1]. 2: Get differential images using get_differential_filter and filter_image 3: Compute the gradients using get_gradient 4: Build the histogram of oriented gradients for all cells using build_histogram 5: Build the descriptor of all blocks with normalization using get_block_descriptor 6: Return a long vector (hog) by concatenating all block descriptors. 2 CSCI 5561: Project #1 Histogram of Oriented Gradients (HOG) 2.1 Image filtering m n (a) Input image (b) Differential along x direction (c) Differential along y direction Figure 2: (a) Input image dimension. (b-c) Differential image along x and y directions. def get_differential_filter(): … return filter_x, filter_y Input: None. Output: filter_x and filter_y are 3×3 filters that differentiate along x and y directions, respectively. Description: You will compute the gradient by differentiating the image along x and y directions. This code will output the differential filters. def filter_image(image, filter): … return image_filtered Input: image is the grayscale m ×n image (Figure 2(a)) converted to float format and filter is a filter (k × k matrix) Output: image_filtered is m × n filtered image. You may need to pad zeros on the boundary on the input image to get the same size filtered image. Description: Given an image and filter, you will compute the filtered image. Given the two functions above, you can generate differential images by visualizing the magnitude of the filter response as shown in Figure 2(b) and 2(c). 3 CSCI 5561: Project #1 Histogram of Oriented Gradients (HOG) 2.2 Gradient Computation (a) Magnitude 0 20 40 60 80 100 120 140 160 180 (b) Angle (c) Gradient (d) Zoomed eye (e) Zoomed neck Figure 3: Visualization of (a) magnitude and (b) orientation of image gradients. (c-e) Visualization of gradients at every 3rd pixel (the magnitudes are re-scaled for illustrative purposes.). def get_gradient(image_dx, image_dy): … return grad_mag, grad_angle Input: image_dx and image_dy are the x and y differential images (size: m × n). Output: grad_mag and grad_angle are the magnitude and orientation of the gradient images (size: m × n). Note that the range of the angle should be [0, π), i.e., unsigned angle (θ == θ + π). Description: Given the differential images, you will compute the magnitude and angle of the gradient. Using the gradients, you can visualize and have some sense with the image, i.e., the magnitude of the gradient is proportional to the contrast (edge) of the local patch and the orientation is perpendicular to the edge direction as shown in Figure 3. 4 CSCI 5561: Project #1 Histogram of Oriented Gradients (HOG) 2.3 Orientation Binning Ignore this shaded area c ell_ siz e Store gradient mag 4,3  M N (u,v) (a) ori histo 165 15 45 75 105 135 165 S u m o f m a g nit u d e s (b) Histogram per cell Figure 4: (a) Histogram of oriented gradients can be built by (b) binning the gradients to corresponding bin. def build_histogram(grad_mag, grad_angle, cell_size): … return ori_histo Input: grad_mag and grad_angle are the magnitude and orientation of the gradient images (size: m × n); cell_size is the size of each cell, which is a positive integer. Output: ori_histo is a 3D tensor with size M × N × 6 where M and N are the number of cells along y and x axes, respectively, i.e., M = ⌊m/cell_size⌋ and N = ⌊n/cell_size⌋ where ⌊·⌋ is the floor operation as shown in Figure 4(a). Description: Given the magnitude and orientation of the gradients per pixel, you can build the histogram of oriented gradients for each cell. ori histo(i, j, k) = X (u,v)∈Ci,j grad mag(u, v) if grad angle(u, v) ∈ θk (1) where Ci,j is a set of x and y coordinates within the (i, j) cell, and θk is the angle range of each bin, e.g., θ1 = [165◦ , 180◦ ) ∪ [0◦ , 15◦ ), θ2 = [15◦ , 45◦ ), θ3 = [45◦ , 75◦ ), θ4 = [75◦ , 105◦ ), θ5 = [105◦ , 135◦ ), and θ6 = [135◦ , 165◦ ). Therefore, ori_histo(i,j,:) returns the histogram of the oriented gradients at (i, j) cell as shown in Figure 4(b). Using the ori_histo, you can visualize HOG per cell where the magnitude of the line proportional to the histogram as shown in Figure 1. Typical cell_size is 8. 5 CSCI 5561: Project #1 Histogram of Oriented Gradients (HOG) 2.4 Block Normalization Blo2cxk2 block Concatenation of HOG and normalization Block M N (a) Block descriptor Block M-1 N-1 (b) Block overlap with stride 1 Figure 5: HOG is normalized to account for illumination and contrast to form a descriptor for a block. (a) HOG within (1,1) block is concatenated and normalized to form a long vector of size 24. (b) This applies to the rest block with overlap and stride 1 to form the normalized HOG. def get_block_descriptor(ori_histo, block_size): … return ori_histo_normalized Input: ori_histo is the histogram of oriented gradients without normalization. block_size is the size of each block (e.g., the number of cells in each row/column), which is a positive integer. Output: ori_histo_normalized is the normalized histogram (size: (M−(block_size− 1)) × (N − (block_size − 1)) × (6 × block_size2 ). Description: To account for changes in illumination and contrast, the gradient strengths must be locally normalized, which requires grouping the cells together into larger, spatially connected blocks (adjacent cells). Given the histogram of oriented gradients, you apply L2 normalization as follows: 1. Build a descriptor of the first block by concatenating the HOG within the block. You can use block_size=2, i.e., 2 × 2 block will contain 2 × 2 × 6 entries that will be concatenated to form one long vector as shown in Figure 5(a). 2. Normalize the descriptor as follow: hˆ i = p hi P i h 2 i + e 2 (2) where hi is the i th element of the histogram and hˆ i is the normalized histogram. e is the normalization constant to prevent division by zero (e.g., e = 0.001). 3. Assign the normalized histogram to ori_histo_normalized(1,1) (white dot location in Figure 5(a)). 4. Move to the next block ori_histo_normalized(1,2) with stride 1 and iterate 1-3 steps above. The resulting ori_histo_normalized will have size of (M − 1) × (N − 1) × 24. 6 CSCI 5561: Project #1 Histogram of Oriented Gradients (HOG) 2.5 Application: Face Detection (a) Template image (b) Target image -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 (c) Response map (d) Thresholding (e) Non-maximum suppression Figure 6: You will use (a) a single template image to detect faces in (b) the target image using HOG descriptors. (c) HOG descriptors from the template and target image patches can be compared by using the measure of normalized cross-correlation (NCC). (d) Thresholding on NCC score will produce many overlapping bounding boxes. (e) Correct bounding boxes for faces can be obtained by using non-maximum suppression. Using the HOG descriptor, you will design a face detection algorithm. The template and target images can be found in assets.zip on the Canvas assignment. def face_detection(I_target, I_template): … return bounding_boxes Input: I_target is the image that contains multiple faces. I_template is the template face image that will be matched to the image to detect faces. Output: bounding_boxes is n×3 array that describes the n detected bounding boxes. Each row of the array is [xi , yi , si ] where (xi , yi) is the left-top corner coordinate of the i th bounding box, and si is the normalized cross-correlation (NCC) score between the bounding box patch and the template: s = a · b ∥a∥∥b∥ (3) 7 CSCI 5561: Project #1 Histogram of Oriented Gradients (HOG) where a and b are two normalized descriptors, i.e., zero mean: ai = ai − ea (4) where ai is the i th element of a, and ai is the i th element of the HOG descriptor. ea is the mean of the HOG descriptor. Description: You will use thresholding and non-maximum suppression with IoU 50% to localize the faces. You may use def visualize_face_detection(I_target, bounding_boxes, box_size) to visualize your detection. 8 CSCI 5561: Project #1 Histogram of Oriented Gradients (HOG) 2.6 [Bonus] Face Detection with Varying Sizes Extend your HOG-based face detector to handle faces at multiple scales. You will search over an image pyramid and perform detection at each scale, followed by nonmaximum suppression (NMS) across all scales. The test image is provided as bonus.jpg in assets.zip on the Canvas assignment. def face_detection_bonus(I_target, I_template): … return bounding_boxes Input: I_target is the image that contains multiple faces. I_template is the template face image to be matched. You may also experiment with different templates if desired. Output: bounding_boxes is an n × 3 array that describes the n detected bounding boxes. Each row of the array is [xi , yi , si ], where (xi , yi) is the left-top corner coordinate of the i th bounding box and si is the normalized cross-correlation (NCC) score. Submission (by Email Only) Do not submit the bonus to Gradescope. Instead, email the following to lee04484@umn.edu: • Your script p1.py • Your detection result image bonus_result.jpg Email subject: [CSCI5561] P1 Bonus – Bonus credit: 2 points. References [1] N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In CVPR, 2005. [2] P. F. Felzenszwalb, R. B. Girshick, D. McAllester, and D. Ramanan. Object detection with discriminatively trained part based models. TPAMI, 2010.