In part 1, I took 6 photos that would later be used for the mosaics. I ensured that my camera was the axis of rotation when taking the photos because the homography assumes a fixed viewpoint. The images had many corners which would make for good correspondence points.
Part 2 required me to the recover the parameters of the homography transformation between image 1 and image 2. Given a set of correspondence points between the two images, I set up the following systems of equations (source: Bill Zheng's Project 4A website):
With 4 pairs of correspondence points, this system would have a unique solution. However, with only 4 points the recovered homography would be prone to noise. Thus, I used 10 correspondence points and solved the overdetermined system with least squares via np.linalg.lstsq()
.
In part 3, I warped image 1 using the homography I calculated in part 2. First, I take the 4 corners of image 1 and calculate the final bounding box by transforming them with H. Then, I take the inverse of H and perform inverse warping on the final bounding box similarly to what I did in Project 3. I visualized the result of warping image 1 below:
In part 4, I used the warping function from part 3 to "rectify" an image. This requires taking a photo of object with known rectangular or square propeties as image 1 and setting correspondence points in image 2 as being correctly aligned (rectangular or square). Below I visualized an outlet and a coaster being rectified.
In part 5, I create an image mosaic by combining image stitching and blending. To stitch the images together, I first compute the new corners of image 1 after applying the homography. Then, I find the minimum and maximum coordinates among the transformed image 1 corners and the corners of image 2 (unwarped) to compute the final bounding box. Lastly, I warp image 1 and overlay it onto image 2 in the final bounding box (using shifts to ensure the two images are positioned correctly). However, the stitched image will have strong edge artifacts from directly overlaying the images. To resolve this, I blend the images by applying a mask which averages pixels that are valid in both images and directly uses pixel values if it's non-zero in only one image.
In part 6, I use Harris corners as interest points which will enable me to automatically select correspondence points between two images. I use the provided function get_harris_corners
which identifies the coordinates of corners in the scene and discards points along the borders of the image. Below I visualized the Harris corners overlaid onto my two images.
In part 7, I run ANMS to only select the top 500 strongest corners. I run ANMS as follows: for each point, I find the set of candidate points where the strength of the harris corner at that point * 0.9 is greater than the strength of the corner at the original point. I then compute the squared distance between each of these candidate points and my original point, with the minimum becoming the "supression radius" for the original point. I simply pick the 500 points with the highest supression radius (the global maxima has a supression radius of infinity).
In part 8, once the strongest keypoints have been selected, I extract feature descriptors that can enable me to find matching regions between the images. For each point, I extract a 40x40 window around the point, resize the window to 8x8 with anti-aliasing, normalize the feature descriptor, and flatten it to a 1x64 array. Below I have visualized 3 random 40x40 patches surrounding interest points.
In part 9, I compare every feature descriptor in img1 to every feature descriptor in img2 using the sum of squared distances (SSD) metric. I keep track of the pairs with the lowest and second lowest SSD error to compute ratio of dist1 / dist2. Using Lowe's technique, I reject any pairs where this ratio exceeds 0.65, which enforces the principle that the nearest neighbor must be significantly better than the second nearest neighbor.
There are noticeably fewer than 500 points with many clear correspondences, but some outliers still remain.
In part 10, I use the RANSAC algorithm to robustly estimate the homography matrix between two images. I repeatedly samples four point matches 10,000 times to compute candidate homographies, keep the one with the most inliers (matches that fit well within a threshold), and finally recalculate the best homography using all inliers from that best model.
I use the final remaining interest points to compute a homography H which is used to to create a mosaic just like in part (a) of this project.
For bells and whistles, I added multiscale processing for corner detection and feature description. Below I have visualized the different components with multiscale processing.