Images of the Russian Empire: Colorizing the Prokudin-Gorskii photo collection

Josh Barua

Overview

Sergei Mikhailovich Prokudin-Gorskii traveled across the vast Russian Empire taking photographs of everything he saw. He recorded three exposures of every scene onto a glass plate using a red, a green, and a blue filter. The goal of this project is to automatically colorize the photos by stacking and aligning images from each color channel.

Methods

Naive Approach: To colorize the photos we need to first find an optimal alignment between different color channels. I experimented with two different objectives for determining the optimal displacement: normalized cross-correlation (NCC) and sum of squared difference (SSD). I observed similar results using both methods, but found SSD to be faster and selected it as my chosen objective. A naive approach to finding the optimal shift is to perform an exhaustive search over [-16,16] pixels on the x and y axis. This works well for smaller images but becomes computationally expensive for large ones.

Image Pyramid: To address this problem, I constructed an image pyramid on the principle that downsampling images and computing alignments on coarse images create good approximations that only need to be refined as the image is upsampled. I used an iterative approach that first downsamples all the images by a factor of 2 at each step until the pyramid has a height of 4 or the image is smaller than 400x400 pixels. While upsampling, I recompute the alignment at each level on a search window that is half as large (i.e. 4th level is [-16,16], 3rd level is [-8,8], etc.). I add the computed shift at each level to 2 global variables that keep track of total dx and dy displacement, multiplying by 2 to account for the number of pixels doubling when upsampling.

Base Image: I used the blue channel as the base image and try to align red and green channel to it.

Bells and Whistles

Sobel Edge Detection: While the base methods worked well for most images, I found that certain images like "emir.tif" were difficult to align with SSD if I used RGB values as input. Instead of computing SSD on the RGB values of the image directly, I preprocessed the images using a sobel edge detection filter and computed SSD on the filtered image. This led to noticeable improvements visualized below.

Results

Base Algorithm

Base Algorithm + Sobel Edge Detection Filter

cathedral.jpg
Image 1
R: [12,3] G: [5,2]
Image 2
R: [12,3] G: [5,2]
monastery.jpg
Image 1
R: [3,2] G: [-3,2]
Image 2
R: [3,2] G: [-3,2]
tobolsk.jpg
Image 1
R: [6,3] G: [3,2]
Image 2
R: [6,3] G: [3,2]
church.tif
Image 1
R: [52,-6] G: [0,-5]
Image 2
R: [58,-4] G: [25,3]
emir.tif
Image 1
R: [107,17] G: [-3,7]
Image 2
R: [110,47] G: [49,24]
harvesters.tif
Image 1
R: [120,7] G: [118,-3]
Image 2
R: [123,13] G: [61,14]
icon.tif
Image 1
R: [89,22] G: [42,16]
Image 2
R: [88,23] G: [39,16]
lady.tif
Image 1
R: [123,-17] G: [57,-6]
Image 2
R: [121,13] G: [57,9]
melons.tif
Image 1
R: [176,7] G: [83,4]
Image 2
R: [176,14] G: [77,5]
onion_church.tif
Image 1
R: [108,0] G: [52,22]
Image 2
R: [108,35] G: [53,23]
sculpture.tif
Image 1
R: [140,-26] G: [33,-11]
Image 2
R: [140,-26] G: [33,-11]