Wednesday, March 7, 2007

Motion Estimation from video input

As mentioned in my previous post on motion estimation i implemented the lucas-kanade optical flow method as a first attempt at performing motion estimation. This method had a major drawback in that the method is only applicable for displacements on the order of one or two high resolution pixels, this limitation posses a serious problem for static images, as in order reconstruct increasingly high resolution images an increasing number of low resolution images are required.

Since this point i have been looking into using video clips at the input to the super-resolution algorithm, this method of data input offers the benefit that if the frame rate of the camera is high enough then the relative motions between the frames will be small (on the order of or two pixels) which means that lucas-kanade's optical flow algorithm should be effective at calculating the motions between frames, this data can then be used to calculate the relative motions with respect to a reference frame. As i have implemented it at the moment i am assuming that the reference frame is simply the first frame in the input sequence, however i intend to modify this so that the displacements are given relative to some 'central frame', this central frame may or may not exist in the set of LR input frames, it is merely the point around which the reconstruction takes place.


I attempted to use the results generated by this image registration approach to perform super-resolution on the video clip that i captured, however the reconstruction did not go well, so i have now taken a step back and am trying to verify the results of the image registration code in order to track down the error.

Wednesday, February 21, 2007

Maximum Likelihood Estimation

For a change from working on the motion estimation side of the project, i decided to take a look at replacing the end of the super-resolution pipeline. The end of the pipeline is the part responsible for reversing the effects of the forward model. Up until know i have implemented this inversion using the pseudo-inverse function, whilst this is an effective approach for small images, and images which do not contain noise it will become infeasable to use this method once i start using larger images, and once i move ontoconsidering real world low resolution examples that contain noise.

The other motivation for implementing this part of the project now asides from the change from looking at motion estimation is that the maximum likelihood estimation approach is similar to work that i have been doing in my Neural networks class (ECE 173).



The X that satisfies the above equation is the X vector that is most likely to have been the original high resolution image. By finding the derivative of the above equation and setting it equal to zero we can then calculate the X vector that corresponds to the minimum of the function. The equation below shows the derivative:

Rearranging this formula and then applying the gradient descent method allows the X vector corresponding to the minimum error to the iteratively obtained.

Where

and


The graph below shows the error between the reconstruction and the original high resolution images against the number of iterations that were performed in order to acquire the reconstruction.



The reconstructed image can be seen below:


Wednesday, February 14, 2007

Motion Estimation

Well with CSE121 finally submitted last night i have been able to get back to work on the project. The next portion of the project i want to get done is the motion estimation part.

My first attempt at implementing motion estimation was to use the Lucas-Kanade optical flow algorithm. This provided promising results for displacements of only 1 or two pixels at high resolution so long as the decimated image wasn't 'too' decimated. For larger high resoltuon displacements however this method of estimating the motion between frames quickly broke down, increasing the factor by which the image is decimated resulted in the algorithm being accurate over smaller displacements. The factor by which an image has been decimated directly influences the amount of information that is c0ntained in the low resolution frames.

After trying Lucas-Kanade i searched around on the internet for more papers relating to motion estimation and came up with one which sounds like it might be a promising method, hopefully this paper will provide a method that works, otherwise i will just implement ransac, which for the moment seems like overkill as i am assuming that the transformations between images are pure translations.

The paper is titled: "Using Gradient Correlation for sub-pixel motion estimation of video sequences"

I have started implementing this algorithm but due to lack of time i have been unable to finish it before the lecture, after the lecture i will hopefully get the implementation finished and i will then improve this entry to include information on how the method works.

Wednesday, February 7, 2007

No Time

Well unfortunately this week I have had hardly anytime at all to spend on my project, been busy trying to get my CSE121 project done which just seems to take forever. As such i have been unable to do any more work on the motion estimation side of the project this week.


What time I have had ive spent playing around with the forward model that I have been using. As one of the next steps in the project I was aiming to test out the super-resolution approach on images that have been decimated using a different method, e.g. images I haven’t decimated my self using my own forward model.


In order to acquire the low resolution images I continued to use the same high resolution image that I used for the previous tests. In order to acquire the low resolution input frames I repeatedly shifted the images and then applied the matlab image resize command with the resize method set to bilinear.


Once these low resolution frames had been acquired I then attempted to reconstruct the original high resolution image, using my own forward model. I performed the reconstruction several times each time adjusting the forward model i was using slightly, i have included a selection of results that i obtained. I have also improved the function i have written to create the decimation matrix to enable me to perform decimation using nearest neighbour as well as bilinear interpolation.


Attempt 1

Forward Model Settings:

Gaussian Blur sig = 0.5, Decimation = Bilinear

Reconstructed Image:

Error = 0.1658

Conclusions: Looking at the error value and the reconstructed image it can clearly be seen that the forward model is incorrect, as the reconstructed image doesn't resemble the original image at all.

Attempt 2

Forward Model Settings:

Gaussian Blur sig = 0.25, Decimation = Bilinear

Reconstructed Image:

Error = 0.0289

Attempt 3

Forward Model Settings:

Gaussian Blur sig = 0 (no bluring), Decimation = Bilinear

Reconstructed Image:

Error = 0.0052

Conclusions: Low Error and the reconstructed image looks similar to the original high res image

Attempt 4

Forward Model Settings:

Gaussian Blur sig = 0.5, Decimation = nearest neighbour

Reconstructed Image:

Error = 0.5919

Attempt 5

Forward Model Settings:

Gaussian Blur sig = 0.25, Decimation = nearest neighbour

Reconstructed Image:

Error = 0.0221

Attempt 6

Forward Model Settings:

Gaussian Blur sig = 0 (no bluring), Decimation = nearest neighbour

Reconstructed Image:

Error = 0.0078




Wednesday, January 31, 2007

Quantitative Results Using Error Metric

Error Metric

I selected to use the mean squared error as the error metric to evaluate the effectiveness of the system. The benefit of using this metric is that it provides a method for calculating the error that gives a numerical value that can be used to compare errors from different reconstructions regardless of the size of the images in question.

Where

X – Original High resolution image

X’ – Reconstructed High resolution image

L – Number of pixels along each axis (assumes square image)

Quantitative Results

The image below shows the original high resolution image that was used in order to assess the performance of the super-resolution algorithm when applied to various low resolution sets.

20 Pixels Super-Resolved to 30


Original High Resolution Image

Low Resolution Input frames (Each one is 30 x 30)

Output Image

Rank of Model Matrix = 900 (Full Rank)

Mean Squared Error = 1.6491e-024

15 Pixels Super-Resolved to 30

I wont include images for this case as they look pretty similar to the images for the 20 pixel low resolution case.

Rank of Model Matrix = 900 (Full Rank)

Mean Squared Error = 9.0744e-023

10 Pixels Super-Resolved to 30

Rank of Model Matrix = 900 (Full Rank)

Mean Squared Error = 1.2279e-025

6 Pixels Super-Resolved to 30

6 Pixels was the lowest that could achieved whilst still having the model matrix have full rank.

Low Resolution Input frames (Each one is 6 x 6)

Reconstructed High Resolution Image

Rank of Model Matrix = 900 (Full Rank)

Mean Squared Error = 1.1463e-024


4 Pixels Super-Resolved to 30

Reconstructed High Resolution Image

Rank of Model Matrix = 400

Mean Squared Error = 0.0182


Monday, January 22, 2007

Down Sampling & Up Sampling Test Images

Since Friday I have been working towards applying the forward model mentioned in my previous post to a high resolution image, and then attempting to obtain the original high resolution image using the produced low resolution images and knowledge about the transformations that were applied to it.

Forward Model

In order to simplify my first attempt at applying the forward model and then reversing its affects I choose to implement the warping operation applied to the high resolution images to being a shifting affect only. The blurring operation was implemented using a Gaussian kernel with a σ = 0.5. Finally I chose to decimate the image using bilinear interpolation.

Applying the forward Model

The only parameter that was varied in order to produce the low resolution images was the amount that the original high resolution image was displaced before the blurring and decimation operations were performed.

The image below is the original high resolution image that I applied the forward model to.

The images below show some examples of the result of applying the forward model to the high resolution image.


Calculating the High Resolution Image

First Attempt at Reconstructing the original high resolution image

Second Attempt at Reconstructing the high resolution image

As suggested by Serge i have now inverted the original high resolution image before creating the low resolution frames, and then applying the super-resolution process to them, this has been done purely for cosmetic reasons so that the black lines that creep in at the borders as the image is shifted dont look out of place.

The image below shows the new high resolution image.

The image below shows 9 out of the 25 low resolution frames that were generated.

Since inverting the images i have noticed a slight bug in the matrix that is being used to decimate the high resolution images. This bug didn't show up when performing the super-resolution with an image with a white background. However along the right, and lower side of the reconstructed high resolution image a greyish strip can be seen. To solve this problem i believe i need to adjust the weights of the decimation matrix. But not wanting to rush this and break the whole program i will wait until i have more time before adjusting them.


Next Step(s)
  • Fix bug in the decimation matrix
  • Implement a function to calculate the error between the reconstructed image and the original image - probably in the form of mean square error
  • Attempt to automatically determine the relative motions between the low resolution images, and then compare the new reconstruction error with the previous reconstruction error. Serge recommended implementing this using the optical flow algorithm.
  • Look into implementing maximum likelihood estimation in order to reconstruct the high resolution image, rather than using the pseudo-inverse.
  • Effects of noise on the system. Currently i have been using low resolution images that contain no noise, once maximum likelihood estimation is implemented a comparison as to which is more robust against the effects of noise will be performed.

Friday, January 19, 2007

Super-Resolution Papers & Models

Papers

The last week or so has been spent pouring over the various papers on multi-frame super resolution and trying to understand the approaches that have been taken in order to solve the super-resolution problem.


Having found an abundance of super-resolution papers online I have selected four papers which were written over the period of 1997 – 2004, which adopt the same approach towards solving the super-resolution problem, each of which builds on the ideas that were presented in the previous paper. The links to these papers are below, and are in chronological order.


“Restoration of a Single Superresolution Image from Several Blurred, Noisy, and Undersampled Measured Images, Michael Elad and Arie Feuer

A computationally efficient superresolution image reconstruction algorithm” Nhat Nguyen_, Peyman Milanfar and Gene Golub


“A Fast Super-Resolution Reconstruction Algorithm for Pure Translation Motion and Common Space-Invariant Blur” Michael Elad, Yacov Hel-Or


“Advances and Challenges in Super-Resolution”, Sina Farsiu, Dirk Robinson, Michael Elad, Peyman Milanfar


In the papers mentioned above super-resolution is defined as an inverse problem in which the formation of the low resolution images is modelled as series of successive transformations that are performed on a high resolution image. They then propose that super-resolution can be achieved by obtaining and then applying to the low resolution images the inverse of the forward transformation in order to reconstruct the original high resolution data. The forward model will be discussed below.

Forward Model


The forward model is the model that is used to describe the transformation from a high resolution image to a low resolution version of the same image. The low resolution image is treated as a high resolution image that has been subjected to motion (or warp), camera blur (the cameras point spread function), and down sampling (decimation) operations, the low resolution image is also treated as contained a noise component.

The model connecting the kth low resolution image to the high resolution image is shown below in a mathematical form:


Where

Yk is the kth low resolution image

Dk is the down sampling matrix

Ck is the Blurring matrix

Fk is the motion or warp matrix

X is the high resolution image

Ek is the kth noise vector

A more general model grouping the equations for all the low resolution frames into one equation is shown below:


...