3D Computer Vision Assignment 4

13
Assignment 4 CAP 6419 - 3D Computer Vision Fall 2013 Due 11-12-2013 Edward Aymerich 1 Implemented methods 1.1 Loading images The function loadImg() was implemented to ease the loading of several images. This function takes a string indicating the name of the images to load (for example: “test*.jpg”) and returns a cell array with the images, ordered by name and number (if any). Wildcards must be used in the file name to load several images. In order to correctly order numbered images, loadImg() uses the function sort nat() from Douglas M. Schwarz 1 . The function loadImg() has the limitation that it can only load images from the current directory. If the file name contains a path (relative or absolute), loadImg() will fail to load the images. 1.2 Feature point correspondence The function findHomography() was implemented to find feature points in a pair of images, and then match such points. The implementation used MAT- LAB functions detectSURFFeatures() and extractFeatures() 2 to identify feature points in the input images, and then the MATLAB function matchFeatures() creates a correspondence between the feature points in both images. Finally, the format of the matching points returned by matchFeatures() is changed to find an homography. Originally, the number of feature points was limited to 100, i.e., if more than 100 points were found, then only the 100 points with the strongest features were used. However, it was found that using this amount of points resulted in very different homographies each time the algorithm was run (because of the random nature in the RANSAC method used later). Increasing the amount of feature points used also increased the stability of the homography, so in the current implementation there is no limit to the amount of points used: all found feature points are matched and used to compute the homography. 1 Available at: http://www.mathworks.com/matlabcentral/fileexchange/ 10959-sortnat-natural-order-sort 2 Part of the Computer Vision System Toolbox. 1

description

Automatic image mosaics from a series of consecutive images.

Transcript of 3D Computer Vision Assignment 4

Page 1: 3D Computer Vision Assignment 4

Assignment 4

CAP 6419 - 3D Computer Vision

Fall 2013

Due 11-12-2013

Edward Aymerich

1 Implemented methods

1.1 Loading images

The function loadImg() was implemented to ease the loading of several images.This function takes a string indicating the name of the images to load (forexample: “test*.jpg”) and returns a cell array with the images, ordered byname and number (if any). Wildcards must be used in the file name to loadseveral images.

In order to correctly order numbered images, loadImg() uses the functionsort nat() from Douglas M. Schwarz1.

The function loadImg() has the limitation that it can only load images fromthe current directory. If the file name contains a path (relative or absolute),loadImg() will fail to load the images.

1.2 Feature point correspondence

The function findHomography() was implemented to find feature points in apair of images, and then match such points. The implementation used MAT-LAB functions detectSURFFeatures() and extractFeatures()2 to identify featurepoints in the input images, and then the MATLAB function matchFeatures()creates a correspondence between the feature points in both images.

Finally, the format of the matching points returned by matchFeatures() ischanged to find an homography.

Originally, the number of feature points was limited to 100, i.e., if more than100 points were found, then only the 100 points with the strongest features wereused. However, it was found that using this amount of points resulted in verydifferent homographies each time the algorithm was run (because of the randomnature in the RANSAC method used later). Increasing the amount of featurepoints used also increased the stability of the homography, so in the currentimplementation there is no limit to the amount of points used: all found featurepoints are matched and used to compute the homography.

1Available at: http://www.mathworks.com/matlabcentral/fileexchange/10959-sortnat-natural-order-sort

2Part of the Computer Vision System Toolbox.

1

Page 2: 3D Computer Vision Assignment 4

1.3 Compute Infinite Homography

The function ransacfithomography() from Peter Kovesi3 is used to estimate thebest homography for the matched feature points. This function uses RANSACto fit a 2D homography. A value of 0.01 was used as distance threshold to decideis a point is an inlier or not.

The call to ransacfithomography() is done inside the implemented functionfindHomography(), so after it calculates the matching points, uses ransacfitho-mography() to fit the homography and then returns such homography.

1.4 Warping of images

With the fitted homography, one of the input images must be warped to fit intothe other image. Such warping is done by MATLAB function imtransform().One problem with this function is that it automatically centers the warpedimage, so any translation in our homography is lost, and since the objectiveis to do a final mosaic, then there is a high chance that we have translationsbetween the images.

To recover the lost translation information, imtransform() is called in thefollowing way:

[A′, xdata, ydata] = imtransform(A,H);

With A being the image to warp, H the fitted homography, A′ is the warpedimage, and xdata and ydata store the translation information, which will be usedwhen the mixing of images is calculated.

1.5 Mixing images

An easy way to mosaic the images is to take the reference image (the onethat wasn’t warped) and copy the warped image over it, using the translationinformation to position the warped image correctly. This approach has one bigproblem: the superimposing of images is very easy to see, and this is seen asartifacts in the resulting mosaic:

3Available at: http://www.csse.uwa.edu.au/~pk/Research/MatlabFns/

2

Page 3: 3D Computer Vision Assignment 4

To prevent this, a mask is applied to each image before it is added into themosaic. To create such mask, an initial mask is created for each image, in a waythat it has the greater value at the center, and decreases linearly to the borders.The value of all masks is added into the mosaic mask, and then the final maskfor each image is determined by dividing its initial mask by the mosaic mask.

Simple mosaic of images A and B. Mosaic mask.

Initial mask for A. Initial mask for B.

Final mask for A. Final mask for B.

Final mosaic.

The initial mask for each image is calculated by the implemented function

3

Page 4: 3D Computer Vision Assignment 4

getMask(), which calculates the inverse distance of each pixel to the image centerin each axis (X and Y), measuring this inverse distance as 1 in the center and0 close to the borders. The final value for each pixel in the mask is its inverseX distance times its inverse Y distance.

The functions immix2() and immix fast() were implemented to handle themixing of images. immix2() is just a wrapper around immix fast() that facil-itates the passing of parameters. This functions do not apply any warping tothe images, they only mix two images at given coordinates using the proceduredescribed before.

A initial implementation of this method was done in the function immix().Due to the author’s inexperience with MATLAB, a lot of for cicles were usedto calculate the masks, and some values were calculated twice, in the hopesthat recalculating some simple values was faster than storing and retrievingthem from memory (as it happens with some algorithms implemented in C).However, immix() was very inefficient, and most of the time of the whole mo-saicing process was spend only mixing the images. The function immix fast()is a re implementation of immix(), using no loops and relying in MATLAB’svectorization.

Using MATLAB’s Profiler to measure the efficiency of immix() and im-mix fast(), it was found that the speedup of the new function is very significant.Using 5 images to produce a mosaic took 13.039 seconds using immix(), of which12.519 seconds are used by immix(). In contrast, using immix fast() to producethe same mosaic took 0.591 seconds, of which only 0.088 seconds were spend inimmix fast(). This represents a speedup of 22, with immix fast() using about15% of the total time, which is a big improvement over the 96% of the totaltime used by immix().

1.6 Creation of the mosaic

To put all the previous steps together, the function imfuse() was implemented.This function receives two images, say A and B, and calls findHomography()to estimate the homography H that match A into B. Then it warps A intoA′ using H and imtransform(), and finally mix A′ with B using immix2() andreturns the resulting image.

Up to this point the process of warping and mixing two images together hasbeen described. One method to mosaic several images could be to estimate thehomography H between each pair of images, and then use backward mappingmethods to estimate the homography that match each image into the referenceimage. This implementation doesn’t follow this method directly, instead usinga different approach.

The function mosaic() was implemented to handle the mosaicing of severalimages. To do this, mosaic() first select the middle image in the cell array ofimages given to it. Then it creates a mosaic mixing images from the beginningof the cell array up to the middle, and another mosaic from the end of thecell array up to the middle. Finally the two mosaics a mixed together into theresulting mosaic. mosaic() allows the user to define how many images around

4

Page 5: 3D Computer Vision Assignment 4

the center (called neighbors) will be used in the final mosaic. If no amount ofneighbors is defined, then mosaic() will use all the images.

To explain how mosaic() works, let’s follow a small example. Suppose thatwe want to mosaic 5 images, labeled [A,B,C,D,E], and we want to use C asreference. mosaic() first fuses images A and B, transforming A to fit into Band leaving B unwarped. Let’s call this new image AB. Now image AB is fusedwith C, warping AB to fit into C and leaving C unwarped and producing imageABC. A similar process, but in backward direction, happens on the other side:E is warped into D to form ED, and ED is warped into C to produce EDC.Finally, ABC and EDC are fused together. Since in both images C was notwarped, C will remain unwarped in the final mosaic, serving as reference for theother images.

To see why this method works, let’s consider the process from A’s viewpoint.In theory, if we estimate the homography HAB that warps A into B and thehomography HBC that warps B into C, then an homography HAC that warps Ainto C can be calculated as HAC = HABHBC . In our method, since we alreadywarped A into AB, then we don’t have to worry about estimating HAC , becauseA is already in B’s space. When AB is warped into C, it will carry on thetransformation applied to A into C space. In this method, the transformationsare not explicitly accumulating in a homography, but they are accumulating inthe images. Therefore we must only worry about finding the correct homographyto fit the current image (which is the accumulation of previous images) into thenext image. An example of this process follows:

Image A Accumulated image A.

Image B Accumulated image AB.

Image C Accumulated image ABC.

5

Page 6: 3D Computer Vision Assignment 4

Notice how image A is increasingly distorted as more and more images areintroduced into the accumulated image. This is due to the transformationsnecessary to warp A into reference image C.

2 Results

2.1 Using set mov2

The set “mov2” was provided for this homework. It consist of 21 picturesapparently taken from the main entrance of the Harris Engineering Corporationbuilding at UCF.

mov2b 7 mob2b 8 mob2b 9 mob2b 10 mob2b 11

mov2b 12 mob2b 13 mob2b 14 mob2b 15 mob2b 16

mov2b 17 mob2b 18 mob2b 19 mob2b 20 mob2b 21

mov2b 22 mob2b 23 mob2b 24 mob2b 25 mob2b 26

mov2b 27

Using a different number of neighbors around the center image (“mob2b 17”)the following mosaics were calculated:

6

Page 7: 3D Computer Vision Assignment 4

Using 0 neighbors. (1 picture).

Using 1 neighbors.(3 pictures).

Using 3 neighbors. (7 pictures).

Using all set. (21 pictures).

7

Page 8: 3D Computer Vision Assignment 4

2.2 Using set mov3

The set “mov3” was provided for this homework. It consist of 17 pictures of anindoor environment.

mov3 1 mov3 2 mov3 3 mov3 4 mov3 5

mov3 6 mov3 7 mov3 8 mov3 9 mov3 10

mov3 11 mov3 12 mov3 13 mov3 14 mov3 15

mov3 16 mov3 17

Using a different number of neighbors around the center image (“mob3 9”)the following mosaics were calculated:

Using 0 neighbors. (1 picture).

Using 3 neighbors.(7 pictures).

8

Page 9: 3D Computer Vision Assignment 4

Using 5 neighbors. (11 pictures).

Using all set. (17 pictures).

2.3 Using set mymov2

The “mymov2” set of images was capture using a PTZ camera, in particular aSony SNC-RZ30N from the Computing Imaging Lab (CIL) at UCF. The imageswere taken inside CIL, panning from left to right. Although the original set ofimages consist of 19 pictures covering a 180 degrees of view, only a subset of thisimages is used in this report, specifically images from “test6.jpg” to “test12.jpg”were used, for a total of 7 pictures.

The assumption for this set is that the PTZ camera doesn’t move it’s cameracenter. Therefore, if the camera is not moved, it’s center should remain at thesame place, even if we rotate the camera to do the panning.

test6.jpg test7.jpg test8.jpg test9.jpg

test10.jpg test11.jpg test12.jpg

9

Page 10: 3D Computer Vision Assignment 4

Starting with the center image (“test9.jpg”) we show the results of the mosaicwith an increasing amount of neighbors around the center:

Using 0 neighbors. (1 picture).

Using 1 neighbors.(3 pictures).

Using 2 neighbors. (5 pictures).

Using 3 neighbors. (7 pictures).

10

Page 11: 3D Computer Vision Assignment 4

2.4 Using set mymov3

The “mymov3” set of images was also capture using the PTZ camera from CIL(a Sony SNC-RZ30N) at UCF. The images were also taken inside CIL, butthis time the panning was diagonal, from lower left to upper right. Althoughthe original set of images consist of 19 pictures covering 180 degrees of viewhorizontally, only a subset of this images is used in this report, specificallyimages from “test6.jpg” to “test12.jpg” were used, for a total of 7 pictures.

This set is intended to test how good is the matching of images when usinga diagonal panning.

test6.jpg test7.jpg test8.jpg test9.jpg

test10.jpg test11.jpg test12.jpg

Starting with the center image (“test9.jpg”) we show the results of the mosaicwith an increasing amount of neighbors around the center:

Using 0 neighbors. (1 picture).

Using 1 neighbors.(3 pictures).

11

Page 12: 3D Computer Vision Assignment 4

Using 2 neighbors. (5 pictures).

Using 3 neighbors. (7 pictures).

2.5 Using set mymov4

This set of images was captured using a consumer grade point-and-shoot camera(a Canon PowerShot A490), without the use of a tripod. The pictures were takenin front of the Millican Hall at UCF, a couple hours before the Spirit Splash ofNovember 8, 2013. A total of 5 pictures were taken.

This set is intended to test how good the algorithm works when the assump-tion that the camera center doesn’t move is not certain. Since the camera washeld by hand, the human imprecision is almost a guaranteed that the cameracenter moves.

IMG 4315.jpg IMG 4316.jpg

12

Page 13: 3D Computer Vision Assignment 4

IMG 4317.jpg (center) IMG 4318.jpg

IMG 4319.jpg

The unified mosaic of all five images follows:

13