Distinctive Image Features from Scale-Invariant Keypoints

42
Distinctive Image Features from Scale- Invariant Keypoints Ronnie Bajwa Sameer Pawar * pted from slides found online by Michael Kowalski, Lehigh University

description

Distinctive Image Features from Scale-Invariant Keypoints. Ronnie Bajwa Sameer Pawar *. * Adapted from slides found online by Michael Kowalski, Lehigh University. Goal. - PowerPoint PPT Presentation

Transcript of Distinctive Image Features from Scale-Invariant Keypoints

Page 1: Distinctive Image Features from Scale-Invariant Keypoints

Distinctive Image Features from Scale-Invariant Keypoints

Ronnie Bajwa

Sameer Pawar

*

* Adapted from slides found online by Michael Kowalski, Lehigh University

Page 2: Distinctive Image Features from Scale-Invariant Keypoints

Goal

Extract distinctive invariant features from an image that can be used to perform reliable object recognition and many other applications.

Page 3: Distinctive Image Features from Scale-Invariant Keypoints

Key requirements for good feature

Highly distinctive, with a low probability of mismatch Should be easy to extract Invariance:

- Scale and rotation

- change in illumination

- slight change in viewing angle

- image noise

- clutter and occlusion

Page 4: Distinctive Image Features from Scale-Invariant Keypoints

Brief History

Moravec(1981): – corner detector.

Harris(1988): – Selects locations that has large gradients in all directions (corners)– Invariant to image rotation and to slight affine intensity change, but faces

problems for scale change. Zhang(1995):

– Introduced correlation window to match images in large range motion. Schmid (1997):

– Suggested use of local feature matching for image recognition.– Used rotationally invariant descriptor of the local image region.– Multiple feature matches accomplish recognition under occlusion & clutter.

Page 5: Distinctive Image Features from Scale-Invariant Keypoints

Harris Corner Detector:

direction of the slowest

change

direction of the fastest change

(max)-1

/2

(min)-1/2

“Edge” 2 >> 1

“Flat” 2 ~ 1~0

“Corner” 2 ~ 1 ~ Large

Page 6: Distinctive Image Features from Scale-Invariant Keypoints

SIFT Algorithm Overview

Filtered approach 1. Scale-space extrema detection

– Identify potential points: invariant to scale & orientation.– Difference-of-Gaussian function

2. Keypoint localization– Improve the estimate for location by fitting a quadratic– Extrema thresholded for filter out insignificant points.

3. Orientation Assignment- Orientation assigned to each keypoint and

neighboring pixels based on local gradient. 4. Keypoint Descriptor construction

– Feature vector based on gradients of local neighborhood

Page 7: Distinctive Image Features from Scale-Invariant Keypoints

Keypoint Selection: Scale space

We express the image at different scales by filtering it with a Gaussian kernel

Page 8: Distinctive Image Features from Scale-Invariant Keypoints

Keypoint Selection: why DoG’s?

Lindeberg(1994) and Mikolajczyk (2002) found that the maxima and minima of the scaled Laplacian provides the most stable scale invariant features

We can use the scaled images to approximate this:

Efficient to compute– Smoothed images L needed later so D can be computed by

simple image subtraction

Page 9: Distinctive Image Features from Scale-Invariant Keypoints

Back to picking keypoints

1. Supersample original image

2. Compute smoothed images using different scales σ for entire octave

3. Compute doG images from adjacent scales for entire octave

4. Isolate keypoints in each octave by detecting extrema in doG compared to neighboring pixels

5. Subsample image 2σ of current octave and repeat process (2-3) for next octave

Page 10: Distinctive Image Features from Scale-Invariant Keypoints

Visual representation of process

Page 11: Distinctive Image Features from Scale-Invariant Keypoints

Visual representation of process (cont.)

Page 12: Distinctive Image Features from Scale-Invariant Keypoints

A More Visual Representation

Original Image

Starting Image

Page 13: Distinctive Image Features from Scale-Invariant Keypoints

A More Visual Representation (cont.)

First Octave of L images

Page 14: Distinctive Image Features from Scale-Invariant Keypoints

A More Visual Representation (cont.)

Second Octave

Third Octave

Fourth Octave

Page 15: Distinctive Image Features from Scale-Invariant Keypoints

A More Visual Representation (cont.)

First Octave difference-of-Gaussians

Page 16: Distinctive Image Features from Scale-Invariant Keypoints

A More Visual Representation (cont.)

Page 17: Distinctive Image Features from Scale-Invariant Keypoints

Scale space sampling

How many fine scales in every octave?

Extremas can be arbitrary close but very close ones are unstable. After subsampling and before finding scaled images of the octave, prior

smoothing of 1.6 is done To compensate the loss of higher spatial frequencies, original image is

doubled in size

Page 18: Distinctive Image Features from Scale-Invariant Keypoints

Accurate Keypoint Localization

From difference-of-Gaussian local extrema detection we obtain approximate values for keypoints

Originally these approximations were used directly

For an improvement in matching and stability fitting to a 3D quadratic function is used

Page 19: Distinctive Image Features from Scale-Invariant Keypoints

The Taylor Series Expansion

Take Taylor Series Expansion of scale-space function D(x,y,σ)

– Use up to quadratic terms

– origin shifted to sample point– offset from this sample point– to find location of extremum, take derivative and set to 0

xx

Dxx

x

DDxD T

T

2

2

2

1)(

Tyxx ),,(

^

x

x

D

x

Dx

1

2

2

Page 20: Distinctive Image Features from Scale-Invariant Keypoints

Thresholding Keypoints (part 1)

The function value at the extrema is used to reject unstable extrema

– Low contrast– Evaluate

– Absolute value less than 0.03 at extrema location results in discarding of extrema

xx

DDxD

T

2

1)(

Page 21: Distinctive Image Features from Scale-Invariant Keypoints

Thresholding Keypoints (part 2)

Difference-of-Gaussian function will be strong along edges– Some locations along edges are poorly

determined and will become unstable when even small amounts of noise are added

– These locations will have a large principal curvature across the edge but a small principal of curvature perpendicular to the edge

– Therefore we need to compute the principal curvatures at the location and compare the two

Page 22: Distinctive Image Features from Scale-Invariant Keypoints

Computing the Principal Curvatures

Hessian matrix

– The eigenvalues of H are proportional to principal curvatures

– We are not concerned about actual values of eigenvalue, just the ratio of the two

yyxy

xyxx

DD

DDH

yyxx DDTr )(H r 2)()( xyyyxx DDDDet H

r

r

r

r

Det

Tr 2

2

222 )1()()(

)(

)(

H

H

Page 23: Distinctive Image Features from Scale-Invariant Keypoints

Stages of keypoint selection

Page 24: Distinctive Image Features from Scale-Invariant Keypoints

Assigning an Orientation

We finally have a keypoint that we are going to keep

The next step is assigning an orientation for the keypoint– Used in making the matching technique invariant

to rotation

Page 25: Distinctive Image Features from Scale-Invariant Keypoints

Assigning an Orientation (cont.)

Gaussian smoothed image, L, with closest scale is chosen (scale invariance)

Points in region around keypoint are selected and magnitude and orientations of gradient are calculated

Orientation histogram formed with 36 bins. Sample is added to appropriate bin and weighted by gradient magnitude and Gaussian-weighted circular window with a of σ 1.5 times scale of keypoint

22 ))1,()1,(()),1(),1((),( yxLyxLyxLyxLyxm

))),1(),1(/())1,()1,(((tan),( 1 yxLyxLyxLyxLyx

Page 26: Distinctive Image Features from Scale-Invariant Keypoints

Assigning an Orientation (cont.)

Highest peak in orientation is found along with any other peaks within 80% of highest peak

3 closest histogram values to each peak are used to interpolate (fit to a parabola) a better accurate peak

As of now each keypoint has 4 dimesions: x location, y location, scale, and orientation

Page 27: Distinctive Image Features from Scale-Invariant Keypoints

Keypoint Descriptor

Calculated using a region around the keypoint as opposed to directly from the keypoint for robustness

Like before, magnitudes and orientations are calculated for points in region around keypoint using L of nearest scale

To ensure orientation invariance the gradient orientations and coordinates of descriptor are rotated relative to orientation of keypoint

Provides invariance to changes in illumination and 3D camera viewpoint

Page 28: Distinctive Image Features from Scale-Invariant Keypoints

Visual Representation of Keypoint Descriptor

Page 29: Distinctive Image Features from Scale-Invariant Keypoints

Keypoint Descriptor (cont.)

Magnitude of each point is weighted with σ of one half the width of the descriptor window– Stops sudden changes in descriptor due to small

changes in position of the window– Gives less weight to gradients far from keypoint

Samples are divided into 4x4 subregions around keypoint. Allows for a change of up to 4 positions of a sample while still being included in the same histogram

Page 30: Distinctive Image Features from Scale-Invariant Keypoints

Keypoint Descriptor (cont.)

Avoiding boundary effects between histograms– Trilinear interpolation used to distribute value of gradient of

each sample into adjacent histogram bins Weight equal to 1 – d , where d is the distance of a specific

sample to the center of a bin Vector normalization

– Done at the end to ensure invariance to illumination change (affine)

– Entire vector normalized to 1 – To combat non-linear illumination (camera saturation)

changes values in feature vector are thresholded to no larger than 0.2 and then the vector is normalized.

Page 31: Distinctive Image Features from Scale-Invariant Keypoints

We now have features

Up to this point we have:– Found rough approximations for features by

looking at the difference-of-Gaussians– Localized the keypoint more accurately– Removed poor keypoints– Determined the orientation of a keypoint– Calculated a 128 feature vector for each keypoint

What do we do now?

Page 32: Distinctive Image Features from Scale-Invariant Keypoints

Matching Keypoints between images

Tale of two images (or more)– One image is the training sample of what we are

looking for– The other image is the world picture that might

contain instances of the training sample

Both images have features associated with them across different octaves

How do we match features between the two?

Page 33: Distinctive Image Features from Scale-Invariant Keypoints

Matching Keypoints between images (cont.)

Nearest Neighbor Algorithm(Euclidean Distance)– Independently match all keypoints in all octaves in

one image with all keypoints in all octaves in other image

– How to solve problem of features that have no correct match with opposite image (i.e. background noise, clutter)

Global threshold on distance (did not perform well) Ratio of closest neighbor with second closest neighbor.

Page 34: Distinctive Image Features from Scale-Invariant Keypoints

Matching Keypoints between images (cont.)

Threshold at 0.8– Eliminates 90% of false matches– Eliminates less than 5% of correct matches

Page 35: Distinctive Image Features from Scale-Invariant Keypoints

Efficient Nearest Neighbor indexing

In high dimensions no algorithms – k-dimensional tree: Is O(2klog n).

Page 36: Distinctive Image Features from Scale-Invariant Keypoints

Efficient Nearest Neighbor indexing

Use an approximate algorithm called Best-Bin-First (feature space)– Bins are searched in order of their closest distance .– Heap-based priority queue.– Returns closest neighbor with high probability– Search cut off after searching the 200 nearest-neighbor candidates

Works well since we only consider matches where nearest neighbor is less than 0.8 times the distance to second nearest neighbor.

Page 37: Distinctive Image Features from Scale-Invariant Keypoints

Clustering -Hough Transform

Small \ Highly-occluded objects: – 3 feature matches are sufficient. – 99% outliers.– RANSAC doesn’t perform well due to large number of

outliers. Hough Transform

– Uses each feature to vote for all object poses consistent with the feature

– Predicts approx. model for similarity transformation.– Bin Sizes must be large to account for large error bounds

Orientation-30 degrees, Scale-of 2, Location-0.25. Each keypoint votes to two closest bins in each dimension/ 16 entries for each hypothesis (feature).

Page 38: Distinctive Image Features from Scale-Invariant Keypoints

Model Verification-Affine Transformation

Geometric Verification of clusters

Page 39: Distinctive Image Features from Scale-Invariant Keypoints

Results (Object Recognition)

Page 40: Distinctive Image Features from Scale-Invariant Keypoints

Results (cont.)

Page 41: Distinctive Image Features from Scale-Invariant Keypoints

Questions?

Page 42: Distinctive Image Features from Scale-Invariant Keypoints