SIFT Guest Lecture by Jiwon Kim

• Guest Lecture by Jiwon Kim• http://www.cs.washington.edu/homes/jwkim/

SIFT Features andIts Applications

Autostitch Demo

Autostitch

• Fully automatic panorama generation– Input: set of images– Output: panorama(s)

• Uses SIFT (Scale-Invariant Feature Transform) to find/align images

1. Solve for homography

2. Find connected sets of images

3. Solve for camera parameters

• New images initialised with rotation, focal length of best matching image

3. Solve for camera parameters

• New images initialised with rotation, focal length of best matching image

4. Blending the panorama

• Burt & Adelson 1983– Blend frequency bands over range

Low frequency ( > 2 pixels)

High frequency ( < 2 pixels)

2-band Blending

Linear Blending

2-band Blending

So, what is SIFT?

• Scale-Invariant Feature Transform• David Lowe at UBC• Scale/rotation invariant• Currently best known feature descriptor• Many real-world applications

– Object recognition– Panorama stitching– Robot localization– Video indexing– …

Example: object recognition

SIFT properties

• Locality: features are local, so robust to occlusion and clutter

• Distinctiveness: individual features can be matched to a large database of objects

• Quantity: many features can be generated for even small objects

• Efficiency: close to real-time performance

SIFT algorithm overview

1. Feature detection– Detect points that can be repeatably

selected under location/scale change

2. Feature description– Assign orientation to detected feature

points– Construct a descriptor for image patch

around each feature point

3. Feature matching

1. Feature detection

• Detect points stable under location/scale change

– Build continuous space (x, y, scale)– Approximated by multi-scale Difference-of-

Gaussian pyramid– Select maxima/minima in (x, y, scale)

1. Feature detection

• Localize extrema by fitting a quadratic

1) Sub-pixel/sub-scale interpolation using Taylor expansion

2) Take derivative and set to zero

1. Feature detection• Discard low-contrast/edge points

1) Low contrast: discard keypoints with < threshold

2) Edge points: high contrast in one direction, low in the other compute principal curvatures from eigenvalues of 2x2 Hessian matrix, and limit ratio

)ˆ(xD

1. Feature detection• Example

(a) 233x189 image(b) 832 DOG extrema(c) 729 left after peak value threshold(d) 536 left after testing ratio of principle curvatures

2. Feature description

– Create histogram of local gradient directions computed at selected scale

– Assign canonical orientation at peak of smoothed histogram

• Assign orientation to keypoints

2. Feature description• Construct SIFT descriptor

– Create array of orientation histograms– 8 orientations x 4x4 histogram array = 128

dimensions

2. Feature description• Advantage over simple correlation

– Gradients less sensitive to illumination change

– Gradients may shift: robust to deformation, viewpoint change

Performance: stability to noise

• Match features after random change in image scale & orientation, with differing levels of image noise

• Find nearest neighbor in database of 30,000 features

Performance:stability to affine change

• Match features after random change in image scale & orientation, with 2% image noise, and affine distortion

• Find nearest neighbor in database of 30,000 features

Performance: distinctiveness

• Vary size of database of features, with 30 degree affine change, 2% image noise

• Measure % correct for single nearest neighbor match

3. Feature matching

• For each feature in A, find nearest neighbor in B

3. Feature matching

• Nearest neighbor search too slow for large database of 128-dimenional data

• Approximate nearest neighbor search:– Best-bin-first [Beis et al. 97]: modification to k-d

tree algorithm– Use heap data structure to identify bins in

order by their distance from query point

• Result: Can give speedup by factor of 1000 while finding nearest neighbor (of interest) 95% of the time

3. Feature matching• Reject false matches

– Compare distance of nearest neighbor to second nearest neighbor

– Common features aren’t distinctive, therefore bad– Threshold of 0.8 provides excellent separation

3. Feature matching

• Now, given feature matches…– Find an object in the scene– Solve for homography (panorama)– …

3. Feature matching

• Example: 3D object recognition

3. Feature matching

• 3D object recognition– Assume affine transform: clusters of size >=3– Looking for 3 matches out of 3000 that agree

on same object and pose: too many outliers for RANSAC or LMS

– Use Hough Transform• Each match votes for a hypothesis for object

ID/pose• Voting for multiple bins & large bin size allow for

error due to similarity approximation

3. Feature matching• 3D object recognition: solve for pose

– Affine transform of [x,y] to [u,v]:

– Rewrite to solve for transform parameters:

3. Feature matching

• 3D object recognition: verify model1) Discard outliers for pose solution in prev step2) Perform top-down check for additional

features3) Evaluate probability that match is correct

a) Use Bayesian model, with probability that features would arise by chance if object was not present

b) Takes account of object size in image, textured regions, model feature count in database, accuracy of fit [Lowe 01]

Planar recognition• Training images

Planar recognition

• Reliably recognized at a rotation of 60° away from the camera

• Affine fit approximates perspective projection

• Only 3 points are needed for recognition

3D object recognition• Training images

3D object recognition

• Only 3 keys are needed for recognition, so extra keys provide robustness

• Affine model is no longer as accurate

Recognition under occlusion

Illumination invariance

Applications of SIFT

• Object recognition• Panoramic image stitching• Robot localization• Video indexing• …

• The Office of the Past– Document tracking and recognition

Location recognition

Robot Localization

Map continuously built over time

Locations of map features in 3D

Sony Aibo

SIFT usage:

Recognize charging station

Communicate with visual cards

Teach object recognition

The Office of the Past• Paper everywhere

Unify physical andelectronic desktops

• Recognize video of paper on physical desktop– Tracking– Recognition– Linking

Video camera

Desktop

Unify physical andelectronic desktops

• Applications– Find lost documents– Browse remote

desktop– Find electronic

version– History-based

queries

Video camera

Desktop

Example input video

Demo – Remote desktop

System overviewVideo camera

DeskUser

Computer

System overview

Video of desk

System overview

Video of desk Images from PDF

System overview

Track & recognize

System overview

Track & recognize

Desk Desk

Internal representation

System overview

Track & recognize

Scene Graph

Desk Desk

System overview

Track & recognize

Where is my W-2?

Desk Desk

System overview

Track & recognize

Desk Desk

Where is my W-2?

Answer

Assumptions

• Document– Corresponding electronic copy exists– No duplicates of same document

Assumptions

• Document– Corresponding electronic copy exists– No duplicates of same document

• Motion– 3 event types: move/entry/exit– One document at a time– Only topmost document can move

Non-assumptions

• Desk need not be initially empty

Non-assumptions

• Desk need not be initially empty• Stacks may overlap

Algorithm overviewInput

Frames… …

Event Detection

before after

Frames… …

Event Detection

Event Interpretation

“A document moved from (x1,y1) to (x2,y2)”

before after

Frames… …

Event Detection

Document Recognition

before after

File1.pdf

File2.pdf

File3.pdf

Frames… …

Event Detection

before after

File1.pdf

File2.pdf

File3.pdf

Scene Graph Update

Desk Desk

Frames… …

Event Detection

before after

File1.pdf

File2.pdf

File3.pdf

Scene Graph Update

Desk Desk

Document tracking example

before after

Motion: (x,y,θ)

before after

File1.pdf File2.pdf File3.pdf File4.pdf File5.pdf File6.pdf

• Match against PDF image database

Document Recognition• Performance analysis

– Tested 20 pages against database of 162 pages

– ~200x300 pixels per document for reliable match

Document Resolution

Recognition Rate

– ~200x300 pixels per document for reliable match

Document Resolution

Recognition Rate

Results

• Input video– ~40 minutes– 1024x768 @ 15 fps– 22 documents, 49 events

• Running time– Video processed offline– No optimization– A few hours for entire video

Demo – Paper tracking

Photo sorting example

Demo – Photo sorting

Future work

• Enhance realism– Handle more realistic desktops– Real-time performance

• More applications– Support other document tasks

• E.g., attach reminder, cluster documents

– Beyond documents• Other 3D desktop objects, books/CD’s

Summary

• SIFT is:– Scale/rotation invariant local feature– Highly distinctive– Robust to occlusion, illumination

change, 3D viewpoint change– Efficient (real-time performance)– Suitable for many useful applications

References

• Distinctive image features from scale-invariant keypoints – David G. Lowe, International Journal of Computer Vision,

60, 2 (2004), pp. 91-110• Recognising panoramas

– Matthew Brown and David G. Lowe, International Conference on Computer Vision (ICCV 2003), Nice, France (October 2003), pp. 1218-25.

• Video-Based Document Tracking: Unifying Your Physical and Electronic Desktops – Jiwon Kim, Steven M. Seitz and Maneesh Agrawala, ACM

Symposium on User Interface Software and Technology (UIST 2004), pp. 99-107.

SIFT Guest Lecture by Jiwon Kim

Documents

Transcript of SIFT Guest Lecture by Jiwon Kim

SIFT & SURF

Phuong phap SIFT

A Comparison of SIFT, PCA-SIFT and SURF · A Comparison of SIFT, ... There are also many other feature detection methods; ... So the descriptor of SIFT that was used is 4 x 4

SIFT Feature Matching

Recognition Using SIFT Features

SIFT-MS - GraphCMS

Toy Portfolio by Jiwon Huh

Jiwon disc

Distributed SociaLite: A Datalog-Based Language for · PDF fileDistributed SociaLite: A Datalog-Based Language for Large-Scale Graph Analysis Jiwon Seo Stanford University jiwon@ Jongsoo

SIFT Flow: Dense Correspondence across Diﬀerent Scenespeople.csail.mit.edu/billf/publications/SIFT_Flow.pdf · SIFT ﬂow algorithm then consists of matching densely sampled SIFT

SIFT White

SIFT Descriptor Extraction on the GPU for Large-Scale ...on-demand.gputechconf.com/.../S4147-sift-descriptor... · SIFT (with additional verification step using homographies) MPEG-7

Knowesis Sift

SIFT-MS - SRA Instruments€¦ · fundamentals of SIFT-MS and outlines the key benefits of Syft’s SIFT-MS solutions. Selected ion flow tube mass spectrometry (SIFT-MS) is a form

Boxer And Mr.Speed By: Aidan, Reagan, Jiwon, Ryoma, Madison, and Lina.

CS201: Computer Vision Lect 09: SIFT Descriptors - …jmagee/cs201/slides/CS201.Lect09.SIFT2.pdf · CS201: Computer Vision Lect 09: SIFT Descriptors ... Computing SIFT Descriptor

Simplifying the Complex - Sift, LLC - Sift Origin Story.pdf · Simplifying the Complex THE SIFT STORY SEPTEMBER 2019. SIFT + // S IFT? 2 I’m fascinated by the challenge of organizing

10 Jiwon Chung

SIFT and Object Recognition

Glass Defect Using SIFT