Dudek & Jugessur, ICRA 2000. April 2000, IEEE ICRADudek & Jugessur Robust Place and Object...

24
April 2000, IEEE ICRA Dudek & Jugessur Dudek & Jugessur, ICRA 200 Robust Place and Object Recognition using Local Appearance based Methods Gregory Dudek and Deeptiman Jugessur Center for Intelligent Machines McGill University + QuickTime™ and a Animation decompressor are needed to see this picture.

Transcript of Dudek & Jugessur, ICRA 2000. April 2000, IEEE ICRADudek & Jugessur Robust Place and Object...

Page 1: Dudek & Jugessur, ICRA 2000. April 2000, IEEE ICRADudek & Jugessur Robust Place and Object Recognition using Local Appearance based Methods Gregory Dudek.

April 2000, IEEE ICRA Dudek & Jugessur

Dudek & Jugessur, ICRA 2000.

Robust Place and Object Recognition using Local Appearance based Methods

Gregory Dudek and Deeptiman Jugessur

Center for Intelligent Machines

McGill University

+

QuickTime™ and aAnimation decompressor

are needed to see this picture.

Page 2: Dudek & Jugessur, ICRA 2000. April 2000, IEEE ICRADudek & Jugessur Robust Place and Object Recognition using Local Appearance based Methods Gregory Dudek.

April 2000, IEEE ICRA Dudek & Jugessur

Dudek & Jugessur, ICRA 2000.

Outline

• Applications• PCA: shortcomings• Objectives• Approach• Background• System Overview• Results• Conclusion

Page 3: Dudek & Jugessur, ICRA 2000. April 2000, IEEE ICRADudek & Jugessur Robust Place and Object Recognition using Local Appearance based Methods Gregory Dudek.

April 2000, IEEE ICRA Dudek & Jugessur

Dudek & Jugessur, ICRA 2000.

Two Applications

• Object recognition: what is that thing?– Recognizing a known object from its visual appearance.

– Landmarks, grasping targets, etc.

• Place recognition (coarse localization): what room am I in?– Recognizing the current waypoint on a trajectory,

validating the current locale for the application of a precise localization method, topological navigation.

Page 4: Dudek & Jugessur, ICRA 2000. April 2000, IEEE ICRADudek & Jugessur Robust Place and Object Recognition using Local Appearance based Methods Gregory Dudek.

April 2000, IEEE ICRA Dudek & Jugessur

Dudek & Jugessur, ICRA 2000.

PCA-based recognition.

• Has now become a well established method for image recognition.

• PCA-based recognition: global transform of image with N degrees of freedom into an eigenspace with M << N degrees of freedom.– Freedoms M are the “most important” characteristics of

the set of images being memorized.

• Avoids having to segment image into object & background by using the whole thing.

Page 5: Dudek & Jugessur, ICRA 2000. April 2000, IEEE ICRADudek & Jugessur Robust Place and Object Recognition using Local Appearance based Methods Gregory Dudek.

April 2000, IEEE ICRA Dudek & Jugessur

Dudek & Jugessur, ICRA 2000.

Observations

• Using whole image implies recognizing combination of object AND background.

• Segmenting object from background would avoid dependence on background, but it’s too difficult.

• Using a small sub-region gives a less precise recognition (e.e. the sun-window could come from more than one image), it’s is efficient.

• Many subwindows together can “vote” for an unambiguous recognition.

• If the sub-windows are suitably chosen, they may totally ignore the background.

Page 6: Dudek & Jugessur, ICRA 2000. April 2000, IEEE ICRADudek & Jugessur Robust Place and Object Recognition using Local Appearance based Methods Gregory Dudek.

April 2000, IEEE ICRA Dudek & Jugessur

Dudek & Jugessur, ICRA 2000.

Problem Statement

• Improving the performance of classic PCA based recognition by accounting for:

– Varying backgrounds

– Planar rotations

– Occlusions

• Also (discussed in less detail) – Changes in object pose

– Non-rigid deformation

Page 7: Dudek & Jugessur, ICRA 2000. April 2000, IEEE ICRADudek & Jugessur Robust Place and Object Recognition using Local Appearance based Methods Gregory Dudek.

April 2000, IEEE ICRA Dudek & Jugessur

Dudek & Jugessur, ICRA 2000.

Our key idea(s).

• Use sub-windows: several together uniquely accomplish recognition.

• Sub-windows are selected by an attention operator (several kinds can be used).

• Each sub-window is sampled non-uniformly to weight it towards it’s center.

• Use only the amplitude spectrum to buy rotational invariance.

Page 8: Dudek & Jugessur, ICRA 2000. April 2000, IEEE ICRADudek & Jugessur Robust Place and Object Recognition using Local Appearance based Methods Gregory Dudek.

April 2000, IEEE ICRA Dudek & Jugessur

Dudek & Jugessur, ICRA 2000.

Background

• Standard Appearance Based Recognition– M. Turk and S. Pentland 1991

– S.K. Nayar, H. Murase, S.A. Nene 1994

– H. Murase, S.K. Nayar 1995

– Shortcomings (due to global approach):• Background

• Scale

• Rotations

• Local changes of the image or object

• Occlusion

Page 9: Dudek & Jugessur, ICRA 2000. April 2000, IEEE ICRADudek & Jugessur Robust Place and Object Recognition using Local Appearance based Methods Gregory Dudek.

April 2000, IEEE ICRA Dudek & Jugessur

Dudek & Jugessur, ICRA 2000.

Background (part 2)

• “Enhanced” Local sub-window methods– D. Lowe 1999: scale invariance, simple features.

– C. Schmid 1999: Probabilistic approach based on sub-windows extracted using Harris operator.

– C. Schmid & R. Mohr 1997: numerous sub-windows extracted using Harris operator for database image retrieval (simpler problem).

– K. Ohba & K. Ikeuchi 1997: K.L.T. operator used for the extraction

of sub-windows for the creation of an eigenspace. Only handles occlusion.

• Interest Operator of choice:– D. Reisfeld, H. Wolfson, Y.Yeshurun 1995: Local symmetry operator

Page 10: Dudek & Jugessur, ICRA 2000. April 2000, IEEE ICRADudek & Jugessur Robust Place and Object Recognition using Local Appearance based Methods Gregory Dudek.

April 2000, IEEE ICRA Dudek & Jugessur

Dudek & Jugessur, ICRA 2000.

Approach

• 2 phases:

– Training (off-line) for the entire database of recognizable images:

• Run an interest operator to obtain a saliency map for each image.

• Choose sub-windows around the salient points for each image.

• Select most informative sub-windows and use foveal sampling.

• Create the eigenspace with the processed sub-windows.

– Testing (on-line) for a candidate test image:

• Run the same interest operator to obtain the saliency map.

• Choose the sub-windows and process the information within them.

• Project the sub-windows onto the eigenspace

• Perform classification based on nearest neighbor rules.

Page 11: Dudek & Jugessur, ICRA 2000. April 2000, IEEE ICRADudek & Jugessur Robust Place and Object Recognition using Local Appearance based Methods Gregory Dudek.

April 2000, IEEE ICRA Dudek & Jugessur

Dudek & Jugessur, ICRA 2000.

Recognition Model

Databaseof

recognizableimages

Candidatetest

image

Extractsub-windows

based oninterest operator

saliencyvalues and

information content

Obtainamplitude

spectrafor the

sub-windows

Eigenspacefor

classification

Run all images though the interest operator

Run the image through the interest operator

2D FFT

2D FFT

Create low dim. eigenspace

Project ontoeigenspace

Off-line

On-line

Page 12: Dudek & Jugessur, ICRA 2000. April 2000, IEEE ICRADudek & Jugessur Robust Place and Object Recognition using Local Appearance based Methods Gregory Dudek.

April 2000, IEEE ICRA Dudek & Jugessur

Dudek & Jugessur, ICRA 2000.

Polar Samplings and 2D FFT

Polar Sampling Polar Sampling

SameAmplitude Spectrum

(in theory)

2D FFT 2D FFT

Page 13: Dudek & Jugessur, ICRA 2000. April 2000, IEEE ICRADudek & Jugessur Robust Place and Object Recognition using Local Appearance based Methods Gregory Dudek.

April 2000, IEEE ICRA Dudek & Jugessur

Dudek & Jugessur, ICRA 2000.

Shift Theorem

f(x,y) → F(u,v)

Shift theorem states that:

f(x−a,y−b) → ej2π(au+bv)F(u,v)

Amplitudes are the same as:

|ej2π(au+bv)F(u,v) | = |F(u,v) |

Page 14: Dudek & Jugessur, ICRA 2000. April 2000, IEEE ICRADudek & Jugessur Robust Place and Object Recognition using Local Appearance based Methods Gregory Dudek.

April 2000, IEEE ICRA Dudek & Jugessur

Dudek & Jugessur, ICRA 2000.

Place RecognitionTest Images Training Images

Best match

Best match

Page 15: Dudek & Jugessur, ICRA 2000. April 2000, IEEE ICRADudek & Jugessur Robust Place and Object Recognition using Local Appearance based Methods Gregory Dudek.

April 2000, IEEE ICRA Dudek & Jugessur

Dudek & Jugessur, ICRA 2000.

Place Recognition (2)Test Images Training Images

Best match

Best match

Page 16: Dudek & Jugessur, ICRA 2000. April 2000, IEEE ICRADudek & Jugessur Robust Place and Object Recognition using Local Appearance based Methods Gregory Dudek.

April 2000, IEEE ICRA Dudek & Jugessur

Dudek & Jugessur, ICRA 2000.

Object RecognitionTest Image Training Image

Recognition

Page 17: Dudek & Jugessur, ICRA 2000. April 2000, IEEE ICRADudek & Jugessur Robust Place and Object Recognition using Local Appearance based Methods Gregory Dudek.

April 2000, IEEE ICRA Dudek & Jugessur

Dudek & Jugessur, ICRA 2000.

Object Recognition (2)Test Image Training Image

Best matches

Note:background variation

and occlusion

Page 18: Dudek & Jugessur, ICRA 2000. April 2000, IEEE ICRADudek & Jugessur Robust Place and Object Recognition using Local Appearance based Methods Gregory Dudek.

April 2000, IEEE ICRA Dudek & Jugessur

Dudek & Jugessur, ICRA 2000.

Performance metrics

• On-line performance:• 15x15 pixel subwindows: 90% recognition with 10 subwindows

(10 interest points).

• 15x15 pixel subwindows: 100% recognition using 15 more subwindows

– Interest operator can take 1/30s to 10 min. (depending on the operator, images size, etc.).

– Classification in Eigenspace well under 1 sec (can be performed in real time).

Page 19: Dudek & Jugessur, ICRA 2000. April 2000, IEEE ICRADudek & Jugessur Robust Place and Object Recognition using Local Appearance based Methods Gregory Dudek.

April 2000, IEEE ICRA Dudek & Jugessur

Dudek & Jugessur, ICRA 2000.

Performance vs Number of Interest PointsR

ecog

niti

on R

ate

100%

Number of features

Note: 10 windows of size 15x15 meansusing only 0.7% of the total image

content.

Page 20: Dudek & Jugessur, ICRA 2000. April 2000, IEEE ICRADudek & Jugessur Robust Place and Object Recognition using Local Appearance based Methods Gregory Dudek.

April 2000, IEEE ICRA Dudek & Jugessur

Dudek & Jugessur, ICRA 2000.

Conclusion & Extensions

• Approach to object and place recognition from single video images. Works despite planar rotation, occlusion or other deformations.

• Highly robust.

• Recognition rates of up to 100% with 20 test images.

• Improved robustness to background can be achieved using “masking” [Jugessur & Dudek CVPR 2000].

• Ongoing work sees to exploit geometry of interest points.

• Could filter in Eigenspace during training to select only “useful” features.

Page 21: Dudek & Jugessur, ICRA 2000. April 2000, IEEE ICRADudek & Jugessur Robust Place and Object Recognition using Local Appearance based Methods Gregory Dudek.

April 2000, IEEE ICRA Dudek & Jugessur

Dudek & Jugessur, ICRA 2000.

That’s all

Page 22: Dudek & Jugessur, ICRA 2000. April 2000, IEEE ICRADudek & Jugessur Robust Place and Object Recognition using Local Appearance based Methods Gregory Dudek.

April 2000, IEEE ICRA Dudek & Jugessur

Dudek & Jugessur, ICRA 2000.

Questions you could ask

• Have you considered the use of alternative interest/attention operators? Does the operator matter?

• What if the background is much more interesting (to the operator) that the object?

• How much does color information matter?• What is the consequence of not using geometric

information (and what does that really mean)?

Page 23: Dudek & Jugessur, ICRA 2000. April 2000, IEEE ICRADudek & Jugessur Robust Place and Object Recognition using Local Appearance based Methods Gregory Dudek.

April 2000, IEEE ICRA Dudek & Jugessur

Dudek & Jugessur, ICRA 2000.

Page 24: Dudek & Jugessur, ICRA 2000. April 2000, IEEE ICRADudek & Jugessur Robust Place and Object Recognition using Local Appearance based Methods Gregory Dudek.

April 2000, IEEE ICRA Dudek & Jugessur

Dudek & Jugessur, ICRA 2000.

Performance metrics

• Training time: roughly 64 windows, 15x15, 17 objects, 3 views per object: 24 hours.– This is using MATLAB and highly non-optimized code.

• Using similar methods on global images, other groups have reported times on the order of minutes for similar tasks.

• On-line performance: – Interest operator can take 1/30s to 10 min. (depending on

the operator, images size, etc.)

– Classification in Eigenspace well under 1 sec (can be performed in real time).