Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with...

48
Building local part models for category-level recognition C. Schmid, INRIA Grenoble int work with G. Dorko, S. Lazebnik, J. Pon

Transcript of Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with...

Page 1: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Building local part models for

category-level recognition

C. Schmid, INRIA Grenoble

Joint work with G. Dorko, S. Lazebnik, J. Ponce

Page 2: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Introduction

• Invariant local descriptors

=> robust recognition of specific objects or scenes

• Recognition of textures and object classes

=> description of intra-class variation, selection of discriminant features, spatial relations

texture recognition car detection

Page 3: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

1. An affine-invariant texture recognition (CVPR’03)

2. A two-layer architecture for texture segmentation and recognition (ICCV’03)

3. Feature selection for object class recognition (ICCV’03)

4. Building affine-invariant part models for recognition

Overview

Page 4: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Affine-invariant texture recognition

• Texture recognition under viewpoint changes and non-rigid transformations

• Use of affine-invariant regions– invariance to viewpoint changes– spatial selection => more compact representation, reduction of

redundancy in texton dictionary

[A sparse texture representation using affine-invariant regions,

S. Lazebnik, C. Schmid and J. Ponce, CVPR 2003]

Page 5: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Spatial selection

clustering each pixel

clustering selected pixels

Page 6: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Overview of the approach

Page 7: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Harris detector

Laplace detector

Region extraction

Page 8: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Descriptors – Spin images

Page 9: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Signature and EMD

• Hierarchical clustering

=> Signature :

• Earth movers distance

– robust distance, optimizes the flow between distributions– can match signatures of different size– not sensitive to the number of clusters

SS = { ( m1 , w1 ) , … , ( mk , wk ) }

D( SS , SS’’ ) = [i,j fij d( mi , m’j)] / [i,j fij ]

Page 10: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Database with viewpoint changes

20 samples of 10 different textures

Page 11: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Results

Spin images Gabor-like filters

Page 12: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

1. An affine-invariant texture recognition (CVPR’03)

2. A two-layer architecture for texture segmentation and recognition (ICCV’03)

3. Feature selection for object class recognition (ICCV’03)

4. Building affine-invariant part models for recognition

Overview

Page 13: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

A two-layer architecture

• Texture recognition + segmentation

• Classification of individual regions + spatial layout

[A generative architecture for semi-supervised texture

recognition, S. Lazebnik, C. Schmid, J. Ponce, ICCV 2003]

Page 14: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

A two-layer architecture

Modeling :

1. Distribution of the local descriptors (affine invariants)• Gaussian mixture model• estimation with EM, allows incorporating unsegmented images

2. Co-occurrence statistics of sub-class labels over affinely adapted neighborhoods

Segmentation + Recognition :

1. Generative model for initial class probabilities

2. Co-occurrence statistics + relaxation to improve labels

Page 15: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Texture Dataset – Training Images

T1 (brick) T2 (carpet) T3 (chair) T4 (floor 1) T5 (floor 2) T6 (marble) T7 (wood)

Page 16: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Effect of relaxation + co-occurrence

Original image

Top: before relaxation (indivual regions), bottom: after relaxation (co-occurrence)

Page 17: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Recognition + Segmentation Examples

Page 18: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Animal Dataset – Training Images

• no manual segmentation, weakly supervised• 10 training images per animal (with background) • no purely negative images

Page 19: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Recognition + Segmentation Examples

Page 20: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

1. An affine-invariant texture recognition (CVPR’03)

2. A two-layer architecture for texture segmentation and recognition (ICCV’03)

3. Feature selection for object class recognition (ICCV’03)

4. Building affine-invariant part models for recognition

Overview

Page 21: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Object class detection/classification

• Description of intra-class variations of object parts

[Selection of scale inv. regions for object class recognition,

G. Dorko and C. Schmid, ICCV’03]

Page 22: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Object class detection/classification

• Description of intra-class variations of object parts

• Selection of discrimiant features (weakly supervised)

Page 23: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Training the model

• Training phase 1– Input : Images of the object with background (positive images),

no normalization, alignment of the image

– Extraction of local descriptors : Harris-Laplace, Kadir-Brady, SIFT

– Clustering : estimation of Gaussian mixture with EM

Page 24: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Training the model

• Training phase 1– Input : Images of the object with background (positive images),

no normalization, alignment of the image/object

– Extraction of local descriptors : Harris-Laplace, Kadir-Brady, SIFT

– Clustering : estimation of Gaussian mixture with EM

Page 25: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Training the model

• Training phase 2 (selection)– Input : verification set, positive and negative images

– Rank each cluster with likelihood (or mutual information)

– MAP classifier with the n top clusters

j

nj

nj

i

ui

ui

dclP

dclPcR

)()(

)()(

)(

Page 26: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

5

Likelihood Mutual Information

25

Likelihood – mutual information

–likelihood: more discriminant but very specific

–mutual Information: discriminant but not too specific

Page 27: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Results for test images

Har

ris-

Lap

lace

354 points 49 correct + 37 incorrect 31 correct + 20 incorrect

25 Likelihood 10 Mutual InformationDetection

Har

ris-

Lap

lace

277 points 43 correct + 36 incorrect 26 correct + 20 incorrect

Page 28: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Relaxation – propagation of probablities

Page 29: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Classification

• Assign each test descriptor to the most probable cluster (MAP)

• Each descriptor assigned to one of the top n clusters is positive

• If the number of positive descriptors are above a threshold p classify the image as positive

Page 30: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Classification experimentsAirplanesAirplanes MotorbikesMotorbikes Wild CatsWild Cats

Training Phase 1

#Positive images 200 200 25

Training Phase 2

#Positive images 200 200 25

#Negative images 450 450 450

Testing

#Positive images 400 400 50

#Negative images 450 450 450

Training

Verification

Test

http://www.robots.ox.ac.uk/~vgg/data Corel Image Library

Page 31: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Results: Motorbikes

Equal-Error-Rates as a function of p.

Receiver-Operating-Characteristic

p=6

Page 32: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Best Estimated p p=6 Fergus

p % p % % %

AirplanesHarris 8 97,5 5 97 97.25 -

Kadir 18 97 30 96.5 96 94

MotorbikesHarris 9 99 5 98 98.25 -

Kadir 19 98.75 32 98.25 98 96

Wild CatsHarris 31 94 34 92 72 -

Kadir 17 86 45 82 84 90

97.5

99

94

Classification results: ROC equal error rates

Page 33: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

1. An affine-invariant texture recognition (CVPR’03)

2. A two-layer architecture for texture segmentation and recognition (ICCV’03)

3. Feature selection for object class recognition (ICCV’03)

4. Building affine-invariant part models for recognition

Overview

Page 34: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

• Matching collections of local affine-invariant regions that map with an affine transformation => part

• Matching works for unsegmented images

• Model = a collection of parts

A

Affine-invariant part models

Page 35: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Matching: Faces

spurious match

Page 36: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Matching: 3D Objects

closeup

closeup closeup

Page 37: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Matching: Finding Repeated Patterns

Page 38: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Matching: Finding Symmetry

Page 39: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Modeling for Recognition

• Match multiple pairs of training images to produce several candidate parts.

• Use additional validation images to evaluate repeatability of parts and individual patches.

• Retain a fixed number of parts having the best repeatability score as class model.

• No background model

Page 40: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

The Butterfly Dataset

• 16 training images (8 pairs) per class

• 10 validation images per class

• 437 test images

• 619 images total

Page 41: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Butterfly Models

Top two rows: pairs of images used for modeling. Bottom two rows: closeup views of some of the partsmaking up the models of the seven butterfly classes.

Page 42: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Recognition

• Top 10 models per class used for recognition

• Multi-class classification results:

total model size (smallest/largest)

Page 43: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Classification Rate vs. Number of Parts

Number of parts

Page 44: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Successful Detection Examples

Model partYellow: detected in test imageBlue: occluded in test image

Test image:All ellipses

Test image:Matched ellipses

Note: only one of the two training images is shown

Page 45: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Successful Detection Examples (cont.)

Page 46: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Detection of Multiple Instances

Page 47: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Detection Failures

Page 48: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Future Work

• Spatial relation– non-rigid models – relations between clusters and affine-invariant parts

• Feature selection: dimensionality reduction

• Shape information: appropriate descriptors

• Rapid search: structuring of the data