Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle...

61
Lecture 13 - Silvio Savarese 20-Feb-14 • Announcements Lecture 13 Visual recognition

Transcript of Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle...

Page 1: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Lecture 13 -Silvio Savarese 20-Feb-14

• Announcements

Lecture 13Visual recognition

Page 2: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Lecture 13 -Silvio Savarese 20-Feb-14

• Object classification bag of words models •Discriminative methods•Generative methods

• Object classification by PCA and FLD

Lecture 13Visual recognition

Page 3: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Variability due to:

• View point

• Illumination

• Occlusions

• Intra-class variability

Challenges

Page 4: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Challenges: intra-class variation

Page 5: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Basic properties

• Representation

– How to represent an object category; which classification scheme?

• Learning

– How to learn the classifier, given training data

• Recognition

– How the classifier is to be used on novel data

Page 6: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

definition of “BoW”

– Histogram of visual words (codewords)

codewords dictionary

Page 7: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

category

decision

Representation

feature detection

& representation

codewords dictionary

image representation

category models

(and/or) classifiers

recognitionle

arn

ing

Page 8: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

20-Feb-14

•Discriminative methods•Nearest neighbors•Linear classifier•SVM

•Generative methods

Classification

Page 9: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

SVM classification

Model spacecategory models

Class 1 Class N

… …

w

Page 10: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

SVM classification

Model space

w

Query image

Winning class: pink

Page 11: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Caltech 101

Page 12: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Caltech 101

BOW

~15%

Page 13: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Major drawback of BOW models

Don’t capture spatial information!

Page 14: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Spatial Pyramid MatchingBeyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. S. Lazebnik, C. Schmid, and J. Ponce.. 2006

N

i

ihihhhI1

2121 ))(),(min(),(),(2

1),(),( 212121 hhIhhIhhSPM

Page 15: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Caltech 101

Page 16: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Pyramid matching

Caltech 101

Page 17: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Discriminative models

Support Vector Machines

Guyon, Vapnik, Heisele,

Serre, Poggio…

Boosting

Viola, Jones 2001,

Torralba et al. 2004,

Opelt et al. 2006,…

106 examples

Nearest neighbor

Shakhnarovich, Viola, Darrell 2003

Berg, Berg, Malik 2005...

Neural networks

Slide adapted from Antonio Torralba

Courtesy of Vittorio Ferrari

Slide credit: Kristen Grauman

Latent SVM

Structural SVM

Felzenszwalb 00

Ramanan 03…

LeCun, Bottou, Bengio, Haffner 1998

Rowley, Baluja, Kanade 1998

Random forests

Page 18: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Lecture 13 -Silvio Savarese 20-Feb-14

• Object classification bag of words models •Discriminative methods•Generative methods

• Object classification by PCA and FLD

Lecture 13Visual recognition

Page 19: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Image classification

)|( imagezebrap

)( ezebra|imagnop

vs.

)(

)(

)|(

)|(

)|(

)|(

zebranop

zebrap

zebranoimagep

zebraimagep

imagezebranop

imagezebrap

• Bayes rule:• Bayes rule:

)(

)(

)|(

)|(

)|(

)|(

zebranop

zebrap

zebranoimagep

zebraimagep

imagezebranop

imagezebrap

posterior ratio likelihood ratio prior ratio

Page 20: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Zebra

Non-zebra

Decision

boundary

)|(

)|(

imagezebranop

imagezebrap

• Modeling the posterior ratio:

Discriminative methods

Page 21: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Generative methods

)|( imagezebrap

)( ezebra|imagnop

vs.

)(

)(

)|(

)|(

)|(

)|(

zebranop

zebrap

zebranoimagep

zebraimagep

imagezebranop

imagezebrap

• Bayes rule:• Bayes rule:

)(

)(

)|(

)|(

)|(

)|(

zebranop

zebrap

zebranoimagep

zebraimagep

imagezebranop

imagezebrap

posterior ratio likelihood ratio prior ratio

Page 22: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Generative models

1. Naïve Bayes classifier– Csurka Bray, Dance & Fan, 2004

2. Hierarchical Bayesian text models (pLSA and LDA)

– Background: Hoffman 2001, Blei, Ng & Jordan, 2004

– Object categorization: Sivic et al. 2005, Sudderth et al. 2005

– Natural scene categorization: Fei-Fei et al. 2005

Page 23: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

• w: a collection of all N codewords in the

image

w = [w1,w2,…,wN]

• c: category of the image

Some notations

Page 24: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

the Naïve Bayes model

)c|w(p)c(p~

Prior prob. of

the object classes

Image likelihood

given the class

)w|c(p

w

N

c

)|,,()( 1 cwwpcp N

Page 25: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

the Naïve Bayes model

)c|w(p)c(p~

Prior prob. of

the object classes

Image likelihood

given the class

)w|c(p )|,,()( 1 cwwpcp N

N

1i

i )c|w(p

• Assume that each feature (codewords) is conditionally

independent given the class

)c|w,,w(p N1

N

n

n cwpcp1

)|()(

Likelihood of nth visual

word given the class

Page 26: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

the Naïve Bayes model

)|( 1cwp i

)|( 2cwp i

)c|w(p)c(p~

Prior prob. of

the object classes

Image likelihood

given the class

)w|c(p )|,,()( 1 cwwpcp N

N

n

n cwpcp1

)|()(

Likelihood of nth visual

word given the class

Example:

2 classes:

bananas vs oranges

Histogram of colors

Wi = number of pixels colored in

yellow in the image

x-axis: percentage of pixel that are

colored in yellow in the image

75%50%25%

Page 27: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

the Naïve Bayes model

)|( 1cwp i

)|( 2cwp i

)c|w(p)c(p~

Prior prob. of

the object classes

Image likelihood

given the class

)w|c(p )|,,()( 1 cwwpcp N

N

n

n cwpcp1

)|()(

Likelihood of nth visual

word given the class

• How do we learn P(wi|cj)?

• From empirical

frequencies of code words

in images from a given

class

Page 28: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Classification/Recognition

N

n

n cwpcp1

)|()(

Object class

decision

)|( wcpc

c maxarg

)|( 1cwp i

)|( 2cwp i

Example:

2 classes:

bananas vs oranges

Query image contains a banana

Look at how many pixels are

yellow: say 60%

Look at corresponding likelihood

values given the two class

hypotheses banana!60%

Page 29: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Summary: Generative models

• Naïve Bayes

– Unigram models in document analysis

– Assumes conditional independence of words given class

– Parameter estimation: frequency counting

Page 30: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Csurka et al. 2004

Csurka’s dataset – 7 classes

Page 31: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

E = 28%

E = 15%

Page 32: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Generative vs discriminative

• Discriminative methods– Computationally efficient & fast

• Generative models– Convenient for weakly- or un-supervised,

incremental training

– Prior information

– Flexibility in modeling parameters

Page 33: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

• All have equal probability for bag-of-words methods

• Location information is important

• No rigorous geometric information of the object components

• Segmentation and localization unclear

Weakness of BoW the models

Page 34: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Lecture 13 -Silvio Savarese 20-Feb-14

• Object classification bag of words models •Discriminative methods•Generative methods

• Object classification by PCA and FLD

Lecture 13Visual recognition

Page 35: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

– Principle Component Analysis (PCA)

– Linear Discriminant Analysis (LDA)

Originally introduced for faces:

• Eigenfaces and Fisherfaces

Object classification by…

Turk & Penland, 91

Belhumeur et al.,

Page 36: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

The Space of images or histograms

• An image (or histogram) H is a point in a high

dimensional space

– An N x M image is a point in RNM

[Thanks to Chuck Dyer, Steve Seitz, Nishino]

Page 37: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

• H in the possible set are highly correlated.

• So, compress them to a low-dimensional subspace that

captures key appearance characteristics of the visual DOFs.

Key Idea

}x̂{

• USE PCA for estimating the sub-space (dimensionality reduction)

• Compare two objects by projecting the images into the

subspace and measuring the EUCLIDEAN distance

between them.

EIGENFACES: [Turk and Pentland 91]

Page 38: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Image

space

• Computes n-dim subspace such that the projection of the data points

onto the subspace has the largest variance among all n-dim subspaces.

Face space

• Maximize the scatter of the training images in face space

Page 39: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

x1

x2

1 2

3

4

5

6

x1

x2

1 2

3

4

5

6

X1’

PCA projection

USE PCA for estimating the sub-space

• Computes n-dim subspace such that the projection of the data points

onto the subspace has the largest variance among all n-dim subspaces.

Page 40: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

x1

x2

1st principal component

2rd principal component

USE PCA for estimating the sub-space

Page 41: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Orthonormal

PCA Mathematical Formulation

Define a transformation, W,

m-dimensional n-dimensional

= Data Scatter matrixT

j

N

1jjT )xx)(xx(S

N...2,1jxWy j

T

j

PCA = eigenvalue decomposition of a data covariance matrix

= Transf. data scatter matrix

Eigenvectors of ST

WSW)yy)(yy(S~

T

TT

j

N

1jjT

Measure data scatter

][ 21 mvvv

Page 42: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Image

spaceFace space

v3

v1

v2 v4

Page 43: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Projecting onto the Eigenfaces

• The eigenfaces v1, ..., vK span the space of faces

– A face is converted to eigenface coordinates by

Page 44: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Algorithm

1. Align training images x1, x2, …, xN

2. Compute average face x = 1/N Σ xi

3. Compute the difference image xi – x

Training

Note that each image is formulated into a long vector!

Page 45: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Testing1. Take query image X

2. Project X into Eigenface space (W = {eigenfaces})

and compute projection ωi

3. Compare projection ωi with all training N projections ai

4. Compute the covariance matrix (total scatter matrix)

5. Compute the eigenvectors of the covariance matrix ST

6. Compute training projections a1, a2... aN

Algorithm

T

j

N

1jjT )xx)(xx(S

Page 46: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Illustration of Eigenfaces

These are the first 4 eigenvectors from a training set of

400 images (ORL Face Database).

• The visualization of eigenvectors:

Page 47: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Eigenfaces look somewhat like generic faces.

Page 48: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

• Only selecting the top P eigenfaces reduces the

dimensionality.

• Fewer eigenfaces result in more information loss, and hence

less discrimination between faces.

Reconstruction and Errors

P = 4

P = 200

P = 400

Page 49: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Summary for Eigenface

Pros

• Non-iterative, globally optimal solution

Limitations

•PCA projection is optimal for reconstruction

from a low dimensional basis, but may NOT be

optimal for discrimination…

Page 50: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

PCA-SIFT: A More Distinctive Representation for Local Image Descriptors -

Y Ke, R Sukthankar - IEEE CVPR 04

Extensions

•PCA-SIFT

• Generalized PCA: R. Vidal, Y. Ma, and S. Sastry. Generalized Principal Component Analysis (GPCA). IEEE

Transactions on Pattern Analysis and Machine Intelligence, volume 27, number 12, pages 1 - 15,

2005.

• Tensor Faces:

"Multilinear Analysis of Image Ensembles: TensorFaces," M.A.O. Vasilescu, D. Terzopoulos, Proc. 7th

European Conference on Computer Vision (ECCV'02), Copenhagen, Denmark, May, 2002

Page 51: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Linear Discriminant Analysis (LDA)

Fisher’s Linear Discriminant (FLD)

• Eigenfaces exploit the max scatter of the

training images in face space

• Fisherfaces attempt to maximise the between

class scatter, while minimising the within

class scatter.

Page 52: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Illustration of the Projection

Poor Projection

x1

x2

x1

x2

Using two classes as example:

Good

Page 53: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Variables

• N Sample images:

• c classes:

• Average of each class:

• Total average:

Nxx ,,1

c ,,1

ikx

k

i

i xN

1

N

kkx

N 1

1

Page 54: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Scatters

• Scatter of class i: Tikx

iki xxSik

c

iiW SS

1

c

i

T

iiiBS1

BWT SSS

• Within class scatter:

• Between class scatter:

• Total scatter:

Page 55: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Illustration

2S

1S

BS

21 SSSW

x1

x2Within class scatter

Between class scatter

Tikx

iki xxSik

c

iiW SS

1

c

i

T

iiiBS1

Page 56: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Mathematical Formulation (1)

• After projection:

• Between class scatter (of y’s):

• Within class scatter (of y’s):

kT

k xWy

WSWS BT

B ~

WSWS WT

W ~

Page 57: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Illustration

2

~S

1

~S

BS~

21

~~~SSSW

x1

x2

kT

k xWy

WSWS BT

B ~

WSWS WT

W ~

c

iiW SS

1

c

i

T

iiiBS1

Page 58: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Mathematical Formulation

• The desired projection:

WSW

WSW

S

SW

WT

BT

W

B

optWW

max arg~

~

max arg

miwSwS iWiiB ,,1

• How is it found ? Generalized Eigenvectors

• If Sw has full rank, the generalized eigenvectors are eigenvectors of SW

-1 SB with largest eigen-values

Page 59: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Results: Eigenface vs. Fisherface

• Variation in Facial Expression, Eyewear, and Lighting

• Input: 160 images of 16 people

• Train: 159 images

• Test: 1 image

With glasses

Without glasses

3 Lighting conditions

5 expressions

Page 60: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Err

or

rate

%Results: Eigenface vs. Fisherface

Page 61: Lecture 13 Visual recognition - Silvio Savarese...Lecture 13 Visual recognition –Principle Component Analysis (PCA) –Linear Discriminant Analysis (LDA) Originally introduced for

Next lecture

• Object detection