2D Image Processing Introduction to object recognition &...

88
INFORMATIK Prof. Didier Stricker Kaiserlautern University http://ags.cs.uni-kl.de/ DFKI – Deutsches Forschungszentrum für Künstliche Intelligenz http://av.dfki.de 2D Image Processing Introduction to object recognition & SVM 1

Transcript of 2D Image Processing Introduction to object recognition &...

Page 1: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

Prof. Didier Stricker

Kaiserlautern University

http://ags.cs.uni-kl.de/

DFKI – Deutsches Forschungszentrum für Künstliche Intelligenz

http://av.dfki.de

2D Image Processing Introduction to object recognition & SVM

1

Page 2: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

•  Basic recognition tasks

•  A statistical learning approach

•  Traditional recognition pipeline •  Bags of features •  Classifiers

•  Support Vector Machine

Overview

2

Page 3: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

Image classification •  outdoor/indoor •  city/forest/factory/etc.

3

Page 4: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

Image tagging

•  street •  people •  building •  mountain •  …

4

Page 5: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

Object detection •  find pedestrians

5

Page 6: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

•  walking •  shopping •  rolling a cart •  sitting •  talking •  …

Activity recognition

6

Page 7: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

mountain

building tree

banner

market people

street lamp

sky

building

Image parsing

7

Page 8: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

This is a busy street in an Asian city. Mountains and a large palace or fortress loom in the background. In the foreground, we see colorful souvenir stalls and people walking around and shopping. One person in the lower left is pushing an empty cart, and a couple of people in the middle are sitting, possibly posing for a photograph.

Image description

8

Page 9: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

Image classification

9

Page 10: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

§ Apply a prediction function to a feature representation of the image to get the desired output:

f( ) = “apple” f( ) = “tomato” f( ) = “cow”

The statistical learning framework

10

Page 11: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

y = f(x)

§  Training: given a training set of labeled examples

{(x1,y1), …, (xN,yN)}, estimate the prediction function f by minimizing the prediction error on the training set

§  Testing: apply f to a never before seen test example x and output the predicted value y = f(x)

The statistical learning framework

output prediction function

Image feature

11

Page 12: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

Predic'on  

Steps Training  Labels  

Training Images

Training  

Training

Image  Features  

Image  Features  

Testing

Test Image

Learned  model  

Learned  model  

Slide credit: D. Hoiem 12

Page 13: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

Tradi&onal  recogni&on  pipeline  

Hand-designed feature

extraction

Trainable classifier

Image Pixels

•  Features are not learned •  Trainable classifier is often generic (e.g. SVM)

Object Class

13

Page 14: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

Bags of features

14

Page 15: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

1.  Extract local features 2.  Learn “visual vocabulary” 3.  Quantize local features using visual vocabulary 4.  Represent images by frequencies of “visual words”

Traditional features: Bags-of-features

15

Page 16: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

•  Sample patches and extract descriptors

1. Local feature extraction

16

Page 17: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

2. Learning the visual vocabulary

Slide credit: Josef Sivic

Extracted descriptors from the training set

17

Page 18: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

2. Learning the visual vocabulary

Clustering

Slide credit: Josef Sivic 18

Page 19: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

2. Learning the visual vocabulary

Clustering

Slide credit: Josef Sivic

Visual vocabulary

19

Page 20: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

∑ ∑ −=k

ki

kiMXDcluster

clusterinpoint

2)(),( mx

Review: K-means clustering

•  Want  to  minimize  sum  of  squared  Euclidean  distances  between  features  xi  and  their  nearest  cluster  centers  mk        

Algorithm:  

•  Randomly  ini&alize  K  cluster  centers  

•  Iterate  un&l  convergence:  §  Assign  each  feature  to  the  nearest  center  §  Recompute  each  cluster  center  as  the  mean  of  all  features  assigned  to  it  

20

Page 21: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

Example visual vocabulary

… Source: B. Leibe

Appearance codebook 21

Page 22: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

1.  Extract local features 2.  Learn “visual vocabulary” 3.  Quantize local features using visual vocabulary 4.  Represent images by frequencies of “visual words”

Bags-of-features: main steps

22

Page 23: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

•  Orderless document representation: frequencies of words from a dictionary Salton & McGill (1983)

Bags of features: Motivation

23

Page 24: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

•  Orderless document representation: frequencies of words from a dictionary Salton & McGill (1983)

Bags of features: Motivation

24

Page 25: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

•  Orderless document representation: frequencies of words from a dictionary Salton & McGill (1983)

Bags of features: Motivation

25

Page 26: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

US Presidential Speeches Tag Cloud http://chir.ag/projects/preztags/

•  Orderless document representation: frequencies of words from a dictionary Salton & McGill (1983)

Bags of features: Motivation

26

Page 27: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

Spatial pyramids

level 0

Lazebnik, Schmid & Ponce (CVPR 2006) 27

Page 28: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

Spatial pyramids

level 0 level 1

Lazebnik, Schmid & Ponce (CVPR 2006) 28

Page 29: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

Spatial pyramids

level 0 level 1 level 2

Lazebnik, Schmid & Ponce (CVPR 2006) 29

Page 30: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

•  Scene  classifica'on  results  

Spatial pyramids

30

Page 31: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

•  Caltech101 classification results

Spatial pyramids

31

Page 32: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

Tradi&onal  recogni&on  pipeline  

Hand-designed feature

extraction

Trainable classifier

Image Pixels

Object Class

32

Page 33: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

f(x) = label of the training example nearest to x

§ All we need is a distance function for our inputs § No training required!

Classifiers: Nearest neighbor

Test example

Training examples

from class 1

Training examples

from class 2

33

Page 34: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

•  For a new point, find the k closest points from training data

•  Vote for class label with labels of the k points

K-nearest neighbor classifier

k = 5

34

Page 35: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

§ Which classifier is more robust to outliers?

K-nearest neighbor classifier

Credit: Andrej Karpathy, http://cs231n.github.io/classification/ 35

Page 36: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

K-nearest neighbor classifier

Credit: Andrej Karpathy, http://cs231n.github.io/classification/ 36

Page 37: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

Linear  classifiers  

§ Find  a  linear  func+on  to  separate  the  classes:  

 f(x)  =  sgn(w  ⋅⋅  x  +  b)  37

Page 38: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

Visualizing linear classifiers

Source: Andrej Karpathy, http://cs231n.github.io/linear-classify/ 38

Page 39: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

•  NN pros: §  Simple to implement §  Decision boundaries not necessarily linear §  Works for any number of classes §  Nonparametric method

•  NN cons: §  Need good distance function §  Slow at test time

•  Linear pros: §  Low-dimensional parametric representation §  Very fast at test time

•  Linear cons: §  Works for two classes §  How to train the linear function? §  What if data is not linearly separable?

Nearest neighbor vs. linear classifiers

39

Page 40: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

40

§ Linear Classifiers can lead to many equally valid choices for the decision boundary

Maximum Margin

Are these really “equally valid”?

Page 41: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

41

§ How can we pick which is best? § Maximize the size of the margin.

Max Margin

Are these really “equally valid”?

Small Margin

Large Margin

Page 42: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

42

§ Support Vectors are those input points (vectors) closest to the decision boundary

1.  They are vectors 2. They “support” the decision hyperplane

Support Vectors

Page 43: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

43

§ Define this as a decision problem

§ The decision hyperplane:

§ No fancy math, just the equation of a hyperplane.

Support Vectors

~w

T~x + b = 0

Page 44: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

44

§ Define this as a decision problem § The decision hyperplane:

§ Decision Function:

Support Vectors

~w

T~x + b = 0

D(~xi) = sign(~w

T~xi + b)

Page 45: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

45

§ Define this as a decision problem § The decision hyperplane:

§ Margin hyperplanes:

Support Vectors

~w

T~x + b = 0

~w

T~x + b = γ

~w

T~x + b = γ

Page 46: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

46

§ The decision hyperplane:

§ Scale invariance

Support Vectors

~w

T~x + b = 0

c~w

T~x + cb = 0

Page 47: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

Support Vectors

§ The decision hyperplane:

§ Scale invariance

~w

T~x + b = 0

47

c~w

T~x + cb = 0

~w

T~x + b = γ

~w

T~x + b = γ

Page 48: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

Support Vectors

§ The decision hyperplane:

§ Scale invariance

~w

T~x + b = 0

48

This scaling does not change the decision hyperplane, or the support vector hyperplanes. But we will eliminate a variable from the optimization

~

w

⇤T~x + b

⇤ = 1~

w

⇤T~x + b

⇤ = 1

c~w

T~x + cb = 0

Page 49: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

49

§ We will represent the size of the margin in terms of w.

§ This will allow us to simultaneously §  Identify a decision boundary §  Maximize the margin

What are we optimizing?

Page 50: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

50

1.  There must at least one point that lies on each support hyperplanes

How do we represent the size of the margin in terms of w?

x1

x2Proof outline: If not, we could define a larger margin support hyperplane that does touch the nearest point(s).

Page 51: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

51

1.  There must at least one point that lies on each support hyperplanes

2.  Thus:

How do we represent the size of the margin in terms of w?

x1

x2w

Tx1 + b = 1

w

Tx2 + b = 1

3.  And:

w

T (x1 x2) = 2

Page 52: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

How  do  we  represent  the  size  of  the  margin  in  terms  of  w?  

1.  There must at least one point that lies on each support hyperplanes

2.  Thus:

52

x1

x2w

Tx1 + b = 1

w

Tx2 + b = 1

3.  And:

w

T (x1 x2) = 2

w

Tx + b = 0

Page 53: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

§ The  vector  w  is  perpendicular  to  the  decision  hyperplane  §  If  the  dot  product  of  two  vectors  equals  zero,  the  two  vectors  are  perpendicular.  

How  do  we  represent  the  size  of  the  margin  in  terms  of  w?  

53

x1

x2

w

Tx + b = 0

w

w

T (x1 x2) = 2

Page 54: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

54

§ The margin is the projection of x1 – x2 onto w, the normal of the hyperplane.

How do we represent the size of the margin in terms of w?

x1

x2

w

Tx + b = 0

w

w

T (x1 x2) = 2

Page 55: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

55

Recall: Vector Projection

~u

~v

cos(✓) =adjacent

hypotenuse

~v · ~uk~uk ~u

~v · ~u = k~vkk~uk cos(✓)

hypotenuse = k~vk adjacent = kgoalk

cos(✓) =k ~

goalkk~vk

k~vkk~ukk~vkk~uk cos(✓) =

k ~

goalkk~vk

~v · ~uk~vkk~uk =

k ~

goalkk~vk

~v · ~uk~uk = k ~

goalk

Page 56: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

56

§ The margin is the projection of x1 – x2 onto w, the normal of the hyperplane.

How do we represent the size of the margin in terms of w?

x1

x2

w

Tx + b = 0

w

w

T (x1 x2) = 2

~v · ~uk~uk ~u

~w

T (x1 − x2)k~wk ~w

2k~wkSize of the Margin:

Projection:

Page 57: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

57

§ Goal: maximize the margin

Maximizing the margin

max2

k~wk

min k~wk

where ti(~w

Txi + b) 1

Linear Separability of the data by the decision boundary

~w

Txi + b ≥ 1 if ti = 1

~w

Txi + b 1 if ti = −1

Page 58: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

58

§  If constraint optimization then Lagrange Multipliers § Optimize the “Primal”

Max Margin Loss Function

L(~w, b) =12

~w · ~wN1X

i=0

↵i[ti((~w · ~xi) + b) 1]

min k~wk where ti(~w

Txi + b) 1

Lagrange Multiplier

Page 59: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

59

§ Optimize the “Primal”

Max Margin Loss Function

L(~w, b) =12

~w · ~wN1X

i=0

↵i[ti((~w · ~xi) + b) 1]

@L(~w, b)@b

= 0

N1X

i=0

↵iti = 0

Partial wrt b

Page 60: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

60

§ Optimize the “Primal”

Max Margin Loss Function

L(~w, b) =12

~w · ~wN1X

i=0

↵i[ti((~w · ~xi) + b) 1]

@L(~w, b)@ ~w

= 0

~wN1X

i=0

↵iti ~xi = 0

~w =N1X

i=0

↵iti ~xi

Partial wrt w

Page 61: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

61

§ Optimize the “Primal”

Max Margin Loss Function

L(~w, b) =12

~w · ~wN1X

i=0

↵i[ti((~w · ~xi) + b) 1]

@L(~w, b)@ ~w

= 0

~wN1X

i=0

↵iti ~xi = 0

~w =N1X

i=0

↵iti ~xi

Partial wrt w

Now have to find αi. Substitute back to the Loss function

Page 62: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

62

§ Construct the “dual”

Max Margin Loss Function

L(~w, b) =12

~w · ~wN1X

i=0

↵i[ti((~w · ~xi) + b) 1]

~w =N1X

i=0

↵iti ~xi

W (↵) =N1X

i=0

↵i12

N1X

i,j=0

↵i↵jtitj(~xi · ~xj)

where ↵i 0N1X

i=0

↵iti = 0

N1X

i=0

↵iti = 0With:

Then:

Page 63: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

63

§ Solve this quadratic program to identify the lagrange multipliers and thus the weights

Dual formulation of the error

W (↵) =N1X

i=0

↵i12

N1X

i,j=0

↵i↵jtitj(~xi · ~xj)

where ↵i 0

There exist (rather) fast approaches to quadratic program solving in both C, C++, Python, Java and R

Page 64: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

64

Quadratic Programming

subject to (one or more) A~x k

B~x = l

minimize f(~x) =12~x

TQ~x + c

T~x

• If Q is positive semi definite, then f(x) is convex.

• If f(x) is convex, then there is a single maximum.

W (↵) =N1X

i=0

↵i12

N1X

i,j=0

↵i↵jtitj(~xi · ~xj)

where ↵i 0

Page 65: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

Support Vector Expansion

§ When  αi  is  non-­‐zero  then  xi  is  a  support  vector  

§ When  αi  is  zero  xi  is  not  a  support  vector  65

~w =N1X

i=0

↵iti ~xi

D(~x) = sign(~w

T~x + b)

= sign

0

@"

N1X

i=0

↵iti ~xi

#T

~x + b

1

A

= sign

"N1X

i=0

↵iti(~xiT~x)

#+ b

!

New decision Function Independent of the

Dimension of x!

Page 66: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

66

§  In constraint optimization: At the optimal solution

§  Constraint * Lagrange Multiplier = 0

Karush-Kuhn-Tucker Conditions

↵i(1 ti(~w

T~xi + b)) = 0

if ↵i 6= 0 ! ti(~w

T~xi + b) = 1

Only points on the decision boundary contribute to the solution!

Page 67: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

•  General idea: the original input space can always be mapped to some higher-dimensional feature space where the training set is separable

Nonlinear SVMs

Φ: x → φ(x)

Image source 67

Page 68: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

•  Linearly separable dataset in 1D:

•  Non-separable dataset in 1D:

•  We can map the data to a higher-dimensional space:

0 x

0 x

0 x

x2

Nonlinear SVMs

Slide credit: Andrew Moore 68

Page 69: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

•  General idea: the original input space can always be mapped to some higher-dimensional feature space where the training set is separable.

•  The kernel trick: instead of explicitly computing the lifting transformation φ(x), define a kernel function K such that

K(x , y) = φ(x) · φ(y) §  (to be valid, the kernel function must satisfy

Mercer’s condition)

The kernel trick

69

Page 70: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

•  Linear SVM decision function:

The kernel trick

C. Burges, A Tutorial on Support Vector Machines for Pattern Recognition, Data Mining and Knowledge Discovery, 1998

bybi iii +⋅=+⋅ ∑ xxxw α

Support vector

learned weight

70

Page 71: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

bKybyi

iiii

iii +=+⋅ ∑∑ ),()()( xxxx αϕϕα

The kernel trick •  Linear  SVM  decision  func&on:  

• Kernel  SVM  decision  func&on:  

 

• This  gives  a  nonlinear  decision  boundary  in  the  original  feature  space  

C. Burges, A Tutorial on Support Vector Machines for Pattern Recognition, Data Mining and Knowledge Discovery, 1998

bybi iii +⋅=+⋅ ∑ xxxw α

71

Page 72: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

Polynomial  kernel:   dcK )(),( yxyx ⋅+=

72

Page 73: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

•  Also known as the radial basis function (RBF) kernel:

Gaussian kernel

||x – y||

K(x, y)

⎟⎠

⎞⎜⎝

⎛ −−=2

2

1exp),( yxyxσ

K

73

Page 74: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

Gaussian  kernel  

SV’s

74

Page 75: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

•  Histogram intersection:

•  Square root (Bhattacharyya kernel):

Kernels for histograms

K(h1,h2 ) = min(h1(i),h2 (i))i=1

N

K(h1,h2 ) = h1(i)h2 (i)i=1

N

75

Page 76: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

•  Pros §  Kernel-based framework is very powerful, flexible §  Training is convex optimization, globally optimal solution

can be found §  Amenable to theoretical analysis §  SVMs work very well in practice, even with very small

training sample sizes

•  Cons §  No “direct” multi-class SVM, must combine two-class

SVMs (e.g., with one-vs-others) §  Computation, memory (esp. for nonlinear SVMs)

SVMs: Pros and cons

76

Page 77: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

•  Generalization refers to the ability to correctly classify never before seen examples

•  Can be controlled by turning “knobs” that affect the complexity of the model

Generalization

Training set (labels known) Test set (labels unknown) 77

Page 78: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

•  Training error: how does the model perform on the data on which it was trained?

•  Test error: how does it perform on never before seen data?

Diagnosing generalization ability

Source: D. Hoiem

Training error

Test error

Underfitting Overfitting

Model complexity High Low

Err

or

78

Page 79: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

•  Underfitting: training and test error are both high §  Model does an equally poor job on the training and the test set §  Either the training procedure is ineffective or the model is too

“simple” to represent the data •  Overfitting: Training error is low but test error is high

•  Model fits irrelevant characteristics (noise) in the training data §  Model is too complex or amount of training data is insufficient

Underfitting and overfitting

Underfitting

Overfitting

Good generalization

Figure source 79

Page 80: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

Effect of training set size

Many training examples

Few training examples

Model complexity High Low

Test

Err

or

Source: D. Hoiem 80

Page 81: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

Effect of training set size

Many training examples

Few training examples

Model complexity High Low

Test

Err

or

Source: D. Hoiem 81

Page 82: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

•  Split the data into training, validation, and test subsets •  Use training set to optimize model parameters •  Use validation test to choose the best model •  Use test set only to evaluate performance

Validation

Model complexity

Training set loss

Test set loss

Err

or

Validation set loss

Stopping point

82

Page 83: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

Histogram of Oriented Gradients (HoG)

[Dalal  and  Triggs,  CVPR  2005]  

10x10  cells  

20x20  cells  

HoGify  

83

Page 84: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

Histogram of Oriented Gradients (HoG)

84

Page 85: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

§  Many  examples  on  Internet:  

§  hVps://www.youtube.com/watch?v=b_DSTuxdGDw  

Histogram of Oriented Gradients (HoG)

[Dalal  and  Triggs,  CVPR  2005]   85

Page 86: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

86

Page 87: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

§ Demo software: http://cs.stanford.edu/people/karpathy/svmjs/demo

§ A useful website: www.kernel-machines.org § Software:

§  LIBSVM: www.csie.ntu.edu.tw/~cjlin/libsvm/ !!§  SVMLight: svmlight.joachims.org/

Resources

87

Page 88: 2D Image Processing Introduction to object recognition & …ags.cs.uni-kl.de/fileadmin/inf_ags/2dip-ss16/2DIP_SS20… ·  · 2016-06-132D Image Processing Introduction to object

INFORMATIK

88

Thank you!