ICIAP 2007 - Tutorial Advances of statistical learning and ...

82
ICIAP 2007 - Tutorial Advances of statistical learning and Applications to Computer Vision Ernesto De Vito and Francesca Odone - PART 2 - http://slipguru.disi.unige.it

Transcript of ICIAP 2007 - Tutorial Advances of statistical learning and ...

Page 1: ICIAP 2007 - Tutorial Advances of statistical learning and ...

ICIAP 2007 - Tutorial

Advances of statistical learningand

Applications to Computer Vision

Ernesto De Vito and Francesca Odone

- PART 2 -

http://slipguru.disi.unige.it

Page 2: ICIAP 2007 - Tutorial Advances of statistical learning and ...

2

Plan of the second part

Brief intro through a set of applications

One problem in detail (face detection):Choosing the representationFeature selectionClassification

On the choice of the classifier (filter methods)

Spotlights on other interesting issuesImage annotationKernel engineeringGlobal vs local

Page 3: ICIAP 2007 - Tutorial Advances of statistical learning and ...

3

Plan of the second part

Brief intro through a set of applications

One problem in detail (face detection):Choosing the representationFeature selectionClassification

On the choice of the classifier (filter methods)

Spotlights on other interesting issuesImage annotationKernel engineeringGlobal vs local

Page 4: ICIAP 2007 - Tutorial Advances of statistical learning and ...

4

Learning in everyday life

Security and video-surveillanceOCR systemsRobot controlBiometricsSpeech recognitionEarly diagnosis from medical dataKnowledge discovery in big dataset of heterogeneous data (included the Internet)Microarray analysis and classificationStock market preditionRegression applications in computer graphics

Page 5: ICIAP 2007 - Tutorial Advances of statistical learning and ...

5

Statistical Learning in Computer Vision

Page 6: ICIAP 2007 - Tutorial Advances of statistical learning and ...

6

Statistical Learning in Computer Vision

Detection problems

Page 7: ICIAP 2007 - Tutorial Advances of statistical learning and ...

7

Statistical Learning in Computer Vision

More in general: Image annotation

cartreebuildingskypavementpedestrian..

Page 8: ICIAP 2007 - Tutorial Advances of statistical learning and ...

8

How difficult is image understanding?

Page 9: ICIAP 2007 - Tutorial Advances of statistical learning and ...

9

Plan of the second part

Brief intro through a set of applications

One problem in detail (face detection):Choosing the representationFeature selectionClassification

On the choice of the classifier (filter methods)

Spotlights on other interesting issuesImage annotationKernel engineeringGlobal vs local

Page 10: ICIAP 2007 - Tutorial Advances of statistical learning and ...

10

Regularized face detection

Main steps towards a complete classifier:

Choosing the representationFeature selectionClassification

Joint work with A. Destrero – C. De Mol – A. Verri

Problem setting:Find one or more occurrences of a (~frontal) human face, possibly at different resolutions, in a digital image

Page 11: ICIAP 2007 - Tutorial Advances of statistical learning and ...

11

Application scenario (the data)

2000+2000 training1000+1000 validation3400 test

19 x 19 images

Page 12: ICIAP 2007 - Tutorial Advances of statistical learning and ...

12

Initial representation (the dictionary)

Overcomplete, general purpose sets of features are effective for modeling visual informationMany object classes have a peculiar intrinsic structure that can be better appreciated if one looks for symmetries or local geometry

Examples of features: wavelets, curvelets, ranklets, chirplets, rectangle features ...Example of problems: face detection (Heisele et al, Viola & Jones, ....), pedestrian detection (Oren et al., ..), car detection (Papageorgiou & Poggio)

Page 13: ICIAP 2007 - Tutorial Advances of statistical learning and ...

13

Initial representation (the dictionary)

The approach is inspired by biological systems See, for instance, B.A. Olshauser and D. J. Field “Sparse coding with an over-complete basis set: a strategy employed by V1?” 1997.

Usually this approach is coupled with learning from examples

The prior knowledge is embedded in the choice of an appropriate training set

Problem: usually these sets are very big

Page 14: ICIAP 2007 - Tutorial Advances of statistical learning and ...

14

Initial representation (the dictionary)

Rectangle features (Viola & Jones)

... About 64000 features per image patch!

Most of them are correlatedShort range correlation of natural imagesLong range correlation relative to the object of interest

Page 15: ICIAP 2007 - Tutorial Advances of statistical learning and ...

15

What’s wrong with this?

Measurements are noisyFeatures are correlatedThe number of features is higher than the number of examples

=> Ill conditioned

Page 16: ICIAP 2007 - Tutorial Advances of statistical learning and ...

16

Feature selection

Extracting features relevant for a given problem

What is relevant?

Often related to dimensionality reductionBut the two problems are different

A possible way to address the problem is to resort to regularization methods

Elastic net penalty (PART 1)

Page 17: ICIAP 2007 - Tutorial Advances of statistical learning and ...

17

Let us revise the basic algorithm

We assume a linear dependence between input and output

φ={φij} is the measurement matrixi=1,...,n examples/dataj=1,...,p dictionary

β=(β1,..., βp)T vector of unknown weights to be estimated

f=(f1,..., fn)T output values {-1,1} labels in binary classification problems

βϕ=f

Page 18: ICIAP 2007 - Tutorial Advances of statistical learning and ...

18

Choosing the appropriate algorithm

What sort of penalty suits our problem best?In other words:

How do we choose ε?

The choice is driven by the application domainWhat can we say about image correlation?Is there any reason to prefer feature A to feature B?Do we want them both?

{ })|(|minarg 222 βεβλϕβ

β++−

fNR

A

B

Page 19: ICIAP 2007 - Tutorial Advances of statistical learning and ...

19

Peculiarity of images

Given a group of short range correlated features each element is a good representative of the group

As for long range correlated features it would be interesting to keep them all, but it’s difficult to distinguish them at this stage

Notice that in other applications (e.g., microarray analysis) feature is important per se.

Page 20: ICIAP 2007 - Tutorial Advances of statistical learning and ...

20

L1 penalty

A purely L1 penalty automatically enforces the presence of many zeros in fThe L1 norm is convex therefore providing feasible algorithms

(PROB L1) is the Lagrangian formulation of the so-called LASSO Problem

PROB L1{ }||minarg βλϕββ

+−∈

22f

NR

Page 21: ICIAP 2007 - Tutorial Advances of statistical learning and ...

21

L1 penalty

The regularization parameter λ regulates the balance between misfit of the data and penalty

Also it allows us to vary the degree of sparsity

Page 22: ICIAP 2007 - Tutorial Advances of statistical learning and ...

22

How do we solve it?

The solution is not uniqueA number of numerical strategies have been proposed

We adopt the iterated soft-threshold Landweber

[ ])( )()()( tL

TtL

tL fS ϕβϕββ λ −+=+1

⎩⎨⎧ ≥−

=otherwise

hifhsignhhS jjj

j 02λλ

λ

||)()(Where the

soft-thresholderis defined as

This algorithm converges to a minimized of (PROB L1) if |ϕ|<1.

ALG L

Page 23: ICIAP 2007 - Tutorial Advances of statistical learning and ...

23

Thresholded Landweber and our problem

βϕ=f

φ is the measurement matrix: one row per imageone column per feature

f is the vector of labels+1 for faces-1 for negative examples

In our experiments φ has size 4000x64000 (about 1Gb!)

[ ])( )()()( tL

TtL

tL fS ϕβϕββ λ −+=+1

Page 24: ICIAP 2007 - Tutorial Advances of statistical learning and ...

24

A sampled version of Thresholded Landweber

We build S feature subsets each time extracting with replacement m features, m < < pWe compute S sub-problems

Then we keep the features that were selected eachtime they appeared in the sub-set

s=1,...,Sssf ϕβ=

Page 25: ICIAP 2007 - Tutorial Advances of statistical learning and ...

25

A sampled version of Thresholded Landweber

In our experimentsEach sub-set is 10% of the original sizeS=200 (the probability of extracting each feature at least 10 times is high)

5 10 15 20 25 30 35 400

1000

2000

3000

4000

5000

6000

Page 26: ICIAP 2007 - Tutorial Advances of statistical learning and ...

26

Structure of the method (I)

S0

sub1 sub2 subS....

Alg L Alg L Alg L

+

S1

Page 27: ICIAP 2007 - Tutorial Advances of statistical learning and ...

27

Choosing λ

A few words on parameter tuning

A classical choice is cross validation but in this case it is too heavy (because of the number of sub-problems..)

Thus, at this stage, we fix the number of zeros to be reached in a given number of iterations

Page 28: ICIAP 2007 - Tutorial Advances of statistical learning and ...

28

Cross validation

A standard technique for parameter estimation

Try different parameters and choose the one that performs (generalizes) best.

K-fold cross validation:Divide the training set in K chunksKeep K-1 for training and 1 for validatingRepeat for the K different validation setsCompute an average classification rate

Page 29: ICIAP 2007 - Tutorial Advances of statistical learning and ...

29

Classification

Two reasons:Obtain an effective face detectorSpeculate on the quality of the selected features

Face detection is a fairly standard binary classification problem

Regularized Least SquaresSupport Vector Machines (Vapnik, 1995)...with some nice kernel

In the following experiments we start using linear SVMs

Page 30: ICIAP 2007 - Tutorial Advances of statistical learning and ...

30

Setting 90% of zeros

We get 4636 features..too many

What about increasing the number of zeros in the solution???

0.7

0.75

0.8

0.85

0.9

0.95

1

0 0.005 0.01 0.015 0.02

One stage feature selectionOne stage feature selection (cross validation)

One stage feature selection (on entire set of features)

Page 31: ICIAP 2007 - Tutorial Advances of statistical learning and ...

31

A refinement of the solution

Setting 99% of zeros:345 features (good)Generalization performance drops of about 3% (bad)

IDEA: We apply the Thresholded Landweber once again (on S1 = 4636 features)This time we tune λ with cross validationWe obtain 247 features

Page 32: ICIAP 2007 - Tutorial Advances of statistical learning and ...

32

Structure of the method (II)

S1

Alg L

S2

S0

sub1 sub2 subS....

Alg L Alg L Alg L

+

Page 33: ICIAP 2007 - Tutorial Advances of statistical learning and ...

33

Comparative analysis

0.7

0.75

0.8

0.85

0.9

0.95

1

0 0.005 0.01 0.015 0.02

2 stages feature selection2 stages feature selection + correlation

Viola+Jones feature selection using our same dataViola+Jones cascade performance

0.7

0.75

0.8

0.85

0.9

0.95

1

0 0.005 0.01 0.015 0.02

2 stages feature selectionPCA

Comparison with PCAComparison with Adaboost feature selection (Viola&Jones)

Page 34: ICIAP 2007 - Tutorial Advances of statistical learning and ...

34

How compact is the solution?

The 247 are still redundantFor real-time processing we may want to try and reduce it further

0.7

0.75

0.8

0.85

0.9

0.95

1

0 0.005 0.01 0.015 0.02

2 stages feature selection2 stages feature selection (polynomial kernel)

Linear vs Polynomial kernel

Page 35: ICIAP 2007 - Tutorial Advances of statistical learning and ...

35

A third optimization stage

Starting from S2We choose one delegate for each group of short range correlated featuresOur correlation is based on discarding features that are

Of the same typeCorrelated according to the Spearman’s testSpatially close

Page 36: ICIAP 2007 - Tutorial Advances of statistical learning and ...

36

Structure of the method (III)

S0

S2

sub1 sub2 subS

Alg L Alg L

+

S1

Alg L

Corr

S3

Page 37: ICIAP 2007 - Tutorial Advances of statistical learning and ...

37

What do we get?

0.7

0.75

0.8

0.85

0.9

0.95

1

0 0.005 0.01 0.015 0.02

2 stages feature selection + correlation2 stages feature selection + correlation (polynomial kernel)Linear vs polynomial 0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

0 0.01 0.02 0.03 0.04 0.05 0.06

Two stages feature selectionTwo stages + correlation analysis

With and without 3rd stage

Page 38: ICIAP 2007 - Tutorial Advances of statistical learning and ...

38

A fully trainable system for detecting faces

Peculiarity of object detectors:For each image many testsVery few positive examplesVery many negative examples

Page 39: ICIAP 2007 - Tutorial Advances of statistical learning and ...

39

A fully trainable system for detecting faces

Coarse-to-fine methods deal with this, devising multiple classifiers of increasing difficulty

Many approaches (focus-of-attention, cascades, ...)

Page 40: ICIAP 2007 - Tutorial Advances of statistical learning and ...

40

Our cascade of classifiers

Starting from a set of features, say S3we build many small linear SVM classifiers each of them based on at least 3 distant features that are able to reach a fixed target performance on a validation setThe target performance is chosen so that each classifier is not likely to miss faces

Minimum hit rate 99.5%Maximum false positive rate 50%

∏∏

==

i

i

hHfF For 10 layers

F ~ 90% and H ~ 0.510

Page 41: ICIAP 2007 - Tutorial Advances of statistical learning and ...

41

Our cascade of classifiers

Page 42: ICIAP 2007 - Tutorial Advances of statistical learning and ...

42

Finding faces in images

Page 43: ICIAP 2007 - Tutorial Advances of statistical learning and ...

43

Finding faces in images

Page 44: ICIAP 2007 - Tutorial Advances of statistical learning and ...

44

Finding faces in images

Page 45: ICIAP 2007 - Tutorial Advances of statistical learning and ...

45

Finding faces in video frames

Page 46: ICIAP 2007 - Tutorial Advances of statistical learning and ...

46

Finding eye regions...

The beauty of data driven approachesSame approachDifferent dataset: we extracted eye regions from a subset of the Feret dataset

Page 47: ICIAP 2007 - Tutorial Advances of statistical learning and ...

47

A few results (faces and eyes)

Page 48: ICIAP 2007 - Tutorial Advances of statistical learning and ...

48

Online examples

video

Page 49: ICIAP 2007 - Tutorial Advances of statistical learning and ...

49

A few words on the choice of the classifier

SVMs are very popular for their effectiveness and their generalization abilityOther algorithms can perform in a similar way and have other attractiveness

Filter methods are very simple to implement and allow us to obtain very interesting performanceIn particular, iterative methods are very useful when parameter tuning is needed

Joint work with L. Logerfo, L. Rosasco, E. De Vito, A Verri

Page 50: ICIAP 2007 - Tutorial Advances of statistical learning and ...

50

Experiments on face detection

1.48 ±0.34σ=300 t=59

1.53 ±0.33σ=341 t=89

1.63 ±0.32σ=341 t=95

ν method

1.60 ±0.71σ=1000 C=0.9

1.99 ±0.82σ=1000 C=1

2.41 ±1.39σ=800 C=1

RBF-SVM

800700600

Size of the training set

Experiments carried out on a portion of the previously mentioned facesdataset

Page 51: ICIAP 2007 - Tutorial Advances of statistical learning and ...

51

Plan of the second part

Brief intro through a set of applications

One problem in detail (face detection):Choosing the representationFeature selectionClassification

On the choice of the classifier (filter methods)

Spotlights on other interesting issuesImage annotationKernel engineeringGlobal vs local

Page 52: ICIAP 2007 - Tutorial Advances of statistical learning and ...

52

On the classifier choice:Filter methods

Starting from RLS we have seen (PART 1) how a large class of methods known as spectral regularization give rise to regularized learning algorithmsThese methods were originally proposed to solve inverse problemsThe crucial intuition is that the same principle allowing us to numerical stabilize a matrix inversion is crucial to avoid overfitting

They are worth investigating for their simplicity and effectiveness

Page 53: ICIAP 2007 - Tutorial Advances of statistical learning and ...

53

Filter methods

Alle these algorithms are consistent and can be easily implemented

They have a common derivation (and similar implementation) but have

Different theoretical properties (PART 1)Different computational burden

Page 54: ICIAP 2007 - Tutorial Advances of statistical learning and ...

54

Filter methods: computational issues

Non iterativeTikhonov (RLS)Truncated SVD

IterativeLandweberv methodIterated Tikhonov

Page 55: ICIAP 2007 - Tutorial Advances of statistical learning and ...

55

Filter methods: computational issues

RLSTraining (for a fixed lambda):function [c] = rls(K, lambda, y)

n = length(K);c = (K+n*lambda*eye(n))\y;

Test:function [y_new] = rls_test(x, x_new, c)

K_test = kernel(x,x_test);y_new = K_test * c;% for classificationy_new = sign(y_new);

Careful to choose the matrix inversion function

Page 56: ICIAP 2007 - Tutorial Advances of statistical learning and ...

56

Filter methods: computational issues

RLS

The computational cost of RLS is the cost of invertingthe matrix K: O(n3)

In case parameter tuning is needed resorting to a eigendecomposition of matrix K saves time:

yQnIQc

QQKT

T

1−+Λ=

Λ=

)()( λλ

ynIKc 1−+= )( λ

Page 57: ICIAP 2007 - Tutorial Advances of statistical learning and ...

57

Filter methods: computational issues

v method

t plays the role of the regularization parameterComputational cost: O(tn2)

The iterative procedure allows us to compute all solutions from 0 to t (regularization path)This is convenient if parameter tuning is needed:

With an appropriate choice of the max number of iterations the computational cost does not change

λ=t

Page 58: ICIAP 2007 - Tutorial Advances of statistical learning and ...

58

Plan of the second part

Brief intro through a set of applications

One problem in detail (face detection):Choosing the representationFeature selectionClassification

On the choice of the classifier (filter methods)

Spotlights on other interesting issuesImage annotationKernel engineeringGlobal vs local

Page 59: ICIAP 2007 - Tutorial Advances of statistical learning and ...

59

How difficult is image understanding?

Problem setting (general):Assign one or many labels (from a finite but possibly big

set of known classes) to a digital image according to its content

This general problem is very complexMany better defined domains have been studied

Image categorizationObject detectionObject recognition

Usually the trick is in defining the boundaries of the problem of interest

Joint work(s) with A. Barla – E. Delponte – A. Verri

Page 60: ICIAP 2007 - Tutorial Advances of statistical learning and ...

60

Object idenfication/recognition

Nevertheless, the problem isnot that simple

Page 61: ICIAP 2007 - Tutorial Advances of statistical learning and ...

61

Image annotation

Problem setting:Assign one or more labels (from a finite set of

known classes) to a digital image according to its content

Assumption: we look for global descriptionsIndoor/outdoorDrawing/pictureDay/nightCityscape/not

It usually leads to supervised problems (binary classifiers)Low level descriptions are often applied

Page 62: ICIAP 2007 - Tutorial Advances of statistical learning and ...

62

Image annotation from low-levelglobal descriptions

The problem:Capture a global description of the image usingsimple features

The procedure:Build a suitable training set of dataFind an appropriate representationChoose a classification algorithm and a kernelTune the parameters

Page 63: ICIAP 2007 - Tutorial Advances of statistical learning and ...

63

Computer vision ingredients

Color: Color histograms

Shape: Orientation and strength edge histogramsHistograms of the lengths of edge chains

Texture:Wavelets, Co-occurrence matrices

We represent whole images with low leveldescriptions of color, shape or texture

Page 64: ICIAP 2007 - Tutorial Advances of statistical learning and ...

64

A few comments

Histograms appear quite often

We need a simple example to discuss kernel engineeringDesigning ad hoc kernels for the problem/data at hand and the right properties:

SymmetryPositive definiteness

=> Let us go through the histogram intersection example

Page 65: ICIAP 2007 - Tutorial Advances of statistical learning and ...

65

Histogram Intersection (HI)

• Since (Swain and Ballard, 1991) it is knownthat histogram intersection is a powerfulsimilarity measure for color indexing

• Given two images, A and B, of N pixels, if werepresent them as histograms with M bins Ai and Bi, histogram intersection is defined as

{ }∑=

=M

iii BABAK

1,min),(

Page 66: ICIAP 2007 - Tutorial Advances of statistical learning and ...

66

Histogram Intersection (HI)

05

1015202530354045

Bin1

Bin2

Bin3

Bin4

Bin5

Bin6

Bin7

Bin8

05

1015202530354045

Bin1

Bin2

Bin3

Bin4

Bin5

Bin6

Bin7

Bin8

0

5

10

15

20

25

30

35

Bin1

Bin2

Bin3

Bin4

Bin5

Bin6

Bin7

Bin8

∑i

iBin

Page 67: ICIAP 2007 - Tutorial Advances of statistical learning and ...

67

HI is a Kernel

If we build the MxN – dimensional vector

it can immediately be seen that

⎟⎟⎟

⎜⎜⎜

⎛=

−−−321

876

321

876

321

876r

M

M

AN

A

AN

A

AN

A

A 0,...,0,0,1,...,1,1,...,0,...,0,0,1,...,1,1,0,...,0,0,1,...,1,12

2

1

1

( ) >=< BABAK ,,

NOTICE: The proof is based on finding an explicit mapping

Dot product(linear kernel)

Page 68: ICIAP 2007 - Tutorial Advances of statistical learning and ...

68

Histogram intersection: applications

HI has been applied with success to a variety of classification problems, both global and local:

Indoor/outdoor, day/night, cityscape/landscape classificationObject detection from local features (SIFT)

In all those cases it outperformed RBF classifiers Also HI does not depend on any parameter

Page 69: ICIAP 2007 - Tutorial Advances of statistical learning and ...

69

Local approaches

Global approaches have limitsOften objects of interest occupy only a (small) portion of the imageIn a simplified setting all the rest of the image can be defined as background (or context)Depending on the application domain context can help recognition or make it more difficult:

Page 70: ICIAP 2007 - Tutorial Advances of statistical learning and ...

70

Local approaches

We may represent the image content as a set of local features (f1, ..., fn) --- corners, DoG features, ...

We immediately see that this is a variable length description

How to deal withvariable length:

Vocabulary approachLocal kernels (or kernels on sets)

Local features in scale-space

Page 71: ICIAP 2007 - Tutorial Advances of statistical learning and ...

71

Local approaches: features vocabulary

It is reminiscent of text categorization

We define a vocabulary of local features and represent our images based on how often a given feature appears in the image

One implementation of this paradigm is the bag of keypoints approach

Page 72: ICIAP 2007 - Tutorial Advances of statistical learning and ...

72

Local approaches: features vocabulary

[Csurka et al, 2004]

Page 73: ICIAP 2007 - Tutorial Advances of statistical learning and ...

73

Local approaches: kernels on sets

Image descriptions based on local features can be seen as sets:

Variable lengthNo internal ordering

A common approach to define a global similarity between feature setsis to combine the local similarity between (possibly all) pairs of vector elements

},,{},,,{ 11 mn yyYxxX KK ==

mjniyxKYXK jiL ,...,1,,1)),((),( ==∀ℑ= K

Page 74: ICIAP 2007 - Tutorial Advances of statistical learning and ...

74

Summation kernel [Haussler,1999]

The simplest kernel for sets is the summation kernel

Ks is a kernel if KL is a kernelKs is not so useful in practice:

Computationally heavyIt mixes good and bad correspondences

∑∑= =

=n

i

m

jjiLS yxKYXK

1 1),(),(

Page 75: ICIAP 2007 - Tutorial Advances of statistical learning and ...

75

Matching kernel [Wallraven et al, 2003]

Among the many other kernels for sets proposed the matching kernel received a lot of attention for image data

( ){ }∑

= ==

+=n

jjiLmj

M

yxKm

YXK

XYKYXKYXK

1 ,...,1),(max1),(ˆ

),(ˆ),(ˆ21),(

where

Page 76: ICIAP 2007 - Tutorial Advances of statistical learning and ...

76

Matching kernel [Wallraven et al, 2003]

The matching kernel lead to promising results on object recognition problems

Nevertheless it has been shown that it is not a Mercer kernel (because of the max op.)

Page 77: ICIAP 2007 - Tutorial Advances of statistical learning and ...

77

Intermediate matching kernel[Boughorbel et al,, 2004]

Let us consider two feature sets

The two feature sets are compared through an auxiliary set of virtual features

The intermediate matching kernel is defined as

},,{},,,{ 11 mn yyYxxX KK ==

},,{ 1 pvvV K=

∑∈

=Vv

vVi

iYXKYXK ),(),(

),(),( ** yxKYXK Lvi=where

x* and y* are the elements of X and Ycloser to vi

Page 78: ICIAP 2007 - Tutorial Advances of statistical learning and ...

78

Intermediate matching kernel[Boughorbel et al,, 2004]

∑∈

=Vv

vVi

iYXKYXK ),(),(

),(),( ** yxKYXK Lvi=where

x* and y* are the elements of X and Ycloser to vi

Page 79: ICIAP 2007 - Tutorial Advances of statistical learning and ...

79

Intermediate matching kernel:how to choose the virtual features

The intuition behind the virtual features is to find representatives of the feature points extracted in the training setSimply the training set features are grouped in N clusters

The authors show that the choice of N is not crucial (the bigger the better, but careful to computational complexity)It is better to cluster features within each class

Page 80: ICIAP 2007 - Tutorial Advances of statistical learning and ...

80

Conclusions

Understanding the image content is difficult

Statistical learning can help a lot

Don’t forget computer vision! Appropriate descriptions, similarity measures allow us to achieve good results and to obtain effective solutions

Page 81: ICIAP 2007 - Tutorial Advances of statistical learning and ...

81

That’s all!

How to contact us:Ernesto: [email protected]: [email protected]

http://slipguru.disi.unige.itwhere you will find updated versions of the slides

Page 82: ICIAP 2007 - Tutorial Advances of statistical learning and ...

82

Selected (and very incomplete) biblio

A. Destrero, C. De Mol, F. Odone, A. Verri. A regularized approachto feature selection for face detection. DISI-TR-2007-01A. Mohan, C. Papageorgiou, T. PoggioExample based object detection in images by components, PAMI(Vol. 23, No. 4), 2001F. Odone, A. Barla, and A. Verri. Building kernels from binary

strings for image matching, IEEE Transactions on ImageProcessing, 14(2):169-180, 2005

P. Viola and M. J. Jones. Robust real-time face detection. International Journal on Computer Vision, 57(2),2004.C. Wallraven, B. Caputo, A. Graf. Recognition with Local features: the kernel recipe. ICCV03.