2D Image Processing Introduction to object recognition &...

INFORMATIK

Prof. Didier Stricker

Kaiserlautern University

http://ags.cs.uni-kl.de/

DFKI – Deutsches Forschungszentrum für Künstliche Intelligenz

http://av.dfki.de

2D Image Processing Introduction to object recognition & SVM

INFORMATIK

•  Basic recognition tasks

•  A statistical learning approach

•  Traditional recognition pipeline •  Bags of features •  Classifiers

•  Support Vector Machine

Overview

Image classification •  outdoor/indoor •  city/forest/factory/etc.

Image tagging

•  street •  people •  building •  mountain •  …

Object detection •  find pedestrians

INFORMATIK

•  walking •  shopping •  rolling a cart •  sitting •  talking •  …

Activity recognition

INFORMATIK

mountain

building tree

banner

market people

street lamp

building

Image parsing

INFORMATIK

This is a busy street in an Asian city. Mountains and a large palace or fortress loom in the background. In the foreground, we see colorful souvenir stalls and people walking around and shopping. One person in the lower left is pushing an empty cart, and a couple of people in the middle are sitting, possibly posing for a photograph.

Image description

INFORMATIK

Image classification

INFORMATIK

§ Apply a prediction function to a feature representation of the image to get the desired output:

f( ) = “apple” f( ) = “tomato” f( ) = “cow”

The statistical learning framework

INFORMATIK

y = f(x)

§  Training: given a training set of labeled examples

{(x1,y1), …, (xN,yN)}, estimate the prediction function f by minimizing the prediction error on the training set

§  Testing: apply f to a never before seen test example x and output the predicted value y = f(x)

The statistical learning framework

output prediction function

Image feature

INFORMATIK

Predic'on

Steps Training Labels

Training Images

Training

Image Features

Testing

Test Image

Learned model

Slide credit: D. Hoiem 12

INFORMATIK

Tradi&onal recogni&on pipeline

Hand-designed feature

extraction

Trainable classifier

Image Pixels

•  Features are not learned •  Trainable classifier is often generic (e.g. SVM)

Object Class

INFORMATIK

Bags of features

INFORMATIK

1.  Extract local features 2.  Learn “visual vocabulary” 3.  Quantize local features using visual vocabulary 4.  Represent images by frequencies of “visual words”

Traditional features: Bags-of-features

INFORMATIK

•  Sample patches and extract descriptors

1. Local feature extraction

INFORMATIK

2. Learning the visual vocabulary

Slide credit: Josef Sivic

Extracted descriptors from the training set

INFORMATIK

Clustering

Slide credit: Josef Sivic 18

INFORMATIK

Clustering

Slide credit: Josef Sivic

Visual vocabulary

INFORMATIK

∑ ∑ −=k

kiMXDcluster

clusterinpoint

2)(),( mx

Review: K-means clustering

•  Want to minimize sum of squared Euclidean distances between features xi and their nearest cluster centers mk

Algorithm:

•  Randomly ini&alize K cluster centers

•  Iterate un&l convergence: §  Assign each feature to the nearest center §  Recompute each cluster center as the mean of all features assigned to it

INFORMATIK

Example visual vocabulary

… Source: B. Leibe

Appearance codebook 21

INFORMATIK

1.  Extract local features 2.  Learn “visual vocabulary” 3.  Quantize local features using visual vocabulary 4.  Represent images by frequencies of “visual words”

Bags-of-features: main steps

INFORMATIK

•  Orderless document representation: frequencies of words from a dictionary Salton & McGill (1983)

Bags of features: Motivation

INFORMATIK

US Presidential Speeches Tag Cloud http://chir.ag/projects/preztags/

INFORMATIK

Spatial pyramids

level 0

Lazebnik, Schmid & Ponce (CVPR 2006) 27

INFORMATIK

Spatial pyramids

level 0 level 1

INFORMATIK

Spatial pyramids

level 0 level 1 level 2

INFORMATIK

•  Scene classifica'on results

Spatial pyramids

INFORMATIK

•  Caltech101 classification results

Spatial pyramids

INFORMATIK

Tradi&onal recogni&on pipeline

Hand-designed feature

extraction

Trainable classifier

Image Pixels

Object Class

INFORMATIK

f(x) = label of the training example nearest to x

§ All we need is a distance function for our inputs § No training required!

Classifiers: Nearest neighbor

Test example

Training examples

from class 1

Training examples

from class 2

INFORMATIK

•  For a new point, find the k closest points from training data

•  Vote for class label with labels of the k points

K-nearest neighbor classifier

INFORMATIK

§ Which classifier is more robust to outliers?

Credit: Andrej Karpathy, http://cs231n.github.io/classification/ 35

INFORMATIK

Credit: Andrej Karpathy, http://cs231n.github.io/classification/ 36

INFORMATIK

Linear classifiers

§ Find a linear func+on to separate the classes:

f(x) = sgn(w ⋅⋅ x + b) 37

INFORMATIK

Visualizing linear classifiers

Source: Andrej Karpathy, http://cs231n.github.io/linear-classify/ 38

INFORMATIK

•  NN pros: §  Simple to implement §  Decision boundaries not necessarily linear §  Works for any number of classes §  Nonparametric method

•  NN cons: §  Need good distance function §  Slow at test time

•  Linear pros: §  Low-dimensional parametric representation §  Very fast at test time

•  Linear cons: §  Works for two classes §  How to train the linear function? §  What if data is not linearly separable?

Nearest neighbor vs. linear classifiers

INFORMATIK

§ Linear Classifiers can lead to many equally valid choices for the decision boundary

Maximum Margin

Are these really “equally valid”?

INFORMATIK

§ How can we pick which is best? § Maximize the size of the margin.

Max Margin

Are these really “equally valid”?

Small Margin

Large Margin

INFORMATIK

§ Support Vectors are those input points (vectors) closest to the decision boundary

1.  They are vectors 2. They “support” the decision hyperplane

Support Vectors

INFORMATIK

§ Define this as a decision problem

§ The decision hyperplane:

§ No fancy math, just the equation of a hyperplane.

Support Vectors

T~x + b = 0

INFORMATIK

§ Define this as a decision problem § The decision hyperplane:

§ Decision Function:

Support Vectors

T~x + b = 0

D(~xi) = sign(~w

T~xi + b)

INFORMATIK

§ Define this as a decision problem § The decision hyperplane:

§ Margin hyperplanes:

Support Vectors

T~x + b = 0

T~x + b = γ

INFORMATIK

§ Scale invariance

Support Vectors

T~x + b = 0

T~x + cb = 0

INFORMATIK

Support Vectors

T~x + b = 0

T~x + cb = 0

T~x + b = γ

INFORMATIK

Support Vectors

T~x + b = 0

This scaling does not change the decision hyperplane, or the support vector hyperplanes. But we will eliminate a variable from the optimization

⇤T~x + b

⇤ = 1~

⇤T~x + b

⇤ = 1

T~x + cb = 0

INFORMATIK

§ We will represent the size of the margin in terms of w.

§ This will allow us to simultaneously §  Identify a decision boundary §  Maximize the margin

What are we optimizing?

INFORMATIK

1.  There must at least one point that lies on each support hyperplanes

How do we represent the size of the margin in terms of w?

x2Proof outline: If not, we could define a larger margin support hyperplane that does touch the nearest point(s).

INFORMATIK

2.  Thus:

Tx1 + b = 1

Tx2 + b = 1

3.  And:

T (x1 x2) = 2

INFORMATIK

2.  Thus:

Tx1 + b = 1

Tx2 + b = 1

3.  And:

T (x1 x2) = 2

Tx + b = 0

INFORMATIK

§ The vector w is perpendicular to the decision hyperplane §  If the dot product of two vectors equals zero, the two vectors are perpendicular.

Tx + b = 0

T (x1 x2) = 2

INFORMATIK

§ The margin is the projection of x1 – x2 onto w, the normal of the hyperplane.

Tx + b = 0

T (x1 x2) = 2

INFORMATIK

Recall: Vector Projection

cos(✓) =adjacent

hypotenuse

~v · ~uk~uk ~u

~v · ~u = k~vkk~uk cos(✓)

hypotenuse = k~vk adjacent = kgoalk

cos(✓) =k ~

goalkk~vk

k~vkk~ukk~vkk~uk cos(✓) =

goalkk~vk

~v · ~uk~vkk~uk =

goalkk~vk

~v · ~uk~uk = k ~

INFORMATIK

§ The margin is the projection of x1 – x2 onto w, the normal of the hyperplane.

Tx + b = 0

T (x1 x2) = 2

~v · ~uk~uk ~u

T (x1 − x2)k~wk ~w

2k~wkSize of the Margin:

Projection:

INFORMATIK

§ Goal: maximize the margin

Maximizing the margin

min k~wk

where ti(~w

Txi + b) 1

Linear Separability of the data by the decision boundary

Txi + b ≥ 1 if ti = 1

Txi + b 1 if ti = −1

INFORMATIK

§  If constraint optimization then Lagrange Multipliers § Optimize the “Primal”

Max Margin Loss Function

L(~w, b) =12

~w · ~wN1X

↵i[ti((~w · ~xi) + b) 1]

min k~wk where ti(~w

Txi + b) 1

Lagrange Multiplier

INFORMATIK

§ Optimize the “Primal”

L(~w, b) =12

~w · ~wN1X

↵i[ti((~w · ~xi) + b) 1]

@L(~w, b)@b

↵iti = 0

Partial wrt b

INFORMATIK

L(~w, b) =12

~w · ~wN1X

↵i[ti((~w · ~xi) + b) 1]

@L(~w, b)@ ~w

↵iti ~xi = 0

~w =N1X

↵iti ~xi

Partial wrt w

INFORMATIK

L(~w, b) =12

~w · ~wN1X

↵i[ti((~w · ~xi) + b) 1]

@L(~w, b)@ ~w

↵iti ~xi = 0

~w =N1X

↵iti ~xi

Partial wrt w

Now have to find αi. Substitute back to the Loss function

INFORMATIK

§ Construct the “dual”

L(~w, b) =12

~w · ~wN1X

↵i[ti((~w · ~xi) + b) 1]

~w =N1X

↵iti ~xi

W (↵) =N1X

↵i12

↵i↵jtitj(~xi · ~xj)

where ↵i 0N1X

↵iti = 0

↵iti = 0With:

INFORMATIK

§ Solve this quadratic program to identify the lagrange multipliers and thus the weights

Dual formulation of the error

W (↵) =N1X

↵i12

where ↵i 0

There exist (rather) fast approaches to quadratic program solving in both C, C++, Python, Java and R

INFORMATIK

Quadratic Programming

subject to (one or more) A~x k

B~x = l

minimize f(~x) =12~x

TQ~x + c

• If Q is positive semi definite, then f(x) is convex.

• If f(x) is convex, then there is a single maximum.

W (↵) =N1X

↵i12

where ↵i 0

INFORMATIK

Support Vector Expansion

§ When αi is non-‐zero then xi is a support vector

§ When αi is zero xi is not a support vector 65

~w =N1X

↵iti ~xi

D(~x) = sign(~w

T~x + b)

= sign

↵iti ~xi

~x + b

= sign

↵iti(~xiT~x)

New decision Function Independent of the

Dimension of x!

INFORMATIK

§  In constraint optimization: At the optimal solution

§  Constraint * Lagrange Multiplier = 0

Karush-Kuhn-Tucker Conditions

↵i(1 ti(~w

T~xi + b)) = 0

if ↵i 6= 0 ! ti(~w

T~xi + b) = 1

Only points on the decision boundary contribute to the solution!

INFORMATIK

•  General idea: the original input space can always be mapped to some higher-dimensional feature space where the training set is separable

Nonlinear SVMs

Φ: x → φ(x)

Image source 67

INFORMATIK

•  Linearly separable dataset in 1D:

•  Non-separable dataset in 1D:

•  We can map the data to a higher-dimensional space:

Nonlinear SVMs

Slide credit: Andrew Moore 68

INFORMATIK

•  General idea: the original input space can always be mapped to some higher-dimensional feature space where the training set is separable.

•  The kernel trick: instead of explicitly computing the lifting transformation φ(x), define a kernel function K such that

K(x , y) = φ(x) · φ(y) §  (to be valid, the kernel function must satisfy

Mercer’s condition)

The kernel trick

INFORMATIK

•  Linear SVM decision function:

The kernel trick

C. Burges, A Tutorial on Support Vector Machines for Pattern Recognition, Data Mining and Knowledge Discovery, 1998

bybi iii +⋅=+⋅ ∑ xxxw α

Support vector

learned weight

INFORMATIK

bKybyi

iii +=+⋅ ∑∑ ),()()( xxxx αϕϕα

The kernel trick •  Linear SVM decision func&on:

• Kernel SVM decision func&on:

• This gives a nonlinear decision boundary in the original feature space

C. Burges, A Tutorial on Support Vector Machines for Pattern Recognition, Data Mining and Knowledge Discovery, 1998

bybi iii +⋅=+⋅ ∑ xxxw α

INFORMATIK

Polynomial kernel: dcK )(),( yxyx ⋅+=

INFORMATIK

•  Also known as the radial basis function (RBF) kernel:

Gaussian kernel

||x – y||

K(x, y)

⎟⎠

⎞⎜⎝

⎛ −−=2

1exp),( yxyxσ

INFORMATIK

Gaussian kernel

SV’s

INFORMATIK

•  Histogram intersection:

•  Square root (Bhattacharyya kernel):

Kernels for histograms

K(h1,h2 ) = min(h1(i),h2 (i))i=1

K(h1,h2 ) = h1(i)h2 (i)i=1

INFORMATIK

•  Pros §  Kernel-based framework is very powerful, flexible §  Training is convex optimization, globally optimal solution

can be found §  Amenable to theoretical analysis §  SVMs work very well in practice, even with very small

training sample sizes

•  Cons §  No “direct” multi-class SVM, must combine two-class

SVMs (e.g., with one-vs-others) §  Computation, memory (esp. for nonlinear SVMs)

SVMs: Pros and cons

INFORMATIK

•  Generalization refers to the ability to correctly classify never before seen examples

•  Can be controlled by turning “knobs” that affect the complexity of the model

Generalization

Training set (labels known) Test set (labels unknown) 77

INFORMATIK

•  Training error: how does the model perform on the data on which it was trained?

•  Test error: how does it perform on never before seen data?

Diagnosing generalization ability

Source: D. Hoiem

Training error

Test error

Underfitting Overfitting

Model complexity High Low

INFORMATIK

•  Underfitting: training and test error are both high §  Model does an equally poor job on the training and the test set §  Either the training procedure is ineffective or the model is too

“simple” to represent the data •  Overfitting: Training error is low but test error is high

•  Model fits irrelevant characteristics (noise) in the training data §  Model is too complex or amount of training data is insufficient

Underfitting and overfitting

Underfitting

Overfitting

Good generalization

Figure source 79

INFORMATIK

Effect of training set size

Many training examples

Few training examples

Source: D. Hoiem 80

INFORMATIK

Effect of training set size

Many training examples

Few training examples

Source: D. Hoiem 81

INFORMATIK

•  Split the data into training, validation, and test subsets •  Use training set to optimize model parameters •  Use validation test to choose the best model •  Use test set only to evaluate performance

Validation

Model complexity

Training set loss

Test set loss

Validation set loss

Stopping point

INFORMATIK

Histogram of Oriented Gradients (HoG)

[Dalal and Triggs, CVPR 2005]

10x10 cells

20x20 cells

HoGify

INFORMATIK

§  Many examples on Internet:

§  hVps://www.youtube.com/watch?v=b_DSTuxdGDw

[Dalal and Triggs, CVPR 2005] 85

INFORMATIK

§ Demo software: http://cs.stanford.edu/people/karpathy/svmjs/demo

§ A useful website: www.kernel-machines.org § Software:

§  LIBSVM: www.csie.ntu.edu.tw/~cjlin/libsvm/ !!§  SVMLight: svmlight.joachims.org/

Resources

INFORMATIK

Thank you!

2D Image Processing Introduction to object recognition &...

Documents

Transcript of 2D Image Processing Introduction to object recognition &...

MONOCULAR SLAM SUPPORTED OBJECT RECOGNITIONpeople.csail.mit.edu/spillai/projects/vslam-object... · Object Proposal Methods State-of-the-art object detection schemes utilize object

An Abstract Object Model for ObJect-Oriented'Systems · object-oriented programming, object oriented systemst object models, datamodels, distributed systems, object management This

Object Persistence and Object-Relational Mapping

OPPM 7000 Facilities Table of Contents...WAC 132D-350 Restricted use of skateboards, roller skates, scooters, two-wheeled . motor bikes and bicycles (recreational equipment) POLICY

Object-oriented modeling Class/Object Diagrams

Computer Vision Object and People Trackingags.cs.uni-kl.de/fileadmin/inf_ags/opt-ss15/OPT_SS2015_lec04.pdfDr. Alain Pagani Kaiserlautern University ... Object and People Tracking 1

App Windows UI object Windows object App code Windows object.

xn----ctbfgxcf0art1j.xn--p1aihttps://двк-электро.рф/eac/eac_uni.pdf · RMC-122D, TOKOBoe pene RMC- 13 ID, TOKOBoe pene RMC-132D TOKOBoe penðRMC= 142D, Pene MOIIIHOCTH

3D Computer Vision - - TU Kaiserslauternags.cs.uni-kl.de/fileadmin/inf_ags/3dcv-ws12-13/lec01_camera.pdf · 3D Computer Vision: Exercises • Homework assignments consisting of –

THE PARLIAMENT OF THE COMMONWEALTH OF AUSTRALIA … · 2009. 3. 26. · Un de r- pr-oposed sub-section l32G(11), where an application for leave is made under- section 132C or 132D,

Object Database Vs_ Object-Relational Databases

Supervised object recognition, unsupervised object ...courses.csail.mit.edu/6.869/lectnotes/lect18/lect18-slides.pdfSupervised object recognition, unsupervised object recognition then

Object Database vs. Object-Relational Databases · Object Database vs. Object-Relational Databases International Data Corporation Object Database vs. Object-Relational Databases Steve

Introduction of Object-Oriented Programming Object ...courses/coe808/Truman/Lecture20.pdf · Introduction of Object-Oriented Programming Object-Oriented Programming in Java Object-Oriented

Object Tracking Using Color Object

1 Object-Oriented Database Languages Object Description Language Object Query Language.

Problems.Net Анатолий Крыжановский Обра. ООП 2 class Bar { } void Foo(object a) { Console.WriteLine("object"); } void Foo(object a, object b) { Console.WriteLine("object,

Object orientation and object patterns

N600 Wireless Dual Band Router - CNET Contentcdn.cnetcontent.com/b6/10/b6102068-132d-4967-a07d-311154056a30.pdfN600 Wireless Dual Band Router ... ˛les through the device’s FTP server

OBJECTS OBJECT IDENTITY REFERENCE TYPES. Object identifier - oid FILE SYSTEM Data object.