The Viola/Jones Face Detector

34
The Viola/Jones Face Detector A “paradigmatic” method for real- time object detection Training is slow, but detection is very fast Key ideas Integral images for fast feature evaluation Boosting for feature selection Attentional cascade for fast rejection of non- face windows P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. CVPR 2001. Slides by Robert Fergus

description

The Viola/Jones Face Detector. A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images for fast feature evaluation Boosting for feature selection Attentional cascade for fast rejection of non-face windows. - PowerPoint PPT Presentation

Transcript of The Viola/Jones Face Detector

Page 1: The Viola/Jones Face Detector

The Viola/Jones Face Detector

• A “paradigmatic” method for real-time object detection

• Training is slow, but detection is very fast• Key ideas

• Integral images for fast feature evaluation• Boosting for feature selection• Attentional cascade for fast rejection of non-face windows

P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. CVPR 2001.

Slides by Robert Fergus

Page 2: The Viola/Jones Face Detector

Image Features

“Rectangle filters”

Value =

∑ (pixels in white area) – ∑ (pixels in black area)

Page 3: The Viola/Jones Face Detector

Example

Source

Result

Page 4: The Viola/Jones Face Detector

Fast computation with integral images

• The integral image computes a value at each pixel (x,y) that is the sum of the pixel values above and to the left of (x,y), inclusive

• This can quickly be computed in one pass through the image

(x,y)

Page 5: The Viola/Jones Face Detector

Computing sum within a rectangle

• Let A,B,C,D be the values of the integral image at the corners of a rectangle

• Then the sum of original image values within the rectangle can be computed as: sum = A – B – C + D

• Only 3 additions are required for any size of rectangle!• This is now used in many areas

of computer vision

D B

C A

Page 6: The Viola/Jones Face Detector

Example

-1 +1+2-1

-2+1

Integral Image

(x,y)(x,y)

Page 7: The Viola/Jones Face Detector

Feature selection

• For a 24x24 detection region, the number of possible rectangle features is ~180,000!

Page 8: The Viola/Jones Face Detector

Feature selection

• For a 24x24 detection region, the number of possible rectangle features is ~180,000!

• At test time, it is impractical to evaluate the entire feature set

• Can we create a good classifier using just a small subset of all possible features?

• How to select such a subset?

Page 9: The Viola/Jones Face Detector

Boosting

• Boosting is a classification scheme that works by combining weak learners into a more accurate ensemble classifier

• Weak learner: classifier with accuracy that need be only better than chance

• We can define weak learners based on rectangle features:

Y. Freund and R. Schapire, A short introduction to boosting, Journal of Japanese Society for Artificial Intelligence, 14(5):771-780, September, 1999.

Page 10: The Viola/Jones Face Detector

AdaBoost• Given a set of weak classifiers

– None much better than random• Iteratively combine classifiers

– Form a linear combination

– Training error converges to 0 quickly– Test error is related to training margin

}1,1{)( :originally xjh

t

t bxhxC )()(

Y. Freund and R. Schapire, A short introduction to boosting, Journal of Japanese Society for Artificial Intelligence, 14(5):771-780, September, 1999.

Page 11: The Viola/Jones Face Detector

60,000 features to choose from

Boosted Face Detection: Image Features

“Rectangle filters”

Similar to Haar wavelets Papageorgiou, et al.

otherwise

)( if )(

t

tittit

xfxh

t

t bxhxC )()(

Page 12: The Viola/Jones Face Detector

Boosting outline

• Initially, give equal weight to each training example

• Iterative training procedure• Find best weak learner for current weighted training set• Raise the weights of training examples misclassified by current

weak learner

• Compute final classifier as linear combination of all weak learners (weight of each learner is related to its accuracy)

Y. Freund and R. Schapire, A short introduction to boosting, Journal of Japanese Society for Artificial Intelligence, 14(5):771-780, September, 1999.

Page 13: The Viola/Jones Face Detector

Boosting

Weak Classifier 1

Page 14: The Viola/Jones Face Detector

Boosting

WeightsIncreased

Page 15: The Viola/Jones Face Detector

Boosting

Weak Classifier 2

Page 16: The Viola/Jones Face Detector

Boosting

WeightsIncreased

Page 17: The Viola/Jones Face Detector

Boosting

Weak Classifier 3

Page 18: The Viola/Jones Face Detector

Boosting

Final classifier is linear combination of weak classifiers

Page 19: The Viola/Jones Face Detector

• For each round of boosting:• Evaluate each rectangle filter on each example• Select best threshold for each filter • Select best filter/threshold combination• Reweight examples

• Computational complexity of learning: O(MNT)• M filters, N examples, T thresholds

Boosting for face detection

Page 20: The Viola/Jones Face Detector

First two features selected by boosting

Page 21: The Viola/Jones Face Detector

Cascading classifiers

• We start with simple classifiers which reject many of the negative sub-windows while detecting almost all positive sub-windows

• Positive results from the first classifier triggers the evaluation of a second (more complex) classifier, and so on

• A negative outcome at any point leads to the immediate rejection of the sub-window

FACEIMAGESUB-WINDOW

Classifier 1T

Classifier 3T

F

NON-FACE

TClassifier 2

T

F

NON-FACE

F

NON-FACE

Page 22: The Viola/Jones Face Detector

Cascading classifiers

• Chain classifiers that are progressively more complex and have lower false positive rates: vs false neg determined by

% False Pos

% D

etec

tion

0 50

50

100

FACEIMAGESUB-WINDOW

Classifier 1T

Classifier 3T

F

NON-FACE

TClassifier 2

T

F

NON-FACE

F

NON-FACE

Receiver operating characteristic

Page 23: The Viola/Jones Face Detector

Training the cascade

• Adjust weak learner threshold to minimize false negatives (as opposed to total classification error)

• Each classifier trained on false positives of previous stages• A single-feature classifier achieves 100% detection rate and

about 50% false positive rate• A five-feature classifier achieves 100% detection rate and

40% false positive rate (20% cumulative)• A 20-feature classifier achieve 100% detection rate with 10%

false positive rate (2% cumulative)

1 Feature 5 Features

F

50%20 Features

20% 2%FACE

NON-FACE

F

NON-FACE

F

NON-FACE

IMAGESUB-WINDOW

Page 24: The Viola/Jones Face Detector

The implemented system

• Training Data• 5000 faces

– All frontal, rescaled to 24x24 pixels

• 300 million non-faces– 9500 non-face images

• Faces are normalized– Scale, translation

• Many variations• Across individuals• Illumination• Pose

(Most slides from Paul Viola)

Page 25: The Viola/Jones Face Detector

System performance

• Training time: “weeks” on 466 MHz Sun workstation

• 38 layers, total of 6061 features• Average of 10 features evaluated per window

on test set• “On a 700 Mhz Pentium III processor, the

face detector can process a 384 by 288 pixel image in about .067 seconds” • 15 Hz• 15 times faster than previous detector of comparable

accuracy (Rowley et al., 1998)

Page 26: The Viola/Jones Face Detector

Output of Face Detector on Test Images

Page 27: The Viola/Jones Face Detector

Other detection tasks

Facial Feature Localization

Male vs. female

Profile Detection

Page 28: The Viola/Jones Face Detector

Profile Detection

Page 29: The Viola/Jones Face Detector

Profile Features

Page 30: The Viola/Jones Face Detector

Summary: Viola/Jones detector

• Rectangle features• Integral images for fast computation• Boosting for feature selection• Attentional cascade for fast rejection of

negative windows

Page 31: The Viola/Jones Face Detector

Overview

Face RecognitionBrief review of EigenfacesActive Appearance modelsFace DetectionViola & Jones real-time face detectorConvolutional Neural NetworksSpecific Object RecognitionSIFT based recognition

Page 32: The Viola/Jones Face Detector

• Application of Convolutional Neural Networks to Face Detection

Osadchy, Miller, LeCun. Face Detection and Pose Estimation, 2004

Page 33: The Viola/Jones Face Detector

• Non-linear dimensionality reduction

Osadchy, Miller, LeCun. Face Detection and Pose Estimation, 2004

Page 34: The Viola/Jones Face Detector

Osadchy, Miller, LeCun. Face Detection and Pose Estimation, 2004