Object Detection Using Semi- Naïve Bayes to Model Sparse Structure Henry Schneiderman Robotics...

Object Detection Using Semi-Naïve Bayes to Model Sparse

Structure

Henry Schneiderman

Robotics InstituteCarnegie Mellon University

Object Detection

• Find all instances of object X (e.g. X = human faces)

Examples of Detected Objects

Sparse Structure of Statistical Dependency

Chosen variable Chosen variableChosen variable


Chosen coefficientChosen coefficient Chosen coefficient


Chosen coefficient Chosen coefficient Chosen coefficient

Detection using a Classifier

“Object is present” (at fixed size and alignment)

“Object is NOT present”(at fixed size and alignment)

Classifier

Proposed Model: Semi-Naïve Bayes

• Kononenko (1991), Pazzini (1996), Domingos and Pazzini (1997), Rokach and Maimon (2001)

)|(

)|(log...

)|(

)|(log

)|(

)|(log),...(

2

1

22

12

21

111

n

nr SP

SP

SP

SP

SP

SPxxH

},...,{,..., 11 rn xxSS

e.g. S1 = (x21, x34, x65, x73, x123) S2 = (x3, x8, x17, x65, x73, x111)

subsets input variables

Goal: Automatic subset grouping

S1 = (x21, x34, x65, x73, x123)S2 = (x3, x8, x17, x65, x73, x111) . . .Sn = (x14, x16, x17, x23, x85, x101, x103, x107)

)|(

)|(log...

)|(

)|(log

)|(

)|(log),...(

2

1

22

12

21

111

n

nr SP

SP

SP

SP

SP

SPxxH


x1 x2 x3 . . . xm

S1 S2 . . . Sq







Generation of Subsets

• “modeling error for assuming independence”

qxP

xP

xPxP

xPxP

xxP

xxP

q dxdxabsxxPCq

q

q

q ...]...log[log)...(... 1)|(

)|(

)|()|(

)|()|(

)|...(

)|...(

11 2

1

22

12

21

11

21

11

q is size of the subset

Generation of Subsets

• Selection of variables - “discrimination power”

qxxP

xxP

q dxdxabsxxPCq

q ...][log)...(... 1)|...(

)|...(

12 21

11

q is size of the subset

Pair-Wise Measurement

• pair-wise measurements

kj

k

k

j

j

kj

kj

xxkjxP

xPxP

xP

xxP

xxP

kjkj dxdxabsxxPxxC,

)|()|(

)|(

)|(

)|,(

)|,(

1 ]log[log),(),(2

1

2

1

2

1

),(),(),( 21 kjkjkj xxCxxCxxC Pair-affinity

kj

kj

kj

xxkjxxP

xxP

kjkj dxdxabsxxPxxC,

)|,(

)|,(

2 ][log),(),(2

1

Visualization of C(x,*)(frontal faces)

xx x

Measure over a Subset

),()(,

)1(1

kj

xxSxx

NNi xxCSD

kj

ikj

iSN Subset-affinity

Generation of Candidate Subsets

x1 x2 x3 . . . . . . . . . . . . . . . . . . . . . . . . xm

S1 S2 . . . . . . . . . . . . . . . . . . Sp

Heuristic search and selective evaluation of D(Si)

C(x1, x2) C(x1, x3) . . . . . . . . C(xm-1, xm)

subset size vs. modeling power

• Model complexity limited by number of training examples, etc.

• Examples of limited modeling power– 5 modes in a mixture model– 7 projection onto principal components


x1 x2 x3 . . . xm

S1 S2 . . . Sq







Log-likelihood function = Table

)...1( ii Nf

Si = (xi1, xi2, . . ., xiq)

)log( )|()|(

2

1

i

i

fPfP

vector quantization

table look-up

Sub-Classifier Training by Counting

Pi (fi |1)

Pi (fi |2)fi

fi

Example of VQ

xi1 xi2 xi3 . . . xiq

c1 c2 c3

z1 z2 z3

f = z1m0 + z2m1 + z3m2

quantization to m levels

projection on to 3 principal components


x1 x2 x3 . . . xm

S1 S2 . . . Sq







h1(S1) h2(S2) . . . hP(SP)

E1,1 E1,2 E2,1 E2,2 . . . Ep,1 Ep,2 Evaluate on training data

ROC1 ROC2 . . . ROCPEvaluate ROCs

Order top Qlog-likelihoodfunctions

hj1(Sj1) hj2(Sj2) . . . hjQ(SjQ)

Candidatelog-likelihoodfunctions

hj1(Sj1) + h1(S1) . . . hjQ(SjQ) + hp(Sp)Form pQ pairsof log-likelihoodfunctions

Ej1,1+ E1,1 Ej1,2+ E1,2 . . . EjQ,1+ Ep,1 EjQ,2+ Ep,2 Sum Evaluations

ROC1 . . . ROCQPEvaluate ROCs

Order top Qpairs of log-likelihoodfunctions

hk1,1(Sk1,1) + hk1,2(Sk1,2) . . . hkQ,1(SkQ,1) + hkQ,2(SkQ,2)

. . . Repeat for n iterations

H1(x1, x2, . . ., xr) . . . HQ(x1, x2, . . ., xr)

Cross-Validation Selects Classifier

Q Candidates: H1(x1, x2, . . ., xr) = hk1,1(Sk1,1) + hk1,2(Sk1,2) +. . .+ hk1,n(Sk1,n) . . . HQ(x1, x2, . . ., xr) = hkQ,1(SkQ,1) + hkQ,2(SkQ,2) + . . . + hQ,n(SkQ,n)

H*(x1, x2, . . ., xr)

Cross-validation

Example subsets learned for telephones

Evaluation of Classifier



Classifier

1) Compute feature values

f2 = #3214

f1 = #5710

fn = #723

2) Look-Up Log-Likelihoods

P1( #5710 | 1)

P1( #5710 | 2)

f2 = #3214

f1 = #5710

fn = #723

= 0.53

P2( #3214 | 1)

P2( #3214 | 2)= 0.03log

log

Pn( #723 | 1)

Pn( #723 | 2)= 0.23log

3) Make Decision

P1( #5710 | 1)

P1( #5710 | 2)= 0.53

P2( #3214 | 1)

P2( #3214 | 2)= 0.03log

log

Pn( #723 | 1)

Pn( #723 | 2)= 0.23log

> <0.53 + 0.03 + . . . + 0.23

Detection using a Classifier



Classifier

View-based Classifiers

FaceClassifier #1

FaceClassifier #2

FaceClassifier #3

Detection: Apply Classifier Exhaustively

Search in position

Search in scale

Decision can be made by partial evaluation

P1( #5710 | 1)

P1( #5710 | 2)= 0.53

P2( #3214 | 1)

P2( #3214 | 2)= 0.03log

log

Pn( #723 | 1)

Pn( #723 | 2)= 0.23log

> <0.53 + 0.03 + . . . + 0.23

Detection Computational Strategy

Computational strategy changes with size of search space

Apply log [p1(S1|1) / p1(S1|2)]exhaustively to scaled input image

Apply log [p2(S2|1) / p2(S2|2)]reduced search space

Apply log [p3(S3|1) / p3(S3|2)]further reduced search space

Repeat for N2 Candidates

Compute M2 feature values Look-up M2 log-likelihood values

Candidate-Based Evaluation

Repeat for N2 Candidates

Compute N2 + M2 +2MN feature values Look-up M2 log-likelihood values

Feature-Based Evaluation

Create candidate subsets

Train candidate log-likelihood functions

Select log-likelihood functions

Retrain selected log-likelihoodfunctions using Adaboost

Training imagesof object

Cross-validationimages

Training images of non-object

Determine detection threshold

Automatically select non-objectexamples for next stage

Images that donot contain object

Increment stage

Cascade Implementation

Adaboost using confidence-rated predictions [Shapire and Singer, 1999]

Bootstrapping [Sung and Poggio, 1995]

Face, eye, ear detection

Frontal Face Detection

Recognition rate 85.2% 89.7% 92.1% 93.7% 94.2%

False detections(this method)

6 13 44 64 79

False Detections[Viola and Jones, CVPR, 2001]

-- 31 50 167 --

• MIT-CMU Frontal Face Test Set [Sung and Poggio, 1995; Rowley, Baluja and Kanade, 1997]

– 180 ms 300x200 image– 400 ms 300x500 image

• Top Rank Video TREC 2002 Face Detection• Top Rank 2002 ARDA VACE Face Detection algorithm evaluation

AMD Athalon 1.2GHz

Face & Eye Detection for Red-Eye Removal from Consumer Photos

CMU Face Detector

Eye Detection

• Experiments performed independently at NIST

• Sequested data set: 29,627 mugshots• Eyes correctly located (radius of 15 pixels)

98.2% (assumed one face per image)• Thanks to Jonathon Phillips, Patrick

Grother, and Sam Trahan for their assistance in running these experiments

Realistic Facial Manipulation:Earring Example

With Jason Pinto

Telephone Detection

Cart, pose 1

Cart, pose 2

Cart, pose 3

Door Handle Detection

Summary of Classifier Design

• Sparse structure of statistical dependency in many image classification problem

• Semi-naïve Bayes Model• Automatic learning structure of semi-naïve Bayes

classifier:– Generation of many candidate subsets– Competition among many log-likelihood functions to

find best combination

CMU on-line face detector:http://www.vasc.ri.cmu.edu/cgi-bin/demos/findface.cgi

Object Detection Using Semi- Naïve Bayes to Model Sparse Structure Henry Schneiderman Robotics...

Documents

Transcript of Object Detection Using Semi- Naïve Bayes to Model Sparse Structure Henry Schneiderman Robotics...