Object Detection Using Semi- Naïve Bayes to Model Sparse Structure Henry Schneiderman Robotics...
-
Upload
thaddeus-beringer -
Category
Documents
-
view
215 -
download
0
Transcript of Object Detection Using Semi- Naïve Bayes to Model Sparse Structure Henry Schneiderman Robotics...
Object Detection Using Semi-Naïve Bayes to Model Sparse
Structure
Henry Schneiderman
Robotics InstituteCarnegie Mellon University
Object Detection
• Find all instances of object X (e.g. X = human faces)
Examples of Detected Objects
Sparse Structure of Statistical Dependency
Chosen variable Chosen variableChosen variable
Sparse Structure of Statistical Dependency
Chosen coefficientChosen coefficient Chosen coefficient
Sparse Structure of Statistical Dependency
Chosen coefficient Chosen coefficient Chosen coefficient
Detection using a Classifier
“Object is present” (at fixed size and alignment)
“Object is NOT present”(at fixed size and alignment)
Classifier
Proposed Model: Semi-Naïve Bayes
• Kononenko (1991), Pazzini (1996), Domingos and Pazzini (1997), Rokach and Maimon (2001)
)|(
)|(log...
)|(
)|(log
)|(
)|(log),...(
2
1
22
12
21
111
n
nr SP
SP
SP
SP
SP
SPxxH
},...,{,..., 11 rn xxSS
e.g. S1 = (x21, x34, x65, x73, x123) S2 = (x3, x8, x17, x65, x73, x111)
subsets input variables
Goal: Automatic subset grouping
S1 = (x21, x34, x65, x73, x123)S2 = (x3, x8, x17, x65, x73, x111) . . .Sn = (x14, x16, x17, x23, x85, x101, x103, x107)
)|(
)|(log...
)|(
)|(log
)|(
)|(log),...(
2
1
22
12
21
111
n
nr SP
SP
SP
SP
SP
SPxxH
Approach: Selection by Competition
x1 x2 x3 . . . xm
S1 S2 . . . Sq
Generate q candidate subsets
log [p1(S1|1) / p1(S1|2)] log [p2(S2|1) / p2(S2|2)] . . . log [pq(Sq|1) / pq(Sq|2)]
Train q log likelihood function
Select combination of n candidates
log [pj1(Sj1|1) / pj1(Sj1|2)] log [pj2(Sj2|1) / pj2(Sj2|2)] . . . log [pjn(Sjn|1) / pjn(Sjn|2)]
H(x1,…,xr) = log [pj1(Sj1|1) / pj1(Sj1|2)]+log [pj2(Sj2|1) / pj2(Sj2|2)] +. . .+ log [pjn(Sjn|1) / pjn(Sjn|2)]
q functions
n functions, n << q
Approach: Selection by Competition
x1 x2 x3 . . . xm
S1 S2 . . . Sq
Generate q candidate subsets
log [p1(S1|1) / p1(S1|2)] log [p2(S2|1) / p2(S2|2)] . . . log [pq(Sq|1) / pq(Sq|2)]
Train q log likelihood function
Select combination of n candidates
log [pj1(Sj1|1) / pj1(Sj1|2)] log [pj2(Sj2|1) / pj2(Sj2|2)] . . . log [pjn(Sjn|1) / pjn(Sjn|2)]
H(x1,…,xr) = log [pj1(Sj1|1) / pj1(Sj1|2)]+log [pj2(Sj2|1) / pj2(Sj2|2)] +. . .+ log [pjn(Sjn|1) / pjn(Sjn|2)]
Generation of Subsets
• “modeling error for assuming independence”
qxP
xP
xPxP
xPxP
xxP
xxP
q dxdxabsxxPCq
q
q
q ...]...log[log)...(... 1)|(
)|(
)|()|(
)|()|(
)|...(
)|...(
11 2
1
22
12
21
11
21
11
q is size of the subset
Generation of Subsets
• Selection of variables - “discrimination power”
qxxP
xxP
q dxdxabsxxPCq
q ...][log)...(... 1)|...(
)|...(
12 21
11
q is size of the subset
Pair-Wise Measurement
• pair-wise measurements
kj
k
k
j
j
kj
kj
xxkjxP
xPxP
xP
xxP
xxP
kjkj dxdxabsxxPxxC,
)|()|(
)|(
)|(
)|,(
)|,(
1 ]log[log),(),(2
1
2
1
2
1
),(),(),( 21 kjkjkj xxCxxCxxC Pair-affinity
kj
kj
kj
xxkjxxP
xxP
kjkj dxdxabsxxPxxC,
)|,(
)|,(
2 ][log),(),(2
1
Visualization of C(x,*)(frontal faces)
xx x
Measure over a Subset
),()(,
)1(1
kj
xxSxx
NNi xxCSD
kj
ikj
iSN Subset-affinity
Generation of Candidate Subsets
x1 x2 x3 . . . . . . . . . . . . . . . . . . . . . . . . xm
S1 S2 . . . . . . . . . . . . . . . . . . Sp
Heuristic search and selective evaluation of D(Si)
C(x1, x2) C(x1, x3) . . . . . . . . C(xm-1, xm)
subset size vs. modeling power
• Model complexity limited by number of training examples, etc.
• Examples of limited modeling power– 5 modes in a mixture model– 7 projection onto principal components
Approach: Selection by Competition
x1 x2 x3 . . . xm
S1 S2 . . . Sq
Generate q candidate subsets
log [p1(S1|1) / p1(S1|2)] log [p2(S2|1) / p2(S2|2)] . . . log [pq(Sq|1) / pq(Sq|2)]
Train q log likelihood function
Select combination of n candidates
log [pj1(Sj1|1) / pj1(Sj1|2)] log [pj2(Sj2|1) / pj2(Sj2|2)] . . . log [pjn(Sjn|1) / pjn(Sjn|2)]
H(x1,…,xr) = log [pj1(Sj1|1) / pj1(Sj1|2)]+log [pj2(Sj2|1) / pj2(Sj2|2)] +. . .+ log [pjn(Sjn|1) / pjn(Sjn|2)]
Log-likelihood function = Table
)...1( ii Nf
Si = (xi1, xi2, . . ., xiq)
)log( )|()|(
2
1
i
i
fPfP
vector quantization
table look-up
Sub-Classifier Training by Counting
Pi (fi |1)
Pi (fi |2)fi
fi
Example of VQ
xi1 xi2 xi3 . . . xiq
c1 c2 c3
z1 z2 z3
f = z1m0 + z2m1 + z3m2
quantization to m levels
projection on to 3 principal components
Approach: Selection by Competition
x1 x2 x3 . . . xm
S1 S2 . . . Sq
Generate q candidate subsets
log [p1(S1|1) / p1(S1|2)] log [p2(S2|1) / p2(S2|2)] . . . log [pq(Sq|1) / pq(Sq|2)]
Train q log likelihood function
Select combination of n candidates
log [pj1(Sj1|1) / pj1(Sj1|2)] log [pj2(Sj2|1) / pj2(Sj2|2)] . . . log [pjn(Sjn|1) / pjn(Sjn|2)]
H(x1,…,xr) = log [pj1(Sj1|1) / pj1(Sj1|2)]+log [pj2(Sj2|1) / pj2(Sj2|2)] +. . .+ log [pjn(Sjn|1) / pjn(Sjn|2)]
h1(S1) h2(S2) . . . hP(SP)
E1,1 E1,2 E2,1 E2,2 . . . Ep,1 Ep,2 Evaluate on training data
ROC1 ROC2 . . . ROCPEvaluate ROCs
Order top Qlog-likelihoodfunctions
hj1(Sj1) hj2(Sj2) . . . hjQ(SjQ)
Candidatelog-likelihoodfunctions
hj1(Sj1) + h1(S1) . . . hjQ(SjQ) + hp(Sp)Form pQ pairsof log-likelihoodfunctions
Ej1,1+ E1,1 Ej1,2+ E1,2 . . . EjQ,1+ Ep,1 EjQ,2+ Ep,2 Sum Evaluations
ROC1 . . . ROCQPEvaluate ROCs
Order top Qpairs of log-likelihoodfunctions
hk1,1(Sk1,1) + hk1,2(Sk1,2) . . . hkQ,1(SkQ,1) + hkQ,2(SkQ,2)
. . . Repeat for n iterations
H1(x1, x2, . . ., xr) . . . HQ(x1, x2, . . ., xr)
Cross-Validation Selects Classifier
Q Candidates: H1(x1, x2, . . ., xr) = hk1,1(Sk1,1) + hk1,2(Sk1,2) +. . .+ hk1,n(Sk1,n) . . . HQ(x1, x2, . . ., xr) = hkQ,1(SkQ,1) + hkQ,2(SkQ,2) + . . . + hQ,n(SkQ,n)
H*(x1, x2, . . ., xr)
Cross-validation
Example subsets learned for telephones
Evaluation of Classifier
“Object is present” (at fixed size and alignment)
“Object is NOT present”(at fixed size and alignment)
Classifier
1) Compute feature values
f2 = #3214
f1 = #5710
fn = #723
2) Look-Up Log-Likelihoods
P1( #5710 | 1)
P1( #5710 | 2)
f2 = #3214
f1 = #5710
fn = #723
= 0.53
P2( #3214 | 1)
P2( #3214 | 2)= 0.03log
log
Pn( #723 | 1)
Pn( #723 | 2)= 0.23log
3) Make Decision
P1( #5710 | 1)
P1( #5710 | 2)= 0.53
P2( #3214 | 1)
P2( #3214 | 2)= 0.03log
log
Pn( #723 | 1)
Pn( #723 | 2)= 0.23log
> <0.53 + 0.03 + . . . + 0.23
Detection using a Classifier
“Object is present” (at fixed size and alignment)
“Object is NOT present”(at fixed size and alignment)
Classifier
View-based Classifiers
FaceClassifier #1
FaceClassifier #2
FaceClassifier #3
Detection: Apply Classifier Exhaustively
Search in position
Search in scale
Decision can be made by partial evaluation
P1( #5710 | 1)
P1( #5710 | 2)= 0.53
P2( #3214 | 1)
P2( #3214 | 2)= 0.03log
log
Pn( #723 | 1)
Pn( #723 | 2)= 0.23log
> <0.53 + 0.03 + . . . + 0.23
Detection Computational Strategy
Computational strategy changes with size of search space
Apply log [p1(S1|1) / p1(S1|2)]exhaustively to scaled input image
Apply log [p2(S2|1) / p2(S2|2)]reduced search space
Apply log [p3(S3|1) / p3(S3|2)]further reduced search space
Repeat for N2 Candidates
Compute M2 feature values Look-up M2 log-likelihood values
Candidate-Based Evaluation
Repeat for N2 Candidates
Compute N2 + M2 +2MN feature values Look-up M2 log-likelihood values
Feature-Based Evaluation
Create candidate subsets
Train candidate log-likelihood functions
Select log-likelihood functions
Retrain selected log-likelihoodfunctions using Adaboost
Training imagesof object
Cross-validationimages
Training images of non-object
Determine detection threshold
Automatically select non-objectexamples for next stage
Images that donot contain object
Increment stage
Cascade Implementation
Adaboost using confidence-rated predictions [Shapire and Singer, 1999]
Bootstrapping [Sung and Poggio, 1995]
Face, eye, ear detection
Frontal Face Detection
Recognition rate 85.2% 89.7% 92.1% 93.7% 94.2%
False detections(this method)
6 13 44 64 79
False Detections[Viola and Jones, CVPR, 2001]
-- 31 50 167 --
• MIT-CMU Frontal Face Test Set [Sung and Poggio, 1995; Rowley, Baluja and Kanade, 1997]
– 180 ms 300x200 image– 400 ms 300x500 image
• Top Rank Video TREC 2002 Face Detection• Top Rank 2002 ARDA VACE Face Detection algorithm evaluation
AMD Athalon 1.2GHz
Face & Eye Detection for Red-Eye Removal from Consumer Photos
CMU Face Detector
Eye Detection
• Experiments performed independently at NIST
• Sequested data set: 29,627 mugshots• Eyes correctly located (radius of 15 pixels)
98.2% (assumed one face per image)• Thanks to Jonathon Phillips, Patrick
Grother, and Sam Trahan for their assistance in running these experiments
Realistic Facial Manipulation:Earring Example
With Jason Pinto
Telephone Detection
Cart, pose 1
Cart, pose 2
Cart, pose 3
Door Handle Detection
Summary of Classifier Design
• Sparse structure of statistical dependency in many image classification problem
• Semi-naïve Bayes Model• Automatic learning structure of semi-naïve Bayes
classifier:– Generation of many candidate subsets– Competition among many log-likelihood functions to
find best combination
CMU on-line face detector:http://www.vasc.ri.cmu.edu/cgi-bin/demos/findface.cgi