HYBRID-BOOST LEARNING FOR MULTI-POSE FACE DETECTION AND FACIAL
EXPRESSION RECOGNITION Hsiuao-Ying ChenChung-Lin Huang Chih-Ming Fu
Pattern Recognition, Volume 41, Issue 3, March 2008, Pages 1173-
1185 Aluna:Lourdes Ramrez Cerna.
Slide 2
INTRODUCTION How we know, Face Detection has many applications
such as surveillance, human computer interface, etc. Nevertheless,
most of the published methods have many restrictions such as no
varying pose or no noisy defocus problem. This paper proposes a
hybrid-boost learning which selects Gabor features and Harr-like
features to provide the most discriminating information for the
strong classifier in the final stage. Finally, they compare the
experimental results with others methods. 2
Slide 3
SEGMENTATION OF POTENTIAL FACE REGIONS Potential face regions
segmentations consists of skin color detection and segmentation.
For skin-color detection, they analyze the color of the pixels in
RGB color space to decrease the effect of illumination changes, and
then classify the pixels into face-color or non-face color based on
their hue component only. Bayesian decision rule which can be
expressed as: if p(c(i)|face)/p(c(i)|non-face)> t, then pixel i
(with c(i) = (r(i), g(i), b(i)) belongs to a face region, otherwise
it is inside a non- face region, where t= p(non-face)/p(face).
3
Slide 4
SEGMENTATION OF POTENTIAL FACE REGIONS 4
Slide 5
HYBRID-BOOST LEARNING FOR FACE DETECTION Gabor features (global
feature): are obtained in the normalized image of 24 24 blocks, and
include more detailed information of frequency and orientation.
Harr-like features (local features): are acquired in the various-
sized blocks (include the width and length). 5 2D Gaussian and a
complex exponential function
Slide 6
HYBRID-BOOST LEARNING FOR FACE DETECTION 6 Hybrid features: For
Gabor features: position (x, y), , and . For Harr-like feature
height (H) and width (W). Finally, the feature is defined as x =
(t, x, y, p1, p2), where t = 1 indicates Gabor feature and t = 28
specifies Harr-like feature.
Slide 7
HYBRID-BOOST LEARNING FOR FACE DETECTION Soft-decision function
for weak classifiers We create a pool of 2D soft-decision function
for weak classifiers for class w l before the hybrid-boost
learning. The soft decision function for weak classifier is denoted
as: 7 where P(b(f(x))) is the histogram of the response of feature
x for all training data, P(l) is the priori of class l, and
P(b(f(x))|l) is the conditional probability. Since there are many
possible features and posteriori functions (i.e., 8 24 24 8 8 =
294912), the hybrid-boost learning algorithm selects only the most
discriminant features from these hybrid features.
Slide 8
8
Slide 9
MULTI-POSE FACE DETECTION AND EXPRESSION RECOGNITION 9 As shown
in Fig. 6, we illustrate the formulation for profile face detection
and expression recognition, where the face data are categorized
into five classes of different posed faces (i.e., pose angles: 90,
45, 0, 45, and 90) and six classes of different expressions (i.e.,
happy, anger, sad, surprise, fear, and disgust).
Slide 10
MULTI-POSE FACE DETECTION AND EXPRESSION RECOGNITION The
multi-class hybrid-boost learning algorithm 10
Slide 11
MULTI-POSE FACE DETECTION AND EXPRESSION RECOGNITION
Different-posed face detection 11 FERET face database, consists of
14051 eight-bit grayscale images of human faces with different
poses ranging from frontal to left and right profiles.
Slide 12
MULTI-POSE FACE DETECTION AND EXPRESSION RECOGNITION
Different-posed face detection 12 However, the strong classifier
with the highest response does not necessarily indicate the correct
class. So, we define the following two decision rules for selecting
the correct strong classifier.
Slide 13
Slide 14
MULTI-POSE FACE DETECTION AND EXPRESSION RECOGNITION Facial
expression recognition Similar to multi-pose face detection, they
apply the hybrid-boost learning for facial expression recognition
and then compare the results with other facial expression systems.
The training data of the facial expression classifiers come from
Cohn and Kanade Facial Expression Database. The input training
images are normalized to a standard size (2424 pixels) with seven
different expressions (happy, anger, sad, surprise, fear, disgust
and neutral). 13
Slide 15
EXPERIMENTAL RESULTS AND DISCUSSIONS 14 DR = 99.44% To decrease
the false alarm rate, they add 2000 miss-classified and find that
the false alarm rate is reduced to 1.72%, and the average detection
rate is 99.24%. Here, they divide the whole training data, which
consists of 1000 face images for each pose, and 5000 non-face
images, into four groups (m=4). The average detection rate is
reduced to about 94%.
Slide 16
EXPERIMENTAL RESULTS AND DISCUSSIONS 15 DT= 93.1% when the
testing data is same as the training data.