Mood Detection Using Facial Expression 1_1

20
Mood Detection  June Abstract Recogniz ing human fac ial expr essi on and emotion by computer is an interesting and challenging prob- lem. In this pap er we pr esent a sys tem for recogniz- in g emotions through facial exp ressions displayed in live vid eo str eams and video se que nces. The system is based on the Piec ewise Bézier Vo lume Deforma- tion tracker [18] and has been extended with a Haar face detector to initially locate the human face au- tomatically. Our experiments with Naive Bayes and the Tr ee-Au gmented-Naive Ba ye s (T AN ) classifiers in person-dependent and per son -in dep en den t tests on the Cohn-Kanade database [1] show that good classi- fi ca ti on resu lt s can be ob tain ed fo r faci al expression recognition. 1 Introduction Recently there has been a growing interest in improv- in g th e in teract io n be tween hu mans an d computers. It is arg ued th at to achieve effecti ve human- computer intelligent interaction, there is a need for the com- puter to int erac t naturally wit h the user, similar to the way humans interact. Humans interact with each other mostly through speech, but also through body gestures to emphasize a certain part of speech and/or disp lay of emoti ons. Emot ions are dis play ed by vi- sual, voc al and oth er phy sio log ical means.  There is more and more evidence appearing that shows that emotional skills are part of wh at is call ed intelligence [8]. One of th e most important ways for humans to display emotions is through facial expressions. If we

Transcript of Mood Detection Using Facial Expression 1_1

8/2/2019 Mood Detection Using Facial Expression 1_1

http://slidepdf.com/reader/full/mood-detection-using-facial-expression-11 1/20

Mood Detection

 June

Abstract

Recognizing human facial expression andemotionby computer is an interesting and challengingprob-lem. In this paper we present a system forrecogniz-ing emotions through facial expressionsdisplayed inlive video streams and video sequences. Thesystemis based on the Piecewise Bézier VolumeDeforma-tion tracker [18] and has been extended with aHaarface detector to initially locate the human face

au-tomatically. Our experiments with Naive Bayesandthe Tree-Augmented-Naive Bayes (TAN)classifiersin person-dependent and person-independenttests onthe Cohn-Kanade database [1] show that goodclassi-fication results can be obtained for facialexpressionrecognition.

1 Introduction

Recently there has been a growing interest inimprov-ing the interaction between humans andcomputers.It is argued that to achieve effective human-computerintelligent interaction, there is a need for thecom-puter to interact naturally with the user,similar to

the way humans interact. Humans interact witheachother mostly through speech, but also through

bodygestures to emphasize a certain part ofand/ordisplay of emotions. Emotions are dby sual, vocal and other physiological

  There ismore and more evidence appearing thathatemotional skills are part of what is‘intelligence’

[8]. One of the most important whumans todisplay emotions is through facial exprIf we

8/2/2019 Mood Detection Using Facial Expression 1_1

http://slidepdf.com/reader/full/mood-detection-using-facial-expression-11 2/20

5

t to achieve more effective human-computer

ction, recognizing the emotional state of the

n from his or her face could prove to be analu-e tool.his work describes a real-time automatical expression recognition system using video

webcam input. Our work focuses on initiallyecting the hu-n face in the video stream, on classifying the

n emotion from facial features and onalizing the recognition results.

Related work

ce the early 1970s there have been extensived-of human facial expressions. Ekman et al [4]

ndence to support universality in facialressions.se ‘universal facial expressions’ are thosere-ting happiness, sadness, anger, fear, surprise

anddisgust. They studied expressions in manycultures,including preliterate ones, and found muchcommon-ality in the expression and recognition of emotionson the face. There are differences as well:

 Japanese,for example, will suppress their real facialexpressionsin the presence of the authorities. Babies

appear toexhibit a wide range of facial expressions withoutbe-ing taught; this suggests that these expressionsareinnate [10].

Ekman developed a coding system for facialex-pressions where movements of the face aredescribedby a set of action units (AUs). Each AU has somere-lated muscular basis. Many researchers were

inspiredto use image and video processing toautomaticallytrack facial features and then use them tocategorize

1

8/2/2019 Mood Detection Using Facial Expression 1_1

http://slidepdf.com/reader/full/mood-detection-using-facial-expression-11 3/20

local deformations of the facial features such asthe

eyebrows, eyelids, and mouth can be tracked.Firstthe 2D image motions are measured usingtemplatematching between frames at different resolutions.Im-age templates from the previous frame and fromthevery first frame are both used for more robusttrack-ing. The measured 2D image motions aremodelledas projections of the true 3D motions onto theim-age plane. From the 2D motions of severalpointson the mesh, the 3D motion can be estimated.Fig-ure 1 shows an example of one frame with thewire-

Figure 1: On the left the wireframe model and ontheright the facial motion units used in our facetracker.

the different expressions. Pantic and Rothkrantz[13]provide an overview of recent research done inauto-matic facial expression recognition. Overall thedif-ferent approaches are similar in that they trackfacialfeatures using some model of image motion(opticalflow, DCT coefficients, etc). Based on thefeatures a

classifier is trained. The main difference lies intheset of features extracted from the video imagesand inthe classifier used (often-used classifiers arebased onBayesian approaches or on hidden Markovmodels).

  The classifiers used can either be ‘static’

classifiersor dynamic ones. ‘Static’ classifiers use featurevec-tors related to a single frame to performclassification,while dynamic classifiers try to capture the

temporalpattern in the sequence of feature vectorstoeach frame.

 The face tracking we use in our sy

based on an incomplete version of the used in [3]. This system in turn was bassystem developed by Tao and Huang [18the Piecewise Bézier Volume Defo(PBVD) tracker.

 This face tracker constructs an expwire-frame model of the face. In the first frtheimage sequence, landmark facial featurestheeye corners and mouth corners need to be by

hand. The generic face model consistssurfacepatches embedded in Bézier volumes warpedto fit the selected facial features. The patchesare guaranteed to be continuous and Oncethe model is constructed and fitted, headand

8/2/2019 Mood Detection Using Facial Expression 1_1

http://slidepdf.com/reader/full/mood-detection-using-facial-expression-11 4/20

me model overlayed on the face being tracked. Thecovered motions are represented in terms of ag-udes of some predefined motion of variousialtures. Each feature motion corresponds to a

m- deformation on the face, defined in terms

thezier volume control parameters. We refer toeseotions vectors as Motion-Units (MU ’s). Note

atey are similar but not equivalent to the AUs Ek-

an. The MU’s used in the face tracker areownfigure 1 on the right and are described inble 1.ese MU’s are the features we use as input tor classifiers described in later sections.

U Description

1 vertical movement of the center of upper lip2 vertical movement of the center of lower lip

3 horizontal movement of left mouth corner4 vertical movement of left mouth corner5 horizontal movement of right mouth corner6 vertical movement of right mouth corner7 vertical movement of right brow8 vertical movement of left brow9 lifting of right cheek0 lifting of left cheek1 blinking of right eye2 blinking of left eye

Table 1: Motion units used in our face tracker.

Classifiers

ive Bayes classifiers are popular due to theirm-city and their success in past applications.e

8/2/2019 Mood Detection Using Facial Expression 1_1

http://slidepdf.com/reader/full/mood-detection-using-facial-expression-11 5/20

n

n

simplicity of a naive Bayes classifier stems fromits in-

dependence assumption, which assumes thatfeaturesare uncorrelated. Thus their joint probabilitycan beexpressed as a product of their individualprobabili-ties. As in any classification problem we wouldliketo assign a class label c to an observed featurevectorX with n dimensions (features). The optimalclas-sification rule under the maximum likelihood

(ML)framework to classify an observed featurevector of  n dimensions, X R∈ n, to one of |C| classlabels,c {1,...,|C|},isgivenas:∈

=argmaxĉ cP(X|c; ).Θ (1)

where is the set of parameters that needΘ  to be learned for the classifier. Given the naiveBayes as-sumption, the conditional probability of X

given a class label c is defined as:∏

P(X|c; ) =Θ P(x i|c; ).Θ (2)i=1

Having a continuous feature space - which istruein our case - the conditional probabilities foreachfeature can be modelled as probabilitydistributionfunctions. The Gaussian distribution is mostcom-

monly used and ML methods are used toestimate itsparameters. For a naive Bayes classifier wehave tolearn a distribution for each feature, but sincewe aredealing with only one dimension, theparameters forthe Gaussian distribution (mean and variance)caneasily be calculated.

However, assuming Gaussian distributions isnot always accurate and thus the Cauchy

distribution was proposed as an alternative bySebe et al [17]. While it can give better

classification results in some cases, itdrawback is that its parameters are mucdifficult to estimate.

Despite the seemingly weak indepassump-tion of the naive Bayes classifier, it ngivessurprisingly good results. Recent studiealsogive some theoretical explanation fsuccess.

Nevertheless, in cases were thedependenciesamong features, the naive Bayes model cgivesa sub-optimal solution. In our scenarfeasibleto assume some dependence between due tothe anatomic structure of the face. Heshould

8/2/2019 Mood Detection Using Facial Expression 1_1

http://slidepdf.com/reader/full/mood-detection-using-facial-expression-11 6/20

empt to also find these dependencies and modeleir joint distributions. Bayesian networkse an

uitive and efficient way to model such jointtri-tions, and they are also suitable forssification.fact, the naive Bayes model is actually antremese of a Bayesian network where all nodes areynnected to the class node (i.e. there are nopen-ncies between features modelled).A Bayesian network consists of a directedyclicaph in which every node is associated with ari-e Xi and with a conditional distribution

Xi|Πi),ere Πi denotes the parents of Xi in the graph.e joint probability distribution is thenfined as:

∏P(X 1,...,X n) = P(X i|Πi)

i=1

One of the important aspects when

signing ayesian network classifier is choosing the rightuc-e for the network graph. Choosing a wronguc-e can have dire effects on thessification re-ts. When the structure of the Bayesiantworkunknown or uncertain, as it is the casere, itbetter to learn the optimal structureng ML.

wever, this requires searching through allssi- structures, i.e. all possible dependencies

mongtures, which is a NP-complete problem. Thus

ould restrict ourselves to a smaller class of uc-es to make the problem tractable. One suchssstructures was proposed by Friedman et al [6]dreferred to as the Tree-Augmented-Naive

yesAN) classifier. TAN classifiers have the

advantagethat there exists an efficient algorithm [2] tocompute the optimal TAN model.

  TAN classifiers are a subclass of Bayesiannetworkclassifiers where the class node has no parentsandeach feature has a parent the class node and atmostone other feature. To learn its exactstructure, a modified Chow-Liu algorithm [2] forconstructing tree augmented Bayesian networks

[6] is used.Essentially the algorithm builds amaximumweighted spanning tree between the featurenodes.As weights of the arcs the pairwise class-conditionalmutual information among the features is used.

 Theresultant graph of the algorithm is a treeincluding allfeature pairs that maximizes the sum of theweights

3

8/2/2019 Mood Detection Using Facial Expression 1_1

http://slidepdf.com/reader/full/mood-detection-using-facial-expression-11 7/20

of the arcs. To make the undirected tree adirected

graph, a root node is chosen and all edges aremadeto point away from the root node. Then theclassnode is made parent node of all features toconstructthe final TAN. The detailed algorithm and thealgo-rithm used to compute the maximum spanningtreecan be found in [3].

 The last step is to compute the joint distributions

of the nodes. Again Gaussian distributions areusedand estimated using ML techniques. This isessen-tially the same as for the naive Bayes classifier,only that now we need to compute additionalcovariance parameters.

Our project aims to design a dynamic classifierforfacial expressions, which means also takingtempo-ral patterns into account. To classify an emotionnot

only the current video frame is used, but alsopastvideo frames. While [3] proposes a multi-levelHid-den Markov Model based classifier, the currentimple-mentation only takes temporal patterns intoaccountby averaging classification results over a setnumberof past frames. We do not discuss dynamicclassifiersand the proposed Hidden Markov Model further,

be-cause we did not work on extending thesystem inthis direction.

4 Face detection

As we described in section 2, the existingsystem re-quired placing all marker points on landmarkfacial features manually. To automate this, wewant to de-

tect the initial location of the human faceautomat-

ically and use this information to plamarker points near their landmark featudo this by placing a scaled version landmark model of the face on the detectlocation.

As our face detector, we chose a farobust classifier proposed by Viola and Jonand proved by Lienhart et al [11, 12]. algorithm makes three main contribution

• The use of integral images.

• A selection of features through a boosalgo-rithm (Adaboost)

8/2/2019 Mood Detection Using Facial Expression 1_1

http://slidepdf.com/reader/full/mood-detection-using-facial-expression-11 8/20

Figure 2: Haar features.

Amethodtocombinesimpleclassifiersinacas-cade structure

1 Integral Images

alyzing images is not an easy task. Using justeel information can be useful in some fields

e.ovement detection) but is in general notough tocognize a known object. In 1998,pageorgiou et[14] proposed a method to analyze imageturesng a subgroup of Haar-like features, derivedme Haar transforms. This subgroup wastendeder by Lienhart et al [11] to also detectall ro-

ions of the sought-after object. The basicssi-rs are decision-tree classifiers with at least 2ves.ar-like features are the input to the basicssifiersd are calculated as described below. Theorithmare describing uses the Haar-like features

own in figure 2.The feature used in a particular classifier isec-ed by its shape (1a, 2b, etc), position

hin thegion of interest and the scale (this scale ist theme as the scale used at the detection stage,oughese two scales are multiplied). For example, insethe third line feature (2c) the response isculatedthe difference between the sum of imageels un-r the rectangle covering the whole featureclud-

the two white stripes and the black stripetheddle) and the sum of the image pixelsder theck stripe multiplied by 3 in order to

mpensatethe differences in the size of areas.

lculatingms of pixels over rectangular regions can berypensive in computational terms, but thisoblemn be solved by using an intermediatepresentation

4

8/2/2019 Mood Detection Using Facial Expression 1_1

http://slidepdf.com/reader/full/mood-detection-using-facial-expression-11 9/20

Figure 3: Calculation of the rectangularregions.

of the images, namely integral images.  Those intermediate images are easily

generated by the cumulative sums of theoriginal image’s pixels: every pixel of theintegral image ii(x,y) corresponds to the sumof all the pixels in the original image i fromi(0,0) to i(x ′,y′).

ii(x,y) = ∑

Figure 4: First two iterations of Adaboost.

In a 24x24 pixel image, there are over 180.000Haar-like features that can be detected, a lot more than

thenumber of pixel in the image (576). In case wearedealing with a bigger image, the number shouldbemultiplied for all the sub-windows of 24 pixels intheimage. The computational cost of thisoperation isclearly prohibitive. Instead, Adaboost is used tose-

x′≤x,y ′≤y i(x ′,y

′)

Using recursive formulas, it is possible to gener-

ate an integral image from an original with a singlecomputational step:

s(x,y) = s(x,y − 1) + i(x,y)

ii(x,y) = ii(x − 1,y) + s(x,y)

where s(x,y) is the cumulative sum of the row.

Once an integral image is generated, it is rather

easy to calculate the sum of pixels under an arbitrary

rectangular region D using the values of points 1, 2,

3 and 4. This is illustrated in figure 3.

In fact, the value of point 1 is the cumulative sum

of A, point 2 is the cumulative sum of A+B, point 3

is A + C and point 4 is A + B + C + D. Since we arelooking for the value of D, we should subtract from

the value of point 4 the value of point 3 and the value

of point 2, and add the value of point 1 since it was

subtracted twice during the previous operation.

4.2 Feature selection usingAdaboost

Proposed by Schapire [15, 16], the Adaboostalgo-rithm is used to ‘boost’ the performance of a

learningalgorithm. In this case, the algorithm is usedboth to

train the classifiers and to analyze thimage.

8/2/2019 Mood Detection Using Facial Expression 1_1

http://slidepdf.com/reader/full/mood-detection-using-facial-expression-11 10/20

t w c of t e features are actua y re evant for t eught-after object, drastically reducing themberfeatures to be analyzed. In every iteration,-oost chooses the most characterizing featurethetire training set from the 180.000 featuresssible in every image.The first two selected feature are displayed inureit is clear that the most discriminative

ature ise difference between the line of the eyesd therrounding; for a face the surroundings arehteran the eyes themselves. The second featureectedthe difference in tonality between the eyesd these; the nose is also lighter when compared

theea of the eyes. The algorithm will continue toect

od features that can be combined in assifier.

3 Cascade of classifiers

ery step, a simple classifier (also called weak-use of their low discriminative power) is built.embination of all the weak classifiers willm aong classifier that can recognize any kind of -

ct it was trained with. The problem is toarchthis particular sized window over the full-e, applying the sequence of weak classifiers

ev-y sub-window of the picture. Viola and Jones9]ed a cascade of classifiers (see figure 5) tokle

8/2/2019 Mood Detection Using Facial Expression 1_1

http://slidepdf.com/reader/full/mood-detection-using-facial-expression-11 11/20

Figure 6: Bars visualization of the probabilities for

Figure 5: Cascade of classifiers.

this problem: the first classifier (the mostdiscrimi-native) is applied to all the sub-windows of theim-age, and at different scale. The second classifierwill be applied only to the sub-windows in whichthe first classifier succeeded. The cascadecontinues, applying all the weak classifiers anddiscarding the negative sub-windows,concentrating the computational power only onthe promising areas.

5 ImplementationWhen studying the incomplete existingimplementa-tion we received, we decided to remove theoutdated parts and to change the programstructure to be able to create a distributablepackage, executable by a normal user withoutVisual C++ and the required librariesinstalled. Minor code cleaning and bugfix-ing was performed all over the source code.

Another big change was in the source of theinput

videos, which supported only AVI movies andMa-trox cameras. We implemented a new classbasedon the OpenCV library [9] which uses the samecodeto read from any kind of movie file andvirtually allcameras supporting computer attachment. It isnowpossible to select the camera’s options directlyandrecord the video stream directly from the

emotionfitting program. On the interface, new buttons

wereadded to control the new options, while thoneswere debugged and restyled in a modern

As stated in the introduction, our maincontribu-tion is the inclusion of a face detector ththe

8/2/2019 Mood Detection Using Facial Expression 1_1

http://slidepdf.com/reader/full/mood-detection-using-facial-expression-11 12/20

ch emotion.

enCV library: we used it to snap the positionde scale of the markers to the position andale of  e user’s face, and most importantly tonitializee position of the mesh when the face was lostringe emotion fitting. This contribution made the

o-am more usable and robust, introducing brief orsy in some cases of occlusion or fast

ovements of  e user. Furthermore, the communicationtweene video program and the classifier programs re-plemented to reduce the delay that wereeviously introduced by establishing a newnnection for every image frame.

1 Visualization

r the visualization of the emotions we choseo dif-ent forms. The first uses the sizes of bars toplay

e emotion and the second uses a circle. Everymo-

n has a different color. For example happys theor green due to the fact that green isnerally con-ered a ‘positive color’ and angry has the

or redcause red is generally considered a ‘negativeor’.r clarity we also write the emotion andrrespond- probability percentage in the mood

ndow. Theood with highest probability is also writtenpa-ely. In the mood window there are twomboxes at the bottom. In these combo boxesere ise possibility of choosing the visualizationpe ande classifier. Figure 6 shows the barsualization.he program is 100% sure that we have a

rtainmotion, then the width of the bar willrrespond to the full width of the window.Figure 7 shows the circle visualization. The edge

8/2/2019 Mood Detection Using Facial Expression 1_1

http://slidepdf.com/reader/full/mood-detection-using-facial-expression-11 13/20

person independent tests contains samples fromsev-

eral people displaying all seven emotions. Asampleconsists of a single labelled frame from a video.

 Thetest set is a disjoint set with samples from otherpeo-ple. On the other hand, in person dependentteststhe training set contains samples from just asingleperson. It is then evaluated on a disjoint testsetcontaining only samples from the same person.

Figure 7: Circle visualization of the probabilitiesforeach emotion

of the circle is a classification of 100% of theemo-tion. So if the dots get closer to the edge thehigher the probability of the emotion. Thecenter of the circle corresponds to neutral.

 The current mood is displayed on the top of thewindow.

6 Evaluation

We ran several test to evaluate theperformance of the emotion detector. Notethat our changes, fixes and new implementationof classifiers should not alter the previouslyreported results [3]. The aim of ourexperiments is thus getting a second set of results for comparison purposes.

6.1 Dataset

Our dataset is the Cohn-Kanade database [1],which contains 52 different people expressing 7emotions. These emotions are: neutral, happy,surprised, an-gry, disgusted, afraid and sad. For every personsev-eral videos are available. Every video starts withthe neutral expression and then shows anemotion. Each frame of every video is labelledwith the correspond-

ing emotion. For some people in the database,not all emotions are available.

6.2 Experiments

For each classifier we performed dependentand person independent test. The traifor

8/2/2019 Mood Detection Using Facial Expression 1_1

http://slidepdf.com/reader/full/mood-detection-using-facial-expression-11 14/20

3 Results

st we examined the performance of ourplemen-ion of a Naive Bayes classifier. We dividedeta into three equal parts, from which weed tworts for training and one part for testing.sultse averaged over the three different

mbinationstest/training set possible. This is alsoown asoss-validation. The confusion matrix of thersonependent test is shown in table 2. The

nfusionatrix for the TAN classifier, using the sameiningd test sets, is shown in table 3.n person dependent tests the classifier isinedd evaluated using data from only a single

rson.samples for a person are again split in threeualrts for cross-validation. We did this for fiveopled averaged the results to obtain the confusiona-x. The confusion matrix of the personpendentst is shown in table 4. The confusionatrix fore TAN classifier using the same people isown in

ble 5.As can be seen in the confusion matrices theultsclassifying the emotion in the person

pendentsts are better (for NB 64,3% compared to2%

d for TAN 53,8% compared to 62,1%) thanerson independent tests. This result is of urseuitively correct, because the classifier wasinedecifically for that person, so it should performtell when the test set is also from that samerson.Our results very clearly do not correspond toe-usly reported results by Cohen et al [3].r-singly our Naive Bayes classifier outperformseN classifier. Our Naive Bayes classifier givese

me results as reported in literature. The TANs-er, however, performed significantly worse.

Wepresume this is caused by an incorrectly learnedde-pendency structure for the TAN model. Investigat-

7

8/2/2019 Mood Detection Using Facial Expression 1_1

http://slidepdf.com/reader/full/mood-detection-using-facial-expression-11 15/20

Neutral Happy Surprised Angry Disgusted Afraid SadNeutral 82.34 1.89 1.76 1.78 0.89 3.74 7.60

Happy 2.17 74.17 0.42 1.95 3.81 14.85 2.63Surprised 2.16 0.00 90.08 1.35 0.00 1.60 4.81Angry 8.01 5.43 0.31 55.28 20.96 3.60 6.42Disgusted 6.12 8.66 3.76 23.76 46.54 6.93 4.24Afraid 4.15 20.52 12.91 0.08 1.66 57.47 3.22Sad 22.46 2.82 15.26 7.95 6.17 1.38 43.96

 Table 2: Confusion matrix for the naive Bayes classifier in person independent tests. The rows representthe emotion expressed and the columns represent the emotion classified. Average accuracy is 64.3%. Rorepresent the true emotion, while columns represent the detected emotion.

Neutral Happy Surprised Angry Disgusted Afraid Sad

Neutral 87.35 1.49 1.66 2.51 0.37 2.58 4.04Happy 6.63 63.98 2.04 2.42 5.31 14.05 5.57Surprised 3.90 0.00 80.97 1.82 0.74 2.29 10.28Angry 17.93 6.43 4.25 36.32 15.72 9.94 9.40Disgusted 9.33 9.18 4.11 25.45 37.07 7.68 7.19Afraid 11.76 22.47 10.92 4.89 5.75 37.08 7.13Sad 21.14 9.10 11.24 9.09 5.71 9.82 33.90

 Table 3: Confusion matrix for the naive TAN classifier in person independent tests. Average accuracy is53.8%.

ing the learned dependencies, we found them todis-agree greatly with the ones reported by Cohenet al.While they reported mostly horizontaldependenciesbetween the features on the face, our structurecon-tains many vertical dependencies. This couldbe abug in our implementation of the TANclassifier.

Another possible explanation is that the TAN

clas-sifier lacks enough training data to beeffectivelytrained. This often happens with more complexclas-sifiers, because they need to estimate moreclassifierparameters from the same amount of data.

Looking for patterns in the confusionmatrices, wesee that the ‘positive’ emotions happy andsurprisedare recognized very well; these are very

pronouncedemotions. It holds for all emotions that when

theyare not pronounced enough, they can bemisclassified

as neutral instead of the correct emotionisconfused most often with afraid, and theconversealso holds. Analysis shows that people who afraid

8/2/2019 Mood Detection Using Facial Expression 1_1

http://slidepdf.com/reader/full/mood-detection-using-facial-expression-11 16/20

nd to open their mouth a bit and the mouth cornerse up a bit. When looking at just a singleme, itvery hard to distinguish these two emotions.en make a similar point for anger and disgust:thrve the mouth downward, though peoplend to open their mouth a bit with disgust andse it when they are angry.

An interesting emotion is fear (afraid), as itn besclassified as surprise quite often, while then-rse seldomly happens. We think that these

mo-ns are very similar in their expression (e.g.ose’

each other) but that surprise has a veryecificpression (little variation in the expression),ak-

it easy to recognize. Fear, however,obably hasange of forms it can take and we think thatr-se may be positioned in-between thesems. Thein confusion for fear is happiness; again in this

n-ion the mouth movement is similar, but for

eseo emotions also the eyebrows also tend to besed

8/2/2019 Mood Detection Using Facial Expression 1_1

http://slidepdf.com/reader/full/mood-detection-using-facial-expression-11 17/20

Neutral Happy Surprised Angry Disgusted Afraid SadNeutral 88.17 2.62 1.83 1.47 2.29 0.56 3.07

Happy 2.22 95.16 0.00 0.00 0.00 2.62 0.00Surprised 0.00 0.00 100.00 0.00 0.00 0.00 0.00Angry 1.67 0.00 0.00 98.33 0.00 0.00 0.00Disgusted 10.00 2.22 0.00 4.44 81.11 0.00 2.22Afraid 3.56 0.00 0.00 0.00 0.00 94.22 2.22Sad 4.44 0.00 0.00 0.00 0.00 0.00 95.56

 Table 4: Confusion matrix for the naive Bayes classifier in person dependent tests. Average accuracy is93.2%. Rows represent the true emotion, while columns represent the detected emotion. Results averageover 5 people.

Neutral Happy Surprised Angry Disgusted Afraid Sad

Neutral 95.26 0.42 0.39 2.09 0.00 0.00 1.84Happy 20.56 56.98 2.50 11.35 0.00 5.28 3.33Surprised 12.62 1.11 73.60 8.78 0.00 2.22 1.67Angry 15.78 2.78 0.00 79.22 0.00 0.00 2.22Disgusted 27.78 7.78 2.22 18.89 33.33 2.22 7.78Afraid 30.22 11.00 0.00 9.33 2.22 41.67 5.56Sad 35.11 0.00 4.44 4.44 1.33 0.00 54.67

 Table 5: Confusion matrix for the naive TAN classifier in person independent tests. Average accuracy is62.1%. Results averaged over 5 people.

a bit. Discriminating these two emotionsmanuallyfrom a single frame ourselves is hard, so thismakes sense.

7 Conclusion

We significantly improved the usability anduser-friendliness of the existing facial tracker,extending it with automatic face positioning,emotion classifiers and visualization. Our Naive

Bayes emotion classi-fier performs quite well. The performance of our

 TAN classifier is not up to par with existingresearch. The classifier either lacks enoughtraining data, or has an implementationproblem.

We believe that additional improvements tothesystem are possible. First of all we could usespecial-ized classifiers to detect specific emotions, andcom-bine them to improve the classification

performance.

Furthermore, the current classifier shows a strangebehavior when readapting the mask after itloses it,due a continuous classification of thedeformations.

 Those deformations are artificial and generateddur-ing the re-adaptation step and should not becon-sidered for classification, so classification shouldbeinterrupted during mesh repositioning. Anotherim-

portant step is to make the system morerobust tolighting conditions and partial occlusions. Infact,the face detector will work only if all thefeaturesfrom the face are visible and won’t work if thefaceis partially occluded or not in a good lightingcon-dition. Finally, the system should be morepersonindependent: with the current implementation,

thesystem requires markers to let the user select the

8/2/2019 Mood Detection Using Facial Expression 1_1

http://slidepdf.com/reader/full/mood-detection-using-facial-expression-11 18/20

-rtant feature of the face. This should bens-rent to the user, using the face detector toalize

e position and the scale of the face andquentiallyply another algorithm to adjust thosearkers to

8/2/2019 Mood Detection Using Facial Expression 1_1

http://slidepdf.com/reader/full/mood-detection-using-facial-expression-11 19/20

the current face. In this way, there will be noneed

for markers anymore and the system could beused by any user, without any intervention.With these improvements, this applicationcould be applied to real-life applications suchas games, chat programs, virtual avatars,interactive TV and other new forms of human-computer interaction.

References

[1] J. Cohn, T. Kanade Cohn-Kanade AU-Coded

Facial Expression Database CarnegieMellonUniversity

[2] C.K. Chow, C.N. Liu. Approximatingdiscrete

probability distributions with dependencetrees.IEEE Trans. Information Theory, 14:462-467,1968.

[3] I. Cohen, N. Sebe, A. Garg, L. Chen, and

 T.S.Huang. Facial expression recognition fromvideosequences: Temporal and static modeling.Com-puter Vision and Image Understanding,91(1-2):160-187, 2003.

[4] P. Ekman Strong evidence for universals infa-

cial expressions. Psychol. Bull., 115(2): 268-287,1994.

[5]  J.H. Friedman On bias, variance 0/1-loss,

and the curse-of-dimensionality. DataMiningKnowledge Discovery, 1 (1): 55-77, 1997.

[6] N. Friedman, D. Geiger, M.Goldszmidt.

Bayesian network classifiers. MachineLearning,29(2):131-163, 1997.

[7] A. Garg, D. Roth. Understandingprobabilistic

classifiers. Proc. Eur. Conf. on Machine

Learn-ing, 179-191, 2001.

[8] D. Goleman. Emotional IntelligencBantam

Books, New York, 1995.

[9] Intel Research Laboratories.

OpenCV:Open computer vision

library.

http://sf.net/projects/opencvlibrary/.

8/2/2019 Mood Detection Using Facial Expression 1_1

http://slidepdf.com/reader/full/mood-detection-using-facial-expression-11 20/20

C.E. Izard. Innate and universal facial expres-sions: evidence from developmental and cross-cultural research. Psychol. Bull., 115(2): 288-

299,1994.

R. Lienhart, J. Maydt. An extended set of haar-like features for rapid object detection. Proceed-ings of the IEEE International Conference onImage Processing, Rochester, New York, vol. 1,pp. 900-903, 2002.

R. Lienhart, A. Kuranov, V. Pisarevsky. Empir-ical Analysis of Detection Cascades of BoostedClassifiers for Rapid Object Detection. IntelCorporation, Technical report, 297-304, 2002.

M. Pantic, L.J.M. Rothkrantz. Automatic anal-

ysis of facial expressions: the state of the art.IEEE Trans. PAMI, 22(12): 1424-1445, 2000.

[14] C. Papageorgiu, M. Oren, T. Poggio. A generalframework for Object Detection. Proceedings of the International Conference on Computer Vi-sion, Bombay, India, pp. 555-562, 1998.

[15] R. Schapire, Y. Freund. Experiments with a newboosting algorithm. Proceedings of the Interna-tional Conference on Machine Learning, Bari,Italy, Morgan Kaufmann, pp. 148-156, 1996.

[16] R. Schapire. The strenght of weak learnability.Machine Learning, 5(1), 197-227, 1990.

[17] N. Sebe, I. Cohen, A. Garg, M.S. Lew, T.S.Huang. Emotion Recognition Using a CauchyNaive Bayes Classifier. International Conferenceon Pattern Recognition (ICPR02), vol I, pp. 17-20, Quebec, Canada, 2002.

[18] H. Tao, T.S. Huang. Connected vibrations: amodal analysis approach to non-rigid motiontracking. Proc. IEEE Conf. on CVPR, 735-740,1998.

[19] P. Viola, M. Jones. Rapid Object Detection Us-ing a Boosted Cascade of Simple Features. Pro-ceedings of the IEEE Conference on Computer

Vision and Pattern Recognition, Kauai, Hawaii,vol. 1, pp. 511-518, 2001.

10