Probabilistic Combination of Multiple Modalities to Detect Interest
description
Transcript of Probabilistic Combination of Multiple Modalities to Detect Interest
![Page 1: Probabilistic Combination of Multiple Modalities to Detect Interest](https://reader035.fdocuments.us/reader035/viewer/2022070419/56815cbc550346895dcabbdd/html5/thumbnails/1.jpg)
Probabilistic Combination of Multiple Modalities
to Detect Interest
Ashish Kapoor, Rosalind W. Picard & Yuri Ivanov*MIT Media Laboratory
*Honda Research Institute US
![Page 2: Probabilistic Combination of Multiple Modalities to Detect Interest](https://reader035.fdocuments.us/reader035/viewer/2022070419/56815cbc550346895dcabbdd/html5/thumbnails/2.jpg)
• Expressing emotions• Recognizing emotions• Handling another’s emotions• Regulating emotions \ • Utilizing emotions / (Salovey and Mayer 90, Goleman 95, Picard 97)
Skills of Emotional Intelligence:
if “have emotion”
![Page 3: Probabilistic Combination of Multiple Modalities to Detect Interest](https://reader035.fdocuments.us/reader035/viewer/2022070419/56815cbc550346895dcabbdd/html5/thumbnails/3.jpg)
Emotions give rise to changes that can be sensed
FaceDistance VoiceSensing: Posture Gestures, movement, behavior
Skin conductivity Pupillary dilationUp-close Respiration, heart rate, pulseSensing: Temperature Blood pressure
Internal HormonesSensing: Neurotransmitters …
![Page 4: Probabilistic Combination of Multiple Modalities to Detect Interest](https://reader035.fdocuments.us/reader035/viewer/2022070419/56815cbc550346895dcabbdd/html5/thumbnails/4.jpg)
• Detecting Interest– Postures, (Mota, 2002)
• Detecting Stress– Physiology, heart-rate (Qi & Picard, 2002)
• Detecting Frustration– Pressure Sensors on Mouse (Reynolds, Qi and
Picard, PUI 2001)
“ Emotion recognition”
![Page 5: Probabilistic Combination of Multiple Modalities to Detect Interest](https://reader035.fdocuments.us/reader035/viewer/2022070419/56815cbc550346895dcabbdd/html5/thumbnails/5.jpg)
Example: On Task
![Page 6: Probabilistic Combination of Multiple Modalities to Detect Interest](https://reader035.fdocuments.us/reader035/viewer/2022070419/56815cbc550346895dcabbdd/html5/thumbnails/6.jpg)
Example: Off-Task
![Page 7: Probabilistic Combination of Multiple Modalities to Detect Interest](https://reader035.fdocuments.us/reader035/viewer/2022070419/56815cbc550346895dcabbdd/html5/thumbnails/7.jpg)
• Advantages:– Robust Affect Recognition
• More Information leads to more reliable recognition of affect.
– Some modalities are good for certain emotions and not good for other
• For example skin conductivity can distinguish between excitement levels but not valence.
– In case one modality fails we have other modalities to infer about the affective state
“ Emotion recognition”
![Page 8: Probabilistic Combination of Multiple Modalities to Detect Interest](https://reader035.fdocuments.us/reader035/viewer/2022070419/56815cbc550346895dcabbdd/html5/thumbnails/8.jpg)
• Ensemble Methods– Decision Level Fusion
• Kittler et al. PAMI, 1998– Critic-based Fusion
• Miller and Yan, Trans on Signal Processing, 1999
– Boosting and Bagging
Previous Work
![Page 9: Probabilistic Combination of Multiple Modalities to Detect Interest](https://reader035.fdocuments.us/reader035/viewer/2022070419/56815cbc550346895dcabbdd/html5/thumbnails/9.jpg)
• Multimodal Recognition of Affect– Huang et al, 1998
• Other Applications– Biometrics, Hong and Jain, PAMI 1998– Computer Vision, Toyama & Horvitz, ACCV
2000– Text Classification, Bennett et al, 2002
Previous Work
![Page 10: Probabilistic Combination of Multiple Modalities to Detect Interest](https://reader035.fdocuments.us/reader035/viewer/2022070419/56815cbc550346895dcabbdd/html5/thumbnails/10.jpg)
Problems in Multimodal Combination
• No “best” rule that works for all the problems
• Rule Based: Product rule– Independence Assumptions about classifiers
• Might not hold• Very sensitive to errors
• Rule Based: Sum Rule– Approximation to the product rule
• Might work where product rule fails
![Page 11: Probabilistic Combination of Multiple Modalities to Detect Interest](https://reader035.fdocuments.us/reader035/viewer/2022070419/56815cbc550346895dcabbdd/html5/thumbnails/11.jpg)
Using multiple modalities
• Aim:– Given multimodal data
– Find out the affective state• Affective state denoted by:
– for example can represent anger/ stress etc.
,...},,{ rateheartspeechface xxxX
)|( XP : What we are ultimately interested in!!
![Page 12: Probabilistic Combination of Multiple Modalities to Detect Interest](https://reader035.fdocuments.us/reader035/viewer/2022070419/56815cbc550346895dcabbdd/html5/thumbnails/12.jpg)
Graphical Models for Fusion
fx
• Generative Model Paradigm
),()|( ff xPxP
)|()( fxPP
![Page 13: Probabilistic Combination of Multiple Modalities to Detect Interest](https://reader035.fdocuments.us/reader035/viewer/2022070419/56815cbc550346895dcabbdd/html5/thumbnails/13.jpg)
Graphical Models for Fusion
px fx
• Assuming Conditional Independence
),,(),|( pfpf xxPxxP
)|()|()( ff xPxPP
)|()|( ff xPxP
Product Rule!!
![Page 14: Probabilistic Combination of Multiple Modalities to Detect Interest](https://reader035.fdocuments.us/reader035/viewer/2022070419/56815cbc550346895dcabbdd/html5/thumbnails/14.jpg)
Graphical Models for Fusion
px fx
• A Switching Variable
10)|(
),|(
UniformxP
xPf
f
01)|(
),|(
UniformxP
xPp
p
)1,0(
)|(*),|0(),,( fpffp xPxxPxxP )|(*),|1( ppf xPxxP
![Page 15: Probabilistic Combination of Multiple Modalities to Detect Interest](https://reader035.fdocuments.us/reader035/viewer/2022070419/56815cbc550346895dcabbdd/html5/thumbnails/15.jpg)
Graphical Models for Fusion
px fx
)1,0(
)|(*),|0(),,( fpffp xPxxPxxP
)|(*),|1( ppf xPxxP
•If CxxP pf ),|1(CxxP pf 1),|0(
Sum Rule!!
![Page 16: Probabilistic Combination of Multiple Modalities to Detect Interest](https://reader035.fdocuments.us/reader035/viewer/2022070419/56815cbc550346895dcabbdd/html5/thumbnails/16.jpg)
Graphical Models for Fusion
px fx
)1,0(
)|(*),|0(),,( fpffp xPxxPxxP
)|(*),|1( ppf xPxxP
•Additionally, If we replace ‘+’ with ‘max’
Max Rule!!
))|(),|(max(),,( pffp xPxPxxP
CxxP pf ),|1(CxxP pf 1),|0(
![Page 17: Probabilistic Combination of Multiple Modalities to Detect Interest](https://reader035.fdocuments.us/reader035/viewer/2022070419/56815cbc550346895dcabbdd/html5/thumbnails/17.jpg)
Graphical Models for Fusion
px fx
)1,0(
)|(*),|0(),,( fpffp xPxxPxxP
)|(*),|1( ppf xPxxP
Performance Based Averaging!!
pxxP pf ),|1(pxxP pf 1),|0(
pf
f
eperformanceperformancpeformancep
![Page 18: Probabilistic Combination of Multiple Modalities to Detect Interest](https://reader035.fdocuments.us/reader035/viewer/2022070419/56815cbc550346895dcabbdd/html5/thumbnails/18.jpg)
Graphical Models for Fusion
px fx
)1,0(
)|(*),|0(),,( fpffp xPxxPxxP
)|(*),|1( ppf xPxxP
Critic Based Averaging!!
![Page 19: Probabilistic Combination of Multiple Modalities to Detect Interest](https://reader035.fdocuments.us/reader035/viewer/2022070419/56815cbc550346895dcabbdd/html5/thumbnails/19.jpg)
Graphical Models for Fusion
px fx
p f
![Page 20: Probabilistic Combination of Multiple Modalities to Detect Interest](https://reader035.fdocuments.us/reader035/viewer/2022070419/56815cbc550346895dcabbdd/html5/thumbnails/20.jpg)
Graphical Models for Fusion
~
px fx
p f
![Page 21: Probabilistic Combination of Multiple Modalities to Detect Interest](https://reader035.fdocuments.us/reader035/viewer/2022070419/56815cbc550346895dcabbdd/html5/thumbnails/21.jpg)
Model in this work
px fx
~
gx
)2,1,0(
)|( XP
3
1 ~),,~|(),|~()|(
XPXPXP
Learning:• Unsupervised (EM)• Supervised
),|~( XP
)|( XP
),~|(),,~|( PXP
Classifiers on individual channel
Trained using results of classifier on training data
Based on Confusion Matrix
![Page 22: Probabilistic Combination of Multiple Modalities to Detect Interest](https://reader035.fdocuments.us/reader035/viewer/2022070419/56815cbc550346895dcabbdd/html5/thumbnails/22.jpg)
Training and Testing Data
• Scenario:– A child solving a puzzle for 20 min.– Puzzle:
• Fripple place: Constraints satisfaction problem.– Sensory data recorded:
• Video of face• Posture information• Full recording of the moves made by the child to solve the
puzzle
• Database consists of about 8 children in the same scenario.
![Page 23: Probabilistic Combination of Multiple Modalities to Detect Interest](https://reader035.fdocuments.us/reader035/viewer/2022070419/56815cbc550346895dcabbdd/html5/thumbnails/23.jpg)
Multiple Modalities:
• Face (Manually Encoded)– Upper Face
• Eyebrow Raises/Frowns (AU 1, 2 & 4)• Eye Widening/Narrowing(AU 5, 6 & 7)
• Postures (Automatically from the chair)• Leaning Forward/ Slumped back etc. • Activity on Chair (High, Medium & Low)
• Game Status (Manually Encoded)• Level of Difficulty• Action performed (Game start/ end/ asked for hint etc.)
![Page 24: Probabilistic Combination of Multiple Modalities to Detect Interest](https://reader035.fdocuments.us/reader035/viewer/2022070419/56815cbc550346895dcabbdd/html5/thumbnails/24.jpg)
Tracking the State: Posture
• Two sensor sheets array of 42-by-48 sensing units.• Each unit outputs an 8-bit pressure reading.• Sampling frequency of 50hz
Slumped BackSitting Upright Leaning Forward Leaning Sideways
![Page 25: Probabilistic Combination of Multiple Modalities to Detect Interest](https://reader035.fdocuments.us/reader035/viewer/2022070419/56815cbc550346895dcabbdd/html5/thumbnails/25.jpg)
Modeling usingGaussian Mixtures
Posture Classification using a multi-layer NN
Posture Features
CClassificationlassification
PPostureosture
Sensory Input
![Page 26: Probabilistic Combination of Multiple Modalities to Detect Interest](https://reader035.fdocuments.us/reader035/viewer/2022070419/56815cbc550346895dcabbdd/html5/thumbnails/26.jpg)
Fusing Everything
Human Coder
Mixture Model &
Neural Network
Human Coder
Fripples
Room Constraints
Hint Button
Face Video
Posture Sensor Output
Game Information
HMM basedClassifier
AU 1
AU 7
Posture
Activity
Game Status
Game Level
HMM basedClassifier
HMM basedClassifier
HMM basedClassifier
HMM basedClassifier
Combine
)|~( 1AUXP
)|~( 7AUXP
)|~( postureXP
)|~( activityXP
)|~( levelXP
)|~( statusXP
)|( allXP
HMM basedClassifier
![Page 27: Probabilistic Combination of Multiple Modalities to Detect Interest](https://reader035.fdocuments.us/reader035/viewer/2022070419/56815cbc550346895dcabbdd/html5/thumbnails/27.jpg)
• Database, 8 children– All channels available for 4 children– Only posture & game channels available for rest– Three classes:
• High Interest (98), Low Interest(94), Refreshing(70)
• 60% Training Data, 40% Testing Data• Recognition Accuracy Averaged over 50 runs
Experimental Evaluation
![Page 28: Probabilistic Combination of Multiple Modalities to Detect Interest](https://reader035.fdocuments.us/reader035/viewer/2022070419/56815cbc550346895dcabbdd/html5/thumbnails/28.jpg)
Results: Individual Channels
Channel Recognition Rate
AU 1 49.7%AU 2 48.6%AU 4 32.8%AU 5 38.1%AU 6 42.4%AU 7 36.0%
Channel Recognition Rate
Postures 55.1%
Activity on Chair
60.1%
Channel Recognition Rate
Game Status 33.0%
Difficulty level
25.4%
Face
Posture
Game
![Page 29: Probabilistic Combination of Multiple Modalities to Detect Interest](https://reader035.fdocuments.us/reader035/viewer/2022070419/56815cbc550346895dcabbdd/html5/thumbnails/29.jpg)
• Reduction in error for round k, combination method a:
• Average Reduction in error:
Experimental Evaluation
kpostures
kpostures
kak
a AccuracyAccuracyAccuracy
R
50
50
1
k
ka
a
RR
![Page 30: Probabilistic Combination of Multiple Modalities to Detect Interest](https://reader035.fdocuments.us/reader035/viewer/2022070419/56815cbc550346895dcabbdd/html5/thumbnails/30.jpg)
Results: Combining Channels
Combining Scheme
Recognition Rate Reduction in Error
Product 62.6% 0.5%Addition 60.7% -4.9%
Vote 55.9% -16.8%Max 54.3% -21.5%Min 60.1% -6.2%
Performance based Averaging
65.1% 7.1%
Critic-basedAveraging
65.9% 9.3%
Full Method 67.8% 14.1%
![Page 31: Probabilistic Combination of Multiple Modalities to Detect Interest](https://reader035.fdocuments.us/reader035/viewer/2022070419/56815cbc550346895dcabbdd/html5/thumbnails/31.jpg)
Limitations
• Conditional Independence Assumption is Invalid– For example AU1 and AU2 are highly
correlated• Too much manual intervention• Training Requires Large Amount of
Data
![Page 32: Probabilistic Combination of Multiple Modalities to Detect Interest](https://reader035.fdocuments.us/reader035/viewer/2022070419/56815cbc550346895dcabbdd/html5/thumbnails/32.jpg)
Summary
• Multiple modalities are useful for robust recognition of affect.
• Graphical Models for sensor fusion• Interest detection using multiple
modalities
![Page 33: Probabilistic Combination of Multiple Modalities to Detect Interest](https://reader035.fdocuments.us/reader035/viewer/2022070419/56815cbc550346895dcabbdd/html5/thumbnails/33.jpg)
Future Work
• Look at the pixel level relationships in video images of face (rather than AUs)
• Semi-supervised learning using GP– Accuracy over 80%
• Extend the framework– unsupervised learning (EM)– Bayesian Inference (Expectation Propagation)
• Learning with human in the loop
![Page 34: Probabilistic Combination of Multiple Modalities to Detect Interest](https://reader035.fdocuments.us/reader035/viewer/2022070419/56815cbc550346895dcabbdd/html5/thumbnails/34.jpg)
Acknowledgements
• John Hershey, Selene Mota & Nancy Alvarado
• Affective Computing Group, MIT Media Lab
• National Science Foundation– This material is based upon work supported by the National Science
Foundation under Grant No. 0087768.– Any opinions, findings, and conclusions or recommendations expressed in
this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.