Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

50
1 30.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth Automatic Recognition of Emotions in Speech: models and methods Prof. Dr. Andreas Wendemuth Univ. Magdeburg, Germany Chair of Cognitive Systems Institute for Information Technology and Communications YAC / Yandex, 30. October 2014, Moscow

description

Transcript of Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

Page 1: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

130.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

Automatic Recognition of Emotions in Speech: models and methods

Prof. Dr. Andreas Wendemuth Univ. Magdeburg, Germany

Chair of Cognitive SystemsInstitute for Information Technology and Communications

YAC / Yandex, 30. October 2014, Moscow

Page 2: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

230.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

Recorded speech starts as an acoustic signal. For decades, appropriate methods in acoustic speech recognition and natural language processing have been developed which aimed at the detection of the verbal content of that signal, and its usage for dictation, command purposes,  and assistive systems. These techniques have matured to date. As it shows, they can be utilized in a modified form to detect and analyse further affective information which is transported by the acoustic signal: emotional content,intentions, and involvement in a situation. Whereas words and phonemes are the unique symbolic classes  for assigning the verbal content, finding appropriate descriptors for affective information is much more difficult.

We describe the corresponding technical steps for software-supported affect annotation and  for automatic emotion recognition, and we report on the data material used for evaluation of these methods.

Further, we show possible applications in companion systems and in dialog control.

Abstract

Page 3: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

330.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

1.Affective Factors in Man-Machine-Interaction

2.Speech and multimodal sensor data – what they reveal  

3.Discrete or dimensional affect description

4.software-supported affect annotation

5.Corpora

6.Automatic emotion recognition

7.Applications in companion systems and in dialog control

Contents

Page 4: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

430.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

Affective Factors in Man-Machine-Interaction

Page 5: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

530.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

Affective Terms - DisambiguationEmotion [Becker 2001]

• short-time affect

• bound to specific events

Mood [Morris 1989]

medium-term affect•

• not bound to specific events

Personality [Mehrabian 1996]

• long-term stable

• represents individual characteristics

Page 6: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

630.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

Emotion: the PAD-space

• Dimensions:• pleasure / valence (p),• arousal (a) and• dominance (d)

• values each from -1.0 bis 1.0

• “neutral” at center

• defines octands, e.g. (+p+a+d)

Siegert et al. 2012 Cognitive Behavioural Systems. COST

Page 7: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

730.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

Correlation of emotion and mood

In order to make it measurabble, there has to be an empirical correlation of moods to PAD space (emotion octands). [Mehrabian 1996]

Moods for octands in PAD space

PAD mood PAD mood+++ Exuberant++- Dependent+-+ Relaxed+- - Docile

- - - Bored- -+ Disdainful-+- Anxious-++ Hostile

Siegert et al. 2012 Cognitive Behavioural Systems. COST

Page 8: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

830.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

Personality and PAD-spaceUnique personality model: Big Five [Allport and Odbert 1936]

5 strong independent factors

[Costa and McCrae 1985] presented the five-factor personality inventorydeliberately applicable to non-clinical environments

NeuroticismExtraversion openness agreeableness conscientiousness

••••

• measurable by questionnaires (NEO FFI test)• Mehrabian showed a relation between the Big Five Factors (from Neo-FFI, scaled to [0,1]) and PAD-space. E.g.:

• P := 0.21 · extraversion + 0.59 · agreeableness + 0.19 · neuroticism

(other formulae available for arousal and dominance)

Siegert et al. 2012 Cognitive Behavioural Systems. COST

Page 9: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

930.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

1.Affective Factors in Man-Machine-Interaction

2.Speech and multimodal sensor data – what they reveal  

3.Discrete or dimensional affect description

4.software-supported affect annotation

5.Corpora

6.Automatic emotion recognition

7.Applications in companion systems and in dialog control

Contents

Page 10: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

1030.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

• Speech (Semantics)• Non-semantic utterances („hmm“, „aehhh“)• Nonverbals (laughing, coughing, swallowing,…)• Emotions in speech

Interaction modalities – what a person „tells“

Page 11: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

1130.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

Discourse Particles

Especially the intonation reveals details about the speakers attitude but is influenced by semantic and grammatical information.

investigate discourse particles (DPs)

• can’t be inflected but emphasized

• occurring at crucial communicative points

• have specific intonation curves (pitch-contours)

• thus may indicate specific functional meanings

Siegert et al. 2013 WIRN Vietri

Page 12: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

1230.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

The Role of Discourse Particles for Human Interaction

J. E. Schmidt [2001] presented an empirical study where he could determine seven form-function relations of the DP “hm”:

Siegert et al. 2013 WIRN Vietri

Name idealised pitch-contour

Description

DP-A attention

DP-T thinkingDP-F finalisation signalDP-C confirmationDP-D decline∗

DP-P positive assessment

DP-R request to respond

Page 13: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

1330.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

The Role of Discourse Particles for Human Interaction• [Kehrein and Rabanus, 2001] examined different conversational styles

and confirmed the form-function relation.

• [Benus et al., 2007] investigated the occurrence frequency of specific backchannel words for American English HHI.

• [Fischer et al., 1996]: the number of partner-oriented signals is decreasing while the number of signals indicating a task-oriented or expressive function is increasing

•Research Questions

• Are DPs occurring within HCI?

• Which meanings can be determined?

• Which form-types are occurring?

Siegert et al. 2013 WIRN Vietri

Page 14: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

1430.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

• Speech (Semantics)• Non-semantic utterances („hmm“, „aehhh“)• Nonverbals (laughing, coughing, swallowing,…)• Emotions in speech

• Eye contact / direction of sight• General Mimics• Face expressions (Laughing, angryness,..)

• Hand gesture, arm gesture• Head posure, body posure

• Bio-signals (blushing, paleness, shivering, frowning…)• Pupil width

• Haptics: Direct operation of devices (keyboard, mouse, touch)• Handwriting, drawing, sculpturing, …

Interaction modalities – what a person „tells“ with other modalities

Page 15: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

1530.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

• Indirect expression (pauses, idleness, fatigueness)• Indirect content (humor, irony, sarcasm)• Indirect intention (hesitation, fillers, discourse particles)

What speech can (indirectly) reveal

Page 16: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

1630.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

• Recognizing speech, mimics, gestures, poses, haptics, bio-signals: indirect information

• Many (most) modalities need data-driven recognition engines• Unclear categories (across modalities?)

• Robustness of recognition in varying / mobile environments

Technical difficulties

Page 17: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

1730.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

So, really, you have raw data.

Now you (hopefully) have recorded (multimodal) data with (reliable) emotional content

but what does it convey?

Actually, you have a (speech) signal,

Page 18: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

1830.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

1.Affective Factors in Man-Machine-Interaction

2.Speech and multimodal sensor data – what they reveal  

3.Discrete or dimensional affect description

4.software-supported affect annotation

5.Corpora

6.Automatic emotion recognition

7.Applications in companion systems and in dialog control

Contents

Page 19: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

1930.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

transcriptions (intended things which happened)(Speech: „Nice to see you“; Mimics: „eyes open, lip corners up“; … )

Now you need:

and

annotations (unintended events, or the way how it happened).Speech: heavy breathing, fast, happy; Mimics: smile, happiness; …

Both processes require

labelling: tagging each recording chunk with marks, which correspond to the relevant transcription / annotation categories

Page 20: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

2030.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

• Trained transcribers / annotators with high intra- and interpersonal reliability (kappa measures)

• Time aligned (synchronicity!), simultaneous presentation of all modalities to the transcriber / annotator

• Selection of (known) categories for the transcriber / annotator

• Labelling

How to transcribe / annotate?

Page 21: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

2130.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

Clear (?) modal units of investigation / categories e.g.: • Speech: phonemes, syllables, words• Language: letters, syllables, words• Request: content! (orgin city, destination city, day, time)• Dialogues: turn, speaker, topic• Situation Involvement: object/subject of attention, diectics, active/passive participant

• Mimics: FACS (Facial Action Coding System) -> 40 action units

• Big 5 Personality Traits (OCEAN)• Sleepiness (Karolinska Scale)• Intoxication (Blood Alcohol Percentage)

Categories:

Page 22: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

2230.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

• Unclear (?) modal categories e.g.:

• Emotion: ???

• Cf.: Disposition: Domain-Specific …. ? • Cf.: Level of Interest (?)

Categories:

Page 23: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

2330.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

Categorial Models of human emotion ...

... which can be utilized for automatic emotion recognition

• Two-Class models, e.g. (not) cooperative

• Base Emotions [Ekman, 1992] (Angriness, Disgust, Fear, Joy, Sadness, Surprise, Neutral)

• VA(D) Models (Valence (Pleasure) Arousal Dominance)

• Geneva Emotion Wheel [Scherer, 2005]

23

Page 24: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

2430.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

Categorial Models of human emotion (2):enhanced listings

Siegert et al. 2011 ICME

24

• sadness,• contempt,• surprise,• interest,•hope,• relief,• joy,•helplessness,• confusion

Page 25: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

2530.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

Categorial Models of human emotion (3):Self-Assessment Manikins [Bradley, Lang, 1994]

Böck et al. 2011 ACII

25

Page 26: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

2630.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

1.Affective Factors in Man-Machine-Interaction

2.Speech and multimodal sensor data – what they reveal  

3.Discrete or dimensional affect description

4.software-supported affect annotation

5.Corpora

6.Automatic emotion recognition

7.Applications in companion systems and in dialog control

Contents

Page 27: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

2730.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

• (having fixed the modalities and categories)

• Examples; EXMARaLDA, FOLKER, ikannotate

EXMARaLDA: „Extensible Markup Language for Discourse Annotation“, www.exmaralda.org/, Hamburger Zentrum für Sprachkorpora (HZSK) und SFB 538 ‘Multilingualism’, seit 2001/ 2006

FOLKER: „Forschungs- und Lehrkorpus Gesprochenes Deutsch“ - Transkriptionseditor, http://agd.ids-mannheim.de/folker.shtml, Institute for German Language, Uni Mannheim, seit 2010 [Schmidt, Schütte, 2010]

Transcription / annotation tools

Page 28: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

2830.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

ikannotate - A Tool for Labelling, Transcription, and Annotation of Emotionally Coloured Speech (2011)• Otto von Guericke University - Chair of Cognitive Systems + Dept. of Psychosomatic

Medicine and Psychotherapy

Written in QT4 based on C++Versions for Linux, Windows XP and higher, and Mac OS XSources and binaries are available on demandHandles different output formats, especially, XML and TXTProcesses MP3 and WAV filesAccording to conversation analytic system of transcription

(GAT) (version 1 and 2) [Selting et.al., 2011]

http://ikannotate.cognitive-systems-magdeburg.de/

ikannotate tool

Page 29: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

2930.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

Screenshots of ikannotate (I)

Böck et al. 2011 ACII

Page 30: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

3030.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

Screenshots of ikannotate (II)

Böck et al. 2011 ACII

Page 31: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

3130.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

1.Affective Factors in Man-Machine-Interaction

2.Speech and multimodal sensor data – what they reveal  

3.Discrete or dimensional affect description

4.software-supported affect annotation

5.Corpora

6.Automatic emotion recognition

7.Applications in companion systems and in dialog control

Contents

Page 32: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

3230.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

• Overview: http://emotion-research.net/wiki/Databases (not complete)

• Contains information on: Identifier, URL, Modalities, Emotional  content, Emotion elicitation methods, Size, Nature of material, Language

• Published overviews: Ververidis & Kotropoulos 2006, Schuller et al. 2010, Appendix of [Pittermann et al.2010]*

• Popular corpora (listed on website above): Emo-DB: Berlin Database of Emotional Speech 2005 SAL: Sensitive Artificial Listener (Semaine 2010) (not listed on website above): eNTERFACE (2005) LMC: LAST MINUTE (2012) Table Talk (2013) Audio-Visual Interest Corpus (AVIC) (ISCA 2009)

• Ververidis, D. & Kotropoulos, C. (2006). “Emotional speech recognition: Resources, features, and methods”. Speech Commun 48 (9), pp. 1162–1181.

• Schuller, B.; Vlasenko, B.; Eyben, F.; Wollmer, M.; Stuhlsatz, A.; Wendemuth, A. & Rigoll, G. (2010). “Cross-Corpus Acoustic Emotion Recognition: Variances and Strategies” IEEE Trans. Affect. Comput. 1 (2), pp. 119–131.

• Pittermann, J.; Pittermann, A. & Minker, W. (2010). Handling Emotions in Human-Computer Dialogues. Amsterdam, The Netherlands: Springer.

Corpora of affective speech (+other modalities)

Page 33: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

3330.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

© Siegert 2014

Page 34: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

3430.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

• Burkhardt, et al., 2005: A Database of German Emotional Speech,

• Proc. INTERSPEECH 2005, Lisbon, Portugal, 1517-1520.

• 7 emotions: anger, boredom, disgust, fear, joy, neutral, sadness

• 10 professional German actors, 5f, 494 phrases

• Perception test with 20 subjects: 84.3% mean acc.

• http://pascal.kgw.tu-berlin.de/emodb/index-1280.html

Example 1: Berlin Database of Emotional Speech (EMO-DB)

Page 35: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

3530.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

Example 2: LAST MINUTE Corpus

Setup Non-acted, emotions evoked by story: task solving with difficulties (barriers)

Groups N = 130, balanced in age, gender, education

Duration 56:02:14

Sensors 13

Max. Video Bandwidth 1388x1038 25Hz

Biopsychological data heart beat, respiration, skin reductance

Questionnaires sociodemographic, psychometric

Interviews yes (73 subjects)

Language German

Available upon request at [email protected] and [email protected]

Frommer et al. 2012 LREC

Page 36: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

3630.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

1.Affective Factors in Man-Machine-Interaction

2.Speech and multimodal sensor data – what they reveal  

3.Discrete or dimensional affect description

4.software-supported affect annotation

5.Corpora

6.Automatic emotion recognition

7.Applications in companion systems and in dialog control

Contents

Page 37: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

3730.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

• Remember, now you have transcribed/annotated data with fixed categories (across modalities?) and modalities.

• You want to use that data to construct unimodal or multimodal data-driven recognition engines

• Once you have these engines, you can automatically determine the categories in yet unkown data.

Data-driven recognition engines

Page 38: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

3830.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

• It’s Pattern Recognition

• Knowledge Sources

A Unified View on data driven recognition

Schuller 2012 Cognitive Behavioural Systems COST

Llyx ll ,...,1,

CapturePre-

processing

Feature extraction

Feature reduction

Classification

Regression Decoding

U f(x') xx' y=κrf(x)

Feature generation /

selection

multi-layered multi-

layered

once

DictionaryInteraction Grammar

Production Model

xxf

Encoding

Learner Optimisatio

n

Page 39: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

3930.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

Audio Features

Böck et al. 2013 HCII

Facial Action Units

• MFCCs with Delta and Acceleration

• Prosodic features• Formants and corresponding

bandwidths• Intensity• Pitch• Jitter

• For acoustic feature extraction: Hidden Markov Toolkit (HTK) and phonetic analysis software PRAAT (http://www.praat.org)

Page 40: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

4030.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

What is the current state of affect recognition?Table : Overview of reported results, #C: Number of Classes, eNT:

eNTERFACE, VAM: Vera am Mittag, SAL: Sensitive Artificial Listener, LMC: LAST MINUTE.

Comparing the results on acted emotional data and naturalistic interactions:

• recognition performance decreases

• too much variability within the data

Database Result #C Comment Reference

emoDB (acted)

91.5% 2 6552 acoustic features and GMMs Schuller et al., 2009

eNT (primed)

74.9% 2 6552 acoustic features, GMMs Schuller et al., 2009

VAM (natural)

76.5% 2 6552 acoustic features with GMMs Schuller et al., 2009

SAL (natural)

61.2% 2 6552 acoustic features with GMMs Schuller et al., 2009

LMC (natural)

80% 2 pre-classification of visual, acousticand gestural features, MFN

Krell et al.,2013

Siegert et al. 2013 ERM4HCI

Page 41: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

4130.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

User-group / temporal specific affect recognition

Success Rates [stress / no stress] (tested on LAST MINUTE corpus) :

• 72% utilizing (few) group-specific (young / old + male/female) audio features [Siegert et al., 2013]

• 71% utilizing audio-visual features and a linear filter as decision level fusion [Panning et al., 2012]

• 80% using facial expressions, gestural analysis and acoustic features with Markov Fusion Networks [Krell et al., 2013]

Approaches 2 & 3 integrate their classifiers of longer temporal sequences.

Siegert et al. 2013 ERM4HCI, workshop ICMI 2013

Page 42: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

4230.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

Classification Engines – Cross-Modalities

• Classification based on audio feature• Preselection of relevant video sequences• Manual annotation of Action Units and classification of facial expressions

Further:• preclassification of the sequences• Dialog act representation models

Böck et al. 2013 HCII, Friesen et al. 2014 LREC

Page 43: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

4330.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

1.Affective Factors in Man-Machine-Interaction

2.Speech and multimodal sensor data – what they reveal  

3.Discrete or dimensional affect description

4.software-supported affect annotation

5.Corpora

6.Automatic emotion recognition

7.Applications in companion systems and in dialog control

Contents

Page 44: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

4430.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

• Remember, now you have transcribed/annotated data with fixed categories (across modalities?) and modalities (maybe a corpus).

• You also have a categories classifier trained on these data, i.e. domain specific / person specific.

Now we use categorized information in applications:

Usage of multimodal information

Page 45: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

4530.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

• Disambiguation (saying and pointing)• Person‘s choice (talking is easier than typing)• „Real“ information (jokes from a blushing person?)• Robustness (talking obscured by noise, but lipreading works)• Higher information content (multiple congruent modalities)m• Uniqueness (reliable emotion recognition only from multi-

modalities)

Why more modalities help understanding what a person wants to „tell“

Page 46: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

4630.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

Companion Technology

Application /

Dialog-Management

Input signalSpeech

Gesture

Touch

Physiolog. Sensor

DevicesMultimodalComponents

Output signal

Multimodal

Adaptive

Individualised

InteractionManagement

User

Weber et al. 2012 SFB TRR 62

Page 47: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

4730.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

Recognition of critical dialogue courses • On basis of linguistic content • in combination with multi-modal emotion recognition

Development of empathy-promoting dialogue strategies • Motivation of the user • Prevent abandonment of the dialogue in problem-prone situations

Emotional and dialogic conditions in user behavior

Page 48: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

4830.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

• Blue = Client• Orange = Agent

Call Center Dialogues: Typical Emotion Trains

© Siegert 2014

Page 49: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

4930.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

Take home messages / outlook

Emotion / Affect recognition: • Data driven, automatic pattern recognition • Categorisation, Annotation tools • Temporal emotion train dependent on mood and

personality• Outlook: Emotion-categorial Appraisal-Model

Use in Man-Machine-Interaction:• Early detection / counteraction of adverse dialogs • Outlook: use in call centers and companion technology

Page 50: Automatic recognition-of-emotions-in-speech wendemuth-yac-moscow 2014-presentation-format-16-9

5030.Oct. 2014 YAC - Automatic recognition of emotions in speech – Andreas Wendemuth

… thank you!

www.cogsy.de