Examples of Blame

19
“You made me do it”: Classification of Blame in Married Couples’ Interactions by Fusing Automatically Derived Speech and Language Information Matthew P. Black, Panayiotis G. Georgiou, Athanasios Katsamanis, Brian R. Baucom, and Shrikanth S. Narayanan http://sail.usc.edu

description

“You made me do it”: Classification of Blame in Married Couples’ Interactions by Fusing Automatically Derived Speech and Language Information. - PowerPoint PPT Presentation

Transcript of Examples of Blame

Page 1: Examples of  Blame

“You made me do it”:Classification of Blame in Married Couples’

Interactions by Fusing Automatically Derived Speech and Language Information

Matthew P. Black, Panayiotis G. Georgiou, Athanasios Katsamanis, Brian R. Baucom,

and Shrikanth S. Narayanan

http://sail.usc.edu

Page 2: Examples of  Blame

Wife: You’re the one who suggested it. Wife: Oh, Bob suggested that this might help. Wife: Why? Why did you even want to get therapy?Wife: You’re the one who suggested it. Wife: I’m living at home with a roommate. Wife: You know, you’re a stranger to me now. I- I don’t-Wife: I wasn’t expecting this when we were going to get married.Husband: Are you done dumping on me yet?Husband: I- I just don’t think there is any point in talking about this.Wife: Oh, Bob suggested that this might help. Wife: You know, you’re a stranger to me now. I- I don’t-Wife: I wasn’t expecting this when we were going to get married.

Examples of Blame

“You made me do it”: Classification of Blame in Married Couples’ InteractionsAug. 28, 2011 2 / 18

Husband blaming Wife Wife blaming Husband

* Note that the above clips are acted. * The real couples analyzed in this study cannot be shown for privacy reasons.

Page 3: Examples of  Blame

Overview

• Blame conveyed through various communicative channels– Language (e.g., “you made me do it”)– Speech (e.g., prosody)– Gestures (e.g., pointing)

• Goal: classify extreme cases of blame behavior using audio– “Low” vs. “High”

• Methodology: combine 2 automatically-derived info sources– Acoustic: models how the spouses spoke– Language: models what the spouses said

“You made me do it”: Classification of Blame in Married Couples’ InteractionsAug. 28, 2011 3 / 18

Page 4: Examples of  Blame

Why Detect Blame?

• Blame is oftentimes targeted in couple therapy– Can lead to escalation of negative affect and resentment [Dimidjian et al. 2008]

• Blame is one important behavior researched in psychology– Rely on established manual coding methods– Challenges: coders must be trained, time-consuming

• How can technology help? – Automatic detection of blame is scalable alternative to manual coding

• Behavioral Signal Processing (BSP) – Quantify and recognize abstract human behaviors in natural interaction

settings relevant to psychology research

4 / 18 S. Dimidjian, C. R. Martell, and A. Christensen, Clinical Handbook of Couple Therapy, 4th ed. The Guilford Press, 2008, ch. Integrative behavioral couple therapy, pp. 73–106.

Page 5: Examples of  Blame

General Technical Challenges

• Blame is a complex human behavior– High-level, heterogeneous– Need to extract generalizable features

• Real data in real-life scenarios– Challenging for robust feature extraction and automatic speech recognition

• Information across multiple modalities/cues– Distributed information across various signals: acoustics, language, gestures– How to merge/combine/fuse them

“You made me do it”: Classification of Blame in Married Couples’ InteractionsAug. 28, 2011 5 / 18

Page 6: Examples of  Blame

Previous Work [Black et al. 2010]

• This paper is extension of our earlier work– Classified extreme instances (low/high) for 6 behavioral codes– Only used acoustic cues

• “Blame” was one of the more challenging codes to predict– Ignoring important lexical cues regarding blame– Coding manual: “explicit blaming statements (e.g., ‘you made me do it’)

warrant a high blame score”

• This paper addresses this weakness

6 / 18 M. P. Black, A. Katsamanis, C.-C. Lee, A. C. Lammert, B. R. Baucom, A. Christensen, P. G. Georgiou, and S. S. Narayanan, “Automatic classification of married couples’ behavior using audio features,” in Proc. Interspeech, 2010.

Page 7: Examples of  Blame

Corpus• Real couples in 10-minute problem-solving dyadic interactions

– Longitudinal study at UCLA and U. of Washington [Christensen et al. 2004]

– 117 distressed couples received couples therapy for 1 year– 569 sessions (96 hours)

• Audio– Single channel– Far-field– Variable noise conditions

• Transcription (word-level)– Chronological– Speaker labels (wife/husband)– No timing information

7 / 18

Wife: then why did you askHusband: to get us out of debtWife: mm hmmm ...

Wife: then why did you askHusband: to get us out of debtWife: mm hmmm ...

A. Christensen, D.C. Atkins, S. Berns, J. Wheeler, D. H. Baucom, and L.E. Simpson. “Traditional versus integrative behavioral couple therapy for significantly and chronically distressed married couples.” J. of Consulting and Clinical Psychology, 72:176-191, 2004.

Page 8: Examples of  Blame

Data Pre-processing• Signal-to-noise ratio (SNR) estimation

– Voice activity detection (VAD) [Ghosh et al. 2011]

– SNR ranged from -1 dB to 26 dB

• Speaker diarization– Used transcriptions and SailAlign’s speech-text alignment (available on web)– Each session’s audio split into wife/husband/unknown turns

• Session selection– (SNR > 5 dB) && (Speaker alignment > 55%)– 372 of 569 sessions (65%) met both thresholds (62.8 hours)

8 / 18 P. K. Ghosh, A. Tsiartas, and S. S. Narayanan, “Robust voice activity detection using long-term signal variability,” IEEE Trans. Audio, Speech, and Language Processing, vol. 19, no. 3, pp 600–613, 2011.

Page 9: Examples of  Blame

• Each spouse’s overall level of blame manually coded– Standardized coding manual [Heavey et al. 2002]

– 9-point scale (1 = no blame, 9 = repeated blame)– Multiple trained evaluators (mean pairwise Pearson’s correlation = 0.788)

• Binary classification set-up (top/bottom 20%)– Low blame (70 wife + 70 husband), High blame (70 wife + 70 husband)– Leave-one-couple-out cross-validation– Gender-independent models of blame

1 2 3 4 5 6 7 8 9

Wife

1 2 3 4 5 6 7 8 9

Husband

1 2 3 4 5 6 7 8 9

Wife

1 2 3 4 5 6 7 8 9

Husband

1 2 3 4 5 6 7 8 9

Wife

1 2 3 4 5 6 7 8 9

Husband

1 2 3 4 5 6 7 8 9

Wife

1 2 3 4 5 6 7 8 9

Husband

Classification Set-up

9 / 18

HighBlame

HighBlame

HighBlame

HighBlame

LowBlame

LowBlameLow

BlameLow

Blame

C. Heavey, D. Gill, and A. Christensen. Couples interaction rating system 2 (CIRS2). University of California, Los Angeles, 2002.

Page 10: Examples of  Blame

• Trained 3 classifiers– Acoustic– Lexical– Fusion

Methodology Overview

“You made me do it”: Classification of Blame in Married Couples’ InteractionsAug. 28, 2011 10 / 18

Page 11: Examples of  Blame

• 53,000+ features extracted1) Extracted frame-level low-level descriptors (LLDs)–Prosodic: f0, intensity [Praat – Boersma 2001]

–Spectral: 15 MFCCs, 8 MFBs [openSMILE – Eyben et al. 2010]

–Voice quality: jitter, shimmer [openSMILE – Eyben et al. 2010]

2) Separate features for each spouse (wife, husband)3) 6 temporal granularities–Global: entire session [Black et al. 2010]

–Hierarchical: 0.1s, 0.5s, 1s, 5s, 10s disjoint windows [Schuller et al. 2008]

4) 14 static functionals (e.g., mean, std. dev.)

• Apply binary classifier–Classifier: Support Vector Machine (SVM) with linear kernel–Confidence score: class probability estimates [LIBSVM – Chang et al. 2001]

Acoustic Classifier

11 / 18 B. Schuller, M. Wimmer, L. Mösenlechner, C. Kern, D. Arsic, and G. Rigoll, “Brute-forcing hierarchical functionals for paralinguistics: A waste of feature space?” in Proc. ICASSP, 2008.

Page 12: Examples of  Blame

• Derived from automatic speech recognition (ASR)

• Lexical classifier based on “competitive” language models– Low (High) blame language models trained on text of low (high) blame

spouses– Classifier: choose the blame class that is most likely– Confidence score: absolute difference in probabilities of low/high blame case

Lexical Classifier (1/2)

“You made me do it”: Classification of Blame in Married Couples’ InteractionsAug. 28, 2011 12 / 18

Low/High blame

Acoustic observations of rated spouse’s speech

Most likely blame class andmost likely word sequence

“turn of rated spouse

“Blame class”-specific acoustic model

“Blame class”-specific language model

Genericacoustic model

Unigramlanguage model

Page 13: Examples of  Blame

• Single most likely path through word lattice may not be robust– Incorporated probabilities of 100 most likely (“N-best”) paths– Assumed N-best hypotheses independent for this paper

• Oracle lexical classifier– Upper bound on performance of proposed ASR-derived lexical classifier– Assume perfect word recognition rate

Lexical Classifier (2/2)

“You made me do it”: Classification of Blame in Married Couples’ InteractionsAug. 28, 2011 13 / 18

nth-best path (n = 1, … , 100)

Manual transcription of rated spouse

Page 14: Examples of  Blame

• Complementary info from acoustic and lexical classifiers– Score-level fusion of classifiers using confidence scores

• Fusion classifier: another binary SVM– Inputs: “confidence score”-weighted class hypotheses

Fusion Classifier

“You made me do it”: Classification of Blame in Married Couples’ InteractionsAug. 28, 2011 14 / 18

Page 15: Examples of  Blame

Classification Results

• Findings– High performance of oracle lexical classifier means lexical cues are important– Lower performance of ASR lexical classifier due to ASR being challenging– Fusion classifiers able to advantageously combine partially orthogonal

language and acoustic information sources“You made me do it”: Classification of Blame in Married Couples’ InteractionsAug. 28, 2011 15 / 18

System Classifier AccuracyBaseline Chance 140/280 = 50.0%

Unimodal

Acoustic 223/280 = 79.6%

Lexical/ASR 211/280 = 75.4%

Lexical/Oracle 255/280 = 91.1%

FusionAcoustic + Lexical/ASR 230/280 = 82.1%

Acoustic + Lexical/Oracle 257/280 = 91.8%

WER: 40%-90%

WER: 40%-90%

• Significant differences– All classifiers significantly higher accuracy than chance (all p < 0.01)– Oracle classifiers significantly higher accuracy than non-oracle (all p < 0.01)– (Acoustic + Lexical/ASR) significantly higher than Lexical/ASR only (p < 0.05)

Page 16: Examples of  Blame

Conclusions & Future Work• Modeled high-level blame behaviors from real couples by fusing

automatically-derived speech and language information– Proposed acoustic classifier more robust than lexical classifier– Even with noisy ASR, lexical classifier attained 75% accuracy– Successfully separated 82% of extreme instances with fusion classifier

• Blaming behaviors are important cue to detect because of their significance in the context of couple therapy

– Detection of blame could facilitate clinician-guided drill-down therapy

• Future Work– Improve individual classifiers (acoustic/lexical) and fusion classifier– Extend fusion experiments to other behavioral codes– Apply behavioral signal processing methodologies to other domains

“You made me do it”: Classification of Blame in Married Couples’ InteractionsAug. 28, 2011 16 / 18

Page 17: Examples of  Blame

References & SoftwareREFERENCESM. P. Black, A. Katsamanis, C.-C. Lee, A. C. Lammert, B. R. Baucom, A. Christensen, P. G. Georgiou, and S. S. Narayanan,

“Automatic classification of married couples’ behavior using audio features,” in Proc. Interspeech, 2010.

A. Christensen, D.C. Atkins, S. Berns, J. Wheeler, D. H. Baucom, and L.E. Simpson. “Traditional versus integrative behavioral couple therapy for significantly and chronically distressed married couples.” J. of Consulting and Clinical Psychology, vol. 72, pp. 176-191, 2004.

S. Dimidjian, C. R. Martell, and A. Christensen, Clinical Handbook of Couple Therapy, 4th ed. The Guilford Press, 2008, ch. Integrative behavioral couple therapy, pp. 73–106.

C. Heavey, D. Gill, and A. Christensen. Couples interaction rating system 2 (CIRS2). University of California, Los Angeles, 2002.

B. Schuller, M. Wimmer, L. Mösenlechner, C. Kern, D. Arsic, and G. Rigoll, “Brute-forcing hierarchical functionals for paralinguistics: A waste of feature space?” in Proc. ICASSP, 2008.

SOFTWAREP. Boersma, “Praat, a system for doing phonetics by computer,” Glot International, vol. 5, no. 9/10, pp. 341–345, 2001.

C. C. Chang and C. J. Lin, LIBSVM: A library for support vector machines, 2001, software available at http://www.csie.ntu.edu.tw/cjlin/libsvm.

F. Eyben, M. Wöllmer, and B. Schuller, “OpenSMILE - The Munich versatile and fast open-source audio feature extractor,” in ACM Multimedia, 2010, pp. 1459–1462.

P. K. Ghosh, A. Tsiartas, and S. S. Narayanan, “Robust voice activity detection using long-term signal variability,” IEEE Trans. Audio, Speech, and Language Processing, vol. 19, no. 3, pp 600-613, 2011.

A. Katsamanis, M. P. Black, P. G. Georgious, L. Goldstein, and S. S. Narayanan, “SailAlign: Robust long speech-text alignment,” in Proc. of Workshop on New Tools and Methods for Very-Large Scale Phonetics Research, 2011.

“You made me do it”: Classification of Blame in Married Couples’ InteractionsAug. 28, 2011 17 / 18

Page 18: Examples of  Blame

Related Work at Interspeech 2011

• Papers at Interspeech 2011 on same Couple Therapy corpus:

• Saliency detection– Mon at 10:00 – Ses1-P1– James Gibson, Athanasios Katsamanis, Matthew Black, and Shrikanth

Narayanan, Automatic identification of salient acoustic instances in couples' behavioral interactions using Diverse Density Support Vector Machines

• Interaction modeling– Tues at 14:30 – Ses2-S1-P– Chi-Chun Lee, Athanasios Katsamanis, Matthew Black, Brian Baucom,

Panayiotis Georgiou, and Shrikanth Narayanan, An analysis of PCA-based vocal entrainment measures in married couples' affective spoken interactions

“You made me do it”: Classification of Blame in Married Couples’ InteractionsAug. 28, 2011 18 / 18

Page 19: Examples of  Blame

Thank you!

Questions?