Classification of Discourse Functions of Affirmative Words in Spoken Dialogue Julia Agustín...
-
date post
21-Dec-2015 -
Category
Documents
-
view
221 -
download
0
Transcript of Classification of Discourse Functions of Affirmative Words in Spoken Dialogue Julia Agustín...
Classification of Discourse Functions of Affirmative Words
in Spoken Dialogue
Agustín Gravano, Stefan Benus, JuliaJulia Hirschberg
Shira Mitchell, Ilia Vovsha
INTERSPEECH, Antwerp, August 2007INTERSPEECH, Antwerp, August 2007
Spoken Language Processing GroupSpoken Language Processing GroupColumbia UniversityColumbia University
Agustín Gravano INTERSPEECH 2007
2
Cue Words
Ambiguous linguistic expressions used for Making a semantic contribution, or Conveying a pragmatic function.
Examples: now, well, so, alright, and, okay, first, by the way, on the other hand.
Single affirmative cue words Examples: alright, okay, mm-hm, right, uh-huh, yes. May be used to convey acknowledgment or
agreement, to change topic, to backchannel, etc.
Agustín Gravano INTERSPEECH 2007
3
Research Goals
Learn which features best characterize the different functions of single affirmative cue words.
Determine how these can be identified automatically.
Important in Spoken Dialogue Systems: Understand user input. Produce output appropriately.
Agustín Gravano INTERSPEECH 2007
4
Previous Work
Classification of cue words into discourse vs. sentential use. Hirschberg & Litman ’87, ’93; Litman ’94; Heeman,
Byron & Allen ’98; Zufferey & Popescu-Belis ’04. In our corpus:
right: 15% discourse, 85% sentential. All other affirmative cue words: 99% disc., 1% sent.
Discourse vs. sentential distinction insufficient. Need to define new classification tasks.
Agustín Gravano INTERSPEECH 2007
5
Talk Overview
Columbia Games Corpus Classification tasks Experimental features Results
Agustín Gravano INTERSPEECH 2007
6
The Columbia Games Corpus 12 spontaneous task-oriented dyadic conversations
in Standard American English. 2 subjects playing computer games; no eye contact.
Agustín Gravano INTERSPEECH 2007
7
The Columbia Games CorpusFunction of Affirmative Cue Words
Cue Words alright gotcha huh mm-hm okay right uh-huh yeah yep yes yup
Functions Acknowledgment / Agreement Backchannel Cue beginning discourse segment Cue ending discourse segment Check with the interlocutor Stall / Filler Back from a task Literal modifier Pivot beginning: Ack/Agree + Cue begin Pivot ending: Ack/Agree + Cue end
7.9% of the words in our corpus
Agustín Gravano INTERSPEECH 2007
8
Literal Modifierthat’s pretty much okay
BackchannelSpeaker 1: between the yellow mermaid and
the whaleSpeaker 2: okaySpeaker 1: and it is
Cue beginning discourse segmentokay we gonna be placing the blue moon
The Columbia Games CorpusFunction of Affirmative Cue Words
Agustín Gravano INTERSPEECH 2007
9
The Columbia Games CorpusFunction of Affirmative Cue Words
3 trained labelers Inter-labeler agreement:
Fleiss’ Kappa = 0.69 (Fleiss ’71) In this study we use the majority label for
each affirmative cue word. Majority label: label chosen by at least two of the
three labelers.
Agustín Gravano INTERSPEECH 2007
10
Identification of a discourse segment boundary function Segment beginning
vs. Segment end vs. No discourse segment boundary function
Identification of an acknowledgment function Acknowledgment vs. No acknowledgment
MethodTwo new classification tasks
Agustín Gravano INTERSPEECH 2007
11
ML Algorithm JRip: Weka’s implementation of the propositional
rule learner Ripper (Cohen ’95). We also tried J4.8, Weka’s implementation of the
decision tree learner C4.5 (Quinlan ’93, ’96), with similar results.
10-fold cross validation in all experiments.
MethodMachine Learning Experiments
Agustín Gravano INTERSPEECH 2007
12
IPU (Inter-pausal unit) Maximal sequence of words delimited by pause >
50ms.
Conversational Turn Maximal sequence of IPUs by the same speaker, with
no contribution from the other speaker.
MethodExperimental features
Agustín Gravano INTERSPEECH 2007
13
Text-based features Extracted from the text transcriptions. Lexical id; POS tags; position of word in IPU / turn; etc.
Timing features Extracted from the time alignment of the transcriptions. Word / IPU / turn duration; amount of overlap; etc.
Acoustic features {min, mean, max, stdev} x {pitch, intensity} Slope of pitch, stylized pitch, and intensity, over the whole word,
and over its last 100, 200, 300ms. Acoustic features from the end of the other speaker’s previous turn.
MethodExperimental features
Agustín Gravano INTERSPEECH 2007
14
ResultsDiscourse segment boundary function
Feature Set Error RateF-Measure
Begin End
Text-based 11.6 % .77 .30
Timing 11.3 % .73 .52
Acoustic 14.2 % .66 .19
Text-based + Timing 9.8 % .81 .53
Full set 9.6 % .81 .57
Baseline (1) 19.0 % .00 .00
Human labelers (2) 5.7 % .94 .71
(1) Majority class baseline: NO BOUNDARY.(2) Calculated wrt each labeler’s agreement with the majority labels.
Agustín Gravano INTERSPEECH 2007
15
ResultsAcknowledgment function
Feature Set Error Rate F-Measure
Text-based 8.3 % .94
Timing 11.0 % .92
Acoustic 17.2 % .87
Text-based + Timing 6.2 % .95
Full set 6.5 % .95
Baseline (1) 16.7 % .88
Human labelers (2) 5.5 % .98
(1) Baseline based on lexical identity: {huh, right } no ACK all other words ACK(2) Calculated wrt each labeler’s agreement with the majority labels.
Agustín Gravano INTERSPEECH 2007
16
Best-performing features
Discourse Segment Boundary Function
Acknowledgment Function
• Lexical identity• POS tag of the following word• Number and proportion of
succeeding words in the turn• Context-normalized mean
intensity
• Lexical identity• POS tag of preceding word• Number and proportion of
preceding words in the turn• IPU and turn length
Agustín Gravano INTERSPEECH 2007
17
ResultsClassification of individual words
Classification of each individual word into its most common functions. alright Ack/Agree, Cue Begin, Other mm-hm Ack/Agree, Backchannel okay Ack/Agree, Backchannel, Cue
Begin, Ack+CueBegin, Ack+CueEnd,
Other right Ack/Agree, Check, Literal Modifier yeah Ack/Agree, Backchannel
Agustín Gravano INTERSPEECH 2007
18
ResultsClassification of the word ‘okay’
Feature SetError Rate
F-MeasureAck /Agree
Back-channel
Cue Begin
Ack/Agree + Cue Begin
Ack/Agree + Cue End
Text-based 31.7 .76 .16 .77 .09 .33
Acoustic 40.2 .69 .24 .64 .03 .25
Text-based + Timing 25.6 .79 .31 .82 .18 .67
Full set 25.5 .80 .46 .83 .21 .66
Baseline (1) 48.3 .68 .00 .00 .00 .00
Human labelers (2) 14.0 .89 .78 .94 .56 .73
(1) Majority class baseline: ACK/AGREE.(2) Calculated wrt each labeler’s agreement with the majority labels.
Agustín Gravano INTERSPEECH 2007
19
Summary
Discourse/sentential distinction is insufficient for affirmative cue words in spoken dialogue.
Two new classification tasks: Detection of an acknowledgment function. Detection of a discourse boundary function.
Best performing ML models: Based on textual and timing features. Slight improvement when using acoustic features.
Agustín Gravano INTERSPEECH 2007
20
Further Work
Gravano et al, 2007On the role of context and prosody in the interpretation of ‘okay’.ACL 2007, Prague, Czech Republic, June 2007.
Benus et al, 2007The prosody of backchannels in American English. ICPhS 2007, Saarbrücken, Germany, August 2007.
Classification of Discourse Functions of Affirmative Words
in Spoken Dialogue
Agustín Gravano, Stefan Benus, JuliaJulia Hirschberg
Shira Mitchell, Ilia Vovsha
INTERSPEECH, Antwerp, August 2007INTERSPEECH, Antwerp, August 2007
Spoken Language Processing GroupSpoken Language Processing GroupColumbia UniversityColumbia University
Agustín Gravano INTERSPEECH 2007
22
alright mm-hm okay right uh-huh yeah Other Total
Ack / Agree 99 61 1137 114 18 808 133 2370
Backchannel 6 402 121 14 143 72 5 763
Cue Begin 89 0 548 2 0 2 0 641
Cue End 8 0 10 0 0 0 0 18
Pivot Begin 5 0 68 0 0 0 0 73
Pivot End 13 12 232 2 0 22 17 298
Back from Task 9 1 33 0 0 0 0 43
Check 0 0 6 53 0 1 8 68
Stall 1 0 15 1 0 2 0 19
Literal Modifier 9 0 29 1079 0 0 1 1118
? 56 27 235 10 3 65 11 407
Total 295 503 2434 1275 164 972 175 5818