Challenges in Dialogue Discourse and Dialogue CMSC 35900-1 October 27, 2006.
Transcript of Challenges in Dialogue Discourse and Dialogue CMSC 35900-1 October 27, 2006.
![Page 1: Challenges in Dialogue Discourse and Dialogue CMSC 35900-1 October 27, 2006.](https://reader035.fdocuments.us/reader035/viewer/2022070414/5697c0231a28abf838cd428a/html5/thumbnails/1.jpg)
Challenges in Dialogue
Discourse and Dialogue
CMSC 35900-1
October 27, 2006
![Page 2: Challenges in Dialogue Discourse and Dialogue CMSC 35900-1 October 27, 2006.](https://reader035.fdocuments.us/reader035/viewer/2022070414/5697c0231a28abf838cd428a/html5/thumbnails/2.jpg)
Roadmap
• Issues in Dialogue– Dialogue vs General Discourse– Dialogue Acts
• Modeling
• Recognition and Interpretation
– Dialogue Management for Computational Agents
![Page 3: Challenges in Dialogue Discourse and Dialogue CMSC 35900-1 October 27, 2006.](https://reader035.fdocuments.us/reader035/viewer/2022070414/5697c0231a28abf838cd428a/html5/thumbnails/3.jpg)
Dialogue vs General Discourse
• Key contrast: Two or more speakers– Primary focus on speech
• Issues in multi-party spoken dialogue– Turn-taking – who speaks next, when?– Collaboration – clarification, feedback,…– Disfluencies– Adjacency pairs, dialogue acts
![Page 4: Challenges in Dialogue Discourse and Dialogue CMSC 35900-1 October 27, 2006.](https://reader035.fdocuments.us/reader035/viewer/2022070414/5697c0231a28abf838cd428a/html5/thumbnails/4.jpg)
Turn-Taking
• Multi-party discourse– Need to trade off speaker/hearer roles
• Interpret reference from sequential utterances
• When?– End of sentence?
• No: multi-utterance turns
– Silence?• No: little silence in smooth dialogue:< 250ms
– When other starts speaking?• No: relatively little overlap face-to-face: ~5%
![Page 5: Challenges in Dialogue Discourse and Dialogue CMSC 35900-1 October 27, 2006.](https://reader035.fdocuments.us/reader035/viewer/2022070414/5697c0231a28abf838cd428a/html5/thumbnails/5.jpg)
Turn-taking: When
• Rule-governed behavior– Possibly multiple legal turn change times
• Aka transition-relevance places (TRP)
• Generally at utterance boundaries– Utterance not necessarily sentence
– In fact, utterance/sentence boundaries not obvious in speech
» Don’t necessarily pause between sentences
• Automatic utterance boundary detection– Cue words (okay, so,..); POS sequences; prosody
![Page 6: Challenges in Dialogue Discourse and Dialogue CMSC 35900-1 October 27, 2006.](https://reader035.fdocuments.us/reader035/viewer/2022070414/5697c0231a28abf838cd428a/html5/thumbnails/6.jpg)
Turn-taking: Who & How
• At each TRP in each turn (Sacks 1974)– If speaker has selected A to speak, A must take floor
– If speaker has selected no one to speak, anyone can
– If no one else takes the turn, the speaker can
• Selecting speaker A:– By explicit/implicit mention: What about it, Bob?
• By gaze, function
• Selecting others: questions, greetings, closing– (Traum et al., 2003)
![Page 7: Challenges in Dialogue Discourse and Dialogue CMSC 35900-1 October 27, 2006.](https://reader035.fdocuments.us/reader035/viewer/2022070414/5697c0231a28abf838cd428a/html5/thumbnails/7.jpg)
Turn-taking in HCI
• Human turn end:– Detected by 250ms silence
• System turn end:– Signaled by end of speech– Indicated by any human sound
• Barge-in
• Continued attention:– No signal
![Page 8: Challenges in Dialogue Discourse and Dialogue CMSC 35900-1 October 27, 2006.](https://reader035.fdocuments.us/reader035/viewer/2022070414/5697c0231a28abf838cd428a/html5/thumbnails/8.jpg)
Gesture, Gaze & Voice
• Range of gestural signals:– head (nod,shake), shoulder, hand, leg, foot
movements; facial expressions; postures; artifacts– Align with syllables
• Units: phonemic clause + change
• Study with recorded exchanges
![Page 9: Challenges in Dialogue Discourse and Dialogue CMSC 35900-1 October 27, 2006.](https://reader035.fdocuments.us/reader035/viewer/2022070414/5697c0231a28abf838cd428a/html5/thumbnails/9.jpg)
Yielding the Floor
• Turn change signal– Offer floor to auditor/hearer
• Cues: pitch fall, lengthening, “but uh”, end gesture, amplitude drop+’uh’, end clause
• Likelihood of change increases with more cues
• Negated by any gesticulation
![Page 10: Challenges in Dialogue Discourse and Dialogue CMSC 35900-1 October 27, 2006.](https://reader035.fdocuments.us/reader035/viewer/2022070414/5697c0231a28abf838cd428a/html5/thumbnails/10.jpg)
Taking the Floor
• Speaker-state signal– Indicate becoming speaker
• Occurs at beginning of turns
• Cues:– Shift in head direction
• AND/OR
– Start of gesture
![Page 11: Challenges in Dialogue Discourse and Dialogue CMSC 35900-1 October 27, 2006.](https://reader035.fdocuments.us/reader035/viewer/2022070414/5697c0231a28abf838cd428a/html5/thumbnails/11.jpg)
Retaining the Floor
• Within-turn signal– Still speaker: Look at hearer as end clause
• Continuation signal– Still speaker: Look away after within-turn/back
• Back-channel:– ‘mmhm’/okay/etc; nods,
• sentence completion. Clarification request; restate
– NOT a turn: signal attention, agreement, confusion
![Page 12: Challenges in Dialogue Discourse and Dialogue CMSC 35900-1 October 27, 2006.](https://reader035.fdocuments.us/reader035/viewer/2022070414/5697c0231a28abf838cd428a/html5/thumbnails/12.jpg)
Segmenting Turns
• Speaker alone:– Within-turn signal->end of one unit;
– Continuation signal -. Beginning of next unit
• Joint signal:– Speaker turn signal (end); auditor ->speaker; speaker-
>auditor
– Within-turn + back-channel + continuation• Back-channels signal understanding
– Early back-channel + continuation
![Page 13: Challenges in Dialogue Discourse and Dialogue CMSC 35900-1 October 27, 2006.](https://reader035.fdocuments.us/reader035/viewer/2022070414/5697c0231a28abf838cd428a/html5/thumbnails/13.jpg)
Regaining Attention
• Gaze & Disfluency– Disfluency: “perturbation” in speech
• Silent pause, filled pause, restart
– Gaze:• Conversants don’t stare at each other constantly• However, speaker expects to meet hearer’s gaze
– Confirm hearer’s attention
• Disfluency occurs when realize hearer NOT attending– Pause until begin gazing, or to request attention
![Page 14: Challenges in Dialogue Discourse and Dialogue CMSC 35900-1 October 27, 2006.](https://reader035.fdocuments.us/reader035/viewer/2022070414/5697c0231a28abf838cd428a/html5/thumbnails/14.jpg)
Collaborative Communication
• Speaker tries to establish and add to “common ground” – “mutual belief”– Presumed a joint, collaborative activity
• Make sure “mutually believe” the same thing
– Hearer can acknowledge/accept/disagree» Clark & Schaeffer: Degrees of grounding
• Display, Demonstrate/Reformulate, Acknowledgement, Next relevant contribution, Continued attention
![Page 15: Challenges in Dialogue Discourse and Dialogue CMSC 35900-1 October 27, 2006.](https://reader035.fdocuments.us/reader035/viewer/2022070414/5697c0231a28abf838cd428a/html5/thumbnails/15.jpg)
Computational Models
• (Traum et al) revised for computation– Involves both speaker and hearer
• Initiate, Continue, Acknowledge, Repair, Request Repair, etc
– Common phenomena• “Back-Channel” – “uh-huh”, “okay”, etc
– Allows hearer to signal continued attention, ack» WITHOUT taking the turn
• Requests for repair – common in human-human– Even more common in human-computer dialogue
![Page 16: Challenges in Dialogue Discourse and Dialogue CMSC 35900-1 October 27, 2006.](https://reader035.fdocuments.us/reader035/viewer/2022070414/5697c0231a28abf838cd428a/html5/thumbnails/16.jpg)
Implicature & Grice’s Maxims
• Inferences licensed by utterances• Grice’s Maxims
– Quantity: Be as informative as required• “There are two classes per week” – not 1, or 5
– Quality: Be truthful – don’t lie, – Relevance: Be relevant– Manner: “Be perspicuous”
• Don’t be obscure, ambiguous, prolix, or disorderly
• “Flouting” maxims: Consciously violate for effect– Humor, emphasis,
![Page 17: Challenges in Dialogue Discourse and Dialogue CMSC 35900-1 October 27, 2006.](https://reader035.fdocuments.us/reader035/viewer/2022070414/5697c0231a28abf838cd428a/html5/thumbnails/17.jpg)
Speech & Dialogue Acts
• Speech Acts (Austin, Searle)– “Doing things with words”
• E.g. performatives: “I dub thee Sir Lancelot”
– Illocutionary acts: act of asking, answering, promising, etc in saying an utterance
• Include: Assertives: “I propose to..” , Directives: “Stop that”, Commissives: “I promise”, Expressives: “Thank you”, Declarations: “You’re fired”
![Page 18: Challenges in Dialogue Discourse and Dialogue CMSC 35900-1 October 27, 2006.](https://reader035.fdocuments.us/reader035/viewer/2022070414/5697c0231a28abf838cd428a/html5/thumbnails/18.jpg)
Dialogue Acts
• (aka Conversational moves)– Enriched set of speech acts
• Capture full range of conversational functions
– Adjacency pairs: Many two-part structures• E.g. Question-Answer, Greeting-Greeting, Request-
Grant, etc…
• Paired for speaker-hearer dyads– Contrast with rhetorical relations in monologue
![Page 19: Challenges in Dialogue Discourse and Dialogue CMSC 35900-1 October 27, 2006.](https://reader035.fdocuments.us/reader035/viewer/2022070414/5697c0231a28abf838cd428a/html5/thumbnails/19.jpg)
DAMSL
• Dialogue Act Tagging framework– Adjacency pairs+grounding+repair
• Forward looking functions– Statement, info-request, commit, closing, etc
• Backward looking functions– Focus on link to prior speaker utterance
• Agreement, answer, accept, etc..
![Page 20: Challenges in Dialogue Discourse and Dialogue CMSC 35900-1 October 27, 2006.](https://reader035.fdocuments.us/reader035/viewer/2022070414/5697c0231a28abf838cd428a/html5/thumbnails/20.jpg)
Tagged Dialogue
![Page 21: Challenges in Dialogue Discourse and Dialogue CMSC 35900-1 October 27, 2006.](https://reader035.fdocuments.us/reader035/viewer/2022070414/5697c0231a28abf838cd428a/html5/thumbnails/21.jpg)
Dialogue Act Recognition
• Goal: Identify dialogue act tag(s) from surface form
• Challenge: Surface form can be ambiguous– “Can you X?” – yes/no question, or info-request
• “Flying on the 11th, at what time?” – check, statement
• Requires interpretation by hearer– Strategies: Plan inference, cue recognition
![Page 22: Challenges in Dialogue Discourse and Dialogue CMSC 35900-1 October 27, 2006.](https://reader035.fdocuments.us/reader035/viewer/2022070414/5697c0231a28abf838cd428a/html5/thumbnails/22.jpg)
Plan-inference-based
• Classic AI (BDI) planning framework– Model Belief, Knowledge, Desire
• Formal definition with predicate calculus– Axiomatization of plans and actions as well– STRIPS-style: Preconditions, Effects, Body
– Rules for plan inference
• Elegant, but..– Labor-intensive rule, KB, heuristic development– Effectively AI-complete
![Page 23: Challenges in Dialogue Discourse and Dialogue CMSC 35900-1 October 27, 2006.](https://reader035.fdocuments.us/reader035/viewer/2022070414/5697c0231a28abf838cd428a/html5/thumbnails/23.jpg)
Cue-based Interpretation
• Employs sets of features to identify– Words and collocations: Please -> request– Prosody: Rising pitch -> yes/no question– Conversational structure: prior act
• Example: Check: • Syntax: tag question “,right?”• Syntax + prosody: Fragment with rise• N-gram: argmax d P(d)P(W|d)
– So you, sounds like, etc
• Details later ….
![Page 24: Challenges in Dialogue Discourse and Dialogue CMSC 35900-1 October 27, 2006.](https://reader035.fdocuments.us/reader035/viewer/2022070414/5697c0231a28abf838cd428a/html5/thumbnails/24.jpg)
From Human to Computer
• Conversational agents– Systems that (try to) participate in dialogues– Examples: Directory assistance, travel info,
weather, restaurant and navigation info
• Issues:– Limited understanding: ASR errors, interpretation– Computational costs:
• broader coverage -> slower, less accurate
![Page 25: Challenges in Dialogue Discourse and Dialogue CMSC 35900-1 October 27, 2006.](https://reader035.fdocuments.us/reader035/viewer/2022070414/5697c0231a28abf838cd428a/html5/thumbnails/25.jpg)
Dialogue Manager Tradeoffs
• Flexibility vs Simplicity/Predictability– System vs User vs Mixed Initiative– Order of dialogue interaction– Conversational “naturalness” vs Accuracy– Cost of model construction, generalization,
learning, etc
• Models: FST, Frame-based, HMM, BDI• Evaluation frameworks