Natural Language Generation for Spoken Dialogue Systems · (the slippery slope) Dialogue Manager...

32
Natural Language Natural Language Generation Generation for Spoken Dialogue Systems for Spoken Dialogue Systems Grad. Seminar on Spoken Grad. Seminar on Spoken Dialogues Dialogues Antoine Raux 11/20/2003 Antoine Raux 11/20/2003

Transcript of Natural Language Generation for Spoken Dialogue Systems · (the slippery slope) Dialogue Manager...

Page 1: Natural Language Generation for Spoken Dialogue Systems · (the slippery slope) Dialogue Manager Content Planner Surface Realizer Speech Synthesizer But, in reactive systems, generation

Natural Language Natural Language GenerationGeneration

for Spoken Dialogue Systemsfor Spoken Dialogue SystemsGrad. Seminar on Spoken Grad. Seminar on Spoken

DialoguesDialoguesAntoine Raux 11/20/2003Antoine Raux 11/20/2003

Page 2: Natural Language Generation for Spoken Dialogue Systems · (the slippery slope) Dialogue Manager Content Planner Surface Realizer Speech Synthesizer But, in reactive systems, generation

OutlookOutlook

!! IntroductionIntroduction!! ““Planning Dialogue Contributions with New Planning Dialogue Contributions with New

InformationInformation””, K. , K. JokinenJokinen et al (1998)et al (1998)!! ““The Practical Value of NThe Practical Value of N--Grams in Grams in

GenerationGeneration””, I. , I. LangkildeLangkilde and K. Knight and K. Knight (1998)(1998)

!! ““Stochastic Language Generation for Stochastic Language Generation for Spoken Dialog SystemsSpoken Dialog Systems””, A. Oh and A. , A. Oh and A. Rudnicky (2000)Rudnicky (2000)

Page 3: Natural Language Generation for Spoken Dialogue Systems · (the slippery slope) Dialogue Manager Content Planner Surface Realizer Speech Synthesizer But, in reactive systems, generation

Natural Language GenerationNatural Language Generation

!! Intro based on Intro based on ““Generation Models for Generation Models for Spoken DialoguesSpoken Dialogues”” G. G. WilcockWilcock and K. and K. JokinenJokinen (2003)(2003)

!! Formal representation of informationFormal representation of information

natural languagenatural language!! Applications:Applications:

!! Machine translationMachine translation!! Text generation (summarization, expert Text generation (summarization, expert

systemssystems……))!! Dialogue systemsDialogue systems!! ……??

Page 4: Natural Language Generation for Spoken Dialogue Systems · (the slippery slope) Dialogue Manager Content Planner Surface Realizer Speech Synthesizer But, in reactive systems, generation

Where does NLG start Where does NLG start and where does it end?and where does it end?

(the slippery slope)(the slippery slope)

DialogueManager

ContentPlanner

SurfaceRealizer

SpeechSynthesizer

Decides the next dialogue actAnd the main information to convey

Page 5: Natural Language Generation for Spoken Dialogue Systems · (the slippery slope) Dialogue Manager Content Planner Surface Realizer Speech Synthesizer But, in reactive systems, generation

Where does NLG start Where does NLG start and where does it end?and where does it end?

(the slippery slope)(the slippery slope)

DialogueManager

ContentPlanner

SurfaceRealizer

SpeechSynthesizer

Decides what pieces of information to include in the utterance

Page 6: Natural Language Generation for Spoken Dialogue Systems · (the slippery slope) Dialogue Manager Content Planner Surface Realizer Speech Synthesizer But, in reactive systems, generation

Where does NLG start Where does NLG start and where does it end?and where does it end?

(the slippery slope)(the slippery slope)

DialogueManager

ContentPlanner

SurfaceRealizer

SpeechSynthesizer

Constructs a valid utterance expressing the selected concepts and their relationship

Page 7: Natural Language Generation for Spoken Dialogue Systems · (the slippery slope) Dialogue Manager Content Planner Surface Realizer Speech Synthesizer But, in reactive systems, generation

Where does NLG start Where does NLG start and where does it end?and where does it end?

(the slippery slope)(the slippery slope)

DialogueManager

ContentPlanner

SurfaceRealizer

SpeechSynthesizer

Produces the spoken utterance, with appropriate prosody

Page 8: Natural Language Generation for Spoken Dialogue Systems · (the slippery slope) Dialogue Manager Content Planner Surface Realizer Speech Synthesizer But, in reactive systems, generation

Where does NLG start Where does NLG start and where does it end?and where does it end?

(the slippery slope)(the slippery slope)

DialogueManager

ContentPlanner

SurfaceRealizer

SpeechSynthesizer

But, in reactive systems, generation might start as early as recognition: decision to stop speaking on barge-in, change the planned utterance according to the user input, etc…

Page 9: Natural Language Generation for Spoken Dialogue Systems · (the slippery slope) Dialogue Manager Content Planner Surface Realizer Speech Synthesizer But, in reactive systems, generation

Machine Translation ApproachMachine Translation Approach

!! Starts from a Starts from a ““bag of conceptsbag of concepts”” and builds and builds all possible syntactically valid sentences all possible syntactically valid sentences containing the conceptscontaining the concepts

!! Selects one (Selects one (cfcf NN--Gram paper)Gram paper)!! Eventually includes semantic constraintsEventually includes semantic constraints!! Order of concepts not relevant (to deal Order of concepts not relevant (to deal

with syntactic differences across with syntactic differences across languages)languages)

Page 10: Natural Language Generation for Spoken Dialogue Systems · (the slippery slope) Dialogue Manager Content Planner Surface Realizer Speech Synthesizer But, in reactive systems, generation

MT Model for SDS?MT Model for SDS?

!! Adaptation of the model exists to account Adaptation of the model exists to account for for incrementalityincrementality in SDS:in SDS:!! Starts building an utterance with an Starts building an utterance with an

incomplete set of conceptsincomplete set of concepts!! Complete the utterance as new concepts are Complete the utterance as new concepts are

addedadded

!! No structure to the informationNo structure to the information

Page 11: Natural Language Generation for Spoken Dialogue Systems · (the slippery slope) Dialogue Manager Content Planner Surface Realizer Speech Synthesizer But, in reactive systems, generation

Text GenerationText Generation

!! Centered on information structure Centered on information structure (discourse planning, reference(discourse planning, reference……))

!! Monologue: all the information must be Monologue: all the information must be provided in one single textprovided in one single text

!! No No interactioninteraction: in SDS feedback from the : in SDS feedback from the user guides generationuser guides generation

!! In SDS, utterances are short, with little In SDS, utterances are short, with little informationinformation

!! Conclusion: not appropriate for SDSConclusion: not appropriate for SDS……

Page 12: Natural Language Generation for Spoken Dialogue Systems · (the slippery slope) Dialogue Manager Content Planner Surface Realizer Speech Synthesizer But, in reactive systems, generation

TemplateTemplate--based Generationbased Generation

!! Most common in SDS: hand written Most common in SDS: hand written templates for fixed sets of conceptstemplates for fixed sets of concepts

!! Rigid: produces only a limited set of Rigid: produces only a limited set of utterancesutterances

!! Not scalable: hard to write/maintain large Not scalable: hard to write/maintain large sets of utterancessets of utterances

!! Fits well with limited domain speech Fits well with limited domain speech synthesissynthesis

Page 13: Natural Language Generation for Spoken Dialogue Systems · (the slippery slope) Dialogue Manager Content Planner Surface Realizer Speech Synthesizer But, in reactive systems, generation

Planning Dialogue Contributions Planning Dialogue Contributions with New Informationwith New Information

!! Very general definition of generation (even Very general definition of generation (even includes the DM)includes the DM)

!! Based on observation of humanBased on observation of human--human human dialoguesdialogues

!! Computational theory of generation in Computational theory of generation in dialoguedialogue

Page 14: Natural Language Generation for Spoken Dialogue Systems · (the slippery slope) Dialogue Manager Content Planner Surface Realizer Speech Synthesizer But, in reactive systems, generation

The Three The Three II’’ss of Generation in of Generation in DialogueDialogue

!! IncrementalityIncrementality: : !! Old information Old information vsvs new informationnew information

!! Immediacy:Immediacy:!! Speakers closely monitor their partner Speakers closely monitor their partner

(backchannel, barge(backchannel, barge--in, etcin, etc……))!! Interactivity:Interactivity:

!! Some turns only play a role at the Some turns only play a role at the metalevelmetalevel of interaction management (e.g. of interaction management (e.g. ask for a confirmation by repeating an ask for a confirmation by repeating an already known piece of information)already known piece of information)

Page 15: Natural Language Generation for Spoken Dialogue Systems · (the slippery slope) Dialogue Manager Content Planner Surface Realizer Speech Synthesizer But, in reactive systems, generation

Information Structure of UtterancesInformation Structure of Utterances

!! Information Unit: a phrase with one or more Information Unit: a phrase with one or more accented words and a phrase accent at the endaccented words and a phrase accent at the end

!! Central Concept: concept on which the Central Concept: concept on which the participants are focusing:participants are focusing:!! goal or goal or subgoalsubgoal for taskfor task--based dialoguesbased dialogues!! current topic for descriptionscurrent topic for descriptions

!! NewInfoNewInfo: piece of information newly presented : piece of information newly presented by an utterance:by an utterance:!! Can become CC in future turnsCan become CC in future turns

Page 16: Natural Language Generation for Spoken Dialogue Systems · (the slippery slope) Dialogue Manager Content Planner Surface Realizer Speech Synthesizer But, in reactive systems, generation

Generation as Generation as ““WrappingWrapping”” of the of the NewInfoNewInfo

!! The DM provides the surface The DM provides the surface realizerrealizer with with a communicative intention (dialogue act) a communicative intention (dialogue act) and the and the NewInfoNewInfo wrapped in further wrapped in further information:information:!! Wrapping 1: prevent ambiguity by providing Wrapping 1: prevent ambiguity by providing

enough detailsenough details!! Wrapping 2: make sure that uncertain Wrapping 2: make sure that uncertain

concepts are mutually knownconcepts are mutually known!! Wrapping 3: decompose complex goals into Wrapping 3: decompose complex goals into

subgoalssubgoals

Page 17: Natural Language Generation for Spoken Dialogue Systems · (the slippery slope) Dialogue Manager Content Planner Surface Realizer Speech Synthesizer But, in reactive systems, generation

Conclusion/DiscussionConclusion/Discussion

!! General computational theory of dialogueGeneral computational theory of dialogue!! Handles both taskHandles both task--based dialogues and based dialogues and

descriptions based on topic shiftsdescriptions based on topic shifts!! Decomposition Task Manager/Dialogue Decomposition Task Manager/Dialogue

ManagerManager

!! Problems: no implementationProblems: no implementation……

Page 18: Natural Language Generation for Spoken Dialogue Systems · (the slippery slope) Dialogue Manager Content Planner Surface Realizer Speech Synthesizer But, in reactive systems, generation

The Practical Value of NThe Practical Value of N--Grams in Grams in GenerationGeneration

!! Combine a ruleCombine a rule--based surface based surface realizerrealizerwith a statistical selectorwith a statistical selector

!! Study the contribution of statistical models Study the contribution of statistical models to surface realizationto surface realization

Page 19: Natural Language Generation for Spoken Dialogue Systems · (the slippery slope) Dialogue Manager Content Planner Surface Realizer Speech Synthesizer But, in reactive systems, generation

TwoTwo--Tier Generation ModuleTier Generation Module

Meaning

lexicongrammarSymbolic Generator

Word lattice of possible renderings

corpusStatistical Extractor

English string

Page 20: Natural Language Generation for Spoken Dialogue Systems · (the slippery slope) Dialogue Manager Content Planner Surface Realizer Speech Synthesizer But, in reactive systems, generation

Symbolic GeneratorSymbolic Generator

!! Gets feature structures as input with Gets feature structures as input with semantic relations (agent/patient, semantic relations (agent/patient, source/destinationsource/destination……))

!! Produces a lattice of possible English Produces a lattice of possible English word sequencesword sequences

Page 21: Natural Language Generation for Spoken Dialogue Systems · (the slippery slope) Dialogue Manager Content Planner Surface Realizer Speech Synthesizer But, in reactive systems, generation

Symbolic GeneratorSymbolic Generator

!! Lexicon: links a semantic concept to a rootLexicon: links a semantic concept to a root!! Morphological KB: rules and exceptions Morphological KB: rules and exceptions

for plurals, conjugationfor plurals, conjugation……!! Grammar rules mapping between: Grammar rules mapping between:

!! Semantic relations (input)Semantic relations (input)!! Deep syntactic relations (oblique, adjunctDeep syntactic relations (oblique, adjunct……))!! Shallow syntactic relations (subject, objectShallow syntactic relations (subject, object……))

!! Does not try to solve ambiguity: produces Does not try to solve ambiguity: produces all possible outputsall possible outputs

Page 22: Natural Language Generation for Spoken Dialogue Systems · (the slippery slope) Dialogue Manager Content Planner Surface Realizer Speech Synthesizer But, in reactive systems, generation

Statistical ExtractorStatistical Extractor

!! BigramBigram LM trained on 2 years of The Wall LM trained on 2 years of The Wall Street JournalStreet Journal

!! From all the possible outputs of the From all the possible outputs of the symbolic generator, selects the one with symbolic generator, selects the one with the highest likelihood according to the LMthe highest likelihood according to the LM

!! Backs off to unigram when no Backs off to unigram when no bigramsbigrams

Page 23: Natural Language Generation for Spoken Dialogue Systems · (the slippery slope) Dialogue Manager Content Planner Surface Realizer Speech Synthesizer But, in reactive systems, generation

Conclusion/DiscussionConclusion/Discussion

!! Simplicity: do not need very detailed Simplicity: do not need very detailed grammar rulesgrammar rules

!! Robustness: LM works on any English Robustness: LM works on any English sentencesentence

!! Generality: should work for any topicGenerality: should work for any topic!! Weaknesses: Weaknesses: bigramsbigrams only capture very only capture very

local relations, sparseness issue with local relations, sparseness issue with larger nlarger n……

Page 24: Natural Language Generation for Spoken Dialogue Systems · (the slippery slope) Dialogue Manager Content Planner Surface Realizer Speech Synthesizer But, in reactive systems, generation

Stochastic Language Generation Stochastic Language Generation for Spoken Dialogue Systemsfor Spoken Dialogue Systems

!! Specifically for dialogue systemsSpecifically for dialogue systems!! Based on statistical methodsBased on statistical methods!! Handles both content planning and Handles both content planning and

surface realizationsurface realization!! Domain specific generationDomain specific generation

Page 25: Natural Language Generation for Spoken Dialogue Systems · (the slippery slope) Dialogue Manager Content Planner Surface Realizer Speech Synthesizer But, in reactive systems, generation

Labeled CorporaLabeled Corpora

!! In the context of Communicator (air travel In the context of Communicator (air travel reservation)reservation)

!! Two corpora of humanTwo corpora of human--human transcribed human transcribed dialogues (travel agent/client)dialogues (travel agent/client)

!! Each utterance classified by type:Each utterance classified by type:query_arrive_cityquery_arrive_city, , inform_flightinform_flight, , inform_priceinform_price!! Each concept labeled:Each concept labeled:airline, airline, arrive_cityarrive_city, , hotel_pricehotel_price……!! Tagging done manually first and then semiTagging done manually first and then semi--

automaticallyautomatically

Page 26: Natural Language Generation for Spoken Dialogue Systems · (the slippery slope) Dialogue Manager Content Planner Surface Realizer Speech Synthesizer But, in reactive systems, generation

Content PlanningContent Planning

!! Train models from labeled data to predict:Train models from labeled data to predict:!! Number of attributes (concepts) in the utteranceNumber of attributes (concepts) in the utterance!! Sequence of attributesSequence of attributes

!! Very simple models:Very simple models:!! most frequent number of attributes for each classmost frequent number of attributes for each class!! Presence of each attribute only depends on the userPresence of each attribute only depends on the user’’s s

previous utteranceprevious utterance

!! Deals with data sparsenessDeals with data sparseness

Page 27: Natural Language Generation for Spoken Dialogue Systems · (the slippery slope) Dialogue Manager Content Planner Surface Realizer Speech Synthesizer But, in reactive systems, generation

Surface RealizationSurface Realization

!! Generates random sequences of words Generates random sequences of words according to the class LM (5according to the class LM (5--gram)gram)

!! Repeats until got an utterance with all Repeats until got an utterance with all attributes predicted by content planner (or attributes predicted by content planner (or timeout)timeout)

!! Replaces word class names by actual Replaces word class names by actual words (e.g. words (e.g. depart_citydepart_city "" New York)New York)

Page 28: Natural Language Generation for Spoken Dialogue Systems · (the slippery slope) Dialogue Manager Content Planner Surface Realizer Speech Synthesizer But, in reactive systems, generation

EvaluationEvaluation

!! No significant difference in user No significant difference in user satisfaction between an old satisfaction between an old vsvs new new approach and this approach for content approach and this approach for content planning (trend: old planning (trend: old vsvs new)new)

!! No significant difference in human No significant difference in human preference judgment for surface preference judgment for surface realization between templates and this realization between templates and this approach (trend: this approach)approach (trend: this approach)

Page 29: Natural Language Generation for Spoken Dialogue Systems · (the slippery slope) Dialogue Manager Content Planner Surface Realizer Speech Synthesizer But, in reactive systems, generation

Conclusion/DiscussionConclusion/Discussion

!! Empirical approach to content planning Empirical approach to content planning (original) but too many simplifications? (original) but too many simplifications? How to deal with data sparseness?How to deal with data sparseness?

!! DomainDomain--specific statistical LM approach to specific statistical LM approach to surface realization: more convincing. surface realization: more convincing. Allows variations in the outputAllows variations in the output

!! Nice attempt to evaluation but not enough Nice attempt to evaluation but not enough data?data?

Page 30: Natural Language Generation for Spoken Dialogue Systems · (the slippery slope) Dialogue Manager Content Planner Surface Realizer Speech Synthesizer But, in reactive systems, generation

Integrating Language Generation Integrating Language Generation with Speech Synthesis in a with Speech Synthesis in a Concept to Speech SystemConcept to Speech System

!! Lower end of the process: link with Lower end of the process: link with synthesissynthesis

!! Integrates prosody into generationIntegrates prosody into generation!! System descriptionSystem description

Page 31: Natural Language Generation for Spoken Dialogue Systems · (the slippery slope) Dialogue Manager Content Planner Surface Realizer Speech Synthesizer But, in reactive systems, generation

NLG ComponentNLG Component

!! Content planner, micro planner (sentence Content planner, micro planner (sentence planning) and lexical chooser not detailedplanning) and lexical chooser not detailed

!! Produces a semantic structure (~ feature Produces a semantic structure (~ feature structure)structure)

Page 32: Natural Language Generation for Spoken Dialogue Systems · (the slippery slope) Dialogue Manager Content Planner Surface Realizer Speech Synthesizer But, in reactive systems, generation

Speech Integrating Markup Speech Integrating Markup LanguageLanguage

!! Keeps semantic/syntactic information with Keeps semantic/syntactic information with the utterancethe utterance

!! Converts into TTS system specific Converts into TTS system specific featuresfeatures