Multi-Modal Dialogue in Human-Robot Conversation Information States in a Multi- modal Dialogue...

27
Multi-Modal Dialogue in Human-Robot Conversation Information States in a Multi-modal Dialogue System for Human-Robot Conversation (Lemon et al. 2001) Multi-tasking and Collaborative Acitivities in Dialogue Systems (Lemon et al. 2002) Edith Klee June 3, 2003

Transcript of Multi-Modal Dialogue in Human-Robot Conversation Information States in a Multi- modal Dialogue...

Page 1: Multi-Modal Dialogue in Human-Robot Conversation Information States in a Multi- modal Dialogue System for Human-Robot Conversation (Lemon et al. 2001)

Multi-Modal Dialogue in Human-Robot Conversation

Information States in a Multi-modal Dialogue System for Human-Robot Conversation (Lemon et al. 2001)Multi-tasking and Collaborative Acitivities in Dialogue Systems (Lemon et al. 2002)

Edith Klee

June 3, 2003

Page 2: Multi-Modal Dialogue in Human-Robot Conversation Information States in a Multi- modal Dialogue System for Human-Robot Conversation (Lemon et al. 2001)

Overview

•Mobile Robots

•Requirements for Human-Robot conversation

•The WITAS Dialogue System

•Dialogue Management in WITAS

•Message Generation in WITAS

•Demo videoclip

Page 3: Multi-Modal Dialogue in Human-Robot Conversation Information States in a Multi- modal Dialogue System for Human-Robot Conversation (Lemon et al. 2001)

Mobile Robots

• move about their environment using wheels, legs, or similar mechanisms, e.g. unmanned land vehicles (ULV), unmanned air vehicles (UAV), autonomous underwater vehicles (AUV)

• device sensors (e.g. camera, mouse gesture input) may give rise to new information at any time

• no predictable course of actions or conversation

• no strict endpoint to action or conversation

• simple form-filling/data-base query style dialogue not sufficient

Page 4: Multi-Modal Dialogue in Human-Robot Conversation Information States in a Multi- modal Dialogue System for Human-Robot Conversation (Lemon et al. 2001)

Requirements on dialogue management in conversational interaction with mobile robots

•Asynchronicityevents in the dialogue scenario can happen at

overlapping time periods

•Mixed task-initiativeoperator and system will introduce issues for

discussion

•Open-endedno clear start/end points for (sub-)dialogues, no rigid

pre-determined goals for interchanges

•Resource-boundedparticipants' actions must be generated and produced in time enough to be effective dialogue contributions

•Simultaneousparticipants can produce and receive actions

simultaneously

Page 5: Multi-Modal Dialogue in Human-Robot Conversation Information States in a Multi- modal Dialogue System for Human-Robot Conversation (Lemon et al. 2001)

The WITAS Dialogue System

• robot helicopter (UAV, 'Unmanned Aerial Vehicle')

• mission goals provided by a human operator, conversation via spoken dialogue

• Open Agent Architecture (OAA2)

OAA2 facilitator

SRNuance Speech

Recognizer

NLGemini

Parser and Generator

ROBOTControl

and Report

GUIInteractive

Map Display

TTSFestival Speech

Synthesizer

DM

Dialogue Manager

ROBOT

Page 6: Multi-Modal Dialogue in Human-Robot Conversation Information States in a Multi- modal Dialogue System for Human-Robot Conversation (Lemon et al. 2001)

Dialogue Management in WITAS

•Dialogue Management: 1) dialogue modelling (representation)

2) dialogue control (algorithm)

•main tasks of the DM in WITAS:- interpretation spoken language and map-gestures (mouse click)

inputs as commands- interpretation of queries, responses, declarations to the robot - generation of synthesized speech and graphical output to express the

robot's responses questions, reports about the environment

•The current DM supports:- ambiguity resolution- presupposition checking- processing of anaphoric and deictic expressions- command revision- report generation, confirmation backchannel

Page 7: Multi-Modal Dialogue in Human-Robot Conversation Information States in a Multi- modal Dialogue System for Human-Robot Conversation (Lemon et al. 2001)

•implemetation of a dynamic information state model

•creates and updates Information States (IS)

•dialogue moves have the effect of updating information states

•moves can be initiated by both the operator and the robot

•parts of an information state (2001):Issues Raised (IR) stack – a stack or public unresolved issues raised

in the dialogue

System Agenda – issues to be raised by system

Salience List – the objects referenced in the dialogue so far

Modality Buffer – keeps track of mouse gestures

Databases – dynamic objects, planned routes, geographical

information, names

Dialogue Manager in WITAS (contd.)

Page 8: Multi-Modal Dialogue in Human-Robot Conversation Information States in a Multi- modal Dialogue System for Human-Robot Conversation (Lemon et al. 2001)

Parts of an Information State (2002)

• Dialogue Move Tree (DMT)structured history of dialogue moves and ‘threads’ + list of ‘active nodes’

• Activity Tree (AT)temporal and hierarchical structure of activities specified by the user or

planned/initiated by the system + their execution status

• System Agenda (SA)stores the communicative goals of the system

• Pending List (PL)stores question that the system has asked, but which the user has not answered, they may be re-raised

• Salience List (SL)objects referenced in the dialogue so far, ordered by recency

• Modality Buffer (MB)

keeps track of mouse gestures (until bound to deictic expressions or recognized as purely gestural expressions)

Page 9: Multi-Modal Dialogue in Human-Robot Conversation Information States in a Multi- modal Dialogue System for Human-Robot Conversation (Lemon et al. 2001)

INDEXICAL INFORMATION

SALIENCE LIST (NPs,

Activities)

MODALITY BUFFER

Map Display Inputs (mouse

clicks)

DIALOGUE MOVE TREE

(Active Node List)

PENDING LIST

MESSAGE GENERATION

(Selection and Aggregation)

DEVICEACTIVITY

LAYER

SYSTEMAGENDA

ACTIVITYTREE

ACTIVITYMODEL

Conversational Move Inputs(parsed human speech)

Dialogue Manager Architecture

Page 10: Multi-Modal Dialogue in Human-Robot Conversation Information States in a Multi- modal Dialogue System for Human-Robot Conversation (Lemon et al. 2001)

INDEXICAL INFORMATION

SALIENCE LIST (NPs,

Activities)

MODALITY BUFFER

Map Display Inputs (mouse

clicks)

DIALOGUE MOVE TREE

(Active Node List)

PENDING LIST

MESSAGE GENERATION

(Selection and Aggregation)

DEVICEACTIVITY

LAYER

SYSTEMAGENDA

ACTIVITYTREE

ACTIVITYMODEL

O: Yes! That car.

Example

O: Try to find the red car!

U: Do you mean the big red car?O: Oh look! The car is here!S: So you mean the big red car?

Page 11: Multi-Modal Dialogue in Human-Robot Conversation Information States in a Multi- modal Dialogue System for Human-Robot Conversation (Lemon et al. 2001)

A Dialogue Move tree is...

• constructed for any ongoing dialogue to represent the current state of the conversation

• input: logical forms from the parsing process, e.g."go to the tower":

command([go], [param-list ([pp-loc(to, arg([np(det([def],the),[n(tower,sg)])]))])])

• a history ("message board") of dialogue contributions, organized by "thread", based on activities

• classifies whether incoming utterances can be interpreted in the current dialogue context, delimits a space of possible IS update functions

• has an Active Node List which controls the order in which this function space is searched

• classifies how incoming utterances are to be interpreted in the current dialogue context

Page 12: Multi-Modal Dialogue in Human-Robot Conversation Information States in a Multi- modal Dialogue System for Human-Robot Conversation (Lemon et al. 2001)

Example Dialogue

Operator: Fly to the tower!

UAV: I plan to fly to the tower.

UAV: Now taking off.

UAV: I have taken off.

UAV: Now flying there.

Operator: Make that the police station.

UAV: I plan to fly to springfield police station.

UAV: I have cancelled flying to the tower

UAV: Now flying to springfield police station.

...

Page 13: Multi-Modal Dialogue in Human-Robot Conversation Information States in a Multi- modal Dialogue System for Human-Robot Conversation (Lemon et al. 2001)

active nodes are red; position on active node list in parens [0=most active]

•Root (1) Root

•Command (0) "Fly to the tower" command([go],[param_list([pp_loc(to,arg([np(det([def],the...•Report "I plan to fly to the tower" report(inform,agent([np([n(uav,sg)])]),confirm_activity...

Example Dialogue Move Tree

Page 14: Multi-Modal Dialogue in Human-Robot Conversation Information States in a Multi- modal Dialogue System for Human-Robot Conversation (Lemon et al. 2001)

active nodes are red; position on active node list in parens [0=most active]

•Root (1) Root

•Command (0) "Fly to the tower" command([go],[param_list([pp_loc(to,arg([np(det([def],the...

•Report "I plan to fly to the tower" report(inform,agent([np([n(uav,sg)])]),confirm_activity...

Example Dialogue Move Tree

Page 15: Multi-Modal Dialogue in Human-Robot Conversation Information States in a Multi- modal Dialogue System for Human-Robot Conversation (Lemon et al. 2001)

active nodes are red; position on active node list in parens [0=most active]

•Root (1) Root

•Command (0) "Fly to the tower" command([go],[param_list([pp_loc(to,arg([np(det([def],the...

•Report "Now taking off" report(inform,agent([np([n(uav,sg)])]),curr_activity...

•Report "I plan to fly to the tower" report(inform,agent([np([n(uav,sg)])]),confirm_activity...

Example Dialogue Move Tree

Page 16: Multi-Modal Dialogue in Human-Robot Conversation Information States in a Multi- modal Dialogue System for Human-Robot Conversation (Lemon et al. 2001)

active nodes are red; position on active node list in parens [0=most active]

•Root (1) Root

•Command (0) "Fly to the tower" command([go],[param_list([pp_loc(to,arg([np(det([def],the...

•Report "Now taking off" report(inform,agent([np([n(uav,sg)])]),curr_activity...

•Report "I plan to fly to the tower" report(inform,agent([np([n(uav,sg)])]),confirm_activity...

Example Dialogue Move Tree

Page 17: Multi-Modal Dialogue in Human-Robot Conversation Information States in a Multi- modal Dialogue System for Human-Robot Conversation (Lemon et al. 2001)

active nodes are red; position on active node list in parens [0=most active]

•Root (1) Root

•Command (0) "Fly to the tower" command([go],[param_list([pp_loc(to,arg([np(det([def],the...

•Report "Now taking off" report(inform,agent([np([n(uav,sg)])]),curr_activity...•Report "I have taken off" report(inform,agent([np([n(uav,sg)])]),compl_activity...•Report "I plan to fly to the tower"

report(inform,agent([np([n(uav,sg)])]),confirm_activity...

Example Dialogue Move Tree

Page 18: Multi-Modal Dialogue in Human-Robot Conversation Information States in a Multi- modal Dialogue System for Human-Robot Conversation (Lemon et al. 2001)

active nodes are red; position on active node list in parens [0=most active]

•Root (1) Root

•Command (0) "Fly to the tower" command([go],[param_list([pp_loc(to,arg([np(det([def],the...

•Report "Now taking off" report(inform,agent([np([n(uav,sg)])]),curr_activity...•Report "I have taken off" report(inform,agent([np([n(uav,sg)])]),compl_activity...

•Report "I plan to fly to the tower" report(inform,agent([np([n(uav,sg)])]),confirm_activity...

Example Dialogue Move Tree

Page 19: Multi-Modal Dialogue in Human-Robot Conversation Information States in a Multi- modal Dialogue System for Human-Robot Conversation (Lemon et al. 2001)

active nodes are red; position on active node list in parens [0=most active]

•Root (1) Root

•Command (0) "Fly to the tower" command([go],[param_list([pp_loc(to,arg([np(det([def],the...

•Report "Now taking off" report(inform,agent([np([n(uav,sg)])]),curr_activity...•Report "I have taken off" report(inform,agent([np([n(uav,sg)])]),compl_activity...•Report "Now flying there" report(inform,agent([np([n(uav,sg)])]),curr_activity ...

•Report "I plan to fly to the tower" report(inform,agent([np([n(uav,sg)])]),confirm_activity...

Example Dialogue Move Tree

Page 20: Multi-Modal Dialogue in Human-Robot Conversation Information States in a Multi- modal Dialogue System for Human-Robot Conversation (Lemon et al. 2001)

active nodes are red; position on active node list in parens [0=most active]

•Root (1) Root

•Command (0) "Fly to the tower" command([go],[param_list([pp_loc(to,arg([np(det([def],the...

•Report "Now taking off" report(inform,agent([np([n(uav,sg)])]),curr_activity...•Report "I have taken off" report(inform,agent([np([n(uav,sg)])]),compl_activity...•Report "Now flying there" report(inform,agent([np([n(uav,sg)])]),curr_activity ...

•Report "I plan to fly to the tower" report(inform,agent([np([n(uav,sg)])]),confirm_activity...

•Revision "Make that the police station" revision([replace([arg([np(det([def],the),[n(police...

Example Dialogue Move Tree

Page 21: Multi-Modal Dialogue in Human-Robot Conversation Information States in a Multi- modal Dialogue System for Human-Robot Conversation (Lemon et al. 2001)

active nodes are red; position on active node list in parens [0=most active]

•Root (1) Root

•Command (0) "Fly to the tower" command([go],[param_list([pp_loc(to,arg([np(det([def],the...

•Report "Now taking off" report(inform,agent([np([n(uav,sg)])]),curr_activity...

•Report "I have taken off" report(inform,agent([np([n(uav,sg)])]),compl_activity...•Report "Now flying there" report(inform,agent([np([n(uav,sg)])]),curr_activity ...

•Report "I plan to fly to the tower" report(inform,agent([np([n(uav,sg)])]),confirm_activity...

•Revision "Make that the police station" revision([replace([arg([np(det([def],the),[n(police...

Example Dialogue Move Tree

Page 22: Multi-Modal Dialogue in Human-Robot Conversation Information States in a Multi- modal Dialogue System for Human-Robot Conversation (Lemon et al. 2001)

active nodes are red; position on active node list in parens [0=most active]

•Root (1) Root

•Command (0) "Fly to the tower" command([go],[param_list([pp_loc(to,arg([np(det([def],the...

•Report "Now taking off" report(inform,agent([np([n(uav,sg)])]),curr_activity...

•Report "I have taken off" report(inform,agent([np([n(uav,sg)])]),compl_activity...•Report "Now flying there" report(inform,agent([np([n(uav,sg)])]),curr_activity ...

•Report "I plan to fly to the tower" report(inform,agent([np([n(uav,sg)])]),confirm_activity...

•Revision "Make that the police station" revision([replace([arg([np(det([def],the),[n(police...

•Report "I plan to fly to springfield police station" report(inform,agent([np([n(uav,sg)])]),confirm ...

Example Dialogue Move Tree

Page 23: Multi-Modal Dialogue in Human-Robot Conversation Information States in a Multi- modal Dialogue System for Human-Robot Conversation (Lemon et al. 2001)

Message Generation

•the system has to communicate about:

- its perceptions of a changing environment

- progress towards user-specific goals

- execution status of activities or tasks

- its own internal state changes

- the progress of the dialogue itself

•problems that may arise:- multi-tasking robot, different activities at the same time- dialogue contribution arise incrementally- generation of too many utterances: overload of information- generation of too little utterances: no establishment and maintenance of an

appropriate context.- large pieces of text not appropriate

Page 24: Multi-Modal Dialogue in Human-Robot Conversation Information States in a Multi- modal Dialogue System for Human-Robot Conversation (Lemon et al. 2001)

Requirements for Message Generation

• flexibility: message selection and generation needs to be more flexible than template-based approaches

• aggregation: aggregation rules must be sensitive to incremental aspects

• relevance/recency filtering: message filtering according to relevance

• echoing: usage of the same language as the user, i.e. usage of anaphoric expressions where possible

• variability: expressing the same content in a variety of ways

• real time generation to keep the conversation natural

Page 25: Multi-Modal Dialogue in Human-Robot Conversation Information States in a Multi- modal Dialogue System for Human-Robot Conversation (Lemon et al. 2001)

Message Selection - Filtering

• selection and generation module

• input: logical forms, describing the communicative goals of the system

- context tags (e.g. activity identifier, dialogue move tree node)

- content logical form (e.g. report, wh-question)

- priority tag (e.g. warn, inform)

- e.g. report(inform, agent(AgentID), cancel-activity(ActivityID)), when

AgentID = robot and ActivityID = tower the result is: "I cancelled flying to

the tower".

• items considered for generation are placed on the system agenda or pending list

• messages more urgent than others carry a label "warning" and pass any

filtering

• Echoing (using anaphoric expressions) is achieved by accessing the Salience

List whenever possible

Page 26: Multi-Modal Dialogue in Human-Robot Conversation Information States in a Multi- modal Dialogue System for Human-Robot Conversation (Lemon et al. 2001)
Page 27: Multi-Modal Dialogue in Human-Robot Conversation Information States in a Multi- modal Dialogue System for Human-Robot Conversation (Lemon et al. 2001)

Thank You!