MERL 1
COLLAGEN:
Middleware for Building Mixed-Initiative
Problem Solving Assistants
( Neal Lesh, Andy Garland, Chris Lee, David McDonald, Egon Pasztor, Chris Maloof, Luke Zettlemoyer, Jim Davies, Myrosia Dzikovska, Steve
Wolfman, Jacob Eisenstein, Allison Bruce )
Charles RichCandace L. Sidner
Mitsubishi Electric Research LaboratoriesCambridge, MA
MERL 2
Outline of the Talk
• Introduction
• Demo: DiamondHelp
• Some Theory
• Some Architecture
• Some More Technical Details
• Related Work
MERL 3
Mixed-Initiative and Collaboration
Collaboration: A process in which two or more participants coordinate their actions toward achieving shared goals. [Grosz & Sidner]
Collaboration Mixed-Initiativeimplies
i.e., all collaborative systems are mixed-initiative
Mixed-Initiativestrongly suggestsCollaboration
i.e., most interesting mixed-initiative systems are collaborative
Mixed Initiative: …efficient, natural interleaving of contributions by users and automated services… [Horvitz]
MERL 4
Collaboration
• covers a wide spectrum of interactions depending, among other factors, on:
- the relative knowledge of the participants- which participant predominantly has the initiative- the primary goal of the collaboration
e.g., tutoring versus assistance
• usually involves some form of communication (discourse) between the participants, e.g., in natural language.
MERL 6
Outline of the Talk
• Introduction
• Demo: DiamondHelp
• Some Theory
• Some Architecture
• Some More Technical Details
• Related Work
MERL 7
SharedPlan Collaborative Discourse Theory
(Grosz, Sidner, Kraus, Lochbaum 1974-1998)
Attentional
focus spaces,focus stack
Intentional
goals, recipes, plans
Linguistic
segments,lexical items
MERL 8
(Grosz, 1974)
E: Replace the pump and belt please.
A: Ok, I found a belt in the back.
A: Is that where it should be?
A: [removes belt]
A: It’s done.
E: Now remove the pump.
…
E: First you have to remove the flywheel.
…
E: Now take the pump off the base plate.
A: Already did.
replacebelt
replacepump
replacepump
andbelt
(fixing an air compressor, E = expert, A = apprentice)
Discourse Segments and Purposes
MERL 9
E: Replace the pump and belt please.
A: Ok, I found a belt in the back.
A: Is that where it should be?
A: [removes belt]
A: It’s done
Focus Stack
replace belt
replace pump and belt
Plan Tree
replace pump and belt
replace pump replace belt
SharedPlan Discourse State Model
currentfocus space
(Grosz & Sidner, 1986)
replace belt
replace pump
and belt
MERL 10
A
B C
d [ user ] e [ user ] f [ agent ] g [ user ]
Plan Tree: Focus Stack:
A
B
SharedPlan Discourse Interpretation Algorithm
2#d [ user ]
live live live
live
live
live
1. User performs e.
2. User performs d.
3. Agent performs f.
4. Agent says “Please perform g.”
1#e [ user ]
live
3#f [ agent ]
4#Propose.Should [ agent, g[user] ]
(Lochbaum, 1998)
Updating the discourse state in response to new discourse events (communications or manipulations)
g
MERL 11
User says "What next?"
Agent says "What do you want to do?"
[Choosing the fabric and stain.]
User says "Choose the fabric and stain."
[Done choosing the fabric.]
[Done successfully navigating.]
[Done user successfully popping up the fabric load selection display.]
Agent says "Please press the Fabric Load picture to pop up the fabric choices."
Agent points to where you press the Fabric Load picture to pop up the fabric choices.
User pops up the fabric load selection display.
User closes the current pop-up window (by pressing OK in the window corner).
User says "What next?"
[Choosing the stain.]
[Done successfully navigating.]
[Done user successfully popping up the stain selection display.]
Agent says "Please press the Stain picture to pop up the stain choices."
Agent points to where you press the Stain picture to pop up the stain choices.
User pops up the stain selection display.
[Next expecting optionally to select a stain.]
[Next expecting to close the current pop-up window (by pressing OK in the window corner).]
[Expecting optionally to adjust detailed settings.]
[Expecting optionally to run the selected cycle.]
MERL 12
Discourse Theory vs. Problem-Solving Theory
• Even though it includes an intentional (plan tree) component, SharedPlan discourse theory is not a complete problem-solving theory:
For example, it does not tell you how to build new recipes (for that, you might use, e..g., first-principles planning or case-based reasoning)
• If a problem solver does not collaborate, then it does not need a discourse model!
• However, a mixed-initiative problem solving assistant needs both a discourse model and a problem-solving model (e.g., BDI).
problem solvingtheory
discoursetheory
MERL 13
Discourse Theory vs. Problem-Solving Theory
•The discourse model constrains the problem solving model:
For example, the discourse model constrains which subproblem to work on next based on the focus of attention in the collaboration.
This modularity is possible because SharedPlan discourse theory captures structure that is independent of the domain and the problem solving model, i.e., structure that is fundamentally about the collaboration process itself.
• The discourse model also provides structure needed for linguistic processing, such as reference resolution (via focus spaces).
problem solving model
discourse model
beliefs
intentions desires
first-principles planning
discourse interpretation
plan recognition
MERL 14
Outline of the Talk
• Introduction
• Demo: DiamondHelp
• Some Theory
• Some Architecture
• Some More Technical Details
• Related Work
MERL 15
Theoretical Orientation:
Applying SharedPlan collaborative discourse theory to improve human-computer interaction.
Practical Goal:
Building collaborative agents (mixed-initiative problem solving assistants) for a wide range of applications with a maximum degree of software reuse.
The COLLAGEN Project
MERL: Charles Rich Candace Sidner
USC/ISI: Jeff RickelMITRE: Abigail GertnerTU Delft: David Keyson, Elyon DekovenMIT Media Lab: Justine Cassell, Tim Bickmore
MERL 16
Task-Oriented Human CollaborationTask-Oriented Human CollaborationCollaborative Agent
communicate
interactinteract
observe observe
plan tree
focus stack
COLLAGEN
Software Reuse: Prototypes Built with Collagen
MERL
MERL
MERL/MELCOMITREUSC/ISI
MERLLOTUS/IBM MERL/MELCO
MERL/MELCO
MERL
MERL MERL
MERL 18
DiscourseState
Respond **
Interpret *
Choose
user event
agent event
Collagen Architecture
Generate
agenda
* Lesh, Rich, Sidner (1999-2001) -- plan recognition Grosz, Sidner, Kraus, Lochbaum (1974-1998) -- discourse interpretation
** Rich, Lesh, Rickel, Garland (2002) -- plugins
Task Model(Recipes)
Implementation of SharedPlan Discourse Theory
Weak Problem-Solving Model
MERL 19
Outline of the Talk
• Introduction
• Demo: DiamondHelp
• Some Theory
• Some Architecture
• Some More Technical Details
• Related Work
MERL 20
EngageEngine
StartGenerator OpenBleedValve Purge
OpenFuelValve
Engage
TurnOn
Fragment of Gas Turbine Task Model
Non-primitive act
Primitive act
Recipe step
Ordering constraint
Other constraints (not shown): *
- act pre/postconditions - recipe applicability conditions - equality between parameters - other logical predicates
( Hierarchical Task Network )
*Truth maintenance system
MERL 21
X X
U
plan recognition
Plan Recognition Reduces Explicit Communication
(Lesh, Rich & Sidner, 1999)
Y Z
W
U
MERL 22
USER: "Let's work on email." USER: "Let's work on email."AGENT: "You have 5 unread messages.” AGENT: "You have 5 unread messages."USER: "Let's read a message."USER: [select message from Chuck] USER: [select message from Chuck]USER: "Let's read message in its own window."USER: [open message] USER: [open message]USER: "Let's do something using message."USER: [visit http://foo.bar.com] USER: [visit http://foo.bar.com]
...long lunch... ...long lunch...USER: "What was I doing?" USER: "What was I doing?"AGENT: "Working on email. AGENT: "Working on email.
Reading message from Chuck. Reading message from Chuck. Reading message in its own window. Reading message in its own window. Done something using message. Done something using message. [visit http://foo.bar.com] [visit http://foo.bar.com] Expecting to react to message (opt)." Expecting to react to message (opt)."
USER: "Let's react to message."USER: "Let's send email."USER: [start new message] USER: [start new message]USER: "Let's fill in message."USER: [address message to Dan] USER: [address message to Dan]
AGENT: "Why are you sending email to Dan? (a) reacting to message from Chuck (b) as a new goal "
USER: "I am reacting to message from Chuck." ...additional editing actions... ...additional editing actions...
USER: "Done filling in message." USER: "Done filling in message."AGENT: "Let's send, cancel or save message." AGENT: "Let's send, cancel or save message."USER: "Please save message." USER: "Please save message."AGENT: [save message] AGENT: [save message]
without plan recognition with plan recognition
MERL 23
DiscourseState
Respond
Interpret
Choose
user event
agent event
Natural Language Processing
Generate
agenda
SLG
SLU
Task Model(Recipes)
MERL 24
Artificial Discourse Language
(Sidner, 1994)
speaker: PROPOSE(b)
Believe(speaker, b)Intend(speaker, Achieve(speaker, Believe(hearer, b)
hearer: ACCEPT(b)
Believe(speaker, b)Believe(hearer, b)Believe(speaker, Believe(hearer, b))Believe(hearer, Believe(speaker, b))Believe(speaker, Believe(hearer, Believe(speaker, b)))... mutual belief
(1) Formal semantics in terms of beliefs and intentions:
MERL 25
Artificial Discourse Language
(2) Translation to and from natural languages:
PROPOSE(SHOULD(DoEmail(...)))
“Let’s work on email.”
utterancemenu
speech recognition
natural languageunderstanding
PROPOSE(SHOULD(DoEmail(...)))
“Let's work on email.”
text to speech
templatesubstitution *
* also using SPUD (Stone, 2003) Devault, Rich, Sidner 2004
MERL 26
Related Work (vs. Collagen)
• multiple participant collaboration (vs. two participants) e.g., Tambe et al.
• other theoretical models of collaboration (vs. SharedPlan) e.g., Levesque & Cohen, Carberry
• application-specific collaborative dialogue systems (vs. middleware) e.g., MERIT, MIRACLE, DenK, TRIPS
• other interface agents (without discourse model) e.g., Maes, and many others
• other agent-related middleware (without discourse model) e.g., PRS, and other BDI interpreters
*
* Recently evolving into CPS middleware
Top Related