Bill Murdock1/44 J. William Murdock Intelligent Decision Aids Group Navy Center for Applied Research...
-
Upload
samson-beasley -
Category
Documents
-
view
214 -
download
0
Transcript of Bill Murdock1/44 J. William Murdock Intelligent Decision Aids Group Navy Center for Applied Research...
Bill Murdock 1/44
J. William MurdockIntelligent Decision Aids Group
Navy Center for Applied Research in Artificial IntelligenceNaval Research Laboratory, Code 5515
Washington, DC [email protected] http://bill.murdocks.org
Presentation for University of Maryland CMSC 722 (AI Planning)
Planning in the Context ofModel-Based Adaptation
Bill Murdock 2/44
Adaptation
• People adapt very well.– They figure out how to do new things.– If something doesn’t work, they try something else.– They understand how and why they are doing things.
• Computer programs do not adapt very well.– They can only do what they are programmed for.– They keep making the same mistakes.– They have no understanding of themselves.
• People adapt very well.– They figure out how to do new things.– If something doesn’t work, they try something else.– They understand how and why they are doing things.
• Computer programs do not adapt very well.– They can only do what they are programmed for.– They keep making the same mistakes.– They have no understanding of themselves.
Can we make computer programs adapt?Can we make computer programs adapt?
Bill Murdock 3/44
REM(Reflective Evolutionary Mind)
• Operating environment for intelligent agents• Provides support for adaptation to new
functional requirements• Uses functional models, generative planning,
and reinforcement learning• J. William Murdock and Ashok K. Goel
• Operating environment for intelligent agents• Provides support for adaptation to new
functional requirements• Uses functional models, generative planning,
and reinforcement learning• J. William Murdock and Ashok K. Goel
Bill Murdock 4/44
Planning and REM• REM is not a planning system, its an agent environment.
– Agents in REM don’t just decide what to do, they actually do things.– They may base their next action on the results of past actions.– Once they have taken an action, they may not be able to undo that
action.– REM consists of two major modules: execution of agents and
adaptation of agents.
• However, research on REM has involved many planning issues.– Agents in REM are represented as hierarchies of tasks and
methods; thus execution of agents resembles HTN planning.– One mechanism for adaptation of agents is traditional generative
planning.– Agents encoded in REM may perform planning.– An agent in REM can act in a deterministic logically-defined
simulated environment; such agents blur the distinction between deciding what to do and doing things.
• REM is not a planning system, its an agent environment.– Agents in REM don’t just decide what to do, they actually do things.– They may base their next action on the results of past actions.– Once they have taken an action, they may not be able to undo that
action.– REM consists of two major modules: execution of agents and
adaptation of agents.
• However, research on REM has involved many planning issues.– Agents in REM are represented as hierarchies of tasks and
methods; thus execution of agents resembles HTN planning.– One mechanism for adaptation of agents is traditional generative
planning.– Agents encoded in REM may perform planning.– An agent in REM can act in a deterministic logically-defined
simulated environment; such agents blur the distinction between deciding what to do and doing things.
Bill Murdock 5/44
Example:Web Browsing Agent
• A mock-up of web browsing software• Based on Mosaic for X Windows, version 2.4• Imitates not only behavior but also internal
process and information of Mosaic 2.4
• A mock-up of web browsing software• Based on Mosaic for X Windows, version 2.4• Imitates not only behavior but also internal
process and information of Mosaic 2.4
???ps
pdf txt
html
Bill Murdock 6/44
Example:Disassembly and Assembly
• Software agent for disassembly in the domain of cameras– Information about cameras– Information about relevant actions
• e.g., pulling, unscrewing, etc.
– Information about disassembly processing• e.g., decide how to disconnect subsystems
from each other and then decide how to disassemble those subsystems separately.
• Agent now needs to assemble a camera
• Software agent for disassembly in the domain of cameras– Information about cameras– Information about relevant actions
• e.g., pulling, unscrewing, etc.
– Information about disassembly processing• e.g., decide how to disconnect subsystems
from each other and then decide how to disassemble those subsystems separately.
• Agent now needs to assemble a camera
Bill Murdock 7/44
• TMK models provide the agent with knowledge of its own design.
• TMK encodes:– Tasks: functional specification / requirements and results– Methods: behavioral specification / composition and control– Knowledge: Domain concepts and relations
• TMK models provide the agent with knowledge of its own design.
• TMK encodes:– Tasks: functional specification / requirements and results– Methods: behavioral specification / composition and control– Knowledge: Domain concepts and relations
Remote Local
…
URL’s, servers,documents, etc.
Access
Request Receive Store
TMK (Task-Method-Knowledge)
Bill Murdock 8/44
...
REM Reasoning Process
A Method
Implemented Task
......
Unimplemented Task
Set of Input Values
Set of Input Values
Execution
Adaptation
ADAPTED Method
ADAPTED Implemented Task
...TraceSet of Output Values
Bill Murdock 9/44
...
ProactiveModel Transfer
...
Adaptation Process
Task
A Method
Similar Implemented Task
......
Situator(for Q-Learning) ADAPTED Method
ADAPTED Implemented Task
...
Failure-DrivenModel Transfer
Existing Method
Trace
Set of Input Values
Generative Planning
Bill Murdock 10/44
...Select Next Task
Within Method
Execution Process
SelectMethod
ExecutePrimitive Task
A Method
Implemented Task
...Set of Input Values
TraceSet of Output Values
Bill Murdock 11/44
Selection: Q-Learning
• Popular, simple form of reinforcement learning.• In each state, each possible decision is assigned an
estimate of its potential value (“Q”).• For each decision, preference is given to higher Q
values.• Each decision is reinforced, i.e., it’s Q value is altered
based on the results of the actions.• These results include actual success or failure and
the Q values of next available decisions.
• Popular, simple form of reinforcement learning.• In each state, each possible decision is assigned an
estimate of its potential value (“Q”).• For each decision, preference is given to higher Q
values.• Each decision is reinforced, i.e., it’s Q value is altered
based on the results of the actions.• These results include actual success or failure and
the Q values of next available decisions.
Bill Murdock 12/44
Q-Learning in REM
• Decisions are made for method selection and for selecting new transitions within a method.
• A decision state is a point in the reasoning (i.e., task, method) plus a set of all decisions which have been made in the past.
• Initial Q values are set to 0.• Decides on option with highest Q value or randomly
selects option with probabilities weighted by Q value (configurable).
• A decision receives positive reinforcement when it leads immediately (without any other decisions) to the success of the overall task.
• Decisions are made for method selection and for selecting new transitions within a method.
• A decision state is a point in the reasoning (i.e., task, method) plus a set of all decisions which have been made in the past.
• Initial Q values are set to 0.• Decides on option with highest Q value or randomly
selects option with probabilities weighted by Q value (configurable).
• A decision receives positive reinforcement when it leads immediately (without any other decisions) to the success of the overall task.
Bill Murdock 13/44
Task-Method-Knowledge Language (TMKL)
• A new, powerful formalism of TMK developed for REM.
• Uses LOOM, a popular off-the-shelf knowledge representation framework: concepts, relations, etc.
• A new, powerful formalism of TMK developed for REM.
• Uses LOOM, a popular off-the-shelf knowledge representation framework: concepts, relations, etc.
REM models not only the tasks of the domain but also itself in TMKL.
REM models not only the tasks of the domain but also itself in TMKL.
Bill Murdock 14/44
Tasks in TMKL
• All tasks can have input & output parameter lists and given & makes conditions.
• A non-primitive task must have one or more methods which accomplishes it.
• A primitive task must include one or more of the following: source code, a logical assertion, a specified output value.
• Unimplemented tasks have neither of these.
• All tasks can have input & output parameter lists and given & makes conditions.
• A non-primitive task must have one or more methods which accomplishes it.
• A primitive task must include one or more of the following: source code, a logical assertion, a specified output value.
• Unimplemented tasks have neither of these.
Bill Murdock 15/44
TMKL Task
(define-task communicate-with-www-server :input (input-url) :output (server-reply) :makes (:and (document-at-location (value server-reply) (value input-url)) (document-at-location (value server-reply) local-host)) :by-mmethod (communicate-with-server-method))
Bill Murdock 16/44
Methods in TMKL
• Methods have provided and additional result conditions which specify incidental requirements and results.
• In addition, a method specifies a start transition for its processing control.
• Each transition specifies requirements for using it and a new state that it goes to.
• Each state has a task and a set of outgoing transitions.
• Methods have provided and additional result conditions which specify incidental requirements and results.
• In addition, a method specifies a start transition for its processing control.
• Each transition specifies requirements for using it and a new state that it goes to.
• Each state has a task and a set of outgoing transitions.
Bill Murdock 17/44
Simple TMKL Method
(define-mmethod external-display
:provided (:not (internal-display-tag (value server-tag)))
:series (select-display-command
compile-display-command
execute-display-command))
Bill Murdock 18/44
Complex TMKL Method(define-mmethod make-plan-node-children-mmethod :series (select-child-plan-node make-subplan-hierarchy add-plan-mappings set-plan-node-children))(tell (transition>links make-plan-node-children-mmethod-t3 equivalent-plan-nodes child-equivalent-plan-nodes) (transition>next make-plan-node-children-mmethod-t5 make-plan-node-children-mmethod-s1) (:create make-plan-node-children-terminate transition) (reasoning-state>transition make-plan-node-children-mmethod-s1 make-plan-node-children-terminate) (:about make-plan-node-children-terminate (transition>provided '(terminal-addam-value (value child-plan-node)))))
Bill Murdock 19/44
Knowledge in TMKL
Foundation: LOOM– Concepts, instances, relations– Concepts and relations are instances and can
have facts about them.
Foundation: LOOM– Concepts, instances, relations– Concepts and relations are instances and can
have facts about them.
Knowledge representation in TMKL involves LOOM +
some TMKL specific reflective concepts and relations.
Knowledge representation in TMKL involves LOOM +
some TMKL specific reflective concepts and relations.
Bill Murdock 20/44
Some TMKLKnowledge Modeling
(defconcept location)(defconcept computer :is-primitive location)(defconcept url :is-primitive location :roles (text))(defrelation text :range string :characteristics :single-valued)(defrelation document-at-location :domain reply :range location)(tell (external-state-relation document-at-location))
Bill Murdock 21/44
Sample Meta-Knowledge in TMKL
•relation characteristics
–single-valued/multiple-valued
–symmetric, commutative
•relation characteristics
–single-valued/multiple-valued
–symmetric, commutative
•relations over relations
–external/internal
–state/definitional
•relations over relations
–external/internal
–state/definitional
•generic relations
–same-as
–instance-of
–inverse-of
•generic relations
–same-as
–instance-of
–inverse-of
•concepts involving concepts
–thing
–meta-concept
–concept
•concepts involving concepts
–thing
–meta-concept
–concept
Bill Murdock 22/44
Web Browsing Agent
• Interactive Domain: Web agent is affected by the user and by the network
• Dynamic Domain: Both users and networks often change
• Knowledge Intensive Domain: Documents, networks, servers, local software, etc.
• Interactive Domain: Web agent is affected by the user and by the network
• Dynamic Domain: Both users and networks often change
• Knowledge Intensive Domain: Documents, networks, servers, local software, etc.
Mock-up of a web browser:
Steps through the web-browsing process
Mock-up of a web browser:
Steps through the web-browsing process
Bill Murdock 23/44
Tasks and Methodsof Web Agent
Communicate with WWW Server Display File
Process URL Method
Process URL
Request from Server Receive from Server
Communicate with WWW Server Method
Interpret Reply Display Interpreted File
External Display Internal Display
Execute Internal DisplaySelect Display Command Compile Display Command Execute Display Command
Display File Method
Bill Murdock 24/44
Example: PDF Viewer
• The web agent is asked to browse the URL for a PDF file. It does not have any information about external viewers for PDF.
• Because the agent already has a task for browsing URL’s it is executed first.
• When the system fails, the user provides feedback indicating the correct viewer.
• Failure-Driven Model Transfer
• The web agent is asked to browse the URL for a PDF file. It does not have any information about external viewers for PDF.
• Because the agent already has a task for browsing URL’s it is executed first.
• When the system fails, the user provides feedback indicating the correct viewer.
• Failure-Driven Model Transfer
Bill Murdock 25/44
Web Agent Adaptation
External Display
Select Display Command Compile Display Command Execute Display Command
...
External Display
Compile Display Command Execute Display Command
...
Select Display Command Base Method Select Display Command Alternate Method
Select Display Command
Select Display Command Base Task Select Display Command Alternate Task
Bill Murdock 26/44
Physical Device Disassembly
• ADDAM: Legacy software agent for case-based, design-level disassembly planning and (simulated) execution
• Interactive: Agent connects to a user specifying goals and to a complex physical environment
• Dynamic: New designs and demands• Knowledge Intensive: Designs, plans, etc.
• ADDAM: Legacy software agent for case-based, design-level disassembly planning and (simulated) execution
• Interactive: Agent connects to a user specifying goals and to a complex physical environment
• Dynamic: New designs and demands• Knowledge Intensive: Designs, plans, etc.
Bill Murdock 27/44
Disassembly Assembly
• A user with access to ADDAM disassembly agent wishes to have this agent instead do assembly.
• ADDAM has no assembly method thus must adapt first.
• Since assembly is similar to disassembly, REM selects Proactive Model Transfer.
• A user with access to ADDAM disassembly agent wishes to have this agent instead do assembly.
• ADDAM has no assembly method thus must adapt first.
• Since assembly is similar to disassembly, REM selects Proactive Model Transfer.
Bill Murdock 28/44
Adaptation UsingRelation Mapping
• Requires a model for an existing agent which has a task similar to the desired task.– e.g., disassembly is similar to assembly
• Specifically, the effects (:makes slot) of the two tasks must match except for one term, and that one term must be connected by a single relation.– e.g., disassembly produces a disassembled world state, and
assembly produces a assembled world state and (inverse-of disassembled assembled) is known.
• Uses that relation to alter lower level tasks and methods which relate to it.– as indicated by the TMK model
• Requires a model for an existing agent which has a task similar to the desired task.– e.g., disassembly is similar to assembly
• Specifically, the effects (:makes slot) of the two tasks must match except for one term, and that one term must be connected by a single relation.– e.g., disassembly produces a disassembled world state, and
assembly produces a assembled world state and (inverse-of disassembled assembled) is known.
• Uses that relation to alter lower level tasks and methods which relate to it.– as indicated by the TMK model
Bill Murdock 29/44
Pieces of ADDAM which are key to Disassembly Assembly
Adapt Disassembly Plan Execute Plan
Plan Then Execute Disassembly
Disassemble
Hierarchical Plan Execution
Select Next Action Execute Action
Topology Based Plan Adaptation
Make Plan Hierarchy
Make Equivalent Plan Nodes Method
Make Equivalent Plan Node Add Equivalent Plan Node
Map Dependencies
Select Dependency Assert Dependency
Bill Murdock 30/44
New Adapted Task inDisassembly Assembly
COPIED Adapt Disassembly Plan COPIED Execute Plan
COPIED Plan Then Execute Disassembly
Assemble
COPIED Hierarchical Plan Execution
Execute Action
COPIED Topology Based Plan Adaptation
COPIED Make Plan Hierarchy
COPIED Make Equivalent Plan Nodes Method
COPIED Add Equivalent Plan Node
COPIED Map Dependencies
COPIED Select Dependency INVERTED Assert Dependency
INSERTED Inversion Task 1
INSERTED Inversion Task 2
Select Next Action
COPIED Make Equivalent Plan Node
Bill Murdock 31/44
Task: Assert Dependency
Before:define-task Assert-Dependency input: target-before-node, target-after-node asserts: (node-precedes (value target-before-node)
(value target-after-node))
After:define-task Mapped-Assert-Dependency input: target-before-node, target-after-node asserts: (node-follows (value target-before-node)
(value target-after-node)))
Bill Murdock 32/44
Task: Make Equivalent Plan Node
define-task make-equivalent-plan-node
input: base-plan-node, parent-plan-node, equivalent-topology-node
output: equivalent-plan-node
makes: (:and
(plan-node-parent (value equivalent-plan-node)
(value parent-plan-node))
(plan-node-object (value equivalent-plan-node)
(value equivalent-topology-node))
(:implies (plan-action (value base-plan-node))
(type-of-action (value equivalent-plan-node)
(type-of-action (value base-plan-node)))))
by procedure ...
Bill Murdock 33/44
Task:Inverted-Reversal-Task
define-task inserted-reversal-task
input: equivalent-plan-node
asserts: (type-of-action
(value equivalent-plan-node)
(inverse-of
(type-of-action
(value equivalent-plan-node))))
Bill Murdock 34/44
Adaptation UsingGenerative Planning
• Does not require a pre-existing model. Only requires operators and a set of facts (initial state).
• Invokes Graphplan– Operators = Those primitive tasks known to the agent which
can be translated into Graphplan’s operator language– Facts = Known assertions which involve relations referred to
by the operators– Goal = Makes condition of main task
• Translates plan into more general method by turning specific objects into parameters
• Stores method for later reuse
• Does not require a pre-existing model. Only requires operators and a set of facts (initial state).
• Invokes Graphplan– Operators = Those primitive tasks known to the agent which
can be translated into Graphplan’s operator language– Facts = Known assertions which involve relations referred to
by the operators– Goal = Makes condition of main task
• Translates plan into more general method by turning specific objects into parameters
• Stores method for later reuse
Bill Murdock 35/44
Adaptation UsingSituated Learning
• Does not require a pre-existing model. Does not even require preconditions and postconditions of the operators.
• Creates a method which performs any action, checks to see if the desired state has been achieved, and if not, loops.
• All decision making is done by Q-learning during execution.
• Over time, the Q-learning mechanism learns to select actions which tend to lead to desirable results.
• Does not require a pre-existing model. Does not even require preconditions and postconditions of the operators.
• Creates a method which performs any action, checks to see if the desired state has been achieved, and if not, loops.
• All decision making is done by Q-learning during execution.
• Over time, the Q-learning mechanism learns to select actions which tend to lead to desirable results.
Bill Murdock 37/44
Roof Assembly
1
10
100
1000
10000
100000
1000000
1 2 3 4 5 6 7
Number of Boards
Ela
ps
ed
Tim
e (
se
co
nd
s)
SituatedLearning
RelationMapping
GenerativePlanning
Bill Murdock 38/44
Modified Roof Assembly: No Conflicting Goals
1
10
100
1000
10000
100000
1 2 3 4 5 6 7
Number of Boards
Ela
pse
d T
ime
(sec
on
ds)
SituatedLearning
RelationMapping
GenerativePlanning
Bill Murdock 39/44
Computational Costs
• Reasoning about models incurs some costs.– For very easy problems, this overhead may not be
justified.– For other problems, the benefits enormously
outweigh these costs.
• Reasoning about models incurs some costs.– For very easy problems, this overhead may not be
justified.– For other problems, the benefits enormously
outweigh these costs.
Models can localize planning and learning.Models can localize planning and learning.
Bill Murdock 40/44
Knowledge Requirements
• Someone has to build an agent.• Builder should know what that agent does and
how it does it Can make model.• Analyst may be able to understand builder’s
notes, etc. Can make model• Some evidence for this in the context of
software engineering / architectural extraction.
• Someone has to build an agent.• Builder should know what that agent does and
how it does it Can make model.• Analyst may be able to understand builder’s
notes, etc. Can make model• Some evidence for this in the context of
software engineering / architectural extraction.
Bill Murdock 41/44
REM and SHOP:Comparison
• Primitive actions in REM involve interacting with a (possibly simulated) external environment.– Performing the same action twice in the same situation may
have different results; actions may be difficult or impossible to reverse.
– Thus REM can’t backtrack; if it fails, it has to start over with updated Q-values or symbolic adaptation.
– SHOP actions can include calls to arbitrary LISP code. However, that code can’t perform actions that aren’t undone by backtracking.
– Backtracking in SHOP is expensive, but not nearly expensive as starting over.
• REM is able to build new methods and modify existing methods. Methods in SHOP do not change.
• Primitive actions in REM involve interacting with a (possibly simulated) external environment.– Performing the same action twice in the same situation may
have different results; actions may be difficult or impossible to reverse.
– Thus REM can’t backtrack; if it fails, it has to start over with updated Q-values or symbolic adaptation.
– SHOP actions can include calls to arbitrary LISP code. However, that code can’t perform actions that aren’t undone by backtracking.
– Backtracking in SHOP is expensive, but not nearly expensive as starting over.
• REM is able to build new methods and modify existing methods. Methods in SHOP do not change.
Bill Murdock 42/44
REM and SHOP:Implications
• On problems that REM and SHOP can both address, SHOP is probably much faster because it can backtrack.
• One can write SHOP methods which leave many more decisions unresolved than one can for REM.
• However, SHOP can’t address problems where the effects of operators are cannot be backtracked over.
• If SHOP does not have a method for some task, it always fails. REM can build new methods for tasks.
• If REM’s methods for some task produce incorrect results, REM can modify those methods. SHOP cannot do this.
• On problems that REM and SHOP can both address, SHOP is probably much faster because it can backtrack.
• One can write SHOP methods which leave many more decisions unresolved than one can for REM.
• However, SHOP can’t address problems where the effects of operators are cannot be backtracked over.
• If SHOP does not have a method for some task, it always fails. REM can build new methods for tasks.
• If REM’s methods for some task produce incorrect results, REM can modify those methods. SHOP cannot do this.
Bill Murdock 43/44
Current Work: AHEAD• Theme: Analyzing hypotheses regarding asymmetric
threats (e.g., criminals, terrorists).– Input: Hypotheses regarding a potential threat– Output: Argument for and/or against the hypotheses
• Technique: Analogy over functional models– An extension to TMKL will encode known behaviors for
asymetric threats and the purposes that the behaviors serve.– Analogical reasoning will enable retrieval and mapping of
new hypotheses to existing models.– Models will provide arguments about how observed actions
do or do not support the purposes of the hypothesized behavior.
• Naval Research Laboratory / DARPA Evidence Extraction and Link Discovery program
• David Aha, J. William Murdock, Len Breslow
• Theme: Analyzing hypotheses regarding asymmetric threats (e.g., criminals, terrorists).– Input: Hypotheses regarding a potential threat– Output: Argument for and/or against the hypotheses
• Technique: Analogy over functional models– An extension to TMKL will encode known behaviors for
asymetric threats and the purposes that the behaviors serve.– Analogical reasoning will enable retrieval and mapping of
new hypotheses to existing models.– Models will provide arguments about how observed actions
do or do not support the purposes of the hypothesized behavior.
• Naval Research Laboratory / DARPA Evidence Extraction and Link Discovery program
• David Aha, J. William Murdock, Len Breslow
Bill Murdock 44/44
Summary
• REM (Reflective Evolutionary Mind)– Operating environment for agents that adapt
• TMKL (Task-Method-Knowledge Language)– The language for agents in REM– Functional modeling language for encoding
computational processes• Adaptation
– Some kinds of adaptation can be performed using specialized model-based techniques
– Others require more generic planning & learning mechanisms (localized using models)
• REM (Reflective Evolutionary Mind)– Operating environment for agents that adapt
• TMKL (Task-Method-Knowledge Language)– The language for agents in REM– Functional modeling language for encoding
computational processes• Adaptation
– Some kinds of adaptation can be performed using specialized model-based techniques
– Others require more generic planning & learning mechanisms (localized using models)