Extending Graphplan to handle Resources Presenter: Pham Van Cuong Department of Computer Science New...

Extending Graphplan to handle Extending Graphplan to handle ResourcesResources

Presenter: Pham Van CuongPresenter: Pham Van Cuong

Department of Computer ScienceDepartment of Computer Science

New Mexico State UniversityNew Mexico State University

MotivationMotivation

• Planning with Resources is ubiquitous Planning with Resources is ubiquitous in real life.in real life.

• Actions in the real world often need Actions in the real world often need resources to execute.resources to execute.

MotivationMotivation

• Our approach to planning with Resources Our approach to planning with Resources is based on Graphplan, a well-known is based on Graphplan, a well-known planning algorithm.planning algorithm.

• Techniques that make Graphplan Techniques that make Graphplan attractive:attractive:Polynomial time construction of planning Polynomial time construction of planning

Graph.Graph.Use of mutexes to enhance the search for Use of mutexes to enhance the search for

a plan.a plan.

OutlineOutline

Graphplan backgroundGraphplan backgroundSTRIPS languageSTRIPS languagePlanning Graph & Mutexes.Planning Graph & Mutexes.

Planning with Resources.Planning with Resources.

GPR- A Graphplan with GPR- A Graphplan with ResourcesResourcesInput language (PDDL 2.1 level 2)Input language (PDDL 2.1 level 2)Data structuresData structuresMutexesMutexesAlgorithmAlgorithmExperimental Results.Experimental Results.

STRIPS languageSTRIPS language

• A STRIPS action a is specified by an A STRIPS action a is specified by an expression of the form expression of the form

action a :Pre action a :Pre Pre(a)Pre(a) :Add :Add Add(a)Add(a) :Del :Del Del(a)Del(a)For example, For example, Action Drive(Car,LC,EP)Action Drive(Car,LC,EP) :Pre {At(Car,LC),Has-fuel(Car)} :Pre {At(Car,LC),Has-fuel(Car)}

:Add {At(Car,EP)} :Add {At(Car,EP)} :Del {At(Car, LC),Has-fuel(Car)}:Del {At(Car, LC),Has-fuel(Car)}


• The result of executing an action a in a The result of executing an action a in a state s isstate s is

Res(a,s) = (s Add(a)) \ Del(a ) if a is Res(a,s) = (s Add(a)) \ Del(a ) if a is executable, Res(a,s) = if otherwise.executable, Res(a,s) = if otherwise.

• The result of executing a sequence of The result of executing a sequence of actions [aactions [a11, a, a22 …, a …, ann] in a state s is ] in a state s is • Res([ ],s)=sRes([ ],s)=s

• Res([aRes([a11, a, a22 …, a …, ann],s) = Res(a],s) = Res(ann,Res([a,Res([a11, a, a22 …, a …, an-n-

11],s)), where Res(a, ) = for every a.],s)), where Res(a, ) = for every a.


• A A planning problem planning problem is a tuple <P,A,I,G>, is a tuple <P,A,I,G>, where P is a finite set of fluents, A is a finite where P is a finite set of fluents, A is a finite set of actions, I (the initial state) is a set of set of actions, I (the initial state) is a set of fluents, and G (the goal) is a finite set of fluents, and G (the goal) is a finite set of fluent literals. fluent literals.

• Given a planning problem Q=<P,A,I,G>, a Given a planning problem Q=<P,A,I,G>, a sequence of actions [asequence of actions [a11, a, a22 …, a …, ann] is a ] is a solution solution ( (planplan) to Q if Res([a) to Q if Res([a11, a, a22 …, a …, ann],I) is ],I) is defined and G holds in Res([adefined and G holds in Res([a11, a, a22 …, a …, ann],I).],I).

Graphplan – Graphplan – Planning GraphPlanning Graph

• a directed, leveled graph with a set a directed, leveled graph with a set of of nodesnodes and a set of and a set of edges edges. .

• The levels alternate between The levels alternate between proposition levelsproposition levels and and action levelsaction levels.. The proposition levels contain The proposition levels contain proposition proposition

nodesnodes, each of which is labeled with a , each of which is labeled with a fluent fluent literalliteral. .

The action levels contain The action levels contain action nodesaction nodes, each , each has an action as its label. has an action as its label.

Graphplan – Graphplan – Planning GraphPlanning Graph

• An An edgeedge presents the relation between presents the relation between an action and a proposition.an action and a proposition.

• At time t, action nodes are connected At time t, action nodes are connected to:to: their preconditions in the proposition their preconditions in the proposition

level t by precondition edges.level t by precondition edges. their add–effects and del-effects in their add–effects and del-effects in

proposition level t+1 by add-edges and proposition level t+1 by add-edges and del-edges, respectively. del-edges, respectively.

Graphplan – Graphplan – mutexmutex

• Two actions A and B are mutex each other Two actions A and B are mutex each other if:if: action A deletes a precondition or an add-effects action A deletes a precondition or an add-effects

of B or vice versa.of B or vice versa. a precondition of action A and a precondition of a precondition of action A and a precondition of

action B are mutex in the previous proposition action B are mutex in the previous proposition levellevel..

• Two propositions p and q are mutex if:Two propositions p and q are mutex if: aall ways of creating p are mutex with all ways of ll ways of creating p are mutex with all ways of

creating qcreating q. .

Current approaches to Planning Current approaches to Planning with Resources- some with Resources- some characteristicscharacteristics• State based search (Metric FF, LGP…).State based search (Metric FF, LGP…).

• Using heuristic function to guide Using heuristic function to guide search (Sapa, TP4 …)search (Sapa, TP4 …)

• Forward chaining approach (TLPlan …)Forward chaining approach (TLPlan …)

• No existing planner uses mutexes of No existing planner uses mutexes of planning Graph to guide the search.planning Graph to guide the search.

GPR- A Graphplan with GPR- A Graphplan with ResourcesResources

• Based on Graphplan algorithm.Based on Graphplan algorithm.

• Use of mutexes to direct the search Use of mutexes to direct the search for a plan.for a plan.

• generate a concurrent plan.generate a concurrent plan.

GPR- Input language (syntax)GPR- Input language (syntax)

• F=FF=FBB U U FFNN where F where FBB is the set of boolean fluents is the set of boolean fluents and Fand FNN is the set of numeric fluents. is the set of numeric fluents.

• An An assignmentassignment is of the form f=v where is of the form f=v where

f F and v Df F and v Dff A set A set δδ of assignments is: of assignments is: consistent consistent if for every fluent f F there exists if for every fluent f F there exists

at mostat most one assignment of the form f=v in one assignment of the form f=v in δδ.. complete complete if for every fluent f F there exists if for every fluent f F there exists

at leastat least one assignment of the form f=v in one assignment of the form f=v in δδ..


• A A numeric constraintnumeric constraint is a triple (exp1, is a triple (exp1, comp, exp2) where comp {>,=,<,comp, exp2) where comp {>,=,<,≥≥, , ≤≤} } is a comparator. is a comparator.

• A A numeric effectnumeric effect is a tuple of the form (f, is a tuple of the form (f, aop, exp) where f Faop, exp) where f FNN , aop , aop {assign,increase, decrease,scale-up,scale-{assign,increase, decrease,scale-up,scale-down} is an assignment operator.down} is an assignment operator.

• A A condition con condition con is a pair (b(con),n(con)) is a pair (b(con),n(con)) where b(con) Fwhere b(con) FBB and n(con) is a set of and n(con) is a set of numeric constraints. numeric constraints.


An An actionaction a is a pair (Pre(a),Eff(a)), where a is a pair (Pre(a),Eff(a)), where pre(a)pre(a) is a condition is a condition eff(a) eff(a) is a triple (b-add(eff(a)), b-is a triple (b-add(eff(a)), b-

del(eff(a)) ,n(eff(a))); del(eff(a)) ,n(eff(a))); b-add(eff(a)), b-del(eff(a)) Fb-add(eff(a)), b-del(eff(a)) FBB; n(eff(a)) is a set of ; n(eff(a)) is a set of

numeric effects which does not contain two numeric effects which does not contain two numeric effects (f,aop,exp) and (f,aop’,exp’).numeric effects (f,aop,exp) and (f,aop’,exp’).

For example, For example, action FLY (plane,EP,LAX)action FLY (plane,EP,LAX) Pre: ( {At(plane,EP)}, {(> (fuel plane) 300)})Pre: ( {At(plane,EP)}, {(> (fuel plane) 300)}) Eff: ({At(plane,LAX)} , {At(plane,EP) }, {(decrease Eff: ({At(plane,LAX)} , {At(plane,EP) }, {(decrease

(fuel plane) 200)} ). (fuel plane) 200)} ).

GPR- Input language GPR- Input language (semantics)(semantics)

• AA statestate is a consistent and complete set of is a consistent and complete set of assignments. assignments.

• An assignment f=v An assignment f=v holdsholds in a state s, in a state s, denoted by s denoted by s ╞╞(f=v), if f=v s. A set of (f=v), if f=v s. A set of assignments assignments δδ holdsholds in s, denoted by s in s, denoted by s ╞╞ δδ, , if for all f=v if for all f=v δδ s s ╞╞ (f=v). (f=v).

• A numeric constraint (exp1, comp, exp2) A numeric constraint (exp1, comp, exp2) holdsholds in a state s, denoted by s in a state s, denoted by s ╞╞(exp1, (exp1, comp, exp2), if both exp1 and exp2 are comp, exp2), if both exp1 and exp2 are defined in s and defined in s and exp1 comp exp2 holds. exp1 comp exp2 holds.


• A set of numeric constraints C A set of numeric constraints C holdsholds in a in a state s if s state s if s ╞╞ c for every numeric c for every numeric constraint c C.constraint c C.

• A A conditioncondition con=(b(con),n(con)) con=(b(con),n(con)) holds holds in a in a state s (s state s (s ╞╞ con) if s con) if s ╞╞ b(con) b(con) andand s s ╞╞ n(con).n(con).

• An action a is An action a is executableexecutable in a state s if in a state s if s s ╞╞ Pre(a) . Pre(a) .

GPR- Input language GPR- Input language (semantics)(semantics)• A state transition Res(a,s), if a is an executable A state transition Res(a,s), if a is an executable

action in s, contains the following assignments:action in s, contains the following assignments:

f=true if f b-add(eff(a))f=true if f b-add(eff(a)) f=false if f b-del(eff(a))f=false if f b-del(eff(a)) f=s(exp) if (f,assign,exp) (eff(a))f=s(exp) if (f,assign,exp) (eff(a)) f=s(f)+ s(exp) if (f,increase,exp) (eff(a))f=s(f)+ s(exp) if (f,increase,exp) (eff(a)) f=s(f)- s(exp) if (f,decrease,exp) (eff(a))f=s(f)- s(exp) if (f,decrease,exp) (eff(a)) f=s(f)* s(exp) if (f,scale-up,exp) (eff(a))f=s(f)* s(exp) if (f,scale-up,exp) (eff(a)) f=s(f)/s(exp) if (f,scale-down,exp) (eff(a))f=s(f)/s(exp) if (f,scale-down,exp) (eff(a)) f=s(f) if there does not exist the fluent f in the left f=s(f) if there does not exist the fluent f in the left

hand side in every assignment f=v Res(a,s) .hand side in every assignment f=v Res(a,s) .


• if a is not executable in s, then if a is not executable in s, then Res(a,s)= (or undefined). Res(a,s)= (or undefined).

• For a sequence [aFor a sequence [a11, a, a22,.., a,.., ann] of actions, ] of actions, Res([aRes([a11, a, a22,.., a,.., ann],s) = Res(a],s) = Res(ann, Res([a, Res([a11, , aa22,.., a,.., an-1n-1],s) and Res([ ],s)=s, where ],s) and Res([ ],s)=s, where Res(a, )= for every a.Res(a, )= for every a.


• A A planning problemplanning problem is a tuple (F,A,I,G), where is a tuple (F,A,I,G), where F = F F = FBB U U FFNN , A is a finite set of actions, I (the , A is a finite set of actions, I (the initial state) is a set of assignments, and G initial state) is a set of assignments, and G (the goal) is a condition. (the goal) is a condition.

• A solution (A solution (planplan) to a numeric planning ) to a numeric planning problem is a sequence [aproblem is a sequence [a11, a, a22 …, a …, ann] of ] of actions if Res([aactions if Res([a11, a, a22 …, a …, ann],I) ],I) ╞╞G and G and Res([aRes([a11, a, a22 …, a …, ann],I) is defined . ],I) is defined .

• The semantics can be extended to allow The semantics can be extended to allow parallel actions.parallel actions.

GPR - Planning Graph GPR - Planning Graph

• A directed, leveled A directed, leveled graphgraph with a set of with a set of nodesnodes and a set of and a set of edges edges. .

• The levels alternate between The levels alternate between fluent levelsfluent levels and and action levelsaction levels. .

• The action levels contain The action levels contain action nodesaction nodes, , each is labeled with an executable action each is labeled with an executable action in that level.in that level.

• The fluent levels contain The fluent levels contain fluent nodesfluent nodes, each , each of which is labeled with an of which is labeled with an assignmentassignment . .

GPR - GPR - Planning GraphPlanning Graph

An An edgeedge presents the relation between presents the relation between an action and an assignment. an action and an assignment.

At time t, each action node a is At time t, each action node a is connected:connected: to assignments f=v which make Pre(a) to assignments f=v which make Pre(a)

hold in the fluent level t, denoted by hold in the fluent level t, denoted by Pre(a,t) Pre(a,t) ╞(f=v),╞(f=v), by by incomingincoming edges. edges.

to assignments created by a ’s effect in to assignments created by a ’s effect in the fluent level t+1 by the fluent level t+1 by outgoing outgoing edges edges

GPR– GPR– mutexmutex between actions A and B at between actions A and B at level tlevel t

• Inconsistent effectsInconsistent effects: : An add-effect of A is negated by B or An add-effect of A is negated by B or

vice versa.vice versa.

• Interference Interference : : One of the del-effects of A is a One of the del-effects of A is a

precondition of B or vice versa.precondition of B or vice versa.There exist two mutexed assignments There exist two mutexed assignments

ff11=v=v11 and f and f22=v=v22 in level t and in level t and Pre(A,t)Pre(A,t)╞(╞(ff11=v=v11) and Pre(B,t)) and Pre(B,t)╞(╞(ff22=v=v22).).

GPR – GPR – mutexmutex between between assignmentsassignments

• Two assignments fTwo assignments f11=v=v11 and f and f22=v=v22 are are mutex at time t if:mutex at time t if:ff11 = f = f22 and v and v11 ≠≠ v v2 2 ; or, ; or,

all ways of creating fall ways of creating f11=v=v11 are mutex with are mutex with all ways of creating fall ways of creating f22=v=v22..

GPR – Algorithm descriptionGPR – Algorithm description

• GPR algorithm GPR algorithm alternates between two alternates between two phases: constructing the planning Graph phases: constructing the planning Graph and extracting a solution.and extracting a solution.

• The planning Graph is constructed until The planning Graph is constructed until the planning Graph is leveled off or a valid the planning Graph is leveled off or a valid plan is found.plan is found.

• Extracting a solution phase starts Extracting a solution phase starts whenever the goal is reached. whenever the goal is reached.

GPR – Constructing planning GPR – Constructing planning GraphGraph• The fluent level 1 consists of all The fluent level 1 consists of all

assignments in the initial state.assignments in the initial state.• Once an executable action A is found at Once an executable action A is found at

time t, GPR will do the followings.time t, GPR will do the followings. creates an action node in the level t and labels creates an action node in the level t and labels

it with A.it with A. for each assignment f=v in the fluent level t for each assignment f=v in the fluent level t

s.t. pre(A,t) s.t. pre(A,t) ╞(f=v)╞(f=v), adds an edge connecting it , adds an edge connecting it to A. to A.

for each assignment f=v created by some for each assignment f=v created by some effect of A, GPR creates a fluent node with the effect of A, GPR creates a fluent node with the label f=v, adds it to the fluent level t+1, and label f=v, adds it to the fluent level t+1, and then inserts an edge from A to this node.then inserts an edge from A to this node.

finds all action nodes in level t which are finds all action nodes in level t which are mutex with A and updates the mutex list.mutex with A and updates the mutex list.

GPR- Extracting a solutionGPR- Extracting a solution

• Given a goal GGiven a goal Gtt at time t, GPR non- at time t, GPR non-deterministically selects a set of actions Adeterministically selects a set of actions At-1t-1 and computes the goal Gand computes the goal Gt-1t-1 as follows as follows..

AAt-1t-1 is a set of actions in level t-1 s.t. for each is a set of actions in level t-1 s.t. for each g Gg Gtt there exists one edge from some a A there exists one edge from some a At-1 t-1

to g. to g. GGt-1t-1 is the set of assignments in level t-1 s.t. is the set of assignments in level t-1 s.t.

every action a Aevery action a At-1t-1 is executable in G is executable in Gt-1t-1

If t=0 and GIf t=0 and Gtt I, this indicates that a solution I, this indicates that a solution is found.is found.

GPR- Experimental ResultsGPR- Experimental Results

• GPR generates a concurrent plan for the GPR generates a concurrent plan for the Rocket domain in 3 time steps with.Rocket domain in 3 time steps with.

• For the Rocket Domain with renewable For the Rocket Domain with renewable Resources, GPR also generates a plan in good Resources, GPR also generates a plan in good quality.quality.

• GPR is available on www.cs.nmsu.edu/~cvanGPR is available on www.cs.nmsu.edu/~cvan

Future worksFuture works

• GPR is the first step towards creating GPR is the first step towards creating a planner for domains with a planner for domains with resources. It can be improved by:resources. It can be improved by:Finding a plan that satisfies some Finding a plan that satisfies some

constraints (eg. minimal resource constraints (eg. minimal resource consumption…). consumption…).

Considering actions with duration. Considering actions with duration.

Thank you !Thank you !

Extending Graphplan to handle Resources Presenter: Pham Van Cuong Department of Computer Science New...

Documents

Transcript of Extending Graphplan to handle Resources Presenter: Pham Van Cuong Department of Computer Science New...