Post on 20-Dec-2015
Over-subscription Planning with Numeric Goals
J. BentonComputer Sci. & Eng.
Dept.Arizona State University
Tempe, AZ
Minh DoPalo Alto Research Center
(PARC)Palo Alto, CA
Subbarao KambhampatiComputer Sci. & Eng.
Dept.Arizona State University
Tempe, AZ
Over-subscription Planning
Goals optional & have utility
Actions have cost Maximize utility-cost
“Benefit”
cost = 200
cost = 500
cost = 300
Util = 500
Util = 200
B
CA
Initial: At A
Goals: Soil_Sample @ B & C
[“The Mystery Talk”, Smith 2003]
-100
300
200
Rovers Example
300
Motivation
Numeric goals also have utility More soil gives better instrument
reading More packages give more profit
Cost for achieving varying values differs More soil requires more weight More packages require more
deliveries
Objective
Want more/less G = soil-sample ∈ [2,4]
U(G) = (* (soil-sample) 2)
Challenge – A measurable level of numeric goal achievement: degree of satisfaction
Collect Cost=1Collect Cost=2
1 gram
1 gram
cost=3
soil collected
util=2*2=4 Collect Cost=3
1 gram
action cost
cost=6util=3*2=6Benefit=4-
3=1Benefit=6-6=0
Satisfy numeric goals at different values to give varying utility
Benefit
v a l u e
best benefit
Modeling Numeric Goal Over-subscription
Achieve with a given utility
Specify a goal range
U(G) = (* (soil-sample) 2)
G = soil-sample ∈ [2,4]
4
2
8
1 2 3 40
6
Sample
Utility
1. Fixed utility forsatisfying level
2. Linear
3. Hard bounds
Infinity onrange OK
4. Model as aseparate goal
SapaMps Architecture
Over-subscribed PlanningPlanning Problem
Input Initial State
Select state with bestf-value
Queue ofTime-Stamped
States Better benefit plan?
Yes OutputPlan
Generate States by Applying Actions
Build RTPGPropagate Cost
Find Utility
No
Anytime A* Search
Based on SapaPS
Challenge – Heuristic Support
Heuristic needs to… Estimate cost of achieving variable
values Find the utility of the values
Extend current state-of-the-art techniques Planning graph structure
Reachability estimation Cost propagation
Challenge – Find Goal Achievement Cost
Propagate reachable values with cost
Sample_Soil
Communicate
0 1 2 2.5
Move(Waypoint1)
Sample_Soil
cost( ): 0 1 2
Cost of achievingeach value bound
v1: [0,0] [0,1] [0,2]
A range of possible values
Cost Propagation on Variable Bounds
Bound cost dependent upon action cost previous bound cost
- current bound cost adds to the next Cost of all bounds in
expressions
Sample_Soil
Cost(v1=2)
Sample_Soil
C(Sample_Soil)+Cost(v1=1)
v1: [0,0] [0,1] [0,2]
Sample_Soil
Cost(v1=6)
Sample_Soil
C(Sample_Soil)+Cost(v2=3)+Cost(v1=3)
v1: [0,0] [0,3] [0,6]
v2: [0,3]
Sample_SoilEffect: v1+=1
Sample_SoilEffect: v1+=v2
Extracting Relaxed Plan with Numeric Info
Start with best benefit bounds Relaxed plan includes
Actions Supporting bounds
Benefit
v a l u e
best benefit
Sample_Soil 1 (Sa1)
Dur = 1
Cost: 1 (at end)V1 += 1
Sample_Soil 2 (Sa2)
Dur = 1.25
Cost: 2 (at end)V1 += 2
Communicate (Com)
Dur = 1.5
Cost: 3(at start) V1 ≥ 1
Sa1
t0 1 1.25 2 2.5 3 3.75
C:1 Sa1 C:1 Sa1 C:1
Sa2 C:2 Sa2 C:2 Sa2 C:2
Com C:4 Com C:4
4
Goal: v2 ∈ [5,∞], U(v2 ∈ [5,∞]) = v2 * 3
(at start)V2 := V1
v1
value
cost
value
costv2
upper bound@ time point
v1 – soil sample in rover’s store
v2 – soil sample communicated
Sample_Soil 1 (Sa1)
Dur = 1
Cost: 1 (at end)V1 += 1
Sample_Soil 2 (Sa2)
Dur = 1.25
Cost: 2 (at end)V1 += 2
Communicate (Com)
Dur = 1.5
Cost: 3 (at start)V2 := V1
(at start) V1 ≥ 1
Sa1
t0 1 1.25 2 2.5 3 3.75
C:1 Sa1 C:1 Sa1 C:1
Sa2 C:2 Sa2 C:2 Sa2 C:2
Com C:4
4
v1
value
cost
value
cost
Com C:4
satisfies goal
h(S) = U(G) - (cost of actions + cost of bounds)
v2
Results – Modified Rovers
Added numeric variables: Soil and rock sample amount in rover store More communicated soil/rock - greater utility
Average improvement: 3.06
Results – Modified Rovers
Anytime A* Search Behavior
Results – Modified Logistics
Added numeric variables: Number of packages at location More packages - greater utility
Results – Modified Logistics
Average improvement: 2.88
Summary
Over-subscription planning in the presence of Numeric goals Durative actions
Propagating cost over numeric values
Future Work
Delayed satisfaction of goals
Goal utility dependency
late
-10
late
-10
Questions.