Temporal Planning and Resource Allocation Stefanie Chiou, Rob Kochman, and Gary Look.

65
Temporal Planning and Resource Allocation Stefanie Chiou, Rob Kochman, and Gary Look

Transcript of Temporal Planning and Resource Allocation Stefanie Chiou, Rob Kochman, and Gary Look.

Temporal Planning and Resource Allocation

Stefanie Chiou, Rob Kochman, and Gary Look

Running Plans in the Real World

Need to account for time and resources when creating plans

Papers featured:• "Executing Reactive, Model-Based Programs through

Graph-Based Temporal Planning" by Phil Kim, Brian C. Williams, and Mark Abramson (IJCAI ’01)

• "Managing Multiple Tasks in Complex, Dynamic Environments" by Michael Freed (AAAI ’98).

Paper

Executing Reactive, Model-based Programs through Graph-based Temporal Planning by Phil Kim, Brian Williams, and Mark Abramson

Familiar Examples

Mars Climate Orbiter: 12/11/98 Mars Polar Lander: 1/3/99

Motivation

Embedded programming is hard Easier to reason about state when

programming

Overview/Contributions

RMPL provides a new programming paradigm for programming robust systems of cooperative autonomous agents

TPN -> synthesis of temporal, causal link, and HTN planning• A “holy grail” for autonomous agents

Planner that implements these ideas

RMPL Intro

RMPL supports four types of reasoning about system interactions • reasoning about contingencies

• scheduling

• inferring hidden state

• controlling hidden state

This paper focuses on first two interaction types

(Model-based) Embedded Programs

Embedded programs interact withplant sensors/actuators:

• Read sensors

• Set actuators

Model-based programs interact with plant state:

• Read state

• Write state

Embedded Program

SPlant

Obs Cntrl

Model-basedEmbedded Program

SPlant

Programmer must map between state and sensors/actuators.

Model-based executive maps between sensors, actuators to states.

setStategetState

Model-based Embedded Program Breakdown

Model-basedEmbedded Program

SPlant

Model-based executive maps between sensors, actuators to states.

Model-based Executive

getState

setState

Sensor data

Actuator commands

Example: The model-based program sets engine = thrusting, and the deductive controller . . .

Determines that valveson the backup engine

will achieve thrust, andplans needed actions.

Deduces that a valve failed - stuck closed

Plans actionsto open

six valves

Fuel tankFuel tankOxidizer tankOxidizer tank

Deduces thatthrust is off, and

the engine is healthy

Time and Contingency Constructs in RMPL

if c thennext A do A maintaining C A,B (concurrency) A;B (serialization) A[l,u] (temporal bounds) Choose{A,B} (choose)

RMPL Code ExampleGroup-Enroute()[l,u] = {choose {

do {Group-Fly-Path(PATH1) [l*90%,u*90%];

} maintaining PATH1_OK,do {

Group-Fly-Path(PATH2) [l*90%,u*90%];} maintaining PATH2_OK

};{

Group-Transmit(FAC,ARRIVED_TAI)[0,2],do {

Group-Wait(TAI_HOLD1,TAI_HOLD2)[0,u*10%]} watching PROCEED_OK

}}

A

B

Path 1

Path 2

Choosing a route from A to B

RMPL’s Representation of Time and Contingencies

Important to find a plan quickly Idea: use a plan graph Generalization of Simple Temporal

Network (STN) TPN defined (STN + conditionals +

choices)

STN example

Start End

Temporal Planning Networks (TPN)

A temporal planning network is just a generalization of a STN

Includes ability to represent conditionals and choices

TPN Example

Ask(Proceed=Ok)

RMPL -> TPN conversion

A [l,u]: invoke activity A between l and u time units

RMPL -> TPN conversion

c [l,u]: Assert that condition c is true now until [l ,u]

RMPL -> TPN conversion

If c thennext A [l,u]: Execute A for [l ,u], if condition c is currently satisfied

RMPL -> TPN conversion

do A [l,u] maintaining c : Execute A for [l ,u], and ensure that condition c holds throughout

RMPL -> TPN conversion

A [l1,u1], B [l2,u2] : Concurrently execute A for [l1 ,u1], and B for [l2 ,u2]

RMPL -> TPN conversion

A [l1,u1]; B [l2,u2] : Execute A for [l1 ,u1], and then B for [l2 ,u2]

RMPL -> TPN conversion

choose {A [l1,u1]; B [l2,u2]} : Reduces to A [l1 ,u1] or B [l2 ,u2] non-deterministically

Kirk

Compiles RMPL program into a TPN Searches TPN for a temporally

consistent plan Temporally consistent plan is

“embedded” into the TPN.

Kirk Phase1

Select plan from TPN• Essentially a graph traversal

Check plan for temporal consistency

Start

Selecting the Plan

Start

Start

Checking for Temporal Consistency

Convert TPN to a distance graph Run Bellman-Ford to check for negative

cycles (if any found, inconsistent)

Converting TPNs to Distance Graphs

The interval [aij,bij] represents the statement: aij ≤

Tj-Ti ≤ bij

This is equivalent to: Tj-Ti ≤ bij and Ti-Tj ≤ -aij

0

3

1

4

2[10,20]

[30,40]

[10,20]

[60,70]

[40,50]

0

3

1

4

220

40

20

7050

-60

-40

-10

-30

-10

Checking for Temporal Consistency

Convert TPN to a distance graph Run Bellman-Ford algorithm to check for

negative cycles:

Bellman-Ford Algorithm

initializeCosts(G, s)

for i=1 to |V(G)|-1

for each edge (u,v) in E(G)

updateCost(u, v, w)

for each edge (u, v) in E(G)

if cost(v) > cost(u) + w(u. v)

return false

return true

Bellman-Ford Example

0

20

40

20

7050

-60

-40

-10

-30

-10

Source

Bellman-Ford Example

0

20

20

40

20

7050

-60

-40

-10

-30

-10

Source

Bellman-Ford Example

0

20

6020

40

20

7050

-60

-40

-10

-30

-10

Source

Bellman-Ford Example

0

50

20

6020

40

20

7050

-60

-40

-10

-30

-10

Source

Bellman-Ford Example

0

50

20

100

6020

40

20

7050

-60

-40

-10

-30

-10

Source

Bellman-Ford Example

0

50

20

70

6020

40

20

7050

-60

-40

-10

-30

-10

Source

Bellman-Ford Example

0

30

20

70

6020

40

20

7050

-60

-40

-10

-30

-10

Source

Bellman-Ford Example

0

30

20

70

5020

40

20

7050

-60

-40

-10

-30

-10

Source

Kirk Phase 2

Resolve threats and open conditions Analogous to threats and open conditions in

causal link planning Identify intervals of inconsistent constraints

using Floyd-Warshall Order intervals to resolve threats Close open conditions by making sure open

conditions satisfied by some action in the plan

Why This Paper?

It’s useful for our term project

Vision

"Managing Multiple Tasks in Complex, Dynamic Environments" by Michael Freed (AAAI ’98).

Achieve goals in “task environments”• Complex

• Time-pressured

• Uncertain

• Co-existing/Interacting

APEX Goal: ATC

Goal: simulate human air traffic controllers• Largely routine activity

• Complexity due to many simple tasks

• Interruptions necessary

APEX Goal: ATC

APEX Goal: ATC

APEX Goal: ATC

Resource Conflicts

Separate tasks make incompatible demands

What to do?• Determine relative priority of tasks

• Assign control to winner

• Deal with the loser

Conflict Resolution Strategies

Shed• Eliminate low importance tasks

• When (Demand > Availability)

Delay/Interrupt• Introduces complications

Circumvent• Select methods that use different resources

APEX Architecture: Two Parts

Resource Architecture• Set of resources

• Cognitive

• Perceptual

• Motor

Action Selection Component

Action Selection Component

Resource Architecture

World

actuators perception

commands events

Procedure Definition Language (PDL)

(clear-hand left-hand) (determine-loc headlight-ctl => ?loc))

(grasp knob left-hand ?loc)

(pull knob left-hand 0.4)

(ungrasp left-hand)

Example: Turning on headlights

Procedure Definition Language (PDL)

(procedure (index (turn-on-headlights) (step s1 (clear-hand left-hand)) (step s2 (determine-loc headlight-ctl => ?loc)) (step s3 (grasp knob left-hand ?loc) (waitfor ?s1 ?s2)) (step s4 (pull knob left-hand 0.4) (waitfor ?s3)) (step s5 (ungrasp left-hand) (waitfor ?s4)) (step s6 (terminate) (waitfor ?s5)))

Example: Turning on headlights

Detecting Conflicts

Must determine:• Which tasks should be checked and when

• Preconditions satisfied

• Resources become available

• Whether conflict exists between specified tasks• Direct and indirect control

PROFILE Clause

Denotes resource requirements for a procedure

(profile (<resource> <duration> <continuity>))

(profile (left-hand 8 10))

Prioritization of Tasks

Used when:• New resource conflict detected

• New information potentially changes a previous prioritization decision

Prioritization Example: Reprioritize

(procedure(index (drive-car)) . . .(step s8 (monitor-behind))(step s9 (reprioritize ?s8) (waitfor (sound-type ?sound car-horn) (loudness ?sound ?db (?if (> ?db 30))))))(urgency ?y)))

Assigning Priority

(step s5 (monitor-fuel-gauge) (priority 3))

(step s6 (monitor-left-traffic) (priority ?x))

(step s7 (monitor-ahead) (priority (+ ?x ?y)))

General Priority Form

(priority <basis> (importance <expression>) (urgency <expression>))

(step s5 (monitor-fuel-gauge)(priority (run-empty) (importance 6) (urgency 2))(priority (delay-to-other-task) (importance ?x)

(urgency 3))(priority (excess-time-cost refuel) (importance ?x)

(urgency ?y)))

Importance vs. Urgency

Depends on workload

priorityb = S*Ib + (Smax-S)Ub

S is subjective workload (a heuristic approximation of actual workload);

Ib and Ub represent importance and urgency for a specified basis

Interruption: RESET

. . .

(step s4 (turn-on-headlights))

(step s5 (reset) (waitfor (suspended ?s4))

Coping with Interruption

Wind-down activities Suspension-time activities Wind-up activities

Wind-down Activities: an Example

(step s15 (pull-over)

(waitfor (suspended ?self))

(priority (avoid-accident) (importance 10)

(urgency 10)))

Interrupt Costs

Wind-down, suspension, and wind-up activities incur cost

Ongoing task has its priority increased in proportion to interrupt cost

(interrupt-cost 5)

Slack Time

(step s17 (suspend ?self)

(waitfor (shape ?object traffic-signal)

(color ?object red)))

(step s18 (monitor-object ?object) (waitfor ?s17))

(step s19 (reprioritize ?self)

(waitfor (color ?object green)))

Computing Priority

bb

bb

b UI

SSIU

SICpriority

1

11

1

11 max

IC = interruption costU = urgencyI = importanceS = workload

Conflict Resolution Strategies

Shed• Eliminate low importance tasks

• When (Demand > Availability)

Delay/Interrupt• Introduces complications

Circumvent• Select methods that use different resources

Evaluation and Future Work

Strengths and Weaknesses ATC application has identified issues

• Computing overall priority from base priorities

• Suppression of base priorities

• Other priority issues:• A, B need X

• A, C need Y

• Priority of A must exceed that of B+C