Intro to Planning

Intro to Planning

Or, how to represent the planning problem in logic

The Planning Problem

Input:1. An “initial state”2. A “goal state”3. A set of actions, each of which can take you from one

state to another one

Output:A sequence of actions that, when executed in order starting in the initial state, guarantee reaching the goal state

Sound Familiar?

Graph Traversal as a Planning Problem

1. “initial state” is the start node in the graph2. “goal state” is the goal node in the graph3. Each “action” is a traversal of one of the edges in

the graph, which takes you from an existing state (a node in the graph) to another state (another node in the graph).

The output is a sequence of actions (edges) that take an agent from the start state to the goal state.

Problems with Graphs as Representations

The algorithms we used for search in graphs work great: they are efficient, and they find optimal paths.

However, some planning problems are difficult to represent as graphs. For example, 1. Uncertainty: the agent may not be omniscient (all-knowing), so it doesn’t know the whole

graph at each time step. We’ve talked about some sources of uncertainty before: 1. partial observability (agent doesn’t perceive the world fully/accurately)2. Stochasticity (actions can have multiple outcomes)3. multi-agent (other intelligent agents operate in the environment)4. Dynamism (the world changes over time, without the agent doing anything)5. computational limits, ignorance, laziness, storage limits, etc.

Consider the problem of planning a traversal of Tuttleman Hall to get from the entrance to room 305. If you didn’t know the building, you’d need to include actions for looking around the hall for the right room number, or for determining whether there are stairs or elevators, and where they are. Your subsequent actions would depend on the outcomes of these actions, so you can’t represent them in the graph at the beginning. (partial observability)

Even if you know the building well, you still can’t plan your route out from the beginning, since you don’t know if people will be in the way (multi-agent/stochasticity).

Problems with Graphs as Representations


However, some planning problems are difficult to represent as graphs. For example,

2. Complexity: the complete graph might be enormous (or infinite), so it’s unrealistic to assume that the whole thing is given as an input.

For example, consider the problem of corralling 100 sheep (s1 through s100) into 10 pens (p1-p10). All sheep start in an open field (F). The objective is to get s1-s10 into p1, s11-s20 into p2, etc. The allowed actions are moving one sheep from one location (F or p1-p10) to another location.

Quiz: If we represent this as a graph, how many total nodes would there be? How many total edges?

Answer: Problems with Graphs as Representations


However, some planning problems are difficult to represent as graphs. For example,

2. Complexity: the complete graph might be enormous (or infinite), so it’s unrealistic to assume that the whole thing is given as an input.

For example, consider the problem of routing corralling 100 sheep (s1 through s100) into 10 pens (p1-p10). All sheep start in an open field (F). The objective is to get s1-s10 into p1, s11-s20 into p2, etc. The allowed actions are moving one sheep from one location (F or p1-p10) to another location.

Quiz: If we represent this as a graph, how many total nodes would there be? How many total edges?

The number of nodes: A node represents a position for all 100 sheep. There’s 11 possible places for s1, 11 for s2, 11 for s3, …, and 11 for s100. So there are 11 x 11 x … x 11 (100 times) = 11100 = around 1.4 x 10104 nodes, or more than a googol (10100).

The number of edges: For every node, there are 100 possible sheep to move, and 11-1 = 10 possible places to move it to, so 1000 edges per node. So there are a total of 11100 x 1000 = around 1.4 x 10107 edges.

Planning generalizes Graph Search

Planning lets us consider problems with more complexity and uncertainty than graph search.

The main difference is that the input includes “states” and “actions” rather than nodes and edges.

In very simple cases, these are the same thing, but not always.

Note: The main difference is in representation, rather than inference or learning.

Handling Complexity with Better Representations

We’ll start by talking about representations that don’t suffer (as much) from combinatorial explosions.

Later, we’ll talk about handling partial observability, stochasticity, and other causes of uncertainty.

Example Planning ProblemInitial state: sheep are in the field, as is the robot.

Goal: get sheep into the corral.

Actions: L: fly left, from corral to field. R: fly right, from field to corral. G: grab a sheep. U: ungrab, or let go of, a sheep.

Quiz: Planning ProblemWhich of the following is a plan? And which of the plans actually achieves the goal, starting from the initial state?

1. [L, L, L]2. [U, G, U, G, M, K, Z]3. [G, R, U]4. [L, G, R, U, L, G, R, U, L, R]5. [G, R, U, L, G, R, U, L]

Initial state:

Goal:

Answers: Planning ProblemWhich of the following is a plan? And which of the plans actually achieves the goal, starting from the initial state?

1. [L, L, L]Plan, unsuccesful

2. [U, G, U, G, M, K, Z]Not a plan (M, K, Z are not actions in this planning problem)

3. [G, R, U]Plan, unsuccessful

4. [L, G, R, U, L, G, R, U, L, R]Plan, successful

5. [G, R, U, L, G, R, U, L]Plan, unsuccessful (robot ends in the wrong spot)

Initial state:

Goal:

Quiz: Describe States in LogicUsing the following boolean variables, come up with PL formulas to describe the initial state and goal state:

Robot_has_sheep_1Robot_has_sheep_2Robot_in_fieldSheep_1_in_fieldSheep_2_in_field

Initial state:

Goal:

Answer: Describe States in LogicUsing the following boolean variables, come up with PL formulas to describe the initial state and goal state:

Robot_has_sheep_1Robot_has_sheep_2Robot_in_fieldSheep_1_in_fieldSheep_2_in_field

Initial: Robot_in_field ∧Sheep_1_in_field ∧Sheep_2_in_field

Goal: Robot_in_field ∧Sheep_1_in_field ∧Sheep_2_in_field

Initial state:

Goal:

Generalizing with PLSuppose we don’t actually care where the robot ends up, just that the sheep are in the corral.

We can describe this goal just by removing the variable Robot_in_field from the goal description.

New Goal: Sheep_1_in_field ∧ Sheep_2_in_field

So long as Sheep_1_in_field and Sheep_2_in_field are both false, any assignment of T or F to Robot_in_field will make the goal formula true.

Initial state:

Goal:

Quiz: Describe States in FOLUsing the following constants and relations, write FOL sentences to describe the initial and goal states.

Constants:B (robot)S1, S2 (sheep)F (field)C (corral)

Relations:Sheep(x) – true if x is a sheepHolding(x, y) – true if x is holding yAt(x, y) – true if x is at location y

Initial state:

Goal:

Answer: Describe States in FOLUsing the following constants and relations, write FOL sentences to describe the initial and goal states.

Constants:B (robot)S1, S2 (sheep)F (field)C (corral)

Relations:Sheep(x) – true if x is a sheepHolding(x, y) – true if x is holding yAt(x, y) – true if x is at location y

Initial state: At(S1, F) At(S2, F) At(B, F)∧ ∧Goal state: At(S1, C) At(S2, C) ∧

Initial state:

Goal:

Quiz: Generalizing with FOLLike with PL, FOL lets us describe goal states that include multiple possible worlds.

Unlike PL, it also has convenient ways of generalizing further.

Suppose there were 100 sheep instead of 2. Write an FOL statement that describes the goal that all of the sheep are in the corral.

Initial state:

Goal:

Answer: Generalizing with FOLLike with PL, FOL lets us describe goal states that include multiple possible worlds.

Unlike PL, it also has convenient ways of generalizing further.

Suppose there were 100 sheep instead of 2. Write an FOL statement that describes the goal that all of the sheep are in the corral.

Answer: ∀s. Sheep(s) ⇒ At(s, C)This formula succinctly captures the goal state, regardless of how many sheep are involved.

Initial state:

Goal:

Describing Actions

We’ve talked a bunch about how to represent the start and goal states.

What about actions?

Let’s go over two commonly-used approaches.

STRIPS ActionsSTRIPS is a language for representing the meaning of actions.

Here are some examples:

Move Left:Pre: At(B, C)Eff: At(B, F) At(B, C)

Ungrab(x, y):Pre: Holding(B, x) At(B, y)Eff: At(x, y) Holding(B, x)

Each action has a list of arguments, a description of preconditions (what must be true before the action can take place), and a list of effects (what is true after the action takes place). Notice that the effects include things that become true, and things that become false. Preconditions and effects CANNOT use quantifiers (in STRIPS).

Quiz: STRIPS Actions

Write STRIPS action descriptions for the Move Right and Grab actions.

Answer: STRIPS ActionsWrite STRIPS action descriptions for the Move Right and Grab actions. Make sure that the robot can’t grab something if it’s already holding something.

Move Right:Pre: At(B, F)Eff: At(B, C) At(B, F)

Grab(x, y):Pre: Holding(B, S1) Holding(B, S2) At(B, y) At(x, y)Eff: At(x, y) Holding(B, x)

Note: You need to modify At(sheep, location), either in the Grab/Ungrab actions’ effects, or in the Move right/Move left actions’ effects. My version here modifies them in the Grab/Ungrab actions.

Note 2: If you want to avoid adding a conjunct to the precondition of Grab for each sheep in the world, you can create a new boolean variable called handsFull. The Precondition for Grab would require this to be false, and the effects would make it true. The preconditions for Ungrab would require handsFull to be true, and the effects would make it false. The only other change is that the initial condition would need to specify handsFull.

Quiz: State Changes with STRIPSInitial state: At(S1, F) At(S2, F) At(B, F)∧ ∧

Given the initial state above, describe the state of the world after each of the following actions takes place, in order:

G(s1, F)RU(s1, C)LU(s1, F)

Initial state:

Goal:

Answer: State Changes with STRIPSInitial state: At(S1, F) At(S2, F) At(B, F)∧ ∧

Given the initial state above, describe the state of the world after each of the following actions takes place, in order:

After G(s1, F): At(S1, F) ∧ At(S2, F) At(B, F) Holding(B, S1)∧ ∧After R: At(S2, F) ∧ At(B, F) ∧ Holding(B, S1) At(B, C)∧After U(s1, C): At(S2, F) ∧ Holding(B, S1) ∧ At(B, C) At(S1, C)∧After L: At(S2, F) ∧ At(B, C) At(S1, C) At(B, F)∧ ∧After U(s1, F): Preconditions aren’t met (Holding(B, s1)), so this action can’t be taken in the current state.

Initial state:

Search Strategies for Finding a PlanInitial state:

Goal:

Strategy 1: Forward (or progression) search1. Keep a priority queue of states (each

described by FOL or PL)2. When it’s time to explore a node, apply all

actions whose preconditions are met, and add the resulting states to the priority queue

3. Stop when a state is taken from the queue that matches the goal state.G(s1)

G(s2)R


Goal:

Strategy 1: Forward (or progression) searchNotice: this algorithm is very similar to our graph search algorithms, but it doesn’t require the complete graph as input.

Also notice: I haven’t (yet) specified how to compute the priorities for the priority queue. But you can use cost (eg, number of actions), or heuristics, or a combination of the two.

G(s1) G(s2)R


Goal:

Strategy 2: Backward (or regression) search1. Start by adding the goal state to the priority

queue, instead of the initial state.2. At each iteration, find all actions whose effects

match the current node, and add the previous states (before the action) to the queue.

3. Stop when you get a node that matches the initial state.

U(s2)

R


Goal:

Strategy 2: Backward (or regression) searchNote: this is basically the same, but there are cases when it’s a lot more efficient than forward search. Consider the case of 1000 sheep, and the goal is to get s457 into the corral. Forward search has 1001 possible actions to consider in the initial state, while backward search only has to consider a small number.

U(s2)

R

Heuristics for PlanningA popular strategy is to automatically generate heuristics for a planning problem, from the descriptions of the actions.

Here’s the general idea:1. Create a relaxed planning problem by simplifying all of the actions.2. For each node, use a depth-first or breadth-first search to solve the relaxed

planning problem.3. Use the path cost for the plan from the relaxed problem as the heuristic value for

the node in the full planning problem.

To make this work out, we need to make sure that the relaxed planning problem is much, much easier to solve than the original planning problem, since we need to solve the relaxed planning problem many times (each time we explore a node).

Heuristics for PlanningHere’s an example of a strategy for generating a relaxed planning problem from STRIPS action descriptions.

Start with your existing actions, e.g.:


Start removing preconditions, to get relaxed action descriptions for a strictly easier planning problem:

Grab(x):Pre: At(B, y) At(x, y)Eff: At(x, y) Holding(B, x)

In this version, the robot can hold as many sheep as it likes.

Heuristics for PlanningHere’s an example of a strategy for generating a relaxed planning problem from STRIPS action descriptions.

Start with your existing actions, e.g.:


Alternatively, or in addition, you can remove negative effects, e.g.:

Grab(x):Pre: Eff: Holding(B, x)

In this version, the robot can hold as many sheep as it wants, it doesn’t have to be in the same square as the sheep.

Intro to Planning

Documents

Transcript of Intro to Planning