Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*)...

60
Sub-optimal Heuristic Search and its Application to Planning for Manipulation Mike Phillips Slides adapted from Maxim Likhachev

Transcript of Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*)...

Page 1: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Sub-optimal Heuristic Search and its Application to Planning for

Manipulation

Mike Phillips

Slides adapted from Maxim Likhachev

Page 2: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

Planning for Mobile Manipulation •  What planning tasks are there?

2 Carnegie Mellon University

robotic bartender demo at Search-based Planning Lab, April’12

Page 3: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

Planning for Mobile Manipulation •  What planning tasks are there?

3 Carnegie Mellon University

robotic bartender demo at Search-based Planning Lab, April’12

task planning (order of getting items) planning for navigation planning motions for an arm planning coordinated base+arms (full body) motions planning how to grasp

Page 4: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

Planning for Mobile Manipulation •  What planning tasks are there?

4 Carnegie Mellon University

robotic bartender demo at Search-based Planning Lab, April’12

task planning (order of getting items) planning for navigation planning motions for an arm planning coordinated base+arms (full body) motions planning how to grasp

this class

Page 5: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

Planning for Mobile Manipulation •  Example of planning for 20D arm in 2D:

–  each state is defined by 20 discretized joint angles {q1, q2, …, q20} –  each action is changing one angle (or set of angles) at a time

5 Carnegie Mellon University

Page 6: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

Planning for Mobile Manipulation •  Example of planning for 20D arm in 2D:

–  each state is defined by 20 discretized joint angles {q1, q2, …, q20} –  each action is changing one angle (or set of angles) at a time

6 Carnegie Mellon University

How to search such a high-dimensional state-space?

Page 7: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

Typical Approaches to Motion Planning

planning with optimal heuristic search-based

approaches (e.g., A*, D*)

planning with sampling-based approaches

(e.g., RRT, PRM,…) + completeness/optimality guarantees + excellent cost minimization + consistent solutions to similar motion queries - slow and memory intensive - used mostly for 2D (x,y) path planning

+ typically very fast and low memory + used for high-D path/motion planning - completeness in the limit -  often poor cost minimization - hard to provide solution consistency

sub-optimal heuristic searches can be fast enough for high-D motion planning and return consistent good quality solutions

7 Carnegie Mellon University

Page 8: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

Planning for Mobile Manipulation •  Example of planning for 20D arm in 2D:

–  each state is defined by 20 discretized joint angles {q1, q2, …, q20} –  each action is changing one angle (or set of angles) at a time

8 Carnegie Mellon University

Heuristic search? Wouldn’t you need to store m20

(m-# of values) states in memory?

states are generated on-the fly as search expands states

Page 9: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

Sub-optimal Heuristic Search for High-D Motion Planning

•  ARA* (Anytime version of A*) graph search –  effective use of solutions to relaxed (easier) motion planning problems

•  Experience graphs –  heuristic search that learns from its planning experiences

9 Carnegie Mellon University

Page 10: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

Sub-optimal Heuristic Search for High-D Motion Planning

•  ARA* (Anytime version of A*) graph search –  effective use of solutions to relaxed (easier) motion planning problems

•  Experience graphs –  heuristic search that learns from its planning experiences

10 Carnegie Mellon University

Page 11: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev University of Pennsylvania

•  A* Search: expands states in the order of f(s)=g(s)+h(s) values

•  In A*, when s is expanded g(s) is optimal •  h(s) is a relaxed (simplified) version of the problem

Heuristics in Heuristic Search

11

start

goal

h(s)

s g(s)

Page 12: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev University of Pennsylvania

•  A* Search: expands states in the order of f(s)=g(s)+h(s) values

•  In A*, when s is expanded g(s) is optimal •  h(s) is a relaxed (simplified) version of the problem

Heuristics in Heuristic Search

12

start

goal

h(s)

s g(s)

If h(s) was a perfect estimate (let’s call it h*), what does f(s) represent upon expansion of s?

Page 13: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev University of Pennsylvania

•  A* Search: expands states in the order of f(s)=g(s)+h(s) values

•  In A*, when s is expanded g(s) is optimal •  h(s) is a relaxed (simplified) version of the problem

Heuristics in Heuristic Search

13

start

goal

h(s)

s g(s)

If h(s) was a perfect estimate (let’s call it h*), what does f(s) represent upon expansion of s?

Now what if h(s) wasn’t perfect?

Page 14: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev University of Pennsylvania

•  A* Search: expands states in the order of f = g+h values ComputePath function while(sgoal is not expanded) remove s with the smallest [f(s) = g(s)+h(s)] from OPEN;

insert s into CLOSED; for every successor s’ of s such that s’ not in CLOSED

if g(s’) > g(s) + c(s,s’) g(s’) = g(s) + c(s,s’); insert s’ into OPEN;

Heuristics in Heuristic Search

expansion of s

14

h(s) g(s)

Sstart

S

S2

S1

Sgoal

the cost of a shortest path from sstart to s found so far

an (under) estimate of the cost of a shortest path from s to sgoal

Page 15: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

•  Solutions to a relaxed version of the planning problem

Heuristics in Heuristic Search

15 Carnegie Mellon University

Euclidean distance heuristic: 2D planning without obstacles and no grid

2D end-effector distance heuristic: 2D planning for end-effector with obstacles

2D (x,y) planning 20D arm planning

Page 16: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev University of Pennsylvania

•  Dijkstra’s: expands states in the order of f = g values •  A* Search: expands states in the order of f = g+h values

•  Weighted A*: expands states in the order of f = g+εh values, ε > 1 = bias towards states that are closer to goal

Heuristics in Heuristic Search

16

h(s) g(s)

Sstart

S

S2

S1

Sgoal

the cost of a shortest path from sstart to s found so far

an (under) estimate of the cost of a shortest path from s to sgoal

Page 17: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev University of Pennsylvania

Heuristics in Heuristic Search

sgoal sstart

… …

•  Dijkstra’s: expands states in the order of f = g values

What are the states expanded?

17

Page 18: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev University of Pennsylvania

Heuristics in Heuristic Search

sgoal sstart

… …

•  A* Search: expands states in the order of f = g+h values

What are the states expanded?

18

Page 19: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev University of Pennsylvania

Heuristics in Heuristic Search

sgoal sstart

… …

•  A* Search: expands states in the order of f = g+h values

for high-D problems, this results in A* being too slow and running out of memory

19

Page 20: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev University of Pennsylvania

Heuristics in Heuristic Search

•  Weighted A* Search: expands states in the order of f = g+εh values, ε > 1

•  bias towards states that are closer to goal

sstart sgoal …

20

Page 21: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev University of Pennsylvania

Heuristics in Heuristic Search

•  Weighted A* Search: –  trades off optimality for speed –  ε-suboptimal:

cost(solution) ≤ ε·cost(optimal solution) –  in many domains, it has been shown to be orders of magnitude

faster than A*

A*: ε =1.0

20 expansions solution=10 moves

Weighted A*: ε =2.5

13 expansions solution=11 moves

21

Page 22: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

Anytime Search based on weighted A*

•  Constructing anytime search based on weighted A*: - find the best path possible given some amount of time for planning - do it by running a series of weighted A* searches with decreasing ε:

ε =2.5

13 expansions solution=11 moves

ε =1.5

15 expansions solution=11 moves

ε =1.0

20 expansions solution=10 moves

University of Pennsylvania 22

Page 23: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

Anytime Search based on weighted A*

•  Constructing anytime search based on weighted A*: - find the best path possible given some amount of time for planning - do it by running a series of weighted A* searches with decreasing ε:

ε =2.5

13 expansions solution=11 moves

ε =1.5

15 expansions solution=11 moves

ε =1.0

20 expansions solution=10 moves

•  Inefficient because – many state values remain the same between search iterations – we should be able to reuse the results of previous searches

University of Pennsylvania 23

Page 24: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

Anytime Search based on weighted A*

•  Constructing anytime search based on weighted A*: - find the best path possible given some amount of time for planning - do it by running a series of weighted A* searches with decreasing ε:

ε =2.5

13 expansions solution=11 moves

ε =1.5

15 expansions solution=11 moves

ε =1.0

20 expansions solution=10 moves

•  ARA* - an efficient version of the above that reuses state values within any search iteration

University of Pennsylvania 24

Page 25: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

•  Efficient series of weighted A* searches with decreasing ε: ARA*

ComputePathwithReuse function while(f(sgoal) > minimum f-value in OPEN ) remove s with the smallest [g(s)+ εh(s)] from OPEN; insert s into CLOSED;

for every successor s’ of s if g(s’) > g(s) + c(s,s’) g(s’) = g(s) + c(s,s’); if s’ not in CLOSED then insert s’ into OPEN;

University of Pennsylvania 25

Page 26: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

•  Efficient series of weighted A* searches with decreasing ε: ARA*

ComputePathwithReuse function while(f(sgoal) > minimum f-value in OPEN ) remove s with the smallest [g(s)+ εh(s)] from OPEN; insert s into CLOSED;

for every successor s’ of s if g(s’) > g(s) + c(s,s’) g(s’) = g(s) + c(s,s’); if s’ not in CLOSED then insert s’ into OPEN; otherwise insert s’ into INCONS

University of Pennsylvania 26

Page 27: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

•  Efficient series of weighted A* searches with decreasing ε: ARA*

ComputePathwithReuse function while(f(sgoal) > minimum f-value in OPEN ) remove s with the smallest [g(s)+ εh(s)] from OPEN; insert s into CLOSED;

for every successor s’ of s if g(s’) > g(s) + c(s,s’) g(s’) = g(s) + c(s,s’); if s’ not in CLOSED then insert s’ into OPEN; otherwise insert s’ into INCONS

University of Pennsylvania 27

set ε to large value; g(sstart) = 0; OPEN = {sstart}; while ε ≥ 1

CLOSED = {}; INCONS = {}; ComputePathwithReuse(); publish current ε suboptimal solution;

decrease ε; initialize OPEN = OPEN U INCONS;

Page 28: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

•  A series of weighted A* searches

•  ARA*

ε =2.5

13 expansions solution=11 moves

ε =1.5

15 expansions solution=11 moves

ε =1.0

20 expansions solution=10 moves

ε =2.5

13 expansions solution=11 moves

ε =1.5

1 expansion solution=11 moves

ε =1.0

9 expansions solution=10 moves

ARA*

University of Pennsylvania 28

Page 29: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

When using ARA*, the research is in finding a graph representation G and a corresponding heuristic function h that lead to shallow local minima for the search

sstart sgoal …

key to finding solution fast: shallow minima for h(s)-h*(s) function

29 Carnegie Mellon University

ARA*

Page 30: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

ARA*-based Planning for Manipulation •  ARA*-based motion planning for a single 7DOF arm [Cohen et al.,

ICRA’10, ICRA’11] –  goal is given as 6D (x,y,z,r,p,y) pose of the end-effector –  each state is: 4 non-wrist joint angles {q1,…q4} when far away from the goal,

all 7 joint angles {q1,…q7} when close to the goal –  actions are: {moving one joint at a time by fixed ∆; IK-based motion to snap to

the goal when close to it}

30 Carnegie Mellon University

Page 31: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

ARA*-based Planning for Manipulation •  ARA*-based motion planning for a single 7DOF arm [Cohen et al.,

ICRA’10, ICRA’11] –  goal is given as 6D (x,y,z,r,p,y) pose of the end-effector –  each state is: 4 non-wrist joint angles {q1,…q4} when far away from the goal,

all 7 joint angles {q1,…q7} when close to the goal –  actions are: {moving one joint at a time by fixed ∆; IK-based motion to snap to

the goal when close to it} –  heuristics are: 3D BFS distances for the end-effector

31 Carnegie Mellon University

expansions (shown as dots corresponding to ef poses) based on 3D BFS heuristics (shown as curve)

expansions based on Euclidean distance heuristics (shown as curve)

Page 32: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

ARA*-based Planning for Manipulation •  ARA*-based motion planning for a single 7DOF arm [Cohen et al.,

ICRA’10, ICRA’11] –  goal is given as 6D (x,y,z,r,p,y) pose of the end-effector –  each state is: 4 non-wrist joint angles {q1,…q4} when far away from the goal,

all 7 joint angles {q1,…q7} when close to the goal –  actions are: {moving one joint at a time by fixed ∆; IK-based motion to snap to

the goal when close to it} –  heuristics are: 3D BFS distances for the end-effector

32 Carnegie Mellon University

expansions (shown as dots corresponding to ef poses) based on 3D BFS heuristics (shown as curve)

expansions based on Euclidean distance heuristics (shown as curve)

Page 33: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

ARA*-based Planning for Manipulation •  ARA*-based motion planning for a single 7DOF arm [Cohen et al.,

ICRA’10, ICRA’11] –  goal is given as 6D (x,y,z,r,p,y) pose of the end-effector –  each state is: 4 non-wrist joint angles {q1,…q4} when far away from the goal,

all 7 joint angles {q1,…q7} when close to the goal –  actions are: {moving one joint at a time by fixed ∆; IK-based motion to snap to

the goal when close to it} –  heuristics are: 3D BFS distances for the end-effector

33 Carnegie Mellon University

expansions (shown as dots corresponding to ef poses) based on 3D BFS heuristics (shown as curve)

expansions based on Euclidean distance heuristics (shown as curve)

Page 34: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

ARA*-based Planning for Manipulation •  ARA*-based motion planning for a single 7DOF arm [Cohen et al.,

ICRA’10, ICRA’11]

34 Carnegie Mellon University

Shown on PR2 robot but planner has been run on other arms (kuka)

Page 35: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

ARA*-based Planning for Manipulation •  ARA*-based motion planning for dual-arm motion [Cohen et al.,

ICRA’12] –  goal is given as 4D (x,y,z,y) pose for the object –  each state is 6D: 4D (x,y,z,y) pose of the object, {q1,q2} of the elbow angles –  actions are: {changing the 4D pose of the object or q1, q2} –  heuristics are: 3D BFS distances for the object

35 Carnegie Mellon University

Page 36: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

ARA*-based Planning for Manipulation •  ARA*-based motion planning for dual-arm motion [Cohen et al.,

ICRA’12]

36 Carnegie Mellon University

motions as generated by the planner - no smoothing

Page 37: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

ARA*-based Planning for Manipulation •  ARA*-based motion planning for dual-arm motion [Cohen et al.,

ICRA’12]

37 Carnegie Mellon University

motions as generated by the planner - no smoothing consistency of motions across similar start-goal trials

Page 38: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

Pros Cons

Carnegie Mellon University 38

•  Explicit cost minimization •  Consistent plans (similar

input generates similar output)

•  Can handle arbitrary goal sets (such as set of grasps from a grasp planner)

•  Many path constraints are easy to implement (upright orientation of gripper)

•  Requires good graph and heuristic design to plan fast

•  Number of motions from a state affects speed and path quality

Discussion

ARA*-based Planning for Manipulation

Page 39: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

Sub-optimal Heuristic Search for High-D Motion Planning

•  ARA* (Anytime version of A*) graph search –  effective use of solutions to relaxed (easier) motion planning problems

•  Experience graphs –  heuristic search that learns from its planning experiences

39 Carnegie Mellon University

Page 40: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

Planning with Experience Graphs •  Many planning tasks are repetitive

-  loading a dishwasher -  opening doors -  moving objects around a warehouse -  …

•  Can we re-use prior experience to accelerate planning, in the context of search-based planning?

•  Would be especially useful for high-dimensional problems such as mobile manipulation!

40 Carnegie Mellon University

Page 41: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

Planning with Experience Graphs Given a set of previous paths (experiences)…

41 Carnegie Mellon University

Page 42: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

Planning with Experience Graphs Put them together into an E-graph (Experience graph)

42 Carnegie Mellon University

Page 43: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

Planning with Experience Graphs •  E-Graph [Phillips et al., RSS’12]:

–  Collection of previously computed paths or demonstrations –  A sub-graph of the original graph

43 Carnegie Mellon University

Page 44: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

Planning with Experience Graphs Given a new planning query…

44 Carnegie Mellon University

Page 45: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

Planning with Experience Graphs …re-use E-graph. For repetitive tasks, planning becomes much faster

goal

start

45 Carnegie Mellon University

Page 46: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

Planning with Experience Graphs …re-use E-graph. For repetitive tasks, planning becomes much faster

goal

start

Theorem  1:  Algorithm  is  complete  with  respect  to  the  original  graph    Theorem  2:  The  cost  of  the  solu8on  is  within  a  given  bound  on  sub-­‐op8mality  

46 Carnegie Mellon University

Page 47: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

Planning with Experience Graphs •  Reuse E-Graph by:

–  Introducing a new heuristic function –  Heuristic guides the search toward expanding states on the E-Graph

within sub-optimality bound ε

goal

start

47 Carnegie Mellon University

Page 48: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

Planning with Experience Graphs •  Focusing search towards E-graph within sub-optimality bound ε

hG

consistent heuristic is one that satisfies the triangle inequality,hG(s, sgoal) ! c(s, s!) + hG(s!, sgoal).

C. Algorithm DetailThe planner maintains two graphs, G and GE . At the high-

level, every time the planner receives a new planning requestthe findPath function is called. It first updates GE to accountfor edge cost changes and perform some precomputations.Then it calls the computePath function, which produces apath !. This path is then added to GE . The updateEGraphfunction works by updating any edge costs in GE that havechanged. If any edges are invalid (e.g. they are now blockedby obstacles) they are put into a disabled list. Conversely, ifan edge in the disabled list now has finite cost it is re-enabled.At this point, the graph GE should only contain finite edges.A precomputeShortcuts function is then called which can beused to compute shortcut edges before the search begins. Waysto compute shortcuts are discussed in Section V. Finally, ourheuristic hE , which encourages path reuse, is computed.

findPath(sstart, sgoal)1: updateEGraph(sgoal)2: ! = computePath(sstart, sgoal)

3: GE = GE ! !

updateEGraph(sgoal)1: updateChangedCosts()2: disable edges that are now invalid3: re-enable disabled edges that are now valid4: precomputeShortcuts()

5: compute heuristic hE according to Equation 1

Our algorithm’s speed-up comes from being able to reuseparts of old paths and avoid searching large portions ofgraph G. To accomplish this we introduce a heuristic whichintelligently guides the search toward GE when it looks likefollowing parts of old paths will help the search get close tothe goal. We define a new heuristic hE in terms of the givenheuristic hG and edges in GE .

hE(s0) = min!

N"1!

i=0

min{"EhG(si, si+1), cE(si, si+1)} (1)

where ! is a path "s0 . . . sN"1# and sN"1 = sgoal and "E isa scalar $ 1.Equation 1 is a minimization over a sequence of segments

such that each segment is a pair of arbitrary states sa, sb % Gand the cost of this segment is given by the minimum of twothings: either "EhG(sa, sb), the original heuristic inflated by"E , or the cost of an actual least-cost path in GE , providedthat sa, sb % GE .In Figure 2, the path ! that minimizes the heuristic contains

a sequence of alternating segments. In reality ! can alternatebetween hG and GE segments as many or as few times asneeded to produce the minimal !. When there is a GE segmentwe can have many states si on the segment to connect two

Fig. 2. A visualization of Equation 1. Solid lines are composed of edgesfrom GE , while dashed lines are distances according to hG. Note that hG

segments are always only two points long, while GE segments can be anarbitrary number of points.

(a) "E = 1 (b) "E = 2

(c) "E " #

Fig. 3. Shortest ! according to hE as "E changes. The dark solid lines arepaths in GE while the dark dashed lines are the heuristic’s path !. Note as"E increases, the heuristic prefers to travel on GE . The light gray circles andlines show the graph G and the filled in gray circles represent the expandedstates under the guidance of the heuristic.

states, while when on a hG segment a single pair of pointssuffices since no real edge between two states is needed forhG to be defined.The larger "E is, the more we avoid exploring G and focus

on traveling on paths in GE . Figure 3 demonstrates how thisworks. As GE increases, it becomes more expensive to traveloff of GE causing the heuristic to guide the search along partsofGE . In Figure 3a, the heuristic ignores the graphGE becausewithout inflating hG at all, following real edges costs (likethose in GE ) will never be the cheaper option. In the otherparts of Figure 3 we can see as "E increase, the heuristic usesmore and more of GE . This figure also shows how during thesearch, by following old paths, we can avoid obstacles andhave far fewer expansions. The expanded states are shown asfilled in gray circles, which change based on how the hE isbiased by "E .The computePath function runs weighted A* without re-

expansions [13, 10]. Weighted A* uses a parameter "w >1 to inflate the heuristic used by A*. The solution cost isguaranteed to be no worse than "w times the cost of the optimalsolution and in practice it runs dramatically faster than A*.The main modification to Weighted A*, is that in addition tousing the edges that G already provides (getSuccessors), weadd two additional types of successors: shortcuts and snapmotions. The only other change is that instead of using the

Traveling on E-Graph uses actual costs

Traveling off the E-Graph uses an inflated original heuristic

Heuristic computation finds a min cost path using two kinds of “edges”

48 Carnegie Mellon University

goal

start

Page 49: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

Planning with Experience Graphs •  Focusing search towards E-graph within sub-optimality bound ε

hG

consistent heuristic is one that satisfies the triangle inequality,hG(s, sgoal) ! c(s, s!) + hG(s!, sgoal).

C. Algorithm DetailThe planner maintains two graphs, G and GE . At the high-

level, every time the planner receives a new planning requestthe findPath function is called. It first updates GE to accountfor edge cost changes and perform some precomputations.Then it calls the computePath function, which produces apath !. This path is then added to GE . The updateEGraphfunction works by updating any edge costs in GE that havechanged. If any edges are invalid (e.g. they are now blockedby obstacles) they are put into a disabled list. Conversely, ifan edge in the disabled list now has finite cost it is re-enabled.At this point, the graph GE should only contain finite edges.A precomputeShortcuts function is then called which can beused to compute shortcut edges before the search begins. Waysto compute shortcuts are discussed in Section V. Finally, ourheuristic hE , which encourages path reuse, is computed.

findPath(sstart, sgoal)1: updateEGraph(sgoal)2: ! = computePath(sstart, sgoal)

3: GE = GE ! !

updateEGraph(sgoal)1: updateChangedCosts()2: disable edges that are now invalid3: re-enable disabled edges that are now valid4: precomputeShortcuts()

5: compute heuristic hE according to Equation 1

Our algorithm’s speed-up comes from being able to reuseparts of old paths and avoid searching large portions ofgraph G. To accomplish this we introduce a heuristic whichintelligently guides the search toward GE when it looks likefollowing parts of old paths will help the search get close tothe goal. We define a new heuristic hE in terms of the givenheuristic hG and edges in GE .

hE(s0) = min!

N"1!

i=0

min{"EhG(si, si+1), cE(si, si+1)} (1)

where ! is a path "s0 . . . sN"1# and sN"1 = sgoal and "E isa scalar $ 1.Equation 1 is a minimization over a sequence of segments

such that each segment is a pair of arbitrary states sa, sb % Gand the cost of this segment is given by the minimum of twothings: either "EhG(sa, sb), the original heuristic inflated by"E , or the cost of an actual least-cost path in GE , providedthat sa, sb % GE .In Figure 2, the path ! that minimizes the heuristic contains

a sequence of alternating segments. In reality ! can alternatebetween hG and GE segments as many or as few times asneeded to produce the minimal !. When there is a GE segmentwe can have many states si on the segment to connect two

Fig. 2. A visualization of Equation 1. Solid lines are composed of edgesfrom GE , while dashed lines are distances according to hG. Note that hG

segments are always only two points long, while GE segments can be anarbitrary number of points.

(a) "E = 1 (b) "E = 2

(c) "E " #

Fig. 3. Shortest ! according to hE as "E changes. The dark solid lines arepaths in GE while the dark dashed lines are the heuristic’s path !. Note as"E increases, the heuristic prefers to travel on GE . The light gray circles andlines show the graph G and the filled in gray circles represent the expandedstates under the guidance of the heuristic.

states, while when on a hG segment a single pair of pointssuffices since no real edge between two states is needed forhG to be defined.The larger "E is, the more we avoid exploring G and focus

on traveling on paths in GE . Figure 3 demonstrates how thisworks. As GE increases, it becomes more expensive to traveloff of GE causing the heuristic to guide the search along partsofGE . In Figure 3a, the heuristic ignores the graphGE becausewithout inflating hG at all, following real edges costs (likethose in GE ) will never be the cheaper option. In the otherparts of Figure 3 we can see as "E increase, the heuristic usesmore and more of GE . This figure also shows how during thesearch, by following old paths, we can avoid obstacles andhave far fewer expansions. The expanded states are shown asfilled in gray circles, which change based on how the hE isbiased by "E .The computePath function runs weighted A* without re-

expansions [13, 10]. Weighted A* uses a parameter "w >1 to inflate the heuristic used by A*. The solution cost isguaranteed to be no worse than "w times the cost of the optimalsolution and in practice it runs dramatically faster than A*.The main modification to Weighted A*, is that in addition tousing the edges that G already provides (getSuccessors), weadd two additional types of successors: shortcuts and snapmotions. The only other change is that instead of using the

Traveling on E-Graph uses actual costs

Traveling off the E-Graph uses an inflated original heuristic

Heuristic computation finds a min cost path using two kinds of “edges”

49 Carnegie Mellon University

goal

start

Page 50: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

Planning with Experience Graphs •  Focusing search towards E-graph within sub-optimality bound ε

goal

start

hG

consistent heuristic is one that satisfies the triangle inequality,hG(s, sgoal) ! c(s, s!) + hG(s!, sgoal).

C. Algorithm DetailThe planner maintains two graphs, G and GE . At the high-

level, every time the planner receives a new planning requestthe findPath function is called. It first updates GE to accountfor edge cost changes and perform some precomputations.Then it calls the computePath function, which produces apath !. This path is then added to GE . The updateEGraphfunction works by updating any edge costs in GE that havechanged. If any edges are invalid (e.g. they are now blockedby obstacles) they are put into a disabled list. Conversely, ifan edge in the disabled list now has finite cost it is re-enabled.At this point, the graph GE should only contain finite edges.A precomputeShortcuts function is then called which can beused to compute shortcut edges before the search begins. Waysto compute shortcuts are discussed in Section V. Finally, ourheuristic hE , which encourages path reuse, is computed.

findPath(sstart, sgoal)1: updateEGraph(sgoal)2: ! = computePath(sstart, sgoal)

3: GE = GE ! !

updateEGraph(sgoal)1: updateChangedCosts()2: disable edges that are now invalid3: re-enable disabled edges that are now valid4: precomputeShortcuts()

5: compute heuristic hE according to Equation 1

Our algorithm’s speed-up comes from being able to reuseparts of old paths and avoid searching large portions ofgraph G. To accomplish this we introduce a heuristic whichintelligently guides the search toward GE when it looks likefollowing parts of old paths will help the search get close tothe goal. We define a new heuristic hE in terms of the givenheuristic hG and edges in GE .

hE(s0) = min!

N"1!

i=0

min{"EhG(si, si+1), cE(si, si+1)} (1)

where ! is a path "s0 . . . sN"1# and sN"1 = sgoal and "E isa scalar $ 1.Equation 1 is a minimization over a sequence of segments

such that each segment is a pair of arbitrary states sa, sb % Gand the cost of this segment is given by the minimum of twothings: either "EhG(sa, sb), the original heuristic inflated by"E , or the cost of an actual least-cost path in GE , providedthat sa, sb % GE .In Figure 2, the path ! that minimizes the heuristic contains

a sequence of alternating segments. In reality ! can alternatebetween hG and GE segments as many or as few times asneeded to produce the minimal !. When there is a GE segmentwe can have many states si on the segment to connect two

Fig. 2. A visualization of Equation 1. Solid lines are composed of edgesfrom GE , while dashed lines are distances according to hG. Note that hG

segments are always only two points long, while GE segments can be anarbitrary number of points.

(a) "E = 1 (b) "E = 2

(c) "E " #

Fig. 3. Shortest ! according to hE as "E changes. The dark solid lines arepaths in GE while the dark dashed lines are the heuristic’s path !. Note as"E increases, the heuristic prefers to travel on GE . The light gray circles andlines show the graph G and the filled in gray circles represent the expandedstates under the guidance of the heuristic.

states, while when on a hG segment a single pair of pointssuffices since no real edge between two states is needed forhG to be defined.The larger "E is, the more we avoid exploring G and focus

on traveling on paths in GE . Figure 3 demonstrates how thisworks. As GE increases, it becomes more expensive to traveloff of GE causing the heuristic to guide the search along partsofGE . In Figure 3a, the heuristic ignores the graphGE becausewithout inflating hG at all, following real edges costs (likethose in GE ) will never be the cheaper option. In the otherparts of Figure 3 we can see as "E increase, the heuristic usesmore and more of GE . This figure also shows how during thesearch, by following old paths, we can avoid obstacles andhave far fewer expansions. The expanded states are shown asfilled in gray circles, which change based on how the hE isbiased by "E .The computePath function runs weighted A* without re-

expansions [13, 10]. Weighted A* uses a parameter "w >1 to inflate the heuristic used by A*. The solution cost isguaranteed to be no worse than "w times the cost of the optimalsolution and in practice it runs dramatically faster than A*.The main modification to Weighted A*, is that in addition tousing the edges that G already provides (getSuccessors), weadd two additional types of successors: shortcuts and snapmotions. The only other change is that instead of using the

Traveling on E-Graph uses actual costs

Traveling off the E-Graph uses an inflated original heuristic

Heuristic computation finds a min cost path using two kinds of “edges”

50 Carnegie Mellon University

Page 51: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

Planning with Experience Graphs •  Focusing search towards E-graph within sub-optimality bound ε

consistent heuristic is one that satisfies the triangle inequality,hG(s, sgoal) ! c(s, s!) + hG(s!, sgoal).

C. Algorithm DetailThe planner maintains two graphs, G and GE . At the high-

level, every time the planner receives a new planning requestthe findPath function is called. It first updates GE to accountfor edge cost changes and perform some precomputations.Then it calls the computePath function, which produces apath !. This path is then added to GE . The updateEGraphfunction works by updating any edge costs in GE that havechanged. If any edges are invalid (e.g. they are now blockedby obstacles) they are put into a disabled list. Conversely, ifan edge in the disabled list now has finite cost it is re-enabled.At this point, the graph GE should only contain finite edges.A precomputeShortcuts function is then called which can beused to compute shortcut edges before the search begins. Waysto compute shortcuts are discussed in Section V. Finally, ourheuristic hE , which encourages path reuse, is computed.

findPath(sstart, sgoal)1: updateEGraph(sgoal)2: ! = computePath(sstart, sgoal)

3: GE = GE ! !

updateEGraph(sgoal)1: updateChangedCosts()2: disable edges that are now invalid3: re-enable disabled edges that are now valid4: precomputeShortcuts()

5: compute heuristic hE according to Equation 1

Our algorithm’s speed-up comes from being able to reuseparts of old paths and avoid searching large portions ofgraph G. To accomplish this we introduce a heuristic whichintelligently guides the search toward GE when it looks likefollowing parts of old paths will help the search get close tothe goal. We define a new heuristic hE in terms of the givenheuristic hG and edges in GE .

hE(s0) = min!

N"1!

i=0

min{"EhG(si, si+1), cE(si, si+1)} (1)

where ! is a path "s0 . . . sN"1# and sN"1 = sgoal and "E isa scalar $ 1.Equation 1 is a minimization over a sequence of segments

such that each segment is a pair of arbitrary states sa, sb % Gand the cost of this segment is given by the minimum of twothings: either "EhG(sa, sb), the original heuristic inflated by"E , or the cost of an actual least-cost path in GE , providedthat sa, sb % GE .In Figure 2, the path ! that minimizes the heuristic contains

a sequence of alternating segments. In reality ! can alternatebetween hG and GE segments as many or as few times asneeded to produce the minimal !. When there is a GE segmentwe can have many states si on the segment to connect two

Fig. 2. A visualization of Equation 1. Solid lines are composed of edgesfrom GE , while dashed lines are distances according to hG. Note that hG

segments are always only two points long, while GE segments can be anarbitrary number of points.

(a) "E = 1 (b) "E = 2

(c) "E " #

Fig. 3. Shortest ! according to hE as "E changes. The dark solid lines arepaths in GE while the dark dashed lines are the heuristic’s path !. Note as"E increases, the heuristic prefers to travel on GE . The light gray circles andlines show the graph G and the filled in gray circles represent the expandedstates under the guidance of the heuristic.

states, while when on a hG segment a single pair of pointssuffices since no real edge between two states is needed forhG to be defined.The larger "E is, the more we avoid exploring G and focus

on traveling on paths in GE . Figure 3 demonstrates how thisworks. As GE increases, it becomes more expensive to traveloff of GE causing the heuristic to guide the search along partsofGE . In Figure 3a, the heuristic ignores the graphGE becausewithout inflating hG at all, following real edges costs (likethose in GE ) will never be the cheaper option. In the otherparts of Figure 3 we can see as "E increase, the heuristic usesmore and more of GE . This figure also shows how during thesearch, by following old paths, we can avoid obstacles andhave far fewer expansions. The expanded states are shown asfilled in gray circles, which change based on how the hE isbiased by "E .The computePath function runs weighted A* without re-

expansions [13, 10]. Weighted A* uses a parameter "w >1 to inflate the heuristic used by A*. The solution cost isguaranteed to be no worse than "w times the cost of the optimalsolution and in practice it runs dramatically faster than A*.The main modification to Weighted A*, is that in addition tousing the edges that G already provides (getSuccessors), weadd two additional types of successors: shortcuts and snapmotions. The only other change is that instead of using the

51 Carnegie Mellon University

εε=1.5 εε=∞

goal

start goal

start

Page 52: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

Planning with Experience Graphs Theorem 5. Completeness w.r.t. the original graph G: Planning with E-graphs is guaranteed to find a solution, if one exists in G Theorem 6. Bounds on sub-optimality: The cost of the solution found by planning with E-graphs is guaranteed to be at most εE-suboptimal:

cost(solution) ≤ εE cost(optimal solution in G)

52 Carnegie Mellon University

Page 53: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

Planning with Experience Graphs Theorem 5. Completeness w.r.t. the original graph G: Planning with E-graphs is guaranteed to find a solution, if one exists in G Theorem 6. Bounds on sub-optimality: The cost of the solution found by planning with E-graphs is guaranteed to be at most εE-suboptimal:

cost(solution) ≤ εE cost(optimal solution in G)

53 Carnegie Mellon University

These  proper8es  hold  even  if  quality  or  placement  of  the  experiences  is  arbitrarily  bad  

Page 54: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

Planning with E-Graphs for Mobile Manipulation •  Dual-arm mobile manipulation (10 DoF)

54 Carnegie Mellon University

Page 55: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

Planning with E-Graphs for Mobile Manipulation •  Dual-arm mobile manipulation (10 DoF)

•  3 for base navigation •  1 for spine •  6 for arms (upright object constraint, roll and pitch are 0)

•  Experiments in multiple environments

55 Carnegie Mellon University

Page 56: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

Planning with E-Graphs for Mobile Manipulation Kitchen environment:

- moving objects around a kitchen - bootstrap E-Graph with 10 representative goals - tested on 40 goals in natural kitchen locations

56 Carnegie Mellon University

Page 57: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

Planning with E-Graphs for Mobile Manipulation

Kitchen environment – Planning times

57 Carnegie Mellon University

•  Max  planning  *me  of  2  minutes  •  Sub-­‐op*mality  bound  of  20  (for  E-­‐Graphs  and  Weighted  A*)  

Method Success (of 40) Mean Speed-up Std. Dev. Max Weighted A* 37 34.62 87.74 506.78

Success (of 40) Mean Time (s) Std. Dev. (s) Max (s)

E-Graphs 40 0.33 0.25 1.00

Page 58: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

Planning with E-Graphs for Mobile Manipulation

Kitchen environment – Path Quality

58 Carnegie Mellon University

Method Success (of 40) Object XYZ Path Length Ratio

Std. Dev.

Weighted A* 37 0.91 0.68

Page 59: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

Planning with E-Graphs for Mobile Manipulation

Pros Cons

Carnegie Mellon University 59

•  Can lead the search around local minima (such as obstacles)

•  Provided paths can have arbitrary quality

•  Can reuse many parts of multiple experiences

•  User can control bound on solution quality

•  Local minima can be created getting on to the E-Graph

•  The E-Graph can grow arbitrarily large

Discussion

Page 60: Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Maxim Likhachev

•  Heuristic Search within sub-optimality bounds -  can be much more suitable for high-dimensional planning for

manipulation than optimal heuristic search

-  still provides -  good cost minimization

-  consistent behavior

-  rigorous theoretical guarantees

•  Planning for manipulation solves similar tasks over and over again (unlike planning for navigation, flight, …) -  great opportunity for combining planning with learning

Conclusions

60 Carnegie Mellon University