Real-Time Stochastic Kinodynamic Motion Planning … · Real-Time Stochastic Kinodynamic Motion...

Real-Time Stochastic Kinodynamic Motion Planningvia Multiobjective Search on GPUs

Brian Ichter, Edward Schmerling, Ali-akbar Agha-mohammadi, and Marco Pavone

Abstract— In this paper we present the PUMP (ParallelUncertainty-aware Multiobjective Planning) algorithm for ad-dressing the stochastic kinodynamic motion planning problem,whereby one seeks a low-cost, dynamically-feasible motionplan subject to a constraint on collision probability (CP). Toensure exhaustive evaluation of candidate motion plans (asneeded to tradeoff the competing objectives of performanceand safety), PUMP incrementally builds the Pareto front ofthe problem, accounting for the optimization objective and anapproximation of CP. This is performed by a massively parallelmultiobjective search, here implemented with a focus on GPUs.Upon termination of the exploration phase, PUMP searchesthe Pareto set of motion plans to identify the lowest costsolution that is certified to satisfy the CP constraint (accordingto an asymptotically exact estimator). We introduce a novelparticle-based CP approximation scheme, designed for efficientGPU implementation, which accounts for dependencies overthe history of a trajectory execution. We present numericalexperiments for quadrotor planning wherein PUMP identifiessolutions in ~100 ms, evaluating over one hundred thousandpartial plans through the course of its exploration phase. Theresults show that this multiobjective search achieves a lowermotion plan cost, for the same CP constraint, compared to asafety buffer-based search heuristic and repeated RRT trials.

I. INTRODUCTION

Motion planning is a fundamental problem in roboticsinvolving the computation of a trajectory from an initialstate to a goal state that avoids obstacles and optimizesan objective function [1]. This problem has traditionallybeen considered in the context of deterministic systemswithout kinematic or dynamic constraints. On the other hand,autonomous robotic systems are poised to operate in increas-ingly dynamic, partially-known environments where thesesimplifying assumptions may no longer hold. Successfulnavigation in these settings requires quick actions that fullyexploit the system’s dynamical capabilities and are robust touncertainties in sensing and actuation. Proper considerationof trajectory cost should bring trajectory optimization againstthe limits of safety design constraints; a well designed tra-jectory should therefore be situated on a Pareto optimal frontrepresenting the tradeoff between the competing objectivesof performance and safety.

Prompted by these considerations, in this work we studythe problem of motion planning under uncertainty. We fo-cus, in particular, on the stochastic kinodynamic motionplanning formulation investigated in [2] (henceforth referred

Brian Ichter and Marco Pavone are with the Department of Aeronauticsand Astronautics, Stanford University. Edward Schmerling is with theInstitute for Computational and Mathematical Engineering, StanfordUniversity, Stanford, CA 94305. Ali-akbar Agha-mohammadi is with theJet Propulsion Laboratory, California Institute of Technology, Pasadena,CA 91109. {ichter, schmrlng, pavone}@stanford.edu,[email protected]

This work was supported by a Qualcomm Innovation Fellowship andby NASA under the Space Technology Research Grants Program, GrantNNX12AQ43G. Brian Ichter was supported by the DoD NDSEG Program.

(a) (b)

(c)

Fig. 1: (1a-1b) Illustration of candidate motion plans identified byPUMP’s exploration phase, representing multiple homotopy classes.(1c) Selected trajectories for a 2% (blue) and 5% (red) target CP;the 2% CP motion plan takes a less aggressive route.

to as the SKP problem), whereby one seeks a low-costreference trajectory to be tracked subject to a constraint onobstacle collision probability (CP). Our key contribution isto introduce the Parallel Uncertainty-aware MultiobjectivePlanning algorithm, a novel approach to planning underuncertainty that initially employs a multiobjective searchto build a Pareto front of plans within the state space,actively considering both cost and an approximation of CP.As shown in Fig. 1, this search identifies several high-quality trajectories to be considered as solutions. A bisectionsearch is then performed over the Pareto set of solutionsto identify the lowest cost reference trajectory that can betracked with satisfaction of the CP constraint (verified viaMonte Carlo (MC) methods [2]). Such a procedure enablesthe explicit tradeoff of performance and safety, as opposedto several existing approaches that forgo plan optimizationand recast the problem as an unconstrained planning prob-lem where trajectory CP is minimized (with, possibly, a-posteriori heuristics to improve upon solution cost). Thekey to enabling real-time solutions in this work lies inalgorithm parallelization to leverage GPU hardware and thedevelopment of a novel fast and accurate CP approximationstrategy over trajectories.

Related Work: A variety of strategies have been studied forplanning under uncertainty. One choice is to formalize theplanning problem as a partially observable Markov DecisionProcess (POMDP) [3, 4, 5]. This allows a planner to optimizeover closed-loop control policies that map belief distributions

arX

iv:1

607.

0688

6v3

[cs

.RO

] 2

3 Fe

b 20

17

over the robot state to appropriate actions. ConsideringPOMDP problems in their full generality, however, hasbeen recognized as extremely computationally intensive; [6]provides a method for remedying the complexity of trackingthe evolution of the state belief. In that work the authors uselocal feedback controllers to regularize the belief state atcertain waypoints in space in order to break the dependenceon history. While effective for constructing control policiesthat can handle large uncertainties, this approach is limitedin the context of trajectory optimization due to the requiredstabilization phases which may adversely affect trajectorycost. For systems where robustness against large deviationsis not a routine need, a common alternative approach isto optimize over open-loop trajectories, producing controlsequences which serve as control policies when augmentedwith a tracking controller. These reference trajectories arerecomputed in a receding horizon fashion to address trackingerrors and environmental uncertainties. This is the con-trol strategy advanced in [2, 7, 8, 9]; we note that high-frequency replanning [9] has been shown to yield greatlyincreased robustness for systems with significant uncertainty.These works view the problem through the lens of chance-constrained optimization [10], of which the SKP is oneformulation, where trajectories are gauged by a metric oftheir safety in addition to cost. Computationally efficientmethods for the asymptotically exact estimation of CP wereintroduced in [2]; however, the Monte Carlo motion planningprocedure in that work addresses the relationship betweencost and CP only indirectly, employing a deterministicasymptotically optimal planner with a safety-buffer heuris-tic as a proxy for CP. The methods in [8, 9] follow analternative approach towards chance-constraint optimization,whereby one employs parallel rapidly-exploring random trees(RRTs) in a safety-agnostic setting, to generate a large setof candidate trajectories for CP evaluation; high trajectoryquality is ensured only through evaluation of many full-length trajectories. In this work we draw inspiration fromthe multiobjective literature to select reference trajectoriesin a principled, incremental fashion during exploration toachieve exhaustivity similar to evaluating many full motionplans. This enables rapid planning without compromising oncost or the ability to guarantee a hard safety constraint.

Unfortunately, even with a known state space (e.g., givena graph of all possible states and motions), multiobjectivesearch with just two objectives is already NP-hard [11].Furthermore, the allocation of risk along a trajectory is acombinatorial problem. While basic formulations have beensolved with computationally intensive optimizations, such asmixed-integer linear programs or constrained nonlinear pro-grams, only brute force techniques are amenable to problemsinvolving dynamics, high-dimensions, or cluttered spaces [7].Thus a major tenet of this work is to leverage algorithmparallelization to exploit the computational breadth of GPUs.An early result in sampling-based motion planning showedthat probabilistic roadmap methods are embarrassingly par-allel [12], which later inspired implementation on GPUs[13]. Most recent works in parallelization of sampling-basedmotion planning have focused on specific subroutines (suchas collision checking and nearest neighbor search) [14, 15]or used AND-/OR-parallelism to adapt serial algorithms[9, 16, 17]. In our work, the design of the entire algorithm

(as opposed to isolated subroutines) is motivated by massiveparallelization, and efficient GPU implementation – in par-ticular the CP approximation and multiobjective exploration.

Statement of Contributions: In this work we presentthe Parallel Uncertainty-aware Multiobjective Planning al-gorithm (PUMP), capable of returning high quality solutionsto the stochastic kinodynamic motion planning problem. AtPUMP’s core is a multiobjective search that considers bothtrajectory cost and an approximation of collision probabilitywhile exploring the state space; in this way, a Pareto front ofhigh-quality motion plans is identified for later certificationof CP constraint satisfaction via MC methods. The additionalcomputational burden of maintaining a Pareto front of plansis offset through algorithm design for massively parallelcomputation on GPU hardware.

The design of PUMP can be separated into three phases,each of which is well posed for application to GPUs. The firstis a graph building phase which constructs a representationof available motions within the free state space, as inspiredby embarrassingly parallel probabilistic roadmap methods[12]. The second explores this graph through a multiobjectivesearch, building a Pareto front of motion plans outward fromthe initial state by expanding in parallel the set of all frontierplans below a constantly increasing cost threshold. Thisexpansion is followed by a dominance check that considersthe cost and CP (approximated using a heuristic) of all plansarriving at the same state to aggressively remove those thatare less promising. We note that propagating CP along amotion plan requires consideration of the uncertainty distri-bution conditioned on collision avoidance; for each partialmotion plan we collapse the full shape of the uncertaintydistribution to a single approximate CP value for domi-nance comparison. We show that this heuristic works wellin practice and is necessary for limiting search branchingfactor. Upon completion of this graph exploration phase, abisection search is performed over the Pareto optimal setof plans reaching the goal according to their approximateCP. The intent of this third phase is to correct for any CPinaccuracy by computing an asymptotically exact estimateof the collision probability using MC-based methods [2](an embarrassingly parallel process), to certify that the finalreturned solution satisfies the problem constraint.

Through numerical experiments with a simplified quadro-tor model, we find that PUMP improves over Monte CarloMotion Planning [2] and repeated RRT trajectory generationin terms of motion plan cost, given the same CP constraint,while returning results in a tempo compatible with real-timeoperation, on the order of 100 ms. To achieve this run time,a new trajectory CP approximation method is presented,termed Half-Space Monte Carlo (HSMC). The method pro-ceeds by sampling many realizations of a reference-trackingcontroller and checking each against a series of relevant half-spaces representing a simplified collision model, allowing afast, accurate, and trivially parallelized CP approximation.

Organization: This paper is structured as follows. Sec-tion II formally defines the stochastic kinodynamic motionplanning problem addressed in this work. Section III outlinesprevious collision probability approximation methods andpresents the novel HSMC method. Section IV discusses thePUMP algorithm and Section V describes its implementationand presents simulation results supporting our statements.Lastly, Section VI discusses conclusions and future work.

II. PROBLEM STATEMENT

This paper addresses the stochastic kinodynamic motionplanning (SKP) problem formulation posed in [2] which webriefly review here. We consider a robot described by lineardynamics with Gaussian process and measurement noisewhich tracks a nominal trajectory using a Linear-QuadraticGaussian (LQG)-derived controller. Although PUMP withMonte Carlo simulation is equipped, as an algorithm frame-work, to handle problem formulations generalized to a non-linear setup (as in, e.g., [8, 9] which base their computationson a local linearization around the reference trajectory)and to any tracking controller (e.g., LQG with nonlinearextended Kalman filter estimation, or geometric trackingcontrol [18]), we focus our attention in this paper ondemonstrating improvement in performance and speed overexisting LQG methods. We note, however, that the MonteCarlo methods discussed in this work — in particular, allof their intermediate inputs computed from dynamics —are in principle directly applicable to a local linearization,or even non-Gaussian noise sampled, e.g., from zero-meanstable distributions.

The underlying nominal trajectory (and correspondingfeedfoward control term) is planned assuming continuousdynamics, but we assume discretized (zero-order hold) ap-proximate dynamics for the tracking controller. That is, thedynamics evolve according to the stochastic linear model:

x(t) = Acx(t) +Bcu(t) + v(t), y(t) = Ccx(t) + w(t), (1)

where x(t) ∈ Rd is the state, u(t) ∈ R` is the control input,y(t) ∈ Rdw is the workspace output, and v ∼ N (0, Vc) andw ∼ N (0,Wc) represent Gaussian process and measurementnoise, respectively. Fix a tracking controller timestep ∆t sothat, e.g., xt denotes x(t ·∆t), and fix a nominal trajectory(xnom(t),unom(t),ynom(t)), t ∈ [0, τ ] satisfying the dynam-ics (1) without the noise terms v and w. With deviationvariables defined as δxt := xt − xnom

t , δut := ut − unomt ,

and δyt := yt − ynomt , for t = 0, . . . , T , the discretized

approximate dynamics for the tracking controller evolve asδxt+1 = Aδxt +Bδut + vt, vt ∼ N (0, V ),

δyt = Cδxt + wt, wt ∼ N (0,W ),

A := eAc∆t, B :=

(∫ ∆t

0

eAcsds

)Bc, C := Cc,

V :=

∫ ∆t

0

eAcsVceA>

c sds, W :=Wc

∆t.

(2)

The discrete LQG controller δuLQGt := Lt δxt, with

Lt and δxt denoting the feedback gain matrix andKalman state estimate respectively (see [19] for com-putation details), minimizes the tracking cost J :=

E[δx>T FδxT +

∑T−1t=0 δx>t Qδxt + δu>t Rδut

]. Then the

control in continuous time is u(t) = unom(t) + δuLQGbt/∆tc.

The SKP problem is posed as a cost optimization overdynamically-feasible trajectories (also referred to as motionplans) subject to a safety tolerance constraint separatefrom the optimization objective. Let Xobs be the obstaclespace, so that Xfree = Rd \ Xobs is the free state space. LetXgoal ⊂ Xfree and x0 = xinit ∈ Xfree be the goal region andinitial state, respectively. Given a trajectory cost measure cand CP tolerance α, we wish to solve the following SKP.

Stochastic Kinodynamic Motion Planning (SKP):

minunom(·)

c(xnom(·))

s.t. P ({x(t) | t ∈ [0, τ ]} ∩ Xobs 6= ∅) ≤ αu(t) = unom(t) + δuLQG

bt/∆tc

Equation (1).

(3)

The numerical experiments in this work take c as the mixedtime/quadratic control effort penalty studied in [20, 21]. Wemake an implicit assumption in optimizing over the cost ofthe reference trajectory c(xnom(·)) that the tracking controlcosts are minor in comparison to the nominal control cost;as noted in Section I this assumption also underlies the useof a tracking control strategy rather than computing a fullpolicy over state beliefs.

III. COLLISION PROBABILITY APPROXIMATION

This section discusses a key challenge in computing solu-tions for the SKP problem, that of efficiently and accuratelyassessing collision probability. As discussed in [2], MonteCarlo methods exist as an asymptotically exact means forcalculating the CP when tracking a trajectory segment. Thesemethods are suitable for validating the most promising can-didate trajectories post-exploration, but the high branchingfactor of multiobjective search necessitates less computa-tionally intensive approximation schemes when checkinga great number of candidate motions (most of which arediscarded as Pareto dominated) in a tempo compatible withreal-time operations. In Section III-A we review existingstrategies for approximating CP. In Section III-B we presentthe novel Half-Space Monte Carlo (HSMC) method aimed atefficiently and accurately approximating CP, particularly onGPUs, and analyze this method with numerical experiments.

A. Additive, Multiplicative, and MC CP ApproximationsWhile the constraint in (3) is defined over continuous

trajectory realizations, there are a number of waypoint-basedschemes which derive an approximation of trajectory CP as afunction of the instantaneous pointwise collision probabilitiesat the intermediate waypoints {xt}. Additive approximations[7] employ a conservative union bound

CP ≈ P(∨T

t=0{xt ∈ Xobs})≤

T∑t=0

P ({xt ∈ Xobs}), (4)

while multiplicative approximations [8] assume indepen-dence between pointwise collision events:

1− CP ≈ P(∧T

t=0{xt /∈ Xobs})≈

T∏t=0

P ({xt /∈ Xobs}). (5)

This independence assumption may be relaxed by fittingGaussians successively to the waypoint distributions condi-tioned on no prior collisions (see, e.g., [9, 22] for details),in order to approximate the terms of the product expansion

1− CP ≈ P(∧T

t=0{xt /∈ Xobs})

=

T∏t=0

P({xt /∈ Xobs} |

∧t−1s=0{xs /∈ Xobs}

),

(6)

which we refer to in this text as the conditional multiplicativeapproximation. A common additional approximation made

with these waypoint-based approaches is to assume a sim-plified collision model, for example by approximating thefree space around a waypoint as an ellipsoid [7] or by alocally convex region defined by half-space constraints [22].We note that even though these collision models, as well asthe additive approximation (4), may be deemed conservative,none of these methods are guaranteed to over- or under-approximate the actual CP as they do not allow for collisionsin the intervals between waypoints [2].

Monte Carlo methods can exactly address the dependen-cies between waypoint CPs, as well as accommodate con-tinuous collision checking, at the expense of computationaleffort and statistical variance in the estimate. In [2], variance-reduced MC methods are developed for estimating CP alongthe span of a full trajectory. The method in [6] breaks depen-dence between collision probabilities at certain points in thestate space using feedback controllers. This construct dividesthe CP computation into smaller pieces corresponding tolocal connections, which allows for expensive MC-based CPcomputation to be done offline, given the environment map.

B. Half-Space Monte CarloIn this subsection we detail the massively parallel HSMC

method and show that it attains good empirical accuracy. ThisCP approximation is inspired by particle-based MC methodsin their ability to represent arbitrary state distributions andhow they evolve in Xfree, as well as by the efficient collisionmodels of waypoint-based methods. Briefly, HSMC is aMonte Carlo method where, within local convex regionsaround a reference waypoint xnom

t , sampled trajectories (par-ticles) are checked at each time step by removing thosethat might be “expected” to be in collision through sub-sequent motion towards xnom

t+1. Instead of computing a fullcollision check, we only compare the particle’s position inthe workspace against the half-space boundaries of the localconvex region (with consideration for the reference waypointstate’s velocity, see Fig. 2). The local convexification processemployed by HSMC differs from that presented in [22] andused in [2], in that obstacle half-spaces are computed on thebasis of Euclidean distance in the workspace, as opposed toMahalanobis distance. In those works Mahalanobis distanceis used as a proxy for the shape of the uncertainty distributioncentered at xnom

t ; HSMC instead maintains a discretizedMonte Carlo representation of this uncertainty distribution sothat position and velocity are sufficient to judge approximatecollision in the Euclidean workspace. We remark that thecomputation of each local convex region is independent oftrajectory history and uncertainty distribution, thus allowingfor a massively parallel computation at the start of planning.Furthermore, if the obstacles are available a priori, thiscomputation can be performed entirely offline.

As we are restricting our attention to the workspace,recall that yt = Cxt is the projection of the state intothe workspace. The construction of the local convex regionaround ynom

t is performed using a sequential process, fol-lowing the work of [22]. At step i, we define dt,i as thevector from ynom

t to the closest obstacle point and prune awayall obstacle geometric structures that lie in the half-spaced>t,i(z− ynom

t ) > dt,i · dt,i, where z ∈ Rdw . This procedurecontinues until all obstacle geometry has been pruned away.The process defines the local convex region, however thefinal approximate collision check should represent the notion

that a particle will collide with an obstacle only if it is bothclose and its motion is towards the obstacle. We thus alterthese half-spaces, as illustrated in Fig. 2, by projecting themto be perpendicular to the direction of travel via

at,i = dt,i −(dt,i · ytyt · yt

)yt. (7)

The resulting collision constraint for an MC particle is thendefined by a>t,iδyt > bt,i, where bt,i = at,i · at,i. That is,to approximate the CP of a given motion plan we generatea set of N particle trajectories and compare their deviationsδyt from the nominal trajectory at each time step to thelocal collision half-spaces. Every half-space check (at,i,bt,i)along a particle trajectory is independent of every other,requiring only a final OR reduction to determine the validityof a particle, thus this process is amenable to parallelization.The final CP approximation is obtained by counting thenumber of invalid particle trajectories and dividing by N .

The results in Fig. 3 show that with two representativeplanning problems, targeting CPs of 1% and 5%, HSMCoutperforms conditional multiplicative in terms of approxi-mation accuracy. While the HSMC CP approximation im-proves as the number of waypoints increases, conditionalmultiplicative’s conservative nature causes the CP approxi-mation to degrade, in agreement with the results in [2].

(a) (b)

Fig. 2: The vector d (blue) is the vector of minimum distanceconnecting the waypoint to the obstacle set and y (green) is theprojected velocity onto the workspace. (2a) Illustration of half-spaceformation by distance only. (2b) Illustration of the alteration of thehalf-space that accounts for the notion that a particle is in collisionwith an obstacle only if it is both close and its motion is towardsthe obstacle; the vector a (red) is computed as a = d− (d·y

y·y )y.

(a) (b)

(c) (d)

Fig. 3: Comparison of the true CP (Monte Carlo, red), conditionalmultiplicative CP approximation (green), and HSMC CP approx-imation (blue) versus number of waypoints (i.e., as the timestep∆t decreases) over several problem instances. Each HSMC ap-proximation and MC CP was computed with 10,000 samples. Notethat the conditional multiplicative approximation tends to becomeincreasingly conservative as the number of waypoints increases,whereas HSMC generally remains near the true CP.

IV. PARALLEL UNCERTAINTY-AWAREMULTIOBJECTIVE PLANNING

The Parallel Uncertainty-aware Multiobjective Planning(PUMP) algorithm explores the state space via a multiob-jective search, building a Pareto set of motion plans fromthe initial state to goal region, where Pareto dominance isdetermined by cost and approximated CP. This Pareto setcomprises not only several motion plans, but different solu-tion homotopies, which are searched to find the minimumcost plan certified to satisfy a CP constraint. The algorithmis briefly outlined in Alg. 1 and formally stated in Algs. 2-5.Algorithm 1 PUMP: Outline

1 Massively parallel sampling-based graph building2 Multiobjective search to find PXgoal := Pareto optimal plans in

terms of (cost, approximate CP)3 Bisection search PXgoal to find the best motion plan, p∗, with a

verified CP below constraintIf feasible plan found, locally optimize (smooth)subject to CP constraint satisfaction and return p∗Otherwise, report failure

To aid in formal description, we now present some nota-tion and algorithmic primitives. In line with the sampling-based motion planning literature, we construct motion plansas a sequence of sampled nodes (samples) in Rd. Sincewe are interested in tracking collision probability, which ishistory dependent, we store plans as structs with attributes(head,path, cost, cp). The fields represent the terminal tra-jectory node from which we may extend other plans, thelist of previous nodes, the cost, and the approximate CP,respectively. Let SampleFree(n) be a function that re-turns a set of n ∈ N points sampled from Xfree. In thefollowing definitions let u, v ∈ Xfree be samples, and letV ⊂ Xfree be a set of samples. Define Cost(u, v) asa function that returns the cost of the optimal trajectoryfrom u to v. Let Near(V, u, r) be a function that returnsthe set of samples {v ∈ V : Cost(u, v) < r}. LetCollision(u, v) denote whether the optimal trajectoryfrom u to v intersects Xobs. Given a set of motion plansP and Popen ⊂ P , RemoveDominated(P, Popen) denotesa function that removes all p ∈ Popen from Popen andP that are dominated by a motion plan in {pdom ∈ P :pdom.head = p.head}; explicitly, we say that pdom dominatesp if (p.cost > pdom.cost)∧(p.cp ≥ pdom.cp). Let CP(v, p) de-note a function that returns an approximate CP for the motionplan p concatenated with the optimal connection p.head to v.We define MC(p) as a function that returns an asymptoticallyexact CP estimate of motion plan p through Monte Carlosampling [2]. Since PUMP conducts its exploration withinthe context of a discrete graph representation of Xfree, thebest costs and cps are limited by the resolution of the graph.To address variance in sample placement, we include a post-processing “smoothing” step which locally optimizes thefinal plan subject to CP constraint recertification. We denotethis process by Smooth(p, α); in this work we use thestrategy in [23], which performs a bisection search to createa new smoothed plan as a combination of the original planand the optimal unconstrained trajectory from xinit to p.head.

We are now ready to fully detail the PUMP algorithm(Algs. 2-5). The main body is provided in Alg. 2; for brevitywe assume the algorithm inputs and tuning parameters areavailable globally to the subroutines Algs. 3-5. PUMP can be

Algorithm 2 PUMPInputs: planning problem (Xfree, xinit,Xgoal), number of nodes n ∈

N, connection radius rn ∈ R≥0, CP threshold α ∈ (0, 1)Params: CP approx. factor η > 1, group cost factor λ ∈ (0, 1]

1 (V,N) = BuildGraph() // Alg. 32 αmin = 1

ηα; αmax = ηα

3 PXgoal ← Explore(αmin, αmax, V,N) // Alg. 44 p∗ ← PlanSelection(α, PXgoal ) // Alg. 55 return p∗

Algorithm 3 BuildGraph

1 V ← {xinit} ∪ SampleFree(n)2 for all v ∈ V do // massively parallel graph building3 N(v)← Near(V \ {v}, v, rn)4 for all u ∈ N(v) do5 if Collision(v, u) then N(v)← N(v) \ {u} end if6 end for7 end for8 return (V , N )

divided into three phases. The first is a graph building phase(Alg. 3), similar to probabilistic roadmap methods [12], thatconstructs a representation of available motions within thefree state space. This phase begins by adding xinit and a setof n samples from Xfree (including at least one from Xgoal)to the set of samples V . Each sample’s nearest neighborsare then computed and each neighbor edge is collisionchecked, storing the collision-free neighbors of v in N(v).Here one should consider the “kinodynamic counterparts”of traditional Euclidean nearest neighbors and straight lineedges [21, 24]. Specifically, for the SKP setting of this paper,we refer to nearest neighbors as samples within a connectionradius rn, where rn is a tuning parameter, which we setas described in [21]. Depending on the CP approximationstrategy used during exploration, this graph building stagecan also calculate obstacle half-spaces at each waypoint.Following [12], this phase is embarrassingly parallel.

The second phase, Alg. 4, explores the state space tofind the Pareto optimal set of motion plans, where Paretodominance is determined in terms of cost and cp. Thisphase takes an upper and lower bound, αmax and αmin,expressed in terms of cp. These bounds are chosen as amultiplicative factor η around the target CP to accommodateany under- or over-CP approximation; η should be tunedaccording to the accuracy of CP, as discussed in detail inSection V-A. Note too that PUMP is left agnostic of CPapproximation methodology, and while we use HSMC for thenumerical experiments in this work, its application is limitedto workspace obstacles; if this is not the case for a givenplanning problem, other methods can be used [8, 22]. Theexploration of the state space proceeds in a parallel fashionby expanding the group, G, of all open (unexpanded) motionplans, Popen, below an increasing cost threshold, to theirnearest neighbors. During this process, any motion plansthat are dominated or have cp > αmax are discarded. Thecost threshold is increased by λrn at each exploration loop,thus determining the size of G through the group cost factorλ ∈ (0, 1]; the choice of this value is discussed in Section V-A. The exploration terminates once a plan reaches Xgoal withcp < αmin or once Popen = ∅, at which point all plansconnecting xinit to Xgoal are stored in PXgoal . The αmin earlytermination criterion introduces a tradeoff for λ: choosing ahigh λ increases algorithm speed through greater parallelism,

Algorithm 4 Explore(αmin, αmax, V,N)

1 Popen ← {(xinit, ∅, 0, 0)} // plans ready to be expanded2 P (xinit)← Popen // plans with head at xinit

3 G ← Popen // plans considered for expansion4 i = 05 while Popen 6= ∅ ∧ {g ∈ G : (g.head ∈ Xgoal)∧

(g.cp < αmin)} = ∅ do6 for all p ∈ G do7 for all x ∈ N(p.head) do8 q ← (x, p.path + {p.head}, p.cost+

Cost(p.head, x), CP(x, p))9 if q.cp < αmax then // CP cutoff

10 P (x)← P (x) ∪ {q}11 Popen ← Popen ∪ {q}12 end if13 end for14 end for15 (P, Popen)← RemoveDominated(P, Popen)16 Popen ← Popen \ G17 i← i+ 118 G ← {p ∈ Popen : p.cost ≤ iλrn}19 end while20 return PXgoal ← {p ∈ P (v) : v ∈ Xgoal}

Algorithm 5 PlanSelection(α, PXgoal)

1 P ← Sort(PXgoal ) // Sort PXgoal in ascending CP order2 l = 1, u = |PXgoal |,m = d(l + u)/2e3 while l 6= u do4 if MC(Pm) > α then u = m− 1 else l = m end if5 m = d(l + u)/2e6 end while7 if MC(Pm) > α then8 return Failure // No feasible plans in PXgoal9 end if

10 return Smooth(Pm, α)

but has the side effect of allowing Popen to spread out in cost,potentially terminating before a high quality partial plan hasreached Xgoal. Finally, we note that the discretized expansiongroups has an added benefit of not requiring a heap to storea cost ordering on Popen, but rather a bucket array shouldbe implemented, allowing O(1) insertion and removal.

Computational tractability of PUMP relies on the use ofthe RemoveDominated function, and thus it is importantto discuss its associated assumptions. Pruning the searchrequires judging the potential of multiple candidate plansthat arrive at the same node. We use each plan’s cp forthis purpose, but this disregards any impact of the shape ofthe plan’s full probability distribution, conditioned on priorobstacle avoidance, at that node. Even an exact CP estimatewould be insufficient for this purpose. The only remedy is anutterly exhaustive search; instead our goal during explorationis to produce a set of high-quality motion plans. We find thatin practice, using cp’s, this set encompasses all importanttrajectory homotopy classes. 1

The final phase, Alg. 5, performs a bisection search overPXgoal to identify the lowest cost plan p∗ with a CP certifiablybelow α, as verified through MC with full collision checking[2]. If no plan is found, the algorithm reports failure. Other-wise, Smooth(p∗, α) is returned, which reduces cost untilany margin in the CP constraint is exhausted.

1We note here an additional potential source of algorithm suboptimality— the fact that we search over a graph of only cost-optimal u-v connections.In some cases, e.g., high initial uncertainty compared to the steady-state, itmay be best to consider other strategies for local connection, e.g., waiting.

V. NUMERICAL EXPERIMENTS

A. Experimental SetupThe simulations in this section use 6D double integrator

dynamics (x = u) to model a quadrotor system in a3D workspace, and were implemented in CUDA C andgenerated on an NVIDIA GeForce GTX 980 GPU on a Unixsystem with a 3.0 GHz CPU. Example code may be foundat github.com/StanfordASL/PUMP. We implementedstate space sampling with the deterministic, low-dispersionHalton sequence, as motivated by the analysis in [25]. Notethat as this sampling is deterministic rather than random,variance is not reported. By preallocating GPU memory, andoffline precomputing and caching both nearest-neighbors andedge controls (which are independent of the particular obsta-cles set Xobs), the total run time was significantly reduced.The simulations below use the HSMC CP approximationmethod detailed in Section III-B with N = 128, noting thatthe number of particles may be quite low as it is solely usedto guide the exploration phase towards promising motionplans rather than certify solutions.

Our algorithm introduces two tuning parameters, η andλ. The value of η, the CP approximation factor, represents atradeoff between considering fewer plans (and thus run time)and the potential for either removal of promising plans orearly exploration termination. In practice, however, we haveobserved only a small influence on run time, and it shouldthus be set to the limits of accuracy for the CP approximationstrategy. Specific to HSMC, we set η = 2 for target CPsabove 1% and higher values below 1% to account for the highdeviation factor from using a particle representation for rareevents (for example at 0.1% we use η = 10). Lastly, for thegroup cost factor λ, we have found that λ = 0.5 representsa good balance of parallelism, ease of implementation, andpotential early termination described in Section IV.

B. Comparison to Monte Carlo Motion PlanningWe begin with a discussion of Monte Carlo Motion

Planning (MCMP) [2], a recent approach to solving theSKP problem, and show that PUMP identifies significantlylower cost solutions in a pair of illustrative examples. MCMPaddresses the SKP problem by performing a bisection searchover a safety-buffer heuristic correlated with CP, at eachstep solving a deterministic planning problem with inflatedobstacles (higher inflation correlates with lower CP). While[2] argues that the algorithm works well in many cases, theassumption that safety-buffer distance is a good analogue forCP breaks down in some planning problems. We show twosuch examples in Figs. 4a and 4b and show that PUMP’sCP-guided multiobjective search outperforms MCMP.

Figure 4a displays a planning problem in which the CPof the solution trajectory for the deterministic problem withinflated obstacles has an increase as inflation factor increases(via the discontinuous jump shown in Fig. 4c). This isa consequence of a narrow, but short gap allowing safertrajectories than a wide, but long passage. When iteratingon inflation factor, the MCMP algorithm is unable to locatethe solution homotopy class through the narrow gap, asthe inflated obstacles block it. Figure 4e shows the costof solutions for various target CPs as compared to PUMP,which identifies the narrow gap solution for all CPs. ForCPs below 3%, the narrow gap is never identified and

github.com/StanfordASL/PUMP

instead a more circuitous motion plan is obtained. MCMP’sperformance is particularly troublesome with a 0.1% targetCP, in which case MCMP is unable to find any solution asthe long passage homotopy class is infeasible, and it neverrealizes the narrow gap homotopy class.

The second planning problem, shown in Fig. 4b, admitstwo solution homotopies: a narrow passage and a large openregion. Figure 4d shows that as the inflation factor is varied,there is a large discontinuity in CP when the narrow gapbecomes blocked. Thus, if the targeted CP is within this gap,the algorithm will converge to exactly the inflation factorthat minimally blocks the gap, alternating between motionplans through the narrow passage with high CPs and overlyconservative motion plans through the open region, notidentifying the desired aggressive motion plan through theopen region. Figure 4f shows that PUMP is able to find thissolution through the open region (unless the gap homotopyis valid for a target CP) while MCMP is overly conservative,especially at low target CPs. A homotopy blocking heuristicis suggested in [2] for remedying this issue, but this isdifficult to execute for planning in general 3D workspaces.

(a) (b)

(c) (d)

(e) (f)

Fig. 4: MCMP and PUMP comparisons. (4a, 4b) plot MCMPcandidate plans at a range of safety buffers. (4c, 4d) plot the CPof the returned solution to the deterministic problem with a giveninflation factor. (4a, 4c, 4e) In this case CP has an increase as theinflation factor increases (via a discontinuous jump when the narrowgap closes at an inflation factor of 0.025) causing MCMP to notfind the optimal solution homotopy. PUMP is able to find the besthomotopy at all CPs. Note that MCMP fails to find a solution ata CP of 0.1% due to not identifying the narrow, short gap beforethe wide, long passageway is closed. (4b, 4d, 4f) A gap in the CPcauses MCMP to be too conservative through the open region atlow CPs, while PUMP successfully identifies this solution.

C. Real-Time Performance and Homotopy Identification

Through the two planning problems in Fig. 1, we demon-strate PUMP’s ability to identify multiple solution homotopyclasses and select low-cost trajectories subject to a certifiedCP constraint, all in a tempo compatible with real-timeapplication (~100 ms). The collision probability constraintsconsidered herein are single digit percentages and below, ashas been studied for realistic stochastic planning problemsin the literature [9]. The obstacles in our simulations are

represented by unions of axis-aligned bounding boxes, ascommonly used for broad phase collision detection [1]. Thiscan provide increasingly accurate representations of the trueobstacle set as more are used (e.g., as in Fig. 1c, or withoctree-based environment representations), and acceleratesthe computation of the local convex regions. If needed, moredetailed collision checking can be performed during the finalsmoothing and MC certification phase.

Table I details the resulting computation times, solutionCPs, and number of partial plans considered, over a rangeof target CPs and graph sample counts. Table I also presentsresults for a simple repeated RRT method for generatingcandidate plans. For this comparison, we compute 1000 kin-odynamic RRT solution trajectories (as performed in [8, 9])and select the trajectory with lowest cost subject to MC certi-fication of chance constraint satisfaction. In all cases PUMPrun times are on the order of 100 ms, while considering orderone hundred thousand partial trajectories in the multiobjec-tive search. Note that these solution times are two ordersof magnitude faster than those reported for MCMP [2],owing to the parallel algorithm design and implementationon GPU hardware. On average, the graph building accountedfor 17% of the total run time (including computing half-spaces), the exploration for 79% (55% overall for HSMC),and only 4% for the final trajectory selection, smoothing, andcertification phase. The time for the computation and cachingof neighbors and edges is not included, as these computationsare performed offline, reducing to constant-time memoryaccess operations at runtime. The initial graph building isvery dependent on number and location of obstacles (as thisremoves invalid samples and edges, and computes the half-spaces for each waypoint), however even with the indoorenvironment it is dominated in run time by exploration.We further note that for particularly cluttered environments,workspace partitioning schemes (e.g., kd-trees) may be em-ployed to limit scaling. The exploration phase also has somedependence on obstacles through the complexity of the localconvex region, however limiting the fidelity of the half-spaceapproximation (number of half-spaces) may keep this scalingin check. Comparing results in Table I, shows that morethan number of obstacles, the key driver of computation timeis the number of partials plans, which is more dictated byobstacle locations (i.e., difficulty in finding a solution plan)than obstacle count. In fact, comparing only by number ofpartial plans shows that despite Fig. 1b having more than fivetimes the number of obstacles of Fig. 1a, the computationtime increases by less than a factor of two.

Figure 1a shows our first planning problems, in whichPUMP’s multiobjective search phase has identified all im-portant solution homotopies. The little variation in costwith graph resolution (i.e., sample count) demonstrates thatPUMP generally identifies not only the correct homotopy, butas aggressive a solution as possible within it. Repeated RRTdid not encounter a candidate plan with CP below 0.1%, andachieved significantly worse cost than PUMP’s principledsearch for target CPs 1% and 5%.

The second planning problem is structured as an indoorflight scenario, once again with multiple solution homotopyclasses. Figure 1b shows the workspace setup and the Paretooptimal set of trajectories, representing several homotopies,identified by PUMP’s multiobjective search. Targeting CPs

TABLE I: PUMP vs. repeated RRT (1000 trials) double integrator experimental results.

Figure 1a Figure 1b-1cTarget CP (%) 0.1 1 5 2 5Sample Count 4k 6k 8k 4k 6k 8k 4k 6k 8k 2k 3k 4k 2k 3k 4kSolution CP (%) 0.10 0.10 0.10 1.00 1.00 1.00 5.00 5.00 5.00 1.95 1.95 1.90 4.98 4.93 4.93Cost 2.11 2.11 2.11 2.02 2.02 2.08 1.97 1.97 1.88 3.41 2.41 1.85 1.48 1.31 1.31Time (ms) 48 122 216 82 183 323 79 223 261 97 253 282 99 224 384

Build Graph Time 13 28 50 13 28 46 13 28 49 10 25 44 10 25 44Explore Time 33 92 163 60 148 270 64 193 206 68 205 227 86 189 330Selection Time 2 2 3 9 7 7 2 2 6 19 23 11 3 10 10

Partial Plans 81k 288k 554k 130k 367k 700k 157k 570k 600k 120k 284k 456k 152k 331k 641krRRT CP (%) no satisfactory 0.98 3.52 no satisfactory no satisfactoryrRRT Cost plan found 2.89 2.22 plan found plan found

of 2% and 5%, the results shown in Table I demonstratethat even in this more complex workspace, PUMP findssolutions tightly bound on the CP constraint with run timeson the order of 100 ms. Unlike the previous example,however, a trend of increasing solution quality as graphresolution increases is observed, possibly reflecting the morecomplex obstacle environment. Figure 1c illustrates the costtradeoff for a 2% (blue) and 5% (red) target CP, in whichthe 2% motion plan must take a more conservative andcostly route (routes visualized through time at https://www.youtube.com/watch?v=ac4A4-ctqrM). Re-peated RRT did not identify any candidate plans below 5%CP in this tight hallway environment; the lowest trajectoryCP found was 9.57% with a cost of 1.86.

VI. CONCLUSIONS

In this paper we have introduced the Parallel Uncertainty-aware Multiobjective Planning algorithm for the stochastickinodynamic motion planning problem. PUMP represents,to the best of our knowledge, an algorithmic first in that itdirectly considers both cost and collision probability at equalpriority when planning through the free state space. Althoughmultiobjective search is computationally intensive, we havedemonstrated that algorithm design and GPU implementationallow for this principled approach to achieve run timescompatible with a ~10 Hz planning loop. Included in thePUMP run time is a Monte Carlo-based certification step toguarantee the collision probability constraint is met, withoutsacrificing cost metric performance due to conservatismbeyond the safety design constraints. The Half-Space MonteCarlo method used in our PUMP experiments may also beof general interest as an empirically accurate and fast wayto approximate motion plan collision probability.

This work leaves many avenues for further investigation.First, we plan to implement this algorithm for more varieddynamical systems and uncertainty models to verify itsgenerality. Second, we plan to extend this algorithm to envi-ronmental uncertainty, particularly in dynamically evolvingenvironments. Third, we plan to extend the presented mul-tiobjective methodology to other constraints, such as arrivaltime windows and resource constraints. Finally, we are inthe process of implementing and verifying this algorithm andproblem setup onboard a GPU-equipped quadrotor.

REFERENCES[1] S. M. LaValle, Planning Algorithms. Cambridge University Press,

2006.[2] L. Janson, E. Schmerling, and M. Pavone, “Monte Carlo motion

planning for robot trajectory optimization under uncertainty,” in Int.Symp. on Robotics Research, 2015.

[3] L. P. Kaelbling, M. L. Littman, and A. R. Cassandra, “Planning andacting in partially observable stochastic domains,” Artificial Intelli-gence, 1998.

[4] H. Kurniawati, D. Hsu, and W. S. Lee, “SARSOP: Efficient point-based POMDP planning by approximating optimally reachable beliefspaces,” in Robotics: Science and Systems, 2008.

[5] T. Lee and Y. J. Kim, “Massively parallel motion planning algorithmsunder uncertainty using POMDP,” Int. Journal of Robotics Research,2016.

[6] A. Agha-mohammadi, S. Chakravorty, and N. M. Amato, “FIRM:Sampling-based feedback motion planning under motion uncertaintyand imperfect measurements,” Int. Journal of Robotics Research, 2014.

[7] B. D. Luders, M. Kothari, and J. P. How, “Chance constrained RRT forprobabilistic robustness to environmental uncertainty,” in AIAA Conf.on Guidance, Navigation and Control, 2010.

[8] J. van den Berg, P. Abbeel, and K. Goldberg, “LQG-MP: Optimizedpath planning for robots with motion uncertainty and imperfect stateinformation,” Int. Journal of Robotics Research, 2011.

[9] W. Sun, S. Patil, and R. Alterovitz, “High-frequency replanning underuncertainty using parallel sampling-based motion planning,” IEEETransactions on Robotics, 2015.

[10] L. Blackmore, M. Ono, A. Bektassov, and B. C. Williams, “A proba-bilistic particle-control approximation of chance-constrained stochasticpredictive control,” IEEE Transactions on Robotics, 2010.

[11] P. Sanders and L. Mandow, “Parallel label-setting multi-objectiveshortest path search,” in IEEE Parallel and Distributed ProcessingSymposium, 2013.

[12] N. M. Amato and L. K. Dale, “Probabilistic roadmap methods areembarrassingly parallel,” in Proc. IEEE Conf. on Robotics and Au-tomation, 1999.

[13] J. Pan, C. Lauterbach, and D. Manocha, “g-planner: Real-time motionplanning and global navigation using GPUs.” in Proc. AAAI Conf. onArtificial Intelligence, 2010.

[14] ——, “Efficient nearest-neighbor computation for GPU-based motionplanning,” in IEEE/RSJ Int. Conf. on Intelligent Robots & Systems,2010.

[15] J. Bialkowski, S. Karaman, and E. Frazzoli, “Massively parallelizingthe RRT and the RRT*,” in IEEE/RSJ Int. Conf. on Intelligent Robots& Systems, 2011.

[16] D. Devaurs, T. Simeon, and J. Cortes, “Parallelizing RRT onlarge-scale distributed-memory architectures,” IEEE Transactions onRobotics, 2013.

[17] C. Park, J. Pan, and D. Manocha, “Poisson-RRT,” in Proc. IEEE Conf.on Robotics and Automation, 2014.

[18] T. Lee, M. Leoky, and N. H. McClamroch, “Geometric tracking controlof a quadrotor UAV on SE(3),” in Proc. IEEE Conf. on Decision andControl, 2010.

[19] A. J. Shaiju and I. R. Petersen, “Formulas for discrete time LQR,LQG, LEQG and minimax LQG optimal control problems,” in IFACWorld Congress, 2008.

[20] D. J. Webb and J. van den Berg, “Kinodynamic RRT*: Optimal motionplanning for systems with linear differential constraints,” in Proc.IEEE Conf. on Robotics and Automation, 2013.

[21] E. Schmerling, L. Janson, and M. Pavone, “Optimal sampling-basedmotion planning under differential constraints: the drift case withlinear affine dynamics,” in Proc. IEEE Conf. on Decision and Control,2015.

[22] S. Patil, J. van den Berg, and R. Alterovitz, “Estimating probability ofcollision for safe motion planning under Gaussian motion and sensinguncertainty,” in Proc. IEEE Conf. on Robotics and Automation, 2012.

[23] J. A. Starek, E. Schmerling, G. D. Maher, B. W. Barbee, andM. Pavone, “Real-time, propellant-optimized spacecraft motion plan-ning under Clohessy-Wiltshire-Hill dynamics,” in IEEE AerospaceConference, 2016.

[24] E. Schmerling, L. Janson, and M. Pavone, “Optimal sampling-basedmotion planning under differential constraints: the driftless case,” inProc. IEEE Conf. on Robotics and Automation, 2015, extended versionavailable at http://arxiv.org/abs/1403.2483/.

[25] L. Janson, B. Ichter, and M. Pavone, “Deterministic sampling-basedmotion planning: Optimality, complexity, and performance,” in Int.Symp. on Robotics Research, 2015.

https://www.youtube.com/watch?v=ac4A4-ctqrM

https://www.youtube.com/watch?v=ac4A4-ctqrM

http://arxiv.org/abs/1403.2483/

Real-Time Stochastic Kinodynamic Motion Planning … · Real-Time Stochastic Kinodynamic Motion...

Documents

Transcript of Real-Time Stochastic Kinodynamic Motion Planning … · Real-Time Stochastic Kinodynamic Motion...