Sensorwebs for Pursuit-Evasion Game on Berkeley UAV / UGV Testbed
Pursuit-Evasion Problems with Robots - CIMATmurrieta/UIUCtalk2016.pdf · R. Isaacs. Differential...
Transcript of Pursuit-Evasion Problems with Robots - CIMATmurrieta/UIUCtalk2016.pdf · R. Isaacs. Differential...
Pursuit-Evasion Problems with Robots
Rafael Murrieta Cid
Centro de Investigacion en Matematicas (CIMAT)
April 2016
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 1 / 47
Outline
1 Introduction
2 Time-Optimal Motion Strategies for Capturing an Omnidirectional Evader using aDifferential Drive Robot
3 Maintaining Strong Mutual Visibility of an Evader in an Environment with Obstacles
4 Conclusions and Future Work
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 2 / 47
Previous work
Target capturing in an environment without obstacles
R. Isaacs. Differential Games: A Mathematical Theory with Applications to Warfare andPursuit, Control and Optimization. John Wiley and Sons. Inc., 1965.
Y.C. Ho et. al., Differential Games and Optimal Pursuit-Evasion Strategies, IEEE Transactionson Automatic Control, 1965.
Pursuer
Evader
(a)
Pursuer
Evader
(b)
Pursuer
Evader
(c)
Figure: Target capturing.
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 3 / 47
Previous workTarget tracking in an environment with obstacles
S. M. LaValle et. al., ”Motion strategies for maintaining visibility of a moving target”, in Proc.IEEE Int. Conf. on Robotics and Automation, 1997.H.H. Gonzalez-Banos et. al., Motion Strategies for Maintaining Visibility of a Moving Target InProc IEEE Int. Conf. on Robotics and Automation, 2002.R. Murrieta-Cid et. al., Surveillance Strategies for a Pursuer with Finite Sensor Range,International Journal on Robotics Research, Vol. 26, No 3, pages 233-253, March 2007.S. Bhattacharya and S. Hutchinson , On the Existence of Nash Equilibrium for a Two PlayerPursuit-Evasion Game with Visibility Constraints, The International Journal of RoboticsResearch, December, 2009.
Trajectory: known or unknown
Visibility Region
EvaderPursuer
Figure: Target tracking.
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 4 / 47
Previous workTarget finding in an environment with obstacles
V. Isler et. al., “Randomized pursuit-evasion in a polygonal enviroment”, IEEE Transactions onRobotics, vol. 5, no. 21, pp. 864-875, 2005.R. Vidal et. al., “Probabilistic pursuit-evasion games: Theory, implementation, andexperimental evaluation”, IEEE Transactions on Robotics and Automation, vol. 18, no. 5, pp.662-669, 2002.
(a) (b) (c)
(d) (e) (f)
Figure: Target finding.Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 5 / 47
Target finding in an environment with obstacles
L. Guibas, J.-C. Latombe, S. LaValle, D. Lin and R. Motwani, “Visibility-based pursuit-evasionin a polygonal environment”, International Journal of Computational Geometry andApplications, vol. 9, no. 4/5, pp. 471-494, 1999.
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 6 / 47
Time-Optimal Motion Strategies for Capturing an OmnidirectionalEvader using a Differential Drive Robot
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 7 / 47
Time-Optimal Motion Strategies for Capturing an OmnidirectionalEvader using a Differential Drive Robot
Problem formulation
A Differential Drive Robot (DDR) and an omnidirectional evader move on a plane withoutobstacles.
The game is over when the distance between the DDR and the evader is smaller than acritical value l .
Both players have maximum bounded speeds V maxp and V max
e , respectively. The DDR isfaster than the evader, V max
p > V maxe .
The DDR wants to minimize the capture time tf while the evader wants to maximize it.
We want to know the time-optimal motion strategies of the players that are in NashEquilibrium.
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 8 / 47
The Homicidal Chauffeur Problem
A driver wants to run over a pedestrian in a parking lot without obstacles.
The pursuer is a vehicle with a minimal turning radius (car-like).
The question to be solved is: under what circumstances, and with what strategy, can thedriver of the car guarantee that he can always catch the pedestrian, or the pedestrianguarantee that he can indefinitely elude the car?
ω
ν
(a) DDR
ν
ω
(b) Car-like
Figure: Control domains
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 9 / 47
Model
Reduced space
The problem can be stated in a coordinate system that is fixed to the body of the DDR. The stateof the system is expressed as x(t) = (x(t), y(t)) ∈ R2.
E
P
yP
P
θ
x
ψ
Ey
E
x E
P
(a) Realistic space
xP
E
υ
y
(b) Reduced space
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 10 / 47
Model
The evolution of the system in the DDR-fixed coordinate system is described by the followingequations of motion
x(t) =(
u2(t)− u1(t)2b
)y(t) + v1(t) sin v2(t)
y(t) = −(
u2(t)− u1(t)2b
)x(t)−
(u1(t) + u2(t)
2
)+ v1(t) cos v2(t)
(1)
This set of equations can be expressed in the form x = f (t , x(t), u(t), v(t)), whereu(t) = (u1(t), u2(t)) ∈ U = [−V max
p ,V maxp ]× [−V max
p ,V maxp ] and
v(t) = (v1(t), v2(t)) ∈ V = [0,V maxe ]× [0, 2π).
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 11 / 47
Preliminaries
Payoff
A standard representation [Isaacs65, Basar95] of the payoff is
J(x(ts), u, v) =
∫tf
tsL(x(t), u(t), v(t))︸ ︷︷ ︸ dt + G(x(tf ))︸ ︷︷ ︸
running cost terminal cost
For problems of minimum time [Isaacs65, Basar95], as in this game, L(x(t), u(t), v(t)) = 1 andG(tf , x(tf )) = 0. Therefore in our game, the payoff is represented as
J(x(ts), u, v) =∫ tf (x(ts),u,v)
tsdt = tf (x(ts), u, v)− ts (2)
Note that tf (x(ts), u, v) depends on the sequence of controls u and v applied to reach the pointx(tf ) from the point x(ts).
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 12 / 47
Preliminaries
Value of the game
For a given state of the system x(ts), V (x(ts)) represents the outcome if the players implementtheir optimal strategies starting at the point x(ts), and it is called the value of the game or the valuefunction at x(ts) [Isaacs65, Basar95]
V (x(ts)) = minu(t)∈U
maxv(t)∈V
J(x(ts), u, v) (3)
where U and V are the set of valid values for the controls at all time t . V (x(t)) is defined over theentire state space.
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 13 / 47
Preliminaries
Open and closed-loop strategies
Let γp(x(t)) and γe(x(t)) denote the closed-loop strategies of the DDR and the evader,respectively, therefore u(t) = γp(x(t)) and v(t) = γe(x(t)).A strategy pair (γ∗p (x(t)), γ∗e (x(t))) is in closed-loop (saddle-point) equilibrium [Basar95] if
J(γ∗p (x(t)), γe(x(t))) ≤ J(γ∗p (x(t)), γ∗e (x(t)))
≤ J(γp(x(t)), γ∗e (x(t)))∀γp(x(t)), γe(x(t))(4)
where J is the payoff of the game in terms of the strategies. An analogous relation exists foropen-loop strategies.
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 14 / 47
Necessary Conditions for Saddle-Point Equilibrium Strategies
Theorem (Pontryagin’s Maximum Principle - PMP)
Suppose that the pair {γ∗p , γ∗e } provides a saddle-point solution in closed-loop strategies, withx∗(t) denoting the corresponding state trajectory. Furthermore, let its open-loop representation{u∗(t) = γp(x∗(t)), v∗(t) = γe(x∗(t))} also provide a saddle-point solution (in open-loop polices).Then there exists a costate function p(·) : [0, tf ]→ Rn such that the following relations aresatisfied:
x∗(t) = f (x∗(t), u∗(t), v∗(t)), x∗(0) = x(ts) (5)
H(p(t), x∗(t), u∗(t), v(t)) ≤ H(p(t), x∗(t), u∗(t), v∗(t)) ≤ H(p(t), x∗(t), u(t), v∗(t)) (6)
p(t) = ∇V (x(t)) (7)
pT (t) = −∂
∂xH(p(t), x∗(t), u∗(t), v∗(t)) (Adjoint Equation) (8)
pT (tf ) =∂
∂xG(tf , x∗(tf )) along ζ(x∗(t)) = 0 (9)
where
H(p(t), x(t), u(t), v(t)) = pT (t) · f (x(t), u(t), v(t)) + L(x(t), u(t), v(t)) (Hamiltonian) (10)
and T denotes the transpose operator.
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 15 / 47
Time-Optimal Motion Primitives
Optimal controls
Lemma
The time-optimal controls for the DDR that satisfy the Isaacs’ equation in the reduced space aregiven by
u∗1 = −sgn(−yVx
b+
xVy
b− Vy
)V max
p
u∗2 = −sgn(
yVx
b−
xVy
b− Vy
)V max
p
(11)
We have that both controls are always saturated. The controls of the evader in the reduced spaceare given by
v∗1 = V maxe , sin v∗2 =
Vx
ρ, cos v∗2 =
Vy
ρ(12)
where ρ =√
V 2x + V 2
y . The evader will move at maximal speed.
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 16 / 47
Decision problemUsable part and its boundary
The portion of the terminal surface where the DDR can guarantee termination regardless of thechoice of controls of the evader is called the usable part (UP) [Isaacs65]. From [Isaacs65], wehave that the UP is given by
UP =
{x(t) ∈ ζ : min
u(t)∈Umax
v(t)∈Vn · f (x(t), u(t), v(t)) < 0
}(13)
where U and V are the sets of valid values for the controls, and n is the normal vector to ζ frompoint x(t) on ζ and extending into the playing space.
Surface
I
IIIII
IV
s
x
y
UP
UP
BUP
BUPBUP
BUP
Terminal
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 17 / 47
Decision problem
Theorem
If V maxe /V max
p < l| tan S|/b the DDR can capture the evader from any initial configuration in thereduced space. Otherwise the barrier separates the reduced space into two regions:
1 One between the UP and the barrier.2 Another above the barrier.
The DDR can only force the capture in the configurations between the UP and the barrier, in whichcase, the DDR follows a straight line in the realistic space when it captures the evader.
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 18 / 47
Partition of the reduced space
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 19 / 47
Partition of the first quadrant
0 0.5 1 1.5 20
0.5
1
1.5
2
2.5
3
x
y
UPBS
TS
US
BUP
I II IIITributary
yc
DS
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 20 / 47
Global optimality
US
IIIζ
III
(c)
1
1
2
3
4
1
12
(d)
Figure: Graphs
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 21 / 47
Simulations - Optimal StrategiesThe parameters were V max
p = 1 , V maxe = 0.5, b = 1 and l = 1. Capture time tc = 1.2s.
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 22 / 47
Simulations - Evader avoids captureThe parameters were V max
p = 1 , V maxe = 0.787, b = 1 and l = 1.
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 23 / 47
State Estimation using the 1D Trifocal Tensor
evader
y
x
y
x
C3
C
y
x
φ3
t x3
t y
t
3
1y = d
i
d e
C1
2
tx1= 0
φ1 = 0
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 24 / 47
A bound for the angle delimiting the field of view of the pursuer
Theorem
If the evader is in position (r0, φ0) in the reduced space at the beginning of the game with φ0 < φvand S < φv then, if the pursuer applies its time-optimal feedback policy the evader’s position (r , φ)will satisfy φ < φv at all times until the capture is achieved regardless of the evader’s motionstrategy.
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 25 / 47
Feedback-based motion strategies for the DDR
0 0.5 1 1.5 20
0.5
1
1.5
2
2.5
3
x
y
UPBS
TS
RS (US)
BUP
RS
RR
S φv
yc
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 26 / 47
State Estimation
Simulations
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 27 / 47
Maintaining Visibility of an Evader in an Environment withObstacles
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 28 / 47
The Objective
We address our target tracking problem as a game of kind consisting in the next decision problem:is the pursuer able to maintain surveillance of an evader at all time?
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 29 / 47
Classical Visibility
������������������������������������������������������������������������������������������������������������������������������������������������
������������������������������������������������������������������������������������������������������������������������������������������������
second occlusion
Pursuer velocity vector reducing shadow region
Evader velocity vector
Evader
Pursuer
Pursuer velocity vector preventing second occlusion
Shadowregion
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 30 / 47
Strong Mutual Visibility
Definition
Two regions R and R′ are said to be strongly mutually visible (SMV) if visibility holds for all pointsx and x ′ such that x ∈ R and x ′ ∈ R′.
A straightforward test for the SMV relation using a convex hull computation is given in the followingexpression:
Regions R and R′ are strongly mutually visible if and only if
int[convex-hull[(R ∪ R′)]] ⊂ W
where W is the polygon representing the workspace.
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 31 / 47
Environment Partition and Graphs
(a) Environment partition
(b) Mutual visibility graph (c) Accessibility graph
Figure: Strong Mutual Visibility
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 32 / 47
Workspace Partition
SMV (Ri ) is the set of regions that are SMV with region Ri .The total area of these regions is then given by
Vsm(Ri ) =∑
Rk∈SMV (Ri )
µ(Rk )
µ(Rk ) denotes the area of region Rk .∑Vsm(Ri ) is the summation of each Vsm(Ri ) done over all region Ri , this is a global measure
intrinsic to a given partition.
(a) (b)
Figure: (a) Reflex rays and (b) Extended bitangent segment
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 33 / 47
Workspace Partition
Theorem
For a given reflex vertex v, for any other procedure ς for partitioning R1 and R2 into two newregions each, apart from drawing pivot segments, there exists a pivot segment S that partitionsboth R1 and R2 that yields a bigger value of
∑Vsm(Ri ) than the one related to the partition
obtained by applying ς.
v
R1 R2
cgR(v)
v
(a)
vcgR(v)
vR1ng
R1g
R2ng
vR2g
(b)
cgR(v)
v
s1 s2s'2
s'1
(c)
cgR(v)
v
s'2
s'1 S
(d)
Figure: Different partitioning procedures
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 34 / 47
Definition
A guard polygon for a given point q is the set of all regions in which each of them is mutuallyvisible to all the regions that own the point q. Let Q(q) = {R : q ∈ R}, a guard polygon gP(q) fora given point q is defined by:
gP(q) = {R : (R,Rk ) ∈ MVG, ∀Rk ∈ Q(q)} (14)
(a) If the evader stands on ni ,it is simultaneously over regions{R1, R2, R3, R4, R7, R8, R9}
(b) The guard polygon for point ni is gP(ni ) ={R4, R5}
Figure: Guard polygon
Ri,i+1 = {(v ,w) : tp(v ,w) ≤ te(qi , qi+1) where v ∈ gP(qi ) and w ∈ gP(qi+1)} (15)
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 35 / 47
Reduced Visibility Graph (RVG)
The vertices in the RVG are the reflex vertices of the environment.
An edge between two vertices in the RVG is generated if the two vertices are endpoints of thesame edge of an obstacle.
Or if a bitangent line can be drawn between such vertices.
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 36 / 47
Safe Areas and RVG with Tree Topology
n1n3
n2
n4
(a)
RVG
n1
n3n2
n4
(b)
Figure: Example 1 with a tree topology RVG and its calculated safe areas, VpVe
= 0.9
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 37 / 47
Safe Areas and RVG with Tree Topology
n1
n3
n2
n4
n5
n6
n7
n8 n9
n10
(a)
RVG
n1
n3
n2
n4
n5
n6
n7
n8 n9
n10
(b)
n1n3
n2
n4
n5
n6
n7
n8n9
n10
(c)
Figure: Example 2 with a tree topology RVG and its calculated safe areas, VpVe
= 1.1Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 38 / 47
RVG with Cycles
RVG
n1 n3
n2
n4
(a) OriginalRVG
nr
n1 n3
n2
n4n1n3
n2 n2n4
(b) UnfoldedRVG
nr
n1 n3
n2
n4n1n3
n2 n2n4
sA(n )(k)
2 sA(n )(k)
2
(c) Feedbackprocedure
nr
n1 n3
n2
n4n1n3
n2 nn4
n1 n3
n4n1n3
n2 n2n4
2
n1 n3
n4n1n3
n2 n2n4
(d) Tree that considers 2laps
Figure: Cycles algorithm
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 39 / 47
Safe Areas and RVG with Cycles
n1
n3
n2
n4
n5
n6
(a)VpVe
= 1.2
RVG
n1
n3
n2
n4
n5
n6
(b)
n1
n3
n2
n4
n5
n6
(c)VpVe
= 1.05
n1
n3
n2
n4
n5
n6
(d)VpVe
= 1.015
n1
n3
n2
n4
n5
n6
(e)VpVe
= 0.999
Figure: Example 3 with cycles in the RVG and its calculated safe areasRafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 40 / 47
The S Set
n1n3
n2
n4
(a)VpVe
= 0.9
n1n3
n2
n4
(b)VpVe
= 0.78
n1n3
n2
n4
(c)VpVe
= 0.71
n1n3
n2
n4
(d)
n1n3
n2
n4
(e)
n1n3
n2
n4
(f)
Figure: Example 4 with its calculated safe areas and sample S setsRafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 41 / 47
Decidability and Complexity
Theorem
The proposed algorithm always converges in a finite number of iterations, hence, the problem ofdeciding whether or not a pursuer is able to maintain SMV of an evader that travels over the RVG,both players moving at bounded speed, is decidable.
Theorem
The problem of deciding whether or not the pursuer is able to maintain SMV of an evader thattravels over the RVG, both players moving at bounded speed, is NP-complete.
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 42 / 47
Conclusions
In this work, we made the following contributions:
1 Pursuit/Evasion: DDR vs Omnidirectional Agent
We presented time-optimal motion strategies and the conditions defining the winner for thegame of capturing an omnidirectional evader with a differential drive robot.
2 Surveillance with Obstacles
We proved decidability of this problem for any arbitrary polygonal environment.
We provided a complexity measure to our evader surveillance game.
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 43 / 47
Future Work
1 Pursuit/Evasion: DDR vs Omnidirectional Agent
Capturing an omnidirectional agent using two o more differential drive robots when one is notable to do it.
To include acceleration bounds in the solution of the problem.
Feedback motion policy based on the image space.
2 Surveillance with Obstacles
A moving evader that is free to travel any path within the workspace.
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 44 / 47
Collaborators and Funding
Israel Becerra (UIUC)
David Jacobo (Google)
Ubaldo Ruiz (CICESE)
Hector Becerra (CIMAT)
Seth Hutchinson (UIUC)
Jean-Paul Laumond (LAAS/CNRS)
Jose Luis Marroquin (CIMAT)
Raul Monroy (ITESM CEM)
Funding: National Council of Science and Technology (CONACYT)Mexico
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 45 / 47
Publications
Israel Becerra, Rafael Murrieta-Cid, Raul Monroy, Seth Hutchinson and Jean-Paul Laumond,Maintaining Strong Mutual Visibility of an Evader Moving over the Reduced Visibility Graph,Autonomous Robots 40(2):395-423, 2016.
David Jacobo, Ubaldo Ruiz, Rafael Murrieta-Cid, Hector Becerra and Jose Luis Marroquin, AVisual Feedback-based Time-Optimal Motion Policy for Capturing an Unpredictable Evader,International Journal of Control 88(4):663-681, 2015.
Ubaldo Ruiz, Rafael Murrieta-Cid and Jose Luis Marroquin, Time-Optimal Motion Strategiesfor Capturing an Omnidirectional Evader using a Differential Drive Robot, IEEE Transactionon Robotics 29(5):1180-1196, 2013.
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 46 / 47
Thanks... Questions?
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 47 / 47
T. Basar and G. Olsder, Dynamic Noncooperative Game Theory, 2nd Ed. SIAM Series inClassics in Applied Mathematics, Philadelphia, 1995.
R. Isaacs. Differential Games: A Mathematical Theory with Applications to Warfare andPursuit, Control and Optimization. Wiley, New York, 1965.
L. S. Pontryagin, V. G. Boltyanskii, R. V. Gamkrelidze, and E. F. Mishchenko. TheMathematical Theory of Optimal Processes. JohnWiley, 1962.
Rafael Murrieta Cid (CIMAT) Pursuit-Evasion Problems April 2016 47 / 47