ENERGY-EFFICIENT COOPERATIVEADAPTIVE CRUISE CONTROL OF
PLATOONING VEHICLES
Weinan Gao1, Zhong-Ping Jiang1, IEEE Fellow, Kaan Ozbay2
1. Control and Networks Lab, Department of Electrical and Computer Engineering,New York University
2. Department of Civil and Urban Engineering, and the Center for Urban Science andProgress (CUSP), New York University
Nov 15, 2016
1 / 14
Background and Motivation
Background:
There are over 1 million road trafficdeaths worldwide every year.
90% of accidents are attributed tohuman errors
Americans were stuck in traffic for 8billion hours in 2015.
”We are the first generation that canend poverty, the last that can endclimate change.” – Ban Ki-Moon.
It’s imperative to develop next-generationcruise controllers for connected vehicles toincrease the safety, reliability, connectivityand autonomy.
[Challenge]:
1 Parametric variations → Unknownsystem parameters
2 Uncertain models of to-be-followedvehicles
3 Energy efficient → Eco-friendly
4 Input(acceleration) saturation2 / 14
Background and Motivation
1 Adaptive control approaches forplatooning vehicles are not optimal[Swaroop, Hedrick & Choi 2001], [Kwon& Chua 2014].
2 Optimal control methods are usuallymodel-based and not data-driven.[Jovanovic & Bamieh 2005], [Waschl,Kolmanovsky, Steinbuch, & del Re 2014].
3 Reinforcement-learning-based controllerscannot guarantee the stability of theclosed-loop systems [Ng et al. 2008],[Desjardins & Chaib-draa 2011]
We develop a data-driven, non-model-based adaptive optimal controller forplatooning vehicles by Adaptive Dynamics Programming (ADP). The issue ofinput saturation is also addressed.
1Gao, W.; Jiang, Z. P. & Ozbay, K. Data-driven Adaptive optimal control of connected vehicles, IEEE
Transactions on Intelligent Transportation Systems, 2016.
3 / 14
Background and Motivation
Dynamic Programming [Bellman 1957]
1 Curse of dimensionality
2 Curse of modeling
[Werbos 1968] pointed out that adaptive approximation to the HJBequation can be achieved by designing appropriate learning systems:approximate/adaptive dynamic programming (ADP)
1 Heuristic dynamic programming: approximate the optimal costfunction.
2 Dual dynamic programming: approximate the gradient of theoptimal cost function.
4 / 14
Review on ADP and Adaptive Optimal Control
The platoon can be modeled by the following systems
x =Ax+Bu
J (x0) =
∫ ∞0
(xT (τ)Qx(τ) + uT (τ)Ru(τ)
)dτ
The optimal control policy
u = −R−1BTP ∗x := −K∗x
where P ∗ = P ∗T > 0 is the unique solution to Riccati equation
ATP ∗ + P ∗A+Q− P ∗BR−1BTP ∗ = 0
Adaptive optimal control: find P ∗ and K∗ when A and B are unknown
5 / 14
Review on ADP and Adaptive Optimal Control
[Jiang & Jiang 2012] Adaptive optimal control with unknown systemmatrices A and B
1 Start from an admissible K0. k ← 0.2 xT (t1)Pkx(t1)− x(t0)TPkx(t0) = −
∫ t1t0xT (Q+KT
k RKk)xdτ+
2∫ t1t0
(u+Kkx)TRKk+1xdτ3 k ← k + 1. Repeat Step 2.
Both can ensure limk→∞
Pk = P ∗ and limk→∞
Kk = K∗.
2Jiang, Y. & Jiang, Z. P. Computational adaptive optimal control for continuous-time linear systems
with completely unknown dynamics, Automatica , 2012, 48, 2699-2704.
6 / 14
Paramics Micro-Traffic Simulation Results
Double-loop ADP algorithm Traffic simulation architecture
1Gao, W.; Jiang, Z. P. & Ozbay, K. Data-driven Adaptive optimal control of connected vehicles, IEEE
Transactions on Intelligent Transportation Systems, 2016.
7 / 14
Paramics Micro-Traffic Simulation Results
−10 −5 0 5 100
500
1000
1500
2000
2500
3000
3500
4000
(a)
Acceleration [m/s2]
−10 −5 0 5 100
1000
2000
3000
4000
5000
6000
(b)
Acceleration [m/s2]
t[s]0 5 10 15 20 25 30
hi[m]
10
15
20
25
(a)
h2
h3
h4
h5
t[s]0 5 10 15 20 25 30
v i[m
/s]
5
10
15
20
(b)
v1v2v3v4v5
Histograms of accelerations Plots of headways and velocities
8 / 14
Nonlinear and Adaptive Optimal Control of PlatooningVehicles
Employ global ADP (GADP)[Jiang and Jiang 2015] to solve a longstanding issue inITS: how to take into account strong nonlinearity and unknown dynamics in thedesign of global adaptive optimal controllers.
Contributions:
Because of the strongly nonlinear dynamics of the platooning vehicles, we arenot aware of any global solutions to adaptive optimal control of platooningvehicles with unknown dynamics.
Different from existing adaptive control approaches of platooning vehicles[Swaroop, Hedrick & Choi 2001], [Kwon & Chua 2014] the online GADPapproach learns a near-optimal controller iteratively via real-time state/inputdata.
The neural network approximation is avoided for this kind of high-orderplatooning vehicle systems which dramatically decreases the computationalburden.
3Gao, W. & Jiang, Z. P., Nonlinear and Adaptive Suboptimal Control of Connected Vehicles: A
Global Adaptive Dynamic Programming Approach. Journal of Intelligent & Robotic Systems, 2016.
10 / 14
Nonlinear and Adaptive Optimal Control of PlatooningVehicles
time (sec)
0 50 100 150
headw
ay(m
)
-2
0
2
4
h1-h
*
h2-h
*
h3-h
*
time (sec)
0 50 100 150
velo
city(m
/sec)
-0.5
0
0.5
1
v1-v
*
v2-v
*
v3-v
*
time (sec)
0 50 100 150
headw
ay(m
)
-2
0
2
4
h1-h
*
h2-h
*
h3-h
*
time (sec)
0 50 100 150
velo
city(m
/sec)
-0.5
0
0.5
1
v1-v
*
v2-v
*
v3-v
*
Initial control policy Learned control policy
11 / 14
Nonlinear and Adaptive Optimal Control of PlatooningVehicles
105
0
x1
-5-10-10
-5
x2
05
×1012
2.5
1
0
0.5
2
1.5
10
V0(x
1,x
2,0,0,0,0)
V10
(x1,x
2,0,0,0,0)
105
0
x3
-5-10-10
-5
x4
05
×1011
8
0
2
4
6
10
12
10
V0(0,0,x
3,x
4,0,0)
V10
(0,0,x3,x
4,0,0)
105
0
x5
-5-10-10
-5
x6
05
×1011
5
2
1
0
6
7
8
4
3
10
V0(0,0,0,0,x
5,x
6)
V10
(0,0,0,0,x5,x
6)
Figure: Comparison of value functions V0 w.r.t. the initial control policy andV10 w.r.t. the learner control policy
12 / 14
Thanks!
13 / 14
Supplimental slides: Model of Vehicles
The optimal velocity model [Orosz, et al., 2010] of the ithhuman-driven vehicle is
hi = vi−1 − vi,vi = αi(f(hi)− vi) + βihi, (1)
where i = 2, 3, · · · , n. αi and βi are human parameters with αi theheadway gain and βi the relative velocity gain satisfyingαi > 0, αi + βi > 0. f(·) indicates a range policy
f(h) =
0 if h ≤ hs,vm(1− cos(π h−hs
hg−hs))/2 if hs < h < hg,
vm if hg ≤ h.(2)
which implies that the vehicle i remains standstill if hi ≤ hs. viincreases as hi increases in the range (hs, hg). Additionally, if hi ≥ hg,vehicle i aims to travel at the maximum velocity vm. In this paper, thegoal for each driver is to actuate the vehicle at desired headway h∗ andvelocity v∗ = f(h∗).
14 / 14
Top Related