Hamilton-Jacobi-Bellman Equationmitchell/Class/CS532M.2007W2/Talks/hjbSham.pdfHJB Equation Extension...
Transcript of Hamilton-Jacobi-Bellman Equationmitchell/Class/CS532M.2007W2/Talks/hjbSham.pdfHJB Equation Extension...
![Page 1: Hamilton-Jacobi-Bellman Equationmitchell/Class/CS532M.2007W2/Talks/hjbSham.pdfHJB Equation Extension of Hamilton-Jacobi equation (classical mechanics) Solution is the optimal cost-to-go](https://reader033.fdocuments.us/reader033/viewer/2022042620/5f4a095391bb81620f671c5a/html5/thumbnails/1.jpg)
Hamilton-Jacobi-Bellman EquationFeb 25, 2008
![Page 2: Hamilton-Jacobi-Bellman Equationmitchell/Class/CS532M.2007W2/Talks/hjbSham.pdfHJB Equation Extension of Hamilton-Jacobi equation (classical mechanics) Solution is the optimal cost-to-go](https://reader033.fdocuments.us/reader033/viewer/2022042620/5f4a095391bb81620f671c5a/html5/thumbnails/2.jpg)
What is it?
The Hamilton-Jacobi-Bellman (HJB) equation is the continuous-time analog to the discrete deterministic
dynamic programming algorithm
![Page 3: Hamilton-Jacobi-Bellman Equationmitchell/Class/CS532M.2007W2/Talks/hjbSham.pdfHJB Equation Extension of Hamilton-Jacobi equation (classical mechanics) Solution is the optimal cost-to-go](https://reader033.fdocuments.us/reader033/viewer/2022042620/5f4a095391bb81620f671c5a/html5/thumbnails/3.jpg)
Discrete VS Continuous
x k1= f x k ,uk x t = f x t , u t
k∈0,. .. , N 0≤t≤T
J N x N =gN x N V T , x =h x
J k xk =minuk∈U k
{ gk x k ,uk +
J k1xk ,uk }
0=minu∈U
{ g x ,u∇ tV t , x + ∇ xV t , x ' f x ,u}
h x T ∫0
T
g x t , u t dtgN x N ∑k=0
N−1
gk xk ,uk
![Page 4: Hamilton-Jacobi-Bellman Equationmitchell/Class/CS532M.2007W2/Talks/hjbSham.pdfHJB Equation Extension of Hamilton-Jacobi equation (classical mechanics) Solution is the optimal cost-to-go](https://reader033.fdocuments.us/reader033/viewer/2022042620/5f4a095391bb81620f671c5a/html5/thumbnails/4.jpg)
HJB Equation
● Extension of Hamilton-Jacobi equation (classical mechanics)
● Solution is the optimal cost-to-go function● Applications
– path planning
– medical
– financial
![Page 5: Hamilton-Jacobi-Bellman Equationmitchell/Class/CS532M.2007W2/Talks/hjbSham.pdfHJB Equation Extension of Hamilton-Jacobi equation (classical mechanics) Solution is the optimal cost-to-go](https://reader033.fdocuments.us/reader033/viewer/2022042620/5f4a095391bb81620f671c5a/html5/thumbnails/5.jpg)
Derivation
● Start with continuous time interval t∈[0,T ]
● Discretize into N pieces so that =TN
● Denote x k=x k , k=0,. .. , N
uk=u k , k=0,. .. , N
![Page 6: Hamilton-Jacobi-Bellman Equationmitchell/Class/CS532M.2007W2/Talks/hjbSham.pdfHJB Equation Extension of Hamilton-Jacobi equation (classical mechanics) Solution is the optimal cost-to-go](https://reader033.fdocuments.us/reader033/viewer/2022042620/5f4a095391bb81620f671c5a/html5/thumbnails/6.jpg)
Derivation
● Remember x t = f x t , u t
● Approximate the continuous time by
x k1=xk f x k ,uk
h x N ∑k=0
N−1
g xk , uk
h x T ∫0
T
g x t , u t dt
![Page 7: Hamilton-Jacobi-Bellman Equationmitchell/Class/CS532M.2007W2/Talks/hjbSham.pdfHJB Equation Extension of Hamilton-Jacobi equation (classical mechanics) Solution is the optimal cost-to-go](https://reader033.fdocuments.us/reader033/viewer/2022042620/5f4a095391bb81620f671c5a/html5/thumbnails/7.jpg)
Derivation
J *t , x
J *t , x
: Optimal cost-to-go function for continuous time problem
: Optimal cost-to-go function for discrete time approximation
![Page 8: Hamilton-Jacobi-Bellman Equationmitchell/Class/CS532M.2007W2/Talks/hjbSham.pdfHJB Equation Extension of Hamilton-Jacobi equation (classical mechanics) Solution is the optimal cost-to-go](https://reader033.fdocuments.us/reader033/viewer/2022042620/5f4a095391bb81620f671c5a/html5/thumbnails/8.jpg)
Derivation
● From discrete time DPJ N x N =gN x N
J k x k =minuk∈U k
{ gk x k ,uk + J k1x k ,uk }
● For the discrete time approximationJ *N , x =h x J *k , x =min
uk∈U k
{ g x ,u J *k1 , x f x ,u }
![Page 9: Hamilton-Jacobi-Bellman Equationmitchell/Class/CS532M.2007W2/Talks/hjbSham.pdfHJB Equation Extension of Hamilton-Jacobi equation (classical mechanics) Solution is the optimal cost-to-go](https://reader033.fdocuments.us/reader033/viewer/2022042620/5f4a095391bb81620f671c5a/html5/thumbnails/9.jpg)
Derivation
● Assume that the Taylor series expansion exists
● Reminder: Taylor series expansion for f x , y
f x x , y y =∑i=0
∞
{1i !
[ x∇ x y∇ y ]i f x , y }
J *k , x f x ,u=∑i=0
∞
{1i !
[∇ t f x ,u∇ x ]i J *k , x }
● Ignoring all higher order terms J *k , x f x ,u= J *k , x ∇ t
J *k , x
f x ,u ∇ xJ *k , x 'o
o
![Page 10: Hamilton-Jacobi-Bellman Equationmitchell/Class/CS532M.2007W2/Talks/hjbSham.pdfHJB Equation Extension of Hamilton-Jacobi equation (classical mechanics) Solution is the optimal cost-to-go](https://reader033.fdocuments.us/reader033/viewer/2022042620/5f4a095391bb81620f671c5a/html5/thumbnails/10.jpg)
Derivation
● CombineJ *k , x =min
uk∈U k
{ g x ,u J *k1 , x f x ,u }
J *k , x f x ,u= J *k , x ∇ tJ *k , x
f x ,u ∇ xJ *k , x 'o
● We getJ *k , x =min
u∈U{ g x ,u J *k , x ∇ t
J *k , x +
f x ,u ∇ xJ *k , x 'o }
![Page 11: Hamilton-Jacobi-Bellman Equationmitchell/Class/CS532M.2007W2/Talks/hjbSham.pdfHJB Equation Extension of Hamilton-Jacobi equation (classical mechanics) Solution is the optimal cost-to-go](https://reader033.fdocuments.us/reader033/viewer/2022042620/5f4a095391bb81620f671c5a/html5/thumbnails/11.jpg)
Derivation
0=minu∈U
{ g x ,u∇ tJ *k , x f x ,u∇ x
J *k , x 'o }
0=minu∈U
{ g x ,u∇ tJ *k , x f x ,u∇ x
J *k , x 'o }
J *k , x =minu∈U
{ g x ,u J *k , x ∇ tJ *k , x +
f x ,u ∇ xJ *k , x 'o }
![Page 12: Hamilton-Jacobi-Bellman Equationmitchell/Class/CS532M.2007W2/Talks/hjbSham.pdfHJB Equation Extension of Hamilton-Jacobi equation (classical mechanics) Solution is the optimal cost-to-go](https://reader033.fdocuments.us/reader033/viewer/2022042620/5f4a095391bb81620f671c5a/html5/thumbnails/12.jpg)
Derivation
● Take the limit
0=minu∈U
{ g x ,u∇ tJ *k , x f x ,u∇ x
J *k , x 'o }
0 k∞ k =t
lim0,k∞ , k=t
J *k , x =J *t , x ● Assume that
0=minu∈U
{ g x ,u∇ t J*t , x f x ,u ∇ x J
*t , x ' }
J *T , x =h x
![Page 13: Hamilton-Jacobi-Bellman Equationmitchell/Class/CS532M.2007W2/Talks/hjbSham.pdfHJB Equation Extension of Hamilton-Jacobi equation (classical mechanics) Solution is the optimal cost-to-go](https://reader033.fdocuments.us/reader033/viewer/2022042620/5f4a095391bb81620f671c5a/html5/thumbnails/13.jpg)
Claim
If V(t,x) is a solution to the HJB equation, V(t,x) equals the optimal cost-to-go function for all t and x
![Page 14: Hamilton-Jacobi-Bellman Equationmitchell/Class/CS532M.2007W2/Talks/hjbSham.pdfHJB Equation Extension of Hamilton-Jacobi equation (classical mechanics) Solution is the optimal cost-to-go](https://reader033.fdocuments.us/reader033/viewer/2022042620/5f4a095391bb81620f671c5a/html5/thumbnails/14.jpg)
Proof
● From 0=minu∈U
{ g x ,u∇ tV t , x f x ,u∇ xV t , x ' }
● With any control and state trajectory
0≤g x t , u t ∇ tV t , x t f x t , u t ∇ xV t , x t '
{ u t | t∈[0,T ]}{ x t | t∈[0,T ]}
![Page 15: Hamilton-Jacobi-Bellman Equationmitchell/Class/CS532M.2007W2/Talks/hjbSham.pdfHJB Equation Extension of Hamilton-Jacobi equation (classical mechanics) Solution is the optimal cost-to-go](https://reader033.fdocuments.us/reader033/viewer/2022042620/5f4a095391bb81620f671c5a/html5/thumbnails/15.jpg)
Proof
● Substitute in x t = f x t , u t
0≤g x t , u t ∇ tV t , x t x t ∇ xV t , x t '
0≤g x t , u t dV t , x t
dt
0≤∫0
T
g x t , u t dt∫0
TdV t , x t
dtdt
![Page 16: Hamilton-Jacobi-Bellman Equationmitchell/Class/CS532M.2007W2/Talks/hjbSham.pdfHJB Equation Extension of Hamilton-Jacobi equation (classical mechanics) Solution is the optimal cost-to-go](https://reader033.fdocuments.us/reader033/viewer/2022042620/5f4a095391bb81620f671c5a/html5/thumbnails/16.jpg)
Proof
● Evaluate 0≤∫0
T
g x t , u t dt∫0
TdV t , x t
dtdt
0≤∫0
T
g x t , u t dtV T , x T −V 0, x 0
V 0, x 0≤h x T ∫0
T
g x t , u t dt
● For any state and control trajectory
![Page 17: Hamilton-Jacobi-Bellman Equationmitchell/Class/CS532M.2007W2/Talks/hjbSham.pdfHJB Equation Extension of Hamilton-Jacobi equation (classical mechanics) Solution is the optimal cost-to-go](https://reader033.fdocuments.us/reader033/viewer/2022042620/5f4a095391bb81620f671c5a/html5/thumbnails/17.jpg)
Proof
● For optimal state and control trajectory
V 0, x 0=h x*T ∫
0
T
g x*t , u*
t dt=J *0, x 0
{u*t | t∈[0,T ]}{ x*t | t∈[0,T ]}
V 0, x 0=J *0, x 0
![Page 18: Hamilton-Jacobi-Bellman Equationmitchell/Class/CS532M.2007W2/Talks/hjbSham.pdfHJB Equation Extension of Hamilton-Jacobi equation (classical mechanics) Solution is the optimal cost-to-go](https://reader033.fdocuments.us/reader033/viewer/2022042620/5f4a095391bb81620f671c5a/html5/thumbnails/18.jpg)
HJB Example
● Consider the simple scalar systemx t =u t ∣u t ∣≤1∀ t∈[0,T ]
● The terminal cost V T , x =12x2
● HJB Equation 0=min∣u∣≤1
{∇ tV t , x u∇ xV t , x }
![Page 19: Hamilton-Jacobi-Bellman Equationmitchell/Class/CS532M.2007W2/Talks/hjbSham.pdfHJB Equation Extension of Hamilton-Jacobi equation (classical mechanics) Solution is the optimal cost-to-go](https://reader033.fdocuments.us/reader033/viewer/2022042620/5f4a095391bb81620f671c5a/html5/thumbnails/19.jpg)
HJB Example
● Candidate control policyu*t , x =−sgnx
● Optimal cost-to-go functionJ *t , x =
12max {0,∣x∣−T−t }2
● Check to see if it solves the HJB equation
![Page 20: Hamilton-Jacobi-Bellman Equationmitchell/Class/CS532M.2007W2/Talks/hjbSham.pdfHJB Equation Extension of Hamilton-Jacobi equation (classical mechanics) Solution is the optimal cost-to-go](https://reader033.fdocuments.us/reader033/viewer/2022042620/5f4a095391bb81620f671c5a/html5/thumbnails/20.jpg)
HJB ExampleJ *t , x =
12max {0,∣x∣−T−t }2
∇ t J*t , x =max {0,∣x∣−T−t }
∇ x J*t , x =sgn x max {0,∣x∣−T−t }
0=min∣u∣≤1
{∇ tV t , x u∇ xV t , x }=min∣u∣≤1
{1u sgn x }max {0,∣x∣−T−t }
![Page 21: Hamilton-Jacobi-Bellman Equationmitchell/Class/CS532M.2007W2/Talks/hjbSham.pdfHJB Equation Extension of Hamilton-Jacobi equation (classical mechanics) Solution is the optimal cost-to-go](https://reader033.fdocuments.us/reader033/viewer/2022042620/5f4a095391bb81620f671c5a/html5/thumbnails/21.jpg)
LQR and HJB● Remember the continuous time LQR
x t =Ax t Bu t
J=x T ' QT x T ∫0
T
x t ' Qx t u t ' Ru t dt
g x ,u=x ' Qxu ' Ru
h x =x ' QT x
![Page 22: Hamilton-Jacobi-Bellman Equationmitchell/Class/CS532M.2007W2/Talks/hjbSham.pdfHJB Equation Extension of Hamilton-Jacobi equation (classical mechanics) Solution is the optimal cost-to-go](https://reader033.fdocuments.us/reader033/viewer/2022042620/5f4a095391bb81620f671c5a/html5/thumbnails/22.jpg)
LQR and HJB
● The HJB Equation
0=minu
{ x ' Qxu ' Ru∇ tV t , x AxBu∇ xV t , x ' }
● Try a solution of the form V t , x =x ' K t x
V T , x =x ' QT x
∇ xV t , x =2K t x
∇ tV t , x =x ' K t x
![Page 23: Hamilton-Jacobi-Bellman Equationmitchell/Class/CS532M.2007W2/Talks/hjbSham.pdfHJB Equation Extension of Hamilton-Jacobi equation (classical mechanics) Solution is the optimal cost-to-go](https://reader033.fdocuments.us/reader033/viewer/2022042620/5f4a095391bb81620f671c5a/html5/thumbnails/23.jpg)
LQR and HJB
● Substitute in the HJB equation
0=minu
{ x ' Qxu ' Rux ' K t x2x ' K t Ax2x ' K t Bu}
● Differentiate to find minimum
0= ∂∂u
{ x ' Qxu ' Rux ' K t x2x ' K t Ax2x ' K t Bu}
2B' K t x2Ru=0 ⇒ u=−R−1B ' K t x
![Page 24: Hamilton-Jacobi-Bellman Equationmitchell/Class/CS532M.2007W2/Talks/hjbSham.pdfHJB Equation Extension of Hamilton-Jacobi equation (classical mechanics) Solution is the optimal cost-to-go](https://reader033.fdocuments.us/reader033/viewer/2022042620/5f4a095391bb81620f671c5a/html5/thumbnails/24.jpg)
LQR and HJB
● Therefore, at minimum u0=x ' K t K t AA' K t −K t B R−1B' K t Q x
0= K t K t AA ' K t −K t B R−1B ' K t Q
K t =−K t A−A ' K t K t B R−1 B' K t −Q
Continuous-time Riccati Equation
![Page 25: Hamilton-Jacobi-Bellman Equationmitchell/Class/CS532M.2007W2/Talks/hjbSham.pdfHJB Equation Extension of Hamilton-Jacobi equation (classical mechanics) Solution is the optimal cost-to-go](https://reader033.fdocuments.us/reader033/viewer/2022042620/5f4a095391bb81620f671c5a/html5/thumbnails/25.jpg)
LQR and HJB
● Given that K(t) satisfies the Riccati equation
J *t , x =V t , x =x ' K t x
● And the optimal control policy is
u*t , x =−R−1B' K t x