Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal,...

37
S - Umberto Biccari Fundación Deusto and Universidad de Deusto, Bilbao, Spain [email protected] cmc.deusto.es/umberto-biccari joint work with: Ana Navarro - Universitat de València Enrique Zuazua - FAU, Fundación Deusto and Universidad Autónoma de Madrid June ,

Transcript of Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal,...

Page 1: Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal, Optimization methods for large-scale machine learning, 2018 13/23 Stochastic Gradient

STOCHASTIC OPTIMIZATION METHODS FOR THESIMULTANEOUS CONTROL OFPARAMETER-DEPENDENT SYSTEMS

Umberto BiccariFundación Deusto and Universidad de Deusto, Bilbao, [email protected] cmc.deusto.es/umberto-biccari

joint work with:Ana Navarro - Universitat de ValènciaEnrique Zuazua - FAU, Fundación Deusto and Universidad Autónoma de Madrid

June 12, 2020

Page 2: Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal, Optimization methods for large-scale machine learning, 2018 13/23 Stochastic Gradient

INTRODUCTION

Page 3: Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal, Optimization methods for large-scale machine learning, 2018 13/23 Stochastic Gradient

Keywords

Key concepts of the presentation:

parameter-depending models

simultaneous controllability

stochastic optimization

2/23

Page 4: Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal, Optimization methods for large-scale machine learning, 2018 13/23 Stochastic Gradient

Parameter-depending models

Parameter-dependent models appear in many real-life applications, todescribe physical phenomena which may have different realizations{

x′ν(t) = Aνxν(t) + Bu(t), 0 < t < T,

xν(0) = x0,, ν ∈ K

Example 1: linearized cart-inverted pendulum systemxνvνθνων

=

0 0 1 00 − ν

M 0 00 0 0 10 ν+M

M` 0 0

xνvνθνων

+

010−1

u.

3/23

Page 5: Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal, Optimization methods for large-scale machine learning, 2018 13/23 Stochastic Gradient

Parameter-depending models

Parameter-dependent models appear in many real-life applications, todescribe physical phenomena which may have different realizations{

x′ν(t) = Aνxν(t) + Bu(t), 0 < t < T,

xν(0) = x0,, ν ∈ K

Example 2: system of thermoelasticity

wtt − µ ∆w −

Lamécoeffi-cients

(λ+ µ) ∇div(w) + α∇θ = u 1ω

θt −∆θ + βdiv(wt) = 0

Lebeau and Zuazua, Null controllability of a system of linear thermoelasticity, 2002

3/23

Page 6: Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal, Optimization methods for large-scale machine learning, 2018 13/23 Stochastic Gradient

Simultaneous controllability

We look for a unique parameter-independent control u such that, at timeT > 0, the corresponding solution xν satisfies

xν(T) = xT , for all ν ∈ K

In the ODE setting, simultaneous controllability is equivalent to the classi-cal controllability of the augmented system

x = Ax + Bu

with x = (xν1 , . . . , xν|K|)T ∈ RN|K|, u = (u, . . . ,u)T ∈ L2(0,T;RN|K|), and

where the matrices A and B are given by

A =

Aν1 0. . .

0 Aν|K|

∈ RN|K|×N|K| and B =

B...B

∈ RN|K|×1

Lohéac and Zuazua, From averaged to simultaneous controllability, 2016

4/23

Page 7: Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal, Optimization methods for large-scale machine learning, 2018 13/23 Stochastic Gradient

Simultaneous controllability

We look for a unique parameter-independent control u such that, at timeT > 0, the corresponding solution xν satisfies

xν(T) = xT , for all ν ∈ K

In the ODE setting, simultaneous controllability is equivalent to the classi-cal controllability of the augmented system

x = Ax + Bu

with x = (xν1 , . . . , xν|K|)T ∈ RN|K|, u = (u, . . . ,u)T ∈ L2(0,T;RN|K|), and

where the matrices A and B are given by

A =

Aν1 0. . .

0 Aν|K|

∈ RN|K|×N|K| and B =

B...B

∈ RN|K|×1

Lohéac and Zuazua, From averaged to simultaneous controllability, 2016

4/23

Page 8: Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal, Optimization methods for large-scale machine learning, 2018 13/23 Stochastic Gradient

Computation of simultaneous controls

u = minu∈L2(0,T;RM)

Fν(u)

Fν(u) :=12E[∥∥∥xν(T)− xT

∥∥∥2RN

]+β

2‖u‖2L2(0,T;RM)

Fν(u) :=1|K|

∑νk∈K

fνk +β

2‖u‖2L2(0,T;RM)

Typical approaches:

• Gradient Descent (GD): uk+1 = uk − ηk∇Fν(uk)

• Conjugate Gradient (CG)

Nocedal and Wright, Numerical optimization, 1999

Ciarlet, Introduction à l’analyse numérique matricielle et à l’optimisation, 1988

Both approaches have a high computational cost when dealing with largeparameter sets.

5/23

Page 9: Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal, Optimization methods for large-scale machine learning, 2018 13/23 Stochastic Gradient

Computation of simultaneous controls

u = minu∈L2(0,T;RM)

Fν(u)

Fν(u) :=12E[∥∥∥xν(T)− xT

∥∥∥2RN

]+β

2‖u‖2L2(0,T;RM)

Fν(u) :=1|K|

∑νk∈K

fνk +β

2‖u‖2L2(0,T;RM)

Typical approaches:

• Gradient Descent (GD): uk+1 = uk − ηk∇Fν(uk)

• Conjugate Gradient (CG)

Nocedal and Wright, Numerical optimization, 1999

Ciarlet, Introduction à l’analyse numérique matricielle et à l’optimisation, 1988

Both approaches have a high computational cost when dealing with largeparameter sets.

5/23

Page 10: Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal, Optimization methods for large-scale machine learning, 2018 13/23 Stochastic Gradient

Computation of simultaneous controls

u = minu∈L2(0,T;RM)

Fν(u)

Fν(u) :=12E[∥∥∥xν(T)− xT

∥∥∥2RN

]+β

2‖u‖2L2(0,T;RM)

Fν(u) :=1|K|

∑νk∈K

fνk +β

2‖u‖2L2(0,T;RM)

Typical approaches:

• Gradient Descent (GD): uk+1 = uk − ηk∇Fν(uk)

• Conjugate Gradient (CG)

Nocedal and Wright, Numerical optimization, 1999

Ciarlet, Introduction à l’analyse numérique matricielle et à l’optimisation, 1988

Both approaches have a high computational cost when dealing with largeparameter sets.

5/23

Page 11: Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal, Optimization methods for large-scale machine learning, 2018 13/23 Stochastic Gradient

Stochastic optimization

STOCHASTIC GRADIENT DESCENT (SGD)This is a simplification of the classical GD in which, instead of computing∇Fν for all parameters ν ∈ K, in each iteration this gradient is estimatedon the basis of a single randomly picked configuration

uk+1 = uk − ηk∇fνk(uk)

Robbins and Monro, A stochastic approximation method, 1951

CONTINUOUS STOCHASTIC GRADIENT (CSG)This is a variant of SGD, based on the idea of reusing previously obtainedinformation to improve the efficiency of the algorithm

uk+1 = uk − ηkG k, G k =k∑`=1

α`∇fν`(u`)

Pflug, Bernhardt, Grieshammer and Stingl, A new stochastic gradient method for the efficientsolution of structural optimization problems with infinitely many state problems, 2020

6/23

Page 12: Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal, Optimization methods for large-scale machine learning, 2018 13/23 Stochastic Gradient

OPTIMIZATION ALGORITHMS

Page 13: Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal, Optimization methods for large-scale machine learning, 2018 13/23 Stochastic Gradient

Gradient Descent

uk+1 = uk − ηk∇Fν(uk)

Convergence

Since Fν is convex, if we take ηk constant small enough, we have∥∥∥uk − u∥∥∥2RN≤∥∥∥u0 − u

∥∥∥2RN

e−2CGDk, CGD = ln

(ρ+ 1ρ− 1

)

∥∥∥uk − u∥∥∥2RN< ε → k = O

(ln(ε−1)

CGD

)→ costGD = O

(|K| ln(ε−1)

CGD

)8/23

Page 14: Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal, Optimization methods for large-scale machine learning, 2018 13/23 Stochastic Gradient

Gradient Descent

uk+1 = uk − ηk

(βuk − 1

|K|∑ν∈K

B>pkν

)

x′ν(t) = Aνxν(t) + Bu, 0 < t < T

p′ν(t) = −A>ν pν(t), 0 < t < T

xν(0) = x0, pν(T) = −(xν(T)− xT)

Convergence

Since Fν is convex, if we take ηk constant small enough, we have∥∥∥uk − u∥∥∥2RN≤∥∥∥u0 − u

∥∥∥2RN

e−2CGDk, CGD = ln

(ρ+ 1ρ− 1

)

∥∥∥uk − u∥∥∥2RN< ε → k = O

(ln(ε−1)

CGD

)→ costGD = O

(|K| ln(ε−1)

CGD

)8/23

Page 15: Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal, Optimization methods for large-scale machine learning, 2018 13/23 Stochastic Gradient

GD - practical considerations

The expected exponential convergence of GD may be violated in practice.

The convergence rate is given in terms of the constant CGD(ρ) which ispositive decreasing converge to zero as ρ→ +∞.

A bad conditioning in a minimization problem affects the actual conver-gence of GD.

Example

minx∈R

(12x>Qτx − b>x

)

Qτ =

1 0 00 τ 00 0 τ2

b = −

111

ρ =λmax

λmin= τ2

τ iterations ρ

2 27 45 161 2510 633 10020 2511 40050 15619 2500

Meza, Steepest descent, 2010

9/23

Page 16: Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal, Optimization methods for large-scale machine learning, 2018 13/23 Stochastic Gradient

GD - practical considerations

The expected exponential convergence of GD may be violated in practice.

The convergence rate is given in terms of the constant CGD(ρ) which ispositive decreasing converge to zero as ρ→ +∞.

A bad conditioning in a minimization problem affects the actual conver-gence of GD.

Example

minx∈R

(12x>Qτx − b>x

)

Qτ =

1 0 00 τ 00 0 τ2

b = −

111

ρ =λmax

λmin= τ2

τ iterations ρ

2 27 45 161 2510 633 10020 2511 40050 15619 2500

Meza, Steepest descent, 2010

9/23

Page 17: Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal, Optimization methods for large-scale machine learning, 2018 13/23 Stochastic Gradient

Conjugate Gradient

∇Fν(u) = βu− 1|K|

∑ν∈K

B>pν

Convergence∥∥∥uk − u∥∥∥2RN≤ 4

∥∥∥u0 − u∥∥∥2RN

e−2CCGk, CCG = ln

(√ρ+ 1√ρ− 1

)

∥∥∥uk − u∥∥∥2RN< ε → k = O

(ln(ε−1)

CCG

)→ costCG = O

(|K| ln(ε−1)

CCG

)10/23

Page 18: Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal, Optimization methods for large-scale machine learning, 2018 13/23 Stochastic Gradient

Conjugate Gradient

∇Fν(u) =

A

(βI + E[L∗T,νLT,ν ]) u+

-b

E[L∗T,ν(yν(T)− xT)] → Au = b

LT,ν : L2(0,T;RM) −→ RN

u 7−→ zν(T)L∗T,ν : RN −→ L2(0,T;RM)

pT,ν 7−→ B>pν{y′ν(t) = Aνyν(t), 0 < t < T

yν(0) = x0

{z′ν(t) = Aνzν(t) + Bu(t), 0 < t < T

zν(0) = 0

Convergence∥∥∥uk − u∥∥∥2RN≤ 4

∥∥∥u0 − u∥∥∥2RN

e−2CCGk, CCG = ln

(√ρ+ 1√ρ− 1

)

∥∥∥uk − u∥∥∥2RN< ε → k = O

(ln(ε−1)

CCG

)→ costCG = O

(|K| ln(ε−1)

CCG

)10/23

Page 19: Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal, Optimization methods for large-scale machine learning, 2018 13/23 Stochastic Gradient

CG - practical considerations

The expected exponential convergence of CG may be violated in practicalexperiments, although the situation is less critical than in GD.

• The constant CCG(ρ) depends on the square root of ρ, hence CG is lesssensible to the conditioning of the problem.

• CG enjoys the finite termination property. This means that, if weapply CG to solve a N-dimensional problem, the algorithm willconverge in at most N-iterations.

11 / 23

Page 20: Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal, Optimization methods for large-scale machine learning, 2018 13/23 Stochastic Gradient

Stochastic Gradient Descent

uk+1 = uk − ηk∇fνk(uk), νk i.i.d. from K

Applying SGD for minimizing Fν(u) requires, at each iteration k, only oneresolution of the dynamics.

12 /23

Page 21: Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal, Optimization methods for large-scale machine learning, 2018 13/23 Stochastic Gradient

Stochastic Gradient Descent

uk+1 = uk − ηk(βuk − B>pkνk

)x′νk(t) = Aνkxνk(t) + Bu, 0 < t < T

p′νk(t) = −A>νkpνk(t), 0 < t < T

xνk(0) = x0, pνk(T) = −(xνk(T)− xT)

Applying SGD for minimizing Fν(u) requires, at each iteration k, only oneresolution of the dynamics.

12 /23

Page 22: Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal, Optimization methods for large-scale machine learning, 2018 13/23 Stochastic Gradient

Stochastic Gradient Descent - convergence

In SGD the iterate sequence (uk)k≥1 is a stochastic process determined bythe random sequence (νk)k≥1 ⊂ K. Hence, the convergence properties

are defined in expectation E[ ∥∥uk+1 − u

∥∥2RN

]or in the context of almost

sure convergence.

Bach and Moulines, Non-asymptotic analysis of stochastic approximation algorithms for ma-chine learning, 2011

Bottou, Online learning and stochastic approximations, 1998

In SGD, convergence is guaranteed if the step-sizes are chosen such that

E[∥∥∇Fν(uk)

∥∥2] is bounded above by a deterministic quantity. In particular,

a fixed step-size ηk = η, even if small, does not allow to converge. Astandard approach is to use as a decreasing sequence such that

∞∑k=1

ηk = +∞ and∞∑k=1

η2k < +∞

Robbins and Monro, A stochastic approximation method, 1951

Bottou, Curtis and Nocedal, Optimization methods for large-scale machine learning, 2018

13/23

Page 23: Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal, Optimization methods for large-scale machine learning, 2018 13/23 Stochastic Gradient

Stochastic Gradient Descent - convergence

If ηk is properly chosen, by means of standard martingale techniques wecan show that the SGD converges almost surely

uk a.s−→ u, as k→ +∞

Convergence rate

Because of the noise introduced by the random selection of the descentdirection the convergence of SGD is linear

E[∥∥∥uk − u

∥∥∥2RN

]= O

(k−1)

E[∥∥∥uk − u

∥∥∥2RN

]< ε → k = O

(ε−1)→ costSGD = O

(ε−1)

14 /23

Page 24: Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal, Optimization methods for large-scale machine learning, 2018 13/23 Stochastic Gradient

Stochastic Gradient Descent - convergence

If ηk is properly chosen, by means of standard martingale techniques wecan show that the SGD converges almost surely

uk a.s−→ u, as k→ +∞

Convergence rate

Because of the noise introduced by the random selection of the descentdirection the convergence of SGD is linear

E[∥∥∥uk − u

∥∥∥2RN

]= O

(k−1)

E[∥∥∥uk − u

∥∥∥2RN

]< ε → k = O

(ε−1)→ costSGD = O

(ε−1)

14 /23

Page 25: Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal, Optimization methods for large-scale machine learning, 2018 13/23 Stochastic Gradient

Continuous Stochastic Gradient

uk+1 = uk − ηkG k, G k =k∑`=1

α`∇fν`(u`)

CONVERGENCE PROPERTIESAs the optimization process evolves, the approximated gradient Gk con-verges almost surely to the full gradient of the objective functional

G k a.s−→ ∇Fν , as k→ +∞

In particular, CSG is a less noisy algorithm and has a better convergencebehavior. In particular, convergence may be guaranteed also choosing afixed learning rate sequence ηk = η.

Pflug, Bernhardt, Grieshammer and Stingl, A new stochastic gradient method for the efficientsolution of structural optimization problems with infinitely many state problems, 2020

15/23

Page 26: Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal, Optimization methods for large-scale machine learning, 2018 13/23 Stochastic Gradient

Continuous Stochastic Gradient

uk+1 = uk − ηkk∑`=1

α`(βu` − B>p`ν`

)

CONVERGENCE PROPERTIESAs the optimization process evolves, the approximated gradient Gk con-verges almost surely to the full gradient of the objective functional

G k a.s−→ ∇Fν , as k→ +∞

In particular, CSG is a less noisy algorithm and has a better convergencebehavior. In particular, convergence may be guaranteed also choosing afixed learning rate sequence ηk = η.

Pflug, Bernhardt, Grieshammer and Stingl, A new stochastic gradient method for the efficientsolution of structural optimization problems with infinitely many state problems, 2020

15/23

Page 27: Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal, Optimization methods for large-scale machine learning, 2018 13/23 Stochastic Gradient

NUMERICAL SIMULATIONS

Page 28: Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal, Optimization methods for large-scale machine learning, 2018 13/23 Stochastic Gradient

Numerical simulations

Linearized cart-inverted pendulum systemxνvνθνων

=

0 0 1 00 − ν

M 0 00 0 0 10 ν+M

M` 0 0

xνvνθνων

+

010−1

u

• The system includes a cart of massM and a rigid pendulum of length `.

• The pendulum is anchored to the cart and at the free extremity it isplaced a variable mass described by the parameter ν .

• The cart moves on a horizontal plane. The states xν(t) and vν(t)describe its position and velocity, respectively.

• During the motion of the cart the pendulum deviates from the initialvertical position by an angle θν(t), with an angular velocity ων(t).

• Starting from an initial state (x i, v i,0,0), we want to compute aparameter-independent control function u steering all the realizationsof the system in time T to the final state (x f ,0,0,0).

17 /23

Page 29: Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal, Optimization methods for large-scale machine learning, 2018 13/23 Stochastic Gradient

Numerical simulations

• The system includes a cart of massM and a rigid pendulum of length `.

• The pendulum is anchored to the cart and at the free extremity it isplaced a variable mass described by the parameter ν .

• The cart moves on a horizontal plane. The states xν(t) and vν(t)describe its position and velocity, respectively.

• During the motion of the cart the pendulum deviates from the initialvertical position by an angle θν(t), with an angular velocity ων(t).

• Starting from an initial state (x i, v i,0,0), we want to compute aparameter-independent control function u steering all the realizationsof the system in time T to the final state (x f ,0,0,0).

17 /23

Page 30: Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal, Optimization methods for large-scale machine learning, 2018 13/23 Stochastic Gradient

Numerical simulations

Input data

• x0 = (−1, 1,0,0)>

• xT = (0,0,0,0)>

• T = 1s

• ε = 10−4

• M = 10

• ` = 1

• ν ∈ K = {ν1, . . . , ν|K|} withν1 = 0.1 and ν|K| = 1

18 /23

Page 31: Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal, Optimization methods for large-scale machine learning, 2018 13/23 Stochastic Gradient

Numerical simulations

Input data

• x0 = (−1, 1,0,0)>

• xT = (0,0,0,0)>

• T = 1s

• ε = 10−4

• M = 10

• ` = 1

• ν ∈ K = {ν1, . . . , ν|K|} withν1 = 0.1 and ν|K| = 1

18 /23

Page 32: Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal, Optimization methods for large-scale machine learning, 2018 13/23 Stochastic Gradient

Numerical simulations

GD CG SGD CSG

|K| Iter. Time Iter. Time Iter. Time Iter. Time

2 1868 45.1s 12 1.1s 2195 33.1s 930 18.6s

10 1869 150.1s 13 2.6s 2106 31.4s 923 17.4s

100 1870 1799.5s 12 17.7s 2102 28.9s 929 17.4s

250 13 50.3s 2080 28.2s 928 17.9s

500 13 101.3s 2099 32.9s 927 21.5s

19/23

Page 33: Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal, Optimization methods for large-scale machine learning, 2018 13/23 Stochastic Gradient

Numerical simulations

CSG outperforms SGD in terms of the number of iterations it requiresto converge and, consequently, of the total computational time. Thisbecause the optimization process is less noisy than SGD, yielding to abetter convergence behavior.

20/23

Page 34: Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal, Optimization methods for large-scale machine learning, 2018 13/23 Stochastic Gradient

Conclusions

We compared the GD, CG, SGD and CSG algorithms for the minimization ofa quadratic functional associated to the simultaneous controllability oflinear parameter-dependent models.

We observed the following:

1. The GD approach is the worst one in terms of the computationalcomplexity, as a consequence of the bad conditioning of thesimultaneous controllability problem.

2. The choice of SGD and CSG instead of CG is preferable only whendealing with parameter sets of large cardinality |K|.

21 /23

Page 35: Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal, Optimization methods for large-scale machine learning, 2018 13/23 Stochastic Gradient

Open problems

SIMULTANEOUS CONTROLLABILITY OF PDE MODELS

• In the PDE setting, simultaneous controllability is a quite delicate issuebecause of the appearance of peculiar phenomena which are notdetected at the ODE level.

• For some PDE systems, simultaneous controllability may beunderstood by looking at the spectral properties of the model.Roughly speaking, one needs all the eigenvalues to have multiplicityone in order to be able to observe every eigenmode independently.This fact generally yields restrictions to the validity of simultaneouscontrollability, which may be difficult to tackle at the numerical level.

Dáger and Zuazua, Controllability of star-shaped networks of strings, 2001

22/23

Page 36: Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal, Optimization methods for large-scale machine learning, 2018 13/23 Stochastic Gradient

Open problems

COMPARISON WITH THE GREEDY METHODOLOGY• The greedy approach aims to approximate the dynamics and controlsof linear parameter-depending by identifying the most meaningfulrealizations of the parameters.Lazar and Zuazua, Greedy controllability of finite dimensional linear systems, 2016

Hernández-Santamaría, Lazar and Zuazua, Greedy controllability of finite dimensional

linear systems, 2019

• A comparison of the greedy and stochastic would be an interestingissue.

22/23

Page 37: Stochastic optimization methods for the simultaneous control ......Bottou, Curtis and Nocedal, Optimization methods for large-scale machine learning, 2018 13/23 Stochastic Gradient

THANK YOU FOR YOUR ATTENTION!

This project has received funding from the European Research Coun-cil (ERC) under the European Union’s Horizon 2020 research andinnovation program (grant agreement No 694126-DYCON).

23/23