Convex Optimization and Linear Matrix Inequalitiessistawww/smc/...Convex Optimization and Linear...

Post on 17-Jun-2020

1 views 0 download

Transcript of Convex Optimization and Linear Matrix Inequalitiessistawww/smc/...Convex Optimization and Linear...

1/42

Convex Optimization

and

Linear Matrix Inequalities

Carsten Scherer

Delft Center for Systems and Control (DCSC)

Delft University of Technology

The Netherlands

Outline

1/42

Carsten Scherer

• From Optimization to Convex Semi-Definite Programming

• Linear Matrix Inequalities (LMIs)

• Truss-Topology Design

• LMIs and Stability

• A First Glimpse at Robustness

• LMI Regions

Optimization

2/42

Carsten Scherer

The ingredients of any optimization problem are:

• A universe of decisions x ∈ X

Optimization

2/42

Carsten Scherer

The ingredients of any optimization problem are:

• A universe of decisions x ∈ X

• A subset S ⊂ X of feasible decisions

Optimization

2/42

Carsten Scherer

The ingredients of any optimization problem are:

• A universe of decisions x ∈ X

• A subset S ⊂ X of feasible decisions

• A cost function or objective function f : S → R

Optimization

2/42

Carsten Scherer

The ingredients of any optimization problem are:

• A universe of decisions x ∈ X

• A subset S ⊂ X of feasible decisions

• A cost function or objective function f : S → R

Optimization Problem/Programming Problem

Find a feasible decision x ∈ S with the least possible cost f(x).

In short such a problem is denoted as

minimize f(x)

subject to x ∈ S

Questions in Optimization

3/42

Carsten Scherer

• What is the least possible cost? Compute the optimal value

fopt := infx∈S

f(x) ≥ −∞.

Convention: If S = ∅ then fopt = +∞.

If fopt = −∞ the problem is said to be unbounded from below.

Questions in Optimization

3/42

Carsten Scherer

• What is the least possible cost? Compute the optimal value

fopt := infx∈S

f(x) ≥ −∞.

Convention: If S = ∅ then fopt = +∞.

If fopt = −∞ the problem is said to be unbounded from below.

• How can we compute almost optimal solutions? For any chosen

positive absolute error ε, determine

xε ∈ S with fopt ≤ f(xε) ≤ fopt + ε.

By definition of the infimum such an xε does exist for all ε > 0.

Questions in Optimization

4/42

Carsten Scherer

• Is there an optimal solution? Does there exist

xopt ∈ S with f(xopt) = fopt?

Questions in Optimization

4/42

Carsten Scherer

• Is there an optimal solution? Does there exist

xopt ∈ S with f(xopt) = fopt?

If yes, the minimum is attained and we write

fopt = minx∈S

f(x).

Set of all optimal solutions is denoted as

arg minx∈S

f(x) = {x ∈ S : f(x) = fopt}.

Questions in Optimization

4/42

Carsten Scherer

• Is there an optimal solution? Does there exist

xopt ∈ S with f(xopt) = fopt?

If yes, the minimum is attained and we write

fopt = minx∈S

f(x).

Set of all optimal solutions is denoted as

arg minx∈S

f(x) = {x ∈ S : f(x) = fopt}.

• Is optimal solution unique?

Nonlinear Programming (NLP)

5/42

Carsten Scherer

With decision universe X = Rn, the feasible set S is often defined by

constraint functions g1 : X → R, . . . , gm : X → R as

S = {x ∈ X : g1(x) ≤ 0, . . . , gm(x) ≤ 0} .

Nonlinear Programming (NLP)

5/42

Carsten Scherer

With decision universe X = Rn, the feasible set S is often defined by

constraint functions g1 : X → R, . . . , gm : X → R as

S = {x ∈ X : g1(x) ≤ 0, . . . , gm(x) ≤ 0} .

The general nonlinear optimization problem is formulated as

minimize f(x)

subject to x ∈ X , g1(x) ≤ 0, . . . , gm(x) ≤ 0.

Nonlinear Programming (NLP)

5/42

Carsten Scherer

With decision universe X = Rn, the feasible set S is often defined by

constraint functions g1 : X → R, . . . , gm : X → R as

S = {x ∈ X : g1(x) ≤ 0, . . . , gm(x) ≤ 0} .

The general nonlinear optimization problem is formulated as

minimize f(x)

subject to x ∈ X , g1(x) ≤ 0, . . . , gm(x) ≤ 0.

By exploiting particular properties of f, g1, . . . , gm (e.g. smoothness),

optimization algorithms typically allow to iteratively compute locally

optimal solutions x0: There exists an ε > 0 such that x0 is optimal on

S ∩ {x ∈ X : ‖x− x0‖ ≤ ε}.

Linear Programming (LP)

6/42

Carsten Scherer

With the decision vectors x = (x1 · · · xn)T ∈ Rn consider the problem

minimize c1x1 + · · ·+ cnxn

subject to a11x1 + · · ·+ a1nxn ≤ b1

...

am1x1 + · · ·+ amnxn ≤ bm

The cost is linear and the set of feasible decisions is defined by finitely

many affine inequality constraints.

Linear Programming (LP)

6/42

Carsten Scherer

With the decision vectors x = (x1 · · · xn)T ∈ Rn consider the problem

minimize c1x1 + · · ·+ cnxn

subject to a11x1 + · · ·+ a1nxn ≤ b1

...

am1x1 + · · ·+ amnxn ≤ bm

The cost is linear and the set of feasible decisions is defined by finitely

many affine inequality constraints.

Simplex algorithms or interior point methods allow to efficiently

1. decide whether the constraint set is feasible (value < ∞).

Linear Programming (LP)

6/42

Carsten Scherer

With the decision vectors x = (x1 · · · xn)T ∈ Rn consider the problem

minimize c1x1 + · · ·+ cnxn

subject to a11x1 + · · ·+ a1nxn ≤ b1

...

am1x1 + · · ·+ amnxn ≤ bm

The cost is linear and the set of feasible decisions is defined by finitely

many affine inequality constraints.

Simplex algorithms or interior point methods allow to efficiently

1. decide whether the constraint set is feasible (value < ∞).

2. decide whether problem is bounded from below (value > −∞).

Linear Programming (LP)

6/42

Carsten Scherer

With the decision vectors x = (x1 · · · xn)T ∈ Rn consider the problem

minimize c1x1 + · · ·+ cnxn

subject to a11x1 + · · ·+ a1nxn ≤ b1

...

am1x1 + · · ·+ amnxn ≤ bm

The cost is linear and the set of feasible decisions is defined by finitely

many affine inequality constraints.

Simplex algorithms or interior point methods allow to efficiently

1. decide whether the constraint set is feasible (value < ∞).

2. decide whether problem is bounded from below (value > −∞).

3. compute an (always existing) optimal solution if 1. & 2. are true.

Linear Programming (LP)

7/42

Carsten Scherer

Major early contributions in 1940s by

• George Dantzig (simplex method)

• John von Neumann (duality)

• Lenoid Kantorovich (economics applications)

Linear Programming (LP)

7/42

Carsten Scherer

Major early contributions in 1940s by

• George Dantzig (simplex method)

• John von Neumann (duality)

• Lenoid Kantorovich (economics applications)

Leonid Khachiyan proved polynomial-time solvability in 1979.

Linear Programming (LP)

7/42

Carsten Scherer

Major early contributions in 1940s by

• George Dantzig (simplex method)

• John von Neumann (duality)

• Lenoid Kantorovich (economics applications)

Leonid Khachiyan proved polynomial-time solvability in 1979.

Narendra Karmakar introduce an interior point method in 1984.

Linear Programming (LP)

7/42

Carsten Scherer

Major early contributions in 1940s by

• George Dantzig (simplex method)

• John von Neumann (duality)

• Lenoid Kantorovich (economics applications)

Leonid Khachiyan proved polynomial-time solvability in 1979.

Narendra Karmakar introduce an interior point method in 1984.

Has numerous applications for example in economical planning problems

(business management, flight-scheduling, resource allocation, finance).

Linear Programming (LP)

7/42

Carsten Scherer

Major early contributions in 1940s by

• George Dantzig (simplex method)

• John von Neumann (duality)

• Lenoid Kantorovich (economics applications)

Leonid Khachiyan proved polynomial-time solvability in 1979.

Narendra Karmakar introduce an interior point method in 1984.

Has numerous applications for example in economical planning problems

(business management, flight-scheduling, resource allocation, finance).

LPs have spurred the development of optimization theory and appear

as subproblem in many optimization algorithms.

Recap

8/42

Carsten Scherer

For a real or complex matrix A the inequality A 4 0 means that A

is Hermitian and negative semi-definite

Recap

8/42

Carsten Scherer

For a real or complex matrix A the inequality A 4 0 means that A

is Hermitian and negative semi-definite.

• A is defined to be Hermitian if A = A∗ = AT . If A is real then this

amounts to A = AT and A is then called symmetric.

Recap

8/42

Carsten Scherer

For a real or complex matrix A the inequality A 4 0 means that A

is Hermitian and negative semi-definite.

• A is defined to be Hermitian if A = A∗ = AT . If A is real then this

amounts to A = AT and A is then called symmetric.

All eigenvalues of Hermitian matrices are real.

Recap

8/42

Carsten Scherer

For a real or complex matrix A the inequality A 4 0 means that A

is Hermitian and negative semi-definite.

• A is defined to be Hermitian if A = A∗ = AT . If A is real then this

amounts to A = AT and A is then called symmetric.

All eigenvalues of Hermitian matrices are real.

• By definition A is negative semi-definite if A = A∗ and

x∗Ax ≤ 0 for all complex vectors x 6= 0.

A is negative semi-definite iff all its eigenvalues are non-positive.

Recap

8/42

Carsten Scherer

For a real or complex matrix A the inequality A 4 0 means that A

is Hermitian and negative semi-definite.

• A is defined to be Hermitian if A = A∗ = AT . If A is real then this

amounts to A = AT and A is then called symmetric.

All eigenvalues of Hermitian matrices are real.

• By definition A is negative semi-definite if A = A∗ and

x∗Ax ≤ 0 for all complex vectors x 6= 0.

A is negative semi-definite iff all its eigenvalues are non-positive.

• A 4 B means by definition: A, B are Hermitian and A−B 4 0.

Recap

8/42

Carsten Scherer

For a real or complex matrix A the inequality A 4 0 means that A

is Hermitian and negative semi-definite.

• A is defined to be Hermitian if A = A∗ = AT . If A is real then this

amounts to A = AT and A is then called symmetric.

All eigenvalues of Hermitian matrices are real.

• By definition A is negative semi-definite if A = A∗ and

x∗Ax ≤ 0 for all complex vectors x 6= 0.

A is negative semi-definite iff all its eigenvalues are non-positive.

• A 4 B means by definition: A, B are Hermitian and A−B 4 0.

• A ≺ B, A 4 B, A < B, A � B defined/characterized analogously.

Recap

9/42

Carsten Scherer

Let A be partitioned with square diagonal blocks as

A =

A11 · · · A1m

.... . .

...

Am1 · · · Amm

.

Then

A ≺ 0 implies A11 ≺ 0, . . . , Amm ≺ 0.

Recap

9/42

Carsten Scherer

Let A be partitioned with square diagonal blocks as

A =

A11 · · · A1m

.... . .

...

Am1 · · · Amm

.

Then

A ≺ 0 implies A11 ≺ 0, . . . , Amm ≺ 0.

Prototypical Proof. Choose any vector zi 6= 0 of length compatible

with the size of Aii.

Recap

9/42

Carsten Scherer

Let A be partitioned with square diagonal blocks as

A =

A11 · · · A1m

.... . .

...

Am1 · · · Amm

.

Then

A ≺ 0 implies A11 ≺ 0, . . . , Amm ≺ 0.

Prototypical Proof. Choose any vector zi 6= 0 of length compatible

with the size of Aii. Define z = (0, . . . , 0, zTi , 0, . . . , 0)T 6= 0 with the

zeros blocks compatible in size with A11, . . . , Amm.

Recap

9/42

Carsten Scherer

Let A be partitioned with square diagonal blocks as

A =

A11 · · · A1m

.... . .

...

Am1 · · · Amm

.

Then

A ≺ 0 implies A11 ≺ 0, . . . , Amm ≺ 0.

Prototypical Proof. Choose any vector zi 6= 0 of length compatible

with the size of Aii. Define z = (0, . . . , 0, zTi , 0, . . . , 0)T 6= 0 with the

zeros blocks compatible in size with A11, . . . , Amm. This construction

implies zT Az = zTi Aiizi.

Recap

9/42

Carsten Scherer

Let A be partitioned with square diagonal blocks as

A =

A11 · · · A1m

.... . .

...

Am1 · · · Amm

.

Then

A ≺ 0 implies A11 ≺ 0, . . . , Amm ≺ 0.

Prototypical Proof. Choose any vector zi 6= 0 of length compatible

with the size of Aii. Define z = (0, . . . , 0, zTi , 0, . . . , 0)T 6= 0 with the

zeros blocks compatible in size with A11, . . . , Amm. This construction

implies zT Az = zTi Aiizi. Since A ≺ 0, we infer zT Az < 0.

Recap

9/42

Carsten Scherer

Let A be partitioned with square diagonal blocks as

A =

A11 · · · A1m

.... . .

...

Am1 · · · Amm

.

Then

A ≺ 0 implies A11 ≺ 0, . . . , Amm ≺ 0.

Prototypical Proof. Choose any vector zi 6= 0 of length compatible

with the size of Aii. Define z = (0, . . . , 0, zTi , 0, . . . , 0)T 6= 0 with the

zeros blocks compatible in size with A11, . . . , Amm. This construction

implies zT Az = zTi Aiizi. Since A ≺ 0, we infer zT Az < 0. Therefore,

zTi Aiizi < 0.

Recap

9/42

Carsten Scherer

Let A be partitioned with square diagonal blocks as

A =

A11 · · · A1m

.... . .

...

Am1 · · · Amm

.

Then

A ≺ 0 implies A11 ≺ 0, . . . , Amm ≺ 0.

Prototypical Proof. Choose any vector zi 6= 0 of length compatible

with the size of Aii. Define z = (0, . . . , 0, zTi , 0, . . . , 0)T 6= 0 with the

zeros blocks compatible in size with A11, . . . , Amm. This construction

implies zT Az = zTi Aiizi. Since A ≺ 0, we infer zT Az < 0. Therefore,

zTi Aiizi < 0. Since zi 6= 0 was arbitrary we infer Aii ≺ 0.

Semi-Definite Programming (SDP)

10/42

Carsten Scherer

Let us now assume that the constraint functions G1, . . . , Gm map Xinto the set of symmetric matrices, and define the feasible set S as

S = {x ∈ X : G1(x) 4 0, . . . , Gm(x) 4 0} .

Semi-Definite Programming (SDP)

10/42

Carsten Scherer

Let us now assume that the constraint functions G1, . . . , Gm map Xinto the set of symmetric matrices, and define the feasible set S as

S = {x ∈ X : G1(x) 4 0, . . . , Gm(x) 4 0} .

The general semi-definite program (SDP) is formulated as

minimize f(x)

subject to x ∈ X , G1(x) 4 0, . . . , Gm(x) 4 0.

Semi-Definite Programming (SDP)

10/42

Carsten Scherer

Let us now assume that the constraint functions G1, . . . , Gm map Xinto the set of symmetric matrices, and define the feasible set S as

S = {x ∈ X : G1(x) 4 0, . . . , Gm(x) 4 0} .

The general semi-definite program (SDP) is formulated as

minimize f(x)

subject to x ∈ X , G1(x) 4 0, . . . , Gm(x) 4 0.

• This includes NLPs as a special case.

Semi-Definite Programming (SDP)

10/42

Carsten Scherer

Let us now assume that the constraint functions G1, . . . , Gm map Xinto the set of symmetric matrices, and define the feasible set S as

S = {x ∈ X : G1(x) 4 0, . . . , Gm(x) 4 0} .

The general semi-definite program (SDP) is formulated as

minimize f(x)

subject to x ∈ X , G1(x) 4 0, . . . , Gm(x) 4 0.

• This includes NLPs as a special case.

• Is called convex if f and G1, . . . , Gm are convex.

Semi-Definite Programming (SDP)

10/42

Carsten Scherer

Let us now assume that the constraint functions G1, . . . , Gm map Xinto the set of symmetric matrices, and define the feasible set S as

S = {x ∈ X : G1(x) 4 0, . . . , Gm(x) 4 0} .

The general semi-definite program (SDP) is formulated as

minimize f(x)

subject to x ∈ X , G1(x) 4 0, . . . , Gm(x) 4 0.

• This includes NLPs as a special case.

• Is called convex if f and G1, . . . , Gm are convex.

• Is called linear matrix inequality (LMI) optimization problem

or linear SDP if f and G1, . . . , Gm are affine.

Recap: Affine Sets and Functions

11/42

Carsten Scherer

The set S in the vector space X is affine if

λs1 + (1− λ)s2 ∈ S for all s1, s2 ∈ S, λ ∈ R.

Recap: Affine Sets and Functions

11/42

Carsten Scherer

The set S in the vector space X is affine if

λs1 + (1− λ)s2 ∈ S for all s1, s2 ∈ S, λ ∈ R.

The matrix-valued function F defined on S is affine if the domain

of definition S is affine and if

F (λs1 + (1− λ)s2) = λF (s1) + (1− λ)F (s2)

for all s1, s2 ∈ S, λ ∈ R.

Recap: Affine Sets and Functions

11/42

Carsten Scherer

The set S in the vector space X is affine if

λs1 + (1− λ)s2 ∈ S for all s1, s2 ∈ S, λ ∈ R.

The matrix-valued function F defined on S is affine if the domain

of definition S is affine and if

F (λs1 + (1− λ)s2) = λF (s1) + (1− λ)F (s2)

for all s1, s2 ∈ S, λ ∈ R.

For points s1, s2 ∈ S recall that

• {λs1 + (1− λ)s2 : λ ∈ R} is the line through s1, s2

• {λs1 + (1− λ)s2 : λ ∈ [0, 1]} is the line segment between s1, s2

Recap: Convex Sets and Functions

12/42

Carsten Scherer

The set S in the vector space X is convex if

λs1 + (1− λ)s2 ∈ S for all s1, s2 ∈ S, λ ∈ (0, 1).

Recap: Convex Sets and Functions

12/42

Carsten Scherer

The set S in the vector space X is convex if

λs1 + (1− λ)s2 ∈ S for all s1, s2 ∈ S, λ ∈ (0, 1).

The symmetric-valued function F defined on S is convex if the

domain of definition S is convex and if

F (λs1 + (1− λ)s2) 4 λF (s1) + (1− λ)F (s2)

for all s1, s2 ∈ S, λ ∈ (0, 1).

Recap: Convex Sets and Functions

12/42

Carsten Scherer

The set S in the vector space X is convex if

λs1 + (1− λ)s2 ∈ S for all s1, s2 ∈ S, λ ∈ (0, 1).

The symmetric-valued function F defined on S is convex if the

domain of definition S is convex and if

F (λs1 + (1− λ)s2) 4 λF (s1) + (1− λ)F (s2)

for all s1, s2 ∈ S, λ ∈ (0, 1).

F is strictly convex if 4 can be replaced by ≺.

Recap: Convex Sets and Functions

12/42

Carsten Scherer

The set S in the vector space X is convex if

λs1 + (1− λ)s2 ∈ S for all s1, s2 ∈ S, λ ∈ (0, 1).

The symmetric-valued function F defined on S is convex if the

domain of definition S is convex and if

F (λs1 + (1− λ)s2) 4 λF (s1) + (1− λ)F (s2)

for all s1, s2 ∈ S, λ ∈ (0, 1).

F is strictly convex if 4 can be replaced by ≺.

If F is real-valued, the inequalities 4 and ≺ are the same as the usual

inequalities ≤ and < between real numbers. Therefore our definition

captures the usual one for real-valued functions!

Relevance of Convexity?

13/42

Carsten Scherer

• Solvers for general nonlinear programs typically determine local op-

timal solutions. There are no guarantees for global optimality.

Relevance of Convexity?

13/42

Carsten Scherer

• Solvers for general nonlinear programs typically determine local op-

timal solutions. There are no guarantees for global optimality.

• Main feature of convex programs:

Locally optimal solutions are globally optimal.

Relevance of Convexity?

13/42

Carsten Scherer

• Solvers for general nonlinear programs typically determine local op-

timal solutions. There are no guarantees for global optimality.

• Main feature of convex programs:

Locally optimal solutions are globally optimal.

Convexity alone neither guarantees that the optimal value is finite,

nor that there exists an optimal solution/efficient solution algorithms.

Relevance of Convexity?

13/42

Carsten Scherer

• Solvers for general nonlinear programs typically determine local op-

timal solutions. There are no guarantees for global optimality.

• Main feature of convex programs:

Locally optimal solutions are globally optimal.

Convexity alone neither guarantees that the optimal value is finite,

nor that there exists an optimal solution/efficient solution algorithms.

• In general convex programs can be solved with guarantees on accuracy

if one can compute (sub)gradients of objective/constraint functions.

Relevance of Convexity?

13/42

Carsten Scherer

• Solvers for general nonlinear programs typically determine local op-

timal solutions. There are no guarantees for global optimality.

• Main feature of convex programs:

Locally optimal solutions are globally optimal.

Convexity alone neither guarantees that the optimal value is finite,

nor that there exists an optimal solution/efficient solution algorithms.

• In general convex programs can be solved with guarantees on accuracy

if one can compute (sub)gradients of objective/constraint functions.

• Strictly feasible linear semi-definite programs are convex and can be

solved very efficiently, with guarantees on accuracy at termination.

Outline

13/42

Carsten Scherer

• From Optimization to Convex Semi-Definite Programming

• Linear Matrix Inequalities (LMIs)

• Truss-Topology Design

• LMIs and Stability

• A First Glimpse at Robustness

• LMI Regions

Linear Matrix Inequalities (LMIs)

14/42

Carsten Scherer

With the decision vectors x = (x1 · · · xn)T ∈ Rn a system of LMIs is

A10 + x1A

11 + · · ·+ xnA

1n 4 0

...Am

0 + x1Am1 + · · ·+ xnA

mn 4 0

where Ai0, Ai

1, . . . , Ain, i = 1, . . . ,m, are real symmetric data matrices.

Linear Matrix Inequalities (LMIs)

14/42

Carsten Scherer

With the decision vectors x = (x1 · · · xn)T ∈ Rn a system of LMIs is

A10 + x1A

11 + · · ·+ xnA

1n 4 0

...Am

0 + x1Am1 + · · ·+ xnA

mn 4 0

where Ai0, Ai

1, . . . , Ain, i = 1, . . . ,m, are real symmetric data matrices.

LMI feasibility problem: Test whether there exist x1, . . . , xn that

render the LMIs satisfied.

Linear Matrix Inequalities (LMIs)

14/42

Carsten Scherer

With the decision vectors x = (x1 · · · xn)T ∈ Rn a system of LMIs is

A10 + x1A

11 + · · ·+ xnA

1n 4 0

...Am

0 + x1Am1 + · · ·+ xnA

mn 4 0

where Ai0, Ai

1, . . . , Ain, i = 1, . . . ,m, are real symmetric data matrices.

LMI feasibility problem: Test whether there exist x1, . . . , xn that

render the LMIs satisfied.

LMI optimization problem: Minimize c1x1 + · · · + cnxn over all

x1, . . . , xn that satisfy the LMIs.

Only simple cases can be treated analytically → Numerical techniques.

LMI Optimization Problems

15/42

Carsten Scherer

minimize c1x1 + · · ·+ cnxn

subject to Ai0 + x1A

i1 + · · ·+ xnA

in 4 0, i = 1, . . . ,m

LMI Optimization Problems

15/42

Carsten Scherer

minimize c1x1 + · · ·+ cnxn

subject to Ai0 + x1A

i1 + · · ·+ xnA

in 4 0, i = 1, . . . ,m

• Natural generalization of LPs with inequalities defined by the cone of

positive semi-definite matrices. Considerable richer class than LPs.

LMI Optimization Problems

15/42

Carsten Scherer

minimize c1x1 + · · ·+ cnxn

subject to Ai0 + x1A

i1 + · · ·+ xnA

in 4 0, i = 1, . . . ,m

• Natural generalization of LPs with inequalities defined by the cone of

positive semi-definite matrices. Considerable richer class than LPs.

• The i-th constraint can be equivalently expressed as

λmax(Ai0 + x1A

i1 + · · ·+ xnA

in) ≤ 0.

This scalar constraint is defined with a non-differentiable function.

LMI Optimization Problems

15/42

Carsten Scherer

minimize c1x1 + · · ·+ cnxn

subject to Ai0 + x1A

i1 + · · ·+ xnA

in 4 0, i = 1, . . . ,m

• Natural generalization of LPs with inequalities defined by the cone of

positive semi-definite matrices. Considerable richer class than LPs.

• The i-th constraint can be equivalently expressed as

λmax(Ai0 + x1A

i1 + · · ·+ xnA

in) ≤ 0.

This scalar constraint is defined with a non-differentiable function.

• Interior-point or bundle methods allow to effectively decide about

feasibility/boundedness and to determine almost optimal solutions.

LMI Optimization Problems

16/42

Carsten Scherer

Developments

• Bellman/Fan initialized derivation of optimality conditions (1963)

• Jan Willems coined terminology LMI and revealed relation to dissi-

pative dynamical systems (1971/72)

• Nesterov/Nemirovski exhibited essential feature (self-concordance)

for existence of polynomial-time solution algorithm (1988)

• Interior-point methods: Alizadeh (1992), Kamath/Karmarkar (1992)

LMI Optimization Problems

16/42

Carsten Scherer

Developments

• Bellman/Fan initialized derivation of optimality conditions (1963)

• Jan Willems coined terminology LMI and revealed relation to dissi-

pative dynamical systems (1971/72)

• Nesterov/Nemirovski exhibited essential feature (self-concordance)

for existence of polynomial-time solution algorithm (1988)

• Interior-point methods: Alizadeh (1992), Kamath/Karmarkar (1992)

Suggested books: Boyd/El Ghaoui (1994), El Ghaoui/Niculescu (2000),

Ben-Tal/Nemiorvski (2001), Boyd/Vandenberghe (2004)

What are LMIs good for?

17/42

Carsten Scherer

• Many engineering optimization problem can be (often but not always

easily) translated into LMI problems.

What are LMIs good for?

17/42

Carsten Scherer

• Many engineering optimization problem can be (often but not always

easily) translated into LMI problems.

• Various computationally difficult optimization problems can be effec-

tively approximated by LMI problems.

What are LMIs good for?

17/42

Carsten Scherer

• Many engineering optimization problem can be (often but not always

easily) translated into LMI problems.

• Various computationally difficult optimization problems can be effec-

tively approximated by LMI problems.

• In practice the description of the data is affected by uncertainty.

Robust optimization problems can be handled/approximated by

standard LMI problems.

What are LMIs good for?

17/42

Carsten Scherer

• Many engineering optimization problem can be (often but not always

easily) translated into LMI problems.

• Various computationally difficult optimization problems can be effec-

tively approximated by LMI problems.

• In practice the description of the data is affected by uncertainty.

Robust optimization problems can be handled/approximated by

standard LMI problems.

In this course

How can we translate and solve (robust) control problems with LMIs?

Outline

17/42

Carsten Scherer

• From Optimization to Convex Semi-Definite Programming

• Linear Matrix Inequalities (LMIs)

• Truss-Topology Design

• LMIs and Stability

• A First Glimpse at Robustness

• LMI Regions

Truss Topology Design

18/42

Carsten Scherer

0 50 100 150 200 250 300 350 400

0

50

100

150

200

250

Example: Truss Topology Design

19/42

Carsten Scherer

• Connect nodes with N bars of length l = col(l1, . . . , lN) (fixed) and

cross-sections x = col(x1, . . . , xN) (to-be-designed).

Example: Truss Topology Design

19/42

Carsten Scherer

• Connect nodes with N bars of length l = col(l1, . . . , lN) (fixed) and

cross-sections x = col(x1, . . . , xN) (to-be-designed).

• Impose bounds ak ≤ xk ≤ bk on cross-section and lT x ≤ v on total

volume (weight). Abbreviate a = col(a1, . . . , aN), b = col(b1, . . . , bN).

Example: Truss Topology Design

19/42

Carsten Scherer

• Connect nodes with N bars of length l = col(l1, . . . , lN) (fixed) and

cross-sections x = col(x1, . . . , xN) (to-be-designed).

• Impose bounds ak ≤ xk ≤ bk on cross-section and lT x ≤ v on total

volume (weight). Abbreviate a = col(a1, . . . , aN), b = col(b1, . . . , bN).

• If applying external forces f = col(f1, . . . , fM) (fixed) on nodes the

construction reacts with the node displacement d = col(d1, . . . , dM).

Example: Truss Topology Design

19/42

Carsten Scherer

• Connect nodes with N bars of length l = col(l1, . . . , lN) (fixed) and

cross-sections x = col(x1, . . . , xN) (to-be-designed).

• Impose bounds ak ≤ xk ≤ bk on cross-section and lT x ≤ v on total

volume (weight). Abbreviate a = col(a1, . . . , aN), b = col(b1, . . . , bN).

• If applying external forces f = col(f1, . . . , fM) (fixed) on nodes the

construction reacts with the node displacement d = col(d1, . . . , dM).

Mechanical model: A(x)d = f where A(x) is the stiffness matrix

which depends linearly on x and has to be positive semi-definite.

Example: Truss Topology Design

19/42

Carsten Scherer

• Connect nodes with N bars of length l = col(l1, . . . , lN) (fixed) and

cross-sections x = col(x1, . . . , xN) (to-be-designed).

• Impose bounds ak ≤ xk ≤ bk on cross-section and lT x ≤ v on total

volume (weight). Abbreviate a = col(a1, . . . , aN), b = col(b1, . . . , bN).

• If applying external forces f = col(f1, . . . , fM) (fixed) on nodes the

construction reacts with the node displacement d = col(d1, . . . , dM).

Mechanical model: A(x)d = f where A(x) is the stiffness matrix

which depends linearly on x and has to be positive semi-definite.

• Goal is to maximize stiffness, for example by minimizing the elastic

stored energy fT d.

Example: Truss Topology Design

20/42

Carsten Scherer

Find x ∈ RN which minimizes fT d subject to the constraints

A(x) < 0, A(x)d = f, lT x ≤ v, a ≤ x ≤ b.

Example: Truss Topology Design

20/42

Carsten Scherer

Find x ∈ RN which minimizes fT d subject to the constraints

A(x) < 0, A(x)d = f, lT x ≤ v, a ≤ x ≤ b.

Features

• Data: Scalar v, vectors f , a, b, l, and symmetric matrices A1, . . . , AN

which define the linear mapping A(x) = A1x1 + · · ·+ ANxN .

Example: Truss Topology Design

20/42

Carsten Scherer

Find x ∈ RN which minimizes fT d subject to the constraints

A(x) < 0, A(x)d = f, lT x ≤ v, a ≤ x ≤ b.

Features

• Data: Scalar v, vectors f , a, b, l, and symmetric matrices A1, . . . , AN

which define the linear mapping A(x) = A1x1 + · · ·+ ANxN .

• Decision variables: Vectors x and d.

Example: Truss Topology Design

20/42

Carsten Scherer

Find x ∈ RN which minimizes fT d subject to the constraints

A(x) < 0, A(x)d = f, lT x ≤ v, a ≤ x ≤ b.

Features

• Data: Scalar v, vectors f , a, b, l, and symmetric matrices A1, . . . , AN

which define the linear mapping A(x) = A1x1 + · · ·+ ANxN .

• Decision variables: Vectors x and d.

• Objective function: d → fT d which happens to be linear.

Example: Truss Topology Design

20/42

Carsten Scherer

Find x ∈ RN which minimizes fT d subject to the constraints

A(x) < 0, A(x)d = f, lT x ≤ v, a ≤ x ≤ b.

Features

• Data: Scalar v, vectors f , a, b, l, and symmetric matrices A1, . . . , AN

which define the linear mapping A(x) = A1x1 + · · ·+ ANxN .

• Decision variables: Vectors x and d.

• Objective function: d → fT d which happens to be linear.

• Constraints: Semi-definite constraint A(x) < 0, nonlinear equality

constraint A(x)d = f , and linear inequality constraints lT x ≤ v,

a ≤ x ≤ b. Latter interpreted elementwise!

From Truss Topology Design to LMI’s

21/42

Carsten Scherer

Render LMI inequality strict. Equality constraint A(x)d = f allows to

eliminate d which results in

minimize fT A(x)−1f

subject to A(x) � 0, lT x ≤ v, a ≤ x ≤ b.

From Truss Topology Design to LMI’s

21/42

Carsten Scherer

Render LMI inequality strict. Equality constraint A(x)d = f allows to

eliminate d which results in

minimize fT A(x)−1f

subject to A(x) � 0, lT x ≤ v, a ≤ x ≤ b.

Push objective to constraints with auxiliary variable:

minimize γ

subject to γ > fT A(x)−1f, A(x) � 0, lT x ≤ v, a ≤ x ≤ b.

From Truss Topology Design to LMI’s

21/42

Carsten Scherer

Render LMI inequality strict. Equality constraint A(x)d = f allows to

eliminate d which results in

minimize fT A(x)−1f

subject to A(x) � 0, lT x ≤ v, a ≤ x ≤ b.

Push objective to constraints with auxiliary variable:

minimize γ

subject to γ > fT A(x)−1f, A(x) � 0, lT x ≤ v, a ≤ x ≤ b.

Trouble: Nonlinear inequality constraint γ > fT A(x)−1f .

Recap: Congruence Transformations

22/42

Carsten Scherer

Given a Hermitian matrix A and a square non-singular matrix T ,

A → T ∗AT

is called a congruence transformation of A.

Recap: Congruence Transformations

22/42

Carsten Scherer

Given a Hermitian matrix A and a square non-singular matrix T ,

A → T ∗AT

is called a congruence transformation of A.

If T is square and non-singular then

A ≺ 0 if and only if T ∗AT ≺ 0.

Recap: Congruence Transformations

22/42

Carsten Scherer

Given a Hermitian matrix A and a square non-singular matrix T ,

A → T ∗AT

is called a congruence transformation of A.

If T is square and non-singular then

A ≺ 0 if and only if T ∗AT ≺ 0.

The following more general statement is also easy to remember.

If A is Hermitian and T is nonsingular, the matrices A and T ∗AT

have the same number of negative, zero, positive eigenvalues.

What is true if T is not square? ... if T has full column rank?

Recap: Schur-Complement Lemma

23/42

Carsten Scherer

The Hermitian block matrix

(Q S

ST R

)is negative definite

if and only if

Q ≺ 0 and R− ST Q−1S ≺ 0

if and only if

R ≺ 0 and Q− SR−1ST ≺ 0.

Recap: Schur-Complement Lemma

23/42

Carsten Scherer

The Hermitian block matrix

(Q S

ST R

)is negative definite

if and only if

Q ≺ 0 and R− ST Q−1S ≺ 0

if and only if

R ≺ 0 and Q− SR−1ST ≺ 0.

Proof. First equivalence follows from(I 0

−ST Q−1 I

)(Q S

ST R

)(I −Q−1S

0 I

)=

(Q 0

0 R− ST Q−1S

).

Recap: Schur-Complement Lemma

23/42

Carsten Scherer

The Hermitian block matrix

(Q S

ST R

)is negative definite

if and only if

Q ≺ 0 and R− ST Q−1S ≺ 0

if and only if

R ≺ 0 and Q− SR−1ST ≺ 0.

Proof. First equivalence follows from(I 0

−ST Q−1 I

)(Q S

ST R

)(I −Q−1S

0 I

)=

(Q 0

0 R− ST Q−1S

).

The proof reveals a more general relation between the number of

negative, zero, positive eigenvalues of the three matrices.

From Truss Topology Design to LMI’s

24/42

Carsten Scherer

Render LMI inequality strict. Equality constraint A(x)d = f allows to

eliminate d which results in

minimize fT A(x)−1f

subject to A(x) � 0, lT x ≤ v, a ≤ x ≤ b.

Push objective to constraints with auxiliary variable:

minimize γ

subject to γ > fT A(x)−1f, A(x) � 0, lT x ≤ v, a ≤ x ≤ b.

Linearize with Schur lemma to equivalent LMI problem

minimize γ

subject to

(γ fT

f A(x)

)� 0, lT x ≤ v, a ≤ x ≤ b.

Yalmip-Coding: Truss Toplogy Design

25/42

Carsten Scherer

minimize γ

subject to

(γ fT

f A(x)

)� 0, lT x ≤ v, a ≤ x ≤ b.

Suppose A(x)=∑N

k=1 xkmkmTk with vectors mk collected in matrix M .

Yalmip-Coding: Truss Toplogy Design

25/42

Carsten Scherer

minimize γ

subject to

(γ fT

f A(x)

)� 0, lT x ≤ v, a ≤ x ≤ b.

Suppose A(x)=∑N

k=1 xkmkmTk with vectors mk collected in matrix M .

The following code with Yalmip commands solves LMI problem:

gamma=sdpvar(1,1); x=sdpvar(N,1,’full’);

lmi=set([gamma f’;f M*diag(x)*M’]);

lmi=lmi+set(l’*x<=v);

lmi=lmi+set(a<=x<=b);

options=sdpsettings(’solver’,’sedumi’);

solvesdp(lmi,gamma,options); s=double(x);

Result: Truss Toplogy Design

26/42

Carsten Scherer

0 50 100 150 200 250 300 350 400

0

50

100

150

200

250

Quickly Accessible Software

27/42

Carsten Scherer

General purpose Matlab interface Yalmip:

• Free code developed by J. Lofberg and accessible at

http://control.ee.ethz.ch/~joloef/yalmip.msql

• Can use usual Matlab-syntax to define optimization problem.

Is extremely easy to use and very versatile. Highly recommended!

• Provides access to a whole suite of public and commercial optimiza-

tion solvers, including fastest available dedicated LMI-solvers.

Quickly Accessible Software

27/42

Carsten Scherer

General purpose Matlab interface Yalmip:

• Free code developed by J. Lofberg and accessible at

http://control.ee.ethz.ch/~joloef/yalmip.msql

• Can use usual Matlab-syntax to define optimization problem.

Is extremely easy to use and very versatile. Highly recommended!

• Provides access to a whole suite of public and commercial optimiza-

tion solvers, including fastest available dedicated LMI-solvers.

Matlab LMI-Toolbox for dedicated control applications. Has recently

been integrated into new Robust Control Toolbox.

Outline

27/42

Carsten Scherer

• From Optimization to Convex Semi-Definite Programming

• Linear Matrix Inequalities (LMIs)

• Truss-Topology Design

• LMIs and Stability

• A First Glimpse at Robustness

• LMI Regions

General Formulation of LMI Problems

28/42

Carsten Scherer

Let X be a finite-dimensional real vector space. Suppose the mappings

f : X → R, G : X → {symmetric matrices of fixed size} are affine.

General Formulation of LMI Problems

28/42

Carsten Scherer

Let X be a finite-dimensional real vector space. Suppose the mappings

f : X → R, G : X → {symmetric matrices of fixed size} are affine.

LMI feasibility problem: Test existence of X ∈ X with G(X) 4 0.

General Formulation of LMI Problems

28/42

Carsten Scherer

Let X be a finite-dimensional real vector space. Suppose the mappings

f : X → R, G : X → {symmetric matrices of fixed size} are affine.

LMI feasibility problem: Test existence of X ∈ X with G(X) 4 0.

LMI optimization problem: Minimize f(X) over all X ∈ X that

satisfy the LMI G(X) 4 0

General Formulation of LMI Problems

28/42

Carsten Scherer

Let X be a finite-dimensional real vector space. Suppose the mappings

f : X → R, G : X → {symmetric matrices of fixed size} are affine.

LMI feasibility problem: Test existence of X ∈ X with G(X) 4 0.

LMI optimization problem: Minimize f(X) over all X ∈ X that

satisfy the LMI G(X) 4 0.

Practical Hints. For all algorithms to work properly, make sure that

the constraint is strictly feasible: There exists some X ∈ X such that

G(X) ≺ 0.

General Formulation of LMI Problems

28/42

Carsten Scherer

Let X be a finite-dimensional real vector space. Suppose the mappings

f : X → R, G : X → {symmetric matrices of fixed size} are affine.

LMI feasibility problem: Test existence of X ∈ X with G(X) 4 0.

LMI optimization problem: Minimize f(X) over all X ∈ X that

satisfy the LMI G(X) 4 0.

Practical Hints. For all algorithms to work properly, make sure that

the constraint is strictly feasible: There exists some X ∈ X such that

G(X) ≺ 0.

If G is genuinely linear, fopt is either +∞ or −∞. Impose additional

bounds/constraints on the decision variables to render problem sensible.

Example: Spectral Norm Approximation

29/42

Carsten Scherer

For real data matrices A, B, C and some unknown X consider

minimize ‖AXB − C‖subject to X ∈ X

where X is a matrix subspace reflecting structural constraints.

Example: Spectral Norm Approximation

29/42

Carsten Scherer

For real data matrices A, B, C and some unknown X consider

minimize ‖AXB − C‖subject to X ∈ X

where X is a matrix subspace reflecting structural constraints.

Key equivalence with Schur:

‖M‖ < γ ⇐⇒ MT M ≺ γ2I ⇐⇒

(γI M

MT γI

)� 0.

Example: Spectral Norm Approximation

29/42

Carsten Scherer

For real data matrices A, B, C and some unknown X consider

minimize ‖AXB − C‖subject to X ∈ X

where X is a matrix subspace reflecting structural constraints.

Key equivalence with Schur:

‖M‖ < γ ⇐⇒ MT M ≺ γ2I ⇐⇒

(γI M

MT γI

)� 0.

Norm minimization hence equivalent to following LMI problem:

minimize γ

subject to X ∈ X ,

(γI AXB − C

(AXB − C)T γI

)� 0

Stability of Dynamical Systems

30/42

Carsten Scherer

For dynamical systems one can distinguish many notions of stability.

We will mainly rely on definitions related to the state-space descriptions

x(t) = Ax(t), x(t) = A(t)x(t), x(t) = f(t, x(t)), x(t0) = x0

which capture the behavior of x(t) for t →∞ depending on x0.

Stability of Dynamical Systems

30/42

Carsten Scherer

For dynamical systems one can distinguish many notions of stability.

We will mainly rely on definitions related to the state-space descriptions

x(t) = Ax(t), x(t) = A(t)x(t), x(t) = f(t, x(t)), x(t0) = x0

which capture the behavior of x(t) for t →∞ depending on x0.

Exponential stability means that there exist real constants a > 0

(decay rate) and K (peaking constant) such that

‖x(t)‖ ≤ ‖x(t0)‖Ke−a(t−t0) for all trajectories and t ≥ t0.

K and α are assumed not to depend on t0 or x(t0) (uniformity).

Stability of Dynamical Systems

30/42

Carsten Scherer

For dynamical systems one can distinguish many notions of stability.

We will mainly rely on definitions related to the state-space descriptions

x(t) = Ax(t), x(t) = A(t)x(t), x(t) = f(t, x(t)), x(t0) = x0

which capture the behavior of x(t) for t →∞ depending on x0.

Exponential stability means that there exist real constants a > 0

(decay rate) and K (peaking constant) such that

‖x(t)‖ ≤ ‖x(t0)‖Ke−a(t−t0) for all trajectories and t ≥ t0.

K and α are assumed not to depend on t0 or x(t0) (uniformity).

Lyapunov theory provides the background for testing stability.

Stability of LTI Systems

31/42

Carsten Scherer

The linear time-invariant dynamical system

x(t) = Ax(t)

is exponentially stable if and only if there exists K with

K � 0 and AT K + KA ≺ 0.

Stability of LTI Systems

31/42

Carsten Scherer

The linear time-invariant dynamical system

x(t) = Ax(t)

is exponentially stable if and only if there exists K with

K � 0 and AT K + KA ≺ 0.

Two inequalities can be combined as(−K 0

0 AT K + KA

)≺ 0.

Since the left-hand side depends affinely on the matrix variable K, this

is indeed a standard strict feasibility test!

Matrix variables are fully supported by Yalmip and LMI-toolbox!

Trajectory-Based Proof of Sufficiency

32/42

Carsten Scherer

Choose ε > 0 such that AT K + KA + εK ≺ 0.

Trajectory-Based Proof of Sufficiency

32/42

Carsten Scherer

Choose ε > 0 such that AT K + KA + εK ≺ 0. Let x(.) be any state-

trajectory of the system.

Trajectory-Based Proof of Sufficiency

32/42

Carsten Scherer

Choose ε > 0 such that AT K + KA + εK ≺ 0. Let x(.) be any state-

trajectory of the system. Then

x(t)T (AT K + KA)x(t) + εx(t)T Kx(t) ≤ 0 for all t ∈ R

Trajectory-Based Proof of Sufficiency

32/42

Carsten Scherer

Choose ε > 0 such that AT K + KA + εK ≺ 0. Let x(.) be any state-

trajectory of the system. Then

x(t)T (AT K + KA)x(t) + εx(t)T Kx(t) ≤ 0 for all t ∈ R

and hence (using x(t) = Ax(t))d

dtx(t)T Kx(t) + εx(t)T Kx(t) ≤ 0 for all t ∈ R

Trajectory-Based Proof of Sufficiency

32/42

Carsten Scherer

Choose ε > 0 such that AT K + KA + εK ≺ 0. Let x(.) be any state-

trajectory of the system. Then

x(t)T (AT K + KA)x(t) + εx(t)T Kx(t) ≤ 0 for all t ∈ R

and hence (using x(t) = Ax(t))d

dtx(t)T Kx(t) + εx(t)T Kx(t) ≤ 0 for all t ∈ R

and hence (integrating factor eεt)

x(t)T Kx(t) ≤ x(t0)T Kx(t0)e−ε(t−t0) for all t ∈ R, t ≥ t0.

Trajectory-Based Proof of Sufficiency

32/42

Carsten Scherer

Choose ε > 0 such that AT K + KA + εK ≺ 0. Let x(.) be any state-

trajectory of the system. Then

x(t)T (AT K + KA)x(t) + εx(t)T Kx(t) ≤ 0 for all t ∈ R

and hence (using x(t) = Ax(t))d

dtx(t)T Kx(t) + εx(t)T Kx(t) ≤ 0 for all t ∈ R

and hence (integrating factor eεt)

x(t)T Kx(t) ≤ x(t0)T Kx(t0)e−ε(t−t0) for all t ∈ R, t ≥ t0.

Since λmin(K)‖x‖2 ≤ xT Kx ≤ λmax(K)‖x‖2 we can conclude that

‖x(t)‖ ≤ ‖x(t0)‖

√λmax(K)λmin(K)

e−ε(t−t0) for t ≥ t0.

Algebraic Proof

33/42

Carsten Scherer

Sufficiency. Let λ ∈ λ(A).

Algebraic Proof

33/42

Carsten Scherer

Sufficiency. Let λ ∈ λ(A). Choose a complex eigenvector x 6= 0 with

Ax = λx.

Algebraic Proof

33/42

Carsten Scherer

Sufficiency. Let λ ∈ λ(A). Choose a complex eigenvector x 6= 0 with

Ax = λx. Then the LMI’s imply x∗Kx > 0 and

0 > x∗(AT K + KA)x = λx∗Kx + x∗Kxλ = 2Re(λ)x∗Kx.

Algebraic Proof

33/42

Carsten Scherer

Sufficiency. Let λ ∈ λ(A). Choose a complex eigenvector x 6= 0 with

Ax = λx. Then the LMI’s imply x∗Kx > 0 and

0 > x∗(AT K + KA)x = λx∗Kx + x∗Kxλ = 2Re(λ)x∗Kx.

This guarantees Re(λ) < 0.

Algebraic Proof

33/42

Carsten Scherer

Sufficiency. Let λ ∈ λ(A). Choose a complex eigenvector x 6= 0 with

Ax = λx. Then the LMI’s imply x∗Kx > 0 and

0 > x∗(AT K + KA)x = λx∗Kx + x∗Kxλ = 2Re(λ)x∗Kx.

This guarantees Re(λ) < 0. Therefore all eigenvalues of A are in C−.

Necessity if A is diagonizable. Suppose all eigenvalues of A are in C−.

Algebraic Proof

33/42

Carsten Scherer

Sufficiency. Let λ ∈ λ(A). Choose a complex eigenvector x 6= 0 with

Ax = λx. Then the LMI’s imply x∗Kx > 0 and

0 > x∗(AT K + KA)x = λx∗Kx + x∗Kxλ = 2Re(λ)x∗Kx.

This guarantees Re(λ) < 0. Therefore all eigenvalues of A are in C−.

Necessity if A is diagonizable. Suppose all eigenvalues of A are in C−.

Since A is diagonizable there exists a complex nonsingular T with TAT−1 =

Λ = diag(λ1, . . . , λn).

Algebraic Proof

33/42

Carsten Scherer

Sufficiency. Let λ ∈ λ(A). Choose a complex eigenvector x 6= 0 with

Ax = λx. Then the LMI’s imply x∗Kx > 0 and

0 > x∗(AT K + KA)x = λx∗Kx + x∗Kxλ = 2Re(λ)x∗Kx.

This guarantees Re(λ) < 0. Therefore all eigenvalues of A are in C−.

Necessity if A is diagonizable. Suppose all eigenvalues of A are in C−.

Since A is diagonizable there exists a complex nonsingular T with TAT−1 =

Λ = diag(λ1, . . . , λn). Since Re(λk) < 0 for k = 1, . . . , n we infer

Λ∗ + Λ ≺ 0 and hence (T ∗)−1AT T ∗ + TAT−1 ≺ 0.

Algebraic Proof

33/42

Carsten Scherer

Sufficiency. Let λ ∈ λ(A). Choose a complex eigenvector x 6= 0 with

Ax = λx. Then the LMI’s imply x∗Kx > 0 and

0 > x∗(AT K + KA)x = λx∗Kx + x∗Kxλ = 2Re(λ)x∗Kx.

This guarantees Re(λ) < 0. Therefore all eigenvalues of A are in C−.

Necessity if A is diagonizable. Suppose all eigenvalues of A are in C−.

Since A is diagonizable there exists a complex nonsingular T with TAT−1 =

Λ = diag(λ1, . . . , λn). Since Re(λk) < 0 for k = 1, . . . , n we infer

Λ∗ + Λ ≺ 0 and hence (T ∗)−1AT T ∗ + TAT−1 ≺ 0.

If we left-multiply with T ∗ and right-multiply with T (congruence) we infer

AT (T ∗T ) + (T ∗T )A ≺ 0.

Algebraic Proof

33/42

Carsten Scherer

Sufficiency. Let λ ∈ λ(A). Choose a complex eigenvector x 6= 0 with

Ax = λx. Then the LMI’s imply x∗Kx > 0 and

0 > x∗(AT K + KA)x = λx∗Kx + x∗Kxλ = 2Re(λ)x∗Kx.

This guarantees Re(λ) < 0. Therefore all eigenvalues of A are in C−.

Necessity if A is diagonizable. Suppose all eigenvalues of A are in C−.

Since A is diagonizable there exists a complex nonsingular T with TAT−1 =

Λ = diag(λ1, . . . , λn). Since Re(λk) < 0 for k = 1, . . . , n we infer

Λ∗ + Λ ≺ 0 and hence (T ∗)−1AT T ∗ + TAT−1 ≺ 0.

If we left-multiply with T ∗ and right-multiply with T (congruence) we infer

AT (T ∗T ) + (T ∗T )A ≺ 0.

Hence K = T ∗T � 0 satisfies the LMI’s.

Algebraic Proof

34/42

Carsten Scherer

Necessity if A is not diagonizable. If A is not diagonizable it can be

transformed by similarity into its Jordan form: There exists a nonsingular

T with TAT−1 = Λ + J where Λ is diagonal and J has either ones or zeros

on the first upper diagonal.

Algebraic Proof

34/42

Carsten Scherer

Necessity if A is not diagonizable. If A is not diagonizable it can be

transformed by similarity into its Jordan form: There exists a nonsingular

T with TAT−1 = Λ + J where Λ is diagonal and J has either ones or zeros

on the first upper diagonal.

For any ε > 0 one can even choose Tε with TεAT−1ε = Λ + εJ .

Algebraic Proof

34/42

Carsten Scherer

Necessity if A is not diagonizable. If A is not diagonizable it can be

transformed by similarity into its Jordan form: There exists a nonsingular

T with TAT−1 = Λ + J where Λ is diagonal and J has either ones or zeros

on the first upper diagonal.

For any ε > 0 one can even choose Tε with TεAT−1ε = Λ + εJ . Since Λ has

the eigenvalues of A on its diagonal we still infer Λ∗ + Λ ≺ 0.

Algebraic Proof

34/42

Carsten Scherer

Necessity if A is not diagonizable. If A is not diagonizable it can be

transformed by similarity into its Jordan form: There exists a nonsingular

T with TAT−1 = Λ + J where Λ is diagonal and J has either ones or zeros

on the first upper diagonal.

For any ε > 0 one can even choose Tε with TεAT−1ε = Λ + εJ . Since Λ has

the eigenvalues of A on its diagonal we still infer Λ∗ + Λ ≺ 0. Therefore it

is possible to fix a sufficiently small ε > 0 with

0 � Λ∗ + Λ + ε(JT + J) = (Λ + εJ)∗ + (Λ + εJ).

Algebraic Proof

34/42

Carsten Scherer

Necessity if A is not diagonizable. If A is not diagonizable it can be

transformed by similarity into its Jordan form: There exists a nonsingular

T with TAT−1 = Λ + J where Λ is diagonal and J has either ones or zeros

on the first upper diagonal.

For any ε > 0 one can even choose Tε with TεAT−1ε = Λ + εJ . Since Λ has

the eigenvalues of A on its diagonal we still infer Λ∗ + Λ ≺ 0. Therefore it

is possible to fix a sufficiently small ε > 0 with

0 � Λ∗ + Λ + ε(JT + J) = (Λ + εJ)∗ + (Λ + εJ).

As before we can conclude that K = T ∗ε Tε satisfies the LMI’s.

Outline

34/42

Carsten Scherer

• From Optimization to Convex Semi-Definite Programming

• Linear Matrix Inequalities (LMIs)

• Truss-Topology Design

• LMIs and Stability

• A First Glimpse at Robustness

• LMI Regions

A First Glimpse at Robustness

35/42

Carsten Scherer

With some compact set A ⊂ Rn×n consider the family of LTI systems

x(t) = Ax(t) with A ∈ A.

A First Glimpse at Robustness

35/42

Carsten Scherer

With some compact set A ⊂ Rn×n consider the family of LTI systems

x(t) = Ax(t) with A ∈ A.

A is said to be quadratically stable if there exists K such that

K � 0 and AT K + KA ≺ 0 for all A ∈ A.

A First Glimpse at Robustness

35/42

Carsten Scherer

With some compact set A ⊂ Rn×n consider the family of LTI systems

x(t) = Ax(t) with A ∈ A.

A is said to be quadratically stable if there exists K such that

K � 0 and AT K + KA ≺ 0 for all A ∈ A.

Why name? V (x) = xT Kx is quadratic Lyapunov function.

A First Glimpse at Robustness

35/42

Carsten Scherer

With some compact set A ⊂ Rn×n consider the family of LTI systems

x(t) = Ax(t) with A ∈ A.

A is said to be quadratically stable if there exists K such that

K � 0 and AT K + KA ≺ 0 for all A ∈ A.

Why name? V (x) = xT Kx is quadratic Lyapunov function.

Why relevant? Implies that all A ∈ A are Hurwitz.

A First Glimpse at Robustness

35/42

Carsten Scherer

With some compact set A ⊂ Rn×n consider the family of LTI systems

x(t) = Ax(t) with A ∈ A.

A is said to be quadratically stable if there exists K such that

K � 0 and AT K + KA ≺ 0 for all A ∈ A.

Why name? V (x) = xT Kx is quadratic Lyapunov function.

Why relevant? Implies that all A ∈ A are Hurwitz.

Even stronger: Implies, for any piece-wise continuous A : R → A,

exponential stability of the time-varying system

x(t) = A(t)x(t).

Computational Verification

36/42

Carsten Scherer

If A has infinitely many elements, testing quadratic stability amounts

to verifying the feasibility of an infinite number of LMIs.

Computational Verification

36/42

Carsten Scherer

If A has infinitely many elements, testing quadratic stability amounts

to verifying the feasibility of an infinite number of LMIs.

Key question: How to reduce to a standard LMI problem?

Computational Verification

36/42

Carsten Scherer

If A has infinitely many elements, testing quadratic stability amounts

to verifying the feasibility of an infinite number of LMIs.

Key question: How to reduce to a standard LMI problem?

Let A be the convex hull of {A1, . . . , AN}: For each A ∈ A there

exist coefficients λ1 ≥ 0, . . . , λN ≥ 0 with λ1 + · · ·+ λN = 1 such that

A = λ1A1 + · · ·+ λNAN .

Computational Verification

36/42

Carsten Scherer

If A has infinitely many elements, testing quadratic stability amounts

to verifying the feasibility of an infinite number of LMIs.

Key question: How to reduce to a standard LMI problem?

Let A be the convex hull of {A1, . . . , AN}: For each A ∈ A there

exist coefficients λ1 ≥ 0, . . . , λN ≥ 0 with λ1 + · · ·+ λN = 1 such that

A = λ1A1 + · · ·+ λNAN .

If A is the convex hull of {A1, . . . , AN} then A is quadratically

stable iff there exists some K with

K � 0 and ATi K + KAi ≺ 0 for all i = 1, . . . , N.

Proofs. Exercises.

Outline

36/42

Carsten Scherer

• From Optimization to Convex Semi-Definite Programming

• Linear Matrix Inequalities (LMIs)

• Truss-Topology Design

• LMIs and Stability

• A First Glimpse at Robustness

• LMI Regions

LMI Regions

37/42

Carsten Scherer

For a real symmetric 2m×2m matrix P the set of complex numbers

LP :=

{z ∈ C :

(I

zI

)∗

P

(I

zI

)≺ 0

}

is called an LMI region.

LMI Regions

37/42

Carsten Scherer

For a real symmetric 2m×2m matrix P the set of complex numbers

LP :=

{z ∈ C :

(I

zI

)∗

P

(I

zI

)≺ 0

}

is called an LMI region.

Half-plane: Re(z) < α ⇔ z + z − 2α ≺ 0 ⇔ z ∈ LP1 .

LMI Regions

37/42

Carsten Scherer

For a real symmetric 2m×2m matrix P the set of complex numbers

LP :=

{z ∈ C :

(I

zI

)∗

P

(I

zI

)≺ 0

}

is called an LMI region.

Half-plane: Re(z) < α ⇔ z + z − 2α ≺ 0 ⇔ z ∈ LP1 .

Circle: |z| < r ⇔ zz − r2 ≺ 0 ⇔ z ∈ LP2

LMI Regions

37/42

Carsten Scherer

For a real symmetric 2m×2m matrix P the set of complex numbers

LP :=

{z ∈ C :

(I

zI

)∗

P

(I

zI

)≺ 0

}

is called an LMI region.

Half-plane: Re(z) < α ⇔ z + z − 2α ≺ 0 ⇔ z ∈ LP1 .

Circle: |z| < r ⇔ zz − r2 ≺ 0 ⇔ z ∈ LP2

Sector: Re(z) tan(φ) < −|Im(z)| ⇔ z ∈ LP3

where

P1 =

(−2α 1

1 0

), P2 =

(−r2 0

0 1

), P3 =

0 0 sin(φ) cos(φ)

0 0 − cos(φ) sin(φ)

∗ ∗ 0 0

∗ ∗ 0 0

.

Properties of LMI Regions

38/42

Carsten Scherer

The intersection of finitely many LMI regions is an LMI region.

We can hence describe rather general subsets of C.

Properties of LMI Regions

38/42

Carsten Scherer

The intersection of finitely many LMI regions is an LMI region.

We can hence describe rather general subsets of C.

Proof. Just follows from usual stacking property of LMI’s:(I

zI

)∗(Q1 S1

ST1 R1

)(I

zI

)≺ 0,

(I

zI

)∗(Q2 S2

ST2 R2

)(I

zI

)≺ 0

if and only ifI 0

0 I

zI 0

0 zI

Q1 0 S1 0

0 Q2 0 S2

ST1 0 R1 0

0 ST2 0 R2

I 0

0 I

zI 0

0 zI

≺ 0.

Recap

39/42

Carsten Scherer

Recall the definition of the Kronecker product

A⊗B =

A11B · · · A1nB...

...

Am1B · · · AmnB

for A ∈ Cm×n, B ∈ Ck×l.

Recap

39/42

Carsten Scherer

Recall the definition of the Kronecker product

A⊗B =

A11B · · · A1nB...

...

Am1B · · · AmnB

for A ∈ Cm×n, B ∈ Ck×l.

Easy to prove following properties (for compatible dimensions):

• 1⊗ A = A = A⊗ 1.

• (A + B)⊗ C = (A⊗ C) + (B ⊗ C)

• (A⊗B)(C ⊗D) = (AC)⊗ (BD).

• (A⊗B)∗ = A∗ ⊗B∗ and (A⊗B)−1 = A−1 ⊗B−1.

Generalization Standard Stability Characterization

40/42

Carsten Scherer

All eigenvalues of A ∈ Rn×n are contained in the LMI-region{z ∈ C :

(I

zI

)∗(Q S

ST R

)(I

zI

)≺ 0

}if and only if there exists a K � 0 such that(

I

A⊗ I

)T (K ⊗Q K ⊗ S

K ⊗ ST K ⊗R

)(I

A⊗ I

)≺ 0

Generalization Standard Stability Characterization

40/42

Carsten Scherer

All eigenvalues of A ∈ Rn×n are contained in the LMI-region{z ∈ C :

(I

zI

)∗(Q S

ST R

)(I

zI

)≺ 0

}if and only if there exists a K � 0 such that(

I

A⊗ I

)T (K ⊗Q K ⊗ S

K ⊗ ST K ⊗R

)(I

A⊗ I

)≺ 0

Beautiful generalization of standard Lyapunov stability characterization!

Gahinet, Chilali (1996)

Proof. Along lines of algebraic proof of standard result and by exploiting

the properties of the Kronecker product.

Discrete-Time versus Continuous-Time

41/42

Carsten Scherer

• xk+1 = Axk is exponentially stable iff A is a Schur matrix.

• x = Ax is exponentially stable iff A is Hurwitz.

The unit disk/the open left half-plane are LMI regions:(1

z

)∗(−1 0

0 1

)(1

z

)≺ 0,

(1

z

)∗(0 1

1 0

)(1

z

)≺ 0.

A is a Schur matrix/A is Hurwitz iff there exists K � 0 with(I

A

)T (−K 0

0 K

)(I

A

)≺ 0,

(I

A

)T (0 K

K 0

)(I

A

)≺ 0.

Lessons to be Learnt

42/42

Carsten Scherer

• Many interesting engineering problems are LMI problems.

• Variables can live in arbitrary vector space.

In control: Variables are typically matrices.

Can involve equation and inequality constraints. Just check whether

cost function and constraints are affine & verify strict feasibility.

• Translation to input for solution algorithm by parser (e.g. Yalmip).

Can choose among many efficient LMI solvers (e.g. Sedumi).

• Main trick in removing nonlinearities so far: Schur Lemma.