Nonlinear Semidefinite Optimization—why and how—
Michal Kocvara
School of Mathematics, The University of Birmingham
Isaac Newton Institute, 2013
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 1 / 45
Semidefinite programming (SDP)
“generalized” mathematical optimization problem
min f (x)
subject to
g(x) ≥ 0
A(x) � 0
A(x) — (non)linear matrix operator Rn → S
m
(A(x) = A0 +∑
xiAi )
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 2 / 45
SDP notations
Sn . . . symmetric matrices of order n × n
A � 0 . . . A positive semidefinite〈A, B〉 := Tr(AB). . . inner product on S
n
A[Rn → Sm]. . . linear matrix operator defined by
A(y) :=n∑
i=1
yiAi with Ai ∈ Sm
A∗[Sm → Rn]. . . adjoint operator defined by
A∗(X ) := [〈A1, X 〉, . . . , 〈An, X 〉]T
and satisfying
〈A∗(X ), y〉 = 〈A(y), X 〉 for all y ∈ Rn
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 3 / 45
Linear SDP: primal-dual pair
infX〈C, X 〉 := Tr(CX ) (P)
s.t. A∗(X ) = b [〈Ai , X 〉 = bi , i = 1, . . . , n]
X � 0
supy ,S
〈b, y〉 :=∑
biyi (D)
s.t. A(y) + S = C [∑
yiAi + S = C]
S � 0
Weak duality: Feasible X , y , S satisfy
〈C, X 〉 − 〈b, y〉 = 〈A(y) + S, X 〉 −∑
yi〈Ai , X 〉 = 〈S, X 〉 ≥ 0
duality gap nonnegative for feasible pointsMichal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 4 / 45
Linear Semidefinite Programming
Vast area of applications. . .
LP and CQP is SDP
eigenvalue optimization
robust programming
control theory
relaxations of integer optimization problems
approximations to combinatorial optimization problems
structural optimization
chemical engineering
machine learning
many many others. . .
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 5 / 45
Why nonlinear SDP?
Problems from
Structural optimization
Control theory
Examples below
There are more but the researchers just don’t know about. . .
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 6 / 45
Free Material Optimization
Aim:
Given an amount of material, boundary conditions and external load f ,find the material (distribution) so that the body is as stiff as possibleunder f .
The design variables are the material properties at each point of thestructure.
M. P. Bendsøe, J.M. Guades, R.B. Haber, P. Pedersen and J. E. Taylor:An analytical model to predict optimal material properties in the contextof optimal structural design. J. Applied Mechanics, 61 (1994) 930–937
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 9 / 45
Free Material Optimization
11
2
2
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 9 / 45
FMO primal formulation
FMO problem with vibration/buckling constraint
minu∈Rn,E1,...,Em
m∑
i=1
TrEi
subject to
Ei � 0, ρ ≤ TrEi ≤ ρ, i = 1, . . . , m
f⊤u ≤ C
A(E)u = f
A(E) + G(E , u) � 0
. . . nonlinear, non-convex semidefinite problem
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 10 / 45
Why nonlinear SDP?
Problems from
Structural optimization
Control theory
Examples below
There are more but the researchers just don’t know about. . .
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 11 / 45
Conic Geometric Programming
Venkat Chandrasekaran’s slides
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 12 / 45
Nonlinear SDP: algorithms, software?
Interior point methods (log det barrier, Jarre ’98), Leibfritz ’02,. . .
Sequential SDP methods (Correa-Ramirez ’04, Jarre-Diehl)
None of these work (well) for practical problems
(Generalized) Augmented Lagrangian methods (Apkarian-Noll’02–’05, D. Sun, J. Sun ’08, MK-Stingl ’02–)
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 13 / 45
Nonlinear SDP: algorithms, software?
Interior point methods (log det barrier, Jarre ’98), Leibfritz ’02,. . .
Sequential SDP methods (Correa-Ramirez ’04, Jarre-Diehl)
None of these work (well) for practical problems
(Generalized) Augmented Lagrangian methods (Apkarian-Noll’02–’05, D. Sun, J. Sun ’08, MK-Stingl ’02–)
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 13 / 45
Nonlinear SDP: algorithms, software?
Interior point methods (log det barrier, Jarre ’98), Leibfritz ’02,. . .
Sequential SDP methods (Correa-Ramirez ’04, Jarre-Diehl)
None of these work (well) for practical problems
(Generalized) Augmented Lagrangian methods (Apkarian-Noll’02–’05, D. Sun, J. Sun ’08, MK-Stingl ’02–)
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 13 / 45
PENNON collectionPENNON (PENalty methods for NONlinear optimization)a collection of codes for NLP, (linear) SDP and BMI
– one algorithm to rule them all –
READYPENNLP AMPL, MATLAB, C/FortranPENSDP MATLAB/YALMIP, SDPA, C/FortranPENBMI MATLAB/YALMIP, C/FortranPENNON (NLP + SDP) extended AMPL, MATLAB,C/FORTRAN
NEWPENLAB (NLP + SDP) open source MATLAB implementation
IN PROGRESSPENSOC (nonlinear SOCP) ( + NLP + SDP)
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 14 / 45
PENSDP with an iterative solver
minX∈Sm
〈C, X 〉 (P)
s.t. 〈Ai , X 〉 = bi , i = 1, . . . , n
X � 0
Large n, small m (Kim Toh)
PENSDP PEN-PCGproblem n m CPU CPU CG/ittheta12 17979 600 27565 254 8theta162 127600 800 mem 13197 13
Large m, small/medium n (robust QCQP, Martin Andersen)
PENSDP PEN-PCGproblem n m CPU CPU CG/itrqcqp1 101 806 324 20 4rqcqp2 101 3206 6313 364 4
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 15 / 45
PENNON: the problem
Optimization problems with nonlinear objective subject to nonlinearinequality and equality constraints and semidefinite bound constraints:
minx∈Rn,Y1∈S
p1 ,...,Yk∈Spk
f (x , Y )
subject to gi(x , Y ) ≤ 0, i = 1, . . . , mg
hi(x , Y ) = 0, i = 1, . . . , mh
λi I � Yi � λi I, i = 1, . . . , k
(NLP-SDP)
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 16 / 45
The problem
Any nonlinear SDP problem can be formulated as NLP-SDP, usingslack variables and (NLP) equality constraints:
G(X ) � 0
write as
G(X ) = S element-wise
S � 0
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 17 / 45
The algorithm
Based on penalty/barrier functions ϕg : R → R and ΦP : Sp → S
p:
gi(x) ≤ 0 ⇐⇒ piϕg(gi(x)/pi) ≤ 0, i = 1, . . . , m
Z � 0 ⇐⇒ ΦP(Z ) � 0, Z ∈ Sp .
Augmented Lagrangian of (NLP-SDP):
F (x,Y ,u,U,U,p)=f (x,Y )+∑mg
i=1 uipiϕg(gi (x,Y )/pi )
+∑k
i=1〈U i ,ΦP(λi I−Yi )〉+∑k
i=1〈U i ,ΦP(Yi−λi I)〉 ;
here u ∈ Rmg and U i , U i are Lagrange multipliers.
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 18 / 45
The algorithm
A generalized Augmented Lagrangian algorithm (based on R. Polyak’92, Ben-Tal–Zibulevsky ’94, Stingl ’05):
Given x1, Y 1, u1, U1, U1; p1
i > 0, i = 1, . . . , mg and P > 0.For k = 1, 2, . . . repeat till a stopping criterium is reached:
(i) Find xk+1 and Y k+1 s.t. ‖∇x F (xk+1, Y k+1, uk , Uk , Uk, pk )‖ ≤ K
(ii) uk+1i = uk
i ϕ′g(gi (x
k+1)/pki ), i = 1, . . . , mg
Uk+1i = DAΦP((λi I − Yi ); Uk
i ), i = 1, . . . , k
Uk+1i = DAΦP((Yi − λi I); U
ki ), i = 1, . . . , k
(iii) pk+1i < pk
i , i = 1, . . . , mg
Pk+1 < Pk .
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 19 / 45
Interfaces
How to enter the data – the functions and their derivatives?
Matlab interface
AMPL interface
C/C++ interface
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 20 / 45
Matlab interface
User provides six MATLAB functions:
f . . . evaluates the objective function
df . . . evaluates the gradient of objective function
hf . . . evaluates the Hessian of objective function
g . . . evaluates the constraints
dg . . . evaluates the gradient of constraints
hg . . . evaluates the Hessian of constraints
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 21 / 45
Matlab interface
Matrix variables are treated as vectors , using the functionsvec : S
m → R(m+1)m/2 defined by
svec
a11 a12 . . . a1m
a22 . . . a2m. . .
...sym amm
= (a11, a12, a22, . . . , a1m, a2m, amm)T
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 22 / 45
Matlab interface
Matrix variables are treated as vectors , using the functionsvec : S
m → R(m+1)m/2 defined by
svec
a11 a12 . . . a1m
a22 . . . a2m. . .
...sym amm
= (a11, a12, a22, . . . , a1m, a2m, amm)T
Keep a specific order of variables, to recognize which are matrices andwhich vectors. Add lower/upper bounds on matrix eigenvalues.Sparse matrices available, sparsity maintained in the user definedfunctions.
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 23 / 45
AMPL interface
AMPL does not support SDP variables and constraints. Use the sametrick:Matrix variables are treated as vectors , using the functionsvec : S
m → R(m+1)m/2 defined by
svec
a11 a12 . . . a1m
a22 . . . a2m. . .
...sym amm
= (a11, a12, a22, . . . , a1m, a2m, amm)T
Need additional input file specifying the matrix sizes and lower/uppereigenvalue bounds.
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 24 / 45
What’s new
PENNON being implemented in NAG (The Numerical AlgorithmsGroup) library
The first routines should appear in the NAG Fortran Library, Mark 24(Autumn 2013)
By-product:PENLAB — free, open, fully functional version of PENNON coded inMATLAB
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 25 / 45
PENLABPENLAB — free, open source, fully functional version of PENNONcoded in Matlab
Designed and coded by Jan Fiala (NAG)
Open source, all in MATLAB (one MEX function)
The basic algorithm is identical
Some data handling routines not (yet?) implemented
PENLAB runs just as PENNON but is slower
Pre-programmed procedures for
standard NLP (with AMPL input!)
linear SDP (reading SDPA input files)
bilinear SDP (=BMI)
SDP with polynomial MI (PMI)
easy to add more (QP, robust QP, SOF, TTO. . . )
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 26 / 45
PENLAB
The problem
minx∈Rn,Y1∈S
p1 ,...,Yk∈Spk
f (x , Y )
subject to gi(x , Y ) ≤ 0, i = 1, . . . , mg
hi(x , Y ) = 0, i = 1, . . . , mh
Ai(x , Y ) � 0, i = 1, . . . , mA
λi I � Yi � λi I, i = 1, . . . , k
(NLP-SDP)
Ai(x , Y ). . . nonlinear matrix operators
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 27 / 45
PENLAB
Solving a problem:
prepare a structure penm containing basic problem data
>> prob = penlab(penm); MATLAB class containing all data
>> solve(prob);
results in class prob
The user has to provide MATLAB functions for
function values
gradients
Hessians (for nonlinear functions)
of all f , g,A.
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 28 / 45
Structure penm and f/g/h functions
Example: min x1 + x2 s.t. x21 + x2
2 ≤ 1, x1 ≥ −0.5
penm = [];penm.Nx = 2;penm.lbx = [-0.5 ; -Inf];penm.NgNLN = 1;penm.ubg = [1];penm.objfun = @(x,Y) deal(x(1) + x(2));penm.objgrad = @(x,Y) deal([1 ; 1]);penm.confun = @(x,Y) deal([x(1)ˆ2 + x(2)ˆ2]);penm.congrad = @(x,Y) deal([2 * x(1) ; 2 * x(2)]);penm.conhess = @(x,Y) deal([2 0 ; 0 2]);% set starting pointpenm.xinit = [2,1];
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 29 / 45
PENLAB gives you a free MATLAB solver for large scale NLP!
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 30 / 45
Toy NLP-SDP example 1
minx∈R2
12(x2
1 + x22 )
subject to
1 x1 − 1 0x1 − 1 1 x2
0 x2 1
� 0
D. Noll, 2007
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 31 / 45
Toy NLP-SDP example 2
minx∈R6
x1x4(x1 + x2 + x3) + x3
subject to x1x2x3x4 − x5 − 25 = 0
x21 + x2
2 + x23 + x2
4 − x6 − 40 = 0
x1 x2 0 0x2 x4 x2 + x3 0
0x2 + x3 x4 x3
0 0 x3 x1
� 0
1 ≤ xi ≤ 5, i = 1, 2, 3, 4, xi ≥ 0, i = 5, 6
Yamashita, Yabe, Harada, 2007(“augmented” Hock-Schittkowski)
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 32 / 45
Example: nearest correlation matrix
Find a nearest correlation matrix:
minX
n∑
i ,j=1
(Xij − Hij)2 (1)
subject to
Xii = 1, i = 1, . . . , n
X � 0
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 33 / 45
Example: nearest correlation matrix
The condition number of the nearest correlation matrix must bebounded by κ.
Using the transformation of the variable X :
zX = X
The new problem:
minz,X
n∑
i ,j=1
(zXij − Hij)2 (2)
subject to
zXii = 1, i = 1, . . . , n
I � X � κI
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 34 / 45
Example: nearest correlation matrix
For
X =1.0000 -0.3775 -0.2230 0.7098 -0.4272 -0.0704
-0.3775 1.0000 0.6930 -0.3155 0.5998 -0.4218-0.2230 0.6930 1.0000 -0.1546 0.5523 -0.4914
0.7098 -0.3155 -0.1546 1.0000 -0.3857 -0.1294-0.4272 0.5998 0.5523 -0.3857 1.0000 -0.0576-0.0704 -0.4218 -0.4914 -0.1294 -0.0576 1.0000
the eigenvalues of the correlation matrix are
eigen =0.2866 0.2866 0.2867 0.6717 1.6019 2.8664
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 35 / 45
Example: nearest correlation matrix
Cooperation with Allianz SE, Munich:
Matrices of size up to 3500 × 3500
Code PENCOR:C code, data in xml formatfeasibility analysissensitivity analysis w.r.t. bounds on matrix elements
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 36 / 45
NLP with AMPL input
Pre-programmed. All you need to do:
>> penm=nlp_define(’datafiles/chain100.nl’);>> prob=penlab(penm);>> prob.solve();
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 37 / 45
NLP with AMPL input
problem vars constr. constr. PENNON PENLABtype sec iter. sec iter.
chain800 3199 2400 = 1 14/23 6 24/56pinene400 8000 7995 = 1 7/7 11 17/17channel800 6398 6398 = 3 3/3 1 3/3torsion100 5000 10000 ≤ 1 17/17 17 26/26
lane emd10 4811 21 ≤ 217 30/86 64 25/49dirichlet10 4491 21 ≤ 151 33/71 73 32/68henon10 2701 21 ≤ 57 49/128 63 76/158
minsurf100 5000 5000 box 1 20/20 97 203/203gasoil400 4001 3998 = & b 3 34/34 13 59/71
duct15 2895 8601 = & ≤ 6 19/19 9 11/11marine400 6415 6392 ≤ & b 2 39/39 22 35/35steering800 3999 3200 ≤ & b 1 9/9 7 19/40methanol400 4802 4797 ≤ & b 2 24/24 16 47/67
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 38 / 45
Linear SDP with SDPA input
Pre-programmed. All you need to do:
>> sdpdata=readsdpa(’datafiles/arch0.dat-s’);>> penm=sdp_define(sdpdata);>> prob=penlab(penm);>> prob.solve();
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 39 / 45
Bilinear matrix inequalities (BMI)
Pre-programmed. All you need to do:
>> bmidata=define_my_problem; %matrices A, K, ...>> penm=bmi_define(bmidata);>> prob=penlab(penm);>> prob.solve();
minx∈Rn
cT x
s.t.
Ai0 +
n∑
k=1
xk Aik +
n∑
k=1
n∑
ℓ=1
xkxℓK ikℓ < 0, i = 1, . . . , m
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 40 / 45
Polynomial matrix inequalities (PMI)Pre-programmed. All you need to do:
>> load datafiles/example_pmidata;>> penm = pmi_define(pmidata);>> problem = penlab(penm);>> problem.solve();
minx∈Rn
12
xT Hx + cT x
subject to blow ≤ Bx ≤ bup
Ai(x) < 0, i = 1, . . . , m
withA(x) =
∑
i
xκ(i)Qi
where κ(i) is a multi-index with possibly repeated entries and xκ(i) is aproduct of elements with indices in κ(i).
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 41 / 45
Other pre-programmed modules
Nearest correlation matrix
Truss topology optimization (stability constraints)
Static output feedback (input from COMPlib, formulated as PMI)
Robust QP
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 42 / 45
Example:Robust quadratic programming on unit simplex with constrai neduncertainty set
Nominal QP
minx∈Rn
[xT Ax − bT x ] s.t.∑
x ≤ 1, x ≥ 0
Assume A uncertain: A ∈ U := {A0 + εU, σ(U) ≤ 1}
Robust QP
minx∈Rn
maxA∈U
[xT Ax − bT x ] s.t.∑
x ≤ 1, x ≥ 0
equivalent to
minx∈Rn
[xT (A0 ± εI)x − bT x ] s.t.∑
x ≤ 1, x ≥ 0
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 43 / 45
Example: Robust QPOptimal solution: x∗, A∗ = A0 + εU∗ (non-unique)
Constraint: A∗ should share some properties with A0
For instance: A∗ � 0 and (A0)ii = A∗ii = 1, i.e., U∗
ii = 0 for all i
=⇒ then the above equivalence no longer holds
Remedy: for a given x∗ find a feasible solution A to the maximizationproblem. This is a linear SDP:
maxU∈Sm
[x∗T (A0 + εU)x∗ − bT x∗]
s.t.
− I 4 U 4 I
A0 + εU � 0
Uii = 0, i = 1, . . . , m
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 44 / 45
Example: Robust QP
Alternatively: find feasible A nearest to A∗ → nonlinear SDP
Finally, we need to solve the original QP with the feasible worst-case Ato get a feasible robust solution xrob:
minx∈Rn
[xT Ax − bT x ] s.t.∑
x ≤ 1, x ≥ 0
The whole procedure
solve QP with A = A0 (nominal)
solve QP with A = A0 + εI (robust)
solve (non-)linear SDP to get A
solve QP with A (robust feasible)
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 45 / 45
NL-SDP, Other Applications, Availabilitypolynomial matrix inequalities (with Didier Henrion)
nearest correlation matrix
sensor network localization
approximation by nonnegative splines
financial mathematics (with Ralf Werner)
Mathematical Programming with Equilibrium Constraints (withDanny Ralph)
structural optimization with matrix variables and nonlinear matrixconstraints (PLATO-N EU FP6 project)
approximation of arrival rate function of a non-homogeneousPoisson process (F. Alizadeh, J. Eckstein)
Many other applications. . . . . . any hint welcome
Free academic version of the code PENNON availablePENLAB available for download from NAG website
Michal Ko cvara (University of Birmingham) Isaac Newton Institute, 2013 46 / 45
Top Related