TIGHT RELAXATIONS FOR POLYNOMIAL OPTIMIZATION …The polynomial expressions is de-termined by linear...

29
arXiv:1701.01549v2 [math.OC] 6 Apr 2018 TIGHT RELAXATIONS FOR POLYNOMIAL OPTIMIZATION AND LAGRANGE MULTIPLIER EXPRESSIONS JIAWANG NIE Abstract. This paper proposes tight semidefinite relaxations for polynomial optimization. The optimality conditions are investigated. We show that gener- ally Lagrange multipliers can be expressed as polynomial functions in decision variables over the set of critical points. The polynomial expressions is de- termined by linear equations. Based on these expressions, new Lasserre type semidefinite relaxations are constructed for solving the polynomial optimiza- tion. We show that the hierarchy of new relaxations has finite convergence, or equivalently, the new relaxations are tight for a finite relaxation order. 1. Introduction A general class of optimization problems is (1.1) f min := min f (x) s.t. c i (x)=0(i ∈E ), c j (x) 0(j ∈I ), where f and all c i ,c j are polynomials in x := (x 1 ,...,x n ), the real decision variable. The E and I are two disjoint finite index sets of constraining polynomials. Lasserre’s relaxations [17] are generally used for solving (1.1) globally, i.e., to find the global minimum value f min and minimizer(s) if any. The convergence of Lasserre’s relax- ations is related to optimality conditions. 1.1. Optimality conditions. A general introduction of optimality conditions in nonlinear programming can be found in [1, Section 3.3]. Let u be a local minimizer of (1.1). Denote the index set of active constraints (1.2) J (u) := {i ∈E∪I| c i (u)=0}. If the constraint qualification condition (CQC) holds at u, i.e., the gradients c i (u) (i J (u)) are linearly independent (denotes the gradient), then there exist Lagrange multipliers λ i (i ∈E∪I ) satisfying (1.3) f (u)= i∈E∪I λ i c i (u), (1.4) c i (u)=0(i ∈E )j c j (u)=0(j ∈I ), (1.5) c j (u) 0(j ∈I )j 0(j ∈I ). 2010 Mathematics Subject Classification. 65K05, 90C22, 90C26. Key words and phrases. Lagrange multiplier, Lasserre relaxation, tight relaxation, polynomial optimization, critical point. 1

Transcript of TIGHT RELAXATIONS FOR POLYNOMIAL OPTIMIZATION …The polynomial expressions is de-termined by linear...

Page 1: TIGHT RELAXATIONS FOR POLYNOMIAL OPTIMIZATION …The polynomial expressions is de-termined by linear equations. Based on these expressions, new Lasserre type semidefinite relaxations

arX

iv1

701

0154

9v2

[m

ath

OC

] 6

Apr

201

8

TIGHT RELAXATIONS FOR POLYNOMIAL OPTIMIZATION

AND LAGRANGE MULTIPLIER EXPRESSIONS

JIAWANG NIE

Abstract This paper proposes tight semidefinite relaxations for polynomialoptimization The optimality conditions are investigated We show that gener-ally Lagrange multipliers can be expressed as polynomial functions in decisionvariables over the set of critical points The polynomial expressions is de-termined by linear equations Based on these expressions new Lasserre typesemidefinite relaxations are constructed for solving the polynomial optimiza-tion We show that the hierarchy of new relaxations has finite convergence orequivalently the new relaxations are tight for a finite relaxation order

1 Introduction

A general class of optimization problems is

(11)

fmin = min f(x)st ci(x) = 0 (i isin E)

cj(x) ge 0 (j isin I)where f and all ci cj are polynomials in x = (x1 xn) the real decision variableThe E and I are two disjoint finite index sets of constraining polynomials Lasserrersquosrelaxations [17] are generally used for solving (11) globally ie to find the globalminimum value fmin and minimizer(s) if any The convergence of Lasserrersquos relax-ations is related to optimality conditions

11 Optimality conditions A general introduction of optimality conditions innonlinear programming can be found in [1 Section 33] Let u be a local minimizerof (11) Denote the index set of active constraints

(12) J(u) = i isin E cup I | ci(u) = 0If the constraint qualification condition (CQC) holds at u ie the gradients nablaci(u)(i isin J(u)) are linearly independent (nabla denotes the gradient) then there existLagrange multipliers λi (i isin E cup I) satisfying

(13) nablaf(u) =sum

iisinEcupIλinablaci(u)

(14) ci(u) = 0 (i isin E) λjcj(u) = 0 (j isin I)

(15) cj(u) ge 0 (j isin I) λj ge 0 (j isin I)

2010 Mathematics Subject Classification 65K05 90C22 90C26Key words and phrases Lagrange multiplier Lasserre relaxation tight relaxation polynomial

optimization critical point

1

2 JIAWANG NIE

The second equation in (14) is called the complementarity condition If λj+cj(u) gt0 for all j isin I the strict complementarity condition (SCC) is said to hold For theλirsquos satisfying (13)-(15) the associated Lagrange function is

L (x) = f(x)minussum

iisinEcupIλici(x)

Under the constraint qualification condition the second order necessary condition(SONC) holds at u ie (nabla2 denotes the Hessian)

(16) vT(

nabla2L (u)

)

v ge 0 for all v isin⋂

iisinJ(u)nablaci(u)perp

Here nablaci(u)perp is the orthogonal complement of nablaci(u) If it further holds that

(17) vT(

nabla2L (u)

)

v gt 0 for all 0 6= v isin⋂

iisinJ(u)nablaci(u)perp

then the second order sufficient condition (SOSC) is said to hold If the con-straint qualification condition holds at u then (13) (14) and (16) are necessaryconditions for u to be a local minimizer If (13) (14) (17) and the strict comple-mentarity condition hold then u is a strict local minimizer

12 Some existing work Under the archimedean condition (see sect2) the hierar-chy of Lasserrersquos relaxations converges asymptotically [17] Moreover in additionto the archimedeanness if the constraint qualification strict complementarity andsecond order sufficient conditions hold at every global minimizer then the Lasserrersquoshierarchy converges in finitely many steps [33] For convex polynomial optimiza-tion the Lasserrersquos hierarchy has finite convergence under the strict convexity orsos-convexity condition [7 20] For unconstrained polynomial optimization thestandard sum of squares relaxation was proposed in [35] When the equality con-straints define a finite set the Lasserrersquos hierarchy also has finite convergence asshown in [18 24 31] Recently a bounded degree hierarchy of relaxations wasproposed for solving polynomial optimization [23] General introductions to poly-nomial optimization and moment problems can be found in the books and surveys[21 22 25 26 39] Lasserrersquos relaxations provide lower bounds for the minimumvalue There also exist methods that compute upper bounds [8 19] A convergencerate analysis for such upper bounds is given in [9 10] When a polynomial opti-mization problem does not have minimizers (ie the infimum is not achievable)there are relaxation methods for computing the infimum [38 42]

A new type of Lasserre relaxations based on Jacobian representations were re-cently proposed in [30] The hierarchy of such relaxations always has finite conver-gence when the tuple of constraining polynomials is nonsingular (ie at every pointin Cn the gradients of active constraining polynomial are linearly independent seeDefinition 51) When there are only equality constraints c1(x) = middot middot middot = cm(x) = 0the method needs the maximal minors of the matrix

[nablaf(x) nablac1(x) middot middot middot nablacm(x)

]

When there are inequality constraints it requires to enumerate all possibilitiesof active constraints The method in [30] is expensive when there are a lot ofconstraints For unconstrained optimization it is reduced to the gradient sum ofsquares relaxations in [27]

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 3

13 New contributions When Lasserrersquos relaxations are used to solve polyno-mial optimization the following issues are typically of concerns

bull The convergence depends on the archimedean condition (see sect2) which issatisfied only if the feasible set is compact If the set is noncompact howcan we get convergent relaxations

bull The cost of Lasserrersquos relaxations depends significantly on the relaxationorder For a fixed order can we construct tighter relaxations than thestandard ones

bull When the convergence of Lasserrersquos relaxations is slow can we constructnew relaxations whose convergence is faster

bull When the optimality conditions fail to hold the Lasserrersquos hierarchy mightnot have finite convergence Can we construct a new hierarchy of strongerrelaxations that also has finite convergence for such cases

This paper addresses the above issues We construct tighter relaxations by usingoptimality conditions In (13)-(14) under the constraint qualification conditionthe Lagrange multipliers λi are uniquely determined by u Consider the polynomialsystem in (x λ)

(18)sum

iisinEcupIλinablaci(x) = nablaf(x) ci(x) = 0 (i isin E) λjcj(x) = 0 (j isin I)

A point x satisfying (18) is called a critical point and such (x λ) is called a criticalpair In (18) once x is known λ can be determined by linear equations Generallythe value of x is not known One can try to express λ as a rational function in xSuppose E cup I = 1 m and denote

G(x) =[nablac1(x) middot middot middot nablacm(x)

]

When m le n and rankG(x) = m we can get the rational expression

(19) λ =(G(x)TG(x)

)minus1G(x)Tnablaf(x)

Typically the matrix inverse(G(x)TG(x)

)minus1is expensive for usage The denom-

inator det(G(x)TG(x)

)is typically a high degree polynomial When m gt n

G(x)TG(x) is always singular and we cannot express λ as in (19)Do there exist polynomials pi (i isin E cup I) such that each

(110) λi = pi(x)

for all (x λ) satisfying (18) If they exist then we can do

bull The polynomial system (18) can be simplified to

(111)sum

iisinEcupIpi(x)nablaci(x) = nablaf(x) ci(x) = 0(i isin E) pj(x)cj(x) = 0(j isin I)

bull For each j isin I the sign condition λj ge 0 is equivalent to

(112) pj(x) ge 0

The new conditions (111) and (112) are only about the variable x not λ Theycan be used to construct tighter relaxations for solving (11)

When do there exist polynomials pi satisfying (110) If they exist how can wecompute them How can we use them to construct tighter relaxations Do thenew relaxations have advantages over the old ones These questions are the maintopics of this paper Our major results are

4 JIAWANG NIE

bull We show that the polynomials pi satisfying (110) always exist when thetuple of constraining polynomials is nonsingular (see Definition 51) More-over they can be determined by linear equations

bull Using the new conditions (111)-(112) we can construct tight relaxationsfor solving (11) To be more precise we construct a hierarchy of newrelaxations which has finite convergence This is true even if the feasibleset is noncompact andor the optimality conditions fail to hold

bull For every relaxation order the new relaxations are tighter than the standardones in the prior work

The paper is organized as follows Section 2 reviews some basics in polynomialoptimization Section 3 constructs new relaxations and proves their tightness Sec-tion 4 characterizes when the polynomials pirsquos satisfying (110) exist and shows howto determine them for polyhedral constraints Section 5 discusses the case of gen-eral nonlinear constraints Section 6 gives examples of using the new relaxationsSection 7 discusses some related issues

2 Preliminaries

Notation The symbol N (resp R C) denotes the set of nonnegative integral(resp real complex) numbers The symbol R[x] = R[x1 xn] denotes the ringof polynomials in x = (x1 xn) with real coefficients The R[x]d stands for theset of real polynomials with degrees le d Denote

Nnd = α = (α1 αn) isin N

n | |α| = α1 + middot middot middot+ αn le dFor a polynomial p deg(p) denotes its total degree For t isin R lceiltrceil denotes thesmallest integer ge t For an integer k gt 0 denote [k] = 1 2 k For x =(x1 xn) and α = (α1 αn) denote

xα = xα1

1 middot middot middotxαnn [x]d =

[1 x1 middot middot middot xn x21 x1x2 middot middot middot xdn

]T

The superscript T denotes the transpose of a matrixvector The ei denotes theith standard unit vector while e denotes the vector of all ones The Im denotesthe m-by-m identity matrix By writing X 0 (resp X ≻ 0) we mean that Xis a symmetric positive semidefinite (resp positive definite) matrix For matricesX1 Xr diag(X1 Xr) denotes the block diagonal matrix whose diagonalblocks are X1 Xr In particular for a vector a diag(a) denotes the diagonalmatrix whose diagonal vector is a For a function f in x fxi

denotes its partialderivative with respect to xi

We review some basics in computational algebra and polynomial optimizationThey could be found in [4 21 22 25 26] An ideal I of R[x] is a subset such thatI middot R[x] sube I and I + I sube I For a tuple h = (h1 hm) of polynomials Ideal(h)denotes the smallest ideal containing all hi which is the set

h1 middot R[x] + middot middot middot+ hm middot R[x]The 2kth truncation of Ideal(h) is the set

Ideal(h)2k = h1 middot R[x]2kminusdeg(h1) + middot middot middot+ hm middot R[x]2kminusdeg(hm)

The truncation Ideal(h)2k depends on the generators h1 hm For an ideal Iits complex and real varieties are respectively defined as

VC(I) = v isin Cn | p(v) = 0 forall p isin I VR(I) = VC(I) capR

n

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 5

A polynomial σ is said to be a sum of squares (SOS) if σ = s21+ middot middot middot+s2k for somepolynomials s1 sk isin R[x] The set of all SOS polynomials in x is denoted asΣ[x] For a degree d denote the truncation

Σ[x]d = Σ[x] cap R[x]d

For a tuple g = (g1 gt) its quadratic module is the set

Qmod(g) = Σ[x] + g1 middot Σ[x] + middot middot middot+ gt middot Σ[x]The 2kth truncation of Qmod(g) is the set

Qmod(g)2k = Σ[x]2k + g1 middot Σ[x]2kminusdeg(g1) + middot middot middot+ gt middot Σ[x]2kminusdeg(gt)

The truncation Qmod(g)2k depends on the generators g1 gt Denote

(21)

IQ(h g) = Ideal(h) + Qmod(g)IQ(h g)2k = Ideal(h)2k +Qmod(g)2k

The set IQ(h g) is said to be archimedean if there exists p isin IQ(h g) such thatp(x) ge 0 defines a compact set in Rn If IQ(h g) is archimedean then

K = x isin Rn | h(x) = 0 g(x) ge 0

must be a compact set Conversely if K is compact say K sube B(0 R) (the ballcentered at 0 with radius R) then IQ(h (gR2 minus xTx)) is always archimedean andh = 0 (gR2 minus xTx) ge 0 give the same set K

Theorem 21 (Putinar [36]) Let h g be tuples of polynomials in R[x] Let K beas above Assume IQ(h g) is archimedean If a polynomial f isin R[x] is positive onK then f isin IQ(h g)

Interestingly if f is only nonnegative on K but standard optimality conditionshold (see Subsection 11) then we still have f isin IQ(h g) [33]

Let RNnd be the space of real multi-sequences indexed by α isin Nnd A vector in

RNnd is called a truncated multi-sequence (tms) of degree d A tms y = (yα)αisinNn

d

gives the Riesz functional Ry acting on R[x]d as

(22) Ry

(

sum

αisinNnd

fαxα)

=sum

αisinNnd

fαyα

For f isin R[x]d and y isin RNnd we denote

(23) 〈f y〉 = Ry(f)

Let q isin R[x]2k The kth localizing matrix of q generated by y isin RNn2k is the

symmetric matrix L(k)q (y) such that

(24) vec(a1)T(

L(k)q (y)

)

vec(a2) = Ry(qa1a2)

for all a1 a2 isin R[x]kminuslceildeg(q)2rceil (The vec(ai) denotes the coefficient vector of ai)

When q = 1 L(k)q (y) is called a moment matrix and we denote

(25) Mk(y) = L(k)1 (y)

The columns and rows of L(k)q (y) as well as Mk(y) are indexed by α isin Nn with

2|α|+ deg(q) le 2k When q = (q1 qr) is a tuple of polynomials we define

(26) L(k)q (y) = diag

(

L(k)q1 (y) L(k)

qr (y))

6 JIAWANG NIE

a block diagonal matrix For the polynomial tuples h g as above the set

(27) S (h g)2k =

y isin RN

2kn

∣∣∣L

(k)h (y) = 0 L(k)

g (y) 0

is a spectrahedral cone in RN2kn The set IQ(h g)2k is also a convex cone in R[x]2k

The dual cone of IQ(h g)2k is precisely S (h g)2k [22 25 34] This is because〈p y〉 ge 0 for all p isin IQ(h g)2k and for all y isin S (h g)2k

3 The construction of tight relaxations

Consider the polynomial optimization problem (11) Let

λ = (λi)iisinEcupI

be the vector of Lagrange multipliers Denote the set

(31) K =

(x λ) isin Rn times R

EcupI∣∣∣∣∣

ci(x) = 0(i isin E) λjcj(x) = 0 (j isin I)nablaf(x) = sum

iisinEcupIλinablaci(x)

Each point in K is called a critical pair The projection

(32) Kc = u | (u λ) isin Kis the set of all real critical points To construct tight relaxations for solving (11)we need the following assumption for Lagrange multipliers

Assumption 31 For each i isin E cupI there exists a polynomial pi isin R[x] such thatfor all (x λ) isin K it holds that

λi = pi(x)

Assumption 31 is generically satisfied as shown in Proposition 57 For thefollowing special cases we can get polynomials pi explicitly

bull (Simplex) For the simplex eTxminus1 = 0 x1 ge 0 xn ge 0 it correspondsto that E = 0 I = [n] c0(x) = eTx minus 1 cj(x) = xj (j isin [n]) TheLagrange multipliers can be expressed as

(33) λ0 = xTnablaf(x) λj = fxjminus xTnablaf(x) (j isin [n])

bull (Hypercube) For the hypercube [minus1 1]n it corresponds to that E = emptyI = [n] and each cj(x) = 1minus x2j We can show that

(34) λj = minus1

2xjfxj

(j isin [n])

bull (Ball or sphere) The constraint is 1minusxTx = 0 or 1minusxTx ge 0 It correspondsto that E cup I = 1 and c1 = 1minus xTx We have

(35) λ1 = minus1

2xTnablaf(x)

bull (Triangular constraints) Suppose E cup I = 1 m and each

ci(x) = τixi + qi(xi+1 xn)

for some polynomials qi isin R[xi+1 xn] and scalars τi 6= 0 The matrixT (x) consisting of the firstm rows of [nablac1(x) nablacm(x)] is an invertiblelower triangular matrix with constant diagonal entries Then

λ = T (x)minus1 middot[fx1

middot middot middot fxm

]T

Note that the inverse T (x)minus1 is a matrix polynomial

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 7

For more general constraints we can also express λ as a polynomial function in xon the set Kc This will be discussed in sect4 and sect5

For the polynomials pi as in Assumption 31 denote

(36) φ =(

nablaf minussum

iisinEcupIpinablaci

(pjcj

)

jisinI

)

ψ =(pj)

jisinI

When the minimum value fmin of (11) is achieved at a critical point (11) isequivalent to the problem

(37)

fc = min f(x)st ceq(x) = 0 cin(x) ge 0

φ(x) = 0 ψ(x) ge 0

We apply Lasserre relaxations to solve it For an integer k gt 0 (called the relaxationorder) the kth order Lasserrersquos relaxation for (37) is

(38)

f primek = min 〈f y〉

st 〈1 y〉 = 1Mk(y) 0

L(k)ceq (y) = 0 L

(k)cin(y) 0

L(k)φ (y) = 0 L

(k)ψ (y) 0 y isin RN

n2k

Since x0 = 1 (the constant one polynomial) the condition 〈1 y〉 = 1 means that(y)0 = 1 The dual optimization problem of (38) is

(39)

fk = max γ

st f minus γ isin IQ(ceq cin)2k + IQ(φ ψ)2k

We refer to sect2 for the notation used in (38)-(39) They are equivalent to semidef-inite programs (SDPs) so they can be solved by SDP solvers (eg SeDuMi [40])For k = 1 2 middot middot middot we get a hierarchy of Lasserre relaxations In (38)-(39) if weremove the usage of φ and ψ they are reduced to standard Lasserre relaxations in[17] So (38)-(39) are stronger relaxations

By the construction of φ as in (36) Assumption 31 implies that

Kc = u isin Rn ceq(u) = 0 φ(u) = 0

By Lemma 33 of [6] f achieves only finitely many values on Kc say(310) v1 lt middot middot middot lt vN

A point u isin Kc might not be feasible for (37) ie it is possible that cin(u) 6ge 0 orψ(u) 6ge 0 In applications we are often interested in the optimal value fc of (37)When (37) is infeasible by convention we set

fc = +infin

When the optimal value fmin of (11) is achieved at a critical point fc = fminThis is the case if the feasible set is compact or if f is coercive (ie for each ℓthe sublevel set f(x) le ℓ is compact) and the constraint qualification conditionholds As in [17] one can show that

(311) fk le f primek le fc

for all k Moreover fk and f primek are both monotonically increasing If for some

order k it occurs thatfk = f prime

k = fc

then the kth order Lasserrersquos relaxation is said to be tight (or exact)

8 JIAWANG NIE

31 Tightness of the relaxations Let cin ψKc fc be as above We refer to sect2for the notation Qmod(cin ψ) We begin with a general assumption

Assumption 32 There exists ρ isin Qmod(cin ψ) such that if u isin Kc and f(u) ltfc then ρ(u) lt 0

In Assumption 32 the hypersurface ρ(x) = 0 separates feasible and infeasiblecritical points Clearly if u isin Kc is a feasible point for (37) then cin(u) ge 0 andψ(u) ge 0 and hence ρ(u) ge 0 Assumption 32 generally holds For instance it issatisfied for the following general cases

a) When there are no inequality constraints cin and ψ are empty tuplesThen Qmod(cin ψ) = Σ[x] and Assumption 32 is satisfied for ρ = 0

b) Suppose the set Kc is finite say Kc = u1 uD andf(u1) f(utminus1) lt fc le f(ut) f(uD)

Let ℓ1 ℓD be real interpolating polynomials such that ℓi(uj) = 1 fori = j and ℓi(uj) = 0 for i 6= j For each i = 1 t there must exist ji isin Isuch that cji(ui) lt 0 Then the polynomial

(312) ρ =sum

iltt

minus1

cji(ui)cji(x)ℓi(x)

2 +sum

igetℓi(x)

2

satisfies Assumption 32c) For each x with f(x) = vi lt fc at least one of the constraints cj(x) ge

0 pj(x) ge 0(j isin I) is violated Suppose for each critical value vi lt fcthere exists gi isin cj pjjisinI such that

gi lt 0 on Kc cap f(x) = viLet ϕ1 ϕN be real univariate polynomials such that ϕi(vj) = 0 for i 6= jand ϕi(vj) = 1 for i = j Suppose vt = fc Then the polynomial

(313) ρ =sum

iltt

gi(x)(ϕi(f(x))

)2+sum

iget

(ϕi(f(x))

)2

satisfies Assumption 32

We refer to sect2 for the archimedean condition and the notation IQ(h g) as in(21) The following is about the convergence of relaxations (38)-(39)

Theorem 33 Suppose Kc 6= empty and Assumption 31 holds If

i) IQ(ceq cin) + IQ(φ ψ) is archimedean orii) IQ(ceq cin) is archimedean oriii) Assumption 32 holds

then fk = f primek = fc for all k sufficiently large Therefore if the minimum value fmin

of (11) is achieved at a critical point then fk = f primek = fmin for all k big enough if

one of the conditions i)-iii) is satisfied

Remark In Theorem 33 the conclusion holds if anyone of conditions i)-iii) issatisfied The condition ii) is only about constraining polynomials of (11) It canbe checked without φ ψ Clearly the condition ii) implies the condition i)

The proof for Theorem 33 is given in the following The main idea is to considerthe set of critical points It can be expressed as a union of subvarieties Theobjective f is a constant in each one of them We can get an SOS type representation

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 9

for f on each subvariety and then construct a single one for f over the entire setof critical points

Proof of Theorem 33 Clearly every point in the complex variety

K1 = x isin Cn | ceq(x) = 0 φ(x) = 0

is a critical point By Lemma 33 of [6] the objective f achieves finitely manyreal values on Kc = K1 cap Rn say they are v1 lt middot middot middot lt vN Up to the shiftingof a constant in f we can further assume that fc = 0 Clearly fc equals one ofv1 vN say vt = fc = 0

Case I Assume IQ(ceq cin) + IQ(φ ψ) is archimedean Let

I = Ideal(ceq φ)

the critical ideal Note that K1 = VC(I) The variety VC(I) is a union of irreduciblesubvarieties say V1 Vℓ If Vi cap Rn 6= empty then f is a real constant on Vi whichequals one of v1 vN This can be implied by Lemma 33 of [6] and Lemma 32of [30] Denote the subvarieties of VC(I)

Ti = K1 cap f(x) = vi (i = t N)

Let Ttminus1 be the union of irreducible subvarieties Vi such that either Vi cap Rn = emptyor f equiv vj on Vi with vj lt vt = fc Then it holds that

VC(I) = Ttminus1 cup Tt cup middot middot middot cup TN By the primary decomposition of I [11 41] there exist ideals Itminus1 It IN sube R[x]such that

I = Itminus1 cap It cap middot middot middot cap IN

and Ti = VC(Ii) for all i = tminus 1 t N Denote the semialgebraic set

(314) S = x isin Rn | cin(x) ge 0 ψ(x) ge 0

For i = t minus 1 we have VR(Itminus1) cap S = empty because v1 vtminus1 lt fc By thePositivstellensatz [2 Corollary 443] there exists p0 isin Preord(cin ψ)

1 satisfying2 + p0 isin Itminus1 Note that 1 + p0 gt 0 on VR(Itminus1) cap S The set Itminus1 +Qmod(cin ψ)is archimedean because I sube Itminus1 and

IQ(ceq cin) + IQ(φ ψ) sube Itminus1 +Qmod(cin ψ)

By Theorem 21 we have

p1 = 1 + p0 isin Itminus1 +Qmod(cin ψ)

Then 1 + p1 isin Itminus1 There exists p2 isin Qmod(cin ψ) such that

minus1 equiv p1 equiv p2 mod Itminus1

Since f = (f4 + 1)2 minus 1 middot (f4minus 1)2 we have

f equiv σtminus1 =(f4 + 1)2 + p2(f4minus 1)2

mod Itminus1

So when k is big enough we have σtminus1 isin Qmod(cin ψ)2k

1It is the preordering of the polynomial tuple (cin ψ) see sect71

10 JIAWANG NIE

For i = t vt = 0 and f(x) vanishes on VC(It) By Hilbertrsquos Strong Nullstellensatz[4] there exists an integer mt gt 0 such that fmt isin It Define the polynomial

st(ǫ) =radicǫsummtminus1

j=0

(12

j

)

ǫminusjf j

Then we have that

D1 = ǫ(1 + ǫminus1f)minus(st(ǫ)

)2 equiv 0 mod It

This is because in the subtraction of D1 after expanding(st(ǫ)

)2 all the terms f j

with j lt mt are cancelled and f j isin It for j ge mt So D1 isin It Let σt(ǫ) = st(ǫ)2

then f + ǫminus σt(ǫ) = D1 and

(315) f + ǫminus σt(ǫ) =summtminus2

j=0bj(ǫ)f

mt+j

for some real scalars bj(ǫ) depending on ǫ

For each i = t+1 N vi gt 0 and f(x)viminus1 vanishes on VC(Ii) By HilbertrsquosStrong Nullstellensatz [4] there exists 0 lt mi isin N such that (fvi minus 1)mi isin Ii Let

si =radicvisummiminus1

j=0

(12

j

)

(fvi minus 1)j

Like for the case i = t we can similarly show that f minus s2i isin Ii Let σi = s2i thenf minus σi isin Ii

Note that VC(Ii) cap VC(Ij) = empty for all i 6= j By Lemma 33 of [30] there existpolynomials atminus1 aN isin R[x] such that

a2tminus1 + middot middot middot+ a2N minus 1 isin I ai isin⋂

i6=jisintminus1NIj

For ǫ gt 0 denote the polynomial

σǫ = σt(ǫ)a2t +

sum

t6=jisintminus1N(σj + ǫ)a2j

then

f + ǫminus σǫ = (f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N )+sum

t6=iisintminus1N(f minus σi)a2i + (f + ǫminus σt(ǫ))a

2t

For each i 6= t f minus σi isin Ii so

(f minus σi)a2i isin

N⋂

j=tminus1

Ij = I

Hence there exists k1 gt 0 such that

(f minus σi)a2i isin I2k1 (t 6= i isin tminus 1 N)

Since f + ǫ minus σt(ǫ) isin It we also have

(f + ǫ minus σt(ǫ))a2t isin

N⋂

j=tminus1

Ij = I

Moreover by the equation (315)

(f + ǫ minus σt(ǫ))a2t =

mtminus2sum

j=0

bj(ǫ)fmt+ja2t

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 11

Each fmt+ja2t isin I since fmt+j isin It So there exists k2 gt 0 such that for all ǫ gt 0

(f + ǫminus σt(ǫ))a2t isin I2k2

Since 1minus a2tminus1 minus middot middot middot minus a2N isin I there also exists k3 gt 0 such that for all ǫ gt 0

(f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N ) isin I2k3

Hence if klowast ge maxk1 k2 k3 then we have

f(x) + ǫminus σǫ isin I2klowast

for all ǫ gt 0 By the construction the degrees of all σi and ai are independent ofǫ So σǫ isin Qmod(cin ψ)2klowast for all ǫ gt 0 if klowast is big enough Note that

I2klowast +Qmod(cin ψ)2klowast = IQ(ceq cin)2klowast + IQ(φ ψ)2klowast

This implies that fklowast ge fc minus ǫ for all ǫ gt 0 On the other hand we always havefklowast le fc So fklowast = fc Moreover since fk is monotonically increasing we musthave fk = fc for all k ge klowast

Case II Assume IQ(ceq cin) is archimedean Because

IQ(ceq cin) sube IQ(ceq cin) + IQ(φ ψ)

the set IQ(ceq cin)+IQ(φ ψ) is also archimedean Therefore the conclusion is alsotrue by applying the result for Case I

Case III Suppose the Assumption 32 holds Let ϕ1 ϕN be real univariatepolynomials such that ϕi(vj) = 0 for i 6= j and ϕi(vj) = 1 for i = j Let

s = st + middot middot middot+ sN where each si = (vi minus fc)(ϕi(f)

)2

Then s isin Σ[x]2k4 for some integer k4 gt 0 Let

f = f minus fc minus s

We show that there exist an integer ℓ gt 0 and q isin Qmod(cin ψ) such that

f2ℓ + q isin Ideal(ceq φ)

This is because by Assumption 32 f(x) equiv 0 on the set

K2 = x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0

It has only a single inequality By the Positivstellensatz [2 Corollary 443] there

exist 0 lt ℓ isin N and q = b0 + ρb1 (b0 b1 isin Σ[x]) such that f2ℓ + q isin Ideal(ceq φ)By Assumption 32 ρ isin Qmod(cin ψ) so we have q isin Qmod(cin ψ)

For all ǫ gt 0 and τ gt 0 we have f + ǫ = φǫ + θǫ where

φǫ = minusτǫ1minus2ℓ(f2ℓ + q

)

θǫ = ǫ(

1 + f ǫ+ τ(f ǫ)2ℓ)

+ τǫ1minus2ℓq

By Lemma 21 of [31] when τ ge 12ℓ there exists k5 such that for all ǫ gt 0

φǫ isin Ideal(ceq φ)2k5 θǫ isin Qmod(cin ψ)2k5

Hence we can getf minus (fc minus ǫ) = φǫ + σǫ

where σǫ = θǫ + s isin Qmod(cin ψ)2k5 for all ǫ gt 0 Note that

IQ(ceq cin)2k5 + IQ(φ ψ)2k5 = Ideal(ceq φ)2k5 +Qmod(cin ψ)2k5

12 JIAWANG NIE

For all ǫ gt 0 γ = fc minus ǫ is feasible in (39) for the order k5 so fk5 ge fc Becauseof (311) and the monotonicity of fk we have fk = f prime

k = fc for all k ge k5

32 Detecting tightness and extracting minimizers The optimal value of(37) is fc and the optimal value of (11) is fmin If fmin is achievable at a criticalpoint then fc = fmin In Theorem 33 we have shown that fk = fc for all k bigenough where fk is the optimal value of (39) The value fc or fmin is often notknown How do we detect the tightness fk = fc in computation The flat extensionor flat truncation condition [5 14 32] can be used for checking tightness Supposeylowast is a minimizer of (38) for the order k Let

(316) d = lceildeg(ceq cin φ ψ)2rceilIf there exists an integer t isin [d k] such that

(317) rankMt(ylowast) = rankMtminusd(y

lowast)

then fk = fc and we can get r = rankMt(ylowast) minimizers for (37) [5 14 32]

The method in [14] can be used to extract minimizers It was implemented inthe software GloptiPoly 3 [13] Generally (317) can serve as a sufficient andnecessary condition for detecting tightness The case that (37) is infeasible (ieno critical points satisfy the constraints cin ge 0 ψ ge 0) can also be detected bysolving the relaxations (38)-(39)

Theorem 34 Under Assumption 31 the relaxations (38)-(39) have the follow-ing properties

i) If (38) is infeasible for some order k then no critical points satisfy theconstraints cin ge 0 ψ ge 0 ie (37) is infeasible

ii) Suppose Assumption 32 holds If (37) is infeasible then the relaxation(38) must be infeasible when the order k is big enough

In the following assume (37) is feasible (ie fc lt +infin) Then for all k bigenough (38) has a minimizer ylowast Moreover

iii) If (317) is satisfied for some t isin [d k] then fk = fciv) If Assumption 32 holds and (37) has finitely many minimizers then every

minimizer ylowast of (38) must satisfy (317) for some t isin [d k] when k is bigenough

Proof By Assumption 31 u is a critical point if and only if ceq(u) = 0 φ(u) = 0i) For every feasible point u of (37) the tms [u]2k (see sect2 for the notation) is

feasible for (38) for all k Therefore if (38) is infeasible for some k then (37)must be infeasible

ii) By Assumption 32 when (37) is infeasible the set

x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0

is empty It has a single inequality By the Positivstellensatz [2 Corollary 443]it holds that minus1 isin Ideal(ceq φ) + Qmod(ρ) By Assumption 32

Ideal(ceq φ) + Qmod(ρ) sube IQ(ceq cin) + IQ(φ ψ)

Thus for all k big enough (39) is unbounded from above Hence (38) must beinfeasible by weak duality

When (37) is feasible f achieves finitely many values on Kc so (37) mustachieve its optimal value fc By Theorem 33 we know that fk = f prime

k = fc for all kbig enough For each minimizer ulowast of (37) the tms [ulowast]2k is a minimizer of (38)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 13

iii) If (317) holds we can get r = rankMt(ylowast) minimizers for (37) [5 14] say

u1 ur such that fk = f(ui) for each i Clearly fk = f(ui) ge fc On the otherhand we always have fk le fc So fk = fc

iv) By Assumption 32 (37) is equivalent to the problem

(318)

min f(x)st ceq(x) = 0 φ(x) = 0 ρ(x) ge 0

The optimal value of (318) is also fc Its kth order Lasserrersquos relaxation is

(319)

γprimek = min 〈f y〉st 〈1 y〉 = 1Mk(y) 0

L(k)ceq (y) = 0 L

(k)φ (y) = 0 L

(k)ρ (y) 0

Its dual optimization problem is

(320)

γk = max γ

st f minus γ isin Ideal(ceq φ)2k + Qmod(ρ)2k

By repeating the same proof as for Theorem 33(iii) we can show that

γk = γprimek = fc

for all k big enough Because ρ isin Qmod(cin ψ) each y feasible for (38) is alsofeasible for (319) So when k is big each ylowast is also a minimizer of (319) Theproblem (318) also has finitely many minimizers By Theorem 26 of [32] thecondition (317) must be satisfied for some t isin [d k] when k is big enough

If (37) has infinitely many minimizers then the condition (317) is typically notsatisfied We refer to [25 sect66]

4 Polyhedral constraints

In this section we assume the feasible set of (11) is the polyhedron

P = x isin Rn | Axminus b ge 0

where A =[a1 middot middot middot am

]T isin Rmtimesn b =[b1 middot middot middot bm

]T isin Rm This corresponds

to that E = empty I = [m] and each ci(x) = aTi xminus bi Denote

(41) D(x) = diag(c1(x) cm(x)) C(x) =

[AT

D(x)

]

The Lagrange multiplier vector λ =[λ1 middot middot middot λm

]Tsatisfies

(42)

[AT

D(x)

]

λ =

[nablaf(x)

0

]

If rankA = m we can express λ as

(43) λ = (AAT )minus1Anablaf(x)If rankA lt m how can we express λ in terms of x In computation we oftenprefer a polynomial expression If there exists L(x) isin R[x]mtimes(n+m) such that

(44) L(x)C(x) = Im

then we can get

λ = L(x)

[nablaf(x)

0

]

= L1(x)nablaf(x)

14 JIAWANG NIE

where L1(x) consists of the first n columns of L(x) In this section we characterizewhen such L(x) exists and give a degree bound for it

The linear function Ax minus b is said to be nonsingular if rankC(u) = m for allu isin Cn (also see Definition 51) This is equivalent to that for every u if J(u) =i1 ik (see (12) for the notation) then ai1 aik are linearly independent

Proposition 41 The linear function Ax minus b is nonsingular if and only if thereexists a matrix polynomial L(x) satisfying (44) Moreover when Ax minus b is non-singular we can choose L(x) in (44) with deg(L) le mminus rankA

Proof Clearly if (44) is satisfied by some L(x) then rankC(u) ge m for all u Thisimplies that Axminus b is nonsingular

Next assume that Axminusb is nonsingular We show that (44) is satisfied by someL(x) isin R[x]mtimes(n+m) with degree le m minus rankA Let r = rankA Up to a linearcoordinate transformation we can reduce x to a r-dimensional variable Withoutloss of generality we can assume that rankA = n and m ge n

For a subset I = i1 imminusn of [m] denote

cI(x) =prod

iisinIci(x) EI(x) = cI(x) middot diag(ci1(x)minus1 cimminusn

(x)minus1)

DI(x) = diag(ci1(x) cimminusn(x)) AI =

[ai1 middot middot middot ainminusm

]T

For the case that I = empty (the empty set) we set cempty(x) = 1 Let

V = I sube [m] |I| = mminus n rankA[m]I = nStep I For each I isin V we construct a matrix polynomial LI(x) such that

(45) LI(x)C(x) = cI(x)Im

The matrix LI = LI(x) satisfying (45) can be given by the following 2times 3 blockmatrix (LI(J K) denotes the submatrix whose row indices are from J and whosecolumn indices are from K)

(46)

J K [n] n+ I n+ [m]II 0 EI(x) 0

[m]I cI(x) middot(A[m]I

)minusT minus(A[m]I

)minusT (AI

)TEI(x) 0

Equivalently the blocks of LI are

LI(I [n]

)= 0 LI

(I n+ [m]I

)= 0 LI

([m]I n+ [m]I

)= 0

LI(I n+ I

)= EI(x) LI

([m]I [n]

)= cI(x)

(A[m]I

)minusT

LI([m]I n+ I

)= minus

(A[m]I

)minusT (AI

)T

For each I isin V A[m]I is invertible The superscript minusT denotes the inverse of thetranspose Let G = LI(x)C(x) then one can verify that

G(I I) = EI(x)DI(x) = cI(x)Imminusn G(I [m]I) = 0

G([m]I [m]I) =[

cI(x)(A[m]I

)minusT minusAminusT[m]IA

TI EI(x)

] [(A[m]I

)T

0

]

= cI(x)In

G([m]I I) =[

cI(x)(A[m]I

)minusT minus(A[m]I

)minusT (AI

)TEI(x)

] [ ATIDI(x)

]

= 0

This shows that the above LI(x) satisfies (45)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 15

Step II We show that there exist real scalars νI satisfying

(47)sum

IisinVνIcI(x) = 1

This can be shown by induction on m

bull When m = n V = empty and cempty(x) = 1 so (47) is clearly truebull When m gt n let

(48) N = i isin [m] | rankA[m]i = n

For each i isin N let Vi be the set of all Iprime sube [m]i such that |I prime| = mminusnminus1

and rankA[m](Iprimecupi) = n For each i isin N by the assumption the linearfunction Amix minus bmi is nonsingular By induction there exist real

scalars ν(i)Iprime satisfying

(49)sum

IprimeisinVi

ν(i)Iprime cIprime(x) = 1

Since rankA = n we can generally assume that a1 an is linearlyindependent So there exist scalars α1 αn such that

am = α1a1 + middot middot middot+ αnan

If all αi = 0 then am = 0 and hence A can be replaced by its first mminus 1rows So (47) is true by the induction In the following suppose at leastone αi 6= 0 and write

i αi 6= 0 = i1 ik

Then ai1 aik am are linearly dependent For convenience set ik+1 =m Since Axminus b is nonsingular the linear system

ci1(x) = middot middot middot = cik(x) = cik+1(x) = 0

has no solutions Hence there exist real scalars micro1 microk+1 such that

micro1ci1(x) + middot middot middot+ microkcik(x) + microk+1cik+1(x) = 1

This above can be implied by echelonrsquos form for inconsistent linear systemsNote that i1 ik+1 isin N For each j = 1 k + 1 by (49)

sum

IprimeisinVij

ν(ij)Iprime cIprime(x) = 1

Then we can get

1 =k+1sum

j=1

microjcij (x) =k+1sum

j=1

microjsum

IprimeisinVij

ν(ij)Iprime cij (x)cIprime(x) =

sum

I=IprimecupijIprimeisinVij1lejlek+1

ν(ij)Iprime microjcI(x)

Since each I prime cup ij isin V (47) must be satisfied by some scalars νI

16 JIAWANG NIE

Step III For LI(x) as in (45) we construct L(x) as

(410) L(x) =sum

IisinVνIcI(x)LI(x)

Clearly L(x) satisfies (44) because

L(x)C(x) =sum

IisinVνILI(x)C(x) =

sum

IisinVνIcI(x)Im = Im

Each LI(x) has degree le mminus n so L(x) has degree le mminus n

Proposition 41 characterizes when there exists L(x) satisfying (44) Whenit does a degree bound for L(x) is m minus rankA Sometimes its degree can besmaller than that as shown in Example 43 For given A b the matrix polynomialL(x) satisfying (44) can be determined by linear equations which are obtained bymatching coefficients on both sides In the following we give some examples ofL(x)C(x) = Im for polyhedral sets

Example 42 Consider the simplicial set

x1 ge 0 xn ge 0 1minus eTx ge 0

The equation L(x)C(x) = In+1 is satisfied by

L(x) =

1minus x1 minusx2 middot middot middot minusxn 1 middot middot middot 1minusx1 1minus x2 middot middot middot minusxn 1 middot middot middot 1

minusx1 minusx2 middot middot middot 1minus xn 1 middot middot middot 1minusx1 minusx2 middot middot middot minusxn 1 middot middot middot 1

Example 43 Consider the box constraint

x1 ge 0 xn ge 0 1minus x1 ge 0 1minus xn ge 0

The equation L(x)C(x) = I2n is satisfied by

L(x) =

[In minus diag(x) In Inminusdiag(x) In In

]

Example 44 Consider the polyhedral set

1minus x4 ge 0 x4 minus x3 ge 0 x3 minus x2 ge 0 x2 minus x1 ge 0 x1 + 1 ge 0

The equation L(x)C(x) = I5 is satisfied by

L(x) =1

2

minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 minusx4 minus 1 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 1minus x4 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 1minus x3 1minus x4 1 1 1 1 1minusx1 minus 1 1minus x2 1minus x3 1minus x4 1 1 1 1 11minus x1 1minus x2 1minus x3 1minus x4 1 1 1 1 1

Example 45 Consider the polyhedral set

1 + x1 ge 0 1minus x1 ge 0 2minus x1 minus x2 ge 0 2minus x1 + x2 ge 0

The matrix L(x) satisfying L(x)C(x) = I4 is

1

6

x12minus 3x1 + 2 x1 x2 minus x2 4minus x1 2minus x1 1minus x1 1minus x1

3x12minus 3x1 minus 6 3x2 + 3x1 x2 6minus 3x1 minus3x1 minus3x1 minus 3 minus3x1 minus 3

1minus x12

minus2x2 minus x1 x2 minus 3 x1 minus 1 x1 + 1 x1 + 2 x1 + 21minus x1

2 3minus x1 x2 minus 2x2 x1 minus 1 x1 + 1 x1 + 2 x1 + 2

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 17

5 General constraints

We consider general nonlinear constraints as in (11) The critical point condi-tions are in (18) We discuss how to express Lagrange multipliers λi as polynomialfunctions in x on the set of critical points

Suppose there are totally m equality and inequality constraints ie

E cup I = 1 mIf (x λ) is a critical pair then λici(x) = 0 for all i isin E cup I So the Lagrange

multiplier vector λ =[λ1 middot middot middot λm

]Tsatisfies the equation

(51)

nablac1(x) nablac2(x) middot middot middot nablacm(x)c1(x) 0 middot middot middot 00 c2(x) 0 0

0 0 middot middot middot cm(x)

︸ ︷︷ ︸

C(x)

λ =

nablaf(x)000

Let C(x) be as in above If there exists L(x) isin R[x]mtimes(m+n) such that

(52) L(x)C(x) = Im

then we can get

(53) λ = L(x)

[nablaf(x)

0

]

= L1(x)nablaf(x)

where L1(x) consists of the first n columns of L(x) Clearly (52) implies thatAssumption 31 holds This section characterizes when such L(x) exists

Definition 51 The tuple c = (c1 cm) of constraining polynomials is said tobe nonsingular if rankC(u) = m for every u isin Cn

Clearly c being nonsingular is equivalent to that for each u isin Cn if J(u) =i1 ik (see (12) for the notation) then the gradients nablaci1(u) nablacik(u) arelinearly independent Our main conclusion is that (52) holds if and only if thetuple c is nonsingular

Proposition 52 (i) For each W (x) isin C[x]stimest with s ge t rankW (u) = t for allu isin Cn if and only if there exists P (x) isin C[x]ttimess such that

P (x)W (x) = It

Moreover for W (x) isin R[x]stimest we can choose P (x) isin R[x]ttimess for the above(ii) The constraining polynomial tuple c is nonsingular if and only if there existsL(x) isin R[x]mtimes(m+n) satisfying (52)

Proof (i) ldquolArrrdquo If L(x)W (x) = It then for all u isin Cn

t = rank It le rankW (u) le t

So W (x) must have full column rank everywhere

ldquorArrrdquo Suppose rankW (u) = t for all u isin Cn Write W (x) in columns

W (x) =[w1(x) w2(x) middot middot middot wt(x)

]

18 JIAWANG NIE

Then the equation w1(x) = 0 does not have a complex solution By Hilbertrsquos WeakNullstellensatz [4] there exists ξ1(x) isin C[x]s such that ξ1(x)

Tw1(x) = 1 For eachi = 2 t denote

r1i(x) = ξ1(x)Twi(x)

then (use sim to denote row equivalence between matrices)

W (x) sim[

1 r12(x) middot middot middot r1t(x)w1(x) w2(x) middot middot middot wm(x)

]

simW1(x) =

[1 r12(x) middot middot middot r1m(x)

0 w(1)2 (x) middot middot middot w

(1)m (x)

]

where each (i = 2 m)

w(1)i (x) = wi(x)minus r1i(x)w1(x)

So there exists P1(x) isin R[x](s+1)timess such that

P1(x)W (x) =W1(x)

Since W (x) and W1(x) are row equivalent W1(x) must also have full column rankeverywhere Similarly the polynomial equation

w(1)2 (x) = 0

does not have a complex solution Again by Hilbertrsquos Weak Nullstellensatz [4]there exists ξ2(x) isin C[x]s such that

ξ2(x)Tw

(1)2 (x) = 1

For each i = 3 t let r2i(x) = ξ2(x)Tw

(1)2 (x) then

W1(x) sim

1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)

0 w(1)2 (x) w

(1)3 (x) middot middot middot w

(1)m (x)

sim

W2(x) =

1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)

0 0 w(2)3 (x) middot middot middot w

(2)m (x)

where each (i = 3 m)

w(2)i (x) = w

(1)i (x)minus r2i(x)w

(1)2 (x)

Similarly W1(x) and W2(x) are row equivalent so W2(x) has full column rankeverywhere There exists P2(x) isin C[x](s+2)times(s+1) such that

P2(x)W1(x) =W2(x)

Continuing this process we can finally get

W2(x) sim middot middot middot simWt(x) =

1 r12(x) r13(x) middot middot middot r1t(x)0 1 r23(x) middot middot middot r2t(x)0 0 1 middot middot middot r3t(x)

0 0 0 middot middot middot 10 0 0 middot middot middot 0

Consequently there exists Pi(x) isin R[x](s+i)times(s+iminus1) for i = 1 2 t such that

Pt(x)Ptminus1(x) middot middot middotP1(x)W (x) =Wt(x)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 19

Since Wt(x) is a unit upper triangular matrix polynomial there exists Pt+1(x) isinR[x]ttimes(s+t) such that Pt+1(x)Wt(x) = It Let

P (x) = Pt+1(x)Pt(x)Ptminus1(x) middot middot middotP1(x)

then P (x)W (x) = Im Note that P (x) isin C[x]ttimess

For W (x) isin R[x]stimest we can replace P (x) by(P (x)+P (x)

)2 (the P (x) denotes

the complex conjugate of P (x)) which is a real matrix polynomial

(ii) The conclusion is implied directly by the item (i)

In Proposition 52 there is no explicit degree bound for L(x) satisfying (52)This question is mostly open to the best of the authorrsquos knowledge However oncea degree is chosen for L(x) it can be determined by comparing coefficients of bothsides of (52) This can be done by solving a linear system In the following wegive some examples of L(x) satisfying (52)

Example 53 Consider the hypercube with quadratic constraints

1minus x21 ge 0 1minus x22 ge 0 1minus x2n ge 0

The equation L(x)C(x) = In is satisfied by

L(x) =[minus 1

2diag(x) In]

Example 54 Consider the nonnegative portion of the unit sphere

x1 ge 0 x2 ge 0 xn ge 0 x21 + middot middot middot+ x2n minus 1 = 0

The equation L(x)C(x) = In+1 is satisfied by

L(x) =

[In minus xxT x1Tn 2x

12x

T minus 121

Tn minus1

]

Example 55 Consider the set

1minus x31 minus x42 ge 0 1minus x43 minus x34 ge 0

The equation L(x)C(x) = I2 is satisfied by

L(x) =

[minusx1

3 minusx2

4 0 0 1 00 0 minusx3

4 minusx4

3 0 1

]

Example 56 Consider the quadratic set

1minus x1x2 minus x2x3 minus x1x3 ge 0 1minus x21 minus x22 minus x23 ge 0

The matrix L(x)T satisfying L(x)C(x) = I2 is

25x13 + 10 x1

2 x2 + 40x1 x22 minus 25 x1 minus 2x3 minus25x1

3 minus 10x12 x2 minus 40 x1 x2

2 +49 x1

2+ 2x3

minus15 x12 x2 + 10x1 x2

2 + 20 x3 x1 x2 minus 10x1 15 x12 x2 minus 10x1 x2

2 minus 20 x3 x1 x2 + 10x1 minus x22

25 x3 x12 minus 20x1 x2

2 + 10 x3 x1 x2 + 2x1 minus25 x3 x12 + 20x1 x2

2 minus 10 x3 x1 x2 minus 2x1 minus x32

1 minus 20 x1 x3 minus 10 x12 minus 20 x1 x2 20 x1 x2 + 20 x1 x3 + 10x1

2

minus50 x12 minus 20 x2 x1 50 x1

2 + 20 x2 x1 + 1

We would like to remark that a polynomial tuple c = (c1 cm) is genericallynonsingular and Assumption 31 holds generically

Proposition 57 For all positive degrees d1 dm there exists an open densesubset U of D = R[x]d1 times middot middot middot times R[x]dm such that every tuple c = (c1 cm) isin Uis nonsingular Indeed such U can be chosen as a Zariski open subset of D ie itis the complement of a proper real variety of D Moreover Assumption 31 holdsfor all c isin U ie it holds generically

20 JIAWANG NIE

Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The

resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of

ci1 cin+1such that if Res(ci1 cin+1

) 6= 0 then the equations

ci1(x) = middot middot middot = cin+1(x) = 0

have no complex solutions Define

F1(c) =prod

(i1in+1)isinJ1

Res(ci1 cin+1)

For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero

Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations

cj1(x) = middot middot middot = cjk(x) = 0

have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define

F2(c) =prod

(j1jk)isinJ2

∆(cj1 cjk)

Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero

Last let F (c) = F1(c)F2(c) and

U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple

Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically

6 Numerical examples

This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21

The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie

pi =(

L1(x)nablaf(x))

i

In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds

By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)

We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo

Example 61 Consider the optimization problem

min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0

The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors

Table 1 Computational results for Example 61

order kwo LME with LME

lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567

2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn

i=1xi(1 minus xi)

2 +sum

i6=j x2

i xj isin IQ(ceq cin)

22 JIAWANG NIE

Example 62 Consider the optimization problem

min x41x22 + x21x

42 + x63 minus 3x21x

22x

23 + (x41 + x42 + x43)

st x21 + x22 + x23 ge 1

The matrix polynomial L(x) =[12x1

12x2

12x3 minus1

] The objective f is the sum

of the positive definite form x41 + x42 + x43 and the Motzkin polynomial

M(x) = x41x22 + x21x

42 + x63 minus 3x21x

22x

23

Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because

c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)

)= 0

defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1

3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic

3 plusmn 1radic

3) The

computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 2 Computational results for Example 62

order kwo LME with LME

lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530

Example 63 Consider the optimization problem

min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0

The matrix polynomial L(x) is

1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0

0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1

minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1

The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3

3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact

4 This is because 1 minus x21minus x2

2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and

1minusx23minusx2

4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23

Table 3 Computational results for Example 63

order kwo LME with LME

lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449

Example 64 Consider the polynomial optimization problem

minxisinR2

x21 + 50x22

st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1

8 ge 0 x22 + 2x1x2 minus 18 ge 0

It is motivated from an example in [12 sect3] The first column of L(x) is

8x13

5 + x1

5288 x2 x1

4

5 minus 16 x13

5 minus x2 x12 1245 + 8 x1

5 minus 2 x2minus 288x2 x1

4

5 minus 16 x13

5 + x2 x12 1245 + 8 x1

5 + 2 x2

and the second column of L(x) is

minus 8x12 x2

5 + 4x23

5 minus x2

10288x1

3 x22

5 + 16x12 x2

5 minus 142x1 x22

5 minus 9 x1

20 minus 8 x23

5 + 11x2

5

minus 288x13 x2

2

5 + 16x12 x2

5 + 142x1 x22

5 + 9 x1

20 minus 8 x23

5 + 11x2

5

The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25

radic5 asymp 1126517 and the minimizers are (plusmn

radic

12plusmn(radic

58 +radic

12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors

Table 4 Computational results for Example 64

order kwo LME with LME

lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537

Example 65 Consider the optimization problem

minxisinR3

x31 + x32 + x33 + 4x1x2x3 minus(x1(x

22 + x23) + x2(x

23 + x21) + x3(x

21 + x22)

)

st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0

The matrix polynomial L(x) is

1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0

minusx1 x2 0 1 0 minus1

24 JIAWANG NIE

The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3

+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 5 Computational results for Example 65

order kwo LME with LME

lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057

Example 66 Consider the optimization problem (x0 = 1)

minxisinR4

xTx+sum4i=0

prod

j 6=i(xi minus xj)

st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0

The matrix polynomial L(x) =[12diag(x) minusI4

] The first part of the objective

is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers

(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)

(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)

(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 6 Computational results for Example 66

order kwo LME with LME

lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354

Example 67 Consider the optimization problem

minxisinR3

x41x22 + x42x

23 + x43x

21 minus 3x21x

22x

23 + x22

st x1 minus x2x3 ge 0minusx2 + x23 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0

minusx3 minus1 0 0 0

]

By the arithmetic-geometric

mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25

relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin

for all k ge 5 up to numerical round-off errors

Table 7 Computational results for Example 67

order kwo LME with LME

lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618

Example 68 Consider the optimization problem

minxisinR4

x21(x1 minus x4)2 + x22(x2 minus x4)

2 + x23(x3 minus x4)2+

2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2

st x1 minus x2 ge 0 x2 minus x3 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0 01 1 0 0 0 0

]

In the objective the sum

of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer

(05632 05632 05632 07510)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 8 Computational results for Example 68

order kwo LME with LME

lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228

Example 69 Consider the optimization problem

minxisinR4

(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)

st 0 le x1 x4 le 1

The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9

5 Note that 4minussum

4

i=1x2i =

sum

4

i=1

(

xi(1 minus xi)2 + (1 minus xi)(1 + x2i )

)

isin IQ(ceq cin)

26 JIAWANG NIE

Table 9 Computational results for Example 69

orderwo LME with LME

lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320

For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison

Example 610 Consider the optimization problem (x0 = 1)

minxisinRn

sum

0leilejlejlen cijkxixjxk

st 0 le x le 1

where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time

Table 10 Consumed time (in seconds) for Example 610

n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778

7 Discussions

71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that

IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k

If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set

Preord(cin ψ) =sum

r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27

The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is

(71)

f primeprek = min 〈f y〉

st 〈1 y〉 = 1 L(k)ceq(y) = 0 L

(k)φ (y) = 0

L(k)

gr11

middotmiddotmiddotgrℓℓ

(y) 0 forall r1 rℓ isin 0 1y isin RN

n2k

Similar to (39) the dual optimization problem of the above is

(72)

fprek = max γ

st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k

An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem

Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then

fprek = f primeprek = fc

for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre

k = fmin for all k big enough

Proof The proof is very similar to the Case III of Theorem 33 Follow the same

argument there Without Assumption 32 we still have f(x) equiv 0 on the set

K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0

By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such

that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same

72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge

73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)

28 JIAWANG NIE

74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get

λ =(

C(x)TC(x))minus1

C(x)T[nablaf(x)

0

]

when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research

Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper

References

[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions

Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-

putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997

[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226

[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200

[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832

[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear

[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear

[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]

[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous

quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-

inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-

tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005

[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109

[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J

Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of

zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29

[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885

[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014

[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009

[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015

[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear

[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26

[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009

[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014

[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606

[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718

[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012

[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255

[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013

[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510

[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121

[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274

[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003

[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984

[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000

[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942

[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324

[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653

[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002

[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951

Department of Mathematics University of California at San Diego 9500 Gilman

Drive La Jolla CA USA 92093

E-mail address njwmathucsdedu

  • 1 Introduction
    • 11 Optimality conditions
    • 12 Some existing work
    • 13 New contributions
      • 2 Preliminaries
      • 3 The construction of tight relaxations
        • 31 Tightness of the relaxations
        • 32 Detecting tightness and extracting minimizers
          • 4 Polyhedral constraints
          • 5 General constraints
          • 6 Numerical examples
          • 7 Discussions
            • 71 Tight relaxations using preorderings
            • 72 Singular constraining polynomials
            • 73 Degree bound for L(x)
            • 74 Rational representation of Lagrange multipliers
              • References
Page 2: TIGHT RELAXATIONS FOR POLYNOMIAL OPTIMIZATION …The polynomial expressions is de-termined by linear equations. Based on these expressions, new Lasserre type semidefinite relaxations

2 JIAWANG NIE

The second equation in (14) is called the complementarity condition If λj+cj(u) gt0 for all j isin I the strict complementarity condition (SCC) is said to hold For theλirsquos satisfying (13)-(15) the associated Lagrange function is

L (x) = f(x)minussum

iisinEcupIλici(x)

Under the constraint qualification condition the second order necessary condition(SONC) holds at u ie (nabla2 denotes the Hessian)

(16) vT(

nabla2L (u)

)

v ge 0 for all v isin⋂

iisinJ(u)nablaci(u)perp

Here nablaci(u)perp is the orthogonal complement of nablaci(u) If it further holds that

(17) vT(

nabla2L (u)

)

v gt 0 for all 0 6= v isin⋂

iisinJ(u)nablaci(u)perp

then the second order sufficient condition (SOSC) is said to hold If the con-straint qualification condition holds at u then (13) (14) and (16) are necessaryconditions for u to be a local minimizer If (13) (14) (17) and the strict comple-mentarity condition hold then u is a strict local minimizer

12 Some existing work Under the archimedean condition (see sect2) the hierar-chy of Lasserrersquos relaxations converges asymptotically [17] Moreover in additionto the archimedeanness if the constraint qualification strict complementarity andsecond order sufficient conditions hold at every global minimizer then the Lasserrersquoshierarchy converges in finitely many steps [33] For convex polynomial optimiza-tion the Lasserrersquos hierarchy has finite convergence under the strict convexity orsos-convexity condition [7 20] For unconstrained polynomial optimization thestandard sum of squares relaxation was proposed in [35] When the equality con-straints define a finite set the Lasserrersquos hierarchy also has finite convergence asshown in [18 24 31] Recently a bounded degree hierarchy of relaxations wasproposed for solving polynomial optimization [23] General introductions to poly-nomial optimization and moment problems can be found in the books and surveys[21 22 25 26 39] Lasserrersquos relaxations provide lower bounds for the minimumvalue There also exist methods that compute upper bounds [8 19] A convergencerate analysis for such upper bounds is given in [9 10] When a polynomial opti-mization problem does not have minimizers (ie the infimum is not achievable)there are relaxation methods for computing the infimum [38 42]

A new type of Lasserre relaxations based on Jacobian representations were re-cently proposed in [30] The hierarchy of such relaxations always has finite conver-gence when the tuple of constraining polynomials is nonsingular (ie at every pointin Cn the gradients of active constraining polynomial are linearly independent seeDefinition 51) When there are only equality constraints c1(x) = middot middot middot = cm(x) = 0the method needs the maximal minors of the matrix

[nablaf(x) nablac1(x) middot middot middot nablacm(x)

]

When there are inequality constraints it requires to enumerate all possibilitiesof active constraints The method in [30] is expensive when there are a lot ofconstraints For unconstrained optimization it is reduced to the gradient sum ofsquares relaxations in [27]

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 3

13 New contributions When Lasserrersquos relaxations are used to solve polyno-mial optimization the following issues are typically of concerns

bull The convergence depends on the archimedean condition (see sect2) which issatisfied only if the feasible set is compact If the set is noncompact howcan we get convergent relaxations

bull The cost of Lasserrersquos relaxations depends significantly on the relaxationorder For a fixed order can we construct tighter relaxations than thestandard ones

bull When the convergence of Lasserrersquos relaxations is slow can we constructnew relaxations whose convergence is faster

bull When the optimality conditions fail to hold the Lasserrersquos hierarchy mightnot have finite convergence Can we construct a new hierarchy of strongerrelaxations that also has finite convergence for such cases

This paper addresses the above issues We construct tighter relaxations by usingoptimality conditions In (13)-(14) under the constraint qualification conditionthe Lagrange multipliers λi are uniquely determined by u Consider the polynomialsystem in (x λ)

(18)sum

iisinEcupIλinablaci(x) = nablaf(x) ci(x) = 0 (i isin E) λjcj(x) = 0 (j isin I)

A point x satisfying (18) is called a critical point and such (x λ) is called a criticalpair In (18) once x is known λ can be determined by linear equations Generallythe value of x is not known One can try to express λ as a rational function in xSuppose E cup I = 1 m and denote

G(x) =[nablac1(x) middot middot middot nablacm(x)

]

When m le n and rankG(x) = m we can get the rational expression

(19) λ =(G(x)TG(x)

)minus1G(x)Tnablaf(x)

Typically the matrix inverse(G(x)TG(x)

)minus1is expensive for usage The denom-

inator det(G(x)TG(x)

)is typically a high degree polynomial When m gt n

G(x)TG(x) is always singular and we cannot express λ as in (19)Do there exist polynomials pi (i isin E cup I) such that each

(110) λi = pi(x)

for all (x λ) satisfying (18) If they exist then we can do

bull The polynomial system (18) can be simplified to

(111)sum

iisinEcupIpi(x)nablaci(x) = nablaf(x) ci(x) = 0(i isin E) pj(x)cj(x) = 0(j isin I)

bull For each j isin I the sign condition λj ge 0 is equivalent to

(112) pj(x) ge 0

The new conditions (111) and (112) are only about the variable x not λ Theycan be used to construct tighter relaxations for solving (11)

When do there exist polynomials pi satisfying (110) If they exist how can wecompute them How can we use them to construct tighter relaxations Do thenew relaxations have advantages over the old ones These questions are the maintopics of this paper Our major results are

4 JIAWANG NIE

bull We show that the polynomials pi satisfying (110) always exist when thetuple of constraining polynomials is nonsingular (see Definition 51) More-over they can be determined by linear equations

bull Using the new conditions (111)-(112) we can construct tight relaxationsfor solving (11) To be more precise we construct a hierarchy of newrelaxations which has finite convergence This is true even if the feasibleset is noncompact andor the optimality conditions fail to hold

bull For every relaxation order the new relaxations are tighter than the standardones in the prior work

The paper is organized as follows Section 2 reviews some basics in polynomialoptimization Section 3 constructs new relaxations and proves their tightness Sec-tion 4 characterizes when the polynomials pirsquos satisfying (110) exist and shows howto determine them for polyhedral constraints Section 5 discusses the case of gen-eral nonlinear constraints Section 6 gives examples of using the new relaxationsSection 7 discusses some related issues

2 Preliminaries

Notation The symbol N (resp R C) denotes the set of nonnegative integral(resp real complex) numbers The symbol R[x] = R[x1 xn] denotes the ringof polynomials in x = (x1 xn) with real coefficients The R[x]d stands for theset of real polynomials with degrees le d Denote

Nnd = α = (α1 αn) isin N

n | |α| = α1 + middot middot middot+ αn le dFor a polynomial p deg(p) denotes its total degree For t isin R lceiltrceil denotes thesmallest integer ge t For an integer k gt 0 denote [k] = 1 2 k For x =(x1 xn) and α = (α1 αn) denote

xα = xα1

1 middot middot middotxαnn [x]d =

[1 x1 middot middot middot xn x21 x1x2 middot middot middot xdn

]T

The superscript T denotes the transpose of a matrixvector The ei denotes theith standard unit vector while e denotes the vector of all ones The Im denotesthe m-by-m identity matrix By writing X 0 (resp X ≻ 0) we mean that Xis a symmetric positive semidefinite (resp positive definite) matrix For matricesX1 Xr diag(X1 Xr) denotes the block diagonal matrix whose diagonalblocks are X1 Xr In particular for a vector a diag(a) denotes the diagonalmatrix whose diagonal vector is a For a function f in x fxi

denotes its partialderivative with respect to xi

We review some basics in computational algebra and polynomial optimizationThey could be found in [4 21 22 25 26] An ideal I of R[x] is a subset such thatI middot R[x] sube I and I + I sube I For a tuple h = (h1 hm) of polynomials Ideal(h)denotes the smallest ideal containing all hi which is the set

h1 middot R[x] + middot middot middot+ hm middot R[x]The 2kth truncation of Ideal(h) is the set

Ideal(h)2k = h1 middot R[x]2kminusdeg(h1) + middot middot middot+ hm middot R[x]2kminusdeg(hm)

The truncation Ideal(h)2k depends on the generators h1 hm For an ideal Iits complex and real varieties are respectively defined as

VC(I) = v isin Cn | p(v) = 0 forall p isin I VR(I) = VC(I) capR

n

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 5

A polynomial σ is said to be a sum of squares (SOS) if σ = s21+ middot middot middot+s2k for somepolynomials s1 sk isin R[x] The set of all SOS polynomials in x is denoted asΣ[x] For a degree d denote the truncation

Σ[x]d = Σ[x] cap R[x]d

For a tuple g = (g1 gt) its quadratic module is the set

Qmod(g) = Σ[x] + g1 middot Σ[x] + middot middot middot+ gt middot Σ[x]The 2kth truncation of Qmod(g) is the set

Qmod(g)2k = Σ[x]2k + g1 middot Σ[x]2kminusdeg(g1) + middot middot middot+ gt middot Σ[x]2kminusdeg(gt)

The truncation Qmod(g)2k depends on the generators g1 gt Denote

(21)

IQ(h g) = Ideal(h) + Qmod(g)IQ(h g)2k = Ideal(h)2k +Qmod(g)2k

The set IQ(h g) is said to be archimedean if there exists p isin IQ(h g) such thatp(x) ge 0 defines a compact set in Rn If IQ(h g) is archimedean then

K = x isin Rn | h(x) = 0 g(x) ge 0

must be a compact set Conversely if K is compact say K sube B(0 R) (the ballcentered at 0 with radius R) then IQ(h (gR2 minus xTx)) is always archimedean andh = 0 (gR2 minus xTx) ge 0 give the same set K

Theorem 21 (Putinar [36]) Let h g be tuples of polynomials in R[x] Let K beas above Assume IQ(h g) is archimedean If a polynomial f isin R[x] is positive onK then f isin IQ(h g)

Interestingly if f is only nonnegative on K but standard optimality conditionshold (see Subsection 11) then we still have f isin IQ(h g) [33]

Let RNnd be the space of real multi-sequences indexed by α isin Nnd A vector in

RNnd is called a truncated multi-sequence (tms) of degree d A tms y = (yα)αisinNn

d

gives the Riesz functional Ry acting on R[x]d as

(22) Ry

(

sum

αisinNnd

fαxα)

=sum

αisinNnd

fαyα

For f isin R[x]d and y isin RNnd we denote

(23) 〈f y〉 = Ry(f)

Let q isin R[x]2k The kth localizing matrix of q generated by y isin RNn2k is the

symmetric matrix L(k)q (y) such that

(24) vec(a1)T(

L(k)q (y)

)

vec(a2) = Ry(qa1a2)

for all a1 a2 isin R[x]kminuslceildeg(q)2rceil (The vec(ai) denotes the coefficient vector of ai)

When q = 1 L(k)q (y) is called a moment matrix and we denote

(25) Mk(y) = L(k)1 (y)

The columns and rows of L(k)q (y) as well as Mk(y) are indexed by α isin Nn with

2|α|+ deg(q) le 2k When q = (q1 qr) is a tuple of polynomials we define

(26) L(k)q (y) = diag

(

L(k)q1 (y) L(k)

qr (y))

6 JIAWANG NIE

a block diagonal matrix For the polynomial tuples h g as above the set

(27) S (h g)2k =

y isin RN

2kn

∣∣∣L

(k)h (y) = 0 L(k)

g (y) 0

is a spectrahedral cone in RN2kn The set IQ(h g)2k is also a convex cone in R[x]2k

The dual cone of IQ(h g)2k is precisely S (h g)2k [22 25 34] This is because〈p y〉 ge 0 for all p isin IQ(h g)2k and for all y isin S (h g)2k

3 The construction of tight relaxations

Consider the polynomial optimization problem (11) Let

λ = (λi)iisinEcupI

be the vector of Lagrange multipliers Denote the set

(31) K =

(x λ) isin Rn times R

EcupI∣∣∣∣∣

ci(x) = 0(i isin E) λjcj(x) = 0 (j isin I)nablaf(x) = sum

iisinEcupIλinablaci(x)

Each point in K is called a critical pair The projection

(32) Kc = u | (u λ) isin Kis the set of all real critical points To construct tight relaxations for solving (11)we need the following assumption for Lagrange multipliers

Assumption 31 For each i isin E cupI there exists a polynomial pi isin R[x] such thatfor all (x λ) isin K it holds that

λi = pi(x)

Assumption 31 is generically satisfied as shown in Proposition 57 For thefollowing special cases we can get polynomials pi explicitly

bull (Simplex) For the simplex eTxminus1 = 0 x1 ge 0 xn ge 0 it correspondsto that E = 0 I = [n] c0(x) = eTx minus 1 cj(x) = xj (j isin [n]) TheLagrange multipliers can be expressed as

(33) λ0 = xTnablaf(x) λj = fxjminus xTnablaf(x) (j isin [n])

bull (Hypercube) For the hypercube [minus1 1]n it corresponds to that E = emptyI = [n] and each cj(x) = 1minus x2j We can show that

(34) λj = minus1

2xjfxj

(j isin [n])

bull (Ball or sphere) The constraint is 1minusxTx = 0 or 1minusxTx ge 0 It correspondsto that E cup I = 1 and c1 = 1minus xTx We have

(35) λ1 = minus1

2xTnablaf(x)

bull (Triangular constraints) Suppose E cup I = 1 m and each

ci(x) = τixi + qi(xi+1 xn)

for some polynomials qi isin R[xi+1 xn] and scalars τi 6= 0 The matrixT (x) consisting of the firstm rows of [nablac1(x) nablacm(x)] is an invertiblelower triangular matrix with constant diagonal entries Then

λ = T (x)minus1 middot[fx1

middot middot middot fxm

]T

Note that the inverse T (x)minus1 is a matrix polynomial

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 7

For more general constraints we can also express λ as a polynomial function in xon the set Kc This will be discussed in sect4 and sect5

For the polynomials pi as in Assumption 31 denote

(36) φ =(

nablaf minussum

iisinEcupIpinablaci

(pjcj

)

jisinI

)

ψ =(pj)

jisinI

When the minimum value fmin of (11) is achieved at a critical point (11) isequivalent to the problem

(37)

fc = min f(x)st ceq(x) = 0 cin(x) ge 0

φ(x) = 0 ψ(x) ge 0

We apply Lasserre relaxations to solve it For an integer k gt 0 (called the relaxationorder) the kth order Lasserrersquos relaxation for (37) is

(38)

f primek = min 〈f y〉

st 〈1 y〉 = 1Mk(y) 0

L(k)ceq (y) = 0 L

(k)cin(y) 0

L(k)φ (y) = 0 L

(k)ψ (y) 0 y isin RN

n2k

Since x0 = 1 (the constant one polynomial) the condition 〈1 y〉 = 1 means that(y)0 = 1 The dual optimization problem of (38) is

(39)

fk = max γ

st f minus γ isin IQ(ceq cin)2k + IQ(φ ψ)2k

We refer to sect2 for the notation used in (38)-(39) They are equivalent to semidef-inite programs (SDPs) so they can be solved by SDP solvers (eg SeDuMi [40])For k = 1 2 middot middot middot we get a hierarchy of Lasserre relaxations In (38)-(39) if weremove the usage of φ and ψ they are reduced to standard Lasserre relaxations in[17] So (38)-(39) are stronger relaxations

By the construction of φ as in (36) Assumption 31 implies that

Kc = u isin Rn ceq(u) = 0 φ(u) = 0

By Lemma 33 of [6] f achieves only finitely many values on Kc say(310) v1 lt middot middot middot lt vN

A point u isin Kc might not be feasible for (37) ie it is possible that cin(u) 6ge 0 orψ(u) 6ge 0 In applications we are often interested in the optimal value fc of (37)When (37) is infeasible by convention we set

fc = +infin

When the optimal value fmin of (11) is achieved at a critical point fc = fminThis is the case if the feasible set is compact or if f is coercive (ie for each ℓthe sublevel set f(x) le ℓ is compact) and the constraint qualification conditionholds As in [17] one can show that

(311) fk le f primek le fc

for all k Moreover fk and f primek are both monotonically increasing If for some

order k it occurs thatfk = f prime

k = fc

then the kth order Lasserrersquos relaxation is said to be tight (or exact)

8 JIAWANG NIE

31 Tightness of the relaxations Let cin ψKc fc be as above We refer to sect2for the notation Qmod(cin ψ) We begin with a general assumption

Assumption 32 There exists ρ isin Qmod(cin ψ) such that if u isin Kc and f(u) ltfc then ρ(u) lt 0

In Assumption 32 the hypersurface ρ(x) = 0 separates feasible and infeasiblecritical points Clearly if u isin Kc is a feasible point for (37) then cin(u) ge 0 andψ(u) ge 0 and hence ρ(u) ge 0 Assumption 32 generally holds For instance it issatisfied for the following general cases

a) When there are no inequality constraints cin and ψ are empty tuplesThen Qmod(cin ψ) = Σ[x] and Assumption 32 is satisfied for ρ = 0

b) Suppose the set Kc is finite say Kc = u1 uD andf(u1) f(utminus1) lt fc le f(ut) f(uD)

Let ℓ1 ℓD be real interpolating polynomials such that ℓi(uj) = 1 fori = j and ℓi(uj) = 0 for i 6= j For each i = 1 t there must exist ji isin Isuch that cji(ui) lt 0 Then the polynomial

(312) ρ =sum

iltt

minus1

cji(ui)cji(x)ℓi(x)

2 +sum

igetℓi(x)

2

satisfies Assumption 32c) For each x with f(x) = vi lt fc at least one of the constraints cj(x) ge

0 pj(x) ge 0(j isin I) is violated Suppose for each critical value vi lt fcthere exists gi isin cj pjjisinI such that

gi lt 0 on Kc cap f(x) = viLet ϕ1 ϕN be real univariate polynomials such that ϕi(vj) = 0 for i 6= jand ϕi(vj) = 1 for i = j Suppose vt = fc Then the polynomial

(313) ρ =sum

iltt

gi(x)(ϕi(f(x))

)2+sum

iget

(ϕi(f(x))

)2

satisfies Assumption 32

We refer to sect2 for the archimedean condition and the notation IQ(h g) as in(21) The following is about the convergence of relaxations (38)-(39)

Theorem 33 Suppose Kc 6= empty and Assumption 31 holds If

i) IQ(ceq cin) + IQ(φ ψ) is archimedean orii) IQ(ceq cin) is archimedean oriii) Assumption 32 holds

then fk = f primek = fc for all k sufficiently large Therefore if the minimum value fmin

of (11) is achieved at a critical point then fk = f primek = fmin for all k big enough if

one of the conditions i)-iii) is satisfied

Remark In Theorem 33 the conclusion holds if anyone of conditions i)-iii) issatisfied The condition ii) is only about constraining polynomials of (11) It canbe checked without φ ψ Clearly the condition ii) implies the condition i)

The proof for Theorem 33 is given in the following The main idea is to considerthe set of critical points It can be expressed as a union of subvarieties Theobjective f is a constant in each one of them We can get an SOS type representation

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 9

for f on each subvariety and then construct a single one for f over the entire setof critical points

Proof of Theorem 33 Clearly every point in the complex variety

K1 = x isin Cn | ceq(x) = 0 φ(x) = 0

is a critical point By Lemma 33 of [6] the objective f achieves finitely manyreal values on Kc = K1 cap Rn say they are v1 lt middot middot middot lt vN Up to the shiftingof a constant in f we can further assume that fc = 0 Clearly fc equals one ofv1 vN say vt = fc = 0

Case I Assume IQ(ceq cin) + IQ(φ ψ) is archimedean Let

I = Ideal(ceq φ)

the critical ideal Note that K1 = VC(I) The variety VC(I) is a union of irreduciblesubvarieties say V1 Vℓ If Vi cap Rn 6= empty then f is a real constant on Vi whichequals one of v1 vN This can be implied by Lemma 33 of [6] and Lemma 32of [30] Denote the subvarieties of VC(I)

Ti = K1 cap f(x) = vi (i = t N)

Let Ttminus1 be the union of irreducible subvarieties Vi such that either Vi cap Rn = emptyor f equiv vj on Vi with vj lt vt = fc Then it holds that

VC(I) = Ttminus1 cup Tt cup middot middot middot cup TN By the primary decomposition of I [11 41] there exist ideals Itminus1 It IN sube R[x]such that

I = Itminus1 cap It cap middot middot middot cap IN

and Ti = VC(Ii) for all i = tminus 1 t N Denote the semialgebraic set

(314) S = x isin Rn | cin(x) ge 0 ψ(x) ge 0

For i = t minus 1 we have VR(Itminus1) cap S = empty because v1 vtminus1 lt fc By thePositivstellensatz [2 Corollary 443] there exists p0 isin Preord(cin ψ)

1 satisfying2 + p0 isin Itminus1 Note that 1 + p0 gt 0 on VR(Itminus1) cap S The set Itminus1 +Qmod(cin ψ)is archimedean because I sube Itminus1 and

IQ(ceq cin) + IQ(φ ψ) sube Itminus1 +Qmod(cin ψ)

By Theorem 21 we have

p1 = 1 + p0 isin Itminus1 +Qmod(cin ψ)

Then 1 + p1 isin Itminus1 There exists p2 isin Qmod(cin ψ) such that

minus1 equiv p1 equiv p2 mod Itminus1

Since f = (f4 + 1)2 minus 1 middot (f4minus 1)2 we have

f equiv σtminus1 =(f4 + 1)2 + p2(f4minus 1)2

mod Itminus1

So when k is big enough we have σtminus1 isin Qmod(cin ψ)2k

1It is the preordering of the polynomial tuple (cin ψ) see sect71

10 JIAWANG NIE

For i = t vt = 0 and f(x) vanishes on VC(It) By Hilbertrsquos Strong Nullstellensatz[4] there exists an integer mt gt 0 such that fmt isin It Define the polynomial

st(ǫ) =radicǫsummtminus1

j=0

(12

j

)

ǫminusjf j

Then we have that

D1 = ǫ(1 + ǫminus1f)minus(st(ǫ)

)2 equiv 0 mod It

This is because in the subtraction of D1 after expanding(st(ǫ)

)2 all the terms f j

with j lt mt are cancelled and f j isin It for j ge mt So D1 isin It Let σt(ǫ) = st(ǫ)2

then f + ǫminus σt(ǫ) = D1 and

(315) f + ǫminus σt(ǫ) =summtminus2

j=0bj(ǫ)f

mt+j

for some real scalars bj(ǫ) depending on ǫ

For each i = t+1 N vi gt 0 and f(x)viminus1 vanishes on VC(Ii) By HilbertrsquosStrong Nullstellensatz [4] there exists 0 lt mi isin N such that (fvi minus 1)mi isin Ii Let

si =radicvisummiminus1

j=0

(12

j

)

(fvi minus 1)j

Like for the case i = t we can similarly show that f minus s2i isin Ii Let σi = s2i thenf minus σi isin Ii

Note that VC(Ii) cap VC(Ij) = empty for all i 6= j By Lemma 33 of [30] there existpolynomials atminus1 aN isin R[x] such that

a2tminus1 + middot middot middot+ a2N minus 1 isin I ai isin⋂

i6=jisintminus1NIj

For ǫ gt 0 denote the polynomial

σǫ = σt(ǫ)a2t +

sum

t6=jisintminus1N(σj + ǫ)a2j

then

f + ǫminus σǫ = (f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N )+sum

t6=iisintminus1N(f minus σi)a2i + (f + ǫminus σt(ǫ))a

2t

For each i 6= t f minus σi isin Ii so

(f minus σi)a2i isin

N⋂

j=tminus1

Ij = I

Hence there exists k1 gt 0 such that

(f minus σi)a2i isin I2k1 (t 6= i isin tminus 1 N)

Since f + ǫ minus σt(ǫ) isin It we also have

(f + ǫ minus σt(ǫ))a2t isin

N⋂

j=tminus1

Ij = I

Moreover by the equation (315)

(f + ǫ minus σt(ǫ))a2t =

mtminus2sum

j=0

bj(ǫ)fmt+ja2t

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 11

Each fmt+ja2t isin I since fmt+j isin It So there exists k2 gt 0 such that for all ǫ gt 0

(f + ǫminus σt(ǫ))a2t isin I2k2

Since 1minus a2tminus1 minus middot middot middot minus a2N isin I there also exists k3 gt 0 such that for all ǫ gt 0

(f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N ) isin I2k3

Hence if klowast ge maxk1 k2 k3 then we have

f(x) + ǫminus σǫ isin I2klowast

for all ǫ gt 0 By the construction the degrees of all σi and ai are independent ofǫ So σǫ isin Qmod(cin ψ)2klowast for all ǫ gt 0 if klowast is big enough Note that

I2klowast +Qmod(cin ψ)2klowast = IQ(ceq cin)2klowast + IQ(φ ψ)2klowast

This implies that fklowast ge fc minus ǫ for all ǫ gt 0 On the other hand we always havefklowast le fc So fklowast = fc Moreover since fk is monotonically increasing we musthave fk = fc for all k ge klowast

Case II Assume IQ(ceq cin) is archimedean Because

IQ(ceq cin) sube IQ(ceq cin) + IQ(φ ψ)

the set IQ(ceq cin)+IQ(φ ψ) is also archimedean Therefore the conclusion is alsotrue by applying the result for Case I

Case III Suppose the Assumption 32 holds Let ϕ1 ϕN be real univariatepolynomials such that ϕi(vj) = 0 for i 6= j and ϕi(vj) = 1 for i = j Let

s = st + middot middot middot+ sN where each si = (vi minus fc)(ϕi(f)

)2

Then s isin Σ[x]2k4 for some integer k4 gt 0 Let

f = f minus fc minus s

We show that there exist an integer ℓ gt 0 and q isin Qmod(cin ψ) such that

f2ℓ + q isin Ideal(ceq φ)

This is because by Assumption 32 f(x) equiv 0 on the set

K2 = x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0

It has only a single inequality By the Positivstellensatz [2 Corollary 443] there

exist 0 lt ℓ isin N and q = b0 + ρb1 (b0 b1 isin Σ[x]) such that f2ℓ + q isin Ideal(ceq φ)By Assumption 32 ρ isin Qmod(cin ψ) so we have q isin Qmod(cin ψ)

For all ǫ gt 0 and τ gt 0 we have f + ǫ = φǫ + θǫ where

φǫ = minusτǫ1minus2ℓ(f2ℓ + q

)

θǫ = ǫ(

1 + f ǫ+ τ(f ǫ)2ℓ)

+ τǫ1minus2ℓq

By Lemma 21 of [31] when τ ge 12ℓ there exists k5 such that for all ǫ gt 0

φǫ isin Ideal(ceq φ)2k5 θǫ isin Qmod(cin ψ)2k5

Hence we can getf minus (fc minus ǫ) = φǫ + σǫ

where σǫ = θǫ + s isin Qmod(cin ψ)2k5 for all ǫ gt 0 Note that

IQ(ceq cin)2k5 + IQ(φ ψ)2k5 = Ideal(ceq φ)2k5 +Qmod(cin ψ)2k5

12 JIAWANG NIE

For all ǫ gt 0 γ = fc minus ǫ is feasible in (39) for the order k5 so fk5 ge fc Becauseof (311) and the monotonicity of fk we have fk = f prime

k = fc for all k ge k5

32 Detecting tightness and extracting minimizers The optimal value of(37) is fc and the optimal value of (11) is fmin If fmin is achievable at a criticalpoint then fc = fmin In Theorem 33 we have shown that fk = fc for all k bigenough where fk is the optimal value of (39) The value fc or fmin is often notknown How do we detect the tightness fk = fc in computation The flat extensionor flat truncation condition [5 14 32] can be used for checking tightness Supposeylowast is a minimizer of (38) for the order k Let

(316) d = lceildeg(ceq cin φ ψ)2rceilIf there exists an integer t isin [d k] such that

(317) rankMt(ylowast) = rankMtminusd(y

lowast)

then fk = fc and we can get r = rankMt(ylowast) minimizers for (37) [5 14 32]

The method in [14] can be used to extract minimizers It was implemented inthe software GloptiPoly 3 [13] Generally (317) can serve as a sufficient andnecessary condition for detecting tightness The case that (37) is infeasible (ieno critical points satisfy the constraints cin ge 0 ψ ge 0) can also be detected bysolving the relaxations (38)-(39)

Theorem 34 Under Assumption 31 the relaxations (38)-(39) have the follow-ing properties

i) If (38) is infeasible for some order k then no critical points satisfy theconstraints cin ge 0 ψ ge 0 ie (37) is infeasible

ii) Suppose Assumption 32 holds If (37) is infeasible then the relaxation(38) must be infeasible when the order k is big enough

In the following assume (37) is feasible (ie fc lt +infin) Then for all k bigenough (38) has a minimizer ylowast Moreover

iii) If (317) is satisfied for some t isin [d k] then fk = fciv) If Assumption 32 holds and (37) has finitely many minimizers then every

minimizer ylowast of (38) must satisfy (317) for some t isin [d k] when k is bigenough

Proof By Assumption 31 u is a critical point if and only if ceq(u) = 0 φ(u) = 0i) For every feasible point u of (37) the tms [u]2k (see sect2 for the notation) is

feasible for (38) for all k Therefore if (38) is infeasible for some k then (37)must be infeasible

ii) By Assumption 32 when (37) is infeasible the set

x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0

is empty It has a single inequality By the Positivstellensatz [2 Corollary 443]it holds that minus1 isin Ideal(ceq φ) + Qmod(ρ) By Assumption 32

Ideal(ceq φ) + Qmod(ρ) sube IQ(ceq cin) + IQ(φ ψ)

Thus for all k big enough (39) is unbounded from above Hence (38) must beinfeasible by weak duality

When (37) is feasible f achieves finitely many values on Kc so (37) mustachieve its optimal value fc By Theorem 33 we know that fk = f prime

k = fc for all kbig enough For each minimizer ulowast of (37) the tms [ulowast]2k is a minimizer of (38)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 13

iii) If (317) holds we can get r = rankMt(ylowast) minimizers for (37) [5 14] say

u1 ur such that fk = f(ui) for each i Clearly fk = f(ui) ge fc On the otherhand we always have fk le fc So fk = fc

iv) By Assumption 32 (37) is equivalent to the problem

(318)

min f(x)st ceq(x) = 0 φ(x) = 0 ρ(x) ge 0

The optimal value of (318) is also fc Its kth order Lasserrersquos relaxation is

(319)

γprimek = min 〈f y〉st 〈1 y〉 = 1Mk(y) 0

L(k)ceq (y) = 0 L

(k)φ (y) = 0 L

(k)ρ (y) 0

Its dual optimization problem is

(320)

γk = max γ

st f minus γ isin Ideal(ceq φ)2k + Qmod(ρ)2k

By repeating the same proof as for Theorem 33(iii) we can show that

γk = γprimek = fc

for all k big enough Because ρ isin Qmod(cin ψ) each y feasible for (38) is alsofeasible for (319) So when k is big each ylowast is also a minimizer of (319) Theproblem (318) also has finitely many minimizers By Theorem 26 of [32] thecondition (317) must be satisfied for some t isin [d k] when k is big enough

If (37) has infinitely many minimizers then the condition (317) is typically notsatisfied We refer to [25 sect66]

4 Polyhedral constraints

In this section we assume the feasible set of (11) is the polyhedron

P = x isin Rn | Axminus b ge 0

where A =[a1 middot middot middot am

]T isin Rmtimesn b =[b1 middot middot middot bm

]T isin Rm This corresponds

to that E = empty I = [m] and each ci(x) = aTi xminus bi Denote

(41) D(x) = diag(c1(x) cm(x)) C(x) =

[AT

D(x)

]

The Lagrange multiplier vector λ =[λ1 middot middot middot λm

]Tsatisfies

(42)

[AT

D(x)

]

λ =

[nablaf(x)

0

]

If rankA = m we can express λ as

(43) λ = (AAT )minus1Anablaf(x)If rankA lt m how can we express λ in terms of x In computation we oftenprefer a polynomial expression If there exists L(x) isin R[x]mtimes(n+m) such that

(44) L(x)C(x) = Im

then we can get

λ = L(x)

[nablaf(x)

0

]

= L1(x)nablaf(x)

14 JIAWANG NIE

where L1(x) consists of the first n columns of L(x) In this section we characterizewhen such L(x) exists and give a degree bound for it

The linear function Ax minus b is said to be nonsingular if rankC(u) = m for allu isin Cn (also see Definition 51) This is equivalent to that for every u if J(u) =i1 ik (see (12) for the notation) then ai1 aik are linearly independent

Proposition 41 The linear function Ax minus b is nonsingular if and only if thereexists a matrix polynomial L(x) satisfying (44) Moreover when Ax minus b is non-singular we can choose L(x) in (44) with deg(L) le mminus rankA

Proof Clearly if (44) is satisfied by some L(x) then rankC(u) ge m for all u Thisimplies that Axminus b is nonsingular

Next assume that Axminusb is nonsingular We show that (44) is satisfied by someL(x) isin R[x]mtimes(n+m) with degree le m minus rankA Let r = rankA Up to a linearcoordinate transformation we can reduce x to a r-dimensional variable Withoutloss of generality we can assume that rankA = n and m ge n

For a subset I = i1 imminusn of [m] denote

cI(x) =prod

iisinIci(x) EI(x) = cI(x) middot diag(ci1(x)minus1 cimminusn

(x)minus1)

DI(x) = diag(ci1(x) cimminusn(x)) AI =

[ai1 middot middot middot ainminusm

]T

For the case that I = empty (the empty set) we set cempty(x) = 1 Let

V = I sube [m] |I| = mminus n rankA[m]I = nStep I For each I isin V we construct a matrix polynomial LI(x) such that

(45) LI(x)C(x) = cI(x)Im

The matrix LI = LI(x) satisfying (45) can be given by the following 2times 3 blockmatrix (LI(J K) denotes the submatrix whose row indices are from J and whosecolumn indices are from K)

(46)

J K [n] n+ I n+ [m]II 0 EI(x) 0

[m]I cI(x) middot(A[m]I

)minusT minus(A[m]I

)minusT (AI

)TEI(x) 0

Equivalently the blocks of LI are

LI(I [n]

)= 0 LI

(I n+ [m]I

)= 0 LI

([m]I n+ [m]I

)= 0

LI(I n+ I

)= EI(x) LI

([m]I [n]

)= cI(x)

(A[m]I

)minusT

LI([m]I n+ I

)= minus

(A[m]I

)minusT (AI

)T

For each I isin V A[m]I is invertible The superscript minusT denotes the inverse of thetranspose Let G = LI(x)C(x) then one can verify that

G(I I) = EI(x)DI(x) = cI(x)Imminusn G(I [m]I) = 0

G([m]I [m]I) =[

cI(x)(A[m]I

)minusT minusAminusT[m]IA

TI EI(x)

] [(A[m]I

)T

0

]

= cI(x)In

G([m]I I) =[

cI(x)(A[m]I

)minusT minus(A[m]I

)minusT (AI

)TEI(x)

] [ ATIDI(x)

]

= 0

This shows that the above LI(x) satisfies (45)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 15

Step II We show that there exist real scalars νI satisfying

(47)sum

IisinVνIcI(x) = 1

This can be shown by induction on m

bull When m = n V = empty and cempty(x) = 1 so (47) is clearly truebull When m gt n let

(48) N = i isin [m] | rankA[m]i = n

For each i isin N let Vi be the set of all Iprime sube [m]i such that |I prime| = mminusnminus1

and rankA[m](Iprimecupi) = n For each i isin N by the assumption the linearfunction Amix minus bmi is nonsingular By induction there exist real

scalars ν(i)Iprime satisfying

(49)sum

IprimeisinVi

ν(i)Iprime cIprime(x) = 1

Since rankA = n we can generally assume that a1 an is linearlyindependent So there exist scalars α1 αn such that

am = α1a1 + middot middot middot+ αnan

If all αi = 0 then am = 0 and hence A can be replaced by its first mminus 1rows So (47) is true by the induction In the following suppose at leastone αi 6= 0 and write

i αi 6= 0 = i1 ik

Then ai1 aik am are linearly dependent For convenience set ik+1 =m Since Axminus b is nonsingular the linear system

ci1(x) = middot middot middot = cik(x) = cik+1(x) = 0

has no solutions Hence there exist real scalars micro1 microk+1 such that

micro1ci1(x) + middot middot middot+ microkcik(x) + microk+1cik+1(x) = 1

This above can be implied by echelonrsquos form for inconsistent linear systemsNote that i1 ik+1 isin N For each j = 1 k + 1 by (49)

sum

IprimeisinVij

ν(ij)Iprime cIprime(x) = 1

Then we can get

1 =k+1sum

j=1

microjcij (x) =k+1sum

j=1

microjsum

IprimeisinVij

ν(ij)Iprime cij (x)cIprime(x) =

sum

I=IprimecupijIprimeisinVij1lejlek+1

ν(ij)Iprime microjcI(x)

Since each I prime cup ij isin V (47) must be satisfied by some scalars νI

16 JIAWANG NIE

Step III For LI(x) as in (45) we construct L(x) as

(410) L(x) =sum

IisinVνIcI(x)LI(x)

Clearly L(x) satisfies (44) because

L(x)C(x) =sum

IisinVνILI(x)C(x) =

sum

IisinVνIcI(x)Im = Im

Each LI(x) has degree le mminus n so L(x) has degree le mminus n

Proposition 41 characterizes when there exists L(x) satisfying (44) Whenit does a degree bound for L(x) is m minus rankA Sometimes its degree can besmaller than that as shown in Example 43 For given A b the matrix polynomialL(x) satisfying (44) can be determined by linear equations which are obtained bymatching coefficients on both sides In the following we give some examples ofL(x)C(x) = Im for polyhedral sets

Example 42 Consider the simplicial set

x1 ge 0 xn ge 0 1minus eTx ge 0

The equation L(x)C(x) = In+1 is satisfied by

L(x) =

1minus x1 minusx2 middot middot middot minusxn 1 middot middot middot 1minusx1 1minus x2 middot middot middot minusxn 1 middot middot middot 1

minusx1 minusx2 middot middot middot 1minus xn 1 middot middot middot 1minusx1 minusx2 middot middot middot minusxn 1 middot middot middot 1

Example 43 Consider the box constraint

x1 ge 0 xn ge 0 1minus x1 ge 0 1minus xn ge 0

The equation L(x)C(x) = I2n is satisfied by

L(x) =

[In minus diag(x) In Inminusdiag(x) In In

]

Example 44 Consider the polyhedral set

1minus x4 ge 0 x4 minus x3 ge 0 x3 minus x2 ge 0 x2 minus x1 ge 0 x1 + 1 ge 0

The equation L(x)C(x) = I5 is satisfied by

L(x) =1

2

minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 minusx4 minus 1 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 1minus x4 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 1minus x3 1minus x4 1 1 1 1 1minusx1 minus 1 1minus x2 1minus x3 1minus x4 1 1 1 1 11minus x1 1minus x2 1minus x3 1minus x4 1 1 1 1 1

Example 45 Consider the polyhedral set

1 + x1 ge 0 1minus x1 ge 0 2minus x1 minus x2 ge 0 2minus x1 + x2 ge 0

The matrix L(x) satisfying L(x)C(x) = I4 is

1

6

x12minus 3x1 + 2 x1 x2 minus x2 4minus x1 2minus x1 1minus x1 1minus x1

3x12minus 3x1 minus 6 3x2 + 3x1 x2 6minus 3x1 minus3x1 minus3x1 minus 3 minus3x1 minus 3

1minus x12

minus2x2 minus x1 x2 minus 3 x1 minus 1 x1 + 1 x1 + 2 x1 + 21minus x1

2 3minus x1 x2 minus 2x2 x1 minus 1 x1 + 1 x1 + 2 x1 + 2

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 17

5 General constraints

We consider general nonlinear constraints as in (11) The critical point condi-tions are in (18) We discuss how to express Lagrange multipliers λi as polynomialfunctions in x on the set of critical points

Suppose there are totally m equality and inequality constraints ie

E cup I = 1 mIf (x λ) is a critical pair then λici(x) = 0 for all i isin E cup I So the Lagrange

multiplier vector λ =[λ1 middot middot middot λm

]Tsatisfies the equation

(51)

nablac1(x) nablac2(x) middot middot middot nablacm(x)c1(x) 0 middot middot middot 00 c2(x) 0 0

0 0 middot middot middot cm(x)

︸ ︷︷ ︸

C(x)

λ =

nablaf(x)000

Let C(x) be as in above If there exists L(x) isin R[x]mtimes(m+n) such that

(52) L(x)C(x) = Im

then we can get

(53) λ = L(x)

[nablaf(x)

0

]

= L1(x)nablaf(x)

where L1(x) consists of the first n columns of L(x) Clearly (52) implies thatAssumption 31 holds This section characterizes when such L(x) exists

Definition 51 The tuple c = (c1 cm) of constraining polynomials is said tobe nonsingular if rankC(u) = m for every u isin Cn

Clearly c being nonsingular is equivalent to that for each u isin Cn if J(u) =i1 ik (see (12) for the notation) then the gradients nablaci1(u) nablacik(u) arelinearly independent Our main conclusion is that (52) holds if and only if thetuple c is nonsingular

Proposition 52 (i) For each W (x) isin C[x]stimest with s ge t rankW (u) = t for allu isin Cn if and only if there exists P (x) isin C[x]ttimess such that

P (x)W (x) = It

Moreover for W (x) isin R[x]stimest we can choose P (x) isin R[x]ttimess for the above(ii) The constraining polynomial tuple c is nonsingular if and only if there existsL(x) isin R[x]mtimes(m+n) satisfying (52)

Proof (i) ldquolArrrdquo If L(x)W (x) = It then for all u isin Cn

t = rank It le rankW (u) le t

So W (x) must have full column rank everywhere

ldquorArrrdquo Suppose rankW (u) = t for all u isin Cn Write W (x) in columns

W (x) =[w1(x) w2(x) middot middot middot wt(x)

]

18 JIAWANG NIE

Then the equation w1(x) = 0 does not have a complex solution By Hilbertrsquos WeakNullstellensatz [4] there exists ξ1(x) isin C[x]s such that ξ1(x)

Tw1(x) = 1 For eachi = 2 t denote

r1i(x) = ξ1(x)Twi(x)

then (use sim to denote row equivalence between matrices)

W (x) sim[

1 r12(x) middot middot middot r1t(x)w1(x) w2(x) middot middot middot wm(x)

]

simW1(x) =

[1 r12(x) middot middot middot r1m(x)

0 w(1)2 (x) middot middot middot w

(1)m (x)

]

where each (i = 2 m)

w(1)i (x) = wi(x)minus r1i(x)w1(x)

So there exists P1(x) isin R[x](s+1)timess such that

P1(x)W (x) =W1(x)

Since W (x) and W1(x) are row equivalent W1(x) must also have full column rankeverywhere Similarly the polynomial equation

w(1)2 (x) = 0

does not have a complex solution Again by Hilbertrsquos Weak Nullstellensatz [4]there exists ξ2(x) isin C[x]s such that

ξ2(x)Tw

(1)2 (x) = 1

For each i = 3 t let r2i(x) = ξ2(x)Tw

(1)2 (x) then

W1(x) sim

1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)

0 w(1)2 (x) w

(1)3 (x) middot middot middot w

(1)m (x)

sim

W2(x) =

1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)

0 0 w(2)3 (x) middot middot middot w

(2)m (x)

where each (i = 3 m)

w(2)i (x) = w

(1)i (x)minus r2i(x)w

(1)2 (x)

Similarly W1(x) and W2(x) are row equivalent so W2(x) has full column rankeverywhere There exists P2(x) isin C[x](s+2)times(s+1) such that

P2(x)W1(x) =W2(x)

Continuing this process we can finally get

W2(x) sim middot middot middot simWt(x) =

1 r12(x) r13(x) middot middot middot r1t(x)0 1 r23(x) middot middot middot r2t(x)0 0 1 middot middot middot r3t(x)

0 0 0 middot middot middot 10 0 0 middot middot middot 0

Consequently there exists Pi(x) isin R[x](s+i)times(s+iminus1) for i = 1 2 t such that

Pt(x)Ptminus1(x) middot middot middotP1(x)W (x) =Wt(x)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 19

Since Wt(x) is a unit upper triangular matrix polynomial there exists Pt+1(x) isinR[x]ttimes(s+t) such that Pt+1(x)Wt(x) = It Let

P (x) = Pt+1(x)Pt(x)Ptminus1(x) middot middot middotP1(x)

then P (x)W (x) = Im Note that P (x) isin C[x]ttimess

For W (x) isin R[x]stimest we can replace P (x) by(P (x)+P (x)

)2 (the P (x) denotes

the complex conjugate of P (x)) which is a real matrix polynomial

(ii) The conclusion is implied directly by the item (i)

In Proposition 52 there is no explicit degree bound for L(x) satisfying (52)This question is mostly open to the best of the authorrsquos knowledge However oncea degree is chosen for L(x) it can be determined by comparing coefficients of bothsides of (52) This can be done by solving a linear system In the following wegive some examples of L(x) satisfying (52)

Example 53 Consider the hypercube with quadratic constraints

1minus x21 ge 0 1minus x22 ge 0 1minus x2n ge 0

The equation L(x)C(x) = In is satisfied by

L(x) =[minus 1

2diag(x) In]

Example 54 Consider the nonnegative portion of the unit sphere

x1 ge 0 x2 ge 0 xn ge 0 x21 + middot middot middot+ x2n minus 1 = 0

The equation L(x)C(x) = In+1 is satisfied by

L(x) =

[In minus xxT x1Tn 2x

12x

T minus 121

Tn minus1

]

Example 55 Consider the set

1minus x31 minus x42 ge 0 1minus x43 minus x34 ge 0

The equation L(x)C(x) = I2 is satisfied by

L(x) =

[minusx1

3 minusx2

4 0 0 1 00 0 minusx3

4 minusx4

3 0 1

]

Example 56 Consider the quadratic set

1minus x1x2 minus x2x3 minus x1x3 ge 0 1minus x21 minus x22 minus x23 ge 0

The matrix L(x)T satisfying L(x)C(x) = I2 is

25x13 + 10 x1

2 x2 + 40x1 x22 minus 25 x1 minus 2x3 minus25x1

3 minus 10x12 x2 minus 40 x1 x2

2 +49 x1

2+ 2x3

minus15 x12 x2 + 10x1 x2

2 + 20 x3 x1 x2 minus 10x1 15 x12 x2 minus 10x1 x2

2 minus 20 x3 x1 x2 + 10x1 minus x22

25 x3 x12 minus 20x1 x2

2 + 10 x3 x1 x2 + 2x1 minus25 x3 x12 + 20x1 x2

2 minus 10 x3 x1 x2 minus 2x1 minus x32

1 minus 20 x1 x3 minus 10 x12 minus 20 x1 x2 20 x1 x2 + 20 x1 x3 + 10x1

2

minus50 x12 minus 20 x2 x1 50 x1

2 + 20 x2 x1 + 1

We would like to remark that a polynomial tuple c = (c1 cm) is genericallynonsingular and Assumption 31 holds generically

Proposition 57 For all positive degrees d1 dm there exists an open densesubset U of D = R[x]d1 times middot middot middot times R[x]dm such that every tuple c = (c1 cm) isin Uis nonsingular Indeed such U can be chosen as a Zariski open subset of D ie itis the complement of a proper real variety of D Moreover Assumption 31 holdsfor all c isin U ie it holds generically

20 JIAWANG NIE

Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The

resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of

ci1 cin+1such that if Res(ci1 cin+1

) 6= 0 then the equations

ci1(x) = middot middot middot = cin+1(x) = 0

have no complex solutions Define

F1(c) =prod

(i1in+1)isinJ1

Res(ci1 cin+1)

For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero

Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations

cj1(x) = middot middot middot = cjk(x) = 0

have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define

F2(c) =prod

(j1jk)isinJ2

∆(cj1 cjk)

Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero

Last let F (c) = F1(c)F2(c) and

U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple

Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically

6 Numerical examples

This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21

The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie

pi =(

L1(x)nablaf(x))

i

In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds

By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)

We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo

Example 61 Consider the optimization problem

min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0

The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors

Table 1 Computational results for Example 61

order kwo LME with LME

lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567

2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn

i=1xi(1 minus xi)

2 +sum

i6=j x2

i xj isin IQ(ceq cin)

22 JIAWANG NIE

Example 62 Consider the optimization problem

min x41x22 + x21x

42 + x63 minus 3x21x

22x

23 + (x41 + x42 + x43)

st x21 + x22 + x23 ge 1

The matrix polynomial L(x) =[12x1

12x2

12x3 minus1

] The objective f is the sum

of the positive definite form x41 + x42 + x43 and the Motzkin polynomial

M(x) = x41x22 + x21x

42 + x63 minus 3x21x

22x

23

Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because

c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)

)= 0

defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1

3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic

3 plusmn 1radic

3) The

computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 2 Computational results for Example 62

order kwo LME with LME

lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530

Example 63 Consider the optimization problem

min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0

The matrix polynomial L(x) is

1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0

0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1

minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1

The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3

3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact

4 This is because 1 minus x21minus x2

2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and

1minusx23minusx2

4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23

Table 3 Computational results for Example 63

order kwo LME with LME

lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449

Example 64 Consider the polynomial optimization problem

minxisinR2

x21 + 50x22

st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1

8 ge 0 x22 + 2x1x2 minus 18 ge 0

It is motivated from an example in [12 sect3] The first column of L(x) is

8x13

5 + x1

5288 x2 x1

4

5 minus 16 x13

5 minus x2 x12 1245 + 8 x1

5 minus 2 x2minus 288x2 x1

4

5 minus 16 x13

5 + x2 x12 1245 + 8 x1

5 + 2 x2

and the second column of L(x) is

minus 8x12 x2

5 + 4x23

5 minus x2

10288x1

3 x22

5 + 16x12 x2

5 minus 142x1 x22

5 minus 9 x1

20 minus 8 x23

5 + 11x2

5

minus 288x13 x2

2

5 + 16x12 x2

5 + 142x1 x22

5 + 9 x1

20 minus 8 x23

5 + 11x2

5

The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25

radic5 asymp 1126517 and the minimizers are (plusmn

radic

12plusmn(radic

58 +radic

12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors

Table 4 Computational results for Example 64

order kwo LME with LME

lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537

Example 65 Consider the optimization problem

minxisinR3

x31 + x32 + x33 + 4x1x2x3 minus(x1(x

22 + x23) + x2(x

23 + x21) + x3(x

21 + x22)

)

st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0

The matrix polynomial L(x) is

1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0

minusx1 x2 0 1 0 minus1

24 JIAWANG NIE

The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3

+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 5 Computational results for Example 65

order kwo LME with LME

lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057

Example 66 Consider the optimization problem (x0 = 1)

minxisinR4

xTx+sum4i=0

prod

j 6=i(xi minus xj)

st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0

The matrix polynomial L(x) =[12diag(x) minusI4

] The first part of the objective

is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers

(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)

(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)

(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 6 Computational results for Example 66

order kwo LME with LME

lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354

Example 67 Consider the optimization problem

minxisinR3

x41x22 + x42x

23 + x43x

21 minus 3x21x

22x

23 + x22

st x1 minus x2x3 ge 0minusx2 + x23 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0

minusx3 minus1 0 0 0

]

By the arithmetic-geometric

mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25

relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin

for all k ge 5 up to numerical round-off errors

Table 7 Computational results for Example 67

order kwo LME with LME

lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618

Example 68 Consider the optimization problem

minxisinR4

x21(x1 minus x4)2 + x22(x2 minus x4)

2 + x23(x3 minus x4)2+

2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2

st x1 minus x2 ge 0 x2 minus x3 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0 01 1 0 0 0 0

]

In the objective the sum

of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer

(05632 05632 05632 07510)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 8 Computational results for Example 68

order kwo LME with LME

lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228

Example 69 Consider the optimization problem

minxisinR4

(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)

st 0 le x1 x4 le 1

The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9

5 Note that 4minussum

4

i=1x2i =

sum

4

i=1

(

xi(1 minus xi)2 + (1 minus xi)(1 + x2i )

)

isin IQ(ceq cin)

26 JIAWANG NIE

Table 9 Computational results for Example 69

orderwo LME with LME

lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320

For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison

Example 610 Consider the optimization problem (x0 = 1)

minxisinRn

sum

0leilejlejlen cijkxixjxk

st 0 le x le 1

where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time

Table 10 Consumed time (in seconds) for Example 610

n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778

7 Discussions

71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that

IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k

If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set

Preord(cin ψ) =sum

r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27

The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is

(71)

f primeprek = min 〈f y〉

st 〈1 y〉 = 1 L(k)ceq(y) = 0 L

(k)φ (y) = 0

L(k)

gr11

middotmiddotmiddotgrℓℓ

(y) 0 forall r1 rℓ isin 0 1y isin RN

n2k

Similar to (39) the dual optimization problem of the above is

(72)

fprek = max γ

st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k

An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem

Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then

fprek = f primeprek = fc

for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre

k = fmin for all k big enough

Proof The proof is very similar to the Case III of Theorem 33 Follow the same

argument there Without Assumption 32 we still have f(x) equiv 0 on the set

K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0

By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such

that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same

72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge

73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)

28 JIAWANG NIE

74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get

λ =(

C(x)TC(x))minus1

C(x)T[nablaf(x)

0

]

when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research

Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper

References

[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions

Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-

putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997

[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226

[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200

[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832

[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear

[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear

[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]

[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous

quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-

inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-

tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005

[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109

[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J

Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of

zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29

[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885

[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014

[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009

[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015

[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear

[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26

[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009

[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014

[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606

[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718

[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012

[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255

[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013

[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510

[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121

[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274

[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003

[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984

[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000

[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942

[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324

[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653

[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002

[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951

Department of Mathematics University of California at San Diego 9500 Gilman

Drive La Jolla CA USA 92093

E-mail address njwmathucsdedu

  • 1 Introduction
    • 11 Optimality conditions
    • 12 Some existing work
    • 13 New contributions
      • 2 Preliminaries
      • 3 The construction of tight relaxations
        • 31 Tightness of the relaxations
        • 32 Detecting tightness and extracting minimizers
          • 4 Polyhedral constraints
          • 5 General constraints
          • 6 Numerical examples
          • 7 Discussions
            • 71 Tight relaxations using preorderings
            • 72 Singular constraining polynomials
            • 73 Degree bound for L(x)
            • 74 Rational representation of Lagrange multipliers
              • References
Page 3: TIGHT RELAXATIONS FOR POLYNOMIAL OPTIMIZATION …The polynomial expressions is de-termined by linear equations. Based on these expressions, new Lasserre type semidefinite relaxations

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 3

13 New contributions When Lasserrersquos relaxations are used to solve polyno-mial optimization the following issues are typically of concerns

bull The convergence depends on the archimedean condition (see sect2) which issatisfied only if the feasible set is compact If the set is noncompact howcan we get convergent relaxations

bull The cost of Lasserrersquos relaxations depends significantly on the relaxationorder For a fixed order can we construct tighter relaxations than thestandard ones

bull When the convergence of Lasserrersquos relaxations is slow can we constructnew relaxations whose convergence is faster

bull When the optimality conditions fail to hold the Lasserrersquos hierarchy mightnot have finite convergence Can we construct a new hierarchy of strongerrelaxations that also has finite convergence for such cases

This paper addresses the above issues We construct tighter relaxations by usingoptimality conditions In (13)-(14) under the constraint qualification conditionthe Lagrange multipliers λi are uniquely determined by u Consider the polynomialsystem in (x λ)

(18)sum

iisinEcupIλinablaci(x) = nablaf(x) ci(x) = 0 (i isin E) λjcj(x) = 0 (j isin I)

A point x satisfying (18) is called a critical point and such (x λ) is called a criticalpair In (18) once x is known λ can be determined by linear equations Generallythe value of x is not known One can try to express λ as a rational function in xSuppose E cup I = 1 m and denote

G(x) =[nablac1(x) middot middot middot nablacm(x)

]

When m le n and rankG(x) = m we can get the rational expression

(19) λ =(G(x)TG(x)

)minus1G(x)Tnablaf(x)

Typically the matrix inverse(G(x)TG(x)

)minus1is expensive for usage The denom-

inator det(G(x)TG(x)

)is typically a high degree polynomial When m gt n

G(x)TG(x) is always singular and we cannot express λ as in (19)Do there exist polynomials pi (i isin E cup I) such that each

(110) λi = pi(x)

for all (x λ) satisfying (18) If they exist then we can do

bull The polynomial system (18) can be simplified to

(111)sum

iisinEcupIpi(x)nablaci(x) = nablaf(x) ci(x) = 0(i isin E) pj(x)cj(x) = 0(j isin I)

bull For each j isin I the sign condition λj ge 0 is equivalent to

(112) pj(x) ge 0

The new conditions (111) and (112) are only about the variable x not λ Theycan be used to construct tighter relaxations for solving (11)

When do there exist polynomials pi satisfying (110) If they exist how can wecompute them How can we use them to construct tighter relaxations Do thenew relaxations have advantages over the old ones These questions are the maintopics of this paper Our major results are

4 JIAWANG NIE

bull We show that the polynomials pi satisfying (110) always exist when thetuple of constraining polynomials is nonsingular (see Definition 51) More-over they can be determined by linear equations

bull Using the new conditions (111)-(112) we can construct tight relaxationsfor solving (11) To be more precise we construct a hierarchy of newrelaxations which has finite convergence This is true even if the feasibleset is noncompact andor the optimality conditions fail to hold

bull For every relaxation order the new relaxations are tighter than the standardones in the prior work

The paper is organized as follows Section 2 reviews some basics in polynomialoptimization Section 3 constructs new relaxations and proves their tightness Sec-tion 4 characterizes when the polynomials pirsquos satisfying (110) exist and shows howto determine them for polyhedral constraints Section 5 discusses the case of gen-eral nonlinear constraints Section 6 gives examples of using the new relaxationsSection 7 discusses some related issues

2 Preliminaries

Notation The symbol N (resp R C) denotes the set of nonnegative integral(resp real complex) numbers The symbol R[x] = R[x1 xn] denotes the ringof polynomials in x = (x1 xn) with real coefficients The R[x]d stands for theset of real polynomials with degrees le d Denote

Nnd = α = (α1 αn) isin N

n | |α| = α1 + middot middot middot+ αn le dFor a polynomial p deg(p) denotes its total degree For t isin R lceiltrceil denotes thesmallest integer ge t For an integer k gt 0 denote [k] = 1 2 k For x =(x1 xn) and α = (α1 αn) denote

xα = xα1

1 middot middot middotxαnn [x]d =

[1 x1 middot middot middot xn x21 x1x2 middot middot middot xdn

]T

The superscript T denotes the transpose of a matrixvector The ei denotes theith standard unit vector while e denotes the vector of all ones The Im denotesthe m-by-m identity matrix By writing X 0 (resp X ≻ 0) we mean that Xis a symmetric positive semidefinite (resp positive definite) matrix For matricesX1 Xr diag(X1 Xr) denotes the block diagonal matrix whose diagonalblocks are X1 Xr In particular for a vector a diag(a) denotes the diagonalmatrix whose diagonal vector is a For a function f in x fxi

denotes its partialderivative with respect to xi

We review some basics in computational algebra and polynomial optimizationThey could be found in [4 21 22 25 26] An ideal I of R[x] is a subset such thatI middot R[x] sube I and I + I sube I For a tuple h = (h1 hm) of polynomials Ideal(h)denotes the smallest ideal containing all hi which is the set

h1 middot R[x] + middot middot middot+ hm middot R[x]The 2kth truncation of Ideal(h) is the set

Ideal(h)2k = h1 middot R[x]2kminusdeg(h1) + middot middot middot+ hm middot R[x]2kminusdeg(hm)

The truncation Ideal(h)2k depends on the generators h1 hm For an ideal Iits complex and real varieties are respectively defined as

VC(I) = v isin Cn | p(v) = 0 forall p isin I VR(I) = VC(I) capR

n

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 5

A polynomial σ is said to be a sum of squares (SOS) if σ = s21+ middot middot middot+s2k for somepolynomials s1 sk isin R[x] The set of all SOS polynomials in x is denoted asΣ[x] For a degree d denote the truncation

Σ[x]d = Σ[x] cap R[x]d

For a tuple g = (g1 gt) its quadratic module is the set

Qmod(g) = Σ[x] + g1 middot Σ[x] + middot middot middot+ gt middot Σ[x]The 2kth truncation of Qmod(g) is the set

Qmod(g)2k = Σ[x]2k + g1 middot Σ[x]2kminusdeg(g1) + middot middot middot+ gt middot Σ[x]2kminusdeg(gt)

The truncation Qmod(g)2k depends on the generators g1 gt Denote

(21)

IQ(h g) = Ideal(h) + Qmod(g)IQ(h g)2k = Ideal(h)2k +Qmod(g)2k

The set IQ(h g) is said to be archimedean if there exists p isin IQ(h g) such thatp(x) ge 0 defines a compact set in Rn If IQ(h g) is archimedean then

K = x isin Rn | h(x) = 0 g(x) ge 0

must be a compact set Conversely if K is compact say K sube B(0 R) (the ballcentered at 0 with radius R) then IQ(h (gR2 minus xTx)) is always archimedean andh = 0 (gR2 minus xTx) ge 0 give the same set K

Theorem 21 (Putinar [36]) Let h g be tuples of polynomials in R[x] Let K beas above Assume IQ(h g) is archimedean If a polynomial f isin R[x] is positive onK then f isin IQ(h g)

Interestingly if f is only nonnegative on K but standard optimality conditionshold (see Subsection 11) then we still have f isin IQ(h g) [33]

Let RNnd be the space of real multi-sequences indexed by α isin Nnd A vector in

RNnd is called a truncated multi-sequence (tms) of degree d A tms y = (yα)αisinNn

d

gives the Riesz functional Ry acting on R[x]d as

(22) Ry

(

sum

αisinNnd

fαxα)

=sum

αisinNnd

fαyα

For f isin R[x]d and y isin RNnd we denote

(23) 〈f y〉 = Ry(f)

Let q isin R[x]2k The kth localizing matrix of q generated by y isin RNn2k is the

symmetric matrix L(k)q (y) such that

(24) vec(a1)T(

L(k)q (y)

)

vec(a2) = Ry(qa1a2)

for all a1 a2 isin R[x]kminuslceildeg(q)2rceil (The vec(ai) denotes the coefficient vector of ai)

When q = 1 L(k)q (y) is called a moment matrix and we denote

(25) Mk(y) = L(k)1 (y)

The columns and rows of L(k)q (y) as well as Mk(y) are indexed by α isin Nn with

2|α|+ deg(q) le 2k When q = (q1 qr) is a tuple of polynomials we define

(26) L(k)q (y) = diag

(

L(k)q1 (y) L(k)

qr (y))

6 JIAWANG NIE

a block diagonal matrix For the polynomial tuples h g as above the set

(27) S (h g)2k =

y isin RN

2kn

∣∣∣L

(k)h (y) = 0 L(k)

g (y) 0

is a spectrahedral cone in RN2kn The set IQ(h g)2k is also a convex cone in R[x]2k

The dual cone of IQ(h g)2k is precisely S (h g)2k [22 25 34] This is because〈p y〉 ge 0 for all p isin IQ(h g)2k and for all y isin S (h g)2k

3 The construction of tight relaxations

Consider the polynomial optimization problem (11) Let

λ = (λi)iisinEcupI

be the vector of Lagrange multipliers Denote the set

(31) K =

(x λ) isin Rn times R

EcupI∣∣∣∣∣

ci(x) = 0(i isin E) λjcj(x) = 0 (j isin I)nablaf(x) = sum

iisinEcupIλinablaci(x)

Each point in K is called a critical pair The projection

(32) Kc = u | (u λ) isin Kis the set of all real critical points To construct tight relaxations for solving (11)we need the following assumption for Lagrange multipliers

Assumption 31 For each i isin E cupI there exists a polynomial pi isin R[x] such thatfor all (x λ) isin K it holds that

λi = pi(x)

Assumption 31 is generically satisfied as shown in Proposition 57 For thefollowing special cases we can get polynomials pi explicitly

bull (Simplex) For the simplex eTxminus1 = 0 x1 ge 0 xn ge 0 it correspondsto that E = 0 I = [n] c0(x) = eTx minus 1 cj(x) = xj (j isin [n]) TheLagrange multipliers can be expressed as

(33) λ0 = xTnablaf(x) λj = fxjminus xTnablaf(x) (j isin [n])

bull (Hypercube) For the hypercube [minus1 1]n it corresponds to that E = emptyI = [n] and each cj(x) = 1minus x2j We can show that

(34) λj = minus1

2xjfxj

(j isin [n])

bull (Ball or sphere) The constraint is 1minusxTx = 0 or 1minusxTx ge 0 It correspondsto that E cup I = 1 and c1 = 1minus xTx We have

(35) λ1 = minus1

2xTnablaf(x)

bull (Triangular constraints) Suppose E cup I = 1 m and each

ci(x) = τixi + qi(xi+1 xn)

for some polynomials qi isin R[xi+1 xn] and scalars τi 6= 0 The matrixT (x) consisting of the firstm rows of [nablac1(x) nablacm(x)] is an invertiblelower triangular matrix with constant diagonal entries Then

λ = T (x)minus1 middot[fx1

middot middot middot fxm

]T

Note that the inverse T (x)minus1 is a matrix polynomial

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 7

For more general constraints we can also express λ as a polynomial function in xon the set Kc This will be discussed in sect4 and sect5

For the polynomials pi as in Assumption 31 denote

(36) φ =(

nablaf minussum

iisinEcupIpinablaci

(pjcj

)

jisinI

)

ψ =(pj)

jisinI

When the minimum value fmin of (11) is achieved at a critical point (11) isequivalent to the problem

(37)

fc = min f(x)st ceq(x) = 0 cin(x) ge 0

φ(x) = 0 ψ(x) ge 0

We apply Lasserre relaxations to solve it For an integer k gt 0 (called the relaxationorder) the kth order Lasserrersquos relaxation for (37) is

(38)

f primek = min 〈f y〉

st 〈1 y〉 = 1Mk(y) 0

L(k)ceq (y) = 0 L

(k)cin(y) 0

L(k)φ (y) = 0 L

(k)ψ (y) 0 y isin RN

n2k

Since x0 = 1 (the constant one polynomial) the condition 〈1 y〉 = 1 means that(y)0 = 1 The dual optimization problem of (38) is

(39)

fk = max γ

st f minus γ isin IQ(ceq cin)2k + IQ(φ ψ)2k

We refer to sect2 for the notation used in (38)-(39) They are equivalent to semidef-inite programs (SDPs) so they can be solved by SDP solvers (eg SeDuMi [40])For k = 1 2 middot middot middot we get a hierarchy of Lasserre relaxations In (38)-(39) if weremove the usage of φ and ψ they are reduced to standard Lasserre relaxations in[17] So (38)-(39) are stronger relaxations

By the construction of φ as in (36) Assumption 31 implies that

Kc = u isin Rn ceq(u) = 0 φ(u) = 0

By Lemma 33 of [6] f achieves only finitely many values on Kc say(310) v1 lt middot middot middot lt vN

A point u isin Kc might not be feasible for (37) ie it is possible that cin(u) 6ge 0 orψ(u) 6ge 0 In applications we are often interested in the optimal value fc of (37)When (37) is infeasible by convention we set

fc = +infin

When the optimal value fmin of (11) is achieved at a critical point fc = fminThis is the case if the feasible set is compact or if f is coercive (ie for each ℓthe sublevel set f(x) le ℓ is compact) and the constraint qualification conditionholds As in [17] one can show that

(311) fk le f primek le fc

for all k Moreover fk and f primek are both monotonically increasing If for some

order k it occurs thatfk = f prime

k = fc

then the kth order Lasserrersquos relaxation is said to be tight (or exact)

8 JIAWANG NIE

31 Tightness of the relaxations Let cin ψKc fc be as above We refer to sect2for the notation Qmod(cin ψ) We begin with a general assumption

Assumption 32 There exists ρ isin Qmod(cin ψ) such that if u isin Kc and f(u) ltfc then ρ(u) lt 0

In Assumption 32 the hypersurface ρ(x) = 0 separates feasible and infeasiblecritical points Clearly if u isin Kc is a feasible point for (37) then cin(u) ge 0 andψ(u) ge 0 and hence ρ(u) ge 0 Assumption 32 generally holds For instance it issatisfied for the following general cases

a) When there are no inequality constraints cin and ψ are empty tuplesThen Qmod(cin ψ) = Σ[x] and Assumption 32 is satisfied for ρ = 0

b) Suppose the set Kc is finite say Kc = u1 uD andf(u1) f(utminus1) lt fc le f(ut) f(uD)

Let ℓ1 ℓD be real interpolating polynomials such that ℓi(uj) = 1 fori = j and ℓi(uj) = 0 for i 6= j For each i = 1 t there must exist ji isin Isuch that cji(ui) lt 0 Then the polynomial

(312) ρ =sum

iltt

minus1

cji(ui)cji(x)ℓi(x)

2 +sum

igetℓi(x)

2

satisfies Assumption 32c) For each x with f(x) = vi lt fc at least one of the constraints cj(x) ge

0 pj(x) ge 0(j isin I) is violated Suppose for each critical value vi lt fcthere exists gi isin cj pjjisinI such that

gi lt 0 on Kc cap f(x) = viLet ϕ1 ϕN be real univariate polynomials such that ϕi(vj) = 0 for i 6= jand ϕi(vj) = 1 for i = j Suppose vt = fc Then the polynomial

(313) ρ =sum

iltt

gi(x)(ϕi(f(x))

)2+sum

iget

(ϕi(f(x))

)2

satisfies Assumption 32

We refer to sect2 for the archimedean condition and the notation IQ(h g) as in(21) The following is about the convergence of relaxations (38)-(39)

Theorem 33 Suppose Kc 6= empty and Assumption 31 holds If

i) IQ(ceq cin) + IQ(φ ψ) is archimedean orii) IQ(ceq cin) is archimedean oriii) Assumption 32 holds

then fk = f primek = fc for all k sufficiently large Therefore if the minimum value fmin

of (11) is achieved at a critical point then fk = f primek = fmin for all k big enough if

one of the conditions i)-iii) is satisfied

Remark In Theorem 33 the conclusion holds if anyone of conditions i)-iii) issatisfied The condition ii) is only about constraining polynomials of (11) It canbe checked without φ ψ Clearly the condition ii) implies the condition i)

The proof for Theorem 33 is given in the following The main idea is to considerthe set of critical points It can be expressed as a union of subvarieties Theobjective f is a constant in each one of them We can get an SOS type representation

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 9

for f on each subvariety and then construct a single one for f over the entire setof critical points

Proof of Theorem 33 Clearly every point in the complex variety

K1 = x isin Cn | ceq(x) = 0 φ(x) = 0

is a critical point By Lemma 33 of [6] the objective f achieves finitely manyreal values on Kc = K1 cap Rn say they are v1 lt middot middot middot lt vN Up to the shiftingof a constant in f we can further assume that fc = 0 Clearly fc equals one ofv1 vN say vt = fc = 0

Case I Assume IQ(ceq cin) + IQ(φ ψ) is archimedean Let

I = Ideal(ceq φ)

the critical ideal Note that K1 = VC(I) The variety VC(I) is a union of irreduciblesubvarieties say V1 Vℓ If Vi cap Rn 6= empty then f is a real constant on Vi whichequals one of v1 vN This can be implied by Lemma 33 of [6] and Lemma 32of [30] Denote the subvarieties of VC(I)

Ti = K1 cap f(x) = vi (i = t N)

Let Ttminus1 be the union of irreducible subvarieties Vi such that either Vi cap Rn = emptyor f equiv vj on Vi with vj lt vt = fc Then it holds that

VC(I) = Ttminus1 cup Tt cup middot middot middot cup TN By the primary decomposition of I [11 41] there exist ideals Itminus1 It IN sube R[x]such that

I = Itminus1 cap It cap middot middot middot cap IN

and Ti = VC(Ii) for all i = tminus 1 t N Denote the semialgebraic set

(314) S = x isin Rn | cin(x) ge 0 ψ(x) ge 0

For i = t minus 1 we have VR(Itminus1) cap S = empty because v1 vtminus1 lt fc By thePositivstellensatz [2 Corollary 443] there exists p0 isin Preord(cin ψ)

1 satisfying2 + p0 isin Itminus1 Note that 1 + p0 gt 0 on VR(Itminus1) cap S The set Itminus1 +Qmod(cin ψ)is archimedean because I sube Itminus1 and

IQ(ceq cin) + IQ(φ ψ) sube Itminus1 +Qmod(cin ψ)

By Theorem 21 we have

p1 = 1 + p0 isin Itminus1 +Qmod(cin ψ)

Then 1 + p1 isin Itminus1 There exists p2 isin Qmod(cin ψ) such that

minus1 equiv p1 equiv p2 mod Itminus1

Since f = (f4 + 1)2 minus 1 middot (f4minus 1)2 we have

f equiv σtminus1 =(f4 + 1)2 + p2(f4minus 1)2

mod Itminus1

So when k is big enough we have σtminus1 isin Qmod(cin ψ)2k

1It is the preordering of the polynomial tuple (cin ψ) see sect71

10 JIAWANG NIE

For i = t vt = 0 and f(x) vanishes on VC(It) By Hilbertrsquos Strong Nullstellensatz[4] there exists an integer mt gt 0 such that fmt isin It Define the polynomial

st(ǫ) =radicǫsummtminus1

j=0

(12

j

)

ǫminusjf j

Then we have that

D1 = ǫ(1 + ǫminus1f)minus(st(ǫ)

)2 equiv 0 mod It

This is because in the subtraction of D1 after expanding(st(ǫ)

)2 all the terms f j

with j lt mt are cancelled and f j isin It for j ge mt So D1 isin It Let σt(ǫ) = st(ǫ)2

then f + ǫminus σt(ǫ) = D1 and

(315) f + ǫminus σt(ǫ) =summtminus2

j=0bj(ǫ)f

mt+j

for some real scalars bj(ǫ) depending on ǫ

For each i = t+1 N vi gt 0 and f(x)viminus1 vanishes on VC(Ii) By HilbertrsquosStrong Nullstellensatz [4] there exists 0 lt mi isin N such that (fvi minus 1)mi isin Ii Let

si =radicvisummiminus1

j=0

(12

j

)

(fvi minus 1)j

Like for the case i = t we can similarly show that f minus s2i isin Ii Let σi = s2i thenf minus σi isin Ii

Note that VC(Ii) cap VC(Ij) = empty for all i 6= j By Lemma 33 of [30] there existpolynomials atminus1 aN isin R[x] such that

a2tminus1 + middot middot middot+ a2N minus 1 isin I ai isin⋂

i6=jisintminus1NIj

For ǫ gt 0 denote the polynomial

σǫ = σt(ǫ)a2t +

sum

t6=jisintminus1N(σj + ǫ)a2j

then

f + ǫminus σǫ = (f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N )+sum

t6=iisintminus1N(f minus σi)a2i + (f + ǫminus σt(ǫ))a

2t

For each i 6= t f minus σi isin Ii so

(f minus σi)a2i isin

N⋂

j=tminus1

Ij = I

Hence there exists k1 gt 0 such that

(f minus σi)a2i isin I2k1 (t 6= i isin tminus 1 N)

Since f + ǫ minus σt(ǫ) isin It we also have

(f + ǫ minus σt(ǫ))a2t isin

N⋂

j=tminus1

Ij = I

Moreover by the equation (315)

(f + ǫ minus σt(ǫ))a2t =

mtminus2sum

j=0

bj(ǫ)fmt+ja2t

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 11

Each fmt+ja2t isin I since fmt+j isin It So there exists k2 gt 0 such that for all ǫ gt 0

(f + ǫminus σt(ǫ))a2t isin I2k2

Since 1minus a2tminus1 minus middot middot middot minus a2N isin I there also exists k3 gt 0 such that for all ǫ gt 0

(f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N ) isin I2k3

Hence if klowast ge maxk1 k2 k3 then we have

f(x) + ǫminus σǫ isin I2klowast

for all ǫ gt 0 By the construction the degrees of all σi and ai are independent ofǫ So σǫ isin Qmod(cin ψ)2klowast for all ǫ gt 0 if klowast is big enough Note that

I2klowast +Qmod(cin ψ)2klowast = IQ(ceq cin)2klowast + IQ(φ ψ)2klowast

This implies that fklowast ge fc minus ǫ for all ǫ gt 0 On the other hand we always havefklowast le fc So fklowast = fc Moreover since fk is monotonically increasing we musthave fk = fc for all k ge klowast

Case II Assume IQ(ceq cin) is archimedean Because

IQ(ceq cin) sube IQ(ceq cin) + IQ(φ ψ)

the set IQ(ceq cin)+IQ(φ ψ) is also archimedean Therefore the conclusion is alsotrue by applying the result for Case I

Case III Suppose the Assumption 32 holds Let ϕ1 ϕN be real univariatepolynomials such that ϕi(vj) = 0 for i 6= j and ϕi(vj) = 1 for i = j Let

s = st + middot middot middot+ sN where each si = (vi minus fc)(ϕi(f)

)2

Then s isin Σ[x]2k4 for some integer k4 gt 0 Let

f = f minus fc minus s

We show that there exist an integer ℓ gt 0 and q isin Qmod(cin ψ) such that

f2ℓ + q isin Ideal(ceq φ)

This is because by Assumption 32 f(x) equiv 0 on the set

K2 = x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0

It has only a single inequality By the Positivstellensatz [2 Corollary 443] there

exist 0 lt ℓ isin N and q = b0 + ρb1 (b0 b1 isin Σ[x]) such that f2ℓ + q isin Ideal(ceq φ)By Assumption 32 ρ isin Qmod(cin ψ) so we have q isin Qmod(cin ψ)

For all ǫ gt 0 and τ gt 0 we have f + ǫ = φǫ + θǫ where

φǫ = minusτǫ1minus2ℓ(f2ℓ + q

)

θǫ = ǫ(

1 + f ǫ+ τ(f ǫ)2ℓ)

+ τǫ1minus2ℓq

By Lemma 21 of [31] when τ ge 12ℓ there exists k5 such that for all ǫ gt 0

φǫ isin Ideal(ceq φ)2k5 θǫ isin Qmod(cin ψ)2k5

Hence we can getf minus (fc minus ǫ) = φǫ + σǫ

where σǫ = θǫ + s isin Qmod(cin ψ)2k5 for all ǫ gt 0 Note that

IQ(ceq cin)2k5 + IQ(φ ψ)2k5 = Ideal(ceq φ)2k5 +Qmod(cin ψ)2k5

12 JIAWANG NIE

For all ǫ gt 0 γ = fc minus ǫ is feasible in (39) for the order k5 so fk5 ge fc Becauseof (311) and the monotonicity of fk we have fk = f prime

k = fc for all k ge k5

32 Detecting tightness and extracting minimizers The optimal value of(37) is fc and the optimal value of (11) is fmin If fmin is achievable at a criticalpoint then fc = fmin In Theorem 33 we have shown that fk = fc for all k bigenough where fk is the optimal value of (39) The value fc or fmin is often notknown How do we detect the tightness fk = fc in computation The flat extensionor flat truncation condition [5 14 32] can be used for checking tightness Supposeylowast is a minimizer of (38) for the order k Let

(316) d = lceildeg(ceq cin φ ψ)2rceilIf there exists an integer t isin [d k] such that

(317) rankMt(ylowast) = rankMtminusd(y

lowast)

then fk = fc and we can get r = rankMt(ylowast) minimizers for (37) [5 14 32]

The method in [14] can be used to extract minimizers It was implemented inthe software GloptiPoly 3 [13] Generally (317) can serve as a sufficient andnecessary condition for detecting tightness The case that (37) is infeasible (ieno critical points satisfy the constraints cin ge 0 ψ ge 0) can also be detected bysolving the relaxations (38)-(39)

Theorem 34 Under Assumption 31 the relaxations (38)-(39) have the follow-ing properties

i) If (38) is infeasible for some order k then no critical points satisfy theconstraints cin ge 0 ψ ge 0 ie (37) is infeasible

ii) Suppose Assumption 32 holds If (37) is infeasible then the relaxation(38) must be infeasible when the order k is big enough

In the following assume (37) is feasible (ie fc lt +infin) Then for all k bigenough (38) has a minimizer ylowast Moreover

iii) If (317) is satisfied for some t isin [d k] then fk = fciv) If Assumption 32 holds and (37) has finitely many minimizers then every

minimizer ylowast of (38) must satisfy (317) for some t isin [d k] when k is bigenough

Proof By Assumption 31 u is a critical point if and only if ceq(u) = 0 φ(u) = 0i) For every feasible point u of (37) the tms [u]2k (see sect2 for the notation) is

feasible for (38) for all k Therefore if (38) is infeasible for some k then (37)must be infeasible

ii) By Assumption 32 when (37) is infeasible the set

x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0

is empty It has a single inequality By the Positivstellensatz [2 Corollary 443]it holds that minus1 isin Ideal(ceq φ) + Qmod(ρ) By Assumption 32

Ideal(ceq φ) + Qmod(ρ) sube IQ(ceq cin) + IQ(φ ψ)

Thus for all k big enough (39) is unbounded from above Hence (38) must beinfeasible by weak duality

When (37) is feasible f achieves finitely many values on Kc so (37) mustachieve its optimal value fc By Theorem 33 we know that fk = f prime

k = fc for all kbig enough For each minimizer ulowast of (37) the tms [ulowast]2k is a minimizer of (38)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 13

iii) If (317) holds we can get r = rankMt(ylowast) minimizers for (37) [5 14] say

u1 ur such that fk = f(ui) for each i Clearly fk = f(ui) ge fc On the otherhand we always have fk le fc So fk = fc

iv) By Assumption 32 (37) is equivalent to the problem

(318)

min f(x)st ceq(x) = 0 φ(x) = 0 ρ(x) ge 0

The optimal value of (318) is also fc Its kth order Lasserrersquos relaxation is

(319)

γprimek = min 〈f y〉st 〈1 y〉 = 1Mk(y) 0

L(k)ceq (y) = 0 L

(k)φ (y) = 0 L

(k)ρ (y) 0

Its dual optimization problem is

(320)

γk = max γ

st f minus γ isin Ideal(ceq φ)2k + Qmod(ρ)2k

By repeating the same proof as for Theorem 33(iii) we can show that

γk = γprimek = fc

for all k big enough Because ρ isin Qmod(cin ψ) each y feasible for (38) is alsofeasible for (319) So when k is big each ylowast is also a minimizer of (319) Theproblem (318) also has finitely many minimizers By Theorem 26 of [32] thecondition (317) must be satisfied for some t isin [d k] when k is big enough

If (37) has infinitely many minimizers then the condition (317) is typically notsatisfied We refer to [25 sect66]

4 Polyhedral constraints

In this section we assume the feasible set of (11) is the polyhedron

P = x isin Rn | Axminus b ge 0

where A =[a1 middot middot middot am

]T isin Rmtimesn b =[b1 middot middot middot bm

]T isin Rm This corresponds

to that E = empty I = [m] and each ci(x) = aTi xminus bi Denote

(41) D(x) = diag(c1(x) cm(x)) C(x) =

[AT

D(x)

]

The Lagrange multiplier vector λ =[λ1 middot middot middot λm

]Tsatisfies

(42)

[AT

D(x)

]

λ =

[nablaf(x)

0

]

If rankA = m we can express λ as

(43) λ = (AAT )minus1Anablaf(x)If rankA lt m how can we express λ in terms of x In computation we oftenprefer a polynomial expression If there exists L(x) isin R[x]mtimes(n+m) such that

(44) L(x)C(x) = Im

then we can get

λ = L(x)

[nablaf(x)

0

]

= L1(x)nablaf(x)

14 JIAWANG NIE

where L1(x) consists of the first n columns of L(x) In this section we characterizewhen such L(x) exists and give a degree bound for it

The linear function Ax minus b is said to be nonsingular if rankC(u) = m for allu isin Cn (also see Definition 51) This is equivalent to that for every u if J(u) =i1 ik (see (12) for the notation) then ai1 aik are linearly independent

Proposition 41 The linear function Ax minus b is nonsingular if and only if thereexists a matrix polynomial L(x) satisfying (44) Moreover when Ax minus b is non-singular we can choose L(x) in (44) with deg(L) le mminus rankA

Proof Clearly if (44) is satisfied by some L(x) then rankC(u) ge m for all u Thisimplies that Axminus b is nonsingular

Next assume that Axminusb is nonsingular We show that (44) is satisfied by someL(x) isin R[x]mtimes(n+m) with degree le m minus rankA Let r = rankA Up to a linearcoordinate transformation we can reduce x to a r-dimensional variable Withoutloss of generality we can assume that rankA = n and m ge n

For a subset I = i1 imminusn of [m] denote

cI(x) =prod

iisinIci(x) EI(x) = cI(x) middot diag(ci1(x)minus1 cimminusn

(x)minus1)

DI(x) = diag(ci1(x) cimminusn(x)) AI =

[ai1 middot middot middot ainminusm

]T

For the case that I = empty (the empty set) we set cempty(x) = 1 Let

V = I sube [m] |I| = mminus n rankA[m]I = nStep I For each I isin V we construct a matrix polynomial LI(x) such that

(45) LI(x)C(x) = cI(x)Im

The matrix LI = LI(x) satisfying (45) can be given by the following 2times 3 blockmatrix (LI(J K) denotes the submatrix whose row indices are from J and whosecolumn indices are from K)

(46)

J K [n] n+ I n+ [m]II 0 EI(x) 0

[m]I cI(x) middot(A[m]I

)minusT minus(A[m]I

)minusT (AI

)TEI(x) 0

Equivalently the blocks of LI are

LI(I [n]

)= 0 LI

(I n+ [m]I

)= 0 LI

([m]I n+ [m]I

)= 0

LI(I n+ I

)= EI(x) LI

([m]I [n]

)= cI(x)

(A[m]I

)minusT

LI([m]I n+ I

)= minus

(A[m]I

)minusT (AI

)T

For each I isin V A[m]I is invertible The superscript minusT denotes the inverse of thetranspose Let G = LI(x)C(x) then one can verify that

G(I I) = EI(x)DI(x) = cI(x)Imminusn G(I [m]I) = 0

G([m]I [m]I) =[

cI(x)(A[m]I

)minusT minusAminusT[m]IA

TI EI(x)

] [(A[m]I

)T

0

]

= cI(x)In

G([m]I I) =[

cI(x)(A[m]I

)minusT minus(A[m]I

)minusT (AI

)TEI(x)

] [ ATIDI(x)

]

= 0

This shows that the above LI(x) satisfies (45)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 15

Step II We show that there exist real scalars νI satisfying

(47)sum

IisinVνIcI(x) = 1

This can be shown by induction on m

bull When m = n V = empty and cempty(x) = 1 so (47) is clearly truebull When m gt n let

(48) N = i isin [m] | rankA[m]i = n

For each i isin N let Vi be the set of all Iprime sube [m]i such that |I prime| = mminusnminus1

and rankA[m](Iprimecupi) = n For each i isin N by the assumption the linearfunction Amix minus bmi is nonsingular By induction there exist real

scalars ν(i)Iprime satisfying

(49)sum

IprimeisinVi

ν(i)Iprime cIprime(x) = 1

Since rankA = n we can generally assume that a1 an is linearlyindependent So there exist scalars α1 αn such that

am = α1a1 + middot middot middot+ αnan

If all αi = 0 then am = 0 and hence A can be replaced by its first mminus 1rows So (47) is true by the induction In the following suppose at leastone αi 6= 0 and write

i αi 6= 0 = i1 ik

Then ai1 aik am are linearly dependent For convenience set ik+1 =m Since Axminus b is nonsingular the linear system

ci1(x) = middot middot middot = cik(x) = cik+1(x) = 0

has no solutions Hence there exist real scalars micro1 microk+1 such that

micro1ci1(x) + middot middot middot+ microkcik(x) + microk+1cik+1(x) = 1

This above can be implied by echelonrsquos form for inconsistent linear systemsNote that i1 ik+1 isin N For each j = 1 k + 1 by (49)

sum

IprimeisinVij

ν(ij)Iprime cIprime(x) = 1

Then we can get

1 =k+1sum

j=1

microjcij (x) =k+1sum

j=1

microjsum

IprimeisinVij

ν(ij)Iprime cij (x)cIprime(x) =

sum

I=IprimecupijIprimeisinVij1lejlek+1

ν(ij)Iprime microjcI(x)

Since each I prime cup ij isin V (47) must be satisfied by some scalars νI

16 JIAWANG NIE

Step III For LI(x) as in (45) we construct L(x) as

(410) L(x) =sum

IisinVνIcI(x)LI(x)

Clearly L(x) satisfies (44) because

L(x)C(x) =sum

IisinVνILI(x)C(x) =

sum

IisinVνIcI(x)Im = Im

Each LI(x) has degree le mminus n so L(x) has degree le mminus n

Proposition 41 characterizes when there exists L(x) satisfying (44) Whenit does a degree bound for L(x) is m minus rankA Sometimes its degree can besmaller than that as shown in Example 43 For given A b the matrix polynomialL(x) satisfying (44) can be determined by linear equations which are obtained bymatching coefficients on both sides In the following we give some examples ofL(x)C(x) = Im for polyhedral sets

Example 42 Consider the simplicial set

x1 ge 0 xn ge 0 1minus eTx ge 0

The equation L(x)C(x) = In+1 is satisfied by

L(x) =

1minus x1 minusx2 middot middot middot minusxn 1 middot middot middot 1minusx1 1minus x2 middot middot middot minusxn 1 middot middot middot 1

minusx1 minusx2 middot middot middot 1minus xn 1 middot middot middot 1minusx1 minusx2 middot middot middot minusxn 1 middot middot middot 1

Example 43 Consider the box constraint

x1 ge 0 xn ge 0 1minus x1 ge 0 1minus xn ge 0

The equation L(x)C(x) = I2n is satisfied by

L(x) =

[In minus diag(x) In Inminusdiag(x) In In

]

Example 44 Consider the polyhedral set

1minus x4 ge 0 x4 minus x3 ge 0 x3 minus x2 ge 0 x2 minus x1 ge 0 x1 + 1 ge 0

The equation L(x)C(x) = I5 is satisfied by

L(x) =1

2

minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 minusx4 minus 1 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 1minus x4 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 1minus x3 1minus x4 1 1 1 1 1minusx1 minus 1 1minus x2 1minus x3 1minus x4 1 1 1 1 11minus x1 1minus x2 1minus x3 1minus x4 1 1 1 1 1

Example 45 Consider the polyhedral set

1 + x1 ge 0 1minus x1 ge 0 2minus x1 minus x2 ge 0 2minus x1 + x2 ge 0

The matrix L(x) satisfying L(x)C(x) = I4 is

1

6

x12minus 3x1 + 2 x1 x2 minus x2 4minus x1 2minus x1 1minus x1 1minus x1

3x12minus 3x1 minus 6 3x2 + 3x1 x2 6minus 3x1 minus3x1 minus3x1 minus 3 minus3x1 minus 3

1minus x12

minus2x2 minus x1 x2 minus 3 x1 minus 1 x1 + 1 x1 + 2 x1 + 21minus x1

2 3minus x1 x2 minus 2x2 x1 minus 1 x1 + 1 x1 + 2 x1 + 2

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 17

5 General constraints

We consider general nonlinear constraints as in (11) The critical point condi-tions are in (18) We discuss how to express Lagrange multipliers λi as polynomialfunctions in x on the set of critical points

Suppose there are totally m equality and inequality constraints ie

E cup I = 1 mIf (x λ) is a critical pair then λici(x) = 0 for all i isin E cup I So the Lagrange

multiplier vector λ =[λ1 middot middot middot λm

]Tsatisfies the equation

(51)

nablac1(x) nablac2(x) middot middot middot nablacm(x)c1(x) 0 middot middot middot 00 c2(x) 0 0

0 0 middot middot middot cm(x)

︸ ︷︷ ︸

C(x)

λ =

nablaf(x)000

Let C(x) be as in above If there exists L(x) isin R[x]mtimes(m+n) such that

(52) L(x)C(x) = Im

then we can get

(53) λ = L(x)

[nablaf(x)

0

]

= L1(x)nablaf(x)

where L1(x) consists of the first n columns of L(x) Clearly (52) implies thatAssumption 31 holds This section characterizes when such L(x) exists

Definition 51 The tuple c = (c1 cm) of constraining polynomials is said tobe nonsingular if rankC(u) = m for every u isin Cn

Clearly c being nonsingular is equivalent to that for each u isin Cn if J(u) =i1 ik (see (12) for the notation) then the gradients nablaci1(u) nablacik(u) arelinearly independent Our main conclusion is that (52) holds if and only if thetuple c is nonsingular

Proposition 52 (i) For each W (x) isin C[x]stimest with s ge t rankW (u) = t for allu isin Cn if and only if there exists P (x) isin C[x]ttimess such that

P (x)W (x) = It

Moreover for W (x) isin R[x]stimest we can choose P (x) isin R[x]ttimess for the above(ii) The constraining polynomial tuple c is nonsingular if and only if there existsL(x) isin R[x]mtimes(m+n) satisfying (52)

Proof (i) ldquolArrrdquo If L(x)W (x) = It then for all u isin Cn

t = rank It le rankW (u) le t

So W (x) must have full column rank everywhere

ldquorArrrdquo Suppose rankW (u) = t for all u isin Cn Write W (x) in columns

W (x) =[w1(x) w2(x) middot middot middot wt(x)

]

18 JIAWANG NIE

Then the equation w1(x) = 0 does not have a complex solution By Hilbertrsquos WeakNullstellensatz [4] there exists ξ1(x) isin C[x]s such that ξ1(x)

Tw1(x) = 1 For eachi = 2 t denote

r1i(x) = ξ1(x)Twi(x)

then (use sim to denote row equivalence between matrices)

W (x) sim[

1 r12(x) middot middot middot r1t(x)w1(x) w2(x) middot middot middot wm(x)

]

simW1(x) =

[1 r12(x) middot middot middot r1m(x)

0 w(1)2 (x) middot middot middot w

(1)m (x)

]

where each (i = 2 m)

w(1)i (x) = wi(x)minus r1i(x)w1(x)

So there exists P1(x) isin R[x](s+1)timess such that

P1(x)W (x) =W1(x)

Since W (x) and W1(x) are row equivalent W1(x) must also have full column rankeverywhere Similarly the polynomial equation

w(1)2 (x) = 0

does not have a complex solution Again by Hilbertrsquos Weak Nullstellensatz [4]there exists ξ2(x) isin C[x]s such that

ξ2(x)Tw

(1)2 (x) = 1

For each i = 3 t let r2i(x) = ξ2(x)Tw

(1)2 (x) then

W1(x) sim

1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)

0 w(1)2 (x) w

(1)3 (x) middot middot middot w

(1)m (x)

sim

W2(x) =

1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)

0 0 w(2)3 (x) middot middot middot w

(2)m (x)

where each (i = 3 m)

w(2)i (x) = w

(1)i (x)minus r2i(x)w

(1)2 (x)

Similarly W1(x) and W2(x) are row equivalent so W2(x) has full column rankeverywhere There exists P2(x) isin C[x](s+2)times(s+1) such that

P2(x)W1(x) =W2(x)

Continuing this process we can finally get

W2(x) sim middot middot middot simWt(x) =

1 r12(x) r13(x) middot middot middot r1t(x)0 1 r23(x) middot middot middot r2t(x)0 0 1 middot middot middot r3t(x)

0 0 0 middot middot middot 10 0 0 middot middot middot 0

Consequently there exists Pi(x) isin R[x](s+i)times(s+iminus1) for i = 1 2 t such that

Pt(x)Ptminus1(x) middot middot middotP1(x)W (x) =Wt(x)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 19

Since Wt(x) is a unit upper triangular matrix polynomial there exists Pt+1(x) isinR[x]ttimes(s+t) such that Pt+1(x)Wt(x) = It Let

P (x) = Pt+1(x)Pt(x)Ptminus1(x) middot middot middotP1(x)

then P (x)W (x) = Im Note that P (x) isin C[x]ttimess

For W (x) isin R[x]stimest we can replace P (x) by(P (x)+P (x)

)2 (the P (x) denotes

the complex conjugate of P (x)) which is a real matrix polynomial

(ii) The conclusion is implied directly by the item (i)

In Proposition 52 there is no explicit degree bound for L(x) satisfying (52)This question is mostly open to the best of the authorrsquos knowledge However oncea degree is chosen for L(x) it can be determined by comparing coefficients of bothsides of (52) This can be done by solving a linear system In the following wegive some examples of L(x) satisfying (52)

Example 53 Consider the hypercube with quadratic constraints

1minus x21 ge 0 1minus x22 ge 0 1minus x2n ge 0

The equation L(x)C(x) = In is satisfied by

L(x) =[minus 1

2diag(x) In]

Example 54 Consider the nonnegative portion of the unit sphere

x1 ge 0 x2 ge 0 xn ge 0 x21 + middot middot middot+ x2n minus 1 = 0

The equation L(x)C(x) = In+1 is satisfied by

L(x) =

[In minus xxT x1Tn 2x

12x

T minus 121

Tn minus1

]

Example 55 Consider the set

1minus x31 minus x42 ge 0 1minus x43 minus x34 ge 0

The equation L(x)C(x) = I2 is satisfied by

L(x) =

[minusx1

3 minusx2

4 0 0 1 00 0 minusx3

4 minusx4

3 0 1

]

Example 56 Consider the quadratic set

1minus x1x2 minus x2x3 minus x1x3 ge 0 1minus x21 minus x22 minus x23 ge 0

The matrix L(x)T satisfying L(x)C(x) = I2 is

25x13 + 10 x1

2 x2 + 40x1 x22 minus 25 x1 minus 2x3 minus25x1

3 minus 10x12 x2 minus 40 x1 x2

2 +49 x1

2+ 2x3

minus15 x12 x2 + 10x1 x2

2 + 20 x3 x1 x2 minus 10x1 15 x12 x2 minus 10x1 x2

2 minus 20 x3 x1 x2 + 10x1 minus x22

25 x3 x12 minus 20x1 x2

2 + 10 x3 x1 x2 + 2x1 minus25 x3 x12 + 20x1 x2

2 minus 10 x3 x1 x2 minus 2x1 minus x32

1 minus 20 x1 x3 minus 10 x12 minus 20 x1 x2 20 x1 x2 + 20 x1 x3 + 10x1

2

minus50 x12 minus 20 x2 x1 50 x1

2 + 20 x2 x1 + 1

We would like to remark that a polynomial tuple c = (c1 cm) is genericallynonsingular and Assumption 31 holds generically

Proposition 57 For all positive degrees d1 dm there exists an open densesubset U of D = R[x]d1 times middot middot middot times R[x]dm such that every tuple c = (c1 cm) isin Uis nonsingular Indeed such U can be chosen as a Zariski open subset of D ie itis the complement of a proper real variety of D Moreover Assumption 31 holdsfor all c isin U ie it holds generically

20 JIAWANG NIE

Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The

resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of

ci1 cin+1such that if Res(ci1 cin+1

) 6= 0 then the equations

ci1(x) = middot middot middot = cin+1(x) = 0

have no complex solutions Define

F1(c) =prod

(i1in+1)isinJ1

Res(ci1 cin+1)

For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero

Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations

cj1(x) = middot middot middot = cjk(x) = 0

have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define

F2(c) =prod

(j1jk)isinJ2

∆(cj1 cjk)

Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero

Last let F (c) = F1(c)F2(c) and

U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple

Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically

6 Numerical examples

This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21

The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie

pi =(

L1(x)nablaf(x))

i

In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds

By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)

We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo

Example 61 Consider the optimization problem

min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0

The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors

Table 1 Computational results for Example 61

order kwo LME with LME

lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567

2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn

i=1xi(1 minus xi)

2 +sum

i6=j x2

i xj isin IQ(ceq cin)

22 JIAWANG NIE

Example 62 Consider the optimization problem

min x41x22 + x21x

42 + x63 minus 3x21x

22x

23 + (x41 + x42 + x43)

st x21 + x22 + x23 ge 1

The matrix polynomial L(x) =[12x1

12x2

12x3 minus1

] The objective f is the sum

of the positive definite form x41 + x42 + x43 and the Motzkin polynomial

M(x) = x41x22 + x21x

42 + x63 minus 3x21x

22x

23

Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because

c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)

)= 0

defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1

3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic

3 plusmn 1radic

3) The

computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 2 Computational results for Example 62

order kwo LME with LME

lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530

Example 63 Consider the optimization problem

min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0

The matrix polynomial L(x) is

1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0

0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1

minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1

The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3

3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact

4 This is because 1 minus x21minus x2

2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and

1minusx23minusx2

4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23

Table 3 Computational results for Example 63

order kwo LME with LME

lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449

Example 64 Consider the polynomial optimization problem

minxisinR2

x21 + 50x22

st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1

8 ge 0 x22 + 2x1x2 minus 18 ge 0

It is motivated from an example in [12 sect3] The first column of L(x) is

8x13

5 + x1

5288 x2 x1

4

5 minus 16 x13

5 minus x2 x12 1245 + 8 x1

5 minus 2 x2minus 288x2 x1

4

5 minus 16 x13

5 + x2 x12 1245 + 8 x1

5 + 2 x2

and the second column of L(x) is

minus 8x12 x2

5 + 4x23

5 minus x2

10288x1

3 x22

5 + 16x12 x2

5 minus 142x1 x22

5 minus 9 x1

20 minus 8 x23

5 + 11x2

5

minus 288x13 x2

2

5 + 16x12 x2

5 + 142x1 x22

5 + 9 x1

20 minus 8 x23

5 + 11x2

5

The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25

radic5 asymp 1126517 and the minimizers are (plusmn

radic

12plusmn(radic

58 +radic

12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors

Table 4 Computational results for Example 64

order kwo LME with LME

lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537

Example 65 Consider the optimization problem

minxisinR3

x31 + x32 + x33 + 4x1x2x3 minus(x1(x

22 + x23) + x2(x

23 + x21) + x3(x

21 + x22)

)

st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0

The matrix polynomial L(x) is

1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0

minusx1 x2 0 1 0 minus1

24 JIAWANG NIE

The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3

+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 5 Computational results for Example 65

order kwo LME with LME

lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057

Example 66 Consider the optimization problem (x0 = 1)

minxisinR4

xTx+sum4i=0

prod

j 6=i(xi minus xj)

st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0

The matrix polynomial L(x) =[12diag(x) minusI4

] The first part of the objective

is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers

(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)

(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)

(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 6 Computational results for Example 66

order kwo LME with LME

lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354

Example 67 Consider the optimization problem

minxisinR3

x41x22 + x42x

23 + x43x

21 minus 3x21x

22x

23 + x22

st x1 minus x2x3 ge 0minusx2 + x23 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0

minusx3 minus1 0 0 0

]

By the arithmetic-geometric

mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25

relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin

for all k ge 5 up to numerical round-off errors

Table 7 Computational results for Example 67

order kwo LME with LME

lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618

Example 68 Consider the optimization problem

minxisinR4

x21(x1 minus x4)2 + x22(x2 minus x4)

2 + x23(x3 minus x4)2+

2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2

st x1 minus x2 ge 0 x2 minus x3 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0 01 1 0 0 0 0

]

In the objective the sum

of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer

(05632 05632 05632 07510)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 8 Computational results for Example 68

order kwo LME with LME

lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228

Example 69 Consider the optimization problem

minxisinR4

(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)

st 0 le x1 x4 le 1

The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9

5 Note that 4minussum

4

i=1x2i =

sum

4

i=1

(

xi(1 minus xi)2 + (1 minus xi)(1 + x2i )

)

isin IQ(ceq cin)

26 JIAWANG NIE

Table 9 Computational results for Example 69

orderwo LME with LME

lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320

For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison

Example 610 Consider the optimization problem (x0 = 1)

minxisinRn

sum

0leilejlejlen cijkxixjxk

st 0 le x le 1

where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time

Table 10 Consumed time (in seconds) for Example 610

n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778

7 Discussions

71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that

IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k

If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set

Preord(cin ψ) =sum

r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27

The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is

(71)

f primeprek = min 〈f y〉

st 〈1 y〉 = 1 L(k)ceq(y) = 0 L

(k)φ (y) = 0

L(k)

gr11

middotmiddotmiddotgrℓℓ

(y) 0 forall r1 rℓ isin 0 1y isin RN

n2k

Similar to (39) the dual optimization problem of the above is

(72)

fprek = max γ

st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k

An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem

Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then

fprek = f primeprek = fc

for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre

k = fmin for all k big enough

Proof The proof is very similar to the Case III of Theorem 33 Follow the same

argument there Without Assumption 32 we still have f(x) equiv 0 on the set

K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0

By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such

that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same

72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge

73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)

28 JIAWANG NIE

74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get

λ =(

C(x)TC(x))minus1

C(x)T[nablaf(x)

0

]

when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research

Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper

References

[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions

Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-

putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997

[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226

[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200

[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832

[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear

[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear

[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]

[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous

quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-

inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-

tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005

[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109

[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J

Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of

zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29

[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885

[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014

[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009

[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015

[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear

[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26

[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009

[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014

[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606

[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718

[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012

[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255

[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013

[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510

[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121

[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274

[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003

[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984

[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000

[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942

[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324

[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653

[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002

[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951

Department of Mathematics University of California at San Diego 9500 Gilman

Drive La Jolla CA USA 92093

E-mail address njwmathucsdedu

  • 1 Introduction
    • 11 Optimality conditions
    • 12 Some existing work
    • 13 New contributions
      • 2 Preliminaries
      • 3 The construction of tight relaxations
        • 31 Tightness of the relaxations
        • 32 Detecting tightness and extracting minimizers
          • 4 Polyhedral constraints
          • 5 General constraints
          • 6 Numerical examples
          • 7 Discussions
            • 71 Tight relaxations using preorderings
            • 72 Singular constraining polynomials
            • 73 Degree bound for L(x)
            • 74 Rational representation of Lagrange multipliers
              • References
Page 4: TIGHT RELAXATIONS FOR POLYNOMIAL OPTIMIZATION …The polynomial expressions is de-termined by linear equations. Based on these expressions, new Lasserre type semidefinite relaxations

4 JIAWANG NIE

bull We show that the polynomials pi satisfying (110) always exist when thetuple of constraining polynomials is nonsingular (see Definition 51) More-over they can be determined by linear equations

bull Using the new conditions (111)-(112) we can construct tight relaxationsfor solving (11) To be more precise we construct a hierarchy of newrelaxations which has finite convergence This is true even if the feasibleset is noncompact andor the optimality conditions fail to hold

bull For every relaxation order the new relaxations are tighter than the standardones in the prior work

The paper is organized as follows Section 2 reviews some basics in polynomialoptimization Section 3 constructs new relaxations and proves their tightness Sec-tion 4 characterizes when the polynomials pirsquos satisfying (110) exist and shows howto determine them for polyhedral constraints Section 5 discusses the case of gen-eral nonlinear constraints Section 6 gives examples of using the new relaxationsSection 7 discusses some related issues

2 Preliminaries

Notation The symbol N (resp R C) denotes the set of nonnegative integral(resp real complex) numbers The symbol R[x] = R[x1 xn] denotes the ringof polynomials in x = (x1 xn) with real coefficients The R[x]d stands for theset of real polynomials with degrees le d Denote

Nnd = α = (α1 αn) isin N

n | |α| = α1 + middot middot middot+ αn le dFor a polynomial p deg(p) denotes its total degree For t isin R lceiltrceil denotes thesmallest integer ge t For an integer k gt 0 denote [k] = 1 2 k For x =(x1 xn) and α = (α1 αn) denote

xα = xα1

1 middot middot middotxαnn [x]d =

[1 x1 middot middot middot xn x21 x1x2 middot middot middot xdn

]T

The superscript T denotes the transpose of a matrixvector The ei denotes theith standard unit vector while e denotes the vector of all ones The Im denotesthe m-by-m identity matrix By writing X 0 (resp X ≻ 0) we mean that Xis a symmetric positive semidefinite (resp positive definite) matrix For matricesX1 Xr diag(X1 Xr) denotes the block diagonal matrix whose diagonalblocks are X1 Xr In particular for a vector a diag(a) denotes the diagonalmatrix whose diagonal vector is a For a function f in x fxi

denotes its partialderivative with respect to xi

We review some basics in computational algebra and polynomial optimizationThey could be found in [4 21 22 25 26] An ideal I of R[x] is a subset such thatI middot R[x] sube I and I + I sube I For a tuple h = (h1 hm) of polynomials Ideal(h)denotes the smallest ideal containing all hi which is the set

h1 middot R[x] + middot middot middot+ hm middot R[x]The 2kth truncation of Ideal(h) is the set

Ideal(h)2k = h1 middot R[x]2kminusdeg(h1) + middot middot middot+ hm middot R[x]2kminusdeg(hm)

The truncation Ideal(h)2k depends on the generators h1 hm For an ideal Iits complex and real varieties are respectively defined as

VC(I) = v isin Cn | p(v) = 0 forall p isin I VR(I) = VC(I) capR

n

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 5

A polynomial σ is said to be a sum of squares (SOS) if σ = s21+ middot middot middot+s2k for somepolynomials s1 sk isin R[x] The set of all SOS polynomials in x is denoted asΣ[x] For a degree d denote the truncation

Σ[x]d = Σ[x] cap R[x]d

For a tuple g = (g1 gt) its quadratic module is the set

Qmod(g) = Σ[x] + g1 middot Σ[x] + middot middot middot+ gt middot Σ[x]The 2kth truncation of Qmod(g) is the set

Qmod(g)2k = Σ[x]2k + g1 middot Σ[x]2kminusdeg(g1) + middot middot middot+ gt middot Σ[x]2kminusdeg(gt)

The truncation Qmod(g)2k depends on the generators g1 gt Denote

(21)

IQ(h g) = Ideal(h) + Qmod(g)IQ(h g)2k = Ideal(h)2k +Qmod(g)2k

The set IQ(h g) is said to be archimedean if there exists p isin IQ(h g) such thatp(x) ge 0 defines a compact set in Rn If IQ(h g) is archimedean then

K = x isin Rn | h(x) = 0 g(x) ge 0

must be a compact set Conversely if K is compact say K sube B(0 R) (the ballcentered at 0 with radius R) then IQ(h (gR2 minus xTx)) is always archimedean andh = 0 (gR2 minus xTx) ge 0 give the same set K

Theorem 21 (Putinar [36]) Let h g be tuples of polynomials in R[x] Let K beas above Assume IQ(h g) is archimedean If a polynomial f isin R[x] is positive onK then f isin IQ(h g)

Interestingly if f is only nonnegative on K but standard optimality conditionshold (see Subsection 11) then we still have f isin IQ(h g) [33]

Let RNnd be the space of real multi-sequences indexed by α isin Nnd A vector in

RNnd is called a truncated multi-sequence (tms) of degree d A tms y = (yα)αisinNn

d

gives the Riesz functional Ry acting on R[x]d as

(22) Ry

(

sum

αisinNnd

fαxα)

=sum

αisinNnd

fαyα

For f isin R[x]d and y isin RNnd we denote

(23) 〈f y〉 = Ry(f)

Let q isin R[x]2k The kth localizing matrix of q generated by y isin RNn2k is the

symmetric matrix L(k)q (y) such that

(24) vec(a1)T(

L(k)q (y)

)

vec(a2) = Ry(qa1a2)

for all a1 a2 isin R[x]kminuslceildeg(q)2rceil (The vec(ai) denotes the coefficient vector of ai)

When q = 1 L(k)q (y) is called a moment matrix and we denote

(25) Mk(y) = L(k)1 (y)

The columns and rows of L(k)q (y) as well as Mk(y) are indexed by α isin Nn with

2|α|+ deg(q) le 2k When q = (q1 qr) is a tuple of polynomials we define

(26) L(k)q (y) = diag

(

L(k)q1 (y) L(k)

qr (y))

6 JIAWANG NIE

a block diagonal matrix For the polynomial tuples h g as above the set

(27) S (h g)2k =

y isin RN

2kn

∣∣∣L

(k)h (y) = 0 L(k)

g (y) 0

is a spectrahedral cone in RN2kn The set IQ(h g)2k is also a convex cone in R[x]2k

The dual cone of IQ(h g)2k is precisely S (h g)2k [22 25 34] This is because〈p y〉 ge 0 for all p isin IQ(h g)2k and for all y isin S (h g)2k

3 The construction of tight relaxations

Consider the polynomial optimization problem (11) Let

λ = (λi)iisinEcupI

be the vector of Lagrange multipliers Denote the set

(31) K =

(x λ) isin Rn times R

EcupI∣∣∣∣∣

ci(x) = 0(i isin E) λjcj(x) = 0 (j isin I)nablaf(x) = sum

iisinEcupIλinablaci(x)

Each point in K is called a critical pair The projection

(32) Kc = u | (u λ) isin Kis the set of all real critical points To construct tight relaxations for solving (11)we need the following assumption for Lagrange multipliers

Assumption 31 For each i isin E cupI there exists a polynomial pi isin R[x] such thatfor all (x λ) isin K it holds that

λi = pi(x)

Assumption 31 is generically satisfied as shown in Proposition 57 For thefollowing special cases we can get polynomials pi explicitly

bull (Simplex) For the simplex eTxminus1 = 0 x1 ge 0 xn ge 0 it correspondsto that E = 0 I = [n] c0(x) = eTx minus 1 cj(x) = xj (j isin [n]) TheLagrange multipliers can be expressed as

(33) λ0 = xTnablaf(x) λj = fxjminus xTnablaf(x) (j isin [n])

bull (Hypercube) For the hypercube [minus1 1]n it corresponds to that E = emptyI = [n] and each cj(x) = 1minus x2j We can show that

(34) λj = minus1

2xjfxj

(j isin [n])

bull (Ball or sphere) The constraint is 1minusxTx = 0 or 1minusxTx ge 0 It correspondsto that E cup I = 1 and c1 = 1minus xTx We have

(35) λ1 = minus1

2xTnablaf(x)

bull (Triangular constraints) Suppose E cup I = 1 m and each

ci(x) = τixi + qi(xi+1 xn)

for some polynomials qi isin R[xi+1 xn] and scalars τi 6= 0 The matrixT (x) consisting of the firstm rows of [nablac1(x) nablacm(x)] is an invertiblelower triangular matrix with constant diagonal entries Then

λ = T (x)minus1 middot[fx1

middot middot middot fxm

]T

Note that the inverse T (x)minus1 is a matrix polynomial

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 7

For more general constraints we can also express λ as a polynomial function in xon the set Kc This will be discussed in sect4 and sect5

For the polynomials pi as in Assumption 31 denote

(36) φ =(

nablaf minussum

iisinEcupIpinablaci

(pjcj

)

jisinI

)

ψ =(pj)

jisinI

When the minimum value fmin of (11) is achieved at a critical point (11) isequivalent to the problem

(37)

fc = min f(x)st ceq(x) = 0 cin(x) ge 0

φ(x) = 0 ψ(x) ge 0

We apply Lasserre relaxations to solve it For an integer k gt 0 (called the relaxationorder) the kth order Lasserrersquos relaxation for (37) is

(38)

f primek = min 〈f y〉

st 〈1 y〉 = 1Mk(y) 0

L(k)ceq (y) = 0 L

(k)cin(y) 0

L(k)φ (y) = 0 L

(k)ψ (y) 0 y isin RN

n2k

Since x0 = 1 (the constant one polynomial) the condition 〈1 y〉 = 1 means that(y)0 = 1 The dual optimization problem of (38) is

(39)

fk = max γ

st f minus γ isin IQ(ceq cin)2k + IQ(φ ψ)2k

We refer to sect2 for the notation used in (38)-(39) They are equivalent to semidef-inite programs (SDPs) so they can be solved by SDP solvers (eg SeDuMi [40])For k = 1 2 middot middot middot we get a hierarchy of Lasserre relaxations In (38)-(39) if weremove the usage of φ and ψ they are reduced to standard Lasserre relaxations in[17] So (38)-(39) are stronger relaxations

By the construction of φ as in (36) Assumption 31 implies that

Kc = u isin Rn ceq(u) = 0 φ(u) = 0

By Lemma 33 of [6] f achieves only finitely many values on Kc say(310) v1 lt middot middot middot lt vN

A point u isin Kc might not be feasible for (37) ie it is possible that cin(u) 6ge 0 orψ(u) 6ge 0 In applications we are often interested in the optimal value fc of (37)When (37) is infeasible by convention we set

fc = +infin

When the optimal value fmin of (11) is achieved at a critical point fc = fminThis is the case if the feasible set is compact or if f is coercive (ie for each ℓthe sublevel set f(x) le ℓ is compact) and the constraint qualification conditionholds As in [17] one can show that

(311) fk le f primek le fc

for all k Moreover fk and f primek are both monotonically increasing If for some

order k it occurs thatfk = f prime

k = fc

then the kth order Lasserrersquos relaxation is said to be tight (or exact)

8 JIAWANG NIE

31 Tightness of the relaxations Let cin ψKc fc be as above We refer to sect2for the notation Qmod(cin ψ) We begin with a general assumption

Assumption 32 There exists ρ isin Qmod(cin ψ) such that if u isin Kc and f(u) ltfc then ρ(u) lt 0

In Assumption 32 the hypersurface ρ(x) = 0 separates feasible and infeasiblecritical points Clearly if u isin Kc is a feasible point for (37) then cin(u) ge 0 andψ(u) ge 0 and hence ρ(u) ge 0 Assumption 32 generally holds For instance it issatisfied for the following general cases

a) When there are no inequality constraints cin and ψ are empty tuplesThen Qmod(cin ψ) = Σ[x] and Assumption 32 is satisfied for ρ = 0

b) Suppose the set Kc is finite say Kc = u1 uD andf(u1) f(utminus1) lt fc le f(ut) f(uD)

Let ℓ1 ℓD be real interpolating polynomials such that ℓi(uj) = 1 fori = j and ℓi(uj) = 0 for i 6= j For each i = 1 t there must exist ji isin Isuch that cji(ui) lt 0 Then the polynomial

(312) ρ =sum

iltt

minus1

cji(ui)cji(x)ℓi(x)

2 +sum

igetℓi(x)

2

satisfies Assumption 32c) For each x with f(x) = vi lt fc at least one of the constraints cj(x) ge

0 pj(x) ge 0(j isin I) is violated Suppose for each critical value vi lt fcthere exists gi isin cj pjjisinI such that

gi lt 0 on Kc cap f(x) = viLet ϕ1 ϕN be real univariate polynomials such that ϕi(vj) = 0 for i 6= jand ϕi(vj) = 1 for i = j Suppose vt = fc Then the polynomial

(313) ρ =sum

iltt

gi(x)(ϕi(f(x))

)2+sum

iget

(ϕi(f(x))

)2

satisfies Assumption 32

We refer to sect2 for the archimedean condition and the notation IQ(h g) as in(21) The following is about the convergence of relaxations (38)-(39)

Theorem 33 Suppose Kc 6= empty and Assumption 31 holds If

i) IQ(ceq cin) + IQ(φ ψ) is archimedean orii) IQ(ceq cin) is archimedean oriii) Assumption 32 holds

then fk = f primek = fc for all k sufficiently large Therefore if the minimum value fmin

of (11) is achieved at a critical point then fk = f primek = fmin for all k big enough if

one of the conditions i)-iii) is satisfied

Remark In Theorem 33 the conclusion holds if anyone of conditions i)-iii) issatisfied The condition ii) is only about constraining polynomials of (11) It canbe checked without φ ψ Clearly the condition ii) implies the condition i)

The proof for Theorem 33 is given in the following The main idea is to considerthe set of critical points It can be expressed as a union of subvarieties Theobjective f is a constant in each one of them We can get an SOS type representation

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 9

for f on each subvariety and then construct a single one for f over the entire setof critical points

Proof of Theorem 33 Clearly every point in the complex variety

K1 = x isin Cn | ceq(x) = 0 φ(x) = 0

is a critical point By Lemma 33 of [6] the objective f achieves finitely manyreal values on Kc = K1 cap Rn say they are v1 lt middot middot middot lt vN Up to the shiftingof a constant in f we can further assume that fc = 0 Clearly fc equals one ofv1 vN say vt = fc = 0

Case I Assume IQ(ceq cin) + IQ(φ ψ) is archimedean Let

I = Ideal(ceq φ)

the critical ideal Note that K1 = VC(I) The variety VC(I) is a union of irreduciblesubvarieties say V1 Vℓ If Vi cap Rn 6= empty then f is a real constant on Vi whichequals one of v1 vN This can be implied by Lemma 33 of [6] and Lemma 32of [30] Denote the subvarieties of VC(I)

Ti = K1 cap f(x) = vi (i = t N)

Let Ttminus1 be the union of irreducible subvarieties Vi such that either Vi cap Rn = emptyor f equiv vj on Vi with vj lt vt = fc Then it holds that

VC(I) = Ttminus1 cup Tt cup middot middot middot cup TN By the primary decomposition of I [11 41] there exist ideals Itminus1 It IN sube R[x]such that

I = Itminus1 cap It cap middot middot middot cap IN

and Ti = VC(Ii) for all i = tminus 1 t N Denote the semialgebraic set

(314) S = x isin Rn | cin(x) ge 0 ψ(x) ge 0

For i = t minus 1 we have VR(Itminus1) cap S = empty because v1 vtminus1 lt fc By thePositivstellensatz [2 Corollary 443] there exists p0 isin Preord(cin ψ)

1 satisfying2 + p0 isin Itminus1 Note that 1 + p0 gt 0 on VR(Itminus1) cap S The set Itminus1 +Qmod(cin ψ)is archimedean because I sube Itminus1 and

IQ(ceq cin) + IQ(φ ψ) sube Itminus1 +Qmod(cin ψ)

By Theorem 21 we have

p1 = 1 + p0 isin Itminus1 +Qmod(cin ψ)

Then 1 + p1 isin Itminus1 There exists p2 isin Qmod(cin ψ) such that

minus1 equiv p1 equiv p2 mod Itminus1

Since f = (f4 + 1)2 minus 1 middot (f4minus 1)2 we have

f equiv σtminus1 =(f4 + 1)2 + p2(f4minus 1)2

mod Itminus1

So when k is big enough we have σtminus1 isin Qmod(cin ψ)2k

1It is the preordering of the polynomial tuple (cin ψ) see sect71

10 JIAWANG NIE

For i = t vt = 0 and f(x) vanishes on VC(It) By Hilbertrsquos Strong Nullstellensatz[4] there exists an integer mt gt 0 such that fmt isin It Define the polynomial

st(ǫ) =radicǫsummtminus1

j=0

(12

j

)

ǫminusjf j

Then we have that

D1 = ǫ(1 + ǫminus1f)minus(st(ǫ)

)2 equiv 0 mod It

This is because in the subtraction of D1 after expanding(st(ǫ)

)2 all the terms f j

with j lt mt are cancelled and f j isin It for j ge mt So D1 isin It Let σt(ǫ) = st(ǫ)2

then f + ǫminus σt(ǫ) = D1 and

(315) f + ǫminus σt(ǫ) =summtminus2

j=0bj(ǫ)f

mt+j

for some real scalars bj(ǫ) depending on ǫ

For each i = t+1 N vi gt 0 and f(x)viminus1 vanishes on VC(Ii) By HilbertrsquosStrong Nullstellensatz [4] there exists 0 lt mi isin N such that (fvi minus 1)mi isin Ii Let

si =radicvisummiminus1

j=0

(12

j

)

(fvi minus 1)j

Like for the case i = t we can similarly show that f minus s2i isin Ii Let σi = s2i thenf minus σi isin Ii

Note that VC(Ii) cap VC(Ij) = empty for all i 6= j By Lemma 33 of [30] there existpolynomials atminus1 aN isin R[x] such that

a2tminus1 + middot middot middot+ a2N minus 1 isin I ai isin⋂

i6=jisintminus1NIj

For ǫ gt 0 denote the polynomial

σǫ = σt(ǫ)a2t +

sum

t6=jisintminus1N(σj + ǫ)a2j

then

f + ǫminus σǫ = (f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N )+sum

t6=iisintminus1N(f minus σi)a2i + (f + ǫminus σt(ǫ))a

2t

For each i 6= t f minus σi isin Ii so

(f minus σi)a2i isin

N⋂

j=tminus1

Ij = I

Hence there exists k1 gt 0 such that

(f minus σi)a2i isin I2k1 (t 6= i isin tminus 1 N)

Since f + ǫ minus σt(ǫ) isin It we also have

(f + ǫ minus σt(ǫ))a2t isin

N⋂

j=tminus1

Ij = I

Moreover by the equation (315)

(f + ǫ minus σt(ǫ))a2t =

mtminus2sum

j=0

bj(ǫ)fmt+ja2t

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 11

Each fmt+ja2t isin I since fmt+j isin It So there exists k2 gt 0 such that for all ǫ gt 0

(f + ǫminus σt(ǫ))a2t isin I2k2

Since 1minus a2tminus1 minus middot middot middot minus a2N isin I there also exists k3 gt 0 such that for all ǫ gt 0

(f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N ) isin I2k3

Hence if klowast ge maxk1 k2 k3 then we have

f(x) + ǫminus σǫ isin I2klowast

for all ǫ gt 0 By the construction the degrees of all σi and ai are independent ofǫ So σǫ isin Qmod(cin ψ)2klowast for all ǫ gt 0 if klowast is big enough Note that

I2klowast +Qmod(cin ψ)2klowast = IQ(ceq cin)2klowast + IQ(φ ψ)2klowast

This implies that fklowast ge fc minus ǫ for all ǫ gt 0 On the other hand we always havefklowast le fc So fklowast = fc Moreover since fk is monotonically increasing we musthave fk = fc for all k ge klowast

Case II Assume IQ(ceq cin) is archimedean Because

IQ(ceq cin) sube IQ(ceq cin) + IQ(φ ψ)

the set IQ(ceq cin)+IQ(φ ψ) is also archimedean Therefore the conclusion is alsotrue by applying the result for Case I

Case III Suppose the Assumption 32 holds Let ϕ1 ϕN be real univariatepolynomials such that ϕi(vj) = 0 for i 6= j and ϕi(vj) = 1 for i = j Let

s = st + middot middot middot+ sN where each si = (vi minus fc)(ϕi(f)

)2

Then s isin Σ[x]2k4 for some integer k4 gt 0 Let

f = f minus fc minus s

We show that there exist an integer ℓ gt 0 and q isin Qmod(cin ψ) such that

f2ℓ + q isin Ideal(ceq φ)

This is because by Assumption 32 f(x) equiv 0 on the set

K2 = x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0

It has only a single inequality By the Positivstellensatz [2 Corollary 443] there

exist 0 lt ℓ isin N and q = b0 + ρb1 (b0 b1 isin Σ[x]) such that f2ℓ + q isin Ideal(ceq φ)By Assumption 32 ρ isin Qmod(cin ψ) so we have q isin Qmod(cin ψ)

For all ǫ gt 0 and τ gt 0 we have f + ǫ = φǫ + θǫ where

φǫ = minusτǫ1minus2ℓ(f2ℓ + q

)

θǫ = ǫ(

1 + f ǫ+ τ(f ǫ)2ℓ)

+ τǫ1minus2ℓq

By Lemma 21 of [31] when τ ge 12ℓ there exists k5 such that for all ǫ gt 0

φǫ isin Ideal(ceq φ)2k5 θǫ isin Qmod(cin ψ)2k5

Hence we can getf minus (fc minus ǫ) = φǫ + σǫ

where σǫ = θǫ + s isin Qmod(cin ψ)2k5 for all ǫ gt 0 Note that

IQ(ceq cin)2k5 + IQ(φ ψ)2k5 = Ideal(ceq φ)2k5 +Qmod(cin ψ)2k5

12 JIAWANG NIE

For all ǫ gt 0 γ = fc minus ǫ is feasible in (39) for the order k5 so fk5 ge fc Becauseof (311) and the monotonicity of fk we have fk = f prime

k = fc for all k ge k5

32 Detecting tightness and extracting minimizers The optimal value of(37) is fc and the optimal value of (11) is fmin If fmin is achievable at a criticalpoint then fc = fmin In Theorem 33 we have shown that fk = fc for all k bigenough where fk is the optimal value of (39) The value fc or fmin is often notknown How do we detect the tightness fk = fc in computation The flat extensionor flat truncation condition [5 14 32] can be used for checking tightness Supposeylowast is a minimizer of (38) for the order k Let

(316) d = lceildeg(ceq cin φ ψ)2rceilIf there exists an integer t isin [d k] such that

(317) rankMt(ylowast) = rankMtminusd(y

lowast)

then fk = fc and we can get r = rankMt(ylowast) minimizers for (37) [5 14 32]

The method in [14] can be used to extract minimizers It was implemented inthe software GloptiPoly 3 [13] Generally (317) can serve as a sufficient andnecessary condition for detecting tightness The case that (37) is infeasible (ieno critical points satisfy the constraints cin ge 0 ψ ge 0) can also be detected bysolving the relaxations (38)-(39)

Theorem 34 Under Assumption 31 the relaxations (38)-(39) have the follow-ing properties

i) If (38) is infeasible for some order k then no critical points satisfy theconstraints cin ge 0 ψ ge 0 ie (37) is infeasible

ii) Suppose Assumption 32 holds If (37) is infeasible then the relaxation(38) must be infeasible when the order k is big enough

In the following assume (37) is feasible (ie fc lt +infin) Then for all k bigenough (38) has a minimizer ylowast Moreover

iii) If (317) is satisfied for some t isin [d k] then fk = fciv) If Assumption 32 holds and (37) has finitely many minimizers then every

minimizer ylowast of (38) must satisfy (317) for some t isin [d k] when k is bigenough

Proof By Assumption 31 u is a critical point if and only if ceq(u) = 0 φ(u) = 0i) For every feasible point u of (37) the tms [u]2k (see sect2 for the notation) is

feasible for (38) for all k Therefore if (38) is infeasible for some k then (37)must be infeasible

ii) By Assumption 32 when (37) is infeasible the set

x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0

is empty It has a single inequality By the Positivstellensatz [2 Corollary 443]it holds that minus1 isin Ideal(ceq φ) + Qmod(ρ) By Assumption 32

Ideal(ceq φ) + Qmod(ρ) sube IQ(ceq cin) + IQ(φ ψ)

Thus for all k big enough (39) is unbounded from above Hence (38) must beinfeasible by weak duality

When (37) is feasible f achieves finitely many values on Kc so (37) mustachieve its optimal value fc By Theorem 33 we know that fk = f prime

k = fc for all kbig enough For each minimizer ulowast of (37) the tms [ulowast]2k is a minimizer of (38)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 13

iii) If (317) holds we can get r = rankMt(ylowast) minimizers for (37) [5 14] say

u1 ur such that fk = f(ui) for each i Clearly fk = f(ui) ge fc On the otherhand we always have fk le fc So fk = fc

iv) By Assumption 32 (37) is equivalent to the problem

(318)

min f(x)st ceq(x) = 0 φ(x) = 0 ρ(x) ge 0

The optimal value of (318) is also fc Its kth order Lasserrersquos relaxation is

(319)

γprimek = min 〈f y〉st 〈1 y〉 = 1Mk(y) 0

L(k)ceq (y) = 0 L

(k)φ (y) = 0 L

(k)ρ (y) 0

Its dual optimization problem is

(320)

γk = max γ

st f minus γ isin Ideal(ceq φ)2k + Qmod(ρ)2k

By repeating the same proof as for Theorem 33(iii) we can show that

γk = γprimek = fc

for all k big enough Because ρ isin Qmod(cin ψ) each y feasible for (38) is alsofeasible for (319) So when k is big each ylowast is also a minimizer of (319) Theproblem (318) also has finitely many minimizers By Theorem 26 of [32] thecondition (317) must be satisfied for some t isin [d k] when k is big enough

If (37) has infinitely many minimizers then the condition (317) is typically notsatisfied We refer to [25 sect66]

4 Polyhedral constraints

In this section we assume the feasible set of (11) is the polyhedron

P = x isin Rn | Axminus b ge 0

where A =[a1 middot middot middot am

]T isin Rmtimesn b =[b1 middot middot middot bm

]T isin Rm This corresponds

to that E = empty I = [m] and each ci(x) = aTi xminus bi Denote

(41) D(x) = diag(c1(x) cm(x)) C(x) =

[AT

D(x)

]

The Lagrange multiplier vector λ =[λ1 middot middot middot λm

]Tsatisfies

(42)

[AT

D(x)

]

λ =

[nablaf(x)

0

]

If rankA = m we can express λ as

(43) λ = (AAT )minus1Anablaf(x)If rankA lt m how can we express λ in terms of x In computation we oftenprefer a polynomial expression If there exists L(x) isin R[x]mtimes(n+m) such that

(44) L(x)C(x) = Im

then we can get

λ = L(x)

[nablaf(x)

0

]

= L1(x)nablaf(x)

14 JIAWANG NIE

where L1(x) consists of the first n columns of L(x) In this section we characterizewhen such L(x) exists and give a degree bound for it

The linear function Ax minus b is said to be nonsingular if rankC(u) = m for allu isin Cn (also see Definition 51) This is equivalent to that for every u if J(u) =i1 ik (see (12) for the notation) then ai1 aik are linearly independent

Proposition 41 The linear function Ax minus b is nonsingular if and only if thereexists a matrix polynomial L(x) satisfying (44) Moreover when Ax minus b is non-singular we can choose L(x) in (44) with deg(L) le mminus rankA

Proof Clearly if (44) is satisfied by some L(x) then rankC(u) ge m for all u Thisimplies that Axminus b is nonsingular

Next assume that Axminusb is nonsingular We show that (44) is satisfied by someL(x) isin R[x]mtimes(n+m) with degree le m minus rankA Let r = rankA Up to a linearcoordinate transformation we can reduce x to a r-dimensional variable Withoutloss of generality we can assume that rankA = n and m ge n

For a subset I = i1 imminusn of [m] denote

cI(x) =prod

iisinIci(x) EI(x) = cI(x) middot diag(ci1(x)minus1 cimminusn

(x)minus1)

DI(x) = diag(ci1(x) cimminusn(x)) AI =

[ai1 middot middot middot ainminusm

]T

For the case that I = empty (the empty set) we set cempty(x) = 1 Let

V = I sube [m] |I| = mminus n rankA[m]I = nStep I For each I isin V we construct a matrix polynomial LI(x) such that

(45) LI(x)C(x) = cI(x)Im

The matrix LI = LI(x) satisfying (45) can be given by the following 2times 3 blockmatrix (LI(J K) denotes the submatrix whose row indices are from J and whosecolumn indices are from K)

(46)

J K [n] n+ I n+ [m]II 0 EI(x) 0

[m]I cI(x) middot(A[m]I

)minusT minus(A[m]I

)minusT (AI

)TEI(x) 0

Equivalently the blocks of LI are

LI(I [n]

)= 0 LI

(I n+ [m]I

)= 0 LI

([m]I n+ [m]I

)= 0

LI(I n+ I

)= EI(x) LI

([m]I [n]

)= cI(x)

(A[m]I

)minusT

LI([m]I n+ I

)= minus

(A[m]I

)minusT (AI

)T

For each I isin V A[m]I is invertible The superscript minusT denotes the inverse of thetranspose Let G = LI(x)C(x) then one can verify that

G(I I) = EI(x)DI(x) = cI(x)Imminusn G(I [m]I) = 0

G([m]I [m]I) =[

cI(x)(A[m]I

)minusT minusAminusT[m]IA

TI EI(x)

] [(A[m]I

)T

0

]

= cI(x)In

G([m]I I) =[

cI(x)(A[m]I

)minusT minus(A[m]I

)minusT (AI

)TEI(x)

] [ ATIDI(x)

]

= 0

This shows that the above LI(x) satisfies (45)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 15

Step II We show that there exist real scalars νI satisfying

(47)sum

IisinVνIcI(x) = 1

This can be shown by induction on m

bull When m = n V = empty and cempty(x) = 1 so (47) is clearly truebull When m gt n let

(48) N = i isin [m] | rankA[m]i = n

For each i isin N let Vi be the set of all Iprime sube [m]i such that |I prime| = mminusnminus1

and rankA[m](Iprimecupi) = n For each i isin N by the assumption the linearfunction Amix minus bmi is nonsingular By induction there exist real

scalars ν(i)Iprime satisfying

(49)sum

IprimeisinVi

ν(i)Iprime cIprime(x) = 1

Since rankA = n we can generally assume that a1 an is linearlyindependent So there exist scalars α1 αn such that

am = α1a1 + middot middot middot+ αnan

If all αi = 0 then am = 0 and hence A can be replaced by its first mminus 1rows So (47) is true by the induction In the following suppose at leastone αi 6= 0 and write

i αi 6= 0 = i1 ik

Then ai1 aik am are linearly dependent For convenience set ik+1 =m Since Axminus b is nonsingular the linear system

ci1(x) = middot middot middot = cik(x) = cik+1(x) = 0

has no solutions Hence there exist real scalars micro1 microk+1 such that

micro1ci1(x) + middot middot middot+ microkcik(x) + microk+1cik+1(x) = 1

This above can be implied by echelonrsquos form for inconsistent linear systemsNote that i1 ik+1 isin N For each j = 1 k + 1 by (49)

sum

IprimeisinVij

ν(ij)Iprime cIprime(x) = 1

Then we can get

1 =k+1sum

j=1

microjcij (x) =k+1sum

j=1

microjsum

IprimeisinVij

ν(ij)Iprime cij (x)cIprime(x) =

sum

I=IprimecupijIprimeisinVij1lejlek+1

ν(ij)Iprime microjcI(x)

Since each I prime cup ij isin V (47) must be satisfied by some scalars νI

16 JIAWANG NIE

Step III For LI(x) as in (45) we construct L(x) as

(410) L(x) =sum

IisinVνIcI(x)LI(x)

Clearly L(x) satisfies (44) because

L(x)C(x) =sum

IisinVνILI(x)C(x) =

sum

IisinVνIcI(x)Im = Im

Each LI(x) has degree le mminus n so L(x) has degree le mminus n

Proposition 41 characterizes when there exists L(x) satisfying (44) Whenit does a degree bound for L(x) is m minus rankA Sometimes its degree can besmaller than that as shown in Example 43 For given A b the matrix polynomialL(x) satisfying (44) can be determined by linear equations which are obtained bymatching coefficients on both sides In the following we give some examples ofL(x)C(x) = Im for polyhedral sets

Example 42 Consider the simplicial set

x1 ge 0 xn ge 0 1minus eTx ge 0

The equation L(x)C(x) = In+1 is satisfied by

L(x) =

1minus x1 minusx2 middot middot middot minusxn 1 middot middot middot 1minusx1 1minus x2 middot middot middot minusxn 1 middot middot middot 1

minusx1 minusx2 middot middot middot 1minus xn 1 middot middot middot 1minusx1 minusx2 middot middot middot minusxn 1 middot middot middot 1

Example 43 Consider the box constraint

x1 ge 0 xn ge 0 1minus x1 ge 0 1minus xn ge 0

The equation L(x)C(x) = I2n is satisfied by

L(x) =

[In minus diag(x) In Inminusdiag(x) In In

]

Example 44 Consider the polyhedral set

1minus x4 ge 0 x4 minus x3 ge 0 x3 minus x2 ge 0 x2 minus x1 ge 0 x1 + 1 ge 0

The equation L(x)C(x) = I5 is satisfied by

L(x) =1

2

minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 minusx4 minus 1 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 1minus x4 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 1minus x3 1minus x4 1 1 1 1 1minusx1 minus 1 1minus x2 1minus x3 1minus x4 1 1 1 1 11minus x1 1minus x2 1minus x3 1minus x4 1 1 1 1 1

Example 45 Consider the polyhedral set

1 + x1 ge 0 1minus x1 ge 0 2minus x1 minus x2 ge 0 2minus x1 + x2 ge 0

The matrix L(x) satisfying L(x)C(x) = I4 is

1

6

x12minus 3x1 + 2 x1 x2 minus x2 4minus x1 2minus x1 1minus x1 1minus x1

3x12minus 3x1 minus 6 3x2 + 3x1 x2 6minus 3x1 minus3x1 minus3x1 minus 3 minus3x1 minus 3

1minus x12

minus2x2 minus x1 x2 minus 3 x1 minus 1 x1 + 1 x1 + 2 x1 + 21minus x1

2 3minus x1 x2 minus 2x2 x1 minus 1 x1 + 1 x1 + 2 x1 + 2

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 17

5 General constraints

We consider general nonlinear constraints as in (11) The critical point condi-tions are in (18) We discuss how to express Lagrange multipliers λi as polynomialfunctions in x on the set of critical points

Suppose there are totally m equality and inequality constraints ie

E cup I = 1 mIf (x λ) is a critical pair then λici(x) = 0 for all i isin E cup I So the Lagrange

multiplier vector λ =[λ1 middot middot middot λm

]Tsatisfies the equation

(51)

nablac1(x) nablac2(x) middot middot middot nablacm(x)c1(x) 0 middot middot middot 00 c2(x) 0 0

0 0 middot middot middot cm(x)

︸ ︷︷ ︸

C(x)

λ =

nablaf(x)000

Let C(x) be as in above If there exists L(x) isin R[x]mtimes(m+n) such that

(52) L(x)C(x) = Im

then we can get

(53) λ = L(x)

[nablaf(x)

0

]

= L1(x)nablaf(x)

where L1(x) consists of the first n columns of L(x) Clearly (52) implies thatAssumption 31 holds This section characterizes when such L(x) exists

Definition 51 The tuple c = (c1 cm) of constraining polynomials is said tobe nonsingular if rankC(u) = m for every u isin Cn

Clearly c being nonsingular is equivalent to that for each u isin Cn if J(u) =i1 ik (see (12) for the notation) then the gradients nablaci1(u) nablacik(u) arelinearly independent Our main conclusion is that (52) holds if and only if thetuple c is nonsingular

Proposition 52 (i) For each W (x) isin C[x]stimest with s ge t rankW (u) = t for allu isin Cn if and only if there exists P (x) isin C[x]ttimess such that

P (x)W (x) = It

Moreover for W (x) isin R[x]stimest we can choose P (x) isin R[x]ttimess for the above(ii) The constraining polynomial tuple c is nonsingular if and only if there existsL(x) isin R[x]mtimes(m+n) satisfying (52)

Proof (i) ldquolArrrdquo If L(x)W (x) = It then for all u isin Cn

t = rank It le rankW (u) le t

So W (x) must have full column rank everywhere

ldquorArrrdquo Suppose rankW (u) = t for all u isin Cn Write W (x) in columns

W (x) =[w1(x) w2(x) middot middot middot wt(x)

]

18 JIAWANG NIE

Then the equation w1(x) = 0 does not have a complex solution By Hilbertrsquos WeakNullstellensatz [4] there exists ξ1(x) isin C[x]s such that ξ1(x)

Tw1(x) = 1 For eachi = 2 t denote

r1i(x) = ξ1(x)Twi(x)

then (use sim to denote row equivalence between matrices)

W (x) sim[

1 r12(x) middot middot middot r1t(x)w1(x) w2(x) middot middot middot wm(x)

]

simW1(x) =

[1 r12(x) middot middot middot r1m(x)

0 w(1)2 (x) middot middot middot w

(1)m (x)

]

where each (i = 2 m)

w(1)i (x) = wi(x)minus r1i(x)w1(x)

So there exists P1(x) isin R[x](s+1)timess such that

P1(x)W (x) =W1(x)

Since W (x) and W1(x) are row equivalent W1(x) must also have full column rankeverywhere Similarly the polynomial equation

w(1)2 (x) = 0

does not have a complex solution Again by Hilbertrsquos Weak Nullstellensatz [4]there exists ξ2(x) isin C[x]s such that

ξ2(x)Tw

(1)2 (x) = 1

For each i = 3 t let r2i(x) = ξ2(x)Tw

(1)2 (x) then

W1(x) sim

1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)

0 w(1)2 (x) w

(1)3 (x) middot middot middot w

(1)m (x)

sim

W2(x) =

1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)

0 0 w(2)3 (x) middot middot middot w

(2)m (x)

where each (i = 3 m)

w(2)i (x) = w

(1)i (x)minus r2i(x)w

(1)2 (x)

Similarly W1(x) and W2(x) are row equivalent so W2(x) has full column rankeverywhere There exists P2(x) isin C[x](s+2)times(s+1) such that

P2(x)W1(x) =W2(x)

Continuing this process we can finally get

W2(x) sim middot middot middot simWt(x) =

1 r12(x) r13(x) middot middot middot r1t(x)0 1 r23(x) middot middot middot r2t(x)0 0 1 middot middot middot r3t(x)

0 0 0 middot middot middot 10 0 0 middot middot middot 0

Consequently there exists Pi(x) isin R[x](s+i)times(s+iminus1) for i = 1 2 t such that

Pt(x)Ptminus1(x) middot middot middotP1(x)W (x) =Wt(x)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 19

Since Wt(x) is a unit upper triangular matrix polynomial there exists Pt+1(x) isinR[x]ttimes(s+t) such that Pt+1(x)Wt(x) = It Let

P (x) = Pt+1(x)Pt(x)Ptminus1(x) middot middot middotP1(x)

then P (x)W (x) = Im Note that P (x) isin C[x]ttimess

For W (x) isin R[x]stimest we can replace P (x) by(P (x)+P (x)

)2 (the P (x) denotes

the complex conjugate of P (x)) which is a real matrix polynomial

(ii) The conclusion is implied directly by the item (i)

In Proposition 52 there is no explicit degree bound for L(x) satisfying (52)This question is mostly open to the best of the authorrsquos knowledge However oncea degree is chosen for L(x) it can be determined by comparing coefficients of bothsides of (52) This can be done by solving a linear system In the following wegive some examples of L(x) satisfying (52)

Example 53 Consider the hypercube with quadratic constraints

1minus x21 ge 0 1minus x22 ge 0 1minus x2n ge 0

The equation L(x)C(x) = In is satisfied by

L(x) =[minus 1

2diag(x) In]

Example 54 Consider the nonnegative portion of the unit sphere

x1 ge 0 x2 ge 0 xn ge 0 x21 + middot middot middot+ x2n minus 1 = 0

The equation L(x)C(x) = In+1 is satisfied by

L(x) =

[In minus xxT x1Tn 2x

12x

T minus 121

Tn minus1

]

Example 55 Consider the set

1minus x31 minus x42 ge 0 1minus x43 minus x34 ge 0

The equation L(x)C(x) = I2 is satisfied by

L(x) =

[minusx1

3 minusx2

4 0 0 1 00 0 minusx3

4 minusx4

3 0 1

]

Example 56 Consider the quadratic set

1minus x1x2 minus x2x3 minus x1x3 ge 0 1minus x21 minus x22 minus x23 ge 0

The matrix L(x)T satisfying L(x)C(x) = I2 is

25x13 + 10 x1

2 x2 + 40x1 x22 minus 25 x1 minus 2x3 minus25x1

3 minus 10x12 x2 minus 40 x1 x2

2 +49 x1

2+ 2x3

minus15 x12 x2 + 10x1 x2

2 + 20 x3 x1 x2 minus 10x1 15 x12 x2 minus 10x1 x2

2 minus 20 x3 x1 x2 + 10x1 minus x22

25 x3 x12 minus 20x1 x2

2 + 10 x3 x1 x2 + 2x1 minus25 x3 x12 + 20x1 x2

2 minus 10 x3 x1 x2 minus 2x1 minus x32

1 minus 20 x1 x3 minus 10 x12 minus 20 x1 x2 20 x1 x2 + 20 x1 x3 + 10x1

2

minus50 x12 minus 20 x2 x1 50 x1

2 + 20 x2 x1 + 1

We would like to remark that a polynomial tuple c = (c1 cm) is genericallynonsingular and Assumption 31 holds generically

Proposition 57 For all positive degrees d1 dm there exists an open densesubset U of D = R[x]d1 times middot middot middot times R[x]dm such that every tuple c = (c1 cm) isin Uis nonsingular Indeed such U can be chosen as a Zariski open subset of D ie itis the complement of a proper real variety of D Moreover Assumption 31 holdsfor all c isin U ie it holds generically

20 JIAWANG NIE

Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The

resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of

ci1 cin+1such that if Res(ci1 cin+1

) 6= 0 then the equations

ci1(x) = middot middot middot = cin+1(x) = 0

have no complex solutions Define

F1(c) =prod

(i1in+1)isinJ1

Res(ci1 cin+1)

For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero

Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations

cj1(x) = middot middot middot = cjk(x) = 0

have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define

F2(c) =prod

(j1jk)isinJ2

∆(cj1 cjk)

Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero

Last let F (c) = F1(c)F2(c) and

U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple

Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically

6 Numerical examples

This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21

The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie

pi =(

L1(x)nablaf(x))

i

In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds

By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)

We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo

Example 61 Consider the optimization problem

min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0

The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors

Table 1 Computational results for Example 61

order kwo LME with LME

lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567

2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn

i=1xi(1 minus xi)

2 +sum

i6=j x2

i xj isin IQ(ceq cin)

22 JIAWANG NIE

Example 62 Consider the optimization problem

min x41x22 + x21x

42 + x63 minus 3x21x

22x

23 + (x41 + x42 + x43)

st x21 + x22 + x23 ge 1

The matrix polynomial L(x) =[12x1

12x2

12x3 minus1

] The objective f is the sum

of the positive definite form x41 + x42 + x43 and the Motzkin polynomial

M(x) = x41x22 + x21x

42 + x63 minus 3x21x

22x

23

Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because

c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)

)= 0

defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1

3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic

3 plusmn 1radic

3) The

computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 2 Computational results for Example 62

order kwo LME with LME

lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530

Example 63 Consider the optimization problem

min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0

The matrix polynomial L(x) is

1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0

0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1

minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1

The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3

3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact

4 This is because 1 minus x21minus x2

2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and

1minusx23minusx2

4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23

Table 3 Computational results for Example 63

order kwo LME with LME

lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449

Example 64 Consider the polynomial optimization problem

minxisinR2

x21 + 50x22

st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1

8 ge 0 x22 + 2x1x2 minus 18 ge 0

It is motivated from an example in [12 sect3] The first column of L(x) is

8x13

5 + x1

5288 x2 x1

4

5 minus 16 x13

5 minus x2 x12 1245 + 8 x1

5 minus 2 x2minus 288x2 x1

4

5 minus 16 x13

5 + x2 x12 1245 + 8 x1

5 + 2 x2

and the second column of L(x) is

minus 8x12 x2

5 + 4x23

5 minus x2

10288x1

3 x22

5 + 16x12 x2

5 minus 142x1 x22

5 minus 9 x1

20 minus 8 x23

5 + 11x2

5

minus 288x13 x2

2

5 + 16x12 x2

5 + 142x1 x22

5 + 9 x1

20 minus 8 x23

5 + 11x2

5

The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25

radic5 asymp 1126517 and the minimizers are (plusmn

radic

12plusmn(radic

58 +radic

12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors

Table 4 Computational results for Example 64

order kwo LME with LME

lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537

Example 65 Consider the optimization problem

minxisinR3

x31 + x32 + x33 + 4x1x2x3 minus(x1(x

22 + x23) + x2(x

23 + x21) + x3(x

21 + x22)

)

st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0

The matrix polynomial L(x) is

1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0

minusx1 x2 0 1 0 minus1

24 JIAWANG NIE

The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3

+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 5 Computational results for Example 65

order kwo LME with LME

lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057

Example 66 Consider the optimization problem (x0 = 1)

minxisinR4

xTx+sum4i=0

prod

j 6=i(xi minus xj)

st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0

The matrix polynomial L(x) =[12diag(x) minusI4

] The first part of the objective

is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers

(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)

(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)

(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 6 Computational results for Example 66

order kwo LME with LME

lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354

Example 67 Consider the optimization problem

minxisinR3

x41x22 + x42x

23 + x43x

21 minus 3x21x

22x

23 + x22

st x1 minus x2x3 ge 0minusx2 + x23 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0

minusx3 minus1 0 0 0

]

By the arithmetic-geometric

mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25

relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin

for all k ge 5 up to numerical round-off errors

Table 7 Computational results for Example 67

order kwo LME with LME

lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618

Example 68 Consider the optimization problem

minxisinR4

x21(x1 minus x4)2 + x22(x2 minus x4)

2 + x23(x3 minus x4)2+

2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2

st x1 minus x2 ge 0 x2 minus x3 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0 01 1 0 0 0 0

]

In the objective the sum

of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer

(05632 05632 05632 07510)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 8 Computational results for Example 68

order kwo LME with LME

lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228

Example 69 Consider the optimization problem

minxisinR4

(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)

st 0 le x1 x4 le 1

The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9

5 Note that 4minussum

4

i=1x2i =

sum

4

i=1

(

xi(1 minus xi)2 + (1 minus xi)(1 + x2i )

)

isin IQ(ceq cin)

26 JIAWANG NIE

Table 9 Computational results for Example 69

orderwo LME with LME

lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320

For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison

Example 610 Consider the optimization problem (x0 = 1)

minxisinRn

sum

0leilejlejlen cijkxixjxk

st 0 le x le 1

where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time

Table 10 Consumed time (in seconds) for Example 610

n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778

7 Discussions

71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that

IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k

If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set

Preord(cin ψ) =sum

r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27

The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is

(71)

f primeprek = min 〈f y〉

st 〈1 y〉 = 1 L(k)ceq(y) = 0 L

(k)φ (y) = 0

L(k)

gr11

middotmiddotmiddotgrℓℓ

(y) 0 forall r1 rℓ isin 0 1y isin RN

n2k

Similar to (39) the dual optimization problem of the above is

(72)

fprek = max γ

st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k

An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem

Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then

fprek = f primeprek = fc

for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre

k = fmin for all k big enough

Proof The proof is very similar to the Case III of Theorem 33 Follow the same

argument there Without Assumption 32 we still have f(x) equiv 0 on the set

K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0

By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such

that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same

72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge

73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)

28 JIAWANG NIE

74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get

λ =(

C(x)TC(x))minus1

C(x)T[nablaf(x)

0

]

when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research

Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper

References

[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions

Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-

putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997

[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226

[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200

[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832

[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear

[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear

[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]

[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous

quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-

inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-

tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005

[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109

[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J

Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of

zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29

[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885

[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014

[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009

[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015

[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear

[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26

[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009

[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014

[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606

[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718

[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012

[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255

[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013

[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510

[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121

[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274

[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003

[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984

[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000

[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942

[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324

[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653

[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002

[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951

Department of Mathematics University of California at San Diego 9500 Gilman

Drive La Jolla CA USA 92093

E-mail address njwmathucsdedu

  • 1 Introduction
    • 11 Optimality conditions
    • 12 Some existing work
    • 13 New contributions
      • 2 Preliminaries
      • 3 The construction of tight relaxations
        • 31 Tightness of the relaxations
        • 32 Detecting tightness and extracting minimizers
          • 4 Polyhedral constraints
          • 5 General constraints
          • 6 Numerical examples
          • 7 Discussions
            • 71 Tight relaxations using preorderings
            • 72 Singular constraining polynomials
            • 73 Degree bound for L(x)
            • 74 Rational representation of Lagrange multipliers
              • References
Page 5: TIGHT RELAXATIONS FOR POLYNOMIAL OPTIMIZATION …The polynomial expressions is de-termined by linear equations. Based on these expressions, new Lasserre type semidefinite relaxations

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 5

A polynomial σ is said to be a sum of squares (SOS) if σ = s21+ middot middot middot+s2k for somepolynomials s1 sk isin R[x] The set of all SOS polynomials in x is denoted asΣ[x] For a degree d denote the truncation

Σ[x]d = Σ[x] cap R[x]d

For a tuple g = (g1 gt) its quadratic module is the set

Qmod(g) = Σ[x] + g1 middot Σ[x] + middot middot middot+ gt middot Σ[x]The 2kth truncation of Qmod(g) is the set

Qmod(g)2k = Σ[x]2k + g1 middot Σ[x]2kminusdeg(g1) + middot middot middot+ gt middot Σ[x]2kminusdeg(gt)

The truncation Qmod(g)2k depends on the generators g1 gt Denote

(21)

IQ(h g) = Ideal(h) + Qmod(g)IQ(h g)2k = Ideal(h)2k +Qmod(g)2k

The set IQ(h g) is said to be archimedean if there exists p isin IQ(h g) such thatp(x) ge 0 defines a compact set in Rn If IQ(h g) is archimedean then

K = x isin Rn | h(x) = 0 g(x) ge 0

must be a compact set Conversely if K is compact say K sube B(0 R) (the ballcentered at 0 with radius R) then IQ(h (gR2 minus xTx)) is always archimedean andh = 0 (gR2 minus xTx) ge 0 give the same set K

Theorem 21 (Putinar [36]) Let h g be tuples of polynomials in R[x] Let K beas above Assume IQ(h g) is archimedean If a polynomial f isin R[x] is positive onK then f isin IQ(h g)

Interestingly if f is only nonnegative on K but standard optimality conditionshold (see Subsection 11) then we still have f isin IQ(h g) [33]

Let RNnd be the space of real multi-sequences indexed by α isin Nnd A vector in

RNnd is called a truncated multi-sequence (tms) of degree d A tms y = (yα)αisinNn

d

gives the Riesz functional Ry acting on R[x]d as

(22) Ry

(

sum

αisinNnd

fαxα)

=sum

αisinNnd

fαyα

For f isin R[x]d and y isin RNnd we denote

(23) 〈f y〉 = Ry(f)

Let q isin R[x]2k The kth localizing matrix of q generated by y isin RNn2k is the

symmetric matrix L(k)q (y) such that

(24) vec(a1)T(

L(k)q (y)

)

vec(a2) = Ry(qa1a2)

for all a1 a2 isin R[x]kminuslceildeg(q)2rceil (The vec(ai) denotes the coefficient vector of ai)

When q = 1 L(k)q (y) is called a moment matrix and we denote

(25) Mk(y) = L(k)1 (y)

The columns and rows of L(k)q (y) as well as Mk(y) are indexed by α isin Nn with

2|α|+ deg(q) le 2k When q = (q1 qr) is a tuple of polynomials we define

(26) L(k)q (y) = diag

(

L(k)q1 (y) L(k)

qr (y))

6 JIAWANG NIE

a block diagonal matrix For the polynomial tuples h g as above the set

(27) S (h g)2k =

y isin RN

2kn

∣∣∣L

(k)h (y) = 0 L(k)

g (y) 0

is a spectrahedral cone in RN2kn The set IQ(h g)2k is also a convex cone in R[x]2k

The dual cone of IQ(h g)2k is precisely S (h g)2k [22 25 34] This is because〈p y〉 ge 0 for all p isin IQ(h g)2k and for all y isin S (h g)2k

3 The construction of tight relaxations

Consider the polynomial optimization problem (11) Let

λ = (λi)iisinEcupI

be the vector of Lagrange multipliers Denote the set

(31) K =

(x λ) isin Rn times R

EcupI∣∣∣∣∣

ci(x) = 0(i isin E) λjcj(x) = 0 (j isin I)nablaf(x) = sum

iisinEcupIλinablaci(x)

Each point in K is called a critical pair The projection

(32) Kc = u | (u λ) isin Kis the set of all real critical points To construct tight relaxations for solving (11)we need the following assumption for Lagrange multipliers

Assumption 31 For each i isin E cupI there exists a polynomial pi isin R[x] such thatfor all (x λ) isin K it holds that

λi = pi(x)

Assumption 31 is generically satisfied as shown in Proposition 57 For thefollowing special cases we can get polynomials pi explicitly

bull (Simplex) For the simplex eTxminus1 = 0 x1 ge 0 xn ge 0 it correspondsto that E = 0 I = [n] c0(x) = eTx minus 1 cj(x) = xj (j isin [n]) TheLagrange multipliers can be expressed as

(33) λ0 = xTnablaf(x) λj = fxjminus xTnablaf(x) (j isin [n])

bull (Hypercube) For the hypercube [minus1 1]n it corresponds to that E = emptyI = [n] and each cj(x) = 1minus x2j We can show that

(34) λj = minus1

2xjfxj

(j isin [n])

bull (Ball or sphere) The constraint is 1minusxTx = 0 or 1minusxTx ge 0 It correspondsto that E cup I = 1 and c1 = 1minus xTx We have

(35) λ1 = minus1

2xTnablaf(x)

bull (Triangular constraints) Suppose E cup I = 1 m and each

ci(x) = τixi + qi(xi+1 xn)

for some polynomials qi isin R[xi+1 xn] and scalars τi 6= 0 The matrixT (x) consisting of the firstm rows of [nablac1(x) nablacm(x)] is an invertiblelower triangular matrix with constant diagonal entries Then

λ = T (x)minus1 middot[fx1

middot middot middot fxm

]T

Note that the inverse T (x)minus1 is a matrix polynomial

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 7

For more general constraints we can also express λ as a polynomial function in xon the set Kc This will be discussed in sect4 and sect5

For the polynomials pi as in Assumption 31 denote

(36) φ =(

nablaf minussum

iisinEcupIpinablaci

(pjcj

)

jisinI

)

ψ =(pj)

jisinI

When the minimum value fmin of (11) is achieved at a critical point (11) isequivalent to the problem

(37)

fc = min f(x)st ceq(x) = 0 cin(x) ge 0

φ(x) = 0 ψ(x) ge 0

We apply Lasserre relaxations to solve it For an integer k gt 0 (called the relaxationorder) the kth order Lasserrersquos relaxation for (37) is

(38)

f primek = min 〈f y〉

st 〈1 y〉 = 1Mk(y) 0

L(k)ceq (y) = 0 L

(k)cin(y) 0

L(k)φ (y) = 0 L

(k)ψ (y) 0 y isin RN

n2k

Since x0 = 1 (the constant one polynomial) the condition 〈1 y〉 = 1 means that(y)0 = 1 The dual optimization problem of (38) is

(39)

fk = max γ

st f minus γ isin IQ(ceq cin)2k + IQ(φ ψ)2k

We refer to sect2 for the notation used in (38)-(39) They are equivalent to semidef-inite programs (SDPs) so they can be solved by SDP solvers (eg SeDuMi [40])For k = 1 2 middot middot middot we get a hierarchy of Lasserre relaxations In (38)-(39) if weremove the usage of φ and ψ they are reduced to standard Lasserre relaxations in[17] So (38)-(39) are stronger relaxations

By the construction of φ as in (36) Assumption 31 implies that

Kc = u isin Rn ceq(u) = 0 φ(u) = 0

By Lemma 33 of [6] f achieves only finitely many values on Kc say(310) v1 lt middot middot middot lt vN

A point u isin Kc might not be feasible for (37) ie it is possible that cin(u) 6ge 0 orψ(u) 6ge 0 In applications we are often interested in the optimal value fc of (37)When (37) is infeasible by convention we set

fc = +infin

When the optimal value fmin of (11) is achieved at a critical point fc = fminThis is the case if the feasible set is compact or if f is coercive (ie for each ℓthe sublevel set f(x) le ℓ is compact) and the constraint qualification conditionholds As in [17] one can show that

(311) fk le f primek le fc

for all k Moreover fk and f primek are both monotonically increasing If for some

order k it occurs thatfk = f prime

k = fc

then the kth order Lasserrersquos relaxation is said to be tight (or exact)

8 JIAWANG NIE

31 Tightness of the relaxations Let cin ψKc fc be as above We refer to sect2for the notation Qmod(cin ψ) We begin with a general assumption

Assumption 32 There exists ρ isin Qmod(cin ψ) such that if u isin Kc and f(u) ltfc then ρ(u) lt 0

In Assumption 32 the hypersurface ρ(x) = 0 separates feasible and infeasiblecritical points Clearly if u isin Kc is a feasible point for (37) then cin(u) ge 0 andψ(u) ge 0 and hence ρ(u) ge 0 Assumption 32 generally holds For instance it issatisfied for the following general cases

a) When there are no inequality constraints cin and ψ are empty tuplesThen Qmod(cin ψ) = Σ[x] and Assumption 32 is satisfied for ρ = 0

b) Suppose the set Kc is finite say Kc = u1 uD andf(u1) f(utminus1) lt fc le f(ut) f(uD)

Let ℓ1 ℓD be real interpolating polynomials such that ℓi(uj) = 1 fori = j and ℓi(uj) = 0 for i 6= j For each i = 1 t there must exist ji isin Isuch that cji(ui) lt 0 Then the polynomial

(312) ρ =sum

iltt

minus1

cji(ui)cji(x)ℓi(x)

2 +sum

igetℓi(x)

2

satisfies Assumption 32c) For each x with f(x) = vi lt fc at least one of the constraints cj(x) ge

0 pj(x) ge 0(j isin I) is violated Suppose for each critical value vi lt fcthere exists gi isin cj pjjisinI such that

gi lt 0 on Kc cap f(x) = viLet ϕ1 ϕN be real univariate polynomials such that ϕi(vj) = 0 for i 6= jand ϕi(vj) = 1 for i = j Suppose vt = fc Then the polynomial

(313) ρ =sum

iltt

gi(x)(ϕi(f(x))

)2+sum

iget

(ϕi(f(x))

)2

satisfies Assumption 32

We refer to sect2 for the archimedean condition and the notation IQ(h g) as in(21) The following is about the convergence of relaxations (38)-(39)

Theorem 33 Suppose Kc 6= empty and Assumption 31 holds If

i) IQ(ceq cin) + IQ(φ ψ) is archimedean orii) IQ(ceq cin) is archimedean oriii) Assumption 32 holds

then fk = f primek = fc for all k sufficiently large Therefore if the minimum value fmin

of (11) is achieved at a critical point then fk = f primek = fmin for all k big enough if

one of the conditions i)-iii) is satisfied

Remark In Theorem 33 the conclusion holds if anyone of conditions i)-iii) issatisfied The condition ii) is only about constraining polynomials of (11) It canbe checked without φ ψ Clearly the condition ii) implies the condition i)

The proof for Theorem 33 is given in the following The main idea is to considerthe set of critical points It can be expressed as a union of subvarieties Theobjective f is a constant in each one of them We can get an SOS type representation

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 9

for f on each subvariety and then construct a single one for f over the entire setof critical points

Proof of Theorem 33 Clearly every point in the complex variety

K1 = x isin Cn | ceq(x) = 0 φ(x) = 0

is a critical point By Lemma 33 of [6] the objective f achieves finitely manyreal values on Kc = K1 cap Rn say they are v1 lt middot middot middot lt vN Up to the shiftingof a constant in f we can further assume that fc = 0 Clearly fc equals one ofv1 vN say vt = fc = 0

Case I Assume IQ(ceq cin) + IQ(φ ψ) is archimedean Let

I = Ideal(ceq φ)

the critical ideal Note that K1 = VC(I) The variety VC(I) is a union of irreduciblesubvarieties say V1 Vℓ If Vi cap Rn 6= empty then f is a real constant on Vi whichequals one of v1 vN This can be implied by Lemma 33 of [6] and Lemma 32of [30] Denote the subvarieties of VC(I)

Ti = K1 cap f(x) = vi (i = t N)

Let Ttminus1 be the union of irreducible subvarieties Vi such that either Vi cap Rn = emptyor f equiv vj on Vi with vj lt vt = fc Then it holds that

VC(I) = Ttminus1 cup Tt cup middot middot middot cup TN By the primary decomposition of I [11 41] there exist ideals Itminus1 It IN sube R[x]such that

I = Itminus1 cap It cap middot middot middot cap IN

and Ti = VC(Ii) for all i = tminus 1 t N Denote the semialgebraic set

(314) S = x isin Rn | cin(x) ge 0 ψ(x) ge 0

For i = t minus 1 we have VR(Itminus1) cap S = empty because v1 vtminus1 lt fc By thePositivstellensatz [2 Corollary 443] there exists p0 isin Preord(cin ψ)

1 satisfying2 + p0 isin Itminus1 Note that 1 + p0 gt 0 on VR(Itminus1) cap S The set Itminus1 +Qmod(cin ψ)is archimedean because I sube Itminus1 and

IQ(ceq cin) + IQ(φ ψ) sube Itminus1 +Qmod(cin ψ)

By Theorem 21 we have

p1 = 1 + p0 isin Itminus1 +Qmod(cin ψ)

Then 1 + p1 isin Itminus1 There exists p2 isin Qmod(cin ψ) such that

minus1 equiv p1 equiv p2 mod Itminus1

Since f = (f4 + 1)2 minus 1 middot (f4minus 1)2 we have

f equiv σtminus1 =(f4 + 1)2 + p2(f4minus 1)2

mod Itminus1

So when k is big enough we have σtminus1 isin Qmod(cin ψ)2k

1It is the preordering of the polynomial tuple (cin ψ) see sect71

10 JIAWANG NIE

For i = t vt = 0 and f(x) vanishes on VC(It) By Hilbertrsquos Strong Nullstellensatz[4] there exists an integer mt gt 0 such that fmt isin It Define the polynomial

st(ǫ) =radicǫsummtminus1

j=0

(12

j

)

ǫminusjf j

Then we have that

D1 = ǫ(1 + ǫminus1f)minus(st(ǫ)

)2 equiv 0 mod It

This is because in the subtraction of D1 after expanding(st(ǫ)

)2 all the terms f j

with j lt mt are cancelled and f j isin It for j ge mt So D1 isin It Let σt(ǫ) = st(ǫ)2

then f + ǫminus σt(ǫ) = D1 and

(315) f + ǫminus σt(ǫ) =summtminus2

j=0bj(ǫ)f

mt+j

for some real scalars bj(ǫ) depending on ǫ

For each i = t+1 N vi gt 0 and f(x)viminus1 vanishes on VC(Ii) By HilbertrsquosStrong Nullstellensatz [4] there exists 0 lt mi isin N such that (fvi minus 1)mi isin Ii Let

si =radicvisummiminus1

j=0

(12

j

)

(fvi minus 1)j

Like for the case i = t we can similarly show that f minus s2i isin Ii Let σi = s2i thenf minus σi isin Ii

Note that VC(Ii) cap VC(Ij) = empty for all i 6= j By Lemma 33 of [30] there existpolynomials atminus1 aN isin R[x] such that

a2tminus1 + middot middot middot+ a2N minus 1 isin I ai isin⋂

i6=jisintminus1NIj

For ǫ gt 0 denote the polynomial

σǫ = σt(ǫ)a2t +

sum

t6=jisintminus1N(σj + ǫ)a2j

then

f + ǫminus σǫ = (f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N )+sum

t6=iisintminus1N(f minus σi)a2i + (f + ǫminus σt(ǫ))a

2t

For each i 6= t f minus σi isin Ii so

(f minus σi)a2i isin

N⋂

j=tminus1

Ij = I

Hence there exists k1 gt 0 such that

(f minus σi)a2i isin I2k1 (t 6= i isin tminus 1 N)

Since f + ǫ minus σt(ǫ) isin It we also have

(f + ǫ minus σt(ǫ))a2t isin

N⋂

j=tminus1

Ij = I

Moreover by the equation (315)

(f + ǫ minus σt(ǫ))a2t =

mtminus2sum

j=0

bj(ǫ)fmt+ja2t

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 11

Each fmt+ja2t isin I since fmt+j isin It So there exists k2 gt 0 such that for all ǫ gt 0

(f + ǫminus σt(ǫ))a2t isin I2k2

Since 1minus a2tminus1 minus middot middot middot minus a2N isin I there also exists k3 gt 0 such that for all ǫ gt 0

(f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N ) isin I2k3

Hence if klowast ge maxk1 k2 k3 then we have

f(x) + ǫminus σǫ isin I2klowast

for all ǫ gt 0 By the construction the degrees of all σi and ai are independent ofǫ So σǫ isin Qmod(cin ψ)2klowast for all ǫ gt 0 if klowast is big enough Note that

I2klowast +Qmod(cin ψ)2klowast = IQ(ceq cin)2klowast + IQ(φ ψ)2klowast

This implies that fklowast ge fc minus ǫ for all ǫ gt 0 On the other hand we always havefklowast le fc So fklowast = fc Moreover since fk is monotonically increasing we musthave fk = fc for all k ge klowast

Case II Assume IQ(ceq cin) is archimedean Because

IQ(ceq cin) sube IQ(ceq cin) + IQ(φ ψ)

the set IQ(ceq cin)+IQ(φ ψ) is also archimedean Therefore the conclusion is alsotrue by applying the result for Case I

Case III Suppose the Assumption 32 holds Let ϕ1 ϕN be real univariatepolynomials such that ϕi(vj) = 0 for i 6= j and ϕi(vj) = 1 for i = j Let

s = st + middot middot middot+ sN where each si = (vi minus fc)(ϕi(f)

)2

Then s isin Σ[x]2k4 for some integer k4 gt 0 Let

f = f minus fc minus s

We show that there exist an integer ℓ gt 0 and q isin Qmod(cin ψ) such that

f2ℓ + q isin Ideal(ceq φ)

This is because by Assumption 32 f(x) equiv 0 on the set

K2 = x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0

It has only a single inequality By the Positivstellensatz [2 Corollary 443] there

exist 0 lt ℓ isin N and q = b0 + ρb1 (b0 b1 isin Σ[x]) such that f2ℓ + q isin Ideal(ceq φ)By Assumption 32 ρ isin Qmod(cin ψ) so we have q isin Qmod(cin ψ)

For all ǫ gt 0 and τ gt 0 we have f + ǫ = φǫ + θǫ where

φǫ = minusτǫ1minus2ℓ(f2ℓ + q

)

θǫ = ǫ(

1 + f ǫ+ τ(f ǫ)2ℓ)

+ τǫ1minus2ℓq

By Lemma 21 of [31] when τ ge 12ℓ there exists k5 such that for all ǫ gt 0

φǫ isin Ideal(ceq φ)2k5 θǫ isin Qmod(cin ψ)2k5

Hence we can getf minus (fc minus ǫ) = φǫ + σǫ

where σǫ = θǫ + s isin Qmod(cin ψ)2k5 for all ǫ gt 0 Note that

IQ(ceq cin)2k5 + IQ(φ ψ)2k5 = Ideal(ceq φ)2k5 +Qmod(cin ψ)2k5

12 JIAWANG NIE

For all ǫ gt 0 γ = fc minus ǫ is feasible in (39) for the order k5 so fk5 ge fc Becauseof (311) and the monotonicity of fk we have fk = f prime

k = fc for all k ge k5

32 Detecting tightness and extracting minimizers The optimal value of(37) is fc and the optimal value of (11) is fmin If fmin is achievable at a criticalpoint then fc = fmin In Theorem 33 we have shown that fk = fc for all k bigenough where fk is the optimal value of (39) The value fc or fmin is often notknown How do we detect the tightness fk = fc in computation The flat extensionor flat truncation condition [5 14 32] can be used for checking tightness Supposeylowast is a minimizer of (38) for the order k Let

(316) d = lceildeg(ceq cin φ ψ)2rceilIf there exists an integer t isin [d k] such that

(317) rankMt(ylowast) = rankMtminusd(y

lowast)

then fk = fc and we can get r = rankMt(ylowast) minimizers for (37) [5 14 32]

The method in [14] can be used to extract minimizers It was implemented inthe software GloptiPoly 3 [13] Generally (317) can serve as a sufficient andnecessary condition for detecting tightness The case that (37) is infeasible (ieno critical points satisfy the constraints cin ge 0 ψ ge 0) can also be detected bysolving the relaxations (38)-(39)

Theorem 34 Under Assumption 31 the relaxations (38)-(39) have the follow-ing properties

i) If (38) is infeasible for some order k then no critical points satisfy theconstraints cin ge 0 ψ ge 0 ie (37) is infeasible

ii) Suppose Assumption 32 holds If (37) is infeasible then the relaxation(38) must be infeasible when the order k is big enough

In the following assume (37) is feasible (ie fc lt +infin) Then for all k bigenough (38) has a minimizer ylowast Moreover

iii) If (317) is satisfied for some t isin [d k] then fk = fciv) If Assumption 32 holds and (37) has finitely many minimizers then every

minimizer ylowast of (38) must satisfy (317) for some t isin [d k] when k is bigenough

Proof By Assumption 31 u is a critical point if and only if ceq(u) = 0 φ(u) = 0i) For every feasible point u of (37) the tms [u]2k (see sect2 for the notation) is

feasible for (38) for all k Therefore if (38) is infeasible for some k then (37)must be infeasible

ii) By Assumption 32 when (37) is infeasible the set

x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0

is empty It has a single inequality By the Positivstellensatz [2 Corollary 443]it holds that minus1 isin Ideal(ceq φ) + Qmod(ρ) By Assumption 32

Ideal(ceq φ) + Qmod(ρ) sube IQ(ceq cin) + IQ(φ ψ)

Thus for all k big enough (39) is unbounded from above Hence (38) must beinfeasible by weak duality

When (37) is feasible f achieves finitely many values on Kc so (37) mustachieve its optimal value fc By Theorem 33 we know that fk = f prime

k = fc for all kbig enough For each minimizer ulowast of (37) the tms [ulowast]2k is a minimizer of (38)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 13

iii) If (317) holds we can get r = rankMt(ylowast) minimizers for (37) [5 14] say

u1 ur such that fk = f(ui) for each i Clearly fk = f(ui) ge fc On the otherhand we always have fk le fc So fk = fc

iv) By Assumption 32 (37) is equivalent to the problem

(318)

min f(x)st ceq(x) = 0 φ(x) = 0 ρ(x) ge 0

The optimal value of (318) is also fc Its kth order Lasserrersquos relaxation is

(319)

γprimek = min 〈f y〉st 〈1 y〉 = 1Mk(y) 0

L(k)ceq (y) = 0 L

(k)φ (y) = 0 L

(k)ρ (y) 0

Its dual optimization problem is

(320)

γk = max γ

st f minus γ isin Ideal(ceq φ)2k + Qmod(ρ)2k

By repeating the same proof as for Theorem 33(iii) we can show that

γk = γprimek = fc

for all k big enough Because ρ isin Qmod(cin ψ) each y feasible for (38) is alsofeasible for (319) So when k is big each ylowast is also a minimizer of (319) Theproblem (318) also has finitely many minimizers By Theorem 26 of [32] thecondition (317) must be satisfied for some t isin [d k] when k is big enough

If (37) has infinitely many minimizers then the condition (317) is typically notsatisfied We refer to [25 sect66]

4 Polyhedral constraints

In this section we assume the feasible set of (11) is the polyhedron

P = x isin Rn | Axminus b ge 0

where A =[a1 middot middot middot am

]T isin Rmtimesn b =[b1 middot middot middot bm

]T isin Rm This corresponds

to that E = empty I = [m] and each ci(x) = aTi xminus bi Denote

(41) D(x) = diag(c1(x) cm(x)) C(x) =

[AT

D(x)

]

The Lagrange multiplier vector λ =[λ1 middot middot middot λm

]Tsatisfies

(42)

[AT

D(x)

]

λ =

[nablaf(x)

0

]

If rankA = m we can express λ as

(43) λ = (AAT )minus1Anablaf(x)If rankA lt m how can we express λ in terms of x In computation we oftenprefer a polynomial expression If there exists L(x) isin R[x]mtimes(n+m) such that

(44) L(x)C(x) = Im

then we can get

λ = L(x)

[nablaf(x)

0

]

= L1(x)nablaf(x)

14 JIAWANG NIE

where L1(x) consists of the first n columns of L(x) In this section we characterizewhen such L(x) exists and give a degree bound for it

The linear function Ax minus b is said to be nonsingular if rankC(u) = m for allu isin Cn (also see Definition 51) This is equivalent to that for every u if J(u) =i1 ik (see (12) for the notation) then ai1 aik are linearly independent

Proposition 41 The linear function Ax minus b is nonsingular if and only if thereexists a matrix polynomial L(x) satisfying (44) Moreover when Ax minus b is non-singular we can choose L(x) in (44) with deg(L) le mminus rankA

Proof Clearly if (44) is satisfied by some L(x) then rankC(u) ge m for all u Thisimplies that Axminus b is nonsingular

Next assume that Axminusb is nonsingular We show that (44) is satisfied by someL(x) isin R[x]mtimes(n+m) with degree le m minus rankA Let r = rankA Up to a linearcoordinate transformation we can reduce x to a r-dimensional variable Withoutloss of generality we can assume that rankA = n and m ge n

For a subset I = i1 imminusn of [m] denote

cI(x) =prod

iisinIci(x) EI(x) = cI(x) middot diag(ci1(x)minus1 cimminusn

(x)minus1)

DI(x) = diag(ci1(x) cimminusn(x)) AI =

[ai1 middot middot middot ainminusm

]T

For the case that I = empty (the empty set) we set cempty(x) = 1 Let

V = I sube [m] |I| = mminus n rankA[m]I = nStep I For each I isin V we construct a matrix polynomial LI(x) such that

(45) LI(x)C(x) = cI(x)Im

The matrix LI = LI(x) satisfying (45) can be given by the following 2times 3 blockmatrix (LI(J K) denotes the submatrix whose row indices are from J and whosecolumn indices are from K)

(46)

J K [n] n+ I n+ [m]II 0 EI(x) 0

[m]I cI(x) middot(A[m]I

)minusT minus(A[m]I

)minusT (AI

)TEI(x) 0

Equivalently the blocks of LI are

LI(I [n]

)= 0 LI

(I n+ [m]I

)= 0 LI

([m]I n+ [m]I

)= 0

LI(I n+ I

)= EI(x) LI

([m]I [n]

)= cI(x)

(A[m]I

)minusT

LI([m]I n+ I

)= minus

(A[m]I

)minusT (AI

)T

For each I isin V A[m]I is invertible The superscript minusT denotes the inverse of thetranspose Let G = LI(x)C(x) then one can verify that

G(I I) = EI(x)DI(x) = cI(x)Imminusn G(I [m]I) = 0

G([m]I [m]I) =[

cI(x)(A[m]I

)minusT minusAminusT[m]IA

TI EI(x)

] [(A[m]I

)T

0

]

= cI(x)In

G([m]I I) =[

cI(x)(A[m]I

)minusT minus(A[m]I

)minusT (AI

)TEI(x)

] [ ATIDI(x)

]

= 0

This shows that the above LI(x) satisfies (45)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 15

Step II We show that there exist real scalars νI satisfying

(47)sum

IisinVνIcI(x) = 1

This can be shown by induction on m

bull When m = n V = empty and cempty(x) = 1 so (47) is clearly truebull When m gt n let

(48) N = i isin [m] | rankA[m]i = n

For each i isin N let Vi be the set of all Iprime sube [m]i such that |I prime| = mminusnminus1

and rankA[m](Iprimecupi) = n For each i isin N by the assumption the linearfunction Amix minus bmi is nonsingular By induction there exist real

scalars ν(i)Iprime satisfying

(49)sum

IprimeisinVi

ν(i)Iprime cIprime(x) = 1

Since rankA = n we can generally assume that a1 an is linearlyindependent So there exist scalars α1 αn such that

am = α1a1 + middot middot middot+ αnan

If all αi = 0 then am = 0 and hence A can be replaced by its first mminus 1rows So (47) is true by the induction In the following suppose at leastone αi 6= 0 and write

i αi 6= 0 = i1 ik

Then ai1 aik am are linearly dependent For convenience set ik+1 =m Since Axminus b is nonsingular the linear system

ci1(x) = middot middot middot = cik(x) = cik+1(x) = 0

has no solutions Hence there exist real scalars micro1 microk+1 such that

micro1ci1(x) + middot middot middot+ microkcik(x) + microk+1cik+1(x) = 1

This above can be implied by echelonrsquos form for inconsistent linear systemsNote that i1 ik+1 isin N For each j = 1 k + 1 by (49)

sum

IprimeisinVij

ν(ij)Iprime cIprime(x) = 1

Then we can get

1 =k+1sum

j=1

microjcij (x) =k+1sum

j=1

microjsum

IprimeisinVij

ν(ij)Iprime cij (x)cIprime(x) =

sum

I=IprimecupijIprimeisinVij1lejlek+1

ν(ij)Iprime microjcI(x)

Since each I prime cup ij isin V (47) must be satisfied by some scalars νI

16 JIAWANG NIE

Step III For LI(x) as in (45) we construct L(x) as

(410) L(x) =sum

IisinVνIcI(x)LI(x)

Clearly L(x) satisfies (44) because

L(x)C(x) =sum

IisinVνILI(x)C(x) =

sum

IisinVνIcI(x)Im = Im

Each LI(x) has degree le mminus n so L(x) has degree le mminus n

Proposition 41 characterizes when there exists L(x) satisfying (44) Whenit does a degree bound for L(x) is m minus rankA Sometimes its degree can besmaller than that as shown in Example 43 For given A b the matrix polynomialL(x) satisfying (44) can be determined by linear equations which are obtained bymatching coefficients on both sides In the following we give some examples ofL(x)C(x) = Im for polyhedral sets

Example 42 Consider the simplicial set

x1 ge 0 xn ge 0 1minus eTx ge 0

The equation L(x)C(x) = In+1 is satisfied by

L(x) =

1minus x1 minusx2 middot middot middot minusxn 1 middot middot middot 1minusx1 1minus x2 middot middot middot minusxn 1 middot middot middot 1

minusx1 minusx2 middot middot middot 1minus xn 1 middot middot middot 1minusx1 minusx2 middot middot middot minusxn 1 middot middot middot 1

Example 43 Consider the box constraint

x1 ge 0 xn ge 0 1minus x1 ge 0 1minus xn ge 0

The equation L(x)C(x) = I2n is satisfied by

L(x) =

[In minus diag(x) In Inminusdiag(x) In In

]

Example 44 Consider the polyhedral set

1minus x4 ge 0 x4 minus x3 ge 0 x3 minus x2 ge 0 x2 minus x1 ge 0 x1 + 1 ge 0

The equation L(x)C(x) = I5 is satisfied by

L(x) =1

2

minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 minusx4 minus 1 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 1minus x4 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 1minus x3 1minus x4 1 1 1 1 1minusx1 minus 1 1minus x2 1minus x3 1minus x4 1 1 1 1 11minus x1 1minus x2 1minus x3 1minus x4 1 1 1 1 1

Example 45 Consider the polyhedral set

1 + x1 ge 0 1minus x1 ge 0 2minus x1 minus x2 ge 0 2minus x1 + x2 ge 0

The matrix L(x) satisfying L(x)C(x) = I4 is

1

6

x12minus 3x1 + 2 x1 x2 minus x2 4minus x1 2minus x1 1minus x1 1minus x1

3x12minus 3x1 minus 6 3x2 + 3x1 x2 6minus 3x1 minus3x1 minus3x1 minus 3 minus3x1 minus 3

1minus x12

minus2x2 minus x1 x2 minus 3 x1 minus 1 x1 + 1 x1 + 2 x1 + 21minus x1

2 3minus x1 x2 minus 2x2 x1 minus 1 x1 + 1 x1 + 2 x1 + 2

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 17

5 General constraints

We consider general nonlinear constraints as in (11) The critical point condi-tions are in (18) We discuss how to express Lagrange multipliers λi as polynomialfunctions in x on the set of critical points

Suppose there are totally m equality and inequality constraints ie

E cup I = 1 mIf (x λ) is a critical pair then λici(x) = 0 for all i isin E cup I So the Lagrange

multiplier vector λ =[λ1 middot middot middot λm

]Tsatisfies the equation

(51)

nablac1(x) nablac2(x) middot middot middot nablacm(x)c1(x) 0 middot middot middot 00 c2(x) 0 0

0 0 middot middot middot cm(x)

︸ ︷︷ ︸

C(x)

λ =

nablaf(x)000

Let C(x) be as in above If there exists L(x) isin R[x]mtimes(m+n) such that

(52) L(x)C(x) = Im

then we can get

(53) λ = L(x)

[nablaf(x)

0

]

= L1(x)nablaf(x)

where L1(x) consists of the first n columns of L(x) Clearly (52) implies thatAssumption 31 holds This section characterizes when such L(x) exists

Definition 51 The tuple c = (c1 cm) of constraining polynomials is said tobe nonsingular if rankC(u) = m for every u isin Cn

Clearly c being nonsingular is equivalent to that for each u isin Cn if J(u) =i1 ik (see (12) for the notation) then the gradients nablaci1(u) nablacik(u) arelinearly independent Our main conclusion is that (52) holds if and only if thetuple c is nonsingular

Proposition 52 (i) For each W (x) isin C[x]stimest with s ge t rankW (u) = t for allu isin Cn if and only if there exists P (x) isin C[x]ttimess such that

P (x)W (x) = It

Moreover for W (x) isin R[x]stimest we can choose P (x) isin R[x]ttimess for the above(ii) The constraining polynomial tuple c is nonsingular if and only if there existsL(x) isin R[x]mtimes(m+n) satisfying (52)

Proof (i) ldquolArrrdquo If L(x)W (x) = It then for all u isin Cn

t = rank It le rankW (u) le t

So W (x) must have full column rank everywhere

ldquorArrrdquo Suppose rankW (u) = t for all u isin Cn Write W (x) in columns

W (x) =[w1(x) w2(x) middot middot middot wt(x)

]

18 JIAWANG NIE

Then the equation w1(x) = 0 does not have a complex solution By Hilbertrsquos WeakNullstellensatz [4] there exists ξ1(x) isin C[x]s such that ξ1(x)

Tw1(x) = 1 For eachi = 2 t denote

r1i(x) = ξ1(x)Twi(x)

then (use sim to denote row equivalence between matrices)

W (x) sim[

1 r12(x) middot middot middot r1t(x)w1(x) w2(x) middot middot middot wm(x)

]

simW1(x) =

[1 r12(x) middot middot middot r1m(x)

0 w(1)2 (x) middot middot middot w

(1)m (x)

]

where each (i = 2 m)

w(1)i (x) = wi(x)minus r1i(x)w1(x)

So there exists P1(x) isin R[x](s+1)timess such that

P1(x)W (x) =W1(x)

Since W (x) and W1(x) are row equivalent W1(x) must also have full column rankeverywhere Similarly the polynomial equation

w(1)2 (x) = 0

does not have a complex solution Again by Hilbertrsquos Weak Nullstellensatz [4]there exists ξ2(x) isin C[x]s such that

ξ2(x)Tw

(1)2 (x) = 1

For each i = 3 t let r2i(x) = ξ2(x)Tw

(1)2 (x) then

W1(x) sim

1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)

0 w(1)2 (x) w

(1)3 (x) middot middot middot w

(1)m (x)

sim

W2(x) =

1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)

0 0 w(2)3 (x) middot middot middot w

(2)m (x)

where each (i = 3 m)

w(2)i (x) = w

(1)i (x)minus r2i(x)w

(1)2 (x)

Similarly W1(x) and W2(x) are row equivalent so W2(x) has full column rankeverywhere There exists P2(x) isin C[x](s+2)times(s+1) such that

P2(x)W1(x) =W2(x)

Continuing this process we can finally get

W2(x) sim middot middot middot simWt(x) =

1 r12(x) r13(x) middot middot middot r1t(x)0 1 r23(x) middot middot middot r2t(x)0 0 1 middot middot middot r3t(x)

0 0 0 middot middot middot 10 0 0 middot middot middot 0

Consequently there exists Pi(x) isin R[x](s+i)times(s+iminus1) for i = 1 2 t such that

Pt(x)Ptminus1(x) middot middot middotP1(x)W (x) =Wt(x)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 19

Since Wt(x) is a unit upper triangular matrix polynomial there exists Pt+1(x) isinR[x]ttimes(s+t) such that Pt+1(x)Wt(x) = It Let

P (x) = Pt+1(x)Pt(x)Ptminus1(x) middot middot middotP1(x)

then P (x)W (x) = Im Note that P (x) isin C[x]ttimess

For W (x) isin R[x]stimest we can replace P (x) by(P (x)+P (x)

)2 (the P (x) denotes

the complex conjugate of P (x)) which is a real matrix polynomial

(ii) The conclusion is implied directly by the item (i)

In Proposition 52 there is no explicit degree bound for L(x) satisfying (52)This question is mostly open to the best of the authorrsquos knowledge However oncea degree is chosen for L(x) it can be determined by comparing coefficients of bothsides of (52) This can be done by solving a linear system In the following wegive some examples of L(x) satisfying (52)

Example 53 Consider the hypercube with quadratic constraints

1minus x21 ge 0 1minus x22 ge 0 1minus x2n ge 0

The equation L(x)C(x) = In is satisfied by

L(x) =[minus 1

2diag(x) In]

Example 54 Consider the nonnegative portion of the unit sphere

x1 ge 0 x2 ge 0 xn ge 0 x21 + middot middot middot+ x2n minus 1 = 0

The equation L(x)C(x) = In+1 is satisfied by

L(x) =

[In minus xxT x1Tn 2x

12x

T minus 121

Tn minus1

]

Example 55 Consider the set

1minus x31 minus x42 ge 0 1minus x43 minus x34 ge 0

The equation L(x)C(x) = I2 is satisfied by

L(x) =

[minusx1

3 minusx2

4 0 0 1 00 0 minusx3

4 minusx4

3 0 1

]

Example 56 Consider the quadratic set

1minus x1x2 minus x2x3 minus x1x3 ge 0 1minus x21 minus x22 minus x23 ge 0

The matrix L(x)T satisfying L(x)C(x) = I2 is

25x13 + 10 x1

2 x2 + 40x1 x22 minus 25 x1 minus 2x3 minus25x1

3 minus 10x12 x2 minus 40 x1 x2

2 +49 x1

2+ 2x3

minus15 x12 x2 + 10x1 x2

2 + 20 x3 x1 x2 minus 10x1 15 x12 x2 minus 10x1 x2

2 minus 20 x3 x1 x2 + 10x1 minus x22

25 x3 x12 minus 20x1 x2

2 + 10 x3 x1 x2 + 2x1 minus25 x3 x12 + 20x1 x2

2 minus 10 x3 x1 x2 minus 2x1 minus x32

1 minus 20 x1 x3 minus 10 x12 minus 20 x1 x2 20 x1 x2 + 20 x1 x3 + 10x1

2

minus50 x12 minus 20 x2 x1 50 x1

2 + 20 x2 x1 + 1

We would like to remark that a polynomial tuple c = (c1 cm) is genericallynonsingular and Assumption 31 holds generically

Proposition 57 For all positive degrees d1 dm there exists an open densesubset U of D = R[x]d1 times middot middot middot times R[x]dm such that every tuple c = (c1 cm) isin Uis nonsingular Indeed such U can be chosen as a Zariski open subset of D ie itis the complement of a proper real variety of D Moreover Assumption 31 holdsfor all c isin U ie it holds generically

20 JIAWANG NIE

Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The

resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of

ci1 cin+1such that if Res(ci1 cin+1

) 6= 0 then the equations

ci1(x) = middot middot middot = cin+1(x) = 0

have no complex solutions Define

F1(c) =prod

(i1in+1)isinJ1

Res(ci1 cin+1)

For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero

Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations

cj1(x) = middot middot middot = cjk(x) = 0

have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define

F2(c) =prod

(j1jk)isinJ2

∆(cj1 cjk)

Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero

Last let F (c) = F1(c)F2(c) and

U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple

Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically

6 Numerical examples

This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21

The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie

pi =(

L1(x)nablaf(x))

i

In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds

By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)

We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo

Example 61 Consider the optimization problem

min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0

The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors

Table 1 Computational results for Example 61

order kwo LME with LME

lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567

2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn

i=1xi(1 minus xi)

2 +sum

i6=j x2

i xj isin IQ(ceq cin)

22 JIAWANG NIE

Example 62 Consider the optimization problem

min x41x22 + x21x

42 + x63 minus 3x21x

22x

23 + (x41 + x42 + x43)

st x21 + x22 + x23 ge 1

The matrix polynomial L(x) =[12x1

12x2

12x3 minus1

] The objective f is the sum

of the positive definite form x41 + x42 + x43 and the Motzkin polynomial

M(x) = x41x22 + x21x

42 + x63 minus 3x21x

22x

23

Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because

c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)

)= 0

defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1

3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic

3 plusmn 1radic

3) The

computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 2 Computational results for Example 62

order kwo LME with LME

lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530

Example 63 Consider the optimization problem

min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0

The matrix polynomial L(x) is

1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0

0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1

minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1

The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3

3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact

4 This is because 1 minus x21minus x2

2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and

1minusx23minusx2

4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23

Table 3 Computational results for Example 63

order kwo LME with LME

lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449

Example 64 Consider the polynomial optimization problem

minxisinR2

x21 + 50x22

st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1

8 ge 0 x22 + 2x1x2 minus 18 ge 0

It is motivated from an example in [12 sect3] The first column of L(x) is

8x13

5 + x1

5288 x2 x1

4

5 minus 16 x13

5 minus x2 x12 1245 + 8 x1

5 minus 2 x2minus 288x2 x1

4

5 minus 16 x13

5 + x2 x12 1245 + 8 x1

5 + 2 x2

and the second column of L(x) is

minus 8x12 x2

5 + 4x23

5 minus x2

10288x1

3 x22

5 + 16x12 x2

5 minus 142x1 x22

5 minus 9 x1

20 minus 8 x23

5 + 11x2

5

minus 288x13 x2

2

5 + 16x12 x2

5 + 142x1 x22

5 + 9 x1

20 minus 8 x23

5 + 11x2

5

The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25

radic5 asymp 1126517 and the minimizers are (plusmn

radic

12plusmn(radic

58 +radic

12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors

Table 4 Computational results for Example 64

order kwo LME with LME

lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537

Example 65 Consider the optimization problem

minxisinR3

x31 + x32 + x33 + 4x1x2x3 minus(x1(x

22 + x23) + x2(x

23 + x21) + x3(x

21 + x22)

)

st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0

The matrix polynomial L(x) is

1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0

minusx1 x2 0 1 0 minus1

24 JIAWANG NIE

The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3

+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 5 Computational results for Example 65

order kwo LME with LME

lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057

Example 66 Consider the optimization problem (x0 = 1)

minxisinR4

xTx+sum4i=0

prod

j 6=i(xi minus xj)

st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0

The matrix polynomial L(x) =[12diag(x) minusI4

] The first part of the objective

is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers

(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)

(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)

(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 6 Computational results for Example 66

order kwo LME with LME

lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354

Example 67 Consider the optimization problem

minxisinR3

x41x22 + x42x

23 + x43x

21 minus 3x21x

22x

23 + x22

st x1 minus x2x3 ge 0minusx2 + x23 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0

minusx3 minus1 0 0 0

]

By the arithmetic-geometric

mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25

relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin

for all k ge 5 up to numerical round-off errors

Table 7 Computational results for Example 67

order kwo LME with LME

lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618

Example 68 Consider the optimization problem

minxisinR4

x21(x1 minus x4)2 + x22(x2 minus x4)

2 + x23(x3 minus x4)2+

2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2

st x1 minus x2 ge 0 x2 minus x3 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0 01 1 0 0 0 0

]

In the objective the sum

of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer

(05632 05632 05632 07510)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 8 Computational results for Example 68

order kwo LME with LME

lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228

Example 69 Consider the optimization problem

minxisinR4

(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)

st 0 le x1 x4 le 1

The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9

5 Note that 4minussum

4

i=1x2i =

sum

4

i=1

(

xi(1 minus xi)2 + (1 minus xi)(1 + x2i )

)

isin IQ(ceq cin)

26 JIAWANG NIE

Table 9 Computational results for Example 69

orderwo LME with LME

lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320

For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison

Example 610 Consider the optimization problem (x0 = 1)

minxisinRn

sum

0leilejlejlen cijkxixjxk

st 0 le x le 1

where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time

Table 10 Consumed time (in seconds) for Example 610

n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778

7 Discussions

71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that

IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k

If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set

Preord(cin ψ) =sum

r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27

The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is

(71)

f primeprek = min 〈f y〉

st 〈1 y〉 = 1 L(k)ceq(y) = 0 L

(k)φ (y) = 0

L(k)

gr11

middotmiddotmiddotgrℓℓ

(y) 0 forall r1 rℓ isin 0 1y isin RN

n2k

Similar to (39) the dual optimization problem of the above is

(72)

fprek = max γ

st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k

An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem

Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then

fprek = f primeprek = fc

for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre

k = fmin for all k big enough

Proof The proof is very similar to the Case III of Theorem 33 Follow the same

argument there Without Assumption 32 we still have f(x) equiv 0 on the set

K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0

By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such

that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same

72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge

73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)

28 JIAWANG NIE

74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get

λ =(

C(x)TC(x))minus1

C(x)T[nablaf(x)

0

]

when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research

Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper

References

[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions

Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-

putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997

[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226

[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200

[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832

[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear

[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear

[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]

[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous

quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-

inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-

tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005

[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109

[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J

Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of

zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29

[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885

[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014

[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009

[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015

[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear

[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26

[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009

[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014

[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606

[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718

[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012

[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255

[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013

[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510

[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121

[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274

[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003

[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984

[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000

[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942

[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324

[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653

[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002

[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951

Department of Mathematics University of California at San Diego 9500 Gilman

Drive La Jolla CA USA 92093

E-mail address njwmathucsdedu

  • 1 Introduction
    • 11 Optimality conditions
    • 12 Some existing work
    • 13 New contributions
      • 2 Preliminaries
      • 3 The construction of tight relaxations
        • 31 Tightness of the relaxations
        • 32 Detecting tightness and extracting minimizers
          • 4 Polyhedral constraints
          • 5 General constraints
          • 6 Numerical examples
          • 7 Discussions
            • 71 Tight relaxations using preorderings
            • 72 Singular constraining polynomials
            • 73 Degree bound for L(x)
            • 74 Rational representation of Lagrange multipliers
              • References
Page 6: TIGHT RELAXATIONS FOR POLYNOMIAL OPTIMIZATION …The polynomial expressions is de-termined by linear equations. Based on these expressions, new Lasserre type semidefinite relaxations

6 JIAWANG NIE

a block diagonal matrix For the polynomial tuples h g as above the set

(27) S (h g)2k =

y isin RN

2kn

∣∣∣L

(k)h (y) = 0 L(k)

g (y) 0

is a spectrahedral cone in RN2kn The set IQ(h g)2k is also a convex cone in R[x]2k

The dual cone of IQ(h g)2k is precisely S (h g)2k [22 25 34] This is because〈p y〉 ge 0 for all p isin IQ(h g)2k and for all y isin S (h g)2k

3 The construction of tight relaxations

Consider the polynomial optimization problem (11) Let

λ = (λi)iisinEcupI

be the vector of Lagrange multipliers Denote the set

(31) K =

(x λ) isin Rn times R

EcupI∣∣∣∣∣

ci(x) = 0(i isin E) λjcj(x) = 0 (j isin I)nablaf(x) = sum

iisinEcupIλinablaci(x)

Each point in K is called a critical pair The projection

(32) Kc = u | (u λ) isin Kis the set of all real critical points To construct tight relaxations for solving (11)we need the following assumption for Lagrange multipliers

Assumption 31 For each i isin E cupI there exists a polynomial pi isin R[x] such thatfor all (x λ) isin K it holds that

λi = pi(x)

Assumption 31 is generically satisfied as shown in Proposition 57 For thefollowing special cases we can get polynomials pi explicitly

bull (Simplex) For the simplex eTxminus1 = 0 x1 ge 0 xn ge 0 it correspondsto that E = 0 I = [n] c0(x) = eTx minus 1 cj(x) = xj (j isin [n]) TheLagrange multipliers can be expressed as

(33) λ0 = xTnablaf(x) λj = fxjminus xTnablaf(x) (j isin [n])

bull (Hypercube) For the hypercube [minus1 1]n it corresponds to that E = emptyI = [n] and each cj(x) = 1minus x2j We can show that

(34) λj = minus1

2xjfxj

(j isin [n])

bull (Ball or sphere) The constraint is 1minusxTx = 0 or 1minusxTx ge 0 It correspondsto that E cup I = 1 and c1 = 1minus xTx We have

(35) λ1 = minus1

2xTnablaf(x)

bull (Triangular constraints) Suppose E cup I = 1 m and each

ci(x) = τixi + qi(xi+1 xn)

for some polynomials qi isin R[xi+1 xn] and scalars τi 6= 0 The matrixT (x) consisting of the firstm rows of [nablac1(x) nablacm(x)] is an invertiblelower triangular matrix with constant diagonal entries Then

λ = T (x)minus1 middot[fx1

middot middot middot fxm

]T

Note that the inverse T (x)minus1 is a matrix polynomial

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 7

For more general constraints we can also express λ as a polynomial function in xon the set Kc This will be discussed in sect4 and sect5

For the polynomials pi as in Assumption 31 denote

(36) φ =(

nablaf minussum

iisinEcupIpinablaci

(pjcj

)

jisinI

)

ψ =(pj)

jisinI

When the minimum value fmin of (11) is achieved at a critical point (11) isequivalent to the problem

(37)

fc = min f(x)st ceq(x) = 0 cin(x) ge 0

φ(x) = 0 ψ(x) ge 0

We apply Lasserre relaxations to solve it For an integer k gt 0 (called the relaxationorder) the kth order Lasserrersquos relaxation for (37) is

(38)

f primek = min 〈f y〉

st 〈1 y〉 = 1Mk(y) 0

L(k)ceq (y) = 0 L

(k)cin(y) 0

L(k)φ (y) = 0 L

(k)ψ (y) 0 y isin RN

n2k

Since x0 = 1 (the constant one polynomial) the condition 〈1 y〉 = 1 means that(y)0 = 1 The dual optimization problem of (38) is

(39)

fk = max γ

st f minus γ isin IQ(ceq cin)2k + IQ(φ ψ)2k

We refer to sect2 for the notation used in (38)-(39) They are equivalent to semidef-inite programs (SDPs) so they can be solved by SDP solvers (eg SeDuMi [40])For k = 1 2 middot middot middot we get a hierarchy of Lasserre relaxations In (38)-(39) if weremove the usage of φ and ψ they are reduced to standard Lasserre relaxations in[17] So (38)-(39) are stronger relaxations

By the construction of φ as in (36) Assumption 31 implies that

Kc = u isin Rn ceq(u) = 0 φ(u) = 0

By Lemma 33 of [6] f achieves only finitely many values on Kc say(310) v1 lt middot middot middot lt vN

A point u isin Kc might not be feasible for (37) ie it is possible that cin(u) 6ge 0 orψ(u) 6ge 0 In applications we are often interested in the optimal value fc of (37)When (37) is infeasible by convention we set

fc = +infin

When the optimal value fmin of (11) is achieved at a critical point fc = fminThis is the case if the feasible set is compact or if f is coercive (ie for each ℓthe sublevel set f(x) le ℓ is compact) and the constraint qualification conditionholds As in [17] one can show that

(311) fk le f primek le fc

for all k Moreover fk and f primek are both monotonically increasing If for some

order k it occurs thatfk = f prime

k = fc

then the kth order Lasserrersquos relaxation is said to be tight (or exact)

8 JIAWANG NIE

31 Tightness of the relaxations Let cin ψKc fc be as above We refer to sect2for the notation Qmod(cin ψ) We begin with a general assumption

Assumption 32 There exists ρ isin Qmod(cin ψ) such that if u isin Kc and f(u) ltfc then ρ(u) lt 0

In Assumption 32 the hypersurface ρ(x) = 0 separates feasible and infeasiblecritical points Clearly if u isin Kc is a feasible point for (37) then cin(u) ge 0 andψ(u) ge 0 and hence ρ(u) ge 0 Assumption 32 generally holds For instance it issatisfied for the following general cases

a) When there are no inequality constraints cin and ψ are empty tuplesThen Qmod(cin ψ) = Σ[x] and Assumption 32 is satisfied for ρ = 0

b) Suppose the set Kc is finite say Kc = u1 uD andf(u1) f(utminus1) lt fc le f(ut) f(uD)

Let ℓ1 ℓD be real interpolating polynomials such that ℓi(uj) = 1 fori = j and ℓi(uj) = 0 for i 6= j For each i = 1 t there must exist ji isin Isuch that cji(ui) lt 0 Then the polynomial

(312) ρ =sum

iltt

minus1

cji(ui)cji(x)ℓi(x)

2 +sum

igetℓi(x)

2

satisfies Assumption 32c) For each x with f(x) = vi lt fc at least one of the constraints cj(x) ge

0 pj(x) ge 0(j isin I) is violated Suppose for each critical value vi lt fcthere exists gi isin cj pjjisinI such that

gi lt 0 on Kc cap f(x) = viLet ϕ1 ϕN be real univariate polynomials such that ϕi(vj) = 0 for i 6= jand ϕi(vj) = 1 for i = j Suppose vt = fc Then the polynomial

(313) ρ =sum

iltt

gi(x)(ϕi(f(x))

)2+sum

iget

(ϕi(f(x))

)2

satisfies Assumption 32

We refer to sect2 for the archimedean condition and the notation IQ(h g) as in(21) The following is about the convergence of relaxations (38)-(39)

Theorem 33 Suppose Kc 6= empty and Assumption 31 holds If

i) IQ(ceq cin) + IQ(φ ψ) is archimedean orii) IQ(ceq cin) is archimedean oriii) Assumption 32 holds

then fk = f primek = fc for all k sufficiently large Therefore if the minimum value fmin

of (11) is achieved at a critical point then fk = f primek = fmin for all k big enough if

one of the conditions i)-iii) is satisfied

Remark In Theorem 33 the conclusion holds if anyone of conditions i)-iii) issatisfied The condition ii) is only about constraining polynomials of (11) It canbe checked without φ ψ Clearly the condition ii) implies the condition i)

The proof for Theorem 33 is given in the following The main idea is to considerthe set of critical points It can be expressed as a union of subvarieties Theobjective f is a constant in each one of them We can get an SOS type representation

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 9

for f on each subvariety and then construct a single one for f over the entire setof critical points

Proof of Theorem 33 Clearly every point in the complex variety

K1 = x isin Cn | ceq(x) = 0 φ(x) = 0

is a critical point By Lemma 33 of [6] the objective f achieves finitely manyreal values on Kc = K1 cap Rn say they are v1 lt middot middot middot lt vN Up to the shiftingof a constant in f we can further assume that fc = 0 Clearly fc equals one ofv1 vN say vt = fc = 0

Case I Assume IQ(ceq cin) + IQ(φ ψ) is archimedean Let

I = Ideal(ceq φ)

the critical ideal Note that K1 = VC(I) The variety VC(I) is a union of irreduciblesubvarieties say V1 Vℓ If Vi cap Rn 6= empty then f is a real constant on Vi whichequals one of v1 vN This can be implied by Lemma 33 of [6] and Lemma 32of [30] Denote the subvarieties of VC(I)

Ti = K1 cap f(x) = vi (i = t N)

Let Ttminus1 be the union of irreducible subvarieties Vi such that either Vi cap Rn = emptyor f equiv vj on Vi with vj lt vt = fc Then it holds that

VC(I) = Ttminus1 cup Tt cup middot middot middot cup TN By the primary decomposition of I [11 41] there exist ideals Itminus1 It IN sube R[x]such that

I = Itminus1 cap It cap middot middot middot cap IN

and Ti = VC(Ii) for all i = tminus 1 t N Denote the semialgebraic set

(314) S = x isin Rn | cin(x) ge 0 ψ(x) ge 0

For i = t minus 1 we have VR(Itminus1) cap S = empty because v1 vtminus1 lt fc By thePositivstellensatz [2 Corollary 443] there exists p0 isin Preord(cin ψ)

1 satisfying2 + p0 isin Itminus1 Note that 1 + p0 gt 0 on VR(Itminus1) cap S The set Itminus1 +Qmod(cin ψ)is archimedean because I sube Itminus1 and

IQ(ceq cin) + IQ(φ ψ) sube Itminus1 +Qmod(cin ψ)

By Theorem 21 we have

p1 = 1 + p0 isin Itminus1 +Qmod(cin ψ)

Then 1 + p1 isin Itminus1 There exists p2 isin Qmod(cin ψ) such that

minus1 equiv p1 equiv p2 mod Itminus1

Since f = (f4 + 1)2 minus 1 middot (f4minus 1)2 we have

f equiv σtminus1 =(f4 + 1)2 + p2(f4minus 1)2

mod Itminus1

So when k is big enough we have σtminus1 isin Qmod(cin ψ)2k

1It is the preordering of the polynomial tuple (cin ψ) see sect71

10 JIAWANG NIE

For i = t vt = 0 and f(x) vanishes on VC(It) By Hilbertrsquos Strong Nullstellensatz[4] there exists an integer mt gt 0 such that fmt isin It Define the polynomial

st(ǫ) =radicǫsummtminus1

j=0

(12

j

)

ǫminusjf j

Then we have that

D1 = ǫ(1 + ǫminus1f)minus(st(ǫ)

)2 equiv 0 mod It

This is because in the subtraction of D1 after expanding(st(ǫ)

)2 all the terms f j

with j lt mt are cancelled and f j isin It for j ge mt So D1 isin It Let σt(ǫ) = st(ǫ)2

then f + ǫminus σt(ǫ) = D1 and

(315) f + ǫminus σt(ǫ) =summtminus2

j=0bj(ǫ)f

mt+j

for some real scalars bj(ǫ) depending on ǫ

For each i = t+1 N vi gt 0 and f(x)viminus1 vanishes on VC(Ii) By HilbertrsquosStrong Nullstellensatz [4] there exists 0 lt mi isin N such that (fvi minus 1)mi isin Ii Let

si =radicvisummiminus1

j=0

(12

j

)

(fvi minus 1)j

Like for the case i = t we can similarly show that f minus s2i isin Ii Let σi = s2i thenf minus σi isin Ii

Note that VC(Ii) cap VC(Ij) = empty for all i 6= j By Lemma 33 of [30] there existpolynomials atminus1 aN isin R[x] such that

a2tminus1 + middot middot middot+ a2N minus 1 isin I ai isin⋂

i6=jisintminus1NIj

For ǫ gt 0 denote the polynomial

σǫ = σt(ǫ)a2t +

sum

t6=jisintminus1N(σj + ǫ)a2j

then

f + ǫminus σǫ = (f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N )+sum

t6=iisintminus1N(f minus σi)a2i + (f + ǫminus σt(ǫ))a

2t

For each i 6= t f minus σi isin Ii so

(f minus σi)a2i isin

N⋂

j=tminus1

Ij = I

Hence there exists k1 gt 0 such that

(f minus σi)a2i isin I2k1 (t 6= i isin tminus 1 N)

Since f + ǫ minus σt(ǫ) isin It we also have

(f + ǫ minus σt(ǫ))a2t isin

N⋂

j=tminus1

Ij = I

Moreover by the equation (315)

(f + ǫ minus σt(ǫ))a2t =

mtminus2sum

j=0

bj(ǫ)fmt+ja2t

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 11

Each fmt+ja2t isin I since fmt+j isin It So there exists k2 gt 0 such that for all ǫ gt 0

(f + ǫminus σt(ǫ))a2t isin I2k2

Since 1minus a2tminus1 minus middot middot middot minus a2N isin I there also exists k3 gt 0 such that for all ǫ gt 0

(f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N ) isin I2k3

Hence if klowast ge maxk1 k2 k3 then we have

f(x) + ǫminus σǫ isin I2klowast

for all ǫ gt 0 By the construction the degrees of all σi and ai are independent ofǫ So σǫ isin Qmod(cin ψ)2klowast for all ǫ gt 0 if klowast is big enough Note that

I2klowast +Qmod(cin ψ)2klowast = IQ(ceq cin)2klowast + IQ(φ ψ)2klowast

This implies that fklowast ge fc minus ǫ for all ǫ gt 0 On the other hand we always havefklowast le fc So fklowast = fc Moreover since fk is monotonically increasing we musthave fk = fc for all k ge klowast

Case II Assume IQ(ceq cin) is archimedean Because

IQ(ceq cin) sube IQ(ceq cin) + IQ(φ ψ)

the set IQ(ceq cin)+IQ(φ ψ) is also archimedean Therefore the conclusion is alsotrue by applying the result for Case I

Case III Suppose the Assumption 32 holds Let ϕ1 ϕN be real univariatepolynomials such that ϕi(vj) = 0 for i 6= j and ϕi(vj) = 1 for i = j Let

s = st + middot middot middot+ sN where each si = (vi minus fc)(ϕi(f)

)2

Then s isin Σ[x]2k4 for some integer k4 gt 0 Let

f = f minus fc minus s

We show that there exist an integer ℓ gt 0 and q isin Qmod(cin ψ) such that

f2ℓ + q isin Ideal(ceq φ)

This is because by Assumption 32 f(x) equiv 0 on the set

K2 = x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0

It has only a single inequality By the Positivstellensatz [2 Corollary 443] there

exist 0 lt ℓ isin N and q = b0 + ρb1 (b0 b1 isin Σ[x]) such that f2ℓ + q isin Ideal(ceq φ)By Assumption 32 ρ isin Qmod(cin ψ) so we have q isin Qmod(cin ψ)

For all ǫ gt 0 and τ gt 0 we have f + ǫ = φǫ + θǫ where

φǫ = minusτǫ1minus2ℓ(f2ℓ + q

)

θǫ = ǫ(

1 + f ǫ+ τ(f ǫ)2ℓ)

+ τǫ1minus2ℓq

By Lemma 21 of [31] when τ ge 12ℓ there exists k5 such that for all ǫ gt 0

φǫ isin Ideal(ceq φ)2k5 θǫ isin Qmod(cin ψ)2k5

Hence we can getf minus (fc minus ǫ) = φǫ + σǫ

where σǫ = θǫ + s isin Qmod(cin ψ)2k5 for all ǫ gt 0 Note that

IQ(ceq cin)2k5 + IQ(φ ψ)2k5 = Ideal(ceq φ)2k5 +Qmod(cin ψ)2k5

12 JIAWANG NIE

For all ǫ gt 0 γ = fc minus ǫ is feasible in (39) for the order k5 so fk5 ge fc Becauseof (311) and the monotonicity of fk we have fk = f prime

k = fc for all k ge k5

32 Detecting tightness and extracting minimizers The optimal value of(37) is fc and the optimal value of (11) is fmin If fmin is achievable at a criticalpoint then fc = fmin In Theorem 33 we have shown that fk = fc for all k bigenough where fk is the optimal value of (39) The value fc or fmin is often notknown How do we detect the tightness fk = fc in computation The flat extensionor flat truncation condition [5 14 32] can be used for checking tightness Supposeylowast is a minimizer of (38) for the order k Let

(316) d = lceildeg(ceq cin φ ψ)2rceilIf there exists an integer t isin [d k] such that

(317) rankMt(ylowast) = rankMtminusd(y

lowast)

then fk = fc and we can get r = rankMt(ylowast) minimizers for (37) [5 14 32]

The method in [14] can be used to extract minimizers It was implemented inthe software GloptiPoly 3 [13] Generally (317) can serve as a sufficient andnecessary condition for detecting tightness The case that (37) is infeasible (ieno critical points satisfy the constraints cin ge 0 ψ ge 0) can also be detected bysolving the relaxations (38)-(39)

Theorem 34 Under Assumption 31 the relaxations (38)-(39) have the follow-ing properties

i) If (38) is infeasible for some order k then no critical points satisfy theconstraints cin ge 0 ψ ge 0 ie (37) is infeasible

ii) Suppose Assumption 32 holds If (37) is infeasible then the relaxation(38) must be infeasible when the order k is big enough

In the following assume (37) is feasible (ie fc lt +infin) Then for all k bigenough (38) has a minimizer ylowast Moreover

iii) If (317) is satisfied for some t isin [d k] then fk = fciv) If Assumption 32 holds and (37) has finitely many minimizers then every

minimizer ylowast of (38) must satisfy (317) for some t isin [d k] when k is bigenough

Proof By Assumption 31 u is a critical point if and only if ceq(u) = 0 φ(u) = 0i) For every feasible point u of (37) the tms [u]2k (see sect2 for the notation) is

feasible for (38) for all k Therefore if (38) is infeasible for some k then (37)must be infeasible

ii) By Assumption 32 when (37) is infeasible the set

x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0

is empty It has a single inequality By the Positivstellensatz [2 Corollary 443]it holds that minus1 isin Ideal(ceq φ) + Qmod(ρ) By Assumption 32

Ideal(ceq φ) + Qmod(ρ) sube IQ(ceq cin) + IQ(φ ψ)

Thus for all k big enough (39) is unbounded from above Hence (38) must beinfeasible by weak duality

When (37) is feasible f achieves finitely many values on Kc so (37) mustachieve its optimal value fc By Theorem 33 we know that fk = f prime

k = fc for all kbig enough For each minimizer ulowast of (37) the tms [ulowast]2k is a minimizer of (38)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 13

iii) If (317) holds we can get r = rankMt(ylowast) minimizers for (37) [5 14] say

u1 ur such that fk = f(ui) for each i Clearly fk = f(ui) ge fc On the otherhand we always have fk le fc So fk = fc

iv) By Assumption 32 (37) is equivalent to the problem

(318)

min f(x)st ceq(x) = 0 φ(x) = 0 ρ(x) ge 0

The optimal value of (318) is also fc Its kth order Lasserrersquos relaxation is

(319)

γprimek = min 〈f y〉st 〈1 y〉 = 1Mk(y) 0

L(k)ceq (y) = 0 L

(k)φ (y) = 0 L

(k)ρ (y) 0

Its dual optimization problem is

(320)

γk = max γ

st f minus γ isin Ideal(ceq φ)2k + Qmod(ρ)2k

By repeating the same proof as for Theorem 33(iii) we can show that

γk = γprimek = fc

for all k big enough Because ρ isin Qmod(cin ψ) each y feasible for (38) is alsofeasible for (319) So when k is big each ylowast is also a minimizer of (319) Theproblem (318) also has finitely many minimizers By Theorem 26 of [32] thecondition (317) must be satisfied for some t isin [d k] when k is big enough

If (37) has infinitely many minimizers then the condition (317) is typically notsatisfied We refer to [25 sect66]

4 Polyhedral constraints

In this section we assume the feasible set of (11) is the polyhedron

P = x isin Rn | Axminus b ge 0

where A =[a1 middot middot middot am

]T isin Rmtimesn b =[b1 middot middot middot bm

]T isin Rm This corresponds

to that E = empty I = [m] and each ci(x) = aTi xminus bi Denote

(41) D(x) = diag(c1(x) cm(x)) C(x) =

[AT

D(x)

]

The Lagrange multiplier vector λ =[λ1 middot middot middot λm

]Tsatisfies

(42)

[AT

D(x)

]

λ =

[nablaf(x)

0

]

If rankA = m we can express λ as

(43) λ = (AAT )minus1Anablaf(x)If rankA lt m how can we express λ in terms of x In computation we oftenprefer a polynomial expression If there exists L(x) isin R[x]mtimes(n+m) such that

(44) L(x)C(x) = Im

then we can get

λ = L(x)

[nablaf(x)

0

]

= L1(x)nablaf(x)

14 JIAWANG NIE

where L1(x) consists of the first n columns of L(x) In this section we characterizewhen such L(x) exists and give a degree bound for it

The linear function Ax minus b is said to be nonsingular if rankC(u) = m for allu isin Cn (also see Definition 51) This is equivalent to that for every u if J(u) =i1 ik (see (12) for the notation) then ai1 aik are linearly independent

Proposition 41 The linear function Ax minus b is nonsingular if and only if thereexists a matrix polynomial L(x) satisfying (44) Moreover when Ax minus b is non-singular we can choose L(x) in (44) with deg(L) le mminus rankA

Proof Clearly if (44) is satisfied by some L(x) then rankC(u) ge m for all u Thisimplies that Axminus b is nonsingular

Next assume that Axminusb is nonsingular We show that (44) is satisfied by someL(x) isin R[x]mtimes(n+m) with degree le m minus rankA Let r = rankA Up to a linearcoordinate transformation we can reduce x to a r-dimensional variable Withoutloss of generality we can assume that rankA = n and m ge n

For a subset I = i1 imminusn of [m] denote

cI(x) =prod

iisinIci(x) EI(x) = cI(x) middot diag(ci1(x)minus1 cimminusn

(x)minus1)

DI(x) = diag(ci1(x) cimminusn(x)) AI =

[ai1 middot middot middot ainminusm

]T

For the case that I = empty (the empty set) we set cempty(x) = 1 Let

V = I sube [m] |I| = mminus n rankA[m]I = nStep I For each I isin V we construct a matrix polynomial LI(x) such that

(45) LI(x)C(x) = cI(x)Im

The matrix LI = LI(x) satisfying (45) can be given by the following 2times 3 blockmatrix (LI(J K) denotes the submatrix whose row indices are from J and whosecolumn indices are from K)

(46)

J K [n] n+ I n+ [m]II 0 EI(x) 0

[m]I cI(x) middot(A[m]I

)minusT minus(A[m]I

)minusT (AI

)TEI(x) 0

Equivalently the blocks of LI are

LI(I [n]

)= 0 LI

(I n+ [m]I

)= 0 LI

([m]I n+ [m]I

)= 0

LI(I n+ I

)= EI(x) LI

([m]I [n]

)= cI(x)

(A[m]I

)minusT

LI([m]I n+ I

)= minus

(A[m]I

)minusT (AI

)T

For each I isin V A[m]I is invertible The superscript minusT denotes the inverse of thetranspose Let G = LI(x)C(x) then one can verify that

G(I I) = EI(x)DI(x) = cI(x)Imminusn G(I [m]I) = 0

G([m]I [m]I) =[

cI(x)(A[m]I

)minusT minusAminusT[m]IA

TI EI(x)

] [(A[m]I

)T

0

]

= cI(x)In

G([m]I I) =[

cI(x)(A[m]I

)minusT minus(A[m]I

)minusT (AI

)TEI(x)

] [ ATIDI(x)

]

= 0

This shows that the above LI(x) satisfies (45)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 15

Step II We show that there exist real scalars νI satisfying

(47)sum

IisinVνIcI(x) = 1

This can be shown by induction on m

bull When m = n V = empty and cempty(x) = 1 so (47) is clearly truebull When m gt n let

(48) N = i isin [m] | rankA[m]i = n

For each i isin N let Vi be the set of all Iprime sube [m]i such that |I prime| = mminusnminus1

and rankA[m](Iprimecupi) = n For each i isin N by the assumption the linearfunction Amix minus bmi is nonsingular By induction there exist real

scalars ν(i)Iprime satisfying

(49)sum

IprimeisinVi

ν(i)Iprime cIprime(x) = 1

Since rankA = n we can generally assume that a1 an is linearlyindependent So there exist scalars α1 αn such that

am = α1a1 + middot middot middot+ αnan

If all αi = 0 then am = 0 and hence A can be replaced by its first mminus 1rows So (47) is true by the induction In the following suppose at leastone αi 6= 0 and write

i αi 6= 0 = i1 ik

Then ai1 aik am are linearly dependent For convenience set ik+1 =m Since Axminus b is nonsingular the linear system

ci1(x) = middot middot middot = cik(x) = cik+1(x) = 0

has no solutions Hence there exist real scalars micro1 microk+1 such that

micro1ci1(x) + middot middot middot+ microkcik(x) + microk+1cik+1(x) = 1

This above can be implied by echelonrsquos form for inconsistent linear systemsNote that i1 ik+1 isin N For each j = 1 k + 1 by (49)

sum

IprimeisinVij

ν(ij)Iprime cIprime(x) = 1

Then we can get

1 =k+1sum

j=1

microjcij (x) =k+1sum

j=1

microjsum

IprimeisinVij

ν(ij)Iprime cij (x)cIprime(x) =

sum

I=IprimecupijIprimeisinVij1lejlek+1

ν(ij)Iprime microjcI(x)

Since each I prime cup ij isin V (47) must be satisfied by some scalars νI

16 JIAWANG NIE

Step III For LI(x) as in (45) we construct L(x) as

(410) L(x) =sum

IisinVνIcI(x)LI(x)

Clearly L(x) satisfies (44) because

L(x)C(x) =sum

IisinVνILI(x)C(x) =

sum

IisinVνIcI(x)Im = Im

Each LI(x) has degree le mminus n so L(x) has degree le mminus n

Proposition 41 characterizes when there exists L(x) satisfying (44) Whenit does a degree bound for L(x) is m minus rankA Sometimes its degree can besmaller than that as shown in Example 43 For given A b the matrix polynomialL(x) satisfying (44) can be determined by linear equations which are obtained bymatching coefficients on both sides In the following we give some examples ofL(x)C(x) = Im for polyhedral sets

Example 42 Consider the simplicial set

x1 ge 0 xn ge 0 1minus eTx ge 0

The equation L(x)C(x) = In+1 is satisfied by

L(x) =

1minus x1 minusx2 middot middot middot minusxn 1 middot middot middot 1minusx1 1minus x2 middot middot middot minusxn 1 middot middot middot 1

minusx1 minusx2 middot middot middot 1minus xn 1 middot middot middot 1minusx1 minusx2 middot middot middot minusxn 1 middot middot middot 1

Example 43 Consider the box constraint

x1 ge 0 xn ge 0 1minus x1 ge 0 1minus xn ge 0

The equation L(x)C(x) = I2n is satisfied by

L(x) =

[In minus diag(x) In Inminusdiag(x) In In

]

Example 44 Consider the polyhedral set

1minus x4 ge 0 x4 minus x3 ge 0 x3 minus x2 ge 0 x2 minus x1 ge 0 x1 + 1 ge 0

The equation L(x)C(x) = I5 is satisfied by

L(x) =1

2

minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 minusx4 minus 1 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 1minus x4 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 1minus x3 1minus x4 1 1 1 1 1minusx1 minus 1 1minus x2 1minus x3 1minus x4 1 1 1 1 11minus x1 1minus x2 1minus x3 1minus x4 1 1 1 1 1

Example 45 Consider the polyhedral set

1 + x1 ge 0 1minus x1 ge 0 2minus x1 minus x2 ge 0 2minus x1 + x2 ge 0

The matrix L(x) satisfying L(x)C(x) = I4 is

1

6

x12minus 3x1 + 2 x1 x2 minus x2 4minus x1 2minus x1 1minus x1 1minus x1

3x12minus 3x1 minus 6 3x2 + 3x1 x2 6minus 3x1 minus3x1 minus3x1 minus 3 minus3x1 minus 3

1minus x12

minus2x2 minus x1 x2 minus 3 x1 minus 1 x1 + 1 x1 + 2 x1 + 21minus x1

2 3minus x1 x2 minus 2x2 x1 minus 1 x1 + 1 x1 + 2 x1 + 2

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 17

5 General constraints

We consider general nonlinear constraints as in (11) The critical point condi-tions are in (18) We discuss how to express Lagrange multipliers λi as polynomialfunctions in x on the set of critical points

Suppose there are totally m equality and inequality constraints ie

E cup I = 1 mIf (x λ) is a critical pair then λici(x) = 0 for all i isin E cup I So the Lagrange

multiplier vector λ =[λ1 middot middot middot λm

]Tsatisfies the equation

(51)

nablac1(x) nablac2(x) middot middot middot nablacm(x)c1(x) 0 middot middot middot 00 c2(x) 0 0

0 0 middot middot middot cm(x)

︸ ︷︷ ︸

C(x)

λ =

nablaf(x)000

Let C(x) be as in above If there exists L(x) isin R[x]mtimes(m+n) such that

(52) L(x)C(x) = Im

then we can get

(53) λ = L(x)

[nablaf(x)

0

]

= L1(x)nablaf(x)

where L1(x) consists of the first n columns of L(x) Clearly (52) implies thatAssumption 31 holds This section characterizes when such L(x) exists

Definition 51 The tuple c = (c1 cm) of constraining polynomials is said tobe nonsingular if rankC(u) = m for every u isin Cn

Clearly c being nonsingular is equivalent to that for each u isin Cn if J(u) =i1 ik (see (12) for the notation) then the gradients nablaci1(u) nablacik(u) arelinearly independent Our main conclusion is that (52) holds if and only if thetuple c is nonsingular

Proposition 52 (i) For each W (x) isin C[x]stimest with s ge t rankW (u) = t for allu isin Cn if and only if there exists P (x) isin C[x]ttimess such that

P (x)W (x) = It

Moreover for W (x) isin R[x]stimest we can choose P (x) isin R[x]ttimess for the above(ii) The constraining polynomial tuple c is nonsingular if and only if there existsL(x) isin R[x]mtimes(m+n) satisfying (52)

Proof (i) ldquolArrrdquo If L(x)W (x) = It then for all u isin Cn

t = rank It le rankW (u) le t

So W (x) must have full column rank everywhere

ldquorArrrdquo Suppose rankW (u) = t for all u isin Cn Write W (x) in columns

W (x) =[w1(x) w2(x) middot middot middot wt(x)

]

18 JIAWANG NIE

Then the equation w1(x) = 0 does not have a complex solution By Hilbertrsquos WeakNullstellensatz [4] there exists ξ1(x) isin C[x]s such that ξ1(x)

Tw1(x) = 1 For eachi = 2 t denote

r1i(x) = ξ1(x)Twi(x)

then (use sim to denote row equivalence between matrices)

W (x) sim[

1 r12(x) middot middot middot r1t(x)w1(x) w2(x) middot middot middot wm(x)

]

simW1(x) =

[1 r12(x) middot middot middot r1m(x)

0 w(1)2 (x) middot middot middot w

(1)m (x)

]

where each (i = 2 m)

w(1)i (x) = wi(x)minus r1i(x)w1(x)

So there exists P1(x) isin R[x](s+1)timess such that

P1(x)W (x) =W1(x)

Since W (x) and W1(x) are row equivalent W1(x) must also have full column rankeverywhere Similarly the polynomial equation

w(1)2 (x) = 0

does not have a complex solution Again by Hilbertrsquos Weak Nullstellensatz [4]there exists ξ2(x) isin C[x]s such that

ξ2(x)Tw

(1)2 (x) = 1

For each i = 3 t let r2i(x) = ξ2(x)Tw

(1)2 (x) then

W1(x) sim

1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)

0 w(1)2 (x) w

(1)3 (x) middot middot middot w

(1)m (x)

sim

W2(x) =

1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)

0 0 w(2)3 (x) middot middot middot w

(2)m (x)

where each (i = 3 m)

w(2)i (x) = w

(1)i (x)minus r2i(x)w

(1)2 (x)

Similarly W1(x) and W2(x) are row equivalent so W2(x) has full column rankeverywhere There exists P2(x) isin C[x](s+2)times(s+1) such that

P2(x)W1(x) =W2(x)

Continuing this process we can finally get

W2(x) sim middot middot middot simWt(x) =

1 r12(x) r13(x) middot middot middot r1t(x)0 1 r23(x) middot middot middot r2t(x)0 0 1 middot middot middot r3t(x)

0 0 0 middot middot middot 10 0 0 middot middot middot 0

Consequently there exists Pi(x) isin R[x](s+i)times(s+iminus1) for i = 1 2 t such that

Pt(x)Ptminus1(x) middot middot middotP1(x)W (x) =Wt(x)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 19

Since Wt(x) is a unit upper triangular matrix polynomial there exists Pt+1(x) isinR[x]ttimes(s+t) such that Pt+1(x)Wt(x) = It Let

P (x) = Pt+1(x)Pt(x)Ptminus1(x) middot middot middotP1(x)

then P (x)W (x) = Im Note that P (x) isin C[x]ttimess

For W (x) isin R[x]stimest we can replace P (x) by(P (x)+P (x)

)2 (the P (x) denotes

the complex conjugate of P (x)) which is a real matrix polynomial

(ii) The conclusion is implied directly by the item (i)

In Proposition 52 there is no explicit degree bound for L(x) satisfying (52)This question is mostly open to the best of the authorrsquos knowledge However oncea degree is chosen for L(x) it can be determined by comparing coefficients of bothsides of (52) This can be done by solving a linear system In the following wegive some examples of L(x) satisfying (52)

Example 53 Consider the hypercube with quadratic constraints

1minus x21 ge 0 1minus x22 ge 0 1minus x2n ge 0

The equation L(x)C(x) = In is satisfied by

L(x) =[minus 1

2diag(x) In]

Example 54 Consider the nonnegative portion of the unit sphere

x1 ge 0 x2 ge 0 xn ge 0 x21 + middot middot middot+ x2n minus 1 = 0

The equation L(x)C(x) = In+1 is satisfied by

L(x) =

[In minus xxT x1Tn 2x

12x

T minus 121

Tn minus1

]

Example 55 Consider the set

1minus x31 minus x42 ge 0 1minus x43 minus x34 ge 0

The equation L(x)C(x) = I2 is satisfied by

L(x) =

[minusx1

3 minusx2

4 0 0 1 00 0 minusx3

4 minusx4

3 0 1

]

Example 56 Consider the quadratic set

1minus x1x2 minus x2x3 minus x1x3 ge 0 1minus x21 minus x22 minus x23 ge 0

The matrix L(x)T satisfying L(x)C(x) = I2 is

25x13 + 10 x1

2 x2 + 40x1 x22 minus 25 x1 minus 2x3 minus25x1

3 minus 10x12 x2 minus 40 x1 x2

2 +49 x1

2+ 2x3

minus15 x12 x2 + 10x1 x2

2 + 20 x3 x1 x2 minus 10x1 15 x12 x2 minus 10x1 x2

2 minus 20 x3 x1 x2 + 10x1 minus x22

25 x3 x12 minus 20x1 x2

2 + 10 x3 x1 x2 + 2x1 minus25 x3 x12 + 20x1 x2

2 minus 10 x3 x1 x2 minus 2x1 minus x32

1 minus 20 x1 x3 minus 10 x12 minus 20 x1 x2 20 x1 x2 + 20 x1 x3 + 10x1

2

minus50 x12 minus 20 x2 x1 50 x1

2 + 20 x2 x1 + 1

We would like to remark that a polynomial tuple c = (c1 cm) is genericallynonsingular and Assumption 31 holds generically

Proposition 57 For all positive degrees d1 dm there exists an open densesubset U of D = R[x]d1 times middot middot middot times R[x]dm such that every tuple c = (c1 cm) isin Uis nonsingular Indeed such U can be chosen as a Zariski open subset of D ie itis the complement of a proper real variety of D Moreover Assumption 31 holdsfor all c isin U ie it holds generically

20 JIAWANG NIE

Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The

resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of

ci1 cin+1such that if Res(ci1 cin+1

) 6= 0 then the equations

ci1(x) = middot middot middot = cin+1(x) = 0

have no complex solutions Define

F1(c) =prod

(i1in+1)isinJ1

Res(ci1 cin+1)

For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero

Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations

cj1(x) = middot middot middot = cjk(x) = 0

have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define

F2(c) =prod

(j1jk)isinJ2

∆(cj1 cjk)

Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero

Last let F (c) = F1(c)F2(c) and

U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple

Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically

6 Numerical examples

This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21

The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie

pi =(

L1(x)nablaf(x))

i

In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds

By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)

We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo

Example 61 Consider the optimization problem

min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0

The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors

Table 1 Computational results for Example 61

order kwo LME with LME

lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567

2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn

i=1xi(1 minus xi)

2 +sum

i6=j x2

i xj isin IQ(ceq cin)

22 JIAWANG NIE

Example 62 Consider the optimization problem

min x41x22 + x21x

42 + x63 minus 3x21x

22x

23 + (x41 + x42 + x43)

st x21 + x22 + x23 ge 1

The matrix polynomial L(x) =[12x1

12x2

12x3 minus1

] The objective f is the sum

of the positive definite form x41 + x42 + x43 and the Motzkin polynomial

M(x) = x41x22 + x21x

42 + x63 minus 3x21x

22x

23

Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because

c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)

)= 0

defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1

3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic

3 plusmn 1radic

3) The

computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 2 Computational results for Example 62

order kwo LME with LME

lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530

Example 63 Consider the optimization problem

min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0

The matrix polynomial L(x) is

1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0

0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1

minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1

The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3

3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact

4 This is because 1 minus x21minus x2

2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and

1minusx23minusx2

4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23

Table 3 Computational results for Example 63

order kwo LME with LME

lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449

Example 64 Consider the polynomial optimization problem

minxisinR2

x21 + 50x22

st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1

8 ge 0 x22 + 2x1x2 minus 18 ge 0

It is motivated from an example in [12 sect3] The first column of L(x) is

8x13

5 + x1

5288 x2 x1

4

5 minus 16 x13

5 minus x2 x12 1245 + 8 x1

5 minus 2 x2minus 288x2 x1

4

5 minus 16 x13

5 + x2 x12 1245 + 8 x1

5 + 2 x2

and the second column of L(x) is

minus 8x12 x2

5 + 4x23

5 minus x2

10288x1

3 x22

5 + 16x12 x2

5 minus 142x1 x22

5 minus 9 x1

20 minus 8 x23

5 + 11x2

5

minus 288x13 x2

2

5 + 16x12 x2

5 + 142x1 x22

5 + 9 x1

20 minus 8 x23

5 + 11x2

5

The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25

radic5 asymp 1126517 and the minimizers are (plusmn

radic

12plusmn(radic

58 +radic

12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors

Table 4 Computational results for Example 64

order kwo LME with LME

lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537

Example 65 Consider the optimization problem

minxisinR3

x31 + x32 + x33 + 4x1x2x3 minus(x1(x

22 + x23) + x2(x

23 + x21) + x3(x

21 + x22)

)

st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0

The matrix polynomial L(x) is

1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0

minusx1 x2 0 1 0 minus1

24 JIAWANG NIE

The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3

+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 5 Computational results for Example 65

order kwo LME with LME

lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057

Example 66 Consider the optimization problem (x0 = 1)

minxisinR4

xTx+sum4i=0

prod

j 6=i(xi minus xj)

st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0

The matrix polynomial L(x) =[12diag(x) minusI4

] The first part of the objective

is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers

(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)

(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)

(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 6 Computational results for Example 66

order kwo LME with LME

lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354

Example 67 Consider the optimization problem

minxisinR3

x41x22 + x42x

23 + x43x

21 minus 3x21x

22x

23 + x22

st x1 minus x2x3 ge 0minusx2 + x23 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0

minusx3 minus1 0 0 0

]

By the arithmetic-geometric

mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25

relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin

for all k ge 5 up to numerical round-off errors

Table 7 Computational results for Example 67

order kwo LME with LME

lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618

Example 68 Consider the optimization problem

minxisinR4

x21(x1 minus x4)2 + x22(x2 minus x4)

2 + x23(x3 minus x4)2+

2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2

st x1 minus x2 ge 0 x2 minus x3 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0 01 1 0 0 0 0

]

In the objective the sum

of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer

(05632 05632 05632 07510)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 8 Computational results for Example 68

order kwo LME with LME

lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228

Example 69 Consider the optimization problem

minxisinR4

(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)

st 0 le x1 x4 le 1

The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9

5 Note that 4minussum

4

i=1x2i =

sum

4

i=1

(

xi(1 minus xi)2 + (1 minus xi)(1 + x2i )

)

isin IQ(ceq cin)

26 JIAWANG NIE

Table 9 Computational results for Example 69

orderwo LME with LME

lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320

For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison

Example 610 Consider the optimization problem (x0 = 1)

minxisinRn

sum

0leilejlejlen cijkxixjxk

st 0 le x le 1

where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time

Table 10 Consumed time (in seconds) for Example 610

n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778

7 Discussions

71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that

IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k

If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set

Preord(cin ψ) =sum

r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27

The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is

(71)

f primeprek = min 〈f y〉

st 〈1 y〉 = 1 L(k)ceq(y) = 0 L

(k)φ (y) = 0

L(k)

gr11

middotmiddotmiddotgrℓℓ

(y) 0 forall r1 rℓ isin 0 1y isin RN

n2k

Similar to (39) the dual optimization problem of the above is

(72)

fprek = max γ

st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k

An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem

Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then

fprek = f primeprek = fc

for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre

k = fmin for all k big enough

Proof The proof is very similar to the Case III of Theorem 33 Follow the same

argument there Without Assumption 32 we still have f(x) equiv 0 on the set

K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0

By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such

that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same

72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge

73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)

28 JIAWANG NIE

74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get

λ =(

C(x)TC(x))minus1

C(x)T[nablaf(x)

0

]

when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research

Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper

References

[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions

Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-

putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997

[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226

[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200

[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832

[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear

[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear

[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]

[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous

quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-

inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-

tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005

[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109

[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J

Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of

zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29

[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885

[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014

[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009

[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015

[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear

[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26

[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009

[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014

[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606

[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718

[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012

[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255

[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013

[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510

[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121

[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274

[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003

[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984

[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000

[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942

[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324

[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653

[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002

[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951

Department of Mathematics University of California at San Diego 9500 Gilman

Drive La Jolla CA USA 92093

E-mail address njwmathucsdedu

  • 1 Introduction
    • 11 Optimality conditions
    • 12 Some existing work
    • 13 New contributions
      • 2 Preliminaries
      • 3 The construction of tight relaxations
        • 31 Tightness of the relaxations
        • 32 Detecting tightness and extracting minimizers
          • 4 Polyhedral constraints
          • 5 General constraints
          • 6 Numerical examples
          • 7 Discussions
            • 71 Tight relaxations using preorderings
            • 72 Singular constraining polynomials
            • 73 Degree bound for L(x)
            • 74 Rational representation of Lagrange multipliers
              • References
Page 7: TIGHT RELAXATIONS FOR POLYNOMIAL OPTIMIZATION …The polynomial expressions is de-termined by linear equations. Based on these expressions, new Lasserre type semidefinite relaxations

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 7

For more general constraints we can also express λ as a polynomial function in xon the set Kc This will be discussed in sect4 and sect5

For the polynomials pi as in Assumption 31 denote

(36) φ =(

nablaf minussum

iisinEcupIpinablaci

(pjcj

)

jisinI

)

ψ =(pj)

jisinI

When the minimum value fmin of (11) is achieved at a critical point (11) isequivalent to the problem

(37)

fc = min f(x)st ceq(x) = 0 cin(x) ge 0

φ(x) = 0 ψ(x) ge 0

We apply Lasserre relaxations to solve it For an integer k gt 0 (called the relaxationorder) the kth order Lasserrersquos relaxation for (37) is

(38)

f primek = min 〈f y〉

st 〈1 y〉 = 1Mk(y) 0

L(k)ceq (y) = 0 L

(k)cin(y) 0

L(k)φ (y) = 0 L

(k)ψ (y) 0 y isin RN

n2k

Since x0 = 1 (the constant one polynomial) the condition 〈1 y〉 = 1 means that(y)0 = 1 The dual optimization problem of (38) is

(39)

fk = max γ

st f minus γ isin IQ(ceq cin)2k + IQ(φ ψ)2k

We refer to sect2 for the notation used in (38)-(39) They are equivalent to semidef-inite programs (SDPs) so they can be solved by SDP solvers (eg SeDuMi [40])For k = 1 2 middot middot middot we get a hierarchy of Lasserre relaxations In (38)-(39) if weremove the usage of φ and ψ they are reduced to standard Lasserre relaxations in[17] So (38)-(39) are stronger relaxations

By the construction of φ as in (36) Assumption 31 implies that

Kc = u isin Rn ceq(u) = 0 φ(u) = 0

By Lemma 33 of [6] f achieves only finitely many values on Kc say(310) v1 lt middot middot middot lt vN

A point u isin Kc might not be feasible for (37) ie it is possible that cin(u) 6ge 0 orψ(u) 6ge 0 In applications we are often interested in the optimal value fc of (37)When (37) is infeasible by convention we set

fc = +infin

When the optimal value fmin of (11) is achieved at a critical point fc = fminThis is the case if the feasible set is compact or if f is coercive (ie for each ℓthe sublevel set f(x) le ℓ is compact) and the constraint qualification conditionholds As in [17] one can show that

(311) fk le f primek le fc

for all k Moreover fk and f primek are both monotonically increasing If for some

order k it occurs thatfk = f prime

k = fc

then the kth order Lasserrersquos relaxation is said to be tight (or exact)

8 JIAWANG NIE

31 Tightness of the relaxations Let cin ψKc fc be as above We refer to sect2for the notation Qmod(cin ψ) We begin with a general assumption

Assumption 32 There exists ρ isin Qmod(cin ψ) such that if u isin Kc and f(u) ltfc then ρ(u) lt 0

In Assumption 32 the hypersurface ρ(x) = 0 separates feasible and infeasiblecritical points Clearly if u isin Kc is a feasible point for (37) then cin(u) ge 0 andψ(u) ge 0 and hence ρ(u) ge 0 Assumption 32 generally holds For instance it issatisfied for the following general cases

a) When there are no inequality constraints cin and ψ are empty tuplesThen Qmod(cin ψ) = Σ[x] and Assumption 32 is satisfied for ρ = 0

b) Suppose the set Kc is finite say Kc = u1 uD andf(u1) f(utminus1) lt fc le f(ut) f(uD)

Let ℓ1 ℓD be real interpolating polynomials such that ℓi(uj) = 1 fori = j and ℓi(uj) = 0 for i 6= j For each i = 1 t there must exist ji isin Isuch that cji(ui) lt 0 Then the polynomial

(312) ρ =sum

iltt

minus1

cji(ui)cji(x)ℓi(x)

2 +sum

igetℓi(x)

2

satisfies Assumption 32c) For each x with f(x) = vi lt fc at least one of the constraints cj(x) ge

0 pj(x) ge 0(j isin I) is violated Suppose for each critical value vi lt fcthere exists gi isin cj pjjisinI such that

gi lt 0 on Kc cap f(x) = viLet ϕ1 ϕN be real univariate polynomials such that ϕi(vj) = 0 for i 6= jand ϕi(vj) = 1 for i = j Suppose vt = fc Then the polynomial

(313) ρ =sum

iltt

gi(x)(ϕi(f(x))

)2+sum

iget

(ϕi(f(x))

)2

satisfies Assumption 32

We refer to sect2 for the archimedean condition and the notation IQ(h g) as in(21) The following is about the convergence of relaxations (38)-(39)

Theorem 33 Suppose Kc 6= empty and Assumption 31 holds If

i) IQ(ceq cin) + IQ(φ ψ) is archimedean orii) IQ(ceq cin) is archimedean oriii) Assumption 32 holds

then fk = f primek = fc for all k sufficiently large Therefore if the minimum value fmin

of (11) is achieved at a critical point then fk = f primek = fmin for all k big enough if

one of the conditions i)-iii) is satisfied

Remark In Theorem 33 the conclusion holds if anyone of conditions i)-iii) issatisfied The condition ii) is only about constraining polynomials of (11) It canbe checked without φ ψ Clearly the condition ii) implies the condition i)

The proof for Theorem 33 is given in the following The main idea is to considerthe set of critical points It can be expressed as a union of subvarieties Theobjective f is a constant in each one of them We can get an SOS type representation

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 9

for f on each subvariety and then construct a single one for f over the entire setof critical points

Proof of Theorem 33 Clearly every point in the complex variety

K1 = x isin Cn | ceq(x) = 0 φ(x) = 0

is a critical point By Lemma 33 of [6] the objective f achieves finitely manyreal values on Kc = K1 cap Rn say they are v1 lt middot middot middot lt vN Up to the shiftingof a constant in f we can further assume that fc = 0 Clearly fc equals one ofv1 vN say vt = fc = 0

Case I Assume IQ(ceq cin) + IQ(φ ψ) is archimedean Let

I = Ideal(ceq φ)

the critical ideal Note that K1 = VC(I) The variety VC(I) is a union of irreduciblesubvarieties say V1 Vℓ If Vi cap Rn 6= empty then f is a real constant on Vi whichequals one of v1 vN This can be implied by Lemma 33 of [6] and Lemma 32of [30] Denote the subvarieties of VC(I)

Ti = K1 cap f(x) = vi (i = t N)

Let Ttminus1 be the union of irreducible subvarieties Vi such that either Vi cap Rn = emptyor f equiv vj on Vi with vj lt vt = fc Then it holds that

VC(I) = Ttminus1 cup Tt cup middot middot middot cup TN By the primary decomposition of I [11 41] there exist ideals Itminus1 It IN sube R[x]such that

I = Itminus1 cap It cap middot middot middot cap IN

and Ti = VC(Ii) for all i = tminus 1 t N Denote the semialgebraic set

(314) S = x isin Rn | cin(x) ge 0 ψ(x) ge 0

For i = t minus 1 we have VR(Itminus1) cap S = empty because v1 vtminus1 lt fc By thePositivstellensatz [2 Corollary 443] there exists p0 isin Preord(cin ψ)

1 satisfying2 + p0 isin Itminus1 Note that 1 + p0 gt 0 on VR(Itminus1) cap S The set Itminus1 +Qmod(cin ψ)is archimedean because I sube Itminus1 and

IQ(ceq cin) + IQ(φ ψ) sube Itminus1 +Qmod(cin ψ)

By Theorem 21 we have

p1 = 1 + p0 isin Itminus1 +Qmod(cin ψ)

Then 1 + p1 isin Itminus1 There exists p2 isin Qmod(cin ψ) such that

minus1 equiv p1 equiv p2 mod Itminus1

Since f = (f4 + 1)2 minus 1 middot (f4minus 1)2 we have

f equiv σtminus1 =(f4 + 1)2 + p2(f4minus 1)2

mod Itminus1

So when k is big enough we have σtminus1 isin Qmod(cin ψ)2k

1It is the preordering of the polynomial tuple (cin ψ) see sect71

10 JIAWANG NIE

For i = t vt = 0 and f(x) vanishes on VC(It) By Hilbertrsquos Strong Nullstellensatz[4] there exists an integer mt gt 0 such that fmt isin It Define the polynomial

st(ǫ) =radicǫsummtminus1

j=0

(12

j

)

ǫminusjf j

Then we have that

D1 = ǫ(1 + ǫminus1f)minus(st(ǫ)

)2 equiv 0 mod It

This is because in the subtraction of D1 after expanding(st(ǫ)

)2 all the terms f j

with j lt mt are cancelled and f j isin It for j ge mt So D1 isin It Let σt(ǫ) = st(ǫ)2

then f + ǫminus σt(ǫ) = D1 and

(315) f + ǫminus σt(ǫ) =summtminus2

j=0bj(ǫ)f

mt+j

for some real scalars bj(ǫ) depending on ǫ

For each i = t+1 N vi gt 0 and f(x)viminus1 vanishes on VC(Ii) By HilbertrsquosStrong Nullstellensatz [4] there exists 0 lt mi isin N such that (fvi minus 1)mi isin Ii Let

si =radicvisummiminus1

j=0

(12

j

)

(fvi minus 1)j

Like for the case i = t we can similarly show that f minus s2i isin Ii Let σi = s2i thenf minus σi isin Ii

Note that VC(Ii) cap VC(Ij) = empty for all i 6= j By Lemma 33 of [30] there existpolynomials atminus1 aN isin R[x] such that

a2tminus1 + middot middot middot+ a2N minus 1 isin I ai isin⋂

i6=jisintminus1NIj

For ǫ gt 0 denote the polynomial

σǫ = σt(ǫ)a2t +

sum

t6=jisintminus1N(σj + ǫ)a2j

then

f + ǫminus σǫ = (f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N )+sum

t6=iisintminus1N(f minus σi)a2i + (f + ǫminus σt(ǫ))a

2t

For each i 6= t f minus σi isin Ii so

(f minus σi)a2i isin

N⋂

j=tminus1

Ij = I

Hence there exists k1 gt 0 such that

(f minus σi)a2i isin I2k1 (t 6= i isin tminus 1 N)

Since f + ǫ minus σt(ǫ) isin It we also have

(f + ǫ minus σt(ǫ))a2t isin

N⋂

j=tminus1

Ij = I

Moreover by the equation (315)

(f + ǫ minus σt(ǫ))a2t =

mtminus2sum

j=0

bj(ǫ)fmt+ja2t

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 11

Each fmt+ja2t isin I since fmt+j isin It So there exists k2 gt 0 such that for all ǫ gt 0

(f + ǫminus σt(ǫ))a2t isin I2k2

Since 1minus a2tminus1 minus middot middot middot minus a2N isin I there also exists k3 gt 0 such that for all ǫ gt 0

(f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N ) isin I2k3

Hence if klowast ge maxk1 k2 k3 then we have

f(x) + ǫminus σǫ isin I2klowast

for all ǫ gt 0 By the construction the degrees of all σi and ai are independent ofǫ So σǫ isin Qmod(cin ψ)2klowast for all ǫ gt 0 if klowast is big enough Note that

I2klowast +Qmod(cin ψ)2klowast = IQ(ceq cin)2klowast + IQ(φ ψ)2klowast

This implies that fklowast ge fc minus ǫ for all ǫ gt 0 On the other hand we always havefklowast le fc So fklowast = fc Moreover since fk is monotonically increasing we musthave fk = fc for all k ge klowast

Case II Assume IQ(ceq cin) is archimedean Because

IQ(ceq cin) sube IQ(ceq cin) + IQ(φ ψ)

the set IQ(ceq cin)+IQ(φ ψ) is also archimedean Therefore the conclusion is alsotrue by applying the result for Case I

Case III Suppose the Assumption 32 holds Let ϕ1 ϕN be real univariatepolynomials such that ϕi(vj) = 0 for i 6= j and ϕi(vj) = 1 for i = j Let

s = st + middot middot middot+ sN where each si = (vi minus fc)(ϕi(f)

)2

Then s isin Σ[x]2k4 for some integer k4 gt 0 Let

f = f minus fc minus s

We show that there exist an integer ℓ gt 0 and q isin Qmod(cin ψ) such that

f2ℓ + q isin Ideal(ceq φ)

This is because by Assumption 32 f(x) equiv 0 on the set

K2 = x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0

It has only a single inequality By the Positivstellensatz [2 Corollary 443] there

exist 0 lt ℓ isin N and q = b0 + ρb1 (b0 b1 isin Σ[x]) such that f2ℓ + q isin Ideal(ceq φ)By Assumption 32 ρ isin Qmod(cin ψ) so we have q isin Qmod(cin ψ)

For all ǫ gt 0 and τ gt 0 we have f + ǫ = φǫ + θǫ where

φǫ = minusτǫ1minus2ℓ(f2ℓ + q

)

θǫ = ǫ(

1 + f ǫ+ τ(f ǫ)2ℓ)

+ τǫ1minus2ℓq

By Lemma 21 of [31] when τ ge 12ℓ there exists k5 such that for all ǫ gt 0

φǫ isin Ideal(ceq φ)2k5 θǫ isin Qmod(cin ψ)2k5

Hence we can getf minus (fc minus ǫ) = φǫ + σǫ

where σǫ = θǫ + s isin Qmod(cin ψ)2k5 for all ǫ gt 0 Note that

IQ(ceq cin)2k5 + IQ(φ ψ)2k5 = Ideal(ceq φ)2k5 +Qmod(cin ψ)2k5

12 JIAWANG NIE

For all ǫ gt 0 γ = fc minus ǫ is feasible in (39) for the order k5 so fk5 ge fc Becauseof (311) and the monotonicity of fk we have fk = f prime

k = fc for all k ge k5

32 Detecting tightness and extracting minimizers The optimal value of(37) is fc and the optimal value of (11) is fmin If fmin is achievable at a criticalpoint then fc = fmin In Theorem 33 we have shown that fk = fc for all k bigenough where fk is the optimal value of (39) The value fc or fmin is often notknown How do we detect the tightness fk = fc in computation The flat extensionor flat truncation condition [5 14 32] can be used for checking tightness Supposeylowast is a minimizer of (38) for the order k Let

(316) d = lceildeg(ceq cin φ ψ)2rceilIf there exists an integer t isin [d k] such that

(317) rankMt(ylowast) = rankMtminusd(y

lowast)

then fk = fc and we can get r = rankMt(ylowast) minimizers for (37) [5 14 32]

The method in [14] can be used to extract minimizers It was implemented inthe software GloptiPoly 3 [13] Generally (317) can serve as a sufficient andnecessary condition for detecting tightness The case that (37) is infeasible (ieno critical points satisfy the constraints cin ge 0 ψ ge 0) can also be detected bysolving the relaxations (38)-(39)

Theorem 34 Under Assumption 31 the relaxations (38)-(39) have the follow-ing properties

i) If (38) is infeasible for some order k then no critical points satisfy theconstraints cin ge 0 ψ ge 0 ie (37) is infeasible

ii) Suppose Assumption 32 holds If (37) is infeasible then the relaxation(38) must be infeasible when the order k is big enough

In the following assume (37) is feasible (ie fc lt +infin) Then for all k bigenough (38) has a minimizer ylowast Moreover

iii) If (317) is satisfied for some t isin [d k] then fk = fciv) If Assumption 32 holds and (37) has finitely many minimizers then every

minimizer ylowast of (38) must satisfy (317) for some t isin [d k] when k is bigenough

Proof By Assumption 31 u is a critical point if and only if ceq(u) = 0 φ(u) = 0i) For every feasible point u of (37) the tms [u]2k (see sect2 for the notation) is

feasible for (38) for all k Therefore if (38) is infeasible for some k then (37)must be infeasible

ii) By Assumption 32 when (37) is infeasible the set

x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0

is empty It has a single inequality By the Positivstellensatz [2 Corollary 443]it holds that minus1 isin Ideal(ceq φ) + Qmod(ρ) By Assumption 32

Ideal(ceq φ) + Qmod(ρ) sube IQ(ceq cin) + IQ(φ ψ)

Thus for all k big enough (39) is unbounded from above Hence (38) must beinfeasible by weak duality

When (37) is feasible f achieves finitely many values on Kc so (37) mustachieve its optimal value fc By Theorem 33 we know that fk = f prime

k = fc for all kbig enough For each minimizer ulowast of (37) the tms [ulowast]2k is a minimizer of (38)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 13

iii) If (317) holds we can get r = rankMt(ylowast) minimizers for (37) [5 14] say

u1 ur such that fk = f(ui) for each i Clearly fk = f(ui) ge fc On the otherhand we always have fk le fc So fk = fc

iv) By Assumption 32 (37) is equivalent to the problem

(318)

min f(x)st ceq(x) = 0 φ(x) = 0 ρ(x) ge 0

The optimal value of (318) is also fc Its kth order Lasserrersquos relaxation is

(319)

γprimek = min 〈f y〉st 〈1 y〉 = 1Mk(y) 0

L(k)ceq (y) = 0 L

(k)φ (y) = 0 L

(k)ρ (y) 0

Its dual optimization problem is

(320)

γk = max γ

st f minus γ isin Ideal(ceq φ)2k + Qmod(ρ)2k

By repeating the same proof as for Theorem 33(iii) we can show that

γk = γprimek = fc

for all k big enough Because ρ isin Qmod(cin ψ) each y feasible for (38) is alsofeasible for (319) So when k is big each ylowast is also a minimizer of (319) Theproblem (318) also has finitely many minimizers By Theorem 26 of [32] thecondition (317) must be satisfied for some t isin [d k] when k is big enough

If (37) has infinitely many minimizers then the condition (317) is typically notsatisfied We refer to [25 sect66]

4 Polyhedral constraints

In this section we assume the feasible set of (11) is the polyhedron

P = x isin Rn | Axminus b ge 0

where A =[a1 middot middot middot am

]T isin Rmtimesn b =[b1 middot middot middot bm

]T isin Rm This corresponds

to that E = empty I = [m] and each ci(x) = aTi xminus bi Denote

(41) D(x) = diag(c1(x) cm(x)) C(x) =

[AT

D(x)

]

The Lagrange multiplier vector λ =[λ1 middot middot middot λm

]Tsatisfies

(42)

[AT

D(x)

]

λ =

[nablaf(x)

0

]

If rankA = m we can express λ as

(43) λ = (AAT )minus1Anablaf(x)If rankA lt m how can we express λ in terms of x In computation we oftenprefer a polynomial expression If there exists L(x) isin R[x]mtimes(n+m) such that

(44) L(x)C(x) = Im

then we can get

λ = L(x)

[nablaf(x)

0

]

= L1(x)nablaf(x)

14 JIAWANG NIE

where L1(x) consists of the first n columns of L(x) In this section we characterizewhen such L(x) exists and give a degree bound for it

The linear function Ax minus b is said to be nonsingular if rankC(u) = m for allu isin Cn (also see Definition 51) This is equivalent to that for every u if J(u) =i1 ik (see (12) for the notation) then ai1 aik are linearly independent

Proposition 41 The linear function Ax minus b is nonsingular if and only if thereexists a matrix polynomial L(x) satisfying (44) Moreover when Ax minus b is non-singular we can choose L(x) in (44) with deg(L) le mminus rankA

Proof Clearly if (44) is satisfied by some L(x) then rankC(u) ge m for all u Thisimplies that Axminus b is nonsingular

Next assume that Axminusb is nonsingular We show that (44) is satisfied by someL(x) isin R[x]mtimes(n+m) with degree le m minus rankA Let r = rankA Up to a linearcoordinate transformation we can reduce x to a r-dimensional variable Withoutloss of generality we can assume that rankA = n and m ge n

For a subset I = i1 imminusn of [m] denote

cI(x) =prod

iisinIci(x) EI(x) = cI(x) middot diag(ci1(x)minus1 cimminusn

(x)minus1)

DI(x) = diag(ci1(x) cimminusn(x)) AI =

[ai1 middot middot middot ainminusm

]T

For the case that I = empty (the empty set) we set cempty(x) = 1 Let

V = I sube [m] |I| = mminus n rankA[m]I = nStep I For each I isin V we construct a matrix polynomial LI(x) such that

(45) LI(x)C(x) = cI(x)Im

The matrix LI = LI(x) satisfying (45) can be given by the following 2times 3 blockmatrix (LI(J K) denotes the submatrix whose row indices are from J and whosecolumn indices are from K)

(46)

J K [n] n+ I n+ [m]II 0 EI(x) 0

[m]I cI(x) middot(A[m]I

)minusT minus(A[m]I

)minusT (AI

)TEI(x) 0

Equivalently the blocks of LI are

LI(I [n]

)= 0 LI

(I n+ [m]I

)= 0 LI

([m]I n+ [m]I

)= 0

LI(I n+ I

)= EI(x) LI

([m]I [n]

)= cI(x)

(A[m]I

)minusT

LI([m]I n+ I

)= minus

(A[m]I

)minusT (AI

)T

For each I isin V A[m]I is invertible The superscript minusT denotes the inverse of thetranspose Let G = LI(x)C(x) then one can verify that

G(I I) = EI(x)DI(x) = cI(x)Imminusn G(I [m]I) = 0

G([m]I [m]I) =[

cI(x)(A[m]I

)minusT minusAminusT[m]IA

TI EI(x)

] [(A[m]I

)T

0

]

= cI(x)In

G([m]I I) =[

cI(x)(A[m]I

)minusT minus(A[m]I

)minusT (AI

)TEI(x)

] [ ATIDI(x)

]

= 0

This shows that the above LI(x) satisfies (45)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 15

Step II We show that there exist real scalars νI satisfying

(47)sum

IisinVνIcI(x) = 1

This can be shown by induction on m

bull When m = n V = empty and cempty(x) = 1 so (47) is clearly truebull When m gt n let

(48) N = i isin [m] | rankA[m]i = n

For each i isin N let Vi be the set of all Iprime sube [m]i such that |I prime| = mminusnminus1

and rankA[m](Iprimecupi) = n For each i isin N by the assumption the linearfunction Amix minus bmi is nonsingular By induction there exist real

scalars ν(i)Iprime satisfying

(49)sum

IprimeisinVi

ν(i)Iprime cIprime(x) = 1

Since rankA = n we can generally assume that a1 an is linearlyindependent So there exist scalars α1 αn such that

am = α1a1 + middot middot middot+ αnan

If all αi = 0 then am = 0 and hence A can be replaced by its first mminus 1rows So (47) is true by the induction In the following suppose at leastone αi 6= 0 and write

i αi 6= 0 = i1 ik

Then ai1 aik am are linearly dependent For convenience set ik+1 =m Since Axminus b is nonsingular the linear system

ci1(x) = middot middot middot = cik(x) = cik+1(x) = 0

has no solutions Hence there exist real scalars micro1 microk+1 such that

micro1ci1(x) + middot middot middot+ microkcik(x) + microk+1cik+1(x) = 1

This above can be implied by echelonrsquos form for inconsistent linear systemsNote that i1 ik+1 isin N For each j = 1 k + 1 by (49)

sum

IprimeisinVij

ν(ij)Iprime cIprime(x) = 1

Then we can get

1 =k+1sum

j=1

microjcij (x) =k+1sum

j=1

microjsum

IprimeisinVij

ν(ij)Iprime cij (x)cIprime(x) =

sum

I=IprimecupijIprimeisinVij1lejlek+1

ν(ij)Iprime microjcI(x)

Since each I prime cup ij isin V (47) must be satisfied by some scalars νI

16 JIAWANG NIE

Step III For LI(x) as in (45) we construct L(x) as

(410) L(x) =sum

IisinVνIcI(x)LI(x)

Clearly L(x) satisfies (44) because

L(x)C(x) =sum

IisinVνILI(x)C(x) =

sum

IisinVνIcI(x)Im = Im

Each LI(x) has degree le mminus n so L(x) has degree le mminus n

Proposition 41 characterizes when there exists L(x) satisfying (44) Whenit does a degree bound for L(x) is m minus rankA Sometimes its degree can besmaller than that as shown in Example 43 For given A b the matrix polynomialL(x) satisfying (44) can be determined by linear equations which are obtained bymatching coefficients on both sides In the following we give some examples ofL(x)C(x) = Im for polyhedral sets

Example 42 Consider the simplicial set

x1 ge 0 xn ge 0 1minus eTx ge 0

The equation L(x)C(x) = In+1 is satisfied by

L(x) =

1minus x1 minusx2 middot middot middot minusxn 1 middot middot middot 1minusx1 1minus x2 middot middot middot minusxn 1 middot middot middot 1

minusx1 minusx2 middot middot middot 1minus xn 1 middot middot middot 1minusx1 minusx2 middot middot middot minusxn 1 middot middot middot 1

Example 43 Consider the box constraint

x1 ge 0 xn ge 0 1minus x1 ge 0 1minus xn ge 0

The equation L(x)C(x) = I2n is satisfied by

L(x) =

[In minus diag(x) In Inminusdiag(x) In In

]

Example 44 Consider the polyhedral set

1minus x4 ge 0 x4 minus x3 ge 0 x3 minus x2 ge 0 x2 minus x1 ge 0 x1 + 1 ge 0

The equation L(x)C(x) = I5 is satisfied by

L(x) =1

2

minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 minusx4 minus 1 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 1minus x4 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 1minus x3 1minus x4 1 1 1 1 1minusx1 minus 1 1minus x2 1minus x3 1minus x4 1 1 1 1 11minus x1 1minus x2 1minus x3 1minus x4 1 1 1 1 1

Example 45 Consider the polyhedral set

1 + x1 ge 0 1minus x1 ge 0 2minus x1 minus x2 ge 0 2minus x1 + x2 ge 0

The matrix L(x) satisfying L(x)C(x) = I4 is

1

6

x12minus 3x1 + 2 x1 x2 minus x2 4minus x1 2minus x1 1minus x1 1minus x1

3x12minus 3x1 minus 6 3x2 + 3x1 x2 6minus 3x1 minus3x1 minus3x1 minus 3 minus3x1 minus 3

1minus x12

minus2x2 minus x1 x2 minus 3 x1 minus 1 x1 + 1 x1 + 2 x1 + 21minus x1

2 3minus x1 x2 minus 2x2 x1 minus 1 x1 + 1 x1 + 2 x1 + 2

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 17

5 General constraints

We consider general nonlinear constraints as in (11) The critical point condi-tions are in (18) We discuss how to express Lagrange multipliers λi as polynomialfunctions in x on the set of critical points

Suppose there are totally m equality and inequality constraints ie

E cup I = 1 mIf (x λ) is a critical pair then λici(x) = 0 for all i isin E cup I So the Lagrange

multiplier vector λ =[λ1 middot middot middot λm

]Tsatisfies the equation

(51)

nablac1(x) nablac2(x) middot middot middot nablacm(x)c1(x) 0 middot middot middot 00 c2(x) 0 0

0 0 middot middot middot cm(x)

︸ ︷︷ ︸

C(x)

λ =

nablaf(x)000

Let C(x) be as in above If there exists L(x) isin R[x]mtimes(m+n) such that

(52) L(x)C(x) = Im

then we can get

(53) λ = L(x)

[nablaf(x)

0

]

= L1(x)nablaf(x)

where L1(x) consists of the first n columns of L(x) Clearly (52) implies thatAssumption 31 holds This section characterizes when such L(x) exists

Definition 51 The tuple c = (c1 cm) of constraining polynomials is said tobe nonsingular if rankC(u) = m for every u isin Cn

Clearly c being nonsingular is equivalent to that for each u isin Cn if J(u) =i1 ik (see (12) for the notation) then the gradients nablaci1(u) nablacik(u) arelinearly independent Our main conclusion is that (52) holds if and only if thetuple c is nonsingular

Proposition 52 (i) For each W (x) isin C[x]stimest with s ge t rankW (u) = t for allu isin Cn if and only if there exists P (x) isin C[x]ttimess such that

P (x)W (x) = It

Moreover for W (x) isin R[x]stimest we can choose P (x) isin R[x]ttimess for the above(ii) The constraining polynomial tuple c is nonsingular if and only if there existsL(x) isin R[x]mtimes(m+n) satisfying (52)

Proof (i) ldquolArrrdquo If L(x)W (x) = It then for all u isin Cn

t = rank It le rankW (u) le t

So W (x) must have full column rank everywhere

ldquorArrrdquo Suppose rankW (u) = t for all u isin Cn Write W (x) in columns

W (x) =[w1(x) w2(x) middot middot middot wt(x)

]

18 JIAWANG NIE

Then the equation w1(x) = 0 does not have a complex solution By Hilbertrsquos WeakNullstellensatz [4] there exists ξ1(x) isin C[x]s such that ξ1(x)

Tw1(x) = 1 For eachi = 2 t denote

r1i(x) = ξ1(x)Twi(x)

then (use sim to denote row equivalence between matrices)

W (x) sim[

1 r12(x) middot middot middot r1t(x)w1(x) w2(x) middot middot middot wm(x)

]

simW1(x) =

[1 r12(x) middot middot middot r1m(x)

0 w(1)2 (x) middot middot middot w

(1)m (x)

]

where each (i = 2 m)

w(1)i (x) = wi(x)minus r1i(x)w1(x)

So there exists P1(x) isin R[x](s+1)timess such that

P1(x)W (x) =W1(x)

Since W (x) and W1(x) are row equivalent W1(x) must also have full column rankeverywhere Similarly the polynomial equation

w(1)2 (x) = 0

does not have a complex solution Again by Hilbertrsquos Weak Nullstellensatz [4]there exists ξ2(x) isin C[x]s such that

ξ2(x)Tw

(1)2 (x) = 1

For each i = 3 t let r2i(x) = ξ2(x)Tw

(1)2 (x) then

W1(x) sim

1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)

0 w(1)2 (x) w

(1)3 (x) middot middot middot w

(1)m (x)

sim

W2(x) =

1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)

0 0 w(2)3 (x) middot middot middot w

(2)m (x)

where each (i = 3 m)

w(2)i (x) = w

(1)i (x)minus r2i(x)w

(1)2 (x)

Similarly W1(x) and W2(x) are row equivalent so W2(x) has full column rankeverywhere There exists P2(x) isin C[x](s+2)times(s+1) such that

P2(x)W1(x) =W2(x)

Continuing this process we can finally get

W2(x) sim middot middot middot simWt(x) =

1 r12(x) r13(x) middot middot middot r1t(x)0 1 r23(x) middot middot middot r2t(x)0 0 1 middot middot middot r3t(x)

0 0 0 middot middot middot 10 0 0 middot middot middot 0

Consequently there exists Pi(x) isin R[x](s+i)times(s+iminus1) for i = 1 2 t such that

Pt(x)Ptminus1(x) middot middot middotP1(x)W (x) =Wt(x)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 19

Since Wt(x) is a unit upper triangular matrix polynomial there exists Pt+1(x) isinR[x]ttimes(s+t) such that Pt+1(x)Wt(x) = It Let

P (x) = Pt+1(x)Pt(x)Ptminus1(x) middot middot middotP1(x)

then P (x)W (x) = Im Note that P (x) isin C[x]ttimess

For W (x) isin R[x]stimest we can replace P (x) by(P (x)+P (x)

)2 (the P (x) denotes

the complex conjugate of P (x)) which is a real matrix polynomial

(ii) The conclusion is implied directly by the item (i)

In Proposition 52 there is no explicit degree bound for L(x) satisfying (52)This question is mostly open to the best of the authorrsquos knowledge However oncea degree is chosen for L(x) it can be determined by comparing coefficients of bothsides of (52) This can be done by solving a linear system In the following wegive some examples of L(x) satisfying (52)

Example 53 Consider the hypercube with quadratic constraints

1minus x21 ge 0 1minus x22 ge 0 1minus x2n ge 0

The equation L(x)C(x) = In is satisfied by

L(x) =[minus 1

2diag(x) In]

Example 54 Consider the nonnegative portion of the unit sphere

x1 ge 0 x2 ge 0 xn ge 0 x21 + middot middot middot+ x2n minus 1 = 0

The equation L(x)C(x) = In+1 is satisfied by

L(x) =

[In minus xxT x1Tn 2x

12x

T minus 121

Tn minus1

]

Example 55 Consider the set

1minus x31 minus x42 ge 0 1minus x43 minus x34 ge 0

The equation L(x)C(x) = I2 is satisfied by

L(x) =

[minusx1

3 minusx2

4 0 0 1 00 0 minusx3

4 minusx4

3 0 1

]

Example 56 Consider the quadratic set

1minus x1x2 minus x2x3 minus x1x3 ge 0 1minus x21 minus x22 minus x23 ge 0

The matrix L(x)T satisfying L(x)C(x) = I2 is

25x13 + 10 x1

2 x2 + 40x1 x22 minus 25 x1 minus 2x3 minus25x1

3 minus 10x12 x2 minus 40 x1 x2

2 +49 x1

2+ 2x3

minus15 x12 x2 + 10x1 x2

2 + 20 x3 x1 x2 minus 10x1 15 x12 x2 minus 10x1 x2

2 minus 20 x3 x1 x2 + 10x1 minus x22

25 x3 x12 minus 20x1 x2

2 + 10 x3 x1 x2 + 2x1 minus25 x3 x12 + 20x1 x2

2 minus 10 x3 x1 x2 minus 2x1 minus x32

1 minus 20 x1 x3 minus 10 x12 minus 20 x1 x2 20 x1 x2 + 20 x1 x3 + 10x1

2

minus50 x12 minus 20 x2 x1 50 x1

2 + 20 x2 x1 + 1

We would like to remark that a polynomial tuple c = (c1 cm) is genericallynonsingular and Assumption 31 holds generically

Proposition 57 For all positive degrees d1 dm there exists an open densesubset U of D = R[x]d1 times middot middot middot times R[x]dm such that every tuple c = (c1 cm) isin Uis nonsingular Indeed such U can be chosen as a Zariski open subset of D ie itis the complement of a proper real variety of D Moreover Assumption 31 holdsfor all c isin U ie it holds generically

20 JIAWANG NIE

Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The

resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of

ci1 cin+1such that if Res(ci1 cin+1

) 6= 0 then the equations

ci1(x) = middot middot middot = cin+1(x) = 0

have no complex solutions Define

F1(c) =prod

(i1in+1)isinJ1

Res(ci1 cin+1)

For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero

Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations

cj1(x) = middot middot middot = cjk(x) = 0

have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define

F2(c) =prod

(j1jk)isinJ2

∆(cj1 cjk)

Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero

Last let F (c) = F1(c)F2(c) and

U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple

Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically

6 Numerical examples

This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21

The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie

pi =(

L1(x)nablaf(x))

i

In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds

By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)

We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo

Example 61 Consider the optimization problem

min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0

The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors

Table 1 Computational results for Example 61

order kwo LME with LME

lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567

2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn

i=1xi(1 minus xi)

2 +sum

i6=j x2

i xj isin IQ(ceq cin)

22 JIAWANG NIE

Example 62 Consider the optimization problem

min x41x22 + x21x

42 + x63 minus 3x21x

22x

23 + (x41 + x42 + x43)

st x21 + x22 + x23 ge 1

The matrix polynomial L(x) =[12x1

12x2

12x3 minus1

] The objective f is the sum

of the positive definite form x41 + x42 + x43 and the Motzkin polynomial

M(x) = x41x22 + x21x

42 + x63 minus 3x21x

22x

23

Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because

c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)

)= 0

defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1

3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic

3 plusmn 1radic

3) The

computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 2 Computational results for Example 62

order kwo LME with LME

lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530

Example 63 Consider the optimization problem

min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0

The matrix polynomial L(x) is

1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0

0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1

minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1

The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3

3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact

4 This is because 1 minus x21minus x2

2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and

1minusx23minusx2

4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23

Table 3 Computational results for Example 63

order kwo LME with LME

lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449

Example 64 Consider the polynomial optimization problem

minxisinR2

x21 + 50x22

st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1

8 ge 0 x22 + 2x1x2 minus 18 ge 0

It is motivated from an example in [12 sect3] The first column of L(x) is

8x13

5 + x1

5288 x2 x1

4

5 minus 16 x13

5 minus x2 x12 1245 + 8 x1

5 minus 2 x2minus 288x2 x1

4

5 minus 16 x13

5 + x2 x12 1245 + 8 x1

5 + 2 x2

and the second column of L(x) is

minus 8x12 x2

5 + 4x23

5 minus x2

10288x1

3 x22

5 + 16x12 x2

5 minus 142x1 x22

5 minus 9 x1

20 minus 8 x23

5 + 11x2

5

minus 288x13 x2

2

5 + 16x12 x2

5 + 142x1 x22

5 + 9 x1

20 minus 8 x23

5 + 11x2

5

The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25

radic5 asymp 1126517 and the minimizers are (plusmn

radic

12plusmn(radic

58 +radic

12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors

Table 4 Computational results for Example 64

order kwo LME with LME

lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537

Example 65 Consider the optimization problem

minxisinR3

x31 + x32 + x33 + 4x1x2x3 minus(x1(x

22 + x23) + x2(x

23 + x21) + x3(x

21 + x22)

)

st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0

The matrix polynomial L(x) is

1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0

minusx1 x2 0 1 0 minus1

24 JIAWANG NIE

The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3

+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 5 Computational results for Example 65

order kwo LME with LME

lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057

Example 66 Consider the optimization problem (x0 = 1)

minxisinR4

xTx+sum4i=0

prod

j 6=i(xi minus xj)

st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0

The matrix polynomial L(x) =[12diag(x) minusI4

] The first part of the objective

is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers

(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)

(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)

(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 6 Computational results for Example 66

order kwo LME with LME

lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354

Example 67 Consider the optimization problem

minxisinR3

x41x22 + x42x

23 + x43x

21 minus 3x21x

22x

23 + x22

st x1 minus x2x3 ge 0minusx2 + x23 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0

minusx3 minus1 0 0 0

]

By the arithmetic-geometric

mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25

relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin

for all k ge 5 up to numerical round-off errors

Table 7 Computational results for Example 67

order kwo LME with LME

lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618

Example 68 Consider the optimization problem

minxisinR4

x21(x1 minus x4)2 + x22(x2 minus x4)

2 + x23(x3 minus x4)2+

2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2

st x1 minus x2 ge 0 x2 minus x3 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0 01 1 0 0 0 0

]

In the objective the sum

of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer

(05632 05632 05632 07510)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 8 Computational results for Example 68

order kwo LME with LME

lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228

Example 69 Consider the optimization problem

minxisinR4

(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)

st 0 le x1 x4 le 1

The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9

5 Note that 4minussum

4

i=1x2i =

sum

4

i=1

(

xi(1 minus xi)2 + (1 minus xi)(1 + x2i )

)

isin IQ(ceq cin)

26 JIAWANG NIE

Table 9 Computational results for Example 69

orderwo LME with LME

lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320

For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison

Example 610 Consider the optimization problem (x0 = 1)

minxisinRn

sum

0leilejlejlen cijkxixjxk

st 0 le x le 1

where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time

Table 10 Consumed time (in seconds) for Example 610

n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778

7 Discussions

71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that

IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k

If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set

Preord(cin ψ) =sum

r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27

The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is

(71)

f primeprek = min 〈f y〉

st 〈1 y〉 = 1 L(k)ceq(y) = 0 L

(k)φ (y) = 0

L(k)

gr11

middotmiddotmiddotgrℓℓ

(y) 0 forall r1 rℓ isin 0 1y isin RN

n2k

Similar to (39) the dual optimization problem of the above is

(72)

fprek = max γ

st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k

An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem

Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then

fprek = f primeprek = fc

for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre

k = fmin for all k big enough

Proof The proof is very similar to the Case III of Theorem 33 Follow the same

argument there Without Assumption 32 we still have f(x) equiv 0 on the set

K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0

By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such

that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same

72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge

73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)

28 JIAWANG NIE

74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get

λ =(

C(x)TC(x))minus1

C(x)T[nablaf(x)

0

]

when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research

Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper

References

[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions

Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-

putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997

[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226

[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200

[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832

[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear

[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear

[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]

[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous

quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-

inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-

tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005

[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109

[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J

Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of

zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29

[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885

[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014

[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009

[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015

[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear

[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26

[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009

[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014

[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606

[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718

[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012

[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255

[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013

[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510

[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121

[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274

[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003

[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984

[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000

[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942

[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324

[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653

[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002

[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951

Department of Mathematics University of California at San Diego 9500 Gilman

Drive La Jolla CA USA 92093

E-mail address njwmathucsdedu

  • 1 Introduction
    • 11 Optimality conditions
    • 12 Some existing work
    • 13 New contributions
      • 2 Preliminaries
      • 3 The construction of tight relaxations
        • 31 Tightness of the relaxations
        • 32 Detecting tightness and extracting minimizers
          • 4 Polyhedral constraints
          • 5 General constraints
          • 6 Numerical examples
          • 7 Discussions
            • 71 Tight relaxations using preorderings
            • 72 Singular constraining polynomials
            • 73 Degree bound for L(x)
            • 74 Rational representation of Lagrange multipliers
              • References
Page 8: TIGHT RELAXATIONS FOR POLYNOMIAL OPTIMIZATION …The polynomial expressions is de-termined by linear equations. Based on these expressions, new Lasserre type semidefinite relaxations

8 JIAWANG NIE

31 Tightness of the relaxations Let cin ψKc fc be as above We refer to sect2for the notation Qmod(cin ψ) We begin with a general assumption

Assumption 32 There exists ρ isin Qmod(cin ψ) such that if u isin Kc and f(u) ltfc then ρ(u) lt 0

In Assumption 32 the hypersurface ρ(x) = 0 separates feasible and infeasiblecritical points Clearly if u isin Kc is a feasible point for (37) then cin(u) ge 0 andψ(u) ge 0 and hence ρ(u) ge 0 Assumption 32 generally holds For instance it issatisfied for the following general cases

a) When there are no inequality constraints cin and ψ are empty tuplesThen Qmod(cin ψ) = Σ[x] and Assumption 32 is satisfied for ρ = 0

b) Suppose the set Kc is finite say Kc = u1 uD andf(u1) f(utminus1) lt fc le f(ut) f(uD)

Let ℓ1 ℓD be real interpolating polynomials such that ℓi(uj) = 1 fori = j and ℓi(uj) = 0 for i 6= j For each i = 1 t there must exist ji isin Isuch that cji(ui) lt 0 Then the polynomial

(312) ρ =sum

iltt

minus1

cji(ui)cji(x)ℓi(x)

2 +sum

igetℓi(x)

2

satisfies Assumption 32c) For each x with f(x) = vi lt fc at least one of the constraints cj(x) ge

0 pj(x) ge 0(j isin I) is violated Suppose for each critical value vi lt fcthere exists gi isin cj pjjisinI such that

gi lt 0 on Kc cap f(x) = viLet ϕ1 ϕN be real univariate polynomials such that ϕi(vj) = 0 for i 6= jand ϕi(vj) = 1 for i = j Suppose vt = fc Then the polynomial

(313) ρ =sum

iltt

gi(x)(ϕi(f(x))

)2+sum

iget

(ϕi(f(x))

)2

satisfies Assumption 32

We refer to sect2 for the archimedean condition and the notation IQ(h g) as in(21) The following is about the convergence of relaxations (38)-(39)

Theorem 33 Suppose Kc 6= empty and Assumption 31 holds If

i) IQ(ceq cin) + IQ(φ ψ) is archimedean orii) IQ(ceq cin) is archimedean oriii) Assumption 32 holds

then fk = f primek = fc for all k sufficiently large Therefore if the minimum value fmin

of (11) is achieved at a critical point then fk = f primek = fmin for all k big enough if

one of the conditions i)-iii) is satisfied

Remark In Theorem 33 the conclusion holds if anyone of conditions i)-iii) issatisfied The condition ii) is only about constraining polynomials of (11) It canbe checked without φ ψ Clearly the condition ii) implies the condition i)

The proof for Theorem 33 is given in the following The main idea is to considerthe set of critical points It can be expressed as a union of subvarieties Theobjective f is a constant in each one of them We can get an SOS type representation

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 9

for f on each subvariety and then construct a single one for f over the entire setof critical points

Proof of Theorem 33 Clearly every point in the complex variety

K1 = x isin Cn | ceq(x) = 0 φ(x) = 0

is a critical point By Lemma 33 of [6] the objective f achieves finitely manyreal values on Kc = K1 cap Rn say they are v1 lt middot middot middot lt vN Up to the shiftingof a constant in f we can further assume that fc = 0 Clearly fc equals one ofv1 vN say vt = fc = 0

Case I Assume IQ(ceq cin) + IQ(φ ψ) is archimedean Let

I = Ideal(ceq φ)

the critical ideal Note that K1 = VC(I) The variety VC(I) is a union of irreduciblesubvarieties say V1 Vℓ If Vi cap Rn 6= empty then f is a real constant on Vi whichequals one of v1 vN This can be implied by Lemma 33 of [6] and Lemma 32of [30] Denote the subvarieties of VC(I)

Ti = K1 cap f(x) = vi (i = t N)

Let Ttminus1 be the union of irreducible subvarieties Vi such that either Vi cap Rn = emptyor f equiv vj on Vi with vj lt vt = fc Then it holds that

VC(I) = Ttminus1 cup Tt cup middot middot middot cup TN By the primary decomposition of I [11 41] there exist ideals Itminus1 It IN sube R[x]such that

I = Itminus1 cap It cap middot middot middot cap IN

and Ti = VC(Ii) for all i = tminus 1 t N Denote the semialgebraic set

(314) S = x isin Rn | cin(x) ge 0 ψ(x) ge 0

For i = t minus 1 we have VR(Itminus1) cap S = empty because v1 vtminus1 lt fc By thePositivstellensatz [2 Corollary 443] there exists p0 isin Preord(cin ψ)

1 satisfying2 + p0 isin Itminus1 Note that 1 + p0 gt 0 on VR(Itminus1) cap S The set Itminus1 +Qmod(cin ψ)is archimedean because I sube Itminus1 and

IQ(ceq cin) + IQ(φ ψ) sube Itminus1 +Qmod(cin ψ)

By Theorem 21 we have

p1 = 1 + p0 isin Itminus1 +Qmod(cin ψ)

Then 1 + p1 isin Itminus1 There exists p2 isin Qmod(cin ψ) such that

minus1 equiv p1 equiv p2 mod Itminus1

Since f = (f4 + 1)2 minus 1 middot (f4minus 1)2 we have

f equiv σtminus1 =(f4 + 1)2 + p2(f4minus 1)2

mod Itminus1

So when k is big enough we have σtminus1 isin Qmod(cin ψ)2k

1It is the preordering of the polynomial tuple (cin ψ) see sect71

10 JIAWANG NIE

For i = t vt = 0 and f(x) vanishes on VC(It) By Hilbertrsquos Strong Nullstellensatz[4] there exists an integer mt gt 0 such that fmt isin It Define the polynomial

st(ǫ) =radicǫsummtminus1

j=0

(12

j

)

ǫminusjf j

Then we have that

D1 = ǫ(1 + ǫminus1f)minus(st(ǫ)

)2 equiv 0 mod It

This is because in the subtraction of D1 after expanding(st(ǫ)

)2 all the terms f j

with j lt mt are cancelled and f j isin It for j ge mt So D1 isin It Let σt(ǫ) = st(ǫ)2

then f + ǫminus σt(ǫ) = D1 and

(315) f + ǫminus σt(ǫ) =summtminus2

j=0bj(ǫ)f

mt+j

for some real scalars bj(ǫ) depending on ǫ

For each i = t+1 N vi gt 0 and f(x)viminus1 vanishes on VC(Ii) By HilbertrsquosStrong Nullstellensatz [4] there exists 0 lt mi isin N such that (fvi minus 1)mi isin Ii Let

si =radicvisummiminus1

j=0

(12

j

)

(fvi minus 1)j

Like for the case i = t we can similarly show that f minus s2i isin Ii Let σi = s2i thenf minus σi isin Ii

Note that VC(Ii) cap VC(Ij) = empty for all i 6= j By Lemma 33 of [30] there existpolynomials atminus1 aN isin R[x] such that

a2tminus1 + middot middot middot+ a2N minus 1 isin I ai isin⋂

i6=jisintminus1NIj

For ǫ gt 0 denote the polynomial

σǫ = σt(ǫ)a2t +

sum

t6=jisintminus1N(σj + ǫ)a2j

then

f + ǫminus σǫ = (f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N )+sum

t6=iisintminus1N(f minus σi)a2i + (f + ǫminus σt(ǫ))a

2t

For each i 6= t f minus σi isin Ii so

(f minus σi)a2i isin

N⋂

j=tminus1

Ij = I

Hence there exists k1 gt 0 such that

(f minus σi)a2i isin I2k1 (t 6= i isin tminus 1 N)

Since f + ǫ minus σt(ǫ) isin It we also have

(f + ǫ minus σt(ǫ))a2t isin

N⋂

j=tminus1

Ij = I

Moreover by the equation (315)

(f + ǫ minus σt(ǫ))a2t =

mtminus2sum

j=0

bj(ǫ)fmt+ja2t

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 11

Each fmt+ja2t isin I since fmt+j isin It So there exists k2 gt 0 such that for all ǫ gt 0

(f + ǫminus σt(ǫ))a2t isin I2k2

Since 1minus a2tminus1 minus middot middot middot minus a2N isin I there also exists k3 gt 0 such that for all ǫ gt 0

(f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N ) isin I2k3

Hence if klowast ge maxk1 k2 k3 then we have

f(x) + ǫminus σǫ isin I2klowast

for all ǫ gt 0 By the construction the degrees of all σi and ai are independent ofǫ So σǫ isin Qmod(cin ψ)2klowast for all ǫ gt 0 if klowast is big enough Note that

I2klowast +Qmod(cin ψ)2klowast = IQ(ceq cin)2klowast + IQ(φ ψ)2klowast

This implies that fklowast ge fc minus ǫ for all ǫ gt 0 On the other hand we always havefklowast le fc So fklowast = fc Moreover since fk is monotonically increasing we musthave fk = fc for all k ge klowast

Case II Assume IQ(ceq cin) is archimedean Because

IQ(ceq cin) sube IQ(ceq cin) + IQ(φ ψ)

the set IQ(ceq cin)+IQ(φ ψ) is also archimedean Therefore the conclusion is alsotrue by applying the result for Case I

Case III Suppose the Assumption 32 holds Let ϕ1 ϕN be real univariatepolynomials such that ϕi(vj) = 0 for i 6= j and ϕi(vj) = 1 for i = j Let

s = st + middot middot middot+ sN where each si = (vi minus fc)(ϕi(f)

)2

Then s isin Σ[x]2k4 for some integer k4 gt 0 Let

f = f minus fc minus s

We show that there exist an integer ℓ gt 0 and q isin Qmod(cin ψ) such that

f2ℓ + q isin Ideal(ceq φ)

This is because by Assumption 32 f(x) equiv 0 on the set

K2 = x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0

It has only a single inequality By the Positivstellensatz [2 Corollary 443] there

exist 0 lt ℓ isin N and q = b0 + ρb1 (b0 b1 isin Σ[x]) such that f2ℓ + q isin Ideal(ceq φ)By Assumption 32 ρ isin Qmod(cin ψ) so we have q isin Qmod(cin ψ)

For all ǫ gt 0 and τ gt 0 we have f + ǫ = φǫ + θǫ where

φǫ = minusτǫ1minus2ℓ(f2ℓ + q

)

θǫ = ǫ(

1 + f ǫ+ τ(f ǫ)2ℓ)

+ τǫ1minus2ℓq

By Lemma 21 of [31] when τ ge 12ℓ there exists k5 such that for all ǫ gt 0

φǫ isin Ideal(ceq φ)2k5 θǫ isin Qmod(cin ψ)2k5

Hence we can getf minus (fc minus ǫ) = φǫ + σǫ

where σǫ = θǫ + s isin Qmod(cin ψ)2k5 for all ǫ gt 0 Note that

IQ(ceq cin)2k5 + IQ(φ ψ)2k5 = Ideal(ceq φ)2k5 +Qmod(cin ψ)2k5

12 JIAWANG NIE

For all ǫ gt 0 γ = fc minus ǫ is feasible in (39) for the order k5 so fk5 ge fc Becauseof (311) and the monotonicity of fk we have fk = f prime

k = fc for all k ge k5

32 Detecting tightness and extracting minimizers The optimal value of(37) is fc and the optimal value of (11) is fmin If fmin is achievable at a criticalpoint then fc = fmin In Theorem 33 we have shown that fk = fc for all k bigenough where fk is the optimal value of (39) The value fc or fmin is often notknown How do we detect the tightness fk = fc in computation The flat extensionor flat truncation condition [5 14 32] can be used for checking tightness Supposeylowast is a minimizer of (38) for the order k Let

(316) d = lceildeg(ceq cin φ ψ)2rceilIf there exists an integer t isin [d k] such that

(317) rankMt(ylowast) = rankMtminusd(y

lowast)

then fk = fc and we can get r = rankMt(ylowast) minimizers for (37) [5 14 32]

The method in [14] can be used to extract minimizers It was implemented inthe software GloptiPoly 3 [13] Generally (317) can serve as a sufficient andnecessary condition for detecting tightness The case that (37) is infeasible (ieno critical points satisfy the constraints cin ge 0 ψ ge 0) can also be detected bysolving the relaxations (38)-(39)

Theorem 34 Under Assumption 31 the relaxations (38)-(39) have the follow-ing properties

i) If (38) is infeasible for some order k then no critical points satisfy theconstraints cin ge 0 ψ ge 0 ie (37) is infeasible

ii) Suppose Assumption 32 holds If (37) is infeasible then the relaxation(38) must be infeasible when the order k is big enough

In the following assume (37) is feasible (ie fc lt +infin) Then for all k bigenough (38) has a minimizer ylowast Moreover

iii) If (317) is satisfied for some t isin [d k] then fk = fciv) If Assumption 32 holds and (37) has finitely many minimizers then every

minimizer ylowast of (38) must satisfy (317) for some t isin [d k] when k is bigenough

Proof By Assumption 31 u is a critical point if and only if ceq(u) = 0 φ(u) = 0i) For every feasible point u of (37) the tms [u]2k (see sect2 for the notation) is

feasible for (38) for all k Therefore if (38) is infeasible for some k then (37)must be infeasible

ii) By Assumption 32 when (37) is infeasible the set

x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0

is empty It has a single inequality By the Positivstellensatz [2 Corollary 443]it holds that minus1 isin Ideal(ceq φ) + Qmod(ρ) By Assumption 32

Ideal(ceq φ) + Qmod(ρ) sube IQ(ceq cin) + IQ(φ ψ)

Thus for all k big enough (39) is unbounded from above Hence (38) must beinfeasible by weak duality

When (37) is feasible f achieves finitely many values on Kc so (37) mustachieve its optimal value fc By Theorem 33 we know that fk = f prime

k = fc for all kbig enough For each minimizer ulowast of (37) the tms [ulowast]2k is a minimizer of (38)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 13

iii) If (317) holds we can get r = rankMt(ylowast) minimizers for (37) [5 14] say

u1 ur such that fk = f(ui) for each i Clearly fk = f(ui) ge fc On the otherhand we always have fk le fc So fk = fc

iv) By Assumption 32 (37) is equivalent to the problem

(318)

min f(x)st ceq(x) = 0 φ(x) = 0 ρ(x) ge 0

The optimal value of (318) is also fc Its kth order Lasserrersquos relaxation is

(319)

γprimek = min 〈f y〉st 〈1 y〉 = 1Mk(y) 0

L(k)ceq (y) = 0 L

(k)φ (y) = 0 L

(k)ρ (y) 0

Its dual optimization problem is

(320)

γk = max γ

st f minus γ isin Ideal(ceq φ)2k + Qmod(ρ)2k

By repeating the same proof as for Theorem 33(iii) we can show that

γk = γprimek = fc

for all k big enough Because ρ isin Qmod(cin ψ) each y feasible for (38) is alsofeasible for (319) So when k is big each ylowast is also a minimizer of (319) Theproblem (318) also has finitely many minimizers By Theorem 26 of [32] thecondition (317) must be satisfied for some t isin [d k] when k is big enough

If (37) has infinitely many minimizers then the condition (317) is typically notsatisfied We refer to [25 sect66]

4 Polyhedral constraints

In this section we assume the feasible set of (11) is the polyhedron

P = x isin Rn | Axminus b ge 0

where A =[a1 middot middot middot am

]T isin Rmtimesn b =[b1 middot middot middot bm

]T isin Rm This corresponds

to that E = empty I = [m] and each ci(x) = aTi xminus bi Denote

(41) D(x) = diag(c1(x) cm(x)) C(x) =

[AT

D(x)

]

The Lagrange multiplier vector λ =[λ1 middot middot middot λm

]Tsatisfies

(42)

[AT

D(x)

]

λ =

[nablaf(x)

0

]

If rankA = m we can express λ as

(43) λ = (AAT )minus1Anablaf(x)If rankA lt m how can we express λ in terms of x In computation we oftenprefer a polynomial expression If there exists L(x) isin R[x]mtimes(n+m) such that

(44) L(x)C(x) = Im

then we can get

λ = L(x)

[nablaf(x)

0

]

= L1(x)nablaf(x)

14 JIAWANG NIE

where L1(x) consists of the first n columns of L(x) In this section we characterizewhen such L(x) exists and give a degree bound for it

The linear function Ax minus b is said to be nonsingular if rankC(u) = m for allu isin Cn (also see Definition 51) This is equivalent to that for every u if J(u) =i1 ik (see (12) for the notation) then ai1 aik are linearly independent

Proposition 41 The linear function Ax minus b is nonsingular if and only if thereexists a matrix polynomial L(x) satisfying (44) Moreover when Ax minus b is non-singular we can choose L(x) in (44) with deg(L) le mminus rankA

Proof Clearly if (44) is satisfied by some L(x) then rankC(u) ge m for all u Thisimplies that Axminus b is nonsingular

Next assume that Axminusb is nonsingular We show that (44) is satisfied by someL(x) isin R[x]mtimes(n+m) with degree le m minus rankA Let r = rankA Up to a linearcoordinate transformation we can reduce x to a r-dimensional variable Withoutloss of generality we can assume that rankA = n and m ge n

For a subset I = i1 imminusn of [m] denote

cI(x) =prod

iisinIci(x) EI(x) = cI(x) middot diag(ci1(x)minus1 cimminusn

(x)minus1)

DI(x) = diag(ci1(x) cimminusn(x)) AI =

[ai1 middot middot middot ainminusm

]T

For the case that I = empty (the empty set) we set cempty(x) = 1 Let

V = I sube [m] |I| = mminus n rankA[m]I = nStep I For each I isin V we construct a matrix polynomial LI(x) such that

(45) LI(x)C(x) = cI(x)Im

The matrix LI = LI(x) satisfying (45) can be given by the following 2times 3 blockmatrix (LI(J K) denotes the submatrix whose row indices are from J and whosecolumn indices are from K)

(46)

J K [n] n+ I n+ [m]II 0 EI(x) 0

[m]I cI(x) middot(A[m]I

)minusT minus(A[m]I

)minusT (AI

)TEI(x) 0

Equivalently the blocks of LI are

LI(I [n]

)= 0 LI

(I n+ [m]I

)= 0 LI

([m]I n+ [m]I

)= 0

LI(I n+ I

)= EI(x) LI

([m]I [n]

)= cI(x)

(A[m]I

)minusT

LI([m]I n+ I

)= minus

(A[m]I

)minusT (AI

)T

For each I isin V A[m]I is invertible The superscript minusT denotes the inverse of thetranspose Let G = LI(x)C(x) then one can verify that

G(I I) = EI(x)DI(x) = cI(x)Imminusn G(I [m]I) = 0

G([m]I [m]I) =[

cI(x)(A[m]I

)minusT minusAminusT[m]IA

TI EI(x)

] [(A[m]I

)T

0

]

= cI(x)In

G([m]I I) =[

cI(x)(A[m]I

)minusT minus(A[m]I

)minusT (AI

)TEI(x)

] [ ATIDI(x)

]

= 0

This shows that the above LI(x) satisfies (45)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 15

Step II We show that there exist real scalars νI satisfying

(47)sum

IisinVνIcI(x) = 1

This can be shown by induction on m

bull When m = n V = empty and cempty(x) = 1 so (47) is clearly truebull When m gt n let

(48) N = i isin [m] | rankA[m]i = n

For each i isin N let Vi be the set of all Iprime sube [m]i such that |I prime| = mminusnminus1

and rankA[m](Iprimecupi) = n For each i isin N by the assumption the linearfunction Amix minus bmi is nonsingular By induction there exist real

scalars ν(i)Iprime satisfying

(49)sum

IprimeisinVi

ν(i)Iprime cIprime(x) = 1

Since rankA = n we can generally assume that a1 an is linearlyindependent So there exist scalars α1 αn such that

am = α1a1 + middot middot middot+ αnan

If all αi = 0 then am = 0 and hence A can be replaced by its first mminus 1rows So (47) is true by the induction In the following suppose at leastone αi 6= 0 and write

i αi 6= 0 = i1 ik

Then ai1 aik am are linearly dependent For convenience set ik+1 =m Since Axminus b is nonsingular the linear system

ci1(x) = middot middot middot = cik(x) = cik+1(x) = 0

has no solutions Hence there exist real scalars micro1 microk+1 such that

micro1ci1(x) + middot middot middot+ microkcik(x) + microk+1cik+1(x) = 1

This above can be implied by echelonrsquos form for inconsistent linear systemsNote that i1 ik+1 isin N For each j = 1 k + 1 by (49)

sum

IprimeisinVij

ν(ij)Iprime cIprime(x) = 1

Then we can get

1 =k+1sum

j=1

microjcij (x) =k+1sum

j=1

microjsum

IprimeisinVij

ν(ij)Iprime cij (x)cIprime(x) =

sum

I=IprimecupijIprimeisinVij1lejlek+1

ν(ij)Iprime microjcI(x)

Since each I prime cup ij isin V (47) must be satisfied by some scalars νI

16 JIAWANG NIE

Step III For LI(x) as in (45) we construct L(x) as

(410) L(x) =sum

IisinVνIcI(x)LI(x)

Clearly L(x) satisfies (44) because

L(x)C(x) =sum

IisinVνILI(x)C(x) =

sum

IisinVνIcI(x)Im = Im

Each LI(x) has degree le mminus n so L(x) has degree le mminus n

Proposition 41 characterizes when there exists L(x) satisfying (44) Whenit does a degree bound for L(x) is m minus rankA Sometimes its degree can besmaller than that as shown in Example 43 For given A b the matrix polynomialL(x) satisfying (44) can be determined by linear equations which are obtained bymatching coefficients on both sides In the following we give some examples ofL(x)C(x) = Im for polyhedral sets

Example 42 Consider the simplicial set

x1 ge 0 xn ge 0 1minus eTx ge 0

The equation L(x)C(x) = In+1 is satisfied by

L(x) =

1minus x1 minusx2 middot middot middot minusxn 1 middot middot middot 1minusx1 1minus x2 middot middot middot minusxn 1 middot middot middot 1

minusx1 minusx2 middot middot middot 1minus xn 1 middot middot middot 1minusx1 minusx2 middot middot middot minusxn 1 middot middot middot 1

Example 43 Consider the box constraint

x1 ge 0 xn ge 0 1minus x1 ge 0 1minus xn ge 0

The equation L(x)C(x) = I2n is satisfied by

L(x) =

[In minus diag(x) In Inminusdiag(x) In In

]

Example 44 Consider the polyhedral set

1minus x4 ge 0 x4 minus x3 ge 0 x3 minus x2 ge 0 x2 minus x1 ge 0 x1 + 1 ge 0

The equation L(x)C(x) = I5 is satisfied by

L(x) =1

2

minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 minusx4 minus 1 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 1minus x4 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 1minus x3 1minus x4 1 1 1 1 1minusx1 minus 1 1minus x2 1minus x3 1minus x4 1 1 1 1 11minus x1 1minus x2 1minus x3 1minus x4 1 1 1 1 1

Example 45 Consider the polyhedral set

1 + x1 ge 0 1minus x1 ge 0 2minus x1 minus x2 ge 0 2minus x1 + x2 ge 0

The matrix L(x) satisfying L(x)C(x) = I4 is

1

6

x12minus 3x1 + 2 x1 x2 minus x2 4minus x1 2minus x1 1minus x1 1minus x1

3x12minus 3x1 minus 6 3x2 + 3x1 x2 6minus 3x1 minus3x1 minus3x1 minus 3 minus3x1 minus 3

1minus x12

minus2x2 minus x1 x2 minus 3 x1 minus 1 x1 + 1 x1 + 2 x1 + 21minus x1

2 3minus x1 x2 minus 2x2 x1 minus 1 x1 + 1 x1 + 2 x1 + 2

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 17

5 General constraints

We consider general nonlinear constraints as in (11) The critical point condi-tions are in (18) We discuss how to express Lagrange multipliers λi as polynomialfunctions in x on the set of critical points

Suppose there are totally m equality and inequality constraints ie

E cup I = 1 mIf (x λ) is a critical pair then λici(x) = 0 for all i isin E cup I So the Lagrange

multiplier vector λ =[λ1 middot middot middot λm

]Tsatisfies the equation

(51)

nablac1(x) nablac2(x) middot middot middot nablacm(x)c1(x) 0 middot middot middot 00 c2(x) 0 0

0 0 middot middot middot cm(x)

︸ ︷︷ ︸

C(x)

λ =

nablaf(x)000

Let C(x) be as in above If there exists L(x) isin R[x]mtimes(m+n) such that

(52) L(x)C(x) = Im

then we can get

(53) λ = L(x)

[nablaf(x)

0

]

= L1(x)nablaf(x)

where L1(x) consists of the first n columns of L(x) Clearly (52) implies thatAssumption 31 holds This section characterizes when such L(x) exists

Definition 51 The tuple c = (c1 cm) of constraining polynomials is said tobe nonsingular if rankC(u) = m for every u isin Cn

Clearly c being nonsingular is equivalent to that for each u isin Cn if J(u) =i1 ik (see (12) for the notation) then the gradients nablaci1(u) nablacik(u) arelinearly independent Our main conclusion is that (52) holds if and only if thetuple c is nonsingular

Proposition 52 (i) For each W (x) isin C[x]stimest with s ge t rankW (u) = t for allu isin Cn if and only if there exists P (x) isin C[x]ttimess such that

P (x)W (x) = It

Moreover for W (x) isin R[x]stimest we can choose P (x) isin R[x]ttimess for the above(ii) The constraining polynomial tuple c is nonsingular if and only if there existsL(x) isin R[x]mtimes(m+n) satisfying (52)

Proof (i) ldquolArrrdquo If L(x)W (x) = It then for all u isin Cn

t = rank It le rankW (u) le t

So W (x) must have full column rank everywhere

ldquorArrrdquo Suppose rankW (u) = t for all u isin Cn Write W (x) in columns

W (x) =[w1(x) w2(x) middot middot middot wt(x)

]

18 JIAWANG NIE

Then the equation w1(x) = 0 does not have a complex solution By Hilbertrsquos WeakNullstellensatz [4] there exists ξ1(x) isin C[x]s such that ξ1(x)

Tw1(x) = 1 For eachi = 2 t denote

r1i(x) = ξ1(x)Twi(x)

then (use sim to denote row equivalence between matrices)

W (x) sim[

1 r12(x) middot middot middot r1t(x)w1(x) w2(x) middot middot middot wm(x)

]

simW1(x) =

[1 r12(x) middot middot middot r1m(x)

0 w(1)2 (x) middot middot middot w

(1)m (x)

]

where each (i = 2 m)

w(1)i (x) = wi(x)minus r1i(x)w1(x)

So there exists P1(x) isin R[x](s+1)timess such that

P1(x)W (x) =W1(x)

Since W (x) and W1(x) are row equivalent W1(x) must also have full column rankeverywhere Similarly the polynomial equation

w(1)2 (x) = 0

does not have a complex solution Again by Hilbertrsquos Weak Nullstellensatz [4]there exists ξ2(x) isin C[x]s such that

ξ2(x)Tw

(1)2 (x) = 1

For each i = 3 t let r2i(x) = ξ2(x)Tw

(1)2 (x) then

W1(x) sim

1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)

0 w(1)2 (x) w

(1)3 (x) middot middot middot w

(1)m (x)

sim

W2(x) =

1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)

0 0 w(2)3 (x) middot middot middot w

(2)m (x)

where each (i = 3 m)

w(2)i (x) = w

(1)i (x)minus r2i(x)w

(1)2 (x)

Similarly W1(x) and W2(x) are row equivalent so W2(x) has full column rankeverywhere There exists P2(x) isin C[x](s+2)times(s+1) such that

P2(x)W1(x) =W2(x)

Continuing this process we can finally get

W2(x) sim middot middot middot simWt(x) =

1 r12(x) r13(x) middot middot middot r1t(x)0 1 r23(x) middot middot middot r2t(x)0 0 1 middot middot middot r3t(x)

0 0 0 middot middot middot 10 0 0 middot middot middot 0

Consequently there exists Pi(x) isin R[x](s+i)times(s+iminus1) for i = 1 2 t such that

Pt(x)Ptminus1(x) middot middot middotP1(x)W (x) =Wt(x)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 19

Since Wt(x) is a unit upper triangular matrix polynomial there exists Pt+1(x) isinR[x]ttimes(s+t) such that Pt+1(x)Wt(x) = It Let

P (x) = Pt+1(x)Pt(x)Ptminus1(x) middot middot middotP1(x)

then P (x)W (x) = Im Note that P (x) isin C[x]ttimess

For W (x) isin R[x]stimest we can replace P (x) by(P (x)+P (x)

)2 (the P (x) denotes

the complex conjugate of P (x)) which is a real matrix polynomial

(ii) The conclusion is implied directly by the item (i)

In Proposition 52 there is no explicit degree bound for L(x) satisfying (52)This question is mostly open to the best of the authorrsquos knowledge However oncea degree is chosen for L(x) it can be determined by comparing coefficients of bothsides of (52) This can be done by solving a linear system In the following wegive some examples of L(x) satisfying (52)

Example 53 Consider the hypercube with quadratic constraints

1minus x21 ge 0 1minus x22 ge 0 1minus x2n ge 0

The equation L(x)C(x) = In is satisfied by

L(x) =[minus 1

2diag(x) In]

Example 54 Consider the nonnegative portion of the unit sphere

x1 ge 0 x2 ge 0 xn ge 0 x21 + middot middot middot+ x2n minus 1 = 0

The equation L(x)C(x) = In+1 is satisfied by

L(x) =

[In minus xxT x1Tn 2x

12x

T minus 121

Tn minus1

]

Example 55 Consider the set

1minus x31 minus x42 ge 0 1minus x43 minus x34 ge 0

The equation L(x)C(x) = I2 is satisfied by

L(x) =

[minusx1

3 minusx2

4 0 0 1 00 0 minusx3

4 minusx4

3 0 1

]

Example 56 Consider the quadratic set

1minus x1x2 minus x2x3 minus x1x3 ge 0 1minus x21 minus x22 minus x23 ge 0

The matrix L(x)T satisfying L(x)C(x) = I2 is

25x13 + 10 x1

2 x2 + 40x1 x22 minus 25 x1 minus 2x3 minus25x1

3 minus 10x12 x2 minus 40 x1 x2

2 +49 x1

2+ 2x3

minus15 x12 x2 + 10x1 x2

2 + 20 x3 x1 x2 minus 10x1 15 x12 x2 minus 10x1 x2

2 minus 20 x3 x1 x2 + 10x1 minus x22

25 x3 x12 minus 20x1 x2

2 + 10 x3 x1 x2 + 2x1 minus25 x3 x12 + 20x1 x2

2 minus 10 x3 x1 x2 minus 2x1 minus x32

1 minus 20 x1 x3 minus 10 x12 minus 20 x1 x2 20 x1 x2 + 20 x1 x3 + 10x1

2

minus50 x12 minus 20 x2 x1 50 x1

2 + 20 x2 x1 + 1

We would like to remark that a polynomial tuple c = (c1 cm) is genericallynonsingular and Assumption 31 holds generically

Proposition 57 For all positive degrees d1 dm there exists an open densesubset U of D = R[x]d1 times middot middot middot times R[x]dm such that every tuple c = (c1 cm) isin Uis nonsingular Indeed such U can be chosen as a Zariski open subset of D ie itis the complement of a proper real variety of D Moreover Assumption 31 holdsfor all c isin U ie it holds generically

20 JIAWANG NIE

Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The

resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of

ci1 cin+1such that if Res(ci1 cin+1

) 6= 0 then the equations

ci1(x) = middot middot middot = cin+1(x) = 0

have no complex solutions Define

F1(c) =prod

(i1in+1)isinJ1

Res(ci1 cin+1)

For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero

Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations

cj1(x) = middot middot middot = cjk(x) = 0

have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define

F2(c) =prod

(j1jk)isinJ2

∆(cj1 cjk)

Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero

Last let F (c) = F1(c)F2(c) and

U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple

Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically

6 Numerical examples

This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21

The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie

pi =(

L1(x)nablaf(x))

i

In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds

By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)

We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo

Example 61 Consider the optimization problem

min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0

The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors

Table 1 Computational results for Example 61

order kwo LME with LME

lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567

2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn

i=1xi(1 minus xi)

2 +sum

i6=j x2

i xj isin IQ(ceq cin)

22 JIAWANG NIE

Example 62 Consider the optimization problem

min x41x22 + x21x

42 + x63 minus 3x21x

22x

23 + (x41 + x42 + x43)

st x21 + x22 + x23 ge 1

The matrix polynomial L(x) =[12x1

12x2

12x3 minus1

] The objective f is the sum

of the positive definite form x41 + x42 + x43 and the Motzkin polynomial

M(x) = x41x22 + x21x

42 + x63 minus 3x21x

22x

23

Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because

c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)

)= 0

defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1

3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic

3 plusmn 1radic

3) The

computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 2 Computational results for Example 62

order kwo LME with LME

lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530

Example 63 Consider the optimization problem

min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0

The matrix polynomial L(x) is

1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0

0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1

minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1

The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3

3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact

4 This is because 1 minus x21minus x2

2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and

1minusx23minusx2

4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23

Table 3 Computational results for Example 63

order kwo LME with LME

lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449

Example 64 Consider the polynomial optimization problem

minxisinR2

x21 + 50x22

st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1

8 ge 0 x22 + 2x1x2 minus 18 ge 0

It is motivated from an example in [12 sect3] The first column of L(x) is

8x13

5 + x1

5288 x2 x1

4

5 minus 16 x13

5 minus x2 x12 1245 + 8 x1

5 minus 2 x2minus 288x2 x1

4

5 minus 16 x13

5 + x2 x12 1245 + 8 x1

5 + 2 x2

and the second column of L(x) is

minus 8x12 x2

5 + 4x23

5 minus x2

10288x1

3 x22

5 + 16x12 x2

5 minus 142x1 x22

5 minus 9 x1

20 minus 8 x23

5 + 11x2

5

minus 288x13 x2

2

5 + 16x12 x2

5 + 142x1 x22

5 + 9 x1

20 minus 8 x23

5 + 11x2

5

The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25

radic5 asymp 1126517 and the minimizers are (plusmn

radic

12plusmn(radic

58 +radic

12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors

Table 4 Computational results for Example 64

order kwo LME with LME

lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537

Example 65 Consider the optimization problem

minxisinR3

x31 + x32 + x33 + 4x1x2x3 minus(x1(x

22 + x23) + x2(x

23 + x21) + x3(x

21 + x22)

)

st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0

The matrix polynomial L(x) is

1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0

minusx1 x2 0 1 0 minus1

24 JIAWANG NIE

The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3

+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 5 Computational results for Example 65

order kwo LME with LME

lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057

Example 66 Consider the optimization problem (x0 = 1)

minxisinR4

xTx+sum4i=0

prod

j 6=i(xi minus xj)

st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0

The matrix polynomial L(x) =[12diag(x) minusI4

] The first part of the objective

is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers

(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)

(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)

(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 6 Computational results for Example 66

order kwo LME with LME

lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354

Example 67 Consider the optimization problem

minxisinR3

x41x22 + x42x

23 + x43x

21 minus 3x21x

22x

23 + x22

st x1 minus x2x3 ge 0minusx2 + x23 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0

minusx3 minus1 0 0 0

]

By the arithmetic-geometric

mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25

relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin

for all k ge 5 up to numerical round-off errors

Table 7 Computational results for Example 67

order kwo LME with LME

lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618

Example 68 Consider the optimization problem

minxisinR4

x21(x1 minus x4)2 + x22(x2 minus x4)

2 + x23(x3 minus x4)2+

2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2

st x1 minus x2 ge 0 x2 minus x3 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0 01 1 0 0 0 0

]

In the objective the sum

of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer

(05632 05632 05632 07510)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 8 Computational results for Example 68

order kwo LME with LME

lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228

Example 69 Consider the optimization problem

minxisinR4

(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)

st 0 le x1 x4 le 1

The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9

5 Note that 4minussum

4

i=1x2i =

sum

4

i=1

(

xi(1 minus xi)2 + (1 minus xi)(1 + x2i )

)

isin IQ(ceq cin)

26 JIAWANG NIE

Table 9 Computational results for Example 69

orderwo LME with LME

lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320

For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison

Example 610 Consider the optimization problem (x0 = 1)

minxisinRn

sum

0leilejlejlen cijkxixjxk

st 0 le x le 1

where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time

Table 10 Consumed time (in seconds) for Example 610

n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778

7 Discussions

71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that

IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k

If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set

Preord(cin ψ) =sum

r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27

The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is

(71)

f primeprek = min 〈f y〉

st 〈1 y〉 = 1 L(k)ceq(y) = 0 L

(k)φ (y) = 0

L(k)

gr11

middotmiddotmiddotgrℓℓ

(y) 0 forall r1 rℓ isin 0 1y isin RN

n2k

Similar to (39) the dual optimization problem of the above is

(72)

fprek = max γ

st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k

An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem

Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then

fprek = f primeprek = fc

for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre

k = fmin for all k big enough

Proof The proof is very similar to the Case III of Theorem 33 Follow the same

argument there Without Assumption 32 we still have f(x) equiv 0 on the set

K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0

By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such

that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same

72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge

73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)

28 JIAWANG NIE

74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get

λ =(

C(x)TC(x))minus1

C(x)T[nablaf(x)

0

]

when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research

Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper

References

[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions

Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-

putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997

[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226

[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200

[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832

[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear

[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear

[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]

[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous

quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-

inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-

tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005

[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109

[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J

Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of

zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29

[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885

[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014

[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009

[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015

[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear

[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26

[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009

[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014

[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606

[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718

[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012

[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255

[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013

[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510

[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121

[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274

[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003

[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984

[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000

[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942

[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324

[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653

[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002

[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951

Department of Mathematics University of California at San Diego 9500 Gilman

Drive La Jolla CA USA 92093

E-mail address njwmathucsdedu

  • 1 Introduction
    • 11 Optimality conditions
    • 12 Some existing work
    • 13 New contributions
      • 2 Preliminaries
      • 3 The construction of tight relaxations
        • 31 Tightness of the relaxations
        • 32 Detecting tightness and extracting minimizers
          • 4 Polyhedral constraints
          • 5 General constraints
          • 6 Numerical examples
          • 7 Discussions
            • 71 Tight relaxations using preorderings
            • 72 Singular constraining polynomials
            • 73 Degree bound for L(x)
            • 74 Rational representation of Lagrange multipliers
              • References
Page 9: TIGHT RELAXATIONS FOR POLYNOMIAL OPTIMIZATION …The polynomial expressions is de-termined by linear equations. Based on these expressions, new Lasserre type semidefinite relaxations

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 9

for f on each subvariety and then construct a single one for f over the entire setof critical points

Proof of Theorem 33 Clearly every point in the complex variety

K1 = x isin Cn | ceq(x) = 0 φ(x) = 0

is a critical point By Lemma 33 of [6] the objective f achieves finitely manyreal values on Kc = K1 cap Rn say they are v1 lt middot middot middot lt vN Up to the shiftingof a constant in f we can further assume that fc = 0 Clearly fc equals one ofv1 vN say vt = fc = 0

Case I Assume IQ(ceq cin) + IQ(φ ψ) is archimedean Let

I = Ideal(ceq φ)

the critical ideal Note that K1 = VC(I) The variety VC(I) is a union of irreduciblesubvarieties say V1 Vℓ If Vi cap Rn 6= empty then f is a real constant on Vi whichequals one of v1 vN This can be implied by Lemma 33 of [6] and Lemma 32of [30] Denote the subvarieties of VC(I)

Ti = K1 cap f(x) = vi (i = t N)

Let Ttminus1 be the union of irreducible subvarieties Vi such that either Vi cap Rn = emptyor f equiv vj on Vi with vj lt vt = fc Then it holds that

VC(I) = Ttminus1 cup Tt cup middot middot middot cup TN By the primary decomposition of I [11 41] there exist ideals Itminus1 It IN sube R[x]such that

I = Itminus1 cap It cap middot middot middot cap IN

and Ti = VC(Ii) for all i = tminus 1 t N Denote the semialgebraic set

(314) S = x isin Rn | cin(x) ge 0 ψ(x) ge 0

For i = t minus 1 we have VR(Itminus1) cap S = empty because v1 vtminus1 lt fc By thePositivstellensatz [2 Corollary 443] there exists p0 isin Preord(cin ψ)

1 satisfying2 + p0 isin Itminus1 Note that 1 + p0 gt 0 on VR(Itminus1) cap S The set Itminus1 +Qmod(cin ψ)is archimedean because I sube Itminus1 and

IQ(ceq cin) + IQ(φ ψ) sube Itminus1 +Qmod(cin ψ)

By Theorem 21 we have

p1 = 1 + p0 isin Itminus1 +Qmod(cin ψ)

Then 1 + p1 isin Itminus1 There exists p2 isin Qmod(cin ψ) such that

minus1 equiv p1 equiv p2 mod Itminus1

Since f = (f4 + 1)2 minus 1 middot (f4minus 1)2 we have

f equiv σtminus1 =(f4 + 1)2 + p2(f4minus 1)2

mod Itminus1

So when k is big enough we have σtminus1 isin Qmod(cin ψ)2k

1It is the preordering of the polynomial tuple (cin ψ) see sect71

10 JIAWANG NIE

For i = t vt = 0 and f(x) vanishes on VC(It) By Hilbertrsquos Strong Nullstellensatz[4] there exists an integer mt gt 0 such that fmt isin It Define the polynomial

st(ǫ) =radicǫsummtminus1

j=0

(12

j

)

ǫminusjf j

Then we have that

D1 = ǫ(1 + ǫminus1f)minus(st(ǫ)

)2 equiv 0 mod It

This is because in the subtraction of D1 after expanding(st(ǫ)

)2 all the terms f j

with j lt mt are cancelled and f j isin It for j ge mt So D1 isin It Let σt(ǫ) = st(ǫ)2

then f + ǫminus σt(ǫ) = D1 and

(315) f + ǫminus σt(ǫ) =summtminus2

j=0bj(ǫ)f

mt+j

for some real scalars bj(ǫ) depending on ǫ

For each i = t+1 N vi gt 0 and f(x)viminus1 vanishes on VC(Ii) By HilbertrsquosStrong Nullstellensatz [4] there exists 0 lt mi isin N such that (fvi minus 1)mi isin Ii Let

si =radicvisummiminus1

j=0

(12

j

)

(fvi minus 1)j

Like for the case i = t we can similarly show that f minus s2i isin Ii Let σi = s2i thenf minus σi isin Ii

Note that VC(Ii) cap VC(Ij) = empty for all i 6= j By Lemma 33 of [30] there existpolynomials atminus1 aN isin R[x] such that

a2tminus1 + middot middot middot+ a2N minus 1 isin I ai isin⋂

i6=jisintminus1NIj

For ǫ gt 0 denote the polynomial

σǫ = σt(ǫ)a2t +

sum

t6=jisintminus1N(σj + ǫ)a2j

then

f + ǫminus σǫ = (f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N )+sum

t6=iisintminus1N(f minus σi)a2i + (f + ǫminus σt(ǫ))a

2t

For each i 6= t f minus σi isin Ii so

(f minus σi)a2i isin

N⋂

j=tminus1

Ij = I

Hence there exists k1 gt 0 such that

(f minus σi)a2i isin I2k1 (t 6= i isin tminus 1 N)

Since f + ǫ minus σt(ǫ) isin It we also have

(f + ǫ minus σt(ǫ))a2t isin

N⋂

j=tminus1

Ij = I

Moreover by the equation (315)

(f + ǫ minus σt(ǫ))a2t =

mtminus2sum

j=0

bj(ǫ)fmt+ja2t

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 11

Each fmt+ja2t isin I since fmt+j isin It So there exists k2 gt 0 such that for all ǫ gt 0

(f + ǫminus σt(ǫ))a2t isin I2k2

Since 1minus a2tminus1 minus middot middot middot minus a2N isin I there also exists k3 gt 0 such that for all ǫ gt 0

(f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N ) isin I2k3

Hence if klowast ge maxk1 k2 k3 then we have

f(x) + ǫminus σǫ isin I2klowast

for all ǫ gt 0 By the construction the degrees of all σi and ai are independent ofǫ So σǫ isin Qmod(cin ψ)2klowast for all ǫ gt 0 if klowast is big enough Note that

I2klowast +Qmod(cin ψ)2klowast = IQ(ceq cin)2klowast + IQ(φ ψ)2klowast

This implies that fklowast ge fc minus ǫ for all ǫ gt 0 On the other hand we always havefklowast le fc So fklowast = fc Moreover since fk is monotonically increasing we musthave fk = fc for all k ge klowast

Case II Assume IQ(ceq cin) is archimedean Because

IQ(ceq cin) sube IQ(ceq cin) + IQ(φ ψ)

the set IQ(ceq cin)+IQ(φ ψ) is also archimedean Therefore the conclusion is alsotrue by applying the result for Case I

Case III Suppose the Assumption 32 holds Let ϕ1 ϕN be real univariatepolynomials such that ϕi(vj) = 0 for i 6= j and ϕi(vj) = 1 for i = j Let

s = st + middot middot middot+ sN where each si = (vi minus fc)(ϕi(f)

)2

Then s isin Σ[x]2k4 for some integer k4 gt 0 Let

f = f minus fc minus s

We show that there exist an integer ℓ gt 0 and q isin Qmod(cin ψ) such that

f2ℓ + q isin Ideal(ceq φ)

This is because by Assumption 32 f(x) equiv 0 on the set

K2 = x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0

It has only a single inequality By the Positivstellensatz [2 Corollary 443] there

exist 0 lt ℓ isin N and q = b0 + ρb1 (b0 b1 isin Σ[x]) such that f2ℓ + q isin Ideal(ceq φ)By Assumption 32 ρ isin Qmod(cin ψ) so we have q isin Qmod(cin ψ)

For all ǫ gt 0 and τ gt 0 we have f + ǫ = φǫ + θǫ where

φǫ = minusτǫ1minus2ℓ(f2ℓ + q

)

θǫ = ǫ(

1 + f ǫ+ τ(f ǫ)2ℓ)

+ τǫ1minus2ℓq

By Lemma 21 of [31] when τ ge 12ℓ there exists k5 such that for all ǫ gt 0

φǫ isin Ideal(ceq φ)2k5 θǫ isin Qmod(cin ψ)2k5

Hence we can getf minus (fc minus ǫ) = φǫ + σǫ

where σǫ = θǫ + s isin Qmod(cin ψ)2k5 for all ǫ gt 0 Note that

IQ(ceq cin)2k5 + IQ(φ ψ)2k5 = Ideal(ceq φ)2k5 +Qmod(cin ψ)2k5

12 JIAWANG NIE

For all ǫ gt 0 γ = fc minus ǫ is feasible in (39) for the order k5 so fk5 ge fc Becauseof (311) and the monotonicity of fk we have fk = f prime

k = fc for all k ge k5

32 Detecting tightness and extracting minimizers The optimal value of(37) is fc and the optimal value of (11) is fmin If fmin is achievable at a criticalpoint then fc = fmin In Theorem 33 we have shown that fk = fc for all k bigenough where fk is the optimal value of (39) The value fc or fmin is often notknown How do we detect the tightness fk = fc in computation The flat extensionor flat truncation condition [5 14 32] can be used for checking tightness Supposeylowast is a minimizer of (38) for the order k Let

(316) d = lceildeg(ceq cin φ ψ)2rceilIf there exists an integer t isin [d k] such that

(317) rankMt(ylowast) = rankMtminusd(y

lowast)

then fk = fc and we can get r = rankMt(ylowast) minimizers for (37) [5 14 32]

The method in [14] can be used to extract minimizers It was implemented inthe software GloptiPoly 3 [13] Generally (317) can serve as a sufficient andnecessary condition for detecting tightness The case that (37) is infeasible (ieno critical points satisfy the constraints cin ge 0 ψ ge 0) can also be detected bysolving the relaxations (38)-(39)

Theorem 34 Under Assumption 31 the relaxations (38)-(39) have the follow-ing properties

i) If (38) is infeasible for some order k then no critical points satisfy theconstraints cin ge 0 ψ ge 0 ie (37) is infeasible

ii) Suppose Assumption 32 holds If (37) is infeasible then the relaxation(38) must be infeasible when the order k is big enough

In the following assume (37) is feasible (ie fc lt +infin) Then for all k bigenough (38) has a minimizer ylowast Moreover

iii) If (317) is satisfied for some t isin [d k] then fk = fciv) If Assumption 32 holds and (37) has finitely many minimizers then every

minimizer ylowast of (38) must satisfy (317) for some t isin [d k] when k is bigenough

Proof By Assumption 31 u is a critical point if and only if ceq(u) = 0 φ(u) = 0i) For every feasible point u of (37) the tms [u]2k (see sect2 for the notation) is

feasible for (38) for all k Therefore if (38) is infeasible for some k then (37)must be infeasible

ii) By Assumption 32 when (37) is infeasible the set

x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0

is empty It has a single inequality By the Positivstellensatz [2 Corollary 443]it holds that minus1 isin Ideal(ceq φ) + Qmod(ρ) By Assumption 32

Ideal(ceq φ) + Qmod(ρ) sube IQ(ceq cin) + IQ(φ ψ)

Thus for all k big enough (39) is unbounded from above Hence (38) must beinfeasible by weak duality

When (37) is feasible f achieves finitely many values on Kc so (37) mustachieve its optimal value fc By Theorem 33 we know that fk = f prime

k = fc for all kbig enough For each minimizer ulowast of (37) the tms [ulowast]2k is a minimizer of (38)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 13

iii) If (317) holds we can get r = rankMt(ylowast) minimizers for (37) [5 14] say

u1 ur such that fk = f(ui) for each i Clearly fk = f(ui) ge fc On the otherhand we always have fk le fc So fk = fc

iv) By Assumption 32 (37) is equivalent to the problem

(318)

min f(x)st ceq(x) = 0 φ(x) = 0 ρ(x) ge 0

The optimal value of (318) is also fc Its kth order Lasserrersquos relaxation is

(319)

γprimek = min 〈f y〉st 〈1 y〉 = 1Mk(y) 0

L(k)ceq (y) = 0 L

(k)φ (y) = 0 L

(k)ρ (y) 0

Its dual optimization problem is

(320)

γk = max γ

st f minus γ isin Ideal(ceq φ)2k + Qmod(ρ)2k

By repeating the same proof as for Theorem 33(iii) we can show that

γk = γprimek = fc

for all k big enough Because ρ isin Qmod(cin ψ) each y feasible for (38) is alsofeasible for (319) So when k is big each ylowast is also a minimizer of (319) Theproblem (318) also has finitely many minimizers By Theorem 26 of [32] thecondition (317) must be satisfied for some t isin [d k] when k is big enough

If (37) has infinitely many minimizers then the condition (317) is typically notsatisfied We refer to [25 sect66]

4 Polyhedral constraints

In this section we assume the feasible set of (11) is the polyhedron

P = x isin Rn | Axminus b ge 0

where A =[a1 middot middot middot am

]T isin Rmtimesn b =[b1 middot middot middot bm

]T isin Rm This corresponds

to that E = empty I = [m] and each ci(x) = aTi xminus bi Denote

(41) D(x) = diag(c1(x) cm(x)) C(x) =

[AT

D(x)

]

The Lagrange multiplier vector λ =[λ1 middot middot middot λm

]Tsatisfies

(42)

[AT

D(x)

]

λ =

[nablaf(x)

0

]

If rankA = m we can express λ as

(43) λ = (AAT )minus1Anablaf(x)If rankA lt m how can we express λ in terms of x In computation we oftenprefer a polynomial expression If there exists L(x) isin R[x]mtimes(n+m) such that

(44) L(x)C(x) = Im

then we can get

λ = L(x)

[nablaf(x)

0

]

= L1(x)nablaf(x)

14 JIAWANG NIE

where L1(x) consists of the first n columns of L(x) In this section we characterizewhen such L(x) exists and give a degree bound for it

The linear function Ax minus b is said to be nonsingular if rankC(u) = m for allu isin Cn (also see Definition 51) This is equivalent to that for every u if J(u) =i1 ik (see (12) for the notation) then ai1 aik are linearly independent

Proposition 41 The linear function Ax minus b is nonsingular if and only if thereexists a matrix polynomial L(x) satisfying (44) Moreover when Ax minus b is non-singular we can choose L(x) in (44) with deg(L) le mminus rankA

Proof Clearly if (44) is satisfied by some L(x) then rankC(u) ge m for all u Thisimplies that Axminus b is nonsingular

Next assume that Axminusb is nonsingular We show that (44) is satisfied by someL(x) isin R[x]mtimes(n+m) with degree le m minus rankA Let r = rankA Up to a linearcoordinate transformation we can reduce x to a r-dimensional variable Withoutloss of generality we can assume that rankA = n and m ge n

For a subset I = i1 imminusn of [m] denote

cI(x) =prod

iisinIci(x) EI(x) = cI(x) middot diag(ci1(x)minus1 cimminusn

(x)minus1)

DI(x) = diag(ci1(x) cimminusn(x)) AI =

[ai1 middot middot middot ainminusm

]T

For the case that I = empty (the empty set) we set cempty(x) = 1 Let

V = I sube [m] |I| = mminus n rankA[m]I = nStep I For each I isin V we construct a matrix polynomial LI(x) such that

(45) LI(x)C(x) = cI(x)Im

The matrix LI = LI(x) satisfying (45) can be given by the following 2times 3 blockmatrix (LI(J K) denotes the submatrix whose row indices are from J and whosecolumn indices are from K)

(46)

J K [n] n+ I n+ [m]II 0 EI(x) 0

[m]I cI(x) middot(A[m]I

)minusT minus(A[m]I

)minusT (AI

)TEI(x) 0

Equivalently the blocks of LI are

LI(I [n]

)= 0 LI

(I n+ [m]I

)= 0 LI

([m]I n+ [m]I

)= 0

LI(I n+ I

)= EI(x) LI

([m]I [n]

)= cI(x)

(A[m]I

)minusT

LI([m]I n+ I

)= minus

(A[m]I

)minusT (AI

)T

For each I isin V A[m]I is invertible The superscript minusT denotes the inverse of thetranspose Let G = LI(x)C(x) then one can verify that

G(I I) = EI(x)DI(x) = cI(x)Imminusn G(I [m]I) = 0

G([m]I [m]I) =[

cI(x)(A[m]I

)minusT minusAminusT[m]IA

TI EI(x)

] [(A[m]I

)T

0

]

= cI(x)In

G([m]I I) =[

cI(x)(A[m]I

)minusT minus(A[m]I

)minusT (AI

)TEI(x)

] [ ATIDI(x)

]

= 0

This shows that the above LI(x) satisfies (45)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 15

Step II We show that there exist real scalars νI satisfying

(47)sum

IisinVνIcI(x) = 1

This can be shown by induction on m

bull When m = n V = empty and cempty(x) = 1 so (47) is clearly truebull When m gt n let

(48) N = i isin [m] | rankA[m]i = n

For each i isin N let Vi be the set of all Iprime sube [m]i such that |I prime| = mminusnminus1

and rankA[m](Iprimecupi) = n For each i isin N by the assumption the linearfunction Amix minus bmi is nonsingular By induction there exist real

scalars ν(i)Iprime satisfying

(49)sum

IprimeisinVi

ν(i)Iprime cIprime(x) = 1

Since rankA = n we can generally assume that a1 an is linearlyindependent So there exist scalars α1 αn such that

am = α1a1 + middot middot middot+ αnan

If all αi = 0 then am = 0 and hence A can be replaced by its first mminus 1rows So (47) is true by the induction In the following suppose at leastone αi 6= 0 and write

i αi 6= 0 = i1 ik

Then ai1 aik am are linearly dependent For convenience set ik+1 =m Since Axminus b is nonsingular the linear system

ci1(x) = middot middot middot = cik(x) = cik+1(x) = 0

has no solutions Hence there exist real scalars micro1 microk+1 such that

micro1ci1(x) + middot middot middot+ microkcik(x) + microk+1cik+1(x) = 1

This above can be implied by echelonrsquos form for inconsistent linear systemsNote that i1 ik+1 isin N For each j = 1 k + 1 by (49)

sum

IprimeisinVij

ν(ij)Iprime cIprime(x) = 1

Then we can get

1 =k+1sum

j=1

microjcij (x) =k+1sum

j=1

microjsum

IprimeisinVij

ν(ij)Iprime cij (x)cIprime(x) =

sum

I=IprimecupijIprimeisinVij1lejlek+1

ν(ij)Iprime microjcI(x)

Since each I prime cup ij isin V (47) must be satisfied by some scalars νI

16 JIAWANG NIE

Step III For LI(x) as in (45) we construct L(x) as

(410) L(x) =sum

IisinVνIcI(x)LI(x)

Clearly L(x) satisfies (44) because

L(x)C(x) =sum

IisinVνILI(x)C(x) =

sum

IisinVνIcI(x)Im = Im

Each LI(x) has degree le mminus n so L(x) has degree le mminus n

Proposition 41 characterizes when there exists L(x) satisfying (44) Whenit does a degree bound for L(x) is m minus rankA Sometimes its degree can besmaller than that as shown in Example 43 For given A b the matrix polynomialL(x) satisfying (44) can be determined by linear equations which are obtained bymatching coefficients on both sides In the following we give some examples ofL(x)C(x) = Im for polyhedral sets

Example 42 Consider the simplicial set

x1 ge 0 xn ge 0 1minus eTx ge 0

The equation L(x)C(x) = In+1 is satisfied by

L(x) =

1minus x1 minusx2 middot middot middot minusxn 1 middot middot middot 1minusx1 1minus x2 middot middot middot minusxn 1 middot middot middot 1

minusx1 minusx2 middot middot middot 1minus xn 1 middot middot middot 1minusx1 minusx2 middot middot middot minusxn 1 middot middot middot 1

Example 43 Consider the box constraint

x1 ge 0 xn ge 0 1minus x1 ge 0 1minus xn ge 0

The equation L(x)C(x) = I2n is satisfied by

L(x) =

[In minus diag(x) In Inminusdiag(x) In In

]

Example 44 Consider the polyhedral set

1minus x4 ge 0 x4 minus x3 ge 0 x3 minus x2 ge 0 x2 minus x1 ge 0 x1 + 1 ge 0

The equation L(x)C(x) = I5 is satisfied by

L(x) =1

2

minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 minusx4 minus 1 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 1minus x4 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 1minus x3 1minus x4 1 1 1 1 1minusx1 minus 1 1minus x2 1minus x3 1minus x4 1 1 1 1 11minus x1 1minus x2 1minus x3 1minus x4 1 1 1 1 1

Example 45 Consider the polyhedral set

1 + x1 ge 0 1minus x1 ge 0 2minus x1 minus x2 ge 0 2minus x1 + x2 ge 0

The matrix L(x) satisfying L(x)C(x) = I4 is

1

6

x12minus 3x1 + 2 x1 x2 minus x2 4minus x1 2minus x1 1minus x1 1minus x1

3x12minus 3x1 minus 6 3x2 + 3x1 x2 6minus 3x1 minus3x1 minus3x1 minus 3 minus3x1 minus 3

1minus x12

minus2x2 minus x1 x2 minus 3 x1 minus 1 x1 + 1 x1 + 2 x1 + 21minus x1

2 3minus x1 x2 minus 2x2 x1 minus 1 x1 + 1 x1 + 2 x1 + 2

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 17

5 General constraints

We consider general nonlinear constraints as in (11) The critical point condi-tions are in (18) We discuss how to express Lagrange multipliers λi as polynomialfunctions in x on the set of critical points

Suppose there are totally m equality and inequality constraints ie

E cup I = 1 mIf (x λ) is a critical pair then λici(x) = 0 for all i isin E cup I So the Lagrange

multiplier vector λ =[λ1 middot middot middot λm

]Tsatisfies the equation

(51)

nablac1(x) nablac2(x) middot middot middot nablacm(x)c1(x) 0 middot middot middot 00 c2(x) 0 0

0 0 middot middot middot cm(x)

︸ ︷︷ ︸

C(x)

λ =

nablaf(x)000

Let C(x) be as in above If there exists L(x) isin R[x]mtimes(m+n) such that

(52) L(x)C(x) = Im

then we can get

(53) λ = L(x)

[nablaf(x)

0

]

= L1(x)nablaf(x)

where L1(x) consists of the first n columns of L(x) Clearly (52) implies thatAssumption 31 holds This section characterizes when such L(x) exists

Definition 51 The tuple c = (c1 cm) of constraining polynomials is said tobe nonsingular if rankC(u) = m for every u isin Cn

Clearly c being nonsingular is equivalent to that for each u isin Cn if J(u) =i1 ik (see (12) for the notation) then the gradients nablaci1(u) nablacik(u) arelinearly independent Our main conclusion is that (52) holds if and only if thetuple c is nonsingular

Proposition 52 (i) For each W (x) isin C[x]stimest with s ge t rankW (u) = t for allu isin Cn if and only if there exists P (x) isin C[x]ttimess such that

P (x)W (x) = It

Moreover for W (x) isin R[x]stimest we can choose P (x) isin R[x]ttimess for the above(ii) The constraining polynomial tuple c is nonsingular if and only if there existsL(x) isin R[x]mtimes(m+n) satisfying (52)

Proof (i) ldquolArrrdquo If L(x)W (x) = It then for all u isin Cn

t = rank It le rankW (u) le t

So W (x) must have full column rank everywhere

ldquorArrrdquo Suppose rankW (u) = t for all u isin Cn Write W (x) in columns

W (x) =[w1(x) w2(x) middot middot middot wt(x)

]

18 JIAWANG NIE

Then the equation w1(x) = 0 does not have a complex solution By Hilbertrsquos WeakNullstellensatz [4] there exists ξ1(x) isin C[x]s such that ξ1(x)

Tw1(x) = 1 For eachi = 2 t denote

r1i(x) = ξ1(x)Twi(x)

then (use sim to denote row equivalence between matrices)

W (x) sim[

1 r12(x) middot middot middot r1t(x)w1(x) w2(x) middot middot middot wm(x)

]

simW1(x) =

[1 r12(x) middot middot middot r1m(x)

0 w(1)2 (x) middot middot middot w

(1)m (x)

]

where each (i = 2 m)

w(1)i (x) = wi(x)minus r1i(x)w1(x)

So there exists P1(x) isin R[x](s+1)timess such that

P1(x)W (x) =W1(x)

Since W (x) and W1(x) are row equivalent W1(x) must also have full column rankeverywhere Similarly the polynomial equation

w(1)2 (x) = 0

does not have a complex solution Again by Hilbertrsquos Weak Nullstellensatz [4]there exists ξ2(x) isin C[x]s such that

ξ2(x)Tw

(1)2 (x) = 1

For each i = 3 t let r2i(x) = ξ2(x)Tw

(1)2 (x) then

W1(x) sim

1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)

0 w(1)2 (x) w

(1)3 (x) middot middot middot w

(1)m (x)

sim

W2(x) =

1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)

0 0 w(2)3 (x) middot middot middot w

(2)m (x)

where each (i = 3 m)

w(2)i (x) = w

(1)i (x)minus r2i(x)w

(1)2 (x)

Similarly W1(x) and W2(x) are row equivalent so W2(x) has full column rankeverywhere There exists P2(x) isin C[x](s+2)times(s+1) such that

P2(x)W1(x) =W2(x)

Continuing this process we can finally get

W2(x) sim middot middot middot simWt(x) =

1 r12(x) r13(x) middot middot middot r1t(x)0 1 r23(x) middot middot middot r2t(x)0 0 1 middot middot middot r3t(x)

0 0 0 middot middot middot 10 0 0 middot middot middot 0

Consequently there exists Pi(x) isin R[x](s+i)times(s+iminus1) for i = 1 2 t such that

Pt(x)Ptminus1(x) middot middot middotP1(x)W (x) =Wt(x)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 19

Since Wt(x) is a unit upper triangular matrix polynomial there exists Pt+1(x) isinR[x]ttimes(s+t) such that Pt+1(x)Wt(x) = It Let

P (x) = Pt+1(x)Pt(x)Ptminus1(x) middot middot middotP1(x)

then P (x)W (x) = Im Note that P (x) isin C[x]ttimess

For W (x) isin R[x]stimest we can replace P (x) by(P (x)+P (x)

)2 (the P (x) denotes

the complex conjugate of P (x)) which is a real matrix polynomial

(ii) The conclusion is implied directly by the item (i)

In Proposition 52 there is no explicit degree bound for L(x) satisfying (52)This question is mostly open to the best of the authorrsquos knowledge However oncea degree is chosen for L(x) it can be determined by comparing coefficients of bothsides of (52) This can be done by solving a linear system In the following wegive some examples of L(x) satisfying (52)

Example 53 Consider the hypercube with quadratic constraints

1minus x21 ge 0 1minus x22 ge 0 1minus x2n ge 0

The equation L(x)C(x) = In is satisfied by

L(x) =[minus 1

2diag(x) In]

Example 54 Consider the nonnegative portion of the unit sphere

x1 ge 0 x2 ge 0 xn ge 0 x21 + middot middot middot+ x2n minus 1 = 0

The equation L(x)C(x) = In+1 is satisfied by

L(x) =

[In minus xxT x1Tn 2x

12x

T minus 121

Tn minus1

]

Example 55 Consider the set

1minus x31 minus x42 ge 0 1minus x43 minus x34 ge 0

The equation L(x)C(x) = I2 is satisfied by

L(x) =

[minusx1

3 minusx2

4 0 0 1 00 0 minusx3

4 minusx4

3 0 1

]

Example 56 Consider the quadratic set

1minus x1x2 minus x2x3 minus x1x3 ge 0 1minus x21 minus x22 minus x23 ge 0

The matrix L(x)T satisfying L(x)C(x) = I2 is

25x13 + 10 x1

2 x2 + 40x1 x22 minus 25 x1 minus 2x3 minus25x1

3 minus 10x12 x2 minus 40 x1 x2

2 +49 x1

2+ 2x3

minus15 x12 x2 + 10x1 x2

2 + 20 x3 x1 x2 minus 10x1 15 x12 x2 minus 10x1 x2

2 minus 20 x3 x1 x2 + 10x1 minus x22

25 x3 x12 minus 20x1 x2

2 + 10 x3 x1 x2 + 2x1 minus25 x3 x12 + 20x1 x2

2 minus 10 x3 x1 x2 minus 2x1 minus x32

1 minus 20 x1 x3 minus 10 x12 minus 20 x1 x2 20 x1 x2 + 20 x1 x3 + 10x1

2

minus50 x12 minus 20 x2 x1 50 x1

2 + 20 x2 x1 + 1

We would like to remark that a polynomial tuple c = (c1 cm) is genericallynonsingular and Assumption 31 holds generically

Proposition 57 For all positive degrees d1 dm there exists an open densesubset U of D = R[x]d1 times middot middot middot times R[x]dm such that every tuple c = (c1 cm) isin Uis nonsingular Indeed such U can be chosen as a Zariski open subset of D ie itis the complement of a proper real variety of D Moreover Assumption 31 holdsfor all c isin U ie it holds generically

20 JIAWANG NIE

Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The

resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of

ci1 cin+1such that if Res(ci1 cin+1

) 6= 0 then the equations

ci1(x) = middot middot middot = cin+1(x) = 0

have no complex solutions Define

F1(c) =prod

(i1in+1)isinJ1

Res(ci1 cin+1)

For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero

Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations

cj1(x) = middot middot middot = cjk(x) = 0

have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define

F2(c) =prod

(j1jk)isinJ2

∆(cj1 cjk)

Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero

Last let F (c) = F1(c)F2(c) and

U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple

Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically

6 Numerical examples

This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21

The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie

pi =(

L1(x)nablaf(x))

i

In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds

By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)

We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo

Example 61 Consider the optimization problem

min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0

The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors

Table 1 Computational results for Example 61

order kwo LME with LME

lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567

2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn

i=1xi(1 minus xi)

2 +sum

i6=j x2

i xj isin IQ(ceq cin)

22 JIAWANG NIE

Example 62 Consider the optimization problem

min x41x22 + x21x

42 + x63 minus 3x21x

22x

23 + (x41 + x42 + x43)

st x21 + x22 + x23 ge 1

The matrix polynomial L(x) =[12x1

12x2

12x3 minus1

] The objective f is the sum

of the positive definite form x41 + x42 + x43 and the Motzkin polynomial

M(x) = x41x22 + x21x

42 + x63 minus 3x21x

22x

23

Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because

c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)

)= 0

defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1

3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic

3 plusmn 1radic

3) The

computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 2 Computational results for Example 62

order kwo LME with LME

lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530

Example 63 Consider the optimization problem

min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0

The matrix polynomial L(x) is

1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0

0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1

minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1

The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3

3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact

4 This is because 1 minus x21minus x2

2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and

1minusx23minusx2

4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23

Table 3 Computational results for Example 63

order kwo LME with LME

lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449

Example 64 Consider the polynomial optimization problem

minxisinR2

x21 + 50x22

st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1

8 ge 0 x22 + 2x1x2 minus 18 ge 0

It is motivated from an example in [12 sect3] The first column of L(x) is

8x13

5 + x1

5288 x2 x1

4

5 minus 16 x13

5 minus x2 x12 1245 + 8 x1

5 minus 2 x2minus 288x2 x1

4

5 minus 16 x13

5 + x2 x12 1245 + 8 x1

5 + 2 x2

and the second column of L(x) is

minus 8x12 x2

5 + 4x23

5 minus x2

10288x1

3 x22

5 + 16x12 x2

5 minus 142x1 x22

5 minus 9 x1

20 minus 8 x23

5 + 11x2

5

minus 288x13 x2

2

5 + 16x12 x2

5 + 142x1 x22

5 + 9 x1

20 minus 8 x23

5 + 11x2

5

The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25

radic5 asymp 1126517 and the minimizers are (plusmn

radic

12plusmn(radic

58 +radic

12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors

Table 4 Computational results for Example 64

order kwo LME with LME

lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537

Example 65 Consider the optimization problem

minxisinR3

x31 + x32 + x33 + 4x1x2x3 minus(x1(x

22 + x23) + x2(x

23 + x21) + x3(x

21 + x22)

)

st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0

The matrix polynomial L(x) is

1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0

minusx1 x2 0 1 0 minus1

24 JIAWANG NIE

The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3

+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 5 Computational results for Example 65

order kwo LME with LME

lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057

Example 66 Consider the optimization problem (x0 = 1)

minxisinR4

xTx+sum4i=0

prod

j 6=i(xi minus xj)

st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0

The matrix polynomial L(x) =[12diag(x) minusI4

] The first part of the objective

is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers

(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)

(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)

(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 6 Computational results for Example 66

order kwo LME with LME

lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354

Example 67 Consider the optimization problem

minxisinR3

x41x22 + x42x

23 + x43x

21 minus 3x21x

22x

23 + x22

st x1 minus x2x3 ge 0minusx2 + x23 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0

minusx3 minus1 0 0 0

]

By the arithmetic-geometric

mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25

relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin

for all k ge 5 up to numerical round-off errors

Table 7 Computational results for Example 67

order kwo LME with LME

lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618

Example 68 Consider the optimization problem

minxisinR4

x21(x1 minus x4)2 + x22(x2 minus x4)

2 + x23(x3 minus x4)2+

2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2

st x1 minus x2 ge 0 x2 minus x3 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0 01 1 0 0 0 0

]

In the objective the sum

of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer

(05632 05632 05632 07510)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 8 Computational results for Example 68

order kwo LME with LME

lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228

Example 69 Consider the optimization problem

minxisinR4

(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)

st 0 le x1 x4 le 1

The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9

5 Note that 4minussum

4

i=1x2i =

sum

4

i=1

(

xi(1 minus xi)2 + (1 minus xi)(1 + x2i )

)

isin IQ(ceq cin)

26 JIAWANG NIE

Table 9 Computational results for Example 69

orderwo LME with LME

lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320

For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison

Example 610 Consider the optimization problem (x0 = 1)

minxisinRn

sum

0leilejlejlen cijkxixjxk

st 0 le x le 1

where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time

Table 10 Consumed time (in seconds) for Example 610

n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778

7 Discussions

71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that

IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k

If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set

Preord(cin ψ) =sum

r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27

The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is

(71)

f primeprek = min 〈f y〉

st 〈1 y〉 = 1 L(k)ceq(y) = 0 L

(k)φ (y) = 0

L(k)

gr11

middotmiddotmiddotgrℓℓ

(y) 0 forall r1 rℓ isin 0 1y isin RN

n2k

Similar to (39) the dual optimization problem of the above is

(72)

fprek = max γ

st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k

An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem

Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then

fprek = f primeprek = fc

for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre

k = fmin for all k big enough

Proof The proof is very similar to the Case III of Theorem 33 Follow the same

argument there Without Assumption 32 we still have f(x) equiv 0 on the set

K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0

By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such

that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same

72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge

73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)

28 JIAWANG NIE

74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get

λ =(

C(x)TC(x))minus1

C(x)T[nablaf(x)

0

]

when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research

Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper

References

[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions

Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-

putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997

[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226

[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200

[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832

[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear

[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear

[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]

[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous

quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-

inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-

tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005

[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109

[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J

Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of

zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29

[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885

[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014

[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009

[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015

[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear

[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26

[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009

[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014

[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606

[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718

[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012

[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255

[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013

[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510

[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121

[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274

[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003

[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984

[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000

[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942

[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324

[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653

[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002

[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951

Department of Mathematics University of California at San Diego 9500 Gilman

Drive La Jolla CA USA 92093

E-mail address njwmathucsdedu

  • 1 Introduction
    • 11 Optimality conditions
    • 12 Some existing work
    • 13 New contributions
      • 2 Preliminaries
      • 3 The construction of tight relaxations
        • 31 Tightness of the relaxations
        • 32 Detecting tightness and extracting minimizers
          • 4 Polyhedral constraints
          • 5 General constraints
          • 6 Numerical examples
          • 7 Discussions
            • 71 Tight relaxations using preorderings
            • 72 Singular constraining polynomials
            • 73 Degree bound for L(x)
            • 74 Rational representation of Lagrange multipliers
              • References
Page 10: TIGHT RELAXATIONS FOR POLYNOMIAL OPTIMIZATION …The polynomial expressions is de-termined by linear equations. Based on these expressions, new Lasserre type semidefinite relaxations

10 JIAWANG NIE

For i = t vt = 0 and f(x) vanishes on VC(It) By Hilbertrsquos Strong Nullstellensatz[4] there exists an integer mt gt 0 such that fmt isin It Define the polynomial

st(ǫ) =radicǫsummtminus1

j=0

(12

j

)

ǫminusjf j

Then we have that

D1 = ǫ(1 + ǫminus1f)minus(st(ǫ)

)2 equiv 0 mod It

This is because in the subtraction of D1 after expanding(st(ǫ)

)2 all the terms f j

with j lt mt are cancelled and f j isin It for j ge mt So D1 isin It Let σt(ǫ) = st(ǫ)2

then f + ǫminus σt(ǫ) = D1 and

(315) f + ǫminus σt(ǫ) =summtminus2

j=0bj(ǫ)f

mt+j

for some real scalars bj(ǫ) depending on ǫ

For each i = t+1 N vi gt 0 and f(x)viminus1 vanishes on VC(Ii) By HilbertrsquosStrong Nullstellensatz [4] there exists 0 lt mi isin N such that (fvi minus 1)mi isin Ii Let

si =radicvisummiminus1

j=0

(12

j

)

(fvi minus 1)j

Like for the case i = t we can similarly show that f minus s2i isin Ii Let σi = s2i thenf minus σi isin Ii

Note that VC(Ii) cap VC(Ij) = empty for all i 6= j By Lemma 33 of [30] there existpolynomials atminus1 aN isin R[x] such that

a2tminus1 + middot middot middot+ a2N minus 1 isin I ai isin⋂

i6=jisintminus1NIj

For ǫ gt 0 denote the polynomial

σǫ = σt(ǫ)a2t +

sum

t6=jisintminus1N(σj + ǫ)a2j

then

f + ǫminus σǫ = (f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N )+sum

t6=iisintminus1N(f minus σi)a2i + (f + ǫminus σt(ǫ))a

2t

For each i 6= t f minus σi isin Ii so

(f minus σi)a2i isin

N⋂

j=tminus1

Ij = I

Hence there exists k1 gt 0 such that

(f minus σi)a2i isin I2k1 (t 6= i isin tminus 1 N)

Since f + ǫ minus σt(ǫ) isin It we also have

(f + ǫ minus σt(ǫ))a2t isin

N⋂

j=tminus1

Ij = I

Moreover by the equation (315)

(f + ǫ minus σt(ǫ))a2t =

mtminus2sum

j=0

bj(ǫ)fmt+ja2t

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 11

Each fmt+ja2t isin I since fmt+j isin It So there exists k2 gt 0 such that for all ǫ gt 0

(f + ǫminus σt(ǫ))a2t isin I2k2

Since 1minus a2tminus1 minus middot middot middot minus a2N isin I there also exists k3 gt 0 such that for all ǫ gt 0

(f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N ) isin I2k3

Hence if klowast ge maxk1 k2 k3 then we have

f(x) + ǫminus σǫ isin I2klowast

for all ǫ gt 0 By the construction the degrees of all σi and ai are independent ofǫ So σǫ isin Qmod(cin ψ)2klowast for all ǫ gt 0 if klowast is big enough Note that

I2klowast +Qmod(cin ψ)2klowast = IQ(ceq cin)2klowast + IQ(φ ψ)2klowast

This implies that fklowast ge fc minus ǫ for all ǫ gt 0 On the other hand we always havefklowast le fc So fklowast = fc Moreover since fk is monotonically increasing we musthave fk = fc for all k ge klowast

Case II Assume IQ(ceq cin) is archimedean Because

IQ(ceq cin) sube IQ(ceq cin) + IQ(φ ψ)

the set IQ(ceq cin)+IQ(φ ψ) is also archimedean Therefore the conclusion is alsotrue by applying the result for Case I

Case III Suppose the Assumption 32 holds Let ϕ1 ϕN be real univariatepolynomials such that ϕi(vj) = 0 for i 6= j and ϕi(vj) = 1 for i = j Let

s = st + middot middot middot+ sN where each si = (vi minus fc)(ϕi(f)

)2

Then s isin Σ[x]2k4 for some integer k4 gt 0 Let

f = f minus fc minus s

We show that there exist an integer ℓ gt 0 and q isin Qmod(cin ψ) such that

f2ℓ + q isin Ideal(ceq φ)

This is because by Assumption 32 f(x) equiv 0 on the set

K2 = x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0

It has only a single inequality By the Positivstellensatz [2 Corollary 443] there

exist 0 lt ℓ isin N and q = b0 + ρb1 (b0 b1 isin Σ[x]) such that f2ℓ + q isin Ideal(ceq φ)By Assumption 32 ρ isin Qmod(cin ψ) so we have q isin Qmod(cin ψ)

For all ǫ gt 0 and τ gt 0 we have f + ǫ = φǫ + θǫ where

φǫ = minusτǫ1minus2ℓ(f2ℓ + q

)

θǫ = ǫ(

1 + f ǫ+ τ(f ǫ)2ℓ)

+ τǫ1minus2ℓq

By Lemma 21 of [31] when τ ge 12ℓ there exists k5 such that for all ǫ gt 0

φǫ isin Ideal(ceq φ)2k5 θǫ isin Qmod(cin ψ)2k5

Hence we can getf minus (fc minus ǫ) = φǫ + σǫ

where σǫ = θǫ + s isin Qmod(cin ψ)2k5 for all ǫ gt 0 Note that

IQ(ceq cin)2k5 + IQ(φ ψ)2k5 = Ideal(ceq φ)2k5 +Qmod(cin ψ)2k5

12 JIAWANG NIE

For all ǫ gt 0 γ = fc minus ǫ is feasible in (39) for the order k5 so fk5 ge fc Becauseof (311) and the monotonicity of fk we have fk = f prime

k = fc for all k ge k5

32 Detecting tightness and extracting minimizers The optimal value of(37) is fc and the optimal value of (11) is fmin If fmin is achievable at a criticalpoint then fc = fmin In Theorem 33 we have shown that fk = fc for all k bigenough where fk is the optimal value of (39) The value fc or fmin is often notknown How do we detect the tightness fk = fc in computation The flat extensionor flat truncation condition [5 14 32] can be used for checking tightness Supposeylowast is a minimizer of (38) for the order k Let

(316) d = lceildeg(ceq cin φ ψ)2rceilIf there exists an integer t isin [d k] such that

(317) rankMt(ylowast) = rankMtminusd(y

lowast)

then fk = fc and we can get r = rankMt(ylowast) minimizers for (37) [5 14 32]

The method in [14] can be used to extract minimizers It was implemented inthe software GloptiPoly 3 [13] Generally (317) can serve as a sufficient andnecessary condition for detecting tightness The case that (37) is infeasible (ieno critical points satisfy the constraints cin ge 0 ψ ge 0) can also be detected bysolving the relaxations (38)-(39)

Theorem 34 Under Assumption 31 the relaxations (38)-(39) have the follow-ing properties

i) If (38) is infeasible for some order k then no critical points satisfy theconstraints cin ge 0 ψ ge 0 ie (37) is infeasible

ii) Suppose Assumption 32 holds If (37) is infeasible then the relaxation(38) must be infeasible when the order k is big enough

In the following assume (37) is feasible (ie fc lt +infin) Then for all k bigenough (38) has a minimizer ylowast Moreover

iii) If (317) is satisfied for some t isin [d k] then fk = fciv) If Assumption 32 holds and (37) has finitely many minimizers then every

minimizer ylowast of (38) must satisfy (317) for some t isin [d k] when k is bigenough

Proof By Assumption 31 u is a critical point if and only if ceq(u) = 0 φ(u) = 0i) For every feasible point u of (37) the tms [u]2k (see sect2 for the notation) is

feasible for (38) for all k Therefore if (38) is infeasible for some k then (37)must be infeasible

ii) By Assumption 32 when (37) is infeasible the set

x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0

is empty It has a single inequality By the Positivstellensatz [2 Corollary 443]it holds that minus1 isin Ideal(ceq φ) + Qmod(ρ) By Assumption 32

Ideal(ceq φ) + Qmod(ρ) sube IQ(ceq cin) + IQ(φ ψ)

Thus for all k big enough (39) is unbounded from above Hence (38) must beinfeasible by weak duality

When (37) is feasible f achieves finitely many values on Kc so (37) mustachieve its optimal value fc By Theorem 33 we know that fk = f prime

k = fc for all kbig enough For each minimizer ulowast of (37) the tms [ulowast]2k is a minimizer of (38)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 13

iii) If (317) holds we can get r = rankMt(ylowast) minimizers for (37) [5 14] say

u1 ur such that fk = f(ui) for each i Clearly fk = f(ui) ge fc On the otherhand we always have fk le fc So fk = fc

iv) By Assumption 32 (37) is equivalent to the problem

(318)

min f(x)st ceq(x) = 0 φ(x) = 0 ρ(x) ge 0

The optimal value of (318) is also fc Its kth order Lasserrersquos relaxation is

(319)

γprimek = min 〈f y〉st 〈1 y〉 = 1Mk(y) 0

L(k)ceq (y) = 0 L

(k)φ (y) = 0 L

(k)ρ (y) 0

Its dual optimization problem is

(320)

γk = max γ

st f minus γ isin Ideal(ceq φ)2k + Qmod(ρ)2k

By repeating the same proof as for Theorem 33(iii) we can show that

γk = γprimek = fc

for all k big enough Because ρ isin Qmod(cin ψ) each y feasible for (38) is alsofeasible for (319) So when k is big each ylowast is also a minimizer of (319) Theproblem (318) also has finitely many minimizers By Theorem 26 of [32] thecondition (317) must be satisfied for some t isin [d k] when k is big enough

If (37) has infinitely many minimizers then the condition (317) is typically notsatisfied We refer to [25 sect66]

4 Polyhedral constraints

In this section we assume the feasible set of (11) is the polyhedron

P = x isin Rn | Axminus b ge 0

where A =[a1 middot middot middot am

]T isin Rmtimesn b =[b1 middot middot middot bm

]T isin Rm This corresponds

to that E = empty I = [m] and each ci(x) = aTi xminus bi Denote

(41) D(x) = diag(c1(x) cm(x)) C(x) =

[AT

D(x)

]

The Lagrange multiplier vector λ =[λ1 middot middot middot λm

]Tsatisfies

(42)

[AT

D(x)

]

λ =

[nablaf(x)

0

]

If rankA = m we can express λ as

(43) λ = (AAT )minus1Anablaf(x)If rankA lt m how can we express λ in terms of x In computation we oftenprefer a polynomial expression If there exists L(x) isin R[x]mtimes(n+m) such that

(44) L(x)C(x) = Im

then we can get

λ = L(x)

[nablaf(x)

0

]

= L1(x)nablaf(x)

14 JIAWANG NIE

where L1(x) consists of the first n columns of L(x) In this section we characterizewhen such L(x) exists and give a degree bound for it

The linear function Ax minus b is said to be nonsingular if rankC(u) = m for allu isin Cn (also see Definition 51) This is equivalent to that for every u if J(u) =i1 ik (see (12) for the notation) then ai1 aik are linearly independent

Proposition 41 The linear function Ax minus b is nonsingular if and only if thereexists a matrix polynomial L(x) satisfying (44) Moreover when Ax minus b is non-singular we can choose L(x) in (44) with deg(L) le mminus rankA

Proof Clearly if (44) is satisfied by some L(x) then rankC(u) ge m for all u Thisimplies that Axminus b is nonsingular

Next assume that Axminusb is nonsingular We show that (44) is satisfied by someL(x) isin R[x]mtimes(n+m) with degree le m minus rankA Let r = rankA Up to a linearcoordinate transformation we can reduce x to a r-dimensional variable Withoutloss of generality we can assume that rankA = n and m ge n

For a subset I = i1 imminusn of [m] denote

cI(x) =prod

iisinIci(x) EI(x) = cI(x) middot diag(ci1(x)minus1 cimminusn

(x)minus1)

DI(x) = diag(ci1(x) cimminusn(x)) AI =

[ai1 middot middot middot ainminusm

]T

For the case that I = empty (the empty set) we set cempty(x) = 1 Let

V = I sube [m] |I| = mminus n rankA[m]I = nStep I For each I isin V we construct a matrix polynomial LI(x) such that

(45) LI(x)C(x) = cI(x)Im

The matrix LI = LI(x) satisfying (45) can be given by the following 2times 3 blockmatrix (LI(J K) denotes the submatrix whose row indices are from J and whosecolumn indices are from K)

(46)

J K [n] n+ I n+ [m]II 0 EI(x) 0

[m]I cI(x) middot(A[m]I

)minusT minus(A[m]I

)minusT (AI

)TEI(x) 0

Equivalently the blocks of LI are

LI(I [n]

)= 0 LI

(I n+ [m]I

)= 0 LI

([m]I n+ [m]I

)= 0

LI(I n+ I

)= EI(x) LI

([m]I [n]

)= cI(x)

(A[m]I

)minusT

LI([m]I n+ I

)= minus

(A[m]I

)minusT (AI

)T

For each I isin V A[m]I is invertible The superscript minusT denotes the inverse of thetranspose Let G = LI(x)C(x) then one can verify that

G(I I) = EI(x)DI(x) = cI(x)Imminusn G(I [m]I) = 0

G([m]I [m]I) =[

cI(x)(A[m]I

)minusT minusAminusT[m]IA

TI EI(x)

] [(A[m]I

)T

0

]

= cI(x)In

G([m]I I) =[

cI(x)(A[m]I

)minusT minus(A[m]I

)minusT (AI

)TEI(x)

] [ ATIDI(x)

]

= 0

This shows that the above LI(x) satisfies (45)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 15

Step II We show that there exist real scalars νI satisfying

(47)sum

IisinVνIcI(x) = 1

This can be shown by induction on m

bull When m = n V = empty and cempty(x) = 1 so (47) is clearly truebull When m gt n let

(48) N = i isin [m] | rankA[m]i = n

For each i isin N let Vi be the set of all Iprime sube [m]i such that |I prime| = mminusnminus1

and rankA[m](Iprimecupi) = n For each i isin N by the assumption the linearfunction Amix minus bmi is nonsingular By induction there exist real

scalars ν(i)Iprime satisfying

(49)sum

IprimeisinVi

ν(i)Iprime cIprime(x) = 1

Since rankA = n we can generally assume that a1 an is linearlyindependent So there exist scalars α1 αn such that

am = α1a1 + middot middot middot+ αnan

If all αi = 0 then am = 0 and hence A can be replaced by its first mminus 1rows So (47) is true by the induction In the following suppose at leastone αi 6= 0 and write

i αi 6= 0 = i1 ik

Then ai1 aik am are linearly dependent For convenience set ik+1 =m Since Axminus b is nonsingular the linear system

ci1(x) = middot middot middot = cik(x) = cik+1(x) = 0

has no solutions Hence there exist real scalars micro1 microk+1 such that

micro1ci1(x) + middot middot middot+ microkcik(x) + microk+1cik+1(x) = 1

This above can be implied by echelonrsquos form for inconsistent linear systemsNote that i1 ik+1 isin N For each j = 1 k + 1 by (49)

sum

IprimeisinVij

ν(ij)Iprime cIprime(x) = 1

Then we can get

1 =k+1sum

j=1

microjcij (x) =k+1sum

j=1

microjsum

IprimeisinVij

ν(ij)Iprime cij (x)cIprime(x) =

sum

I=IprimecupijIprimeisinVij1lejlek+1

ν(ij)Iprime microjcI(x)

Since each I prime cup ij isin V (47) must be satisfied by some scalars νI

16 JIAWANG NIE

Step III For LI(x) as in (45) we construct L(x) as

(410) L(x) =sum

IisinVνIcI(x)LI(x)

Clearly L(x) satisfies (44) because

L(x)C(x) =sum

IisinVνILI(x)C(x) =

sum

IisinVνIcI(x)Im = Im

Each LI(x) has degree le mminus n so L(x) has degree le mminus n

Proposition 41 characterizes when there exists L(x) satisfying (44) Whenit does a degree bound for L(x) is m minus rankA Sometimes its degree can besmaller than that as shown in Example 43 For given A b the matrix polynomialL(x) satisfying (44) can be determined by linear equations which are obtained bymatching coefficients on both sides In the following we give some examples ofL(x)C(x) = Im for polyhedral sets

Example 42 Consider the simplicial set

x1 ge 0 xn ge 0 1minus eTx ge 0

The equation L(x)C(x) = In+1 is satisfied by

L(x) =

1minus x1 minusx2 middot middot middot minusxn 1 middot middot middot 1minusx1 1minus x2 middot middot middot minusxn 1 middot middot middot 1

minusx1 minusx2 middot middot middot 1minus xn 1 middot middot middot 1minusx1 minusx2 middot middot middot minusxn 1 middot middot middot 1

Example 43 Consider the box constraint

x1 ge 0 xn ge 0 1minus x1 ge 0 1minus xn ge 0

The equation L(x)C(x) = I2n is satisfied by

L(x) =

[In minus diag(x) In Inminusdiag(x) In In

]

Example 44 Consider the polyhedral set

1minus x4 ge 0 x4 minus x3 ge 0 x3 minus x2 ge 0 x2 minus x1 ge 0 x1 + 1 ge 0

The equation L(x)C(x) = I5 is satisfied by

L(x) =1

2

minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 minusx4 minus 1 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 1minus x4 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 1minus x3 1minus x4 1 1 1 1 1minusx1 minus 1 1minus x2 1minus x3 1minus x4 1 1 1 1 11minus x1 1minus x2 1minus x3 1minus x4 1 1 1 1 1

Example 45 Consider the polyhedral set

1 + x1 ge 0 1minus x1 ge 0 2minus x1 minus x2 ge 0 2minus x1 + x2 ge 0

The matrix L(x) satisfying L(x)C(x) = I4 is

1

6

x12minus 3x1 + 2 x1 x2 minus x2 4minus x1 2minus x1 1minus x1 1minus x1

3x12minus 3x1 minus 6 3x2 + 3x1 x2 6minus 3x1 minus3x1 minus3x1 minus 3 minus3x1 minus 3

1minus x12

minus2x2 minus x1 x2 minus 3 x1 minus 1 x1 + 1 x1 + 2 x1 + 21minus x1

2 3minus x1 x2 minus 2x2 x1 minus 1 x1 + 1 x1 + 2 x1 + 2

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 17

5 General constraints

We consider general nonlinear constraints as in (11) The critical point condi-tions are in (18) We discuss how to express Lagrange multipliers λi as polynomialfunctions in x on the set of critical points

Suppose there are totally m equality and inequality constraints ie

E cup I = 1 mIf (x λ) is a critical pair then λici(x) = 0 for all i isin E cup I So the Lagrange

multiplier vector λ =[λ1 middot middot middot λm

]Tsatisfies the equation

(51)

nablac1(x) nablac2(x) middot middot middot nablacm(x)c1(x) 0 middot middot middot 00 c2(x) 0 0

0 0 middot middot middot cm(x)

︸ ︷︷ ︸

C(x)

λ =

nablaf(x)000

Let C(x) be as in above If there exists L(x) isin R[x]mtimes(m+n) such that

(52) L(x)C(x) = Im

then we can get

(53) λ = L(x)

[nablaf(x)

0

]

= L1(x)nablaf(x)

where L1(x) consists of the first n columns of L(x) Clearly (52) implies thatAssumption 31 holds This section characterizes when such L(x) exists

Definition 51 The tuple c = (c1 cm) of constraining polynomials is said tobe nonsingular if rankC(u) = m for every u isin Cn

Clearly c being nonsingular is equivalent to that for each u isin Cn if J(u) =i1 ik (see (12) for the notation) then the gradients nablaci1(u) nablacik(u) arelinearly independent Our main conclusion is that (52) holds if and only if thetuple c is nonsingular

Proposition 52 (i) For each W (x) isin C[x]stimest with s ge t rankW (u) = t for allu isin Cn if and only if there exists P (x) isin C[x]ttimess such that

P (x)W (x) = It

Moreover for W (x) isin R[x]stimest we can choose P (x) isin R[x]ttimess for the above(ii) The constraining polynomial tuple c is nonsingular if and only if there existsL(x) isin R[x]mtimes(m+n) satisfying (52)

Proof (i) ldquolArrrdquo If L(x)W (x) = It then for all u isin Cn

t = rank It le rankW (u) le t

So W (x) must have full column rank everywhere

ldquorArrrdquo Suppose rankW (u) = t for all u isin Cn Write W (x) in columns

W (x) =[w1(x) w2(x) middot middot middot wt(x)

]

18 JIAWANG NIE

Then the equation w1(x) = 0 does not have a complex solution By Hilbertrsquos WeakNullstellensatz [4] there exists ξ1(x) isin C[x]s such that ξ1(x)

Tw1(x) = 1 For eachi = 2 t denote

r1i(x) = ξ1(x)Twi(x)

then (use sim to denote row equivalence between matrices)

W (x) sim[

1 r12(x) middot middot middot r1t(x)w1(x) w2(x) middot middot middot wm(x)

]

simW1(x) =

[1 r12(x) middot middot middot r1m(x)

0 w(1)2 (x) middot middot middot w

(1)m (x)

]

where each (i = 2 m)

w(1)i (x) = wi(x)minus r1i(x)w1(x)

So there exists P1(x) isin R[x](s+1)timess such that

P1(x)W (x) =W1(x)

Since W (x) and W1(x) are row equivalent W1(x) must also have full column rankeverywhere Similarly the polynomial equation

w(1)2 (x) = 0

does not have a complex solution Again by Hilbertrsquos Weak Nullstellensatz [4]there exists ξ2(x) isin C[x]s such that

ξ2(x)Tw

(1)2 (x) = 1

For each i = 3 t let r2i(x) = ξ2(x)Tw

(1)2 (x) then

W1(x) sim

1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)

0 w(1)2 (x) w

(1)3 (x) middot middot middot w

(1)m (x)

sim

W2(x) =

1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)

0 0 w(2)3 (x) middot middot middot w

(2)m (x)

where each (i = 3 m)

w(2)i (x) = w

(1)i (x)minus r2i(x)w

(1)2 (x)

Similarly W1(x) and W2(x) are row equivalent so W2(x) has full column rankeverywhere There exists P2(x) isin C[x](s+2)times(s+1) such that

P2(x)W1(x) =W2(x)

Continuing this process we can finally get

W2(x) sim middot middot middot simWt(x) =

1 r12(x) r13(x) middot middot middot r1t(x)0 1 r23(x) middot middot middot r2t(x)0 0 1 middot middot middot r3t(x)

0 0 0 middot middot middot 10 0 0 middot middot middot 0

Consequently there exists Pi(x) isin R[x](s+i)times(s+iminus1) for i = 1 2 t such that

Pt(x)Ptminus1(x) middot middot middotP1(x)W (x) =Wt(x)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 19

Since Wt(x) is a unit upper triangular matrix polynomial there exists Pt+1(x) isinR[x]ttimes(s+t) such that Pt+1(x)Wt(x) = It Let

P (x) = Pt+1(x)Pt(x)Ptminus1(x) middot middot middotP1(x)

then P (x)W (x) = Im Note that P (x) isin C[x]ttimess

For W (x) isin R[x]stimest we can replace P (x) by(P (x)+P (x)

)2 (the P (x) denotes

the complex conjugate of P (x)) which is a real matrix polynomial

(ii) The conclusion is implied directly by the item (i)

In Proposition 52 there is no explicit degree bound for L(x) satisfying (52)This question is mostly open to the best of the authorrsquos knowledge However oncea degree is chosen for L(x) it can be determined by comparing coefficients of bothsides of (52) This can be done by solving a linear system In the following wegive some examples of L(x) satisfying (52)

Example 53 Consider the hypercube with quadratic constraints

1minus x21 ge 0 1minus x22 ge 0 1minus x2n ge 0

The equation L(x)C(x) = In is satisfied by

L(x) =[minus 1

2diag(x) In]

Example 54 Consider the nonnegative portion of the unit sphere

x1 ge 0 x2 ge 0 xn ge 0 x21 + middot middot middot+ x2n minus 1 = 0

The equation L(x)C(x) = In+1 is satisfied by

L(x) =

[In minus xxT x1Tn 2x

12x

T minus 121

Tn minus1

]

Example 55 Consider the set

1minus x31 minus x42 ge 0 1minus x43 minus x34 ge 0

The equation L(x)C(x) = I2 is satisfied by

L(x) =

[minusx1

3 minusx2

4 0 0 1 00 0 minusx3

4 minusx4

3 0 1

]

Example 56 Consider the quadratic set

1minus x1x2 minus x2x3 minus x1x3 ge 0 1minus x21 minus x22 minus x23 ge 0

The matrix L(x)T satisfying L(x)C(x) = I2 is

25x13 + 10 x1

2 x2 + 40x1 x22 minus 25 x1 minus 2x3 minus25x1

3 minus 10x12 x2 minus 40 x1 x2

2 +49 x1

2+ 2x3

minus15 x12 x2 + 10x1 x2

2 + 20 x3 x1 x2 minus 10x1 15 x12 x2 minus 10x1 x2

2 minus 20 x3 x1 x2 + 10x1 minus x22

25 x3 x12 minus 20x1 x2

2 + 10 x3 x1 x2 + 2x1 minus25 x3 x12 + 20x1 x2

2 minus 10 x3 x1 x2 minus 2x1 minus x32

1 minus 20 x1 x3 minus 10 x12 minus 20 x1 x2 20 x1 x2 + 20 x1 x3 + 10x1

2

minus50 x12 minus 20 x2 x1 50 x1

2 + 20 x2 x1 + 1

We would like to remark that a polynomial tuple c = (c1 cm) is genericallynonsingular and Assumption 31 holds generically

Proposition 57 For all positive degrees d1 dm there exists an open densesubset U of D = R[x]d1 times middot middot middot times R[x]dm such that every tuple c = (c1 cm) isin Uis nonsingular Indeed such U can be chosen as a Zariski open subset of D ie itis the complement of a proper real variety of D Moreover Assumption 31 holdsfor all c isin U ie it holds generically

20 JIAWANG NIE

Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The

resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of

ci1 cin+1such that if Res(ci1 cin+1

) 6= 0 then the equations

ci1(x) = middot middot middot = cin+1(x) = 0

have no complex solutions Define

F1(c) =prod

(i1in+1)isinJ1

Res(ci1 cin+1)

For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero

Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations

cj1(x) = middot middot middot = cjk(x) = 0

have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define

F2(c) =prod

(j1jk)isinJ2

∆(cj1 cjk)

Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero

Last let F (c) = F1(c)F2(c) and

U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple

Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically

6 Numerical examples

This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21

The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie

pi =(

L1(x)nablaf(x))

i

In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds

By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)

We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo

Example 61 Consider the optimization problem

min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0

The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors

Table 1 Computational results for Example 61

order kwo LME with LME

lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567

2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn

i=1xi(1 minus xi)

2 +sum

i6=j x2

i xj isin IQ(ceq cin)

22 JIAWANG NIE

Example 62 Consider the optimization problem

min x41x22 + x21x

42 + x63 minus 3x21x

22x

23 + (x41 + x42 + x43)

st x21 + x22 + x23 ge 1

The matrix polynomial L(x) =[12x1

12x2

12x3 minus1

] The objective f is the sum

of the positive definite form x41 + x42 + x43 and the Motzkin polynomial

M(x) = x41x22 + x21x

42 + x63 minus 3x21x

22x

23

Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because

c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)

)= 0

defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1

3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic

3 plusmn 1radic

3) The

computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 2 Computational results for Example 62

order kwo LME with LME

lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530

Example 63 Consider the optimization problem

min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0

The matrix polynomial L(x) is

1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0

0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1

minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1

The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3

3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact

4 This is because 1 minus x21minus x2

2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and

1minusx23minusx2

4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23

Table 3 Computational results for Example 63

order kwo LME with LME

lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449

Example 64 Consider the polynomial optimization problem

minxisinR2

x21 + 50x22

st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1

8 ge 0 x22 + 2x1x2 minus 18 ge 0

It is motivated from an example in [12 sect3] The first column of L(x) is

8x13

5 + x1

5288 x2 x1

4

5 minus 16 x13

5 minus x2 x12 1245 + 8 x1

5 minus 2 x2minus 288x2 x1

4

5 minus 16 x13

5 + x2 x12 1245 + 8 x1

5 + 2 x2

and the second column of L(x) is

minus 8x12 x2

5 + 4x23

5 minus x2

10288x1

3 x22

5 + 16x12 x2

5 minus 142x1 x22

5 minus 9 x1

20 minus 8 x23

5 + 11x2

5

minus 288x13 x2

2

5 + 16x12 x2

5 + 142x1 x22

5 + 9 x1

20 minus 8 x23

5 + 11x2

5

The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25

radic5 asymp 1126517 and the minimizers are (plusmn

radic

12plusmn(radic

58 +radic

12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors

Table 4 Computational results for Example 64

order kwo LME with LME

lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537

Example 65 Consider the optimization problem

minxisinR3

x31 + x32 + x33 + 4x1x2x3 minus(x1(x

22 + x23) + x2(x

23 + x21) + x3(x

21 + x22)

)

st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0

The matrix polynomial L(x) is

1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0

minusx1 x2 0 1 0 minus1

24 JIAWANG NIE

The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3

+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 5 Computational results for Example 65

order kwo LME with LME

lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057

Example 66 Consider the optimization problem (x0 = 1)

minxisinR4

xTx+sum4i=0

prod

j 6=i(xi minus xj)

st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0

The matrix polynomial L(x) =[12diag(x) minusI4

] The first part of the objective

is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers

(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)

(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)

(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 6 Computational results for Example 66

order kwo LME with LME

lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354

Example 67 Consider the optimization problem

minxisinR3

x41x22 + x42x

23 + x43x

21 minus 3x21x

22x

23 + x22

st x1 minus x2x3 ge 0minusx2 + x23 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0

minusx3 minus1 0 0 0

]

By the arithmetic-geometric

mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25

relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin

for all k ge 5 up to numerical round-off errors

Table 7 Computational results for Example 67

order kwo LME with LME

lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618

Example 68 Consider the optimization problem

minxisinR4

x21(x1 minus x4)2 + x22(x2 minus x4)

2 + x23(x3 minus x4)2+

2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2

st x1 minus x2 ge 0 x2 minus x3 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0 01 1 0 0 0 0

]

In the objective the sum

of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer

(05632 05632 05632 07510)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 8 Computational results for Example 68

order kwo LME with LME

lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228

Example 69 Consider the optimization problem

minxisinR4

(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)

st 0 le x1 x4 le 1

The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9

5 Note that 4minussum

4

i=1x2i =

sum

4

i=1

(

xi(1 minus xi)2 + (1 minus xi)(1 + x2i )

)

isin IQ(ceq cin)

26 JIAWANG NIE

Table 9 Computational results for Example 69

orderwo LME with LME

lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320

For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison

Example 610 Consider the optimization problem (x0 = 1)

minxisinRn

sum

0leilejlejlen cijkxixjxk

st 0 le x le 1

where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time

Table 10 Consumed time (in seconds) for Example 610

n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778

7 Discussions

71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that

IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k

If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set

Preord(cin ψ) =sum

r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27

The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is

(71)

f primeprek = min 〈f y〉

st 〈1 y〉 = 1 L(k)ceq(y) = 0 L

(k)φ (y) = 0

L(k)

gr11

middotmiddotmiddotgrℓℓ

(y) 0 forall r1 rℓ isin 0 1y isin RN

n2k

Similar to (39) the dual optimization problem of the above is

(72)

fprek = max γ

st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k

An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem

Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then

fprek = f primeprek = fc

for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre

k = fmin for all k big enough

Proof The proof is very similar to the Case III of Theorem 33 Follow the same

argument there Without Assumption 32 we still have f(x) equiv 0 on the set

K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0

By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such

that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same

72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge

73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)

28 JIAWANG NIE

74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get

λ =(

C(x)TC(x))minus1

C(x)T[nablaf(x)

0

]

when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research

Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper

References

[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions

Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-

putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997

[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226

[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200

[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832

[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear

[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear

[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]

[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous

quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-

inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-

tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005

[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109

[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J

Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of

zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29

[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885

[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014

[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009

[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015

[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear

[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26

[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009

[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014

[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606

[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718

[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012

[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255

[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013

[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510

[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121

[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274

[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003

[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984

[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000

[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942

[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324

[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653

[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002

[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951

Department of Mathematics University of California at San Diego 9500 Gilman

Drive La Jolla CA USA 92093

E-mail address njwmathucsdedu

  • 1 Introduction
    • 11 Optimality conditions
    • 12 Some existing work
    • 13 New contributions
      • 2 Preliminaries
      • 3 The construction of tight relaxations
        • 31 Tightness of the relaxations
        • 32 Detecting tightness and extracting minimizers
          • 4 Polyhedral constraints
          • 5 General constraints
          • 6 Numerical examples
          • 7 Discussions
            • 71 Tight relaxations using preorderings
            • 72 Singular constraining polynomials
            • 73 Degree bound for L(x)
            • 74 Rational representation of Lagrange multipliers
              • References
Page 11: TIGHT RELAXATIONS FOR POLYNOMIAL OPTIMIZATION …The polynomial expressions is de-termined by linear equations. Based on these expressions, new Lasserre type semidefinite relaxations

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 11

Each fmt+ja2t isin I since fmt+j isin It So there exists k2 gt 0 such that for all ǫ gt 0

(f + ǫminus σt(ǫ))a2t isin I2k2

Since 1minus a2tminus1 minus middot middot middot minus a2N isin I there also exists k3 gt 0 such that for all ǫ gt 0

(f + ǫ)(1minus a2tminus1 minus middot middot middot minus a2N ) isin I2k3

Hence if klowast ge maxk1 k2 k3 then we have

f(x) + ǫminus σǫ isin I2klowast

for all ǫ gt 0 By the construction the degrees of all σi and ai are independent ofǫ So σǫ isin Qmod(cin ψ)2klowast for all ǫ gt 0 if klowast is big enough Note that

I2klowast +Qmod(cin ψ)2klowast = IQ(ceq cin)2klowast + IQ(φ ψ)2klowast

This implies that fklowast ge fc minus ǫ for all ǫ gt 0 On the other hand we always havefklowast le fc So fklowast = fc Moreover since fk is monotonically increasing we musthave fk = fc for all k ge klowast

Case II Assume IQ(ceq cin) is archimedean Because

IQ(ceq cin) sube IQ(ceq cin) + IQ(φ ψ)

the set IQ(ceq cin)+IQ(φ ψ) is also archimedean Therefore the conclusion is alsotrue by applying the result for Case I

Case III Suppose the Assumption 32 holds Let ϕ1 ϕN be real univariatepolynomials such that ϕi(vj) = 0 for i 6= j and ϕi(vj) = 1 for i = j Let

s = st + middot middot middot+ sN where each si = (vi minus fc)(ϕi(f)

)2

Then s isin Σ[x]2k4 for some integer k4 gt 0 Let

f = f minus fc minus s

We show that there exist an integer ℓ gt 0 and q isin Qmod(cin ψ) such that

f2ℓ + q isin Ideal(ceq φ)

This is because by Assumption 32 f(x) equiv 0 on the set

K2 = x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0

It has only a single inequality By the Positivstellensatz [2 Corollary 443] there

exist 0 lt ℓ isin N and q = b0 + ρb1 (b0 b1 isin Σ[x]) such that f2ℓ + q isin Ideal(ceq φ)By Assumption 32 ρ isin Qmod(cin ψ) so we have q isin Qmod(cin ψ)

For all ǫ gt 0 and τ gt 0 we have f + ǫ = φǫ + θǫ where

φǫ = minusτǫ1minus2ℓ(f2ℓ + q

)

θǫ = ǫ(

1 + f ǫ+ τ(f ǫ)2ℓ)

+ τǫ1minus2ℓq

By Lemma 21 of [31] when τ ge 12ℓ there exists k5 such that for all ǫ gt 0

φǫ isin Ideal(ceq φ)2k5 θǫ isin Qmod(cin ψ)2k5

Hence we can getf minus (fc minus ǫ) = φǫ + σǫ

where σǫ = θǫ + s isin Qmod(cin ψ)2k5 for all ǫ gt 0 Note that

IQ(ceq cin)2k5 + IQ(φ ψ)2k5 = Ideal(ceq φ)2k5 +Qmod(cin ψ)2k5

12 JIAWANG NIE

For all ǫ gt 0 γ = fc minus ǫ is feasible in (39) for the order k5 so fk5 ge fc Becauseof (311) and the monotonicity of fk we have fk = f prime

k = fc for all k ge k5

32 Detecting tightness and extracting minimizers The optimal value of(37) is fc and the optimal value of (11) is fmin If fmin is achievable at a criticalpoint then fc = fmin In Theorem 33 we have shown that fk = fc for all k bigenough where fk is the optimal value of (39) The value fc or fmin is often notknown How do we detect the tightness fk = fc in computation The flat extensionor flat truncation condition [5 14 32] can be used for checking tightness Supposeylowast is a minimizer of (38) for the order k Let

(316) d = lceildeg(ceq cin φ ψ)2rceilIf there exists an integer t isin [d k] such that

(317) rankMt(ylowast) = rankMtminusd(y

lowast)

then fk = fc and we can get r = rankMt(ylowast) minimizers for (37) [5 14 32]

The method in [14] can be used to extract minimizers It was implemented inthe software GloptiPoly 3 [13] Generally (317) can serve as a sufficient andnecessary condition for detecting tightness The case that (37) is infeasible (ieno critical points satisfy the constraints cin ge 0 ψ ge 0) can also be detected bysolving the relaxations (38)-(39)

Theorem 34 Under Assumption 31 the relaxations (38)-(39) have the follow-ing properties

i) If (38) is infeasible for some order k then no critical points satisfy theconstraints cin ge 0 ψ ge 0 ie (37) is infeasible

ii) Suppose Assumption 32 holds If (37) is infeasible then the relaxation(38) must be infeasible when the order k is big enough

In the following assume (37) is feasible (ie fc lt +infin) Then for all k bigenough (38) has a minimizer ylowast Moreover

iii) If (317) is satisfied for some t isin [d k] then fk = fciv) If Assumption 32 holds and (37) has finitely many minimizers then every

minimizer ylowast of (38) must satisfy (317) for some t isin [d k] when k is bigenough

Proof By Assumption 31 u is a critical point if and only if ceq(u) = 0 φ(u) = 0i) For every feasible point u of (37) the tms [u]2k (see sect2 for the notation) is

feasible for (38) for all k Therefore if (38) is infeasible for some k then (37)must be infeasible

ii) By Assumption 32 when (37) is infeasible the set

x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0

is empty It has a single inequality By the Positivstellensatz [2 Corollary 443]it holds that minus1 isin Ideal(ceq φ) + Qmod(ρ) By Assumption 32

Ideal(ceq φ) + Qmod(ρ) sube IQ(ceq cin) + IQ(φ ψ)

Thus for all k big enough (39) is unbounded from above Hence (38) must beinfeasible by weak duality

When (37) is feasible f achieves finitely many values on Kc so (37) mustachieve its optimal value fc By Theorem 33 we know that fk = f prime

k = fc for all kbig enough For each minimizer ulowast of (37) the tms [ulowast]2k is a minimizer of (38)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 13

iii) If (317) holds we can get r = rankMt(ylowast) minimizers for (37) [5 14] say

u1 ur such that fk = f(ui) for each i Clearly fk = f(ui) ge fc On the otherhand we always have fk le fc So fk = fc

iv) By Assumption 32 (37) is equivalent to the problem

(318)

min f(x)st ceq(x) = 0 φ(x) = 0 ρ(x) ge 0

The optimal value of (318) is also fc Its kth order Lasserrersquos relaxation is

(319)

γprimek = min 〈f y〉st 〈1 y〉 = 1Mk(y) 0

L(k)ceq (y) = 0 L

(k)φ (y) = 0 L

(k)ρ (y) 0

Its dual optimization problem is

(320)

γk = max γ

st f minus γ isin Ideal(ceq φ)2k + Qmod(ρ)2k

By repeating the same proof as for Theorem 33(iii) we can show that

γk = γprimek = fc

for all k big enough Because ρ isin Qmod(cin ψ) each y feasible for (38) is alsofeasible for (319) So when k is big each ylowast is also a minimizer of (319) Theproblem (318) also has finitely many minimizers By Theorem 26 of [32] thecondition (317) must be satisfied for some t isin [d k] when k is big enough

If (37) has infinitely many minimizers then the condition (317) is typically notsatisfied We refer to [25 sect66]

4 Polyhedral constraints

In this section we assume the feasible set of (11) is the polyhedron

P = x isin Rn | Axminus b ge 0

where A =[a1 middot middot middot am

]T isin Rmtimesn b =[b1 middot middot middot bm

]T isin Rm This corresponds

to that E = empty I = [m] and each ci(x) = aTi xminus bi Denote

(41) D(x) = diag(c1(x) cm(x)) C(x) =

[AT

D(x)

]

The Lagrange multiplier vector λ =[λ1 middot middot middot λm

]Tsatisfies

(42)

[AT

D(x)

]

λ =

[nablaf(x)

0

]

If rankA = m we can express λ as

(43) λ = (AAT )minus1Anablaf(x)If rankA lt m how can we express λ in terms of x In computation we oftenprefer a polynomial expression If there exists L(x) isin R[x]mtimes(n+m) such that

(44) L(x)C(x) = Im

then we can get

λ = L(x)

[nablaf(x)

0

]

= L1(x)nablaf(x)

14 JIAWANG NIE

where L1(x) consists of the first n columns of L(x) In this section we characterizewhen such L(x) exists and give a degree bound for it

The linear function Ax minus b is said to be nonsingular if rankC(u) = m for allu isin Cn (also see Definition 51) This is equivalent to that for every u if J(u) =i1 ik (see (12) for the notation) then ai1 aik are linearly independent

Proposition 41 The linear function Ax minus b is nonsingular if and only if thereexists a matrix polynomial L(x) satisfying (44) Moreover when Ax minus b is non-singular we can choose L(x) in (44) with deg(L) le mminus rankA

Proof Clearly if (44) is satisfied by some L(x) then rankC(u) ge m for all u Thisimplies that Axminus b is nonsingular

Next assume that Axminusb is nonsingular We show that (44) is satisfied by someL(x) isin R[x]mtimes(n+m) with degree le m minus rankA Let r = rankA Up to a linearcoordinate transformation we can reduce x to a r-dimensional variable Withoutloss of generality we can assume that rankA = n and m ge n

For a subset I = i1 imminusn of [m] denote

cI(x) =prod

iisinIci(x) EI(x) = cI(x) middot diag(ci1(x)minus1 cimminusn

(x)minus1)

DI(x) = diag(ci1(x) cimminusn(x)) AI =

[ai1 middot middot middot ainminusm

]T

For the case that I = empty (the empty set) we set cempty(x) = 1 Let

V = I sube [m] |I| = mminus n rankA[m]I = nStep I For each I isin V we construct a matrix polynomial LI(x) such that

(45) LI(x)C(x) = cI(x)Im

The matrix LI = LI(x) satisfying (45) can be given by the following 2times 3 blockmatrix (LI(J K) denotes the submatrix whose row indices are from J and whosecolumn indices are from K)

(46)

J K [n] n+ I n+ [m]II 0 EI(x) 0

[m]I cI(x) middot(A[m]I

)minusT minus(A[m]I

)minusT (AI

)TEI(x) 0

Equivalently the blocks of LI are

LI(I [n]

)= 0 LI

(I n+ [m]I

)= 0 LI

([m]I n+ [m]I

)= 0

LI(I n+ I

)= EI(x) LI

([m]I [n]

)= cI(x)

(A[m]I

)minusT

LI([m]I n+ I

)= minus

(A[m]I

)minusT (AI

)T

For each I isin V A[m]I is invertible The superscript minusT denotes the inverse of thetranspose Let G = LI(x)C(x) then one can verify that

G(I I) = EI(x)DI(x) = cI(x)Imminusn G(I [m]I) = 0

G([m]I [m]I) =[

cI(x)(A[m]I

)minusT minusAminusT[m]IA

TI EI(x)

] [(A[m]I

)T

0

]

= cI(x)In

G([m]I I) =[

cI(x)(A[m]I

)minusT minus(A[m]I

)minusT (AI

)TEI(x)

] [ ATIDI(x)

]

= 0

This shows that the above LI(x) satisfies (45)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 15

Step II We show that there exist real scalars νI satisfying

(47)sum

IisinVνIcI(x) = 1

This can be shown by induction on m

bull When m = n V = empty and cempty(x) = 1 so (47) is clearly truebull When m gt n let

(48) N = i isin [m] | rankA[m]i = n

For each i isin N let Vi be the set of all Iprime sube [m]i such that |I prime| = mminusnminus1

and rankA[m](Iprimecupi) = n For each i isin N by the assumption the linearfunction Amix minus bmi is nonsingular By induction there exist real

scalars ν(i)Iprime satisfying

(49)sum

IprimeisinVi

ν(i)Iprime cIprime(x) = 1

Since rankA = n we can generally assume that a1 an is linearlyindependent So there exist scalars α1 αn such that

am = α1a1 + middot middot middot+ αnan

If all αi = 0 then am = 0 and hence A can be replaced by its first mminus 1rows So (47) is true by the induction In the following suppose at leastone αi 6= 0 and write

i αi 6= 0 = i1 ik

Then ai1 aik am are linearly dependent For convenience set ik+1 =m Since Axminus b is nonsingular the linear system

ci1(x) = middot middot middot = cik(x) = cik+1(x) = 0

has no solutions Hence there exist real scalars micro1 microk+1 such that

micro1ci1(x) + middot middot middot+ microkcik(x) + microk+1cik+1(x) = 1

This above can be implied by echelonrsquos form for inconsistent linear systemsNote that i1 ik+1 isin N For each j = 1 k + 1 by (49)

sum

IprimeisinVij

ν(ij)Iprime cIprime(x) = 1

Then we can get

1 =k+1sum

j=1

microjcij (x) =k+1sum

j=1

microjsum

IprimeisinVij

ν(ij)Iprime cij (x)cIprime(x) =

sum

I=IprimecupijIprimeisinVij1lejlek+1

ν(ij)Iprime microjcI(x)

Since each I prime cup ij isin V (47) must be satisfied by some scalars νI

16 JIAWANG NIE

Step III For LI(x) as in (45) we construct L(x) as

(410) L(x) =sum

IisinVνIcI(x)LI(x)

Clearly L(x) satisfies (44) because

L(x)C(x) =sum

IisinVνILI(x)C(x) =

sum

IisinVνIcI(x)Im = Im

Each LI(x) has degree le mminus n so L(x) has degree le mminus n

Proposition 41 characterizes when there exists L(x) satisfying (44) Whenit does a degree bound for L(x) is m minus rankA Sometimes its degree can besmaller than that as shown in Example 43 For given A b the matrix polynomialL(x) satisfying (44) can be determined by linear equations which are obtained bymatching coefficients on both sides In the following we give some examples ofL(x)C(x) = Im for polyhedral sets

Example 42 Consider the simplicial set

x1 ge 0 xn ge 0 1minus eTx ge 0

The equation L(x)C(x) = In+1 is satisfied by

L(x) =

1minus x1 minusx2 middot middot middot minusxn 1 middot middot middot 1minusx1 1minus x2 middot middot middot minusxn 1 middot middot middot 1

minusx1 minusx2 middot middot middot 1minus xn 1 middot middot middot 1minusx1 minusx2 middot middot middot minusxn 1 middot middot middot 1

Example 43 Consider the box constraint

x1 ge 0 xn ge 0 1minus x1 ge 0 1minus xn ge 0

The equation L(x)C(x) = I2n is satisfied by

L(x) =

[In minus diag(x) In Inminusdiag(x) In In

]

Example 44 Consider the polyhedral set

1minus x4 ge 0 x4 minus x3 ge 0 x3 minus x2 ge 0 x2 minus x1 ge 0 x1 + 1 ge 0

The equation L(x)C(x) = I5 is satisfied by

L(x) =1

2

minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 minusx4 minus 1 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 1minus x4 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 1minus x3 1minus x4 1 1 1 1 1minusx1 minus 1 1minus x2 1minus x3 1minus x4 1 1 1 1 11minus x1 1minus x2 1minus x3 1minus x4 1 1 1 1 1

Example 45 Consider the polyhedral set

1 + x1 ge 0 1minus x1 ge 0 2minus x1 minus x2 ge 0 2minus x1 + x2 ge 0

The matrix L(x) satisfying L(x)C(x) = I4 is

1

6

x12minus 3x1 + 2 x1 x2 minus x2 4minus x1 2minus x1 1minus x1 1minus x1

3x12minus 3x1 minus 6 3x2 + 3x1 x2 6minus 3x1 minus3x1 minus3x1 minus 3 minus3x1 minus 3

1minus x12

minus2x2 minus x1 x2 minus 3 x1 minus 1 x1 + 1 x1 + 2 x1 + 21minus x1

2 3minus x1 x2 minus 2x2 x1 minus 1 x1 + 1 x1 + 2 x1 + 2

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 17

5 General constraints

We consider general nonlinear constraints as in (11) The critical point condi-tions are in (18) We discuss how to express Lagrange multipliers λi as polynomialfunctions in x on the set of critical points

Suppose there are totally m equality and inequality constraints ie

E cup I = 1 mIf (x λ) is a critical pair then λici(x) = 0 for all i isin E cup I So the Lagrange

multiplier vector λ =[λ1 middot middot middot λm

]Tsatisfies the equation

(51)

nablac1(x) nablac2(x) middot middot middot nablacm(x)c1(x) 0 middot middot middot 00 c2(x) 0 0

0 0 middot middot middot cm(x)

︸ ︷︷ ︸

C(x)

λ =

nablaf(x)000

Let C(x) be as in above If there exists L(x) isin R[x]mtimes(m+n) such that

(52) L(x)C(x) = Im

then we can get

(53) λ = L(x)

[nablaf(x)

0

]

= L1(x)nablaf(x)

where L1(x) consists of the first n columns of L(x) Clearly (52) implies thatAssumption 31 holds This section characterizes when such L(x) exists

Definition 51 The tuple c = (c1 cm) of constraining polynomials is said tobe nonsingular if rankC(u) = m for every u isin Cn

Clearly c being nonsingular is equivalent to that for each u isin Cn if J(u) =i1 ik (see (12) for the notation) then the gradients nablaci1(u) nablacik(u) arelinearly independent Our main conclusion is that (52) holds if and only if thetuple c is nonsingular

Proposition 52 (i) For each W (x) isin C[x]stimest with s ge t rankW (u) = t for allu isin Cn if and only if there exists P (x) isin C[x]ttimess such that

P (x)W (x) = It

Moreover for W (x) isin R[x]stimest we can choose P (x) isin R[x]ttimess for the above(ii) The constraining polynomial tuple c is nonsingular if and only if there existsL(x) isin R[x]mtimes(m+n) satisfying (52)

Proof (i) ldquolArrrdquo If L(x)W (x) = It then for all u isin Cn

t = rank It le rankW (u) le t

So W (x) must have full column rank everywhere

ldquorArrrdquo Suppose rankW (u) = t for all u isin Cn Write W (x) in columns

W (x) =[w1(x) w2(x) middot middot middot wt(x)

]

18 JIAWANG NIE

Then the equation w1(x) = 0 does not have a complex solution By Hilbertrsquos WeakNullstellensatz [4] there exists ξ1(x) isin C[x]s such that ξ1(x)

Tw1(x) = 1 For eachi = 2 t denote

r1i(x) = ξ1(x)Twi(x)

then (use sim to denote row equivalence between matrices)

W (x) sim[

1 r12(x) middot middot middot r1t(x)w1(x) w2(x) middot middot middot wm(x)

]

simW1(x) =

[1 r12(x) middot middot middot r1m(x)

0 w(1)2 (x) middot middot middot w

(1)m (x)

]

where each (i = 2 m)

w(1)i (x) = wi(x)minus r1i(x)w1(x)

So there exists P1(x) isin R[x](s+1)timess such that

P1(x)W (x) =W1(x)

Since W (x) and W1(x) are row equivalent W1(x) must also have full column rankeverywhere Similarly the polynomial equation

w(1)2 (x) = 0

does not have a complex solution Again by Hilbertrsquos Weak Nullstellensatz [4]there exists ξ2(x) isin C[x]s such that

ξ2(x)Tw

(1)2 (x) = 1

For each i = 3 t let r2i(x) = ξ2(x)Tw

(1)2 (x) then

W1(x) sim

1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)

0 w(1)2 (x) w

(1)3 (x) middot middot middot w

(1)m (x)

sim

W2(x) =

1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)

0 0 w(2)3 (x) middot middot middot w

(2)m (x)

where each (i = 3 m)

w(2)i (x) = w

(1)i (x)minus r2i(x)w

(1)2 (x)

Similarly W1(x) and W2(x) are row equivalent so W2(x) has full column rankeverywhere There exists P2(x) isin C[x](s+2)times(s+1) such that

P2(x)W1(x) =W2(x)

Continuing this process we can finally get

W2(x) sim middot middot middot simWt(x) =

1 r12(x) r13(x) middot middot middot r1t(x)0 1 r23(x) middot middot middot r2t(x)0 0 1 middot middot middot r3t(x)

0 0 0 middot middot middot 10 0 0 middot middot middot 0

Consequently there exists Pi(x) isin R[x](s+i)times(s+iminus1) for i = 1 2 t such that

Pt(x)Ptminus1(x) middot middot middotP1(x)W (x) =Wt(x)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 19

Since Wt(x) is a unit upper triangular matrix polynomial there exists Pt+1(x) isinR[x]ttimes(s+t) such that Pt+1(x)Wt(x) = It Let

P (x) = Pt+1(x)Pt(x)Ptminus1(x) middot middot middotP1(x)

then P (x)W (x) = Im Note that P (x) isin C[x]ttimess

For W (x) isin R[x]stimest we can replace P (x) by(P (x)+P (x)

)2 (the P (x) denotes

the complex conjugate of P (x)) which is a real matrix polynomial

(ii) The conclusion is implied directly by the item (i)

In Proposition 52 there is no explicit degree bound for L(x) satisfying (52)This question is mostly open to the best of the authorrsquos knowledge However oncea degree is chosen for L(x) it can be determined by comparing coefficients of bothsides of (52) This can be done by solving a linear system In the following wegive some examples of L(x) satisfying (52)

Example 53 Consider the hypercube with quadratic constraints

1minus x21 ge 0 1minus x22 ge 0 1minus x2n ge 0

The equation L(x)C(x) = In is satisfied by

L(x) =[minus 1

2diag(x) In]

Example 54 Consider the nonnegative portion of the unit sphere

x1 ge 0 x2 ge 0 xn ge 0 x21 + middot middot middot+ x2n minus 1 = 0

The equation L(x)C(x) = In+1 is satisfied by

L(x) =

[In minus xxT x1Tn 2x

12x

T minus 121

Tn minus1

]

Example 55 Consider the set

1minus x31 minus x42 ge 0 1minus x43 minus x34 ge 0

The equation L(x)C(x) = I2 is satisfied by

L(x) =

[minusx1

3 minusx2

4 0 0 1 00 0 minusx3

4 minusx4

3 0 1

]

Example 56 Consider the quadratic set

1minus x1x2 minus x2x3 minus x1x3 ge 0 1minus x21 minus x22 minus x23 ge 0

The matrix L(x)T satisfying L(x)C(x) = I2 is

25x13 + 10 x1

2 x2 + 40x1 x22 minus 25 x1 minus 2x3 minus25x1

3 minus 10x12 x2 minus 40 x1 x2

2 +49 x1

2+ 2x3

minus15 x12 x2 + 10x1 x2

2 + 20 x3 x1 x2 minus 10x1 15 x12 x2 minus 10x1 x2

2 minus 20 x3 x1 x2 + 10x1 minus x22

25 x3 x12 minus 20x1 x2

2 + 10 x3 x1 x2 + 2x1 minus25 x3 x12 + 20x1 x2

2 minus 10 x3 x1 x2 minus 2x1 minus x32

1 minus 20 x1 x3 minus 10 x12 minus 20 x1 x2 20 x1 x2 + 20 x1 x3 + 10x1

2

minus50 x12 minus 20 x2 x1 50 x1

2 + 20 x2 x1 + 1

We would like to remark that a polynomial tuple c = (c1 cm) is genericallynonsingular and Assumption 31 holds generically

Proposition 57 For all positive degrees d1 dm there exists an open densesubset U of D = R[x]d1 times middot middot middot times R[x]dm such that every tuple c = (c1 cm) isin Uis nonsingular Indeed such U can be chosen as a Zariski open subset of D ie itis the complement of a proper real variety of D Moreover Assumption 31 holdsfor all c isin U ie it holds generically

20 JIAWANG NIE

Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The

resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of

ci1 cin+1such that if Res(ci1 cin+1

) 6= 0 then the equations

ci1(x) = middot middot middot = cin+1(x) = 0

have no complex solutions Define

F1(c) =prod

(i1in+1)isinJ1

Res(ci1 cin+1)

For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero

Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations

cj1(x) = middot middot middot = cjk(x) = 0

have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define

F2(c) =prod

(j1jk)isinJ2

∆(cj1 cjk)

Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero

Last let F (c) = F1(c)F2(c) and

U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple

Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically

6 Numerical examples

This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21

The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie

pi =(

L1(x)nablaf(x))

i

In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds

By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)

We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo

Example 61 Consider the optimization problem

min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0

The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors

Table 1 Computational results for Example 61

order kwo LME with LME

lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567

2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn

i=1xi(1 minus xi)

2 +sum

i6=j x2

i xj isin IQ(ceq cin)

22 JIAWANG NIE

Example 62 Consider the optimization problem

min x41x22 + x21x

42 + x63 minus 3x21x

22x

23 + (x41 + x42 + x43)

st x21 + x22 + x23 ge 1

The matrix polynomial L(x) =[12x1

12x2

12x3 minus1

] The objective f is the sum

of the positive definite form x41 + x42 + x43 and the Motzkin polynomial

M(x) = x41x22 + x21x

42 + x63 minus 3x21x

22x

23

Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because

c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)

)= 0

defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1

3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic

3 plusmn 1radic

3) The

computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 2 Computational results for Example 62

order kwo LME with LME

lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530

Example 63 Consider the optimization problem

min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0

The matrix polynomial L(x) is

1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0

0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1

minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1

The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3

3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact

4 This is because 1 minus x21minus x2

2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and

1minusx23minusx2

4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23

Table 3 Computational results for Example 63

order kwo LME with LME

lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449

Example 64 Consider the polynomial optimization problem

minxisinR2

x21 + 50x22

st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1

8 ge 0 x22 + 2x1x2 minus 18 ge 0

It is motivated from an example in [12 sect3] The first column of L(x) is

8x13

5 + x1

5288 x2 x1

4

5 minus 16 x13

5 minus x2 x12 1245 + 8 x1

5 minus 2 x2minus 288x2 x1

4

5 minus 16 x13

5 + x2 x12 1245 + 8 x1

5 + 2 x2

and the second column of L(x) is

minus 8x12 x2

5 + 4x23

5 minus x2

10288x1

3 x22

5 + 16x12 x2

5 minus 142x1 x22

5 minus 9 x1

20 minus 8 x23

5 + 11x2

5

minus 288x13 x2

2

5 + 16x12 x2

5 + 142x1 x22

5 + 9 x1

20 minus 8 x23

5 + 11x2

5

The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25

radic5 asymp 1126517 and the minimizers are (plusmn

radic

12plusmn(radic

58 +radic

12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors

Table 4 Computational results for Example 64

order kwo LME with LME

lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537

Example 65 Consider the optimization problem

minxisinR3

x31 + x32 + x33 + 4x1x2x3 minus(x1(x

22 + x23) + x2(x

23 + x21) + x3(x

21 + x22)

)

st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0

The matrix polynomial L(x) is

1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0

minusx1 x2 0 1 0 minus1

24 JIAWANG NIE

The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3

+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 5 Computational results for Example 65

order kwo LME with LME

lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057

Example 66 Consider the optimization problem (x0 = 1)

minxisinR4

xTx+sum4i=0

prod

j 6=i(xi minus xj)

st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0

The matrix polynomial L(x) =[12diag(x) minusI4

] The first part of the objective

is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers

(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)

(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)

(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 6 Computational results for Example 66

order kwo LME with LME

lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354

Example 67 Consider the optimization problem

minxisinR3

x41x22 + x42x

23 + x43x

21 minus 3x21x

22x

23 + x22

st x1 minus x2x3 ge 0minusx2 + x23 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0

minusx3 minus1 0 0 0

]

By the arithmetic-geometric

mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25

relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin

for all k ge 5 up to numerical round-off errors

Table 7 Computational results for Example 67

order kwo LME with LME

lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618

Example 68 Consider the optimization problem

minxisinR4

x21(x1 minus x4)2 + x22(x2 minus x4)

2 + x23(x3 minus x4)2+

2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2

st x1 minus x2 ge 0 x2 minus x3 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0 01 1 0 0 0 0

]

In the objective the sum

of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer

(05632 05632 05632 07510)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 8 Computational results for Example 68

order kwo LME with LME

lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228

Example 69 Consider the optimization problem

minxisinR4

(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)

st 0 le x1 x4 le 1

The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9

5 Note that 4minussum

4

i=1x2i =

sum

4

i=1

(

xi(1 minus xi)2 + (1 minus xi)(1 + x2i )

)

isin IQ(ceq cin)

26 JIAWANG NIE

Table 9 Computational results for Example 69

orderwo LME with LME

lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320

For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison

Example 610 Consider the optimization problem (x0 = 1)

minxisinRn

sum

0leilejlejlen cijkxixjxk

st 0 le x le 1

where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time

Table 10 Consumed time (in seconds) for Example 610

n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778

7 Discussions

71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that

IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k

If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set

Preord(cin ψ) =sum

r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27

The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is

(71)

f primeprek = min 〈f y〉

st 〈1 y〉 = 1 L(k)ceq(y) = 0 L

(k)φ (y) = 0

L(k)

gr11

middotmiddotmiddotgrℓℓ

(y) 0 forall r1 rℓ isin 0 1y isin RN

n2k

Similar to (39) the dual optimization problem of the above is

(72)

fprek = max γ

st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k

An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem

Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then

fprek = f primeprek = fc

for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre

k = fmin for all k big enough

Proof The proof is very similar to the Case III of Theorem 33 Follow the same

argument there Without Assumption 32 we still have f(x) equiv 0 on the set

K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0

By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such

that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same

72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge

73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)

28 JIAWANG NIE

74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get

λ =(

C(x)TC(x))minus1

C(x)T[nablaf(x)

0

]

when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research

Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper

References

[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions

Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-

putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997

[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226

[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200

[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832

[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear

[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear

[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]

[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous

quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-

inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-

tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005

[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109

[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J

Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of

zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29

[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885

[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014

[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009

[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015

[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear

[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26

[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009

[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014

[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606

[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718

[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012

[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255

[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013

[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510

[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121

[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274

[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003

[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984

[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000

[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942

[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324

[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653

[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002

[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951

Department of Mathematics University of California at San Diego 9500 Gilman

Drive La Jolla CA USA 92093

E-mail address njwmathucsdedu

  • 1 Introduction
    • 11 Optimality conditions
    • 12 Some existing work
    • 13 New contributions
      • 2 Preliminaries
      • 3 The construction of tight relaxations
        • 31 Tightness of the relaxations
        • 32 Detecting tightness and extracting minimizers
          • 4 Polyhedral constraints
          • 5 General constraints
          • 6 Numerical examples
          • 7 Discussions
            • 71 Tight relaxations using preorderings
            • 72 Singular constraining polynomials
            • 73 Degree bound for L(x)
            • 74 Rational representation of Lagrange multipliers
              • References
Page 12: TIGHT RELAXATIONS FOR POLYNOMIAL OPTIMIZATION …The polynomial expressions is de-termined by linear equations. Based on these expressions, new Lasserre type semidefinite relaxations

12 JIAWANG NIE

For all ǫ gt 0 γ = fc minus ǫ is feasible in (39) for the order k5 so fk5 ge fc Becauseof (311) and the monotonicity of fk we have fk = f prime

k = fc for all k ge k5

32 Detecting tightness and extracting minimizers The optimal value of(37) is fc and the optimal value of (11) is fmin If fmin is achievable at a criticalpoint then fc = fmin In Theorem 33 we have shown that fk = fc for all k bigenough where fk is the optimal value of (39) The value fc or fmin is often notknown How do we detect the tightness fk = fc in computation The flat extensionor flat truncation condition [5 14 32] can be used for checking tightness Supposeylowast is a minimizer of (38) for the order k Let

(316) d = lceildeg(ceq cin φ ψ)2rceilIf there exists an integer t isin [d k] such that

(317) rankMt(ylowast) = rankMtminusd(y

lowast)

then fk = fc and we can get r = rankMt(ylowast) minimizers for (37) [5 14 32]

The method in [14] can be used to extract minimizers It was implemented inthe software GloptiPoly 3 [13] Generally (317) can serve as a sufficient andnecessary condition for detecting tightness The case that (37) is infeasible (ieno critical points satisfy the constraints cin ge 0 ψ ge 0) can also be detected bysolving the relaxations (38)-(39)

Theorem 34 Under Assumption 31 the relaxations (38)-(39) have the follow-ing properties

i) If (38) is infeasible for some order k then no critical points satisfy theconstraints cin ge 0 ψ ge 0 ie (37) is infeasible

ii) Suppose Assumption 32 holds If (37) is infeasible then the relaxation(38) must be infeasible when the order k is big enough

In the following assume (37) is feasible (ie fc lt +infin) Then for all k bigenough (38) has a minimizer ylowast Moreover

iii) If (317) is satisfied for some t isin [d k] then fk = fciv) If Assumption 32 holds and (37) has finitely many minimizers then every

minimizer ylowast of (38) must satisfy (317) for some t isin [d k] when k is bigenough

Proof By Assumption 31 u is a critical point if and only if ceq(u) = 0 φ(u) = 0i) For every feasible point u of (37) the tms [u]2k (see sect2 for the notation) is

feasible for (38) for all k Therefore if (38) is infeasible for some k then (37)must be infeasible

ii) By Assumption 32 when (37) is infeasible the set

x isin Rn ceq(x) = 0 φ(x) = 0 ρ(x) ge 0

is empty It has a single inequality By the Positivstellensatz [2 Corollary 443]it holds that minus1 isin Ideal(ceq φ) + Qmod(ρ) By Assumption 32

Ideal(ceq φ) + Qmod(ρ) sube IQ(ceq cin) + IQ(φ ψ)

Thus for all k big enough (39) is unbounded from above Hence (38) must beinfeasible by weak duality

When (37) is feasible f achieves finitely many values on Kc so (37) mustachieve its optimal value fc By Theorem 33 we know that fk = f prime

k = fc for all kbig enough For each minimizer ulowast of (37) the tms [ulowast]2k is a minimizer of (38)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 13

iii) If (317) holds we can get r = rankMt(ylowast) minimizers for (37) [5 14] say

u1 ur such that fk = f(ui) for each i Clearly fk = f(ui) ge fc On the otherhand we always have fk le fc So fk = fc

iv) By Assumption 32 (37) is equivalent to the problem

(318)

min f(x)st ceq(x) = 0 φ(x) = 0 ρ(x) ge 0

The optimal value of (318) is also fc Its kth order Lasserrersquos relaxation is

(319)

γprimek = min 〈f y〉st 〈1 y〉 = 1Mk(y) 0

L(k)ceq (y) = 0 L

(k)φ (y) = 0 L

(k)ρ (y) 0

Its dual optimization problem is

(320)

γk = max γ

st f minus γ isin Ideal(ceq φ)2k + Qmod(ρ)2k

By repeating the same proof as for Theorem 33(iii) we can show that

γk = γprimek = fc

for all k big enough Because ρ isin Qmod(cin ψ) each y feasible for (38) is alsofeasible for (319) So when k is big each ylowast is also a minimizer of (319) Theproblem (318) also has finitely many minimizers By Theorem 26 of [32] thecondition (317) must be satisfied for some t isin [d k] when k is big enough

If (37) has infinitely many minimizers then the condition (317) is typically notsatisfied We refer to [25 sect66]

4 Polyhedral constraints

In this section we assume the feasible set of (11) is the polyhedron

P = x isin Rn | Axminus b ge 0

where A =[a1 middot middot middot am

]T isin Rmtimesn b =[b1 middot middot middot bm

]T isin Rm This corresponds

to that E = empty I = [m] and each ci(x) = aTi xminus bi Denote

(41) D(x) = diag(c1(x) cm(x)) C(x) =

[AT

D(x)

]

The Lagrange multiplier vector λ =[λ1 middot middot middot λm

]Tsatisfies

(42)

[AT

D(x)

]

λ =

[nablaf(x)

0

]

If rankA = m we can express λ as

(43) λ = (AAT )minus1Anablaf(x)If rankA lt m how can we express λ in terms of x In computation we oftenprefer a polynomial expression If there exists L(x) isin R[x]mtimes(n+m) such that

(44) L(x)C(x) = Im

then we can get

λ = L(x)

[nablaf(x)

0

]

= L1(x)nablaf(x)

14 JIAWANG NIE

where L1(x) consists of the first n columns of L(x) In this section we characterizewhen such L(x) exists and give a degree bound for it

The linear function Ax minus b is said to be nonsingular if rankC(u) = m for allu isin Cn (also see Definition 51) This is equivalent to that for every u if J(u) =i1 ik (see (12) for the notation) then ai1 aik are linearly independent

Proposition 41 The linear function Ax minus b is nonsingular if and only if thereexists a matrix polynomial L(x) satisfying (44) Moreover when Ax minus b is non-singular we can choose L(x) in (44) with deg(L) le mminus rankA

Proof Clearly if (44) is satisfied by some L(x) then rankC(u) ge m for all u Thisimplies that Axminus b is nonsingular

Next assume that Axminusb is nonsingular We show that (44) is satisfied by someL(x) isin R[x]mtimes(n+m) with degree le m minus rankA Let r = rankA Up to a linearcoordinate transformation we can reduce x to a r-dimensional variable Withoutloss of generality we can assume that rankA = n and m ge n

For a subset I = i1 imminusn of [m] denote

cI(x) =prod

iisinIci(x) EI(x) = cI(x) middot diag(ci1(x)minus1 cimminusn

(x)minus1)

DI(x) = diag(ci1(x) cimminusn(x)) AI =

[ai1 middot middot middot ainminusm

]T

For the case that I = empty (the empty set) we set cempty(x) = 1 Let

V = I sube [m] |I| = mminus n rankA[m]I = nStep I For each I isin V we construct a matrix polynomial LI(x) such that

(45) LI(x)C(x) = cI(x)Im

The matrix LI = LI(x) satisfying (45) can be given by the following 2times 3 blockmatrix (LI(J K) denotes the submatrix whose row indices are from J and whosecolumn indices are from K)

(46)

J K [n] n+ I n+ [m]II 0 EI(x) 0

[m]I cI(x) middot(A[m]I

)minusT minus(A[m]I

)minusT (AI

)TEI(x) 0

Equivalently the blocks of LI are

LI(I [n]

)= 0 LI

(I n+ [m]I

)= 0 LI

([m]I n+ [m]I

)= 0

LI(I n+ I

)= EI(x) LI

([m]I [n]

)= cI(x)

(A[m]I

)minusT

LI([m]I n+ I

)= minus

(A[m]I

)minusT (AI

)T

For each I isin V A[m]I is invertible The superscript minusT denotes the inverse of thetranspose Let G = LI(x)C(x) then one can verify that

G(I I) = EI(x)DI(x) = cI(x)Imminusn G(I [m]I) = 0

G([m]I [m]I) =[

cI(x)(A[m]I

)minusT minusAminusT[m]IA

TI EI(x)

] [(A[m]I

)T

0

]

= cI(x)In

G([m]I I) =[

cI(x)(A[m]I

)minusT minus(A[m]I

)minusT (AI

)TEI(x)

] [ ATIDI(x)

]

= 0

This shows that the above LI(x) satisfies (45)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 15

Step II We show that there exist real scalars νI satisfying

(47)sum

IisinVνIcI(x) = 1

This can be shown by induction on m

bull When m = n V = empty and cempty(x) = 1 so (47) is clearly truebull When m gt n let

(48) N = i isin [m] | rankA[m]i = n

For each i isin N let Vi be the set of all Iprime sube [m]i such that |I prime| = mminusnminus1

and rankA[m](Iprimecupi) = n For each i isin N by the assumption the linearfunction Amix minus bmi is nonsingular By induction there exist real

scalars ν(i)Iprime satisfying

(49)sum

IprimeisinVi

ν(i)Iprime cIprime(x) = 1

Since rankA = n we can generally assume that a1 an is linearlyindependent So there exist scalars α1 αn such that

am = α1a1 + middot middot middot+ αnan

If all αi = 0 then am = 0 and hence A can be replaced by its first mminus 1rows So (47) is true by the induction In the following suppose at leastone αi 6= 0 and write

i αi 6= 0 = i1 ik

Then ai1 aik am are linearly dependent For convenience set ik+1 =m Since Axminus b is nonsingular the linear system

ci1(x) = middot middot middot = cik(x) = cik+1(x) = 0

has no solutions Hence there exist real scalars micro1 microk+1 such that

micro1ci1(x) + middot middot middot+ microkcik(x) + microk+1cik+1(x) = 1

This above can be implied by echelonrsquos form for inconsistent linear systemsNote that i1 ik+1 isin N For each j = 1 k + 1 by (49)

sum

IprimeisinVij

ν(ij)Iprime cIprime(x) = 1

Then we can get

1 =k+1sum

j=1

microjcij (x) =k+1sum

j=1

microjsum

IprimeisinVij

ν(ij)Iprime cij (x)cIprime(x) =

sum

I=IprimecupijIprimeisinVij1lejlek+1

ν(ij)Iprime microjcI(x)

Since each I prime cup ij isin V (47) must be satisfied by some scalars νI

16 JIAWANG NIE

Step III For LI(x) as in (45) we construct L(x) as

(410) L(x) =sum

IisinVνIcI(x)LI(x)

Clearly L(x) satisfies (44) because

L(x)C(x) =sum

IisinVνILI(x)C(x) =

sum

IisinVνIcI(x)Im = Im

Each LI(x) has degree le mminus n so L(x) has degree le mminus n

Proposition 41 characterizes when there exists L(x) satisfying (44) Whenit does a degree bound for L(x) is m minus rankA Sometimes its degree can besmaller than that as shown in Example 43 For given A b the matrix polynomialL(x) satisfying (44) can be determined by linear equations which are obtained bymatching coefficients on both sides In the following we give some examples ofL(x)C(x) = Im for polyhedral sets

Example 42 Consider the simplicial set

x1 ge 0 xn ge 0 1minus eTx ge 0

The equation L(x)C(x) = In+1 is satisfied by

L(x) =

1minus x1 minusx2 middot middot middot minusxn 1 middot middot middot 1minusx1 1minus x2 middot middot middot minusxn 1 middot middot middot 1

minusx1 minusx2 middot middot middot 1minus xn 1 middot middot middot 1minusx1 minusx2 middot middot middot minusxn 1 middot middot middot 1

Example 43 Consider the box constraint

x1 ge 0 xn ge 0 1minus x1 ge 0 1minus xn ge 0

The equation L(x)C(x) = I2n is satisfied by

L(x) =

[In minus diag(x) In Inminusdiag(x) In In

]

Example 44 Consider the polyhedral set

1minus x4 ge 0 x4 minus x3 ge 0 x3 minus x2 ge 0 x2 minus x1 ge 0 x1 + 1 ge 0

The equation L(x)C(x) = I5 is satisfied by

L(x) =1

2

minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 minusx4 minus 1 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 1minus x4 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 1minus x3 1minus x4 1 1 1 1 1minusx1 minus 1 1minus x2 1minus x3 1minus x4 1 1 1 1 11minus x1 1minus x2 1minus x3 1minus x4 1 1 1 1 1

Example 45 Consider the polyhedral set

1 + x1 ge 0 1minus x1 ge 0 2minus x1 minus x2 ge 0 2minus x1 + x2 ge 0

The matrix L(x) satisfying L(x)C(x) = I4 is

1

6

x12minus 3x1 + 2 x1 x2 minus x2 4minus x1 2minus x1 1minus x1 1minus x1

3x12minus 3x1 minus 6 3x2 + 3x1 x2 6minus 3x1 minus3x1 minus3x1 minus 3 minus3x1 minus 3

1minus x12

minus2x2 minus x1 x2 minus 3 x1 minus 1 x1 + 1 x1 + 2 x1 + 21minus x1

2 3minus x1 x2 minus 2x2 x1 minus 1 x1 + 1 x1 + 2 x1 + 2

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 17

5 General constraints

We consider general nonlinear constraints as in (11) The critical point condi-tions are in (18) We discuss how to express Lagrange multipliers λi as polynomialfunctions in x on the set of critical points

Suppose there are totally m equality and inequality constraints ie

E cup I = 1 mIf (x λ) is a critical pair then λici(x) = 0 for all i isin E cup I So the Lagrange

multiplier vector λ =[λ1 middot middot middot λm

]Tsatisfies the equation

(51)

nablac1(x) nablac2(x) middot middot middot nablacm(x)c1(x) 0 middot middot middot 00 c2(x) 0 0

0 0 middot middot middot cm(x)

︸ ︷︷ ︸

C(x)

λ =

nablaf(x)000

Let C(x) be as in above If there exists L(x) isin R[x]mtimes(m+n) such that

(52) L(x)C(x) = Im

then we can get

(53) λ = L(x)

[nablaf(x)

0

]

= L1(x)nablaf(x)

where L1(x) consists of the first n columns of L(x) Clearly (52) implies thatAssumption 31 holds This section characterizes when such L(x) exists

Definition 51 The tuple c = (c1 cm) of constraining polynomials is said tobe nonsingular if rankC(u) = m for every u isin Cn

Clearly c being nonsingular is equivalent to that for each u isin Cn if J(u) =i1 ik (see (12) for the notation) then the gradients nablaci1(u) nablacik(u) arelinearly independent Our main conclusion is that (52) holds if and only if thetuple c is nonsingular

Proposition 52 (i) For each W (x) isin C[x]stimest with s ge t rankW (u) = t for allu isin Cn if and only if there exists P (x) isin C[x]ttimess such that

P (x)W (x) = It

Moreover for W (x) isin R[x]stimest we can choose P (x) isin R[x]ttimess for the above(ii) The constraining polynomial tuple c is nonsingular if and only if there existsL(x) isin R[x]mtimes(m+n) satisfying (52)

Proof (i) ldquolArrrdquo If L(x)W (x) = It then for all u isin Cn

t = rank It le rankW (u) le t

So W (x) must have full column rank everywhere

ldquorArrrdquo Suppose rankW (u) = t for all u isin Cn Write W (x) in columns

W (x) =[w1(x) w2(x) middot middot middot wt(x)

]

18 JIAWANG NIE

Then the equation w1(x) = 0 does not have a complex solution By Hilbertrsquos WeakNullstellensatz [4] there exists ξ1(x) isin C[x]s such that ξ1(x)

Tw1(x) = 1 For eachi = 2 t denote

r1i(x) = ξ1(x)Twi(x)

then (use sim to denote row equivalence between matrices)

W (x) sim[

1 r12(x) middot middot middot r1t(x)w1(x) w2(x) middot middot middot wm(x)

]

simW1(x) =

[1 r12(x) middot middot middot r1m(x)

0 w(1)2 (x) middot middot middot w

(1)m (x)

]

where each (i = 2 m)

w(1)i (x) = wi(x)minus r1i(x)w1(x)

So there exists P1(x) isin R[x](s+1)timess such that

P1(x)W (x) =W1(x)

Since W (x) and W1(x) are row equivalent W1(x) must also have full column rankeverywhere Similarly the polynomial equation

w(1)2 (x) = 0

does not have a complex solution Again by Hilbertrsquos Weak Nullstellensatz [4]there exists ξ2(x) isin C[x]s such that

ξ2(x)Tw

(1)2 (x) = 1

For each i = 3 t let r2i(x) = ξ2(x)Tw

(1)2 (x) then

W1(x) sim

1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)

0 w(1)2 (x) w

(1)3 (x) middot middot middot w

(1)m (x)

sim

W2(x) =

1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)

0 0 w(2)3 (x) middot middot middot w

(2)m (x)

where each (i = 3 m)

w(2)i (x) = w

(1)i (x)minus r2i(x)w

(1)2 (x)

Similarly W1(x) and W2(x) are row equivalent so W2(x) has full column rankeverywhere There exists P2(x) isin C[x](s+2)times(s+1) such that

P2(x)W1(x) =W2(x)

Continuing this process we can finally get

W2(x) sim middot middot middot simWt(x) =

1 r12(x) r13(x) middot middot middot r1t(x)0 1 r23(x) middot middot middot r2t(x)0 0 1 middot middot middot r3t(x)

0 0 0 middot middot middot 10 0 0 middot middot middot 0

Consequently there exists Pi(x) isin R[x](s+i)times(s+iminus1) for i = 1 2 t such that

Pt(x)Ptminus1(x) middot middot middotP1(x)W (x) =Wt(x)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 19

Since Wt(x) is a unit upper triangular matrix polynomial there exists Pt+1(x) isinR[x]ttimes(s+t) such that Pt+1(x)Wt(x) = It Let

P (x) = Pt+1(x)Pt(x)Ptminus1(x) middot middot middotP1(x)

then P (x)W (x) = Im Note that P (x) isin C[x]ttimess

For W (x) isin R[x]stimest we can replace P (x) by(P (x)+P (x)

)2 (the P (x) denotes

the complex conjugate of P (x)) which is a real matrix polynomial

(ii) The conclusion is implied directly by the item (i)

In Proposition 52 there is no explicit degree bound for L(x) satisfying (52)This question is mostly open to the best of the authorrsquos knowledge However oncea degree is chosen for L(x) it can be determined by comparing coefficients of bothsides of (52) This can be done by solving a linear system In the following wegive some examples of L(x) satisfying (52)

Example 53 Consider the hypercube with quadratic constraints

1minus x21 ge 0 1minus x22 ge 0 1minus x2n ge 0

The equation L(x)C(x) = In is satisfied by

L(x) =[minus 1

2diag(x) In]

Example 54 Consider the nonnegative portion of the unit sphere

x1 ge 0 x2 ge 0 xn ge 0 x21 + middot middot middot+ x2n minus 1 = 0

The equation L(x)C(x) = In+1 is satisfied by

L(x) =

[In minus xxT x1Tn 2x

12x

T minus 121

Tn minus1

]

Example 55 Consider the set

1minus x31 minus x42 ge 0 1minus x43 minus x34 ge 0

The equation L(x)C(x) = I2 is satisfied by

L(x) =

[minusx1

3 minusx2

4 0 0 1 00 0 minusx3

4 minusx4

3 0 1

]

Example 56 Consider the quadratic set

1minus x1x2 minus x2x3 minus x1x3 ge 0 1minus x21 minus x22 minus x23 ge 0

The matrix L(x)T satisfying L(x)C(x) = I2 is

25x13 + 10 x1

2 x2 + 40x1 x22 minus 25 x1 minus 2x3 minus25x1

3 minus 10x12 x2 minus 40 x1 x2

2 +49 x1

2+ 2x3

minus15 x12 x2 + 10x1 x2

2 + 20 x3 x1 x2 minus 10x1 15 x12 x2 minus 10x1 x2

2 minus 20 x3 x1 x2 + 10x1 minus x22

25 x3 x12 minus 20x1 x2

2 + 10 x3 x1 x2 + 2x1 minus25 x3 x12 + 20x1 x2

2 minus 10 x3 x1 x2 minus 2x1 minus x32

1 minus 20 x1 x3 minus 10 x12 minus 20 x1 x2 20 x1 x2 + 20 x1 x3 + 10x1

2

minus50 x12 minus 20 x2 x1 50 x1

2 + 20 x2 x1 + 1

We would like to remark that a polynomial tuple c = (c1 cm) is genericallynonsingular and Assumption 31 holds generically

Proposition 57 For all positive degrees d1 dm there exists an open densesubset U of D = R[x]d1 times middot middot middot times R[x]dm such that every tuple c = (c1 cm) isin Uis nonsingular Indeed such U can be chosen as a Zariski open subset of D ie itis the complement of a proper real variety of D Moreover Assumption 31 holdsfor all c isin U ie it holds generically

20 JIAWANG NIE

Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The

resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of

ci1 cin+1such that if Res(ci1 cin+1

) 6= 0 then the equations

ci1(x) = middot middot middot = cin+1(x) = 0

have no complex solutions Define

F1(c) =prod

(i1in+1)isinJ1

Res(ci1 cin+1)

For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero

Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations

cj1(x) = middot middot middot = cjk(x) = 0

have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define

F2(c) =prod

(j1jk)isinJ2

∆(cj1 cjk)

Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero

Last let F (c) = F1(c)F2(c) and

U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple

Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically

6 Numerical examples

This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21

The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie

pi =(

L1(x)nablaf(x))

i

In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds

By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)

We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo

Example 61 Consider the optimization problem

min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0

The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors

Table 1 Computational results for Example 61

order kwo LME with LME

lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567

2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn

i=1xi(1 minus xi)

2 +sum

i6=j x2

i xj isin IQ(ceq cin)

22 JIAWANG NIE

Example 62 Consider the optimization problem

min x41x22 + x21x

42 + x63 minus 3x21x

22x

23 + (x41 + x42 + x43)

st x21 + x22 + x23 ge 1

The matrix polynomial L(x) =[12x1

12x2

12x3 minus1

] The objective f is the sum

of the positive definite form x41 + x42 + x43 and the Motzkin polynomial

M(x) = x41x22 + x21x

42 + x63 minus 3x21x

22x

23

Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because

c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)

)= 0

defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1

3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic

3 plusmn 1radic

3) The

computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 2 Computational results for Example 62

order kwo LME with LME

lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530

Example 63 Consider the optimization problem

min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0

The matrix polynomial L(x) is

1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0

0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1

minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1

The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3

3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact

4 This is because 1 minus x21minus x2

2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and

1minusx23minusx2

4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23

Table 3 Computational results for Example 63

order kwo LME with LME

lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449

Example 64 Consider the polynomial optimization problem

minxisinR2

x21 + 50x22

st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1

8 ge 0 x22 + 2x1x2 minus 18 ge 0

It is motivated from an example in [12 sect3] The first column of L(x) is

8x13

5 + x1

5288 x2 x1

4

5 minus 16 x13

5 minus x2 x12 1245 + 8 x1

5 minus 2 x2minus 288x2 x1

4

5 minus 16 x13

5 + x2 x12 1245 + 8 x1

5 + 2 x2

and the second column of L(x) is

minus 8x12 x2

5 + 4x23

5 minus x2

10288x1

3 x22

5 + 16x12 x2

5 minus 142x1 x22

5 minus 9 x1

20 minus 8 x23

5 + 11x2

5

minus 288x13 x2

2

5 + 16x12 x2

5 + 142x1 x22

5 + 9 x1

20 minus 8 x23

5 + 11x2

5

The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25

radic5 asymp 1126517 and the minimizers are (plusmn

radic

12plusmn(radic

58 +radic

12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors

Table 4 Computational results for Example 64

order kwo LME with LME

lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537

Example 65 Consider the optimization problem

minxisinR3

x31 + x32 + x33 + 4x1x2x3 minus(x1(x

22 + x23) + x2(x

23 + x21) + x3(x

21 + x22)

)

st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0

The matrix polynomial L(x) is

1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0

minusx1 x2 0 1 0 minus1

24 JIAWANG NIE

The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3

+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 5 Computational results for Example 65

order kwo LME with LME

lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057

Example 66 Consider the optimization problem (x0 = 1)

minxisinR4

xTx+sum4i=0

prod

j 6=i(xi minus xj)

st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0

The matrix polynomial L(x) =[12diag(x) minusI4

] The first part of the objective

is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers

(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)

(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)

(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 6 Computational results for Example 66

order kwo LME with LME

lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354

Example 67 Consider the optimization problem

minxisinR3

x41x22 + x42x

23 + x43x

21 minus 3x21x

22x

23 + x22

st x1 minus x2x3 ge 0minusx2 + x23 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0

minusx3 minus1 0 0 0

]

By the arithmetic-geometric

mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25

relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin

for all k ge 5 up to numerical round-off errors

Table 7 Computational results for Example 67

order kwo LME with LME

lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618

Example 68 Consider the optimization problem

minxisinR4

x21(x1 minus x4)2 + x22(x2 minus x4)

2 + x23(x3 minus x4)2+

2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2

st x1 minus x2 ge 0 x2 minus x3 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0 01 1 0 0 0 0

]

In the objective the sum

of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer

(05632 05632 05632 07510)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 8 Computational results for Example 68

order kwo LME with LME

lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228

Example 69 Consider the optimization problem

minxisinR4

(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)

st 0 le x1 x4 le 1

The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9

5 Note that 4minussum

4

i=1x2i =

sum

4

i=1

(

xi(1 minus xi)2 + (1 minus xi)(1 + x2i )

)

isin IQ(ceq cin)

26 JIAWANG NIE

Table 9 Computational results for Example 69

orderwo LME with LME

lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320

For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison

Example 610 Consider the optimization problem (x0 = 1)

minxisinRn

sum

0leilejlejlen cijkxixjxk

st 0 le x le 1

where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time

Table 10 Consumed time (in seconds) for Example 610

n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778

7 Discussions

71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that

IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k

If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set

Preord(cin ψ) =sum

r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27

The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is

(71)

f primeprek = min 〈f y〉

st 〈1 y〉 = 1 L(k)ceq(y) = 0 L

(k)φ (y) = 0

L(k)

gr11

middotmiddotmiddotgrℓℓ

(y) 0 forall r1 rℓ isin 0 1y isin RN

n2k

Similar to (39) the dual optimization problem of the above is

(72)

fprek = max γ

st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k

An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem

Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then

fprek = f primeprek = fc

for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre

k = fmin for all k big enough

Proof The proof is very similar to the Case III of Theorem 33 Follow the same

argument there Without Assumption 32 we still have f(x) equiv 0 on the set

K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0

By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such

that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same

72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge

73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)

28 JIAWANG NIE

74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get

λ =(

C(x)TC(x))minus1

C(x)T[nablaf(x)

0

]

when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research

Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper

References

[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions

Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-

putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997

[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226

[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200

[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832

[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear

[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear

[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]

[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous

quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-

inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-

tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005

[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109

[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J

Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of

zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29

[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885

[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014

[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009

[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015

[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear

[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26

[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009

[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014

[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606

[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718

[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012

[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255

[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013

[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510

[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121

[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274

[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003

[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984

[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000

[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942

[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324

[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653

[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002

[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951

Department of Mathematics University of California at San Diego 9500 Gilman

Drive La Jolla CA USA 92093

E-mail address njwmathucsdedu

  • 1 Introduction
    • 11 Optimality conditions
    • 12 Some existing work
    • 13 New contributions
      • 2 Preliminaries
      • 3 The construction of tight relaxations
        • 31 Tightness of the relaxations
        • 32 Detecting tightness and extracting minimizers
          • 4 Polyhedral constraints
          • 5 General constraints
          • 6 Numerical examples
          • 7 Discussions
            • 71 Tight relaxations using preorderings
            • 72 Singular constraining polynomials
            • 73 Degree bound for L(x)
            • 74 Rational representation of Lagrange multipliers
              • References
Page 13: TIGHT RELAXATIONS FOR POLYNOMIAL OPTIMIZATION …The polynomial expressions is de-termined by linear equations. Based on these expressions, new Lasserre type semidefinite relaxations

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 13

iii) If (317) holds we can get r = rankMt(ylowast) minimizers for (37) [5 14] say

u1 ur such that fk = f(ui) for each i Clearly fk = f(ui) ge fc On the otherhand we always have fk le fc So fk = fc

iv) By Assumption 32 (37) is equivalent to the problem

(318)

min f(x)st ceq(x) = 0 φ(x) = 0 ρ(x) ge 0

The optimal value of (318) is also fc Its kth order Lasserrersquos relaxation is

(319)

γprimek = min 〈f y〉st 〈1 y〉 = 1Mk(y) 0

L(k)ceq (y) = 0 L

(k)φ (y) = 0 L

(k)ρ (y) 0

Its dual optimization problem is

(320)

γk = max γ

st f minus γ isin Ideal(ceq φ)2k + Qmod(ρ)2k

By repeating the same proof as for Theorem 33(iii) we can show that

γk = γprimek = fc

for all k big enough Because ρ isin Qmod(cin ψ) each y feasible for (38) is alsofeasible for (319) So when k is big each ylowast is also a minimizer of (319) Theproblem (318) also has finitely many minimizers By Theorem 26 of [32] thecondition (317) must be satisfied for some t isin [d k] when k is big enough

If (37) has infinitely many minimizers then the condition (317) is typically notsatisfied We refer to [25 sect66]

4 Polyhedral constraints

In this section we assume the feasible set of (11) is the polyhedron

P = x isin Rn | Axminus b ge 0

where A =[a1 middot middot middot am

]T isin Rmtimesn b =[b1 middot middot middot bm

]T isin Rm This corresponds

to that E = empty I = [m] and each ci(x) = aTi xminus bi Denote

(41) D(x) = diag(c1(x) cm(x)) C(x) =

[AT

D(x)

]

The Lagrange multiplier vector λ =[λ1 middot middot middot λm

]Tsatisfies

(42)

[AT

D(x)

]

λ =

[nablaf(x)

0

]

If rankA = m we can express λ as

(43) λ = (AAT )minus1Anablaf(x)If rankA lt m how can we express λ in terms of x In computation we oftenprefer a polynomial expression If there exists L(x) isin R[x]mtimes(n+m) such that

(44) L(x)C(x) = Im

then we can get

λ = L(x)

[nablaf(x)

0

]

= L1(x)nablaf(x)

14 JIAWANG NIE

where L1(x) consists of the first n columns of L(x) In this section we characterizewhen such L(x) exists and give a degree bound for it

The linear function Ax minus b is said to be nonsingular if rankC(u) = m for allu isin Cn (also see Definition 51) This is equivalent to that for every u if J(u) =i1 ik (see (12) for the notation) then ai1 aik are linearly independent

Proposition 41 The linear function Ax minus b is nonsingular if and only if thereexists a matrix polynomial L(x) satisfying (44) Moreover when Ax minus b is non-singular we can choose L(x) in (44) with deg(L) le mminus rankA

Proof Clearly if (44) is satisfied by some L(x) then rankC(u) ge m for all u Thisimplies that Axminus b is nonsingular

Next assume that Axminusb is nonsingular We show that (44) is satisfied by someL(x) isin R[x]mtimes(n+m) with degree le m minus rankA Let r = rankA Up to a linearcoordinate transformation we can reduce x to a r-dimensional variable Withoutloss of generality we can assume that rankA = n and m ge n

For a subset I = i1 imminusn of [m] denote

cI(x) =prod

iisinIci(x) EI(x) = cI(x) middot diag(ci1(x)minus1 cimminusn

(x)minus1)

DI(x) = diag(ci1(x) cimminusn(x)) AI =

[ai1 middot middot middot ainminusm

]T

For the case that I = empty (the empty set) we set cempty(x) = 1 Let

V = I sube [m] |I| = mminus n rankA[m]I = nStep I For each I isin V we construct a matrix polynomial LI(x) such that

(45) LI(x)C(x) = cI(x)Im

The matrix LI = LI(x) satisfying (45) can be given by the following 2times 3 blockmatrix (LI(J K) denotes the submatrix whose row indices are from J and whosecolumn indices are from K)

(46)

J K [n] n+ I n+ [m]II 0 EI(x) 0

[m]I cI(x) middot(A[m]I

)minusT minus(A[m]I

)minusT (AI

)TEI(x) 0

Equivalently the blocks of LI are

LI(I [n]

)= 0 LI

(I n+ [m]I

)= 0 LI

([m]I n+ [m]I

)= 0

LI(I n+ I

)= EI(x) LI

([m]I [n]

)= cI(x)

(A[m]I

)minusT

LI([m]I n+ I

)= minus

(A[m]I

)minusT (AI

)T

For each I isin V A[m]I is invertible The superscript minusT denotes the inverse of thetranspose Let G = LI(x)C(x) then one can verify that

G(I I) = EI(x)DI(x) = cI(x)Imminusn G(I [m]I) = 0

G([m]I [m]I) =[

cI(x)(A[m]I

)minusT minusAminusT[m]IA

TI EI(x)

] [(A[m]I

)T

0

]

= cI(x)In

G([m]I I) =[

cI(x)(A[m]I

)minusT minus(A[m]I

)minusT (AI

)TEI(x)

] [ ATIDI(x)

]

= 0

This shows that the above LI(x) satisfies (45)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 15

Step II We show that there exist real scalars νI satisfying

(47)sum

IisinVνIcI(x) = 1

This can be shown by induction on m

bull When m = n V = empty and cempty(x) = 1 so (47) is clearly truebull When m gt n let

(48) N = i isin [m] | rankA[m]i = n

For each i isin N let Vi be the set of all Iprime sube [m]i such that |I prime| = mminusnminus1

and rankA[m](Iprimecupi) = n For each i isin N by the assumption the linearfunction Amix minus bmi is nonsingular By induction there exist real

scalars ν(i)Iprime satisfying

(49)sum

IprimeisinVi

ν(i)Iprime cIprime(x) = 1

Since rankA = n we can generally assume that a1 an is linearlyindependent So there exist scalars α1 αn such that

am = α1a1 + middot middot middot+ αnan

If all αi = 0 then am = 0 and hence A can be replaced by its first mminus 1rows So (47) is true by the induction In the following suppose at leastone αi 6= 0 and write

i αi 6= 0 = i1 ik

Then ai1 aik am are linearly dependent For convenience set ik+1 =m Since Axminus b is nonsingular the linear system

ci1(x) = middot middot middot = cik(x) = cik+1(x) = 0

has no solutions Hence there exist real scalars micro1 microk+1 such that

micro1ci1(x) + middot middot middot+ microkcik(x) + microk+1cik+1(x) = 1

This above can be implied by echelonrsquos form for inconsistent linear systemsNote that i1 ik+1 isin N For each j = 1 k + 1 by (49)

sum

IprimeisinVij

ν(ij)Iprime cIprime(x) = 1

Then we can get

1 =k+1sum

j=1

microjcij (x) =k+1sum

j=1

microjsum

IprimeisinVij

ν(ij)Iprime cij (x)cIprime(x) =

sum

I=IprimecupijIprimeisinVij1lejlek+1

ν(ij)Iprime microjcI(x)

Since each I prime cup ij isin V (47) must be satisfied by some scalars νI

16 JIAWANG NIE

Step III For LI(x) as in (45) we construct L(x) as

(410) L(x) =sum

IisinVνIcI(x)LI(x)

Clearly L(x) satisfies (44) because

L(x)C(x) =sum

IisinVνILI(x)C(x) =

sum

IisinVνIcI(x)Im = Im

Each LI(x) has degree le mminus n so L(x) has degree le mminus n

Proposition 41 characterizes when there exists L(x) satisfying (44) Whenit does a degree bound for L(x) is m minus rankA Sometimes its degree can besmaller than that as shown in Example 43 For given A b the matrix polynomialL(x) satisfying (44) can be determined by linear equations which are obtained bymatching coefficients on both sides In the following we give some examples ofL(x)C(x) = Im for polyhedral sets

Example 42 Consider the simplicial set

x1 ge 0 xn ge 0 1minus eTx ge 0

The equation L(x)C(x) = In+1 is satisfied by

L(x) =

1minus x1 minusx2 middot middot middot minusxn 1 middot middot middot 1minusx1 1minus x2 middot middot middot minusxn 1 middot middot middot 1

minusx1 minusx2 middot middot middot 1minus xn 1 middot middot middot 1minusx1 minusx2 middot middot middot minusxn 1 middot middot middot 1

Example 43 Consider the box constraint

x1 ge 0 xn ge 0 1minus x1 ge 0 1minus xn ge 0

The equation L(x)C(x) = I2n is satisfied by

L(x) =

[In minus diag(x) In Inminusdiag(x) In In

]

Example 44 Consider the polyhedral set

1minus x4 ge 0 x4 minus x3 ge 0 x3 minus x2 ge 0 x2 minus x1 ge 0 x1 + 1 ge 0

The equation L(x)C(x) = I5 is satisfied by

L(x) =1

2

minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 minusx4 minus 1 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 1minus x4 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 1minus x3 1minus x4 1 1 1 1 1minusx1 minus 1 1minus x2 1minus x3 1minus x4 1 1 1 1 11minus x1 1minus x2 1minus x3 1minus x4 1 1 1 1 1

Example 45 Consider the polyhedral set

1 + x1 ge 0 1minus x1 ge 0 2minus x1 minus x2 ge 0 2minus x1 + x2 ge 0

The matrix L(x) satisfying L(x)C(x) = I4 is

1

6

x12minus 3x1 + 2 x1 x2 minus x2 4minus x1 2minus x1 1minus x1 1minus x1

3x12minus 3x1 minus 6 3x2 + 3x1 x2 6minus 3x1 minus3x1 minus3x1 minus 3 minus3x1 minus 3

1minus x12

minus2x2 minus x1 x2 minus 3 x1 minus 1 x1 + 1 x1 + 2 x1 + 21minus x1

2 3minus x1 x2 minus 2x2 x1 minus 1 x1 + 1 x1 + 2 x1 + 2

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 17

5 General constraints

We consider general nonlinear constraints as in (11) The critical point condi-tions are in (18) We discuss how to express Lagrange multipliers λi as polynomialfunctions in x on the set of critical points

Suppose there are totally m equality and inequality constraints ie

E cup I = 1 mIf (x λ) is a critical pair then λici(x) = 0 for all i isin E cup I So the Lagrange

multiplier vector λ =[λ1 middot middot middot λm

]Tsatisfies the equation

(51)

nablac1(x) nablac2(x) middot middot middot nablacm(x)c1(x) 0 middot middot middot 00 c2(x) 0 0

0 0 middot middot middot cm(x)

︸ ︷︷ ︸

C(x)

λ =

nablaf(x)000

Let C(x) be as in above If there exists L(x) isin R[x]mtimes(m+n) such that

(52) L(x)C(x) = Im

then we can get

(53) λ = L(x)

[nablaf(x)

0

]

= L1(x)nablaf(x)

where L1(x) consists of the first n columns of L(x) Clearly (52) implies thatAssumption 31 holds This section characterizes when such L(x) exists

Definition 51 The tuple c = (c1 cm) of constraining polynomials is said tobe nonsingular if rankC(u) = m for every u isin Cn

Clearly c being nonsingular is equivalent to that for each u isin Cn if J(u) =i1 ik (see (12) for the notation) then the gradients nablaci1(u) nablacik(u) arelinearly independent Our main conclusion is that (52) holds if and only if thetuple c is nonsingular

Proposition 52 (i) For each W (x) isin C[x]stimest with s ge t rankW (u) = t for allu isin Cn if and only if there exists P (x) isin C[x]ttimess such that

P (x)W (x) = It

Moreover for W (x) isin R[x]stimest we can choose P (x) isin R[x]ttimess for the above(ii) The constraining polynomial tuple c is nonsingular if and only if there existsL(x) isin R[x]mtimes(m+n) satisfying (52)

Proof (i) ldquolArrrdquo If L(x)W (x) = It then for all u isin Cn

t = rank It le rankW (u) le t

So W (x) must have full column rank everywhere

ldquorArrrdquo Suppose rankW (u) = t for all u isin Cn Write W (x) in columns

W (x) =[w1(x) w2(x) middot middot middot wt(x)

]

18 JIAWANG NIE

Then the equation w1(x) = 0 does not have a complex solution By Hilbertrsquos WeakNullstellensatz [4] there exists ξ1(x) isin C[x]s such that ξ1(x)

Tw1(x) = 1 For eachi = 2 t denote

r1i(x) = ξ1(x)Twi(x)

then (use sim to denote row equivalence between matrices)

W (x) sim[

1 r12(x) middot middot middot r1t(x)w1(x) w2(x) middot middot middot wm(x)

]

simW1(x) =

[1 r12(x) middot middot middot r1m(x)

0 w(1)2 (x) middot middot middot w

(1)m (x)

]

where each (i = 2 m)

w(1)i (x) = wi(x)minus r1i(x)w1(x)

So there exists P1(x) isin R[x](s+1)timess such that

P1(x)W (x) =W1(x)

Since W (x) and W1(x) are row equivalent W1(x) must also have full column rankeverywhere Similarly the polynomial equation

w(1)2 (x) = 0

does not have a complex solution Again by Hilbertrsquos Weak Nullstellensatz [4]there exists ξ2(x) isin C[x]s such that

ξ2(x)Tw

(1)2 (x) = 1

For each i = 3 t let r2i(x) = ξ2(x)Tw

(1)2 (x) then

W1(x) sim

1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)

0 w(1)2 (x) w

(1)3 (x) middot middot middot w

(1)m (x)

sim

W2(x) =

1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)

0 0 w(2)3 (x) middot middot middot w

(2)m (x)

where each (i = 3 m)

w(2)i (x) = w

(1)i (x)minus r2i(x)w

(1)2 (x)

Similarly W1(x) and W2(x) are row equivalent so W2(x) has full column rankeverywhere There exists P2(x) isin C[x](s+2)times(s+1) such that

P2(x)W1(x) =W2(x)

Continuing this process we can finally get

W2(x) sim middot middot middot simWt(x) =

1 r12(x) r13(x) middot middot middot r1t(x)0 1 r23(x) middot middot middot r2t(x)0 0 1 middot middot middot r3t(x)

0 0 0 middot middot middot 10 0 0 middot middot middot 0

Consequently there exists Pi(x) isin R[x](s+i)times(s+iminus1) for i = 1 2 t such that

Pt(x)Ptminus1(x) middot middot middotP1(x)W (x) =Wt(x)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 19

Since Wt(x) is a unit upper triangular matrix polynomial there exists Pt+1(x) isinR[x]ttimes(s+t) such that Pt+1(x)Wt(x) = It Let

P (x) = Pt+1(x)Pt(x)Ptminus1(x) middot middot middotP1(x)

then P (x)W (x) = Im Note that P (x) isin C[x]ttimess

For W (x) isin R[x]stimest we can replace P (x) by(P (x)+P (x)

)2 (the P (x) denotes

the complex conjugate of P (x)) which is a real matrix polynomial

(ii) The conclusion is implied directly by the item (i)

In Proposition 52 there is no explicit degree bound for L(x) satisfying (52)This question is mostly open to the best of the authorrsquos knowledge However oncea degree is chosen for L(x) it can be determined by comparing coefficients of bothsides of (52) This can be done by solving a linear system In the following wegive some examples of L(x) satisfying (52)

Example 53 Consider the hypercube with quadratic constraints

1minus x21 ge 0 1minus x22 ge 0 1minus x2n ge 0

The equation L(x)C(x) = In is satisfied by

L(x) =[minus 1

2diag(x) In]

Example 54 Consider the nonnegative portion of the unit sphere

x1 ge 0 x2 ge 0 xn ge 0 x21 + middot middot middot+ x2n minus 1 = 0

The equation L(x)C(x) = In+1 is satisfied by

L(x) =

[In minus xxT x1Tn 2x

12x

T minus 121

Tn minus1

]

Example 55 Consider the set

1minus x31 minus x42 ge 0 1minus x43 minus x34 ge 0

The equation L(x)C(x) = I2 is satisfied by

L(x) =

[minusx1

3 minusx2

4 0 0 1 00 0 minusx3

4 minusx4

3 0 1

]

Example 56 Consider the quadratic set

1minus x1x2 minus x2x3 minus x1x3 ge 0 1minus x21 minus x22 minus x23 ge 0

The matrix L(x)T satisfying L(x)C(x) = I2 is

25x13 + 10 x1

2 x2 + 40x1 x22 minus 25 x1 minus 2x3 minus25x1

3 minus 10x12 x2 minus 40 x1 x2

2 +49 x1

2+ 2x3

minus15 x12 x2 + 10x1 x2

2 + 20 x3 x1 x2 minus 10x1 15 x12 x2 minus 10x1 x2

2 minus 20 x3 x1 x2 + 10x1 minus x22

25 x3 x12 minus 20x1 x2

2 + 10 x3 x1 x2 + 2x1 minus25 x3 x12 + 20x1 x2

2 minus 10 x3 x1 x2 minus 2x1 minus x32

1 minus 20 x1 x3 minus 10 x12 minus 20 x1 x2 20 x1 x2 + 20 x1 x3 + 10x1

2

minus50 x12 minus 20 x2 x1 50 x1

2 + 20 x2 x1 + 1

We would like to remark that a polynomial tuple c = (c1 cm) is genericallynonsingular and Assumption 31 holds generically

Proposition 57 For all positive degrees d1 dm there exists an open densesubset U of D = R[x]d1 times middot middot middot times R[x]dm such that every tuple c = (c1 cm) isin Uis nonsingular Indeed such U can be chosen as a Zariski open subset of D ie itis the complement of a proper real variety of D Moreover Assumption 31 holdsfor all c isin U ie it holds generically

20 JIAWANG NIE

Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The

resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of

ci1 cin+1such that if Res(ci1 cin+1

) 6= 0 then the equations

ci1(x) = middot middot middot = cin+1(x) = 0

have no complex solutions Define

F1(c) =prod

(i1in+1)isinJ1

Res(ci1 cin+1)

For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero

Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations

cj1(x) = middot middot middot = cjk(x) = 0

have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define

F2(c) =prod

(j1jk)isinJ2

∆(cj1 cjk)

Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero

Last let F (c) = F1(c)F2(c) and

U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple

Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically

6 Numerical examples

This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21

The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie

pi =(

L1(x)nablaf(x))

i

In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds

By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)

We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo

Example 61 Consider the optimization problem

min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0

The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors

Table 1 Computational results for Example 61

order kwo LME with LME

lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567

2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn

i=1xi(1 minus xi)

2 +sum

i6=j x2

i xj isin IQ(ceq cin)

22 JIAWANG NIE

Example 62 Consider the optimization problem

min x41x22 + x21x

42 + x63 minus 3x21x

22x

23 + (x41 + x42 + x43)

st x21 + x22 + x23 ge 1

The matrix polynomial L(x) =[12x1

12x2

12x3 minus1

] The objective f is the sum

of the positive definite form x41 + x42 + x43 and the Motzkin polynomial

M(x) = x41x22 + x21x

42 + x63 minus 3x21x

22x

23

Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because

c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)

)= 0

defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1

3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic

3 plusmn 1radic

3) The

computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 2 Computational results for Example 62

order kwo LME with LME

lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530

Example 63 Consider the optimization problem

min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0

The matrix polynomial L(x) is

1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0

0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1

minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1

The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3

3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact

4 This is because 1 minus x21minus x2

2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and

1minusx23minusx2

4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23

Table 3 Computational results for Example 63

order kwo LME with LME

lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449

Example 64 Consider the polynomial optimization problem

minxisinR2

x21 + 50x22

st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1

8 ge 0 x22 + 2x1x2 minus 18 ge 0

It is motivated from an example in [12 sect3] The first column of L(x) is

8x13

5 + x1

5288 x2 x1

4

5 minus 16 x13

5 minus x2 x12 1245 + 8 x1

5 minus 2 x2minus 288x2 x1

4

5 minus 16 x13

5 + x2 x12 1245 + 8 x1

5 + 2 x2

and the second column of L(x) is

minus 8x12 x2

5 + 4x23

5 minus x2

10288x1

3 x22

5 + 16x12 x2

5 minus 142x1 x22

5 minus 9 x1

20 minus 8 x23

5 + 11x2

5

minus 288x13 x2

2

5 + 16x12 x2

5 + 142x1 x22

5 + 9 x1

20 minus 8 x23

5 + 11x2

5

The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25

radic5 asymp 1126517 and the minimizers are (plusmn

radic

12plusmn(radic

58 +radic

12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors

Table 4 Computational results for Example 64

order kwo LME with LME

lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537

Example 65 Consider the optimization problem

minxisinR3

x31 + x32 + x33 + 4x1x2x3 minus(x1(x

22 + x23) + x2(x

23 + x21) + x3(x

21 + x22)

)

st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0

The matrix polynomial L(x) is

1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0

minusx1 x2 0 1 0 minus1

24 JIAWANG NIE

The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3

+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 5 Computational results for Example 65

order kwo LME with LME

lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057

Example 66 Consider the optimization problem (x0 = 1)

minxisinR4

xTx+sum4i=0

prod

j 6=i(xi minus xj)

st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0

The matrix polynomial L(x) =[12diag(x) minusI4

] The first part of the objective

is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers

(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)

(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)

(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 6 Computational results for Example 66

order kwo LME with LME

lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354

Example 67 Consider the optimization problem

minxisinR3

x41x22 + x42x

23 + x43x

21 minus 3x21x

22x

23 + x22

st x1 minus x2x3 ge 0minusx2 + x23 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0

minusx3 minus1 0 0 0

]

By the arithmetic-geometric

mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25

relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin

for all k ge 5 up to numerical round-off errors

Table 7 Computational results for Example 67

order kwo LME with LME

lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618

Example 68 Consider the optimization problem

minxisinR4

x21(x1 minus x4)2 + x22(x2 minus x4)

2 + x23(x3 minus x4)2+

2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2

st x1 minus x2 ge 0 x2 minus x3 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0 01 1 0 0 0 0

]

In the objective the sum

of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer

(05632 05632 05632 07510)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 8 Computational results for Example 68

order kwo LME with LME

lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228

Example 69 Consider the optimization problem

minxisinR4

(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)

st 0 le x1 x4 le 1

The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9

5 Note that 4minussum

4

i=1x2i =

sum

4

i=1

(

xi(1 minus xi)2 + (1 minus xi)(1 + x2i )

)

isin IQ(ceq cin)

26 JIAWANG NIE

Table 9 Computational results for Example 69

orderwo LME with LME

lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320

For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison

Example 610 Consider the optimization problem (x0 = 1)

minxisinRn

sum

0leilejlejlen cijkxixjxk

st 0 le x le 1

where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time

Table 10 Consumed time (in seconds) for Example 610

n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778

7 Discussions

71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that

IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k

If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set

Preord(cin ψ) =sum

r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27

The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is

(71)

f primeprek = min 〈f y〉

st 〈1 y〉 = 1 L(k)ceq(y) = 0 L

(k)φ (y) = 0

L(k)

gr11

middotmiddotmiddotgrℓℓ

(y) 0 forall r1 rℓ isin 0 1y isin RN

n2k

Similar to (39) the dual optimization problem of the above is

(72)

fprek = max γ

st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k

An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem

Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then

fprek = f primeprek = fc

for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre

k = fmin for all k big enough

Proof The proof is very similar to the Case III of Theorem 33 Follow the same

argument there Without Assumption 32 we still have f(x) equiv 0 on the set

K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0

By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such

that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same

72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge

73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)

28 JIAWANG NIE

74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get

λ =(

C(x)TC(x))minus1

C(x)T[nablaf(x)

0

]

when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research

Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper

References

[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions

Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-

putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997

[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226

[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200

[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832

[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear

[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear

[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]

[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous

quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-

inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-

tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005

[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109

[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J

Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of

zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29

[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885

[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014

[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009

[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015

[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear

[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26

[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009

[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014

[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606

[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718

[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012

[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255

[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013

[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510

[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121

[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274

[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003

[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984

[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000

[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942

[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324

[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653

[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002

[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951

Department of Mathematics University of California at San Diego 9500 Gilman

Drive La Jolla CA USA 92093

E-mail address njwmathucsdedu

  • 1 Introduction
    • 11 Optimality conditions
    • 12 Some existing work
    • 13 New contributions
      • 2 Preliminaries
      • 3 The construction of tight relaxations
        • 31 Tightness of the relaxations
        • 32 Detecting tightness and extracting minimizers
          • 4 Polyhedral constraints
          • 5 General constraints
          • 6 Numerical examples
          • 7 Discussions
            • 71 Tight relaxations using preorderings
            • 72 Singular constraining polynomials
            • 73 Degree bound for L(x)
            • 74 Rational representation of Lagrange multipliers
              • References
Page 14: TIGHT RELAXATIONS FOR POLYNOMIAL OPTIMIZATION …The polynomial expressions is de-termined by linear equations. Based on these expressions, new Lasserre type semidefinite relaxations

14 JIAWANG NIE

where L1(x) consists of the first n columns of L(x) In this section we characterizewhen such L(x) exists and give a degree bound for it

The linear function Ax minus b is said to be nonsingular if rankC(u) = m for allu isin Cn (also see Definition 51) This is equivalent to that for every u if J(u) =i1 ik (see (12) for the notation) then ai1 aik are linearly independent

Proposition 41 The linear function Ax minus b is nonsingular if and only if thereexists a matrix polynomial L(x) satisfying (44) Moreover when Ax minus b is non-singular we can choose L(x) in (44) with deg(L) le mminus rankA

Proof Clearly if (44) is satisfied by some L(x) then rankC(u) ge m for all u Thisimplies that Axminus b is nonsingular

Next assume that Axminusb is nonsingular We show that (44) is satisfied by someL(x) isin R[x]mtimes(n+m) with degree le m minus rankA Let r = rankA Up to a linearcoordinate transformation we can reduce x to a r-dimensional variable Withoutloss of generality we can assume that rankA = n and m ge n

For a subset I = i1 imminusn of [m] denote

cI(x) =prod

iisinIci(x) EI(x) = cI(x) middot diag(ci1(x)minus1 cimminusn

(x)minus1)

DI(x) = diag(ci1(x) cimminusn(x)) AI =

[ai1 middot middot middot ainminusm

]T

For the case that I = empty (the empty set) we set cempty(x) = 1 Let

V = I sube [m] |I| = mminus n rankA[m]I = nStep I For each I isin V we construct a matrix polynomial LI(x) such that

(45) LI(x)C(x) = cI(x)Im

The matrix LI = LI(x) satisfying (45) can be given by the following 2times 3 blockmatrix (LI(J K) denotes the submatrix whose row indices are from J and whosecolumn indices are from K)

(46)

J K [n] n+ I n+ [m]II 0 EI(x) 0

[m]I cI(x) middot(A[m]I

)minusT minus(A[m]I

)minusT (AI

)TEI(x) 0

Equivalently the blocks of LI are

LI(I [n]

)= 0 LI

(I n+ [m]I

)= 0 LI

([m]I n+ [m]I

)= 0

LI(I n+ I

)= EI(x) LI

([m]I [n]

)= cI(x)

(A[m]I

)minusT

LI([m]I n+ I

)= minus

(A[m]I

)minusT (AI

)T

For each I isin V A[m]I is invertible The superscript minusT denotes the inverse of thetranspose Let G = LI(x)C(x) then one can verify that

G(I I) = EI(x)DI(x) = cI(x)Imminusn G(I [m]I) = 0

G([m]I [m]I) =[

cI(x)(A[m]I

)minusT minusAminusT[m]IA

TI EI(x)

] [(A[m]I

)T

0

]

= cI(x)In

G([m]I I) =[

cI(x)(A[m]I

)minusT minus(A[m]I

)minusT (AI

)TEI(x)

] [ ATIDI(x)

]

= 0

This shows that the above LI(x) satisfies (45)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 15

Step II We show that there exist real scalars νI satisfying

(47)sum

IisinVνIcI(x) = 1

This can be shown by induction on m

bull When m = n V = empty and cempty(x) = 1 so (47) is clearly truebull When m gt n let

(48) N = i isin [m] | rankA[m]i = n

For each i isin N let Vi be the set of all Iprime sube [m]i such that |I prime| = mminusnminus1

and rankA[m](Iprimecupi) = n For each i isin N by the assumption the linearfunction Amix minus bmi is nonsingular By induction there exist real

scalars ν(i)Iprime satisfying

(49)sum

IprimeisinVi

ν(i)Iprime cIprime(x) = 1

Since rankA = n we can generally assume that a1 an is linearlyindependent So there exist scalars α1 αn such that

am = α1a1 + middot middot middot+ αnan

If all αi = 0 then am = 0 and hence A can be replaced by its first mminus 1rows So (47) is true by the induction In the following suppose at leastone αi 6= 0 and write

i αi 6= 0 = i1 ik

Then ai1 aik am are linearly dependent For convenience set ik+1 =m Since Axminus b is nonsingular the linear system

ci1(x) = middot middot middot = cik(x) = cik+1(x) = 0

has no solutions Hence there exist real scalars micro1 microk+1 such that

micro1ci1(x) + middot middot middot+ microkcik(x) + microk+1cik+1(x) = 1

This above can be implied by echelonrsquos form for inconsistent linear systemsNote that i1 ik+1 isin N For each j = 1 k + 1 by (49)

sum

IprimeisinVij

ν(ij)Iprime cIprime(x) = 1

Then we can get

1 =k+1sum

j=1

microjcij (x) =k+1sum

j=1

microjsum

IprimeisinVij

ν(ij)Iprime cij (x)cIprime(x) =

sum

I=IprimecupijIprimeisinVij1lejlek+1

ν(ij)Iprime microjcI(x)

Since each I prime cup ij isin V (47) must be satisfied by some scalars νI

16 JIAWANG NIE

Step III For LI(x) as in (45) we construct L(x) as

(410) L(x) =sum

IisinVνIcI(x)LI(x)

Clearly L(x) satisfies (44) because

L(x)C(x) =sum

IisinVνILI(x)C(x) =

sum

IisinVνIcI(x)Im = Im

Each LI(x) has degree le mminus n so L(x) has degree le mminus n

Proposition 41 characterizes when there exists L(x) satisfying (44) Whenit does a degree bound for L(x) is m minus rankA Sometimes its degree can besmaller than that as shown in Example 43 For given A b the matrix polynomialL(x) satisfying (44) can be determined by linear equations which are obtained bymatching coefficients on both sides In the following we give some examples ofL(x)C(x) = Im for polyhedral sets

Example 42 Consider the simplicial set

x1 ge 0 xn ge 0 1minus eTx ge 0

The equation L(x)C(x) = In+1 is satisfied by

L(x) =

1minus x1 minusx2 middot middot middot minusxn 1 middot middot middot 1minusx1 1minus x2 middot middot middot minusxn 1 middot middot middot 1

minusx1 minusx2 middot middot middot 1minus xn 1 middot middot middot 1minusx1 minusx2 middot middot middot minusxn 1 middot middot middot 1

Example 43 Consider the box constraint

x1 ge 0 xn ge 0 1minus x1 ge 0 1minus xn ge 0

The equation L(x)C(x) = I2n is satisfied by

L(x) =

[In minus diag(x) In Inminusdiag(x) In In

]

Example 44 Consider the polyhedral set

1minus x4 ge 0 x4 minus x3 ge 0 x3 minus x2 ge 0 x2 minus x1 ge 0 x1 + 1 ge 0

The equation L(x)C(x) = I5 is satisfied by

L(x) =1

2

minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 minusx4 minus 1 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 1minus x4 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 1minus x3 1minus x4 1 1 1 1 1minusx1 minus 1 1minus x2 1minus x3 1minus x4 1 1 1 1 11minus x1 1minus x2 1minus x3 1minus x4 1 1 1 1 1

Example 45 Consider the polyhedral set

1 + x1 ge 0 1minus x1 ge 0 2minus x1 minus x2 ge 0 2minus x1 + x2 ge 0

The matrix L(x) satisfying L(x)C(x) = I4 is

1

6

x12minus 3x1 + 2 x1 x2 minus x2 4minus x1 2minus x1 1minus x1 1minus x1

3x12minus 3x1 minus 6 3x2 + 3x1 x2 6minus 3x1 minus3x1 minus3x1 minus 3 minus3x1 minus 3

1minus x12

minus2x2 minus x1 x2 minus 3 x1 minus 1 x1 + 1 x1 + 2 x1 + 21minus x1

2 3minus x1 x2 minus 2x2 x1 minus 1 x1 + 1 x1 + 2 x1 + 2

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 17

5 General constraints

We consider general nonlinear constraints as in (11) The critical point condi-tions are in (18) We discuss how to express Lagrange multipliers λi as polynomialfunctions in x on the set of critical points

Suppose there are totally m equality and inequality constraints ie

E cup I = 1 mIf (x λ) is a critical pair then λici(x) = 0 for all i isin E cup I So the Lagrange

multiplier vector λ =[λ1 middot middot middot λm

]Tsatisfies the equation

(51)

nablac1(x) nablac2(x) middot middot middot nablacm(x)c1(x) 0 middot middot middot 00 c2(x) 0 0

0 0 middot middot middot cm(x)

︸ ︷︷ ︸

C(x)

λ =

nablaf(x)000

Let C(x) be as in above If there exists L(x) isin R[x]mtimes(m+n) such that

(52) L(x)C(x) = Im

then we can get

(53) λ = L(x)

[nablaf(x)

0

]

= L1(x)nablaf(x)

where L1(x) consists of the first n columns of L(x) Clearly (52) implies thatAssumption 31 holds This section characterizes when such L(x) exists

Definition 51 The tuple c = (c1 cm) of constraining polynomials is said tobe nonsingular if rankC(u) = m for every u isin Cn

Clearly c being nonsingular is equivalent to that for each u isin Cn if J(u) =i1 ik (see (12) for the notation) then the gradients nablaci1(u) nablacik(u) arelinearly independent Our main conclusion is that (52) holds if and only if thetuple c is nonsingular

Proposition 52 (i) For each W (x) isin C[x]stimest with s ge t rankW (u) = t for allu isin Cn if and only if there exists P (x) isin C[x]ttimess such that

P (x)W (x) = It

Moreover for W (x) isin R[x]stimest we can choose P (x) isin R[x]ttimess for the above(ii) The constraining polynomial tuple c is nonsingular if and only if there existsL(x) isin R[x]mtimes(m+n) satisfying (52)

Proof (i) ldquolArrrdquo If L(x)W (x) = It then for all u isin Cn

t = rank It le rankW (u) le t

So W (x) must have full column rank everywhere

ldquorArrrdquo Suppose rankW (u) = t for all u isin Cn Write W (x) in columns

W (x) =[w1(x) w2(x) middot middot middot wt(x)

]

18 JIAWANG NIE

Then the equation w1(x) = 0 does not have a complex solution By Hilbertrsquos WeakNullstellensatz [4] there exists ξ1(x) isin C[x]s such that ξ1(x)

Tw1(x) = 1 For eachi = 2 t denote

r1i(x) = ξ1(x)Twi(x)

then (use sim to denote row equivalence between matrices)

W (x) sim[

1 r12(x) middot middot middot r1t(x)w1(x) w2(x) middot middot middot wm(x)

]

simW1(x) =

[1 r12(x) middot middot middot r1m(x)

0 w(1)2 (x) middot middot middot w

(1)m (x)

]

where each (i = 2 m)

w(1)i (x) = wi(x)minus r1i(x)w1(x)

So there exists P1(x) isin R[x](s+1)timess such that

P1(x)W (x) =W1(x)

Since W (x) and W1(x) are row equivalent W1(x) must also have full column rankeverywhere Similarly the polynomial equation

w(1)2 (x) = 0

does not have a complex solution Again by Hilbertrsquos Weak Nullstellensatz [4]there exists ξ2(x) isin C[x]s such that

ξ2(x)Tw

(1)2 (x) = 1

For each i = 3 t let r2i(x) = ξ2(x)Tw

(1)2 (x) then

W1(x) sim

1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)

0 w(1)2 (x) w

(1)3 (x) middot middot middot w

(1)m (x)

sim

W2(x) =

1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)

0 0 w(2)3 (x) middot middot middot w

(2)m (x)

where each (i = 3 m)

w(2)i (x) = w

(1)i (x)minus r2i(x)w

(1)2 (x)

Similarly W1(x) and W2(x) are row equivalent so W2(x) has full column rankeverywhere There exists P2(x) isin C[x](s+2)times(s+1) such that

P2(x)W1(x) =W2(x)

Continuing this process we can finally get

W2(x) sim middot middot middot simWt(x) =

1 r12(x) r13(x) middot middot middot r1t(x)0 1 r23(x) middot middot middot r2t(x)0 0 1 middot middot middot r3t(x)

0 0 0 middot middot middot 10 0 0 middot middot middot 0

Consequently there exists Pi(x) isin R[x](s+i)times(s+iminus1) for i = 1 2 t such that

Pt(x)Ptminus1(x) middot middot middotP1(x)W (x) =Wt(x)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 19

Since Wt(x) is a unit upper triangular matrix polynomial there exists Pt+1(x) isinR[x]ttimes(s+t) such that Pt+1(x)Wt(x) = It Let

P (x) = Pt+1(x)Pt(x)Ptminus1(x) middot middot middotP1(x)

then P (x)W (x) = Im Note that P (x) isin C[x]ttimess

For W (x) isin R[x]stimest we can replace P (x) by(P (x)+P (x)

)2 (the P (x) denotes

the complex conjugate of P (x)) which is a real matrix polynomial

(ii) The conclusion is implied directly by the item (i)

In Proposition 52 there is no explicit degree bound for L(x) satisfying (52)This question is mostly open to the best of the authorrsquos knowledge However oncea degree is chosen for L(x) it can be determined by comparing coefficients of bothsides of (52) This can be done by solving a linear system In the following wegive some examples of L(x) satisfying (52)

Example 53 Consider the hypercube with quadratic constraints

1minus x21 ge 0 1minus x22 ge 0 1minus x2n ge 0

The equation L(x)C(x) = In is satisfied by

L(x) =[minus 1

2diag(x) In]

Example 54 Consider the nonnegative portion of the unit sphere

x1 ge 0 x2 ge 0 xn ge 0 x21 + middot middot middot+ x2n minus 1 = 0

The equation L(x)C(x) = In+1 is satisfied by

L(x) =

[In minus xxT x1Tn 2x

12x

T minus 121

Tn minus1

]

Example 55 Consider the set

1minus x31 minus x42 ge 0 1minus x43 minus x34 ge 0

The equation L(x)C(x) = I2 is satisfied by

L(x) =

[minusx1

3 minusx2

4 0 0 1 00 0 minusx3

4 minusx4

3 0 1

]

Example 56 Consider the quadratic set

1minus x1x2 minus x2x3 minus x1x3 ge 0 1minus x21 minus x22 minus x23 ge 0

The matrix L(x)T satisfying L(x)C(x) = I2 is

25x13 + 10 x1

2 x2 + 40x1 x22 minus 25 x1 minus 2x3 minus25x1

3 minus 10x12 x2 minus 40 x1 x2

2 +49 x1

2+ 2x3

minus15 x12 x2 + 10x1 x2

2 + 20 x3 x1 x2 minus 10x1 15 x12 x2 minus 10x1 x2

2 minus 20 x3 x1 x2 + 10x1 minus x22

25 x3 x12 minus 20x1 x2

2 + 10 x3 x1 x2 + 2x1 minus25 x3 x12 + 20x1 x2

2 minus 10 x3 x1 x2 minus 2x1 minus x32

1 minus 20 x1 x3 minus 10 x12 minus 20 x1 x2 20 x1 x2 + 20 x1 x3 + 10x1

2

minus50 x12 minus 20 x2 x1 50 x1

2 + 20 x2 x1 + 1

We would like to remark that a polynomial tuple c = (c1 cm) is genericallynonsingular and Assumption 31 holds generically

Proposition 57 For all positive degrees d1 dm there exists an open densesubset U of D = R[x]d1 times middot middot middot times R[x]dm such that every tuple c = (c1 cm) isin Uis nonsingular Indeed such U can be chosen as a Zariski open subset of D ie itis the complement of a proper real variety of D Moreover Assumption 31 holdsfor all c isin U ie it holds generically

20 JIAWANG NIE

Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The

resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of

ci1 cin+1such that if Res(ci1 cin+1

) 6= 0 then the equations

ci1(x) = middot middot middot = cin+1(x) = 0

have no complex solutions Define

F1(c) =prod

(i1in+1)isinJ1

Res(ci1 cin+1)

For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero

Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations

cj1(x) = middot middot middot = cjk(x) = 0

have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define

F2(c) =prod

(j1jk)isinJ2

∆(cj1 cjk)

Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero

Last let F (c) = F1(c)F2(c) and

U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple

Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically

6 Numerical examples

This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21

The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie

pi =(

L1(x)nablaf(x))

i

In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds

By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)

We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo

Example 61 Consider the optimization problem

min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0

The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors

Table 1 Computational results for Example 61

order kwo LME with LME

lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567

2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn

i=1xi(1 minus xi)

2 +sum

i6=j x2

i xj isin IQ(ceq cin)

22 JIAWANG NIE

Example 62 Consider the optimization problem

min x41x22 + x21x

42 + x63 minus 3x21x

22x

23 + (x41 + x42 + x43)

st x21 + x22 + x23 ge 1

The matrix polynomial L(x) =[12x1

12x2

12x3 minus1

] The objective f is the sum

of the positive definite form x41 + x42 + x43 and the Motzkin polynomial

M(x) = x41x22 + x21x

42 + x63 minus 3x21x

22x

23

Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because

c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)

)= 0

defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1

3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic

3 plusmn 1radic

3) The

computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 2 Computational results for Example 62

order kwo LME with LME

lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530

Example 63 Consider the optimization problem

min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0

The matrix polynomial L(x) is

1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0

0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1

minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1

The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3

3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact

4 This is because 1 minus x21minus x2

2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and

1minusx23minusx2

4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23

Table 3 Computational results for Example 63

order kwo LME with LME

lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449

Example 64 Consider the polynomial optimization problem

minxisinR2

x21 + 50x22

st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1

8 ge 0 x22 + 2x1x2 minus 18 ge 0

It is motivated from an example in [12 sect3] The first column of L(x) is

8x13

5 + x1

5288 x2 x1

4

5 minus 16 x13

5 minus x2 x12 1245 + 8 x1

5 minus 2 x2minus 288x2 x1

4

5 minus 16 x13

5 + x2 x12 1245 + 8 x1

5 + 2 x2

and the second column of L(x) is

minus 8x12 x2

5 + 4x23

5 minus x2

10288x1

3 x22

5 + 16x12 x2

5 minus 142x1 x22

5 minus 9 x1

20 minus 8 x23

5 + 11x2

5

minus 288x13 x2

2

5 + 16x12 x2

5 + 142x1 x22

5 + 9 x1

20 minus 8 x23

5 + 11x2

5

The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25

radic5 asymp 1126517 and the minimizers are (plusmn

radic

12plusmn(radic

58 +radic

12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors

Table 4 Computational results for Example 64

order kwo LME with LME

lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537

Example 65 Consider the optimization problem

minxisinR3

x31 + x32 + x33 + 4x1x2x3 minus(x1(x

22 + x23) + x2(x

23 + x21) + x3(x

21 + x22)

)

st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0

The matrix polynomial L(x) is

1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0

minusx1 x2 0 1 0 minus1

24 JIAWANG NIE

The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3

+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 5 Computational results for Example 65

order kwo LME with LME

lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057

Example 66 Consider the optimization problem (x0 = 1)

minxisinR4

xTx+sum4i=0

prod

j 6=i(xi minus xj)

st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0

The matrix polynomial L(x) =[12diag(x) minusI4

] The first part of the objective

is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers

(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)

(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)

(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 6 Computational results for Example 66

order kwo LME with LME

lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354

Example 67 Consider the optimization problem

minxisinR3

x41x22 + x42x

23 + x43x

21 minus 3x21x

22x

23 + x22

st x1 minus x2x3 ge 0minusx2 + x23 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0

minusx3 minus1 0 0 0

]

By the arithmetic-geometric

mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25

relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin

for all k ge 5 up to numerical round-off errors

Table 7 Computational results for Example 67

order kwo LME with LME

lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618

Example 68 Consider the optimization problem

minxisinR4

x21(x1 minus x4)2 + x22(x2 minus x4)

2 + x23(x3 minus x4)2+

2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2

st x1 minus x2 ge 0 x2 minus x3 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0 01 1 0 0 0 0

]

In the objective the sum

of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer

(05632 05632 05632 07510)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 8 Computational results for Example 68

order kwo LME with LME

lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228

Example 69 Consider the optimization problem

minxisinR4

(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)

st 0 le x1 x4 le 1

The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9

5 Note that 4minussum

4

i=1x2i =

sum

4

i=1

(

xi(1 minus xi)2 + (1 minus xi)(1 + x2i )

)

isin IQ(ceq cin)

26 JIAWANG NIE

Table 9 Computational results for Example 69

orderwo LME with LME

lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320

For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison

Example 610 Consider the optimization problem (x0 = 1)

minxisinRn

sum

0leilejlejlen cijkxixjxk

st 0 le x le 1

where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time

Table 10 Consumed time (in seconds) for Example 610

n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778

7 Discussions

71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that

IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k

If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set

Preord(cin ψ) =sum

r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27

The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is

(71)

f primeprek = min 〈f y〉

st 〈1 y〉 = 1 L(k)ceq(y) = 0 L

(k)φ (y) = 0

L(k)

gr11

middotmiddotmiddotgrℓℓ

(y) 0 forall r1 rℓ isin 0 1y isin RN

n2k

Similar to (39) the dual optimization problem of the above is

(72)

fprek = max γ

st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k

An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem

Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then

fprek = f primeprek = fc

for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre

k = fmin for all k big enough

Proof The proof is very similar to the Case III of Theorem 33 Follow the same

argument there Without Assumption 32 we still have f(x) equiv 0 on the set

K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0

By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such

that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same

72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge

73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)

28 JIAWANG NIE

74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get

λ =(

C(x)TC(x))minus1

C(x)T[nablaf(x)

0

]

when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research

Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper

References

[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions

Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-

putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997

[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226

[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200

[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832

[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear

[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear

[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]

[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous

quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-

inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-

tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005

[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109

[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J

Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of

zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29

[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885

[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014

[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009

[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015

[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear

[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26

[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009

[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014

[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606

[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718

[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012

[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255

[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013

[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510

[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121

[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274

[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003

[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984

[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000

[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942

[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324

[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653

[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002

[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951

Department of Mathematics University of California at San Diego 9500 Gilman

Drive La Jolla CA USA 92093

E-mail address njwmathucsdedu

  • 1 Introduction
    • 11 Optimality conditions
    • 12 Some existing work
    • 13 New contributions
      • 2 Preliminaries
      • 3 The construction of tight relaxations
        • 31 Tightness of the relaxations
        • 32 Detecting tightness and extracting minimizers
          • 4 Polyhedral constraints
          • 5 General constraints
          • 6 Numerical examples
          • 7 Discussions
            • 71 Tight relaxations using preorderings
            • 72 Singular constraining polynomials
            • 73 Degree bound for L(x)
            • 74 Rational representation of Lagrange multipliers
              • References
Page 15: TIGHT RELAXATIONS FOR POLYNOMIAL OPTIMIZATION …The polynomial expressions is de-termined by linear equations. Based on these expressions, new Lasserre type semidefinite relaxations

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 15

Step II We show that there exist real scalars νI satisfying

(47)sum

IisinVνIcI(x) = 1

This can be shown by induction on m

bull When m = n V = empty and cempty(x) = 1 so (47) is clearly truebull When m gt n let

(48) N = i isin [m] | rankA[m]i = n

For each i isin N let Vi be the set of all Iprime sube [m]i such that |I prime| = mminusnminus1

and rankA[m](Iprimecupi) = n For each i isin N by the assumption the linearfunction Amix minus bmi is nonsingular By induction there exist real

scalars ν(i)Iprime satisfying

(49)sum

IprimeisinVi

ν(i)Iprime cIprime(x) = 1

Since rankA = n we can generally assume that a1 an is linearlyindependent So there exist scalars α1 αn such that

am = α1a1 + middot middot middot+ αnan

If all αi = 0 then am = 0 and hence A can be replaced by its first mminus 1rows So (47) is true by the induction In the following suppose at leastone αi 6= 0 and write

i αi 6= 0 = i1 ik

Then ai1 aik am are linearly dependent For convenience set ik+1 =m Since Axminus b is nonsingular the linear system

ci1(x) = middot middot middot = cik(x) = cik+1(x) = 0

has no solutions Hence there exist real scalars micro1 microk+1 such that

micro1ci1(x) + middot middot middot+ microkcik(x) + microk+1cik+1(x) = 1

This above can be implied by echelonrsquos form for inconsistent linear systemsNote that i1 ik+1 isin N For each j = 1 k + 1 by (49)

sum

IprimeisinVij

ν(ij)Iprime cIprime(x) = 1

Then we can get

1 =k+1sum

j=1

microjcij (x) =k+1sum

j=1

microjsum

IprimeisinVij

ν(ij)Iprime cij (x)cIprime(x) =

sum

I=IprimecupijIprimeisinVij1lejlek+1

ν(ij)Iprime microjcI(x)

Since each I prime cup ij isin V (47) must be satisfied by some scalars νI

16 JIAWANG NIE

Step III For LI(x) as in (45) we construct L(x) as

(410) L(x) =sum

IisinVνIcI(x)LI(x)

Clearly L(x) satisfies (44) because

L(x)C(x) =sum

IisinVνILI(x)C(x) =

sum

IisinVνIcI(x)Im = Im

Each LI(x) has degree le mminus n so L(x) has degree le mminus n

Proposition 41 characterizes when there exists L(x) satisfying (44) Whenit does a degree bound for L(x) is m minus rankA Sometimes its degree can besmaller than that as shown in Example 43 For given A b the matrix polynomialL(x) satisfying (44) can be determined by linear equations which are obtained bymatching coefficients on both sides In the following we give some examples ofL(x)C(x) = Im for polyhedral sets

Example 42 Consider the simplicial set

x1 ge 0 xn ge 0 1minus eTx ge 0

The equation L(x)C(x) = In+1 is satisfied by

L(x) =

1minus x1 minusx2 middot middot middot minusxn 1 middot middot middot 1minusx1 1minus x2 middot middot middot minusxn 1 middot middot middot 1

minusx1 minusx2 middot middot middot 1minus xn 1 middot middot middot 1minusx1 minusx2 middot middot middot minusxn 1 middot middot middot 1

Example 43 Consider the box constraint

x1 ge 0 xn ge 0 1minus x1 ge 0 1minus xn ge 0

The equation L(x)C(x) = I2n is satisfied by

L(x) =

[In minus diag(x) In Inminusdiag(x) In In

]

Example 44 Consider the polyhedral set

1minus x4 ge 0 x4 minus x3 ge 0 x3 minus x2 ge 0 x2 minus x1 ge 0 x1 + 1 ge 0

The equation L(x)C(x) = I5 is satisfied by

L(x) =1

2

minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 minusx4 minus 1 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 1minus x4 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 1minus x3 1minus x4 1 1 1 1 1minusx1 minus 1 1minus x2 1minus x3 1minus x4 1 1 1 1 11minus x1 1minus x2 1minus x3 1minus x4 1 1 1 1 1

Example 45 Consider the polyhedral set

1 + x1 ge 0 1minus x1 ge 0 2minus x1 minus x2 ge 0 2minus x1 + x2 ge 0

The matrix L(x) satisfying L(x)C(x) = I4 is

1

6

x12minus 3x1 + 2 x1 x2 minus x2 4minus x1 2minus x1 1minus x1 1minus x1

3x12minus 3x1 minus 6 3x2 + 3x1 x2 6minus 3x1 minus3x1 minus3x1 minus 3 minus3x1 minus 3

1minus x12

minus2x2 minus x1 x2 minus 3 x1 minus 1 x1 + 1 x1 + 2 x1 + 21minus x1

2 3minus x1 x2 minus 2x2 x1 minus 1 x1 + 1 x1 + 2 x1 + 2

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 17

5 General constraints

We consider general nonlinear constraints as in (11) The critical point condi-tions are in (18) We discuss how to express Lagrange multipliers λi as polynomialfunctions in x on the set of critical points

Suppose there are totally m equality and inequality constraints ie

E cup I = 1 mIf (x λ) is a critical pair then λici(x) = 0 for all i isin E cup I So the Lagrange

multiplier vector λ =[λ1 middot middot middot λm

]Tsatisfies the equation

(51)

nablac1(x) nablac2(x) middot middot middot nablacm(x)c1(x) 0 middot middot middot 00 c2(x) 0 0

0 0 middot middot middot cm(x)

︸ ︷︷ ︸

C(x)

λ =

nablaf(x)000

Let C(x) be as in above If there exists L(x) isin R[x]mtimes(m+n) such that

(52) L(x)C(x) = Im

then we can get

(53) λ = L(x)

[nablaf(x)

0

]

= L1(x)nablaf(x)

where L1(x) consists of the first n columns of L(x) Clearly (52) implies thatAssumption 31 holds This section characterizes when such L(x) exists

Definition 51 The tuple c = (c1 cm) of constraining polynomials is said tobe nonsingular if rankC(u) = m for every u isin Cn

Clearly c being nonsingular is equivalent to that for each u isin Cn if J(u) =i1 ik (see (12) for the notation) then the gradients nablaci1(u) nablacik(u) arelinearly independent Our main conclusion is that (52) holds if and only if thetuple c is nonsingular

Proposition 52 (i) For each W (x) isin C[x]stimest with s ge t rankW (u) = t for allu isin Cn if and only if there exists P (x) isin C[x]ttimess such that

P (x)W (x) = It

Moreover for W (x) isin R[x]stimest we can choose P (x) isin R[x]ttimess for the above(ii) The constraining polynomial tuple c is nonsingular if and only if there existsL(x) isin R[x]mtimes(m+n) satisfying (52)

Proof (i) ldquolArrrdquo If L(x)W (x) = It then for all u isin Cn

t = rank It le rankW (u) le t

So W (x) must have full column rank everywhere

ldquorArrrdquo Suppose rankW (u) = t for all u isin Cn Write W (x) in columns

W (x) =[w1(x) w2(x) middot middot middot wt(x)

]

18 JIAWANG NIE

Then the equation w1(x) = 0 does not have a complex solution By Hilbertrsquos WeakNullstellensatz [4] there exists ξ1(x) isin C[x]s such that ξ1(x)

Tw1(x) = 1 For eachi = 2 t denote

r1i(x) = ξ1(x)Twi(x)

then (use sim to denote row equivalence between matrices)

W (x) sim[

1 r12(x) middot middot middot r1t(x)w1(x) w2(x) middot middot middot wm(x)

]

simW1(x) =

[1 r12(x) middot middot middot r1m(x)

0 w(1)2 (x) middot middot middot w

(1)m (x)

]

where each (i = 2 m)

w(1)i (x) = wi(x)minus r1i(x)w1(x)

So there exists P1(x) isin R[x](s+1)timess such that

P1(x)W (x) =W1(x)

Since W (x) and W1(x) are row equivalent W1(x) must also have full column rankeverywhere Similarly the polynomial equation

w(1)2 (x) = 0

does not have a complex solution Again by Hilbertrsquos Weak Nullstellensatz [4]there exists ξ2(x) isin C[x]s such that

ξ2(x)Tw

(1)2 (x) = 1

For each i = 3 t let r2i(x) = ξ2(x)Tw

(1)2 (x) then

W1(x) sim

1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)

0 w(1)2 (x) w

(1)3 (x) middot middot middot w

(1)m (x)

sim

W2(x) =

1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)

0 0 w(2)3 (x) middot middot middot w

(2)m (x)

where each (i = 3 m)

w(2)i (x) = w

(1)i (x)minus r2i(x)w

(1)2 (x)

Similarly W1(x) and W2(x) are row equivalent so W2(x) has full column rankeverywhere There exists P2(x) isin C[x](s+2)times(s+1) such that

P2(x)W1(x) =W2(x)

Continuing this process we can finally get

W2(x) sim middot middot middot simWt(x) =

1 r12(x) r13(x) middot middot middot r1t(x)0 1 r23(x) middot middot middot r2t(x)0 0 1 middot middot middot r3t(x)

0 0 0 middot middot middot 10 0 0 middot middot middot 0

Consequently there exists Pi(x) isin R[x](s+i)times(s+iminus1) for i = 1 2 t such that

Pt(x)Ptminus1(x) middot middot middotP1(x)W (x) =Wt(x)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 19

Since Wt(x) is a unit upper triangular matrix polynomial there exists Pt+1(x) isinR[x]ttimes(s+t) such that Pt+1(x)Wt(x) = It Let

P (x) = Pt+1(x)Pt(x)Ptminus1(x) middot middot middotP1(x)

then P (x)W (x) = Im Note that P (x) isin C[x]ttimess

For W (x) isin R[x]stimest we can replace P (x) by(P (x)+P (x)

)2 (the P (x) denotes

the complex conjugate of P (x)) which is a real matrix polynomial

(ii) The conclusion is implied directly by the item (i)

In Proposition 52 there is no explicit degree bound for L(x) satisfying (52)This question is mostly open to the best of the authorrsquos knowledge However oncea degree is chosen for L(x) it can be determined by comparing coefficients of bothsides of (52) This can be done by solving a linear system In the following wegive some examples of L(x) satisfying (52)

Example 53 Consider the hypercube with quadratic constraints

1minus x21 ge 0 1minus x22 ge 0 1minus x2n ge 0

The equation L(x)C(x) = In is satisfied by

L(x) =[minus 1

2diag(x) In]

Example 54 Consider the nonnegative portion of the unit sphere

x1 ge 0 x2 ge 0 xn ge 0 x21 + middot middot middot+ x2n minus 1 = 0

The equation L(x)C(x) = In+1 is satisfied by

L(x) =

[In minus xxT x1Tn 2x

12x

T minus 121

Tn minus1

]

Example 55 Consider the set

1minus x31 minus x42 ge 0 1minus x43 minus x34 ge 0

The equation L(x)C(x) = I2 is satisfied by

L(x) =

[minusx1

3 minusx2

4 0 0 1 00 0 minusx3

4 minusx4

3 0 1

]

Example 56 Consider the quadratic set

1minus x1x2 minus x2x3 minus x1x3 ge 0 1minus x21 minus x22 minus x23 ge 0

The matrix L(x)T satisfying L(x)C(x) = I2 is

25x13 + 10 x1

2 x2 + 40x1 x22 minus 25 x1 minus 2x3 minus25x1

3 minus 10x12 x2 minus 40 x1 x2

2 +49 x1

2+ 2x3

minus15 x12 x2 + 10x1 x2

2 + 20 x3 x1 x2 minus 10x1 15 x12 x2 minus 10x1 x2

2 minus 20 x3 x1 x2 + 10x1 minus x22

25 x3 x12 minus 20x1 x2

2 + 10 x3 x1 x2 + 2x1 minus25 x3 x12 + 20x1 x2

2 minus 10 x3 x1 x2 minus 2x1 minus x32

1 minus 20 x1 x3 minus 10 x12 minus 20 x1 x2 20 x1 x2 + 20 x1 x3 + 10x1

2

minus50 x12 minus 20 x2 x1 50 x1

2 + 20 x2 x1 + 1

We would like to remark that a polynomial tuple c = (c1 cm) is genericallynonsingular and Assumption 31 holds generically

Proposition 57 For all positive degrees d1 dm there exists an open densesubset U of D = R[x]d1 times middot middot middot times R[x]dm such that every tuple c = (c1 cm) isin Uis nonsingular Indeed such U can be chosen as a Zariski open subset of D ie itis the complement of a proper real variety of D Moreover Assumption 31 holdsfor all c isin U ie it holds generically

20 JIAWANG NIE

Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The

resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of

ci1 cin+1such that if Res(ci1 cin+1

) 6= 0 then the equations

ci1(x) = middot middot middot = cin+1(x) = 0

have no complex solutions Define

F1(c) =prod

(i1in+1)isinJ1

Res(ci1 cin+1)

For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero

Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations

cj1(x) = middot middot middot = cjk(x) = 0

have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define

F2(c) =prod

(j1jk)isinJ2

∆(cj1 cjk)

Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero

Last let F (c) = F1(c)F2(c) and

U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple

Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically

6 Numerical examples

This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21

The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie

pi =(

L1(x)nablaf(x))

i

In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds

By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)

We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo

Example 61 Consider the optimization problem

min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0

The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors

Table 1 Computational results for Example 61

order kwo LME with LME

lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567

2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn

i=1xi(1 minus xi)

2 +sum

i6=j x2

i xj isin IQ(ceq cin)

22 JIAWANG NIE

Example 62 Consider the optimization problem

min x41x22 + x21x

42 + x63 minus 3x21x

22x

23 + (x41 + x42 + x43)

st x21 + x22 + x23 ge 1

The matrix polynomial L(x) =[12x1

12x2

12x3 minus1

] The objective f is the sum

of the positive definite form x41 + x42 + x43 and the Motzkin polynomial

M(x) = x41x22 + x21x

42 + x63 minus 3x21x

22x

23

Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because

c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)

)= 0

defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1

3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic

3 plusmn 1radic

3) The

computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 2 Computational results for Example 62

order kwo LME with LME

lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530

Example 63 Consider the optimization problem

min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0

The matrix polynomial L(x) is

1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0

0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1

minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1

The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3

3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact

4 This is because 1 minus x21minus x2

2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and

1minusx23minusx2

4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23

Table 3 Computational results for Example 63

order kwo LME with LME

lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449

Example 64 Consider the polynomial optimization problem

minxisinR2

x21 + 50x22

st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1

8 ge 0 x22 + 2x1x2 minus 18 ge 0

It is motivated from an example in [12 sect3] The first column of L(x) is

8x13

5 + x1

5288 x2 x1

4

5 minus 16 x13

5 minus x2 x12 1245 + 8 x1

5 minus 2 x2minus 288x2 x1

4

5 minus 16 x13

5 + x2 x12 1245 + 8 x1

5 + 2 x2

and the second column of L(x) is

minus 8x12 x2

5 + 4x23

5 minus x2

10288x1

3 x22

5 + 16x12 x2

5 minus 142x1 x22

5 minus 9 x1

20 minus 8 x23

5 + 11x2

5

minus 288x13 x2

2

5 + 16x12 x2

5 + 142x1 x22

5 + 9 x1

20 minus 8 x23

5 + 11x2

5

The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25

radic5 asymp 1126517 and the minimizers are (plusmn

radic

12plusmn(radic

58 +radic

12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors

Table 4 Computational results for Example 64

order kwo LME with LME

lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537

Example 65 Consider the optimization problem

minxisinR3

x31 + x32 + x33 + 4x1x2x3 minus(x1(x

22 + x23) + x2(x

23 + x21) + x3(x

21 + x22)

)

st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0

The matrix polynomial L(x) is

1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0

minusx1 x2 0 1 0 minus1

24 JIAWANG NIE

The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3

+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 5 Computational results for Example 65

order kwo LME with LME

lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057

Example 66 Consider the optimization problem (x0 = 1)

minxisinR4

xTx+sum4i=0

prod

j 6=i(xi minus xj)

st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0

The matrix polynomial L(x) =[12diag(x) minusI4

] The first part of the objective

is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers

(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)

(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)

(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 6 Computational results for Example 66

order kwo LME with LME

lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354

Example 67 Consider the optimization problem

minxisinR3

x41x22 + x42x

23 + x43x

21 minus 3x21x

22x

23 + x22

st x1 minus x2x3 ge 0minusx2 + x23 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0

minusx3 minus1 0 0 0

]

By the arithmetic-geometric

mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25

relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin

for all k ge 5 up to numerical round-off errors

Table 7 Computational results for Example 67

order kwo LME with LME

lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618

Example 68 Consider the optimization problem

minxisinR4

x21(x1 minus x4)2 + x22(x2 minus x4)

2 + x23(x3 minus x4)2+

2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2

st x1 minus x2 ge 0 x2 minus x3 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0 01 1 0 0 0 0

]

In the objective the sum

of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer

(05632 05632 05632 07510)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 8 Computational results for Example 68

order kwo LME with LME

lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228

Example 69 Consider the optimization problem

minxisinR4

(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)

st 0 le x1 x4 le 1

The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9

5 Note that 4minussum

4

i=1x2i =

sum

4

i=1

(

xi(1 minus xi)2 + (1 minus xi)(1 + x2i )

)

isin IQ(ceq cin)

26 JIAWANG NIE

Table 9 Computational results for Example 69

orderwo LME with LME

lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320

For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison

Example 610 Consider the optimization problem (x0 = 1)

minxisinRn

sum

0leilejlejlen cijkxixjxk

st 0 le x le 1

where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time

Table 10 Consumed time (in seconds) for Example 610

n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778

7 Discussions

71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that

IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k

If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set

Preord(cin ψ) =sum

r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27

The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is

(71)

f primeprek = min 〈f y〉

st 〈1 y〉 = 1 L(k)ceq(y) = 0 L

(k)φ (y) = 0

L(k)

gr11

middotmiddotmiddotgrℓℓ

(y) 0 forall r1 rℓ isin 0 1y isin RN

n2k

Similar to (39) the dual optimization problem of the above is

(72)

fprek = max γ

st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k

An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem

Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then

fprek = f primeprek = fc

for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre

k = fmin for all k big enough

Proof The proof is very similar to the Case III of Theorem 33 Follow the same

argument there Without Assumption 32 we still have f(x) equiv 0 on the set

K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0

By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such

that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same

72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge

73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)

28 JIAWANG NIE

74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get

λ =(

C(x)TC(x))minus1

C(x)T[nablaf(x)

0

]

when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research

Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper

References

[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions

Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-

putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997

[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226

[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200

[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832

[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear

[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear

[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]

[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous

quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-

inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-

tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005

[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109

[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J

Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of

zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29

[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885

[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014

[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009

[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015

[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear

[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26

[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009

[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014

[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606

[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718

[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012

[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255

[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013

[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510

[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121

[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274

[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003

[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984

[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000

[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942

[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324

[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653

[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002

[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951

Department of Mathematics University of California at San Diego 9500 Gilman

Drive La Jolla CA USA 92093

E-mail address njwmathucsdedu

  • 1 Introduction
    • 11 Optimality conditions
    • 12 Some existing work
    • 13 New contributions
      • 2 Preliminaries
      • 3 The construction of tight relaxations
        • 31 Tightness of the relaxations
        • 32 Detecting tightness and extracting minimizers
          • 4 Polyhedral constraints
          • 5 General constraints
          • 6 Numerical examples
          • 7 Discussions
            • 71 Tight relaxations using preorderings
            • 72 Singular constraining polynomials
            • 73 Degree bound for L(x)
            • 74 Rational representation of Lagrange multipliers
              • References
Page 16: TIGHT RELAXATIONS FOR POLYNOMIAL OPTIMIZATION …The polynomial expressions is de-termined by linear equations. Based on these expressions, new Lasserre type semidefinite relaxations

16 JIAWANG NIE

Step III For LI(x) as in (45) we construct L(x) as

(410) L(x) =sum

IisinVνIcI(x)LI(x)

Clearly L(x) satisfies (44) because

L(x)C(x) =sum

IisinVνILI(x)C(x) =

sum

IisinVνIcI(x)Im = Im

Each LI(x) has degree le mminus n so L(x) has degree le mminus n

Proposition 41 characterizes when there exists L(x) satisfying (44) Whenit does a degree bound for L(x) is m minus rankA Sometimes its degree can besmaller than that as shown in Example 43 For given A b the matrix polynomialL(x) satisfying (44) can be determined by linear equations which are obtained bymatching coefficients on both sides In the following we give some examples ofL(x)C(x) = Im for polyhedral sets

Example 42 Consider the simplicial set

x1 ge 0 xn ge 0 1minus eTx ge 0

The equation L(x)C(x) = In+1 is satisfied by

L(x) =

1minus x1 minusx2 middot middot middot minusxn 1 middot middot middot 1minusx1 1minus x2 middot middot middot minusxn 1 middot middot middot 1

minusx1 minusx2 middot middot middot 1minus xn 1 middot middot middot 1minusx1 minusx2 middot middot middot minusxn 1 middot middot middot 1

Example 43 Consider the box constraint

x1 ge 0 xn ge 0 1minus x1 ge 0 1minus xn ge 0

The equation L(x)C(x) = I2n is satisfied by

L(x) =

[In minus diag(x) In Inminusdiag(x) In In

]

Example 44 Consider the polyhedral set

1minus x4 ge 0 x4 minus x3 ge 0 x3 minus x2 ge 0 x2 minus x1 ge 0 x1 + 1 ge 0

The equation L(x)C(x) = I5 is satisfied by

L(x) =1

2

minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 minusx4 minus 1 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 minusx3 minus 1 1minus x4 1 1 1 1 1minusx1 minus 1 minusx2 minus 1 1minus x3 1minus x4 1 1 1 1 1minusx1 minus 1 1minus x2 1minus x3 1minus x4 1 1 1 1 11minus x1 1minus x2 1minus x3 1minus x4 1 1 1 1 1

Example 45 Consider the polyhedral set

1 + x1 ge 0 1minus x1 ge 0 2minus x1 minus x2 ge 0 2minus x1 + x2 ge 0

The matrix L(x) satisfying L(x)C(x) = I4 is

1

6

x12minus 3x1 + 2 x1 x2 minus x2 4minus x1 2minus x1 1minus x1 1minus x1

3x12minus 3x1 minus 6 3x2 + 3x1 x2 6minus 3x1 minus3x1 minus3x1 minus 3 minus3x1 minus 3

1minus x12

minus2x2 minus x1 x2 minus 3 x1 minus 1 x1 + 1 x1 + 2 x1 + 21minus x1

2 3minus x1 x2 minus 2x2 x1 minus 1 x1 + 1 x1 + 2 x1 + 2

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 17

5 General constraints

We consider general nonlinear constraints as in (11) The critical point condi-tions are in (18) We discuss how to express Lagrange multipliers λi as polynomialfunctions in x on the set of critical points

Suppose there are totally m equality and inequality constraints ie

E cup I = 1 mIf (x λ) is a critical pair then λici(x) = 0 for all i isin E cup I So the Lagrange

multiplier vector λ =[λ1 middot middot middot λm

]Tsatisfies the equation

(51)

nablac1(x) nablac2(x) middot middot middot nablacm(x)c1(x) 0 middot middot middot 00 c2(x) 0 0

0 0 middot middot middot cm(x)

︸ ︷︷ ︸

C(x)

λ =

nablaf(x)000

Let C(x) be as in above If there exists L(x) isin R[x]mtimes(m+n) such that

(52) L(x)C(x) = Im

then we can get

(53) λ = L(x)

[nablaf(x)

0

]

= L1(x)nablaf(x)

where L1(x) consists of the first n columns of L(x) Clearly (52) implies thatAssumption 31 holds This section characterizes when such L(x) exists

Definition 51 The tuple c = (c1 cm) of constraining polynomials is said tobe nonsingular if rankC(u) = m for every u isin Cn

Clearly c being nonsingular is equivalent to that for each u isin Cn if J(u) =i1 ik (see (12) for the notation) then the gradients nablaci1(u) nablacik(u) arelinearly independent Our main conclusion is that (52) holds if and only if thetuple c is nonsingular

Proposition 52 (i) For each W (x) isin C[x]stimest with s ge t rankW (u) = t for allu isin Cn if and only if there exists P (x) isin C[x]ttimess such that

P (x)W (x) = It

Moreover for W (x) isin R[x]stimest we can choose P (x) isin R[x]ttimess for the above(ii) The constraining polynomial tuple c is nonsingular if and only if there existsL(x) isin R[x]mtimes(m+n) satisfying (52)

Proof (i) ldquolArrrdquo If L(x)W (x) = It then for all u isin Cn

t = rank It le rankW (u) le t

So W (x) must have full column rank everywhere

ldquorArrrdquo Suppose rankW (u) = t for all u isin Cn Write W (x) in columns

W (x) =[w1(x) w2(x) middot middot middot wt(x)

]

18 JIAWANG NIE

Then the equation w1(x) = 0 does not have a complex solution By Hilbertrsquos WeakNullstellensatz [4] there exists ξ1(x) isin C[x]s such that ξ1(x)

Tw1(x) = 1 For eachi = 2 t denote

r1i(x) = ξ1(x)Twi(x)

then (use sim to denote row equivalence between matrices)

W (x) sim[

1 r12(x) middot middot middot r1t(x)w1(x) w2(x) middot middot middot wm(x)

]

simW1(x) =

[1 r12(x) middot middot middot r1m(x)

0 w(1)2 (x) middot middot middot w

(1)m (x)

]

where each (i = 2 m)

w(1)i (x) = wi(x)minus r1i(x)w1(x)

So there exists P1(x) isin R[x](s+1)timess such that

P1(x)W (x) =W1(x)

Since W (x) and W1(x) are row equivalent W1(x) must also have full column rankeverywhere Similarly the polynomial equation

w(1)2 (x) = 0

does not have a complex solution Again by Hilbertrsquos Weak Nullstellensatz [4]there exists ξ2(x) isin C[x]s such that

ξ2(x)Tw

(1)2 (x) = 1

For each i = 3 t let r2i(x) = ξ2(x)Tw

(1)2 (x) then

W1(x) sim

1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)

0 w(1)2 (x) w

(1)3 (x) middot middot middot w

(1)m (x)

sim

W2(x) =

1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)

0 0 w(2)3 (x) middot middot middot w

(2)m (x)

where each (i = 3 m)

w(2)i (x) = w

(1)i (x)minus r2i(x)w

(1)2 (x)

Similarly W1(x) and W2(x) are row equivalent so W2(x) has full column rankeverywhere There exists P2(x) isin C[x](s+2)times(s+1) such that

P2(x)W1(x) =W2(x)

Continuing this process we can finally get

W2(x) sim middot middot middot simWt(x) =

1 r12(x) r13(x) middot middot middot r1t(x)0 1 r23(x) middot middot middot r2t(x)0 0 1 middot middot middot r3t(x)

0 0 0 middot middot middot 10 0 0 middot middot middot 0

Consequently there exists Pi(x) isin R[x](s+i)times(s+iminus1) for i = 1 2 t such that

Pt(x)Ptminus1(x) middot middot middotP1(x)W (x) =Wt(x)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 19

Since Wt(x) is a unit upper triangular matrix polynomial there exists Pt+1(x) isinR[x]ttimes(s+t) such that Pt+1(x)Wt(x) = It Let

P (x) = Pt+1(x)Pt(x)Ptminus1(x) middot middot middotP1(x)

then P (x)W (x) = Im Note that P (x) isin C[x]ttimess

For W (x) isin R[x]stimest we can replace P (x) by(P (x)+P (x)

)2 (the P (x) denotes

the complex conjugate of P (x)) which is a real matrix polynomial

(ii) The conclusion is implied directly by the item (i)

In Proposition 52 there is no explicit degree bound for L(x) satisfying (52)This question is mostly open to the best of the authorrsquos knowledge However oncea degree is chosen for L(x) it can be determined by comparing coefficients of bothsides of (52) This can be done by solving a linear system In the following wegive some examples of L(x) satisfying (52)

Example 53 Consider the hypercube with quadratic constraints

1minus x21 ge 0 1minus x22 ge 0 1minus x2n ge 0

The equation L(x)C(x) = In is satisfied by

L(x) =[minus 1

2diag(x) In]

Example 54 Consider the nonnegative portion of the unit sphere

x1 ge 0 x2 ge 0 xn ge 0 x21 + middot middot middot+ x2n minus 1 = 0

The equation L(x)C(x) = In+1 is satisfied by

L(x) =

[In minus xxT x1Tn 2x

12x

T minus 121

Tn minus1

]

Example 55 Consider the set

1minus x31 minus x42 ge 0 1minus x43 minus x34 ge 0

The equation L(x)C(x) = I2 is satisfied by

L(x) =

[minusx1

3 minusx2

4 0 0 1 00 0 minusx3

4 minusx4

3 0 1

]

Example 56 Consider the quadratic set

1minus x1x2 minus x2x3 minus x1x3 ge 0 1minus x21 minus x22 minus x23 ge 0

The matrix L(x)T satisfying L(x)C(x) = I2 is

25x13 + 10 x1

2 x2 + 40x1 x22 minus 25 x1 minus 2x3 minus25x1

3 minus 10x12 x2 minus 40 x1 x2

2 +49 x1

2+ 2x3

minus15 x12 x2 + 10x1 x2

2 + 20 x3 x1 x2 minus 10x1 15 x12 x2 minus 10x1 x2

2 minus 20 x3 x1 x2 + 10x1 minus x22

25 x3 x12 minus 20x1 x2

2 + 10 x3 x1 x2 + 2x1 minus25 x3 x12 + 20x1 x2

2 minus 10 x3 x1 x2 minus 2x1 minus x32

1 minus 20 x1 x3 minus 10 x12 minus 20 x1 x2 20 x1 x2 + 20 x1 x3 + 10x1

2

minus50 x12 minus 20 x2 x1 50 x1

2 + 20 x2 x1 + 1

We would like to remark that a polynomial tuple c = (c1 cm) is genericallynonsingular and Assumption 31 holds generically

Proposition 57 For all positive degrees d1 dm there exists an open densesubset U of D = R[x]d1 times middot middot middot times R[x]dm such that every tuple c = (c1 cm) isin Uis nonsingular Indeed such U can be chosen as a Zariski open subset of D ie itis the complement of a proper real variety of D Moreover Assumption 31 holdsfor all c isin U ie it holds generically

20 JIAWANG NIE

Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The

resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of

ci1 cin+1such that if Res(ci1 cin+1

) 6= 0 then the equations

ci1(x) = middot middot middot = cin+1(x) = 0

have no complex solutions Define

F1(c) =prod

(i1in+1)isinJ1

Res(ci1 cin+1)

For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero

Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations

cj1(x) = middot middot middot = cjk(x) = 0

have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define

F2(c) =prod

(j1jk)isinJ2

∆(cj1 cjk)

Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero

Last let F (c) = F1(c)F2(c) and

U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple

Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically

6 Numerical examples

This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21

The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie

pi =(

L1(x)nablaf(x))

i

In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds

By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)

We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo

Example 61 Consider the optimization problem

min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0

The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors

Table 1 Computational results for Example 61

order kwo LME with LME

lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567

2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn

i=1xi(1 minus xi)

2 +sum

i6=j x2

i xj isin IQ(ceq cin)

22 JIAWANG NIE

Example 62 Consider the optimization problem

min x41x22 + x21x

42 + x63 minus 3x21x

22x

23 + (x41 + x42 + x43)

st x21 + x22 + x23 ge 1

The matrix polynomial L(x) =[12x1

12x2

12x3 minus1

] The objective f is the sum

of the positive definite form x41 + x42 + x43 and the Motzkin polynomial

M(x) = x41x22 + x21x

42 + x63 minus 3x21x

22x

23

Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because

c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)

)= 0

defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1

3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic

3 plusmn 1radic

3) The

computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 2 Computational results for Example 62

order kwo LME with LME

lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530

Example 63 Consider the optimization problem

min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0

The matrix polynomial L(x) is

1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0

0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1

minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1

The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3

3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact

4 This is because 1 minus x21minus x2

2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and

1minusx23minusx2

4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23

Table 3 Computational results for Example 63

order kwo LME with LME

lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449

Example 64 Consider the polynomial optimization problem

minxisinR2

x21 + 50x22

st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1

8 ge 0 x22 + 2x1x2 minus 18 ge 0

It is motivated from an example in [12 sect3] The first column of L(x) is

8x13

5 + x1

5288 x2 x1

4

5 minus 16 x13

5 minus x2 x12 1245 + 8 x1

5 minus 2 x2minus 288x2 x1

4

5 minus 16 x13

5 + x2 x12 1245 + 8 x1

5 + 2 x2

and the second column of L(x) is

minus 8x12 x2

5 + 4x23

5 minus x2

10288x1

3 x22

5 + 16x12 x2

5 minus 142x1 x22

5 minus 9 x1

20 minus 8 x23

5 + 11x2

5

minus 288x13 x2

2

5 + 16x12 x2

5 + 142x1 x22

5 + 9 x1

20 minus 8 x23

5 + 11x2

5

The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25

radic5 asymp 1126517 and the minimizers are (plusmn

radic

12plusmn(radic

58 +radic

12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors

Table 4 Computational results for Example 64

order kwo LME with LME

lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537

Example 65 Consider the optimization problem

minxisinR3

x31 + x32 + x33 + 4x1x2x3 minus(x1(x

22 + x23) + x2(x

23 + x21) + x3(x

21 + x22)

)

st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0

The matrix polynomial L(x) is

1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0

minusx1 x2 0 1 0 minus1

24 JIAWANG NIE

The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3

+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 5 Computational results for Example 65

order kwo LME with LME

lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057

Example 66 Consider the optimization problem (x0 = 1)

minxisinR4

xTx+sum4i=0

prod

j 6=i(xi minus xj)

st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0

The matrix polynomial L(x) =[12diag(x) minusI4

] The first part of the objective

is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers

(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)

(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)

(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 6 Computational results for Example 66

order kwo LME with LME

lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354

Example 67 Consider the optimization problem

minxisinR3

x41x22 + x42x

23 + x43x

21 minus 3x21x

22x

23 + x22

st x1 minus x2x3 ge 0minusx2 + x23 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0

minusx3 minus1 0 0 0

]

By the arithmetic-geometric

mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25

relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin

for all k ge 5 up to numerical round-off errors

Table 7 Computational results for Example 67

order kwo LME with LME

lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618

Example 68 Consider the optimization problem

minxisinR4

x21(x1 minus x4)2 + x22(x2 minus x4)

2 + x23(x3 minus x4)2+

2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2

st x1 minus x2 ge 0 x2 minus x3 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0 01 1 0 0 0 0

]

In the objective the sum

of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer

(05632 05632 05632 07510)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 8 Computational results for Example 68

order kwo LME with LME

lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228

Example 69 Consider the optimization problem

minxisinR4

(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)

st 0 le x1 x4 le 1

The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9

5 Note that 4minussum

4

i=1x2i =

sum

4

i=1

(

xi(1 minus xi)2 + (1 minus xi)(1 + x2i )

)

isin IQ(ceq cin)

26 JIAWANG NIE

Table 9 Computational results for Example 69

orderwo LME with LME

lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320

For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison

Example 610 Consider the optimization problem (x0 = 1)

minxisinRn

sum

0leilejlejlen cijkxixjxk

st 0 le x le 1

where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time

Table 10 Consumed time (in seconds) for Example 610

n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778

7 Discussions

71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that

IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k

If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set

Preord(cin ψ) =sum

r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27

The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is

(71)

f primeprek = min 〈f y〉

st 〈1 y〉 = 1 L(k)ceq(y) = 0 L

(k)φ (y) = 0

L(k)

gr11

middotmiddotmiddotgrℓℓ

(y) 0 forall r1 rℓ isin 0 1y isin RN

n2k

Similar to (39) the dual optimization problem of the above is

(72)

fprek = max γ

st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k

An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem

Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then

fprek = f primeprek = fc

for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre

k = fmin for all k big enough

Proof The proof is very similar to the Case III of Theorem 33 Follow the same

argument there Without Assumption 32 we still have f(x) equiv 0 on the set

K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0

By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such

that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same

72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge

73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)

28 JIAWANG NIE

74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get

λ =(

C(x)TC(x))minus1

C(x)T[nablaf(x)

0

]

when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research

Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper

References

[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions

Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-

putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997

[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226

[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200

[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832

[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear

[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear

[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]

[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous

quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-

inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-

tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005

[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109

[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J

Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of

zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29

[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885

[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014

[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009

[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015

[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear

[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26

[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009

[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014

[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606

[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718

[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012

[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255

[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013

[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510

[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121

[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274

[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003

[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984

[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000

[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942

[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324

[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653

[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002

[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951

Department of Mathematics University of California at San Diego 9500 Gilman

Drive La Jolla CA USA 92093

E-mail address njwmathucsdedu

  • 1 Introduction
    • 11 Optimality conditions
    • 12 Some existing work
    • 13 New contributions
      • 2 Preliminaries
      • 3 The construction of tight relaxations
        • 31 Tightness of the relaxations
        • 32 Detecting tightness and extracting minimizers
          • 4 Polyhedral constraints
          • 5 General constraints
          • 6 Numerical examples
          • 7 Discussions
            • 71 Tight relaxations using preorderings
            • 72 Singular constraining polynomials
            • 73 Degree bound for L(x)
            • 74 Rational representation of Lagrange multipliers
              • References
Page 17: TIGHT RELAXATIONS FOR POLYNOMIAL OPTIMIZATION …The polynomial expressions is de-termined by linear equations. Based on these expressions, new Lasserre type semidefinite relaxations

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 17

5 General constraints

We consider general nonlinear constraints as in (11) The critical point condi-tions are in (18) We discuss how to express Lagrange multipliers λi as polynomialfunctions in x on the set of critical points

Suppose there are totally m equality and inequality constraints ie

E cup I = 1 mIf (x λ) is a critical pair then λici(x) = 0 for all i isin E cup I So the Lagrange

multiplier vector λ =[λ1 middot middot middot λm

]Tsatisfies the equation

(51)

nablac1(x) nablac2(x) middot middot middot nablacm(x)c1(x) 0 middot middot middot 00 c2(x) 0 0

0 0 middot middot middot cm(x)

︸ ︷︷ ︸

C(x)

λ =

nablaf(x)000

Let C(x) be as in above If there exists L(x) isin R[x]mtimes(m+n) such that

(52) L(x)C(x) = Im

then we can get

(53) λ = L(x)

[nablaf(x)

0

]

= L1(x)nablaf(x)

where L1(x) consists of the first n columns of L(x) Clearly (52) implies thatAssumption 31 holds This section characterizes when such L(x) exists

Definition 51 The tuple c = (c1 cm) of constraining polynomials is said tobe nonsingular if rankC(u) = m for every u isin Cn

Clearly c being nonsingular is equivalent to that for each u isin Cn if J(u) =i1 ik (see (12) for the notation) then the gradients nablaci1(u) nablacik(u) arelinearly independent Our main conclusion is that (52) holds if and only if thetuple c is nonsingular

Proposition 52 (i) For each W (x) isin C[x]stimest with s ge t rankW (u) = t for allu isin Cn if and only if there exists P (x) isin C[x]ttimess such that

P (x)W (x) = It

Moreover for W (x) isin R[x]stimest we can choose P (x) isin R[x]ttimess for the above(ii) The constraining polynomial tuple c is nonsingular if and only if there existsL(x) isin R[x]mtimes(m+n) satisfying (52)

Proof (i) ldquolArrrdquo If L(x)W (x) = It then for all u isin Cn

t = rank It le rankW (u) le t

So W (x) must have full column rank everywhere

ldquorArrrdquo Suppose rankW (u) = t for all u isin Cn Write W (x) in columns

W (x) =[w1(x) w2(x) middot middot middot wt(x)

]

18 JIAWANG NIE

Then the equation w1(x) = 0 does not have a complex solution By Hilbertrsquos WeakNullstellensatz [4] there exists ξ1(x) isin C[x]s such that ξ1(x)

Tw1(x) = 1 For eachi = 2 t denote

r1i(x) = ξ1(x)Twi(x)

then (use sim to denote row equivalence between matrices)

W (x) sim[

1 r12(x) middot middot middot r1t(x)w1(x) w2(x) middot middot middot wm(x)

]

simW1(x) =

[1 r12(x) middot middot middot r1m(x)

0 w(1)2 (x) middot middot middot w

(1)m (x)

]

where each (i = 2 m)

w(1)i (x) = wi(x)minus r1i(x)w1(x)

So there exists P1(x) isin R[x](s+1)timess such that

P1(x)W (x) =W1(x)

Since W (x) and W1(x) are row equivalent W1(x) must also have full column rankeverywhere Similarly the polynomial equation

w(1)2 (x) = 0

does not have a complex solution Again by Hilbertrsquos Weak Nullstellensatz [4]there exists ξ2(x) isin C[x]s such that

ξ2(x)Tw

(1)2 (x) = 1

For each i = 3 t let r2i(x) = ξ2(x)Tw

(1)2 (x) then

W1(x) sim

1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)

0 w(1)2 (x) w

(1)3 (x) middot middot middot w

(1)m (x)

sim

W2(x) =

1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)

0 0 w(2)3 (x) middot middot middot w

(2)m (x)

where each (i = 3 m)

w(2)i (x) = w

(1)i (x)minus r2i(x)w

(1)2 (x)

Similarly W1(x) and W2(x) are row equivalent so W2(x) has full column rankeverywhere There exists P2(x) isin C[x](s+2)times(s+1) such that

P2(x)W1(x) =W2(x)

Continuing this process we can finally get

W2(x) sim middot middot middot simWt(x) =

1 r12(x) r13(x) middot middot middot r1t(x)0 1 r23(x) middot middot middot r2t(x)0 0 1 middot middot middot r3t(x)

0 0 0 middot middot middot 10 0 0 middot middot middot 0

Consequently there exists Pi(x) isin R[x](s+i)times(s+iminus1) for i = 1 2 t such that

Pt(x)Ptminus1(x) middot middot middotP1(x)W (x) =Wt(x)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 19

Since Wt(x) is a unit upper triangular matrix polynomial there exists Pt+1(x) isinR[x]ttimes(s+t) such that Pt+1(x)Wt(x) = It Let

P (x) = Pt+1(x)Pt(x)Ptminus1(x) middot middot middotP1(x)

then P (x)W (x) = Im Note that P (x) isin C[x]ttimess

For W (x) isin R[x]stimest we can replace P (x) by(P (x)+P (x)

)2 (the P (x) denotes

the complex conjugate of P (x)) which is a real matrix polynomial

(ii) The conclusion is implied directly by the item (i)

In Proposition 52 there is no explicit degree bound for L(x) satisfying (52)This question is mostly open to the best of the authorrsquos knowledge However oncea degree is chosen for L(x) it can be determined by comparing coefficients of bothsides of (52) This can be done by solving a linear system In the following wegive some examples of L(x) satisfying (52)

Example 53 Consider the hypercube with quadratic constraints

1minus x21 ge 0 1minus x22 ge 0 1minus x2n ge 0

The equation L(x)C(x) = In is satisfied by

L(x) =[minus 1

2diag(x) In]

Example 54 Consider the nonnegative portion of the unit sphere

x1 ge 0 x2 ge 0 xn ge 0 x21 + middot middot middot+ x2n minus 1 = 0

The equation L(x)C(x) = In+1 is satisfied by

L(x) =

[In minus xxT x1Tn 2x

12x

T minus 121

Tn minus1

]

Example 55 Consider the set

1minus x31 minus x42 ge 0 1minus x43 minus x34 ge 0

The equation L(x)C(x) = I2 is satisfied by

L(x) =

[minusx1

3 minusx2

4 0 0 1 00 0 minusx3

4 minusx4

3 0 1

]

Example 56 Consider the quadratic set

1minus x1x2 minus x2x3 minus x1x3 ge 0 1minus x21 minus x22 minus x23 ge 0

The matrix L(x)T satisfying L(x)C(x) = I2 is

25x13 + 10 x1

2 x2 + 40x1 x22 minus 25 x1 minus 2x3 minus25x1

3 minus 10x12 x2 minus 40 x1 x2

2 +49 x1

2+ 2x3

minus15 x12 x2 + 10x1 x2

2 + 20 x3 x1 x2 minus 10x1 15 x12 x2 minus 10x1 x2

2 minus 20 x3 x1 x2 + 10x1 minus x22

25 x3 x12 minus 20x1 x2

2 + 10 x3 x1 x2 + 2x1 minus25 x3 x12 + 20x1 x2

2 minus 10 x3 x1 x2 minus 2x1 minus x32

1 minus 20 x1 x3 minus 10 x12 minus 20 x1 x2 20 x1 x2 + 20 x1 x3 + 10x1

2

minus50 x12 minus 20 x2 x1 50 x1

2 + 20 x2 x1 + 1

We would like to remark that a polynomial tuple c = (c1 cm) is genericallynonsingular and Assumption 31 holds generically

Proposition 57 For all positive degrees d1 dm there exists an open densesubset U of D = R[x]d1 times middot middot middot times R[x]dm such that every tuple c = (c1 cm) isin Uis nonsingular Indeed such U can be chosen as a Zariski open subset of D ie itis the complement of a proper real variety of D Moreover Assumption 31 holdsfor all c isin U ie it holds generically

20 JIAWANG NIE

Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The

resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of

ci1 cin+1such that if Res(ci1 cin+1

) 6= 0 then the equations

ci1(x) = middot middot middot = cin+1(x) = 0

have no complex solutions Define

F1(c) =prod

(i1in+1)isinJ1

Res(ci1 cin+1)

For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero

Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations

cj1(x) = middot middot middot = cjk(x) = 0

have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define

F2(c) =prod

(j1jk)isinJ2

∆(cj1 cjk)

Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero

Last let F (c) = F1(c)F2(c) and

U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple

Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically

6 Numerical examples

This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21

The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie

pi =(

L1(x)nablaf(x))

i

In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds

By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)

We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo

Example 61 Consider the optimization problem

min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0

The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors

Table 1 Computational results for Example 61

order kwo LME with LME

lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567

2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn

i=1xi(1 minus xi)

2 +sum

i6=j x2

i xj isin IQ(ceq cin)

22 JIAWANG NIE

Example 62 Consider the optimization problem

min x41x22 + x21x

42 + x63 minus 3x21x

22x

23 + (x41 + x42 + x43)

st x21 + x22 + x23 ge 1

The matrix polynomial L(x) =[12x1

12x2

12x3 minus1

] The objective f is the sum

of the positive definite form x41 + x42 + x43 and the Motzkin polynomial

M(x) = x41x22 + x21x

42 + x63 minus 3x21x

22x

23

Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because

c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)

)= 0

defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1

3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic

3 plusmn 1radic

3) The

computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 2 Computational results for Example 62

order kwo LME with LME

lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530

Example 63 Consider the optimization problem

min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0

The matrix polynomial L(x) is

1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0

0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1

minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1

The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3

3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact

4 This is because 1 minus x21minus x2

2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and

1minusx23minusx2

4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23

Table 3 Computational results for Example 63

order kwo LME with LME

lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449

Example 64 Consider the polynomial optimization problem

minxisinR2

x21 + 50x22

st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1

8 ge 0 x22 + 2x1x2 minus 18 ge 0

It is motivated from an example in [12 sect3] The first column of L(x) is

8x13

5 + x1

5288 x2 x1

4

5 minus 16 x13

5 minus x2 x12 1245 + 8 x1

5 minus 2 x2minus 288x2 x1

4

5 minus 16 x13

5 + x2 x12 1245 + 8 x1

5 + 2 x2

and the second column of L(x) is

minus 8x12 x2

5 + 4x23

5 minus x2

10288x1

3 x22

5 + 16x12 x2

5 minus 142x1 x22

5 minus 9 x1

20 minus 8 x23

5 + 11x2

5

minus 288x13 x2

2

5 + 16x12 x2

5 + 142x1 x22

5 + 9 x1

20 minus 8 x23

5 + 11x2

5

The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25

radic5 asymp 1126517 and the minimizers are (plusmn

radic

12plusmn(radic

58 +radic

12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors

Table 4 Computational results for Example 64

order kwo LME with LME

lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537

Example 65 Consider the optimization problem

minxisinR3

x31 + x32 + x33 + 4x1x2x3 minus(x1(x

22 + x23) + x2(x

23 + x21) + x3(x

21 + x22)

)

st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0

The matrix polynomial L(x) is

1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0

minusx1 x2 0 1 0 minus1

24 JIAWANG NIE

The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3

+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 5 Computational results for Example 65

order kwo LME with LME

lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057

Example 66 Consider the optimization problem (x0 = 1)

minxisinR4

xTx+sum4i=0

prod

j 6=i(xi minus xj)

st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0

The matrix polynomial L(x) =[12diag(x) minusI4

] The first part of the objective

is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers

(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)

(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)

(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 6 Computational results for Example 66

order kwo LME with LME

lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354

Example 67 Consider the optimization problem

minxisinR3

x41x22 + x42x

23 + x43x

21 minus 3x21x

22x

23 + x22

st x1 minus x2x3 ge 0minusx2 + x23 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0

minusx3 minus1 0 0 0

]

By the arithmetic-geometric

mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25

relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin

for all k ge 5 up to numerical round-off errors

Table 7 Computational results for Example 67

order kwo LME with LME

lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618

Example 68 Consider the optimization problem

minxisinR4

x21(x1 minus x4)2 + x22(x2 minus x4)

2 + x23(x3 minus x4)2+

2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2

st x1 minus x2 ge 0 x2 minus x3 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0 01 1 0 0 0 0

]

In the objective the sum

of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer

(05632 05632 05632 07510)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 8 Computational results for Example 68

order kwo LME with LME

lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228

Example 69 Consider the optimization problem

minxisinR4

(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)

st 0 le x1 x4 le 1

The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9

5 Note that 4minussum

4

i=1x2i =

sum

4

i=1

(

xi(1 minus xi)2 + (1 minus xi)(1 + x2i )

)

isin IQ(ceq cin)

26 JIAWANG NIE

Table 9 Computational results for Example 69

orderwo LME with LME

lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320

For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison

Example 610 Consider the optimization problem (x0 = 1)

minxisinRn

sum

0leilejlejlen cijkxixjxk

st 0 le x le 1

where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time

Table 10 Consumed time (in seconds) for Example 610

n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778

7 Discussions

71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that

IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k

If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set

Preord(cin ψ) =sum

r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27

The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is

(71)

f primeprek = min 〈f y〉

st 〈1 y〉 = 1 L(k)ceq(y) = 0 L

(k)φ (y) = 0

L(k)

gr11

middotmiddotmiddotgrℓℓ

(y) 0 forall r1 rℓ isin 0 1y isin RN

n2k

Similar to (39) the dual optimization problem of the above is

(72)

fprek = max γ

st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k

An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem

Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then

fprek = f primeprek = fc

for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre

k = fmin for all k big enough

Proof The proof is very similar to the Case III of Theorem 33 Follow the same

argument there Without Assumption 32 we still have f(x) equiv 0 on the set

K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0

By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such

that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same

72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge

73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)

28 JIAWANG NIE

74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get

λ =(

C(x)TC(x))minus1

C(x)T[nablaf(x)

0

]

when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research

Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper

References

[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions

Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-

putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997

[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226

[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200

[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832

[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear

[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear

[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]

[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous

quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-

inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-

tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005

[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109

[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J

Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of

zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29

[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885

[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014

[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009

[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015

[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear

[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26

[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009

[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014

[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606

[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718

[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012

[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255

[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013

[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510

[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121

[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274

[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003

[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984

[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000

[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942

[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324

[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653

[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002

[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951

Department of Mathematics University of California at San Diego 9500 Gilman

Drive La Jolla CA USA 92093

E-mail address njwmathucsdedu

  • 1 Introduction
    • 11 Optimality conditions
    • 12 Some existing work
    • 13 New contributions
      • 2 Preliminaries
      • 3 The construction of tight relaxations
        • 31 Tightness of the relaxations
        • 32 Detecting tightness and extracting minimizers
          • 4 Polyhedral constraints
          • 5 General constraints
          • 6 Numerical examples
          • 7 Discussions
            • 71 Tight relaxations using preorderings
            • 72 Singular constraining polynomials
            • 73 Degree bound for L(x)
            • 74 Rational representation of Lagrange multipliers
              • References
Page 18: TIGHT RELAXATIONS FOR POLYNOMIAL OPTIMIZATION …The polynomial expressions is de-termined by linear equations. Based on these expressions, new Lasserre type semidefinite relaxations

18 JIAWANG NIE

Then the equation w1(x) = 0 does not have a complex solution By Hilbertrsquos WeakNullstellensatz [4] there exists ξ1(x) isin C[x]s such that ξ1(x)

Tw1(x) = 1 For eachi = 2 t denote

r1i(x) = ξ1(x)Twi(x)

then (use sim to denote row equivalence between matrices)

W (x) sim[

1 r12(x) middot middot middot r1t(x)w1(x) w2(x) middot middot middot wm(x)

]

simW1(x) =

[1 r12(x) middot middot middot r1m(x)

0 w(1)2 (x) middot middot middot w

(1)m (x)

]

where each (i = 2 m)

w(1)i (x) = wi(x)minus r1i(x)w1(x)

So there exists P1(x) isin R[x](s+1)timess such that

P1(x)W (x) =W1(x)

Since W (x) and W1(x) are row equivalent W1(x) must also have full column rankeverywhere Similarly the polynomial equation

w(1)2 (x) = 0

does not have a complex solution Again by Hilbertrsquos Weak Nullstellensatz [4]there exists ξ2(x) isin C[x]s such that

ξ2(x)Tw

(1)2 (x) = 1

For each i = 3 t let r2i(x) = ξ2(x)Tw

(1)2 (x) then

W1(x) sim

1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)

0 w(1)2 (x) w

(1)3 (x) middot middot middot w

(1)m (x)

sim

W2(x) =

1 r12(x) r13(x) middot middot middot r1m(x)0 1 r23(x) middot middot middot r2m(x)

0 0 w(2)3 (x) middot middot middot w

(2)m (x)

where each (i = 3 m)

w(2)i (x) = w

(1)i (x)minus r2i(x)w

(1)2 (x)

Similarly W1(x) and W2(x) are row equivalent so W2(x) has full column rankeverywhere There exists P2(x) isin C[x](s+2)times(s+1) such that

P2(x)W1(x) =W2(x)

Continuing this process we can finally get

W2(x) sim middot middot middot simWt(x) =

1 r12(x) r13(x) middot middot middot r1t(x)0 1 r23(x) middot middot middot r2t(x)0 0 1 middot middot middot r3t(x)

0 0 0 middot middot middot 10 0 0 middot middot middot 0

Consequently there exists Pi(x) isin R[x](s+i)times(s+iminus1) for i = 1 2 t such that

Pt(x)Ptminus1(x) middot middot middotP1(x)W (x) =Wt(x)

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 19

Since Wt(x) is a unit upper triangular matrix polynomial there exists Pt+1(x) isinR[x]ttimes(s+t) such that Pt+1(x)Wt(x) = It Let

P (x) = Pt+1(x)Pt(x)Ptminus1(x) middot middot middotP1(x)

then P (x)W (x) = Im Note that P (x) isin C[x]ttimess

For W (x) isin R[x]stimest we can replace P (x) by(P (x)+P (x)

)2 (the P (x) denotes

the complex conjugate of P (x)) which is a real matrix polynomial

(ii) The conclusion is implied directly by the item (i)

In Proposition 52 there is no explicit degree bound for L(x) satisfying (52)This question is mostly open to the best of the authorrsquos knowledge However oncea degree is chosen for L(x) it can be determined by comparing coefficients of bothsides of (52) This can be done by solving a linear system In the following wegive some examples of L(x) satisfying (52)

Example 53 Consider the hypercube with quadratic constraints

1minus x21 ge 0 1minus x22 ge 0 1minus x2n ge 0

The equation L(x)C(x) = In is satisfied by

L(x) =[minus 1

2diag(x) In]

Example 54 Consider the nonnegative portion of the unit sphere

x1 ge 0 x2 ge 0 xn ge 0 x21 + middot middot middot+ x2n minus 1 = 0

The equation L(x)C(x) = In+1 is satisfied by

L(x) =

[In minus xxT x1Tn 2x

12x

T minus 121

Tn minus1

]

Example 55 Consider the set

1minus x31 minus x42 ge 0 1minus x43 minus x34 ge 0

The equation L(x)C(x) = I2 is satisfied by

L(x) =

[minusx1

3 minusx2

4 0 0 1 00 0 minusx3

4 minusx4

3 0 1

]

Example 56 Consider the quadratic set

1minus x1x2 minus x2x3 minus x1x3 ge 0 1minus x21 minus x22 minus x23 ge 0

The matrix L(x)T satisfying L(x)C(x) = I2 is

25x13 + 10 x1

2 x2 + 40x1 x22 minus 25 x1 minus 2x3 minus25x1

3 minus 10x12 x2 minus 40 x1 x2

2 +49 x1

2+ 2x3

minus15 x12 x2 + 10x1 x2

2 + 20 x3 x1 x2 minus 10x1 15 x12 x2 minus 10x1 x2

2 minus 20 x3 x1 x2 + 10x1 minus x22

25 x3 x12 minus 20x1 x2

2 + 10 x3 x1 x2 + 2x1 minus25 x3 x12 + 20x1 x2

2 minus 10 x3 x1 x2 minus 2x1 minus x32

1 minus 20 x1 x3 minus 10 x12 minus 20 x1 x2 20 x1 x2 + 20 x1 x3 + 10x1

2

minus50 x12 minus 20 x2 x1 50 x1

2 + 20 x2 x1 + 1

We would like to remark that a polynomial tuple c = (c1 cm) is genericallynonsingular and Assumption 31 holds generically

Proposition 57 For all positive degrees d1 dm there exists an open densesubset U of D = R[x]d1 times middot middot middot times R[x]dm such that every tuple c = (c1 cm) isin Uis nonsingular Indeed such U can be chosen as a Zariski open subset of D ie itis the complement of a proper real variety of D Moreover Assumption 31 holdsfor all c isin U ie it holds generically

20 JIAWANG NIE

Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The

resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of

ci1 cin+1such that if Res(ci1 cin+1

) 6= 0 then the equations

ci1(x) = middot middot middot = cin+1(x) = 0

have no complex solutions Define

F1(c) =prod

(i1in+1)isinJ1

Res(ci1 cin+1)

For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero

Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations

cj1(x) = middot middot middot = cjk(x) = 0

have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define

F2(c) =prod

(j1jk)isinJ2

∆(cj1 cjk)

Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero

Last let F (c) = F1(c)F2(c) and

U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple

Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically

6 Numerical examples

This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21

The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie

pi =(

L1(x)nablaf(x))

i

In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds

By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)

We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo

Example 61 Consider the optimization problem

min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0

The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors

Table 1 Computational results for Example 61

order kwo LME with LME

lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567

2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn

i=1xi(1 minus xi)

2 +sum

i6=j x2

i xj isin IQ(ceq cin)

22 JIAWANG NIE

Example 62 Consider the optimization problem

min x41x22 + x21x

42 + x63 minus 3x21x

22x

23 + (x41 + x42 + x43)

st x21 + x22 + x23 ge 1

The matrix polynomial L(x) =[12x1

12x2

12x3 minus1

] The objective f is the sum

of the positive definite form x41 + x42 + x43 and the Motzkin polynomial

M(x) = x41x22 + x21x

42 + x63 minus 3x21x

22x

23

Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because

c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)

)= 0

defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1

3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic

3 plusmn 1radic

3) The

computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 2 Computational results for Example 62

order kwo LME with LME

lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530

Example 63 Consider the optimization problem

min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0

The matrix polynomial L(x) is

1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0

0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1

minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1

The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3

3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact

4 This is because 1 minus x21minus x2

2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and

1minusx23minusx2

4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23

Table 3 Computational results for Example 63

order kwo LME with LME

lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449

Example 64 Consider the polynomial optimization problem

minxisinR2

x21 + 50x22

st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1

8 ge 0 x22 + 2x1x2 minus 18 ge 0

It is motivated from an example in [12 sect3] The first column of L(x) is

8x13

5 + x1

5288 x2 x1

4

5 minus 16 x13

5 minus x2 x12 1245 + 8 x1

5 minus 2 x2minus 288x2 x1

4

5 minus 16 x13

5 + x2 x12 1245 + 8 x1

5 + 2 x2

and the second column of L(x) is

minus 8x12 x2

5 + 4x23

5 minus x2

10288x1

3 x22

5 + 16x12 x2

5 minus 142x1 x22

5 minus 9 x1

20 minus 8 x23

5 + 11x2

5

minus 288x13 x2

2

5 + 16x12 x2

5 + 142x1 x22

5 + 9 x1

20 minus 8 x23

5 + 11x2

5

The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25

radic5 asymp 1126517 and the minimizers are (plusmn

radic

12plusmn(radic

58 +radic

12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors

Table 4 Computational results for Example 64

order kwo LME with LME

lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537

Example 65 Consider the optimization problem

minxisinR3

x31 + x32 + x33 + 4x1x2x3 minus(x1(x

22 + x23) + x2(x

23 + x21) + x3(x

21 + x22)

)

st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0

The matrix polynomial L(x) is

1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0

minusx1 x2 0 1 0 minus1

24 JIAWANG NIE

The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3

+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 5 Computational results for Example 65

order kwo LME with LME

lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057

Example 66 Consider the optimization problem (x0 = 1)

minxisinR4

xTx+sum4i=0

prod

j 6=i(xi minus xj)

st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0

The matrix polynomial L(x) =[12diag(x) minusI4

] The first part of the objective

is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers

(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)

(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)

(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 6 Computational results for Example 66

order kwo LME with LME

lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354

Example 67 Consider the optimization problem

minxisinR3

x41x22 + x42x

23 + x43x

21 minus 3x21x

22x

23 + x22

st x1 minus x2x3 ge 0minusx2 + x23 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0

minusx3 minus1 0 0 0

]

By the arithmetic-geometric

mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25

relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin

for all k ge 5 up to numerical round-off errors

Table 7 Computational results for Example 67

order kwo LME with LME

lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618

Example 68 Consider the optimization problem

minxisinR4

x21(x1 minus x4)2 + x22(x2 minus x4)

2 + x23(x3 minus x4)2+

2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2

st x1 minus x2 ge 0 x2 minus x3 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0 01 1 0 0 0 0

]

In the objective the sum

of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer

(05632 05632 05632 07510)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 8 Computational results for Example 68

order kwo LME with LME

lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228

Example 69 Consider the optimization problem

minxisinR4

(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)

st 0 le x1 x4 le 1

The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9

5 Note that 4minussum

4

i=1x2i =

sum

4

i=1

(

xi(1 minus xi)2 + (1 minus xi)(1 + x2i )

)

isin IQ(ceq cin)

26 JIAWANG NIE

Table 9 Computational results for Example 69

orderwo LME with LME

lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320

For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison

Example 610 Consider the optimization problem (x0 = 1)

minxisinRn

sum

0leilejlejlen cijkxixjxk

st 0 le x le 1

where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time

Table 10 Consumed time (in seconds) for Example 610

n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778

7 Discussions

71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that

IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k

If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set

Preord(cin ψ) =sum

r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27

The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is

(71)

f primeprek = min 〈f y〉

st 〈1 y〉 = 1 L(k)ceq(y) = 0 L

(k)φ (y) = 0

L(k)

gr11

middotmiddotmiddotgrℓℓ

(y) 0 forall r1 rℓ isin 0 1y isin RN

n2k

Similar to (39) the dual optimization problem of the above is

(72)

fprek = max γ

st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k

An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem

Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then

fprek = f primeprek = fc

for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre

k = fmin for all k big enough

Proof The proof is very similar to the Case III of Theorem 33 Follow the same

argument there Without Assumption 32 we still have f(x) equiv 0 on the set

K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0

By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such

that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same

72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge

73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)

28 JIAWANG NIE

74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get

λ =(

C(x)TC(x))minus1

C(x)T[nablaf(x)

0

]

when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research

Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper

References

[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions

Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-

putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997

[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226

[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200

[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832

[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear

[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear

[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]

[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous

quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-

inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-

tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005

[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109

[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J

Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of

zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29

[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885

[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014

[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009

[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015

[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear

[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26

[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009

[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014

[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606

[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718

[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012

[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255

[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013

[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510

[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121

[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274

[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003

[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984

[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000

[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942

[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324

[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653

[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002

[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951

Department of Mathematics University of California at San Diego 9500 Gilman

Drive La Jolla CA USA 92093

E-mail address njwmathucsdedu

  • 1 Introduction
    • 11 Optimality conditions
    • 12 Some existing work
    • 13 New contributions
      • 2 Preliminaries
      • 3 The construction of tight relaxations
        • 31 Tightness of the relaxations
        • 32 Detecting tightness and extracting minimizers
          • 4 Polyhedral constraints
          • 5 General constraints
          • 6 Numerical examples
          • 7 Discussions
            • 71 Tight relaxations using preorderings
            • 72 Singular constraining polynomials
            • 73 Degree bound for L(x)
            • 74 Rational representation of Lagrange multipliers
              • References
Page 19: TIGHT RELAXATIONS FOR POLYNOMIAL OPTIMIZATION …The polynomial expressions is de-termined by linear equations. Based on these expressions, new Lasserre type semidefinite relaxations

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 19

Since Wt(x) is a unit upper triangular matrix polynomial there exists Pt+1(x) isinR[x]ttimes(s+t) such that Pt+1(x)Wt(x) = It Let

P (x) = Pt+1(x)Pt(x)Ptminus1(x) middot middot middotP1(x)

then P (x)W (x) = Im Note that P (x) isin C[x]ttimess

For W (x) isin R[x]stimest we can replace P (x) by(P (x)+P (x)

)2 (the P (x) denotes

the complex conjugate of P (x)) which is a real matrix polynomial

(ii) The conclusion is implied directly by the item (i)

In Proposition 52 there is no explicit degree bound for L(x) satisfying (52)This question is mostly open to the best of the authorrsquos knowledge However oncea degree is chosen for L(x) it can be determined by comparing coefficients of bothsides of (52) This can be done by solving a linear system In the following wegive some examples of L(x) satisfying (52)

Example 53 Consider the hypercube with quadratic constraints

1minus x21 ge 0 1minus x22 ge 0 1minus x2n ge 0

The equation L(x)C(x) = In is satisfied by

L(x) =[minus 1

2diag(x) In]

Example 54 Consider the nonnegative portion of the unit sphere

x1 ge 0 x2 ge 0 xn ge 0 x21 + middot middot middot+ x2n minus 1 = 0

The equation L(x)C(x) = In+1 is satisfied by

L(x) =

[In minus xxT x1Tn 2x

12x

T minus 121

Tn minus1

]

Example 55 Consider the set

1minus x31 minus x42 ge 0 1minus x43 minus x34 ge 0

The equation L(x)C(x) = I2 is satisfied by

L(x) =

[minusx1

3 minusx2

4 0 0 1 00 0 minusx3

4 minusx4

3 0 1

]

Example 56 Consider the quadratic set

1minus x1x2 minus x2x3 minus x1x3 ge 0 1minus x21 minus x22 minus x23 ge 0

The matrix L(x)T satisfying L(x)C(x) = I2 is

25x13 + 10 x1

2 x2 + 40x1 x22 minus 25 x1 minus 2x3 minus25x1

3 minus 10x12 x2 minus 40 x1 x2

2 +49 x1

2+ 2x3

minus15 x12 x2 + 10x1 x2

2 + 20 x3 x1 x2 minus 10x1 15 x12 x2 minus 10x1 x2

2 minus 20 x3 x1 x2 + 10x1 minus x22

25 x3 x12 minus 20x1 x2

2 + 10 x3 x1 x2 + 2x1 minus25 x3 x12 + 20x1 x2

2 minus 10 x3 x1 x2 minus 2x1 minus x32

1 minus 20 x1 x3 minus 10 x12 minus 20 x1 x2 20 x1 x2 + 20 x1 x3 + 10x1

2

minus50 x12 minus 20 x2 x1 50 x1

2 + 20 x2 x1 + 1

We would like to remark that a polynomial tuple c = (c1 cm) is genericallynonsingular and Assumption 31 holds generically

Proposition 57 For all positive degrees d1 dm there exists an open densesubset U of D = R[x]d1 times middot middot middot times R[x]dm such that every tuple c = (c1 cm) isin Uis nonsingular Indeed such U can be chosen as a Zariski open subset of D ie itis the complement of a proper real variety of D Moreover Assumption 31 holdsfor all c isin U ie it holds generically

20 JIAWANG NIE

Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The

resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of

ci1 cin+1such that if Res(ci1 cin+1

) 6= 0 then the equations

ci1(x) = middot middot middot = cin+1(x) = 0

have no complex solutions Define

F1(c) =prod

(i1in+1)isinJ1

Res(ci1 cin+1)

For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero

Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations

cj1(x) = middot middot middot = cjk(x) = 0

have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define

F2(c) =prod

(j1jk)isinJ2

∆(cj1 cjk)

Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero

Last let F (c) = F1(c)F2(c) and

U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple

Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically

6 Numerical examples

This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21

The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie

pi =(

L1(x)nablaf(x))

i

In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds

By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)

We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo

Example 61 Consider the optimization problem

min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0

The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors

Table 1 Computational results for Example 61

order kwo LME with LME

lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567

2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn

i=1xi(1 minus xi)

2 +sum

i6=j x2

i xj isin IQ(ceq cin)

22 JIAWANG NIE

Example 62 Consider the optimization problem

min x41x22 + x21x

42 + x63 minus 3x21x

22x

23 + (x41 + x42 + x43)

st x21 + x22 + x23 ge 1

The matrix polynomial L(x) =[12x1

12x2

12x3 minus1

] The objective f is the sum

of the positive definite form x41 + x42 + x43 and the Motzkin polynomial

M(x) = x41x22 + x21x

42 + x63 minus 3x21x

22x

23

Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because

c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)

)= 0

defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1

3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic

3 plusmn 1radic

3) The

computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 2 Computational results for Example 62

order kwo LME with LME

lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530

Example 63 Consider the optimization problem

min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0

The matrix polynomial L(x) is

1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0

0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1

minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1

The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3

3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact

4 This is because 1 minus x21minus x2

2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and

1minusx23minusx2

4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23

Table 3 Computational results for Example 63

order kwo LME with LME

lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449

Example 64 Consider the polynomial optimization problem

minxisinR2

x21 + 50x22

st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1

8 ge 0 x22 + 2x1x2 minus 18 ge 0

It is motivated from an example in [12 sect3] The first column of L(x) is

8x13

5 + x1

5288 x2 x1

4

5 minus 16 x13

5 minus x2 x12 1245 + 8 x1

5 minus 2 x2minus 288x2 x1

4

5 minus 16 x13

5 + x2 x12 1245 + 8 x1

5 + 2 x2

and the second column of L(x) is

minus 8x12 x2

5 + 4x23

5 minus x2

10288x1

3 x22

5 + 16x12 x2

5 minus 142x1 x22

5 minus 9 x1

20 minus 8 x23

5 + 11x2

5

minus 288x13 x2

2

5 + 16x12 x2

5 + 142x1 x22

5 + 9 x1

20 minus 8 x23

5 + 11x2

5

The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25

radic5 asymp 1126517 and the minimizers are (plusmn

radic

12plusmn(radic

58 +radic

12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors

Table 4 Computational results for Example 64

order kwo LME with LME

lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537

Example 65 Consider the optimization problem

minxisinR3

x31 + x32 + x33 + 4x1x2x3 minus(x1(x

22 + x23) + x2(x

23 + x21) + x3(x

21 + x22)

)

st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0

The matrix polynomial L(x) is

1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0

minusx1 x2 0 1 0 minus1

24 JIAWANG NIE

The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3

+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 5 Computational results for Example 65

order kwo LME with LME

lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057

Example 66 Consider the optimization problem (x0 = 1)

minxisinR4

xTx+sum4i=0

prod

j 6=i(xi minus xj)

st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0

The matrix polynomial L(x) =[12diag(x) minusI4

] The first part of the objective

is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers

(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)

(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)

(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 6 Computational results for Example 66

order kwo LME with LME

lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354

Example 67 Consider the optimization problem

minxisinR3

x41x22 + x42x

23 + x43x

21 minus 3x21x

22x

23 + x22

st x1 minus x2x3 ge 0minusx2 + x23 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0

minusx3 minus1 0 0 0

]

By the arithmetic-geometric

mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25

relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin

for all k ge 5 up to numerical round-off errors

Table 7 Computational results for Example 67

order kwo LME with LME

lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618

Example 68 Consider the optimization problem

minxisinR4

x21(x1 minus x4)2 + x22(x2 minus x4)

2 + x23(x3 minus x4)2+

2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2

st x1 minus x2 ge 0 x2 minus x3 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0 01 1 0 0 0 0

]

In the objective the sum

of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer

(05632 05632 05632 07510)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 8 Computational results for Example 68

order kwo LME with LME

lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228

Example 69 Consider the optimization problem

minxisinR4

(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)

st 0 le x1 x4 le 1

The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9

5 Note that 4minussum

4

i=1x2i =

sum

4

i=1

(

xi(1 minus xi)2 + (1 minus xi)(1 + x2i )

)

isin IQ(ceq cin)

26 JIAWANG NIE

Table 9 Computational results for Example 69

orderwo LME with LME

lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320

For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison

Example 610 Consider the optimization problem (x0 = 1)

minxisinRn

sum

0leilejlejlen cijkxixjxk

st 0 le x le 1

where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time

Table 10 Consumed time (in seconds) for Example 610

n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778

7 Discussions

71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that

IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k

If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set

Preord(cin ψ) =sum

r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27

The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is

(71)

f primeprek = min 〈f y〉

st 〈1 y〉 = 1 L(k)ceq(y) = 0 L

(k)φ (y) = 0

L(k)

gr11

middotmiddotmiddotgrℓℓ

(y) 0 forall r1 rℓ isin 0 1y isin RN

n2k

Similar to (39) the dual optimization problem of the above is

(72)

fprek = max γ

st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k

An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem

Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then

fprek = f primeprek = fc

for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre

k = fmin for all k big enough

Proof The proof is very similar to the Case III of Theorem 33 Follow the same

argument there Without Assumption 32 we still have f(x) equiv 0 on the set

K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0

By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such

that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same

72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge

73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)

28 JIAWANG NIE

74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get

λ =(

C(x)TC(x))minus1

C(x)T[nablaf(x)

0

]

when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research

Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper

References

[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions

Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-

putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997

[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226

[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200

[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832

[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear

[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear

[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]

[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous

quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-

inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-

tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005

[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109

[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J

Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of

zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29

[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885

[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014

[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009

[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015

[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear

[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26

[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009

[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014

[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606

[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718

[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012

[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255

[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013

[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510

[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121

[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274

[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003

[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984

[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000

[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942

[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324

[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653

[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002

[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951

Department of Mathematics University of California at San Diego 9500 Gilman

Drive La Jolla CA USA 92093

E-mail address njwmathucsdedu

  • 1 Introduction
    • 11 Optimality conditions
    • 12 Some existing work
    • 13 New contributions
      • 2 Preliminaries
      • 3 The construction of tight relaxations
        • 31 Tightness of the relaxations
        • 32 Detecting tightness and extracting minimizers
          • 4 Polyhedral constraints
          • 5 General constraints
          • 6 Numerical examples
          • 7 Discussions
            • 71 Tight relaxations using preorderings
            • 72 Singular constraining polynomials
            • 73 Degree bound for L(x)
            • 74 Rational representation of Lagrange multipliers
              • References
Page 20: TIGHT RELAXATIONS FOR POLYNOMIAL OPTIMIZATION …The polynomial expressions is de-termined by linear equations. Based on these expressions, new Lasserre type semidefinite relaxations

20 JIAWANG NIE

Proof The proof needs to use resultants and discriminants which we refer to [29]First let J1 be the set of all (i1 in+1) with 1 le i1 lt middot middot middot lt in+1 le m The

resultant Res(ci1 cin+1) [29 Section 2] is a polynomial in the coefficients of

ci1 cin+1such that if Res(ci1 cin+1

) 6= 0 then the equations

ci1(x) = middot middot middot = cin+1(x) = 0

have no complex solutions Define

F1(c) =prod

(i1in+1)isinJ1

Res(ci1 cin+1)

For the case that m le n J1 = empty and we just simply let F1(c) = 1 Clearly ifF1(c) 6= 0 then no more than n polynomials of c1 cm have a common complexzero

Second let J2 be the set of all (j1 jk) with k le n and 1 le j1 lt middot middot middot lt jk le mWhen one of cj1 cjk has degree bigger than one the discriminant ∆(cj1 cjk)is a polynomial in the coefficients of cj1 cjk such that if ∆(cj1 cjk) 6= 0 thenthe equations

cj1(x) = middot middot middot = cjk(x) = 0

have no singular complex solution [29 Section 3] ie at every complex commonsolution u the gradients of cj1 cjk at u are linearly independent When allcj1 cjk have degree one the discriminant of the tuple (cj1 cjk) is not a singlepolynomial but we can define ∆(cj1 cjk) to be the product of all maximumminors of its Jacobian (a constant matrix) Define

F2(c) =prod

(j1jk)isinJ2

∆(cj1 cjk)

Clearly if F2(c) 6= 0 then then no n or less polynomials of c1 cm have a singularcomplex comon zero

Last let F (c) = F1(c)F2(c) and

U = c = (c1 cm) isin D F (c) 6= 0Note that U is a Zariski open subset of D and it is open dense in D For allc isin D no more than n of c1 cm can have a complex common zero For anyk polynomials (k le n) of c1 cm if they have a complex common zero say uthen their gradients at u must be linearly independent This means that c is anonsingular tuple

Since every c isin U is nonsingular Proposition 52 implies (52) whence Assump-tion 31 is satisifed Therefore Assumption 31 holds for all c isin U So it holdsgenerically

6 Numerical examples

This section gives examples of using the new relaxations (38)-(39) for solving theoptimization problem (11) with usage of Lagrange multiplier expressions Somepolynomials in the examples are from [37] The computation is implemented inMATLAB R2012a on a Lenovo Laptop with CPU290GHz and RAM 160G Therelaxations (38)-(39) are solved by the software GloptiPoly 3 [13] which callsthe SDP package SeDuMi [40] For neatness only four decimal digits are displayedfor computational results

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21

The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie

pi =(

L1(x)nablaf(x))

i

In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds

By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)

We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo

Example 61 Consider the optimization problem

min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0

The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors

Table 1 Computational results for Example 61

order kwo LME with LME

lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567

2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn

i=1xi(1 minus xi)

2 +sum

i6=j x2

i xj isin IQ(ceq cin)

22 JIAWANG NIE

Example 62 Consider the optimization problem

min x41x22 + x21x

42 + x63 minus 3x21x

22x

23 + (x41 + x42 + x43)

st x21 + x22 + x23 ge 1

The matrix polynomial L(x) =[12x1

12x2

12x3 minus1

] The objective f is the sum

of the positive definite form x41 + x42 + x43 and the Motzkin polynomial

M(x) = x41x22 + x21x

42 + x63 minus 3x21x

22x

23

Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because

c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)

)= 0

defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1

3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic

3 plusmn 1radic

3) The

computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 2 Computational results for Example 62

order kwo LME with LME

lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530

Example 63 Consider the optimization problem

min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0

The matrix polynomial L(x) is

1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0

0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1

minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1

The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3

3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact

4 This is because 1 minus x21minus x2

2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and

1minusx23minusx2

4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23

Table 3 Computational results for Example 63

order kwo LME with LME

lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449

Example 64 Consider the polynomial optimization problem

minxisinR2

x21 + 50x22

st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1

8 ge 0 x22 + 2x1x2 minus 18 ge 0

It is motivated from an example in [12 sect3] The first column of L(x) is

8x13

5 + x1

5288 x2 x1

4

5 minus 16 x13

5 minus x2 x12 1245 + 8 x1

5 minus 2 x2minus 288x2 x1

4

5 minus 16 x13

5 + x2 x12 1245 + 8 x1

5 + 2 x2

and the second column of L(x) is

minus 8x12 x2

5 + 4x23

5 minus x2

10288x1

3 x22

5 + 16x12 x2

5 minus 142x1 x22

5 minus 9 x1

20 minus 8 x23

5 + 11x2

5

minus 288x13 x2

2

5 + 16x12 x2

5 + 142x1 x22

5 + 9 x1

20 minus 8 x23

5 + 11x2

5

The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25

radic5 asymp 1126517 and the minimizers are (plusmn

radic

12plusmn(radic

58 +radic

12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors

Table 4 Computational results for Example 64

order kwo LME with LME

lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537

Example 65 Consider the optimization problem

minxisinR3

x31 + x32 + x33 + 4x1x2x3 minus(x1(x

22 + x23) + x2(x

23 + x21) + x3(x

21 + x22)

)

st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0

The matrix polynomial L(x) is

1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0

minusx1 x2 0 1 0 minus1

24 JIAWANG NIE

The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3

+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 5 Computational results for Example 65

order kwo LME with LME

lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057

Example 66 Consider the optimization problem (x0 = 1)

minxisinR4

xTx+sum4i=0

prod

j 6=i(xi minus xj)

st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0

The matrix polynomial L(x) =[12diag(x) minusI4

] The first part of the objective

is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers

(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)

(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)

(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 6 Computational results for Example 66

order kwo LME with LME

lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354

Example 67 Consider the optimization problem

minxisinR3

x41x22 + x42x

23 + x43x

21 minus 3x21x

22x

23 + x22

st x1 minus x2x3 ge 0minusx2 + x23 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0

minusx3 minus1 0 0 0

]

By the arithmetic-geometric

mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25

relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin

for all k ge 5 up to numerical round-off errors

Table 7 Computational results for Example 67

order kwo LME with LME

lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618

Example 68 Consider the optimization problem

minxisinR4

x21(x1 minus x4)2 + x22(x2 minus x4)

2 + x23(x3 minus x4)2+

2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2

st x1 minus x2 ge 0 x2 minus x3 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0 01 1 0 0 0 0

]

In the objective the sum

of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer

(05632 05632 05632 07510)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 8 Computational results for Example 68

order kwo LME with LME

lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228

Example 69 Consider the optimization problem

minxisinR4

(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)

st 0 le x1 x4 le 1

The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9

5 Note that 4minussum

4

i=1x2i =

sum

4

i=1

(

xi(1 minus xi)2 + (1 minus xi)(1 + x2i )

)

isin IQ(ceq cin)

26 JIAWANG NIE

Table 9 Computational results for Example 69

orderwo LME with LME

lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320

For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison

Example 610 Consider the optimization problem (x0 = 1)

minxisinRn

sum

0leilejlejlen cijkxixjxk

st 0 le x le 1

where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time

Table 10 Consumed time (in seconds) for Example 610

n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778

7 Discussions

71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that

IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k

If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set

Preord(cin ψ) =sum

r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27

The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is

(71)

f primeprek = min 〈f y〉

st 〈1 y〉 = 1 L(k)ceq(y) = 0 L

(k)φ (y) = 0

L(k)

gr11

middotmiddotmiddotgrℓℓ

(y) 0 forall r1 rℓ isin 0 1y isin RN

n2k

Similar to (39) the dual optimization problem of the above is

(72)

fprek = max γ

st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k

An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem

Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then

fprek = f primeprek = fc

for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre

k = fmin for all k big enough

Proof The proof is very similar to the Case III of Theorem 33 Follow the same

argument there Without Assumption 32 we still have f(x) equiv 0 on the set

K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0

By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such

that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same

72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge

73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)

28 JIAWANG NIE

74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get

λ =(

C(x)TC(x))minus1

C(x)T[nablaf(x)

0

]

when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research

Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper

References

[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions

Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-

putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997

[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226

[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200

[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832

[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear

[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear

[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]

[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous

quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-

inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-

tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005

[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109

[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J

Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of

zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29

[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885

[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014

[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009

[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015

[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear

[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26

[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009

[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014

[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606

[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718

[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012

[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255

[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013

[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510

[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121

[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274

[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003

[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984

[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000

[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942

[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324

[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653

[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002

[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951

Department of Mathematics University of California at San Diego 9500 Gilman

Drive La Jolla CA USA 92093

E-mail address njwmathucsdedu

  • 1 Introduction
    • 11 Optimality conditions
    • 12 Some existing work
    • 13 New contributions
      • 2 Preliminaries
      • 3 The construction of tight relaxations
        • 31 Tightness of the relaxations
        • 32 Detecting tightness and extracting minimizers
          • 4 Polyhedral constraints
          • 5 General constraints
          • 6 Numerical examples
          • 7 Discussions
            • 71 Tight relaxations using preorderings
            • 72 Singular constraining polynomials
            • 73 Degree bound for L(x)
            • 74 Rational representation of Lagrange multipliers
              • References
Page 21: TIGHT RELAXATIONS FOR POLYNOMIAL OPTIMIZATION …The polynomial expressions is de-termined by linear equations. Based on these expressions, new Lasserre type semidefinite relaxations

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 21

The polynomials pi in Assumption 31 are constructed as follows Order theconstraining polynomials as c1 cm First find a matrix polynomial L(x) satis-fying (44) or (52) Let L1(x) be the submatrix of L(x) consisting of the first ncolumns Then choose (p1 pm) to be the product L1(x)nablaf(x) ie

pi =(

L1(x)nablaf(x))

i

In all our examples the global minimum value fmin of (11) is achieved at a criticalpoint This is the case if the feasible set is compact or if f is coercive (iethe sub-level set f(x) le ℓ is compact for all ℓ) and the constraint qualification conditionholds

By Theorem 33 we have fk = fmin for all k big enough if fc = fmin and anyoneof its conditions i)-iii) holds Typically it might be inconvenient to check theseconditions However in computation we do not need to check them at all Indeedthe condition (317) is more convenient for usage When there are finitely manyglobal minimizers Theorem 34 proved that (317) is an appropriate criteria fordetecting convergence It is satisfied for all our examples except Examples 61 67and 69 (they have infinitely many minimizers)

We compare the new relaxations (38)-(39) with standard Lasserre relaxationsin [17] The lower bounds given by relaxations in [17] (without using Lagrangemultiplier expressions) and the lower bounds given by (38)-(39) (using Lagrangemultiplier expressions) are shown in the tables The computational time (in sec-onds) is also compared The results for standard Lasserre relaxations are titledldquowo LMErdquo and those for the new relaxations (38)-(39) are titled ldquowithLMErdquo

Example 61 Consider the optimization problem

min x1x2(10minus x3)st x1 ge 0 x2 ge 0 x3 ge 0 1minus x1 minus x2 minus x3 ge 0

The matrix polynomial L(x) is given in Example 42 Since the feasible set iscompact the minimum fmin = 0 is achieved at a critical point The condition ii) ofTheorem 33 is satisfied2 Each feasible point (x1 x2 x3) with x1x2 = 0 is a globalminimizer The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 1 It confirms that fk = fmin for all k ge 3 up tonumerical round-off errors

Table 1 Computational results for Example 61

order kwo LME with LME

lower bound time lower bound time2 minus00521 06841 minus00521 019223 minus00026 02657 minus3 middot 10minus8 022854 minus00007 06785 minus6 middot 10minus9 044315 minus00004 16105 minus2 middot 10minus9 09567

2Note that 1minus xT x = (1 minus eT x)(1 + xTx) +sumn

i=1xi(1 minus xi)

2 +sum

i6=j x2

i xj isin IQ(ceq cin)

22 JIAWANG NIE

Example 62 Consider the optimization problem

min x41x22 + x21x

42 + x63 minus 3x21x

22x

23 + (x41 + x42 + x43)

st x21 + x22 + x23 ge 1

The matrix polynomial L(x) =[12x1

12x2

12x3 minus1

] The objective f is the sum

of the positive definite form x41 + x42 + x43 and the Motzkin polynomial

M(x) = x41x22 + x21x

42 + x63 minus 3x21x

22x

23

Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because

c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)

)= 0

defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1

3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic

3 plusmn 1radic

3) The

computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 2 Computational results for Example 62

order kwo LME with LME

lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530

Example 63 Consider the optimization problem

min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0

The matrix polynomial L(x) is

1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0

0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1

minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1

The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3

3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact

4 This is because 1 minus x21minus x2

2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and

1minusx23minusx2

4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23

Table 3 Computational results for Example 63

order kwo LME with LME

lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449

Example 64 Consider the polynomial optimization problem

minxisinR2

x21 + 50x22

st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1

8 ge 0 x22 + 2x1x2 minus 18 ge 0

It is motivated from an example in [12 sect3] The first column of L(x) is

8x13

5 + x1

5288 x2 x1

4

5 minus 16 x13

5 minus x2 x12 1245 + 8 x1

5 minus 2 x2minus 288x2 x1

4

5 minus 16 x13

5 + x2 x12 1245 + 8 x1

5 + 2 x2

and the second column of L(x) is

minus 8x12 x2

5 + 4x23

5 minus x2

10288x1

3 x22

5 + 16x12 x2

5 minus 142x1 x22

5 minus 9 x1

20 minus 8 x23

5 + 11x2

5

minus 288x13 x2

2

5 + 16x12 x2

5 + 142x1 x22

5 + 9 x1

20 minus 8 x23

5 + 11x2

5

The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25

radic5 asymp 1126517 and the minimizers are (plusmn

radic

12plusmn(radic

58 +radic

12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors

Table 4 Computational results for Example 64

order kwo LME with LME

lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537

Example 65 Consider the optimization problem

minxisinR3

x31 + x32 + x33 + 4x1x2x3 minus(x1(x

22 + x23) + x2(x

23 + x21) + x3(x

21 + x22)

)

st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0

The matrix polynomial L(x) is

1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0

minusx1 x2 0 1 0 minus1

24 JIAWANG NIE

The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3

+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 5 Computational results for Example 65

order kwo LME with LME

lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057

Example 66 Consider the optimization problem (x0 = 1)

minxisinR4

xTx+sum4i=0

prod

j 6=i(xi minus xj)

st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0

The matrix polynomial L(x) =[12diag(x) minusI4

] The first part of the objective

is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers

(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)

(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)

(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 6 Computational results for Example 66

order kwo LME with LME

lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354

Example 67 Consider the optimization problem

minxisinR3

x41x22 + x42x

23 + x43x

21 minus 3x21x

22x

23 + x22

st x1 minus x2x3 ge 0minusx2 + x23 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0

minusx3 minus1 0 0 0

]

By the arithmetic-geometric

mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25

relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin

for all k ge 5 up to numerical round-off errors

Table 7 Computational results for Example 67

order kwo LME with LME

lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618

Example 68 Consider the optimization problem

minxisinR4

x21(x1 minus x4)2 + x22(x2 minus x4)

2 + x23(x3 minus x4)2+

2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2

st x1 minus x2 ge 0 x2 minus x3 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0 01 1 0 0 0 0

]

In the objective the sum

of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer

(05632 05632 05632 07510)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 8 Computational results for Example 68

order kwo LME with LME

lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228

Example 69 Consider the optimization problem

minxisinR4

(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)

st 0 le x1 x4 le 1

The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9

5 Note that 4minussum

4

i=1x2i =

sum

4

i=1

(

xi(1 minus xi)2 + (1 minus xi)(1 + x2i )

)

isin IQ(ceq cin)

26 JIAWANG NIE

Table 9 Computational results for Example 69

orderwo LME with LME

lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320

For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison

Example 610 Consider the optimization problem (x0 = 1)

minxisinRn

sum

0leilejlejlen cijkxixjxk

st 0 le x le 1

where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time

Table 10 Consumed time (in seconds) for Example 610

n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778

7 Discussions

71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that

IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k

If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set

Preord(cin ψ) =sum

r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27

The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is

(71)

f primeprek = min 〈f y〉

st 〈1 y〉 = 1 L(k)ceq(y) = 0 L

(k)φ (y) = 0

L(k)

gr11

middotmiddotmiddotgrℓℓ

(y) 0 forall r1 rℓ isin 0 1y isin RN

n2k

Similar to (39) the dual optimization problem of the above is

(72)

fprek = max γ

st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k

An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem

Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then

fprek = f primeprek = fc

for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre

k = fmin for all k big enough

Proof The proof is very similar to the Case III of Theorem 33 Follow the same

argument there Without Assumption 32 we still have f(x) equiv 0 on the set

K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0

By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such

that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same

72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge

73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)

28 JIAWANG NIE

74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get

λ =(

C(x)TC(x))minus1

C(x)T[nablaf(x)

0

]

when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research

Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper

References

[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions

Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-

putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997

[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226

[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200

[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832

[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear

[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear

[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]

[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous

quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-

inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-

tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005

[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109

[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J

Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of

zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29

[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885

[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014

[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009

[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015

[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear

[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26

[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009

[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014

[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606

[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718

[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012

[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255

[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013

[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510

[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121

[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274

[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003

[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984

[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000

[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942

[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324

[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653

[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002

[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951

Department of Mathematics University of California at San Diego 9500 Gilman

Drive La Jolla CA USA 92093

E-mail address njwmathucsdedu

  • 1 Introduction
    • 11 Optimality conditions
    • 12 Some existing work
    • 13 New contributions
      • 2 Preliminaries
      • 3 The construction of tight relaxations
        • 31 Tightness of the relaxations
        • 32 Detecting tightness and extracting minimizers
          • 4 Polyhedral constraints
          • 5 General constraints
          • 6 Numerical examples
          • 7 Discussions
            • 71 Tight relaxations using preorderings
            • 72 Singular constraining polynomials
            • 73 Degree bound for L(x)
            • 74 Rational representation of Lagrange multipliers
              • References
Page 22: TIGHT RELAXATIONS FOR POLYNOMIAL OPTIMIZATION …The polynomial expressions is de-termined by linear equations. Based on these expressions, new Lasserre type semidefinite relaxations

22 JIAWANG NIE

Example 62 Consider the optimization problem

min x41x22 + x21x

42 + x63 minus 3x21x

22x

23 + (x41 + x42 + x43)

st x21 + x22 + x23 ge 1

The matrix polynomial L(x) =[12x1

12x2

12x3 minus1

] The objective f is the sum

of the positive definite form x41 + x42 + x43 and the Motzkin polynomial

M(x) = x41x22 + x21x

42 + x63 minus 3x21x

22x

23

Note that M(x) is nonnegative everywhere but not SOS [37] Clearly f is coerciveand fmin is achieved at a critical point The set IQ(φ ψ) is archimedean because

c1(x)p1(x) = (x21 + x22 + x23 minus 1)(3M(x) + 2(x41 + x42 + x43)

)= 0

defines a compact set So the condition i) of Theorem 33 is satisfied3 Theminimum value fmin = 1

3 and there are 8 minimizers (plusmn 1radic3 plusmn 1radic

3 plusmn 1radic

3) The

computational results for standard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 2 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 2 Computational results for Example 62

order kwo LME with LME

lower bound time lower bound time3 minusinfin 04466 01111 011694 minusinfin 04948 03333 034995 minus21821 middot 105 11836 03333 06530

Example 63 Consider the optimization problem

min x1x2 + x2x3 + x3x4 minus 3x1x2x3x4 + (x31 + middot middot middot+ x34)st x1 x2 x3 x4 ge 0 1minus x1 minus x2 ge 0 1minus x3 minus x4 ge 0

The matrix polynomial L(x) is

1minus x1 minusx2 0 0 1 1 0 0 1 0minusx1 1minus x2 0 0 1 1 0 0 1 0

0 0 1minus x3 minusx4 0 0 1 1 0 10 0 minusx3 1minus x4 0 0 1 1 0 1

minusx1 minusx2 0 0 1 1 0 0 1 00 0 minusx3 minusx4 0 0 1 1 0 1

The feasible set is compact so fmin is achieved at a critical point One can showthat fmin = 0 and the minimizer is the origin The condition ii) of Theorem 33is satisfied because IQ(ceq cin) is archimedean4 The computational results forstandard Lasserrersquos relaxations and the new ones (38)-(39) are in Table 3

3 This is because minusc21p21isin Ideal(φ) sube IQ(φψ) and the set minusc1(x)2p1(x)2 ge 0 is compact

4 This is because 1 minus x21minus x2

2belongs to the quadratic module of (x1 x2 1 minus x1 minus x2) and

1minusx23minusx2

4belongs to the quadratic module of (x3 x4 1minusx3minusx4) See the footnote in Example 61

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23

Table 3 Computational results for Example 63

order kwo LME with LME

lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449

Example 64 Consider the polynomial optimization problem

minxisinR2

x21 + 50x22

st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1

8 ge 0 x22 + 2x1x2 minus 18 ge 0

It is motivated from an example in [12 sect3] The first column of L(x) is

8x13

5 + x1

5288 x2 x1

4

5 minus 16 x13

5 minus x2 x12 1245 + 8 x1

5 minus 2 x2minus 288x2 x1

4

5 minus 16 x13

5 + x2 x12 1245 + 8 x1

5 + 2 x2

and the second column of L(x) is

minus 8x12 x2

5 + 4x23

5 minus x2

10288x1

3 x22

5 + 16x12 x2

5 minus 142x1 x22

5 minus 9 x1

20 minus 8 x23

5 + 11x2

5

minus 288x13 x2

2

5 + 16x12 x2

5 + 142x1 x22

5 + 9 x1

20 minus 8 x23

5 + 11x2

5

The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25

radic5 asymp 1126517 and the minimizers are (plusmn

radic

12plusmn(radic

58 +radic

12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors

Table 4 Computational results for Example 64

order kwo LME with LME

lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537

Example 65 Consider the optimization problem

minxisinR3

x31 + x32 + x33 + 4x1x2x3 minus(x1(x

22 + x23) + x2(x

23 + x21) + x3(x

21 + x22)

)

st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0

The matrix polynomial L(x) is

1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0

minusx1 x2 0 1 0 minus1

24 JIAWANG NIE

The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3

+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 5 Computational results for Example 65

order kwo LME with LME

lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057

Example 66 Consider the optimization problem (x0 = 1)

minxisinR4

xTx+sum4i=0

prod

j 6=i(xi minus xj)

st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0

The matrix polynomial L(x) =[12diag(x) minusI4

] The first part of the objective

is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers

(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)

(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)

(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 6 Computational results for Example 66

order kwo LME with LME

lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354

Example 67 Consider the optimization problem

minxisinR3

x41x22 + x42x

23 + x43x

21 minus 3x21x

22x

23 + x22

st x1 minus x2x3 ge 0minusx2 + x23 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0

minusx3 minus1 0 0 0

]

By the arithmetic-geometric

mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25

relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin

for all k ge 5 up to numerical round-off errors

Table 7 Computational results for Example 67

order kwo LME with LME

lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618

Example 68 Consider the optimization problem

minxisinR4

x21(x1 minus x4)2 + x22(x2 minus x4)

2 + x23(x3 minus x4)2+

2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2

st x1 minus x2 ge 0 x2 minus x3 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0 01 1 0 0 0 0

]

In the objective the sum

of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer

(05632 05632 05632 07510)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 8 Computational results for Example 68

order kwo LME with LME

lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228

Example 69 Consider the optimization problem

minxisinR4

(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)

st 0 le x1 x4 le 1

The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9

5 Note that 4minussum

4

i=1x2i =

sum

4

i=1

(

xi(1 minus xi)2 + (1 minus xi)(1 + x2i )

)

isin IQ(ceq cin)

26 JIAWANG NIE

Table 9 Computational results for Example 69

orderwo LME with LME

lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320

For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison

Example 610 Consider the optimization problem (x0 = 1)

minxisinRn

sum

0leilejlejlen cijkxixjxk

st 0 le x le 1

where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time

Table 10 Consumed time (in seconds) for Example 610

n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778

7 Discussions

71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that

IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k

If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set

Preord(cin ψ) =sum

r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27

The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is

(71)

f primeprek = min 〈f y〉

st 〈1 y〉 = 1 L(k)ceq(y) = 0 L

(k)φ (y) = 0

L(k)

gr11

middotmiddotmiddotgrℓℓ

(y) 0 forall r1 rℓ isin 0 1y isin RN

n2k

Similar to (39) the dual optimization problem of the above is

(72)

fprek = max γ

st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k

An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem

Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then

fprek = f primeprek = fc

for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre

k = fmin for all k big enough

Proof The proof is very similar to the Case III of Theorem 33 Follow the same

argument there Without Assumption 32 we still have f(x) equiv 0 on the set

K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0

By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such

that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same

72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge

73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)

28 JIAWANG NIE

74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get

λ =(

C(x)TC(x))minus1

C(x)T[nablaf(x)

0

]

when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research

Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper

References

[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions

Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-

putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997

[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226

[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200

[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832

[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear

[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear

[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]

[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous

quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-

inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-

tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005

[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109

[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J

Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of

zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29

[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885

[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014

[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009

[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015

[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear

[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26

[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009

[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014

[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606

[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718

[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012

[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255

[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013

[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510

[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121

[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274

[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003

[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984

[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000

[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942

[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324

[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653

[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002

[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951

Department of Mathematics University of California at San Diego 9500 Gilman

Drive La Jolla CA USA 92093

E-mail address njwmathucsdedu

  • 1 Introduction
    • 11 Optimality conditions
    • 12 Some existing work
    • 13 New contributions
      • 2 Preliminaries
      • 3 The construction of tight relaxations
        • 31 Tightness of the relaxations
        • 32 Detecting tightness and extracting minimizers
          • 4 Polyhedral constraints
          • 5 General constraints
          • 6 Numerical examples
          • 7 Discussions
            • 71 Tight relaxations using preorderings
            • 72 Singular constraining polynomials
            • 73 Degree bound for L(x)
            • 74 Rational representation of Lagrange multipliers
              • References
Page 23: TIGHT RELAXATIONS FOR POLYNOMIAL OPTIMIZATION …The polynomial expressions is de-termined by linear equations. Based on these expressions, new Lasserre type semidefinite relaxations

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 23

Table 3 Computational results for Example 63

order kwo LME with LME

lower bound time lower bound time3 minus29 middot 10minus5 07335 minus6 middot 10minus7 060914 minus14 middot 10minus5 25055 minus8 middot 10minus8 274235 minus14 middot 10minus5 127092 minus5 middot 10minus8 137449

Example 64 Consider the polynomial optimization problem

minxisinR2

x21 + 50x22

st x21 minus 12 ge 0 x22 minus 2x1x2 minus 1

8 ge 0 x22 + 2x1x2 minus 18 ge 0

It is motivated from an example in [12 sect3] The first column of L(x) is

8x13

5 + x1

5288 x2 x1

4

5 minus 16 x13

5 minus x2 x12 1245 + 8 x1

5 minus 2 x2minus 288x2 x1

4

5 minus 16 x13

5 + x2 x12 1245 + 8 x1

5 + 2 x2

and the second column of L(x) is

minus 8x12 x2

5 + 4x23

5 minus x2

10288x1

3 x22

5 + 16x12 x2

5 minus 142x1 x22

5 minus 9 x1

20 minus 8 x23

5 + 11x2

5

minus 288x13 x2

2

5 + 16x12 x2

5 + 142x1 x22

5 + 9 x1

20 minus 8 x23

5 + 11x2

5

The objective is coercive so fmin is achieved at a critical point The minimum valuefmin = 56 + 34 + 25

radic5 asymp 1126517 and the minimizers are (plusmn

radic

12plusmn(radic

58 +radic

12)) The computational results for standard Lasserrersquos relaxations and thenew ones (38)-(39) are in Table 4 It confirms that fk = fmin for all k ge 4 up tonumerical round-off errors

Table 4 Computational results for Example 64

order kwo LME with LME

lower bound time lower bound time3 67535 04611 567500 013094 69294 02428 1126517 024055 88519 03376 1126517 021676 165971 04703 1126517 037887 354756 06536 1126517 04537

Example 65 Consider the optimization problem

minxisinR3

x31 + x32 + x33 + 4x1x2x3 minus(x1(x

22 + x23) + x2(x

23 + x21) + x3(x

21 + x22)

)

st x1 ge 0 x1x2 minus 1 ge 0 x2x3 minus 1 ge 0

The matrix polynomial L(x) is

1minus x1 x2 0 0 x2 x2 0x1 0 0 minus1 minus1 0

minusx1 x2 0 1 0 minus1

24 JIAWANG NIE

The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3

+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 5 Computational results for Example 65

order kwo LME with LME

lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057

Example 66 Consider the optimization problem (x0 = 1)

minxisinR4

xTx+sum4i=0

prod

j 6=i(xi minus xj)

st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0

The matrix polynomial L(x) =[12diag(x) minusI4

] The first part of the objective

is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers

(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)

(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)

(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 6 Computational results for Example 66

order kwo LME with LME

lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354

Example 67 Consider the optimization problem

minxisinR3

x41x22 + x42x

23 + x43x

21 minus 3x21x

22x

23 + x22

st x1 minus x2x3 ge 0minusx2 + x23 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0

minusx3 minus1 0 0 0

]

By the arithmetic-geometric

mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25

relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin

for all k ge 5 up to numerical round-off errors

Table 7 Computational results for Example 67

order kwo LME with LME

lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618

Example 68 Consider the optimization problem

minxisinR4

x21(x1 minus x4)2 + x22(x2 minus x4)

2 + x23(x3 minus x4)2+

2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2

st x1 minus x2 ge 0 x2 minus x3 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0 01 1 0 0 0 0

]

In the objective the sum

of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer

(05632 05632 05632 07510)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 8 Computational results for Example 68

order kwo LME with LME

lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228

Example 69 Consider the optimization problem

minxisinR4

(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)

st 0 le x1 x4 le 1

The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9

5 Note that 4minussum

4

i=1x2i =

sum

4

i=1

(

xi(1 minus xi)2 + (1 minus xi)(1 + x2i )

)

isin IQ(ceq cin)

26 JIAWANG NIE

Table 9 Computational results for Example 69

orderwo LME with LME

lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320

For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison

Example 610 Consider the optimization problem (x0 = 1)

minxisinRn

sum

0leilejlejlen cijkxixjxk

st 0 le x le 1

where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time

Table 10 Consumed time (in seconds) for Example 610

n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778

7 Discussions

71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that

IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k

If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set

Preord(cin ψ) =sum

r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27

The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is

(71)

f primeprek = min 〈f y〉

st 〈1 y〉 = 1 L(k)ceq(y) = 0 L

(k)φ (y) = 0

L(k)

gr11

middotmiddotmiddotgrℓℓ

(y) 0 forall r1 rℓ isin 0 1y isin RN

n2k

Similar to (39) the dual optimization problem of the above is

(72)

fprek = max γ

st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k

An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem

Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then

fprek = f primeprek = fc

for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre

k = fmin for all k big enough

Proof The proof is very similar to the Case III of Theorem 33 Follow the same

argument there Without Assumption 32 we still have f(x) equiv 0 on the set

K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0

By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such

that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same

72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge

73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)

28 JIAWANG NIE

74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get

λ =(

C(x)TC(x))minus1

C(x)T[nablaf(x)

0

]

when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research

Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper

References

[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions

Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-

putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997

[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226

[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200

[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832

[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear

[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear

[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]

[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous

quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-

inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-

tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005

[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109

[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J

Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of

zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29

[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885

[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014

[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009

[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015

[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear

[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26

[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009

[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014

[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606

[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718

[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012

[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255

[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013

[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510

[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121

[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274

[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003

[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984

[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000

[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942

[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324

[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653

[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002

[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951

Department of Mathematics University of California at San Diego 9500 Gilman

Drive La Jolla CA USA 92093

E-mail address njwmathucsdedu

  • 1 Introduction
    • 11 Optimality conditions
    • 12 Some existing work
    • 13 New contributions
      • 2 Preliminaries
      • 3 The construction of tight relaxations
        • 31 Tightness of the relaxations
        • 32 Detecting tightness and extracting minimizers
          • 4 Polyhedral constraints
          • 5 General constraints
          • 6 Numerical examples
          • 7 Discussions
            • 71 Tight relaxations using preorderings
            • 72 Singular constraining polynomials
            • 73 Degree bound for L(x)
            • 74 Rational representation of Lagrange multipliers
              • References
Page 24: TIGHT RELAXATIONS FOR POLYNOMIAL OPTIMIZATION …The polynomial expressions is de-termined by linear equations. Based on these expressions, new Lasserre type semidefinite relaxations

24 JIAWANG NIE

The objective is a variation of Robinsonrsquos form [37] It is a positive definite form overthe nonnegative orthant R3

+ so the minimum value is achieved at a critical point Incomputation we got fmin asymp 09492 and a global minimizer (09071 11024 09071)The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 5 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 5 Computational results for Example 65

order kwo LME with LME

lower bound time lower bound time2 minusinfin 04129 minusinfin 019003 minus78184 middot 106 04641 09492 031394 minus20575 middot 104 06499 09492 05057

Example 66 Consider the optimization problem (x0 = 1)

minxisinR4

xTx+sum4i=0

prod

j 6=i(xi minus xj)

st x21 minus 1 ge 0 x22 minus 1 ge 0 x23 minus 1 ge 0 x24 minus 1 ge 0

The matrix polynomial L(x) =[12diag(x) minusI4

] The first part of the objective

is xTx while the second part is a nonnegative polynomial [37] The objective iscoercive so fmin is achieved at a critical point In computation we got fmin =40000 and 11 global minimizers

(1 1 1 1) (1minus1minus1 1) (1minus1 1minus1) (1 1minus1minus1)

(1minus1minus1minus1) (minus1minus1 1 1) (minus1 1minus1 1) (minus1 1 1minus1)

(minus1minus1minus1 1) (minus1minus1 1minus1) (minus1 1minus1minus1)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 6 It confirms that fk = fmin for all k ge 4 up to numericalround-off errors

Table 6 Computational results for Example 66

order kwo LME with LME

lower bound time lower bound time3 minusinfin 11377 35480 117654 minus66913 middot 104 47677 40000 307615 minus213778 229970 40000 103354

Example 67 Consider the optimization problem

minxisinR3

x41x22 + x42x

23 + x43x

21 minus 3x21x

22x

23 + x22

st x1 minus x2x3 ge 0minusx2 + x23 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0

minusx3 minus1 0 0 0

]

By the arithmetic-geometric

mean inequality one can show that fmin = 0 The global minimizers are (x1 0 x3)with x1 ge 0 and x1x3 = 0 The computational results for standard Lasserrersquos

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25

relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin

for all k ge 5 up to numerical round-off errors

Table 7 Computational results for Example 67

order kwo LME with LME

lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618

Example 68 Consider the optimization problem

minxisinR4

x21(x1 minus x4)2 + x22(x2 minus x4)

2 + x23(x3 minus x4)2+

2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2

st x1 minus x2 ge 0 x2 minus x3 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0 01 1 0 0 0 0

]

In the objective the sum

of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer

(05632 05632 05632 07510)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 8 Computational results for Example 68

order kwo LME with LME

lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228

Example 69 Consider the optimization problem

minxisinR4

(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)

st 0 le x1 x4 le 1

The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9

5 Note that 4minussum

4

i=1x2i =

sum

4

i=1

(

xi(1 minus xi)2 + (1 minus xi)(1 + x2i )

)

isin IQ(ceq cin)

26 JIAWANG NIE

Table 9 Computational results for Example 69

orderwo LME with LME

lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320

For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison

Example 610 Consider the optimization problem (x0 = 1)

minxisinRn

sum

0leilejlejlen cijkxixjxk

st 0 le x le 1

where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time

Table 10 Consumed time (in seconds) for Example 610

n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778

7 Discussions

71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that

IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k

If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set

Preord(cin ψ) =sum

r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27

The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is

(71)

f primeprek = min 〈f y〉

st 〈1 y〉 = 1 L(k)ceq(y) = 0 L

(k)φ (y) = 0

L(k)

gr11

middotmiddotmiddotgrℓℓ

(y) 0 forall r1 rℓ isin 0 1y isin RN

n2k

Similar to (39) the dual optimization problem of the above is

(72)

fprek = max γ

st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k

An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem

Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then

fprek = f primeprek = fc

for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre

k = fmin for all k big enough

Proof The proof is very similar to the Case III of Theorem 33 Follow the same

argument there Without Assumption 32 we still have f(x) equiv 0 on the set

K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0

By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such

that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same

72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge

73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)

28 JIAWANG NIE

74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get

λ =(

C(x)TC(x))minus1

C(x)T[nablaf(x)

0

]

when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research

Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper

References

[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions

Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-

putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997

[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226

[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200

[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832

[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear

[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear

[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]

[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous

quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-

inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-

tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005

[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109

[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J

Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of

zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29

[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885

[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014

[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009

[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015

[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear

[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26

[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009

[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014

[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606

[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718

[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012

[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255

[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013

[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510

[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121

[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274

[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003

[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984

[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000

[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942

[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324

[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653

[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002

[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951

Department of Mathematics University of California at San Diego 9500 Gilman

Drive La Jolla CA USA 92093

E-mail address njwmathucsdedu

  • 1 Introduction
    • 11 Optimality conditions
    • 12 Some existing work
    • 13 New contributions
      • 2 Preliminaries
      • 3 The construction of tight relaxations
        • 31 Tightness of the relaxations
        • 32 Detecting tightness and extracting minimizers
          • 4 Polyhedral constraints
          • 5 General constraints
          • 6 Numerical examples
          • 7 Discussions
            • 71 Tight relaxations using preorderings
            • 72 Singular constraining polynomials
            • 73 Degree bound for L(x)
            • 74 Rational representation of Lagrange multipliers
              • References
Page 25: TIGHT RELAXATIONS FOR POLYNOMIAL OPTIMIZATION …The polynomial expressions is de-termined by linear equations. Based on these expressions, new Lasserre type semidefinite relaxations

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 25

relaxations and the new ones (38)-(39) are in Table 7 It confirms that fk = fmin

for all k ge 5 up to numerical round-off errors

Table 7 Computational results for Example 67

order kwo LME with LME

lower bound time lower bound time3 minusinfin 06144 minusinfin 034184 minus10909 middot 107 10542 minus39476 071805 minus9426772 16771 minus3 middot 10minus9 146076 minus00110 33532 minus8 middot 10minus10 31618

Example 68 Consider the optimization problem

minxisinR4

x21(x1 minus x4)2 + x22(x2 minus x4)

2 + x23(x3 minus x4)2+

2x1x2x3(x1 + x2 + x3 minus 2x4) + (x1 minus 1)2 + (x2 minus 1)2 + (x3 minus 1)2

st x1 minus x2 ge 0 x2 minus x3 ge 0

The matrix polynomial L(x) =

[1 0 0 0 0 01 1 0 0 0 0

]

In the objective the sum

of the first 4 terms is a nonnegative form [37] while the sum of the last 3 terms isa coercive polynomial The objective is coercive so fmin is achieved at a criticalpoint In computation we got fmin asymp 09413 and a minimizer

(05632 05632 05632 07510)

The computational results for standard Lasserrersquos relaxations and the new ones(38)-(39) are in Table 8 It confirms that fk = fmin for all k ge 3 up to numericalround-off errors

Table 8 Computational results for Example 68

order kwo LME with LME

lower bound time lower bound time2 minusinfin 03984 minus03360 093213 minusinfin 07634 09413 052404 minus64896 middot 105 45496 09413 171925 minus31645 middot 103 243665 09413 81228

Example 69 Consider the optimization problem

minxisinR4

(x1 + x2 + x3 + x4 + 1)2 minus 4(x1x2 + x2x3 + x3x4 + x4 + x1)

st 0 le x1 x4 le 1

The matrix L(x) is given in Example 43 The objective is the dehomogenizationof Hornrsquos form [37] The feasible set is compact so fmin is achieved at a criticalpoint The condition ii) of Theorem 33 is satisfied5 The minimum value fmin = 0For each t isin [0 1] the point (t 0 0 1 minus t) is a global minimizer The computa-tional results for standard Lasserrersquos relaxations and the new ones (38)-(39) arein Table 9

5 Note that 4minussum

4

i=1x2i =

sum

4

i=1

(

xi(1 minus xi)2 + (1 minus xi)(1 + x2i )

)

isin IQ(ceq cin)

26 JIAWANG NIE

Table 9 Computational results for Example 69

orderwo LME with LME

lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320

For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison

Example 610 Consider the optimization problem (x0 = 1)

minxisinRn

sum

0leilejlejlen cijkxixjxk

st 0 le x le 1

where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time

Table 10 Consumed time (in seconds) for Example 610

n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778

7 Discussions

71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that

IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k

If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set

Preord(cin ψ) =sum

r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27

The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is

(71)

f primeprek = min 〈f y〉

st 〈1 y〉 = 1 L(k)ceq(y) = 0 L

(k)φ (y) = 0

L(k)

gr11

middotmiddotmiddotgrℓℓ

(y) 0 forall r1 rℓ isin 0 1y isin RN

n2k

Similar to (39) the dual optimization problem of the above is

(72)

fprek = max γ

st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k

An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem

Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then

fprek = f primeprek = fc

for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre

k = fmin for all k big enough

Proof The proof is very similar to the Case III of Theorem 33 Follow the same

argument there Without Assumption 32 we still have f(x) equiv 0 on the set

K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0

By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such

that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same

72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge

73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)

28 JIAWANG NIE

74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get

λ =(

C(x)TC(x))minus1

C(x)T[nablaf(x)

0

]

when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research

Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper

References

[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions

Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-

putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997

[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226

[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200

[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832

[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear

[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear

[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]

[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous

quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-

inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-

tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005

[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109

[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J

Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of

zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29

[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885

[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014

[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009

[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015

[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear

[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26

[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009

[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014

[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606

[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718

[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012

[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255

[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013

[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510

[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121

[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274

[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003

[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984

[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000

[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942

[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324

[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653

[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002

[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951

Department of Mathematics University of California at San Diego 9500 Gilman

Drive La Jolla CA USA 92093

E-mail address njwmathucsdedu

  • 1 Introduction
    • 11 Optimality conditions
    • 12 Some existing work
    • 13 New contributions
      • 2 Preliminaries
      • 3 The construction of tight relaxations
        • 31 Tightness of the relaxations
        • 32 Detecting tightness and extracting minimizers
          • 4 Polyhedral constraints
          • 5 General constraints
          • 6 Numerical examples
          • 7 Discussions
            • 71 Tight relaxations using preorderings
            • 72 Singular constraining polynomials
            • 73 Degree bound for L(x)
            • 74 Rational representation of Lagrange multipliers
              • References
Page 26: TIGHT RELAXATIONS FOR POLYNOMIAL OPTIMIZATION …The polynomial expressions is de-termined by linear equations. Based on these expressions, new Lasserre type semidefinite relaxations

26 JIAWANG NIE

Table 9 Computational results for Example 69

orderwo LME with LME

lower bound time lower bound time2 minus00279 02262 minus5 middot 10minus6 118353 minus00005 04691 minus6 middot 10minus7 165664 minus00001 31098 minus2 middot 10minus7 552345 minus4 middot 10minus5 165092 minus6 middot 10minus7 197320

For some polynomial optimization problems the standard Lasserre relaxationsmight converge fast eg the lowest order relaxation may often be tight For suchcases the new relaxations (38)-(39) have the same convergence property but mighttake more computational time The following is such a comparison

Example 610 Consider the optimization problem (x0 = 1)

minxisinRn

sum

0leilejlejlen cijkxixjxk

st 0 le x le 1

where each coefficient cijk is randomly generated (by randn in MATLAB) Thematrix L(x) is the same as in Example 43 Since the feasible set is compact wealways have fc = fmin The condition ii) of Theorem 33 is satisfied because ofbox constraints For this kind of randomly generated problems standard Lasserrersquosrelaxations are often tight for the order k = 2 which is also the case for the newrelaxations (38)-(39) Here we compare the computational time that is consumedby standard Lasserrersquos relaxations and (38)-(39) The time is shown (in seconds)in Table 10 For each n in the table we generate 10 random instances and we showthe average of the consumed time For all instances standard Lasserrersquos relaxationsand the new ones (38)-(39) are tight for the order k = 2 while their time is a bitdifferent We can observe that (38)-(39) consume slightly more time

Table 10 Consumed time (in seconds) for Example 610

n 9 10 11 12 13 14wo LME 12569 25619 63085 158722 351675 784111with LME 19714 38288 82519 200310 376373 824778

7 Discussions

71 Tight relaxations using preorderings When the global minimum valuefmin is achieved at a critical point the problem (11) is equivalent to (37) Weproposed relaxations (38)-(39) for solving (37) Note that

IQ(ceq cin)2k + IQ(φ ψ)2k = Ideal(ceq φ)2k +Qmod(cin ψ)2k

If we replace the quadratic module Qmod(cin ψ) by the preordering of (cin ψ)[21 25] we can get further tighter relaxations For convenience write (cin ψ) as asingle tuple (g1 gℓ) Its preordering is the set

Preord(cin ψ) =sum

r1rℓisin01gr11 middot middot middot grℓℓ Σ[x]

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27

The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is

(71)

f primeprek = min 〈f y〉

st 〈1 y〉 = 1 L(k)ceq(y) = 0 L

(k)φ (y) = 0

L(k)

gr11

middotmiddotmiddotgrℓℓ

(y) 0 forall r1 rℓ isin 0 1y isin RN

n2k

Similar to (39) the dual optimization problem of the above is

(72)

fprek = max γ

st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k

An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem

Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then

fprek = f primeprek = fc

for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre

k = fmin for all k big enough

Proof The proof is very similar to the Case III of Theorem 33 Follow the same

argument there Without Assumption 32 we still have f(x) equiv 0 on the set

K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0

By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such

that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same

72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge

73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)

28 JIAWANG NIE

74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get

λ =(

C(x)TC(x))minus1

C(x)T[nablaf(x)

0

]

when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research

Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper

References

[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions

Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-

putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997

[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226

[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200

[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832

[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear

[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear

[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]

[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous

quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-

inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-

tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005

[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109

[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J

Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of

zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29

[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885

[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014

[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009

[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015

[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear

[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26

[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009

[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014

[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606

[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718

[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012

[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255

[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013

[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510

[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121

[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274

[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003

[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984

[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000

[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942

[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324

[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653

[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002

[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951

Department of Mathematics University of California at San Diego 9500 Gilman

Drive La Jolla CA USA 92093

E-mail address njwmathucsdedu

  • 1 Introduction
    • 11 Optimality conditions
    • 12 Some existing work
    • 13 New contributions
      • 2 Preliminaries
      • 3 The construction of tight relaxations
        • 31 Tightness of the relaxations
        • 32 Detecting tightness and extracting minimizers
          • 4 Polyhedral constraints
          • 5 General constraints
          • 6 Numerical examples
          • 7 Discussions
            • 71 Tight relaxations using preorderings
            • 72 Singular constraining polynomials
            • 73 Degree bound for L(x)
            • 74 Rational representation of Lagrange multipliers
              • References
Page 27: TIGHT RELAXATIONS FOR POLYNOMIAL OPTIMIZATION …The polynomial expressions is de-termined by linear equations. Based on these expressions, new Lasserre type semidefinite relaxations

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 27

The truncation Preord(cin ψ)2k is similarly defined like Qmod(cin ψ)2k in Sec-tion 2 A tighter relaxation than (38) of the same order k is

(71)

f primeprek = min 〈f y〉

st 〈1 y〉 = 1 L(k)ceq(y) = 0 L

(k)φ (y) = 0

L(k)

gr11

middotmiddotmiddotgrℓℓ

(y) 0 forall r1 rℓ isin 0 1y isin RN

n2k

Similar to (39) the dual optimization problem of the above is

(72)

fprek = max γ

st f minus γ isin Ideal(ceq φ)2k + Preord(cin ψ)2k

An attractive property of the relaxations (71)-(72) is that the conclusion of The-orem 33 still holds even if none of the conditions i)-iii) there is satisfied Thisgives the following theorem

Theorem 71 Suppose Kc 6= empty and Assumption 31 holds Then

fprek = f primeprek = fc

for all k sufficiently large Therefore if the minimum value fmin of (11) is achievedat a critical point then fprek = f primepre

k = fmin for all k big enough

Proof The proof is very similar to the Case III of Theorem 33 Follow the same

argument there Without Assumption 32 we still have f(x) equiv 0 on the set

K3 = x isin Rn | ceq(x) = 0 φ(x) = 0 cin(x) ge 0 ψ(x) ge 0

By the Positivstellensatz there exists an integer ℓ gt 0 and q isin Preord(cin ψ) such

that f2ℓ + q isin Ideal(ceq φ) The resting proof is the same

72 Singular constraining polynomials As shown in Proposition 52 if thetuple c of constraining polynomials is nonsingular then there exists a matrix poly-nomial L(x) such that L(x)C(x) = Im Hence the Lagrange multiplier λ can beexpressed as in (53) However if c is not nonsingular then such L(x) does notexist For such cases how can we express λ in terms of x for critical pairs (x λ)This question is mostly open to the best of the authorrsquos knowledge

73 Degree bound for L(x) For a nonsingular tuple c of constraining polyno-mials what is a good degree bound for L(x) in Proposition 52 When c is lineara degree bound is given in Proposition 41 However for nonlinear c an explicitdegree bound is not known Theoretically we can get a degree bound for L(x) Inthe proof of Proposition 52 the Hilbertrsquos Nullstellensatz is used for t times Thereexists sharp degree bounds for Hilbertrsquos Nullstellensatz [16] For each time of itsusage if the degree bound in [16] is used then a degree bound for L(x) can beeventually obtained However such obtained bound is enormous because the onein [16] is already exponential in the number of variables An interesting future workis to get a useful degree bound for L(x)

28 JIAWANG NIE

74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get

λ =(

C(x)TC(x))minus1

C(x)T[nablaf(x)

0

]

when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research

Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper

References

[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions

Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-

putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997

[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226

[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200

[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832

[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear

[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear

[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]

[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous

quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-

inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-

tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005

[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109

[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J

Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of

zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29

[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885

[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014

[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009

[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015

[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear

[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26

[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009

[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014

[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606

[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718

[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012

[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255

[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013

[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510

[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121

[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274

[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003

[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984

[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000

[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942

[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324

[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653

[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002

[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951

Department of Mathematics University of California at San Diego 9500 Gilman

Drive La Jolla CA USA 92093

E-mail address njwmathucsdedu

  • 1 Introduction
    • 11 Optimality conditions
    • 12 Some existing work
    • 13 New contributions
      • 2 Preliminaries
      • 3 The construction of tight relaxations
        • 31 Tightness of the relaxations
        • 32 Detecting tightness and extracting minimizers
          • 4 Polyhedral constraints
          • 5 General constraints
          • 6 Numerical examples
          • 7 Discussions
            • 71 Tight relaxations using preorderings
            • 72 Singular constraining polynomials
            • 73 Degree bound for L(x)
            • 74 Rational representation of Lagrange multipliers
              • References
Page 28: TIGHT RELAXATIONS FOR POLYNOMIAL OPTIMIZATION …The polynomial expressions is de-termined by linear equations. Based on these expressions, new Lasserre type semidefinite relaxations

28 JIAWANG NIE

74 Rational representation of Lagrange multipliers In (51) the Lagrangemultiplier vector λ is determined by a linear equation Naturally one can get

λ =(

C(x)TC(x))minus1

C(x)T[nablaf(x)

0

]

when C(x) has full column rank This rational representation is expensive for us-age because its denominator is typically a high degree polynomial However λmight have rational representations other than the above Can we find a rationalrepresentation whose denominator and numerator have low degrees If this is pos-sible the methods for optimizing rational functions [3 15 28] can be applied Thisis an interesting question for future research

Acknowledgement The author was partially supported by the NSF grants DMS-1417985 and DMS-1619973 He would like to thank the anonymous referees forvaluable comments on improving the paper

References

[1] D Bertsekas Nonlinear programming second edition Athena Scientific 1995[2] J Bochnak M Coste and M-F Roy Real algebraic geometry Springer 1998[3] F Bugarin D Henrion and J Lasserre Minimizing the sum of many rational functions

Math Programming Comput 8(2016) pp 83ndash111[4] D Cox J Little and D OrsquoShea Ideals varieties and algorithms An introduction to com-

putational algebraic geometry and commutative algebra Third edition Undergraduate Textsin Mathematics Springer New York 1997

[5] R Curto and L Fialkow Truncated K-moment problems in several variables Journal ofOperator Theory 54(2005) pp 189-226

[6] J Demmel JNie and V Powers Representations of positive polynomials on non-compactsemialgebraic sets via KKT ideals Journal of Pure and Applied Algebra 209 (2007) no 1pp 189ndash200

[7] E de Klerk and M Laurent On the Lasserre hierarchy of semidefinite programming re-laxations of convex polynomial optimization problems SIAM Journal On Optimization 21(2011) pp 824-832

[8] E de Klerk J Lasserre M Laurent and Z Sun Bound-constrained polynomial optimizationusing only elementary calculations Mathematics of Operations Research to appear

[9] E de Klerk M Laurent and Z Sun Convergence analysis for Lasserrersquos measure-based hier-archy of upper bounds for polynomial optimization Mathematical Programming to appear

[10] E de Klerk R Hess and M Laurent Improved convergence rates for Lasserre-type hi-erarchies of upper bounds for box-constrained polynomial optimization Preprint 2016arXiv160303329[mathOC]

[11] G-M Greuel and G Pfister A Singular introduction to commutative algebra Springer 2002[12] S He Z Luo J Nie and S Zhang Semidefinite relaxation bounds for indefinite homogeneous

quadratic optimization SIAM J Optim 19 (2) pp 503ndash523[13] D Henrion J Lasserre and J Loefberg GloptiPoly 3 moments optimization and semidef-

inite programming Optimization Methods and Software 24 (2009) no 4-5 pp 761ndash779[14] D Henrion and JB Lasserre Detecting global optimality and extracting solutions in Glop-

tiPoly Positive polynomials in control 293ndash310 Lecture Notes in Control and Inform Sci312 Springer Berlin 2005

[15] D Jibetean and E de Klerk Global optimization of rational functions A semidefinite pro-gramming approach Mathematical Programming 106 (2006) no 1 pp 93ndash109

[16] Janos Kollar Sharp effective Nullstellensatz J Amer Math Soc 1 (1988) no 4 963975[17] JB Lasserre Global optimization with polynomials and the problem of moments SIAM J

Optim 11(3) pp 796ndash817 2001[18] J Lasserre M Laurent and P Rostalski Semidefinite characterization and computation of

zero-dimensional real radical ideals Foundations of Computational Mathematics 8(2008)no 5 pp 607ndash647

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29

[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885

[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014

[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009

[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015

[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear

[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26

[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009

[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014

[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606

[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718

[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012

[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255

[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013

[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510

[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121

[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274

[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003

[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984

[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000

[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942

[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324

[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653

[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002

[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951

Department of Mathematics University of California at San Diego 9500 Gilman

Drive La Jolla CA USA 92093

E-mail address njwmathucsdedu

  • 1 Introduction
    • 11 Optimality conditions
    • 12 Some existing work
    • 13 New contributions
      • 2 Preliminaries
      • 3 The construction of tight relaxations
        • 31 Tightness of the relaxations
        • 32 Detecting tightness and extracting minimizers
          • 4 Polyhedral constraints
          • 5 General constraints
          • 6 Numerical examples
          • 7 Discussions
            • 71 Tight relaxations using preorderings
            • 72 Singular constraining polynomials
            • 73 Degree bound for L(x)
            • 74 Rational representation of Lagrange multipliers
              • References
Page 29: TIGHT RELAXATIONS FOR POLYNOMIAL OPTIMIZATION …The polynomial expressions is de-termined by linear equations. Based on these expressions, new Lasserre type semidefinite relaxations

TIGHT RELAXATIONS AND LAGRANGE MULTIPLIER EXPRESSIONS 29

[19] J Lasserre A new look at nonnegativity on closed sets and polynomial optimization SIAMJ Optim 21 (2011) pp 864ndash885

[20] JB Lasserre Convexity in semi-algebraic geometry and polynomial optimization SIAM JOptim 19 (2009) pp 1995-2014

[21] JB Lasserre Moments positive polynomials and their applications Imperial College Press2009

[22] JB Lasserre Introduction to polynomial and semi-algebraic optimization Cambridge Uni-versity Press Cambridge 2015

[23] JB Lasserre KC Toh and S Yang A bounded degree SOS hierarchy for polynomial opti-mization Euro J Comput Optim to appear

[24] M Laurent Semidefinite representations for finite varieties Mathematical Programming 109(2007) pp 1ndash26

[25] M Laurent Sums of squares moment matrices and optimization over polynomials Emerg-ing Applications of Algebraic Geometry Vol 149 of IMA Volumes in Mathematics and itsApplications (Eds M Putinar and S Sullivant) Springer pages 157-270 2009

[26] M Laurent Optimization over polynomials Selected topics Proceedings of the InternationalCongress of Mathematicians 2014

[27] J Nie J Demmel and B Sturmfels Minimizing polynomials via sum of squares over thegradient ideal Mathematical Programming Ser A 106 (2006) no 3 pp 587ndash606

[28] J Nie J Demmel and M Gu Global minimization of rational functions and the nearestGCDs Journal of Global Optimization 40 (2008) no 4 pp 697ndash718

[29] J Nie Discriminants and Nonnegative Polynomials Journal of Symbolic ComputationVol 47 no 2 pp 167ndash191 2012

[30] J Nie An exact Jacobian SDP relaxation for polynomial optimization Mathematical Pro-gramming Ser A 137 (2013) pp 225ndash255

[31] J Nie Polynomial optimization with real varieties SIAM J Optim Vol 23 No 3 pp1634-1646 2013

[32] J Nie Certifying convergence of Lasserrersquos hierarchy via flat truncation Mathematical Pro-gramming Ser A 142 (2013) no 1-2 pp 485ndash510

[33] J Nie Optimality conditions and finite convergence of Lasserrersquos hierarchy MathematicalProgramming Ser A 146 (2014) no 1-2 pp 97ndash121

[34] J Nie Linear optimization with cones of moments and nonnegative polynomials Mathemat-ical Programming Ser B 153 (2015) no 1 pp 247ndash274

[35] P A Parrilo and B Sturmfels Minimizing polynomial functions In S Basu and L Gonzalez-Vega editors Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathe-matics and Computer Science volume 60 of DIMACS Series in Discrete Mathematics andComputer Science pages 83-99 AMS 2003

[36] M Putinar Positive polynomials on compact semi-algebraic sets Ind Univ Math J 42(1993) 969ndash984

[37] B Reznick Some concrete aspects of Hilbertrsquos 17th problem In Contemp Math Vol 253pp 251-272 American Mathematical Society 2000

[38] M Schweighofer Global optimization of polynomials using gradient tentacles and sums ofsquares SIAM J Optim 17 (2006) no 3 pp 920ndash942

[39] C Scheiderer Positivity and sums of squares A guide to recent results Emerging Appli-cations of Algebraic Geometry (M Putinar S Sullivant eds) IMA Volumes Math Appl149 Springer 2009 pp 271ndash324

[40] J F Sturm SeDuMi 102 A MATLAB toolbox for optimization over symmetric conesOptim Methods Softw 11 amp 12 (1999) pp 625ndash653

[41] B Sturmfels Solving systems of polynomial equations CBMS Regional Conference Series inMathematics 97 American Mathematical Society Providence RI 2002

[42] H H Vui and P T Son Global optimization of polynomials using the truncated tangencyvariety and sums of squares SIAM J Optim 19 (2008) no 2 pp 941ndash951

Department of Mathematics University of California at San Diego 9500 Gilman

Drive La Jolla CA USA 92093

E-mail address njwmathucsdedu

  • 1 Introduction
    • 11 Optimality conditions
    • 12 Some existing work
    • 13 New contributions
      • 2 Preliminaries
      • 3 The construction of tight relaxations
        • 31 Tightness of the relaxations
        • 32 Detecting tightness and extracting minimizers
          • 4 Polyhedral constraints
          • 5 General constraints
          • 6 Numerical examples
          • 7 Discussions
            • 71 Tight relaxations using preorderings
            • 72 Singular constraining polynomials
            • 73 Degree bound for L(x)
            • 74 Rational representation of Lagrange multipliers
              • References