1
CRM-DIMACS Workshop on Mixed-Integer Nonlinear Programming
October 2019, Montreal
On Local Minima ofCubic Polynomials
Amir Ali AhmadiPrinceton, ORFE
Affiliated member of PACM, COS, MAE, CSML
Joint work with
Jeffrey ZhangPrinceton, ORFE
Deciding local minimality
2
Consider the optimization problem
Given a point π₯, decide if it is a local minimum.
minπ₯ββπ
π(π₯)
π₯ β Ξ©
Why local minima?
- Global minima are often intractable- Recent interest in local minima, particularly in machine
learning applications- Existing notions that local minima are βeasier to findβ or are
sufficient for applications- Formal understanding of local minima is desirable
Local minima
3
A point π₯ is a local minimum of
if there exists a ball of radius π > 0 such that π π₯ β€ π(π₯) for all π₯ β π΅π π₯ β© Ξ©.
π₯ is a strict local minimum if π π₯ < π(π₯) for all π₯ β π΅π π₯ β© Ξ©\ π₯.
minπ₯ββπ
π(π₯)
ππ π₯ β₯ 0, π = 1, β¦ , π
Our focus: polynomial optimization problems
π is a polynomial, Ξ© is defined by polynomial inequalities.
minπ₯ββπ
π(π₯)
π₯ β Ξ©
Unconstrained quadratic optimization Linear Programming
4
Known tractable cases
Check coefficients of characteristic polynomialCheck if π₯ is optimal. If it is, add πππ₯ = ππ π₯ as a constraint, and solve sequence of LPs
minπ₯ββπ
1
2π₯πππ₯ + πππ₯
π₯ is a local minimum if and only ifπ π₯ + π = 0
π β½ 0
π₯ is a strict local minimum if and only if
π π₯ + π = 0π β» 0
minπ₯ββπ
πππ₯
π΄π₯ = ππ₯ β₯ 0
π₯ is a local minimum if and only if it is optimal.
π₯ is a strict local minimum if and only it is the unique optimal solution.
Compute π leading principal minorsπ΄ π₯ = π, π₯ β₯ 0, and ππ π₯ is attainable in the dual
Unconstrained quartic optimization Quadratic programming
π is a quartic polynomial
Known intractable cases
5
A matrix π is copositive if and only if 0 is a local minimum of
π₯12
β¦π₯π
2
π
ππ₯1
2
β¦π₯π
2
π
minπ₯ββπ
1
2π₯πππ₯ + πππ₯
π΄π₯ β₯ π
minπ₯ββπ
π(π₯)
minπ₯ββπ
π₯πππ₯
π₯ β₯ 0
A matrix π is copositive if π₯πππ₯ β₯ 0, βπ₯ β₯ 0
or of
Strict local minimality also NP-hard.
Unconstrained quartic optimization Quadratic programming
π is a quartic polynomial
Summary of prior literature
6
minπ₯ββπ
1
2π₯πππ₯ + πππ₯
π΄π₯ β₯ π
minπ₯ββπ
π(π₯)
Open cases?
Unconstrained cubic minimization
Unconstrained quadratic optimization Linear Programming
minπ₯ββπ
1
2π₯πππ₯ + πππ₯
minπ₯ββπ
πππ₯
π΄π₯ = ππ₯ β₯ 0
Poly-time (both for local min and strict local min)
NP-hard (both for local min and strict local min)
Om
ar Kh
ayyam (1
04
8-1
13
1)
Outline
β’ Part I: Testing local minimality of a given pointfor a cubic polynomial
7
β’ Part 2: Finding a local minimum of a cubic polynomial
Classical optimality conditions
8
First Order Necessary Condition (FONC) Second Order Necessary Condition (SONC)
Second Order Sufficient Condition (SOSC):
is a local minimum β no descent directions at
Unlike quadratics, not sufficient for cubic polynomials
π π₯1, π₯2 = π₯22 β π₯1
2π₯2 --+
+
π π
π₯ is a local minimum β βπ π₯ = 0 π₯ is a local minimum β β2π π₯ β½ 0
FONC + β2π π₯ β» 0 β π₯ is a (strict) local minimum
A direction π is a descent direction for π at π₯ if for some πΌβ > 0,π π₯ + πΌπ < π( π₯) for all πΌ β (0, πΌβ)
Necessary and sufficient condition for local minima
9
Theorem (Third Order Condition, TOC)
Let p be a cubic polynomial and suppose satisfies FONC and SONC. Then is a local minimum of π if and only if
π β π(π»2π( )) β π»π3 π = 0
Moreover, this condition can be checked in polynomial time.
π(π»2π( π₯)) is the null space of Hessian at π₯
π3 is the cubic component of π
π₯
π₯
π₯
Example: origin a local minimum
10
π π₯1, π₯2 = π₯22 + π₯1π₯2
2
π»π 0,0 =00
π»2π 0,0 =0 00 2
π»π3(π₯1, π₯2) =2π₯2
2
2π₯1π₯2
π β π(π»2π π₯ ) β π»π3 π = 0?
π»π3(πΌ, 0) =00-
-
+
+
Example: origin not a local minimum
11
π π₯1, π₯2 = π₯22 β π₯1
2π₯2
π»π 0,0 =00
π»2π 0,0 =0 00 2
π»π3(π₯1, π₯2) =β2π₯1π₯2
βπ₯12
π β π(π»2π π₯ ) β π»π3 π = 0?
π»π3(πΌ, 0) =0
βπΌ2
+
+
--
A more natural condition?
12
π β π(π»2π π₯ ) = 0 β π3 π = 0 Necessary for πΆ3 functions (where π3
would be the cubic component of the Taylor expansion). βThird Order Necessary Conditionβ (TONC)
π π₯1, π₯2 = π₯22 β π₯1
2π₯2
Not sufficient for local optimality, even for cubics
Guarantees no descent directions for cubic polynomials
Does not guarantee no parabolas of descent
Is TOC necessary for general functions?
13
π π₯1, π₯2 = π₯14 + π₯1
2 + π₯22
π β π π»2π π₯ β π»π3 π = 0?
π»π3 =2π₯2
2
4π₯1π₯2
π»π3(0,1) =20
π»π 0,0 =00
π»2π 0,0 =0 00 2
Easy to see not sufficient for higher degree polynomials (e.g., π π₯ = π₯5),but is it necessary? No!
A characterization of local minima for cubics
14
Theorem (Third Order Condition, TOC)
Let p be a cubic polynomial and suppose satisfies FONC and SONC. Then is a local minimum of π if and only if
π β π(π»2π( )) β π»π3 π = 0
Moreover, this condition can be checked in polynomial time.
π₯
π₯
π₯
Proof of characterization of local minima (1/3)
15
π π₯ + ππ£ = π π₯ + ππ»π π₯ ππ£ +1
2π2π£ππ»2π π₯ π£ +
Taylor expansion of cubic polynomials
π3π3(π£)π(π3)
π π₯ + πΌπ + π½π§ = π π₯ + π»π π₯ π πΌπ + π½π§
+1
2πΌπ + π½π§ ππ»2π π₯ πΌπ + π½π§
+π3 πΌπ + π½π§
Suppose π₯ satisfies FONC, SONC. We show π₯ is a local min iff TOC.For any unit vectors π in the null space of π»2π( π₯) and π§ in the range of π»2π π₯ ,
0
π½2π§ππ»2π π₯ π§
π3 πΌπ + π½π§ = π3 πΌπ + π½π»π3 πΌπ ππ§ +1
2π½2π§ππ»2π3 πΌπ π§ + π½3π3(π§)
0
π π₯ + πΌπ + π½π§ β π( π₯) =
1
2π½2π§ππ»2π π₯ π§ + πΌ2π½π»π3 π ππ§ +
1
2πΌπ½2π§ππ»2π3 π π§ + π½3π3(π§)
Proof of characterization (2/3) [Sufficiency]
16
π π₯ + πΌπ + π½π§ β π( π₯) =
1
2π½2π§ππ»2π π₯ π§ + πΌ2π½π»π3 π ππ§ +
1
2πΌπ½2π§ππ»2π3 π π§ + π½3π3(π§)
π»π3 π = 0 β π β π(π»2π π₯ ) β local minimum
1
2π½2π§ππ»2π π₯ π§ +
1
2πΌπ½2π§ππ»2π3 π π§ + π½3π3(π§)
1
2π½2π§ππ»2π π₯ π§ +
1
2πΌπ½2π§ππ»2π3 π π§ + π½3π3(π§)
π½2(1
2π§ππ»2π π₯ π§ +
1
2πΌπ§ππ»2π3 π π§ + π½π3 π§ )
β₯ smallest nonzero eigenvalue
Upper bounded in abs. value
Therefore, βπΌβ, π½β such that if πΌ < πΌβ, π½ < π½β,this expression is nonnegative
Proof of characterization (3/3) [Necessity]
17
Pick a sequence π½π β 0, πΌπ β π½π
1
3
1
2π½π
2 π§ππ»2π π₯ π§ + π½π5/3
π»π3 π π π§ +1
2π½π
7/3 π§ππ»2π3 π π§ + π½π
3π3( π§)
π π₯ + πΌπ + π½π§ β π( π₯) =
1
2π½2π§ππ»2π π₯ π§ + πΌ2π½π»π3 π ππ§ +
1
2πΌπ½2π§ππ»2π3 π π§ + π½3π3(π§)
local minimum β π»π3 π = 0 βπ β π π»2π π₯
Otherwise, for the sake of contradiction pick π β π π»2π π₯
such that π»π3( π) β 0 and pick π§ = βπ»π3( π)
1
2π½2 π§ππ»2π π₯ π§ + πΌ2π½π»π3
ππ
π§ +1
2πΌπ½2 π§ππ»2π3
π π§ + π½3π3( π§)
< 0
Characterization of strict local minima
18
Proposition
is a strict local minimum of a cubic polynomial π if and only if
π π₯ + πΌπ = π π₯ + πΌπ»π π₯ ππ +1
2πΌ2πππ»2π π₯ π + πΌ3π3 π .
Proof. Only need to show π₯ strict local min β β2π π₯ β» 0.
Otherwise for the sake of contradiction pick π β π π»2π π₯ .
π₯
π»π π₯ = 0 (FONC)
π»2π π₯ β» 0 (SOSC)
(Note: SOSC is not necessary in general: π π₯ = π₯4.)
Observation: If a cubic has a strict local min, then that is the unique local min.
Proof.
Checking local minimality in polynomial time
19
β’ Input: π, π₯
β’ Compute gradient and Hessian of π at π₯
β’ Check FONC and SONC
β’ Compute gradient of π3, evaluated on the null space of π»2π π₯
π»π3 π₯ =
ππ3
ππ₯1( π₯)
β¦
ππ3
ππ₯π( π₯)
β
ππ3
ππ₯1(πΌ1π£1 + πΌ2π£2 + β― + πΌππ£π)
β¦
ππ3
ππ₯π(πΌ1π£1 + πΌ2π£2 + β― + πΌππ£π)
π1(πΌ1, β¦ , ππ)
β¦
ππ πΌ1, β¦ , ππ
β’ All coefficients of all ππ must be zero
β’ Compute a basis {π£1, π£2, β¦ , π£π} for null space of π»2π π₯ (solving linear systems)
For strict local minima, check FONC and SOSC (leading π principal minors must be positive)
Outline
β’ Part I: Testing local minimality of a given pointfor a cubic polynomial
20
β’ Part 2: Finding a local minimum of a cubic polynomial
Finding local minima
21
Given a cubic polynomial π, can we efficiently find a local minimum of π?
Unfortunatelyβ¦
Theorem
Deciding if a cubic polynomial has a critical point is strongly NP-hard.
Reduction from MAXCUT
Given a graph πΊ = (π, πΈ), partition the vertices into two sets such that as many edges as possible are between vertices in opposite sets
Letβs start with a βsimplerβ question. Can we efficiently find a critical point of π?
MAXCUT (decision version)
22
Is there a cut of size π?
Quadratic satisfiability
1 β π₯π2 = 0, π = 1, β¦ , π
1
4
π,π βπΈ
(1 β π₯ππ₯π) = π
Critical points of a cubic polynomial
π π₯1, β¦ , π₯π, π¦0, π¦1, β¦ , π¦π = π¦π
1
4
π,π βπΈ
1 β π₯ππ₯π β π +
π=1
π
π¦π(1 β π₯π2)
Critical points of a cubic polynomial
23
π»π π₯, π¦ =
ππ
ππ₯π
ππ
ππ¦0
ππ
ππ¦π
=
βπ¦0
4
π,π βπΈ
π₯π β 2π₯ππ¦π
1
4
π,π βπΈ
1 β π₯ππ₯π β π
1 β π₯π2
π π₯, π¦ = π¦π
1
4
π,π βπΈ
1 β π₯ππ₯π β π +
π=1
π
π¦π(1 β π₯π2)
Any cut of size π β critical point (π₯ = cut, π¦ = 0)
Any critical point β cut of size π (π₯ β cut)
But this doesnβt necessarily mean finding local minima is NP-hard.First some geometryβ¦
24
Some geometric properties of local minima
Theorem
The local minima of any cubic polynomial π form a convex set.
Lemma
If is a local minimum of a cubic polynomial π and π β π(β2π ), then for any πΌ,
π»π +πΌπ = 0
Proof (of theorem).Let π₯ and π¦ be local minima. Note π is constant on the line between π₯ and π¦. Consider π§ = π₯ + πΌ(π¦ β π₯)
FONC: π¦ β π₯ β π π»2π π₯ + Lemma
SONC: Convex combination of PSD matrices is PSD
TOC: π(π»2π((1 β πΌ)π₯ + πΌπ¦))= π( 1 β πΌ π»2π π₯ + πΌπ»2π(π¦))
= π π»2π π₯ β© π π»2π π¦ β π β2π π₯
π₯
π₯
π₯
Convexity of the set of local minima
25
Critical points
Local minima
π π₯1, π₯2 = π₯13 + 3π₯1
2π₯2 + 3π₯1π₯22 + π₯2
3 β π₯1 β π₯2
π convex
Set of local minima not necessarily polyhedral
26
π π₯1, π₯2, π₯3, π₯4 =1
2π₯1
2π₯32 + 2π₯1π₯3π₯4 +
1
2π₯1π₯4
2 β1
2π₯2π₯3
2
+π₯2π₯3π₯4 + 2π₯2π₯42 + π₯3
2 + π₯42
π₯3 = 0, π₯4 = 0 β©2 + π₯1 β π₯2 2π₯1 + π₯2
2π₯1 + π₯2 2 + 2π₯1 + 4π₯2β» 0
Convexity region
27
Definition (Convexity region)
The convexity region of a polynomial π is the setπ₯ β βπ β2π π₯ β½ 0}
β’ The convexity region of a cubic polynomial is a spectrahedron (we call it a βCH-spectrahedronβ)
β’ Any spectrahedron π₯ β βπ π=1π π΄ππ₯π + π β½ 0} with π΄π in βπΓπ
is the shadow of a CH-spectrahedron in dimension π + π.
β’ Not every spectrahedron is a CH-spectrahedron:
28
A βconvexβ optimization problem
Theorem
If a cubic polynomial π has a local minimum, the solution set of the following optimization problem is the closure of its local minima.
In particular, the optimal value of this βconvexβ problem gives the value of π at any local minimum.
minπ₯
π(π₯)
β2π π₯ β½ 0
Note: the value of π at local minima must be the same.
Very rough intuition:
29
A βconvexβ optimization problem proof (1/2)
local minimum
βbetter pointβ
Local minima are optimal to
minπ₯
π(π₯)
β2π π₯ β½ 0
30
local minimum
optimal point
FONC: βSONC: β
TOC: π β2π π§ β π(β2π π₯ )
A βconvexβ optimization problem proof (2/2)
π₯π¦ π§
Solutions tomin
π₯π(π₯)
β2π π₯ β½ 0
are in closure of local minima
π π₯ = π(π¦)
31
Sum of squares polynomials
Sum of squares polynomials
32
β’ A polynomial π is a sum of squares (sos) if it can be written as
π(π₯) = ππ2(π₯)
β’ Any sos polynomial is nonnegative
β’ Imposing that a polynomial is sos is a semidefinite constraint
β’ A matrix of polynomials π(π₯) is an sos-matrix if the polynomial π¦ππ(π₯)π¦ is sos, or equivalently if π π₯ = π π₯ π π₯ π
Sum of squares relaxations
minπ₯
π(π₯) = maxπΎ
πΎ
π π₯ β πΎ is a nonnegative polynomial
sos
Find lower bounds on the optimal value of a polynomial optimization problem
β₯
33
Sum of squares relaxations for constrained problems
minπ₯ββπ
π(π₯)
ππ π₯ β₯ 0, π = 1, β¦ , π
maxπΎ,ππ π ππ
πΎ
π π₯ β πΎ = π0(π₯) +
π=1
π
ππ(π₯)ππ(π₯)
β€
Lasserre hierarchy:
For ππ of fixed degree, this is an SDP of size polynomial in data
As deg ππ β β, the optimal value of the sos program will converge to the true optimal value (under a mild assumption)
Putinarβs Psatz
34
Theorem
If π has a local minimum, the first level of this sos relaxation (i.e., when deg π = deg π = 2) is tight.
Sos relaxation
minπ₯
π(π₯)
β2π π₯ β½ 0
maxπ(π₯),π(π₯)
πΎ
π(π₯) β πΎ = π(π₯) + ππ(π»2π π₯ π π₯ )
π is sosπ is an sos-matrix
β€
Proof.Produce an algebraic identity that attains the best possible value.For any local minimum π₯,
π π₯ β πβ =1
3π₯ β π₯ ππ»2π π₯ π₯ β π₯ + ππ(π»2π π₯
1
6π₯ β π₯ π₯ β π₯ π )
Value at local min π π₯
sos
π(π₯)sos-matrix
How to extract a local min itself?
35
π(π₯) + ππ(π»2π π₯ π π₯ )
Idea: Find the zeros of
Solve:min
π₯0
ππ π»2π π₯ π π₯ = 0
π π₯ = 0
π»2π π₯ β½ 0
Nonlinear constraintsβ¦
or are they?
Recovering a local minimum
36
ππ π»2π π₯ π π₯ = 0? (cubic equation)
π»2π π₯ β½ 0 is a linear matrix inequality β
π is an sos quadratic, so the solutions to π π₯ = 0 can be found by solving a system of linear equations
β
Observation:
Since π is a quadratic sos matrix, π(π₯) = π π₯ π π₯ π, where π (π₯) is affine
More geometryβ¦
minπ₯
0
ππ π»2π π₯ π π₯ = 0
π π₯ = 0
π»2π π₯ β½ 0
ππ β2π π₯ π π₯ = 0 β β2π π₯ π π₯ = 0 (quadratic equation)
π π π₯ β π β2π π₯ , βπ
37
Relative Interior
Definition (Relative Interior)
The relative interior of a nonempty convex set π is the setπ₯ β π βπ¦ β π, βπΌ > 1, π¦ + πΌ π¦ β π₯ β π}
More geometry
38
Lemma
Lemma
π₯
π₯
Convex combination of PSD matrices: π β2π π₯ β π β2π π₯
Let π₯ be a local minimum of a cubic polynomial π. Then for any π₯ β βπ
and π β π(β2π π₯ ), ππβ2π π₯ π = 0.
Let π₯ be a local minimum of a cubic polynomial π. Then for any π₯ in the
relative interior of the convexity region of π, π π»2π( π₯) = π π»2π π₯ .
Proof (of second lemma).
First Lemma + β2π π₯ β½ 0: π β2π π₯ β π β2π π₯
Rewriting the cubic equation
39
β’ Find any point π₯ in the relative interior of the convexity region
β’ Find a basis {π£1, π£2, β¦ , π£π} for π(π»2π π₯ )
β’ Decompose π π₯ = π π₯ π π₯ π
β’ Impose π π π₯ β π π»2π π₯ βπ as π π π₯ = π=1π πΌππ£π βπ
(linear constraint!)
β
What does this buy us?
Goal: Impose ππ π»2π π₯ π π₯ = 0
For any π₯ such that β2π π₯ β½ 0, this is equivalent to
imposing π π π₯ β π β2π π₯ βπ β ππ π»2π π₯ π π₯ = 0
An SDP!
40
minπ₯
0
ππ π»2π π₯ π π₯ = 0
π π₯ = 0
π»2π π₯ β½ 0
Theorem
The relative interior of the feasible set of this SDP is the set of local minima of π.
Rewritable as an SDP!
Two steps require a point in the relative interior of a set
How can we get a point in the relative interior of a set?
Finding a point in the relative interior
41
Definition (Relative Interior)
The relative interior of a nonempty convex set π is the setπ₯ β π βπ¦ β π, βπΌ > 0, π₯ + πΌ π¦ β π₯ β π}
Algorithm for finding a local minimum
β’ Find an sos-certified lower bound for value at any local minimum
β’ Find any point in the relative interior of the convexity region
β’ Find a basis for the null space of the Hessian of any local minimum
β’ Find relative interior solution of equivalent SDP
42
Overall result
43
Theorem
Deciding if a cubic polynomial π has a local minimum, and finding one if it does, can be done in polynomially many calls to an SDP blackbox, Choleskly decompositions, and linear system solves of polynomial size.
SDP Blackbox Optimal value
Can be used to recover solutions
Why the blackbox assumption?
44
Local minima can be irrational:
π π₯ = π₯3 β 6π₯
π₯ = 2 is the unique local minimum
Even if there are rational local minima, they can all have size exponential in the input:
π΄ π₯ =
π₯1 22 1
π₯2 π₯1
π₯1 1
β¦
β¦ β± β¦
β¦π₯π π₯πβ1
π₯πβ1 1
π π₯ = π¦ππ΄ π₯ π¦, where Local minima:π¦ = 0 β© {π΄ π₯ β» 0}
π₯1 > 4, π₯2 > 16, β¦ , π₯π > 22π
Summary
β’ Given a cubic polynomial π and a point π₯, checking whether π₯ is a local minimum of πcan be done in polynomial time in the Turing model
β’ It is strongly NP-hard to test if a cubic polynomial has a critical point
β’ Given a cubic polynomial π, we can test if there is a local minimum by solving polynomially many SDPs of polynomial size
45
Thank you!
46
Want to know more? aaa.princeton.eduprinceton.edu/~jeffz
Top Related