Separating Doubly Nonnegative and Completely Positive Matrices · Separating Doubly Nonnegative and...

51
Separating Doubly Nonnegative and Completely Positive Matrices Kurt M. Anstreicher Dept. of Management Sciences University of Iowa (joint work with Hongbo Dong) INFORMS National Meeting, Austin, November 2010 2

Transcript of Separating Doubly Nonnegative and Completely Positive Matrices · Separating Doubly Nonnegative and...

Separating Doubly Nonnegative andCompletely Positive Matrices

Kurt M. AnstreicherDept. of Management Sciences

University of Iowa(joint work with Hongbo Dong)

INFORMS National Meeting, Austin, November 2010

2

0

CP and DNN matrices

Let Sn denote the set of n×n real symmetric matrices, S+n denote

the cone of n×n real symmetric positive semidefinite matrices andNn denote the cone of symmetric nonnegative n× n matrices.

CP and DNN matrices

Let Sn denote the set of n×n real symmetric matrices, S+n denote

the cone of n×n real symmetric positive semidefinite matrices andNn denote the cone of symmetric nonnegative n× n matrices.

• The cone of n×n doubly nonnegative (DNN) matrices is thenDn = S+

n ∩Nn.

CP and DNN matrices

Let Sn denote the set of n×n real symmetric matrices, S+n denote

the cone of n×n real symmetric positive semidefinite matrices andNn denote the cone of symmetric nonnegative n× n matrices.

• The cone of n×n doubly nonnegative (DNN) matrices is thenDn = S+

n ∩Nn.

• The cone of n× n completely positive (CP) matrices isCn = {X |X = AAT for some n× k nonnegative matrix A}.

CP and DNN matrices

Let Sn denote the set of n×n real symmetric matrices, S+n denote

the cone of n×n real symmetric positive semidefinite matrices andNn denote the cone of symmetric nonnegative n× n matrices.

• The cone of n×n doubly nonnegative (DNN) matrices is thenDn = S+

n ∩Nn.

• The cone of n× n completely positive (CP) matrices isCn = {X |X = AAT for some n× k nonnegative matrix A}.• Dual of Cn is the cone of n× n copositive matrices,C∗n = {X ∈ Sn | yTXy ≥ 0 ∀ y ∈ <+

n}.

CP and DNN matrices

Let Sn denote the set of n×n real symmetric matrices, S+n denote

the cone of n×n real symmetric positive semidefinite matrices andNn denote the cone of symmetric nonnegative n× n matrices.

• The cone of n×n doubly nonnegative (DNN) matrices is thenDn = S+

n ∩Nn.

• The cone of n× n completely positive (CP) matrices isCn = {X |X = AAT for some n× k nonnegative matrix A}.• Dual of Cn is the cone of n× n copositive matrices,C∗n = {X ∈ Sn | yTXy ≥ 0 ∀ y ∈ <+

n}.

Clear that

Cn ⊆ Dn, D∗n = S+n +Nn ⊆ C∗n,

and these inclusions are in fact strict for n > 4.

Goal: Given a matrix X ∈ Dn \ Cn, separate X from Cn using amatrix V ∈ C∗n having V •X < 0.

Goal: Given a matrix X ∈ Dn \ Cn, separate X from Cn using amatrix V ∈ C∗n having V •X < 0.

Why Bother?

Goal: Given a matrix X ∈ Dn \ Cn, separate X from Cn using amatrix V ∈ C∗n having V •X < 0.

Why Bother?

• [Bur09] shows that broad class of NP-hard problems can beposed as linear optimization problems over Cn.

Goal: Given a matrix X ∈ Dn \ Cn, separate X from Cn using amatrix V ∈ C∗n having V •X < 0.

Why Bother?

• [Bur09] shows that broad class of NP-hard problems can beposed as linear optimization problems over Cn.

• Dn is a tractable relaxation of Cn. Expect that solution ofrelaxed problem will be X ∈ Dn \ Cn.

Goal: Given a matrix X ∈ Dn \ Cn, separate X from Cn using amatrix V ∈ C∗n having V •X < 0.

Why Bother?

• [Bur09] shows that broad class of NP-hard problems can beposed as linear optimization problems over Cn.

• Dn is a tractable relaxation of Cn. Expect that solution ofrelaxed problem will be X ∈ Dn \ Cn.

• Note that least n where problem occurs is n = 5.

For X ∈ Sn let G(X) denote the undirected graph on vertices{1, . . . , n} with edges {{i 6= j} |Xij 6= 0}.

For X ∈ Sn let G(X) denote the undirected graph on vertices{1, . . . , n} with edges {{i 6= j} |Xij 6= 0}.Definition 1. Let G be an undirected graph on n vertices.Then G is called a CP graph if any matrix X ∈ Dn withG(X) = G also has X ∈ Cn.

For X ∈ Sn let G(X) denote the undirected graph on vertices{1, . . . , n} with edges {{i 6= j} |Xij 6= 0}.Definition 1. Let G be an undirected graph on n vertices.Then G is called a CP graph if any matrix X ∈ Dn withG(X) = G also has X ∈ Cn.

The main result on CP graphs is the following:

Proposition 1. [KB93] An undirected graph on n vertices isa CP graph if and only if it contains no odd cycle of length 5or greater.

In [BAD09] it is shown that:

In [BAD09] it is shown that:

• Extreme rays of D5 are either rank-one matrices in C5, or rank-three “extremely bad” matrices where G(X) is a 5-cycle (everyvertex in G(X) has degree two).

In [BAD09] it is shown that:

• Extreme rays of D5 are either rank-one matrices in C5, or rank-three “extremely bad” matrices where G(X) is a 5-cycle (everyvertex in G(X) has degree two).

• Any such extremely bad matrix can be separated from C5 by atransformation of the Horn matrix

H :=

1 −1 1 1 −1−1 1 −1 1 1

1 −1 1 −1 11 1 −1 1 −1−1 1 1 −1 1

∈ C∗5 \ D∗5 .

In [DA10] show that:

In [DA10] show that:

• Separation procedure based on transformed Horn matrix ap-plies to X ∈ D5 \ C5 where X has rank three and G(X) hasat least one vertex of degree 2.

In [DA10] show that:

• Separation procedure based on transformed Horn matrix ap-plies to X ∈ D5 \ C5 where X has rank three and G(X) hasat least one vertex of degree 2.

•More general separation procedure applies to any X ∈ D5 \ C5that is not componentwise strictly positive.

In [DA10] show that:

• Separation procedure based on transformed Horn matrix ap-plies to X ∈ D5 \ C5 where X has rank three and G(X) hasat least one vertex of degree 2.

•More general separation procedure applies to any X ∈ D5 \ C5that is not componentwise strictly positive.

An even more general separation procedure that applies to anyX ∈ D5 \ C5 is described in [BD10]. In this talk we will describethe procedure from [DA10] for X ∈ D5 \ C5, X 6> 0, and itsgeneralization to larger matrices having block structure.

A separation procedure for the 5× 5 case

Assume that X ∈ D5, X 6> 0. After a symmetric permutationand diagonal scaling, X may be assumed to have the form

X =

X11 α1 α2

αT1 1 0

αT2 0 1

, (1)

where X11 ∈ D3.

A separation procedure for the 5× 5 case

Assume that X ∈ D5, X 6> 0. After a symmetric permutationand diagonal scaling, X may be assumed to have the form

X =

X11 α1 α2

αT1 1 0

αT2 0 1

, (1)

where X11 ∈ D3.

Theorem 1. [BX04, Theorem 2.1] Let X ∈ D5 have the form(1). Then X ∈ C5 if and only if there are matrices A11 andA22 such that X11 = A11 + A22, and(

Aii αiαTi 1

)∈ D4, i = 1, 2.

A separation procedure for the 5× 5 case

Assume that X ∈ D5, X 6> 0. After a symmetric permutationand diagonal scaling, X may be assumed to have the form

X =

X11 α1 α2

αT1 1 0

αT2 0 1

, (1)

where X11 ∈ D3.

Theorem 1. [BX04, Theorem 2.1] Let X ∈ D5 have the form(1). Then X ∈ C5 if and only if there are matrices A11 andA22 such that X11 = A11 + A22, and(

Aii αiαTi 1

)∈ D4, i = 1, 2.

In [BX04], Theorem 1 is utilized only as a proof mechanism, butwe now show that it has algorithmic consequences as well.

Theorem 2. Assume that X ∈ D5 has the form (1). ThenX ∈ D5 \ C5 if and only if there is a matrix

V =

V11 β1 β2

βT1 γ1 0

βT2 0 γ2

such that

(V11 βiβTi γi

)∈ D∗4 , i = 1, 2,

and V •X < 0.

Theorem 2. Assume that X ∈ D5 has the form (1). ThenX ∈ D5 \ C5 if and only if there is a matrix

V =

V11 β1 β2

βT1 γ1 0

βT2 0 γ2

such that

(V11 βiβTi γi

)∈ D∗4 , i = 1, 2,

and V •X < 0.

• Suppose that X ∈ D5 \ C5, and V are as in Theorem 2. IfX ∈ C5 is another matrix of the form (1), then Theorem 1implies that V • X ≥ 0.

Theorem 2. Assume that X ∈ D5 has the form (1). ThenX ∈ D5 \ C5 if and only if there is a matrix

V =

V11 β1 β2

βT1 γ1 0

βT2 0 γ2

such that

(V11 βiβTi γi

)∈ D∗4 , i = 1, 2,

and V •X < 0.

• Suppose that X ∈ D5 \ C5, and V are as in Theorem 2. IfX ∈ C5 is another matrix of the form (1), then Theorem 1implies that V • X ≥ 0.

• However cannot conclude that V ∈ C∗5 because V •X ≥ 0 only

holds for X of the form (1), in particular, x45 = 0.

Theorem 2. Assume that X ∈ D5 has the form (1). ThenX ∈ D5 \ C5 if and only if there is a matrix

V =

V11 β1 β2

βT1 γ1 0

βT2 0 γ2

such that

(V11 βiβTi γi

)∈ D∗4 , i = 1, 2,

and V •X < 0.

• Suppose that X ∈ D5 \ C5, and V are as in Theorem 2. IfX ∈ C5 is another matrix of the form (1), then Theorem 1implies that V • X ≥ 0.

• However cannot conclude that V ∈ C∗5 because V •X ≥ 0 only

holds for X of the form (1), in particular, x45 = 0.

• Fortunately, by [HJR05, Theorem 1], V can easily be “com-pleted” to obtain a copositive matrix that still separates Xfrom C5.

Theorem 3. Suppose that X ∈ D5 \ C5 has the form (1), andV satisfies the conditions of Theorem 2. Define

V (s) =

V11 β1 β2

βT1 γ1 s

βT2 s γ2

.

Then V (s) •X < 0 for any s, and V (s) ∈ C∗5 for s ≥ √γ1γ2.

Separation for larger matrices with block structure

Procedure for 5 case where X 6> 0 can be generalized to largermatrices with block structure. Assume X has the form

X =

X11 X12 X13 . . . X1k

XT12 X22 0 . . . 0

XT13 0 . . . . . . ...... ... . . . . . . 0

XT1k 0 . . . 0 Xkk

, (2)

where k ≥ 3, each Xii is an ni × ni matrix, and∑ki=1 ni = n.

Lemma 1. Suppose that X ∈ Dn has the form (2), k ≥ 3,and let

Xi =

(X11 X1i

XT1i Xii

), i = 2, . . . , k.

Then X ∈ Cn if and only if there are matrices Aii, i = 2, . . . , ksuch that

∑ki=2Aii = X11, and(Aii X1i

XT1i Xii

)∈ Cn1+ni, i = 2, . . . , k.

Moreover, if G(Xi) is a CP graph for each i = 2, . . . , k,then the above statement remains true with Cn1+ni replacedby Dn1+ni.

Theorem 4. Suppose that X ∈ Dn \ Cn has the form (2),where G(Xi) is a CP graph, i = 2, . . . , k. Then there is amatrix V , also of the form (2), such that(

V11 V1i

V T1i Vii

)∈ D∗n1+ni, i = 2, . . . , k,

and V • X < 0. Moreover, if γi = diag(Diag(Vii).5), then the

matrix

V =

V11 . . . V1k... . . . ...

V T1k . . . Vkk

,

where Vij = γiγTj , 2 ≤ i 6= j ≤ k, has V ∈ C∗n and V • X =

V •X < 0.

Note that:

Note that:

•Matrix X may have numerical entries that are small but notexactly zero. Can then apply Lemma 1 to perturbed matrix Xwhere entries of X below a specified tolerance are set to zero. Ifa cut V separating X from Cn is found, then V •X ≈ V •X <0, and V is very likely to also separate X from Cn.

Note that:

•Matrix X may have numerical entries that are small but notexactly zero. Can then apply Lemma 1 to perturbed matrix Xwhere entries of X below a specified tolerance are set to zero. Ifa cut V separating X from Cn is found, then V •X ≈ V •X <0, and V is very likely to also separate X from Cn.

• Theorem 4 may provide a cut separating a given X ∈ Dn \ Cneven when the sufficient conditions for generating such a cutare not satisfied. In particular, a cut may be found even whenthe condition that Xi is a CP graph for each i is not satisfied.

A second case where block structure can be used to generate cutsfor a matrix X ∈ Dn \ Cn is when X has the form

X =

I X12 X13 . . . X1k

XT12 I X23 . . . X2k

XT13 XT

23. . . . . . ...

... ... . . . . . . X(k−1)k

XT1k X

T2k . . . XT

(k−1)kI

, (3)

where k ≥ 2, each Xij is an ni × nj matrix, and∑ki=1 ni = n.

The structure in (3) corresponds to a partitioning of the vertices{1, 2, . . . , n} into k stable sets in G(X), of size n1, . . . , nk (notethat ni = 1 is allowed).

Applications

Example 1: Consider the box-constrained QP problem

(QPB) max xTQx + cTx

s.t. 0 ≤ x ≤ e.

Applications

Example 1: Consider the box-constrained QP problem

(QPB) max xTQx + cTx

s.t. 0 ≤ x ≤ e.

To describe a CP-formulation of QPB, define matrices

Y =

(1 xT

x X

), Y + =

1 xT sT

x X Z

s ZT S

.

Applications

Example 1: Consider the box-constrained QP problem

(QPB) max xTQx + cTx

s.t. 0 ≤ x ≤ e.

To describe a CP-formulation of QPB, define matrices

Y =

(1 xT

x X

), Y + =

1 xT sT

x X Z

s ZT S

.

By the result of [Bur09], QPB can then be written in the form

(QPBCP) max Q •X + cTx

s.t. x + s = e, Diag(X + 2Z + S) = e,

Y + ∈ C2n+1.

Replacing C2n+1 with D2n+1 gives tractable DNN relaxation.

Replacing C2n+1 with D2n+1 gives tractable DNN relaxation.

• Can be shown that DNN relaxation is equivalent to “SDP+RLT”relaxation, and for n = 2 problem is equivalent to QPB [AB10].

Replacing C2n+1 with D2n+1 gives tractable DNN relaxation.

• Can be shown that DNN relaxation is equivalent to “SDP+RLT”relaxation, and for n = 2 problem is equivalent to QPB [AB10].

• Constraints from Boolean Quadric Polytope (BQP) are valid foroff-diagonal components of X [BL09]. For n = 3, BQP is com-pletely determined by triangle inequalities and RLT constraints.However, can still be a gap when using SDP+RLT+TRI relax-ation.

Replacing C2n+1 with D2n+1 gives tractable DNN relaxation.

• Can be shown that DNN relaxation is equivalent to “SDP+RLT”relaxation, and for n = 2 problem is equivalent to QPB [AB10].

• Constraints from Boolean Quadric Polytope (BQP) are valid foroff-diagonal components of X [BL09]. For n = 3, BQP is com-pletely determined by triangle inequalities and RLT constraints.However, can still be a gap when using SDP+RLT+TRI relax-ation.

• Example from [BL09] with n = 3 has optimal value for QPBof 1.0, value for SDP+RLT+TRI relaxation of 1.093. Solutionmatrix Y + has 5 × 5 principal submatrix that is not strictlypositive, and is not CP. Can obtain cut from Theorem 3, re-solve problem, and repeat.

1.00E‐07

1.00E‐06

1.00E‐05

1.00E‐04

1.00E‐03

1.00E‐02

1.00E‐01

1.00E+00

Gap

 to optim

al value

1.00E‐09

1.00E‐08

1.00E‐07

1.00E‐06

1.00E‐05

1.00E‐04

1.00E‐03

1.00E‐02

1.00E‐01

1.00E+00

0 5 10 15 20 25

Gap

 to optim

al value

Number of CP cuts added

Figure 1: Gap to optimal value for Burer-Letchford QPB problem (n = 3)

Example 2: Let A be the adjacency matrix of a graph G onn vertices, and let α be the maximum size of a stable set in G.Known [dKP02] that

Example 2: Let A be the adjacency matrix of a graph G onn vertices, and let α be the maximum size of a stable set in G.Known [dKP02] that

α−1 = min{

(I + A) •X : eeT •X = 1, X ∈ Cn}. (4)

Example 2: Let A be the adjacency matrix of a graph G onn vertices, and let α be the maximum size of a stable set in G.Known [dKP02] that

α−1 = min{

(I + A) •X : eeT •X = 1, X ∈ Cn}. (4)

Relaxing Cn to Dn results in the Lovasz-Schrijver bound

(ϑ′)−1 = min{

(I + A) •X : eeT •X = 1, X ∈ Dn}. (5)

Let G12 be the complement of the graph corresponding to thevertices of a regular icosahedron [BdK02]. Then α = 3 and ϑ′ ≈3.24.

Let G12 be the complement of the graph corresponding to thevertices of a regular icosahedron [BdK02]. Then α = 3 and ϑ′ ≈3.24.

• Using the cone K112 to approximate the dual of (4) provides no

improvement [BdK02].

Let G12 be the complement of the graph corresponding to thevertices of a regular icosahedron [BdK02]. Then α = 3 and ϑ′ ≈3.24.

• Using the cone K112 to approximate the dual of (4) provides no

improvement [BdK02].

• For the solution matrix X ∈ D12 from (5), cannot find cutbased on first block structure (2). However can find a cutbased on (3). Adding this cut and re-solving, gap to 1/α = 1

3is approximately 2× 10−8.

References

[AB10] Kurt M. Anstreicher and Samuel Burer, Computable representations for convex hulls of low-dimensional quadratic forms, Math. Prog. B 124 (2010), 33–43.

[BAD09] Samuel Burer, Kurt M. Anstreicher, and Mirjam Dur, The difference between 5 × 5 doublynonnegative and completely positive matrices, Linear Algebra Appl. 431 (2009), 1539–1552.

[BD10] Samuel Burer and Hongbo Dong, Separation and relaxation for cones of quadratic forms, Dept.of Management Sciences, University of Iowa (2010).

[BdK02] Immanuel M. Bomze and Etienne de Klerk, Solving standard quadratic optimization problems vialinear, semidefinite and copositive programming, J. Global Optim. 24 (2002), 163–185.

[BL09] Samuel Burer and Adam N. Letchford, On nonconvex quadratic programming with box constri-ants, SIAM J. Optim. 20 (2009), 1073–1089.

[Bur09] Samuel Burer, On the copositive representation of binary and continuous nonconvex quadraticprograms, Math. Prog. 120 (2009), 479–495.

[BX04] Abraham Berman and Changqing Xu, 5× 5 Completely positive matrices, Linear Algebra Appl.393 (2004), 55–71.

[DA10] Hongbo Dong and Kurt Anstreicher, Separating doubly nonnegative and completely positive ma-trices, Dept. of Management Sciences, University of Iowa (2010).

[dKP02] Etienne de Klerk and Dmitrii V. Pasechnik, Approximation of the stability number of a graph viacopositive programming, SIAM J. Optim. 12 (2002), 875–892.

[HJR05] Leslie Hogben, Charles R. Johnson, and Robert Reams, The copositive completion problem, LinearAlgebra Appl. 408 (2005), 207–211.

[KB93] Natalia Kogan and Abraham Berman, Characterization of completely positive graphs, DiscreteMath. 114 (1993), 297–304.