This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Bad semidefinite programs: they all look the same

Gábor Pataki Department of Statistics and Operations Research, University of North Carolina at Chapel Hill
Abstract

Conic linear programs, among them semidefinite programs, often behave pathologically: the optimal values of the primal and dual programs may differ, and may not be attained. We present a novel analysis of these pathological behaviors. We call a conic linear system 𝒜xKb\mathcal{A}x\leq_{K}b badly behaved if the value of sup{c,x|𝒜xKb}\sup\,\{\langle c,x\rangle|\mathcal{A}x\leq_{K}b\} is finite but the dual program has no solution with the same value for some c.c. We describe simple and intuitive geometric characterizations of badly behaved conic linear systems. Our main motivation is the striking similarity of badly behaved semidefinite systems in the literature; we characterize such systems by certain excluded matrices, which are easy to spot in all published examples.

We show how to transform semidefinite systems into a canonical form, which allows us to easily verify whether they are badly behaved. We prove several other structural results about badly behaved semidefinite systems; for example, we show that they are in 𝒩𝒫co𝒩𝒫{\cal NP}\cap{\rm co\mathchar 45\relax}{\cal NP} in the real number model of computing. As a byproduct, we prove that all linear maps that act on symmetric matrices can be brought into a canonical form; this canonical form allows us to easily check whether the image of the semidefinite cone under the given linear map is closed.

Key words: conic linear programming; semidefinite programming; duality; closedness of the linear image of a closed convex cone; pathological semidefinite programs

MSC 2010 subject classification: Primary: 90C46, 49N15; secondary: 52A40

OR/MS subject classification: Primary: convexity; secondary: programming-nonlinear-theory

1 Introduction

Many problems in engineering, combinatorial optimization, machine learning, and related fields can be formulated as the primal-dual pair of conic linear programs

supc,xinfb,y(Pc)s.t.𝒜xKbs.t.yK0(Dc)𝒜y=c,\begin{array}[]{lrlcrlr}&\sup&\langle c,x\rangle&&\inf&\langle b,y\rangle&\\ (\mathit{P_{c}})&s.t.&\mathcal{A}x\leq_{K}b&&s.t.&y\geq_{K^{*}}0&(\mathit{D_{c}})\\ &&&&&\mathcal{A}^{*}y=c,&\end{array}

where 𝒜:XY\mathcal{A}:X\rightarrow Y is a linear map between finite dimensional Euclidean spaces XX and Y,Y,\, 𝒜\mathcal{A}^{*} is its adjoint, KYK\subseteq Y is a closed, convex cone, KK^{*} is its dual cone, and sKts\leq_{K}t\, means tsK.t-s\in K.\, Note that the subscript cc refers to the objective of the primal problem.

Problems (Pc)(\mathit{P_{c}}) and (Dc)(\mathit{D_{c}}) generalize linear programs and share some of the duality theory of linear programming. For instance, a pair of feasible solutions always satisfies the weak duality inequality c,xb,y.\langle c,x\rangle\leq\langle b,y\rangle. However, in conic linear programming pathological phenomena occur: the optimal values of (Pc)(\mathit{P_{c}}) and of (Dc)(\mathit{D_{c}}) may differ, and they may not be attained.

In particular, semidefinite programs (SDPs) and second order conic programs (SOCPs) — probably the most useful and pervasive conic linear programs — often behave pathologically: for a variety of examples we refer to the textbooks [6, 37, 11, 3, 42] surveys [44, 41, 27] and research papers [34, 1, 43]. Pathological conic LPs are both theoretically interesting and often difficult, or impossible to solve numerically.

These pathologies arise, since the linear image of a closed convex cone is not always closed. For recent studies about when such sets are closed (or not), see e.g., [4, 2, 29]. Three approaches (which we review in detail below) can help to avoid or remedy the pathologies: one can impose a constraint qualification (CQ), such as Slater’s condition; one can regularize (Pc)(Dc)(\mathit{P_{c}})\mathchar 45\relax(\mathit{D_{c}}) using a facial reduction algorithm [16, 46, 31]; or write an extended dual [34, 24], which uses extra variables and constraints. However, such CQs often do not hold, and neither facial reduction algorithms nor extended duals can help solve all pathological instances.

We started this research observing that pathological SDPs in the literature look curiously similar and one of our main goals is to find the root cause of the similarity. We focus on the system underlying (Pc)(\mathit{P_{c}}) and call

𝒜xKb\begin{array}[]{rcl}\mathcal{A}x\leq_{K}b\end{array} (PP)

badly behaved if there exists cc such that (Dc)(\mathit{D_{c}}) either does not attain its value or its value differs from the value of (Pc).(\mathit{P_{c}}). We call (PP) well behaved if it is not badly behaved.

Main contributions of the paper:

  1. (1)

    In Theorem 1 of Section 2 we characterize when the system (PP) is badly or well behaved. At the heart of Theorem 1 is a simple geometric condition that involves the set of feasible directions at zK,z\in K,\, i.e.,

    {y|z+ϵyKfor someϵ>0},\{\,y\,|\,z+\epsilon y\in K\,\text{for some}\,\epsilon>0\,\},

    and zz is chosen as a certain slack in (PP).

    In Theorem 1 we unify two well-known (and seemingly unrelated) conditions for (PP) to be well behaved: the first is Slater’s condition, and the second requires KK to be polyhedral.

    Theorem 1 relies on a result on the closedness of the linear image of a closed convex cone from [29] (which we recap in Lemma 1).

  2. (2)

    In Section 3 we characterize when a semidefinite system

    i=1mxiAiB\sum_{i=1}^{m}x_{i}A_{i}\preceq B (PSDP_{SD})

    is badly behaved via certain excluded matrices. We assume (with no loss of generality) that a maximum rank positive semidefinite matrix of the form BixiAiB-\sum_{i}x_{i}A_{i} is

    Z=(Ir000)forsome 0rn.Z\,=\,\begin{pmatrix}I_{r}&0\\ 0&0\end{pmatrix}\,{\rm for\,some}\,0\leq r\leq n. (1.1)

    We prove (in Theorem 2) that (PSDP_{SD}) is badly behaved iff there is a matrix VV which is a linear combination of the AiA_{i} and BB of the form

    V=(V11V12V12TV22),V\,=\,\begin{pmatrix}V_{11}&V_{12}\\ V_{12}^{T}&V_{22}\end{pmatrix},\, (1.2)

    where V11V_{11} is r×r,r\!\times\!r,\, V22V_{22} is positive semidefinite, and (V12T)(V22).{\cal R}(V_{12}^{T})\not\subseteq{\cal R}(V_{22}). Here (){\cal R}() stands for rangespace.

    The excluded matrices ZZ and VV are easy to spot in all published badly behaved semidefinite systems (we counted about 2020 in the above references). The simplest such system is

    x1(α110)(1000),\begin{array}[]{rl}x_{1}\begin{pmatrix}\alpha&1\\ 1&0\end{pmatrix}\preceq\begin{pmatrix}1&0\\ 0&0\end{pmatrix},\end{array} (1.3)

    where α\alpha is any real number: in (1.3) the right hand side serves as ZZ and the matrix on the left hand side serves as V.V.

    Theorem 3 similarly characterizes well behaved semidefinite systems.

    Theorems 2 and 3 follow from Theorem 1, and Lemma 3, which characterizes the set of feasible directions and related sets in the semidefinite cone.

  3. (3)

    How do we verify that (PSDP_{SD}) is badly or well behaved? In other words, how do we convince a nonexpert reader that an instance of (PSDP_{SD}) is badly or well behaved? Theorems 4 and 5 in Section 4 show how to transform (PSDP_{SD}) into an equivalent standard system, whose bad or good behavior is self-evident. The transformation is surprisingly simple, as it relies mostly on elementary row operations — the same operations that are used in Gaussian elimination. A natural analogy (and our inspiration) is how one transforms an infeasible linear system of equations Ax=bAx=b to derive the obviously infeasible equation 0,x=1.\langle 0,x\rangle=1.

    Here we also prove that i) badly/well behaved semidefinite systems are in 𝒩𝒫co𝒩𝒫{\cal NP}\cap{\rm co\mathchar 45\relax}{\cal NP} in the real number model of computing ii) for a well behaved semidefinite system we can restrict optimal dual matrices to be block-diagonal, and iii) roughly speaking, we can partition a well behaved system into a strictly feasible part, and a linear part.

    As a byproduct, we prove that all linear maps that act on symmetric matrices can be brought into a canonical form; this canonical form allows us to easily check whether the image of the semidefinite cone under the given linear map is closed.

  4. (4)

    In Section 5 we sketch analogous results for conic linear programs and SDPs in the dual form and prove that all badly behaved semidefinite systems can be reduced, by a sequence of natural operations, to the system (1.3).

  5. (5)

    Since most examples in the main body of the paper have at most three variables and 3×33\!\times\!3 matrices, in Appendix A we give a larger illustrative example with four variables and 4×44\!\times\!4 matrices. We prove other technical results in Appendix B.

We illustrate our results by many examples. The only technical proofs in the main body of the paper are those of Theorem 1 and of Lemma 5, and these can be safely skipped at first reading.

Related work A fundamental question in convex analysis is whether the linear image of a closed convex cone is closed. In this paper we rely on Theorem 1.1 from [29], which we summmarize in Lemma 1. This result gives several necessary conditions, and exact characterizations for the class of nice cones. We refer to Bauschke and Borwein [4] for the closedness of the continuous image of a closed convex cone; to Auslender [2] for the closedness of the linear image of an arbitrary closed convex set; and to Waksman and Epelman [47] for another related result. For perturbation results we refer to Borwein and Moors [13, 14]; the latter paper shows that the set of linear maps under which the image of a closed convex cone is not closed is small both in terms of measure and category. For a more general problem, whether the intersection of an infinite sequence of nested sets is nonempty, Bertsekas and Tseng [7] gave a sufficient condition. Their characterization is in terms of a certain retractiveness property of the set sequence.

We say that (Dc)(\mathit{D_{c}}) is a strong dual of (Pc)(\mathit{P_{c}}) if they have the same value, and (Dc)(\mathit{D_{c}}) attains this value, when it is finite. Thus in general (Dc)(\mathit{D_{c}}) is not a strong dual of (Pc).(\mathit{P_{c}}). Using this terminology, (PP) is well behaved exactly if (Dc)(\mathit{D_{c}}) is a strong dual of (Pc)(\mathit{P_{c}}) for all c.c. We say that (PP) satisfies Slater’s condition, if there is xx such that b𝒜xb-\mathcal{A}x is in the relative interior of K;K; if this condition holds, then (PP) is well behaved.

Ramana in [34] proposed a strong dual for SDPs, which uses polynomially many extra variables and constraints. His result implies that semidefinite feasibility is in 𝒩𝒫co𝒩𝒫{\cal NP}\cap{\rm co\mathchar 45\relax}{\cal NP} in the real number model of computing. Klep and Schweighofer in [24] constructed a Ramana-type strong dual for SDPs, which, interestingly, is based on ideas from algebraic geometry, rather than from convex analysis.

The facial reduction algorithm of Borwein and Wolkowicz in [16, 15] converts (PP) into a system that satisfies Slater’s condition, and is hence well behaved. The algorithm relies on a sequence of reduction steps. For more recent, simplified facial reduction algorithms, see Waki and Muramatsu [46] and Pataki [31]. Ramana, Tunçel, and Wolkowicz in [36] proved the correctness of Ramana’s dual from the facial reduction algorithm of [16, 15], showing the connection of these two seemingly unrelated concepts. We refer to Ramana and Freund [35] for a proof that the Lagrange dual of Ramana’s dual has the same value as the original problem. Generalizations of Ramana’s dual are known for conic LPs over nice cones [31]; and for conic LPs over homogeneous cones (Pólik and Terlaky [33]).

For a generalization of the concept of strict complementarity (a concept that plays an important role in our work), we refer to Pena and Roshchina [32]. Schurr et al in [40] characterize universal duality — when strong duality holds for all right hand sides and objective functions. Tunçel and Wolkowicz in [43] related the lack of strict complementarity in a homogeneous conic linear system to the existence of an objective function with a positive gap.

We finally remark that the technique of reformulating equality constrained SDPs (relying mostly on elementary row operations), to easily verify their infeasibility was used recently by Liu and Pataki [25].

1.1 Preliminaries. When is the linear image of a closed convex cone closed?

We now review some basics in convex analysis, relying mainly on references [38, 23, 12, 5]. In Lemma 1 we also give a short and transparent summary of a result on the closedness of the linear image of a closed convex cone from [29].

If xx and yy are elements of the same Euclidean space, we sometimes write xyx^{*}y for x,y.\langle x,y\rangle. For a set CC we denote its linear span, the orthogonal complement of its linear span, its closure, and interior by linC,C,clC,\operatorname{lin}C,\,C^{\perp},\,\operatorname{cl}C,\, and intC,\operatorname{int}C,\, respectively. For a convex set CC we denote its relative interior by riC.\operatorname{ri}C. For a convex set CC\, and xCx\in C\, we define

dir(x,C)\displaystyle\operatorname{dir}(x,C) =\displaystyle= {y|x+ϵyCfor someϵ>0},\displaystyle\{\,y\,|\,x+\epsilon y\in C\,\text{for some}\,\epsilon>0\,\}, (1.4)
ldir(x,C)\displaystyle\operatorname{ldir}(x,C) =\displaystyle= dir(x,C)dir(x,C),\displaystyle\operatorname{dir}(x,C)\cap-\operatorname{dir}(x,C), (1.5)
tan(x,C)\displaystyle\tan(x,C) =\displaystyle= cldir(x,C)cldir(x,C).\displaystyle\operatorname{cl}\operatorname{dir}(x,C)\cap-\operatorname{cl}\operatorname{dir}(x,C). (1.6)

Here dir(x,C)\operatorname{dir}(x,C) is the set of feasible directions at xx in C,C,\, and tan(x,C)\tan(x,C) is the tangent space at xx in C.C.

A set CC is a cone if λxC\lambda x\in C holds for all xC,x\in C,\, and λ0.\lambda\geq 0. Let CC be a closed convex cone. Its dual cone is

C={y|y,x0xC}.C^{*}\,=\,\{\,y\,|\,\langle y,x\rangle\geq 0\,\,\forall x\in C\}.

For E,E,\, a convex subset of C,C,\, we say that EE is a face of C,C,\, if x1,x2C,x_{1},x_{2}\in C,\, and 1/2(x1+x2)E1/2(x_{1}+x_{2})\in E\, implies that x1x_{1} and x2x_{2} are in E.E.

For xC\,x\in C\, and uC,u\in C^{*},\, we say that uu is strictly complementary to x\,x\, if uri(Cx).u\in\operatorname{ri}(C^{*}\cap x^{\perp}). If CC is the semidefinite cone, or the second order cone, then uu is strictly complementary to xx\, iff xx is strictly complementary to u;u; in other cones, however, this may not be the case (see a discussion in [28]).

We say that a closed convex cone CC is nice, if

C+EisclosedforallEfacesofC.C^{*}+E^{\perp}\;{\rm is\;closed}\;{\rm for\;all\;}E\;{\rm faces\;of\;}C.

We know that polyhedral, semidefinite, and pp-order cones are nice [16, 15, 29]; the intersection of a nice cone with a linear subspace and the linear preimage of a nice cone are nice [18]; hence homogeneous cones are nice, as they are the intersection of a semidefinite cone with a linear subspace (see [17, 22]). In [30] we characterized nice cones, proved that they must be facially exposed and conjectured that all facially exposed cones are nice. However, Roshchina [39] disproved this conjecture.

We denote the rangespace, nullspace, and adjoint operator of a linear operator {\cal M} by (),𝒩(){\cal R}({\cal M}),\,{\cal N}({\cal M}) and ,{\cal M}^{*},\, respectively. We denote by 𝒮n{\cal S}^{n} the set of nn by nn symmetric matrices, and by 𝒮+n{\cal S}_{+}^{n} the set of n×nn\times n symmetric positive semidefinite (psd) matrices. For symmetric matrices AA and BB we write ABA\preceq B [ABA\prec B] to denote that BAB-A is positive semidefinite [positive definite], and we write ABA\bullet B to denote the trace of AB.AB. We have (𝒮+n)=𝒮+n({\cal S}_{+}^{n})^{*}={\cal S}_{+}^{n} with respect to the \bullet inner product.

We will use the fact that for an H𝒮nH\subseteq{\cal S}^{n} affine subspace

ri(H𝒮+n)={X𝒮+n|XisamaximumrankpsdmatrixinH}.\operatorname{ri}(H\cap{\cal S}_{+}^{n})\,=\,\{\,X\in{\cal S}_{+}^{n}\,|\,X\,{\rm is\,a\,maximum\,rank\,psd\,matrix\,in}\,H\}.

For A,B𝒮nA,B\in{\cal S}^{n} and an invertible matrix TT we will use the identity

TTATT1BTT=AB.T^{T}AT\bullet T^{-1}BT^{-T}\,=\,A\bullet B. (1.7)

For matrices A1A_{1} and A2,A_{2},\, we let

A1A2=(A100A2),A_{1}\oplus A_{2}\,=\,\begin{pmatrix}A_{1}&0\\ 0&A_{2}\end{pmatrix},

and for sets of matrices X1X_{1} and X2X_{2} we define

X1X2={A1A2|A1X1,A2X2}.X_{1}\oplus X_{2}\,=\,\{\,A_{1}\oplus A_{2}\,|\,A_{1}\in X_{1},\,A_{2}\in X_{2}\,\}.

For instance, 𝒮+r{0}{\cal S}_{+}^{r}\oplus\{0\} (where the order of the 0 matrix will be clear from context) is the set of matrices with the upper left r×rr\times r block positive semidefinite and the rest of the components zero.

We write IrI_{r} for the identity matrix of order r.r.

The following question is fundamental in convex analysis: when is the linear image of a closed convex cone closed? We state and illustrate a short version of Theorem 1.1 from [29], which gives easily checkable conditions which are “almost” necessary and sufficient. We will use Lemma 1 later on to prove Theorem 1.

Lemma 1.

Let {\cal M} be a linear map, CC a closed convex cone, and wri(C()).w\in\operatorname{ri}(C\cap{\cal R}({\cal M})). Conditions (1) and (2) below are equivalent to each other, and necessary for C{\cal M}^{*}C^{*} to be closed. If CC is nice, then they are necessary and sufficient.

  1. (1)

    ()(cldir(w,C)dir(w,C))=.{\cal R}({\cal M})\cap\bigl{(}\operatorname{cl}\operatorname{dir}(w,C)\setminus\operatorname{dir}(w,C)\bigr{)}=\emptyset.

  2. (2)

    There is w𝒩()Cw^{\prime}\in{\cal N}({\cal M}^{*})\cap C^{*}\, strictly complementary to w,w,\, and

    ()(tan(w,C)ldir(w,C))=.{\cal R}({\cal M})\cap\bigl{(}\tan(w,C)\setminus\operatorname{ldir}(w,C)\bigr{)}=\emptyset.

Our first example which illustrates Lemma 1 is very simple:

Example 1.

Let C=C=𝒮+2C=C^{*}={\cal S}_{+}^{2} and define the map :2𝒮2{\cal M}:\mathbb{R}^{2}\rightarrow{\cal S}^{2} as

(x1,x2)=(x1x2x20).{\cal M}(x_{1},x_{2})\,=\,\begin{pmatrix}x_{1}&x_{2}\\ x_{2}&0\end{pmatrix}.

Then Y=(y11,2y12)TwhereY𝒮2,{\cal M}^{*}Y=(y_{11},2y_{12})^{T}\,{\rm where}\,Y\in{\cal S}^{2},\, and C{\cal M}^{*}C^{*} is not closed: a direct computation shows C=(++×)(0,0),{\cal M}^{*}C^{*}=(\mathbb{R}_{++}\times\mathbb{R})\cup(0,0), where ++\mathbb{R}_{++} stands for the set of strictly positive reals.

Lemma 1 also proves that C{\cal M}^{*}C^{*} is not closed: to see how, let

w=(1000),v=(0110).w=\begin{pmatrix}1&0\\ 0&0\end{pmatrix},\,v=\begin{pmatrix}0&1\\ 1&0\end{pmatrix}.

Then wri(()C),w\in\operatorname{ri}({\cal R}({\cal M})\cap C),\, since it is a maximum rank psd matrix in (M).{\cal R}(M).\, Also, v()(cldir(w,C)dir(w,C)),v\in{\cal R}({\cal M})\cap(\operatorname{cl}\operatorname{dir}(w,C)\setminus\operatorname{dir}(w,C)),\, since vdir(w,C)v\not\in\operatorname{dir}(w,C) follows from the definition, and vcldir(w,C)v\in\operatorname{cl}\operatorname{dir}(w,C) follows, since putting any ϵ>0\epsilon>0 into the (2,2)(2,2) position of vv makes it a feasible direction. So condition (1) of Lemma 1 is violated, hence C{\cal M}^{*}C^{*} is not closed.

The next, more involved example illustrates the key point of Lemma 1: the image set C{\cal M}^{*}C^{*} usually has much more complicated geometry than CC and C.C^{*}. Lemma 1 sheds light on the geometry of C{\cal M}^{*}C^{*} via the geometry of the simpler set C.C.

Example 2.

Let C=C=𝒮+3C=C^{*}={\cal S}_{+}^{3} and define the map :3𝒮3{\cal M}:\mathbb{R}^{3}\rightarrow{\cal S}^{3} as

(x1,x2,x3)=(x12x2x32x2x2+x30x300).{\cal M}(x_{1},x_{2},x_{3})=\begin{pmatrix}x_{1}&2x_{2}&x_{3}\\ 2x_{2}&x_{2}+x_{3}&0\\ x_{3}&0&0\end{pmatrix}.

Thus Y=(y11,y22+4y12,y22+2y13),{\cal M}^{*}Y=(y_{11},y_{22}+4y_{12},y_{22}+2y_{13}),\, where Y𝒮3.Y\in{\cal S}^{3}.

It is a straightforward computation (which we omit) to show

cl(C)={(α,β,γ):α0,4α+β0},cl(C)C={(0,β,γ):γβ0}.\begin{array}[]{rcl}\operatorname{cl}({\cal M}^{*}C^{*})&=&\{(\alpha,\beta,\gamma):\,\alpha\geq 0,4\alpha+\beta\geq 0\},\\ \operatorname{cl}({\cal M}^{*}C^{*})\setminus{\cal M}^{*}C^{*}&=&\{(0,\beta,\gamma):\,\gamma\neq\beta\geq 0\}.\end{array} (1.8)

The set C{\cal M}^{*}C^{*} is shown on Figure 1 in blue, and cl(C)C{\rm cl\,}({\cal M}^{*}C^{*})\setminus{\cal M}^{*}C^{*} in green. (Note that the blue diagonal segment on the green facet actually belongs to C.{\cal M}^{*}C^{*}.)

Lemma 1 easily proves that C{\cal M}^{*}C^{*} is not closed, even without computing the sets in (1.8); indeed, let

w:=(6,1,0)=(620210000),v:=(0,0,1)=(001010100),w\,:=\,{\cal M}(6,1,0)\,=\,\begin{pmatrix}6&2&0\\ 2&1&0\\ 0&0&0\end{pmatrix},\,\,v:={\cal M}(0,0,1)\,=\,\begin{pmatrix}0&0&1\\ 0&1&0\\ 1&0&0\end{pmatrix},

and observe i) wri(()C),w\in\operatorname{ri}({\cal R}({\cal M})\cap C), since it is a maximum rank psd matrix in ();{\cal R}({\cal M}); ii) vdir(w,C)v\not\in\operatorname{dir}(w,C) follows from the definition; and iii) vcldir(w,C),v\in\operatorname{cl}\operatorname{dir}(w,C), since putting any ϵ>0\epsilon>0 into the (3,3)(3,3) position of vv makes it a feasible direction.

Thus condition (1) in Lemma 1 is violated, so C{\cal M}^{*}C^{*} is not closed.

Refer to caption
Figure 1: The set C{\cal M}^{*}C^{*} is in blue, and cl(C)C{\rm cl}\,({\cal M}^{*}C^{*})\setminus{\cal M}^{*}C^{*} is in green

We mention in passing that the second part of condition (2) in Lemma 1 is stated in Theorem 1.1 in [29] as ()((E)linE)=,{\cal R}({\cal M})\cap((E^{\triangle})^{\perp}\setminus\operatorname{lin}E)=\emptyset, where EE is the smallest face of CC that contains ww and E=Cw.E^{\triangle}=C^{*}\cap w^{\perp}. However, this is an equivalent formulation, as implied by the characterization of linE\operatorname{lin}E and (E)(E^{\triangle})^{\perp}, see e.g., Lemma 7 in Appendix B.

Throughout the paper we assume that (PP) is feasible. Recall that we say that (PP) satisfies Slater’s condition if there exists xx such that b𝒜xriK.b-\mathcal{A}x\in\operatorname{ri}K.

2 When is a conic linear system badly or well behaved?

In this section we present our main characterization of when (PP) is badly or well behaved (these concepts are defined in the Introduction). We first need a definition.

Definition 1.

A slack in (PP) is a vector in

((𝒜)+b)K,({\cal R}(\mathcal{A})+b)\cap K,

and a maximum slack is a vector in the relative interior of all slacks.

We start with a basic lemma:

Lemma 2.

The system (PP) is well behaved, if and only if the set

(𝒜0b1)(K+)\left(\!\!\!\begin{array}[]{cc}\mathcal{A}^{*}&0\\ b^{*}&1\end{array}\!\!\!\right)\left(\!\!\!\begin{array}[]{cc}K^{*}\\ \mathbb{R}_{+}\end{array}\!\!\!\right)

is closed.

To put Lemma 2 into perspective, note that the image set in Lemma 2 is closed if (𝒜,b)K(\mathcal{A},b)^{*}K^{*} is closed (one can argue this directly or by modifying the proof of Lemma 2). In turn, if (𝒜,b)K(\mathcal{A},b)^{*}K^{*} is closed, then the duality gap between (Pc)(\mathit{P_{c}}) and (Dc)(\mathit{D_{c}}) is zero, even if KK lives in an infinite dimensional space — see, e.g., Theorem 7.2 in [3] (where our primal is called the dual). The proof of Lemma 2 is standard, and we give it in Appendix B.

The main result of this section follows (recall the definition of dir(z,K)\operatorname{dir}(z,K) and related sets from (1.4)–(1.6)). We write (𝒜,b){\cal R}({\cal A},b) for the rangespace of the operator (x,t)𝒜x+bt.(x,t)\rightarrow{\cal A}x+bt.

Theorem 1.

Let zz be a maximum slack in (PP). Conditions (1) and (2) below are equivalent to each other, and necessary for (PP) to be well behaved. If KK is nice, then they are necessary and sufficient.

  1. (1)

    (𝒜,b)(cldir(z,K)dir(z,K))=.{\cal R}({\cal A},b)\cap(\operatorname{cl}\operatorname{dir}(z,K)\setminus\operatorname{dir}(z,K))=\emptyset.\,

  2. (2)

    There is u𝒩((𝒜,b))Ku\in{\cal N}\bigl{(}(\mathcal{A},b)^{*}\bigr{)}\cap K^{*}\, strictly complementary to z,z,\, and

    (𝒜,b)(tan(z,K)ldir(z,K))=.{\cal R}(\mathcal{A},b)\cap\bigl{(}\tan(z,K)\setminus\operatorname{ldir}(z,K)\bigr{)}=\emptyset.

To build intuition we show how Theorem 1 unifies two classical, seemingly unrelated, sufficient conditions for (PP) to be well-behaved.

Corollary 1.

Suppose that KK is a nice cone. If KK is polyhedral or (PP) satisfies Slater’s condition, then (PP) is well-behaved.

Proof Let zz be a maximum slack in (PP). If KK is polyhedral, then so is dir(z,K)\operatorname{dir}(z,K). If (PP) satisfies Slater’s condition, then clearly zriK,z\in\operatorname{ri}K, so dir(z,K)=linK.\operatorname{dir}(z,K)=\operatorname{lin}K.\, In both cases dir(z,K)\operatorname{dir}(z,K) is closed, hence Condition (1) holds, so (PP) is well behaved. ∎

Though Lemma 2 is a bit simpler to state than Theorem 1, the latter will be more useful. On the one hand, Lemma 2 relies on the closedness of the linear image of K×+,K^{*}\times\mathbb{R}_{+},\, which may not be easy to check. On the other hand, Theorem 1 relies on the geometry of the cone KK\, itself, and not on the geometry of its linear image. The geometry of typical cones that occur in optimization — e.g. the geometry of the semidefinite cone — is well understood. Thus Theorem 1, among other things, will lead to a proof that badly behaved semidefinite systems are in 𝒩𝒫co𝒩𝒫{\cal NP}\cap{\rm co\mathchar 45\relax}{\cal NP} in the real number model of computing. Lemma 2, by itself, affords no such corollary.

Note that if b=0b=0, then by Lemma 2 the system (PP) is well behaved iff 𝒜K\mathcal{A}^{*}K^{*} is closed. Thus in this case Lemma 1 and Theorem 1 are equivalent. To prove the general case of Theorem 1 we use a homogenization argument.

Proof of Theorem 1: We consider the homogenized system

𝒜xbx0K0x00,\begin{array}[]{rcl}\mathcal{A}x-bx_{0}&\leq_{K}&0\\ -x_{0}&\leq&0,\end{array} (PhP_{h})

and first prove the following claim:

Claim There is a zz maximum slack in (PP) such that (z,1)(z,1) is also a maximum slack in (PhP_{h}).

To prove the Claim we first note that if zz is a maximum slack in (PP) and zz^{\prime} is some other slack, then λz+(1λ)z\lambda z+(1-\lambda)z^{\prime} is also a maximum slack for all 0<λ10<\lambda\leq 1\, (by Theorem 6.1 in [38]). A similar result holds for (PhP_{h}).

Now let z1z_{1} be a maximum slack in (PP), then (z1,1)(z_{1},1) is a slack in (PhP_{h}). Next, let (z2,x0)(z_{2},x_{0}) be a maximum slack in (PhP_{h}). By the properties of the relative interior, and since (z1,1)(z_{1},1) is a slack in (PhP_{h}), we have that (z2,x0)ϵ(z1,1)(z_{2},x_{0})-\epsilon(z_{1},1) is a slack in (PhP_{h}) for some ϵ>0.\epsilon>0.\, So x0>0x_{0}>0 must hold, and (after normalizing) we can assume x0=1.x_{0}=1. Hence z2z_{2} is a slack in (PP) and

z:=12(z1+z2)z:=\dfrac{1}{2}(z_{1}+z_{2})

will do. This completes the proof of the claim.

To proceed with the proof of the theorem, we note that the set of maximum slacks in (PP) is a relatively open set, so by Theorem 18.2 in [38] it is contained in riF,\operatorname{ri}F,\, where FF is some face of K.K. Therefore dir(z,K)=K+linF\operatorname{dir}(z,K)=K+\operatorname{lin}F for any maximum slack zz (see e.g. Lemma 2.7 in [28]) so the sets dir(z,K)\operatorname{dir}(z,K) and tan(z,K)\tan(z,K) depend only on F.F. Hence we are free to use any maximum slack of (PP) in our proof, and we will use the particular maximum slack provided in the preceding Claim.

For convenience we define the linear map

𝒜h=(𝒜b01)\mathcal{A}_{h}=\left(\!\!\!\begin{array}[]{cc}\mathcal{A}&b\\ 0&1\end{array}\!\!\!\right)\,

which corresponds to the homogenized conic linear system (PhP_{h}).

We first note that (trivially)

dir((z,1),K×+)\displaystyle\operatorname{dir}\bigl{(}(z,1),K\times\mathbb{R}_{+}\bigr{)} =\displaystyle= dir(z,K)×holds.\displaystyle\operatorname{dir}(z,K)\times\mathbb{R}\,{\rm holds}.

Equations (1.4)–(1.6) imply that the same statement holds, if we replace the operator dir′′′′{}^{\prime\prime}\operatorname{dir}^{\prime\prime} by cl′′dir′′,{}^{\prime\prime}\operatorname{cl}\operatorname{dir}^{\prime\prime},\, tan′′,``\tan^{\prime\prime},\, or ldir′′′′.{}^{\prime\prime}\operatorname{ldir}^{\prime\prime}.

Hence the following equations hold:

cldir((z,1),K×+)dir((z,1),K×+)\displaystyle\operatorname{cl}\operatorname{dir}\bigl{(}(z,1),K\times\mathbb{R}_{+}\bigr{)}\setminus\operatorname{dir}\bigl{(}(z,1),K\times\mathbb{R}_{+}\bigr{)} =\displaystyle= (cldir(z,K)dir(z,K))×,\displaystyle\bigl{(}\operatorname{cl}\operatorname{dir}(z,K)\setminus\operatorname{dir}(z,K)\bigr{)}\times\mathbb{R}, (2.9)
tan((z,1),K×+)ldir((z,1),K×+)\displaystyle\tan\bigl{(}(z,1),K\times\mathbb{R}_{+}\bigr{)}\setminus\operatorname{ldir}\bigl{(}(z,1),K\times\mathbb{R}_{+}\bigr{)} =\displaystyle= (tan(z,K)ldir(z,K))×.\displaystyle\bigl{(}\tan(z,K)\setminus\operatorname{ldir}(z,K)\bigr{)}\times\mathbb{R}. (2.10)

Consider now the following variants of conditions (1) and (2):

  1. (1)(1^{\prime})

    (𝒜h)[cldir((z,1),K×+)dir((z,1),K×+)]=.{\cal R}(\mathcal{A}_{h})\cap\bigl{[}\operatorname{cl}\operatorname{dir}\bigl{(}(z,1),K\times\mathbb{R}_{+}\bigr{)}\setminus\operatorname{dir}\bigl{(}(z,1),K\times\mathbb{R}_{+}\bigr{)}\bigr{]}=\emptyset.

  2. (2)(2^{\prime})

    There is (u,u0)𝒩(Ah)(K×+)(u,u_{0})\in{\cal N}(A_{h}^{*})\cap(K\times\mathbb{R}_{+})^{*}\, strictly complementary to (z,1)(z,1) and

    (𝒜h)[tan((z,1),K×+)ldir((z,1),K×+)]=.{\cal R}(\mathcal{A}_{h})\cap\bigl{[}\tan\bigl{(}(z,1),K\times\mathbb{R}_{+}\bigr{)}\setminus\operatorname{ldir}\bigl{(}(z,1),K\times\mathbb{R}_{+}\bigr{)}\bigr{]}=\emptyset.

Since (z,1)(z,1) is a maximum slack in (PhP_{h}), we have (z,1)ri((𝒜h)(K×+)).(z,1)\in\operatorname{ri}\bigl{(}{\cal R}(\mathcal{A}_{h})\cap(K\times\mathbb{R}_{+})\bigr{)}. Hence by Lemma 1 with C=K×+,=𝒜h,w=(z,1)C=K\times\mathbb{R}_{+},\,{\cal M}=\mathcal{A}_{h},\,w=(z,1) we find

𝒜h(K×+)isclosed(1)(2)\mathcal{A}_{h}^{*}(K\times\mathbb{R}_{+})^{*}\,\,{\rm is\,\,closed\,\,}\Rightarrow(1^{\prime})\Leftrightarrow(2^{\prime})

and that equivalence holds when K×+K\times\mathbb{R}_{+} is nice.

We next note that by (2.9) condition (1)(1^{\prime}) is equivalent to (1). Also, if (u,u0)(u,u_{0}) is as specified in (2),(2^{\prime}), then

(u,u0),(z,1)=u,z+u0= 0,\langle(u,u_{0}),(z,1)\rangle\,=\,\langle u,z\rangle+u_{0}\,=\,0,

and since both terms above are nonnegative, we must have u0=0.u_{0}=0.\, Thus using (2.10) we find that statement (2)(2^{\prime}) is equivalent to condition (2) in Theorem 1. Thus we have

𝒜h(K×+)isclosed(1)(2)\mathcal{A}_{h}^{*}(K\times\mathbb{R}_{+})^{*}\,\,{\rm is\,\,closed\,\,}\Rightarrow(1)\Leftrightarrow(2)

with equivalence holding when K×+K\times\mathbb{R}_{+} is nice. Finally, KK is nice if and only if K×+K\times\mathbb{R}_{+} is, thus invoking Lemma 2 completes the proof. ∎

We can easily modify the proof of Theorem 1 to show that conditions 1 and 2 suffice for (PP) to be well behaved, even under a weaker condition than KK being nice: it is enough for K+FK^{*}+F^{\perp} to be closed, where FF is the smallest face of KK that contains z.z. This more general version of Theorem 1 implies that Corollary 1 holds even if we do not assume that KK is nice – we refer the interested reader to version 3 of the paper on arxiv.org.

3 When is a semidefinite system badly or well behaved?

We now specialize the results of Section 2, and characterize when the semidefinite system (PSDP_{SD}) is badly or well behaved. To this end, we consider the primal-dual pair of SDPs

supi=1mcixiinfBY(𝑆𝐷𝑃c)s.t.i=1mxiAiBs.t.Y0(𝑆𝐷𝐷c)AiY=ci(i=1,,m),\begin{array}[]{lrlcrlr}&\sup&\sum_{i=1}^{m}c_{i}x_{i}&&\inf&B\bullet Y&\\ (\mathit{SDP_{c}})&s.t.&\sum_{i=1}^{m}x_{i}A_{i}\preceq B&&s.t.&Y\succeq 0&(\mathit{SDD_{c}})\\ &&&&&A_{i}\bullet Y=c_{i}\,\,(i=1,\dots,m),&\end{array}

where A1,,Am,B𝒮n,A_{1},\dots,A_{m},B\in{\cal S}^{n}, and c1,,cmc_{1},\dots,c_{m} are scalars.

Specializing Definition 1 to the semidefinite system (PSDP_{SD}), we find that i) a slack in (PSDP_{SD}) is a matrix of the form S=BixiAi0,S=B-\sum_{i}x_{i}A_{i}\succeq 0,\, and ii) a maximum slack in (PSDP_{SD}) is a maximum rank slack. We also note that the cone of positive semidefinite matrices is nice [16, 15, 29].

We make the following

Assumption 1.

The maximum rank slack in (PSDP_{SD}) is

Z=(Ir000)forsome 0rn.Z\,=\,\begin{pmatrix}I_{r}&0\\ 0&0\end{pmatrix}\,{\rm for\,some\,}0\leq r\leq n. (3.11)

We can easily satisfy Assumption 1, at least from a theoretical point of view, as follows. If ZZ is any maximum rank slack in (PSDP_{SD}), QQ is a matrix of suitably scaled eigenvectors of Z,Z, and we apply the rotation QT()QQ^{T}()Q to all AiA_{i} and B,B, then the maximum rank slack in the rotated system is in the required form. (We do not make a claim about actually computing ZZ or QQ; we discuss this point more at the end of Section 4 ).

In the interest of the reader we first state and illustrate the main results, then prove them.

Theorem 2.

The system (PSDP_{SD}) is badly behaved if and only if there is a matrix VV which is a linear combination of the AiA_{i} and BB of the form

V=(V11V12V12TV22),V\,=\,\begin{pmatrix}V_{11}&V_{12}\\ V_{12}^{T}&V_{22}\end{pmatrix},\, (3.12)

where V11V_{11} is r×r,r\times r,\, V220,V_{22}\succeq 0,\, and (V12T)(V22).{\cal R}(V_{12}^{T})\not\subseteq{\cal R}(V_{22}).

The ZZ and VV matrices provide a certificate of the bad behavior of (PSDP_{SD}).

Example 3.

In the problem

supx1s.t.x1(0110)(1000)\begin{array}[]{rl}\sup&x_{1}\\ s.t.&x_{1}\begin{pmatrix}0&1\\ 1&0\end{pmatrix}\preceq\begin{pmatrix}1&0\\ 0&0\end{pmatrix}\end{array} (3.13)

the only feasible solution is x1=0.x_{1}=0. The dual program, in which we denote the components of YY by yij,y_{ij},\, is equivalent to

infy11s.t.(y111/21/2y22)0,\begin{array}[]{rl}\inf&y_{11}\\ s.t.&\begin{pmatrix}y_{11}&1/2\\ 1/2&y_{22}\end{pmatrix}\succeq 0,\end{array}

which has a 0 infimum but does not attain it.

The certificates of the bad behavior of the system in (3.13) are

Z=(1000),V=(0110).Z\,=\,\begin{pmatrix}1&0\\ 0&0\end{pmatrix},\,V\,=\,\begin{pmatrix}0&1\\ 1&0\end{pmatrix}.
Example 4.

The problem

supx2s.t.x1(100000000)+x2(001010100)(100010000)\begin{array}[]{rl}\sup&x_{2}\\ s.t.&x_{1}\begin{pmatrix}1&0&0\\ 0&0&0\\ 0&0&0\end{pmatrix}+x_{2}\begin{pmatrix}0&0&1\\ 0&1&0\\ 1&0&0\end{pmatrix}\preceq\begin{pmatrix}1&0&0\\ 0&1&0\\ 0&0&0\end{pmatrix}\end{array} (3.14)

again has an attained 0 supremum. The reader can easily check that the value of the dual program is 11\, (and it is attained), so there is a finite, positive duality gap.

In (3.14) the right hand side is the maximum slack, and we can choose the coefficient matrix of x2x_{2} as the VV matrix of Theorem 2.

We next characterize well behaved semidefinite systems:

Theorem 3.

The system (PSDP_{SD}) is well behaved if and only if conditions (1) and (2) below hold:

  1. (1)

    There is a matrix UU of the form

    U=(000U22),U\,=\,\begin{pmatrix}0&0\\ 0&U_{22}\end{pmatrix}, (3.15)

    with U22𝒮nr,U220U_{22}\in{\cal S}^{n-r},\,U_{22}\succ 0 and

    A1U==AmU=BU= 0.A_{1}\bullet U\,=\,\dots\,=\,A_{m}\bullet U\,=\,B\bullet U\,=\,0. (3.16)
  2. (2)

    For all VV matrices, which are a linear combination of the AiA_{i} and BB and are of the form

    V=(V11V12V12T0),V\,=\,\begin{pmatrix}V_{11}&V_{12}\\ V_{12}^{T}&0\end{pmatrix},

    with V11𝒮r,V_{11}\in{\cal S}^{r}, we must have V12=0.V_{12}=0.

Example 5.

The system

x1(000001010)(100000000)x_{1}\begin{pmatrix}0&0&0\\ 0&0&1\\ 0&1&0\end{pmatrix}\preceq\begin{pmatrix}1&0&0\\ 0&0&0\\ 0&0&0\end{pmatrix} (3.17)

is well behaved; we can easily prove this either directly or via Theorem 3. To do the latter, note that the right hand side of (3.17) is the maximum rank slack, condition (1) of Theorem 3 holds with U=0I2,U=0\oplus I_{2}, and condition (2) holds vacuously (the (1,2),(1,3)(1,2),(1,3) block of both constraint matrices is zero).

Example 6.

This example illustrates both badly and well behaved semidefinite systems, depending on the value of the parameter α:\alpha:

x1(001013138)+x2(013101316)+x3(11α3112α322)(22α5224α544)\begin{array}[]{rcl}x_{1}\begin{pmatrix}0&0&1\\ 0&1&-3\\ 1&-3&8\end{pmatrix}+x_{2}\begin{pmatrix}0&1&-3\\ 1&0&1\\ -3&1&-6\end{pmatrix}+x_{3}\begin{pmatrix}1&1&\alpha-3\\ 1&1&-2\\ \alpha-3&-2&2\end{pmatrix}&\preceq&\begin{pmatrix}2&2&\alpha-5\\ 2&2&-4\\ \alpha-5&-4&4\end{pmatrix}\end{array} (3.18)

Let us write AiA_{i} for the constraint matrices on the left, and BB for the right hand side matrix in (3.18). We first observe that Z=I10Z=I_{1}\oplus 0 is the maximum rank slack; indeed i) Z=BA1A2A3,Z=B-A_{1}-A_{2}-A_{3}, so it is a slack, and ii) the matrix

U=(0000103031)U=\begin{pmatrix}0&0&0\\ 0&10&3\\ 0&3&1\end{pmatrix} (3.19)

satisfies BU=AiU=0B\bullet U=A_{i}\bullet U=0 for all i.i. Hence UU is orthogonal to any slack matrix, so the rank of any slack matrix is at most 1.1.

If α1,\alpha\neq 1,\, then (3.18) is badly behaved; as proof, observe that

V:=A3A2A1=(10α1000α100)V\,:=\,A_{3}-A_{2}-A_{1}\,=\,\begin{pmatrix}1&0&\alpha-1\\ 0&0&0\\ \alpha-1&0&0\end{pmatrix}

is a certificate matrix as required by Theorem 2.

If α=1,\alpha=1,\, then (3.18) is well behaved, and we can verify this using Theorem 3 as follows. The UU matrix in (3.19) satisfies condition (1) of Theorem 3. As to condition (2), if the lower right 2×22\times 2 block of V:=i=13λiAi+μBV:=\sum_{i=1}^{3}\lambda_{i}A_{i}+\mu B is zero, then (λ1,λ2,λ3,μ)(\lambda_{1},\lambda_{2},\lambda_{3},\mu) must be a linear combination of

(0,0,2,1)and(5,5,1,2),(0,0,2,-1)\,{\rm and}\,(5,5,-1,-2),

so for all such (λ1,λ2,λ3,μ)(\lambda_{1},\lambda_{2},\lambda_{3},\mu) the upper left 1×21\times 2 block of VV is also zero.

We return to Examples 36 in Section 4. As we will see there, the bad or good behavior of semidefinite systems can be verified using only an elementary linear algebraic argument, without ever referring to Theorems 2 or 3. We will use Examples 36 as illustrations.

The reader may find it interesting to spot the ZZ and VV excluded matrices in other pathological SDPs in the literature, e.g., in the instances in [6, 11, 3, 44, 41, 34, 43, 27].

Theorems 2 and 3 simply follow from Theorem 1 and from Lemma 3 below, which describes the set of feasible directions and related sets in the semidefinite cone:

Lemma 3.

Let ZZ be as in Assumption 1, and recall the definition of the set of feasible directions, and related sets from (1.4)-(1.6). Then

ldir(Z,𝒮+n)\displaystyle\operatorname{ldir}(Z,{\cal S}_{+}^{n}) =\displaystyle= 𝒮r{0},\displaystyle{\cal S}^{r}\oplus\{0\}, (3.20)
cldir(Z,𝒮+n)\displaystyle\operatorname{cl}\operatorname{dir}(Z,{\cal S}_{+}^{n}) =\displaystyle= {(Y11Y12Y12TY22)|.Y22𝒮+nr},\displaystyle\left\{\,\begin{pmatrix}Y_{11}&Y_{12}\\ Y_{12}^{T}&Y_{22}\end{pmatrix}\,\bigr{|}\bigl{.}\,Y_{22}\in{\cal S}_{+}^{n-r}\;\right\}, (3.21)
tan(Z,𝒮+n)\displaystyle\tan(Z,{\cal S}_{+}^{n}) =\displaystyle= {(Y11Y12Y12T0)|.Y11𝒮r},\displaystyle\left\{\,\begin{pmatrix}Y_{11}&Y_{12}\\ Y_{12}^{T}&0\end{pmatrix}\,\bigr{|}\bigl{.}\,Y_{11}\,\in\,{\cal S}^{r}\,\right\}, (3.22)
dir(Z,𝒮+n)\displaystyle\operatorname{dir}(Z,{\cal S}_{+}^{n}) =\displaystyle= {(Y11Y12Y12TY22)|.Y22𝒮+nr,(Y12T)(Y22)}.\displaystyle\,\left\{\,\begin{pmatrix}Y_{11}&Y_{12}\\ Y_{12}^{T}&Y_{22}\end{pmatrix}\,\bigr{|}\bigl{.}\,Y_{22}\in{\cal S}_{+}^{n-r},{\cal R}(Y_{12}^{T})\subseteq{\cal R}(Y_{22})\;\right\}. (3.23)

The proof of Lemma 3 is given in Appendix B.

Proof of Theorem 2 By condition (1) of Theorem 1 we see that (PSDP_{SD}) is badly behaved, iff there is a matrix Vlin{A1,,Am,B}V\in\operatorname{lin}\{A_{1},\dots,A_{m},B\}\, such that

Vcldir(Z,𝒮+n)dir(Z,𝒮+n).V\in\operatorname{cl}\operatorname{dir}(Z,{\cal S}_{+}^{n})\setminus\operatorname{dir}(Z,{\cal S}_{+}^{n}).

Thus our result follows from parts (3.21) and (3.23) in Lemma 3. ∎

Proof of Theorem 3 We apply Theorem 1 to the system (PSDP_{SD}). We first observe that a matrix U0U\succeq 0 is strictly complementary to ZZ if and only if

U\displaystyle U =\displaystyle= (000U22),withU22𝒮nr,U220.\displaystyle\begin{pmatrix}0&0\\ 0&U_{22}\end{pmatrix},\;\mathrm{with}\;U_{22}\in{\cal S}^{n-r},\,U_{22}\succ 0.

Next we note that the first part of condition (2)(\ref{tang}) in Theorem 1 holds iff there is such a UU that satisfies (3.16). By (3.20) and (3.22) in Lemma 3 the second part of condition (2) in Theorem 1 holds iff all Vlin{A1,,Am,B}V\in\operatorname{lin}\{A_{1},\dots,A_{m},B\} which are of the form

V=(V11V12V12T0)V\,=\,\begin{pmatrix}V_{11}&V_{12}\\ V_{12}^{T}&0\end{pmatrix}

satisfy V12=0.V_{12}=0. This completes the proof. ∎

To summarize, Theorems 2 and 3 are a ”combinatorial version” of Theorem 1.

We note that for semidefinite systems that are strictly feasible, a matrix similar to the VV matrix in Theorem 2 can make sure that the optimal primal-dual solution pair fails strict complementarity; see [48].

Although we focus on feasible systems, we obtain natural corollaries about weakly infeasible SDPs, a class of pathological infeasible SDPs. To describe the connection, note that the alternative system

Y0,AiY=0(i=1,,m),BY=1Y\succeq 0,\,A_{i}\bullet Y=0\,(i=1,\dots,m),\,B\bullet Y=-1 (3.24)

gives a natural proof of infeasibility of (PSDP_{SD}): if (3.24) is feasible, then (PSDP_{SD}) is trivially infeasible. However, (PSDP_{SD}) and (3.24) may both be infeasible, in which case we call the semidefinite system (PSDP_{SD}) weakly infeasible.

As background on weakly infeasible SDPs, we mention that Waki [45] recently described a method for generating weakly infeasible SDPs based on Lasserre’s relaxation for polynomial optimization problems; Klep and Schweighofer [24] analyzed weakly infeasible SDPs using real algebraic geometry techniques; and Lourenco et al [26] proved that any weakly infeasible SDP with order nn matrices has a weakly infeasible subsystem with dimension at most n1.n-1.

To apply our machinery to weakly infeasible SDPs, we homogenize (PSDP_{SD}) to obtain the system

i=1mxiAix0B0.\sum_{i=1}^{m}x_{i}A_{i}-x_{0}B\preceq 0. (3.25)

Assume that the system (3.25) satisfies Assumption 1. First, suppose that (PSDP_{SD}) is weakly infeasible. Then (3.25) is badly behaved, since

sup{x0|(x,x0)isfeasiblein(3.25)}=0,\sup\{\,x_{0}\,|\,(x,x_{0})\,{\rm is\,feasible\,in\,}(\ref{homogenized})\}=0, (3.26)

but there is no solution feasible in the dual of (3.26) (such a dual solution would be feasible in (3.24)). Hence by Theorem 2 the excluded matrices ZZ and VV appear in (3.25). In turn, if (3.25) satisfies the conditions of Theorem 3 and hence it is well behaved, then (PSDP_{SD}) cannot be weakly infeasible.

4 Reformulations. Badly behaved semidefinite systems are in 𝒩𝒫co𝒩𝒫{\cal NP}\cap{\rm co\mathchar 45\relax}{\cal NP}

4.1 Reformulations

To motivate the discussion of this section, we recall a basic result from the theory of linear equations:

  • “The system Ax=bAx=b is infeasible if and only if its row echelon form contains the equation 0,x=α,\langle 0,x\rangle=\alpha, where α0.\alpha\neq 0.

Since the ”if” direction is trivial, we will — informally — say that the row echelon form is an easy-to-verify certificate, or witness, of infeasibility.

In this section we describe analogous results for a very different problem : we show how to transform (PSDP_{SD}) into an equivalent system whose bad or good behavior is trivial to verify. As a corollary we prove that badly (and well) behaved semidefinite systems are in 𝒩𝒫co𝒩𝒫{\cal NP}\cap{\rm co\mathchar 45\relax}{\cal NP} in the real number model of computing. (In this model we can store arbitrary real numbers in unit space and perform arithmetic operations in unit time; see e.g. [9]. We do not claim that badly behaved semidefinite systems are in 𝒫,{\cal P}, i.e., we do not provide a polynomial time algorithm to decide whether (PSDP_{SD}) is badly behaved. We discuss this point in more detail at the end of this section.)

We first define the type of transformation that we use on (PSDP_{SD}).

Definition 2.

We obtain an elementary reformulation, or simply a reformulation, of (𝑆𝐷𝑃c)(\mathit{SDP_{c}}) by a sequence of the following operations:

  1. (1)

    Apply a rotation TT()TT^{T}()T to all AiA_{i} and B,B,\, where T=IrMT=I_{r}\oplus M\, and MM is invertible.

  2. (2)

    Replace BB by B+j=1mμjAj,B+\sum_{j=1}^{m}\mu_{j}A_{j},\, where μm.\mu\in\mathbb{R}^{m}.

  3. (3)

    Exchange (Ai,ci)(A_{i},c_{i}) and (Aj,cj),(A_{j},c_{j}),\, where ij.i\neq j.

  4. (4)

    Replace (Ai,ci)(A_{i},c_{i}) by (j=1mλjAj,j=1mλjcj),(\sum_{j=1}^{m}\lambda_{j}A_{j},\sum_{j=1}^{m}\lambda_{j}c_{j}),\, where λm,λi0.\lambda\in\mathbb{R}^{m},\,\lambda_{i}\neq 0.

We obtain an elementary reformulation of the system (PSDP_{SD}) by applying the preceding operations with some c.c.

Clearly, in all reformulations of (PSDP_{SD}) the maximum rank slack is the same.

Where do these operations come from? Operations (3) and (4) are equivalent to elementary row operations (inherited from Gaussian elimination) done on (𝑆𝐷𝐷c):(\mathit{SDD_{c}}):

  • Operation (3) exchanges the dual equations AiY=ciandAjY=cj;A_{i}\bullet Y=c_{i}\,{\rm and}\,A_{j}\bullet Y=c_{j}; and

  • Operation (4) replaces the dual equation AiY=cibyj=1m(λjAj)Y=j=1mλjcj.A_{i}\bullet Y=c_{i}\,{\rm by}\,\sum_{j=1}^{m}(\lambda_{j}A_{j})\bullet Y=\sum_{j=1}^{m}\lambda_{j}c_{j}.

Lemma 4.

The system (PSDP_{SD}) is well behaved if and only if its elementary reformulations are.

Proof Operations (1)-(4) of Definition 2 keep the value of (𝑆𝐷𝑃c)(\mathit{SDP_{c}}) finite, if it is finite; and infinite, if it is infinite. Suppose now that YY is feasible in (𝑆𝐷𝐷c)(\mathit{SDD_{c}}) with value, say, α,\alpha, and we apply operations (1) and (2) with rotation matrix TT\, and vector μ.\mu. Then identity (1.7) implies that TTYTT^{-T}YT is feasible in the dual of the reformulated problem with value α+j=1mμjcj.\alpha+\sum_{j=1}^{m}\mu_{j}c_{j}. Operations (3) and (4) preserve the feasibility and objective value of a solution of (𝑆𝐷𝐷c).(\mathit{SDD_{c}}). Thus if (PSDP_{SD}) is well behaved, so are its reformulations, and this completes the proof of the ”Only if” direction. Since (PSDP_{SD}) is a reformulation of its reformulations, the ”If” direction follows as well. ∎

4.2 Reformulating (PSDP_{SD}) to verify maximality of the maximum rank slack

Recall that ZZ is the maximum rank slack in (PSDP_{SD}) described in Assumption 1. We reformulate (PSDP_{SD}) in two steps. In the first step, given in Lemma 5, we reformulate (PSDP_{SD}) so the resulting system has easy-to-verify witnesses that ZZ is a maximum rank slack. (The YjY_{j} matrices in Lemma 5 will be the witnesses.)

In Lemma 5 we rely on a facial reduction algorithm (see [16, 15, 46, 31]). It is important that in Lemma 5 we only use rotations, i.e., type (1) operations of Definition 2.

Lemma 5.

The system (PSDP_{SD}) has a reformulation

i=1mxiAiB\sum_{i=1}^{m}x_{i}A_{i}^{\prime}\preceq B^{\prime} (PSDP_{SD}^{\prime})

and there exist symmetric matrices of the form

Yj=rjrj1++r1( 00×) 0I××××(j=1,,)Y_{j}=\bordermatrix{&\hskip 28.45274pt&\overbrace{\qquad}^{\scriptstyle{r_{j}}}&\overbrace{\qquad}^{\scriptstyle{r_{j-1}+\dots+r_{1}}}\cr&0&0&\times\cr&0&I&\times\cr&\times&\times&\times\cr}\,(j=1,\dots,\ell) (4.27)

where 0,r1>0,,r>0,\ell\geq 0,r_{1}>0,\dots,r_{\ell}>0,\, r1++r=nr,r_{1}+\dots+r_{\ell}=n-r,\, and

YjB=YjAi=0Y_{j}\bullet B^{\prime}=Y_{j}\bullet A_{i}^{\prime}=0 (4.28)

holds for all ii and j.j. Here the ×\times symbols denote blocks with arbitrary elements in the YjY_{j} matrices.

If Z=I,Z=I,\, i.e., (PSDP_{SD}) satisfies Slater’s condition, then we just take B=B,Ai=AiB^{\prime}=B,\,A_{i}^{\prime}=A_{i} for all ii and =0\ell=0 in Lemma 5.

To build intuition, we first establish why the YjY_{j} matrices indeed prove that the rank of any slack matrix is at most r.r. Let SS be a slack in (PSDP_{SD}), and Y1,,YY_{1},\dots,Y_{\ell} as in the statement of Lemma 5. Then S=BixiAiS=B^{\prime}-\sum_{i}x_{i}A_{i}^{\prime}\, for some xm.x\in\mathbb{R}^{m}. So Y1S=0Y_{1}\bullet S=0 and S0,S\succeq 0,\, hence the last r1r_{1} rows and columns of SS are zero; Y2S=0Y_{2}\bullet S=0 and S0S\succeq 0 imply that the next r2r_{2} rows and columns of SS are zero, and so on. Inductively we find that the last r1++r=nrr_{1}+\dots+r_{\ell}=n-r rows and columns of SS are zero, hence SS must have rank at most r.r.

Thus we can prove that ZZ is a maximum rank slack in (PSDP_{SD}^{\prime}) (hence also in (PSDP_{SD})) using

  1. (1)

    a vector xmx\in\mathbb{R}^{m} such that Z=Bi=1mxiAi,Z=B^{\prime}-\sum_{i=1}^{m}x_{i}A_{i}^{\prime}, and

  2. (2)

    the YjY_{j} matrices of Lemma 5.

We next illustrate Lemma 5.

Example 7.

(Examples 3, 4, 5 and 6 continued) In all these examples it is easy to show why the maximum rank slack is indeed a slack. Also, in Example 3

Y1=(0001)Y_{1}=\begin{pmatrix}0&0\\ 0&1\end{pmatrix}

is orthogonal to all constraint matrices (using the \bullet inner product), so it proves that the rank of any slack matrix is at most one.

In Example 4 the matrix Y1=0I1Y_{1}=0\oplus I_{1} proves that the rank of any slack is at most one, and in Example 5 the matrix Y1=0I2Y_{1}=0\oplus I_{2} proves that the rank of any slack is at most two. (So the first three examples do not even need to be reformulated to have a convenient proof that ZZ is a maximum rank slack.)

In Example 6 we let

T=(100013001),T=\begin{pmatrix}1&0&0\\ 0&1&3\\ 0&0&1\end{pmatrix},

and apply the rotation TT()TT^{T}()T to all matrices to obtain the system

x1(001010101)+x2(010101010)+x3(11α111α11)(22α+1222α+122).x_{1}\begin{pmatrix}0&0&1\\ 0&1&0\\ 1&0&-1\end{pmatrix}+x_{2}\begin{pmatrix}0&1&0\\ 1&0&1\\ 0&1&0\end{pmatrix}+x_{3}\begin{pmatrix}1&1&\alpha\\ 1&1&1\\ \alpha&1&-1\end{pmatrix}\preceq\begin{pmatrix}2&2&\alpha+1\\ 2&2&2\\ \alpha+1&2&-2\end{pmatrix}. (4.29)

Now Y1=0I2Y_{1}=0\oplus I_{2} is orthogonal to all constraint matrices in (4.29), and this proves that the rank of any slack is at most one, so Z=I10Z=I_{1}\oplus 0 is a maximum rank slack.

In Appendix A we give a larger example, in which we need two YjY_{j} matrices to prove that any slack matrix has rank at most 2.2.

Proof of Lemma 5

To find the reformulation assume that k0,k\geq 0, we have a reformulation of the form (PSDP_{SD}^{\prime}) and matrices Y1,,YkY_{1},\dots,Y_{k} such that (4.28) holds for all ii and for j=1,,k.j=1,\dots,k. At the start k=0k=0 and B=B,Ai=AiB^{\prime}=B,\,A_{i}^{\prime}=A_{i}\, for all i.i. For brevity, let sk=r1++rk.s_{k}=r_{1}+\dots+r_{k}. We claim that

sknrholds.s_{k}\leq n-r\,{\rm holds.}

This indeed follows since if S0S\succeq 0 is a slack in (PSDP_{SD}^{\prime}) then (using the same argument that we used before) the last sks_{k} rows and columns of SS must be zero.

If sk=nr,s_{k}=n-r, we set =k,\ell=k,\, and stop; otherwise, we define the cone K=𝒮+nY1Yk.K={\cal S}_{+}^{n}\cap Y_{1}^{\perp}\dots\cap Y_{k}^{\perp}. Clearly, KK and its dual cone KK^{*} are of the form

K={(Y11000)|.Y11𝒮+nsk},K={(Y11Y12Y12TY22)|.Y11𝒮+nsk}.K\,=\,\left\{\,\begin{pmatrix}Y_{11}&0\\ 0&0\end{pmatrix}\,\bigr{|}\bigl{.}\,Y_{11}\in{\cal S}_{+}^{n-s_{k}}\;\right\},\,K^{*}\,=\,\left\{\,\begin{pmatrix}Y_{11}&Y_{12}\\ Y_{12}^{T}&Y_{22}\end{pmatrix}\,\bigr{|}\bigl{.}\,Y_{11}\in{\cal S}_{+}^{n-s_{k}}\;\right\}.

Next, define the affine subspace

H=lin{A1,,Am}+B.H\,=\,\operatorname{lin}\,\{\,A_{1}^{\prime},\dots,A_{m}^{\prime}\,\}+B^{\prime}.

Since ZZ is also a maximum rank slack in (PSD)(P_{SD}^{\prime}), and r<nsk,r<n-s_{k},\, we have HK,H\cap K\neq\emptyset,\, HriK=,H\cap\operatorname{ri}K=\emptyset,\, hence H(KK)H^{\perp}\cap(K^{*}\setminus K^{\perp})\neq\emptyset by a classic theorem of the alternative (see e.g. Lemma 1 in [31]).

Let

Yk+1H(KK).Y_{k+1}\in H^{\perp}\cap(K^{*}\setminus K^{\perp}).

Since Yk+1Z=0,Y_{k+1}\bullet Z=0,\, we have

Yk+1=rsk( 00×) 0Y××××Y_{k+1}=\bordermatrix{&\overbrace{\qquad}^{\scriptstyle{r}}&\hskip 28.45274pt&\overbrace{\qquad}^{\scriptstyle{s_{k}}}\cr&0&0&\times\cr&0&Y^{\prime}&\times\cr&\times&\times&\times\cr}

for some Y0.Y^{\prime}\succeq 0. (Again, the ×\times symbols stand for submatrices with arbitrary elements). Let rk+1r_{k+1} be the number of positive eigenvalues of Y;Y^{\prime}; since Yk+1K,Y_{k+1}\not\in K^{\perp},\, we have rk+1>0.r_{k+1}>0.

Let QQ be an invertible matrix such that QTYQ=0Irk+1,Q^{T}Y^{\prime}Q=0\oplus I_{r_{k+1}}, and T=IrQIsk.T=I_{r}\oplus Q\oplus I_{s_{k}}. We apply the rotation TT()TT^{T}()T to Y1,,Yk+1,Y_{1},\dots,Y_{k+1},\, and the rotation T1()TTT^{-1}()T^{-T} to all AiA_{i}^{\prime} and to B.B^{\prime}.

By (1.7) the equation (4.28) holds for all ii and for j=1,,k+1.j=1,\dots,k+1. By the form of TT now Y1,,Yk+1Y_{1},\dots,Y_{k+1} are in the required shape (see equation (4.27)). We then set k=k+1k=k+1 and continue.

Clearly, our algorithm terminates in finitely many steps, so the proof is complete. ∎

4.3 Reformulating (PSDP_{SD}) to verify that it is badly behaved

In Theorem 4 we give the final reformulation of (PSDP_{SD}) to prove its bad behavior. We point out that in Theorem 4 the proof of the ”if” direction is elementary, thus the reformulated system (PSD,badP_{\text{SD,bad}}) is an easy-to-verify certificate that (PSDP_{SD}) is badly behaved.

Theorem 4.

The system (PSDP_{SD}) is badly behaved if and only if it has a reformulation

i=1kxi(Fi000)+i=k+1mxi(FiGiGiTHi)(Ir000)=Z,\sum_{i=1}^{k}x_{i}\begin{pmatrix}F_{i}&0\\ 0&0\end{pmatrix}+\sum_{i=k+1}^{m}x_{i}\begin{pmatrix}F_{i}&G_{i}\\ G_{i}^{T}&H_{i}\end{pmatrix}\,\preceq\,\begin{pmatrix}I_{r}&0\\ 0&0\end{pmatrix}=Z, (PSD,badP_{\text{SD,bad}})

where

  1. (1)

    matrix ZZ is the maximum rank slack, and its maximality can be verified by matrices Y1,,Y,Y_{1},\dots,Y_{\ell},\, as given by Lemma 5.

  2. (2)

    The matrices

    (GiHi)(i=k+1,,m)\begin{pmatrix}G_{i}\\ H_{i}\end{pmatrix}\,(i=k+1,\dots,m)

    are linearly independent.

  3. (3)

    Hm0.H_{m}\succeq 0.

Proof (If) By Lemma 4 it is enough to prove that (PSD,badP_{\text{SD,bad}}) is badly behaved. Let xx be feasible in (PSD,bad)(P_{\text{SD,bad}}) with a corresponding slack S.S. Note that the last nrn-r rows and columns of SS must be zero, otherwise 12(S+Z)\frac{1}{2}(S+Z) would be a slack with larger rank than Z.Z. Hence, by condition (2) we must have xk+1==xm= 0.x_{k+1}\,=\,\dots\,=\,x_{m}\,=\,0. Next, let us consider the SDP

sup{xm|xis feasible in (PSD,bad)},\sup\,\{\,-x_{m}\,|\,x\,\text{is feasible in }\mbox{(\ref{p-sd-bad})}\,\}, (4.30)

which, by the above argument, has optimal value 0.0. We prove that its dual cannot have a feasible solution with value 0,0,\, so suppose that

Y=(Y11Y12Y12TY22)0Y\,=\,\begin{pmatrix}Y_{11}&Y_{12}\\ Y_{12}^{T}&Y_{22}\end{pmatrix}\succeq 0

is such a solution. By YZ=0Y\bullet Z=0 we get Y11=0,Y_{11}=0,\, hence by psdness of YY we deduce Y12=0.Y_{12}=0. Thus

(FmGmGmTHm)Y=HmY220,\begin{pmatrix}F_{m}&G_{m}\\ G_{m}^{T}&H_{m}\end{pmatrix}\bullet Y=H_{m}\bullet Y_{22}\geq 0,

which contradicts the assumption that YY is feasible in the dual of (4.30).

Proof (Only if) We start with the system (PSDP_{SD}^{\prime}) given by Lemma 5 and further reformulate it. For brevity we denote the constraint matrices on the left hand side by AiA_{i}^{\prime} throughout the process.

We first replace BB^{\prime} by ZZ in (PSDP_{SD}^{\prime}). Since the resulting system is still badly behaved, by Theorem 2 there is a matrix of the form

V=λ0Z+i=1mλiAi=(V11V12V12TV22),V\,=\,\lambda_{0}Z+\sum_{i=1}^{m}\lambda_{i}A_{i}^{\prime}\,=\,\begin{pmatrix}V_{11}&V_{12}\\ V_{12}^{T}&V_{22}\end{pmatrix},\,

with V11𝒮r,V220,and(V12T)(V22).V_{11}\in{\cal S}^{r},V_{22}\succeq 0,\,{\rm and}\,{\cal R}(V_{12}^{T})\not\subseteq{\cal R}(V_{22}). By the form of ZZ we can assume λ0=0\lambda_{0}=0 (otherwise we can replace VV by Vλ0ZV-\lambda_{0}Z).

Note that the block of VV comprising the last nrn-r columns must be nonzero. We pick an ii\, such that λi0,\lambda_{i}\neq 0,\, replace AiA_{i}^{\prime} by V,V, then switch AiA_{i}^{\prime} and Am.A_{m}^{\prime}. Next we choose a maximal subset of the AiA_{i}^{\prime} matrices so their blocks comprising the last nrn-r columns are linearly independent. We let AmA_{m}^{\prime} to be one of these matrices (this can be done, since AmA_{m}^{\prime} is now the VV certificate matrix), and permute the AiA_{i}^{\prime} so this special subset becomes Ak+1,,AmA_{k+1}^{\prime},\dots,A_{m}^{\prime} for some k0.k\geq 0.

We finally add suitable multiples of Ak+1,,AmA_{k+1}^{\prime},\dots,A_{m}^{\prime} to A1,,AkA_{1}^{\prime},\dots,A_{k}^{\prime} to zero out the last nrn-r columns and rows of the latter, and arrive at the required reformulation. ∎

Example 8.

(Examples 3, 4 and 6 continued) The first two of these examples are already in the standard form (PSD,badP_{\text{SD,bad}}). Suppose now α1\alpha\neq 1\, in Example 6, i.e., the system (3.18) is badly behaved. Recall that by a rotation we brought (3.18) to the simpler form (4.29). Then in (4.29) we set

B\displaystyle B :=\displaystyle:= BA1A2A3,\displaystyle B-A_{1}-A_{2}-A_{3},
A3\displaystyle A_{3} :=\displaystyle:= A3A1A2,\displaystyle A_{3}-A_{1}-A_{2},

and obtain the system

x1(001010101)+x2(010101010)+x3(10α1000α100)(100000000),\begin{array}[]{rl}&x_{1}\begin{pmatrix}0&0&1\\ 0&1&0\\ 1&0&-1\end{pmatrix}+x_{2}\begin{pmatrix}0&1&0\\ 1&0&1\\ 0&1&0\end{pmatrix}+x_{3}\begin{pmatrix}1&0&\alpha-1\\ 0&0&0\\ \alpha-1&0&0\end{pmatrix}\preceq\begin{pmatrix}1&0&0\\ 0&0&0\\ 0&0&0\end{pmatrix},\end{array} (4.31)

which is in the standard form (PSD,badP_{\text{SD,bad}}) (with k=0k=0). The objective function supx3\sup-x_{3} yields a zero optimal value over (4.31) but there is no dual solution with the same value: we can argue this as in the proof of the ”if” direction in Theorem 4.

Note that the certificate matrix VV of Theorem 2 appears in the system (PSD,badP_{\text{SD,bad}}) as the last matrix on the left hand side.

4.4 Reformulating (PSDP_{SD}) to verify that it is well behaved

We now turn to well behaved semidefinite systems, and in Theorem 5 we show how to reformulate them to easily verify their good behavior. In Theorem 5 we also show block-diagonality of dual optimal solutions. Note that the proof of the ”if” direction in Theorem 5 is easy, so the system (PSD,goodP_{\text{SD,good}}) is an easy-to-verify certificate of good behavior.

Theorem 5.

The system (PSDP_{SD}) is well behaved if and only if it has a reformulation

i=1kxi(Fi000)+i=k+1mxi(FiGiGiTHi)(Ir000)=Z,\sum_{i=1}^{k}x_{i}\begin{pmatrix}F_{i}&0\\ 0&0\end{pmatrix}+\sum_{i=k+1}^{m}x_{i}\begin{pmatrix}F_{i}&G_{i}\\ G_{i}^{T}&H_{i}\end{pmatrix}\preceq\begin{pmatrix}I_{r}&0\\ 0&0\end{pmatrix}=Z, (PSD,goodP_{\text{SD,good}})

where

  1. (1)

    the matrix ZZ is the maximum rank slack.

  2. (2)

    The matrices Hi(i=k+1,,m)H_{i}\,(i=k+1,\dots,m) are linearly independent.

  3. (3)

    Hk+1I==HmI=0.H_{k+1}\bullet I=\dots=H_{m}\bullet I=0.

Also, if (PSDP_{SD}) is well behaved, and the value of (𝑆𝐷𝑃c)(\mathit{SDP_{c}}) is finite, then there is an optimal dual matrix in 𝒮+r𝒮+nr.{\cal S}_{+}^{r}\oplus{\cal S}_{+}^{n-r}.

Proof (If and block-diagonality) Let cc be such that

v:=sup{i=1mcixi|xis feasible in (PSD,good)}v:=\sup\,\{\,\sum_{i=1}^{m}c_{i}x_{i}\,|\,x\,\text{is feasible in }(P_{\text{SD,good}})\,\} (4.32)

is finite. By the proof of Lemma 4 it suffices to prove that the dual of (4.32) has a block-diagonal solution with value v.v. An argument like in the proof of Theorem 4 proves that xk+1==xm=0x_{k+1}=\dots=x_{m}=0 holds for any xx feasible in (4.32), so

v=sup{i=1kcixi|i=1kxiFiIr}.v\,=\,\sup\,\{\,\sum_{i=1}^{k}c_{i}x_{i}\,|\,\sum_{i=1}^{k}x_{i}F_{i}\preceq I_{r}\,\}. (4.33)

Since (4.33) satisfies Slater’s condition, there is Y11Y_{11} feasible in its dual with Y11Ir=v.Y_{11}\bullet I_{r}=v.

As the HiH_{i} are linearly independent, we can choose Y22𝒮nrY_{22}\in{\cal S}^{n-r} (which is possibly not psd) such that

Y:=(Y1100Y22)Y:=\begin{pmatrix}Y_{11}&0\\ 0&Y_{22}\end{pmatrix}

satisfies the equality constraints of the dual of (4.32). We then add a positive multiple of the identity to Y22Y_{22} to make YY psd. Taking condition (3) into account we can see that after this operation YY is feasible in the dual of (4.32) and clearly YZ=vY\bullet Z=v holds. The proof is now complete.

Proof (Only if) We again start with the system (PSDP_{SD}^{\prime}) that Lemma 5 provides; now (PSDP_{SD}^{\prime}) is well behaved. (We also note that the UU matrix of Theorem 3 became the Y1=0InrY_{1}=0\oplus I_{n-r} matrix of Lemma 5, after we rotated it.) We first replace BB^{\prime} by Z.Z. Next we choose a maximal subset of the AiA_{i}^{\prime} whose lower principal (nr)×(nr)(n-r)\times(n-r) blocks are linearly independent. We permute the AiA_{i}^{\prime} if needed, to make this subset Ak+1,,AmA_{k+1}^{\prime},\dots,A_{m}^{\prime} for some k0.k\geq 0.

To complete the process we add multiples of Ak+1,,AmA_{k+1}^{\prime},\dots,A_{m}^{\prime} to A1,,AkA_{1}^{\prime},\dots,A_{k}^{\prime} to zero out the lower principal (nr)×(nr)(n-r)\times(n-r) block of the latter. By Theorem 3 the upper right r×(nr)r\times(n-r) block of A1,,AkA_{1}^{\prime},\dots,A_{k}^{\prime} and the symmetric counterpart also become zero. This concludes the proof. ∎

Example 9.

(Examples 5 and 6 continued) In Example 5 the system (3.17) is already in the form of (PSD,goodP_{\text{SD,good}}).

Suppose now α=1\alpha=1\, in Example 6, i.e., (3.18) is well behaved. Recall that we transformed this system into the system (4.31) (in Example 9; note that this can be done independently of the value of α\alpha). We then switch the first and third matrices in (4.31) to get

x1(100000000)+x2(010101010)+x3(001010101)(100000000)\begin{array}[]{rl}&x_{1}\begin{pmatrix}1&0&0\\ 0&0&0\\ 0&0&0\end{pmatrix}+x_{2}\begin{pmatrix}0&1&0\\ 1&0&1\\ 0&1&0\end{pmatrix}+x_{3}\begin{pmatrix}0&0&1\\ 0&1&0\\ 1&0&-1\end{pmatrix}\preceq\begin{pmatrix}1&0&0\\ 0&0&0\\ 0&0&0\end{pmatrix}\end{array} (4.34)

in the standard form (PSD,goodP_{\text{SD,good}}) (with k=1k=1).

We next discuss some implications of Theorem 5. First, as the proof of the ”if” direction shows, we can compute an optimal solution of (4.32) from an optimal solution of the reduced problem (4.33); to do so, we only need to solve a linear system of equations (to find Y22Y_{22}) and do a linesearch (to make Y22Y_{22} psd).

Second, loosely speaking, the system (PSD,goodP_{\text{SD,good}}) can be partitioned into a strictly feasible part, and a linear part, which corresponds to variables xk+1,,xm.x_{k+1},\dots,x_{m}.

Third, how do we generate a well behaved semidefinite system? Theorem 5 can help us to do this: we can choose matrices Z,Hi,Gi,FiZ,H_{i},G_{i},F_{i} to obtain a system in the form (PSD,goodP_{\text{SD,good}}), then arbitrarily reformulate it, while keeping it well behaved. In fact, according to Theorem 5, we can obtain any well behaved semidefinite system in this manner.

In related work, Bomze et al in [10] describe methods to generate pathological conic LP instances from other pathological conic LPs. Their results differ from ours, since they need to start with a pathological conic LP.

We also note that using Lemma 1 the authors in Theorem 3.2 in [19] characterized the situation when the projection of 𝒮+n{\cal S}_{+}^{n} onto some entries is closed; we can view Theorem 5 as a generalization of this result.

4.5 Badly behaved semidefinite systems are in 𝒩𝒫co𝒩𝒫.{\cal NP}\cap{\rm co\mathchar 45\relax}{\cal NP}. Certificates to verify (non)closedness of the linear image of the semidefinite cone

We now state our main complexity result:

Theorem 6.

Badly (and well) behaved semidefinite systems are in 𝒩𝒫co𝒩𝒫{\cal NP}\cap{\rm co\mathchar 45\relax}{\cal NP} in the real number model of computing.

Proof We give the following certificates to check the status of (PSDP_{SD}): (1) a reformulation of (PSDP_{SD}) into the form (PSD,badP_{\text{SD,bad}}) or (PSD,goodP_{\text{SD,good}}); (2) the YjY_{j} matrices of Lemma 5 to verify that ZZ\, is indeed a maximum rank slack; (3) a matrix T=IrM,T=I_{r}\oplus M,\, and μm,\mu\in\mathbb{R}^{m},\, which were used to transform (PSDP_{SD}) into (PSD,badP_{\text{SD,bad}}) or (PSD,goodP_{\text{SD,good}}).

The verifier first checks that (PSD,badP_{\text{SD,bad}}) or (PSD,goodP_{\text{SD,good}}) is indeed a reformulation of (PSDP_{SD}); then verifies the properties of (PSD,badP_{\text{SD,bad}}) or (PSD,goodP_{\text{SD,good}}) as given in Theorems 4 or 5; then the proof of the “If” part in Theorems 4 or 5 shows that these systems are well- or badly behaved. ∎

Assume that we are working with the real number model of computing. We don’t claim to have a polynomial time algorithm to decide whether (PSDP_{SD}) is badly behaved; in particular, we don’t have a polynomial time algorithm to compute the ZZ and VV excluded matrices of Theorem 2, or one to compute the reformulated systems (PSD,badP_{\text{SD,bad}}) or (PSD,goodP_{\text{SD,good}}).

In analogy, if (PSDP_{SD}) is feasible, we can verify this in polynomial time (by plugging in a feasible xx). If (PSDP_{SD}) is infeasible, we can also verify this in polynomial time, using one of the infeasibility certificates in [34, 24, 46, 25]. However, we don’t know how to decide in polynomial time whether (PSDP_{SD}) is feasible.

Thus feasibility of a semidefinite system is similar to the bad behavior of a feasible system: both properties are in 𝒩𝒫co𝒩𝒫,{\cal NP}\cap{\rm co\mathchar 45\relax}{\cal NP},\, but neither is known to be in 𝒫.{\cal P}.

To conclude this section, we briefly discuss easy-to-verify certificates for the (non)closedness of the linear image of 𝒮+n.{\cal S}_{+}^{n}. All linear maps that map from 𝒮n{\cal S}^{n} to m\mathbb{R}^{m} are of the form 𝒜:𝒮nm,\mathcal{A}^{*}:{\cal S}^{n}\rightarrow\mathbb{R}^{m},\, where

𝒜(x)=i=1mxiAi,𝒜(Y)=(A1Y,,AmY)T\mathcal{A}(x)=\sum_{i=1}^{m}x_{i}A_{i},\,\mathcal{A}^{*}(Y)=(A_{1}\bullet Y,\dots,A_{m}\bullet Y)^{T}

and Ai𝒮nA_{i}\in{\cal S}^{n} for all i.i. We know that 𝒜(𝒮+n)\mathcal{A}^{*}({\cal S}_{+}^{n}) is closed if and only if the homogeneous system

i=1mxiAi0\sum_{i=1}^{m}x_{i}A_{i}\preceq 0 (4.35)

is well behaved (this is immediate from Lemma 2). Thus reformulating this homogeneous system into the standard forms of (PSD,badP_{\text{SD,bad}}) or (PSD,goodP_{\text{SD,good}}) gives easy-to-verify certificates of the closedness or nonclosedness of 𝒜(𝒮+n).\mathcal{A}^{*}({\cal S}_{+}^{n}).

To illustrate this point we revisit Examples 1 and 2. The semidefinite system

(x)0,-{\cal M}(x)\preceq 0, (4.36)

where {\cal M} is the linear map defined there, is badly behaved (since the image of the semidefinite cone under {\cal M}^{*} is not closed). We can apply the machinery of this paper to study the system (4.36); e.g., we can find the ZZ and VV excluded matrices of Theorem 2, and reformulate (4.36) into the standard form (PSD,badP_{\text{SD,bad}}). We leave the details to the reader.

5 Concluding remarks

Theorem 2 gives the ZZ and VV excluded matrices to characterize bad behavior of (PSDP_{SD}). We can carry this idea further, and prove the following result:

Corollary 2.

Suppose that in addition to the operations of Definition 2 we allow a sequence of the following operations:

  1. (1)

    Delete row ii and column ii from all matrices, where i{1,,n}.i\in\{1,\dots,n\}.

  2. (2)

    Delete a constraint matrix.

Then we can bring any badly behaved semidefinite system to the form of

x1(α110)(1000),\begin{array}[]{rl}x_{1}\begin{pmatrix}\alpha&1\\ 1&0\end{pmatrix}\preceq\begin{pmatrix}1&0\\ 0&0\end{pmatrix},\end{array} (5.37)

where α\alpha is some real number.

Proof Suppose that (PSDP_{SD}) is badly behaved and let us recall the form of the maximum rank slack in Assumption 1. We first add multiples of the AiA_{i} to BB to make sure that the right hand side is the maximum rank slack. Next we let VV to be a certificate matrix as given by Theorem 2; we can assume that VV is the linear combination of the AiA_{i} only; we reformulate, so VV becomes a constraint matrix.

As we show in Lemma 3, we can apply a rotation TT()TT^{T}()T to VV (where T=IrMT=I_{r}\oplus M\, for some invertible MM) to bring VV to the form

V=(V11V12V13V12TIs0V13T00),V\,=\,\begin{pmatrix}V_{11}&V_{12}&V_{13}\\ V_{12}^{T}&I_{s}&0\\ V_{13}^{T}&0&0\end{pmatrix}, (5.38)

where V11V_{11} is r×r,r\times r,\, s0s\geq 0 and V130.V_{13}\neq 0. We apply the rotation TT()TT^{T}()T to all constraint matrices, and after this operation VV is of the form specified in (5.38). Suppose now that vij0,v_{ij}\neq 0,\, where 1ir1\leq i\leq r and r+s+1jn.r+s+1\leq j\leq n. We rescale VV to make sure that vij=1v_{ij}=1\, holds, then delete all rows and columns from the constraint matrices whose index is not ii nor j,j,\, to obtain system (5.37). ∎

Excluded minor results in graph theory, such as Kuratowski’s theorem, show that a graph lacks a certain fundamental property, if and only if it can be reduced to a minimal such graph by a sequence of elementary operations. Corollary 2 resembles such results, since system (5.37) is trivially badly behaved.

We can define the well- or badly behaved nature of conic linear systems in a different form, and characterize such systems. For instance, we call the dual system

𝒜y=c,yK,\mathcal{A}^{*}y=c,\,y\in K^{*}, (5.39)

well behaved, if for all bb dual objective functions the values of (Dc)\mathit{(D_{c})\,}and of (Pc)\mathit{(P_{c})\,}agree, and the latter value is attained, when it is finite. System (5.39) can be recast in the primal form

xKy0,\mathcal{B}x\leq_{K^{*}}y_{0}, (5.40)

where \mathcal{B} and y0y_{0} satisfy ()=𝒩(𝒜){\cal R}(\mathcal{B})={\cal N}(\mathcal{A}^{*}) and 𝒜y0=c.\mathcal{A}^{*}y_{0}=c. It is straightforward to show that (5.39) is well behaved, if and only if (5.40) is, and to translate the conditions of Theorem 1 to characterize when (5.39) is well- or badly behaved. We leave the details to the reader.

In the special case of semidefinite systems we can obtain the following result:

Theorem 7.

Suppose that in the system

Y0,AiY=ci(i=1,,m)Y\succeq 0,\,A_{i}\bullet Y\,=\,c_{i}\,(i=1,\dots,m) (5.41)

the maximum rank feasible matrix is

Y¯=(Ir000)forsomer0.\bar{Y}\,=\,\begin{pmatrix}I_{r}&0\\ 0&0\end{pmatrix}\,{\rm for\,some}\,r\geq 0.

Then (5.41) is badly behaved if and only if there is a matrix VV and a real number λ\lambda such that

AiV=λci(i=1,,m),A_{i}\bullet V\,=\,\lambda c_{i}\,\,(i=1,\dots,m),

and

V=(V11V12V12TV22),V\,=\,\begin{pmatrix}V_{11}&V_{12}\\ V_{12}^{T}&V_{22}\end{pmatrix},\,

where V11V_{11} is rr by r,r,\, V220,V_{22}\succeq 0,\, and (V12T)(V22).{\cal R}(V_{12}^{T})\not\subseteq{\cal R}(V_{22}).

We can apply similar arguments to conic linear systems in a subspace form

K(L+x0),K\cap(L+x_{0}),

to characterize their well- or badly behaved status.

We can also characterize badly behaved second order conic systems similarly as we did it for (PSDP_{SD}) in Theorem 2. This result is in version 2 of the online version of the paper on arxiv.org.

We finally mention a subject for possible future work. The interplay of algebraic geometry and optimization is an active research area: see for instance the recent monograph by Blekherman et al [8], and the paper of Klep and Schweighofer [24]. It would be interesting to see how our certificates of bad and good behavior can be interpreted in the language of algebraic geometry.

Appendix A A larger badly behaved semidefinite system

In this appendix we give a larger badly behaved semidefinite system to illustrate the standard form reformulation (PSD,badP_{\text{SD,bad}}). What is nice about this example is that the bad behavior of the original (not reformulated) system is very difficult to verify by an ad hoc argument, whereas the bad behavior of the reformulated system is self-evident.

Example 10.

Consider the badly behaved semidefinite system

x1(43533202501283284)+x2(1410159106061503624962412)+x3(86536402501283284)+x4(2015251315101925158381393818)(4532553132191215511308631218642).\begin{array}[]{l}x_{1}\begin{pmatrix}4&3&-5&-3\\ 3&-2&0&-2\\ -5&0&-12&-8\\ -3&-2&-8&-4\end{pmatrix}+x_{2}\begin{pmatrix}14&10&-15&-9\\ 10&-6&0&-6\\ -15&0&-36&-24\\ -9&-6&-24&-12\end{pmatrix}+x_{3}\begin{pmatrix}8&6&-5&-3\\ 6&-4&0&-2\\ -5&0&-12&-8\\ -3&-2&-8&-4\end{pmatrix}+x_{4}\begin{pmatrix}20&15&-25&-13\\ 15&-10&-1&-9\\ -25&-1&-58&-38\\ -13&-9&-38&-18\end{pmatrix}\\ \preceq\begin{pmatrix}45&32&-55&-31\\ 32&-19&-1&-21\\ -55&-1&-130&-86\\ -31&-21&-86&-42\end{pmatrix}.\end{array} (A.42)

We show how to bring (A.42) into the form of (PSD,badP_{\text{SD,bad}}), so let us denote the constraint matrices on the left by Ai(i=1,,4),A_{i}\,(i=1,\dots,4),\, and the right hand side matrix by B.B. Let

T=(10000100001/21/2003/21/2),T\,=\,\begin{pmatrix}1&0&0&0\\ 0&1&0&0\\ 0&0&-1/2&1/2\\ 0&0&3/2&-1/2\end{pmatrix},

apply the rotation TT()TT^{T}()T to all AiA_{i} and B,B,\, then perform the following operations:

B\displaystyle B :=\displaystyle:= BA12A2+A3A4,\displaystyle B-A_{1}-2A_{2}+A_{3}-A_{4},
A4\displaystyle A_{4} :=\displaystyle:= 5A1+A4,\displaystyle-5A_{1}+A_{4},
A3\displaystyle A_{3} :=\displaystyle:= 2A1+A3,\displaystyle-2A_{1}+A_{3},
A2\displaystyle A_{2} :=\displaystyle:= 3A1+A2,\displaystyle-3A_{1}+A_{2},
A1\displaystyle A_{1} :=\displaystyle:= A12A2+A3.\displaystyle A_{1}-2A_{2}+A_{3}.

We obtain the system

x1(0100120000000000)+x2(2100100000000000)+x3(0021003123021120)+x4(0031002132201100)(1000010000000000).\begin{array}[]{l}x_{1}\begin{pmatrix}0&1&0&0\\ 1&-2&0&0\\ 0&0&0&0\\ 0&0&0&0\end{pmatrix}+x_{2}\begin{pmatrix}2&1&0&0\\ 1&0&0&0\\ 0&0&0&0\\ 0&0&0&0\end{pmatrix}+x_{3}\begin{pmatrix}0&0&2&1\\ 0&0&3&-1\\ 2&3&0&2\\ 1&-1&2&0\end{pmatrix}+x_{4}\begin{pmatrix}0&0&3&-1\\ 0&0&2&-1\\ 3&2&2&0\\ -1&-1&0&0\end{pmatrix}\\ \preceq\begin{pmatrix}1&0&0&0\\ 0&1&0&0\\ 0&0&0&0\\ 0&0&0&0\end{pmatrix}.\end{array} (A.43)

In (A.43) the matrices

Y1=(0000000000000001),andY2=(0001000100201100)Y_{1}\,=\,\begin{pmatrix}0&0&0&0\\ 0&0&0&0\\ 0&0&0&0\\ 0&0&0&1\end{pmatrix},\,{\rm and}\,Y_{2}\,=\,\begin{pmatrix}0&0&0&1\\ 0&0&0&1\\ 0&0&2&0\\ 1&1&0&0\end{pmatrix} (A.44)

are orthogonal to all the constraint matrices, thus they prove that the rank of any slack is at most two. So in (A.43) the right hand side is the maximum rank slack.

It is easy to see that (A.43) is badly behaved: following the proof of the “If” implication in Theorem 4, one can see that the objective function supx4\sup-x_{4} yields a value of 0 over (A.43), but there is no dual solution with the same value.

Appendix B Proof of Lemmas 2 and 3

In this section we prove Lemmas 2 and 3.

First we need some definitions and notation. For optimization problems we use the symbol val()\operatorname{val}() to denote their optimal value. For program (Dc)(\mathit{D_{c}}) we say that {yi}K\{y_{i}\}\subseteq K^{*} is an asymptotically feasible (AF) solution, if 𝒜yic,\mathcal{A}^{*}y_{i}\rightarrow c, and the asymptotic value of (Dc)(\mathit{D_{c}}) is

aval(Dc)=inf{limbyi|{yi}isasymptoticallyfeasiblein(Dc)},\begin{array}[]{rcl}\operatorname{aval}(\mathit{D_{c}})&=&\inf\{\,\lim\,b^{*}y_{i}\,|\,\{y_{i}\}\,{\rm is\;asymptotically\;feasible\;in\;\;}(\mathit{D_{c}})\,\},\end{array}

where the infimum is taken over those AF solutions for which limbyi\lim\,b^{*}y_{i} exists.

We prove Lemma 2 by adapting an argument from [20]. We also rely on the following lemma due to Duffin:

Lemma 6.

(Duffin [21]) Problem (Pc)(\mathit{P_{c}}) is feasible with val(Pc)<+,\operatorname{val}(\mathit{P_{c}})<+\infty,\, iff (Dc)(\mathit{D_{c}}) is asymptotically feasible with aval(Dc)>,\operatorname{aval}(\mathit{D_{c}})>-\infty,\, and if these equivalent statements hold, then

val(Pc)=aval(Dc).\operatorname{val}(\mathit{P_{c}})=\operatorname{aval}(\mathit{D_{c}}).

Proof of Lemma 2 We will use the notation

𝒜h=(Ab01)\mathcal{A}_{h}=\begin{pmatrix}A&b\\ 0&1\end{pmatrix}

(which is also used in the proof of Theorem 1).

Proof (If) Suppose that 𝒜h(K×+)\mathcal{A}_{h}^{*}(K\times\mathbb{R}_{+})^{*} is closed and let cc be an objective vector, such that c0:=val(Pc)c_{0}:=\operatorname{val}(\mathit{P_{c}}) is finite. Then aval(Dc)=c0\operatorname{aval}(\mathit{D_{c}})=c_{0}\, holds by Lemma 6, so there is {yi}Ks.t.𝒜yic,andbyic0,\{y_{i}\}\subseteq K^{*}\;\mathrm{s.t.}\;\mathcal{A}^{*}y_{i}\rightarrow c,\;\mathrm{and}\;b^{*}y_{i}\rightarrow c_{0},\; i.e.,

(c,c0)cl(𝒜,b)Kcl𝒜h(K×+)=𝒜h(K×+).(c,c_{0})\in\,\operatorname{cl}(\mathcal{A},b)^{*}K^{*}\,\subseteq\,\operatorname{cl}\mathcal{A}_{h}^{*}(K^{*}\times\mathbb{R}_{+})\,=\,\mathcal{A}_{h}^{*}(K^{*}\times\mathbb{R}_{+}).

Hence there is yK,s0y\in K^{*},\,s\geq 0 such that 𝒜y=c,\mathcal{A}^{*}y=c,\, and by+s=c0;b^{*}y+s=c_{0}; by weak duality by=c0b^{*}y=c_{0} must hold. So yy is a feasible solution of (Dc)(\mathit{D_{c}}) with value c0,c_{0},\, and this completes the proof.

Proof (Only if) To obtain a contradiction, suppose that 𝒜h(K×+)\mathcal{A}_{h}^{*}(K\times\mathbb{R}_{+})^{*} is not closed; then we will show that (PP) is badly behaved. Let us choose cc and c0c_{0} such that

(c,c0)cl𝒜h(K×+)𝒜h(K×+).\begin{array}[]{rclrcl}(c,c_{0})&\in&\operatorname{cl}\mathcal{A}_{h}^{*}(K^{*}\times\mathbb{R}_{+})\setminus\mathcal{A}_{h}^{*}(K^{*}\times\mathbb{R}_{+}).\end{array}

By (c,c0)cl𝒜h(K×+)(c,c_{0})\in\operatorname{cl}\mathcal{A}_{h}^{*}(K^{*}\times\mathbb{R}_{+}) there is {(yi,si)}K×+s.t.𝒜yic,andbyi+sic0.\{(y_{i},s_{i})\}\subseteq K^{*}\times\mathbb{R}_{+}\;\mathrm{s.t.}\;\mathcal{A}^{*}y_{i}\rightarrow c,\;\mathrm{and}\;b^{*}y_{i}+s_{i}\rightarrow c_{0}. Hence

val(Pc)=aval(Dc)c0,\operatorname{val}(\mathit{P_{c}})\,=\,\operatorname{aval}(\mathit{D_{c}})\leq c_{0},\,

where the equality comes from Lemma 6.

However, (c,c0)𝒜h(K×+)(c,c_{0})\not\in\mathcal{A}_{h}^{*}(K^{*}\times\mathbb{R}_{+}) shows that no feasible solution of (Dc)(\mathit{D_{c}}) can have value c0\leq c_{0}. Hence either val(Dc)>c0\operatorname{val}(\mathit{D_{c}})>c_{0}\, (this includes the case val(Dc)=+,\operatorname{val}(\mathit{D_{c}})=+\infty,\, i.e., when (Dc)(\mathit{D_{c}}) is infeasible), or val(Dc)\operatorname{val}(\mathit{D_{c}}) is not attained. ∎

To prove Lemma 3 we need another lemma, which is mostly based on results surveyed in [28].

Lemma 7.

Let CC be a closed convex cone, xC,x\in C,\, and EE the smallest face of CC that contains x.x. Then

dir(x,C)\displaystyle\operatorname{dir}(x,C) =\displaystyle= C+linE,\displaystyle C+\operatorname{lin}E, (B.45)
ldir(x,C)\displaystyle\operatorname{ldir}(x,C) =\displaystyle= linE,\displaystyle\operatorname{lin}E, (B.46)
cldir(x,C)\displaystyle\operatorname{cl}\operatorname{dir}(x,C) =\displaystyle= (Cx),\displaystyle(C^{*}\cap x^{\perp})^{*}, (B.47)
tan(x,C)\displaystyle\tan(x,C) =\displaystyle= (Cx).\displaystyle(C^{*}\cap x^{\perp})^{\perp}. (B.48)

Proof Statements (B.45) and (B.47) are in Lemma 3.2.1 in [28] (in Lemma 2.7 in the online version). We also proved statement (B.48) there, assuming that CC is nice. In fact, it follows from (B.47) and (1.6) in general.

In (B.46) the containment \supseteq is trivial. To see \subseteq let yldir(x,C),y\in\operatorname{ldir}(x,C),\, then x±ϵyCx\pm\epsilon y\in C for some ϵ>0.\epsilon>0. Hence x±ϵyE,x\pm\epsilon y\in E,\, so ϵylinE,\epsilon y\in\operatorname{lin}E,\, and this completes the proof. ∎

Proof of Lemma 3 Let FF be the smallest face of 𝒮+n{\cal S}_{+}^{n} that contains Z.Z. Then clearly F=𝒮+r{0},F={\cal S}_{+}^{r}\oplus\{0\},\, and 𝒮+nZ={0}𝒮+nr.{\cal S}_{+}^{n}\cap Z^{\perp}=\{0\,\}\oplus{\cal S}_{+}^{n-r}. Hence statements (3.20)-(3.22) follow by taking C=𝒮+n,x=Z,E=FC={\cal S}_{+}^{n},\,x=Z,\,E=F\, in Lemma 7.

Next, fix Ycldir(Z,𝒮+n),Y\in\operatorname{cl}\operatorname{dir}(Z,{\cal S}_{+}^{n}),\, and partition it as in the right hand side set in (3.21). Then (3.23) is equivalent to

Ydir(Z,𝒮+n)(Y12T)(Y22).Y\in\operatorname{dir}(Z,{\cal S}_{+}^{n})\,\Leftrightarrow\,{\cal R}(Y_{12}^{T})\subseteq{\cal R}(Y_{22}). (B.49)

Let PP be an orthogonal matrix, such that PTY22P=Is0,P^{T}Y_{22}P=I_{s}\oplus 0,\, where ss is the number of positive eigenvalues of Y22Y_{22} and T=IrP.T=I_{r}\oplus P.

Define

V:=TTYT=(Y11Y12PPTY12TPTY22P)=(Y11Y12PPTY12TIs0).V\,:=\,T^{T}YT\,=\,\begin{pmatrix}Y_{11}&Y_{12}P\\ P^{T}Y_{12}^{T}&P^{T}Y_{22}P\end{pmatrix}\,=\,\begin{pmatrix}Y_{11}&Y_{12}P\\ P^{T}Y_{12}^{T}&I_{s}\oplus 0\end{pmatrix}.

Next we claim

Ydir(Z,𝒮+n)\displaystyle Y\in\operatorname{dir}(Z,{\cal S}_{+}^{n}) \displaystyle\Leftrightarrow Vdir(Z,𝒮+n),\displaystyle V\in\operatorname{dir}(Z,{\cal S}_{+}^{n}),\,\, (B.50)
(Y12T)(Y22)\displaystyle{\cal R}(Y_{12}^{T})\subseteq{\cal R}(Y_{22}) \displaystyle\Leftrightarrow (PTY12T)(PTY22P).\displaystyle{\cal R}(P^{T}Y_{12}^{T})\subseteq{\cal R}(P^{T}Y_{22}P). (B.51)

Indeed, (B.50) follows from TTZT=Z,T^{T}ZT=Z,\, and the definition of feasible directions. As to (B.51), the left hand side statement holds, iff there is a matrix DD with

Y12T=Y22D,Y_{12}^{T}\,=\,Y_{22}D, (B.52)

and the right hand side statement holds, iff there is a matrix DD^{\prime} such that

PTY12T=PTY22PD.P^{T}Y_{12}^{T}\,=\,P^{T}Y_{22}PD^{\prime}. (B.53)

If DD satisfies (B.52), then D:=P1DD^{\prime}:=P^{-1}D satisfies (B.53). Conversely, if (B.53) holds for D,D^{\prime}, then D:=PDD:=PD^{\prime} verifies (B.52).

Next, partition Y12PY_{12}P as (V12,V13),(V_{12},V_{13}),\, so that V12V_{12} has ss columns; then (B.51) is equivalent to V13=0.V_{13}=0. So we only need to prove

Vdir(Z,𝒮+n)V13=0.V\in\operatorname{dir}(Z,{\cal S}_{+}^{n})\,\Leftrightarrow\,V_{13}=0. (B.54)

Consider the matrix Z+ϵVZ+\epsilon V\, for some ϵ>0.\epsilon>0. If V130,V_{13}\neq 0,\, then Z+ϵVZ+\epsilon V is not positive semidefinite for any ϵ>0,\epsilon>0,\, and this proves the direction .\Rightarrow. As to ,\Leftarrow,\, if V13=0,V_{13}=0,\, then by the Schur-complement condition for positive semidefiniteness we have that Z+ϵV0Z+\epsilon V\succeq 0 iff

(Ir+ϵV11)(ϵV12)(ϵIs)1(ϵV12T)0,(I_{r}+\epsilon V_{11})-(\epsilon V_{12})(\epsilon I_{s})^{-1}(\epsilon V_{12}^{T})\succeq 0,

and the latter is clearly true for some small ϵ>0.\epsilon>0.

Acknowledgement My sincere thanks are due to the anonymous referees, the Associate Editor, and Shu Lu for their careful reading of the manuscript, and thoughtful comments. I also thank Minghui Liu for helpful comments, and for his help in proving Theorem 5. My thanks are also due to Asen Dontchev for his support while writing this paper.

References

  • [1] Erling D. Andersen, Cees Roos, and Tamás Terlaky. Notes on duality in second order and pp-order cone optimization. Optimization, 51(4):627–643, 2002.
  • [2] Alfred Auslender. Closedness criteria for the image of a closed set by a linear operator. Numer. Funct. Anal. Optim., 17:503–515, 1996.
  • [3] Alexander Barvinok. A Course in Convexity. Graduate Studies in Mathematics. AMS, 2002.
  • [4] Heinz Bauschke and Jonathan M. Borwein. Conical open mapping theorems and regularity. In Proceedings of the Centre for Mathematics and its Applications 36, pages 1–10. Australian National University, 1999.
  • [5] Heinz Bauschke and Patrick Combettes. Convex Analysis and Monotone Operator Theory in Hilbert Spaces. bb. Springer, 2011.
  • [6] Aharon Ben-Tal and Arkadii Nemirovskii. Lectures on modern convex optimization. MPS/SIAM Series on Optimization. SIAM, Philadelphia, PA, 2001.
  • [7] Dimitri Bertsekas and Paul Tseng. Set intersection theorems and existence of optimal solutions. Math. Program., 110:287–314, 2007.
  • [8] Grigoriy Blekherman, Pablo Parrilo, and Rekha Thomas, editors. Semidefinite Optimization and Convex Algebraic Geometry. MOS/SIAM Series in Optimization. SIAM, 2012.
  • [9] Lenore Blum, Felipe Cucker, Michael Shub, and Stephen Smale. Complexity and Real Computation. Springer, 1998.
  • [10] Immanuel Bomze, W. Schachinger, and G. Uchida. Think co(mpletely)positive ! matrix properties, examples and a clustered bibliography on copositive optimization. J. Global Opt., (52), 2012.
  • [11] Frédéric J. Bonnans and Alexander Shapiro. Perturbation analysis of optimization problems. Springer Series in Operations Research. Springer-Verlag, 2000.
  • [12] Jonathan M. Borwein and Adrian S. Lewis. Convex Analysis and Nonlinear Optimization: Theory and Examples. CMS Books in Mathematics. Springer, 2000.
  • [13] Jonathan M. Borwein and Warren B. Moors. Stability of closedness of convex cones under linear mappings. J. Convex Anal., 16(3–4), 2009.
  • [14] Jonathan M. Borwein and Warren B. Moors. Stability of closedness of convex cones under linear mappings II. J. Nonlin. Anal. Theory and Appl., 1(1), 2010.
  • [15] Jonathan M. Borwein and Henry Wolkowicz. Facial reduction for a cone-convex programming problem. J. Aust. Math. Soc., 30:369–380, 1981.
  • [16] Jonathan M. Borwein and Henry Wolkowicz. Regularizing the abstract convex program. J. Math. Anal. App., 83:495–530, 1981.
  • [17] Check-Beng Chua. Relating homogeneous cones and positive definite cones via T-algebras. SIAM J. Optim., 14:500–506, 2003.
  • [18] Check-Beng Chua and Levent Tunçel. Invariance and efficiency of convex representations. Math. Program. B, 111:113–140, 2008.
  • [19] Dimitry Drusviyatsky, Gábor Pataki, and Henry Wolkowicz. Coordinate shadows of semi-definite and euclidean distance matrices. SIAM J. Opt., 25(2):1160–1178, 2015.
  • [20] Richard Duffin, Robert Jeroslow, and Les A. Karlovitz. Duality in semi-infinite linear programming. In Semi-infinite programming and applications (Austin, Tex., 1981), volume 215 of Lecture Notes in Econom. and Math. Systems, pages 50–62. Springer, Berlin, 1983.
  • [21] Richard J. Duffin. Infinite programs. In A.W. Tucker, editor, Linear inequalities and Related Systems, pages 157–170. Princeton University Press, 1956.
  • [22] Leonid Faybusovich. On Nesterov’s approach to semi-definite programming. Acta Appl. Math., 74:195–215, 2002.
  • [23] Jean-Baptiste Hiriart-Urruty and Claude Lemaréchal. Convex Analysis and Minimization Algorithms. Springer-Verlag, 1993.
  • [24] Igor Klep and Markus Schweighofer. An exact duality theory for semidefinite programming based on sums of squares. Math. Oper. Res., 38(3):569–590, 2013.
  • [25] Minghui Liu and Gábor Pataki. Exact duality in semidefinite programming based on elementary reformulations. SIAM J. Opt., 25(3):1441–1454, 2015.
  • [26] Bruno Lourenco, Masakazu Muramatsu, and Takashi Tsuchiya. A structural geometrical analysis of weakly infeasible SDPs. Journal of the Operations Research Society of Japan, 59(3):241–257, 2015.
  • [27] Zhi-Quan Luo, Jos Sturm, and Shuzhong Zhang. Duality results for conic convex programming. Technical Report Report 9719/A, Erasmus University Rotterdam, Econometric Institute, The Netherlands, 1997.
  • [28] Gábor Pataki. The geometry of semidefinite programming. In Romesh Saigal, Lieven Vandenberghe, and Henry Wolkowicz, editors, Handbook of semidefinite programming. Kluwer Academic Publishers, 2000.
  • [29] Gábor Pataki. On the closedness of the linear image of a closed convex cone. Math. Oper. Res., 32(2):395–412, 2007.
  • [30] Gábor Pataki. On the connection of facially exposed and nice cones. J. Math. Anal. App., 400:211–221, 2013.
  • [31] Gábor Pataki. Strong duality in conic linear programming: facial reduction and extended duals. In David Bailey, Heinz H. Bauschke, Frank Garvan, Michel Théra, Jon D. Vanderwerff, and Henry Wolkowicz, editors, Proceedings of Jonfest: a conference in honour of the 60th birthday of Jon Borwein. Springer, also available from http://arxiv.org/abs/1301.7717, 2013.
  • [32] Javier Pena and Vera Roshchina. A complementarity partition theorem for multifold conic systems. Math. Program., 142:579–589, 2013.
  • [33] Imre Pólik and Tamás Terlaky. Exact duality for optimization over symmetric cones. Technical report, Lehigh University, Betlehem, PA, USA, 2009.
  • [34] Motakuri V. Ramana. An exact duality theory for semidefinite programming and its complexity implications. Math. Program. Ser. B, 77:129–162, 1997.
  • [35] Motakuri V. Ramana and Robert Freund. On the ELSD duality theory for SDP. Technical report, MIT, 1996.
  • [36] Motakuri V. Ramana, Levent Tunçel, and Henry Wolkowicz. Strong duality for semidefinite programming. SIAM J. Opt., 7(3):641–662, 1997.
  • [37] James Renegar. A Mathematical View of Interior-Point Methods in Convex Optimization. MPS-SIAM Series on Optimization. SIAM, Philadelphia, USA, 2001.
  • [38] Tyrrel R. Rockafellar. Convex Analysis. Princeton University Press, Princeton, NJ, USA, 1970.
  • [39] Vera Roshchina. Facially exposed cones are not nice in general. SIAM J. Opt., 24:257–268, 2014.
  • [40] Simon P. Schurr, André L. Tits, and Dianne P. O’Leary. Universal duality in conic convex optimization. Math. Program. Ser. A, 109:69–88, 2007.
  • [41] Michael J. Todd. Semidefinite optimization. Acta Numer., 10:515–560, 2001.
  • [42] Levent Tunçel. Polyhedral and Semidefinite Programming Methods in Combinatorial Optimization. Fields Institute Monographs, 2011.
  • [43] Levent Tunçel and Henry Wolkowicz. Strong duality and minimal representations for cone optimization. Comput. Optim. Appl., 53:619–648, 2012.
  • [44] Lieven Vandenberghe and Steven Boyd. Semidefinite programming. SIAM Review, 38(1):49–95, 1996.
  • [45] Hayato Waki. How to generate weakly infeasible semidefinite programs via Lasserre’s relaxations for polynomial optimization. Optim. Lett., 6(8):1883–1896, 2012.
  • [46] Hayato Waki and Masakazu Muramatsu. Facial reduction algorithms for conic optimization problems. J. Optim. Theory Appl., 158(1):188–215, 2013.
  • [47] Z. Waksman and M. Epelman. On point classification in convex sets. Math. Scand., 38:83–96, 1976.
  • [48] Hua Wei and Henry Wolkowicz. Generating and measuring instances of hard semidefinite programs. Math. Program., 125(1):31–45, 2010.