This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Sum-of-squares hierarchy lower bounds
for symmetric formulationsthanks: Supported by the Swiss National Science Foundation project 200020-144491/1 “Approximation Algorithms for Machine Scheduling Through Theory and Experiments” and by Sciex Project 12.311.

Adam Kurpisz  Samuli Leppänen  Monaldo Mastrolilli
IDSIA, 6928 Manno, Switzerland {adam, samuli, monaldo}@idsia.ch
Abstract

We introduce a method for proving Sum-of-Squares (SoS)/ Lasserre hierarchy lower bounds when the initial problem formulation exhibits a high degree of symmetry. Our main technical theorem allows us to reduce the study of the positive semidefiniteness to the analysis of “well-behaved” univariate polynomial inequalities.

We illustrate the technique on two problems, one unconstrained and the other with constraints. More precisely, we give a short elementary proof of Grigoriev/Laurent lower bound for finding the integer cut polytope of the complete graph. We also show that the SoS hierarchy requires a non-constant number of rounds to improve the initial integrality gap of 2 for the Min-Knapsack linear program strengthened with cover inequalities.

1 Introduction

Proving lower bounds for the Sum-of-Squares (SoS)/Lasserre hierarchy [25, 31] has attracted notable attention in the theoretical computer science community during the last decade, see e.g. [7, 12, 13, 18, 19, 27, 28, 29, 34, 36]. This is partly because the hierarchy captures many of the best known approximation algorithms based on semidefinite programming (SDP) for several natural 0/1 optimization problems (see [28] for a recent result). Indeed, it can be argued that the SoS hierarchy is the strongest candidate to be the “optimal” meta-algorithm predicted by the Unique Games Conjecture (UGC) [22, 32]. On the other hand, the hierarchy is also one of the best known candidates for refuting the conjecture since it is still conceivable that one could show that the SoS hierarchy achieves better approximation guarantees than the UGC predicts (see [6] for discussion). Despite the interest in the algorithm and due to the many technical challenges presented by semidefinite programming, only relatively few techniques are known for proving lower bounds for the hierarchy. In particular, several integrality gap results follow from applying gadget reductions to the few known original lower bound constructions.

Indeed, many of the known lower bounds for the SoS hierarchy originated in the works of Grigoriev [18, 19].111More precisely, Grigoriev considers the positivstellensatz proof system, which is the dual of the SoS hierarchy considered in this paper. For brevity, we will use SoS hierarchy/proof system interchangeably. We defer the formal definition of the hierarchy for later and only point out that solving the hierarchy after tt rounds takes nO(t)n^{O(t)} time. In [19] Grigoriev showed that random 3Xor or 3Sat instances cannot be solved even by Ω(n)\Omega(n) rounds of the SoS hierarchy (some of these results were later independently rediscovered by Schoenebeck [34]). Lower bounds, such as those of [7, 36] rely on [19, 34] combined with gadget reductions. Another important lower bound was also given by Grigoriev [18] for the Knapsack problem (a simplified proof can be found in [20]), showing that the SoS hierarchy cannot prove within n/2\lfloor n/2\rfloor rounds that the polytope {x[0,1]n:i=1nxi=n/2}\{x\in[0,1]^{n}:\sum_{i=1}^{n}x_{i}=n/2\} contains no integer point when nn is odd. Using essentially the same construction as in [20], Laurent [27] independently showed that n2\lfloor\frac{n}{2}\rfloor rounds are not enough for finding the integer cut polytope of the complete graph with nn nodes, where nn is odd (this result was recently shown to be tight in [15]).222The two problems, Knapsack and Max-Cut in complete graphs, considered respectively in [18, 20] and in [27], are essentially the same and we will use Max-Cut to refer to both. By using several new ideas and techniques, but a similar starting point as in [20, 27], Meka, Potechin and Wigderson [29] were able to show a lower bound of Ω(log1/2n)\Omega(\log^{1/2}n) for the Planted-Clique problem. Common to the works [20, 27] and [29] is that the matrix involved in the analysis has a large kernel, and they prove that a principal submatrix is positive definite by applying the theory of association schemes [16]. It is also interesting to point out that for the class of Max-CSPs, Lee, Raghavendra and Steurer [28] proved that the SoS relaxation yields the “optimal” approximation, meaning that SDPs of polynomial-size are equivalent in power to those arising from O(1)O(1) rounds of the SoS relaxations. Then, by appealing to the result by Grigoriev/Laurent [18, 27] they showed an exponential lower bound on the size of SDP formulations for the integer cut polytope. For different techniques to obtain lower bounds, we refer for example to the recent papers [5, 23, 24] (see also Section 5.4) and the survey [13] for an overview of previous results.

In this paper we introduce a method for proving SoS hierarchy lower bounds when the initial problem formulation exhibits a high degree of symmetry. Our main technical theorem (Theorem 1) allows us to reduce the study of the positive semidefiniteness to the analysis of “well-behaved” univariate polynomial inequalities. The theorem applies whenever the solution and constraints are symmetric, informally meaning that all subsets of the variables of equal cardinality play the same role in the formulation (see Section 3 for the formal definition). For example, the solution in [18, 20, 27] for Max-Cut is symmetric in this sense.

We note that exploiting symmetry reduces the number of variables involved in the analysis, and different ways of utilizing symmetry have been widely used in the past for proving integrality gaps for different hierarchies, see for example [8, 17, 19, 21, 24, 35]. An interesting difference of our approach from others is that we establish several lower bounds without fully identifying the formula of eigenvectors. More specifically, the common task in this context is to identify the spectral structure to get a simple diagonalized form. In the previous papers the moment matrices belong to the Bose-Mesner algebra of a well-studied association scheme, and hence one can use the existing theory. In this paper, instead of identifying the spectral structure completely, we identify only possible forms and propose to test all the possible candidates. This is in fact an important point, since the approach may be extended even if the underlying symmetry is imperfect or its spectral property is not well understood.

The proof of Theorem 1 is obtained by a sequence of elementary operations, as opposed to notions such as big kernel in the matrix form, the use of interlacing eigenvalues, the machinery of association schemes and various results about hyper-geometric series as in [18, 20, 27]. Thus Theorem 1 applies to the whole class of symmetric solutions, even when several conditions and machinery exploited in [18, 20, 27] cannot be directly applied. For example the kernel dimension, which was one of the important key property used to prove the results in  [18, 20, 27], depends on the particular solution that is used and it is not a general property of the class of symmetric solutions. The solutions for two problems considered in this paper have completely different kernel sizes of the analyzed matrices, one large and the other zero.

We demonstrate the technique with two illustrative and complementary applications. First, we show that the analysis of the lower bound for Max-Cut in [18, 20, 27] simplifies to few elementary calculations once the main theorem is in place. This result is partially motivated by the open question posed by O’Donnell [30] of finding a simpler proof for Grigoriev’s lower bound for the Knapsack problem.

As a second application we consider a constrained problem. We show that after Ω(log1ϵn)\Omega(\log^{1-\epsilon}n) levels the SoS hierarchy does not improve the integrality gap of 22 for the Min-knapsack linear program formulation strengthened with cover inequalities [10] introduced by Wolsey [37]. Adding cover inequalities is currently the most successful approach for capacitated covering problems of this type [1, 2, 3, 9, 11].

Our result is the first SoS lower bound for formulations with cover inequalities. In this application we demonstrate that our technique can also be used for suggesting the solution and for analyzing its feasibility.

Finally we point it out that the same analysis can be used to provide a non trivial lower bound to an open question raised by Laurent [26] regarding the Lasserre rank of the knapsack problem (see Section 5.4 for a discussion).

2 The SoS hierarchy

Consider a 0/10/1 optimization problem with m0m\geq 0 linear constraints g(x)0g_{\ell}(x)\geq 0, for [m]\ell\in[m] and xnx\in\mathbb{R}^{n}. We are interested in approximating the convex hull of the integral points of the set K={xn|g(x)0,[m]}K=\left\{x\in\mathbb{R}^{n}~|~g_{\ell}(x)\geq 0,\forall\ell\in[m]\right\} with the SoS hierarchy defined in the following.

The form of the SoS hierarchy we use in this paper (Definition 1) is equivalent to the one used in literature (see e.g. [4, 25, 26]). It follows from applying a change of basis to the dual certificate of the refutation of the proof system [26] (see also [29] for discussion on the connection to the proof system). We use this change of basis in order to obtain a useful decomposition of the moment matrices as a sum of rank one matrices of special kind. This will play an important role in our analysis. We refer the reader to Appendix A for more details and for a mapping between the different forms.

For any IN={1,,n}I\subseteq N=\{1,\ldots,n\}, let xIx_{I} denote the 0/10/1 solution obtained by setting xi=1x_{i}=1 for iIi\in I, and xi=0x_{i}=0 for iNIi\in N\setminus I. We denote by g(xI)g_{\ell}(x_{I}) the value of the constraint evaluated at xIx_{I}. For each integral solution xIx_{I}, where INI\subseteq N, in the SoS hierarchy defined below there is a variable yINy^{N}_{I} that can be interpreted as the “relaxed” indicator variable for the solution xIx_{I}. We point out that in this formulation of the hierarchy the number of variables {yIN:IN}\{y^{N}_{I}:I\subseteq N\} is exponential in nn, but this is not a problem in our context since we are interested in proving lower bounds rather than solving an optimization problem.

Let 𝒫t(N)\mathcal{P}_{t}(N) be the collection of subsets of NN of size at most tt\in\mathbb{N}. For every INI\subseteq N, the qq-zeta vector ZI𝒫q(N)Z_{I}\in\mathbb{R}^{\mathcal{P}_{q}(N)} is a 0/10/1 vector with JJ-th entry (|J|q|J|\leq q) equal to 11 if and only if JIJ\subseteq I.333In order to keep the notation simple, we do not emphasize the parameter qq as the dimension of the vectors should be clear from the context. Note that ZIZIZ_{I}Z_{I}^{\top} is a rank one matrix and the matrices considered in Definition 1 are linear combinations of these rank one matrices.

Definition 1.

The tt-th round SoS hierarchy relaxation for the set KK, denoted by SoSt(K)\text{\sc{SoS}}_{t}(K), is the set of values {yIN:IN}\{y^{N}_{I}\in\mathbb{R}:\forall I\subseteq N\} that satisfy

INyIN\displaystyle\sum_{\begin{subarray}{c}I\subseteq N\end{subarray}}y^{N}_{I} =\displaystyle= 1,\displaystyle 1, (1)
INyINZIZI\displaystyle\sum_{\begin{subarray}{c}I\subseteq N\end{subarray}}y^{N}_{I}Z_{I}Z_{I}^{\top} \displaystyle\succeq 0, where ZI𝒫t+d(N)\displaystyle 0,\text{ where }Z_{I}\in\mathbb{R}^{\mathcal{P}_{t+d}(N)} (2)
INg(xI)yINZIZI\displaystyle\sum_{\begin{subarray}{c}I\subseteq N\end{subarray}}g_{\ell}(x_{I})y^{N}_{I}Z_{I}Z_{I}^{\top} \displaystyle\succeq 0,[m], where ZI𝒫t(N)\displaystyle 0,~\forall\ell\in[m]\text{, where }Z_{I}\in\mathbb{R}^{\mathcal{P}_{t}(N)} (3)

where d=0d=0 if m=0m=0 (no linear constraints), otherwise d=1d=1.

It is straightforward to see that the SoS hierarchy formulation given in Definition 1 is a relaxation of the integral polytope. Indeed consider any feasible integral solution xIKx_{I}\in K and set yIN=1y^{N}_{I}=1 and the other variables to zero. This solution clearly satisfies Condition (1), Condition (2) because the rank one matrix ZIZIZ_{I}Z_{I}^{\top} is positive semidefinite (PSD), and Condition (3) since xIKx_{I}\in K.

3 The main technical theorem

The main result of this paper (see Theorem 1 below) allows us to reduce the study of the positive semidefiniteness for matrices (2) and (3) to the analysis of “well-behaved” univariate polynomial inequalities. It can be applied whenever the solutions and constraints are symmetric, namely they are invariant under all permutations π\pi of the set NN: zIN=zπ(I)Nz_{I}^{N}=z^{N}_{\pi(I)} for all INI\subseteq N (equivalently when zIN=zJNz_{I}^{N}=z^{N}_{J} whenever |I|=|J||I|=|J|),444We define the set-valued permutation by π(I)={π(i)|iI}\pi(I)=\left\{\pi(i)~|~i\in I\right\}. where zINz_{I}^{N} is understood to denote either yINy^{N}_{I} or g(xI)yINg_{\ell}(x_{I})y^{N}_{I}. For example, the solution for Max-Cut considered by Grigoriev [18] and Laurent [27] belongs to this class.

Theorem 1.

For any t{1,,n}t\in\{1,\ldots,n\}, let 𝒮t\mathcal{S}_{t} be the set of all polynomials Gh(k)[k]G_{h}(k)\in\mathbb{R}[k], for h{0,,t}h\in\{0,\ldots,t\}, that satisfy the following conditions:

Gh(k)\displaystyle G_{h}(k) [k]2t\displaystyle\in\mathbb{R}[k]_{2t} (4)
Gh(k)\displaystyle G_{h}(k) =0for k{0,,h1}{nh+1,,n}\displaystyle=0\qquad\text{for }k\in\{0,\ldots,h-1\}\cup\{n-h+1,\ldots,n\} (5)
Gh(k)\displaystyle G_{h}(k) 0for k[h1,,nh+1]\displaystyle\geq 0\qquad\text{for }k\in[h-1,\ldots,n-h+1] (6)

For any fixed set of values {zkN:k=0,,n}\{z^{N}_{k}\in\mathbb{R}:k=0,\ldots,n\}, if the following holds

k=hnhzkN(nk)Gh(k)\displaystyle\sum_{k=h}^{n-h}z^{N}_{k}\binom{n}{k}G_{h}(k) 0Gh(k)𝒮t\displaystyle\geq 0\qquad\forall G_{h}(k)\in\mathcal{S}_{t} (7)

then matrix (8) is positive-semidefinite

k=0nzkNIN|I|=kZIZI(where ZI𝒫t(N))\sum_{k=0}^{n}z^{N}_{k}\sum_{\begin{subarray}{c}I\subseteq N\\ |I|=k\end{subarray}}Z_{I}Z_{I}^{\top}\qquad(\text{where }Z_{I}\in\mathbb{R}^{\mathcal{P}_{t}(N)}) (8)

Note that polynomial Gh(k)G_{h}(k) in (6) is nonnegative in a real interval, and in (5) it is zero for a finite set of integers. Moreover, constraints (7) are trivially satisfied for h>n/2h>\lfloor n/2\rfloor.

Theorem 1 is a actually a corollary of a technical theorem that is not strictly necessary for the applications of this paper, and therefore deferred to a later section (see Theorem 7 in Section 6). The proof (given in Section 6) is obtained by exploiting the high symmetry of the eigenvectors of the matrix appearing in (8). Condition (7) corresponds to the requirement that the Rayleigh quotient being non-negative restricted to some highly symmetric vectors (which we show are the only ones we need to consider).

4 Max-Cut for the complete graph

In the Max-Cut problem, we are given an undirected graph and we wish to find a partition of the vertices (a cut) which maximizes the number of edges whose endpoints are on different sides of the partition (cut value). For the complete graph with nn vertices, consider any solution with ω\omega vertices on one side and the remaining nωn-\omega on the other side of the partition. This gives a cut of value ω(nω)\omega(n-\omega). When nn is odd and for any ωn/2\omega\leq n/2, Grigoriev [18] and Laurent [27] considered the following solution (reformulated in the basis considered in Definition 1, see Appendix B):

yIN=(n+1)(ωn+1)(1)n|I|ω|I|INy^{N}_{I}=(n+1)\binom{{\omega}}{n+1}\frac{(-1)^{n-|I|}}{{\omega}-|I|}\qquad\forall I\subseteq N (9)

It is shown [18, 27] that (9) is a feasible solution for the SoS hierarchy of value ω(nω)\omega(n-\omega), for any ωn/2\omega\leq n/2 up to round tωt\leq\lfloor\omega\rfloor. In particular for ω=n/2\omega=n/2 the cut value of the SoS relaxation is strictly larger than the value of the optimal integral cut (i.e. n2(n2+1)\lfloor\frac{n}{2}\rfloor(\lfloor\frac{n}{2}\rfloor+1)), showing therefore an integrality gap at round n/2\lfloor n/2\rfloor.

We note that the formula for the solution (9) is essentially implied by the requirement of having exactly ω\omega vertices on one side of the partition (see [18, 27] and [29] for more details) and the core of the analysis in [18, 27] is in showing that (9) is a feasible solution for the SoS hierarchy. By taking advantage of Theorem 1, the proof that (9) is a feasible solution for the SoS relaxation follows by observing the fact below.

Lemma 2.

For any polynomial P(x)[x]P(x)\in\mathbb{R}[x] of degree n\leq n and yiN=yINy_{i}^{N}=y^{N}_{I} defined in (9) we have

k=0n(nk)ykNP(k)=P(ω)\sum_{k=0}^{n}\binom{n}{k}y_{k}^{N}\,P(k)=P(\omega)
Proof.

By the polynomial remainder theorem P(k)=(ωk)Q(k)+P(ω)P(k)=(\omega-k)Q(k)+P(\omega), where Q(k)Q(k) is a unique polynomial of degree at most n1n-1. It follows that

k=0n(nk)ykNP(k)\displaystyle\sum_{k=0}^{n}\binom{n}{k}y_{k}^{N}\,P(k) =k=0n(nk)ykN(ωk)Q(k)=0+P(ω)k=0n(nk)ykN=1=P(ω)\displaystyle=\underbrace{\sum_{k=0}^{n}\binom{n}{k}y_{k}^{N}\,(\omega-k)Q(k)}_{=0}+P(\omega)\underbrace{\sum_{k=0}^{n}\binom{n}{k}y_{k}^{N}}_{=1}=P(\omega)

since k=0n(1)k(nk)Q(k)=0\sum_{k=0}^{n}(-1)^{k}\binom{n}{k}Q(k)=0 for any polynomial of degree at most n1n-1.

Now by Lemma 2 we have k=0nykN(nk)Gh(k)=Gh(ω)\sum_{k=0}^{n}y^{N}_{k}\binom{n}{k}G_{h}(k)=G_{h}(\omega) and the feasibility of (9) follows by Theorem 1, since we have that Gh(ω)0G_{h}(\omega)\geq 0 whenever tωt\leq\omega for ωn/2\omega\leq n/2.

5 Min-Knapsack with cover inequalities

The Min-Knapsack problem is defined as follows: we have nn items with costs cic_{i} and profits pip_{i}, and we want to choose a subset of items such that the sum of the costs of the selected items is minimized and the sum of the profits is at least a given demand PP. Formally, this can be formulated as an integer program (IP)min{j=1ncjxj:j=1npjxjP,x{0,1}n}(IP)~\min\left\{\sum_{j=1}^{n}c_{j}x_{j}:\\ \sum_{j=1}^{n}p_{j}x_{j}\geq P,x\in\{0,1\}^{n}\right\}. It is easy to see that the natural linear program (LP)(LP), obtained by relaxing x{0,1}nx\in\{0,1\}^{n} to x[0,1]nx\in[0,1]^{n} in (IP)(IP), has an unbounded integrality gap.

By adding the Knapsack Cover (KC) inequalities introduced by Wolsey [37] (see also [10]), the arbitrarily large integrality gap of the natural LP can be reduced to 2 (and it is tight [10]). The KC constraints are as follows: jApjAxjPp(A)\sum_{j\not\in A}p^{A}_{j}x_{j}\geq P-p(A) for all ANA\subseteq N, where p(A)=iApip(A)=\sum_{i\in A}p_{i} and pjA=min{pj,Pp(A)}p^{A}_{j}=\min\left\{p_{j},P-p(A)\right\}. Note that these constraints are valid constraints for integral solutions. Indeed, in the “integral world” if a set AA of items is picked we still need to cover Pp(A)P-p(A); the remaining profits are “trimmed” to be at most Pp(A)P-p(A) and this again does not remove any feasible integral solution.

The following instance [10] shows that the integrality gap implied by KC inequalities is 2: we have nn items of unit costs and profits. We are asked to select a set of items in order to obtain a profit of at least 1+1/(n1)1+1/(n-1). The resulting linear program formulation with KC inequalities is as follows (for xi[0,1]x_{i}\in[0,1], i=1,,ni=1,\ldots,n)

(LP+)minj=1nxj\displaystyle\mbox{($LP^{+}$)}\min\sum_{j=1}^{n}x_{j} s.t. j=1nxj1+1/(n1)\displaystyle\sum_{j=1}^{n}x_{j}\geq 1+1/(n-1) (11)
jNxj1NN:|N|=n1\displaystyle\sum_{j\in N^{\prime}}x_{j}\geq 1\qquad\forall N^{\prime}\subseteq N:|N^{\prime}|=n-1

Note that the solution xi=1/(n1)x_{i}=1/(n-1) is a valid fractional solution of value 1+1/(n1)1+1/(n-1) whereas the optimal integral solution has value 22. In the following we show that SoSt(LP+)\text{\sc{SoS}}_{t}(LP^{+}), with tt arbitrarily close to a logarithmic function of nn, admits the same integrality gap as the initial linear program (LP+LP^{+}) relaxation.

Theorem 3.

For any δ>0\delta>0 and sufficiently large nn^{\prime}, let t=log1δnt=\lfloor\log^{1-\delta}n^{\prime}\rfloor, n=nttn=\lfloor\frac{n^{\prime}}{t}\rfloor t and ϵ=o(t1)\epsilon=o(t^{-1}). Then the following solution is feasible for SoSt(LP+)\text{\sc{SoS}}_{t}(LP^{+}) with integrality gap of 2o(1)2-o(1)

yIN=(n|I|)1{(1+ϵ)n(n1)lognfor |I|=lognϵtjnfor |I|=jnt and j[t]1INyINfor I=0otherwise\displaystyle y^{N}_{I}=\binom{n}{|I|}^{-1}\cdot\left\{\begin{array}[]{ll}\frac{(1+\epsilon)n}{(n-1)\lfloor\log n\rfloor}&\text{for }|I|=\lfloor\log n\rfloor\\ \frac{\epsilon t}{jn}&\text{for }|I|=j\frac{n}{t}\text{ and }j\in[t]\\ 1-\sum_{\emptyset\neq I\subseteq N}y^{N}_{I}&\text{for }I=\emptyset\\ 0&\text{otherwise}\end{array}\right. (16)

5.1 Overview of the proof

An integrality gap proof for the SoS hierarchy can in general be thought of having two steps: first, choosing a solution to the hierarchy that attains a superoptimal value, and second showing that this solution is feasible for the hierarchy. We take advantage of Theorem 1 in both steps. Here we describe the overview of our integrality gap construction while keeping the discussion informal and technical details minimal, the proof can be found in Appendix 5.2.

Choosing the solution.

We make the following simplifying assumptions about the structure of the variables yINy_{I}^{N}: due to symmetry in the problem we set yIN=yJN=ykNy_{I}^{N}=y_{J}^{N}=y_{k}^{N} for each I,JI,J such that |I|=|J|=k|I|=|J|=k, and for every INI\subseteq N we set yIN0y_{I}^{N}\geq 0 in order to satisfy (2) for free. Furthermore, in order to have an integrality gap (i.e. a small objective function value), we guess that y0N1y^{N}_{0}\approx 1 forcing the other variables to be small due to (1).

We then show that satisfying (3) for every constraint follows from showing that

k=0n(nk)ykN(k2)i=1t(kri)20\sum_{k=0}^{n}\binom{n}{k}y_{k}^{N}(k-2)\prod_{i=1}^{t}(k-r_{i})^{2}\geq 0 (17)

for every choice of tt real variables rir_{i}. We get this condition by observing similarities in the structure of the constraints and applying Theorem 1, then expressing the polynomial in root form.555We show that the roots rir_{i} can be assumed to be real numbers. If we set y1N=0y_{1}^{N}=0, the only negative term in the sum corresponds to y0Ny_{0}^{N}. Then, it is clear that we need at least t+1t+1 non-zero variables ykNy_{k}^{N}, otherwise the roots rir_{i} can be set such that the positive terms in (17) vanish and the inequality is not satisfied. Therefore, we choose exactly t+1t+1 of the ykNy_{k}^{N} to be strictly positive (and the rest 0 excluding y0Ny_{0}^{N}), and we distribute them “far away” from each other, so that no root can be placed such that the coefficient of two positive terms become small. To take this idea further, for one “very small” kk^{\prime} (logarithmic in nn), we set ykNy_{k^{\prime}}^{N} positive and space out the rest evenly.

Proving that the solution is feasible.

We show that (17) holds for all possible rir_{i} with our chosen solution by analysing two cases. In the first case we assume that all of the roots rir_{i} are larger than log3n\log^{3}n. Then, we show that the “small” point kk^{\prime} we chose is enough to satisfy the condition. In the complement case, we assume that there is at least one root rir_{i} that is smaller than log3n\log^{3}n. It follows that one of the evenly spaced points is “far” from any remaining root, and can be used to show that the condition is satisfied.

5.2 Proof of Theorem 3

We start by proving the claimed integrality gap. The defined solution has an objective value that is arbitrarily close to 1 whereas the optimal integral value is 2. Indeed, the objective value of the relaxation is (see Appendix A): INyIN|I|=nn1(1+ε)+εt=n1.\sum_{I\subseteq N}y^{N}_{I}|I|=\frac{n}{n-1}(1+\varepsilon)+\varepsilon t\stackrel{{\scriptstyle n\rightarrow\infty}}{{=}}1.

The remaining part of the Theorem 3 follows by showing that the suggested solution satisfies (1), (2) and (3). Note that (1) is immediately satisfied by the definition of variables {yIN}\{y_{I}^{N}\}, and (2) is satisfied since yIN0y_{I}^{N}\geq 0 and the rank one matrix ZIZIZ_{I}Z_{I}^{\top} is positive semidefinite for every INI\subseteq N. It remains to prove that the condition (3) is also satisfied for all the constraints (11) and (11). Note that the constraint (11) is not symmetric (one variable is missing and sets of variables of the same size do not play the same role with respect to this constraint). However, the following lemma shows how to solve this issue by reducing to the form (8) of Theorem 1.

Lemma 4.

The constraint (3) holds for both (11) and (11) if the solution of Theorem 3 satisfies

INnyIN(|I|2)ZIZInn1ZZ\sum_{\emptyset\neq I\subseteq N}^{n}y_{I}^{N}\left(|I|-2\right)Z_{I}Z_{I}^{\top}\succeq\frac{n}{n-1}Z_{\emptyset}Z_{\emptyset}^{\top} (18)
Proof.

We first show that this implies that (3) holds for the constraint j=1nxj1+1/(n1)\sum_{j=1}^{n}x_{j}\geq 1+1/(n-1). Since for a large nn, yIN=0y_{I}^{N}=0 for |I|=1|I|=1 and yN1y_{\emptyset}^{N}\leq 1, the condition (3) takes the following form

INyIN(|I|nn1)ZIZIyNnn1ZZ\sum_{\emptyset\neq I\subseteq N}y_{I}^{N}\left(|I|-\frac{n}{n-1}\right)Z_{I}Z_{I}^{\top}\succeq y_{\emptyset}^{N}\frac{n}{n-1}Z_{\emptyset}Z_{\emptyset}^{\top}

which is implied if (18) is satisfied (recall that yIN0y^{N}_{I}\geq 0). Next, we will show that (18) also implies that (3) is satisfied for the cover constraint j=1n1xj1\sum_{j=1}^{n-1}x_{j}\geq 1 (the other cases are similar). For this constraint Condition (3) can be written as

nINIyIN(|I|1)ZIZI+nINyIN(|I|2)ZIZIyNZZ\displaystyle\sum_{\begin{subarray}{c}n\notin I\subseteq N\\ I\neq\emptyset\end{subarray}}y^{N}_{I}(|I|-1)Z_{I}Z_{I}^{\top}+\sum_{\begin{subarray}{c}n\in I\subseteq N\end{subarray}}y^{N}_{I}(|I|-2)Z_{I}Z_{I}^{\top}-y^{N}_{\emptyset}Z_{\emptyset}Z_{\emptyset}^{\top}
INIyIN(|I|2)ZIZIyNZZ0\displaystyle\succeq\sum_{\begin{subarray}{c}I\subseteq N\\ I\neq\emptyset\end{subarray}}y^{N}_{I}(|I|-2)Z_{I}Z_{I}^{\top}-y^{N}_{\emptyset}Z_{\emptyset}Z_{\emptyset}^{\top}\succeq 0

which is also implied if (18) is satisfied.

Now by Theorem 1, Condition (18) holds if we have k=1nykN(k2)(nk)Gh(k)nn1Gh(0)\sum_{k=1}^{n}y_{k}^{N}(k-2)\binom{n}{k}G_{h}(k)\geq\frac{n}{n-1}G_{h}(0) for h=0,1,,th=0,1,\ldots,t and every univariate polynomial Gh(k)G_{h}(k) of degree 2t2t such that Gh(k)0G_{h}(k)\geq 0 for k[h1,,nh+1]k\in[h-1,\ldots,n-h+1] and Gh(k)=0G_{h}(k)=0 for k{0,,h1}{nh+1,,n}k\in\{0,\ldots,h-1\}\cup\{n-h+1,\ldots,n\}.

Note that the only nontrivial case is for h=0h=0, since otherwise the above condition is immediately satisfied. Indeed, for h>0h>0, we have that Gh(0)=0G_{h}(0)=0 and the only remaining terms in the sum are non-negative. Thus in order to complete the proof of Theorem 3 it is enough to show that the following is satisfied

k=1nykN(k2)(nk)P2(k)nn1P2(0)P:deg(P)t\sum_{k=1}^{n}y_{k}^{N}(k-2)\binom{n}{k}P^{2}(k)\geq\frac{n}{n-1}P^{2}(0)\quad\forall P:\text{deg}(P)\leq t (19)

The following lemma (proved in Section 5.3) further reduces the interesting cases.

Lemma 5.

In order to prove that Solution (26) satisfies (19) it is sufficient to prove that (26) satisfies (19) for polynomials P(x)P(x) with the following properties:

  1. (a)

    all the roots r1,,rtr_{1},...,r_{t} of P(x)=0P(x)=0 are real numbers,

  2. (b)

    all the roots r1,,rtr_{1},...,r_{t} of P(x)=0P(x)=0 are in the range, 1rjn1\leq r_{j}\leq n for all j=1,,tj=1,\ldots,t,

  3. (c)

    the degree of P(x)P(x) is exactly tt.

Next we show that Solution (26) satisfies (19) and that there exists an ε=o(t1)\varepsilon=o(t^{-1}) as claimed.

The fundamental theorem of algebra states that any univariate polynomial of degree tt has exactly tt complex roots. According to Lemma 5 we can focus to polynomial with tt real roots. We prove that the suggested solution satisfies (19) by expressing the generic univariate polynomial P(k)P(k) using its roots r1,,rtr_{1},...,r_{t}, so that (19) becomes

k=1n(nk)ykN(k2)i=1t(rik)2(1+1n1)i=1tri2\sum_{k=1}^{n}\binom{n}{k}y_{k}^{N}\left(k-2\right)\prod_{i=1}^{t}(r_{i}-k)^{2}\geq\left(1+\frac{1}{n-1}\right)\prod_{i=1}^{t}r_{i}^{2} (20)

To show that (20) is satisfied we separate two cases: when all of the roots of the polynomial are greater or equal to a fixed treshold α=log3n\alpha=\log^{3}n and when at least one root is smaller than this treshold. In order to simplify the computations we denote β=logn\beta=\lfloor\log n\rfloor.

  1. 1.

    rjαr_{j}\geq\alpha for all jj. It is sufficient to show that the left–hand side term in (20) corresponding to k=βk=\beta satisfies

    (nβ)yβN(β2)i=1t(riβ)2(1+1n1)i=1tri2\binom{n}{\beta}y_{\beta}^{N}\left(\beta-2\right)\prod_{i=1}^{t}(r_{i}-\beta)^{2}\geq\left(1+\frac{1}{n-1}\right)\prod_{i=1}^{t}r_{i}^{2}

    Replacing the variables with the values we get

    nn11+εβ(β2)i=1t(riβ)2nn1i=1tri2\displaystyle\frac{n}{n-1}\frac{1+\varepsilon}{\beta}\left(\beta-2\right)\prod_{i=1}^{t}(r_{i}-\beta)^{2}\geq\frac{n}{n-1}\prod_{i=1}^{t}r_{i}^{2}
    1+εi=1t(ririβ)2112β1\displaystyle\Longleftrightarrow 1+\varepsilon\geq\prod_{i=1}^{t}\left(\frac{r_{i}}{r_{i}-\beta}\right)^{2}\frac{1}{1-2\beta^{-1}}

    Since by Theorem 1 and assumption, all roots rjr_{j} satisfy αrjn\alpha\leq r_{j}\leq n. Since ririβααβ\frac{r_{i}}{r_{i}-\beta}\leq\frac{\alpha}{\alpha-\beta} it is sufficient that it holds 1+ε112β1(ααβ)2t1+\varepsilon\geq\frac{1}{1-2\beta^{-1}}\left(\frac{\alpha}{\alpha-\beta}\right)^{2t}.

  2. 2.

    There is at least one root rjr_{j} such that rj<αr_{j}<\alpha. It can be shown by straightforward induction on the number of roots that if for at least one jj, rj<αr_{j}<\alpha, then there exists a point u=lntu=l\frac{n}{t}, l=1,,tl=1,...,t such that (nu)yu>0\binom{n}{u}y_{u}>0 and |uri|n2t|u-r_{i}|\geq\frac{n}{2t} for all i=1,,ti=1,...,t. Let uu be such a point. It is sufficient to show that we can satisfy (nu)yuN(u2)i=1t(riu)2nn1i=1tri2\binom{n}{u}y_{u}^{N}\left(u-2\right)\prod_{i=1}^{t}(r_{i}-u)^{2}\geq\frac{n}{n-1}\prod_{i=1}^{t}r_{i}^{2}.

    We have (nu)yuN=εu\binom{n}{u}y_{u}^{N}=\frac{\varepsilon}{u} and the estimates u2u2,(riu)2n2(2t)2,i=1trint1αu-2\geq\frac{u}{2},(r_{i}-u)^{2}\geq\frac{n^{2}}{(2t)^{2}},\prod_{i=1}^{t}r_{i}\leq n^{t-1}\alpha. Substituting these we get the condition ε2(n2t)2tnn1n2t2α2\frac{\varepsilon}{2}\left(\frac{n}{2t}\right)^{2t}\geq\frac{n}{n-1}n^{2t-2}\alpha^{2} which gives us the requirement that ε2α2n2(2t)2tnn1\varepsilon\geq\frac{2\alpha^{2}}{n^{2}}(2t)^{2t}\frac{n}{n-1}.

These two cases suggest that we fix ε\varepsilon as

ε=max{112β1(1βα)2t1,nn12α2n2(2t)2t}\varepsilon=\max\left\{\frac{1}{1-2\beta^{-1}}\left(1-\frac{\beta}{\alpha}\right)^{-2t}-1,\frac{n}{n-1}\frac{2\alpha^{2}}{n^{2}}(2t)^{2t}\right\}

The proof has now been reduced to showing that with this choice of ε\varepsilon we have εt0\varepsilon t\rightarrow 0, i.e., ε=o(t1)\varepsilon=o(t^{-1}). Assume ε=112β1(1βα)2t1\varepsilon=\frac{1}{1-2\beta^{-1}}\left(1-\frac{\beta}{\alpha}\right)^{-2t}-1. Then εt=t(112β1(1βα)2t1)t(112β1e4tβα1)\begin{array}[]{l}\varepsilon t=t\left(\frac{1}{1-2\beta^{-1}}\left(1-\frac{\beta}{\alpha}\right)^{-2t}-1\right)\leq t\left(\frac{1}{1-2\beta^{-1}}e^{4t\frac{\beta}{\alpha}}-1\right)\\ \end{array}, when β/α1/2\beta/\alpha\leq 1/2, using the estimate 1xe2x(1x)2te4xt1-x\geq e^{-2x}\Rightarrow(1-x)^{-2t}\leq e^{4xt} which holds when x1/2x\leq 1/2. Furthermore, the same estimate yields ex12xe^{x}-1\leq 2x when x1/2x\leq 1/2. Hence, we have the bound

εtt112β18tβα+t(112β11)=812β1t2βα+2tβ112β1\varepsilon t\leq t\frac{1}{1-2\beta^{-1}}\cdot 8t\frac{\beta}{\alpha}+t\left(\frac{1}{1-2\beta^{-1}}-1\right)=\frac{8}{1-2\beta^{-1}}\cdot t^{2}\frac{\beta}{\alpha}+\frac{2t\beta^{-1}}{1-2\beta^{-1}}

The right-hand side goes to 0 if t2βα0 and tβ0\frac{t^{2}\beta}{\alpha}\rightarrow 0\textrm{ and }\frac{t}{\beta}\rightarrow 0 as nn\rightarrow\infty. This is clearly the case for tlog1δnt\leq\log^{1-\delta}n, for any δ>0\delta>0.

Next, assume ε=nn12α2n2(2t)2t\varepsilon=\frac{n}{n-1}\frac{2\alpha^{2}}{n^{2}}(2t)^{2t}. Then εt=tnn12α2n2(2t)2t,\varepsilon t=t\frac{n}{n-1}\frac{2\alpha^{2}}{n^{2}}(2t)^{2t}, which immediately yields the condition on α\alpha and tt that we need tα2n2(2t)2t0\frac{t\alpha^{2}}{n^{2}}(2t)^{2t}\rightarrow 0 as nn\rightarrow\infty. Substituting α=log3n\alpha=\log^{3}n and t=log1δnt=\log^{1-\delta}n, for any δ>0\delta>0, allows us to write this as tα2n2(2t)2t=log1δnlog6nn2(2log1δn)2log1δn\frac{t\alpha^{2}}{n^{2}}(2t)^{2t}=\log^{1-\delta}n\frac{\log^{6}n}{n^{2}}(2\log^{1-\delta}n)^{2\log^{1-\delta}n}. By a change of variables of the form w=log1δnw=\log^{1-\delta}n we get

w2w+7δ1δ22we2w11δw4w+7δ1δe2w11δ=e(4w+7δ1δ)logwe2w11δ=e(4w+7δ1δ)logw2w11δ\frac{w^{2w+\frac{7-\delta}{1-\delta}}2^{2w}}{e^{2w^{\frac{1}{1-\delta}}}}\leq\frac{w^{4w+\frac{7-\delta}{1-\delta}}}{e^{2w^{\frac{1}{1-\delta}}}}=\frac{e^{(4w+\frac{7-\delta}{1-\delta})\log w}}{e^{2w^{\frac{1}{1-\delta}}}}=e^{(4w+\frac{7-\delta}{1-\delta})\log w-{2w^{\frac{1}{1-\delta}}}}

which tends to 0 as nn\rightarrow\infty.

5.3 Proof of Lemma 5

Lemma 5. In order to prove that Solution (26) satisfies (19) it is sufficient to prove that (26) satisfies (19) for polynomials P(x)P(x) with the following properties:

  1. (a)

    all the roots r1,,rtr_{1},...,r_{t} of P(x)=0P(x)=0 are real numbers,

  2. (b)

    all the roots r1,,rtr_{1},...,r_{t} of P(x)=0P(x)=0 are in the range, 1rjn1\leq r_{j}\leq n for all j=1,,tj=1,\ldots,t,

  3. (c)

    the degree of P(x)P(x) is exactly tt.

Proof.

First notice that (19) is equivalent to

k=1nykN(nk)(k2)j=1t(rjkrj)2(1+1n1)\sum_{k=1}^{n}y_{k}^{N}\binom{n}{k}\left(k-2\right)\prod_{j=1}^{t}\left(\frac{r_{j}-k}{r_{j}}\right)^{2}\geq\left(1+\frac{1}{n-1}\right) (21)

where for the fixed nn the right-hand side is constant.

  1. (a)

    Let P(k)P(k) be the univariate polynomial with 2q2q complex roots (complex roots appear in conjugate pairs) i.e. r2j1=aj+bjir_{2j-1}=a_{j}+b_{j}i, r2j=ajbjir_{2j}=a_{j}-b_{j}i for j=1,,qj=1,...,q and the rest real roots. Let P(k)P^{\prime}(k) be the polynomial with all real roots such that r2j1=r2j=a2j2+b2j2r^{\prime}_{2j-1}=r^{\prime}_{2j}=\sqrt{a_{2j}^{2}+b_{2j}^{2}} for j=1,,qj=1,...,q and rj=rj,j>2qr^{\prime}_{j}=r_{j},j>2q.

    For any kNk\in N and j[t]j\in[t], a simple calculation shows that

    (r2j1kr2j1)2(r2jkr2j)2(r2j1kr2j1)2(r2jkr2j)2\left(\frac{r_{2j-1}-k}{r_{2j-1}}\right)^{2}\left(\frac{r_{2j}-k}{r_{2j}}\right)^{2}\geq\left(\frac{r^{\prime}_{2j-1}-k}{r^{\prime}_{2j-1}}\right)^{2}\left(\frac{r^{\prime}_{2j}-k}{r^{\prime}_{2j}}\right)^{2}

    Hence,

    k=1nykN(nk)(k2)j=1t(rjkrj)2k=1nykN(nk)(k2)j=1t(rjkrj)2\sum_{k=1}^{n}y_{k}^{N}\binom{n}{k}\left(k-2\right)\prod_{j=1}^{t}\left(\frac{r_{j}-k}{r_{j}}\right)^{2}\geq\sum_{k=1}^{n}y_{k}^{N}\binom{n}{k}\left(k-2\right)\prod_{j=1}^{t}\left(\frac{r^{\prime}_{j}-k}{r^{\prime}_{j}}\right)^{2}\\
  2. (b)

    Let P(k)P(k) be the univariate polynomial with all positive roots but one i.e. r1=ar_{1}=-a, for a>0a>0. Let P(k)P^{\prime}(k) be the univariate polynomial with all positive roots such that r1=ar^{\prime}_{1}=a and rj=rj,j>1r^{\prime}_{j}=r_{j},j>1. Since for any kNk\in N

    (aka)2(aka)2\left(\frac{-a-k}{-a}\right)^{2}\geq\left(\frac{a-k}{a}\right)^{2}

    and follows that,

    k=1nykN(nk)(k2)j=1t(rjkrj)2k=1nykN(nk)(k2)j=1t(rjkrj)2\sum_{k=1}^{n}y_{k}^{N}\binom{n}{k}\left(k-2\right)\prod_{j=1}^{t}\left(\frac{r_{j}-k}{r_{j}}\right)^{2}\geq\sum_{k=1}^{n}y_{k}^{N}\binom{n}{k}\left(k-2\right)\prod_{j=1}^{t}\left(\frac{r^{\prime}_{j}-k}{r^{\prime}_{j}}\right)^{2}\\

    Now, let P(k)P(k) be the univariate polynomial with r1(0,1)r_{1}\in(0,1) and rj1r_{j}\geq 1, for j>1j>1. Let P(k)P^{\prime}(k) be the univariate polynomial with r1=1r_{1}=1 and rj=rj,j>1r^{\prime}_{j}=r_{j},j>1.

    Since for any kNk\in N

    (r1kr1)2(1k1)2\left(\frac{r_{1}-k}{r_{1}}\right)^{2}\geq\left(\frac{1-k}{1}\right)^{2}

    and follows that,

    k=1nykN(nk)(k2)j=1t(rjkrj)2k=1nykN(nk)(k2)j=1t(rjkrj)2\sum_{k=1}^{n}y_{k}^{N}\binom{n}{k}\left(k-2\right)\prod_{j=1}^{t}\left(\frac{r_{j}-k}{r_{j}}\right)^{2}\geq\sum_{k=1}^{n}y_{k}^{N}\binom{n}{k}\left(k-2\right)\prod_{j=1}^{t}\left(\frac{r^{\prime}_{j}-k}{r^{\prime}_{j}}\right)^{2}\\

    Next, let P(k)P(k) be the univariate polynomial with rt=anr_{t}=an for a>1a>1 and rj[1,n]r_{j}\in[1,n], for jtj\neq t. Let P(k)P^{\prime}(k) be the univariate polynomial with rt=nr_{t}=n and rj=rj,jtr^{\prime}_{j}=r_{j},j\neq t.

    Since for any kNk\in N

    (ankan)2(nkn)2\left(\frac{an-k}{an}\right)^{2}\geq\left(\frac{n-k}{n}\right)^{2}

    and follows that,

    k=1nykN(nk)(k2)j=1t(rjkrj)2k=1nykN(nk)(k2)j=1t(rjkrj)2\sum_{k=1}^{n}y_{k}^{N}\binom{n}{k}\left(k-2\right)\prod_{j=1}^{t}\left(\frac{r_{j}-k}{r_{j}}\right)^{2}\geq\sum_{k=1}^{n}y_{k}^{N}\binom{n}{k}\left(k-2\right)\prod_{j=1}^{t}\left(\frac{r^{\prime}_{j}-k}{r^{\prime}_{j}}\right)^{2}\\
  3. (c)

    Let P(k)P(k) be the univariate polynomial with degree s<ts<t with all real roots. Let P(k)P^{\prime}(k) be the polynomial of degree tt with all real roots such that rj=rj,jsr^{\prime}_{j}=r_{j},j\leq s and rj=nr^{\prime}_{j}=n for s<jts<j\leq t

    For any kNk\in N, we have

    1(nkn)21\geq\left(\frac{n-k}{n}\right)^{2}

    Hence,

    (r1kr1)2(rskrs)2(r1kr1)2(rskrs)2(nkn)2(ts)\left(\frac{r_{1}-k}{r_{1}}\right)^{2}\cdots\left(\frac{r_{s}-k}{r_{s}}\right)^{2}\geq\left(\frac{r_{1}-k}{r_{1}}\right)^{2}\cdots\left(\frac{r_{s}-k}{r_{s}}\right)^{2}\left(\frac{n-k}{n}\right)^{2(t-s)}

    and finally

    k=1nykN(nk)(k2)j=1t(rjkrj)2k=1nykN(nk)(k2)j=1t(rjkrj)2\sum_{k=1}^{n}y_{k}^{N}\binom{n}{k}\left(k-2\right)\prod_{j=1}^{t}\left(\frac{r_{j}-k}{r_{j}}\right)^{2}\geq\sum_{k=1}^{n}y_{k}^{N}\binom{n}{k}\left(k-2\right)\prod_{j=1}^{t}\left(\frac{r^{\prime}_{j}-k}{r^{\prime}_{j}}\right)^{2}\\

5.4 Further Results

In a recent paper [23] the authors characterize the class of the initial 0/1 relaxations that are maximally hard for the SoS hierarchy. Here, maximally hard means those relaxations that still have an integrality gap even after n1n-1 rounds of the SoS hierarchy.666Recall that at level nn the integrality gap vanishes. An illustrative natural member of this class is given by the simple LP relaxation for the Min Knapsack problem, i.e.

(LP)min{j=1nxj:j=1nxjP,x[0,1]n}(LP)~\min\left\{\sum_{j=1}^{n}x_{j}:\\ \sum_{j=1}^{n}x_{j}\geq P,x\in[0,1]^{n}\right\}

In [23] it is shown that at level n1n-1 the integrality gap is kk, for any k2k\geq 2 if and only if P=Θ(k)22nP=\Theta(k)\cdot 2^{2n}. A natural question is to understand if the SoS hierarchy is able to reduce the gap when PP is “small”.

This problem, for P=1/2P=1/2, was considered by Cook and Dash [14] as an example where the Lovasz-Schrijver hierarchy rank is nn. Laurent [26] showed that the Sherali-Adams hierarchy rank is also equal to nn and raised the open question to find the rank for the Lasserre hierarchy. She also showed that when n=2n=2, the Lasserre relaxation has an integrality gap at level 1, but leaves open whether or not this happens at level n1n-1 for a general nn. In [23] it is ruled out the possibility that the Lasserre/SoS rank is nn for n3n\geq 3.

The following lemma provides a feasible solution for SoSt(LP)\text{\sc{SoS}}_{t}(LP) with integrality gap arbitrarily close to PP when t=Ω(log1εn)t=\Omega(\log^{1-\varepsilon}n) and for any P<1P<1. The proof is omitted since it is similar to the proof of Theorem 3.

Theorem 6.

For any δ>0\delta>0 and sufficiently large nn^{\prime}, let t=log1δnt=\lfloor\log^{1-\delta}n^{\prime}\rfloor, n=nttn=\lfloor\frac{n^{\prime}}{t}\rfloor t and ϵ=o(t1)\epsilon=o(t^{-1}). Then the following solution is feasible for SoSt(LP+)\text{\sc{SoS}}_{t}(LP^{+}) with integrality gap arbitrarily close to PP.

yIN=(n|I|)1{(1+ϵ)Plognfor |I|=lognϵtjnfor |I|=jnt and j[t]1INyINfor I=0otherwise\displaystyle y^{N}_{I}=\binom{n}{|I|}^{-1}\cdot\left\{\begin{array}[]{ll}\frac{(1+\epsilon)}{P\lfloor\log n\rfloor}&\text{for }|I|=\lfloor\log n\rfloor\\ \frac{\epsilon t}{jn}&\text{for }|I|=j\frac{n}{t}\text{ and }j\in[t]\\ 1-\sum_{\emptyset\neq I\subseteq N}y^{N}_{I}&\text{for }I=\emptyset\\ 0&\text{otherwise}\end{array}\right. (26)

6 Proof of Theorem 1

Theorem 1 is actually a corollary of a stronger statement (see Theorem 7 below) that provides necessary and sufficient conditions for the matrix (8) being positive-semidefinite.

Theorem 7 uses a special family of polynomials Gh(k)[k]G_{h}(k)\in\mathbb{R}[k] whose definition is deferred to a later section (see Definition 3 in Section 6.1). We postpone the definition because it will become natural in the flow of the proof of Theorem 7. Here we remark that the polynomials Gh(k)G_{h}(k) of Definition 3 satisfy the conditions (4), (5) and (6) of Theorem 1 (as shown in Lemma 14 to follow).

Theorem 7.

Let zkNz^{N}_{k}\in\mathbb{R} for k{0,,n}k\in\{0,\ldots,n\}. Then for any tNt\in N the following matrix is positive-semidefinite

k=0nzkNIN|I|=kZIZI(where ZI𝒫t(N))\sum_{k=0}^{n}z^{N}_{k}\sum_{\begin{subarray}{c}I\subseteq N\\ |I|=k\end{subarray}}Z_{I}Z_{I}^{\top}\qquad(\text{where }Z_{I}\in\mathbb{R}^{\mathcal{P}_{t}(N)}) (27)

if and only if

k=0nzkN(nk)Gh(k)0 for h{0,,t}\sum_{k=0}^{n}z^{N}_{k}\binom{n}{k}G_{h}(k)\geq 0\qquad\text{ for }h\in\{0,\ldots,t\} (28)

for every univariate polynomial Gh(x)[x]G_{h}(x)\in\mathbb{R}[x] of degree at most 2t2t as defined in Definition 3.

By Lemma 14, Theorem 1 is a straightforward corollary of Theorem 7. In the following we provide a proof for the latter.

6.1 Proof of Theorem 7

We study when the matrix M=k=0nzkIN,|I|=kZIZI,M=\sum_{k=0}^{n}z_{k}\sum_{I\subseteq N,|I|=k}Z_{I}Z_{I}^{\top}, where ZI𝒫t(N)Z_{I}\in\mathbb{R}^{\mathcal{P}_{t}(N)} is positive-semidefinite. Theorem 7 allows us to reduce the condition M0M\succeq 0 to inequalities of the form k=0n(nk)zkp(k)0\sum_{k=0}^{n}\binom{n}{k}z_{k}p(k)\geq 0, where p(k)p(k) is a univariate polynomial of degree 2t2t with some additional remarkable properties.

A basic key idea that is used to obtain such a characterization is that the eigenvectors of MM are “very well” structured. This structure is used to get p(k)p(k) with the claimed properties.

The structure of the eigenvectors.

Let Π\Pi denote the group of all permutations of the set NN, i.e. the symmetric group. Let PπP_{\pi} be the permutation matrix of size 𝒫t(N)×𝒫t(N)\mathcal{P}_{t}(N)\times\mathcal{P}_{t}(N) corresponding to any permutation π\pi of set NN, i.e. for any vector vv we have [Pπv]I=vπ(I)[P_{\pi}v]_{I}=v_{\pi(I)} for any I𝒫t(N)I\in\mathcal{P}_{t}(N) (see Footnote 4). Note that Pπ1=PπP_{\pi}^{-1}=P_{\pi}^{\top}.

Lemma 8.

For every πΠ\pi\in\Pi we have PπMPπ=MP^{\top}_{\pi}MP_{\pi}=M or, equivalently, MM and PπP_{\pi} commute MPπ=PπMMP_{\pi}=P_{\pi}M.

Proof.

Let eIe_{I} denote the vector with a 11 in the II-th coordinate and 0’s elsewhere. Observe

PπZI=PπQIeQ=QIPπeQ=QIeπ1(Q)=π(H)IeH=Hπ1(I)eH=Zπ1(I)P_{\pi}^{\top}Z_{I}=P_{\pi}^{\top}\sum_{Q\subseteq I}e_{Q}=\sum_{Q\subseteq I}P^{\top}_{\pi}e_{Q}=\sum_{Q\subseteq I}e_{\pi^{-1}(Q)}=\sum_{\pi(H)\subseteq I}e_{H}=\sum_{H\subseteq\pi^{-1}(I)}e_{H}=Z_{\pi^{-1}(I)}

Then, PπMPπ=k=0nzkINPπZIZIPπ=k=0nzkINZπ1(I)Zπ1(I)=MP_{\pi}^{\top}MP_{\pi}=\sum_{k=0}^{n}z_{k}\sum_{I\subseteq N}P_{\pi}^{\top}Z_{I}Z_{I}^{\top}P_{\pi}=\sum_{k=0}^{n}z_{k}\sum_{I\subseteq N}Z_{\pi^{-1}(I)}Z_{\pi^{-1}(I)}^{\top}=M.

Corollary 9.

If w𝒫t(N)w\in\mathbb{R}^{\mathcal{P}_{t}(N)} is an eigenvector of MM then v=Pπwv=P_{\pi}w is also an eigenvector of MM for any πΠ\pi\in\Pi.

Proof.

By the assumption, Mw=λwMw=\lambda w and by Lemma 8, Mv=M(Pπw)=PπMw=λvMv=M(P_{\pi}w)=P_{\pi}Mw=\lambda v.

By using Corollary 9 we can show that the set of interesting eigenvectors have some “strong” symmetry properties that will be used in our analysis. In the simplest case, for any eigenvector ww we could take the vector u=πΠPπwu=\sum_{\pi\in\Pi}P_{\pi}w and observe that the elements of uu have the form uI=uJu_{I}=u_{J} for each I,JI,J such that |I|=|J||I|=|J|. If u0\|u\|\not=0 then u/uu/\|u\| and w/ww/\|w\| are two eigenvectors corresponding to the same eigenvalue. The latter implies that by considering only eigenvectors having the form uI=uJu_{I}=u_{J} for each |I|=|J||I|=|J| we would consider the eigenvalue corresponding to the “unstructured” eigenvector ww as well. This is not the case in general, however, since it is possible that πΠPπw=0\sum_{\pi\in\Pi}P_{\pi}w=0.

We overcome this obstacle by restricting the permutations in a way which guarantees uu to be non-zero. Before going into the details, we introduce some notation.

Definition 2.

For any HNH\subseteq N, we denote by ΠH\Pi_{H} the permutation group that fixes the set HH in the following sense: πΠHπ(H)=H\pi\in\Pi_{H}\Leftrightarrow\pi(H)=H.

Note that the definition is equivalent to saying that πΠH\pi\in\Pi_{H} if and only if π(i)H\pi(i)\in H for every iHi\in H and π(i)H\pi(i)\notin H for every iHi\notin H.

Now, we choose a subset HNH\subseteq N such that πΠIPπw=0\sum_{\pi\in\Pi_{I}}P_{\pi}w=0 for each II such that |I|<|H||I|<|H| and u=πΠHPπw0u=\sum_{\pi\in\Pi_{H}}P_{\pi}w\neq 0. Such a set HH always exists, since otherwise ww is a zero vector, since if there is one non-zero entry wJw_{J} in ww, we can take H=JH=J and the resulting uu is non-zero. The choice of HH is not unique, but we can always assume that it is the subset of the first h=|H|h=|H| elements from NN, i.e. H={1,,h}H=\{1,\ldots,h\}. Indeed, if it is not the case, there exists a permutation πΠ\pi\in\Pi that maps HH to the subset of the first |H||H| elements from NN and such that PπwP_{\pi}w is an eigenvector of MM by Lemma 9. Now it holds that u0u\neq 0 and the vector u/uu/\|u\| is a unit eigenvector corresponding to the same eigenvalue as ww and has many elements that are equal to each other.

Lemma 10.

Let w𝒫t(N)w\in\mathbb{R}^{\mathcal{P}_{t}(N)} be a unit eigenvector of MM corresponding to eigenvalue λ\lambda, and HH be the smallest subset of NN such that u=πΠHPπw0u=\sum_{\pi\in\Pi_{H}}P_{\pi}w\neq 0. Then u/uu/\|u\| is also a unit eigenvector of MM corresponding to eigenvalue λ\lambda.

The following lemma shows the structure of eigenvectors obtained from summing the permutations of any “unstructured” eigenvector.

Lemma 11.

Let u=πΠHPπwu=\sum_{\pi\in\Pi_{H}}P_{\pi}w. Then the vector uu is invariant under the permutations of ΠH\Pi_{H}, namely uI=uπ(I)u_{I}=u_{\pi(I)} for πΠH\pi\in\Pi_{H}. Equivalently, uI=uJu_{I}=u_{J} for all |I|=|J||I|=|J| such that |IH|=|JH||I\cap H|=|J\cap H|.

Proof.

We have uI=[πΠHPπw]I=πΠH[Pπw]I=πΠHwπ(I)=πΠHwπ(π(I))=uπ(I)u_{I}=\left[\sum_{\pi\in\Pi_{H}}P_{\pi}w\right]_{I}=\sum_{\pi\in\Pi_{H}}[P_{\pi}w]_{I}=\sum_{\pi\in\Pi_{H}}w_{\pi(I)}=\sum_{\pi\in\Pi_{H}}w_{\pi(\pi(I))}=u_{\pi(I)} where the last but one equality follows since permutations are bijective. The claim follows by observing that for all |I|=|J||I|=|J| such that |IH|=|JH||I\cap H|=|J\cap H| there exists πΠH\pi\in\Pi_{H} such that π(I)=J\pi(I)=J.

Lemma 10, Lemma 11 and the arguments above imply Lemma 12.

Lemma 12.

For any eigenvalue λ\lambda of MM there exists an h=0,1,,th=0,1,\dots,t such that the following is an eigenvector corresponding to λ\lambda:

uh=i=0tj=0min{h,i}αi,jbi,ju_{h}=\sum_{i=0}^{t}\sum_{j=0}^{\min\{h,i\}}\alpha_{i,j}b_{i,j} (29)

where H={1,,h}H=\{1,\ldots,h\}, αi,j\alpha_{i,j}\in\mathbb{R} and bi,j𝒫t(N)b_{i,j}\in\mathbb{R}^{\mathcal{P}_{t}(N)} such that [bi,j]Q=1[b_{i,j}]_{Q}=1 if |Q|=i|Q|=i and |QH|=j|Q\cap H|=j, [bi,j]Q=0[b_{i,j}]_{Q}=0 otherwise.

By Lemma 12, we have that the positive semidefiniteness condition of MM follows by ensuring that for any h=0,1,,th=0,1,\ldots,t we have uhMuh0u_{h}^{\top}Mu_{h}\geq 0, i.e.

uhMuh=k=0nzkIN|I|=k(uhZI)2Ak=k=0nzkIN|I|=k(i=0tj=0min{h,i}αi,jbi,jZI)20u_{h}^{\top}Mu_{h}=\sum_{k=0}^{n}z_{k}\underbrace{\sum_{\begin{subarray}{c}I\subseteq N\\ |I|=k\end{subarray}}\left(u_{h}^{\top}Z_{I}\right)^{2}}_{A_{k}}=\sum_{k=0}^{n}z_{k}\sum_{\begin{subarray}{c}I\subseteq N\\ |I|=k\end{subarray}}\left(\sum_{i=0}^{t}\sum_{j=0}^{\min\{h,i\}}\alpha_{i,j}b_{i,j}^{\top}Z_{I}\right)^{2}\geq 0

In the following (Lemma 13) we show that the above values AkA_{k} are interpolated by the univariate polynomial Gh(x)G_{h}(x) defined in Definition 3. In Lemma 14 we prove some remarkable properties of Gh(x)G_{h}(x) as claimed in Theorem 1.

Definition 3.

For any h{0,,t}h\in\{0,\ldots,t\}, let Gh(k)[k]G_{h}(k)\in\mathbb{R}[k] be a univariate polynomial defined as follows

Gh(k)=r=0h(hr)hr(k)(j=0h(rj)pj(kr))2\displaystyle G_{h}(k)=\sum_{r=0}^{h}\binom{h}{r}h_{r}(k)\left(\sum_{j=0}^{h}\binom{r}{j}p_{j}(k-r)\right)^{2} (30)

where hr(k)=kr¯(nk)hr¯h_{r}(k)={k}^{\underline{r}}\cdot{(n-k)}^{\underline{h-r}} and pj(kr)=i=0tjαi+j,j(kri)p_{j}(k-r)=\sum_{i=0}^{t-j}\alpha_{i+j,j}\binom{k-r}{i} (for αi,j\alpha_{i,j}\in\mathbb{R}).777Denote by xm¯=x(x1)(xm+1){x}^{\underline{m}}=x(x-1)\cdots(x-m+1) the falling factorial (with the convention that x0¯=1{x}^{\underline{0}}=1).

Lemma 13.

For every k=0,,nk=0,\ldots,n the following identity holds Ak=(nk)1nh¯Gh(k)A_{k}=\binom{n}{k}\frac{1}{{n}^{\underline{h}}}G_{h}(k).

Proof.

We start noting that for every i=0,,ti=0,...,t, j=0,,|H|j=0,...,|H| we have 888Recall that (nk)=(nn+k)=0\binom{n}{-k}=\binom{n}{n+k}=0 for any positive integer kk.

bi,jZI=(|IH|j)(|IH|ij)b_{i,j}^{\top}Z_{I}=\binom{|I\cap H|}{j}\binom{|I\setminus H|}{i-j}

Indeed

bi,jZI=Q𝒫t(N)(bi,j)Q(zI)Q=QI,|Q|=i|QH|=j1=(|IH|j)(|IH|ij)\displaystyle b_{i,j}^{\top}Z_{I}=\sum_{Q\in\mathcal{P}_{t}(N)}(b_{i,j})_{Q}(z_{I})_{Q}=\sum_{\begin{subarray}{c}Q\subseteq I,|Q|=i\\ |Q\cap H|=j\end{subarray}}1=\binom{|I\cap H|}{j}\binom{|I\setminus H|}{i-j}

It follows that we have

IN|I|=k(uZI)2=IN|I|=k(i=0tj=0|H|αi,jbi,jZI)2\displaystyle\sum_{\begin{subarray}{c}I\subseteq N\\ |I|=k\end{subarray}}\left(u^{\top}Z_{I}\right)^{2}=\sum_{\begin{subarray}{c}I\subseteq N\\ |I|=k\end{subarray}}\left(\sum_{i=0}^{t}\sum_{j=0}^{|H|}\alpha_{i,j}b_{i,j}^{\top}Z_{I}\right)^{2}
=IN|I|=k(i=0tj=0|H|αi,j(|IH|j)(|IH|ij))2\displaystyle=\sum_{\begin{subarray}{c}I\subseteq N\\ |I|=k\end{subarray}}\left(\sum_{i=0}^{t}\sum_{j=0}^{|H|}\alpha_{i,j}\binom{|I\cap H|}{j}\binom{|I\setminus H|}{i-j}\right)^{2}

Splitting the sum over II by considering the intersections IHI\cap H of sizes r=0,,|H|r=0,...,|H|, we have

r=0|H||I|=k|IH|=r(i=0tj=0|H|αi,j(rj)(krij))2\displaystyle\sum_{r=0}^{|H|}\sum_{\begin{subarray}{c}|I|=k\\ |I\cap H|=r\end{subarray}}\left(\sum_{i=0}^{t}\sum_{j=0}^{|H|}\alpha_{i,j}\binom{r}{j}\binom{k-r}{i-j}\right)^{2}
=r=0|H|(|H|r)(n|H|kr)(j=0|H|(rj)i=0tαi,j(krij))2\displaystyle=\sum_{r=0}^{|H|}\binom{|H|}{r}\binom{n-|H|}{k-r}\left(\sum_{j=0}^{|H|}\binom{r}{j}\sum_{i=0}^{t}\alpha_{i,j}\binom{k-r}{i-j}\right)^{2}

Finally, we shift the sum over ii by jj and thus justify the equality

IN|I|=k(uZI)2=r=0|H|(|H|r)(n|H|kr)(j=0|H|(rj)i=0tjαi+j,j(kri))2\sum_{\begin{subarray}{c}I\subseteq N\\ |I|=k\end{subarray}}\left(u^{\top}Z_{I}\right)^{2}=\sum_{r=0}^{|H|}\binom{|H|}{r}\binom{n-|H|}{k-r}\left(\sum_{j=0}^{|H|}\binom{r}{j}\sum_{i=0}^{t-j}\alpha_{i+j,j}\binom{k-r}{i}\right)^{2}

Now, the sum over ii is a Newton polynomial that we denote by pj(kr)=i=0tjαi+j,j(kri)p_{j}(k-r)=\sum_{i=0}^{t-j}\alpha_{i+j,j}\binom{k-r}{i}. Note that by definition, here deg(p)=tj\deg(p)=t-j. Furthermore, observe that

(n|H|kr)=(nk)1n|H|¯kr¯(nk)|H|r¯\binom{n-|H|}{k-r}=\binom{n}{k}\frac{1}{{n}^{\underline{|H|}}}{k}^{\underline{r}}\cdot{(n-k)}^{\underline{|H|-r}}

and writing hr(k)=kr¯(nk)|H|r¯h_{r}(k)={k}^{\underline{r}}\cdot{(n-k)}^{\underline{|H|-r}} yields the claim. ∎

It follows that for any unit eigenvector uu of the form (29) the corresponding eigenvalue is equal to uMu=1nh¯k=0nzk(nk)Gh(k)u^{\top}Mu=\frac{1}{{n}^{\underline{h}}}\sum_{k=0}^{n}z_{k}\binom{n}{k}G_{h}(k). Theorem 7 requires that k=0nzk(nk)Gh(k)0\sum_{k=0}^{n}z_{k}\binom{n}{k}G_{h}(k)\geq 0 which implies that the eigenvalue uMuu^{\top}Mu is nonnegative. In the following section we complete the proof by showing that the polynomials Gh(k)G_{h}(k) of Definition 3 satisfy the conditions (4), (5) and (6) of Theorem 1 (as shown in Lemma 14).

6.2 Properties of the univariate polynomials

Lemma 14.

For any h{0,,t}h\in\{0,\ldots,t\}, the polynomials Gh(k)G_{h}(k) as defined in Definition 3 have the following properties:

  • (a)

    Gh(k)G_{h}(k) is a univariate polynomial of degree at most 2t2t,

  • (b)

    Gh(k)0G_{h}(k)\geq 0 for k[h1,nh+1]k\in[h-1,n-h+1]

  • (c)

    Gh(k)=0G_{h}(k)=0 for every k{0,,h1}{nh+1,,n}k\in\left\{0,...,h-1\right\}\cup\left\{n-h+1,...,n\right\}.

Proof of (a).

Gh(k)\displaystyle G_{h}(k) =\displaystyle= r=0h(hr)hr(k)(j=0h(rj)pj(kr))2\displaystyle\sum_{r=0}^{h}\binom{h}{r}h_{r}(k)\left(\sum_{j=0}^{h}\binom{r}{j}p_{j}(k-r)\right)^{2}
=\displaystyle= r=0h(hr)hr(k)(i=0hj=0h(ri)(rj)pi(kr)pj(kr))\displaystyle\sum_{r=0}^{h}\binom{h}{r}h_{r}(k)\left(\sum_{i=0}^{h}\sum_{j=0}^{h}\binom{r}{i}\binom{r}{j}p_{i}(k-r)p_{j}(k-r)\right)
=\displaystyle= r=0h(hr)hr(k)(i=0hj=0h(ri)(rj)(a=0tiαa+i,i(kra))(b=0tjαb+j,j(krb)))\displaystyle\sum_{r=0}^{h}\binom{h}{r}h_{r}(k)\left(\sum_{i=0}^{h}\sum_{j=0}^{h}\binom{r}{i}\binom{r}{j}\left(\sum_{a=0}^{t-i}\alpha_{a+i,i}\binom{k-r}{a}\right)\left(\sum_{b=0}^{t-j}\alpha_{b+j,j}\binom{k-r}{b}\right)\right)
=\displaystyle= i=0hj=0ha=0tib=0tjαa+i,iαb+j,jq=0as=0b(khq)(khs)(r=0h(hr)hr(k)(ri)(rj)(hraq)(hrbs))B(k)\displaystyle\sum_{i=0}^{h}\sum_{j=0}^{h}\sum_{a=0}^{t-i}\sum_{b=0}^{t-j}\alpha_{a+i,i}\alpha_{b+j,j}\sum_{q=0}^{a}\sum_{s=0}^{b}\underbrace{\binom{k-h}{q}\binom{k-h}{s}\left(\sum_{r=0}^{h}\binom{h}{r}h_{r}(k)\binom{r}{i}\binom{r}{j}\binom{h-r}{a-q}\binom{h-r}{b-s}\right)}_{B(k)}

Note that (kra)=q=0a(khq)(hraq)\binom{k-r}{a}=\sum_{q=0}^{a}\binom{k-h}{q}\binom{h-r}{a-q} by Vandermonde’s identity. We prove the theorem by showing that B(k)B(k) has degree not larger than 2t2t.

B(k)\displaystyle B(k) =\displaystyle= (khq)(khs)(r=0h(hr)kr¯(nk)hr¯(ri)(rj)(hraq)(hrbs)f(r))C(k)\displaystyle\binom{k-h}{q}\binom{k-h}{s}\underbrace{\left(\sum_{r=0}^{h}\binom{h}{r}{k}^{\underline{r}}{(n-k)}^{\underline{h-r}}\overbrace{\binom{r}{i}\binom{r}{j}\binom{h-r}{a-q}\binom{h-r}{b-s}}^{f(r)}\right)}_{C(k)}

By Lemma 15 below, the degree of C(k)C(k) is at most i+j+aq+bsi+j+a-q+b-s and thus the degree of B(k)B(k) is at most i+j+a+b=2ti+j+a+b=2t

Lemma 15.

The degree of C(k)C(k) is at most i+j+aq+bsi+j+a-q+b-s.

The claim follows by showing that the degree of C(k)C(k) is at most the degree of f(r)f(r). The degree of f(r)f(r) is i+j+aq+bsi+j+a-q+b-s.

Recall that the forward difference of function g(X)g(X) with respect to variable XX is a finite difference defined by ΔX[g(X)]=g(X+1)g(X)\Delta_{X}[g(X)]=g(X+1)-g(X). Higher order differences are obtained by repeated operations of the forward difference operator. We will use ΔX[g(X)]X=b\Delta_{X}^{\ell}[g(X)]_{X=b} to denote the \ell-th forward difference evaluated at X=bX=b. We will us the following easy to check identity: ΔXd[(k+X)r+d¯]=(k+X)r¯(r+d)d¯\Delta_{X}^{d}[{(k+X)}^{\underline{r+d}}]={(k+X)}^{\underline{r}}{(r+d)}^{\underline{d}}.

First note that any polynomial f(r)f(r) of degree δ\delta can written as linear combinations of polynomials Pd(r)=(r+1)d¯=(r+d)d¯P_{d}(r)={(r+1)}^{\overline{d}}={(r+d)}^{\underline{d}} with 0dδ0\leq d\leq\delta. It follows that the claim follows by showing that the degree of the following C(k)C^{\prime}(k) is at most the degree of Pd(r)P_{d}(r)

C(k)\displaystyle C^{\prime}(k) =\displaystyle= r=0h(hr)(nk)hr¯kr¯(r+d)d¯\displaystyle\sum_{r=0}^{h}\binom{h}{r}{(n-k)}^{\underline{h-r}}\cdot{k}^{\underline{r}}{(r+d)}^{\underline{d}}
=\displaystyle= r=0h(hr)(nk)hr¯ΔXd[(k+X)r+d¯]X=0\displaystyle\sum_{r=0}^{h}\binom{h}{r}{(n-k)}^{\underline{h-r}}\cdot\Delta_{X}^{d}\left[{(k+X)}^{\underline{r+d}}\right]_{X=0}
=\displaystyle= ΔXd[r=0h(hr)(nk)hr¯(k+X)r+d¯]X=0\displaystyle\Delta_{X}^{d}\left[\sum_{r=0}^{h}\binom{h}{r}{(n-k)}^{\underline{h-r}}{(k+X)}^{\underline{r+d}}\right]_{X=0}
=\displaystyle= ΔXd[(k+X)d¯r=0h(hr)(nk)hr¯(k+Xd)r¯]X=0\displaystyle\Delta_{X}^{d}\left[{(k+X)}^{\underline{d}}\sum_{r=0}^{h}\binom{h}{r}{(n-k)}^{\underline{h-r}}{(k+X-d)}^{\underline{r}}\right]_{X=0}
=\displaystyle= ΔXd[(k+X)d¯(n+Xd)h¯]X=0\displaystyle\Delta_{X}^{d}\left[{(k+X)}^{\underline{d}}{(n+X-d)}^{\underline{h}}\right]_{X=0}

where we have used the linearity of the forward difference operator and the Vandermonde’s identity to derive the last equality. The claim follows by observing that the forward difference operator does not increase the degree of its argument and therefore C(k)C^{\prime}(k) has degree at most dd.

Proof of (b).

Let k[h1,nh+1]k\in[h-1,n-h+1]. We have

Gh(k)=r=0h(hr)hr(k)(j=0h(rj)pj(kr))2G_{h}(k)=\sum_{r=0}^{h}\binom{h}{r}h_{r}(k)\left(\sum_{j=0}^{h}\binom{r}{j}p_{j}(k-r)\right)^{2}

where hr(k)=kr¯(nk)hr¯0h_{r}(k)={k}^{\underline{r}}\cdot{(n-k)}^{\underline{h-r}}\geq 0 for each r=0,,hr=0,...,h. Therefore Gh(k)G_{h}(k) is a sum of non-negative numbers (j=0h(rj)pj(kr))2\left(\sum_{j=0}^{h}\binom{r}{j}p_{j}(k-r)\right)^{2} weighted by positive coefficients hr(k)h_{r}(k).

Proof of (c).

From Lemma 13 we have that

1nh¯(nk)Gh(k)=IN|I|=k(uZI)2\frac{1}{{n}^{\underline{h}}}\binom{n}{k}G_{h}(k)=\sum_{\begin{subarray}{c}I\subseteq N\\ |I|=k\end{subarray}}\left(u^{\top}Z_{I}\right)^{2}

Therefore Gh(k)=0G_{h}(k)=0 for k{0,,h1}{nh+1,,n}k\in\left\{0,...,h-1\right\}\cup\left\{n-h+1,...,n\right\} if we can show that uZQ=0u^{\top}Z_{Q}=0 for all QNQ\subseteq N such that |Q|=k|Q|=k.

With this aim, we start noting that for every set SNS\subseteq N we have that the permutation group ΠS\Pi_{S} is the same as ΠNS\Pi_{N\setminus S}. Moreover if |S|<h|S|<h then πΠSPπu=0\sum_{\pi\in\Pi_{S}}P_{\pi}u=0, otherwise we would obtain a set SS smaller than HH with πΠSPπu0\sum_{\pi\in\Pi_{S}}P_{\pi}u\not=0 (contradicting our assumption that HH is a set with the smallest set size having πΠHPπu0\sum_{\pi\in\Pi_{H}}P_{\pi}u\not=0).

Now consider any set II such that IQI\subseteq Q with Q{S,NS}Q\in\{S,N\setminus S\} and |S|<h|S|<h. By the previous observations it follows that [πΠQPπu]I=πΠQPπuI=πΠQuπ(I)=0[\sum_{\pi\in\Pi_{Q}}P_{\pi}u]_{I}=\sum_{\pi\in\Pi_{Q}}P_{\pi}u_{I}=\sum_{\pi\in\Pi_{Q}}u_{\pi(I)}=0. Note that since IQI\subseteq Q the set {π(I):πΠQ}\{\pi(I):\pi\in\Pi_{Q}\} is equal to {J:JQ,|J|=|I|}\{J:J\subseteq Q,|J|=|I|\}, since ΠQ\Pi_{Q} is the permutation group that maps any element II from QQ to any other element from QQ of the same size. It follows that πΠQuπ(I)=JQ,|J|=|I|uJ=0\sum_{\pi\in\Pi_{Q}}u_{\pi(I)}=\sum_{J\subseteq Q,|J|=|I|}u_{J}=0. Using the latter we get

uZQ=JQuJ=i=0|Q|JQ,|J|=iuJ=0u^{\top}Z_{Q}=\sum_{J\subseteq Q}u_{J}=\sum_{i=0}^{|Q|}\sum_{J\subseteq Q,|J|=i}u_{J}=0

proving the claim.

Acknowledgements.

The authors would like to express their gratitude to Ola Svensson for helpful discussions and ideas regarding this paper.

References

  • [1] N. Bansal, N. Buchbinder, and J. Naor. Randomized competitive algorithms for generalized caching. In STOC, pages 235–244, 2008.
  • [2] N. Bansal, A. Gupta, and R. Krishnaswamy. A constant factor approximation algorithm for generalized min-sum set cover. In SODA, pages 1539–1545, 2010.
  • [3] N. Bansal and K. Pruhs. The geometry of scheduling. In FOCS, pages 407–414, 2010.
  • [4] B. Barak, F. G. S. L. Brandão, A. W. Harrow, J. A. Kelner, D. Steurer, and Y. Zhou. Hypercontractivity, sum-of-squares proofs, and their applications. In STOC, pages 307–326, 2012.
  • [5] B. Barak, S. O. Chan, and P. Kothari. Sum of squares lower bounds from pairwise independence. In STOC, 2015.
  • [6] B. Barak and D. Steurer. Sum-of-squares proofs and the quest toward optimal algorithms. Electronic Colloquium on Computational Complexity (ECCC), 21:59, 2014.
  • [7] A. Bhaskara, M. Charikar, A. Vijayaraghavan, V. Guruswami, and Y. Zhou. Polynomial integrality gaps for strong sdp relaxations of densest k-subgraph. In SODA, pages 388–405, 2012.
  • [8] G. Blekherman, J. ao Gouveia, and J. Pfeiffer. Sums of squares on the hypercube. CoRR, abs/1402.4199, 2014.
  • [9] T. Carnes and D. B. Shmoys. Primal-dual schema for capacitated covering problems. In IPCO, pages 288–302, 2008.
  • [10] R. D. Carr, L. Fleischer, V. J. Leung, and C. A. Phillips. Strengthening integrality gaps for capacitated network design and covering problems. In SODA, pages 106–115, 2000.
  • [11] D. Chakrabarty, E. Grant, and J. Könemann. On column-restricted and priority covering integer programs. In IPCO, pages 355–368, 2010.
  • [12] K. K. H. Cheung. Computation of the Lasserre ranks of some polytopes. Mathematics of Operation Research, 32(1):88–94, 2007.
  • [13] E. Chlamtac and M. Tulsiani. Convex relaxations and integrality gaps. In Handbook on Semidefinite, Conic and Polynomial Optimization, volume 166, pages 139–169. Springer, 2011.
  • [14] W. Cook and S. Dash. On the matrix-cut rank of polyhedra. Mathematics of Operations Research, 26(1):19–30, 2001.
  • [15] H. Fawzi, J. Saunderson, and P. Parrilo. Sparse sum-of-squares certificates on finite abelian groups. CoRR, abs/1503.01207, 2015.
  • [16] C. Godsil. Association schemes. Lecture Notes available at http://quoll.uwaterloo.ca/mine/Notes/assoc2.pdf, 2010.
  • [17] M. X. Goemans and L. Tunçel. When does the positive semidefiniteness constraint help in lifting procedures? Math. Oper. Res., 26(4):796–815, 2001.
  • [18] D. Grigoriev. Complexity of positivstellensatz proofs for the knapsack. Computational Complexity, 10(2):139–154, 2001.
  • [19] D. Grigoriev. Linear lower bound on degrees of positivstellensatz calculus proofs for the parity. Theoretical Computer Science, 259(1-2):613–622, 2001.
  • [20] D. Grigoriev, E. A. Hirsch, and D. V. Pasechnik. Complexity of semi-algebraic proofs. In STACS, pages 419–430, 2002.
  • [21] S. Hong and L. Tunçel. Unification of lower-bound analyses of the lift-and-project rank of combinatorial optimization polyhedra. Discrete Applied Mathematics, 156(1):25–41, 2008.
  • [22] S. Khot. On the power of unique 2-prover 1-round games. In STOC, pages 767–775, 2002.
  • [23] A. Kurpisz, S. Leppänen, and M. Mastrolilli. A lasserre lower bound for the min-sum single machine scheduling problem. In ESA, pages 853–864, 2015.
  • [24] A. Kurpisz, S. Leppänen, and M. Mastrolilli. On the hardest problem formulations for the 0/1 lasserre hierarchy. In ICALP, pages 872–885, 2015.
  • [25] J. B. Lasserre. Global optimization with polynomials and the problem of moments. SIAM Journal on Optimization, 11(3):796–817, 2001.
  • [26] M. Laurent. A comparison of the Sherali-Adams, Lovász-Schrijver, and Lasserre relaxations for 0-1 programming. Mathematics of Operations Research, 28(3):470–496, 2003.
  • [27] M. Laurent. Lower bound for the number of iterations in semidefinite hierarchies for the cut polytope. Math. Oper. Res., 28(4):871–883, 2003.
  • [28] J. R. Lee, P. Raghavendra, and D. Steurer. Lower bounds on the size of semidefinite programming relaxations. In STOC, pages 567–576, 2015.
  • [29] R. Meka, A. Potechin, and A. Wigderson. Sum-of-squares lower bounds for planted clique. In STOC, pages 87–96, 2015.
  • [30] R. O’Donnell. Approximability and proof complexity. Talk at ELC Tokyo. Slides available at http://www.cs.cmu.edu/ odonnell/slides/approx-proof-cxty.pps, 2013.
  • [31] P. Parrilo. Structured Semidefinite Programs and Semialgebraic Geometry Methods in Robustness and Optimization. PhD thesis, California Institute of Technology, 2000.
  • [32] P. Raghavendra. Optimal algorithms and inapproximability results for every csp? In STOC, pages 245–254, 2008.
  • [33] T. Rothvoß. The lasserre hierarchy in approximation algorithms. Lecture Notes for the MAPSP 2013 - Tutorial, June 2013.
  • [34] G. Schoenebeck. Linear level Lasserre lower bounds for certain k-csps. In FOCS, pages 593–602, 2008.
  • [35] T. Stephen and L. Tunçel. On a representation of the matching polytope via semidefinite liftings. Math. Oper. Res., 24(1):1–7, 1999.
  • [36] M. Tulsiani. Csp gaps and reductions in the Lasserre hierarchy. In STOC, pages 303–312, 2009.
  • [37] L. A. Wolsey. Facets for a linear inequality in 0–1 variables. Mathematical Programming, 8:168–175, 1975.

Appendix

Appendix A The SoS hierarchy

In this section we recall the usual definition of the SoS/Lasserre hierarchy [25] and justify Definition 1. Notice that the SDP hierarchy that we discuss here is the dual certificate of a refutation of the positivstellensatz proof system, for further information about the connection to the proof system we refer the reader to [29]. In our setting we restrict ourselves to problems with 0/10/1-variables and linear constraints. More precisely, we consider the following general optimization problem \mathbb{P}: given a multilinear polynomial f:{0,1}nf:\{0,1\}^{n}\rightarrow\mathbb{R}

:min{f(x)|x{0,1}nK}\mathbb{P}:\quad\min\{f(x)|x\in\{0,1\}^{n}\cap K\} (31)

where KK is a polytope defined by mm linear inequalities g(x)0 for [m]g_{\ell}(x)\geq 0\text{ for }\ell\in[m]. Many basic optimization problems are special cases of \mathbb{P}. For example, any kk-ary boolean constraint satisfaction problem, such as Max-Cut, is captured by (31) where a degree kk function f(x)f(x) counts the number of satisfied constraints, and no linear constraints g(x)0g_{\ell}(x)\geq 0 are present. Also any 0/10/1 integer linear program is a special case of (31), where f(x)f(x) is a linear function.

Lasserre [25] proposed a hierarchy of SDP relaxations parameterized by an integer rr,

min{L(f)|L:[X]2r,L(1)=1,L(x2x)=0 and L(u2),L(u2g)0, polynomial u}\min\{L(f)|L:\mathbb{R}[X]_{2r}\rightarrow\mathbb{R},L(1)=1,L(x^{2}-x)=0\text{ and }L(u^{2}),L(u^{2}g_{\ell})\geq 0,\forall\text{ polynomial }u\} (32)

where L:[X]2rL:\mathbb{R}[X]_{2r}\rightarrow\mathbb{R} is a linear map with [X]2r\mathbb{R}[X]_{2r} denoting the ring [X]\mathbb{R}[X] restricted to polynomials of degree at most 2r2r.999In [4], L(p)L(p) is written 𝔼~[p]\tilde{\mathbb{E}}[p] and called the “pseudo-expectation” of pp. Note that (32) is a relaxation since one can take LL to be the evaluation map ff(x)f\rightarrow f(x^{*}) for any optimal solution xx^{*}.

Relaxation (32) can be equivalently formulated in terms of moment matrices [25]. In the context of this paper, this matrix point of view is more convenient to use and it is described below. In our notation we mainly follow the survey of Laurent [26] (see also [33]).

Variables and Moment Matrix.

Let NN denote the set {1,,n}\{1,\ldots,n\}. The collection of all subsets of NN is denoted by 𝒫(N)\mathcal{P}(N). For any integer t0t\geq 0, let 𝒫t(N)\mathcal{P}_{t}(N) denote the collection of subsets of NN having cardinality at most tt. Let y𝒫(N)y\in\mathbb{R}^{\mathcal{P}(N)}. For any nonnegative integer tnt\leq n, let Mt(y)M_{t}(y) denote the matrix with (I,J)(I,J)-entry yIJy_{I\cup J} for all I,J𝒫t(N)I,J\in\mathcal{P}_{t}(N). Matrix Mt(y)M_{t}(y) is termed in the following as the t-moment matrix of yy. For a linear function g(x)=i=1ngixi+g0g(x)=\sum_{i=1}^{n}g_{i}\cdot x_{i}+g_{0}, we define gyg*y as a vector, often called shift operator, where the II-th entry is (gy)I=i=1ngiyI{i}+g0yI(g*y)_{I}=\sum_{i=1}^{n}g_{i}y_{I\cup\{i\}}+g_{0}y_{I}. Let ff denote the vector of coefficients of polynomial f(x)f(x) (where fIf_{I} is the coefficient of monomial ΠiIxi\Pi_{i\in I}x_{i} in f(x)f(x)).

Definition 4.

The tt-th round SoS (or Lasserre) relaxation of problem (31), denoted as SoSt()\text{\sc{SoS}}_{t}(\mathbb{P}), is the following

SoSt():min{INfIyI|y𝒫2t+2d(N) and y𝕄}\text{\sc{SoS}}_{t}(\mathbb{P}):\quad\min\left\{\sum_{I\subseteq N}f_{I}y_{I}|y\in\mathbb{R}^{\mathcal{P}_{2t+2d}(N)}\text{ and }y\in\mathbb{M}\right\} (33)

where 𝕄\mathbb{M} is the set of vectors y𝒫2t+2d(N)y\in\mathbb{R}^{\mathcal{P}_{2t+2d}(N)} that satisfy the following PSD conditions

y\displaystyle y_{\varnothing} =\displaystyle= 1,\displaystyle 1, (34)
Mt+d(y)\displaystyle M_{t+d}(y) \displaystyle\succeq 0,\displaystyle 0, (35)
Mt(gy)\displaystyle M_{t}(g_{\ell}*y) \displaystyle\succeq 0[m]\displaystyle 0\qquad\ell\in[m] (36)

where d=0d=0 if m=0m=0 (no linear constraints), otherwise d=1d=1.

Change of variables.

A solution of the SoS hierarchy as defined in Definition 4 is given by a vector y𝒫2t+2d(N)y\in\mathbb{R}^{\mathcal{P}_{2t+2d}(N)}. Next we show we can make a change of basis and replace the variables yIy_{I} with other variables yINy^{N}_{I} that are indexed by all the subsets of NN. Variable yINy^{N}_{I} can be interpreted as the “relaxed” indicator variable for the integral solution xIx_{I}, i.e. the 0/10/1 solution obtained by setting xi=1x_{i}=1 for iIi\in I, and xi=0x_{i}=0 for iNIi\in N\setminus I. We use this change of basis in order to obtain a useful decomposition of the moment matrix as a sum of rank one matrices of special kind. Here it is not necessary to distinguish between the moment matrix of the variables and constraints, hence in what follows we denote a generic vector by w𝒫2q(N)w\in\mathbb{R}^{\mathcal{P}_{2q}(N)}, where qq is either tt or t+1t+1.

Definition 5.

Let w𝒫2q(N)w\in\mathbb{R}^{\mathcal{P}_{2q}(N)}. For every I𝒫(N)I\in\mathcal{P}(N), define a vector wN𝒫(N)w^{N}\in\mathbb{R}^{\mathcal{P}(N)} such that

wI=IHNwHNw_{I}=\sum_{I\subseteq H\subseteq N}w^{N}_{H}

Note that the inverse (for |I|2t|I|\leq 2t) is

wIN=HNI,|HI|2t(1)HwIHw_{I}^{N}=\sum_{H\subseteq N\setminus I,|H\cup I|\leq 2t}(-1)^{H}w_{I\cup H} (37)

To simplify the notation, we note that the moment matrix of the variables is structurally similar to the moment matrix of the constraints: if z𝒫2q(N)z\in\mathbb{R}^{\mathcal{P}_{2q}(N)} is a vector such that zI=i=1NAiyI{i}byIz_{I}=\sum_{i=1}^{N}A_{\ell i}y_{I\cup\left\{i\right\}}-b_{\ell}y_{I} for some \ell, then [Mt(gy)]I,J=zIJ[M_{t}(g_{\ell}*y)]_{I,J}=z_{I\cup J}. Hence, the following lemma holds for the moment matrix of variables and constraints.

Lemma 16.

Let w𝒫2q(N)w\in\mathbb{R}^{\mathcal{P}_{2q}(N)}, and M𝒫q(N)×𝒫q(N)M\in\mathbb{R}^{\mathcal{P}_{q}(N)\times\mathcal{P}_{q}(N)} such that MI,J=wIJM_{I,J}=w_{I\cup J}. Then

M=HNwHNZHZHM=\sum_{H\subseteq N}w^{N}_{H}Z_{H}Z_{H}^{\top}
Proof.

Since MI,J=wIJM_{I,J}=w_{I\cup J}, we have by the change of variables that

[M]I,J=IJHNwHN=HNχIJ(H)wHN[M]_{I,J}=\sum_{I\cup J\subseteq H\subseteq N}w^{N}_{H}=\sum_{H\subseteq N}\chi_{I\cup J}(H)w^{N}_{H}

where χIJ(H)\chi_{I\cup J}(H) is the 0-1 indicator function such that χI(H)=1\chi_{I}(H)=1 if and only if IJHI\cup J\subseteq H. On the other hand, [ZHZH]I,J=[ZH]I[ZH]J=1[Z_{H}Z_{H}^{\top}]_{I,J}=[Z_{H}]_{I}[Z_{H}]_{J}=1 if IJHI\cup J\subseteq H, and 0 otherwise. Therefore [ZHZH]I,J=χIJ(H)[Z_{H}Z_{H}^{\top}]_{I,J}=\chi_{I\cup J}(H).

By the previous lemma it follows that given a solution by using variables {wIN}\{w_{I}^{N}\} we can obtain a solution with variables {wI:|I|2t}\{w_{I}:|I|\leq 2t\}. Viceversa, given any assignment of variables in {wI:|I|2t}\{w_{I}:|I|\leq 2t\} we can find an assignment of variables in {wIN}\{w_{I}^{N}\} such that MI,J=wIJM_{I,J}=w_{I\cup J} and M=HNwHNZHZHM=\sum_{H\subseteq N}w^{N}_{H}Z_{H}Z_{H}^{\top}. Indeed, set wIN=0w_{I}^{N}=0 for every II such that |I|>2t|I|>2t. For the remaining ones note that for |I|2t|I|\leq 2t the square matrix corresponding to the following equalities wI=IHNwHNw_{I}=\sum_{I\subseteq H\subseteq N}w^{N}_{H} is invertible since it is upper triangular.

Lemma 17.

[26] Given y𝒫2t+2(N)y\in\mathbb{R}^{\mathcal{P}_{2t+2}(N)}, for the vector zI=i=1nAiyI{i}byIz_{I}=\sum_{i=1}^{n}A_{\ell i}y_{I\cup\left\{i\right\}}-b_{\ell}y_{I} we have

zIN=g(xI)yINz^{N}_{I}=g_{\ell}(x_{I})y^{N}_{I} (38)

where g(xI)=i=1NAixibg_{\ell}(x_{I})=\sum_{i=1}^{N}A_{\ell i}x_{i}-b_{\ell} is a linear function corresponding to the constraint \ell, evaluated at xIx_{I} such that xi=1x_{i}=1 if iIi\in I and xi=0x_{i}=0 otherwise.

Proof.

We need to show that this choice of zINz_{I}^{N} yields zI=IHNzHNz_{I}=\sum_{I\subseteq H\subseteq N}z^{N}_{H}. We plug in (38) to obtain

IHNzHN=IHNg(xH)yHN=IHN[i=1nAixib]x=xHyHN\displaystyle\sum_{I\subseteq H\subseteq N}z^{N}_{H}=\sum_{I\subseteq H\subseteq N}g_{\ell}(x_{H})y^{N}_{H}=\sum_{I\subseteq H\subseteq N}\left[\sum_{i=1}^{n}A_{\ell i}x_{i}-b_{\ell}\right]_{x=x_{H}}y^{N}_{H}
=IHN(i=1n[Aixi]x=xHyHNbyHN)=IHNi=1n[Aixi]x=xHyHNbyI\displaystyle=\sum_{I\subseteq H\subseteq N}\left(\sum_{i=1}^{n}\left[A_{\ell i}x_{i}\right]_{x=x_{H}}y^{N}_{H}-b_{\ell}y^{N}_{H}\right)=\sum_{I\subseteq H\subseteq N}\sum_{i=1}^{n}\left[A_{\ell i}x_{i}\right]_{x=x_{H}}y^{N}_{H}-b_{\ell}y_{I}

Here the term [Aixi]x=xHyHN\left[A_{\ell i}x_{i}\right]_{x=x_{H}}y^{N}_{H} is AiyHNA_{\ell i}y^{N}_{H} if iHi\in H and 0 otherwise. Taking this into account and changing the order of the sums, the above becomes

i=1nI{i}HNAiyHNbyI=i=1nAiyI{i}byI\sum_{i=1}^{n}\sum_{I\cup\left\{i\right\}\subseteq H\subseteq N}A_{\ell i}y^{N}_{H}-b_{\ell}y_{I}=\sum_{i=1}^{n}A_{\ell i}y_{I\cup\left\{i\right\}}-b_{\ell}y_{I}

which proves the claim.

The above discussion together with the observation that y=1y_{\emptyset}=1 implies that JNyJN=1\sum_{J\subseteq N}y_{J}^{N}=1 and justifies Definition 1. Finally, we remark the following

Lemma 18.

Let ff denote the vector of coefficients of polynomial f(x)f(x) of (31). Then the objective value of the solution yy is given by

INfIyI=INf(xI)yIN\sum_{I\subseteq N}f_{I}y_{I}=\sum_{I\subseteq N}f(x_{I})y_{I}^{N}
Proof.

Similar lines as the proof of Lemmas 16 and 17.

Appendix B Change of variables for Max-Cut

Grigoriev [18] and Laurent [27] proved that the following solution is feasible for any ωn/2\omega\leq n/2 up to round tωt\leq\lfloor\omega\rfloor for the SoS hierarchy given in Definition 4:

yI=(ω|I|)(n|I|)IN:|I|2ty_{I}=\frac{\binom{\omega}{|I|}}{\binom{n}{|I|}}\qquad\forall I\subseteq N:|I|\leq 2t

Using the change of basis (37), solution {yI}\{y_{I}\} is equivalent to solution {yIN}\{y^{N}_{I}\}:

yIN\displaystyle y^{N}_{I} =HNI(1)|H|yIH=h=0n|I|(n|I|h)(1)h(ω|I|+h)(n|I|+h)\displaystyle=\sum_{H\subseteq N\setminus I}(-1)^{|H|}y_{I\cup H}=\sum_{h=0}^{n-|I|}\binom{n-|I|}{h}(-1)^{h}\frac{\binom{\omega}{|I|+h}}{\binom{n}{|I|+h}}
=yI(ω|I|1n|I|)(1)n|I|=(n+1)(ωn+1)(1)n|I|ω|I|\displaystyle=y_{I}\binom{\omega-|I|-1}{n-|I|}(-1)^{n-|I|}=(n+1)\binom{{\omega}}{n+1}\frac{(-1)^{n-|I|}}{{\omega}-|I|} (39)

where we use the identity ω=0m(1)ω(nω)=(1)m(n1m)\sum_{\omega=0}^{m}(-1)^{\omega}\binom{n}{{\omega}}=(-1)^{m}\binom{n-1}{m}.