This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Bounds on the Total Coefficient Size of Nullstellensatz Proofs of the Pigeonhole Principle and the Ordering Principle

Aaron Potechin and Aaron Zhang
Abstract

In this paper, we investigate the total coefficient size of Nullstellensatz proofs. We show that Nullstellensatz proofs of the pigeonhole principle on nn pigeons require total coefficient size 2Ω(n)2^{\Omega(n)} and that there exist Nullstellensatz proofs of the ordering principle on nn elements with total coefficient size 2nn2^{n}-n.

Acknowledgement: This research was supported by NSF grant CCF-2008920 and NDSEG fellowship F-9422254702.

1 Introduction

Given a system {pi=0:i[m]}\{p_{i}=0:i\in[m]\} of polynomial equations over an algebraically closed field, a Nullstellensatz proof of infeasibility is an equality of the form 1=i=1mpiqi1=\sum_{i=1}^{m}{p_{i}{q_{i}}} for some polynomials {qi=0:i[m]}\{q_{i}=0:i\in[m]\}. Hilbert’s Nullstellensatz111Actually, this is the weak form of Hilbert’s Nullstellensatz. Hilbert’s Nullstellensatz actually says that given polynomials p1,,pmp_{1},\ldots,p_{m} and another polynomial pp, if p(x)=0p(x)=0 for all xx such that pi(x)=0p_{i}(x)=0 for each i[m]i\in[m] then there exists a natural number rr such that prp^{r} is in the ideal generated by p1,,pmp_{1},\ldots,p_{m}. says that the Nullstellensatz proof system is complete, i.e. a system of polynomial equations has no solutions over an algebraically closed field if and only if there is a Nullstellensatz proof of infeasibility. However, Hilbert’s Nullstellensatz does not give any bounds on the degree or size needed for Nullstellensatz proofs.

The degree of Nullstellensatz proofs has been extensively studied. Grete Hermann showed a doubly exponential degree upper bound for the ideal membership problem [10] which implies the same upper bound for Nullstellensatz proofs. Several decades later, W. Dale Brownawell gave an exponential upper bound on the degree required for Nullstellensatz proofs over algebraically closed fields of characterisic zero [6]. A year later, János Kollár showed that this result holds for all algebraically closed fields [12].

For specific problems, the degree of Nullstellensatz proofs can be analyzed using designs [7]. Using designs, Nullstellensatz degree lower bounds have been shown for many problems including the pigeonhole principle, the induction principle, the housesitting principle, and the mod mm matching principles [2, 1, 3, 5, 8]. More recent work showed that there is a close connection between Nullstellensatz degree and reversible pebbling games [9] and that lower bounds on Nullstellensatz degree can be lifted to lower bounds on monotone span programs, monotone comparator circuits, and monotone switching networks [14].

For analyzing the size of Nullstellensatz proofs, a powerful technique is the size-degree tradeoff showed by Russell Impagliazzo, Pavel Pudlák, and Jiří Sgall for polynomial calculus [11]. This tradeoff says that if there is a size SS polynomial calculus proof then there is a polynomial calculus proof of degree O(nlogS)O(\sqrt{n\log{S}}). Thus, if we have an Ω(n)\Omega(n) degree lower bound for polynomial calculus, this implies a 2Ω(n)2^{\Omega(n)} size lower bound for polynomial calculus (which also holds for Nullstellensatz as Nullstellensatz is a weaker proof system). However, the size-degree tradeoff does not give any size lower bound when the degree is O(n)O(\sqrt{n}) and we know of very few other techniques for analyzing the size of Nullstellensatz proofs.

In this paper, we instead investigate the total coefficient size of Nullstellensatz proofs. We have two reasons for this. First, total coefficient size is interesting in its own right and to the best of our knowledge, it has not yet been explored. Second, total coefficient size may give insight into proof size in settings where we cannot apply the size-degree tradeoff and thus do not have good size lower bounds.

Remark 1.

Note that Nullstellensatz size lower bounds do not imply total coefficient size lower bounds because we could have a proof with many monomials but a small coefficient on each monomial. Thus, the exponential size lower bounds for the pigeonhole principle from Razborov’s Ω(n)\Omega(n) degree lower bound for polynomial calculus [15] and the size-degree tradeoff [11] do not imply total coefficient size lower bounds for the pigeonhole principle.

1.1 Our results

In this paper, we consider two principles, the pigeonhole principle and the ordering principle. We show an exponential lower bound on the total coefficient size of Nullstellensatz proofs of the pigeonhole principle and we show an exponential upper bound on the total coefficient size of Nullstellensatz proofs of the ordering principle. More precisely, we show the following bounds.

Theorem 1.

For all n2n\geq 2, any Nullstellensatz proof of the pigeonhole principle with nn pigeons and n1n-1 holes has total coefficient size Ω(n34(2e)n)\Omega\left(n^{\frac{3}{4}}\left(\frac{2}{\sqrt{e}}\right)^{n}\right).

Theorem 2.

For all n3n\geq 3, there is a Nullstellensatz proof of the ordering principle on nn elements with size and total coefficient size 2nn2^{n}-n.

After showing these bounds, we discuss total coefficient size for stronger proof systems. We observe that if we consider a stronger proof system which we call resolution-like proofs, our lower bound proof for the pigeonhole principle no longer works. We also observe that even though resolution is a dynamic proof system, the O(n3)O(n^{3}) size resolution proof of the ordering principle found by Gunnar Stålmark [16] can be captured by a one line sum of squares proof.

2 Nullstellensatz total coefficient size

We start by defining total coefficient size for Nullstellensatz proofs and describing a linear program for finding the minimum total coefficient size of a Nullstellensatz proof.

Definition 1.

Given a polynomial ff, we define the total coefficient size T(f)T(f) of ff to be the sum of the magnitudes of the coefficients of ff. For example, if f(x,y,z)=2x2y3xyz+5z5f(x,y,z)=2{x^{2}}y-3xyz+5z^{5} then T(f)=2+3+5=10T(f)=2+3+5=10.

Definition 2.

Given a system {pi=0:i[m]}\{p_{i}=0:i\in[m]\} of mm polynomial equations, a Nullstellensatz proof of infeasibility is an equality of the form

1=i=1mpiqi1=\sum_{i=1}^{m}{p_{i}{q_{i}}}

for some polynomials {qi:i[m]}\{q_{i}:i\in[m]\}. We define the total coefficient size of such a Nullstellensatz proof to be i=1mT(qi)\sum_{i=1}^{m}{T(q_{i})}.

The following terminology will be useful.

Definition 3.

Given a system {pi=0:i[m]}\{p_{i}=0:i\in[m]\} of polynomial equations, we call each of the equations pi=0p_{i}=0 an axiom. For each axiom si=0s_{i}=0, we define a weakening of this axiom to be an equation of the form rpi=0rp_{i}=0 for some monomial rr.

Remark 2.

We do not include the total coefficient size of pip_{i} in the total coefficient size of the proof as we want to focus on the complexity of the proof as opposed to the complexity of the axioms. That said, in this paper we only consider systems of polynomial equations where each pip_{i} is a monomial, so this choice does not matter.

The minimum total coefficient size of a Nullstellensatz proof can be found using the following linear program. In general, this linear program will have infinite size, but as we discuss below, it has finite size when the variables are Boolean.

  1. Primal: Minimize i=1mT(qi)\sum_{i=1}^{m}{T(q_{i})} subject to i=1mpiqi=1\sum_{i=1}^{m}{{p_{i}}{q_{i}}}=1. More precisely, writing qi=monomials rcirrq_{i}=\sum_{\text{monomials }r}{c_{ir}r}, we minimize i=1mmonomials rbir\sum_{i=1}^{m}{\sum_{\text{monomials }r}{b_{ir}}} subject to the constraints that

    1. 1.

      bircirb_{ir}\geq-c_{ir} and bircirb_{ir}\geq c_{ir} for all i[m]i\in[m] and monomials rr.

    2. 2.

      i=1mmonomials rcirrpi=1\sum_{i=1}^{m}{\sum_{\text{monomials }r}{c_{ir}{r}p_{i}}}=1

  2. Dual: Maximize D(1)D(1) subject to the constraints that

    1. 1.

      DD is a linear map from polynomials to \mathbb{R}.

    2. 2.

      For each i[m]i\in[m] and each monomial rr, |D(rpi)|1|D(rp_{i})|\leq 1.

Weak duality, which is what we need for our lower bound on the pigeonhole principle, can be seen directly as follows.

Proposition 1.

If DD is a linear map from polynomials to \mathbb{R} such that |D(rpi)|1|D(rp_{i})|\leq 1 for all i[m]i\in[m] and all monomials rr then any Nullstellensatz proof of infeasibility has total coefficient size at least D(1)D(1).

Proof.

Given a Nullstellensatz proof 1=i=1mpiqi1=\sum_{i=1}^{m}{{p_{i}}{q_{i}}}, applying DD to it gives

D(1)=i=1mD(piqi)i=1mT(qi)D(1)=\sum_{i=1}^{m}{D({p_{i}}{q_{i}})}\leq\sum_{i=1}^{m}{T(q_{i})}

2.1 Special case: Boolean variables

In this paper, we only consider problems where all of our variables are Boolean, so we make specific definitions for this case. In particular, we allow monomials to contain terms of the form (1xi)(1-x_{i}) as well as xix_{i} and we allow the Boolean axioms xi2=xix_{i}^{2}=x_{i} to be used for free. We also observe that we can define a linear map DD from polynomials to \mathbb{R} by assigning a value D(x)D(x) to each input xx.

Definition 4.

Given Boolean variables x1,,xNx_{1},\ldots,x_{N} where we have that xi=1x_{i}=1 if xix_{i} is true and xi=0x_{i}=0 if xix_{i} is false, we define a monomial to be a product of the form (iSxi)(jT(1xj))\left(\prod_{i\in S}{x_{i}}\right)\left(\prod_{j\in T}{(1-x_{j})}\right) for some disjoint subsets S,TS,T of [N][N].

Definition 5.

Given a Boolean variable xx, we use x¯\bar{x} as shorthand for the negation 1x1-x of xx.

Definition 6.

Given a set of polynomial equations {pi=0:i[m]}\{p_{i}=0:i\in[m]\} together with Boolean axioms {xj2xj=0:j[N]}\{x_{j}^{2}-x_{j}=0:j\in[N]\}, we define the total coefficient size of a Nullstellensatz proof

1=i=1mpiqi+j=1Ngj(xj2xj)1=\sum_{i=1}^{m}{{p_{i}}{q_{i}}}+\sum_{j=1}^{N}{{g_{j}}(x_{j}^{2}-x_{j})}

to be i=1mT(qi)\sum_{i=1}^{m}{T(q_{i})}. In other words, we allow the Boolean axioms {xj2xj=0:j[N]}\{x_{j}^{2}-x_{j}=0:j\in[N]\} to be used for free.

Remark 3.

For the problems we consider in this paper, all of our non-Boolean axioms are monomials, so there is actually no need to use the Boolean axioms.

Remark 4.

We allow monomials to contain terms of the form (1xi)(1-x_{i}) and allow the Boolean axioms to be used for free in order to avoid spurious lower bounds coming from difficulties in manipulating the Boolean variables rather than handling the non-Boolean axioms. In particular, with these adjustments, when the non-Boolean axioms are monomials, the minimum total coefficient size of a Nullstellensatz proof is upper bounded by the minimum size of a tree-resolution proof.

Since the Boolean axioms {xj2xj=0:j[N]}\{x_{j}^{2}-x_{j}=0:j\in[N]\} can be used for free, to specify a linear map DD from polynomials to \mathbb{R}, it is necessary and sufficient to specify the value of DD on each input x{0,1}Nx\in\{0,1\}^{N}.

Definition 7.

Given a function D:{0,1}ND:\{0,1\}^{N}\to\mathbb{R}, we can view DD as a linear map from polynomials to \mathbb{R} by taking D(f)=x{0,1}Nf(x)D(x)D(f)=\sum_{x\in\{0,1\}^{N}}{f(x)D(x)}

3 Total coefficient size lower bound for the pigeonhole principle

In this section, we prove Theorem 1, our total coefficient size lower bound on the pigeonhole principle. We start by formally defining the pigeonhole principle.

Definition 8 (pigeonhole principle (PHPn\mathrm{PHP}_{n})).

Intuitively, the pigeonhole principle says that if nn pigeons are assigned to n1n-1 holes, then some hole must have more than one pigeon. Formally, for n1n\geq 1, we define PHPn\mathrm{PHP}_{n} to be the statement that the following system of axioms is infeasible:

  • For each i[n]i\in[n] and j[n1]j\in[n-1], we have a variable xi,jx_{i,j}. xi,j=1x_{i,j}=1 represents pigeon ii being in hole jj, and xi,j=0x_{i,j}=0 represents pigeon ii not being in hole jj.

  • For each i[n]i\in[n], we have the axiom j=1n1x¯i,j=0\prod_{j=1}^{n-1}{\bar{x}_{i,j}}=0 representing the constraint that each pigeon must be in at least one hole (recall that x¯i,j=1xi,j\bar{x}_{i,j}=1-x_{i,j}).

  • For each pair of distinct pigeons i1,i2[n]i_{1},i_{2}\in[n] and each hole j[n1]j\in[n-1], we have the axiom xi1,jxi2,j=0x_{i_{1},j}x_{i_{2},j}=0 representing the constraint that pigeons i1i_{1} and i2i_{2} cannot both be in hole jj.

We prove our lower bound on the total coefficient size complexity of PHPn\text{PHP}_{n} by constructing and analyzing a dual solution DD. In our dual solution, the only assignments xx for which D(x)0D(x)\neq 0 are those where each pigeon goes to exactly one hole (i.e., for each pigeon ii, exactly one of the xi,jx_{i,j} is 1). Note that there are (n1)n(n-1)^{n} such assignments. In the rest of this section, when we refer to assignments or write a summation or expectation over assignments xx, we refer specifically to these (n1)n(n-1)^{n} assignments.

Recall that the dual constraints are

D(W)=assignments xD(x)W(x)[1,1]D(W)=\sum_{\text{assignments }x}{D(x)W(x)}\in[-1,1]

for all weakenings WW of an axiom. Note that since D(x)D(x) is only nonzero for assignments xx where each pigeon goes to exactly one hole, for any weakening WW of an axiom of the form j=1n1x¯i,j=0\prod_{j=1}^{n-1}{\bar{x}_{i,j}}=0, D(W)=0D(W)=0. Thus, it is sufficient to consider weakenings WW of the axioms xi1,jxi2,j=0x_{i_{1},j}x_{i_{2},j}=0. Further note that if |D(W)|>1|D(W)|>1 for some weakening WW then we can rescale DD by dividing by maxW|D(W)|\max_{W}{|D(W)|}. Thus, we can rewrite the objective value of the dual program as D(1)maxW|D(W)|\frac{D(1)}{\max_{W}{|D(W)|}}. Letting 𝔼\mathbb{E} denote the expectation over a uniform assignment where each pigeon goes to exactly one hole, D(1)maxW|D(W)|=𝔼(D)maxW|𝔼(DW)|\frac{D(1)}{\max_{W}{|D(W)|}}=\frac{\mathbb{E}(D)}{\max_{W}{|\mathbb{E}(DW)|}} so it is sufficient to construct DD and analyze 𝔼(D)\mathbb{E}(D) and maxW|𝔼(DW)|\max_{W}{|\mathbb{E}(DW)|}.

Before constructing and analyzing DD, we provide some intuition for our construction. The idea is that, if we consider a subset of n1n-1 pigeons, then DD should behave like the indicator function for whether those n1n-1 pigeons all go to different holes. More concretely, for any polynomial pp which does not depend on some pigeon ii (i.e. pp does not contain xi,jx_{i,j} or x¯i,j\bar{x}_{i,j} for any j[n1]j\in[n-1]),

𝔼(Dp)=(n1)!(n1)n1𝔼(pall pigeons in [n]{i} go to different holes)\mathbb{E}(Dp)=\frac{(n-1)!}{(n-1)^{n-1}}\mathbb{E}(p\mid\text{all pigeons in }[n]\setminus\{i\}\text{ go to different holes})

Given this intuition, we now present our construction. Our dual solution DD will be a linear combination of the following functions:

Definition 9 (functions JSJ_{S}).

Let S[n]S\subsetneq[n] be a subset of pigeons of size at most n1n-1. We define the function JSJ_{S} that maps assignments to {0,1}\{0,1\}. For an assignment xx, JS(x)=1J_{S}(x)=1 if all pigeons in SS are in different holes according to xx, and JS(x)=0J_{S}(x)=0 otherwise.

Note that if |S|=0|S|=0 or |S|=1|S|=1, then JSJ_{S} is the constant function 1. In general, the expectation of JSJ_{S} over a uniform assignment is 𝔼(JS)=(k=1|S|(nk))/(n1)|S|\mathbb{E}(J_{S})=\left(\prod_{k=1}^{|S|}(n-k)\right)/(n-1)^{|S|}.

Definition 10 (dual solution DD).

Our dual solution DD is:

D=S[n]cSJS,D=\sum_{S\subsetneq[n]}c_{S}J_{S},

where the coefficients cSc_{S} are cS=(1)n1|S|(n1|S|)!(n1)n1|S|c_{S}=\frac{(-1)^{n-1-|S|}(n-1-|S|)!}{(n-1)^{n-1-|S|}}.

We will lower-bound the dual value 𝔼(D)/maxW|𝔼(DW)|\mathbb{E}(D)/\max_{W}|\mathbb{E}(DW)| by computing 𝔼(D)\mathbb{E}(D) and then upper-bounding maxW|𝔼(DW)|\max_{W}|\mathbb{E}(DW)|. In both calculations, we will use the following key property of DD, which we introduced in our intuition for the construction:

Lemma 1.

If pp is a polynomial which does not depend on pigeon ii (i.e. pp does not contain any variables of the form xi,jx_{i,j} or x¯i,j\bar{x}_{i,j}) then 𝔼(Dp)=𝔼(J[n]{i}p)\mathbb{E}(Dp)=\mathbb{E}(J_{[n]\setminus\{i\}}p).

Proof.

Without loss of generality, suppose pp does not contain any variables of the form x1,jx_{1,j} or x¯1,j\bar{x}_{1,j}. Let TT be any subset of pigeons that does not contain pigeon 1 and that has size at most n2n-2. Observe that

𝔼(JT{1}p)=n1|T|n1𝔼(JTp)\mathbb{E}({J_{T\cup\{1\}}}p)=\frac{n-1-|T|}{n-1}\mathbb{E}({J_{T}}p)

because regardless of the locations of the pigeons in TT, the probability that pigeon 11 goes to a different hole is n1|T|n1\frac{n-1-|T|}{n-1} and pp does not depend on the location of pigeon 11. Since

cT{1}\displaystyle c_{T\cup\{1\}} =(1)n2|T|(n2|T|)!(n1)n2|T|\displaystyle=\frac{(-1)^{n-2-|T|}(n-2-|T|)!}{(n-1)^{n-2-|T|}}
=n1n1|T|(1)n1|T|(n1|T|)!(n1)n1|T|=n1n1|T|cT\displaystyle=-\frac{n-1}{n-1-|T|}\cdot\frac{(-1)^{n-1-|T|}(n-1-|T|)!}{(n-1)^{n-1-|T|}}=-\frac{n-1}{n-1-|T|}c_{T}

we have that for all T{2,,n}T\subsetneq\{2,\dots,n\},

𝔼(cT{1}JT{1}p)+𝔼(cTJTp)=0\mathbb{E}(c_{T\cup\{1\}}{J_{T\cup\{1\}}}p)+\mathbb{E}(c_{T}{J_{T}}p)=0

Thus, all terms except for J{2,3,,n}J_{\{2,3,\ldots,n\}} cancel. Since c{2,3,,n}=1c_{\{2,3,\ldots,n\}}=1, we have that 𝔼(Dp)=𝔼(J{2,3,,n}p)\mathbb{E}(Dp)=\mathbb{E}(J_{\{2,3,\ldots,n\}}p), as needed. ∎

The value of 𝔼(D)\mathbb{E}(D) follows immediately:

Corollary 1.
𝔼(D)=(n2)!(n1)n2.\mathbb{E}(D)=\frac{(n-2)!}{(n-1)^{n-2}}.
Proof.

Let p=1p=1. By Lemma 1, 𝔼(D)=𝔼(J{2,,n})=(n2)!/(n1)n2\mathbb{E}(D)=\mathbb{E}(J_{\{2,\dots,n\}})=(n-2)!/(n-1)^{n-2}. ∎

3.1 Upper bound on maxW|𝔼(DW)|\max_{W}|\mathbb{E}(DW)|

We introduce the following notation:

Definition 11 (HW,iH_{W,i}).

Given a weakening WW, we define a set of holes HW,i[n1]H_{W,i}\subseteq[n-1] for each pigeon i[n]i\in[n] so that W(x)=1W(x)=1 if and only if each pigeon i[n]i\in[n] is mapped to one of the holes in HW,iH_{W,i}. More precisely,

  • If WW contains terms xi,j1x_{i,j_{1}} and xi,j2x_{i,j_{2}} for distinct holes j1,j2j_{1},j_{2}, then HW,i=H_{W,i}=\emptyset (i.e. it is impossible that W(x)=1W(x)=1 because pigeon ii cannot go to both holes hh and hh^{\prime}). Similarly, if WW contains both xi,jx_{i,j} and x¯i,j\bar{x}_{i,j} for some jj then HW,i=H_{W,i}=\emptyset (i.e. it is impossible for pigeon ii to both be in hole jj and not be in hole jj).

  • If WW contains exactly one term of the form xi,jx_{i,j}, then HW,i={j}H_{W,i}=\{j\}. (i.e., for all xx such that W(x)=1W(x)=1, pigeon ii goes to hole jj).

  • If WW contains no terms of the form xi,jx_{i,j}, then HW,iH_{W,i} is the subset of holes jj such that WW does not contain the term x¯i,j\bar{x}_{i,j}. (i.e., if WW contains the term x¯i,j\bar{x}_{i,j}, then for all xx such that W(x)=1W(x)=1, pigeon ii does not go to hole jj.)

The key property we will use to bound maxW|𝔼(DW)|\max_{W}|\mathbb{E}(DW)| follows immediately from Lemma 1:

Lemma 2.

Let WW be a weakening. If there exists some pigeon i[n]i\in[n] such that HW,i=[n1]H_{W,i}=[n-1] (i.e., WW does not contain any terms of the form xi,jx_{i,j} or x¯i,j\bar{x}_{i,j}), then 𝔼(DW)=0\mathbb{E}(DW)=0.

Proof.

Without loss of generality, suppose WW is a weakening of the axiom x2,1x3,1=0x_{2,1}x_{3,1}=0 and HW,1=[n]H_{W,1}=[n]. By Lemma 1, 𝔼(DW)=𝔼(J{2,,n}W)\mathbb{E}(DW)=\mathbb{E}(J_{\{2,\dots,n\}}W). However, 𝔼(J{2,,n}W)=0\mathbb{E}(J_{\{2,\dots,n\}}W)=0 because if W=1W=1 then pigeons 2 and 3 must both go to hole 1. ∎

We make the following definition and then state a corollary of Lemma 2.

Definition 12 (WSflipW^{\mathrm{flip}}_{S}).

Let WW be a weakening of the axiom xi1,jxi2,j=0x_{i_{1},j}x_{i_{2},j}=0 for pigeons i1,i2i_{1},i_{2} and hole jj. Let S[n]{i1,i2}S\subseteq[n]\setminus\{i_{1},i_{2}\}. We define WSflipW^{\mathrm{flip}}_{S}, which is also a weakening of the axiom xi1,jxi2,j=0x_{i_{1},j}x_{i_{2},j}=0, as follows.

  • For each pigeon i3Si_{3}\in S, we define WSflipW^{\mathrm{flip}}_{S} so that HWSflip,i3=[n1]HW,i3H_{W^{\mathrm{flip}}_{S},i_{3}}=[n-1]\setminus H_{W,i_{3}}.

  • For each pigeon i3Si_{3}\notin S, we define WSflipW^{\mathrm{flip}}_{S} so that HWSflip,i3=HW,i3H_{W^{\mathrm{flip}}_{S},i_{3}}=H_{W,i_{3}}.

(Technically, there may be multiple ways to define WSflipW^{\mathrm{flip}}_{S} to satisfy these properties; we can arbitrarily choose any such definition.)

In other words, WSflipW^{\mathrm{flip}}_{S} is obtained from WW by flipping the sets of holes that the pigeons in SS can go to in order to make the weakening evaluate to 1. Now we state a corollary of Lemma 2:

Corollary 2.

Let WW be a weakening of the axiom xi1,jxi2,j=0x_{i_{1},j}x_{i_{2},j}=0 for pigeons i1,i2i_{1},i_{2} and hole jj. Let S[n]{i1,i2}S\subseteq[n]\setminus\{i_{1},i_{2}\}. Then

𝔼(DWSflip)=(1)|S|𝔼(DW).\mathbb{E}\left(DW^{\mathrm{flip}}_{S}\right)=(-1)^{|S|}\cdot\mathbb{E}(DW).
Proof.

It suffices to show that for i3[n]{i1,i2}i_{3}\in[n]\setminus\{i_{1},i_{2}\}, we have 𝔼(DW{i3}flip)=𝔼(DW)\mathbb{E}\left(DW^{\mathrm{flip}}_{\{i_{3}\}}\right)=-\mathbb{E}(DW). Indeed, W+W{i3}flipW+W^{\mathrm{flip}}_{\{i_{3}\}} is a weakening satisfying HW+W{i3}flip,i3=[n1]H_{W+W^{\mathrm{flip}}_{\{i_{3}\}},i_{3}}=[n-1]. Therefore, by Lemma 2, 𝔼(D(W+W{i3}flip))=0\mathbb{E}\left(D\left(W+W^{\mathrm{flip}}_{\{i_{3}\}}\right)\right)=0. ∎

Using Corollary 2, we can bound maxW|𝔼(DW)|\max_{W}|\mathbb{E}(DW)| using Cauchy-Schwarz. We first show an approach that does not give a strong enough bound. We then show how to modify the approach to achieve a better bound.

3.1.1 Unsuccessful approach to upper bound maxW|𝔼(DW)|\max_{W}|\mathbb{E}(DW)|

Consider maxW|𝔼(DW)|\max_{W}|\mathbb{E}(DW)|. By Lemma 2, it suffices to consider only weakenings WW such that, if WW is a weakening of the axiom xi1,jxi2,j=0x_{i_{1},j}x_{i_{2},j}=0, then for all pigeons i3[n]{i1,i2}i_{3}\in[n]\setminus\{i_{1},i_{2}\}, we have |HW,k|(n1)/2|H_{W,k}|\leq\lfloor(n-1)/2\rfloor. For any such WW, we have

W\displaystyle\lVert W\rVert =E(W2)\displaystyle=\sqrt{E(W^{2})}
(1n1)2(12)n2\displaystyle\leq\sqrt{\left(\frac{1}{n-1}\right)^{2}\left(\frac{1}{2}\right)^{n-2}}
=(n1)12(n2)/2.\displaystyle=(n-1)^{-1}\cdot 2^{-(n-2)/2}.

By Cauchy-Schwarz,

|𝔼(DW)|\displaystyle|\mathbb{E}(DW)| DW\displaystyle\leq\lVert D\rVert\lVert W\rVert
D(n1)12(n2)/2.\displaystyle\leq\lVert D\rVert(n-1)^{-1}2^{-(n-2)/2}.

Using the value of 𝔼(D)\mathbb{E}(D) from Corollary 1, the dual value 𝔼(D)/maxW|𝔼(DW)|\mathbb{E}(D)/\max_{W}|\mathbb{E}(DW)| is at least:

(n2)!(n1)n2(n1)2(n2)/2D=Θ~((e2)n1D)\frac{(n-2)!}{(n-1)^{n-2}}\cdot\frac{(n-1)2^{(n-2)/2}}{\lVert D\rVert}=\widetilde{\Theta}\left(\left(\frac{e}{\sqrt{2}}\right)^{-n}\cdot\frac{1}{\lVert D\rVert}\right)

by Stirling’s formula. Thus, in order to achieve an exponential lower bound on the dual value, we would need 1/DΩ(cn)1/\lVert D\rVert\geq\Omega(c^{n}) for some c>e/2c>e/\sqrt{2}. However, this requirement is too strong, as we will show that 1/D=Θ~((e)n)1/\lVert D\rVert=\widetilde{\Theta}\left(\left(\sqrt{e}\right)^{n}\right). Directly applying Cauchy-Schwarz results in too loose of a bound on maxW|𝔼(DW)|\max_{W}|\mathbb{E}(DW)|, so we now modify our approach.

3.1.2 Successful approach to upper bound maxW|𝔼(DW)|\max_{W}|\mathbb{E}(DW)|

Definition 13 (W{1,0,1}W^{\{-1,0,1\}}).

Let WW be a weakening of the axiom xi1,jxi2,j=0x_{i_{1},j}x_{i_{2},j}=0 for pigeons i1,i2i_{1},i_{2} and hole jj. We define the function W{1,0,1}W^{\{-1,0,1\}} that maps assignments to {1,0,1}\{-1,0,1\}. For an assignment xx,

  • If pigeons i1i_{1} and i2i_{2} do not both go to hole jj, then W{1,0,1}(x)=0W^{\{-1,0,1\}}(x)=0.

  • Otherwise, let V(x)=|{i3[n]{i1,i2}:pigeon i3 does not go to HW,i3}|V(x)=|\{i_{3}\in[n]\setminus\{i_{1},i_{2}\}:\text{pigeon }i_{3}\text{ does not go to }H_{W,i_{3}}\}|. Then W{1,0,1}(x)=(1)V(x)W^{\{-1,0,1\}}(x)=(-1)^{V(x)}.

Note that W{1,0,1}W^{\{-1,0,1\}} is a linear combination of the WSflipW^{\mathrm{flip}}_{S}:

Lemma 3.

Let WW be a weakening of the axiom xi1,jxi2,j=0x_{i_{1},j}x_{i_{2},j}=0 for pigeons i1,i2i_{1},i_{2} and hole jj. We have:

W{1,0,1}=S[n]{i1,i2}(1)|S|WSflip.W^{\{-1,0,1\}}=\sum_{S\subseteq[n]\setminus\{i_{1},i_{2}\}}(-1)^{|S|}\cdot W^{\mathrm{flip}}_{S}.

It follows that:

𝔼(DW{1,0,1})=2n2𝔼(DW).\mathbb{E}\left(DW^{\{-1,0,1\}}\right)=2^{n-2}\cdot\mathbb{E}(DW).
Proof.

To prove the first equation, consider any assignment xx. If pigeons i1i_{1} and i2i_{2} do not both go to hole jj, then both W{1,0,1}W^{\{-1,0,1\}} and all the WSflipW^{\mathrm{flip}}_{S} evaluate to 0 on xx. Otherwise, exactly one of the WSflip(x)W^{\mathrm{flip}}_{S}(x) equals 1, and for this choice of SS, we have W{1,0,1}(x)=(1)|S|W^{\{-1,0,1\}}(x)=(-1)^{|S|}.

The second equation follows because:

𝔼(DW{1,0,1})\displaystyle\mathbb{E}\left(DW^{\{-1,0,1\}}\right) =S[n]{i1,i2}(1)|S|𝔼(DWSflip)\displaystyle=\sum_{S\subseteq[n]\setminus\{i_{1},i_{2}\}}(-1)^{|S|}\cdot\mathbb{E}\left(DW^{\mathrm{flip}}_{S}\right)
=S[n]{i1,i2}(1)|S|(1)|S|𝔼(DW)\displaystyle=\sum_{S\subseteq[n]\setminus\{i_{1},i_{2}\}}(-1)^{|S|}(-1)^{|S|}\cdot\mathbb{E}(DW) (Corollary 2)
=2n2𝔼(DW).\displaystyle=2^{n-2}\cdot\mathbb{E}(DW).

Using Lemma 3, we now improve on the approach to upper-bound maxW|𝔼(DW)|\max_{W}|\mathbb{E}(DW)| from section 3.1.1:

Lemma 4.

The dual value 𝔼(D)/maxW|𝔼(DW)|\mathbb{E}(D)/\max_{W}|\mathbb{E}(DW)| is at least (n2)!(n1)n2(n1)2n2D\frac{(n-2)!}{(n-1)^{n-2}}\cdot\frac{(n-1)2^{n-2}}{\lVert D\rVert}

Proof.

For any WW, we have:

𝔼(DW)\displaystyle\mathbb{E}(DW) =2(n2)𝔼(DW{1,0,1})\displaystyle=2^{-(n-2)}\cdot\mathbb{E}\left(DW^{\{-1,0,1\}}\right) (Lemma 3)
2(n2)DW{1,0,1}\displaystyle\leq 2^{-(n-2)}\cdot\lVert D\rVert\lVert W^{\{-1,0,1\}}\rVert (Cauchy-Schwarz)
=2(n2)D𝔼((W{1,0,1})2)\displaystyle=2^{-(n-2)}\cdot\lVert D\rVert\sqrt{\mathbb{E}\left(\left(W^{\{-1,0,1\}}\right)^{2}\right)}
=(n1)12(n2)D.\displaystyle=(n-1)^{-1}2^{-(n-2)}\cdot\lVert D\rVert.

Using the value of 𝔼(D)\mathbb{E}(D) from Corollary 1, the dual value 𝔼(D)/maxW|𝔼(DW)|\mathbb{E}(D)/\max_{W}|\mathbb{E}(DW)| is at least (n2)!(n1)n2(n1)2n2D\frac{(n-2)!}{(n-1)^{n-2}}\cdot\frac{(n-1)2^{n-2}}{\lVert D\rVert}. ∎

It only remains to compute D\lVert D\rVert:

Lemma 5.
D2=(n2)!(n1)n2n!c=0n1(1)n1cnc1(n1)n1cc!{\lVert D\rVert}^{2}=\frac{(n-2)!}{(n-1)^{n-2}}\cdot n!\cdot\sum_{c=0}^{n-1}\frac{(-1)^{n-1-c}}{n-c}\cdot\frac{1}{(n-1)^{n-1-c}c!}
Proof.

Recall the definition of DD (Definition 10):

D\displaystyle D =S[n]cSJS,\displaystyle=\sum_{S\subsetneq[n]}c_{S}J_{S},
cS\displaystyle c_{S} =(1)n1|S|(n1|S|)!(n1)n1|S|.\displaystyle=\frac{(-1)^{n-1-|S|}(n-1-|S|)!}{(n-1)^{n-1-|S|}}.

We compute D2=𝔼(D2)\lVert D\rVert^{2}=\mathbb{E}(D^{2}) as follows.

𝔼(D2)=S[n]T[n]cScT𝔼(JSJT).\mathbb{E}(D^{2})=\sum_{S\subsetneq[n]}\sum_{T\subsetneq[n]}c_{S}c_{T}\cdot\mathbb{E}(J_{S}J_{T}).

Given S,T[n]S,T\subsetneq[n], we have:

𝔼(JSJT)\displaystyle\mathbb{E}(J_{S}J_{T}) =𝔼(JS)𝔼(JTJS=1)\displaystyle=\mathbb{E}(J_{S})\mathbb{E}(J_{T}\mid J_{S}=1)
=((i=1|S|(ni)!)/(n1)|S|)((j=|ST|+1|T|(nj)!)/(n1)|TS|)\displaystyle=\left(\left(\prod_{i=1}^{|S|}(n-i)!\right)/(n-1)^{|S|}\right)\left(\left(\prod_{j=|S\cap T|+1}^{|T|}(n-j)!\right)/(n-1)^{|T\setminus S|}\right)

Therefore,

cScT𝔼(JSJT)\displaystyle c_{S}c_{T}\cdot\mathbb{E}(J_{S}J_{T}) =(cS(i=1|S|(ni)!)/(n1)|S|)(cT(j=|ST|+1|T|(nj)!)/(n1)|TS|).\displaystyle=\left(c_{S}\left(\prod_{i=1}^{|S|}(n-i)!\right)/(n-1)^{|S|}\right)\left(c_{T}\left(\prod_{j=|S\cap T|+1}^{|T|}(n-j)!\right)/(n-1)^{|T\setminus S|}\right).

Note that the product of (1)n1|S|(-1)^{n-1-|S|} (from the cSc_{S}) and (1)n1|T|(-1)^{n-1-|T|} (from the cTc_{T}) equals (1)|S||T|(-1)^{|S|-|T|}, so the above equation becomes:

cScT𝔼(JSJT)\displaystyle c_{S}c_{T}\cdot\mathbb{E}(J_{S}J_{T}) =(1)|S||T|((n2)!(n1)n2)((n1|ST|)!(n1)n1|ST|).\displaystyle=(-1)^{|S|-|T|}\left(\frac{(n-2)!}{(n-1)^{n-2}}\right)\left(\frac{(n-1-|S\cap T|)!}{(n-1)^{n-1-|S\cap T|}}\right).

Now, we rearrange the sum for 𝔼(D2)\mathbb{E}(D^{2}) in the following way:

𝔼(D2)\displaystyle\mathbb{E}(D^{2}) =S[n]T[n]cScT𝔼(JSJT)\displaystyle=\sum_{S\subsetneq[n]}\sum_{T\subsetneq[n]}c_{S}c_{T}\cdot\mathbb{E}(J_{S}J_{T})
=(n2)!(n1)n2c=0n1(n1c)!(n1)n1cS,T[n],|ST|=c(1)|S||T|.\displaystyle=\frac{(n-2)!}{(n-1)^{n-2}}\sum_{c=0}^{n-1}\frac{(n-1-c)!}{(n-1)^{n-1-c}}\sum_{\begin{subarray}{c}S,T\subsetneq[n],\\ |S\cap T|=c\end{subarray}}(-1)^{|S|-|T|}.

To evaluate this expression, fix cn1c\leq n-1 and consider the inner sum. Consider the collection of tuples {(S,T)S,T[n],|ST|=c}\{(S,T)\mid S,T\subsetneq[n],|S\cap T|=c\}. We can pair up (most of) these tuples in the following way. For each SS, let mSm_{S} denote the minimum element in [n][n] that is not in SS (note that mSm_{S} is well defined because SS cannot be [n][n]). We pair up the tuple (S,T)(S,T) with the tuple (S,T{mS})(S,T\triangle\{m_{S}\}), where \triangle denotes symmetric difference. The only tuples (S,T)(S,T) that cannot be paired up in this way are those where |S|=c|S|=c and T=[n]{mS}T=[n]\setminus\{m_{S}\}, because TT cannot be [n][n]. There are (nc)\binom{n}{c} unpaired tuples (S,T)(S,T), and for each of these tuples, we have (1)|S||T|=(1)n1c(-1)^{|S|-|T|}=(-1)^{n-1-c}. On the other hand, each pair (S,T),(S,T{mS})(S,T),(S,T\triangle\{m_{S}\}) contributes 0 to the inner sum. Therefore, the inner sum equals (1)n1c(nc)(-1)^{n-1-c}\binom{n}{c}, and we have:

𝔼(D2)\displaystyle\mathbb{E}(D^{2}) =(n2)!(n1)n2c=0n1(1)n1c(n1c)!(n1)n1c(nc)\displaystyle=\frac{(n-2)!}{(n-1)^{n-2}}\sum_{c=0}^{n-1}\frac{(-1)^{n-1-c}(n-1-c)!}{(n-1)^{n-1-c}}\binom{n}{c}
=(n2)!(n1)n2c=0n1(1)n1c(n1c)!(n1)n1cn!c!(nc)!\displaystyle=\frac{(n-2)!}{(n-1)^{n-2}}\sum_{c=0}^{n-1}\frac{(-1)^{n-1-c}(n-1-c)!}{(n-1)^{n-1-c}}\cdot\frac{n!}{c!(n-c)!}
=(n2)!(n1)n2n!c=0n1(1)n1cnc1(n1)n1cc!.\displaystyle=\frac{(n-2)!}{(n-1)^{n-2}}\cdot n!\cdot\sum_{c=0}^{n-1}\frac{(-1)^{n-1-c}}{n-c}\cdot\frac{1}{(n-1)^{n-1-c}c!}.

Corollary 3.

𝔼(D2)n!(n1)n1\mathbb{E}(D^{2})\leq\frac{n!}{(n-1)^{n-1}}

Proof.

Observe that the sum

c=0n1(1)n1cnc1(n1)n1cc!\sum_{c=0}^{n-1}\frac{(-1)^{n-1-c}}{n-c}\cdot\frac{1}{(n-1)^{n-1-c}c!}

is an alternating series where the magnitudes of the terms decrease as cc decreases. The two largest magnitude terms are 1/(n1)!1/(n-1)! and (1/2)1/(n1)!-(1/2)\cdot 1/(n-1)!. Therefore, the sum is at most 1(n1)!\frac{1}{(n-1)!}, and we conclude that

𝔼(D2)(n2)!(n1)n2n!(n1)!=n!(n1)n1\mathbb{E}(D^{2})\leq\frac{(n-2)!}{(n-1)^{n-2}}\cdot\frac{n!}{(n-1)!}=\frac{n!}{(n-1)^{n-1}}

as needed. ∎

We can now complete the proof of Theorem 1

Proof of Theorem 1.

By Lemma 4, any Nullstellensatz proof for PHPn\text{PHP}_{n} has total coefficient size at least (n2)!(n1)n2(n1)2n2D\frac{(n-2)!}{(n-1)^{n-2}}\cdot\frac{(n-1)2^{n-2}}{\lVert D\rVert}. By Corollary 3, Dn!(n1)n1\lVert D\rVert\leq\sqrt{\frac{n!}{(n-1)^{n-1}}}. Combining these results, any Nullstellensatz proof for PHPn\text{PHP}_{n} has total coefficient size at least

(n2)!(n1)n2(n1)2n2n!(n1)n1\displaystyle\frac{(n-2)!}{(n-1)^{n-2}}\cdot\frac{(n-1)2^{n-2}}{\sqrt{\frac{n!}{(n-1)^{n-1}}}} =2n2n(n1)!(n1)n232\displaystyle=\frac{2^{n-2}}{\sqrt{n}}\cdot\frac{\sqrt{(n-1)!}}{(n-1)^{\frac{n}{2}-\frac{3}{2}}}
=2n2(n1)n(n1)!(n1)n1\displaystyle=\frac{2^{n-2}(n-1)}{\sqrt{n}}\sqrt{\frac{(n-1)!}{(n-1)^{n-1}}}

Using Stirling’s approximation that n!n! is approximately 2πn(ne)n\sqrt{2{\pi}n}\left(\frac{n}{e}\right)^{n}, (n1)!(n1)n1\sqrt{\frac{(n-1)!}{(n-1)^{n-1}}} is approximately 2π(n1)4(1e)n1\sqrt[4]{2{\pi}(n-1)}\left(\frac{1}{\sqrt{e}}\right)^{n-1} so this expression is Ω(n34(2e)n)\Omega\left(n^{\frac{3}{4}}\left(\frac{2}{\sqrt{e}}\right)^{n}\right), as needed. ∎

3.2 Experimental Results for PHPn\text{PHP}_{n}

For small nn, we computed the optimal dual values shown below. The first column of values is the optimal dual value for n=3,4n=3,4. The second column of values is the optimal dual value for n=3,4,5,6n=3,4,5,6 under the restriction that the only nonzero assignments are those where each pigeon goes to exactly one hole.

nn dual value dual value, each pigeon goes to exactly one hole
3 11 6
4 41.469¯41.4\overline{69} 27
5 - 100
6 - 293.75

For comparison, the table below shows the value we computed for our dual solution and the lower bound of 2n2(n1)n(n1)!(n1)n1\frac{2^{n-2}(n-1)}{\sqrt{n}}\sqrt{\frac{(n-1)!}{(n-1)^{n-1}}} that we showed in the proof of Theorem 1. (Values are rounded to 3 decimals.)

nn value of DD proven lower bound on value of DD
3 4 1.633
4 18 2.828
5 64 4.382
6 210.674 6.4

It is possible that our lower bound on the value of DD can be improved. The following experimental evidence suggests that the dual value 𝔼(D)/maxW|𝔼(DW)|\mathbb{E}(D)/\max_{W}|\mathbb{E}(DW)| of DD may actually be Θ~(2n)\widetilde{\Theta}(2^{n}). For n=3,4,5,6n=3,4,5,6, we found that the weakenings WW that maximize |𝔼(DW)||\mathbb{E}(DW)| are of the following form, up to symmetry. (By symmetry, we mean that we can permute pigeons/holes without changing |𝔼(DW)||\mathbb{E}(DW)|, and we can flip sets of holes as in Lemma 2 without changing |𝔼(DW)||\mathbb{E}(DW)|.)

  • For odd nn (n=3,5n=3,5): WW is the weakening of the axiom x1,1x2,1=0x_{1,1}x_{2,1}=0 where, for i=3,,ni=3,\dots,n, we have HW,i={2,,(n+1)/2}H_{W,i}=\{2,\dots,(n+1)/2\}.

  • For even nn (n=4,6n=4,6): WW is the following weakening of the axiom x1,1x2,1=0x_{1,1}x_{2,1}=0. For i=3,,n/2+1i=3,\dots,n/2+1, we have HW,i={2,,n/2}H_{W,i}=\{2,\dots,n/2\}. For i=n/2+2,,ni=n/2+2,\dots,n, we have HW,i={n/2+1,,n1}H_{W,i}=\{n/2+1,\dots,n-1\}.

If this pattern continues to hold for larger nn, then experimentally it seems that
𝔼(D)/maxW|𝔼(DW)|\mathbb{E}(D)/\max_{W}|\mathbb{E}(DW)| is Θ~(2n)\widetilde{\Theta}(2^{n}), although we do not have a proof of this.

4 Total coefficient size upper bound for the ordering principle

In this section, we construct an explicit Nullstellensatz proof of infeasibility for the ordering principle ORDn\text{ORD}_{n} with size and total coefficient size 2nn2^{n}-n. We start by formally defining the ordering principle.

Definition 14 (ordering principle (ORDn\mathrm{ORD}_{n})).

Intuitively, the ordering principle says that any well-ordering on nn elements must have a minimum element. Formally, for n1n\geq 1, we define ORDn\mathrm{ORD}_{n} to be the statement that the following system of axioms is infeasible:

  • We have a variable xi,jx_{i,j} for each pair i,j[n]i,j\in[n] with i<ji<j. xi,j=1x_{i,j}=1 represents element ii being less than element jj in the well-ordering, and xi,j=0x_{i,j}=0 represents element ii being more than element jj in the well-ordering.

    We write xj,ix_{j,i} as shorthand for 1xi,j1-x_{i,j} (i.e. we take xj,i=x¯i,j=1xi,jx_{j,i}=\bar{x}_{i,j}=1-x_{i,j}).

  • For each i[n]i\in[n], we have the axiom j[n]{i}xi,j=0\prod_{j\in[n]\setminus\{i\}}{x_{i,j}}=0 which represents the constraint that element ii is not a minimum element. We call these axioms non-minimality axioms.

  • For each triple i,j,k[n]i,j,k\in[n] where i<j<ki<j<k, we have the two axioms xi,jxj,kxk,i=0x_{i,j}x_{j,k}x_{k,i}=0 and xk,jxj,ixi,k=0x_{k,j}x_{j,i}x_{i,k}=0 which represent the constraints that elements i,j,ki,j,k satisfy transitivity. We call these axioms transitivity axioms.

In our Nullstellensatz proof, for each weakening WW of an axiom, its coefficient cWc_{W} will either be 11 or 0. Non-minimality axioms will appear with coefficient 11 and the only weakenings of transitivity axioms which appear have a special form which we describe below.

Definition 15 (nice transitivity weakening).

Let WW be a weakening of the axiom xi,jxj,kxk,ix_{i,j}x_{j,k}x_{k,i} or the axiom xk,jxj,ixi,kx_{k,j}x_{j,i}x_{i,k} for some i<j<ki<j<k. Let G(W)G(W) be the following directed graph. The vertices of G(W)G(W) are [n][n]. For distinct i,j[n]i^{\prime},j^{\prime}\in[n], G(W)G(W) has an edge from ii^{\prime} to jj^{\prime} if WW contains the term xi,jx_{i^{\prime},j^{\prime}}. We say that WW is a nice transitivity weakening if G(W)G(W) has exactly nn edges and all vertices are reachable from vertex ii.

In other words, if WW is a weakening of the axiom xi,jxj,kxk,ix_{i,j}x_{j,k}x_{k,i} or the axiom xk,jxj,ixi,kx_{k,j}x_{j,i}x_{i,k} then G(W)G(W) contains a 3-cycle on vertices {i,j,k}\{i,j,k\}. WW is a nice transitivity weakening if and only if contracting this 3-cycle results in a (directed) spanning tree rooted at the contracted vertex. Note that if WW is a nice transitivity weakening and xx is an assignment with a minimum element then W(x)=0W(x)=0.

Theorem 3.

There is a Nullstellensatz proof of infeasibility for ORDn\text{ORD}_{n} satisfying:

  1. 1.

    The total coefficient size is 2nn2^{n}-n.

  2. 2.

    Each cWc_{W} is either 0 or 1.

  3. 3.

    If AA is a non-minimality axiom, then cA=1c_{A}=1 and cW=0c_{W}=0 for all other weakenings of AA.

  4. 4.

    If WW is a transitivity weakening but not a nice transitivity weakening then cW=0c_{W}=0.

Proof. We prove Theorem 3 by induction on nn. When n=3n=3, the desired Nullstellensatz proof sets cA=1c_{A}=1 for each axiom AA. It can be verified that WcWW\sum_{W}c_{W}W evaluates to 1 on each assignment, and that this Nullstellensatz proof satisfies the properties of Theorem 3.

Now suppose we have a Nullstellensatz proof for ORDn\text{ORD}_{n} satisfying Theorem 3, and let SnS_{n} denote the set of transitivity weakenings WW for which cW=1c_{W}=1. The idea to obtain a Nullstellensatz proof for ORDn+1\text{ORD}_{n+1} is to use two “copies” of SnS_{n}, the first copy on elements {1,,n}\{1,\dots,n\} and the second copy on elements {2,,n+1}\{2,\dots,n+1\}. Specifically, we construct the Nullstellensatz proof for ORDn+1\text{ORD}_{n+1} by setting the following cWc_{W} to 1 and all other cWc_{W} to 0.

  1. 1.

    For each non-minimality axiom AA in ORDn+1\text{ORD}_{n+1}, we set cA=1c_{A}=1.

  2. 2.

    For each WSnW\in S_{n}, we define the transitivity weakening WW^{\prime} on n+1n+1 elements by W=Wx1,n+1W^{\prime}=W\cdot x_{1,n+1} and set cW=1c_{W^{\prime}}=1.

  3. 3.

    For each WSnW\in S_{n}, first we define the transitivity weakening W′′W^{\prime\prime} on n+1n+1 elements by replacing each variable xi,jx_{i,j} that appears in WW by xi+1,j+1x_{i+1,j+1}. (e.g., if W=x1,2x2,3x3,1W=x_{1,2}x_{2,3}x_{3,1}, then W′′=x2,3x3,4x4,2W^{\prime\prime}=x_{2,3}x_{3,4}x_{4,2}.) Then, we define W′′′=W′′xn+1,1W^{\prime\prime\prime}=W^{\prime\prime}x_{n+1,1} and set cW′′′=1c_{W^{\prime\prime\prime}}=1.

  4. 4.

    For each i{2,,n}i\in\{2,\dots,n\}, for each of the 2 transitivity axioms AA on (1,i,n+1)(1,i,n+1), we set cW=1c_{W}=1 for the following weakening WW of AA:

    W=A(j[n]{i}xi,j).W=A\left(\prod_{j\in[n]\setminus\{i\}}{x_{i,j}}\right).

    In other words, W(x)=1W(x)=1 if and only if A(x)=1A(x)=1 and ii is the minimum element among the elements [n+1]{1,n+1}[n+1]\setminus\{1,n+1\}.

The desired properties 1 through 4 in Theorem 3 can be verified by induction. It remains to show that for each assignment xx, there is exactly one nonzero cWc_{W} for which W(x)=1W(x)=1. If xx has a minimum element i[n+1]i\in[n+1], then the only nonzero cWc_{W} for which W(x)=1W(x)=1 is the non-minimality axiom for ii. Now suppose that xx does not have a minimum element. Consider two cases: either x1,n+1=1x_{1,n+1}=1, or xn+1,1=1x_{n+1,1}=1. Suppose x1,n+1=1x_{1,n+1}=1. Consider the two subcases:

  1. 1.

    Suppose that, if we ignore element n+1n+1, then there is still no minimum element among the elements {1,,n}\{1,\dots,n\}. Then there is exactly one weakening WW in point 2 of the construction for which W(x)=1W(x)=1, by induction.

  2. 2.

    Otherwise, for some i{2,,n}i\in\{2,\dots,n\}, we have that ii is a minimum element among {1,,n}\{1,\dots,n\} and xn+1,i=1x_{n+1,i}=1. Then there is exactly one weakening WW in point 4 of the construction for which W(x)=1W(x)=1 (namely the weakening WW of the axiom A=xi,1x1,n+1xn+1,iA=x_{i,1}x_{1,n+1}x_{n+1,i}).

The case xn+1,1=1x_{n+1,1}=1 is handled similarly by considering whether there is a minimum element among {2,,n+1}\{2,\dots,n+1\}. Assignments that do have a minimum element among {2,,n+1}\{2,\dots,n+1\} are handled by point 3 of the construction, and assignments that do not are handled by point 4 of the construction.

4.1 Restriction to instances with no minimial element

We now observe that for the ordering principle, we can restrict our attention to instances which have no minimum element.

Lemma 6.

Suppose we have coefficients cWc_{W} satisfying WcWW(x)=1\sum_{W}c_{W}W(x)=1 for all assignments xx that have no minimum element (but it is possible that WcWW(x)1\sum_{W}c_{W}W(x)\neq 1 on assignments xx that do have a minimum element). Then there exist coefficients cWc^{\prime}_{W} such that WcWW=1\sum_{W}c^{\prime}_{W}W=1 (i.e., the coefficients cWc_{W}^{\prime} are a valid primal solution) with

W|cW|(n+1)(W|cW|)+n.\sum_{W}{|c^{\prime}_{W}|}\leq(n+1)\left(\sum_{W}{|c_{W}|}\right)+n.

This lemma says that, to prove upper or lower bounds for ORDn\text{ORD}_{n} by constructing primal or dual solutions, it suffices to consider only assignments xx that have no minimum element, up to a factor of O(n)O(n) in the solution value.

Proof.

Let CC denote the function on weakenings that maps WW to cWc_{W}. For i[n]i\in[n], we will define the function CiC_{i} on weakenings satisfying the properties:

  • If xx is an assignment where ii is a minimum element, then WCi(W)W(x)=WC(W)W(x)\sum_{W}{C_{i}(W)W(x)}=\sum_{W}{C(W)W(x)}.

  • Otherwise, WCi(W)W(x)=0\sum_{W}{C_{i}(W)W(x)}=0.

Let Ai=j[n]{i}xi,jA_{i}=\prod_{j\in[n]\setminus\{i\}}{x_{i,j}} be the non-minimality axiom for ii. Intuitively, we want to define CiC_{i} as follows: For all WW, Ci(AiW)=C(W)C_{i}(A_{i}W)=C(W). (If WW is a weakening that is not a weakening of AiA_{i}, then Ci(W)=0C_{i}(W)=0.) The only technicality is that multiple weakenings WW may become the same when multiplied by AiA_{i}, so we actually define Ci(AiW)=W:AiW=AiWC(W)C_{i}(A_{i}W)=\sum_{W^{\prime}:A_{i}W^{\prime}=A_{i}W}C(W^{\prime}).

Finally, we use the functions CiC_{i} to define the function CC^{\prime}:

C=C(i=1nCi)+(i=1nAi).C^{\prime}=C-\left(\sum_{i=1}^{n}C_{i}\right)+\left(\sum_{i=1}^{n}A_{i}\right).

By taking cW=C(W)c^{\prime}_{W}=C^{\prime}(W), the cWc^{\prime}_{W} are a valid primal solution with the desired bound on the total coefficient size. ∎

4.2 Experimental results

For small values of nn, we have computed both the minimum total coefficient size of a Nullstellensatz proof of the ordering principle and the value of the linear program where we restrict our attention to instances xx which have no minimum element.

We found that for n=3,4,5n=3,4,5, the minimum total coefficient size of a Nullstellensatz proof of the ordering principle is 2nn2^{n}-n so the primal solution given by Theorem 3 is optimal. However, for n=6n=6 this solution is not optimal as the minimum total coefficient size is 5252 rather than 266=582^{6}-6=58.

If we restrict our attention to instances xx which have no minimum element then for n=3,4,5,6n=3,4,5,6, the value of the resulting linear program is equal to 2(n3)2\binom{n}{3}, which is the number of transitivity axioms. However, this is no longer true for n=7n=7, though we did not compute the exact value.

5 Analyzing Total Coefficient Size for Stronger Proof Systems

In this section, we consider the total coefficient size for two stronger proof systems, sum of squares proofs and a proof system which is between Nullstellensatz and sum of squares proofs which we call resolution-like proofs.

Definition 16.

Given a system of axioms {pi=0:i[m]}\{p_{i}=0:i\in[m]\}, we define a resolution-like proof of infeasibility to be an equality of the form

1=i=1mpiqi+jcjgj-1=\sum_{i=1}^{m}{{p_{i}}{q_{i}}}+\sum_{j}{{c_{j}}g_{j}}

where each gjg_{j} is a monomial and each coefficient cjc_{j} is non-negative. We define the total coefficient size of such a proof to be i=1mT(qi)+jcj\sum_{i=1}^{m}{T(q_{i})}+\sum_{j}{c_{j}}.

We call this proof system resolution-like because it captures the resolution-like calculus introduced for Max-SAT by María Luisa Bonet, Jordi Levy, and Felip Manyà [4]. The idea is that if we have deduced that xr10x{r_{1}}\leq 0 and x¯r20\bar{x}{r_{2}}\leq 0 for some variable xx and monomials r1r_{1} and r2r_{2} then we can deduce that r1r20{r_{1}}{r_{2}}\leq 0 as follows:

r1r2=xr1(1r2)xr1+x¯r2(1r1)x¯r2{r_{1}}{r_{2}}=x{r_{1}}-(1-r_{2})x{r_{1}}+\bar{x}{r_{2}}-(1-r_{1})\bar{x}{r_{2}}

where we decompose (1r1)(1-r_{1}) and (1r2)(1-r_{2}) into monomials using the observation that 1i=1kxi=j=1k(1xj)(i=1j1xi)1-\prod_{i=1}^{k}{x_{i}}=\sum_{j=1}^{k}{(1-x_{j})\left(\prod_{i=1}^{j-1}{x_{i}}\right)}.

The minimum total coefficient size of a resolution-like proof can be found using the following linear program.

  1. Primal: Minimize i=1mT(qi)+jcj\sum_{i=1}^{m}{T(q_{i})}+\sum_{j}{c_{j}} subject to i=1mpiqi+jcjgj=1\sum_{i=1}^{m}{{p_{i}}{q_{i}}}+\sum_{j}{{c_{j}}g_{j}}=-1

  2. Dual: Maximize D(1)D(1) subject to the constraints that

    1. 1.

      DD is a linear map from polynomials to \mathbb{R}.

    2. 2.

      For each i[m]i\in[m] and each monomial rr, |D(rpi)|1|D(rp_{i})|\leq 1.

    3. 3.

      For each monomial rr, D(r)1D(r)\geq-1.

Definition 17.

Given a system of axioms {pi=0:i[m]}\{p_{i}=0:i\in[m]\}, a Positivstellensatz/sum of squares proof of infeasibility is an equality of the form

1=i=1mpiqi+jgj2-1=\sum_{i=1}^{m}{{p_{i}}{q_{i}}}+\sum_{j}{g_{j}^{2}}

We define the total coefficient size of a Positivstellensatz/sum of squares proof to be i=1mT(qi)+jT(gj)2\sum_{i=1}^{m}{T(q_{i})}+\sum_{j}{T(g_{j})^{2}}

  1. Primal: Minimize i=1mT(qi)+jT(gj)2\sum_{i=1}^{m}{T(q_{i})}+\sum_{j}{T(g_{j})^{2}} subject to the constraint that 1=i=1mpiqi+jgj2-1=\sum_{i=1}^{m}{{p_{i}}{q_{i}}}+\sum_{j}{g_{j}^{2}}.

  2. Dual: Maximize D(1)D(1) subject to the constraints that

    1. 1.

      DD is a linear map from polynomials to \mathbb{R}.

    2. 2.

      For each i[m]i\in[m] and each monomial rr, |D(rpi)|1|D(rp_{i})|\leq 1.

    3. 3.

      For each polynomial gjg_{j}, D((gj)2)T(gj)2D((g_{j})^{2})\geq-T(g_{j})^{2},

5.1 Failure of the dual certificate for resolution-like proofs

In this subsection, we observe that our dual certificate does not give a lower bound on the total coefficient size for resolution-like proofs of the pigeonhole principle because it has a large negative value on some monomials.

Theorem 4.

The value of the dual certificate on the polynomial i=1nx¯i1\prod_{i=1}^{n}{\bar{x}_{i1}} is
(n2)!(n1)n1(1(1)n1(n1)n2)-\frac{(n-2)!}{(n-1)^{n-1}}\left(1-\frac{(-1)^{n-1}}{(n-1)^{n-2}}\right)

Proof.

To show this, we make the following observations.

  1. 1.

    The value of the dual certificate on the polynomial i=2nx¯i1\prod_{i=2}^{n}{\bar{x}_{i1}} is 0.

  2. 2.

    The value of the dual certificate on the polynomial x11i=3nx¯i1x_{11}\prod_{i=3}^{n}{\bar{x}_{i1}} is (n2)!(n1)n1\frac{(n-2)!}{(n-1)^{n-1}}

  3. 3.

    The value of the dual certificate on the polynomial x11x21i=3nx¯i1x_{11}x_{21}\prod_{i=3}^{n}{\bar{x}_{i1}} is (1)n2(n2)!(n1)2n3\frac{(-1)^{n-2}(n-2)!}{(n-1)^{2n-3}}.

For the first observation, observe that since the first pigeon is unrestricted, every term of the dual certificate cancels except J{2,3,,n}J_{\{2,3,\ldots,n\}} which is 0 as none of these pigeons can go to hole 11. For the second observation, observe that since the second pigeon is unrestriced, every term of the dual certificate cancels except J{1,3,4,,n}J_{\{1,3,4,\ldots,n\}} which gives value 𝔼(D)n1=(n2)!(n1)n1\frac{\mathbb{E}(D)}{n-1}=\frac{(n-2)!}{(n-1)^{n-1}}. For the third observation, observe that by Lemma 2, the value of the dual certificate on the polynomial x11x21i=3nx¯i1x_{11}x_{21}\prod_{i=3}^{n}{\bar{x}_{i1}} is (1)n2(-1)^{n-2} times the value of the dual certificate on the polynomial i=1nxi1\prod_{i=1}^{n}{x_{i1}} which is

1(n1)n((n1)!(n1)n1+n(n2)!(n1)n2)=(n2)!(n1)2n3\frac{1}{(n-1)^{n}}\left(-\frac{(n-1)!}{(n-1)^{n-1}}+n\frac{(n-2)!}{(n-1)^{n-2}}\right)=\frac{(n-2)!}{(n-1)^{2n-3}}

Putting these observations together, the value of the dual certificate for the polynomial

i=1nx¯i1=i=2nx¯i1x11i=3nx¯i1+x11x21i=3nx¯i1\prod_{i=1}^{n}{\bar{x}_{i1}}=\prod_{i=2}^{n}{\bar{x}_{i1}}-x_{11}\prod_{i=3}^{n}{\bar{x}_{i1}}+x_{11}x_{21}\prod_{i=3}^{n}{\bar{x}_{i1}}

is (n2)!(n1)n1+(1)n2(n2)!(n1)2n3=(n2)!(n1)n1(1(1)n1(n1)n2)-\frac{(n-2)!}{(n-1)^{n-1}}+\frac{(-1)^{n-2}(n-2)!}{(n-1)^{2n-3}}=-\frac{(n-2)!}{(n-1)^{n-1}}\left(1-\frac{(-1)^{n-1}}{(n-1)^{n-2}}\right). ∎

5.2 Small total coefficient size sum of squares proof of the ordering principle

In this subsection, we show that the small size resolution proof of the ordering principle [16], which seems to be dynamic in nature, can actually be mimicked by a sum of squares proof. Thus, while sum of squares requires degree Θ~(n)\tilde{\Theta}(\sqrt{n}) to refute the negation of the ordering principle [13], there is a sum of squares proof which has polynomial size and total coefficient size. To make our proof easier to express, we define the following monomials

Definition 18.
  1. 1.

    Whenever 1jmn1\leq j\leq m\leq n, let Fjm=i[m]{j}xjiF_{jm}=\prod_{i\in[m]\setminus\{j\}}{x_{ji}} be the monomial which is 11 if xjx_{j} is the first element in x1,,xmx_{1},\dots,x_{m} and 0 otherwise.

  2. 2.

    For all m[n1]m\in[n-1] and all distinct j,k[m]j,k\in[m], we define TjmkT_{jmk} to be the monomial

    Tjmk=Fjmx(m+1)jxk(m+1)i[k1]{j}x(m+1)iT_{jmk}=F_{jm}x_{(m+1)j}x_{k(m+1)}\prod_{i\in[k-1]\setminus\{j\}}{x_{(m+1)i}}

    Note that TjmkT_{jmk} is a multiple of x(m+1)jxjkxk(m+1)x_{(m+1)j}x_{jk}x_{k(m+1)} so it is a weakening of a transitivity axiom.

With these definitions, we can now express our proof.

Theorem 5.

The following equality (modulo the axioms that xij2=xijx_{ij}^{2}=x_{ij} and xijxji=0x_{ij}x_{ji}=0 for all distinct i,j[n]i,j\in[n]) gives an SoS proof that the total ordering axioms are infeasible.

1=m=1n1((F(m+1)(m+1)j=1mFjmF(m+1)(m+1))2j=1mk[m]{j}Tjmk)j=1nFjn-1=\sum_{m=1}^{n-1}{\left(\left(F_{(m+1)(m+1)}-\sum_{j=1}^{m}{F_{jm}F_{(m+1)(m+1)}}\right)^{2}-\sum_{j=1}^{m}{\sum_{k\in[m]\setminus\{j\}}{T_{jmk}}}\right)}-\sum_{j=1}^{n}{F_{jn}}
Proof.

Our key building block is the following lemma.

Lemma 7.

For all m[1,n1]m\in[1,n-1] and all j[1,m]j\in[1,m],

Fjm=Fj(m+1)+k[m]{j}Tjmk+FjmF(m+1)(m+1)F_{jm}=F_{j(m+1)}+\sum_{k\in[m]\setminus\{j\}}{T_{jmk}}+F_{jm}F_{(m+1)(m+1)}
Proof.

First note that

Fjm=Fjmxj(m+1)+Fjmx(m+1)j=Fj(m+1)+Fjmx(m+1)jF_{jm}=F_{jm}x_{j(m+1)}+F_{jm}x_{(m+1)j}=F_{j(m+1)}+F_{jm}x_{(m+1)j}

We now use the following proposition

Proposition 2.
Fjmx(m+1)j=k[m]{j}Fjmx(m+1)jxk(m+1)i[k1]{j}x(m+1)i+FjmF(m+1)(m+1)x(m+1)jF_{jm}x_{(m+1)j}=\sum_{k\in[m]\setminus\{j\}}{F_{jm}x_{(m+1)j}x_{k(m+1)}\prod_{i\in[k-1]\setminus\{j\}}{x_{(m+1)i}}}+F_{jm}F_{(m+1)(m+1)}x_{(m+1)j}
Proof.

If xk(m+1)=0x_{k^{\prime}(m+1)}=0 for all k[m]k\in[m] then xk(m+1)i[k1]{j}x(m+1)i=0x_{k(m+1)}\prod_{i\in[k-1]\setminus\{j\}}{x_{(m+1)i}}=0 for all k[m]k\in[m] and F(m+1)(m+1)=1F_{(m+1)(m+1)}=1. Otherwise, let kk^{\prime} be the first index in [m][m] such that xk(m+1)=1x_{k^{\prime}(m+1)}=1. Now observe that F(m+1)(m+1)=0F_{(m+1)(m+1)}=0 and xk(m+1)i[k1]{j}x(m+1)i=1x_{k(m+1)}\prod_{i\in[k-1]\setminus\{j\}}{x_{(m+1)i}}=1 if k=kk=k^{\prime} and is 0 if kkk\neq k^{\prime}. ∎

Finally, we observe that FjmF(m+1)(m+1)x(m+1)j=FjmF(m+1)(m+1)F_{jm}F_{(m+1)(m+1)}x_{(m+1)j}=F_{jm}F_{(m+1)(m+1)} as x(m+1)jx_{(m+1)j} is contained in F(m+1)(m+1)F_{(m+1)(m+1)}. Putting everything together,

Fjm=Fj(m+1)+k[m]{j}Tjmk+FjmF(m+1)(m+1)F_{jm}=F_{j(m+1)}+\sum_{k\in[m]\setminus\{j\}}{T_{jmk}}+F_{jm}F_{(m+1)(m+1)}

We now verify that our proof, which we restate here for convenience, is indeed an equality modulo the axioms that xij2=xijx_{ij}^{2}=x_{ij} and xijxji=0x_{ij}x_{ji}=0 for all distinct i,j[n]i,j\in[n].

1=m=1n1((F(m+1)(m+1)j=1mFjmF(m+1)(m+1))2j=1mk[m]{j}Tjmk)j=1nFjn-1=\sum_{m=1}^{n-1}{\left(\left(F_{(m+1)(m+1)}-\sum_{j=1}^{m}{F_{jm}F_{(m+1)(m+1)}}\right)^{2}-\sum_{j=1}^{m}{\sum_{k\in[m]\setminus\{j\}}{T_{jmk}}}\right)}-\sum_{j=1}^{n}{F_{jn}}

Observe that F(m+1)(m+1)2=F(m+1)(m+1)F_{(m+1)(m+1)}^{2}=F_{(m+1)(m+1)}, Fjm2=F(jm)F_{jm}^{2}=F_{(jm)}, and for all distinct j,j[m]j,j^{\prime}\in[m], FjmFjm=0F_{jm}F_{j^{\prime}m}=0. Thus,

(F(m+1)(m+1)j=1mFjmF(m+1)(m+1))2=(F(m+1)(m+1)j=1mFjmF(m+1)(m+1))\left(F_{(m+1)(m+1)}-\sum_{j=1}^{m}{F_{jm}F_{(m+1)(m+1)}}\right)^{2}=\left(F_{(m+1)(m+1)}-\sum_{j=1}^{m}{F_{jm}F_{(m+1)(m+1)}}\right)

By Lemma 7, FjmF(m+1)(m+1)+k[m]{j}Tjmk=FjmFj(m+1)F_{jm}F_{(m+1)(m+1)}+\sum_{k\in[m]\setminus\{j\}}{T_{jmk}}=F_{jm}-F_{j(m+1)}. This implies that

m=1n1j=1m(FjmF(m+1)(m+1)+k[m]{j}Tjmk)\displaystyle-\sum_{m=1}^{n-1}{\sum_{j=1}^{m}{\left(F_{jm}F_{(m+1)(m+1)}+\sum_{k\in[m]\setminus\{j\}}{T_{jmk}}\right)}} =j=1n1m=jn1(FjmFj(m+1))\displaystyle=-\sum_{j=1}^{n-1}{\sum_{m=j}^{n-1}{\left(F_{jm}-F_{j(m+1)}\right)}}
=j=1n1(FjnFjj)\displaystyle=\sum_{j=1}^{n-1}{(F_{jn}-F_{jj})}

Now observe that m=1n1F(m+1)(m+1)=j=2n1Fjj+Fnn=1+j=1n1Fjj+Fnn\sum_{m=1}^{n-1}{F_{(m+1)(m+1)}}=\sum_{j=2}^{n-1}{F_{jj}}+F_{nn}=-1+\sum_{j=1}^{n-1}{F_{jj}}+F_{nn}. Putting everything together, the equality holds. Since each TjmkT_{jmk} is a weakening of a transitivity axiom and each FjnF_{jn} is a non-minimality axiom, this indeed gives a sum of squares proof that these axioms are unsatisfiable. ∎

6 Open Problems

Our work raises a number of open problems. For the pigeonhole principle, while we have proved an exponential total coefficient size lower bound on Nullstellensatz proofs, there is a lot of room for further work. Some questions are as follows.

  1. 1.

    For the pigeonhole principle, our lower bound is 2Ω(n)2^{\Omega(n)} while the trivial upper bound is O(n!)O(n!). Can we improve the lower and/or upper bound?

  2. 2.

    If we increase the number of pigeons from nn to n+1n+1 while still having n1n-1 holes, our lower bound proof no longer applies. Can we prove a total coefficient size lower bound on Nullstellensatz when there are n+1n+1 or more pigeons? How does the minimum total coefficient size of a proof depend on the number of pigeons?

  3. 3.

    Can we show total coefficient size lower bounds for resolution-like proofs of the pigeonhole principle?

  4. 4.

    How much of an effect would adding the axioms that pigeons can only go to one hole have on the minimum total coefficient size needed to prove the pigeonhole principle?

We are still far from understanding the total coefficient size of proofs for the ordering principle. Two natural questions are as follows.

  1. 1.

    Can we prove superpolynomial lower bounds on the total coefficient size of Nullstellensatz proofs for the ordering principle and/or improve the O(2n)O(2^{n}) upper bound?

  2. 2.

    Are there resolution-like proofs for the ordering principle with polynomial total coefficient size? If so, this shows that the seemingly dynamic O(n3)O(n^{3}) size resolution proof of the ordering principle [16] can be captured by a one line resolution-like proof. If not, this gives a natural example separating resolution proof size and the total coefficient size of resolution-like proofs.

Finally, we can ask what relationships and separations we can show between all of these different proof systems. Some questions are as follows.

  1. 1.

    Are there natural examples where the minimum total coefficient size is very different (either larger or smaller) than the minimum size for Nullstellensatz, resolution-like, or sum of squares proofs?

  2. 2.

    Can the minimum total coefficient size of a strong proof system be used to lower bound the size of another proof system? For example, can resolution proof size be lower bounded by the minimum total coefficient size of a sum of squares proof or can we find an example where there is a polynomial size resolution proof but any sum of squares proof has superpolynomial total coefficient size?

References

  • BCE+ [98] Paul Beame, Stephen Cook, Jeff Edmonds, Russell Impagliazzo, and Toniann Pitassi. The relative complexity of np search problems. J. Comput. Syst. Sci., 57(1):3–19, 1998.
  • BIK+ [94] P. Beame, R. Impagliazzo, J. Krajicek, T. Pitassi, and P. Pudlak. Lower bounds on hilbert’s nullstellensatz and propositional proofs. In Proceedings of the 35th Annual Symposium on Foundations of Computer Science (FOCS), pages 794–806, 1994.
  • BIK+ [97] S. Buss, R. Impagliazzo, J. Krajíček, P. Pudlák, A. A. Razborov, and J. Sgall. Proof complexity in algebraic systems and bounded depth frege systems with modular counting. Comput. Complex., 6(3):256–298, 1997.
  • BLM [07] María Luisa Bonet, Jordi Levy, and Felip Manyà. Resolution for max-sat. Artificial Intelligence, 171(8):606–618, 2007.
  • BP [96] S.R. Buss and T. Pitassi. Good degree bounds on nullstellensatz refutations of the induction principle. In Proceedings of Computational Complexity (Formerly Structure in Complexity Theory), pages 233–242, 1996.
  • Bro [87] W. Dale Brownawell. Bounds for the degrees in the nullstellensatz. Annals of Mathematics, 126(3):577–591, 1987.
  • Bus [96] Samuel R. Buss. Lower bounds on nullstellensatz proofs via designs. volume 39 of DIMACS Series in Discrete Mathematics and Theoretical Computer Science, pages 59–71. DIMACS/AMS, 1996.
  • CEI [96] Matthew Clegg, Jeffery Edmonds, and Russell Impagliazzo. Using the groebner basis algorithm to find proofs of unsatisfiability. In Proceedings of the Twenty-Eighth Annual ACM Symposium on Theory of Computing (STOC), page 174–183, 1996.
  • dRNMR [19] Susanna F. de Rezende, Jakob Nordström, Or Meir, and Robert Robere. Nullstellensatz Size-Degree Trade-offs from Reversible Pebbling. In 34th Computational Complexity Conference (CCC), volume 137, pages 18:1–18:16, 2019.
  • Her [26] Grete Hermann. Die frage der endlich vielen schritte in der theorie der polynomideale. Mathematische Annalen, 95:736–788, 1926.
  • IPS [99] Russell Impagliazzo, Pavel Pudlák, and Jiří Sgall. Lower bounds for the polynomial calculus and the gröbner basis algorithm. Comput. Complex., 8(2):127–144, nov 1999.
  • Kol [88] János Kollár. Sharp effective nullstellensatz. Journal of the American Mathematical Society, pages 963–975, 1988.
  • Pot [20] Aaron Potechin. Sum of Squares Bounds for the Ordering Principle. In 35th Computational Complexity Conference (CCC), volume 169, pages 38:1–38:37, 2020.
  • PR [18] Toniann Pitassi and Robert Robere. Lifting nullstellensatz to monotone span programs over any field. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing (STOC), page 1207–1219, 2018.
  • Raz [98] Alexander A Razborov. Lower bounds for the polynomial calculus. computational complexity, 7(4):291–324, 1998.
  • Stå [96] Gunnar Stålmarck. Short resolution proofs for a sequence of tricky formulas. Acta Informatica, 33(3):277–280, 1996.