Bounds on the Total Coefficient Size of Nullstellensatz Proofs of the Pigeonhole Principle and the Ordering Principle

Aaron Potechin and Aaron Zhang

Abstract

In this paper, we investigate the total coefficient size of Nullstellensatz proofs. We show that Nullstellensatz proofs of the pigeonhole principle on $n$ pigeons require total coefficient size $2^{\Omega(n)}$ and that there exist Nullstellensatz proofs of the ordering principle on $n$ elements with total coefficient size $2^{n}-n$ .

Acknowledgement: This research was supported by NSF grant CCF-2008920 and NDSEG fellowship F-9422254702.

1 Introduction

Given a system $\{p_{i}=0:i\in[m]\}$ of polynomial equations over an algebraically closed field, a Nullstellensatz proof of infeasibility is an equality of the form $1=\sum_{i=1}^{m}{p_{i}{q_{i}}}$ for some polynomials $\{q_{i}=0:i\in[m]\}$ . Hilbert’s Nullstellensatz¹¹1Actually, this is the weak form of Hilbert’s Nullstellensatz. Hilbert’s Nullstellensatz actually says that given polynomials $p_{1},\ldots,p_{m}$ and another polynomial $p$ , if $p(x)=0$ for all $x$ such that $p_{i}(x)=0$ for each $i\in[m]$ then there exists a natural number $r$ such that $p^{r}$ is in the ideal generated by $p_{1},\ldots,p_{m}$ . says that the Nullstellensatz proof system is complete, i.e. a system of polynomial equations has no solutions over an algebraically closed field if and only if there is a Nullstellensatz proof of infeasibility. However, Hilbert’s Nullstellensatz does not give any bounds on the degree or size needed for Nullstellensatz proofs.

The degree of Nullstellensatz proofs has been extensively studied. Grete Hermann showed a doubly exponential degree upper bound for the ideal membership problem [10] which implies the same upper bound for Nullstellensatz proofs. Several decades later, W. Dale Brownawell gave an exponential upper bound on the degree required for Nullstellensatz proofs over algebraically closed fields of characterisic zero [6]. A year later, János Kollár showed that this result holds for all algebraically closed fields [12].

For specific problems, the degree of Nullstellensatz proofs can be analyzed using designs [7]. Using designs, Nullstellensatz degree lower bounds have been shown for many problems including the pigeonhole principle, the induction principle, the housesitting principle, and the mod $m$ matching principles [2, 1, 3, 5, 8]. More recent work showed that there is a close connection between Nullstellensatz degree and reversible pebbling games [9] and that lower bounds on Nullstellensatz degree can be lifted to lower bounds on monotone span programs, monotone comparator circuits, and monotone switching networks [14].

For analyzing the size of Nullstellensatz proofs, a powerful technique is the size-degree tradeoff showed by Russell Impagliazzo, Pavel Pudlák, and Jiří Sgall for polynomial calculus [11]. This tradeoff says that if there is a size $S$ polynomial calculus proof then there is a polynomial calculus proof of degree $O(\sqrt{n\log{S}})$ . Thus, if we have an $\Omega(n)$ degree lower bound for polynomial calculus, this implies a $2^{\Omega(n)}$ size lower bound for polynomial calculus (which also holds for Nullstellensatz as Nullstellensatz is a weaker proof system). However, the size-degree tradeoff does not give any size lower bound when the degree is $O(\sqrt{n})$ and we know of very few other techniques for analyzing the size of Nullstellensatz proofs.

In this paper, we instead investigate the total coefficient size of Nullstellensatz proofs. We have two reasons for this. First, total coefficient size is interesting in its own right and to the best of our knowledge, it has not yet been explored. Second, total coefficient size may give insight into proof size in settings where we cannot apply the size-degree tradeoff and thus do not have good size lower bounds.

Remark 1.

Note that Nullstellensatz size lower bounds do not imply total coefficient size lower bounds because we could have a proof with many monomials but a small coefficient on each monomial. Thus, the exponential size lower bounds for the pigeonhole principle from Razborov’s $\Omega(n)$ degree lower bound for polynomial calculus [15] and the size-degree tradeoff [11] do not imply total coefficient size lower bounds for the pigeonhole principle.

1.1 Our results

In this paper, we consider two principles, the pigeonhole principle and the ordering principle. We show an exponential lower bound on the total coefficient size of Nullstellensatz proofs of the pigeonhole principle and we show an exponential upper bound on the total coefficient size of Nullstellensatz proofs of the ordering principle. More precisely, we show the following bounds.

Theorem 1.

For all $n\geq 2$ , any Nullstellensatz proof of the pigeonhole principle with $n$ pigeons and $n-1$ holes has total coefficient size $\Omega\left(n^{\frac{3}{4}}\left(\frac{2}{\sqrt{e}}\right)^{n}\right)$ .

Theorem 2.

For all $n\geq 3$ , there is a Nullstellensatz proof of the ordering principle on $n$ elements with size and total coefficient size $2^{n}-n$ .

After showing these bounds, we discuss total coefficient size for stronger proof systems. We observe that if we consider a stronger proof system which we call resolution-like proofs, our lower bound proof for the pigeonhole principle no longer works. We also observe that even though resolution is a dynamic proof system, the $O(n^{3})$ size resolution proof of the ordering principle found by Gunnar Stålmark [16] can be captured by a one line sum of squares proof.

2 Nullstellensatz total coefficient size

We start by defining total coefficient size for Nullstellensatz proofs and describing a linear program for finding the minimum total coefficient size of a Nullstellensatz proof.

Definition 1.

Given a polynomial $f$ , we define the total coefficient size $T(f)$ of $f$ to be the sum of the magnitudes of the coefficients of $f$ . For example, if $f(x,y,z)=2{x^{2}}y-3xyz+5z^{5}$ then $T(f)=2+3+5=10$ .

Definition 2.

Given a system $\{p_{i}=0:i\in[m]\}$ of $m$ polynomial equations, a Nullstellensatz proof of infeasibility is an equality of the form

1=\sum_{i=1}^{m}{p_{i}{q_{i}}}

for some polynomials $\{q_{i}:i\in[m]\}$ . We define the total coefficient size of such a Nullstellensatz proof to be $\sum_{i=1}^{m}{T(q_{i})}$ .

The following terminology will be useful.

Definition 3.

Given a system $\{p_{i}=0:i\in[m]\}$ of polynomial equations, we call each of the equations $p_{i}=0$ an axiom. For each axiom $s_{i}=0$ , we define a weakening of this axiom to be an equation of the form $rp_{i}=0$ for some monomial $r$ .

Remark 2.

We do not include the total coefficient size of $p_{i}$ in the total coefficient size of the proof as we want to focus on the complexity of the proof as opposed to the complexity of the axioms. That said, in this paper we only consider systems of polynomial equations where each $p_{i}$ is a monomial, so this choice does not matter.

The minimum total coefficient size of a Nullstellensatz proof can be found using the following linear program. In general, this linear program will have infinite size, but as we discuss below, it has finite size when the variables are Boolean.

Primal: Minimize $\sum_{i=1}^{m}{T(q_{i})}$ subject to $\sum_{i=1}^{m}{{p_{i}}{q_{i}}}=1$ . More precisely, writing $q_{i}=\sum_{\text{monomials }r}{c_{ir}r}$ , we minimize $\sum_{i=1}^{m}{\sum_{\text{monomials }r}{b_{ir}}}$ subject to the constraints that
1. 1.
  
  $b_{ir}\geq-c_{ir}$ and $b_{ir}\geq c_{ir}$ for all $i\in[m]$ and monomials $r$ .
2. 2.
  
  $\sum_{i=1}^{m}{\sum_{\text{monomials }r}{c_{ir}{r}p_{i}}}=1$
Dual: Maximize $D(1)$ subject to the constraints that
1. 1.
  
  $D$ is a linear map from polynomials to $\mathbb{R}$ .
2. 2.
  
  For each $i\in[m]$ and each monomial $r$ , $|D(rp_{i})|\leq 1$ .

Weak duality, which is what we need for our lower bound on the pigeonhole principle, can be seen directly as follows.

Proposition 1.

If $D$ is a linear map from polynomials to $\mathbb{R}$ such that $|D(rp_{i})|\leq 1$ for all $i\in[m]$ and all monomials $r$ then any Nullstellensatz proof of infeasibility has total coefficient size at least $D(1)$ .

Proof.

Given a Nullstellensatz proof $1=\sum_{i=1}^{m}{{p_{i}}{q_{i}}}$ , applying $D$ to it gives

D(1)=\sum_{i=1}^{m}{D({p_{i}}{q_{i}})}\leq\sum_{i=1}^{m}{T(q_{i})}

∎

2.1 Special case: Boolean variables

In this paper, we only consider problems where all of our variables are Boolean, so we make specific definitions for this case. In particular, we allow monomials to contain terms of the form $(1-x_{i})$ as well as $x_{i}$ and we allow the Boolean axioms $x_{i}^{2}=x_{i}$ to be used for free. We also observe that we can define a linear map $D$ from polynomials to $\mathbb{R}$ by assigning a value $D(x)$ to each input $x$ .

Definition 4.

Given Boolean variables $x_{1},\ldots,x_{N}$ where we have that $x_{i}=1$ if $x_{i}$ is true and $x_{i}=0$ if $x_{i}$ is false, we define a monomial to be a product of the form $\left(\prod_{i\in S}{x_{i}}\right)\left(\prod_{j\in T}{(1-x_{j})}\right)$ for some disjoint subsets $S,T$ of $[N]$ .

Definition 5.

Given a Boolean variable $x$ , we use $\bar{x}$ as shorthand for the negation $1-x$ of $x$ .

Definition 6.

Given a set of polynomial equations $\{p_{i}=0:i\in[m]\}$ together with Boolean axioms $\{x_{j}^{2}-x_{j}=0:j\in[N]\}$ , we define the total coefficient size of a Nullstellensatz proof

1=\sum_{i=1}^{m}{{p_{i}}{q_{i}}}+\sum_{j=1}^{N}{{g_{j}}(x_{j}^{2}-x_{j})}

to be $\sum_{i=1}^{m}{T(q_{i})}$ . In other words, we allow the Boolean axioms $\{x_{j}^{2}-x_{j}=0:j\in[N]\}$ to be used for free.

Remark 3.

For the problems we consider in this paper, all of our non-Boolean axioms are monomials, so there is actually no need to use the Boolean axioms.

Remark 4.

We allow monomials to contain terms of the form $(1-x_{i})$ and allow the Boolean axioms to be used for free in order to avoid spurious lower bounds coming from difficulties in manipulating the Boolean variables rather than handling the non-Boolean axioms. In particular, with these adjustments, when the non-Boolean axioms are monomials, the minimum total coefficient size of a Nullstellensatz proof is upper bounded by the minimum size of a tree-resolution proof.

Since the Boolean axioms $\{x_{j}^{2}-x_{j}=0:j\in[N]\}$ can be used for free, to specify a linear map $D$ from polynomials to $\mathbb{R}$ , it is necessary and sufficient to specify the value of $D$ on each input $x\in\{0,1\}^{N}$ .

Definition 7.

Given a function $D:\{0,1\}^{N}\to\mathbb{R}$ , we can view $D$ as a linear map from polynomials to $\mathbb{R}$ by taking $D(f)=\sum_{x\in\{0,1\}^{N}}{f(x)D(x)}$

3 Total coefficient size lower bound for the pigeonhole principle

In this section, we prove Theorem 1, our total coefficient size lower bound on the pigeonhole principle. We start by formally defining the pigeonhole principle.

Definition 8 (pigeonhole principle ( $\mathrm{PHP}_{n}$ )).

Intuitively, the pigeonhole principle says that if $n$ pigeons are assigned to $n-1$ holes, then some hole must have more than one pigeon. Formally, for $n\geq 1$ , we define $\mathrm{PHP}_{n}$ to be the statement that the following system of axioms is infeasible:

•

For each $i\in[n]$ and $j\in[n-1]$ , we have a variable $x_{i,j}$ . $x_{i,j}=1$ represents pigeon $i$ being in hole $j$ , and $x_{i,j}=0$ represents pigeon $i$ not being in hole $j$ .
•

For each $i\in[n]$ , we have the axiom $\prod_{j=1}^{n-1}{\bar{x}_{i,j}}=0$ representing the constraint that each pigeon must be in at least one hole (recall that $\bar{x}_{i,j}=1-x_{i,j}$ ).
•

For each pair of distinct pigeons $i_{1},i_{2}\in[n]$ and each hole $j\in[n-1]$ , we have the axiom $x_{i_{1},j}x_{i_{2},j}=0$ representing the constraint that pigeons $i_{1}$ and $i_{2}$ cannot both be in hole $j$ .

We prove our lower bound on the total coefficient size complexity of $\text{PHP}_{n}$ by constructing and analyzing a dual solution $D$ . In our dual solution, the only assignments $x$ for which $D(x)\neq 0$ are those where each pigeon goes to exactly one hole (i.e., for each pigeon $i$ , exactly one of the $x_{i,j}$ is 1). Note that there are $(n-1)^{n}$ such assignments. In the rest of this section, when we refer to assignments or write a summation or expectation over assignments $x$ , we refer specifically to these $(n-1)^{n}$ assignments.

Recall that the dual constraints are

D(W)=\sum_{\text{assignments }x}{D(x)W(x)}\in[-1,1]

for all weakenings $W$ of an axiom. Note that since $D(x)$ is only nonzero for assignments $x$ where each pigeon goes to exactly one hole, for any weakening $W$ of an axiom of the form $\prod_{j=1}^{n-1}{\bar{x}_{i,j}}=0$ , $D(W)=0$ . Thus, it is sufficient to consider weakenings $W$ of the axioms $x_{i_{1},j}x_{i_{2},j}=0$ . Further note that if $|D(W)|>1$ for some weakening $W$ then we can rescale $D$ by dividing by $\max_{W}{|D(W)|}$ . Thus, we can rewrite the objective value of the dual program as $\frac{D(1)}{\max_{W}{|D(W)|}}$ . Letting $\mathbb{E}$ denote the expectation over a uniform assignment where each pigeon goes to exactly one hole, $\frac{D(1)}{\max_{W}{|D(W)|}}=\frac{\mathbb{E}(D)}{\max_{W}{|\mathbb{E}(DW)|}}$ so it is sufficient to construct $D$ and analyze $\mathbb{E}(D)$ and $\max_{W}{|\mathbb{E}(DW)|}$ .

Before constructing and analyzing $D$ , we provide some intuition for our construction. The idea is that, if we consider a subset of $n-1$ pigeons, then $D$ should behave like the indicator function for whether those $n-1$ pigeons all go to different holes. More concretely, for any polynomial $p$ which does not depend on some pigeon $i$ (i.e. $p$ does not contain $x_{i,j}$ or $\bar{x}_{i,j}$ for any $j\in[n-1]$ ),

\mathbb{E}(Dp)=\frac{(n-1)!}{(n-1)^{n-1}}\mathbb{E}(p\mid\text{all pigeons in }[n]\setminus\{i\}\text{ go to different holes})

Given this intuition, we now present our construction. Our dual solution $D$ will be a linear combination of the following functions:

Definition 9 (functions $J_{S}$ ).

Let $S\subsetneq[n]$ be a subset of pigeons of size at most $n-1$ . We define the function $J_{S}$ that maps assignments to $\{0,1\}$ . For an assignment $x$ , $J_{S}(x)=1$ if all pigeons in $S$ are in different holes according to $x$ , and $J_{S}(x)=0$ otherwise. ∎

Note that if $|S|=0$ or $|S|=1$ , then $J_{S}$ is the constant function 1. In general, the expectation of $J_{S}$ over a uniform assignment is $\mathbb{E}(J_{S})=\left(\prod_{k=1}^{|S|}(n-k)\right)/(n-1)^{|S|}$ .

Definition 10 (dual solution $D$ ).

Our dual solution $D$ is:

D=\sum_{S\subsetneq[n]}c_{S}J_{S},

where the coefficients $c_{S}$ are $c_{S}=\frac{(-1)^{n-1-|S|}(n-1-|S|)!}{(n-1)^{n-1-|S|}}$ .

We will lower-bound the dual value $\mathbb{E}(D)/\max_{W}|\mathbb{E}(DW)|$ by computing $\mathbb{E}(D)$ and then upper-bounding $\max_{W}|\mathbb{E}(DW)|$ . In both calculations, we will use the following key property of $D$ , which we introduced in our intuition for the construction:

Lemma 1.

If $p$ is a polynomial which does not depend on pigeon $i$ (i.e. $p$ does not contain any variables of the form $x_{i,j}$ or $\bar{x}_{i,j}$ ) then $\mathbb{E}(Dp)=\mathbb{E}(J_{[n]\setminus\{i\}}p)$ .

Proof.

Without loss of generality, suppose $p$ does not contain any variables of the form $x_{1,j}$ or $\bar{x}_{1,j}$ . Let $T$ be any subset of pigeons that does not contain pigeon 1 and that has size at most $n-2$ . Observe that

\mathbb{E}({J_{T\cup\{1\}}}p)=\frac{n-1-|T|}{n-1}\mathbb{E}({J_{T}}p)

because regardless of the locations of the pigeons in $T$ , the probability that pigeon $1$ goes to a different hole is $\frac{n-1-|T|}{n-1}$ and $p$ does not depend on the location of pigeon $1$ . Since

	$\displaystyle c_{T\cup\{1\}}$	$\displaystyle=\frac{(-1)^{n-2-\|T\|}(n-2-\|T\|)!}{(n-1)^{n-2-\|T\|}}$
		$\displaystyle=-\frac{n-1}{n-1-\|T\|}\cdot\frac{(-1)^{n-1-\|T\|}(n-1-\|T\|)!}{(n-1)^{n-1-\|T\|}}=-\frac{n-1}{n-1-\|T\|}c_{T}$

we have that for all $T\subsetneq\{2,\dots,n\}$ ,

\mathbb{E}(c_{T\cup\{1\}}{J_{T\cup\{1\}}}p)+\mathbb{E}(c_{T}{J_{T}}p)=0

Thus, all terms except for $J_{\{2,3,\ldots,n\}}$ cancel. Since $c_{\{2,3,\ldots,n\}}=1$ , we have that $\mathbb{E}(Dp)=\mathbb{E}(J_{\{2,3,\ldots,n\}}p)$ , as needed. ∎

The value of $\mathbb{E}(D)$ follows immediately:

Corollary 1.

\mathbb{E}(D)=\frac{(n-2)!}{(n-1)^{n-2}}.

Proof.

Let $p=1$ . By Lemma 1, $\mathbb{E}(D)=\mathbb{E}(J_{\{2,\dots,n\}})=(n-2)!/(n-1)^{n-2}$ . ∎

3.1 Upper bound on $\max_{W}|\mathbb{E}(DW)|$

We introduce the following notation:

Definition 11 ( $H_{W,i}$ ).

Given a weakening $W$ , we define a set of holes $H_{W,i}\subseteq[n-1]$ for each pigeon $i\in[n]$ so that $W(x)=1$ if and only if each pigeon $i\in[n]$ is mapped to one of the holes in $H_{W,i}$ . More precisely,

•

If $W$ contains terms $x_{i,j_{1}}$ and $x_{i,j_{2}}$ for distinct holes $j_{1},j_{2}$ , then $H_{W,i}=\emptyset$ (i.e. it is impossible that $W(x)=1$ because pigeon $i$ cannot go to both holes $h$ and $h^{\prime}$ ). Similarly, if $W$ contains both $x_{i,j}$ and $\bar{x}_{i,j}$ for some $j$ then $H_{W,i}=\emptyset$ (i.e. it is impossible for pigeon $i$ to both be in hole $j$ and not be in hole $j$ ).
•

If $W$ contains exactly one term of the form $x_{i,j}$ , then $H_{W,i}=\{j\}$ . (i.e., for all $x$ such that $W(x)=1$ , pigeon $i$ goes to hole $j$ ).
•

If $W$ contains no terms of the form $x_{i,j}$ , then $H_{W,i}$ is the subset of holes $j$ such that $W$ does not contain the term $\bar{x}_{i,j}$ . (i.e., if $W$ contains the term $\bar{x}_{i,j}$ , then for all $x$ such that $W(x)=1$ , pigeon $i$ does not go to hole $j$ .)

The key property we will use to bound $\max_{W}|\mathbb{E}(DW)|$ follows immediately from Lemma 1:

Lemma 2.

Let $W$ be a weakening. If there exists some pigeon $i\in[n]$ such that $H_{W,i}=[n-1]$ (i.e., $W$ does not contain any terms of the form $x_{i,j}$ or $\bar{x}_{i,j}$ ), then $\mathbb{E}(DW)=0$ .

Proof.

Without loss of generality, suppose $W$ is a weakening of the axiom $x_{2,1}x_{3,1}=0$ and $H_{W,1}=[n]$ . By Lemma 1, $\mathbb{E}(DW)=\mathbb{E}(J_{\{2,\dots,n\}}W)$ . However, $\mathbb{E}(J_{\{2,\dots,n\}}W)=0$ because if $W=1$ then pigeons 2 and 3 must both go to hole 1. ∎

We make the following definition and then state a corollary of Lemma 2.

Definition 12 ( $W^{\mathrm{flip}}_{S}$ ).

Let $W$ be a weakening of the axiom $x_{i_{1},j}x_{i_{2},j}=0$ for pigeons $i_{1},i_{2}$ and hole $j$ . Let $S\subseteq[n]\setminus\{i_{1},i_{2}\}$ . We define $W^{\mathrm{flip}}_{S}$ , which is also a weakening of the axiom $x_{i_{1},j}x_{i_{2},j}=0$ , as follows.

•

For each pigeon $i_{3}\in S$ , we define $W^{\mathrm{flip}}_{S}$ so that $H_{W^{\mathrm{flip}}_{S},i_{3}}=[n-1]\setminus H_{W,i_{3}}$ .
•

For each pigeon $i_{3}\notin S$ , we define $W^{\mathrm{flip}}_{S}$ so that $H_{W^{\mathrm{flip}}_{S},i_{3}}=H_{W,i_{3}}$ .

(Technically, there may be multiple ways to define $W^{\mathrm{flip}}_{S}$ to satisfy these properties; we can arbitrarily choose any such definition.) ∎

In other words, $W^{\mathrm{flip}}_{S}$ is obtained from $W$ by flipping the sets of holes that the pigeons in $S$ can go to in order to make the weakening evaluate to 1. Now we state a corollary of Lemma 2:

Corollary 2.

Let $W$ be a weakening of the axiom $x_{i_{1},j}x_{i_{2},j}=0$ for pigeons $i_{1},i_{2}$ and hole $j$ . Let $S\subseteq[n]\setminus\{i_{1},i_{2}\}$ . Then

\mathbb{E}\left(DW^{\mathrm{flip}}_{S}\right)=(-1)^{|S|}\cdot\mathbb{E}(DW).

Proof.

It suffices to show that for $i_{3}\in[n]\setminus\{i_{1},i_{2}\}$ , we have $\mathbb{E}\left(DW^{\mathrm{flip}}_{\{i_{3}\}}\right)=-\mathbb{E}(DW)$ . Indeed, $W+W^{\mathrm{flip}}_{\{i_{3}\}}$ is a weakening satisfying $H_{W+W^{\mathrm{flip}}_{\{i_{3}\}},i_{3}}=[n-1]$ . Therefore, by Lemma 2, $\mathbb{E}\left(D\left(W+W^{\mathrm{flip}}_{\{i_{3}\}}\right)\right)=0$ . ∎

Using Corollary 2, we can bound $\max_{W}|\mathbb{E}(DW)|$ using Cauchy-Schwarz. We first show an approach that does not give a strong enough bound. We then show how to modify the approach to achieve a better bound.

3.1.1 Unsuccessful approach to upper bound $\max_{W}|\mathbb{E}(DW)|$

Consider $\max_{W}|\mathbb{E}(DW)|$ . By Lemma 2, it suffices to consider only weakenings $W$ such that, if $W$ is a weakening of the axiom $x_{i_{1},j}x_{i_{2},j}=0$ , then for all pigeons $i_{3}\in[n]\setminus\{i_{1},i_{2}\}$ , we have $|H_{W,k}|\leq\lfloor(n-1)/2\rfloor$ . For any such $W$ , we have

	$\displaystyle\lVert W\rVert$	$\displaystyle=\sqrt{E(W^{2})}$
		$\displaystyle\leq\sqrt{\left(\frac{1}{n-1}\right)^{2}\left(\frac{1}{2}\right)^{n-2}}$
		$\displaystyle=(n-1)^{-1}\cdot 2^{-(n-2)/2}.$

By Cauchy-Schwarz,

	$\displaystyle\|\mathbb{E}(DW)\|$	$\displaystyle\leq\lVert D\rVert\lVert W\rVert$
		$\displaystyle\leq\lVert D\rVert(n-1)^{-1}2^{-(n-2)/2}.$

Using the value of $\mathbb{E}(D)$ from Corollary 1, the dual value $\mathbb{E}(D)/\max_{W}|\mathbb{E}(DW)|$ is at least:

\frac{(n-2)!}{(n-1)^{n-2}}\cdot\frac{(n-1)2^{(n-2)/2}}{\lVert D\rVert}=\widetilde{\Theta}\left(\left(\frac{e}{\sqrt{2}}\right)^{-n}\cdot\frac{1}{\lVert D\rVert}\right)

by Stirling’s formula. Thus, in order to achieve an exponential lower bound on the dual value, we would need $1/\lVert D\rVert\geq\Omega(c^{n})$ for some $c>e/\sqrt{2}$ . However, this requirement is too strong, as we will show that $1/\lVert D\rVert=\widetilde{\Theta}\left(\left(\sqrt{e}\right)^{n}\right)$ . Directly applying Cauchy-Schwarz results in too loose of a bound on $\max_{W}|\mathbb{E}(DW)|$ , so we now modify our approach.

3.1.2 Successful approach to upper bound $\max_{W}|\mathbb{E}(DW)|$

Definition 13 ( $W^{\{-1,0,1\}}$ ).

Let $W$ be a weakening of the axiom $x_{i_{1},j}x_{i_{2},j}=0$ for pigeons $i_{1},i_{2}$ and hole $j$ . We define the function $W^{\{-1,0,1\}}$ that maps assignments to $\{-1,0,1\}$ . For an assignment $x$ ,

•

If pigeons $i_{1}$ and $i_{2}$ do not both go to hole $j$ , then $W^{\{-1,0,1\}}(x)=0$ .
•

Otherwise, let $V(x)=|\{i_{3}\in[n]\setminus\{i_{1},i_{2}\}:\text{pigeon }i_{3}\text{ does not go to }H_{W,i_{3}}\}|$ . Then $W^{\{-1,0,1\}}(x)=(-1)^{V(x)}$ .

Note that $W^{\{-1,0,1\}}$ is a linear combination of the $W^{\mathrm{flip}}_{S}$ :

Lemma 3.

Let $W$ be a weakening of the axiom $x_{i_{1},j}x_{i_{2},j}=0$ for pigeons $i_{1},i_{2}$ and hole $j$ . We have:

W^{\{-1,0,1\}}=\sum_{S\subseteq[n]\setminus\{i_{1},i_{2}\}}(-1)^{|S|}\cdot W^{\mathrm{flip}}_{S}.

It follows that:

\mathbb{E}\left(DW^{\{-1,0,1\}}\right)=2^{n-2}\cdot\mathbb{E}(DW).

Proof.

To prove the first equation, consider any assignment $x$ . If pigeons $i_{1}$ and $i_{2}$ do not both go to hole $j$ , then both $W^{\{-1,0,1\}}$ and all the $W^{\mathrm{flip}}_{S}$ evaluate to 0 on $x$ . Otherwise, exactly one of the $W^{\mathrm{flip}}_{S}(x)$ equals 1, and for this choice of $S$ , we have $W^{\{-1,0,1\}}(x)=(-1)^{|S|}$ .

The second equation follows because:

$\displaystyle\mathbb{E}\left(DW^{\{-1,0,1\}}\right)$	$\displaystyle=\sum_{S\subseteq[n]\setminus\{i_{1},i_{2}\}}(-1)^{\|S\|}\cdot\mathbb{E}\left(DW^{\mathrm{flip}}_{S}\right)$
	$\displaystyle=\sum_{S\subseteq[n]\setminus\{i_{1},i_{2}\}}(-1)^{\|S\|}(-1)^{\|S\|}\cdot\mathbb{E}(DW)$	(Corollary 2)
	$\displaystyle=2^{n-2}\cdot\mathbb{E}(DW).$

∎

Using Lemma 3, we now improve on the approach to upper-bound $\max_{W}|\mathbb{E}(DW)|$ from section 3.1.1:

Lemma 4.

The dual value $\mathbb{E}(D)/\max_{W}|\mathbb{E}(DW)|$ is at least $\frac{(n-2)!}{(n-1)^{n-2}}\cdot\frac{(n-1)2^{n-2}}{\lVert D\rVert}$

Proof.

For any $W$ , we have:

$\displaystyle\mathbb{E}(DW)$	$\displaystyle=2^{-(n-2)}\cdot\mathbb{E}\left(DW^{\{-1,0,1\}}\right)$	(Lemma 3)
	$\displaystyle\leq 2^{-(n-2)}\cdot\lVert D\rVert\lVert W^{\{-1,0,1\}}\rVert$	(Cauchy-Schwarz)
	$\displaystyle=2^{-(n-2)}\cdot\lVert D\rVert\sqrt{\mathbb{E}\left(\left(W^{\{-1,0,1\}}\right)^{2}\right)}$
	$\displaystyle=(n-1)^{-1}2^{-(n-2)}\cdot\lVert D\rVert.$

Using the value of $\mathbb{E}(D)$ from Corollary 1, the dual value $\mathbb{E}(D)/\max_{W}|\mathbb{E}(DW)|$ is at least $\frac{(n-2)!}{(n-1)^{n-2}}\cdot\frac{(n-1)2^{n-2}}{\lVert D\rVert}$ . ∎

It only remains to compute $\lVert D\rVert$ :

Lemma 5.

{\lVert D\rVert}^{2}=\frac{(n-2)!}{(n-1)^{n-2}}\cdot n!\cdot\sum_{c=0}^{n-1}\frac{(-1)^{n-1-c}}{n-c}\cdot\frac{1}{(n-1)^{n-1-c}c!}

Proof.

Recall the definition of $D$ (Definition 10):

	$\displaystyle D$	$\displaystyle=\sum_{S\subsetneq[n]}c_{S}J_{S},$
	$\displaystyle c_{S}$	$\displaystyle=\frac{(-1)^{n-1-\|S\|}(n-1-\|S\|)!}{(n-1)^{n-1-\|S\|}}.$

We compute $\lVert D\rVert^{2}=\mathbb{E}(D^{2})$ as follows.

\mathbb{E}(D^{2})=\sum_{S\subsetneq[n]}\sum_{T\subsetneq[n]}c_{S}c_{T}\cdot\mathbb{E}(J_{S}J_{T}).

Given $S,T\subsetneq[n]$ , we have:

	$\displaystyle\mathbb{E}(J_{S}J_{T})$	$\displaystyle=\mathbb{E}(J_{S})\mathbb{E}(J_{T}\mid J_{S}=1)$
		$\displaystyle=\left(\left(\prod_{i=1}^{\|S\|}(n-i)!\right)/(n-1)^{\|S\|}\right)\left(\left(\prod_{j=\|S\cap T\|+1}^{\|T\|}(n-j)!\right)/(n-1)^{\|T\setminus S\|}\right)$

Therefore,

\displaystyle c_{S}c_{T}\cdot\mathbb{E}(J_{S}J_{T})

\displaystyle=\left(c_{S}\left(\prod_{i=1}^{|S|}(n-i)!\right)/(n-1)^{|S|}\right)\left(c_{T}\left(\prod_{j=|S\cap T|+1}^{|T|}(n-j)!\right)/(n-1)^{|T\setminus S|}\right).

Note that the product of $(-1)^{n-1-|S|}$ (from the $c_{S}$ ) and $(-1)^{n-1-|T|}$ (from the $c_{T}$ ) equals $(-1)^{|S|-|T|}$ , so the above equation becomes:

\displaystyle c_{S}c_{T}\cdot\mathbb{E}(J_{S}J_{T})

\displaystyle=(-1)^{|S|-|T|}\left(\frac{(n-2)!}{(n-1)^{n-2}}\right)\left(\frac{(n-1-|S\cap T|)!}{(n-1)^{n-1-|S\cap T|}}\right).

Now, we rearrange the sum for $\mathbb{E}(D^{2})$ in the following way:

	$\displaystyle\mathbb{E}(D^{2})$	$\displaystyle=\sum_{S\subsetneq[n]}\sum_{T\subsetneq[n]}c_{S}c_{T}\cdot\mathbb{E}(J_{S}J_{T})$
		$\displaystyle=\frac{(n-2)!}{(n-1)^{n-2}}\sum_{c=0}^{n-1}\frac{(n-1-c)!}{(n-1)^{n-1-c}}\sum_{\begin{subarray}{c}S,T\subsetneq[n],\\ \|S\cap T\|=c\end{subarray}}(-1)^{\|S\|-\|T\|}.$

To evaluate this expression, fix $c\leq n-1$ and consider the inner sum. Consider the collection of tuples $\{(S,T)\mid S,T\subsetneq[n],|S\cap T|=c\}$ . We can pair up (most of) these tuples in the following way. For each $S$ , let $m_{S}$ denote the minimum element in $[n]$ that is not in $S$ (note that $m_{S}$ is well defined because $S$ cannot be $[n]$ ). We pair up the tuple $(S,T)$ with the tuple $(S,T\triangle\{m_{S}\})$ , where $\triangle$ denotes symmetric difference. The only tuples $(S,T)$ that cannot be paired up in this way are those where $|S|=c$ and $T=[n]\setminus\{m_{S}\}$ , because $T$ cannot be $[n]$ . There are $\binom{n}{c}$ unpaired tuples $(S,T)$ , and for each of these tuples, we have $(-1)^{|S|-|T|}=(-1)^{n-1-c}$ . On the other hand, each pair $(S,T),(S,T\triangle\{m_{S}\})$ contributes 0 to the inner sum. Therefore, the inner sum equals $(-1)^{n-1-c}\binom{n}{c}$ , and we have:

	$\displaystyle\mathbb{E}(D^{2})$	$\displaystyle=\frac{(n-2)!}{(n-1)^{n-2}}\sum_{c=0}^{n-1}\frac{(-1)^{n-1-c}(n-1-c)!}{(n-1)^{n-1-c}}\binom{n}{c}$
		$\displaystyle=\frac{(n-2)!}{(n-1)^{n-2}}\sum_{c=0}^{n-1}\frac{(-1)^{n-1-c}(n-1-c)!}{(n-1)^{n-1-c}}\cdot\frac{n!}{c!(n-c)!}$
		$\displaystyle=\frac{(n-2)!}{(n-1)^{n-2}}\cdot n!\cdot\sum_{c=0}^{n-1}\frac{(-1)^{n-1-c}}{n-c}\cdot\frac{1}{(n-1)^{n-1-c}c!}.$

∎

Corollary 3.

$\mathbb{E}(D^{2})\leq\frac{n!}{(n-1)^{n-1}}$

Proof.

Observe that the sum

\sum_{c=0}^{n-1}\frac{(-1)^{n-1-c}}{n-c}\cdot\frac{1}{(n-1)^{n-1-c}c!}

is an alternating series where the magnitudes of the terms decrease as $c$ decreases. The two largest magnitude terms are $1/(n-1)!$ and $-(1/2)\cdot 1/(n-1)!$ . Therefore, the sum is at most $\frac{1}{(n-1)!}$ , and we conclude that

\mathbb{E}(D^{2})\leq\frac{(n-2)!}{(n-1)^{n-2}}\cdot\frac{n!}{(n-1)!}=\frac{n!}{(n-1)^{n-1}}

as needed. ∎

We can now complete the proof of Theorem 1

Proof of Theorem 1.

By Lemma 4, any Nullstellensatz proof for $\text{PHP}_{n}$ has total coefficient size at least $\frac{(n-2)!}{(n-1)^{n-2}}\cdot\frac{(n-1)2^{n-2}}{\lVert D\rVert}$ . By Corollary 3, $\lVert D\rVert\leq\sqrt{\frac{n!}{(n-1)^{n-1}}}$ . Combining these results, any Nullstellensatz proof for $\text{PHP}_{n}$ has total coefficient size at least

	$\displaystyle\frac{(n-2)!}{(n-1)^{n-2}}\cdot\frac{(n-1)2^{n-2}}{\sqrt{\frac{n!}{(n-1)^{n-1}}}}$	$\displaystyle=\frac{2^{n-2}}{\sqrt{n}}\cdot\frac{\sqrt{(n-1)!}}{(n-1)^{\frac{n}{2}-\frac{3}{2}}}$
		$\displaystyle=\frac{2^{n-2}(n-1)}{\sqrt{n}}\sqrt{\frac{(n-1)!}{(n-1)^{n-1}}}$

Using Stirling’s approximation that $n!$ is approximately $\sqrt{2{\pi}n}\left(\frac{n}{e}\right)^{n}$ , $\sqrt{\frac{(n-1)!}{(n-1)^{n-1}}}$ is approximately $\sqrt[4]{2{\pi}(n-1)}\left(\frac{1}{\sqrt{e}}\right)^{n-1}$ so this expression is $\Omega\left(n^{\frac{3}{4}}\left(\frac{2}{\sqrt{e}}\right)^{n}\right)$ , as needed. ∎

3.2 Experimental Results for $\text{PHP}_{n}$

For small $n$ , we computed the optimal dual values shown below. The first column of values is the optimal dual value for $n=3,4$ . The second column of values is the optimal dual value for $n=3,4,5,6$ under the restriction that the only nonzero assignments are those where each pigeon goes to exactly one hole.

$n$	dual value	dual value, each pigeon goes to exactly one hole
3	11	6
4	$41.4\overline{69}$	27
5	-	100
6	-	293.75

For comparison, the table below shows the value we computed for our dual solution and the lower bound of $\frac{2^{n-2}(n-1)}{\sqrt{n}}\sqrt{\frac{(n-1)!}{(n-1)^{n-1}}}$ that we showed in the proof of Theorem 1. (Values are rounded to 3 decimals.)

$n$	value of $D$	proven lower bound on value of $D$
3	4	1.633
4	18	2.828
5	64	4.382
6	210.674	6.4

It is possible that our lower bound on the value of $D$ can be improved. The following experimental evidence suggests that the dual value $\mathbb{E}(D)/\max_{W}|\mathbb{E}(DW)|$ of $D$ may actually be $\widetilde{\Theta}(2^{n})$ . For $n=3,4,5,6$ , we found that the weakenings $W$ that maximize $|\mathbb{E}(DW)|$ are of the following form, up to symmetry. (By symmetry, we mean that we can permute pigeons/holes without changing $|\mathbb{E}(DW)|$ , and we can flip sets of holes as in Lemma 2 without changing $|\mathbb{E}(DW)|$ .)

•

For odd $n$ ( $n=3,5$ ): $W$ is the weakening of the axiom $x_{1,1}x_{2,1}=0$ where, for $i=3,\dots,n$ , we have $H_{W,i}=\{2,\dots,(n+1)/2\}$ .
•

For even $n$ ( $n=4,6$ ): $W$ is the following weakening of the axiom $x_{1,1}x_{2,1}=0$ . For $i=3,\dots,n/2+1$ , we have $H_{W,i}=\{2,\dots,n/2\}$ . For $i=n/2+2,\dots,n$ , we have $H_{W,i}=\{n/2+1,\dots,n-1\}$ .

If this pattern continues to hold for larger $n$ , then experimentally it seems that
$\mathbb{E}(D)/\max_{W}|\mathbb{E}(DW)|$ is $\widetilde{\Theta}(2^{n})$ , although we do not have a proof of this.

4 Total coefficient size upper bound for the ordering principle

In this section, we construct an explicit Nullstellensatz proof of infeasibility for the ordering principle $\text{ORD}_{n}$ with size and total coefficient size $2^{n}-n$ . We start by formally defining the ordering principle.

Definition 14 (ordering principle ( $\mathrm{ORD}_{n}$ )).

Intuitively, the ordering principle says that any well-ordering on $n$ elements must have a minimum element. Formally, for $n\geq 1$ , we define $\mathrm{ORD}_{n}$ to be the statement that the following system of axioms is infeasible:

•

We have a variable $x_{i,j}$ for each pair $i,j\in[n]$ with $i<j$ . $x_{i,j}=1$ represents element $i$ being less than element $j$ in the well-ordering, and $x_{i,j}=0$ represents element $i$ being more than element $j$ in the well-ordering.

We write $x_{j,i}$ as shorthand for $1-x_{i,j}$ (i.e. we take $x_{j,i}=\bar{x}_{i,j}=1-x_{i,j}$ ).
•

For each $i\in[n]$ , we have the axiom $\prod_{j\in[n]\setminus\{i\}}{x_{i,j}}=0$ which represents the constraint that element $i$ is not a minimum element. We call these axioms non-minimality axioms.
•

For each triple $i,j,k\in[n]$ where $i<j<k$ , we have the two axioms $x_{i,j}x_{j,k}x_{k,i}=0$ and $x_{k,j}x_{j,i}x_{i,k}=0$ which represent the constraints that elements $i,j,k$ satisfy transitivity. We call these axioms transitivity axioms.

In our Nullstellensatz proof, for each weakening $W$ of an axiom, its coefficient $c_{W}$ will either be $1$ or $0$ . Non-minimality axioms will appear with coefficient $1$ and the only weakenings of transitivity axioms which appear have a special form which we describe below.

Definition 15 (nice transitivity weakening).

Let $W$ be a weakening of the axiom $x_{i,j}x_{j,k}x_{k,i}$ or the axiom $x_{k,j}x_{j,i}x_{i,k}$ for some $i<j<k$ . Let $G(W)$ be the following directed graph. The vertices of $G(W)$ are $[n]$ . For distinct $i^{\prime},j^{\prime}\in[n]$ , $G(W)$ has an edge from $i^{\prime}$ to $j^{\prime}$ if $W$ contains the term $x_{i^{\prime},j^{\prime}}$ . We say that $W$ is a nice transitivity weakening if $G(W)$ has exactly $n$ edges and all vertices are reachable from vertex $i$ .

In other words, if $W$ is a weakening of the axiom $x_{i,j}x_{j,k}x_{k,i}$ or the axiom $x_{k,j}x_{j,i}x_{i,k}$ then $G(W)$ contains a 3-cycle on vertices $\{i,j,k\}$ . $W$ is a nice transitivity weakening if and only if contracting this 3-cycle results in a (directed) spanning tree rooted at the contracted vertex. Note that if $W$ is a nice transitivity weakening and $x$ is an assignment with a minimum element then $W(x)=0$ .

Theorem 3.

There is a Nullstellensatz proof of infeasibility for $\text{ORD}_{n}$ satisfying:

1.

The total coefficient size is $2^{n}-n$ .
2.

Each $c_{W}$ is either 0 or 1.
3.

If $A$ is a non-minimality axiom, then $c_{A}=1$ and $c_{W}=0$ for all other weakenings of $A$ .
4.

If $W$ is a transitivity weakening but not a nice transitivity weakening then $c_{W}=0$ .

Proof. We prove Theorem 3 by induction on $n$ . When $n=3$ , the desired Nullstellensatz proof sets $c_{A}=1$ for each axiom $A$ . It can be verified that $\sum_{W}c_{W}W$ evaluates to 1 on each assignment, and that this Nullstellensatz proof satisfies the properties of Theorem 3.

Now suppose we have a Nullstellensatz proof for $\text{ORD}_{n}$ satisfying Theorem 3, and let $S_{n}$ denote the set of transitivity weakenings $W$ for which $c_{W}=1$ . The idea to obtain a Nullstellensatz proof for $\text{ORD}_{n+1}$ is to use two “copies” of $S_{n}$ , the first copy on elements $\{1,\dots,n\}$ and the second copy on elements $\{2,\dots,n+1\}$ . Specifically, we construct the Nullstellensatz proof for $\text{ORD}_{n+1}$ by setting the following $c_{W}$ to 1 and all other $c_{W}$ to 0.

1.

For each non-minimality axiom $A$ in $\text{ORD}_{n+1}$ , we set $c_{A}=1$ .
2.

For each $W\in S_{n}$ , we define the transitivity weakening $W^{\prime}$ on $n+1$ elements by $W^{\prime}=W\cdot x_{1,n+1}$ and set $c_{W^{\prime}}=1$ .
3.

For each $W\in S_{n}$ , first we define the transitivity weakening $W^{\prime\prime}$ on $n+1$ elements by replacing each variable $x_{i,j}$ that appears in $W$ by $x_{i+1,j+1}$ . (e.g., if $W=x_{1,2}x_{2,3}x_{3,1}$ , then $W^{\prime\prime}=x_{2,3}x_{3,4}x_{4,2}$ .) Then, we define $W^{\prime\prime\prime}=W^{\prime\prime}x_{n+1,1}$ and set $c_{W^{\prime\prime\prime}}=1$ .
4.

For each $i\in\{2,\dots,n\}$ , for each of the 2 transitivity axioms $A$ on $(1,i,n+1)$ , we set $c_{W}=1$ for the following weakening $W$ of $A$ :

$W=A\left(\prod_{j\in[n]\setminus\{i\}}{x_{i,j}}\right).$

In other words, $W(x)=1$ if and only if $A(x)=1$ and $i$ is the minimum element among the elements $[n+1]\setminus\{1,n+1\}$ .

The desired properties 1 through 4 in Theorem 3 can be verified by induction. It remains to show that for each assignment $x$ , there is exactly one nonzero $c_{W}$ for which $W(x)=1$ . If $x$ has a minimum element $i\in[n+1]$ , then the only nonzero $c_{W}$ for which $W(x)=1$ is the non-minimality axiom for $i$ . Now suppose that $x$ does not have a minimum element. Consider two cases: either $x_{1,n+1}=1$ , or $x_{n+1,1}=1$ . Suppose $x_{1,n+1}=1$ . Consider the two subcases:

1.

Suppose that, if we ignore element $n+1$ , then there is still no minimum element among the elements $\{1,\dots,n\}$ . Then there is exactly one weakening $W$ in point 2 of the construction for which $W(x)=1$ , by induction.
2.

Otherwise, for some $i\in\{2,\dots,n\}$ , we have that $i$ is a minimum element among $\{1,\dots,n\}$ and $x_{n+1,i}=1$ . Then there is exactly one weakening $W$ in point 4 of the construction for which $W(x)=1$ (namely the weakening $W$ of the axiom $A=x_{i,1}x_{1,n+1}x_{n+1,i}$ ).

The case $x_{n+1,1}=1$ is handled similarly by considering whether there is a minimum element among $\{2,\dots,n+1\}$ . Assignments that do have a minimum element among $\{2,\dots,n+1\}$ are handled by point 3 of the construction, and assignments that do not are handled by point 4 of the construction. ∎

4.1 Restriction to instances with no minimial element

We now observe that for the ordering principle, we can restrict our attention to instances which have no minimum element.

Lemma 6.

Suppose we have coefficients $c_{W}$ satisfying $\sum_{W}c_{W}W(x)=1$ for all assignments $x$ that have no minimum element (but it is possible that $\sum_{W}c_{W}W(x)\neq 1$ on assignments $x$ that do have a minimum element). Then there exist coefficients $c^{\prime}_{W}$ such that $\sum_{W}c^{\prime}_{W}W=1$ (i.e., the coefficients $c_{W}^{\prime}$ are a valid primal solution) with

\sum_{W}{|c^{\prime}_{W}|}\leq(n+1)\left(\sum_{W}{|c_{W}|}\right)+n.

This lemma says that, to prove upper or lower bounds for $\text{ORD}_{n}$ by constructing primal or dual solutions, it suffices to consider only assignments $x$ that have no minimum element, up to a factor of $O(n)$ in the solution value.

Proof.

Let $C$ denote the function on weakenings that maps $W$ to $c_{W}$ . For $i\in[n]$ , we will define the function $C_{i}$ on weakenings satisfying the properties:

•

If $x$ is an assignment where $i$ is a minimum element, then $\sum_{W}{C_{i}(W)W(x)}=\sum_{W}{C(W)W(x)}$ .
•

Otherwise, $\sum_{W}{C_{i}(W)W(x)}=0$ .

Let $A_{i}=\prod_{j\in[n]\setminus\{i\}}{x_{i,j}}$ be the non-minimality axiom for $i$ . Intuitively, we want to define $C_{i}$ as follows: For all $W$ , $C_{i}(A_{i}W)=C(W)$ . (If $W$ is a weakening that is not a weakening of $A_{i}$ , then $C_{i}(W)=0$ .) The only technicality is that multiple weakenings $W$ may become the same when multiplied by $A_{i}$ , so we actually define $C_{i}(A_{i}W)=\sum_{W^{\prime}:A_{i}W^{\prime}=A_{i}W}C(W^{\prime})$ .

Finally, we use the functions $C_{i}$ to define the function $C^{\prime}$ :

C^{\prime}=C-\left(\sum_{i=1}^{n}C_{i}\right)+\left(\sum_{i=1}^{n}A_{i}\right).

By taking $c^{\prime}_{W}=C^{\prime}(W)$ , the $c^{\prime}_{W}$ are a valid primal solution with the desired bound on the total coefficient size. ∎

4.2 Experimental results

For small values of $n$ , we have computed both the minimum total coefficient size of a Nullstellensatz proof of the ordering principle and the value of the linear program where we restrict our attention to instances $x$ which have no minimum element.

We found that for $n=3,4,5$ , the minimum total coefficient size of a Nullstellensatz proof of the ordering principle is $2^{n}-n$ so the primal solution given by Theorem 3 is optimal. However, for $n=6$ this solution is not optimal as the minimum total coefficient size is $52$ rather than $2^{6}-6=58$ .

If we restrict our attention to instances $x$ which have no minimum element then for $n=3,4,5,6$ , the value of the resulting linear program is equal to $2\binom{n}{3}$ , which is the number of transitivity axioms. However, this is no longer true for $n=7$ , though we did not compute the exact value.

5 Analyzing Total Coefficient Size for Stronger Proof Systems

In this section, we consider the total coefficient size for two stronger proof systems, sum of squares proofs and a proof system which is between Nullstellensatz and sum of squares proofs which we call resolution-like proofs.

Definition 16.

Given a system of axioms $\{p_{i}=0:i\in[m]\}$ , we define a resolution-like proof of infeasibility to be an equality of the form

-1=\sum_{i=1}^{m}{{p_{i}}{q_{i}}}+\sum_{j}{{c_{j}}g_{j}}

where each $g_{j}$ is a monomial and each coefficient $c_{j}$ is non-negative. We define the total coefficient size of such a proof to be $\sum_{i=1}^{m}{T(q_{i})}+\sum_{j}{c_{j}}$ .

We call this proof system resolution-like because it captures the resolution-like calculus introduced for Max-SAT by María Luisa Bonet, Jordi Levy, and Felip Manyà [4]. The idea is that if we have deduced that $x{r_{1}}\leq 0$ and $\bar{x}{r_{2}}\leq 0$ for some variable $x$ and monomials $r_{1}$ and $r_{2}$ then we can deduce that ${r_{1}}{r_{2}}\leq 0$ as follows:

{r_{1}}{r_{2}}=x{r_{1}}-(1-r_{2})x{r_{1}}+\bar{x}{r_{2}}-(1-r_{1})\bar{x}{r_{2}}

where we decompose $(1-r_{1})$ and $(1-r_{2})$ into monomials using the observation that $1-\prod_{i=1}^{k}{x_{i}}=\sum_{j=1}^{k}{(1-x_{j})\left(\prod_{i=1}^{j-1}{x_{i}}\right)}$ .

The minimum total coefficient size of a resolution-like proof can be found using the following linear program.

Primal: Minimize $\sum_{i=1}^{m}{T(q_{i})}+\sum_{j}{c_{j}}$ subject to $\sum_{i=1}^{m}{{p_{i}}{q_{i}}}+\sum_{j}{{c_{j}}g_{j}}=-1$
Dual: Maximize $D(1)$ subject to the constraints that
1. 1.
  
  $D$ is a linear map from polynomials to $\mathbb{R}$ .
2. 2.
  
  For each $i\in[m]$ and each monomial $r$ , $|D(rp_{i})|\leq 1$ .
3. 3.
  
  For each monomial $r$ , $D(r)\geq-1$ .

Definition 17.

Given a system of axioms $\{p_{i}=0:i\in[m]\}$ , a Positivstellensatz/sum of squares proof of infeasibility is an equality of the form

-1=\sum_{i=1}^{m}{{p_{i}}{q_{i}}}+\sum_{j}{g_{j}^{2}}

We define the total coefficient size of a Positivstellensatz/sum of squares proof to be $\sum_{i=1}^{m}{T(q_{i})}+\sum_{j}{T(g_{j})^{2}}$

Primal: Minimize $\sum_{i=1}^{m}{T(q_{i})}+\sum_{j}{T(g_{j})^{2}}$ subject to the constraint that $-1=\sum_{i=1}^{m}{{p_{i}}{q_{i}}}+\sum_{j}{g_{j}^{2}}$ .
Dual: Maximize $D(1)$ subject to the constraints that
1. 1.
  
  $D$ is a linear map from polynomials to $\mathbb{R}$ .
2. 2.
  
  For each $i\in[m]$ and each monomial $r$ , $|D(rp_{i})|\leq 1$ .
3. 3.
  
  For each polynomial $g_{j}$ , $D((g_{j})^{2})\geq-T(g_{j})^{2}$ ,

5.1 Failure of the dual certificate for resolution-like proofs

In this subsection, we observe that our dual certificate does not give a lower bound on the total coefficient size for resolution-like proofs of the pigeonhole principle because it has a large negative value on some monomials.

Theorem 4.

The value of the dual certificate on the polynomial $\prod_{i=1}^{n}{\bar{x}_{i1}}$ is
$-\frac{(n-2)!}{(n-1)^{n-1}}\left(1-\frac{(-1)^{n-1}}{(n-1)^{n-2}}\right)$

Proof.

To show this, we make the following observations.

1.

The value of the dual certificate on the polynomial $\prod_{i=2}^{n}{\bar{x}_{i1}}$ is $0$ .
2.

The value of the dual certificate on the polynomial $x_{11}\prod_{i=3}^{n}{\bar{x}_{i1}}$ is $\frac{(n-2)!}{(n-1)^{n-1}}$
3.

The value of the dual certificate on the polynomial $x_{11}x_{21}\prod_{i=3}^{n}{\bar{x}_{i1}}$ is $\frac{(-1)^{n-2}(n-2)!}{(n-1)^{2n-3}}$ .

For the first observation, observe that since the first pigeon is unrestricted, every term of the dual certificate cancels except $J_{\{2,3,\ldots,n\}}$ which is $0$ as none of these pigeons can go to hole $1$ . For the second observation, observe that since the second pigeon is unrestriced, every term of the dual certificate cancels except $J_{\{1,3,4,\ldots,n\}}$ which gives value $\frac{\mathbb{E}(D)}{n-1}=\frac{(n-2)!}{(n-1)^{n-1}}$ . For the third observation, observe that by Lemma 2, the value of the dual certificate on the polynomial $x_{11}x_{21}\prod_{i=3}^{n}{\bar{x}_{i1}}$ is $(-1)^{n-2}$ times the value of the dual certificate on the polynomial $\prod_{i=1}^{n}{x_{i1}}$ which is

\frac{1}{(n-1)^{n}}\left(-\frac{(n-1)!}{(n-1)^{n-1}}+n\frac{(n-2)!}{(n-1)^{n-2}}\right)=\frac{(n-2)!}{(n-1)^{2n-3}}

Putting these observations together, the value of the dual certificate for the polynomial

\prod_{i=1}^{n}{\bar{x}_{i1}}=\prod_{i=2}^{n}{\bar{x}_{i1}}-x_{11}\prod_{i=3}^{n}{\bar{x}_{i1}}+x_{11}x_{21}\prod_{i=3}^{n}{\bar{x}_{i1}}

is $-\frac{(n-2)!}{(n-1)^{n-1}}+\frac{(-1)^{n-2}(n-2)!}{(n-1)^{2n-3}}=-\frac{(n-2)!}{(n-1)^{n-1}}\left(1-\frac{(-1)^{n-1}}{(n-1)^{n-2}}\right)$ . ∎

5.2 Small total coefficient size sum of squares proof of the ordering principle

In this subsection, we show that the small size resolution proof of the ordering principle [16], which seems to be dynamic in nature, can actually be mimicked by a sum of squares proof. Thus, while sum of squares requires degree $\tilde{\Theta}(\sqrt{n})$ to refute the negation of the ordering principle [13], there is a sum of squares proof which has polynomial size and total coefficient size. To make our proof easier to express, we define the following monomials

Definition 18.

1.

Whenever $1\leq j\leq m\leq n$ , let $F_{jm}=\prod_{i\in[m]\setminus\{j\}}{x_{ji}}$ be the monomial which is $1$ if $x_{j}$ is the first element in $x_{1},\dots,x_{m}$ and $0$ otherwise.
2.

For all $m\in[n-1]$ and all distinct $j,k\in[m]$ , we define $T_{jmk}$ to be the monomial

$T_{jmk}=F_{jm}x_{(m+1)j}x_{k(m+1)}\prod_{i\in[k-1]\setminus\{j\}}{x_{(m+1)i}}$

Note that $T_{jmk}$ is a multiple of $x_{(m+1)j}x_{jk}x_{k(m+1)}$ so it is a weakening of a transitivity axiom.

With these definitions, we can now express our proof.

Theorem 5.

The following equality (modulo the axioms that $x_{ij}^{2}=x_{ij}$ and $x_{ij}x_{ji}=0$ for all distinct $i,j\in[n]$ ) gives an SoS proof that the total ordering axioms are infeasible.

-1=\sum_{m=1}^{n-1}{\left(\left(F_{(m+1)(m+1)}-\sum_{j=1}^{m}{F_{jm}F_{(m+1)(m+1)}}\right)^{2}-\sum_{j=1}^{m}{\sum_{k\in[m]\setminus\{j\}}{T_{jmk}}}\right)}-\sum_{j=1}^{n}{F_{jn}}

Proof.

Our key building block is the following lemma.

Lemma 7.

For all $m\in[1,n-1]$ and all $j\in[1,m]$ ,

F_{jm}=F_{j(m+1)}+\sum_{k\in[m]\setminus\{j\}}{T_{jmk}}+F_{jm}F_{(m+1)(m+1)}

Proof.

First note that

F_{jm}=F_{jm}x_{j(m+1)}+F_{jm}x_{(m+1)j}=F_{j(m+1)}+F_{jm}x_{(m+1)j}

We now use the following proposition

Proposition 2.

F_{jm}x_{(m+1)j}=\sum_{k\in[m]\setminus\{j\}}{F_{jm}x_{(m+1)j}x_{k(m+1)}\prod_{i\in[k-1]\setminus\{j\}}{x_{(m+1)i}}}+F_{jm}F_{(m+1)(m+1)}x_{(m+1)j}

Proof.

If $x_{k^{\prime}(m+1)}=0$ for all $k\in[m]$ then $x_{k(m+1)}\prod_{i\in[k-1]\setminus\{j\}}{x_{(m+1)i}}=0$ for all $k\in[m]$ and $F_{(m+1)(m+1)}=1$ . Otherwise, let $k^{\prime}$ be the first index in $[m]$ such that $x_{k^{\prime}(m+1)}=1$ . Now observe that $F_{(m+1)(m+1)}=0$ and $x_{k(m+1)}\prod_{i\in[k-1]\setminus\{j\}}{x_{(m+1)i}}=1$ if $k=k^{\prime}$ and is $0$ if $k\neq k^{\prime}$ . ∎

Finally, we observe that $F_{jm}F_{(m+1)(m+1)}x_{(m+1)j}=F_{jm}F_{(m+1)(m+1)}$ as $x_{(m+1)j}$ is contained in $F_{(m+1)(m+1)}$ . Putting everything together,

F_{jm}=F_{j(m+1)}+\sum_{k\in[m]\setminus\{j\}}{T_{jmk}}+F_{jm}F_{(m+1)(m+1)}

∎

We now verify that our proof, which we restate here for convenience, is indeed an equality modulo the axioms that $x_{ij}^{2}=x_{ij}$ and $x_{ij}x_{ji}=0$ for all distinct $i,j\in[n]$ .

-1=\sum_{m=1}^{n-1}{\left(\left(F_{(m+1)(m+1)}-\sum_{j=1}^{m}{F_{jm}F_{(m+1)(m+1)}}\right)^{2}-\sum_{j=1}^{m}{\sum_{k\in[m]\setminus\{j\}}{T_{jmk}}}\right)}-\sum_{j=1}^{n}{F_{jn}}

Observe that $F_{(m+1)(m+1)}^{2}=F_{(m+1)(m+1)}$ , $F_{jm}^{2}=F_{(jm)}$ , and for all distinct $j,j^{\prime}\in[m]$ , $F_{jm}F_{j^{\prime}m}=0$ . Thus,

\left(F_{(m+1)(m+1)}-\sum_{j=1}^{m}{F_{jm}F_{(m+1)(m+1)}}\right)^{2}=\left(F_{(m+1)(m+1)}-\sum_{j=1}^{m}{F_{jm}F_{(m+1)(m+1)}}\right)

By Lemma 7, $F_{jm}F_{(m+1)(m+1)}+\sum_{k\in[m]\setminus\{j\}}{T_{jmk}}=F_{jm}-F_{j(m+1)}$ . This implies that

	$\displaystyle-\sum_{m=1}^{n-1}{\sum_{j=1}^{m}{\left(F_{jm}F_{(m+1)(m+1)}+\sum_{k\in[m]\setminus\{j\}}{T_{jmk}}\right)}}$	$\displaystyle=-\sum_{j=1}^{n-1}{\sum_{m=j}^{n-1}{\left(F_{jm}-F_{j(m+1)}\right)}}$
		$\displaystyle=\sum_{j=1}^{n-1}{(F_{jn}-F_{jj})}$

Now observe that $\sum_{m=1}^{n-1}{F_{(m+1)(m+1)}}=\sum_{j=2}^{n-1}{F_{jj}}+F_{nn}=-1+\sum_{j=1}^{n-1}{F_{jj}}+F_{nn}$ . Putting everything together, the equality holds. Since each $T_{jmk}$ is a weakening of a transitivity axiom and each $F_{jn}$ is a non-minimality axiom, this indeed gives a sum of squares proof that these axioms are unsatisfiable. ∎

6 Open Problems

Our work raises a number of open problems. For the pigeonhole principle, while we have proved an exponential total coefficient size lower bound on Nullstellensatz proofs, there is a lot of room for further work. Some questions are as follows.

1.

For the pigeonhole principle, our lower bound is $2^{\Omega(n)}$ while the trivial upper bound is $O(n!)$ . Can we improve the lower and/or upper bound?
2.

If we increase the number of pigeons from $n$ to $n+1$ while still having $n-1$ holes, our lower bound proof no longer applies. Can we prove a total coefficient size lower bound on Nullstellensatz when there are $n+1$ or more pigeons? How does the minimum total coefficient size of a proof depend on the number of pigeons?
3.

Can we show total coefficient size lower bounds for resolution-like proofs of the pigeonhole principle?
4.

How much of an effect would adding the axioms that pigeons can only go to one hole have on the minimum total coefficient size needed to prove the pigeonhole principle?

We are still far from understanding the total coefficient size of proofs for the ordering principle. Two natural questions are as follows.

1.

Can we prove superpolynomial lower bounds on the total coefficient size of Nullstellensatz proofs for the ordering principle and/or improve the $O(2^{n})$ upper bound?
2.

Are there resolution-like proofs for the ordering principle with polynomial total coefficient size? If so, this shows that the seemingly dynamic $O(n^{3})$ size resolution proof of the ordering principle [16] can be captured by a one line resolution-like proof. If not, this gives a natural example separating resolution proof size and the total coefficient size of resolution-like proofs.

Finally, we can ask what relationships and separations we can show between all of these different proof systems. Some questions are as follows.

1.

Are there natural examples where the minimum total coefficient size is very different (either larger or smaller) than the minimum size for Nullstellensatz, resolution-like, or sum of squares proofs?
2.

Can the minimum total coefficient size of a strong proof system be used to lower bound the size of another proof system? For example, can resolution proof size be lower bounded by the minimum total coefficient size of a sum of squares proof or can we find an example where there is a polynomial size resolution proof but any sum of squares proof has superpolynomial total coefficient size?

References

BCE⁺ [98] Paul Beame, Stephen Cook, Jeff Edmonds, Russell Impagliazzo, and Toniann Pitassi. The relative complexity of np search problems. J. Comput. Syst. Sci., 57(1):3–19, 1998.
BIK⁺ [94] P. Beame, R. Impagliazzo, J. Krajicek, T. Pitassi, and P. Pudlak. Lower bounds on hilbert’s nullstellensatz and propositional proofs. In Proceedings of the 35th Annual Symposium on Foundations of Computer Science (FOCS), pages 794–806, 1994.
BIK⁺ [97] S. Buss, R. Impagliazzo, J. Krajíček, P. Pudlák, A. A. Razborov, and J. Sgall. Proof complexity in algebraic systems and bounded depth frege systems with modular counting. Comput. Complex., 6(3):256–298, 1997.
BLM [07] María Luisa Bonet, Jordi Levy, and Felip Manyà. Resolution for max-sat. Artificial Intelligence, 171(8):606–618, 2007.
BP [96] S.R. Buss and T. Pitassi. Good degree bounds on nullstellensatz refutations of the induction principle. In Proceedings of Computational Complexity (Formerly Structure in Complexity Theory), pages 233–242, 1996.
Bro [87] W. Dale Brownawell. Bounds for the degrees in the nullstellensatz. Annals of Mathematics, 126(3):577–591, 1987.
Bus [96] Samuel R. Buss. Lower bounds on nullstellensatz proofs via designs. volume 39 of DIMACS Series in Discrete Mathematics and Theoretical Computer Science, pages 59–71. DIMACS/AMS, 1996.
CEI [96] Matthew Clegg, Jeffery Edmonds, and Russell Impagliazzo. Using the groebner basis algorithm to find proofs of unsatisfiability. In Proceedings of the Twenty-Eighth Annual ACM Symposium on Theory of Computing (STOC), page 174–183, 1996.
dRNMR [19] Susanna F. de Rezende, Jakob Nordström, Or Meir, and Robert Robere. Nullstellensatz Size-Degree Trade-offs from Reversible Pebbling. In 34th Computational Complexity Conference (CCC), volume 137, pages 18:1–18:16, 2019.
Her [26] Grete Hermann. Die frage der endlich vielen schritte in der theorie der polynomideale. Mathematische Annalen, 95:736–788, 1926.
IPS [99] Russell Impagliazzo, Pavel Pudlák, and Jiří Sgall. Lower bounds for the polynomial calculus and the gröbner basis algorithm. Comput. Complex., 8(2):127–144, nov 1999.
Kol [88] János Kollár. Sharp effective nullstellensatz. Journal of the American Mathematical Society, pages 963–975, 1988.
Pot [20] Aaron Potechin. Sum of Squares Bounds for the Ordering Principle. In 35th Computational Complexity Conference (CCC), volume 169, pages 38:1–38:37, 2020.
PR [18] Toniann Pitassi and Robert Robere. Lifting nullstellensatz to monotone span programs over any field. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing (STOC), page 1207–1219, 2018.
Raz [98] Alexander A Razborov. Lower bounds for the polynomial calculus. computational complexity, 7(4):291–324, 1998.
Stå [96] Gunnar Stålmarck. Short resolution proofs for a sequence of tricky formulas. Acta Informatica, 33(3):277–280, 1996.

Bounds on the Total Coefficient Size of Nullstellensatz Proofs of the Pigeonhole Principle and the Ordering Principle

Abstract

1 Introduction

Remark 1.

1.1 Our results

Theorem 1.

Theorem 2.

2 Nullstellensatz total coefficient size

Definition 1.

Definition 2.

Definition 3.

Remark 2.

Proposition 1.

Proof.

2.1 Special case: Boolean variables

Definition 4.

Definition 5.

Definition 6.

Remark 3.

Remark 4.

Definition 7.

3 Total coefficient size lower bound for the pigeonhole principle

Definition 8 (pigeonhole principle (PHPn\mathrm{PHP}_{n})).

Definition 9 (functions JSJ_{S}).

Definition 10 (dual solution DD).

Lemma 1.

Proof.

Corollary 1.

Proof.

3.1 Upper bound on maxW⁡|𝔼​(D​W)|\max_{W}|\mathbb{E}(DW)|

Definition 11 (HW,iH_{W,i}).

Lemma 2.

Proof.

Definition 12 (WSflipW^{\mathrm{flip}}_{S}).

Corollary 2.

Proof.

3.1.1 Unsuccessful approach to upper bound maxW⁡|𝔼​(D​W)|\max_{W}|\mathbb{E}(DW)|

3.1.2 Successful approach to upper bound maxW⁡|𝔼​(D​W)|\max_{W}|\mathbb{E}(DW)|

Definition 13 (W{−1,0,1}W^{\{-1,0,1\}}).

Lemma 3.

Proof.

Lemma 4.

Proof.

Lemma 5.

Proof.

Corollary 3.

Proof.

Proof of Theorem 1.

3.2 Experimental Results for PHPn\text{PHP}_{n}

4 Total coefficient size upper bound for the ordering principle

Definition 14 (ordering principle (ORDn\mathrm{ORD}_{n})).

Definition 15 (nice transitivity weakening).

Theorem 3.

4.1 Restriction to instances with no minimial element

Lemma 6.

Proof.

4.2 Experimental results

5 Analyzing Total Coefficient Size for Stronger Proof Systems

Definition 16.

Definition 17.

5.1 Failure of the dual certificate for resolution-like proofs

Theorem 4.

Proof.

5.2 Small total coefficient size sum of squares proof of the ordering principle

Definition 18.

Theorem 5.

Proof.

Lemma 7.

Proof.

Proposition 2.

Proof.

6 Open Problems

References

Definition 8 (pigeonhole principle ( $\mathrm{PHP}_{n}$ )).

Definition 9 (functions $J_{S}$ ).

Definition 10 (dual solution $D$ ).

3.1 Upper bound on $\max_{W}|\mathbb{E}(DW)|$

Definition 11 ( $H_{W,i}$ ).

Definition 12 ( $W^{\mathrm{flip}}_{S}$ ).

3.1.1 Unsuccessful approach to upper bound $\max_{W}|\mathbb{E}(DW)|$

3.1.2 Successful approach to upper bound $\max_{W}|\mathbb{E}(DW)|$

Definition 13 ( $W^{\{-1,0,1\}}$ ).

3.2 Experimental Results for $\text{PHP}_{n}$

Definition 14 (ordering principle ( $\mathrm{ORD}_{n}$ )).