Complexity of branch-and-bound and cutting planes in mixed-integer optimization - II

Amitabh Basu Department of Applied Mathematics and Statistics, Johns Hopkins University, Baltimore, MD, USA (basu.amitabh@jhu.edu, hjiang32@jhu.edu). Michele Conforti Dipartimento di Matematica “Tullio Levi-Civita”, Università degli Studi Padova, Italy (conforti@math.unipd.it, disumma@math.unipd.it). Marco Di Summa²²footnotemark: 2 Hongyi Jiang¹¹footnotemark: 1

Abstract

We study the complexity of cutting planes and branching schemes from a theoretical point of view. We give some rigorous underpinnings to the empirically observed phenomenon that combining cutting planes and branching into a branch-and-cut framework can be orders of magnitude more efficient than employing these tools on their own. In particular, we give general conditions under which a cutting plane strategy and a branching scheme give a provably exponential advantage in efficiency when combined into branch-and-cut. The efficiency of these algorithms is evaluated using two concrete measures: number of iterations and sparsity of constraints used in the intermediate linear/convex programs. To the best of our knowledge, our results are the first mathematically rigorous demonstration of the superiority of branch-and-cut over pure cutting planes and pure branch-and-bound.

1 Introduction

In this paper, we consider the following mixed-integer optimization problem:

\begin{array}[]{rcll}\sup\limits&\langle c,x\rangle&&\\ \textrm{s.t.}&x&\in&C\cap({\mathbb{Z}}^{n}\times\mathbb{R}^{d})\end{array}

(1.1)

where $C$ is a closed, convex set in $\mathbb{R}^{n+d}$ .

State-of-the-art algorithms for integer optimization are based on two ideas that are at the origin of mixed-integer programming and have been constantly refined: cutting planes and branch-and-bound. Decades of theoretical and experimental research into both these techniques is at the heart of the outstanding success of integer programming solvers. Nevertheless, we feel that there is lot of scope for widening and deepening our understanding of these tools. We have recently started building foundations for a rigorous, quantitative theory for analyzing the strengths and weaknesses of cutting planes and branching [3]. We continue this project in the current manuscript.

In particular, we provide a theoretical framework to explain an empirically observed phenomenon: algorithms that make a combined use of both cutting planes and branching techniques are more efficient (sometimes by orders of magnitude), compared to their stand alone use in algorithms. We hope that our insights can contribute to a better and more precise understanding of the interaction of cutting planes and branching: which cutting plane schemes and branching schemes complement each other with concrete, provable gains obtained with their combined use, as opposed to not? Not only is a theoretical understanding of this phenomenon lacking, a deeper understanding of the interaction of these methods is considered to be important by both practitioners and theoreticians in the mixed-integer optimization community. To quote an influential computational survey [39] “… it seems that a tighter coordination of the two most fundamental ingredients of the solvers, branching and cutting, can lead to strong improvements.”

The main computational burden in any cutting plane or branch-and-bound or branch-and-cut algorithm is the solution of the intermediate convex relaxations. Thus, there are two important aspects to deciding how efficient such an algorithm is: 1) How many linear programs (LPs) or convex optimization problems are solved? 2) How computationally challenging are these convex problems? The first aspect has been widely studied using the concepts of proof size and rank; see [21, 22, 23, 12, 11, 10, 17, 6, 27, 50] for a small sample of previous work. Formalizing the second aspect is somewhat tricky and we will focus on a very specific aspect: the sparsity of the constraints describing the linear program. The collective wisdom of the optimization community says that sparsity of constraints is a highly important aspect in the efficiency of linear programming [5, 28, 49, 53]. Additionally, most successful mixed-integer optimization solvers use sparsity as a criterion for cutting plane selection; see [25, 24, 26] for an innovative line of research. Compared to cutting planes, sparsity considerations have not been as prominent in the choice of branching schemes. This is primarily because for variable disjunctions sparsity is not an issue, and there is relatively less work on more general branching schemes; see [1, 45, 4, 20, 41, 42, 44, 19, 40, 36]. In our analysis, we are careful about the sparsity of the disjunctions as well – see Definition 1.3 below.

1.1 Framework for mathematical analysis.

We now present the formal details of our approach. A cutting plane for the feasible region of (1.1) is a halfspace $H=\{x\in\mathbb{R}^{n+d}:\langle a,x\rangle\leq\delta\}$ such that $C\cap({\mathbb{Z}}^{n}\times\mathbb{R}^{d})\subseteq H$ . The most useful cutting planes are those that are not valid for $C$ , i.e., $C\not\subseteq H$ . There are several procedures used in practice for generating cutting planes, all of which can be formalized by the general notion of a cutting plane paradigm. A cutting plane paradigm is a function $\mathcal{CP}$ that takes as input any closed, convex set $C$ and outputs a (possibly infinite) family $\mathcal{CP}(C)$ of cutting planes valid for $C\cap({\mathbb{Z}}^{n}\times\mathbb{R}^{d})$ . Two well-studied examples of cutting plane paradigms are the Chvátal-Gomory cutting plane paradigm [51, Chapter 23] and the split cut paradigm [14, Chapter 5]. We will assume that all cutting planes are rational in this paper.

State-of-the-art solvers embed cutting planes into a systematic enumeration scheme called branch-and-bound. The central notion is that of a disjunction, which is a union of polyhedra $D=Q_{1}\cup\ldots\cup Q_{k}$ such that ${\mathbb{Z}}^{n}\times\mathbb{R}^{d}\subseteq D$ , i.e., the polyhedra together cover all of ${\mathbb{Z}}^{n}\times\mathbb{R}^{d}$ . One typically uses a (possibly infinite) family of disjunctions for potential deployment in algorithms. A well-known example is the family of split disjunctions that are of the form $D_{\pi,\pi_{0}}:=\{x\in\mathbb{R}^{n+d}:\langle\pi,x\rangle\leq\pi_{0}\}\cup\{x\in\mathbb{R}^{n+d}:\langle\pi,x\rangle\geq\pi_{0}+1\}$ , where $\pi\in{\mathbb{Z}}^{n}\times\{0\}^{d}$ and $\pi_{0}\in{\mathbb{Z}}$ . When the first $n$ coordinates of $\pi$ correspond to a standard unit vector, we get variable disjunctions, i.e., disjunctions of the form $\{x:x_{i}\leq\pi_{0}\}\cup\{x:x_{i}\geq\pi_{0}+1\}$ , for $i=1,\ldots,n$ .

A family of disjunctions $\mathcal{D}$ can also form the basis of a cutting plane paradigm. Given any disjunction $D$ , any halfspace $H$ such that $C\cap D\subseteq H$ is a cutting plane, since $C\cap({\mathbb{Z}}^{n}\times\mathbb{R}^{d})\subseteq C\cap D$ by definition of a disjunction. The corresponding cutting plane paradigm $\mathcal{CP}(C)$ , called disjunctive cuts based on $\mathcal{D}$ , is the family of all such cutting planes derived from disjunctions in $\mathcal{D}$ . Two well-known examples are the family of split cuts, based on the family of split disjunctions defined above, and the family of lift-and-project cuts derived from variable disjunctions.

In the following we assume that all convex optimization problems that need to be solved have an optimal solution or are infeasible.

Definition 1.1.

A branch-and-cut algorithm based on a family $\mathcal{D}$ of disjunctions and a cutting plane paradigm $\mathcal{CP}$ maintains a list $\mathcal{L}$ of convex subsets of the initial set $C$ which are guaranteed to contain the optimal point, and a lower bound $LB$ that stores the objective value of the best feasible solution found so far (with $LB=-\infty$ if no feasible solution has been found). At every iteration, the algorithm selects one of these subsets $N\in\mathcal{L}$ and solves the convex optimization problem $\sup\{\langle c,x\rangle:x\in N\}$ to obtain $x^{N}$ . If the objective value is less than or equal to $LB$ , then this set $N$ is discarded from the list $\mathcal{L}$ . Else, if $x^{N}$ satisfies the integrality constraints, $LB$ is updated with the value of $x^{N}$ and $N$ is discarded from the list. Otherwise, the algorithm makes a decision whether to branch or to cut. In the former case, a disjunction $D=(Q_{1}\cup\ldots\cup Q_{k})\in\mathcal{D}$ is chosen such that $x^{N}\not\in D$ and the list is updated $\mathcal{L}:=\mathcal{L}\setminus\{N\}\cup\{Q_{1}\cap N,\ldots,Q_{k}\cap N\}$ . If the decision is to cut, then the algorithm selects a cutting plane $H\in\mathcal{CP}(N)$ such that $x^{N}\not\in H$ , and updates the relaxation $N$ by adding the cut $H$ , i.e., updates $\mathcal{L}:=\mathcal{L}\setminus\{N\}\cup\{N\cap H\}$ .

Motivated by the above, we will refer to a family $\mathcal{D}$ of disjunctions also as a branching scheme. In a branch-and-cut algorithm, if one always chooses to add a cutting plane and never uses a disjunction to branch, then it is said to be a (pure) cutting plane algorithm and if one does not use any cutting planes ever, then it is called a (pure) branch-and-bound algorithm. We note here that in practice, when a decision to cut is made, several cutting planes are usually added as opposed to just one single cutting plane like in Definition 1.1. In our mathematical framework, allowing only a single cut makes for a seamless generalization from pure cutting plane algorithms, and also makes quantitative analysis easier.

Definition 1.2.

The execution of any branch-and-cut algorithm on a mixed-integer optimization instance can be represented by a tree. Every convex relaxation $N$ processed by the algorithm is denoted by a node in the tree. If the optimal value for $N$ is not better than the current lower bound, or is integral, $N$ is a leaf. Otherwise, in the case of a branching, its children are $Q_{1}\cap N,\ldots,Q_{k}\cap N$ , and in the case of a cutting plane, there is a single child representing $N\cap H$ (we use the same notation as in Definition 1.1). This tree is called the branch-and-cut tree (branch-and-bound tree, if no cutting planes are used). If no branching is done, this tree (which is really a path) is called a cutting plane proof. The size of the tree or proof is the total number of nodes.

Proof versus algorithm.

Although we use the word “algorithm” in Definition 1.1, it is technically a non-deterministic algorithm, or equivalently, a proof schema or proof system for optimality [2] (leaving aside the question of finite termination for now). This is because no indication is given on how the important decisions are made: Which set $N$ to process from $\mathcal{L}$ ? Branch or cut? Which disjunction or cutting plane to use? If these are made concrete, one would obtain a standard deterministic algorithm (assuming, for the moment, finite termination on all instances). Nevertheless, the proof system is very useful for obtaining information theoretic lower bounds on the efficiency of any deterministic branch-and-cut algorithm. Moreover, one can prove the validity of any upper bound on the objective, i.e., the validity of $\langle c,x\rangle\leq\gamma$ by exhibiting a branch-and-cut tree where this inequality is valid for all the leaves. If $\gamma$ is the optimal value, this is a proof of optimality, but one may often be interested in the branch-and-cut/branch-and-bound/cutting plane proof complexity of other valid inequalities as well. The connections between integer programming and proof complexity has a long history; see [4, 20, 7, 48, 34, 8, 16, 29, 30, 13, 46, 47, 38, 37, 31], to cite a few. Our results can be interpreted in the language of proof complexity as well.

Another subtlety to keep in mind is that one could add to the power of such a branch-and-cut proof system by relaxing the requirement that the current optimal solution $x^{N}$ should be eliminated by the chosen disjunction or cutting plane. This can make a difference – an instance may have a finite proof in the strengthened system while no finite proof exists in the original system [43]. When required, we will use the phrase restricted proof to refer to a proof that imposes the restriction of eliminating $x^{N}$ at every node $N$ of the proof tree.

Recall that we quantify the complexity of any branch-and-bound/cutting plane/branch-and-cut algorithm using two aspects: the number of LP relaxations processed and the sparsity of the constraints defining the LPs. The number of LP relaxations processed is given precisely by the number of nodes in the corresponding tree (Definition 1.2). Sparsity is formalized in the following definitions.

Definition 1.3.

Let $1\leq s\leq n+d$ be a natural number that we call the sparsity parameter. Then the pair $(\mathcal{CP},s)$ will denote the restriction of the paradigm $\mathcal{CP}$ that only reports the sub-family of cutting planes that can be represented by inequalities with at most $s$ non-zero coefficients; the notation $(\mathcal{CP},s)(C)$ will be used to denote this sub-family for any particular convex set $C$ . Similarly, $(\mathcal{D},s)$ will denote the sub-family of the family of disjunctions $\mathcal{D}$ such that each polyhedron in the disjunction has an inequality description where every inequality has at most $s$ non-zero coefficients.

Cutting plane proof systems with restrictions on the “depth” of the cutting planes have been considered in the proof complexity literature; see [30, 33].

1.2 Our Results

1.2.1 Sparsity versus size.

Our first set of results considers the trade-off between the sparsity parameter $s$ and the number of LPs processed, i.e., the size of the tree. There are several avenues to explore in this direction. For example, one could compare pure branch-and-bound algorithms based on $(\mathcal{D},s_{1})$ and $(\mathcal{D},s_{2})$ , i.e., fix a particular disjunction family $\mathcal{D}$ and consider the effect of sparsity on the branch-and-bound tree sizes. One could also look at two different families of disjunctions $\mathcal{D}_{1}$ and $\mathcal{D}_{2}$ and look at their relative tree sizes as one turns the knob on the sparsity parameter. Similar questions could be asked about cutting plane paradigms $(\mathcal{CP}_{1},s_{1})$ and $(\mathcal{CP}_{2},s_{2})$ for interesting paradigms $\mathcal{CP}_{1},\mathcal{CP}_{2}$ . Even more interestingly, one could compare pure branch-and-bound and pure cutting plane algorithms against each other.

We first focus on pure branch-and-bound algorithms based on the family $\mathcal{S}$ of split disjunctions. A very well-known example of pure integer instances (i.e., $d=0$ ) due to Jeroslow [35] shows that if the sparsity of the splits used is restricted to be 1, i.e., one uses only variable disjunctions, then the branch-and-bound algorithm will generate an exponential (in the dimension $n$ ) sized tree. On the other hand, if one allows fully dense splits, i.e., sparsity is $n$ , then there is a tree with just 3 nodes (one root, and two leaves) that solves the problem. We ask what happens in Jeroslow’s example if one uses split disjunctions with sparsity $s>1$ . Our first result shows that unless the sparsity parameter $s=\Omega(n)$ , one cannot get constant size trees, and if the sparsity parameter $s=O(1)$ , then the tree is of exponential size.

Theorem 1.4.

Let $H$ be the halfspace defined by inequality $2\sum_{i=1}^{n}x_{i}\leq n$ , where $n$ is an odd number. Consider the instances of (1.1) with $d=0$ , the objective $\sum_{i=1}^{n}x_{i}$ and $C=H\cap[0,1]^{n}$ . The optimum is $\left\lfloor\frac{n}{2}\right\rfloor$ , and any branch-and-bound proof with sparsity $s\leq\left\lfloor\frac{n}{2}\right\rfloor$ that certifies $\sum_{i=1}^{n}x_{i}\leq\left\lfloor\frac{n}{2}\right\rfloor$ has size at least $\Omega(2^{\frac{n}{2s}})$ .

The above instance is a modification of Jeroslow’s instance; Jeroslow’s instance uses an equality constraint instead of an inequality. However, the same argument applies for Jeroslow’s instance.

Corollary 1.5.

Let $H$ be the hyperplane defined by equality $2\sum_{i=1}^{n}x_{i}=n$ , where $n$ is an odd number. Consider the instances of (1.1) with $d=0$ , the objective $\sum_{i=1}^{n}x_{i}$ and $C=H\cap[0,1]^{n}$ . This problem is infeasible, and any branch-and-bound proof of infeasibility with sparsity $s\leq\left\lfloor\frac{n}{2}\right\rfloor$ has size at least $\Omega(2^{\frac{n}{2s}})$ .

The bounds in Theorem 1.4 give a constant lower bound when $s=\Omega(n)$ . We establish another lower bound which does better in this regime.

Theorem 1.6.

Next we consider the relative strength of cutting planes and branch-and-bound. Our previous work has studied conditions under which one method can dominate the other, depending on which cutting plane paradigm and branching scheme one chooses [3]. For this paper, the following result from [3] is relevant: for every convex 0/1 pure integer instance, any branch-and-bound proof based on variable disjunctions can be “simulated” by a lift-and-project cutting plane proof without increasing the size of the proof (versions of this result for linear 0/1 programming were known earlier; see [21, 22]). Moreover, in [3] we constructed a family of stable set instances where lift-and-project cuts give exponentially shorter proofs than branch-and-bound. This is interesting because lift-and-project cuts are disjunctive cuts based on the same family of variable disjunctions, so it is not a priori clear that they have an advantage. These results were obtained with no regard for sparsity. We now show that once we also track the sparsity parameter, this advantage can disappear.

Theorem 1.7.

Let $H$ be the halfspace defined by inequality $2\sum_{i=1}^{n}x_{i}\leq n$ , where $n$ is an odd number. Consider the intances of (1.1) with $d=0$ , the objective $\sum_{i=1}^{\left\lceil\frac{n}{2}\right\rceil}x_{i}$ and $C=H\cap[0,1]^{n}$ . The optimum is $\left\lfloor\frac{n}{2}\right\rfloor$ , and there is a branch-and-bound algorithm based on variable disjunctions, i.e., the family of split disjunctions with sparsity $1$ , that certifies $\sum_{i=1}^{\left\lceil\frac{n}{2}\right\rceil}x_{i}\leq\left\lfloor\frac{n}{2}\right\rfloor$ in $O(n)$ steps. However, any cutting plane for $C$ with sparsity $s\leq\left\lfloor\frac{n}{2}\right\rfloor$ is trivial, i.e., valid for $[0,1]^{n}$ , no matter what cutting plane paradigm is used to derive it.

1.2.2 Superiority of branch-and-cut.

We next consider the question of when combining branching and cutting planes is provably advantageous. For this question, we leave aside the complications arising due to sparsity considerations and focus only on the size of proofs. The following discussion and results can be extended to handle the issue of sparsity as well, but we leave it out of this extended abstract.

Given a cutting plane paradigm $\mathcal{CP}$ , and a branching scheme $\mathcal{D}$ , are there families of instances where branch-and-cut based on $\mathcal{CP}$ and $\mathcal{D}$ does provably better than pure cutting planes based on $\mathcal{CP}$ alone and pure branch-and-bound based on $\mathcal{D}$ alone? If a cutting plane paradigm $\mathcal{CP}$ and a branching scheme $\mathcal{D}$ are such that either for every instance, $\mathcal{CP}$ gives cutting plane proofs of size at most a polynomial factor larger than the shortest branch-and-bound proofs with $\mathcal{D}$ , or vice versa, for every instance $\mathcal{D}$ gives proofs of size at most polynomially larger than the shortest cutting plane proofs based on $\mathcal{CP}$ , then combining them into branch-and-cut is likely to give no substantial improvement since one method can always do the job of the other, up to polynomial factors. As mentioned above, prior work [3] had shown that disjunctive cuts based on variable disjunctions (with no restriction on sparsity) dominate branch-and-bound based on variable disjunctions for pure 0/1 instances, and as a consequence branch-and-cut based on these paradigms is dominated by pure cutting planes. In the next theorem, we show that the situation completely reverses if one considers a broader family of disjunctions (still restricted to the pure integer case).

Theorem 1.8.

Let $C\subseteq\mathbb{R}^{n}$ be a closed, convex set. Let $k\in{\mathbb{N}}$ be a fixed natural number and let $\mathcal{D}$ be any family of disjunctions that contains all split disjunctions, such that all disjunctions in $\mathcal{D}$ have at most $k$ terms in the disjunction. If a valid inequality $\langle c,x\rangle\leq\delta$ for $C\cap{\mathbb{Z}}^{n}$ has a cutting plane proof of size $L$ using disjunctive cuts based on $\mathcal{D}$ , then there exists a branch-and-bound proof of size at most $(k+1)L$ based on $\mathcal{D}$ . Moreover, there is a family of instances where branch-and-bound based on split disjunctions solves the problem in $O(1)$ time whereas there is a polynomial lower bound on split cut proofs.

A consequence of Theorem 1.8 is that any cutting plane proof based on Chvátal-Gomory cuts can be replaced by a branch-and-bound proof based on split disjunctions with a constant blow up in size (since Chvátal-Gomory cuts are a subset of split cuts). This special case was also proved in earlier work by Beame et al. [4, Theorem 12]. We also emphasize that the proof of Theorem 1.8 crucially uses the fact that we have a class of disjunctions that is rich enough to include all split disjunctions.

With similar analysis as Theorem 1.8, we can get the following theorem that takes sparsity into account as well.

Theorem 1.9.

Let $C\in\mathbb{R}^{n}$ be a closed, convex set. Let $\langle c,x\rangle\leq\delta$ be a valid inequality for $C\cap{\mathbb{Z}}^{n}$ . If there exists a cutting plane proof of size $L$ and sparsity $s$ certifying the validity of this inequality, which is derived using general split disjunctions of sparsity $s$ , then there exists a branch-and-bound proof of sparsity $s$ which proves the validity and takes at most $O(L)$ iterations.

The above discussion and theorem motivate the following definition which formalizes the situation where no method dominates the other. To make things precise, we assume that there is a well-defined way to assign a concrete size to any instance of (1.1); see [32] for a discussion on how to make this formal. Additionally, when we speak of an instance, we allow the possibility of proving the validity of any inequality valid for $C\cap({\mathbb{Z}}^{n}\times\mathbb{R}^{d})$ , not necessarily related to an upper bound on the objective value. Thus, an instance is a tuple $(C,c,\gamma)$ such that $\langle c,x\rangle\leq\gamma$ for all $x\in C\cap({\mathbb{Z}}^{n}\times\mathbb{R}^{d})$ .

Definition 1.10.

A cutting plane paradigm $\mathcal{CP}$ and a branching scheme $\mathcal{D}$ are complementary if there is a family of instances where $\mathcal{CP}$ gives polynomial (in the size of the instances) size proofs and the shortest branch-and-bound proof based on $\mathcal{D}$ is exponential (in the size of the instances), and there is another family of instances where $\mathcal{D}$ gives polynomial size proofs while $\mathcal{CP}$ gives exponential size proofs.

We wish to formalize the intuition that branch-and-cut is expected to be exponentially better than branch-and-bound or cutting planes alone for complementary pairs of branching schemes and cutting plane paradigms. But we need to make some mild assumptions about the branching schemes and cutting plane paradigms. All known branching schemes and cutting plane methods from the literature satisfy these conditions.

Definition 1.11.

A branching scheme is said to be regular if no disjunction involves a continuous variable, i.e., each polyhedron in the disjunction is described using inequalities that involve only the integer constrained variables.

A branching scheme $\mathcal{D}$ is said to be embedding closed if disjunctions from higher dimensions can be applied to lower dimensions. More formally, let $n_{1}$ , $n_{2}$ , $d_{1}$ , $d_{2}\in{\mathbb{N}}$ . If $D\in\mathcal{D}$ is a disjunction in $\mathbb{R}^{n_{1}}\times\mathbb{R}^{d_{1}}\times\mathbb{R}^{n_{2}}\times\mathbb{R}^{d_{2}}$ with respect to ${\mathbb{Z}}^{n_{1}}\times\mathbb{R}^{d_{1}}\times{\mathbb{Z}}^{n_{2}}\times\mathbb{R}^{d_{2}}$ , then the disjunction $D\cap(\mathbb{R}^{n_{1}}\times\mathbb{R}^{d_{1}}\times\{0\}^{n_{2}}\times\{0\}^{d_{2}})$ , interpreted as a set in $\mathbb{R}^{n_{1}}\times\mathbb{R}^{d_{1}}$ , is also in $\mathcal{D}$ for the space $\mathbb{R}^{n_{1}}\times\mathbb{R}^{d_{1}}$ with respect to ${\mathbb{Z}}^{n_{1}}\times\mathbb{R}^{d_{1}}$ (note that $D\cap(\mathbb{R}^{n_{1}}\times\mathbb{R}^{d_{1}}\times\{0\}^{n_{2}}\times\{0\}^{d_{2}})$ , interpreted as a set in $\mathbb{R}^{n_{1}}\times\mathbb{R}^{d_{1}}$ , is certainly a disjunction with respect to ${\mathbb{Z}}^{n_{1}}\times\mathbb{R}^{d_{1}}$ ; we want $\mathcal{D}$ to be closed with respect to such restrictions).

A cutting plane paradigm $\mathcal{CP}$ is said to be regular if it has the following property, which says that adding “dummy variables” to the formulation of the instance should not change the power of the paradigm. Formally, let $C\subseteq\mathbb{R}^{n}\times\mathbb{R}^{d}$ be any closed, convex set and let $C^{\prime}=\{(x,t)\in\mathbb{R}^{n}\times\mathbb{R}^{d}\times\mathbb{R}:x\in C,\;\;t=\langle f,x\rangle\}$ for some $f\in\mathbb{R}^{n}$ . Then if a cutting plane $\langle a,x\rangle\leq b$ is derived by $\mathcal{CP}$ applied to $C$ , i.e., this inequality is in $\mathcal{CP}(C)$ , then it should also be in $\mathcal{CP}(C^{\prime})$ , and conversely, if $\langle a,x\rangle+\mu t\leq b$ is in $\mathcal{CP}(C^{\prime})$ , then the equivalent inequality $\langle a+\mu f,x\rangle\leq b$ should be in $\mathcal{CP}(C)$ .

A cutting plane paradigm $\mathcal{CP}$ is said to be embedding closed if cutting planes from higher dimensions can be applied to lower dimensions. More formally, let $n_{1},n_{2},d_{1},d_{2}\in{\mathbb{N}}$ . Let $C\subseteq\mathbb{R}^{n_{1}}\times\mathbb{R}^{d_{1}}$ be any closed, convex set. If the inequality $\langle c_{1},x_{1}\rangle+\langle a_{1},y_{1}\rangle+\langle c_{2},x_{2}\rangle+\langle a_{2},y_{2}\rangle\leq\gamma$ is a cutting plane for $C\times\{0\}^{n_{2}}\times\{0\}^{d_{2}}$ with respect to ${\mathbb{Z}}^{n_{1}}\times\mathbb{R}^{d_{1}}\times{\mathbb{Z}}^{n_{2}}\times\mathbb{R}^{d_{2}}$ that can be derived by applying $\mathcal{CP}$ to $C\times\{0\}^{n_{2}}\times\{0\}^{d_{2}}$ , then the cutting plane $\langle c_{1},x_{1}\rangle+\langle a_{1},y_{1}\rangle\leq\gamma$ that is valid for $C\cap({\mathbb{Z}}^{n_{1}}\times\mathbb{R}^{d_{1}})$ should also belong to $\mathcal{CP}(C)$ .

A cutting plane paradigm $\mathcal{CP}$ is said to be inclusion closed, if for any two closed convex sets $C\subseteq C^{\prime}$ , we have $\mathcal{CP}(C^{\prime})\subseteq\mathcal{CP}(C)$ . In other words, any cutting plane derived for $C^{\prime}$ can also be derived for a subset $C$ .

Theorem 1.12.

Let $\mathcal{D}$ be a regular, embedding closed branching scheme and let $\mathcal{CP}$ be a regular, embedding closed, and inclusion closed cutting plane paradigm such that $\mathcal{D}$ includes all variable disjunctions and $\mathcal{CP}$ and $\mathcal{D}$ form a complementary pair. Then there exists a family of instances of (1.1) which have polynomial size branch-and-cut proofs, whereas any branch-and-bound proof based on $\mathcal{D}$ and any cutting plane proof based on $\mathcal{CP}$ is of exponential size.

Example 1.13.

As a concrete example of a complementary pair that satisfies the other conditions of Theorem 1.12, consider $\mathcal{CP}$ to be the Chvátal-Gomory paradigm and $\mathcal{D}$ to be the family of variable disjunctions. From their definitions, they are both regular and $\mathcal{D}$ is embedding closed. The Chvátal-Gomory paradigm is also embedding closed and inclusion closed. For the Jeroslow instances from Theorem 1.4, the single Chvátal-Gomory cut $\sum_{i=1}^{n}x_{i}\leq\lfloor\frac{n}{2}\rfloor$ proves optimality, whereas variable disjunctions produce a tree of size $2^{\lfloor\frac{n}{2}\rfloor}$ . On the other hand, consider the set $T$ , where $T=\operatorname*{conv}\{(0,0),(1,0),(\frac{1}{2},h)\}$ and the valid inequality $x_{2}\leq 0$ for $T\cap{\mathbb{Z}}^{2}$ . Any Chvátal-Gomory paradigm based proof has size exponential in the size of the input, i.e., every proof has length at least $\Omega(h)$ [51]. On the other hand, a single disjunction on the variable $x_{1}$ solves the problem.

In [3], we also studied examples of disjunction families $\mathcal{D}$ such that disjunctive cuts based on $\mathcal{D}$ are complementary to branching schemes based on $\mathcal{D}$ .

Example 1.13 shows that the classical Chvátal-Gomory cuts and variable branching are complementary and thus give rise to a superior branch-and-cut routine when combined by Theorem 1.12. As discussed above, for 0/1 problems, lift-and-project cuts and variable branching do not form a complementary pair, and neither do split cuts and split disjunctions by Theorem 1.8. It would be nice to establish the converse of Theorem 1.12: if there is a family where branch-and-cut is exponentially superior, then the cutting plane paradigm and branching scheme are complementary. In Theorem 1.14 below, we prove a partial converse along these lines in the pure integer setting. This partial converse requires the disjunction family to include all split disjunctions. It would be more satisfactory to establish similar results without this assumption. More generally, it remains an open question if our definition of complementarity is an exact characterization of when branch-and-cut is superior.

Theorem 1.14.

Let $\mathcal{D}$ be a branching scheme that includes all split disjunctions and let $\mathcal{CP}$ be any cutting plane paradigm. Suppose that for every pure integer instance and any cutting plane proof based on $\mathcal{CP}$ for this instance, there is a branch-and-bound proof based on $\mathcal{D}$ of size at most a polynomial factor (in the size of the instance) larger. Then for any branch-and-cut proof based on $\mathcal{D}$ and $\mathcal{CP}$ for a pure integer instance, there exists a pure branch-and-bound proof based on $\mathcal{D}$ that has size at most polynomially larger than the branch-and-cut proof.

The high level message that we extract from our results is the formalization of the following simple intuition. For branch-and-cut to be superior to pure cutting planes or pure branch-and-bound, one needs the cutting planes and branching scheme to do “sufficiently different” things. For example, if they are both based on the same family of disjunctions (such as lift-and-project cuts and variable branching, or the setting of Theorem 1.8), then we do not get any improvements with branch-and-cut. The definition of a complementary pair attempts to make the notion of “sufficiently different” formal and Theorem 1.12 derives the concrete superior performance of branch-and-cut from this formalization.

2 Proofs

2.1 Proof of Theorem 1.4

We first give necessary definitions and prove a lemma.

Definition 2.1.

Consider the instances in Theorem 1.4, and the branch-and-bound tree $T$ produced by split disjunctions to solve it. Assume node $N$ of $T$ contains at least one integer point in $\{0,1\}^{n}$ , and $D_{1},D_{2},\ldots,D_{r}$ are the split disjunctions used to derive $N$ from the root of $T$ . For $1\leq j\leq r$ , $D_{j}$ is a true split disjunction of $N$ if both of the two halfspaces of $D_{j}$ have a nonempty intersection with the integer hull of the corresponding parent node, i.e. the parent node’s integer hull is split into two nonempty parts by $D_{j}$ . Otherwise, it is called a false split disjunction of $N$ . We define the generation variable set of $N$ as the index set $I\subseteq\{1,2,\ldots,n\}$ such that it consists of all the indices of the variables involved in the true split disjunctions of $N$ . The generation set of the root node is empty.

Lemma 2.2.

Consider the instances in Theorem 1.4, and the branch-and-bound tree $T$ produced by split disjunctions with sparsity parameter $s<\left\lfloor\frac{n}{2}\right\rfloor$ to solve it. For any node $N$ of $T$ with at least one feasible integer point $v=(v_{1},v_{2},\ldots,v_{n})\in\{0,1\}^{n}$ , let $P$ , $P_{I}$ and $I$ denote the relaxation, the integer hull and the generation variable set corresponding to $N$ . Define $V:=\{(x_{1},x_{2},\ldots,x_{n})\in\{0,1\}^{n}:x_{i}=v_{i}\mbox{ for }i\in I,\sum_{j=1}^{n}x_{i}=\left\lfloor\frac{n}{2}\right\rfloor\}$ .

If $\lvert I\rvert\leq\left\lfloor\frac{n}{2}\right\rfloor-s$ , then we have:

(i)

$V\neq\emptyset$ and $V\subseteq P_{I}\cap\{0,1\}^{n}$ ;
(ii)

the objective LP value of $N$ is $\frac{n}{2}$ .

Proof.

We first give a proof of (i). Since $v$ is a feasible integer point, $0\leq\sum_{i\in I}v_{i}\leq\sum_{i=1}^{n}v_{i}\leq\left\lfloor\frac{n}{2}\right\rfloor$ . Thus, there exists $v^{\prime}=(v^{\prime}_{1},v^{\prime}_{2},\ldots,v^{\prime}_{n})$ , where $v_{i}^{\prime}=v_{i}$ for $i\in I$ and $\sum_{i=1}^{n}v_{i}^{\prime}=\left\lfloor\frac{n}{2}\right\rfloor$ . So $v^{\prime}\in V\neq\emptyset$ .

For each $v^{*}\in V$ , we wish to show that $v^{*}\in P$ . This will show that $v^{*}\in P_{I}$ and $V\subseteq P_{I}$ . Consider any inequality describing $P$ ; if it is not the original defining inequality $\sum_{i=1}^{n}x_{i}\leq\frac{n}{2}$ or a 0/1 bound on a variable, then this inequality was introduced on the path from the root to $N$ . A false split disjunction cannot remove $v^{*}$ since $v^{*}$ is integral. Consider an inequality coming from a true split disjunction. Let $\sum_{i\in S}a_{i}x_{i}\leq\delta^{*}$ for some $S\subseteq I$ be such an inequality. Since $v\in P_{I}$ and $v^{*}_{i}=v_{i}$ for $i\in I$ , we observe that $\sum_{i\in S}a_{i}v_{i}=\sum_{i\in S}a_{i}v^{*}_{i}\leq\delta^{*}$ .

We will prove (ii) by contradiction, so we assume the objective LP value of $N$ is strictly less than $\frac{n}{2}$ . Let $P_{0}$ denote the relaxation corresponding to the root node. Assume $\ell\in\{1,2,\ldots,n\}\backslash I$ .

Since $\lvert I\rvert\leq\left\lfloor\frac{n}{2}\right\rfloor-s$ , there exists $v^{1}=(v_{1}^{1},v_{2}^{1},\ldots,v_{n}^{1})\in V$ , where $v_{\ell}^{1}=0$ . Define $v^{2}=(v_{1}^{2},v_{2}^{2},\ldots,v_{n}^{2})$ , where $v_{\ell}^{2}=\frac{1}{2}$ , and $v_{i}^{2}=v_{i}^{1}$ for $i\in\{1,2,\ldots,n\}\backslash\{\ell\}$ . It is clear that $v^{2}\in P_{0}$ , and $v^{2}\notin P$ since the LP value is assumed to be strictly less than $\frac{n}{2}$ . Since $\ell\notin I$ , there must be a halfspace $\hat{H}$ coming from a false split disjunction of $N$ that excludes $v^{2}$ . The inequality describing this halfspace $\hat{H}$ must involve variable $x_{\ell}$ , otherwise $v^{1}$ also violates $\hat{H}$ , which leads to a contradiction since $\hat{H}$ comes from a false split disjunction and therefore cannot cut off any integer point. Hence assume the inequality describing $\hat{H}$ is $a_{\ell}x_{\ell}+\sum_{i\in S}a_{i}x_{i}\leq\delta$ for some $S\subseteq\{1,2,\ldots,n\}\backslash\{\ell\}$ , and $\lvert S\rvert\leq s-1$ (since the sparsity of the disjunctions is restricted to be at most $s$ ). Since $\sum_{i\in I}v^{1}_{i}\leq\left\lfloor\frac{n}{2}\right\rfloor-s$ , we have $\sum_{i\notin I\cup\{\ell\}}v^{1}_{i}\geq s$ , and there exists $r\in\{1,2,\ldots,n\}\backslash(S\cup I\cup\{\ell\})$ such that $v_{r}^{1}=1$ . Let $v^{3}=(v_{1}^{3},v_{2}^{3},\ldots,v_{n}^{3})$ , where $v_{\ell}^{3}=1$ , $v_{r}^{3}=0$ , and $v_{i}^{3}=v_{i}^{1}$ for $i\neq\ell,r$ . By definition of $V$ , $v^{3}\in V$ . Since $v^{1},v^{3}$ are integral, and $\hat{H}$ comes from a false split disjunction, $\hat{H}$ must be valid for $v^{1}$ and $v^{3}$ . Thus, we have

	$\displaystyle a_{\ell}\cdot 0+\sum_{i\in S}a_{i}v_{i}^{1}=a_{\ell}\cdot 0+\sum_{i\in S}a_{i}v_{i}^{2}\leq\delta,$		(2.1)
	$\displaystyle a_{\ell}\cdot 1+\sum_{i\in S}a_{i}v_{i}^{3}=a_{\ell}\cdot 1+\sum_{i\in S}a_{i}v_{i}^{1}=a_{\ell}\cdot 1+\sum_{i\in S}a_{i}v_{i}^{2}\leq\delta.$		(2.2)

Summing up (2.1) and (2.2) and dividing by 2, we get

a_{\ell}\cdot\frac{1}{2}+\sum_{i\in S}a_{i}v_{i}^{2}=a_{\ell}\cdot v^{2}_{\ell}+\sum_{i\in S}a_{i}v_{i}^{2}\leq\delta,

(2.3)

which implies that $\hat{H}$ is valid for $v^{2}$ . This is a contradiction. ∎

Proof of Theorem 1.4.

For a node $N$ of the branch-and-bound tree containing at least one integer point, if it is derived by exactly $m$ true split disjunctions, then we say it is a node of generation $m$ . By Lemma 2.2, if $m\leq\frac{1}{s}\big{\lfloor}\frac{n}{2}\big{\rfloor}-1$ , then a node $N$ of generation $m$ has LP objective value $\frac{n}{2}$ , and in the subtree rooted at $N$ there must exist at least two descendants from generation $m+1$ , since the leaf nodes must have LP values less than or equal to $\lfloor\frac{n}{2}\rfloor$ . Therefore, there are at least $2^{m}$ nodes of generation $m$ when $m\leq\frac{1}{s}\big{\lfloor}\frac{n}{2}\big{\rfloor}-1$ . This finishes the proof. ∎

2.2 Proof of Theorem 1.6

Lemma 2.3.

Let $w_{1},\ldots,w_{k}\in{\mathbb{Z}}\setminus\{0\}$ and $W\in{\mathbb{Z}}$ . Then the number of 0/1 solutions to $\sum_{j=1}^{k}w_{j}x_{j}=W$ is at most ${k\choose\lfloor k/2\rfloor}$ .

Proof.

Let $P:=\{i\in\{1,\ldots,k\}:w_{i}>0\}$ and $N:=\{i\in\{1,\ldots,k\}:w_{i}<0\}$ . By making the variable change $x_{i}=1-y_{i}$ for $i\in N$ and $x_{i}=y_{i}$ for $i\in P$ , it is seen that the number of 0/1 solutions to $\sum_{i=1}^{k}w_{i}x_{i}=W$ is the same as the number of 0/1 solutions to $\sum_{i\in P}w_{i}y_{i}+\sum_{i\in N}(-w_{i})y_{i}=W-\sum_{i\in N}w_{i}$ . Writing this a bit more cleanly, we want to upper bound the number of 0/1 solutions to $\sum_{i=1}^{k}w^{\prime}_{i}y_{i}=W^{\prime}$ , where $w_{i}^{\prime}>0$ for all $i\in\{1,\ldots,k\}$ and $W^{\prime}\in{\mathbb{Z}}$ . The collection of subsets $I\subseteq\{1,\ldots,k\}$ that are solutions to $\sum_{i=1}^{k}w^{\prime}_{i}y_{i}=W^{\prime}$ is an antichain in the lattice of subsets with set inclusion as the partial order because all the $w^{\prime}_{i}$ values are strictly positive. By Sperner’s Theorem [52], the size of this collection is at most ${k\choose\lfloor k/2\rfloor}$ . ∎

Proof of Theorem 1.6.

We consider the instance from Theorem 1.6. For any split disjunction $D:=\{x:\langle a,x\rangle\leq b\}\cup\{x:\langle a,x\rangle\geq b+1\}$ , we define $V(D)$ to be the set of all the optimal LP vertices (of the original polytope) that lie strictly in the corresponding split set $\{x:b\leq\langle a,x\rangle\leq b+1\}$ . Let the support of $a$ be given by $T\subseteq\{1,\ldots,n\}$ with $t:=|T|\leq s\leq\lfloor n/2\rfloor$ . Since $a\in{\mathbb{Z}}^{n}$ and $b\in{\mathbb{Z}}$ , $V(D)$ is precisely the subset of the optimal LP vertices $\hat{x}$ such that $\langle a,\hat{x}\rangle=b+\frac{1}{2}$ . Fix some $\ell\in T$ and consider those optimal LP vertices $\hat{x}\in V(D)$ where $\hat{x}_{\ell}=\frac{1}{2}$ . This means that $\sum_{j\in T\setminus\{\ell\}}a_{j}\hat{x}_{j}=b+\frac{1}{2}-\frac{a_{\ell}}{2}$ . Let $r_{i}$ be the number of 0/1 solutions to $\sum_{j\in T\setminus\{\ell\}}a_{j}\hat{x}_{j}=b+\frac{1}{2}-\frac{a_{\ell}}{2}$ with exactly $i$ coordinates set to 1. Then the number of vertices from $V(D)$ with the $\ell$ -th coordinate equal to $\frac{1}{2}$ is

\sum_{i=0}^{t-1}r_{i}{n-t\choose\lfloor n/2\rfloor-i}\leq\left(\sum_{i=0}^{t-1}r_{i}\right){n-t\choose\lfloor n/2\rfloor-\lfloor t/2\rfloor}.

since ${n-t\choose\lfloor n/2\rfloor-i}\leq{n-t\choose\lfloor n/2\rfloor-\lfloor t/2\rfloor}$ for all $i\in\{0,\ldots,t-1\}$ . Using Lemma 2.3, $\sum_{i=0}^{t-1}r_{i}\leq{t-1\choose\lfloor t/2\rfloor}$ and we obtain the upper bound ${t-1\choose\lfloor t/2\rfloor}{n-t\choose\lfloor n/2\rfloor-\lfloor t/2\rfloor}$ on the number of vertices from $V(D)$ with the $\ell$ -th coordinate equal to $\frac{1}{2}$ . Therefore, $|V(D)|\leq t{t-1\choose\lfloor t/2\rfloor}{n-t\choose\lfloor n/2\rfloor-\lfloor t/2\rfloor}=:p(t).$ Since $n$ is odd, we have

p(t)=\begin{cases}\displaystyle\frac{t!(n-t)!}{(t/2)!(t/2-1)!((n-t-1)/2)!((n-t+1)/2)!}&\mbox{if $t$ is even},\\[8.53581pt] \displaystyle\frac{t!(n-t)!}{((t-1)/2)!((t-1)/2)!((n-t)/2)!((n-t)/2)!}&\mbox{if $t$ is odd}.\end{cases}

A direct calculation then shows that

\frac{p(t+1)}{p(t)}=\begin{cases}\displaystyle\frac{(t+1)(n-t+1)}{t(n-t)}&\mbox{if $t$ is even},\\ \displaystyle 1&\mbox{if $t$ is odd}.\end{cases}

Let $h$ be the largest even number not exceeding $s$ . Since $p(1)={n-1\choose\lfloor n/2\rfloor}$ , we obtain, for every $t\in\{1,\dots,s\}$ ,

p(t)\leq p(s)={n-1\choose\lfloor n/2\rfloor}\prod_{\begin{subarray}{c}1\leq q\leq s\\ \mbox{$q$ even}\end{subarray}}\frac{q+1}{q}\cdot\frac{n-q+1}{n-q}={n-1\choose\lfloor n/2\rfloor}\cdot\frac{(h+1)!!}{h!!}\cdot\frac{(n-1)!!}{(n-2)!!}\cdot\frac{(n-h-2)!!}{(n-h-1)!!},

where $m!!$ denotes the product of all integers from $1$ up to $m$ of the same parity as $m$ . Using the fact that, for every even positive integer $\ell$ ,

\sqrt{\frac{\pi\ell}{2}}<\frac{\ell!!}{(\ell-1)!!}<\sqrt{\frac{\pi(\ell+1)}{2}}

(see, e.g., [54, 9]), we have (for $h\geq 1$ , i.e., $s\geq 2$ )

\begin{split}p(t)&\leq{n-1\choose\lfloor n/2\rfloor}\cdot\frac{(h+1)(h-1)!!}{h!!}\cdot\frac{(n-1)!!}{(n-2)!!}\cdot\frac{(n-h-2)!!}{(n-h-1)!!}\\ &\leq{n-1\choose\lfloor n/2\rfloor}(h+1)\sqrt{\frac{2}{\pi h}\cdot\frac{\pi n}{2}\cdot\frac{2}{\pi(n-h-1)}}\\ &={n-1\choose\lfloor n/2\rfloor}\sqrt{\frac{2n(h+1)^{2}}{\pi h(n-h-1)}}\\ &={n-1\choose\lfloor n/2\rfloor}O\left(\sqrt{\frac{ns}{n-s}}\right).\end{split}

Thus, this is an upper bound on $|V(D)|$ . Since the total number of optimal LP vertices of the instance is ${n{n-1\choose\lfloor n/2\rfloor}}$ , we obtain the following lower bound of on the size of a branch-and-bound proof: $\frac{{n{n-1\choose\lfloor n/2\rfloor}}}{|V(D)|}=\Omega\left(\sqrt{\frac{n(n-s)}{s}}\right).$ ∎

2.3 Proof of Theorem 1.7

Proof of Theorem 1.7.

We first show a branch-and-bound algorithm with size $O(n)$ . Let the root node be $N_{0}$ . The objective LP value of $N_{0}$ is $\frac{n}{2}$ . Let $N_{1}^{0}$ and $N_{1}^{1}$ be the children of $N_{0}$ produced by branches $x_{1}\leq 0$ and $x_{1}\geq 1$ respectively. Then the LP values of $N_{1}^{0}$ and $N_{1}^{1}$ are $\left\lfloor\frac{n}{2}\right\rfloor$ and $\frac{n}{2}$ . Therefore $N_{1}^{0}$ is a leaf node. Recursively, let $N_{j+1}^{0}$ and $N_{j+1}^{1}$ be children of $N_{j}^{1}$ produced by $x_{j+1}\leq 0$ and $x_{j+1}\geq 1$ for $1\leq j\leq\left\lfloor\frac{n}{2}\right\rfloor$ . Note that this is well defined since the LP values of $N_{j}^{0}$ and $N_{j}^{1}$ are $\left\lfloor\frac{n}{2}\right\rfloor$ and $\frac{n}{2}$ for $1\leq j\leq\left\lfloor\frac{n}{2}\right\rfloor$ . It is clear that node $N_{j+1}^{0}$ is a leaf for $1\leq j\leq\left\lfloor\frac{n}{2}\right\rfloor$ . Node $N_{\left\lceil\frac{n}{2}\right\rceil}^{1}$ is an infeasible leaf since there are $\left\lceil\frac{n}{2}\right\rceil$ variables set to be $1$ . Therefore, the whole branch-and-bound tree has $n+2$ nodes.

Next, we show that any cutting plane for the problem with sparsity $s\leq\left\lfloor\frac{n}{2}\right\rfloor$ is valid for $[0,1]^{n}$ . We will use the fact that $H\cap\{0,1\}^{n}=\{(x_{1},x_{2},\ldots,x_{n})\in\{0,1\}^{n}:\sum_{i=1}^{n}x_{i}\leq\left\lfloor\frac{n}{2}\right\rfloor\}$ .

Let $S\subseteq\{1,\ldots,n\}$ be the set of indices for the non-zero coefficients in an inequality defining the cutting plane, i.e., the inequality is given by $\sum_{i\in S}a_{i}x_{i}\leq\delta$ . Since this is a cutting plane it must be valid for all points in $H\cap\{0,1\}^{n}$ . Let $V_{S}=\{(x_{1},x_{2},\ldots,x_{n})\in\{0,1\}^{n}:x_{i}=0,i\not\in S\}$ . Since $|S|\leq s\leq\left\lfloor\frac{n}{2}\right\rfloor$ , we have $V_{S}\subseteq H\cap\{0,1\}^{n}$ . Therefore $\sum_{i\in S}a_{i}x_{i}\leq\delta$ is valid for all of $V_{S}$ . Since the inequality only involves $x_{i}$ , $i\in S$ , it must also be a valid inequality for all of $\{0,1\}^{n}$ . ∎

2.4 Proof of Theorem 1.8

Proof of Theorem 1.8.

Let the cutting plane proof be $H_{1},H_{2},\ldots,H_{L}$ , and the sequence of the corresponding disjunctions deriving it be $D_{1},D_{2},\ldots,D_{L}\in\mathcal{D}$ . Moreover, assume $H_{i}$ is $\langle\alpha_{i},x\rangle\leq\delta_{i}$ for $1\leq i\leq L$ . Since we assume all cutting planes are rational, we may assume $\alpha_{i}\in{\mathbb{Z}}^{n+d}$ and $\delta_{i}\in{\mathbb{Z}}$ . Let $H^{\prime}_{i}$ be $\langle\alpha_{i},x\rangle\geq\delta_{i}+1$ . Since $H_{i}$ is valid for $C\cap D_{i}$ , we must have that $(C\cap H^{\prime}_{i})\cap D_{i}=\emptyset$ .

Let $N_{0}=C$ be the root node of the branch-and-bound tree. Recursively, we define $N_{i}$ and $N_{i}^{\prime}$ be the children of $N_{i-1}$ generated by applying the split disjunction $H_{i}\cup H^{\prime}_{i}$ for $1\leq i\leq L$ . Applying the disjunction $D_{i}$ on $N_{i}^{\prime}$ only generates infeasible nodes as noted above. Meanwhile, $N_{i}$ shows the validity of $H_{i}$ . Thus, we have replaced the cut $H_{i}$ with $k+1$ nodes of the branch-and-bound tree: $k$ of these are infeasible and one is feasible. Therefore, we get a branch-and-bound tree of size $(k+1)L$ .

A well-known family of instances in $\mathbb{R}^{3}$ , given by $\operatorname*{conv}\{(0,0,0),(2,0,0),(0,2,0),(\frac{1}{2},\frac{1}{2},h)\}$ for $h\in{\mathbb{N}}$ , from [18] can be solved by branch-and-bound in $O(1)$ iterations with just variable disjunctions; however, there is a $\operatorname*{poly}(\log(h))$ lower bound on the split rank [15], and therefore, on the length of proofs based on split cuts. ∎

2.5 Proofs of Theorems 1.12 and 1.14

We will need some preliminary facts for comparing growth rate of instance sizes.

Definition 2.4.

A sequence of real numbers $(a_{n})_{n\in{\mathbb{N}}}$ is said to (asymptotically) polynomially dominate another sequence $(b_{n})_{n\in{\mathbb{N}}}$ if there exists a polynomial $p$ , and two natural numbers $n_{1},n_{2}\in{\mathbb{N}}$ such that

\lim_{n\to\infty}\frac{b_{n_{1}+n}}{p(a_{n_{2}+n})}<\infty.

If $(a_{n})_{n\in{\mathbb{N}}}$ polynomially dominates $(b_{n})_{n\in{\mathbb{N}}}$ and vice versa, we say that the two sequences are (asymptotically) polynomially equivalent.

Note that if $b_{n}=O(p(a_{n}))$ for some polynomial $p$ , then $(a_{n})_{n\in{\mathbb{N}}}$ polynomially dominates $(b_{n})_{n\in{\mathbb{N}}}$ (for example, $a_{n}=n$ is polynomially equivalent to the sequence $b_{n}=n^{3}$ ). However, our definition allows us to neglect a finite number of terms from both sequences. To illustrate the difference, consider the following two sequences. Define $a_{1}=2$ , and recursively $a_{n+1}=2^{a_{n}}$ for $n\geq 2$ . Define $b_{n}=a_{n+1}$ for $n\geq 1$ . There is no polynomial $p$ such that $b_{n}=O(p(a_{n}))$ . Nevertheless, the sequence $(b_{n})_{n\in{\mathbb{N}}}$ is simply a “shift” of the sequence $(a_{n})_{n\in{\mathbb{N}}}$ and we would like to say that both have the same growth rate. Our definition captures this situation.

The following two lemmas are direct consequences of Definition 2.4.

Lemma 2.5.

Let $(a_{n})_{n\in{\mathbb{N}}}$ and $(b_{n})_{n\in{\mathbb{N}}}$ be two sequences such that $a_{n}\geq b_{n}$ for all $n\in{\mathbb{N}}$ . Then $(a_{n})_{n\in{\mathbb{N}}}$ polynomially dominates $(b_{n})_{n\in{\mathbb{N}}}$ .

Lemma 2.6.

Let $(a_{n})_{n\in{\mathbb{N}}}$ and $(b_{n})_{n\in{\mathbb{N}}}$ be two sequences such that $a_{n}\leq b_{n}\leq a_{n+1}$ for all $n\in{\mathbb{N}}$ . Then $(a_{n})_{n\in{\mathbb{N}}}$ and $(b_{n})_{n\in{\mathbb{N}}}$ are polynomially equivalent.

Proposition 2.7.

Let $(a_{n})_{n\in{\mathbb{N}}}$ and $(b_{n})_{n\in{\mathbb{N}}}$ be two sequences such that $\lim_{n\to\infty}a_{n}=\infty=\lim_{n\to\infty}b_{n}$ . Then there exist subsequences $(a^{\prime}_{n})_{n\in{\mathbb{N}}}$ and $(b^{\prime}_{n})_{n\in{\mathbb{N}}}$ of $(a_{n})_{n\in{\mathbb{N}}}$ and $(b_{n})_{n\in{\mathbb{N}}}$ respectively such that $(a^{\prime}_{n})_{n\in{\mathbb{N}}}$ and $(b^{\prime}_{n})_{n\in{\mathbb{N}}}$ are polynomially equivalent.

Proof.

Since $\lim_{n\to\infty}a_{n}=\infty=\lim_{n\to\infty}b_{n}$ , there exist subsequences $(a^{\prime}_{n})_{n\in{\mathbb{N}}}$ and $(b^{\prime}_{n})_{n\in{\mathbb{N}}}$ of $(a_{n})_{n\in{\mathbb{N}}}$ and $(b_{n})_{n\in{\mathbb{N}}}$ respectively such that $a_{n}\leq b_{n}\leq a_{n+1}$ for all $n\in{\mathbb{N}}$ . Indeed, one can build this sequence inductively: Start with $a^{\prime}_{1}=a_{1}$ , define $b^{\prime}_{1}$ to be the smallest number in the sequence $(b_{n})_{n\in{\mathbb{N}}}$ larger than or equal to $a^{\prime}_{1}$ . Suppose we have built up the subsequence upto some $i\in{\mathbb{N}}$ : $a^{\prime}_{1},\ldots,a^{\prime}_{i}$ and $b^{\prime}_{1},\ldots,b^{\prime}_{i}$ such that $a^{\prime}_{k}\leq b^{\prime}_{k}\leq a^{\prime}_{k+1}$ for all $k\leq i-1$ and $a^{\prime}_{i}\leq b^{\prime}_{i}$ . Define $a^{\prime}_{i+1}$ to be the smallest number in the sequence $(a_{n})_{n\in{\mathbb{N}}}$ larger than or equal to $b^{\prime}_{i}$ , and define $b^{\prime}_{i+1}$ to be the smallest number in the sequence $(b_{n})_{n\in{\mathbb{N}}}$ larger than or equal to $a^{\prime}_{i+1}$ . By Lemma 2.6, these two subsequences are polynomially equivalent. ∎

We next derive some straightforward consequences of Definition 1.11.

Lemma 2.8.

Let $C\subseteq C^{\prime}$ be two closed, convex sets. Let $\mathcal{D}$ be any branching scheme and let $\mathcal{CP}$ be an inclusion closed cutting plane paradigm. If there is a branch-and-bound proof with respect to $C^{\prime}$ based on $\mathcal{D}$ for the validity of an inequality $\langle c,x\rangle\leq\gamma$ , then there is a branch-and-bound proof with respect to $C$ based on $\mathcal{D}$ for the validity of $\langle c,x\rangle\leq\gamma$ of the same size. The same holds for cutting plane proofs based on $\mathcal{CP}$ .

Proof.

For the branch-and-bound proofs, apply the same set of disjunctions on $C$ instead of $C^{\prime}$ . Since $C\subseteq C^{\prime}$ , all the nodes in the branch-and-bound tree for $C$ are subsets of the corresponding nodes in the branch-and-bound tree for $C^{\prime}$ . Thus, $\langle c,x\rangle\leq d$ is valid for the leaves of the new branch-and-bound tree.

For the cutting plane proofs, apply the same sequence of cuts and the result follows from the inclusion closed property of $\mathcal{CP}$ (Definition 1.11).∎

Lemma 2.9.

Let $\mathcal{D}$ and $\mathcal{CP}$ be both embedding closed and let $C\subseteq\mathbb{R}^{n_{1}}\times\mathbb{R}^{d_{1}}$ be a closed, convex set. Let $\langle c,x\rangle\leq\gamma$ be a valid inequality for $C\cap({\mathbb{Z}}^{n_{1}}\times\mathbb{R}^{d_{1}})$ . If there is a branch-and-bound proof with respect to $C\times\{0\}^{n_{2}}\times\{0\}^{d_{2}}$ based on $\mathcal{D}$ for the validity of $\langle c,x\rangle\leq\gamma$ interpreted as a valid inequality in $\mathbb{R}^{n_{1}}\times\mathbb{R}^{d_{1}}\times\mathbb{R}^{n_{2}}\times\mathbb{R}^{d_{2}}$ for $(C\times\{0\}^{n_{2}}\times\{0\}^{d_{2}})\cap({\mathbb{Z}}^{n_{1}}\times\mathbb{R}^{d_{1}}\times{\mathbb{Z}}^{n_{2}}\times\mathbb{R}^{d_{2}})$ , then there is a branch-and-bound proof with respect to $C$ based on $\mathcal{D}$ for the validity of $\langle c,x\rangle\leq\gamma$ of the same size. The same holds for cutting plane proofs based on $\mathcal{CP}$ .

Proof.

Since $\mathcal{D}$ is embedding closed, for any disjunction $D$ used in the space $\mathbb{R}^{n_{1}}\times\mathbb{R}^{n_{2}}\times\mathbb{R}^{d_{1}}\times\mathbb{R}^{d_{2}}$ , we use the restriction of $D$ to the space $\mathbb{R}^{n_{1}}\times\mathbb{R}^{d_{1}}$ (Definition 1.11).

Similarly, the cutting plane claim from the fact that $\mathcal{CP}$ is embedding closed (Definition 1.11). ∎

Lemma 2.10.

Let $C\subseteq\mathbb{R}^{n+d}$ be a polytope and let $\langle c,x\rangle\leq\gamma$ be a valid inequality for $C\cap({\mathbb{Z}}^{n}\times\mathbb{R}^{d})$ . Let $X:=\{(x,t)\in\mathbb{R}^{n+d}\times\mathbb{R}:x\in C,\;\;t=\langle c,x\rangle\}$ . Then, for any regular branching scheme $\mathcal{D}$ or a regular cutting plane paradigm $\mathcal{CP}$ , any proof of validity of $\langle c,x\rangle\leq\gamma$ with respect to $C\cap({\mathbb{Z}}^{n}\times\mathbb{R}^{d})$ can be changed into a proof of validity of $t\leq\gamma$ with respect to $X\cap({\mathbb{Z}}^{n}\times\mathbb{R}^{d}\times\mathbb{R})$ with no change in length, and vice versa.

Proof.

A proof of $\langle c,x\rangle\leq\gamma$ with respect to $C\cap({\mathbb{Z}}^{n}\times\mathbb{R}^{d})$ never involves $t$ , and so can be carried over verbatim a proof for $t=\langle c,x\rangle\leq\gamma$ with respect to $X\cap({\mathbb{Z}}^{n}\times\mathbb{R}^{d}\times\mathbb{R})$ . In the other direction, since we assume $\mathcal{D}$ is regular (Definition 1.11), no disjunction uses the variable $t$ and so it can be applied with the same effect on $C$ . Similarly, since $\mathcal{CP}$ is regular, by definition any cutting plane derived for $X$ can be converted into an equivalent cutting plane for $C$ .∎

Proof of Theorem 1.12.

Let $\{P_{k}\subseteq\mathbb{R}^{n_{k}}\times\mathbb{R}^{d_{k}}:k\in{\mathbb{N}}\}$ be a family of closed, convex sets, and $\{(c_{k},\gamma_{k})\in\mathbb{R}^{n_{k}}\times\mathbb{R}^{d_{k}}\times\mathbb{R}:k\in{\mathbb{N}}\}$ be a family of tuples such that $\langle c_{k},x\rangle\leq\gamma_{k}$ is valid for $P_{k}\cap({\mathbb{Z}}^{n_{k}}\times\mathbb{R}^{d_{k}})$ , and $\mathcal{CP}$ has polynomial size proofs for this family of instances, whereas $\mathcal{D}$ has exponential size proofs. Similarly, let $\{P^{\prime}_{k}\subseteq\mathbb{R}^{n^{\prime}_{k}}\times\mathbb{R}^{d^{\prime}_{k}}:k\in{\mathbb{N}}\}$ be a family of closed, convex sets, and $\{(c^{\prime}_{k},\gamma^{\prime}_{k})\in\mathbb{R}^{n^{\prime}_{k}}\times\mathbb{R}^{d^{\prime}_{k}}\times\mathbb{R}:k\in{\mathbb{N}}\}$ be a family of tuples such that $\langle c^{\prime}_{k},x\rangle\leq\gamma^{\prime}_{k}$ is valid for $P^{\prime}_{k}\cap({\mathbb{Z}}^{n^{\prime}_{k}}\times\mathbb{R}^{d^{\prime}_{k}})$ , and $\mathcal{D}$ has polynomial size proofs for this family of instances, whereas $\mathcal{CP}$ has exponential size proofs. By Proposition 2.7, we may assume that the sequence of sizes of the instances $(P_{k},c_{k},\gamma_{k})$ and $(P^{\prime}_{k},c^{\prime}_{k},\gamma^{\prime}_{k})$ in the two families are polynomially equivalent, by passing to an infinite subfamily if necessary. Since the polynomial or exponential behaviour of the proof sizes are defined with respect to the sizes of the instances, passing to infinite subfamilies maintains this behaviour.

We first embed $P_{k}$ and $P^{\prime}_{k}$ into a common ambient space for each $k\in{\mathbb{N}}$ . This is done by defining $\bar{n}_{k}=\max\{n_{k},n^{\prime}_{k}\}$ , $\bar{d}_{k}=\max\{d_{k},d^{\prime}_{k}\}$ , and embedding both $P_{k}$ and $P^{\prime}_{k}$ into the space $\mathbb{R}^{\bar{n}_{k}}\times\mathbb{R}^{\bar{d}_{k}}$ by defining $Q_{k}:=P_{k}\times\{0\}^{\bar{n}_{k}-n_{k}}\times\{0\}^{\bar{d}_{k}-d_{k}}$ and $Q^{\prime}_{k}:=P^{\prime}_{k}\times\{0\}^{\bar{n}_{k}-n^{\prime}_{k}}\times\{0\}^{\bar{d}_{k}-d^{\prime}_{k}}$ . By Lemma 2.9, $\mathcal{D}$ has an exponential lower bound on sizes of proofs for the inequality $\langle c_{k},x\rangle\leq\gamma_{k}$ , interpreted as an inequality in $\mathbb{R}^{\bar{n}_{k}}\times\mathbb{R}^{\bar{d}_{k}}$ , valid for $Q_{k}\cap({\mathbb{Z}}^{\bar{n}_{k}}\times\mathbb{R}^{\bar{d}_{k}})$ . By Lemma 2.9, $\mathcal{CP}$ has an exponential lower bound on sizes of proofs for the inequality $\langle c^{\prime}_{k},x\rangle\leq\gamma^{\prime}_{k}$ , interpreted as an inequality in $\mathbb{R}^{\bar{n}_{k}}\times\mathbb{R}^{\bar{d}_{k}}$ , valid for $Q^{\prime}_{k}\cap({\mathbb{Z}}^{\bar{n}_{k}}\times\mathbb{R}^{\bar{d}_{k}})$ .

We now make the objective vector common for both families of instances. Define $X_{k}:=\{(x,t)\in\mathbb{R}^{\bar{n}_{k}}\times\mathbb{R}^{\bar{d}_{k}}\times\mathbb{R}:x\in Q_{k},\;\;t=\langle c_{k},x\rangle\}$ and $X^{\prime}_{k}:=\{(x,t)\in\mathbb{R}^{\bar{n}_{k}}\times\mathbb{R}^{\bar{d}_{k}}\times\mathbb{R}:x\in Q^{\prime}_{k},\;\;t=\langle c^{\prime}_{k},x\rangle\}$ . By Lemma 2.10, the inequality $t\leq\gamma_{k}$ has an exponential lower bound on sizes of proofs based on $\mathcal{D}$ for $X_{k}$ and the inequality $t\leq\gamma^{\prime}_{k}$ has an exponential lower bound on sizes of proofs based on $\mathcal{CP}$ for $X^{\prime}_{k}$ .

We next embed these families as faces of the same closed convex set. Define $Z_{k}\subseteq\mathbb{R}^{\bar{n}_{k}}\times\mathbb{R}^{\bar{d}_{k}}\times\mathbb{R}\times\mathbb{R}$ , for every $k\in{\mathbb{N}}$ , as the convex hull of $X_{k}\times\{0\}$ and $X^{\prime}_{k}\times\{1\}$ .

The key point to note is that these constructions combine two families whose sizes are polynomially equivalent and therefore the new family that is created has sizes that are polynomially equivalent to the original two families.

We let $(x,t,y)$ denote points in the new space $\mathbb{R}^{\bar{n}_{k}}\times\mathbb{R}^{\bar{d}_{k}}\times\mathbb{R}\times\mathbb{R}$ , i.e., $y$ denotes the last coordinate. Consider the family of inequalities $t-\gamma_{k}(1-y)-\gamma^{\prime}_{k}y\leq 0$ for every $k\in{\mathbb{N}}$ . Note that this inequality reduces to $t\leq\gamma_{k}$ when $y=0$ and it reduces to $t\leq\gamma^{\prime}_{k}$ when $y=1$ . Thus, the inequality is valid for $Z_{k}\cap({\mathbb{Z}}^{\bar{n}_{k}}\times\mathbb{R}^{\bar{d}_{k}}\times\mathbb{R}\times{\mathbb{Z}})$ , i.e., when we constrain $y$ to be an integer variable. Since $X_{k}\times\{0\}\subseteq Z_{k}$ , by Lemma 2.8, proofs of $t-\gamma_{k}(1-y)-\gamma^{\prime}_{k}y\leq 0$ based on $\mathcal{D}$ have an exponential lower bound on their size. Similarly, since $X^{\prime}_{k}\times\{1\}\subseteq Z_{k}$ , by Lemma 2.8, proofs of $t-\gamma_{k}(1-y)-\gamma^{\prime}_{k}y\leq 0$ based on $\mathcal{CP}$ have an exponential lower bound on their size.

However, for branch-and-cut based on $\mathcal{CP}$ and $\mathcal{D}$ , we can first branch on the variable $y$ (recall from the hypothesis that $\mathcal{D}$ allows branching on any integer variable). Since $\mathcal{CP}$ has a polynomial proof for $P_{k}$ and $(c_{k},\gamma_{k})$ and therefore for the valid inequality $t\leq\gamma_{k}$ for $X_{k}\times\{0\}$ , we can process the $y=0$ branch with polynomial size cutting plane proofs. Similarly, $\mathcal{D}$ has a polynomial proof for $P^{\prime}_{k}$ and $(c^{\prime}_{k},\gamma^{\prime}_{k})$ and therefore for the valid inequality $t\leq\gamma^{\prime}_{k}$ for $X^{\prime}_{k}\times\{1\}$ , we can process the $y=1$ branch also in with polynomial size proofs. Thus, branch-and-cut gives polynomial size proofs overall for this family of instances. ∎

Proof of Theorem 1.14.

Recall that we restrict ourselves to the pure integer case, i.e., $d=0$ . Consider any branch-and-cut proof for some instance. If no cutting planes are used in the proof, this is a pure branch-and-bound proof and we are done. Otherwise, let $N$ be a node of the proof tree where a cutting plane $\langle a,x\rangle\leq\gamma$ is used. Since we assume all cutting planes are rational, we may assume $a\in{\mathbb{Z}}^{n}$ and $\gamma\in{\mathbb{Z}}$ . Thus, $N^{\prime}=N\cap\{x:\langle a,x\rangle\geq\gamma+1\}$ is integer infeasible. Since $\langle a,x\rangle\leq\gamma$ is in $\mathcal{CP}(N)$ , by our assumption, there must be a branch-and-bound proof of polynomial size based on $\mathcal{D}$ for the validity of $\langle a,x\rangle\leq\gamma$ with respect to $N$ . Since $N^{\prime}\subseteq N$ , by Lemma 2.8, there must be a branch-and-bound proof for the validity of $\langle a,x\rangle\leq\gamma$ with respect to $N^{\prime}$ , thus proving the infeasibility of $N^{\prime}$ . In the branch-and-cut proof, one can replace the child of $N$ by first applying the disjunction $\{x:\langle a,x\rangle\leq\gamma\}\cup\{x:\langle a,x\rangle\geq\gamma+1\}$ on $N$ , and then on $N^{\prime}$ , applying the above branch-and-bound proof of infeasibility. We now have a branch-and-cut proof for the original instance with one less cutting plane node. We can repeat this for all nodes where a cutting plane is added and convert the entire branch-and-cut tree into a pure branch-and-bound tree with at most a polynomial blow up in size.∎

Acknowledgments

Amitabh Basu and Hongyi Jiang gratefully acknowledge support from ONR Grant N000141812096, NSF Grant CCF2006587, and AFOSR Grant FA95502010341. Michele Conforti and Marco Di Summa were supported by a SID grant of the University of Padova.

References

[1] Karen Aardal, Robert E Bixby, Cor AJ Hurkens, Arjen K Lenstra, and Job W Smeltink. Market split and basis reduction: Towards a solution of the cornuéjols-dawande instances. INFORMS Journal on Computing, 12(3):192–202, 2000.
[2] Sanjeev Arora and Boaz Barak. Computational complexity: a modern approach. Cambridge University Press, 2009.
[3] Amitabh Basu, Michele Conforti, Marco Di Summa, and Hongyi Jiang. Complexity of cutting plane and branch-and-bound algorithms for mixed-integer optimization. https://arxiv.org/abs/2003.05023, 2019.
[4] Paul Beame, Noah Fleming, Russell Impagliazzo, Antonina Kolokolova, Denis Pankratov, Toniann Pitassi, and Robert Robere. Stabbing Planes. In Anna R. Karlin, editor, 9th Innovations in Theoretical Computer Science Conference (ITCS 2018), volume 94 of Leibniz International Proceedings in Informatics (LIPIcs), pages 10:1–10:20, Dagstuhl, Germany, 2018. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik.
[5] Robert E Bixby. Solving real-world linear programs: A decade and more of progress. Operations research, 50(1):3–15, 2002.
[6] Alexander Bockmayr, Friedrich Eisenbrand, Mark Hartmann, and Andreas S Schulz. On the Chvátal rank of polytopes in the 0/1 cube. Discrete Applied Mathematics, 98(1-2):21–27, 1999.
[7] Maria Bonet, Toniann Pitassi, and Ran Raz. Lower bounds for cutting planes proofs with small coefficients. The Journal of Symbolic Logic, 62(3):708–728, 1997.
[8] Samuel R Buss and Peter Clote. Cutting planes, connectivity, and threshold logic. Archive for Mathematical Logic, 35(1):33–62, 1996.
[9] Chao-Ping Chen and Feng Qi. Completely monotonic function associated with the gamma functions and proof of Wallis’ inequality. Tamkang Journal of Mathematics, 36(4):303–307, 2005.
[10] Vašek Chvátal. Hard knapsack problems. Operations Research, 28(6):1402–1411, 1980.
[11] Vašek Chvátal. Cutting-plane proofs and the stability number of a graph, Report Number 84326-OR. Institut für Ökonometrie und Operations Research, Universität Bonn, Bonn, 1984.
[12] Vašek Chvátal, William J. Cook, and Mark Hartmann. On cutting-plane proofs in combinatorial optimization. Linear algebra and its applications, 114:455–499, 1989.
[13] Peter Clote. Cutting planes and constant depth frege proofs. In Proceedings of the Seventh Annual IEEE Symposium on Logic in Computer Science, pages 296–307, 1992.
[14] Michele Conforti, Gérard Cornuéjols, and Giacomo Zambelli. Integer programming, volume 271. Springer, 2014.
[15] Michele Conforti, Alberto Del Pia, Marco Di Summa, Yuri Faenza, and Roland Grappe. Reverse Chvátal–Gomory rank. SIAM Journal on Discrete Mathematics, 29(1):166–181, 2015.
[16] William J. Cook, Collette R. Coullard, and Gy Turán. On the complexity of cutting-plane proofs. Discrete Applied Mathematics, 18(1):25–38, 1987.
[17] William J. Cook and Sanjeeb Dash. On the matrix-cut rank of polyhedra. Mathematics of Operations Research, 26(1):19–30, 2001.
[18] William J. Cook, Ravindran Kannan, and Alexander Schrijver. Chvátal closures for mixed integer programming problems. Mathematical Programming, 47:155–174, 1990.
[19] Gerard Cornuéjols, Leo Liberti, and Giacomo Nannicini. Improved strategies for branching on general disjunctions. Mathematical Programming, 130(2):225–247, 2011.
[20] Daniel Dadush and Samarth Tiwari. On the complexity of branching proofs. arXiv preprint arXiv:2006.04124, 2020.
[21] Sanjeeb Dash. An exponential lower bound on the length of some classes of branch-and-cut proofs. In International Conference on Integer Programming and Combinatorial Optimization (IPCO), pages 145–160. Springer, 2002.
[22] Sanjeeb Dash. Exponential lower bounds on the lengths of some classes of branch-and-cut proofs. Mathematics of Operations Research, 30(3):678–700, 2005.
[23] Sanjeeb Dash. On the complexity of cutting-plane proofs using split cuts. Operations Research Letters, 38(2):109–114, 2010.
[24] Santanu S Dey, Andres Iroume, and Marco Molinaro. Some lower bounds on sparse outer approximations of polytopes. Operations Research Letters, 43(3):323–328, 2015.
[25] Santanu S Dey, Marco Molinaro, and Qianyi Wang. Approximating polyhedra with sparse inequalities. Mathematical Programming, 154(1-2):329–352, 2015.
[26] Santanu S Dey, Marco Molinaro, and Qianyi Wang. Analysis of sparse cutting planes for sparse milps with applications to stochastic milps. Mathematics of Operations Research, 43(1):304–332, 2018.
[27] Friedrich Eisenbrand and Andreas S Schulz. Bounds on the Chvátal rank of polytopes in the 0/1-cube. Combinatorica, 23(2):245–261, 2003.
[28] Samuel K Eldersveld and Michael A Saunders. A block-lu update for large-scale linear programming. SIAM Journal on Matrix Analysis and Applications, 13(1):191–201, 1992.
[29] Andreas Goerdt. Cutting plane versus frege proof systems. In International Workshop on Computer Science Logic, pages 174–194. Springer, 1990.
[30] Andreas Goerdt. The cutting plane proof system with bounded degree of falsity. In International Workshop on Computer Science Logic, pages 119–133. Springer, 1991.
[31] Dima Grigoriev, Edward A Hirsch, and Dmitrii V Pasechnik. Complexity of semi-algebraic proofs. In Annual Symposium on Theoretical Aspects of Computer Science (STACS), pages 419–430. Springer, 2002.
[32] Martin Grötschel, László Lovász, and Alexander Schrijver. Geometric Algorithms and Combinatorial Optimization, volume 2 of Algorithms and Combinatorics: Study and Research Texts. Springer-Verlag, Berlin, 1988.
[33] Edward A Hirsch and Sergey I Nikolenko. Simulating cutting plane proofs with restricted degree of falsity by resolution. In International Conference on Theory and Applications of Satisfiability Testing, pages 135–142. Springer, 2005.
[34] Russell Impagliazzo, Toniann Pitassi, and Alasdair Urquhart. Upper and lower bounds for tree-like cutting planes proofs. In Proceedings Ninth Annual IEEE Symposium on Logic in Computer Science, pages 220–228. IEEE, 1994.
[35] Robert G Jeroslow. Trivial integer programs unsolvable by branch-and-bound. Mathematical Programming, 6(1):105–109, Dec 1974.
[36] Miroslav Karamanov and Gérard Cornuéjols. Branching on general disjunctions. Mathematical Programming, 128(1-2):403–436, 2011.
[37] Arist Kojevnikov. Improved lower bounds for tree-like resolution over linear inequalities. In International Conference on Theory and Applications of Satisfiability Testing, pages 70–79. Springer, 2007.
[38] Jan Krajíček. Discretely ordered modules as a first-order extension of the cutting planes proof system. The Journal of Symbolic Logic, 63(4):1582–1596, 1998.
[39] Andrea Lodi. Mixed integer programming computation. In 50 Years of Integer Programming 1958-2008, pages 619–645. Springer, 2010.
[40] Ashutosh Mahajan and Theodore K Ralphs. Experiments with branching using general disjunctions. In Operations Research and Cyber-Infrastructure, pages 101–118. Springer, 2009.
[41] Hanan Mahmoud and John W Chinneck. Achieving milp feasibility quickly using general disjunctions. Computers & operations research, 40(8):2094–2102, 2013.
[42] James Ostrowski, Jeff Linderoth, Fabrizio Rossi, and Stefano Smriglio. Constraint orbital branching. In International Conference on Integer Programming and Combinatorial Optimization, pages 225–239. Springer, 2008.
[43] Jonathan H Owen and Sanjay Mehrotra. A disjunctive cutting plane procedure for general mixed-integer linear programs. Mathematical Programming, 89(3):437–448, 2001.
[44] Jonathan H Owen and Sanjay Mehrotra. Experimental results on using general disjunctions in branch-and-bound for general-integer linear programs. Computational optimization and applications, 20(2):159–170, 2001.
[45] Gábor Pataki, Mustafa Tural, and Erick B Wong. Basis reduction and the complexity of branch-and-bound. In Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete algorithms, pages 1254–1261. SIAM, 2010.
[46] Pavel Pudlák. Lower bounds for resolution and cutting plane proofs and monotone computations. The Journal of Symbolic Logic, 62(3):981–998, 1997.
[47] Pavel Pudlák. On the complexity of the propositional calculus. London Mathematical Society Lecture Note Series, pages 197–218, 1999.
[48] Alexander A Razborov. On the width of semialgebraic proofs and algorithms. Mathematics of Operations Research, 42(4):1106–1134, 2017.
[49] John Ker Reid. A sparsity-exploiting variant of the Bartels-Golub decomposition for linear programming bases. Mathematical Programming, 24(1):55–69, 1982.
[50] Thomas Rothvoß and Laura Sanità. 0/1 polytopes with quadratic Chvátal rank. In International Conference on Integer Programming and Combinatorial Optimization (IPCO), pages 349–361. Springer, 2013.
[51] Alexander Schrijver. Theory of Linear and Integer Programming. John Wiley and Sons, New York, 1986.
[52] Emanuel Sperner. Ein satz über untermengen einer endlichen menge. Mathematische Zeitschrift, 27(1):544–548, 1928.
[53] Robert J. Venderbei. https://vanderbei.princeton.edu/tex/talks/IDA_CCR/SparsityMatters.pdf, 2017.
[54] George N. Watson. A note on gamma functions. Edinburgh Mathematical Notes, 42:7–9, 1959.