Circuit imbalance measures and linear programming ††thanks: This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement ScaleOpt–757481)
Abstract
We study properties and applications of various circuit imbalance measures associated with linear spaces. These measures describe possible ratios between nonzero entries of support-minimal nonzero vectors of the space. The fractional circuit imbalance measure turns out to be a crucial parameter in the context of linear programming, and two integer variants can be used to describe integrality properties of associated polyhedra.
We give an overview of the properties of these measures, and survey classical and recent applications, in particular, for linear programming algorithms with running time dependence on the constraint matrix only, and for circuit augmentation algorithms. We also present new bounds on the diameter and circuit diameter of polyhedra in terms of the fractional circuit imbalance measure.
1 Introduction
For a linear space , is an elementary vector if is a support minimal nonzero vector in , that is, no exists such that , where denotes the support of a vector. A circuit in is the support of some elementary vector; these are precisely the circuits in the associated linear matroid . We let and denote the set of elementary vectors and circuits in the space , respectively.
Elementary vectors were first studied in the 1960s by Camion [Cam64], Tutte [Tut65], Fulkerson [Ful68], and Rockafellar [Roc69]. Circuits play a crucial role in matroid theory and have been extremely well studied. For regular subspaces (i.e., kernels of totally unimodular matrices), elementary vectors have entries; this fact has been at the heart of several arguments in network optimization since the 1950s.
The focus of this paper is on various circuit imbalance measures. We give an overview of classical and recent applications, and their relationship with other condition measures. We will mainly focus on applications in linear programming, mentioning in passing also their relevance to integer programming.
Three circuit imbalance measures
There are multiple ways to quantify how ‘imbalanced’ elementary vectors of a subspace can be. We define three different measures that capture various fractionality and integrality properties.
We will need some simple definitions. The linear spaces and will be called trivial subspaces; all other subspaces are nontrivial. A linear subspace of is a rational linear space if it admits a basis of rational vectors. Equivalently, a rational linear space can be represented as the image of a rational matrix. For an integer vector , let denote the least common multiple of the entries , .
For every , the elementary vectors with support form a one-dimensional subspace of . We pick a representative from this subspace. If is not a rational subspace, we select arbitrarily. For rational subspaces, we select as an integer vector with the largest common divisor of the coordinates being 1; this choice is unique up to multiplication by . When clear from the context, we omit the index and simply write . We now define the fractional circuit imbalance measure and two variants of integer circuit imbalance measure.
Definition 1.1 (Circuit imbalances).
For a non-trivial linear subspace , let us define the following notions:
-
•
The fractional circuit imbalance measure of is
-
•
If is a rational linear space, the lcm-circuit imbalance measure is
-
•
If is a rational linear space, the max-circuit imbalance measure is
For trivial subspaces , we define . Further, we say that the rational subspace is anchored, if every vector , has a entry.
Equivalently, in an anchored subspace every elementary vector has a nonzero entry such that all other entries are integer multiples of this entry.
The term ‘circuit imbalance measure’ will refer to the fractional measure . Note that and implies . This case plays a distinguished role and turns out to be equivalent to being a regular linear space (see Theorem 3.4).
Another important case is when is a prime power. In this case, is anchored, and . The linear space will be often represented as for a a matrix . We will use , , , , to refer to the corresponding quantities in .
An earlier systematic study of elementary vectors was done in Lee’s work [Lee89]. He mainly focused on the max-circuit imbalance measure; we give a quick comparison to the results in Section 3. The fractional circuit imbalance measure played a key role in the paper [DHNV20] on layered-least-squares interior point methods; it turns out to be a close proxy to the well-studied condition number . As far as the authors are aware, the lcm-circuit imbalance measure has not been explicitly studied previously.
Overview and contributions
Section 2 introduces some background and notation. Section 3 gives an overview of fundamental properties of and . In particular, Section 3.1 relates circuit imbalances to subdeterminant bounds. We note that many extensions of totally unimodular matrices focus on matrices with bounded subdeterminants. Working with circuit imbalances directly can often lead to stronger and conceptually cleaner results. Section 3.2 presents an extension of the Hoffman-Kruskal characterization of TU matrices. Section 3.3 shows an important self-duality property of and . Section 3.4 studies ‘nice’ matrix representations of subspaces with given lcm-circuit imbalances. Section 3.5 proves a multiplicative triangle-inequality for . Many of these results were previously shown by Lee [Lee89], Appa and Kotnyek [AK04], and by Dadush et al. [DHNV20]. We present them in a unified framework, extend some of the results, and provide new proofs.
Section 4 reveals connections between and the well-studied condition numbers studied in the context of interior point methods, and studied—among other topics—in the analysis of the shadow simplex method. In particular, we show that previous diameter bounds for polyhedra can be translated to strong diameter bounds in terms of the condition number (Theorem 4.8).
Section 5 studies the best possible values of that can be achieved by rescaling the variables. We present the algorithm and min-max characterization from [DHNV20]. Further, we characterize when a subspace can be rescaled to a regular one; we also give a new proof of a theorem from [Lee89].
Section 6 shows variants of Hoffman-proximity bounds in terms of that will be used in subsequent algorithms. In Section 7, we study algorithms for linear programming whose running time only depends on the constraint matrix , and reveal the key role of in this context. Section 7.1 shows how the Hoffman-proximity bounds can be used to obtain a black-box algorithm with -dependence as in [DNV20], and Section 7.2 discusses layered least squares interior point methods [DHNV20, VY96].
Section 8 gives an overview of circuit diameter bounds and circuit augmentation algorithms, a natural class of LP algorithms that work directly with elementary vectors. As a new result, we present an improved iteration bound on the steepest-descent circuit augmentation algorithm, by extending the analysis of the minimum mean-cycle cancelling algorithm of Goldberg and Tarjan (Theorem 8.4).
2 Preliminaries
We let . For , a number is -integral if it is an integer multiple of . Let denote the set of primes. Let denote the set of positive reals, and the set of nonnegative reals.
For a prime number , the -adic valuation for is the function defined by
(1) |
We denote the support of a vector by . We let denote the -dimensional all-ones vector, or simply , whenever the dimension is clear from the context. Let denote the -th unit vector.
For vectors we denote by the vector with ; analogously for . Further, we use the notation and ; note that both and are nonnegative vectors. For two vectors , we let denote their scalar product. For sets we let .
We let denote the -dimensional identity matrix. We let denote the set of all positive definite diagonal matrices. For a vector , we denote by the diagonal matrix whose -th diagonal entry is and for a matrix , let denote the column vectors, and denote the row vectors, transposed. For , let denote the submatrix formed by the columns of and for , , we say that is in basis form for if .
We will use and vector norms, denoted as , and , respectively. By , we always mean the 2-norm . Further, for a matrix , will refer to the operator norm, and to the max-norm.
For an index subset , we use for the coordinate projection. That is, , and for a subset , . We let .
For a subspace , we let . It is easy to see that . Assume we are given a matrix such that . Then, , and we can obtain a matrix from such that by performing a Gaussian elimination of the variables in .
For a subspace , we define by the orthogonal projection onto .
For a set of vectors we let denote the linear space spanned by the vectors in . For a matrix , is the subspace spanned by the columns of . A circuit basis of a subspace is a set of linearly independent elementary vectors, i.e., .
Linear Programming (LP) in matrix formulation
We will use LPs in the following standard primal and dual form for , , .
(LP) |
Linear Programming in subspace formulation
Since our main focus is on properties of subspaces, it will be more natural to think about linear programming in the following subspace formulation. For , and as above, let and such that . We assume the existence of such a vector as otherwise the primal program is trivially infeasible. We can write LP in the following equivalent form:
(LP) |
Conformal circuit decompositions
We say that the vector conforms to if whenever . Given a subspace , a conformal circuit decomposition of a vector is a decomposition
where and are elementary vectors that are conformal with . A fundamental result on elementary vectors asserts the existence of a conformal circuit decomposition, see e.g. [Ful68, Roc69].
Lemma 2.1.
For every subspace , every admits a conformal circuit decomposition.
Proof.
Let be the set of vectors conformal with . is a polyhedral cone; its faces correspond to inequalities of the form , , or . The rays (edges) of are of the form for . Clearly, , and thus, can be written as a conic combination of at most rays by the Minkowski–Weyl theorem. Such a decomposition yields a conformal circuit decomposition. ∎
Linear matroids
For a linear subspace , let denote the associated linear matroid, i.e. the matroid defined by the set of circuits . Here, denotes the set of independent sets; if and only if there exists no with ; the maximal independent sets are the bases. We refer the reader to [Sch03, Chapter 39] or [Fra11, Chapter 5] for relevant definitions and background on matroid theory.
Assume and for . Then , is a basis in if and only if is nonsingular; then, is in basis form for such that .
The matroid is separable, if the ground set can be partitioned into two nonempty subsets such that if and only if . In this case, the matroid is the direct sum of its restrictions to and . In particular, every circuit is fully contained in or in . For the linear matroid , separability means that . In this case, we have and ; solving LP can be decomposed into two subproblems, restricted to the columns in and in .
Thus, for most concepts and problems considered in this paper, we can focus on the non-separable components of . The following characterization will turn out to be very useful, see e.g. [Fra11, Theorem 5.2.5].
Proposition 2.2.
A matroid is non-separable if and only if for any , there exists a circuit containing and .
3 Properties of the imbalance measures
Comparison to well-scaled frames
Lee’s work [Lee89] on ‘well-scaled frames’, investigated the following closely related concepts. For a set the rational linear space is -regular if for every elementary vector , there exists a such that all nonzero entries of are in . For , the subspace is called -regular. For , a subspace is -adic of order if it is -regular for . The frame of the subspace refers to the set of elementary vectors .
Using our terminology, a subspace is -regular if and only if , and every -adic subspace is anchored. Many of the properties in this section were explicitly or implicitly shown in Lee [Lee89]. However, it turns out that many properties are simpler and more natural to state in terms of either and . Roughly speaking, the fractional circuit imbalance is the key quantity of interest for continuous properties, particularly relevant for proximity results in linear programming. On the other hand, the lcm-circuit imbalance captures most clearly the integrality properties. The max-circuit imbalance interpolates between these two, although, as already noted by Lee, it is the right quantity for proximity results in integer programming (see Section 9).
The key lemma on basis forms
The following simple proposition turns out to be extremely useful in deriving properties of and . The first statement is from [DNV20].
Proposition 3.1.
For every matrix with ,
Moreover, for each nonsingular , all nonzero entries of have absolute values between and and are -integral.
Proof.
Consider the matrix for any non-singular submatrix . Let us renumber the columns such that corresponds to the first columns. Then, for every , the th column of corresponds to an elementary vector where , and for . Hence, gives a lower bound on . This also implies that all nonzero entries are between and . To see that all entries of are -integral, note that for a vector where all entries are integer divisors of . Since , it follows that itself is an integer divisor of .
To see that the maximum in the first statement is achieved, take the elementary vector that attains the maximum in the definition of ; let be the minimum absolute value element. Let us select a basis such that . Then, the largest absolute value in the -th column of will be . ∎
3.1 Bounds on subdeterminants
For an integer matrix , we define
(2) | ||||
The matrix is totally unimodular (TU), if : thus, all subdeterminants are or . This class of matrices plays a foundational role in combinatorial optimization, see e.g., [Sch98, Chapters 19-20]. A significant example is the node-arc incidence matrix of a directed graph. A key property is that they define integer polyhedra, see Theorem 3.5 below. A polynomial-time algorithm is known to decide whether a matrix is TU, based on the deep decomposition theorem by Seymour from 1980 [Sey80].
The next statement is implicit in [Lee89, Proposition 5.3].
Proposition 3.2.
For every integer matrix , and .
Proof.
Let be a circuit, and select a submatrix of where the columns are indexed by , and the rows are linearly independent. Let be the square submatrix resulting from deleting the column corresponding to from . From Cramer’s rule, we see that for some , . This implies both claims and . ∎
To see an example where can be much larger than , let be the node-edge incidence matrix of a complete undirected graph on nodes; assume is divisible by . The determinant corresponding to any submatrix corresponding to an odd cycle is . Let be an edge set of node-disjoint triangles. Then is a square submatrix with determinant . In fact, in this case, since for a node-edge incidence matrix equals the maximum number of node disjoint odd cycles, see [GKS95]. On the other hand, for the incidence matrix of any undirected graph; see Section 3.2.
For TU-matrices, the converse of Proposition 3.2 is also true. In 1956, Heller and Tompkins [Hel57, HT56] introduced the Dantzig property. A matrix has the Dantzig property if is a -matrix for every nonsingular submatrix . According to Proposition 3.1, this is equivalent to . Theorem 3.4 below can be attributed to Cederbaum [Ced57, Proposition (v)]; see also Camion’s PhD thesis [Cam64, Theorem 2.4.5(f)]. The key is the following lemma that we formulate for general for later use.
Lemma 3.3.
Let . Then, for any nonsingular square submatrix of , the inverse is -integral, with non-zero entries between and in absolute value.
Proof.
Let be any nonsingular submatrix of ; w.l.o.g., let us assume that it uses the first rows of . Let be the set of columns of , along with the additional columns , i.e., the last unit vectors from . Thus, is also nonsingular. After permuting the columns, this can be written in the form
for some . We now use Proposition 3.1 for . Note that the first columns of correspond to . Moreover, we see that
Thus, is -integral, with non-zero entries between and completing the proof. ∎
Appa and Kotnyek define -regular matrices as follows: a rational matrix is -regular if and only if the inverse of all nonsingular submatrices is -integral. From the above statement, it follows that is -regular in this sense for . See also Corollary 3.9.
Theorem 3.4 (Cederbaum, 1957).
Let be a linear subspace. Then, the following are equivalent.
-
(i)
.
-
(ii)
There exists a TU matrix , such that .
-
(iii)
For any matrix in basis form such that , is a TU-matrix.
3.2 Fractional integrality characterization
Hoffman and Kruskal [HK56] gave the following characterization of TU matrices. A polyhedron is integral, if all vertices (=basic feasible solutions) are integer.
Theorem 3.5 (Hoffman and Kruskal, 1956).
An integer matrix is totally unimodular if and only if for every , the polyhedron is integral.
Since is a property of the subspace, it will be more convenient to work with the standard equality form of an LP. Here as well as in Section 4.2, we use the following straightforward correspondence between the two forms. Recall that an edge of a polyhedron is a bounded one dimensional face; every edge is incident to exactly two vertices. The following statement is standard and easy to verify.
Lemma 3.6.
Let be of the form for . For a vector , let
Let denote the index set of . Then, , i.e., is the projection of to the coordinates in . For every vertex of , is a vertex of , and conversely, for every vertex of , there exists a unique vertex of such that . There is a one-to-one correspondence between the edges of and . Further, if , then is -integral if and only if is -integral.
Corollary 3.7.
Let be a linear space. Then, if and only if for every , the polyhedron is integral.
Proof.
Let . W.l.o.g., assume the last variables form a basis, and let us represent in a basis form as for , where . It follows by Theorem 3.4 that if and only if is TU, which is further equivalent to being TU.
Further, note that the system coincides with , where .
We provide the following natural generalization. Related statements, although in substantially more complicated forms, were given in [Lee89, Proposition 6.1 and 6.2].
Theorem 3.8.
Let be a linear space. Then, is the smallest integer such that for every , the polyhedron is -integral.
Proof.
Let , and let us represent for . Then, , can be written as , . Let be a basic feasible solution (i.e. vertex) of this system. Then, . By Proposition 3.1, is -integral. Thus, if then must be also -integral.
Let us now show the converse direction. Assume is -integral for every . For a contradiction, assume there exists a circuit such that the entries of the elementary vector are not all divisors of (or that is not even a rational vector if is not a rational space). In particular, select an index such that , or such that is not rational.
Let us select a basis such that . For simplicity of notation, let . We can represent in a basis form as . Let be defined by , for and otherwise; thus, .
Let us pick an integer , , and define by for , , and otherwise. Then, the basic solution of , corresponding to the basis is obtained as for and for . The choice of guarantees . By the assumption, is -integer, and therefore is also -integer. Recall that , where either with and , or is not rational. Both cases give a contradiction. ∎
Using again Lemma 3.6, we can write this theorem in a form similar to the Hoffman-Kruskal theorem.
Corollary 3.9.
Let . Then, is the smallest value such that for every , the polyhedron is -integral.
Appa and Kotnyek [AK04, Theorem 17] show that -regularity of (in the sense that the inverse of every square submatrix is -integral) is equivalent to the property above.
Subspaces with
The case is a particularly interesting class. As already noted, it includes incidence matrices of undirected graphs, and according to Theorem 3.8, it corresponds to half-integer polytopes. This class includes the following matrices, first studied by Edmonds and Johnson [EJ70]; the following result follows e.g. from [AK04, GS86, HMNT93].
Theorem 3.10.
Let such that for each column , . Then .
3.3 Self-duality
We next show that both and are self-dual. These rely on the following duality property of circuits. We introduce the following more refined quantities that will also come useful later on.
Definition 3.11 (Pairwise Circuit Imbalances).
For a space and variables we define
We call the pairwise imbalance between and .
Cleary, for a nontrivial linear space . We use the following simple lemma.
Lemma 3.12.
Consider a matrix in basis form for , i.e., . Let ; thus, . The following hold.
-
(i)
The rows of form a circuit basis of , denoted as .
-
(ii)
For any two rows , , , and , the vector fulfills .
Proof.
For part i, the rows are clearly linearly independent and span . Therefore, every must have , and if then . These two facts imply that each is support minimal in , that is, .
For part ii, there is nothing to prove if or ; for the rest, assume both are nonzero. Assume for a contradiction ; thus, there exists a , and . We have . If , as above we get that or , a contradiction since but . Hence, . By part i, we have ; and since it follows that ; thus, is a scalar multiple of , a contradiction. ∎
Lemma 3.13.
For any we have . Equivalently: for every elementary vector with indices there exists an elementary vector such that .
Proof.
Let such that . If then any with fulfills , so and .
Else, there exists . Let us select a basis of with . Let be a matrix in basis form for with , and let , an elementary vector in by Lemma 3.12ii.
By the construction, . On the other hand, and implies and similarly implies . The claim follows. ∎
For , duality is immediate from the above:
Proposition 3.14 ([DHNV20]).
For any linear subspace , we have .
Let us now show duality also for ; this was shown in [Lee89, Lemma 2.1] in a slightly different form.
Proposition 3.15.
For any rational linear subspace , we have .
Proof.
We next show that and are monotone under projections and restrictions of the subspace.
Lemma 3.16.
For any linear subspace , and , we have
Proof.
Proposition 3.17.
For any linear subspace and , we have
3.4 Matrix representations
Proposition 3.1 already tells us that any rational matrix of the form is -integral, and according to Lemma 3.3, the inverse of every non-singular square submatrix of is also -integral. It is natural to ask whether every linear subspace can be represented as for an integer matrix with the same property on the inverse matrices.
We show that this is true if the dual space is anchored but false in general. Recall that this means that every elementary vector , has a entry. In particular, for some prime number implies that both and are anchored; in this case we also have .
In [Lee89, Section 7], it is shown that if is a basis minimizing for a full rank , then every nonzero entry in is at least 1 in absolute value. Moreover, a simple greedy algorithm is proposed (called 1-OPT) that finds such a basis within pivots for -adic spaces. Our next statement can be seen as the variant of this for anchored-spaces, using the lcm-circuit imbalance . We note that finding a basis minimizing is computationally hard in general [Kha95].
Proposition 3.18.
Let be a rational subspace such that is an anchored space. Then there exists an integer matrix such that , and
-
(i)
All entries of divide .
-
(ii)
For all non-singular submatrices of , is -integral.
-
(iii)
is an integer divisor of .
Proof.
Let be an arbitrary matrix with . By performing row operations we can convert into where is positive diagonal and (after possibly permuting the columns). If , then we are already done. Property i follows by Proposition 3.1; property ii follows by Lemma 3.3, and property iii holds since , , and is -integral.
If is not the identity matrix, then we show that can be brought to the form with an integer by performing further basis exchanges. Let us assume that for all rows , . By Lemma 3.13, . Assume for some . As is a circuit and is anchored, there exists an index such that .
Let us perform a basis exchange between columns and . That is, subtract integer multiples of row from the other rows to turn column into . We then swap columns and and obtain the matrix again in the form . Notice that the matrix remains integral, , and for , . Hence, repeating this procedure at most times, we can convert the matrix to the integer form , completing the proof. ∎
Note that the proof gives an algorithm to find such a basis representation using a Gaussian elimination and at most additional pivot operations. If is not anchored, we show the following weaker statement.
Proposition 3.19.
Let be a rational subspace. Then there exists an integer matrix with such that
-
(i)
All entries of divide ;
-
(ii)
For all non-singular submatrices of , is -integral.
-
(iii)
is an integer divisor of .
Proof.
The proof is an easy consequence of Proposition 3.1 and Lemma 3.3. Consider any basis form with (after possibly permuting the columns). According to Proposition 3.1, all entries of are integral. By Lemma 3.13, the rows for . We can write for some and such that for each . By the definition of , the entries of each are divisors of . Since it follows that and . Let be the diagonal matrix with entries . Then, is an integer matrix where all entries divide , proving i. Part ii follows by Lemma 3.3 and noting that the subdeterminants get multiplied by a submatrix .
For part iii, let us start use a basis such that is maximal; w.l.o.g. assume . Then, in the basis form for , all subdeterminants are . This holds as for any submatrix of with we have that augmenting the columns of by the columns such that is not a row of results in a basis with by assumption on . After multiplying by as above, , all subdeterminants will be . ∎
Note that parts i and ii are true for any choice of the basis form, whereas iii requires one to select with maximum determinant. The maximum subdeterminant is NP-hard even to approximate better than for some [DSEFM14]. However, it is easy to see that even if we start with an arbitrary basis, then , since every subdeterminant of is at most follows by Lemma 3.3.
We now give an example to illustrate why Proposition 3.18ii cannot hold for arbitrary values of . The proof is given in the Appendix.
Proposition 3.20.
Consider the matrix
For this matrix holds, and there exists no such that and the inverse of every nonsingular submatrix of is -integral.
3.5 The triangle inequality
An interesting additional fact about circuit imbalances is that the logarithm of the weights satisfy the triangle inequality; this was shown in [DHNV20]. Here, we formulate a stronger version and give a simpler proof. Throughout, we assume that is non-separable. Thus, according to Proposition 2.2, for any there is a circuit with .
Theorem 3.21.
Let be a linear space, and assume is non-separable. Then,
-
(i)
for any distinct , ; and
-
(ii)
for any distinct , .
The proof relies on the following technical lemma that analyzes the scenario when almost all vectors in are elementary.
Lemma 3.22.
Let be a subspace s.t. is non-separable.
-
(i)
If , then .
-
(ii)
If there exists such that , then
Proof.
For part i, let and let such that and . If , then shows the claim.
Assume , and pick such that and let ; such a exists by Proposition 2.2. Then and , so by the assumption. If then and so with , therefore certifies the statement as . Otherwise, and fulfills as , . Now, using that and it is easy to see that
(4) |
We now turn to part ii. Since there exists with , we cannot have .
Let and such that . Consider any , such that . If there exists such that . We must have , since . Then fulfills , and , a contradiction to . ∎
Proof of Theorem 3.21.
Let and such that and for , . If then clearly . Otherwise, let us select such that , and is minimal. Let and .
Claim 3.22.1.
Let . Then for the space we have that .
Proof.
The statement that is clear as and the variables we project out fulfill . For the statement on assume that there exists such that . Then there exists a lift of and some such that ; note also that . The vector fulfills and .
Now pick any circuit such that and . Note that is independent, as . Therefore, . Hence, for we have that is non-separable. In particular there exists a circuit such that . As , this is a contradiction to the minimal choice of . ∎
If , then the reverse inclusion trivially holds, since 1 is the only element in these sets. In Proposition 5.4, we give a necessary and sufficient condition for .
One may ask under which circumstances an element is also contained in . We give a partial answer by stating a sufficient condition in a restrictive setting. For a basis of , recall from Lemma 3.12. Then, Lemmas 3.12 and 3.13 together imply:
Lemma 3.23.
Given a basis in and such that , and . Then .
4 Connections to other condition numbers
4.1 The condition number and the lifting operator
For a full row rank matrix , the condition number can be defined in the following two equivalent ways:
(5) | ||||
This condition number was first studied by Dikin [Dik67], Stewart [Ste89], and Todd [Tod90]. There is an extensive literature on the properties and applications of , as well as its relations to other condition numbers. In particular, it plays a key role in layered-least-squares interior point methods, see Section 7.2. We refer the reader to the papers [HT02, MT03, VY96] for further results and references.
It is important to note that—similarly to and — only depends on the subspace . Hence, we can also write for a subspace , defined to be equal to for some matrix with . We will use the notations and interchangeably. The following characterization reveals the connection between and .
Proposition 4.1 ([TTY01]).
For a full row rank matrix ,
Together with Proposition 3.1, this shows that the difference between and is in using instead of norm. This immediately implies the upper bound and a slightly weaker lower bound in the next theorem.
Approximating the condition number is known to be hard; by the same token, also cannot be approximated by any polynomial factor. The proof relies on the hardness of approximating the minimum subdeterminant by Khachiyan [Kha95].
Theorem 4.3 (Tunçel [Tun99]).
Approximating up to a factor of is NP-hard.
In connection with , it is worth mentioning the lifting map, a key concept in the algorithms presented in Section 7. The map lifts back a vector from a coordinate projection of to a minimum-norm vector in :
Note that is the unique linear map from to such that and is orthogonal to . The condition number can be equivalently defined as the maximum norm of any lifting map for an index subset.
Even though is defined with respect to the -norm, it can also be used to characterize .
Proposition 4.5 ([DNV20]).
For a linear subspace ,
Proof.
We first show that for any , and , holds. Let , and take a conformal decomposition as in Lemma 2.1. For each , let . We claim that all these circuits must intersect . Indeed, assume for a contradiction that one of them, say is disjoint from , and let . Then, and . Thus, also lifts to , but , contradicting the definition of as the minimum-norm lift of .
By the definition of , for each . The claim follows since , moreover, conformity guarantees that . Therefore,
We have thus shown that the maximum value in the statement is at most . To show that equality holds, let be the circuit and the corresponding elementary vector and such that .
Let us set , and define if and . Then , and the unique extension to is ; thus, . We have . Noting that , it follows that . ∎
4.2 The condition number and bounds on diameters of polyhedra
Another related condition number is , defined as follows:
Definition 4.6.
Let be a set of vectors. Then is the largest value such that for any set of linearly independent vectors and ,
For a matrix , we let denote the value associated with the rows of .
This can be equivalently characterized as follows: for a subset and , , the sine of the angle between the vector and the subspace is at least (see e.g. for the equivalence [DVZar]).
A line of work studied this condition number in the context of the simplex algorithm and diameter bounds. The diameter of a polyhedron is the diameter of the vertex-edge graph associated with ; Hirsch’s famous conjecture from 1957 asserted that the diameter of a polytope (a bounded polyhedron) in dimensions with facets is at most . This was disproved by Santos in 2012 [San12], but the polynomial Hirsch conjecture, i.e., a poly diameter bound remains wide open.
Consider the LP in standard inequality form with variables and constraints as
(6) |
for , . Using a randomized dual simplex algorithm, Dyer and Frieze [DF94] showed the polynomial Hirsch conjecture for TU matrices. Bonifas et al. [BDSE+14] strengthened and extended this to the bounded subdeterminant case, showing a diameter bound of for integer constraint matrices . Note that this is independent of the number of constraints .
Brunsch and Röglin [BR13] analyzed the shadow vertex simplex algorithm in terms of the condition number , noting that for integer matrices . They gave a diameter bound . Eisenbrand and Vempala [EV17] used a different approach to derive a bound poly that is independent of . Dadush and Hähnle [DH16] further improved these bounds to .
In recent work, Dadush et al. [DVZar] considered (6) in the oracle model, where for each point , the oracle returns or a violated inequality from the system . Their algorithm finds exact primal and dual solutions using oracle calls, where ; the running time is independent of the cost function . They also show the following relation between and :
Lemma 4.7 ([DVZar]).
-
(i)
Let be a matrix with full row rank and , with for all columns . Then, .
-
(ii)
Let be in basis form . Then, .
-
(iii)
If is the basis maximizing , then for , it holds that .
Proof.
Part i: Let be an elementary vector. Select an arbitrary , and let . Then, the columns are linearly independent, and . Thus,
and using that all columns have unit norm, we get for all . This shows that .
Parts ii and iii:
Let in basis form, and let . Let us first show
(7) |
Take any set of linearly independent columns of , along with coefficients . Without loss of generality, assume , i.e., is a basis, by allowing for some coefficients. Let . Then, . Lemma 3.3 implies that every column of has 2-norm at most . Hence, holds for all , implying (7).
Then, part ii follows since by Proposition 3.1. For part iii, let be a basis maximizing . Then, . Indeed, if there is an entry , then we can obtain a larger determinant by exchanging for . This implies . ∎
Using this correspondence between and , we can derive the following bound on the diameter of polyhedra in standard form from [DH16]. This verifies the polynomial Hirsch-conjecture whenever is polynomially bounded.
Theorem 4.8.
Consider a polyhedron in the standard equality form
for and . Then, the diameter of is at most .
Proof.
Without loss of generality, we can assume that has full row rank. Changing to a standard basis representation does neither change the geometry (in particular, the diameter) of , nor the value of . Let be the basis maximizing , and let us replace by ; w.l.o.g. assume that is the set of the last columns. Hence, for . According to Lemma 3.6, has the same diameter as defined as
in other words, , where and . There is a one-to-one correspondence between the vertices and edges of and , and hence, the two polyhedra have the same diameter. Thus, [DH16] gives a bound on the diameter of . By the choice of , from Lemma 4.7iii, we obtain the diameter bound . We claim that . Indeed, the kernels of and represent orthogonal complements, thus by Proposition 3.14. This completes the proof. ∎
The diameter bound in [DH16] is proved constructively, using the shadow simplex method. However, in the proof we choose maximizing , a hard computational problem to solve even approximately [DSEFM14]. However, we do not actually require a (near) maximizing subdeterminant. For the argument, we only need to find a basis such that for , for some constant . Then, (7), gives .
Such a basis corresponds to approximate local subdeterminant maximization, and can be found using the following simple algorithm proposed by Knuth [Knu85]. As long as there is an entry , then swapping for increases by a factor . Using that by Proposition 3.19, the algorithm terminates in iterations.
5 Optimizing circuit imbalances
Recall that is the set of positive definite diagonal matrices. For every , represents a column rescaling. This is a natural symmetry in linear programming, and particularly relevant in the context of interior point algorithms, as discussed in Section 7.2.
The condition number may vastly differ from . In terms of the subspace , this amounts to rescaling the subspace by ; we denote this by . It is natural to ask for the best possible value that can be achieved by rescaling:
In most algorithmic and polyhedral results in this paper, the dependence can be replaced by dependence. For example, the diameter bound in Theorem 4.8 is true in the stronger form with , since the diagonal rescaling maintains the geometry of the polyhedron.
Even the value can be arbitrarily large. As an example, let be defined as . Both generating vectors are elementary, and we see that . Rescaling the third and fourth coordinates have opposite effect on the two elementary vectors, therefore we also have , i.e. the original subspace is already optimally rescaled.
A key result in [DHNV20] shows that an approximately optimal rescaling can be found:
Theorem 5.1 ([DHNV20]).
There is an time algorithm that for any matrix , computes an estimate of such that
and a such that
This is in surprising contrast with the inapproximability result Theorem 4.3. Note that there is no contradiction since the approximation factor is not bounded as in general.
The key idea of the proof of Theorem 5.1 is to analyze the pairwise imbalances introduced in Section 3.3. In the 4-dimensional example above, we have . Let and let denote the diagonal elements; i.e. the rescaling multiplies the -th coordinate of every by . Then, we can see that . In particular, for any pair of variables and , . Consequently, we get a lower bound .
Theorem 5.1 is based on a combinatorial min-max characterization that extends this idea. For the rest of this section, let us assume that the matroid is non-separable. In case it is separable, we can obtain by taking a maximum over the non-separable components.
Let be the complete directed graph on vertices with edge weights . Since is assumed to be non-separable, Proposition 2.2 implies that for any . We will refer to this weighted digraph as the circuit ratio digraph.
Let be a cycle in , that is, a sequence of indices . We use to denote the length of the cycle. (In this terminology, cycles refer to objects in , whereas circuits to objects in .)
We use the notation . The observation for length-2 cycles remains valid in general: is invariant under any rescaling. This leads to the lower bound . The best of these bounds turns out to be tight:
Theorem 5.2 ([DHNV20]).
For a subspace , we have
The proof relies on the following formulation:
(8) |
Taking logarithms and substituting , we can rewrite this problem equivalently as
(9) | ||||
This is the dual of the minimum-mean cycle problem with weights , and can be solved in polynomial time (see e.g. [AMO93, Theorem 5.8]).
Whereas this formulation verifies Theorem 5.2, it does not give a polynomial-time algorithm to compute . In fact, the values are already NP-hard to approximate due to Theorem 4.3. Nevertheless, the bound implies that for any elementary vector with support , we have
(10) |
To find an efficient algorithm as in Theorem 5.1, we replace the exact values by estimates obtained as for an arbitrary circuit with ; these can be obtained using standard techniques from linear algebra and matroid theory. Thus, we can return as the estimate on the value of . To estimate , we solve (9) with the estimates in place of the ’s.
5.1 Perfect balancing:
Let us now show that can be efficiently checked.
Theorem 5.3.
There exists a strongly polynomial algorithm, that given a matrix , returns one of the following outcomes:
-
(a)
A diagonal matrix such that showing that . The algorithm also returns the exact value of . Further, if is a rational linear space, then we can select with integer diagonal entries that divide .
-
(b)
The answer , along with a cycle of circuits such that .
Proof.
As noted above, we can assume without loss of generality that the matroid is non-separable, as we can reduce the problem to solving on all connected components separately.
We obtain estimates for every edge of the circuit ratio graph using a circuit with . Assuming that , (10) implies that holds and the rescaling factors must satisfy
(11) |
If this system is infeasible, then using the circuits that provided the estimates , we can obtain a cycle such that , that is, outcome b. Let us now assume that (11) is feasible; then it has a unique solution up to scalar multiplication. We define with diagonal entries .
Since is non-separable, we can conclude that if and only if . By Theorem 3.4, this holds if and only if is a TU-matrix for any basis .
We run Seymour’s algorithm [Sey80] for . If it confirms that is TU (certified by a construction sequence), then we return outcome a. In this case, is the same for any circuit with ; therefore , and we can return .
Otherwise, Seymour’s algorithm finds a submatrix of with . As in the proof of Proposition 3.2, we can recover a circuit in with two entries such that for the corresponding elementary vector . Note that for the rescaled estimates. Hence, the circuit with used to obtain the estimate , together with certifies that as required for outcome b.
Finally, if is a rational linear space and we concluded , then let us select the solution to (11) such that and . We claim that for all . Indeed, let . For each pair , for two integers . Hence, for any prime , , implying . ∎
Let be a linear space such that is non-separable. Recall from Theorem 3.21 that for all . We now characterize when equality holds for all triples.
Proposition 5.4.
Let be a linear space such that is non-separable. Then, the following are equivalent:
-
(i)
,
-
(ii)
for all ,
-
(iii)
holds for all distinct .
Proof.
A surprising finding by Lee [Lee89, Lee90] is that if is an odd prime power, then holds.111The statement in the paper is slightly more general, for -adic subspaces with ; the proof is essentially the same. We first present a proof sketch following the lines of the one in [Lee89, Lee90]. We also present a second, almost self-contained proof, relying only on basic results on TU matrices.
Proof.
A theorem by Tutte [Tut65] asserts that can be represented as the kernel of a unimodular matrix, i.e. or has a minor such that where is the uniform matroid on four elements such that the independent sets are the sets of cardinality at most two. Here, a matroid minor corresponds to iteratively either deleting variables or projecting variables out. In the first case we are done, so let us consider the second case. Note that and by Lemma 3.16 we have that for all we have that and so in particular for some . An easy consequence of the proof of Proposition 3.18 and the congruence is that can be represented by , i.e. such that
(13) |
for and . Further, by and (Proposition 3.18) we have that
(14) |
It is immediate that (14) cannot be fulfilled for . ∎
Alternative Proof of Theorem 5.5.
Let be such that satisfying the properties in Proposition 3.18 in basis form ; for simplicity, assume the identity matrix is in the first columns. Let be a directed multigraph associated with with edge set where . Further, define where for we let . For a directed cycle in we define .
Claim 5.5.1.
All cycles in fulfill .
Proof.
For a contradiction, assume that there exists a cycle such that and let be a shortest cycle with this property. Then has no chord , as otherwise contains two shorter cycles such that and so in particular or . This also means that the support of the corresponding submatrix of where and is exactly the set of non-zeros of an incidence matrix of a cycle. We have that as the corresponding cycle has . Recall the Leibniz determinant formula. As is supported on the incidence matrix of a cycle there exist only two bijective maps , such that is non-vanishing. One of the maps corresponds to traversing the cycle forward, the other corresponds to traversing it backwards. As all the entries of are powers of we therefore have that for some . This contradicts Proposition 3.18iii for . ∎
The above claim implies the existence of a rescaling of rows and columns where , such that . If is TU, then we are done by Proposition 3.2 as now . Otherwise, we use a result by Gomory (see [Cam65] and [Sch98, Theorem 19.3]) that states that any matrix with entries in that is not totally unimodular has a submatrix with . Let and such that . Note that w.l.o.g. the diagonal entries of and are of the form for some . Therefore, for some . As we must have and . This again contradicts Proposition 3.18iii for . ∎
6 Hoffman proximity theorems
Hoffman’s seminal work [Hof52] has analyzed proximity of LP solutions. Given , , and norms and , we are interested in the minimum of over . Hoffman showed that this can be bounded as , where the Lipschitz-bound is a constant that only depends on and the norms. Such results are known as Hoffman proximity bounds in the literature and have been extensively studied; we refer the reader to [GHR95, KT95, PVZ20] for references. In particular, they are related to studied in Section 4.2, see e.g. [GHR95].
In this section, we show a Hoffman-bound for the system , namely. Related bounds using have been shown in [HT02]. We then extend it to proximity results on optimal LP solutions. These will be used in the black-box LP algorithms in Section 7.1, as well as for the improved analysis of the steepest descent circuit augmentation algorithm in Section 8.4.
A central tool in this section are conformal decompositions into circuits as in Lemma 2.1. The next proof is similar to that of Proposition 4.5.
Lemma 6.1.
If the system is feasible, then the system
is also feasible.
Proof.
Let be a solution to LP such that is minimal, and subject to that, is minimal. Let .
Take a conformal circuit decomposition of the vector as in Lemma 2.1 in the form for some . We claim that for all . Indeed, if , then for some is another solution with and .
Consider any index . For every elementary vector with , there exists an index such that . By conformity,
completing the proof. ∎
We note that [DNV20] also provides a strongly polynomial algorithm that, for a given , and an estimate on , either finds a solution as in Lemma 6.1, or finds an elementary vector that reveals .
We next provide proximity results for optimization. For vectors , let us define the set
(15) |
Note that for , . Consequently, if and , then and are optimal primal and dual solutions to LP.
Lemma 6.2.
If the system LP is feasible, bounded and , then there is an optimal solution such that
Proof.
Similarly to the proof of Lemma 6.1, let be an optimal solution to LP chosen such that is minimal, and subject to that, is minimal; let . Take a conformal circuit decomposition for some . With a similar argument as in the proof of Lemma 6.1, we can show that for each , either , or is an objective-reducing circuit, i.e. . Since , the latter requires that for some , and , implying . A similar bound as in the proof of Lemma 6.1 completes the proof. ∎
Lemma 6.3.
Let be a subspace and . Let be an optimal solution to LP. Then there exists an optimal solution to LP such that
Proof.
Let . Note that , and also . Thus, the systems LP and LP define the same problem.
We apply Lemma 6.2 to . This guarantees the existence of an optimal to LP such that
Since , we get that . Second, by the optimality of , we have , and thus . These together imply that
∎
We can immediately use Lemma 6.3 to derive a conclusion on the support of the optimal dual solutions to LP, using the optimal solution to LP.
Theorem 6.4.
Let be a subspace and . Let be an optimal solution to LP and
Then for every dual optimal solution to LP, we have .
Proof.
By Lemma 6.3 there exists an optimal solution to LP such that . Consequently, , implying for every dual optimal by complementary slackness. ∎
In Section 8.4, we use a dual version of this theorem, also including upper bound constraints in the primal side. We now adapt the required proximity result to the following primal and dual LPs, and formulate it in matrix language to conform to the algorithm in Section 8.4.
(16) |
Note that any induces a feasible dual solution with and for . A primal feasible solution and are optimal solutions if and only if if and if .
Theorem 6.5.
Let be optimal primal and dual solutions to (16) for input , and for input . Let
Then for every and for every .
Proof.
Let . It is easy to see that . Let such that . With , the primal system can be equivalently written as , , . The statement follows by Theorem 6.4 applied for . ∎
7 Linear programming with dependence on the constraint matrix only
Recent years have seen tremendous progress in the development of more efficient LP algorithms using interior point methods, see e.g. [CLS19, LS19, vdBLN+20, vdBLL+21] and references therein. These algorithms are weakly polynomial, i.e., their running time depends on the encoding length of the input of LP.
A fundamental open problem is the existence of a strongly polynomial LP algorithm; this was listed by Smale as one of the key open problems in mathematics for the 21st century [Sma98]. The number of arithmetic operations of such an algorithm would be polynomial in the number of variables and of constraints, but independent of the input length.
Towards this end, there is a line of work on developing algorithms with running time depending only on the constraint matrix , while removing the dependence on and . This direction was pioneered by Tardos’s 1985 paper [Tar86], giving an algorithm for LP with integral that has runtime .
A breakthrough work by Vavasis and Ye [VY96] introduced a Layered Least Squares Interior-Point Method that solves LP within iterations, each requiring to solve a linear system. Recall from Theorem 4.2 that ; also recall from Proposition 3.2 that .
Recently, [DHNV20] improved the Vavasis–Ye bound to linear system solves, where is the optimized version of , analogous to defined in Section 5. The key insight of this work is using the circuit imbalance measure as a proxy to . These results are discussed in Section 7.2.
Section 7.1 exhibits another recent paper [DNV20] that extends Tardos’s black-box framework to solve LP in runtime , based on the proximity results in Section 6. We note that using an initial rescaling as in Theorem 5.1, we can obtain runtimes from these algorithms.
7.1 A black box algorithm
The LP feasibility and optimization algorithms in [DNV20] rely on a black-box subroutine for approximate LP solutions, and use their outputs to find exact primal (and dual) optimal solutions in time . For the black-box, one can use the fast interior-point algorithms cited above.
More precisely, we require the following approximately feasible and optimal solution to LP in time . Here denotes the objective value of LP.
(APX-LP) | ||||
The feasibility algorithm makes calls, and the optimization algorithm makes calls to such a subroutine for .
We now give a high-level outline of the feasibility algorithm in [DNV20] for the system , . For the description, let us assume this system is feasible; in case of infeasibility, the algorithm recovers a Farkas-certificate. The main progress step in the algorithm is reducing the dimension of the linear space by one, based on information obtained form an approximate solution to (APX-LP). We can make such an inference using the proximity result Lemma 6.1.
If for the solution returned by the solver, we can terminate with . Otherwise, let denote the set of ‘large’ coordinates of , i.e., where . By Lemma 6.1, there must exist a feasible solution , such that is still sufficiently large for . Therefore, one can drop the sign-constraint on , as non-negativity can be enforced automatically. We recurse on for , i.e. project out the variables in .
Each recursive call decreases the dimension of the subspace, until a feasible vector is found. The feasible solution on the remaining variables now has to be lifted to the variables we projected out, to get a feasible solution to the original problem , . We use the lifting operator introduced in Section 4.1: for , is the minimum-norm vector in such that . According to Proposition 4.5, ; this bound can be used to guarantee that the lifted solution is nonnegative on .
Algorithm 1 gives a simplified description of the feasibility algorithm of [DNV20]. For simplicity, we ignore the infeasibility case and the details of the that may replace by it projection to in certain cases. This is needed to ensure . Further, we omit an additional proximity condition from the approximate system APX-LP.
As stated here, the computation complexity is dominated by the at most recursive calls to the solver for the system (APX-LP).
7.2 Layered least squares interior point methods
In this section, we briefly review layered least squares (LLS) interior-point methods, and highlight the role of the circuit imbalance measure in this context. The central path for the standard log-barrier function is the parametrized curve given by the solutions to the system following system for
(17) | ||||
A unique solution for each exist whenever LP possesses strictly feasible primal and dual solutions, i.e. primal resp. dual solutions with resp. ; the duality gap between these solutions is . The limit point at gives a pair of primal and dual optimal solutions. At a high level, interior point methods require an initial solution close to the central path for some large and proceed by following the central path in some proximity towards smaller and smaller , which corresponds to converging to an optimal solution. A standard variant is the Mizuno–Todd–Ye [MTY93] predictor-corrector method. This alternates between predictor and corrector steps. Each predictor step decreases the parameter at least by a factor , but moves further away from the central path. Corrector steps maintain the same but restore better centrality.
Let us now focus on the predictor step at a given point ; we use the subspace notation , as in LP. The augmentation direction is computed by the affine scaling (AS) step, that can be written as weighted least squares problems on the primal and dual sides:
(18) | ||||
The update is then performed by setting , for some . As such, this algorithm can find -approximate solutions in weakly polynomial time. However, it does not even terminate in finitely many iterations, because using the weighted -regressions problems (18) will never set variables exactly to 0 as required by complementary slackness. For standard interior point methods, a final rounding step is required.
The layered least squares interior-point method by Vavasis and Ye [VY96] not only terminates finitely, but has an iteration bound of , depending only on , but independent of and . We will refer to this as the VY algorithm.
Recall from Theorem 4.2 that . For certain predictor iterations, they use a layered least squared (LLS) step instead of affine scaling. Variables are split into layers according to the values: we order the variables as , and start a new layer whenever there is a big gap between consecutive variables, i.e. . For a point on the central path, the ordering on the ’s will be approximately reverse.
We illustrate their step based on a partition of the variable set into two layers ; the general step may use an arbitrary number of layers. The layered least squared step is given in a 2-stage approach via
(19) | ||||
Whereas the predictor-corrector algorithm—as most standard interior point variants—is invariant under rescaling the columns of the constraint matrix, the VY algorithm is not: the layers are chosen by comparing the values. For this reason, it was long sought to find a scaling-invariant version of [VY96], that would automatically improve the running time dependence from to the best possible value achievable under column rescaling. In this line of work fall the results of [MT08, MT03, MT05], but none of them achieving dependence on the constraint matrix only while being scaling-invariant.
This question was finally settled in [DHNV20] in the affirmative. A key ingredient is revealing the connection between and . Preprocessing the instance via the algorithm in Theorem 5.1 to find a nearly optimal rescaling for (and thus for ), and then using the VY algorithm already achieves . Beyond this, [DHNV20] also presents a new LLS interior point method based on the pairwise circuit imbalances that is inherently scaling invariant, as well as an improved analysis of iterations. We give an outline next. Details are omitted here, a self-contained overview can be found in [DHNV20].
What determines a good layering?
We illustrate the LLS step in the following hypothetical situation. Assume that the partition is such that and for optimal primal and dual solution to LP. In particular, the LLS direction in (19) will set . Note that this does not hold for the AS direction that solves (18). This benefit of the LLS direction over the AS direction will result in us being able to choose larger for the LLS step compared to the AS step.
To terminate with an optimal solution in a single step, we need to be able to select the step size , which requires that . But as in the computation of the components in are ignored we need to ensure the choice of does not impact by too much. By that we mean that there is a vector such that for all and . The norm of this is exactly governed by the lifting operator we introduced in Section 4.1. Let denote the space rescaled by the values. Then,
(20) |
By Lemma 4.5, note that
(21) |
Further, notice that the lifting cost imposed on variables in by are given by the circuit imbalances in the rescaled space : For and we are interested in . In particular, if these quantities are small for all and , then the low lifting cost discussed above is achieved and we can select stepsize .
The choice of layers
The Vavasis-Ye algorithm defines the layering based on the magnitude of the elements . This guarantees that is small since if is on a higher layer than . However, this choice is inherently not scaling-invariant.
The LLS algorithm in [DHNV20] directly uses the scaling invariant quantities to define the layering. In the ideal version of the algorithm, the layers are selected as the strongly connected components of the directed graph formed by the edges where this value is large. Hence, is small whenever is on a higher layer than .
This ideal version cannot be implemented since the pairwise imbalances are hard to compute or even approximate. The actual algorithm instead works with lower estimates . Thus, we may miss some edges from the directed graph, in which case the lifting may fail. Such failure will be detected in the algorithm, and in turn reveals better estimates for some pairs .
7.3 The curvature of the central path
The condition number also has an interesting connection to the geometry of the central path. In this context, Sonnevend, Stoer, and Zhao [SSZ91] introduced a primal-dual curvature notion. Monteiro and Tsuchiya [MT08] reveal strong connections between the curvature integral, the Mizuno-Todd-Ye predictor-corrector algorithm, and the Vavasis-Ye algorithm. In particular, they prove a bound on the curvature integral.
Besides the above primal-dual curvature, one can also study the total curvature of the central path, a standard notion in algebraic geometry. De Loera, Sturmfels, and Vinzant [DLSV12] studied the central curve defined as the solution of the polynomial equations
(22) |
This includes the usual central path in the region , but also includes the central path of all other LPs with objective in the hyperplane arrangement in defined by the hyperplanes ; i.e., all LPs where some nonnegativity constraints are flipped to . In fact, [DLSV12] shows thato (22) defines the smallest algebraic variety containing the central path.
They consider the average curvature taken over the bounded regions in the hyperplane arrangement, and show a bound for the primal central path (i.e., the projection of (22) to the space), and for the dual central path (the projection to the space). Their argument crucially relies on circuit polynomials defined via elementary vectors. See [DLSV12] for further pointers to the literature on the total curvature of the central path.
8 Circuit diameter bounds and circuit augmentation algorithms
Consider an LP in standard equality form with upper bounds, where , , :
(LP) | ||||
In Section 4.2 we briefly mentioned the Hirsch conjecture and some progress towards the polynomial Hirsch conjecture; in Theorem 4.8 shows a bound on the diameter of for .
Circuit diameter bounds were introduced by Borgwardt, Finhold, and Hemmecke [BFH15] as a relaxation of diameter bounds. Let denote the feasible region of LP. A circuit walk is a set of consecutive feasible points such that for each , for , and further, for any , i.e., each consecutive circuit step is maximal. The circuit diameter of is the minimum length of a circuit walk between any two vertices .
In contrast to walks in the vertex-edge graph, circuit walks are non-reversible and the minimum length from to may be different from the one from to ; this is due to the maximality requirement. The circuit-analogue of the Hirsch conjecture, formulated in [BFH15], asserts that the circuit diameter of a polytope in dimensions with facets is at most ; this may be true even for unbounded polyhedra, see [BSY18].
In this section we begin by showing a recent, improved bound on the circuit diameter with dependence. Section 8.2 gives an overview of circuit augmentations algorithms. We review existing algorithms for different augmentation rules (Theorem 8.2 and Theorem 8.5), and also show a new bound for the steepest-descent direction (Theorem 8.4). The bounds in these three theorems also translate directly to circuit diameter bounds, since they all consider algorithms with maximal augmentation sequences.
8.1 An improved circuit diameter bound
In a recent paper Dadush et al. [DKNV21] gave the following bound on the circuit diameter:
Let us highlight the main ideas to prove Theorem 8.1. The argument is constructive but non algorithmic in the sense that the augmentation steps are defined using the optimal solution. We first show the bound in Theorem 8.1 for LP (i.e., without upper bounds ) and then extend the argument to systems of form LP. Let be a basic optimal solution to LP corresponding to basis , and let . Thus, is the unique optimal solution with respect to the cost vector .
For the current iterate , consider a conformal circuit decomposition of , and select a circuit such that is maximized. We find the next iterate for the maximal stepsize . Note that the existence of such a decomposition does not yield a circuit diameter bound due to the maximality requirement in the definition of circuit walks. Nonetheless, it can be shown that we will not overshoot by too much. More precisely, one can show that the step length will be . Further, the choice of guarantees that decreases geometrically.
The analysis focuses on the index sets and . For every , must already be ‘large’ and cannot be set to zero later in the algorithm; is the set of indices that have essentially ‘converged’ to the final value . Since is decreasing, once an index enters , it can never leave again. The same property can be shown for . Moreover, a new index is added to either set or every iterations, leading to the overall bound .
For a system LP with upper bounds , the above argument yields a bound using a simple reduction. To achieve a better bound, [DKNV21] gives a preprocessing sequence of circuit augmentations that reduces the number of variables to . This preprocessing terminates once the set of columns in are linearly independent for . Since a basic solution may have entries not equal to the lower or upper bound, at this point there are variables . This leads to a circuit diameter bound of .
8.2 Circuit augmentation algorithms
The generic circuit augmentation algorithm is a circuit walk as defined above, such that an initial feasible is given, and , i.e., the objective value decreases in every iteration. The elementary vector is an augmenting direction for the solution if and only if , for every and for every . By LP duality, is optimal if and only if no augmenting direction exists. Otherwise, the algorithm proceeds to the next iterate by a maximal augmentation in an augmenting direction.
The simplex algorithm can be seen as a circuit augmentation algorithm that is restricted to using special elementary vectors corresponding to edges of the polyhedron.222Simplex may contain degenerate pivots when the basic solution remains the same; we do not count these as augmentation steps. For the general framework, the iterates may not be vertices. However, in case of maximal augmentations, they must all lie on the boundary of the polyhedron.
In unpublished work, Bland [Bla76] extended the Edmonds–Karp–Dinic algorithm [Din70, EK72] algorithm for general LP, see also [Lee89, Proposition 3.1]. Circuit augmentation algorithms were revisited by De Loera, Hemmecke, and Lee in 2015 [DLHL15], analyzing different augmentation rules and extending them to integer programming. We give an overview of their results first for linear programming. In particular, they studied three augmentation rules that use maximal augmentation. Let be the current feasible solution, and we aim to select an augmenting direction as follows.
-
•
Dantzig-descent direction: Select such that is maximized, where is the elementary vector with for a circuit .
-
•
Deepest-descent direction: Select such that is maximized, where is the maximal stepsize for and .
-
•
Steepest-descent direction: Select such that is maximized.
Computing Dantzig- and deepest-descent directions is in general NP-hard, see [DLKS19] and as detailed below. The steepest-descent direction can be formulated by an LP; but without any restrictions on the input problem, this may not be simpler than the original one. However, it could be easier to solve in practice; Borgwardt and Viss [BV20] exhibits an implementation of a steepest-descent circuit augmentation algorithm with encouraging computational results.
8.2.1 Augmenting directions for flow problems
It is instructive to consider these algorithms for the special case of minimum-cost flows. Given a directed graph with capacities , costs , and node demands with . The objective is to find the minimum cost flow that satisfies the capacity constraints: , and the node demands: for each node , the total incoming minus the total outgoing flow equals . This can be written in the form LP with as the node-arc incidence matrix of , a TU matrix. Let us define the residual graph , where for we let if and if . The cost of a reverse arc will be defined as . We will also refer to the residual capacities of arcs; these are in the first case and in the second.
Let us observe that the augmenting directions correspond to directed cycles in the residual graph. Circuit augmentation algorithms for the primal and dual problems yield the rich classes of cycle cancelling and cut cancelling algorithms, see the survey [SIM00].
The maximum flow problem between a source and sink can be formulated as a special case as follows. We add a new arc with capacity , set the demands , and costs as and otherwise. Bland’s [Bla76] observation was that the steepest-descent direction for this problem corresponds to finding a shortest residual - path, as chosen in the Edmonds–Karp–Dinic algorithm.
More generally, a steepest-descent direction amounts to finding a residual cycle that minimizes the mean cycle cost . Thus, the steepest descent algorithm for minimum-cost flows corresponds to the classical Goldberg–Tarjan algorithm [GT89] that is strongly polynomial with running time [RG94].
Let us now consider the other two variants. A Dantzig-descent direction in this context asks for the most negative cycle, i.e., a cycle maximizing . A deepest-descent direction asks for a cycle of arcs that maximizes , where is the residual capacity of . Computing both these directions exactly is NP-complete, since they generalize the Hamiltonian-cycle problem: for every directed graph, we can set up a flow problem where coincides with the input graph, all residual capacities are equal to 1, and all costs are . We note that De Loera, Kafer, and Sanità [DLKS19] showed that computing the Dantzig- and deepest-descent directions is also NP-hard for the fractional matching polytope.
Nevertheless, the deepest-descent direction can be suitably approximated. Wallacher [Wal89] proposed selecting a minimum ratio cycle in the residual graph. This is a cycle in that minimizes , where for every residual arc ; such a cycle can be found in strongly polynomial time. It is easy to show that this cycle approximates the deepest descent direction within a factor . Wallacher’s algorithm can be naturally extended to linear programming [MS00], and has found several combinatorial applications, e.g. [WZ99, Way02], and has also been used in the context of integer programming [SW99]. We discuss an improved new variant in Section 8.3. A different relaxation of the deepest-descent algorithm was given by Barahona and Tardos [BT89], based on Weintraub’s algorithm [Wei74].
8.2.2 Convergence bounds
We now state the convergence bounds from [DLHL15]. The original statement refers to subdeterminant bounds; we paraphrase them in terms of finding approximately optimal solutions.
Theorem 8.2 (De Loera, Hemmecke, Lee [DLHL15]).
Consider a linear program in the form LP. Assume we are given an initial feasible solution , and let denote the optimum value. By an - optimal solution we mean an iterate such that .
-
(a)
For given , one can find an -optimal solution in deepest-descent augmentations.
-
(b)
For given , one can find an -optimal solution in Dantzig-descent augmentations, where is an upper bound on the maximum entry in any feasible solution.
-
(c)
One can find an exact optimal solution in steepest-descent augmentations, where denotes the number of distinct values of over .
In general, circuit augmentation algorithms may not even finitely terminate; see [MS00] for an example on Wallacher’s rule for minimum cost flows. In parts a and b, assume that all basic solutions are -integral for some and cost function is . If is a -optimal solution for , then we can identify an optimal vertex of the face containing using a Carathéodory decomposition argument, this can be implemented by a sequence of circuit augmentations (see [DLHL15, Lemma 5]).
According to part c, steepest descent terminates with an optimal solution in a finite number of iterations; moreover, the bound only depends on the linear space and , and not on the parameters and . However, the bound can be exponentially large.
Bland’s original observation was that is strongly polynomially bounded for the maximum flow problem. Recall that all elementary vectors correspond to cycles in the auxiliary graph. Normalizing such that , for every augmenting cycle (as these must use the arc), and is between 1 and . In fact, the crucial argument by Edmonds and Karp [EK72] and Dinic [Din70] is showing that the length of the shortest augmenting path is non-decreasing, and must strictly increase within consecutive iterations.
For an integer cost function , Lee [Lee89, Proposition 3.2] gave the following upper bound on :
Proposition 8.3.
If , then
In order to bound the circuit distance between vertices and let us use the following cost function. For the basis defining , let
(23) |
With this cost function, Theorem 8.2c and Proposition 8.3 yield a bound on the circuit diameter using the steepest descent algorithm.
Extending the analysis of the Goldberg-Tarjan algorithm [GT89], we present a new bound that only depends on the fractional circuit imbalance , and is independent of . The same bound was independently obtained by Gauthier and Derosiers [GD21]. The proof is given in Section 8.4.
Theorem 8.4.
For the problem LP with constraint matrix , the steepest-descent algorithm terminates within augmentations starting from any feasible solution .
This improves on the above bound for most values of the parameters (recall that ). Moreover, this bounds the running time for steepest descent for an arbitrary cost function , not necessarily of the form (23).
Both these bounds are independent of , however, and may be exponentially large in the encoding length of the matrix . In contrast, Theorem 8.2a yields a polynomial bound on the number of deepest-descent iterations, where is the encoding length of . In what follows, we review a new circuit augmentation algorithm from [DKNV21] that achieves a dependence; the running time is bounded as , independently from .
8.3 A circuit augmentation algorithm with dependence
Recall that the diameter bound Theorem 8.1 is non-algorithmic in the sense that the augmentation steps rely on knowing the optimal solution . Dadush et al. [DKNV21] complemented this with an efficient circuit augmentation algorithm, assuming oracles are provided for certain circuit directions.
Theorem 8.5 ([DKNV21]).
Consider the primal of LP. Given a feasible solution, there exists a circuit augmentation algorithm that finds an optimal solution or concludes unboundedness using circuit augmentations.
The main circuit augmentation direction used in the paper for optimization is a step defined as Ratio-Circuit, a generalisation of the previously mentioned augmentation step by Wallacher [Wal89] for minimum cost flows. It finds a circuit that is a basic optimal solution to the following linear system:
(24) |
Equivalently, the goal is to minimize the cost to weight ratio of a circuit, where the unit weight of increasing a variable is and decreasing it is . This can be seen as an efficiently implementable relaxation of the deepest-descent direction: for suitable weights, it achieves a geometric decrease in the objective value. For a vector , we let denote the vector with (in particular, if ).
Lemma 8.6 ([MS00]).
Proof.
Repeated application of a Ratio-Circuit step thus provide an iterate whose optimality gap decreases by a constant factor within iterations. However, as noted in [MS00], using only this rule does not even finitely terminate already for minimum cost flows.
Support Circuits
For this reason, [DKNV21] occasionally uses a second circuit augmentation step called Support-Circuit. Roughly speaking, when given a non-basic feasible point in the system LP, one can efficiently augment around a circuit such that and thereby reduce its support while maintaining the objective.
On a high level, the need for such an operation becomes clear when considering following example. Assume and further assume that we are given an iterate with for some basic solution . Then geometric progress in the objective can be achieved by just reducing the norm of , but just geometric progress would not give the desired bound in the number of circuit augmentations. Note that the norm of all basic solutions lies within a factor of by Proposition 3.1. Therefore, it is helpful to reduce the support through at most Support-Circuit operations until a basic solution is reached instead of applying Ratio-Circuit. Subsequent applications of Ratio-Circuit will now, again due to Proposition 3.1, not be able to reduce the norm of by more than a factor of , a fact that will be exploited in the proximity arguments.
The main progress in the algorithm in Theorem 8.5 is identifying a new index such that in the current solution and in any optimal solution . Such a conclusion derives using a variant of the proximity theorem Theorem 6.5. To implement the main subroutine that fixes a variable to , a careful combination of Ratio-Circuit and Support-Circuit iterations is used. Interestingly, the Ratio-Circuit iterations do not use the original cost function , but a perturbed objective function . The main progress in the subroutine is identifying new ‘large’ variables, similarly to the proof of Theorem 8.1. Perturbations are performed whenever a new large variable is identified.
8.4 An improved bound for steepest-descent augmentation
We now prove Theorem 8.4. The proof follows the same lines as that of the Goldberg–Tarjan algorithm; see also [AMO93, Section 10.5] for the analysis. A factor improvement over the original bound was given in [RG94]. A key property in the original analysis is that for a flow around a cycle (i.e., an elementary vector), every edge carries at least fraction of the -norm of the flow. This can be naturally replaced by the argument that for every elementary flow , the minimum nonzero value of is at least .
The Goldberg–Tarjan algorithm has been generalized to separable convex minimization with linear constraints by Karzanov and McCormick [KM97]. Instead of , they use the maximum entry in a Graver basis (see Section 9 below). Lemma 10.1 in their paper proves a weakly polynomial bound similar to Lemma 8.8 for the separable convex setting. However, no strongly polynomial analysis is given (which is in general not possible for the nonlinear setting).
Our arguments will be based on the dual of LP:
(25) | ||||
Recall the primal-dual slackness conditions from Section 6: if is feasible to LP and , they are primal and dual optimal solutions if and only if if and if .
Let us start by formulating the steepest-descent direction as an LP. Let and . Clearly, . For a feasible solution to LP, we define residual variable set
and consider the system
(26) | ||||
We can map a solution to by setting . We will assume that is chosen as a basic optimal solution. Observe that every basic feasible solution to this program maps to an elementary vector in . The dual program can be equivalently written as
(27) | ||||
For the solution , we let denote the optimal solution to this dual problem; thus, the optimal solution to the primal is . If , then and are complementary primal and dual optimal solutions to LP. We first show that this quantity is monotone (a key step also in the analysis in [DLHL15]).
Lemma 8.7.
At every iteration of the circuit augmentation algorithm, .
Proof.
Let and let be an optimal solution to (27) for . We show that the same is also feasible for ; the claim follows immediately. There is nothing to prove if , so let .
Assume first ; let . This means that ; therefore, the augmenting direction has . Thus, for the optimal solution to (26), we must have . By primal-dual slackness, ; thus,
The case is analogous. ∎
The next lemma shows that within every iterations, decreases by a factor depending on .
Lemma 8.8.
For every iteration , .
Proof.
Let us set , , and let be an optimal dual solution to (27) for . Let
that is, if then , and if , , then . In particular, for every . Let be the basic optimal solution to (26) for . By complementary slackness, every must have , and thus, .
Claim 8.8.1.
Proof.
For , let . We show that . Since , this implies . Let be the basic optimal solution for (26); recall that the augmenting direction is computed with . By the choice of , . Thus, we may only increase for and decrease it for for . Consequently, every index entering has , and therefore remains feasible throughout.
We now turn to the proof of . Since we use a maximal augmentation, at least one index leaves at each iteration. We claim that . For a contradiction, assume there exists . If , then must be in the support of ; in particular, . But this would mean that , in contradiction with the definition of . Similarly, for . ∎
Let us now consider the optimal solution to (26) at iteration ; by the above claim, is still a feasible dual solution. Select an index .
In the first inequality, we use that by the feasibility of , and by the choice of . In the second equality, we use the constraint . The final inequality uses that is a basic solution, and therefore, an elementary vector in . In particular , and . Consequently, . ∎
We say that the variable is frozen at iteration , if for any . Thus, for , , and for , , for all subsequent iterations. We show that a new frozen variable can be found in every iterations; this implies Theorem 8.4.
Lemma 8.9.
For every iteration , there is a variable that is frozen at iteration for .
Proof.
Let . By Lemma 8.8, we can choose such that . Consider the primal and dual optimal solutions to (26) and (27) at iteration and at iteration .
Claim 8.9.1.
There exists a such that .
Proof.
For a contradiction, assume that for every . Then,
contradicting the choice of . ∎
We now show that all such indices are frozen at iteration by making use of Theorem 6.5 on proximity. Let and for any ; let be optimal to (27) at iteration ; we have by Lemma 8.7.
9 Circuits, integer proximity, and Graver bases
We now briefly discuss implications of circuit imbalances to the integer program (IP) of the form
(IP) | ||||
Many algorithms for (IP) solve the LP-relaxation first and deduce from the optimal solution of the relaxation information about the IP itself. The following proximity lemma shows that in case that (IP) is feasible, the distance of an optimal integral solution to the optimal solution of the relaxation can be bounded in terms of max-circuit imbalance . So, a local search within a radius of this guaranteed proximity will provide the optimal solution for the IP; see [Lee89, Proposition 4.1].
Lemma 9.1.
Proof.
Let be an optimal solution to (IP) that minimizes and consider and a conformal circuit decomposition for some and circuits and . Then, for all as otherwise would be a feasible solution to the primal of LP with strictly better objective than . Further, note that for all as otherwise is a feasible solution to (IP) that has an objective as least as good as and would in norm be strictly closer to than . Therefore,
∎
Another popular and well-studied quantity in integer programming is the Graver basis, defined as follows.
Definition 9.2 (Graver basis).
The Graver basis of a matrix , denoted by , consists of all such that there exists no such that and are conformal and for all . We can further define
(28) |
See [LHK12] for extensive treatment of the Graver basis and [EHK+19] for more recent developments. Clearly, elementary vectors, scaled such that its entries have greatest common divisor equal to one belong to the Graver basis: .
We will furthermore see how the max-circuit imbalance measure and Graver basis are related.
Lemma 9.3.
.
Proof.
The first inequality follows from the paragraph above, noting that
for the normalized elementary vectors with . For the second inequality, let and be a conformal circuit decomposition where . Note that for all as otherwise would contradict that . Therefore,
(29) |
∎
Using the Steinitz lemma, Eisenbrand and Weismantel [EHK18, Lemma 2] gave a bound on that only depends on but is independent of :
Theorem 9.4.
Let . Then .
10 A decomposition conjecture
Let be a linear space. As the analogue of maximal augmentations, we say that a conformal circuit decomposition of is maximal, if it can be obtained as follows. If is an elementary vector, return the decomposition containing the single vector . Otherwise, select an arbitrary that is conformal with (in particular, ), and set for the largest value such that is conformal with . Then, recursively apply this procedure to to obtain the other elementary vectors . We have , since the support decreases by at least one due to the maximal choice of . If , then it is easy to verify the following.
Proposition 10.1.
Let be a linear space with , and let . Then, for every maximal conformal circuit decomposition , we have .
We formulate a conjecture asserting that this property generalizes for arbitrary values. Note that in the conjecture, we only require the existence of some (not necessarily maximal) conformal circuit decomposition.
Conjecture 10.1.1.
Let be a rational linear subspace. Then, for every , there exists a conformal circuit decomposition , such that each is a -integral vector in .
Note that it is equivalent to require the same property for elements of the Graver basis . Hence, the conjecture asserts that every vector in the Graver basis is a ‘nice’ combination of elementary vectors.
We present some preliminary evidence towards this conjecture:
Proposition 10.2.
Let be a rational linear subspace with . Then, for every , and every maximal conformal circuit decomposition , we have that is a -integral vector in .
Proof.
By Theorem 5.5, this implies the conjecture whenever for , , . Let us now consider the case when is a power of 2. We verify the conjecture when the decomposition contains at most three terms.
Proposition 10.3.
Let be a rational linear subspace with for . If has a maximal conformal circuit decomposition with , then each is a -integral vector in .
Proof.
Let us write the maximal conformal circuit decomposition in the form such that , and all entries for , . There is nothing to prove for . If , then by the maximality of the decomposition, . Hence, is -integral. Consequently, both and are -integral.
If , then is -integral as above. It also follows that and are -integral for some . Let us choose the smallest such ; we are done if .
Assume for a contradiction . Let for . Thus, , is even, and at least one of and is odd. We show that both and must be odd. Let us first assume that is odd. There exists an such that . Then, implies that must also be odd. Similarly, if is odd then must also be odd.
Let us take any such that . Then, . Noting that and are powers of 2, both at most , it follows that ; by conformity, we have .
Consequently, . Clearly, , and the containment is strict by the maximality of the decomposition: there exists an index such that . This contradicts the fact that . ∎
Acknowledgements
The authors are grateful to Daniel Dadush for numerous inspiring discussions and joint work on circuit imbalances and linear programming, and to Luze Xu for pointing them to Jon Lee’s papers [Lee89, Lee90]. The authors would also like to thank Jesús De Loera, Martin Koutecký, and the anonymous reviewers for their helpful comments and suggestions.
References
- [AK04] G. Appa and B. Kotnyek. Rational and integral -regular matrices. Discrete Mathematics, 275(1-3):1–15, 2004.
- [AMO93] R. K. Ahuja, T. L. Magnanti, and J. B. Orlin. Network Flows: Theory, Algorithms, and Applications. Prentice-Hall, Inc., 1993.
- [BDSE+14] N. Bonifas, M. Di Summa, F. Eisenbrand, N. Hähnle, and M. Niemeier. On sub-determinants and the diameter of polyhedra. Discrete & Computational Geometry, 52(1):102–115, 2014.
- [BFH15] S. Borgwardt, E. Finhold, and R. Hemmecke. On the circuit diameter of dual transportation polyhedra. SIAM Journal on Discrete Mathematics, 29(1):113–121, 2015.
- [Bla76] R. G. Bland. On the generality of network flow theory. Presented at the ORSA/TIMS Joint National Meeting, Miami, FL, 1976.
- [BR13] T. Brunsch and H. Röglin. Finding short paths on polytopes by the shadow vertex algorithm. In Proceedings of the 40th International Colloquium on Automata, Languages, and Programming (ICALP), pages 279–290. Springer, 2013.
- [BSY18] S. Borgwardt, T. Stephen, and T. Yusun. On the circuit diameter conjecture. Discrete & Computational Geometry, 60(3):558–587, 2018.
- [BT89] F. Barahona and É. Tardos. Note on Weintraub’s minimum-cost circulation algorithm. SIAM Journal on Computing, 18(3):579–583, 1989.
- [BV20] S. Borgwardt and C. Viss. An implementation of steepest-descent augmentation for linear programs. Operations Research Letters, 48(3):323–328, 2020.
- [Cam64] P. Camion. Matrices Totalement Unimodulaires et Problemes Combinatoires. PhD thesis, Communauté europénne de l’énergie atomique (EURATOM), 1964. EUR 1632.1.
- [Cam65] P. Camion. Characterization of totally unimodular matrices. Proceedings of the American Mathematical Society, 16(5):1068–1068, May 1965.
- [Ced57] I. Cederbaum. Matrices all of whose elements and subdeterminants are , , or . Journal of Mathematics and Physics, 36(1-4):351–361, 1957.
- [CLS19] M. B. Cohen, Y. T. Lee, and Z. Song. Solving linear programs in the current matrix multiplication time. In Proceedings of the 51st Annual ACM Symposium on Theory of Computing (STOC), pages 938–942, 2019.
- [DF94] M. Dyer and A. Frieze. Random walks, totally unimodular matrices, and a randomised dual simplex algorithm. Mathematical Programming, 64(1):1–16, 1994.
- [DH16] D. Dadush and N. Hähnle. On the shadow simplex method for curved polyhedra. Discrete & Computational Geometry, 56(4):882–909, 2016.
- [DHNV20] D. Dadush, S. Huiberts, B. Natura, and L. A. Végh. A scaling-invariant algorithm for linear programming whose running time depends only on the constraint matrix. In Proceedings of the 52nd Annual ACM Symposium on Theory of Computing (STOC), pages 761–774, 2020.
- [Dik67] I. Dikin. Iterative solution of problems of linear and quadratic programming. Doklady Akademii Nauk, 174(4):747–748, 1967.
- [Din70] E. A. Dinic. Algorithm for solution of a problem of maximum flow in networks with power estimation. In Soviet Math. Doklady, volume 11, pages 1277–1280, 1970.
- [DKNV21] D. Dadush, Z. K. Koh, B. Natura, and L. A. Végh. On circuit diameter bounds via circuit imbalances. arXiv preprint arXiv:2111.07913, 2021.
- [DLHL15] J. A. De Loera, R. Hemmecke, and J. Lee. On augmentation algorithms for linear and integer-linear programming: From Edmonds–Karp to Bland and beyond. SIAM Journal on Optimization, 25(4):2494–2511, 2015.
- [DLKS19] J. A. De Loera, S. Kafer, and L. Sanità. Pivot rules for circuit-augmentation algorithms in linear optimization. arXiv preprint arXiv:1909.12863, 2019.
- [DLSV12] J. A. De Loera, B. Sturmfels, and C. Vinzant. The central curve in linear programming. Foundations of Computational Mathematics, 12(4):509–540, 2012.
- [DNV20] D. Dadush, B. Natura, and L. A. Végh. Revisiting Tardos’s framework for linear programming: Faster exact solutions using approximate solvers. In Proceedings of the 61st Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 931–942, 2020.
- [DSEFM14] M. Di Summa, F. Eisenbrand, Y. Faenza, and C. Moldenhauer. On largest volume simplices and sub-determinants. In Proceedings of the 26th annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 315–323, 2014.
- [DVZar] D. Dadush, L. A. Végh, and G. Zambelli. On finding exact solutions to linear programs in the oracle model. In Proceedings of the 2022 ACM-SIAM Symposium on Discrete Algorithms (SODA), 2022 (to appear).
- [EHK18] F. Eisenbrand, C. Hunkenschröder, and K.-M. Klein. Faster Algorithms for Integer Programs with Block Structure. In Proceedings of the 45th International Colloquium on Automata, Languages, and Programming (ICALP 2018), pages 49:1–13, 2018.
- [EHK+19] F. Eisenbrand, C. Hunkenschröder, K.-M. Klein, M. Koutecký, A. Levin, and S. Onn. An algorithmic theory of integer programming. arXiv preprint arXiv:1904.01361, 2019.
- [EJ70] J. Edmonds and E. L. Johnson. Matching: A well-solved class of integer linear programs. In Combinatorial Structures and Their Applications, pages 89–92. Gordon and Breach, 1970.
- [EK72] J. Edmonds and R. M. Karp. Theoretical improvements in algorithmic efficiency for network flow problems. Journal of the ACM (JACM), 19(2):248–264, 1972.
- [EV17] F. Eisenbrand and S. Vempala. Geometric random edge. Mathematical Programming, 164(1-2):325–339, 2017.
- [Fra11] A. Frank. Connections in Combinatorial Optimization. Number 38 in Oxford Lecture Series in Mathematics and its Applications. Oxford University Press, 2011.
- [Ful68] D. Fulkerson. Networks, frames, blocking systems. Mathematics of the Decision Sciences, Part I, Lectures in Applied Mathematics, 2:303–334, 1968.
- [GD21] J. B. Gauthier and J. Desrosiers. The minimum mean cycle-canceling algorithm for linear programs. European Journal of Operational Research, 2021. (in press).
- [GHR95] O. Güler, A. J. Hoffman, and U. G. Rothblum. Approximations to solutions to systems of linear inequalities. SIAM Journal on Matrix Analysis and Applications, 16(2):688–696, 1995.
- [GKS95] J. W. Grossman, D. M. Kulkarni, and I. E. Schochetman. On the minors of an incidence matrix and its Smith normal form. Linear Algebra and its Applications, 218:213–224, March 1995.
- [GS86] A. M. Gerards and A. Schrijver. Matrices with the Edmonds–Johnson property. Combinatorica, 6(4):365–379, 1986.
- [GT89] A. V. Goldberg and R. E. Tarjan. Finding minimum-cost circulations by canceling negative cycles. Journal of the ACM (JACM), 36(4):873–886, 1989.
- [Hel57] I. Heller. On linear systems with integral valued solutions. Pacific Journal of Mathematics, 7(3):1351–1364, 1957.
- [HK56] A. Hoffman and J. Kruskal. Integral boundary points of convex polyhedra. In Linear Inequalities and Related Systems, pages 223–246. Princeton University Press, 1956.
- [HMNT93] D. S. Hochbaum, N. Megiddo, J. S. Naor, and A. Tamir. Tight bounds and 2-approximation algorithms for integer programs with two variables per inequality. Mathematical Programming, 62(1):69–83, 1993.
- [Hof52] A. J. Hoffman. On approximate solutions of systems of linear inequalities. Journal of Research of the National Bureau of Standards, 49(4):263––265, 1952.
- [HT56] I. Heller and C. Tompkins. An extension of a theorem of Dantzig’s. Linear inequalities and related systems, 38:247–254, 1956.
- [HT02] J. C. Ho and L. Tunçel. Reconciliation of various complexity and condition measures for linear programming problems and a generalization of Tardos’ theorem. In Foundations of Computational Mathematics, pages 93–147. World Scientific, 2002.
- [Kha95] L. Khachiyan. On the complexity of approximating extremal determinants in matrices. Journal of Complexity, 11(1):138–153, 1995.
- [KM97] A. V. Karzanov and S. T. McCormick. Polynomial methods for separable convex optimization in unimodular linear spaces with applications. SIAM Journal on Computing, 26(4):1245–1275, 1997.
- [Knu85] D. E. Knuth. Semi-optimal bases for linear dependencies. Linear and Multilinear Algebra, 17(1):1–4, 1985.
- [KT95] D. Klatte and G. Thiere. Error bounds for solutions of linear equations and inequalities. Zeitschrift für Operations Research, 41(2):191–214, 1995.
- [Lee89] J. Lee. Subspaces with well-scaled frames. Linear Algebra and its Applications, 114:21–56, 1989.
- [Lee90] J. Lee. The incidence structure of subspaces with well-scaled frames. Journal of Combinatorial Theory, Series B, 50(2):265–287, 1990.
- [LHK12] J. D. Loera, R. Hemmecke, and M. Köppe. Algebraic and Geometric Ideas in the Theory of Discrete Optimization. Society for Industrial and Applied Mathematics, USA, 2012.
- [LS19] Y. T. Lee and A. Sidford. Solving linear programs with linear system solves. arXiv preprint 1910.08033, 2019.
- [MS00] S. T. McCormick and A. Shioura. Minimum ratio canceling is oracle polynomial for linear programming, but not strongly polynomial, even for networks. Operations Research Letters, 27(5):199–207, 2000.
- [MT03] R. D. C. Monteiro and T. Tsuchiya. A variant of the Vavasis-Ye layered-step interior-point algorithm for linear programming. SIAM Journal on Optimization, 13(4):1054–1079, 2003.
- [MT05] R. D. C. Monteiro and T. Tsuchiya. A new iteration-complexity bound for the MTY predictor-corrector algorithm. SIAM Journal on Optimization, 15(2):319–347, 2005.
- [MT08] R. D. Monteiro and T. Tsuchiya. A strong bound on the integral of the central path curvature and its relationship with the iteration-complexity of primal-dual path-following LP algorithms. Mathematical Programming, 115(1):105–149, 2008.
- [MTY93] S. Mizuno, M. Todd, and Y. Ye. On adaptive-step primal-dual interior-point algorithms for linear programming. Mathematics of Operations Research - MOR, 18:964–981, 11 1993.
- [O’L90] D. P. O’Leary. On bounds for scaled projections and pseudoinverses. Linear Algebra and its Applications, 132:115–117, April 1990.
- [PVZ20] J. Pena, J. C. Vera, and L. F. Zuluaga. New characterizations of Hoffman constants for systems of linear constraints. Mathematical Programming, 187:1–31, 2020.
- [RG94] T. Radzik and A. V. Goldberg. Tight bounds on the number of minimum-mean cycle cancellations and related results. Algorithmica, 11(3):226–242, 1994.
- [Roc69] R. T. Rockafellar. The elementary vectors of a subspace of . In Combinatorial Mathematics and Its Applications: Proceedings North Carolina Conference, Chapel Hill, 1967, pages 104–127. The University of North Carolina Press, 1969.
- [San12] F. Santos. A counterexample to the Hirsch conjecture. Annals of Mathematics, pages 383–412, 2012.
- [Sch98] A. Schrijver. Theory of linear and integer programming. John Wiley & Sons, 1998.
- [Sch03] A. Schrijver. Combinatorial Optimization – Polyhedra and Efficiency. Springer, 2003.
- [Sey80] P. Seymour. Decomposition of regular matroids. Journal of Combinatorial Theory, Series B, 28(3):305–359, 1980.
- [Sey93] M. Seysen. Simultaneous reduction of a lattice basis and its reciprocal basis. Combinatorica, 13(3):363–376, 1993.
- [SIM00] M. Shigeno, S. Iwata, and S. T. McCormick. Relaxed most negative cycle and most positive cut canceling algorithms for minimum cost flow. Mathematics of Operations Research, 25(1):76–104, 2000.
- [Sma98] S. Smale. Mathematical problems for the next century. The Mathematical Intelligencer, 20:7–15, 1998.
- [SSZ91] G. Sonnevend, J. Stoer, and G. Zhao. On the complexity of following the central path of linear programs by linear extrapolation II. Mathematical Programming, 52(1-3):527–553, 1991.
- [Ste89] G. Stewart. On scaled projections and pseudoinverses. Linear Algebra and its Applications, 112:189–193, 1989.
- [SW99] A. S. Schulz and R. Weismantel. An oracle-polynomial time augmentation algorithm for integer programming. In Proceedings of the 10th ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 967–968, 1999.
- [Tar86] É. Tardos. A strongly polynomial algorithm to solve combinatorial linear programs. Operations Research, pages 250–256, 1986.
- [Tod90] M. J. Todd. A Dantzig–Wolfe-like variant of Karmarkar’s interior-point linear programming algorithm. Operations Research, 38(6):1006–1018, 1990.
- [TTY01] M. J. Todd, L. Tunçel, and Y. Ye. Characterizations, bounds, and probabilistic analysis of two complexity measures for linear programming problems. Mathematical Programming, 90(1):59–69, Mar 2001.
- [Tun99] L. Tunçel. Approximating the complexity measure of Vavasis-Ye algorithm is NP-hard. Mathematical Programming, 86(1):219–223, Sep 1999.
- [Tut65] W. T. Tutte. Lectures on matroids. J. Research of the National Bureau of Standards (B), 69:1–47, 1965.
- [vdBLL+21] J. van den Brand, Y. P. Liu, Y.-T. Lee, T. Saranurak, A. Sidford, Z. Song, and D. Wang. Minimum cost flows, MDPs, and L1-regression in nearly linear time for dense instances. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing (STOC), pages 859–869, 2021.
- [vdBLN+20] J. van den Brand, Y.-T. Lee, D. Nanongkai, R. Peng, T. Saranurak, A. Sidford, Z. Song, and D. Wang. Bipartite matching in nearly-linear time on moderately dense graphs. In IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS), pages 919–930, 2020.
- [VY96] S. A. Vavasis and Y. Ye. A primal-dual interior point method whose running time depends only on the constraint matrix. Mathematical Programming, 74(1):79–120, 1996.
- [Wal89] C. Wallacher. A generalization of the minimum-mean cycle selection rule in cycle canceling algorithms. unpublished manuscript, Institute für Angewandte Mathematik, Technische Universität Braunschweig, 1989.
- [Way02] K. D. Wayne. A polynomial combinatorial algorithm for generalized minimum cost flow. Mathematics of Operations Research, pages 445–459, 2002.
- [Wei74] A. Weintraub. A primal algorithm to solve network flow problems with convex costs. Management Science, 21(1):87–97, 1974.
- [WZ99] C. Wallacher and U. T. Zimmermann. A polynomial cycle canceling algorithm for submodular flows. Mathematical programming, 86(1):1–15, 1999.
Appendix A Appendix: Proof of Proposition 3.20
See 3.20
Proof.
We know all other representations of the space like such that are of the form where is a invertible matrix. Since then to get an integral we need to have integer and . Furthermore since the g.c.d. of the numbers in the second column is equal to , then and should be integers as well.
It can be verified by computer that the only matrices like such that all entries of are divisors of are
Checking all different matrices we can get these matrices:
All of these matrices contain a submatrix such that its inverse is not -integral. ∎