This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Computing the nc-rank via discrete convex optimization on CAT(0) spaces

Masaki HAMADA and Hiroshi HIRAI
Department of Mathematical Informatics,
Graduate School of Information Science and Technology,
The University of Tokyo, Tokyo, 113-8656, Japan.
masaki_\_hamada@mist.i.u-tokyo.ac.jp
hirai@mist.i.u-tokyo.ac.jp
Abstract

In this paper, we address the noncommutative rank (nc-rank) computation of a linear symbolic matrix

A=A1x1+A2x2++Amxm,A=A_{1}x_{1}+A_{2}x_{2}+\cdots+A_{m}x_{m},

where each AiA_{i} is an n×nn\times n matrix over a field 𝕂{\mathbb{K}}, and xix_{i} (i=1,2,,m)(i=1,2,\ldots,m) are noncommutative variables. For this problem, polynomial time algorithms were given by Garg, Gurvits, Oliveira, and Wigderson for 𝕂={\mathbb{K}}={\mathbb{Q}}, and by Ivanyos, Qiao, and Subrahmanyam for an arbitrary field 𝕂{\mathbb{K}}. We present a significantly different polynomial time algorithm that works on an arbitrary field 𝕂{\mathbb{K}}. Our algorithm is based on a combination of submodular optimization on modular lattices and convex optimization on CAT(0) spaces.

Keywords: Edmonds’ problem, noncommutative rank, CAT(0) space, proximal point algorithm, submodular function, modular lattice, pp-adic valuation, Euclidean building

1 Introduction

The present article addresses rank computation of a linear symbolic matrix— a matrix of the following form:

A=A1x1+A2x2++Amxm,A=A_{1}x_{1}+A_{2}x_{2}+\cdots+A_{m}x_{m}, (1.1)

where each AiA_{i} is an n×nn\times n matrix over a field 𝕂{\mathbb{K}}, xix_{i} (i=1,2,,m)(i=1,2,\ldots,m) are variables, and AA is viewed as a matrix over 𝕂(x1,x2,,xm){\mathbb{K}}(x_{1},x_{2},\ldots,x_{m}). This problem, sometimes called Edmonds’ problem, has fundamental importance in a wide range of applied mathematics and computer science; see [36]. Edmonds’ problem (on large field 𝕂{\mathbb{K}}) is a representative problem that belongs to RP—the class of problems having a randomized polynomial time algorithm—but is not known to belong to P. The existence of a deterministic polynomial time algorithm for Edmonds’ problem is one of the major open problems in theoretical computer science.

In 2015, Ivanyos, Qiao, and Subrahmanyam [29] introduced a noncommutative formulation of the Edmonds’ problem, called the noncommutative Edmonds’ problem. In this formulation, linear symbolic matrix AA is regarded as a matrix over the free skew field 𝕂(x1,,xm){\mathbb{K}}(\langle x_{1},\ldots,x_{m}\rangle), which is the “most generic” skew field of fractions of noncommutative polynomial ring 𝕂x1,,xm{\mathbb{K}}\langle x_{1},\ldots,x_{m}\rangle. The rank of AA over the free skew field is called the noncommutative rank, or nc-rank, which is denoted by nc-rankA\mathop{\rm nc\mbox{-}rank}A. Contrary to the commutative case, the noncommutative Edmonds’ problem can be solved in polynomial time.

Theorem 1.1 ([17, 30]).

The nc-rank of a matrix AA of form (1.1) can be computed in polynomial time.

As well as the result, the algorithms for nc-rank are stimulating subsequent researches. The first polynomial time algorithm is due to Garg, Gurvits, Oliveira, and Wigderson [17] for the case of 𝕂={\mathbb{K}}={\mathbb{Q}}. They showed that Gurvits’ operator scaling algorithm [19], which was designed for solving a special class (Edmonds-Rado class) of Edmonds’ problem, can solve nc-singularity testing (i.e., testing whether n=nc-rankAn=\mathop{\rm nc\mbox{-}rank}A) in polynomial time. The operator scaling algorithm has rich connections to various fields of mathematical sciences. Particularly, nc-singularity testing can be formulated as a geodesically-convex optimization problem on Riemannian manifold GLn()/On()GL_{n}({\mathbb{R}})/O_{n}({\mathbb{R}}), and the operator scaling can be viewed as a minimization algorithm on it; see [2]. For explosive developments after [17], we refer to e.g., [9] and the references therein.

Ivanyos, Qiao, and Subrahmanyam [29, 30] developed the first polynomial time algorithm for the nc-rank that works on an arbitrary field 𝕂{\mathbb{K}}. Their algorithm is viewed as a “vector-space generalization” of the augmenting path algorithm in the bipartite matching problem. This indicates a new direction in combinatorial optimization, since Edmonds’ problem generalizes several important combinatorial optimization problems. Inspired by their algorithm, [27] developed a combinatorial polynomial time algorithm for a certain algebraically constraint 2-matching problem in a bipartite graph, which corresponds to the (commutative) Edmonds’ problem for a linear symbolic matrix in [32]. Also, a noncommutative algebraic formulation that captures weighted versions of combinatorial optimization problems was studied in [25, 26, 40].

The main contribution of this paper is a significantly different polynomial time algorithm for computing the nc-rank on an arbitrary field 𝕂{\mathbb{K}}. While describing the above algorithms and validity proofs is rather tough work, the algorithm and proof presented in this paper are conceptually simple, elementary, and relatively short. Further, it is also relevant to the following two cutting edge issues in discrete and continuous optimization:

  • submodular optimization on a modular lattice.

  • convex optimization on a CAT(0) space.

A submodular function ff on a lattice {\cal L} is a function f:f:{\cal L}\to{\mathbb{R}} satisfying f(p)+f(q)f(pq)+f(pq)f(p)+f(q)\geq f(p\vee q)+f(p\wedge q) for p,qp,q\in{\cal L}. Submodular functions on Boolean lattice {0,1}n\{0,1\}^{n} are well-studied, and have played central roles in the developments of combinatorial optimization; see [15]. They are correspondents of convex functions (discrete convex functions) in discrete optimization; see [37]. Optimization of submodular functions beyond Boolean lattices, particularly on modular lattices, is a new research area that has just started; see [16, 24, 34] on this subject.

A CAT(0) space is a (non-manifold) generalization of nonpositively curved Riemannian manifolds; see [8]. While CAT(0) spaces have been studied mainly in geometric group theory, their effective utilization in applied mathematics has gained attention; see e.g.,[6]. A CAT(0) space is a uniquely-geodesic metric space, and convexity concepts are defined along unique geodesics. Theory of algorithms and optimization on CAT(0) spaces is now being pioneered; see e.g.,  [3, 4, 5, 22, 41].

Our algorithm is obtained as a combination of these new optimization approaches. We hope that this will bring new interactions to the nc-rank literature. While it is somehow relevant to geodesically-convex optimization mentioned above, we deal with optimization on combinatorially-defined non-manifold CAT(0) spaces. The most important implication of our result is that convex optimization algorithms on such spaces can be a tool of showing polynomial time complexity.

Outline.

Let us outline our algorithm. As shown by Fortin and Reutenauer [14], the nc-rank is given by the optimum value of an optimization problem:

Theorem 1.2 ([14]).

Let AA be a matrix of form (1.1). Then nc-rankA\mathop{\rm nc\mbox{-}rank}A is equal to the optimal value of the following problem:

FR:Min.\displaystyle{\rm FR}:\quad{\rm Min.} 2nrs\displaystyle 2n-r-s
s.t.\displaystyle{\rm s.t.} SATSAT has an r×sr\times s zero submatrix,
S,TGLn(𝕂).\displaystyle S,T\in GL_{n}({\mathbb{K}}).

As in [29, 30], our algorithm is designed to solve this optimization problem. This problem FR can also be formulated as an optimization problem on the modular lattice of vector subspaces in 𝕂n{\mathbb{K}}^{n}, as follows. Regard each matrix AiA_{i} as a bilinear form 𝕂n×𝕂n𝕂{\mathbb{K}}^{n}\times{\mathbb{K}}^{n}\to{\mathbb{K}} by

Ai(x,y):=xAiy(x,y𝕂n).A_{i}(x,y):=x^{\top}A_{i}y\quad(x,y\in{\mathbb{K}}^{n}).

Then the condition of FR says that there is a pair of vector subspaces UU and VV of dimension rr and ss, respectively, that annihilates all bilinear forms, i.e., Ai(U,V):={0}A_{i}(U,V):=\{0\}. The objective function is written as 2ndimUdimV2n-\dim U-\dim V. Therefore, FR is equivalent to the following problem (maximum vanishing subspace problem; MVSP):

MVSP:Min.\displaystyle{\rm MVSP}:\quad{\rm Min.} dimXdimY\displaystyle-\dim X-\dim Y
s.t.\displaystyle{\rm s.t.} Ai(X,Y)={0}(i=1,2,,m),\displaystyle A_{i}(X,Y)=\{0\}\quad(i=1,2,\ldots,m),
X,Y:vector subspaces of 𝕂n.\displaystyle X,Y:\mbox{vector subspaces of ${\mathbb{K}}^{n}$}.

It is a basic fact that the family {\cal L} of all vector subspaces in 𝕂n\mathbb{K}^{n} forms a modular lattice with respect to the inclusion order. Hence, MVSP is an optimization problem over ×{\cal L}\times{\cal L}. Further, by reversing the order of the second {\cal L}, it can be viewed as a submodular function minimization (SFM) on modular lattice ×{\cal L}\times{\cal L}; see Proposition 3.2 in Section 3.1.

Contrary to the Boolean case, it is not known generally whether a submodular function on a modular lattice can be minimized in polynomial time. The reason of polynomial-time solvability of SFM on Boolean lattice {0,1}n\{0,1\}^{n} is the Lovász extension [35]— a piecewise-linear interpolation f¯:[0,1]n\bar{f}:[0,1]^{n}\to{\mathbb{R}} of function f:{0,1}nf:\{0,1\}^{n}\to{\mathbb{R}} such that f¯\bar{f} is convex if and only if ff is submodular. For SFM on a modular lattice, however, such a good convex relaxation to n\mathbb{R}^{n} is not known.

A recent study [24] introduced an approach of constructing a convex relaxation of SFM on a modular lattice, where the domain of the relaxation is a CAT(0) space. The construction is based on the concept of an orthoscheme complex [7]. Consider the order complex K()K({\cal L}) of {\cal L}, and endow each simplex with a specific Euclidean metric. The resulting metric space K()K({\cal L}) is called the orthoscheme complex of {\cal L}, and is dealt with as a continuous relaxation of {\cal L}. The details are given in Section 2.2.2. Figure 1 illustrates the orthoscheme complex of a modular lattice with rank 22, which is obtained by gluing Euclidean isosceles right triangles along longer edges.

Refer to caption
Figure 1: An orthoscheme complex

The orthoscheme complex of a modular lattice was shown to be CAT(0) [10]. This enables us to consider geodesically-convexity for functions on K()K({\cal L}). In this setting, a submodular function f:f:{\cal L}\to{\mathbb{R}} is characterized by the convexity of its piecewise linear interpolation, i.e., Lovász extension f¯:K()\bar{f}:K({\cal L})\to{\mathbb{R}} [24]. According to this construction, we obtain an exact convex relaxation of MVSP in a CAT(0)-space.

Our proposed algorithm is obtained by applying the splitting proximal point algorithm (SPPA) to this convex relaxation. SPPA is a generic algorithm that minimizes a convex function of a separable form i=1Nfi\sum_{i=1}^{N}f_{i}, where each fif_{i} is a convex function. Each iteration of the algorithm updates the current point xx to its resolvent of fif_{i}— a minimizer of yfi(y)+(1/λ)d(y,x)2y\mapsto f_{i}(y)+(1/\lambda)d(y,x)^{2}, where ii is chosen cyclically. Bačák [4] showed that SPPA generates a sequence convergent to a minimizer of ff (under a mild assumption). Subsequently, Ohta and Pálfia [39] proved a sublinear convergence of SPPA.

The main technical contribution is to show that SPPA is applicable to the convex relaxation of MVSP and becomes a polynomial time algorithm for MVSP: We provide an equivalent convex relaxation of MVSP with a separable objective function ifi\sum_{i}f_{i}, and show that the resolvent of each fif_{i} can be computed in polynomial time. By utilizing the sublinear convergence estimate, a polynomial number of iterations for SPPA identifies an optimal solution of MVSP.

Compared with the existing algorithms, this algorithm has advantages and drawbacks. As mentioned above, our algorithm and its validity proof are relatively simple. Particularly, it can be uniformly written for an arbitrary field 𝕂{\mathbb{K}}, where only the requirement for 𝕂{\mathbb{K}} is that arithmetic operations is executable. No care is needed for a small finite field, whereas the algorithm in [29, 30] needs a field extension. On the other hand, our algorithm is very slow; see Theorem 3.3. This is caused by using a generic and primitive algorithm (SPPA) for optimization on CAT(0) spaces. We believe that this will be naturally improved in future developments.

The problematic point of our algorithm is bit-complexity explosion for the case of 𝕂={\mathbb{K}}={\mathbb{Q}}. Our algorithm updates feasible vector subspaces in MVSP, and can cause an exponential increase of the bit-size representing bases of those vector subspaces. To resolve this problem and make use of the advantage in finite fields, we propose a reduction of nc-rank computation on {\mathbb{Q}} to that on GF(p)GF(p). This reduction is an application of the pp-adic valuation on {\mathbb{Q}}. We consider a weighted version of the nc-rank, which was introduced by [25] for 𝕂(t){\mathbb{K}}(t) and is definable for an arbitrary field with a discrete valuation. The corresponding optimization problem MVMP is a discrete convex optimization on a representative CAT(0) space—the Euclidean building for GLn()GL_{n}({\mathbb{Q}}) (or GLn(p)GL_{n}({\mathbb{Q}}_{p})). This may be viewed as a pp-adic counterpart of the above geodesically-convex optimization approach on GLn()/On()GL_{n}({\mathbb{R}})/O_{n}({\mathbb{R}}) for nc-singularity testing on {\mathbb{Q}}. By using an obvious relation of the pp-adic valuation of a nonzero integer and its bit-length in base pp, we show that nc-singularity testing on {\mathbb{Q}} reduces to a polynomial number of nc-rank computation over the residue field GF(p)GF(p), in which the required bit-length is polynomially bounded.

Organization.

The rest of this paper is organized as follows. In Section 2, we present necessary backgrounds on convex optimization on CAT(0) space, modular lattices, and submodular functions. In Section 3, we present our algorithm and show its validity. In Section 4, we present the pp-adic reduction for nc-rank computation on {\mathbb{Q}}.

Original motivation: Block triangularization of a partitioned matrix.

The original version [21] of this paper dealt with block triangularization of a matrix with the following partition structure:

A=(A11A12A1νA21A22A2νAμ1Aμ2Aμν),A=\left(\begin{array}[]{ccccc}A_{11}&A_{12}&\cdots&A_{1\nu}\\ A_{21}&A_{22}&\cdots&A_{2\nu}\\ \vdots&\vdots&\ddots&\vdots\\ A_{\mu 1}&A_{\mu 2}&\cdots&A_{\mu\nu}\end{array}\right),

where AαβA_{\alpha\beta} is an nα×mβn_{\alpha}\times m_{\beta} matrix over field 𝕂{\mathbb{K}} for α[μ]\alpha\in[\mu] and β[ν]\beta\in[\nu]. Consider the following block triangularization

APEAFQ=[O],A\mapsto PEAFQ=\left[\begin{array}[]{ccccc}\vrule\lx@intercol\hfil\hfil\lx@intercol&&&\\ \cline{1-1}\cr&\vrule\lx@intercol\hfil\hfil\lx@intercol&&\text{\huge$*$}\\ \cline{2-3}\cr&&&\vrule\lx@intercol\hfil\hfil\lx@intercol&\\ \cline{4-4}\cr\text{\huge$O$}&&&&\vrule\lx@intercol\hfil\hfil\lx@intercol\\ \cline{5-5}\cr\end{array}\right],

where PP and QQ are permutation matrices and EE and FF are regular transformations “within blocks,” i.e., EE and FF are block diagonal matrices with block diagonals EαGLnα(𝕂)E_{\alpha}\in GL_{n_{\alpha}}({\mathbb{K}}) (α[μ])(\alpha\in[\mu]) and FβGLmβ(𝕂)F_{\beta}\in GL_{m_{\beta}}({\mathbb{K}}) (β[μ])(\beta\in[\mu]), respectively. Such a block triangularization was addressed by Ito, Iwata, and Murota [31] for motivating analysis on physical systems with (restricted) symmetry. The most effective block triangularization is determined by arranging a maximal chain of maximum-size zero-blocks exposed in EAFEAF, where the size of a zero block is defined as the sum of row and column numbers. This generalizes the classical Dulmage-Mendelsohn decomposition for bipartite graphs and Murota’s combinatorial canonical form for layered mixed matrices; see [23, 37].

Finding a maximum-size zero-block is nothing but FR (or MVSP) for the linear symbolic matrix obtained by multiplying variable xαβx_{\alpha\beta} to AαβA_{\alpha\beta}; see [25, Appendix] for details. The original version of our algorithm was designed for this zero-block finding. Later, we found that this is essentially nc-rank computation. This new version improves analysis (on Theorem 3.3), simplifies the arguments, particularly the proof of Theorem 3.9, and includes the new section for the pp-adic reduction.

2 Preliminaries

Let [n][n] denote {1,2,,n}\{1,2,\ldots,n\}. Let {\mathbb{R}}, {\mathbb{Q}}, {\mathbb{Z}} denote the sets of real, rational, and integer numbers, respectively. Let 1X1_{X} denote the vector in n{\mathbb{R}}^{n} such that (1X)i=1(1_{X})_{i}=1 if iXi\in X and zero otherwise. The ii-unit vector 1{i}1_{\{i\}} is simply written as 1i1_{i}.

2.1 Convex optimization on CAT(0)-spaces

2.1.1 CAT(0)-spaces

Let KK be a metric space with distance function dd. A path in KK is a continuous map γ:[0,1]L\gamma:[0,1]\to L, where its length is defined as supi=0N1d(γ(ti),γ(ti+1))\sup\sum_{i=0}^{N-1}d(\gamma(t_{i}),\gamma(t_{i+1})) over 0=t0<t1<t2<<tN=10=t_{0}<t_{1}<t_{2}<\cdots<t_{N}=1 and N>0N>0. If γ(0)=x\gamma(0)=x and γ(1)=y\gamma(1)=y, then we say that a path γ\gamma connects x,yx,y. A geodesic is a path γ\gamma satisfying d(γ(s),γ(t))=d(γ(0),γ(1))|st|d(\gamma(s),\gamma(t))=d(\gamma(0),\gamma(1))|s-t| for every s,t[0,1]s,t\in[0,1]. A geodesic metric space is a metric space KK in which any pair of two points is connected by a geodesic. Additionally, if a geodesic connecting any points is unique, then KK is called uniquely geodesic.

We next introduce a CAT(0) space. Informally, it is defined as a geodesic metric space in which any triangle is not thicker than the corresponding triangle in Euclidean plane. We here adopt the following definition. A geodesic metric space KK is said to be CAT(0) if for every point xKx\in K, every geodesic γ:[0,1]K\gamma:[0,1]\to K and t[0,1]t\in[0,1], it holds

d(x,γ(t))2(1t)d(x,γ(0))2+td(x,γ(1))2t(1t)d(γ(0),γ(1))2.d(x,\gamma(t))^{2}\leq(1-t)d(x,\gamma(0))^{2}+td(x,\gamma(1))^{2}-t(1-t)d(\gamma(0),\gamma(1))^{2}. (2.1)

The following property of a CAT(0) space is a basis of introducing convexity.

Proposition 2.1 ([8, Proposition 1.4]).

A CAT(0)(0)-space is uniquely geodesic.

Suppose that KK is a CAT(0)(0) space. For points x,yx,y in KK, let [x,y][x,y] denote the image of a unique geodesic γ\gamma connecting x,yx,y. For t[0,1]t\in[0,1], the point pp on [x,y][x,y] with d(x,p)/d(x,y)=td(x,p)/d(x,y)=t is formally written as (1t)x+ty(1-t)x+ty.

A function f:Kf:K\to{\mathbb{R}} is said to be convex if for all x,yK,t[0,1]x,y\in K,t\in[0,1] it satisfies

f((1t)x+ty)(1t)f(x)+tf(y).f((1-t)x+ty)\leq(1-t)f(x)+tf(y).

If it satisfies a stronger inequality

f((1t)x+ty)(1t)f(x)+tf(y)κ2t(1t)d(x,y)2f((1-t)x+ty)\leq(1-t)f(x)+tf(y)-\frac{\kappa}{2}t(1-t)d(x,y)^{2}

for some κ>0\kappa>0, then ff is said to be strongly convex with parameter κ>0\kappa>0. In this paper, we always assume that a convex function is continuous. A function f:Kf:K\to{\mathbb{R}} is said to be LL-Lipschitz with parameter L0L\geq 0 if for all x,yKx,y\in K it satisfies

|f(x)f(y)|Ld(x,y).|f(x)-f(y)|\leq Ld(x,y).
Lemma 2.2.

For any zKz\in K, the function xd(z,x)2x\mapsto d(z,x)^{2} is strongly convex with parameter κ=2\kappa=2, and is LL-Lipschitz with L=2diamKL=2\mathop{\rm diam}K, where diamK:=supx,yKd(x,y)\mathop{\rm diam}K:=\sup_{x,y\in K}d(x,y) denotes the diameter of KK

The former follows directly from the definition (2.1) of CAT(0)-space. The latter follows from d(z,x)2d(z,y)2(d(z,x)+d(z,y))(d(z,x)d(z,y))=(d(z,x)+d(z,y))d(x,y)(2diamK)d(x,y)d(z,x)^{2}-d(z,y)^{2}\leq(d(z,x)+d(z,y))(d(z,x)-d(z,y))=(d(z,x)+d(z,y))d(x,y)\leq(2\mathop{\rm diam}K)d(x,y).

2.1.2 Proximal point algorithm

Let KK be a complete CAT(0)-space (which is also called an Hadamard space). For a convex function f:Kf:K\to{\mathbb{R}} and λ>0\lambda>0 the resolvent of ff is a map Jλf:KKJ_{\lambda}^{f}:K\to K defined by

Jλf(x):=argminyK(f(y)+12λd(x,y)2)(xK).J_{\lambda}^{f}(x):=\mathop{\rm argmin}_{y\in K}\left(f(y)+\frac{1}{2\lambda}d(x,y)^{2}\right)\quad(x\in K).

Since the function yf(y)+12λd(x,y)2y\mapsto f(y)+\frac{1}{2\lambda}d(x,y)^{2} is strongly convex with parameter 1/λ>01/\lambda>0, the minimizer is uniquely determined, and JλfJ_{\lambda}^{f} is well-defined; see [5, Proposition 2.2.17].

The proximal point algorithm (PPA) is to iterate updates xJλf(x)x\leftarrow J_{\lambda}^{f}(x). This simple algorithm generates a sequence converging to a minimizer of ff under a mild assumption; see [3, 5]. The splitting proximal point algorithm (SPPA) [4, 5], which we will use, minimizes a convex function f:Kf:K\to{\mathbb{R}} represented as the following form

f:=i=1mfi,f:=\sum_{i=1}^{m}f_{i},

where each fi:Kf_{i}:K\to{\mathbb{R}} is a convex function. Consider a sequence (λk)k=1,2,,(\lambda_{k})_{k=1,2,\ldots,} satisfying

k=0λk=,k=0λk2<.\sum_{k=0}^{\infty}\lambda_{k}=\infty,\quad\sum_{k=0}^{\infty}\lambda_{k}^{2}<\infty.
Splitting Proximal Point Algorithm (SPPA)
\bullet

Let x0Kx_{0}\in K be an initial point.

\bullet

For k=0,1,2,k=0,1,2,\ldots, repeat the following:

xkm+i:=Jλkfi(xkm+i1)(i=1,2,,m).x_{km+i}:=J_{\lambda_{k}}^{f_{i}}(x_{km+i-1})\quad(i=1,2,\ldots,m).

Bačák [4] showed that the sequence generated by SPPA converges to a minimizer of ff if KK is locally compact. Ohta and Pálfia [39] proved sublinear convergence of SPPA if ff is strongly convex and KK is not necessarily locally compact.

Theorem 2.3 ([39]).

Suppose that ff is strongly convex with parameter ϵ>0\epsilon>0 and each fif_{i} is LL-Lipschitz. Let xx^{*} be the unique minimizer of ff. Define the sequence (λk)(\lambda_{k}) by

λk:=1/ϵ(k+1).\lambda_{k}:=1/\epsilon(k+1).

Then the sequence (x)(x_{\ell}) generated by SPPA satisfies

d(xkm,x)2=O(logkkL2m2ϵ2)(k=1,2,).d(x_{km},x^{*})^{2}=O\left(\frac{\log k}{k}\frac{L^{2}m^{2}}{\epsilon^{2}}\right)\quad(k=1,2,\ldots).

2.2 Geometry of modular lattices

We use basic terminologies and facts in lattice theory; see e.g., [18]. A lattice {\cal L} is a partially ordered set in which every pair p,qp,q of elements has meet pqp\wedge q (greatest common lower bound) and join pqp\vee q (lowest common upper bound). Let \preceq denote the partial order, where pqp\prec q means pqp\preceq q and pqp\neq q. A pairwise comparable subset of {\cal L}, arranged as p0p1pkp_{0}\prec p_{1}\prec\cdots\prec p_{k}, is called a chain (from p0p_{0} to pkp_{k}), where kk is called the length. In this paper, we only consider lattices in which any chain has a finite length. Let 𝟎{\bf 0} and 𝟏{\bf 1} denote the minimum and maximum elements of {\cal L}, respectively. The rank r(p)r(p) of element pp is defined as the maximum length of a chain from 𝟎{\bf 0} to pp. The rank of lattice {\cal L} is defined as the rank of 𝟏{\bf 1}. For elements p,qp,q with pqp\preceq q the interval [p,q][p,q] is the set of elements uu with puqp\preceq u\preceq q. Restricting \preceq to [p,q][p,q], the interval [p,q][p,q] is a lattice with maximum qq and minimum pp. If pqp\neq q and [p,q]={p,q}[p,q]=\{p,q\}, we say that qq covers pp and denote p:qp\prec:q or q:pq:\succ p. For two lattices ,{\cal L},{\cal M}, their direct product ×{\cal L}\times{\cal M} becomes a lattice, where the partial order on ×{\cal L}\times{\cal M} is defined by (p,p)(q,q)pq,pq(p,p^{\prime})\preceq(q,q^{\prime})\Leftrightarrow p\preceq q,p^{\prime}\preceq q^{\prime}.

A lattice {\cal L} is called modular if for every triple x,a,bx,a,b of elements with xbx\preceq b, it holds x(ab)=(xa)bx\vee(a\wedge b)=(x\vee a)\wedge b. A modular lattice satisfies the Jordan-Dedekind chain condition. This is, the lengths of maximal chains of every interval are the same. Also, we often use the following property:

p:ppq=pqorpq:pq.p\prec:p^{\prime}\ \Rightarrow\ p\wedge q=p^{\prime}\wedge q\ \ \mbox{or}\ \ p\wedge q\prec:p^{\prime}\wedge q. (2.2)

This can be seen from the definition of modular lattices, and holds also when replacing \wedge by \vee.

A modular lattice {\cal L} is said to be complemented if every element can be represented as a join of atoms, where an atom is an element of rank 11. It is known that for a complemented modular lattice, every interval is complemented modular, and a lattice obtained by reversing the partial order is also complemented modular. The product of two complemented modular lattices is also complemented modular.

A canonical example of a complemented modular lattice is the family {\cal L} of all subspaces of a vector space UU, where the partial order is the inclusion order with =\wedge=\cap, and =+\vee=+. Another important example is a Boolean lattice—a lattice isomorphic to the poset 2[n]2^{[n]} of all subsets of [n][n] with respect to the inclusion order \subseteq.

2.2.1 Frames—Boolean sublattices in a complemented modular lattice

Let {\cal L} be a complemented modular lattice of rank nn, and let rr denote the rank function of {\cal L}. A complemented modular lattice is equivalent to a spherical building of type A [1]. We consider a lattice-theoretic counterpart of an apartment, which is a maximal Boolean sublattice of {\cal L}.

A base is a set of nn atoms a1,a2,,ana_{1},a_{2},\ldots,a_{n} with a1a2an=𝟏a_{1}\vee a_{2}\vee\cdots\vee a_{n}={\bf 1}. The sublattice a1,a2,,an\langle a_{1},a_{2},\ldots,a_{n}\rangle generated by a base {a1,a2,,an}\{a_{1},a_{2},\ldots,a_{n}\} is called a frame, which is isomorphic to a Boolean lattice 2[n]2^{[n]} by the map

XiXai.X\mapsto\bigvee_{i\in X}a_{i}.
Lemma 2.4 (see e.g.,[18]).

Let {\cal L} be a complemented modular lattice of rank nn.

  • (1)

    For chains 𝒞,𝒟{\cal C},{\cal D} in \cal L, there is a frame {\cal F}\subseteq{\cal L} containing 𝒞\cal C and 𝒟\cal D.

  • (2)

    For a frame {\cal F} and an ordering a1,a2,,ana_{1},a_{2},\ldots,a_{n} of its basis, define map φa1,a2,,an:\varphi_{a_{1},a_{2},\ldots,a_{n}}:{\cal L}\to{\cal F} by

    p{aii[n]:p(a1a2ai):p(a1a2ai1)}.p\mapsto\bigvee\{a_{i}\mid i\in[n]:p\wedge(a_{1}\vee a_{2}\vee\cdots\vee a_{i}):\succ p\wedge(a_{1}\vee a_{2}\vee\cdots\vee a_{i-1})\}. (2.3)

    Then φa1,a2,,an\varphi_{a_{1},a_{2},\ldots,a_{n}} is a retraction to {\cal F} such that it is rank-preserving (i.e., r(p)=r(φ(p))r(p)=r(\varphi(p))) and order-preserving (i.e., pqφ(p)φ(q)p\preceq q\Rightarrow\varphi(p)\preceq\varphi(q)).

This is nothing but a part of the axiom of building, where the map in (2) is essentially a canonical retraction to an apartment.

Proof.

We show (1) by the induction on nn. Suppose that 𝒞=(𝟎=p0p1pn=𝟏){\cal C}=({\bf 0}=p_{0}\prec p_{1}\prec\cdots\prec p_{n}={\bf 1}) and 𝒟=(𝟎=q0q1qn=𝟏){\cal D}=({\bf 0}=q_{0}\prec q_{1}\prec\cdots\prec q_{n}={\bf 1}). Consider the maximal chains 𝒞,𝒟{\cal C}^{\prime},{\cal D}^{\prime} from 𝟎{\bf 0} to pn1p_{n-1}, where 𝒞:=(𝟎=p0p1pn1){\cal C}^{\prime}:=({\bf 0}=p_{0}\prec p_{1}\prec\cdots\prec p_{n-1}) and 𝒟{\cal D}^{\prime} consists of qi:=pn1qiq_{i}^{\prime}:=p_{n-1}\wedge q_{i} (i=0,1,,n)(i=0,1,\ldots,n). Note that the maximality of 𝒟{\cal D}^{\prime} follows from (2.2). By induction, there is a frame a1,a2,,an1\langle a_{1},a_{2},\ldots,a_{n-1}\rangle of the interval [𝟎,pn1][{\bf 0},p_{n-1}] (that is a complemented modular lattice of rank n1n-1) such that it contains 𝒞,𝒟{\cal C}^{\prime},{\cal D}^{\prime}. Consider the first index jj such that qjpn1q_{j}\not\preceq p_{n-1}. Then qi=qiq^{\prime}_{i}=q_{i} for i<ji<j, and qj=qj1q_{j}^{\prime}=q_{j-1}. For iji\geq j, by pn1qj=𝟏p_{n-1}\vee q_{j}={\bf 1} and modularity, it holds that qiq_{i} covers qiq_{i}^{\prime}. Again by modularity, it must hold qiqj=qiq_{i}^{\prime}\vee q_{j}=q_{i} for iji\geq j. By complementality, we can choose an atom ana_{n} such that qj1an=qjq_{j-1}\vee a_{n}=q_{j}. Now a1,a2,,an\langle a_{1},a_{2},\ldots,a_{n}\rangle is a frame as required.

(2). By (2.2), {p(a1ai)}i\{p\wedge(a_{1}\vee\cdots\vee a_{i})\}_{i} is a maximal chain from 𝟎{\bf 0} to pp. From this and the chain condition, the rank-preserving property follows. Suppose that pqp\preceq q and pb:pbp\wedge b\prec:p\wedge b^{\prime} for b:bb\prec:b^{\prime}. Then [pb,b]qbqb[pb,b][p\wedge b,b]\ni q\wedge b\preceq q\wedge b^{\prime}\in[p\wedge b,b^{\prime}]. By (2.2) and the chain condition from pbp\wedge b to bb^{\prime}, it must hold qb:qbq\wedge b\prec:q\wedge b^{\prime}. This means that any index ii appeared in (2.3) for pp also appears in that for qq. Then, the order-preserving property follows. ∎

Suppose that {\cal L} is the lattice of all vector subspaces of 𝕂n{\mathbb{K}}^{n}, and that we are given two chains 𝒞{\cal C} and 𝒟{\cal D} of vector subspaces, where each subspace XX in the chains is given by a matrix BB with ImB=X{\rm Im}\,B=X (or kerB=X\ker B=X). The above proof can be implemented by Gaussian elimination, and obtain vectors a1,a2,,ana_{1},a_{2},\ldots,a_{n} with 𝒞,𝒟a1,a2,,an{\cal C},{\cal D}\subseteq\langle a_{1},a_{2},\ldots,a_{n}\rangle in polynomial time.

2.2.2 The orthoscheme complex of a modular lattice

Let {\cal L} be a modular lattice of rank nn. Let K()K({\cal L}) denote the geometric realization of the order complex of {\cal L}. That is, K()K({\cal L}) is the set of all formal convex combinations x=pλ(p)px=\sum_{p\in{\cal L}}\lambda(p)p of elements in {\cal L} such that the support {pλ(p)0}\{p\in{\cal L}\mid\lambda(p)\neq 0\} of xx is a chain of {\cal L}. Here “convex” means that the coefficients λ(p)\lambda(p) are nonnegative reals and pλ(p)=1\sum_{p\in{\cal L}}\lambda(p)=1. A simplex corresponding to a chain 𝒞\cal C is the subset of points whose supports belong to 𝒞\cal C.

We next introduce a metric on K()K({\cal L}). For a maximal simplex σ\sigma corresponding to a maximal chain 𝒞=p0p1pn{\cal C}=p_{0}\prec p_{1}\prec\cdots\prec p_{n}, define a map φσ:σn\varphi_{\sigma}:\sigma\to{\mathbb{R}}^{n} by

φσ(x)=i=1nλi1[i](x=i=0nλipiσ).\varphi_{\sigma}(x)=\sum_{i=1}^{n}\lambda_{i}1_{[i]}\quad(x=\sum_{i=0}^{n}\lambda_{i}p_{i}\in\sigma). (2.4)

This is a bijection from σ\sigma to the nn-dimensional simplex of vertices 0,1[1],1[2],1[3],,1[n]0,1_{[1]},1_{[2]},1_{[3]},\ldots,1_{[n]}. This simplex is called the nn-dimensional orthoscheme. The metric dσd_{\sigma} on each simplex σ\sigma of K()K({\cal L}) is defined by

dσ(x,y):=φσ(x)φσ(y)2(x,yσ).d_{\sigma}(x,y):=\|\varphi_{\sigma}(x)-\varphi_{\sigma}(y)\|_{2}\quad(x,y\in\sigma). (2.5)

Accordingly, the length d(γ)d(\gamma) of a path γ:[0,1]K()\gamma:[0,1]\to K({\cal L}) is defined as the supremum of i=0N1dσi(γ(ti),γ(ti+1))\sum_{i=0}^{N-1}d_{\sigma_{i}}(\gamma(t_{i}),\gamma(t_{i+1})) over all 0=t0<t1<t2<<tN=10=t_{0}<t_{1}<t_{2}<\cdots<t_{N}=1 and N1N\geq 1, in which γ([ti,ti+1])\gamma([t_{i},t_{i+1}]) belongs to a simplex σi\sigma_{i} for each ii. Then the metric d(x,y)d(x,y) on K()K({\cal L}) is defined as the infimum of d(γ)d(\gamma) over all paths γ\gamma connecting x,yx,y. The resulting metric space K()K({\cal L}) is called the orthoscheme complex of {\cal L} [7]. By Bridson’s theorem [8, Theorem 7.19], K()K({\cal L}) is a complete geodesic metric space. Basic properties of the orthoscheme complex of a modular lattice are summarized as follows.

Proposition 2.5.
  • (1)

    [10] For a modular lattice {\cal L}, the orthoscheme complex K()K({\cal L}) is a complete CAT(0) space.

  • (2)

    [7, 10] For two modular lattices ,{\cal L},{\cal M}, the orthoscheme complex K(×)K({\cal L}\times{\cal M}) is isometric to K()×K()K({\cal L})\times K({\cal M}) with metric given by

    d((x,y),(x,y)):=d(x,x)2+d(y,y)2((x,y),(x,y)K()×K()).d((x,y),(x^{\prime},y^{\prime})):=\sqrt{d(x,x^{\prime})^{2}+d(y,y^{\prime})^{2}}\quad((x,y),(x^{\prime},y^{\prime})\in K({\cal L})\times K({\cal M})).
  • (3)

    [7, 10] For a Boolean lattice =2[n]{\cal L}=2^{[n]}, the orthoscheme complex K()K({\cal L}) is isometric to the nn-cube [0,1]nn[0,1]^{n}\subseteq{\mathbb{R}}^{n}, where the isometry is given by

    x=iλiXiiλi1Xi.x=\sum_{i}\lambda_{i}X_{i}\mapsto\sum_{i}\lambda_{i}1_{X_{i}}. (2.6)
  • (4)

    [10] For a complemented modular lattice {\cal L} of rank nn and a frame {\cal F} of {\cal L} with an ordering a1,a2,,ana_{1},a_{2},\ldots,a_{n} of its basis, the map φ=φa1,a2,,an:\varphi=\varphi_{a_{1},a_{2},\ldots,a_{n}}:{\cal L}\to{\cal F} is extended to φ¯:K()K()\bar{\varphi}:K({\cal L})\to K({\cal F}) by

    x=iλipiiλiφ(pi).x=\sum_{i}\lambda_{i}p_{i}\mapsto\sum_{i}\lambda_{i}\varphi(p_{i}).

    Then φ¯\bar{\varphi} is a nonexpansive retraction from K()K({\cal L}) to K()K({\cal F}). In particular,

    • (4-1)

      K()[0,1]nK({\cal F})\simeq[0,1]^{n} is an isometric subspace of K()K({\cal L}), and

    • (4-2)

      diamK()=n\mathop{\rm diam}K({\cal L})=\sqrt{n}.

For a complemented modular lattice {\cal L}, the CAT(0)-property of K()K({\cal L}) is equivalent to the CAT(1)-property of the corresponding spherical building, as shown in [20].

The isometry between K()×K()K({\cal L})\times K({\cal M}) and K(×)K({\cal L}\times{\cal M}).

The isometry from K(×)K({\cal L}\times{\cal M}) to K()×K()K({\cal L})\times K({\cal M}) (Proposition 2.5 (2)) is given by

z=iλi(pi,qi)(iλipi,iλiqi).z=\sum_{i}\lambda_{i}(p_{i},q_{i})\mapsto\left(\sum_{i}\lambda_{i}p_{i},\sum_{i}\lambda_{i}q_{i}\right).

The inverse map is constructed as follows: For (x,y)=(iμipi,jνjqj)=:(x,y)(x,y)=(\sum_{i}\mu_{i}p_{i},\sum_{j}\nu_{j}q_{j})=:(x^{\prime},y^{\prime}), choose maximum pi,qjp_{i},q_{j} with μi0\mu_{i}\neq 0, νj0\nu_{j}\neq 0, set zz+min(μi,νj)(pi,qj)z\leftarrow z+\min(\mu_{i},\nu_{j})(p_{i},q_{j}), (x,y)(x,y)min(μi,νj)(pi,qj)(x^{\prime},y^{\prime})\leftarrow(x^{\prime},y^{\prime})-\min(\mu_{i},\nu_{j})(p_{i},q_{j}), repeat it from z=0z=0 until (x,y)=(0,0)(x^{\prime},y^{\prime})=(0,0). The resulting zz satisfies φ(z)=(x,y)\varphi(z)=(x,y).

The {\cal F}-coordinate of a frame {\cal F}.

A frame =a1,a2,,an{\cal F}=\langle a_{1},a_{2},\ldots,a_{n}\rangle is isomorphic to Boolean lattice 2[n]2^{[n]} by ai1ai2aik{i1,i2,,ik}a_{i_{1}}\vee a_{i_{2}}\vee\cdots\vee a_{i_{k}}\mapsto\{i_{1},i_{2},\ldots,i_{k}\}. Further, the subcomplex K()K({\cal F}) is viewed as an nn-cube [0,1]n[0,1]^{n}, and a point xx in K()K({\cal F}) is viewed as x=(x1,x2,,xn)[0,1]nx=(x_{1},x_{2},\ldots,x_{n})\in[0,1]^{n} via isometry (2.6)(\ref{eqn:F-coordinate}). This nn-dimensional vector (x1,x2,,xn)(x_{1},x_{2},\ldots,x_{n}) is called the {\cal F}-coordinate of xx. From {\cal F}-coordinate (x1,x2,,xn)(x_{1},x_{2},\ldots,x_{n}), the original expression of xx is recovered by sorting x1,x2,,xnx_{1},x_{2},\ldots,x_{n} in decreasing order as: xi1xi2xinx_{i_{1}}\geq x_{i_{2}}\geq\cdots\geq x_{i_{n}}, and letting

x=(1xi1)𝟎+k=1n(xikxik+1)(ai1ai2aik),x=(1-x_{i_{1}}){\bf 0}+\sum_{k=1}^{n}(x_{i_{k}}-x_{i_{k+1}})(a_{i_{1}}\vee a_{i_{2}}\vee\cdots\vee a_{i_{k}}), (2.7)

where xin+1:=0x_{i_{n+1}}:=0.

2.2.3 Submodular functions and Lovász extensions

Let {\cal L} be a modular lattice. A function f:f:{\cal L}\to{\mathbb{R}} is said to be submodular if

f(p)+f(q)f(pq)+f(pq)(p,q).f(p)+f(q)\geq f(p\wedge q)+f(p\vee q)\quad(p,q\in{\cal L}).

For a function f:f:{\cal L}\to{\mathbb{R}}, the Lovász extension f¯:K()\overline{f}:K({\cal L})\to{\mathbb{R}} is defined by

f¯(x):=iλif(pi)(x=iλipiK()).\overline{f}(x):=\sum_{i}\lambda_{i}f(p_{i})\quad(x=\sum_{i}\lambda_{i}p_{i}\in K({\cal L})).

In the case of =2[n]{\cal L}=2^{[n]}, this definition of the Lovász extension coincides with the original one [15, 35] by K()[0,1]nK({\cal L})\simeq[0,1]^{n} (Proposition 2.5 (3)).

Proposition 2.6.

Let {\cal L} be a modular lattice of rank nn. For a function f:f:{\cal L}\to{\mathbb{R}}, we have the following.

  • (1)

    [24] ff is submodular if and only if the Lovász extension f¯\overline{f} is convex.

  • (2)

    The Lovász extension f¯\overline{f} is LL-Lipschitz with L=2nmaxp|f(p)|.L=2\sqrt{n}\max_{p\in{\cal L}}|f(p)|.

  • (3)

    Suppose that ff is integer-valued. For xK()x\in K({\cal L}), if f¯(x)minpf(p)<1\overline{f}(x)-\min_{p\in{\cal L}}f(p)<1, then a minimizer of ff exists in the support of xx.

Proof.

(1) [sketch]. For two points x,yK()x,y\in K({\cal L}), there is a frame {\cal F} such that K()K({\cal F}) contains x,yx,y. Since K()K({\cal F}) is an isometric subspace of K()K({\cal L}) (Proposition 2.5 (4)), the geodesic [x,y][x,y] belongs to K()K({\cal F}). Hence, a function on K()K({\cal L}) is convex if and only if it is convex on K()K({\cal F}) for every frame {\cal F}. For any frame {\cal F}, the restriction of a submodular function f:f:{\cal L}\to{\mathbb{R}} to {\cal F} is a usual submodular function on Boolean lattice 2[n]{\cal F}\simeq 2^{[n]}. Hence f¯:K()\overline{f}:K({\cal F})\to{\mathbb{R}} is viewed as the usual Lovász extension by [0,1]nK()[0,1]^{n}\simeq K({\cal F}), and is convex.

(2). We first show that the restriction f¯|σ\overline{f}|_{\sigma} of f¯\overline{f} to any maximal simplex σ\sigma is LL-Lipschitz with L2nmaxp|f(p)|L\leq 2\sqrt{n}\max_{p\in{\cal L}}|f(p)|. Suppose that σ\sigma corresponds to a chain 𝟎=p0p1pn=𝟏{\bf 0}=p_{0}\prec p_{1}\prec\cdots\prec p_{n}={\bf 1}. Let x=kλkpkx=\sum_{k}\lambda_{k}p_{k} and y=kμkpky=\sum_{k}\mu_{k}p_{k} be points in σ\sigma. Define u,vnu,v\in{\mathbb{R}}^{n} by

uk:=λk+λk+1++λn,vk:=μk+μk+1++μn.u_{k}:=\lambda_{k}+\lambda_{k+1}+\cdots+\lambda_{n},\quad v_{k}:=\mu_{k}+\mu_{k+1}+\cdots+\mu_{n}.

By (2.4) and (2.5), we have dσ(x,y)=uv2.d_{\sigma}(x,y)=\|u-v\|_{2}. Let C:=maxp|f(p)|C:=\max_{p\in{\cal L}}|f(p)|. Then we have

|f¯(x)f¯(y)|=|k=0n(λkμk)f(pk)|Ck=0n|λkμk|\displaystyle|\overline{f}(x)-\overline{f}(y)|=\left|\sum_{k=0}^{n}(\lambda_{k}-\mu_{k})f(p_{k})\right|\leq C\sum_{k=0}^{n}|\lambda_{k}-\mu_{k}|
=Ck=0n|ukuk+1(vkvk+1)|2Ck=1n|ukvk|2nCuv2,\displaystyle=C\sum_{k=0}^{n}|u_{k}-u_{k+1}-(v_{k}-v_{k+1})|\leq 2C\sum_{k=1}^{n}|u_{k}-v_{k}|\leq 2\sqrt{n}C\|u-v\|_{2},

where we let u0=v0:=1u_{0}=v_{0}:=1 and un+1=vn+1:=0u_{n+1}=v_{n+1}:=0. Thus, f¯|σ\overline{f}|_{\sigma} is 2nC2\sqrt{n}C-Lipschitz.

Next we show that f¯\overline{f} is 2nC2\sqrt{n}C-Lipschitz. For any x,yK()x,y\in K({\cal L}), choose the geodesic γ\gamma between xx and yy, and 0=t0<t1<<tm=10=t_{0}<t_{1}<\cdots<t_{m}=1 such that γ([ti,ti+1])\gamma([t_{i},t_{i+1}]) belongs to simplex σi\sigma_{i}. Then we have

|f¯(x)f¯(y)|i=1m|f¯(γ(ti))f¯(γ(ti1))|2nCi=1mdσi(γ(ti),γ(ti1))=2nCd(x,y).|\overline{f}(x)-\overline{f}(y)|\leq\sum_{i=1}^{m}|\overline{f}(\gamma(t_{i}))-\overline{f}(\gamma(t_{i-1}))|\leq 2\sqrt{n}C\sum_{i=1}^{m}d_{\sigma_{i}}(\gamma(t_{i}),\gamma(t_{i-1}))=2\sqrt{n}Cd(x,y).

(3). Let f:=minpf(p)f^{*}:=\min_{p\in{\cal L}}f(p), and let x=iλipix=\sum_{i}\lambda_{i}p_{i}. Suppose to the contrary that all pip_{i}’s satisfy f(pi)>ff(p_{i})>f^{*}. Then f(pi)f+1f(p_{i})\geq f^{*}+1. Hence f¯(x)=iλif(pi)iλi(f+1)=f+1\overline{f}(x)=\sum_{i}\lambda_{i}f(p_{i})\geq\sum_{i}\lambda_{i}(f^{*}+1)=f^{*}+1. However this contradicts f¯(x)f<1\overline{f}(x)-f^{*}<1. ∎

3 Algorithm

3.1 Nc-rank is submodular minimization

Consider MVSP for a linear symbolic matrix A=i=1mAixiA=\sum_{i=1}^{m}A_{i}x_{i}. Let us formulate MVSP as an unconstrained submodular function minimization over a complemented modular lattice. Let {\cal L} and {\cal M} denote the lattices of all vector subspaces of 𝕂n\mathbb{K}^{n}, where the partial order of {\cal L} is the inclusion order and the partial order of {\cal M} is the reverse inclusion order. Let Ri=RAi:×R_{i}=R_{A_{i}}:{\cal L}\times{\cal M}\to{\mathbb{Z}} be defined by

Ri(X,Y):=rankAi|X×Y((X,Y)×),R_{i}(X,Y):=\mathop{\rm rank}A_{i}|_{X\times Y}\quad((X,Y)\in{\cal L}\times{\cal M}),

where Ai|X×Y:X×Y𝕂A_{i}|_{X\times Y}:X\times Y\to\mathbb{K} is the restriction of AiA_{i} to X×YX\times Y. Then the condition Ai(X,Y)={0}A_{i}(X,Y)=\{0\} in MVSP can be written as Ri(X,Y)=0R_{i}(X,Y)=0. By using RiR_{i} as a penalty term, consider the following unconstrained problem:

MVSPR:Min.\displaystyle{\rm MVSP}_{R}:\quad{\rm Min.} dimXdimY+(2n+1)i=1mRi(X,Y)\displaystyle-\dim X-\dim Y+(2n+1)\sum_{i=1}^{m}R_{i}(X,Y)
s.t.\displaystyle{\rm s.t.} (X,Y)×.\displaystyle(X,Y)\in{\cal L}\times{\cal M}.

Then it is easy to see:

Lemma 3.1.

Any optimal solution of MVSPR is optimal to MVSP.

Proposition 3.2.

The objective function of MVSPR is submodular on ×{\cal L}\times{\cal M}.

Proof.

Submodularity of XdimXX\mapsto-\dim X and YdimYY\mapsto-\dim Y directly follows from dimX+dimX=dim(XX)+dim(X+X)\dim X+\dim X^{\prime}=\dim(X\cap X^{\prime})+\dim(X+X^{\prime}). Thus it suffices to verify that R=Ri:×R=R_{i}:{\cal L}\times{\cal M}\to{\mathbb{Z}} is submodular:

R(X,Y)+R(X,Y)R(XX,Y+Y)+R(X+X,YY).R(X,Y)+R(X^{\prime},Y^{\prime})\geq R(X\cap X^{\prime},Y+Y^{\prime})+R(X+X^{\prime},Y\cap Y^{\prime}).

Note that an equivalent statement appeared in [32, Lemma 4.2].

By Lemma 2.4, there is a base {a1,a2,,an}\{a_{1},a_{2},\ldots,a_{n}\} of {\cal L} with X,X,XX,X+Xa1,a2,,anX,X^{\prime},X\cap X^{\prime},X+X^{\prime}\subseteq\langle a_{1},a_{2},\ldots,a_{n}\rangle, and there is a base {b1,b2,,bn}\{b_{1},b_{2},\ldots,b_{n}\} of {\cal M} with Y,Y,YY,Y+Yb1,b2,,bnY,Y^{\prime},Y\cap Y^{\prime},Y+Y^{\prime}\subseteq\langle b_{1},b_{2},\ldots,b_{n}\rangle. Consider the matrix representation A=(A(ai,bj))A=(A(a_{i},b_{j})) with respect to these bases. For I,J[n]I,J\subseteq[n], let A[I,J]A[I,J] be the submatrix of AA with row set II and column set JJ. Submodularity of RR follows from the rank inequality

rankA[I,J]+rankA[I,J]rankA[II,JJ]+rankA[II,JJ].\mathop{\rm rank}A[I,J]+\mathop{\rm rank}A[I^{\prime},J^{\prime}]\geq\mathop{\rm rank}A[I\cap I^{\prime},J\cup J^{\prime}]+\mathop{\rm rank}A[I\cup I^{\prime},J\cap J^{\prime}].

See [37, Proposition 2.1.9]. ∎

Thus, MVSPR has a convex relaxation on CAT(0) space K(×)=K()×K()K({\cal L}\times{\cal M})=K({\cal L})\times K({\cal M}) with objective function gg that is the Lovász extension

g(x,y):=dim¯(x)dim¯(y)+(2n+1)i=1mRi¯(x,y).g(x,y):=-\overline{\dim}(x)-\overline{\dim}(y)+(2n+1)\sum_{i=1}^{m}\overline{R_{i}}(x,y). (3.1)

3.2 Splitting proximal point algorithm for nc-rank

We apply SPPA to the following perturbed version of the convex relaxation:

Min.\displaystyle\quad{\rm Min.} dim¯(x)dim¯(y)+(2n+1)i=1mRi¯(x,y)+(1/8n)(d(𝟎,x)2+d(𝟎,y)2)\displaystyle-\overline{\dim}(x)-\overline{\dim}(y)+(2n+1)\sum_{i=1}^{m}\overline{R_{i}}(x,y)+(1/8n)(d(\mathbf{0},x)^{2}+d(\mathbf{0},y)^{2})
s.t.\displaystyle{\rm s.t.} (x,y)K()×K().\displaystyle(x,y)\in K({\cal L})\times K({\cal M}).

We regard the objective function g~\tilde{g} as i=1m+2fi\sum_{i=1}^{m+2}f_{i}, where fif_{i} is defined by

fi(x,y):={dim¯(x)+(1/8n)d(𝟎,x)2ifk=m+1,dim¯(y)+(1/8n)d(𝟎,y)2ifk=m+2,(2n+1)Ri¯(x,y)if 1imf_{i}(x,y):=\left\{\begin{array}[]{cl}-\overline{\dim}(x)+(1/8n)d(\mathbf{0},x)^{2}&{\rm if}\ k=m+1,\\ -\overline{\dim}(y)+(1/8n)d(\mathbf{0},y)^{2}&{\rm if}\ k=m+2,\\ (2n+1)\overline{R_{i}}(x,y)&{\rm if}\ 1\leq i\leq m\end{array}\right.
Theorem 3.3.

Let (z)(z_{\ell}) be the sequence obtained by SPPA applied to g~=i=1m+2fi\tilde{g}=\sum_{i=1}^{m+2}f_{i} with ϵ:=1/2n\epsilon:=1/2n. For =Ω(n12m5lognm)\ell=\Omega(n^{12}m^{5}\log nm), the support of z=(x,y)z_{\ell}=(x_{\ell},y_{\ell}) contains a minimizer of MVSP.

Proof.

We first show that fif_{i} is LL-Lipschitz with L=O(n5/2).L=O(n^{5/2}). By Lemma 2.2, Proposition 2.5 (4-2), and Proposition 2.6 (2), the Lipschitz constants of dim¯\overline{\dim} and d(𝟎,)2d({\bf 0},\cdot)^{2} are O(n3/2)O(n^{3/2}) and O(n)O(\sqrt{n}), respectively. Therefore, if i=m+1i=m+1 or m+2m+2, then the Lipschitz constant of fif_{i} is O(n3/2)O(n^{3/2}). The Lipschitz constant of other fif_{i} is O(n5/2)O(n^{5/2}).

The objective function is strongly convex with parameter 1/2n1/2n. Let z~\tilde{z} denote the minimizer of g~\tilde{g}. By Theorem 2.3, we have

g~(zk(m+2)))g~(z~)(m+2)Ld(zk(m+2),z~)=O(logkkn6m2).\displaystyle\tilde{g}(z_{k(m+2))})-\tilde{g}(\tilde{z})\leq(m+2)Ld(z_{k(m+2)},\tilde{z})=O\left(\sqrt{\frac{\log k}{k}}n^{6}m^{2}\right).

Thus, for k=Ω(n12m4lognm)k=\Omega(n^{12}m^{4}\log nm), it holds g~(zk(m+2))g~(z~)<1/2\tilde{g}(z_{k(m+2)})-\tilde{g}(\tilde{z})<1/2.

Let zz^{*} be a minimizer of gg of (3.1). Then we have g(zk(m+2))g(z)=g(zk(m+2))g(z~)+g(z~)g(z)g~(zk(m+2))g~(z~)+(1/8n)d(𝟎,z~)2+g~(z~)g~(z)+(1/8n)d(𝟎,z)2g~(zk(m+2))g~(z~)+1/2<1g(z_{k(m+2)})-g(z^{*})=g(z_{k(m+2)})-g(\tilde{z})+g(\tilde{z})-g(z^{*})\leq\tilde{g}(z_{k(m+2)})-\tilde{g}(\tilde{z})+(1/8n)d(\mathbf{0},\tilde{z})^{2}+\tilde{g}(\tilde{z})-\tilde{g}(z^{*})+(1/8n)d(\mathbf{0},z^{*})^{2}\leq\tilde{g}(z_{k(m+2)})-\tilde{g}(\tilde{z})+1/2<1. By Proposition 2.6 (3), the support of zk(m+2)z_{k(m+2)} contains a minimizer of MVSP. ∎

Thus, after a polynomial number of iterations, a minimizer (X,Y)(X^{*},Y^{*}) of MVSP exists in the support of zz_{\ell}. Our remaining task is to show that the resolvent of each summand fif_{i} can be computed in polynomial time.

3.2.1 Computation of the resolvent for fi=dim¯+(1/8n)d(𝟎,)2f_{i}=-\overline{\dim}+(1/8n)d(\mathbf{0},\cdot)^{2}

First we consider the resolvent of dim¯+(1/8n)d(𝟎,)2-\overline{\dim}+(1/8n)d(\mathbf{0},\cdot)^{2}. This is an optimization problem over the orthoscheme complex of a single lattice. It suffices to consider the following problem.

P1:Min.\displaystyle{\rm P1:\quad Min}. dim¯(x)+ϵd(𝟎,x)2+12λd(x,x0)2\displaystyle-\overline{\dim}(x)+\epsilon d(\mathbf{0},x)^{2}+\frac{1}{2\lambda}d(x,x^{0})^{2}
s.t.\displaystyle{\rm s.t.} xK(),\displaystyle x\in K({\cal L}),

where ϵ,λ>0\epsilon,\lambda>0, and x0K()x^{0}\in K({\cal L}).

Lemma 3.4.

Suppose that x0x^{0} belongs to a maximal simplex σ\sigma. Then the minimizer xx^{*} of P1 exists in σ\sigma.

Proof.

Let x0=i=0nλipix^{0}=\sum_{i=0}^{n}\lambda_{i}p_{i} for the maximal chain {pi}\{p_{i}\} of σ\sigma. Let x=iμiqix^{*}=\sum_{i}\mu_{i}q_{i} be the unique minimizer of P1. Consider a frame =a1,a2,,an{\cal F}=\langle a_{1},a_{2},\ldots,a_{n}\rangle containing chains {pi}\{p_{i}\} and {qi}\{q_{i}\}. Notice K()[0,1]nK({\cal F})\simeq[0,1]^{n}. Let (x10,x20,,xn0)(x^{0}_{1},x^{0}_{2},\ldots,x^{0}_{n}) and (x1,x2,,xn)(x^{*}_{1},x^{*}_{2},\ldots,x^{*}_{n}) be the {\cal F}-coordinates of x0x^{0} and xx^{*}, respectively. By (2.6), it holds dim¯(x)=ixi\overline{\dim}(x)=\sum_{i}x_{i}, since x=k=0nλkai1ai2aikkλk1{i1,i2,,ik}x=\sum_{k=0}^{n}\lambda_{k}a_{i_{1}}\vee a_{i_{2}}\vee\cdots\vee a_{i_{k}}\simeq\sum_{k}\lambda_{k}1_{\{i_{1},i_{2},\ldots,i_{k}\}}. Hence the objective function of P1 is written as

i=1nxi+ϵi=1nxi2+12λi=1n(xixi0)2.-\sum_{i=1}^{n}x_{i}+\epsilon\sum_{i=1}^{n}x_{i}^{2}+\frac{1}{2\lambda}\sum_{i=1}^{n}(x_{i}-x^{0}_{i})^{2}.

We can assume that pi=a1a2aip_{i}=a_{1}\vee a_{2}\vee\cdots\vee a_{i} by relabeling. Then x10x20xn0x^{0}_{1}\geq x^{0}_{2}\geq\cdots\geq x^{0}_{n}. Suppose that xi0>xi+10x^{0}_{i}>x^{0}_{i+1}. Then xixi+1x^{*}_{i}\geq x^{*}_{i+1} must hold. If xi<xi+1x^{*}_{i}<x^{*}_{i+1}, then interchanging the ii-coordinate and (i+1)(i+1)-coordinate of xx^{*} gives rise to another point in K()K({\cal F}) having a smaller objective value. This is a contradiction to the optimality of xx^{*}. Suppose that xi0=xi+10x^{0}_{i}=x^{0}_{i+1}. If xixi+1x^{*}_{i}\neq x^{*}_{i+1}, then replace both xix_{i}^{*} and xi+1x_{i+1}^{*} by (xi+xi+1)/2(x^{*}_{i}+x^{*}_{i+1})/2 to decrease the objective value, which is a contradiction. Thus x1x2xnx^{*}_{1}\geq x^{*}_{2}\geq\cdots\geq x^{*}_{n}. By (2.7), the original coordinate is written as x=(1x1)𝟎+i=1n(xixi+1)(a1a2ai)=i(xixi+1)pix^{*}=(1-x^{*}_{1}){\bf 0}+\sum_{i=1}^{n}(x^{*}_{i}-x^{*}_{i+1})(a_{1}\vee a_{2}\vee\cdots\vee a_{i})=\sum_{i}(x^{*}_{i}-x^{*}_{i+1})p_{i} (with x0=1x_{0}^{*}=1 and xn+1=0x^{*}_{n+1}=0). This means that xx^{*} belongs to σ\sigma. ∎

As in the proof, to solve P1, consider (implicitly) a frame {\cal F} containing the chain {pi}\{p_{i}\} for x0=iλipix^{0}=\sum_{i}\lambda_{i}p_{i}, and the following Euclidean convex optimization problem:

P1:Min.\displaystyle{\rm P1^{\prime}:\quad Min}. i=1nxi+ϵi=1nxi2+12λi=1n(xixi0)2\displaystyle-\sum_{i=1}^{n}x_{i}+\epsilon\sum_{i=1}^{n}x_{i}^{2}+\frac{1}{2\lambda}\sum_{i=1}^{n}(x_{i}-x^{0}_{i})^{2}
s.t.\displaystyle{\rm s.t.} 0xi1(1in),\displaystyle 0\leq x_{i}\leq 1\quad(1\leq i\leq n),

where xx and x0x^{0} are represented in the {\cal F}-coordinate. Then the optimal solution xx^{*} of P1 is obtained coordinate-wise. Specifically, xix^{*}_{i} is 0, 11, or (xi0+λ)/(1+2ϵλ)(x_{i}^{0}+\lambda)/(1+2\epsilon\lambda) for each ii. According to (2.7), the expression in K()K({\cal L}) is recovered.

Theorem 3.5.

The resolvent of fi=dim¯+(1/4n)d(𝟎,)2f_{i}=-\overline{\dim}+(1/4n)d(\mathbf{0},\cdot)^{2} is computed in polynomial time.

3.2.2 Computation of the resolvent for fi=(2n+1)Ri¯f_{i}=(2n+1)\overline{R_{i}}

Next we consider the computation of the resolvent of (2n+1)Ri¯(2n+1)\overline{R_{i}}. It suffices to consider the following problem for R=RAiR=R_{A_{i}}:

P2:Min.\displaystyle{\rm P2:\quad Min.} R¯(x,y)+12λ(d(x,x0)2+d(y,y0)2)\displaystyle\overline{R}(x,y)+\frac{1}{2\lambda}(d(x,x^{0})^{2}+d(y,y^{0})^{2})
s.t.\displaystyle{\rm s.t.} (x,y)K()×K(),\displaystyle(x,y)\in K({\cal L})\times K({\cal M}),

where λ>0\lambda>0, x0K()x^{0}\in K({\cal L}), and y0K()y^{0}\in K({\cal M}). As in the case of P1, we reduce P2 to a convex optimization over [0,1]2n[0,1]^{2n} by choosing a special frame e1,e2,,en,f1,f2,,fn\langle e_{1},e_{2},\ldots,e_{n},f_{1},f_{2},\ldots,f_{n}\rangle of ×{\cal L}\times{\cal M}.

For XX\in{\cal L}, let XX^{\bot} denote the subspace in {\cal M} defined by

X:={y𝕂nAi(x,y)=0(xX)}.X^{\bot}:=\{y\in{\mathbb{K}}^{n}\mid A_{i}(x,y)=0\ (x\in X)\}.

Namely XX^{\bot} is the orthogonal subspace of XX with respect to the bilinear form AiA_{i}. For YY\in{\cal M}, let YY^{\bot}\in{\cal L} be defined analogously. Let U0U_{0}\in{\cal L} and V0V_{0}\in{\cal M} denote the left and right kernels of AiA_{i}, respectively:

U0\displaystyle U_{0} :=\displaystyle:= {x𝕂nAi(x,y)=0(y𝕂n)}.\displaystyle\{x\in{\mathbb{K}}^{n}\mid A_{i}(x,y)=0\ (y\in{\mathbb{K}}^{n})\}.
V0\displaystyle V_{0} :=\displaystyle:= {y𝕂nAi(x,y)=0(x𝕂n)}.\displaystyle\{y\in{\mathbb{K}}^{n}\mid A_{i}(x,y)=0\ (x\in{\mathbb{K}}^{n})\}.

Let k:=rankAik:=\mathop{\rm rank}A_{i}. An orthogonal frame =e1,e2,,en,f1,f2,,fn{\cal F}=\langle e_{1},e_{2},\ldots,e_{n},f_{1},f_{2},\ldots,f_{n}\rangle is a frame of ×{\cal L}\times{\cal M} satisfying the following conditions:

  • e1,e2,,en\langle e_{1},e_{2},\ldots,e_{n}\rangle is a frame of {\cal L}.

  • f1,f2,,fn\langle f_{1},f_{2},\ldots,f_{n}\rangle is a frame of {\cal M}.

  • ek+1ek+2en=U0e_{k+1}\vee e_{k+2}\vee\cdots\vee e_{n}=U_{0}.

  • f1f2fk=V0f_{1}\vee f_{2}\vee\cdots\vee f_{k}=V_{0} (\Leftrightarrow f1f2fk=V0f_{1}\cap f_{2}\cap\cdots\cap f_{k}=V_{0} ).

  • fi=eif_{i}={e_{i}}^{\bot} for i=1,2,,ki=1,2,\ldots,k.

Figure 2 is an intuitive illustration of an orthogonal frame.

Refer to caption
Figure 2: An orthogonal frame
Proposition 3.6.

Let =e1,e2,,en,f1,f2,,fn{\cal F}=\langle e_{1},e_{2},\ldots,e_{n},f_{1},f_{2},\ldots,f_{n}\rangle be an orthogonal frame. The restriction of the Lovász extension R¯\overline{R} to K()[0,1]n×[0,1]nK({\cal F})\simeq[0,1]^{n}\times[0,1]^{n} can be written as

R¯(x,y)=i=1kmax{0,xiyi},\overline{R}(x,y)=\sum_{i=1}^{k}\max\{0,x_{i}-y_{i}\}, (3.2)

where (x1,x2,,xn)(x_{1},x_{2},\ldots,x_{n}) is the e1,e2,,en\langle e_{1},e_{2},\ldots,e_{n}\rangle-coordinate of xx and (y1,y2,,yn)(y_{1},y_{2},\ldots,y_{n}) is the f1,f2,,fn\langle f_{1},f_{2},\ldots,f_{n}\rangle-coordinate of yy.

Proposition 3.7.

Let 𝒳{\cal X} and 𝒴{\cal Y} be maximal chains of {\cal L} and {\cal M}, respectively. Then there exists an orthogonal frame =e1,e2,,en,f1,f2,,fn{\cal F}=\langle e_{1},e_{2},\ldots,e_{n},f_{1},f_{2},\ldots,f_{n}\rangle satisfying

𝒳𝒴e1,e2,,en,𝒳𝒴f1,f2,,fn.{\cal X}\cup{\cal Y}^{\bot}\subseteq\langle e_{1},e_{2},\ldots,e_{n}\rangle,\ {\cal X}^{\bot}\cup{\cal Y}\subseteq\langle f_{1},f_{2},\ldots,f_{n}\rangle. (3.3)

Such a frame can be found in polynomial time.

Proposition 3.8.

Let 𝒳{\cal X} and 𝒴{\cal Y} be maximal chains corresponding to maximal simplices containing x0x^{0} and y0y^{0}, respectively. For an orthogonal frame {\cal F} satisfying (3.3)(\ref{eqn:XuYbot}), the minimizer (x,y)(x^{*},y^{*}) of P2 exists in K()K({\cal F}).

The above three propositions are proved in Section 3.2.3. Assuming these, we proceed the computation of the resolvent. For an orthogonal frame satisfying (3.3), the problem P2 is equivalent to

P2:Min.\displaystyle{\rm P2^{\prime}:\quad Min.} i=1kmax{0,xiyi}+12λ{i=1m(xixi0)2+i=1n(yiyi0)2}\displaystyle\sum_{i=1}^{k}\max\{0,x_{i}-y_{i}\}+\frac{1}{2\lambda}\left\{\sum_{i=1}^{m}(x_{i}-x^{0}_{i})^{2}+\sum_{i=1}^{n}(y_{i}-y^{0}_{i})^{2}\right\}
s.t.\displaystyle{\rm s.t.} 0xi1,0yi1(0in).\displaystyle 0\leq x_{i}\leq 1,0\leq y_{i}\leq 1\quad(0\leq i\leq n).

Again this problem is easily solved coordinate-wise. Obviously xi=xi0x^{*}_{i}=x^{0}_{i} and yi=yi0y^{*}_{i}=y^{0}_{i} for i>ki>k. For iki\leq k, (xi,yi)(x^{*}_{i},y^{*}_{i}) is the minimizer of the 22-dimensional problem. Obviously this can be solved in constant time.

Theorem 3.9.

The resolvent of fi=(2n+1)Ri¯f_{i}=(2n+1)\overline{R_{i}} is computed in polynomial time.

Remark 3.10 (Bit complexity).

In the above SPPA, the required bit-length for coefficients of zK(×)z\in K({\cal L}\times{\cal M}) is bounded polynomially in n,mn,m. Indeed, the transformation between the original coordinate and an {\cal F}-coordinate corresponds to multiplying a triangular matrix consisting of 0,±10,\pm 1 entries; see (2.7). In each iteration kk, the optimal solution of quadratic problem P1 or P2 is obtained by adding (fixed) rational functions in n,m,kn,m,k to (current points) xi0,yi0x_{i}^{0},y_{i}^{0} and multiplying a (fixed) 2×22\times 2 rational matrix in n,m,kn,m,k. Consequently, the bit increase is polynomially bounded.

On the other hand, in the case of 𝕂={\mathbb{K}}={\mathbb{Q}}, we could not exclude the possibility of an exponential increase of the bit-length for the basis of a vector subspace appearing in the algorithm.

3.2.3 Proofs of Propositions 3.6, 3.7, and 3.8

We start with basic properties of ()(\cdot)^{\bot}, which follow from elementary linear algebra.

Lemma 3.11.
  • (1)

    If XXX\subseteq X^{\prime}, then XXX^{\bot}\supseteq{X^{\prime}}^{\bot} and dimXdimXdimXdimX\dim{X}^{\bot}-\dim{X^{\prime}}^{\bot}\leq\dim X^{\prime}-\dim X.

  • (2)

    (X+X)=XX(X+X^{\prime})^{\bot}=X^{\bot}\cap{X^{\prime}}^{\bot}.

  • (3)

    XXX^{\bot\bot}\supseteq X.

  • (4)

    XXX\mapsto X^{\bot} induces an isomorphism between [U0,𝕂n][U_{0},{\mathbb{K}}^{n}] and [𝕂n,V0][{\mathbb{K}}^{n},V_{0}] with inverse YYY\mapsto Y^{\bot}. In particular, X=XX^{\bot\bot\bot}=X^{\bot}.

An alternative expression of RR by using ()(\cdot)^{\bot} is given.

Lemma 3.12.

R(X,Y)=dimYdimYX=dimXdimXYR(X,Y)=\dim Y-\dim Y\cap X^{\bot}=\dim X-\dim X\cap Y^{\bot}.

Proof.

Consider bases {a1,a2,,a}\{a_{1},a_{2},\ldots,a_{\ell}\} of XX and {b1,b2,,b}\{b_{1},b_{2},\ldots,b_{\ell^{\prime}}\} of YY. We can assume that {ak+1,ak+2,,a}\{a_{k^{\prime}+1},a_{k^{\prime}+2},\ldots,a_{\ell^{\prime}}\} is a base of YXY\cap X^{\bot}. Consider the matrix representation (Ai(ai,bj))(A_{i}(a_{i^{\prime}},b_{j^{\prime}})) of Ai|X×YA_{i}|_{X\times Y} with respect to these bases. Its submatrix of k+1,k+2,,k^{\prime}+1,k^{\prime}+2,\ldots,\ell^{\prime}-th columns is a zero matrix. On the other hand, the submatrix of 1,2,,k1,2,\ldots,k^{\prime}-th columns must have the column-full rank kk^{\prime}. Thus, the rank R(X,Y)R(X,Y) of Ai|X×YA_{i}|_{X\times Y} is k=(k)=r(Y)r(YX)k^{\prime}=\ell^{\prime}-(\ell^{\prime}-k^{\prime})=r(Y)-r(Y\cap X^{\bot}). The second expression is obtained similarly. ∎

Proof of Proposition 3.6.

An orthogonal frame e1,e2,,en,f1,f2,,fn\langle e_{1},e_{2},\ldots,e_{n},f_{1},f_{2},\ldots,f_{n}\rangle is naturally identified with Boolean lattice 2[2n]2[n]×2[n]2^{[2n]}\simeq 2^{[n]}\times 2^{[n]}. Notice that ei=fi{e_{i}}^{\bot}=f_{i} if iki\leq k and ei=𝕂n{e_{i}}^{\bot}={\mathbb{K}}^{n} if i>ki>k. The latter fact follows from eiU0eiU0=𝕂ne_{i}\subseteq U_{0}\Rightarrow{e_{i}}^{\bot}\supseteq U_{0}^{\bot}={\mathbb{K}}^{n}. By Lemma 3.11 (2), we have X=X{1,2,,k}X^{\bot}=X\cap\{1,2,\ldots,k\} for X2[n]X\in 2^{[n]}. By Lemma 3.12 and dimY=n|Y|\dim Y=n-|Y| for Y2[n]f1,f2,,fnY\in 2^{[n]}\simeq\langle f_{1},f_{2},\ldots,f_{n}\rangle (with inclusion order reversed), we have

R(X,Y)=|Y(X[k])||Y|=|(XY)[k]|.R(X,Y)=|Y\cup(X\cap[k])|-|Y|=|(X\setminus Y)\cap[k]|.

Identify 2[n]×2[n]2^{[n]}\times 2^{[n]} with {0,1}n×{0,1}n\{0,1\}^{n}\times\{0,1\}^{n} by (X,Y)(1X,1Y)(X,Y)\mapsto(1_{X},1_{Y}). Then RR is also written as

R(x,y)=i=1kmax{0,xiyi}((x,y){0,1}n×{0,1}n).R(x,y)=\sum_{i=1}^{k}\max\{0,x_{i}-y_{i}\}\quad((x,y)\in\{0,1\}^{n}\times\{0,1\}^{n}).

Observe that the Lovász extension of (xi,yi)max{0,xiyi}(x_{i},y_{i})\mapsto\max\{0,x_{i}-y_{i}\} is obtained simply by extending the domain to [0,1]2[0,1]^{2}. Hence, we obtain the desired expression.

Proof of Proposition 3.7.

By Lemma 2.4, we can find (in polynomial time) a frame e1,e2,,en\langle e_{1},e_{2},\ldots,e_{n}\rangle containing two chains 𝒳{\cal X} and 𝒴{\cal Y}^{\bot}. Suppose that 𝒳={Xi}i=0n{\cal X}=\{X_{i}\}_{i=0}^{n} and 𝒴={Yi}i=0n{\cal Y}=\{Y_{i}\}_{i=0}^{n}. We can assume that ek+1ek+2en=Y0=U0e_{k+1}\vee e_{k+2}\vee\cdots\vee e_{n}={Y_{0}}^{\bot}=U_{0}. Let fi:=eif_{i}:={e_{i}}^{\bot} for i=1,2,,ki=1,2,\ldots,k. Then f1f2fn=V0f_{1}\vee f_{2}\vee\cdots\vee f_{n}=V_{0} holds, since, by Lemma 3.11 (2), we have V0=(e1e2en)=e1e2en=f1f2fk𝕂n𝕂n=f1f2fkV_{0}=(e_{1}\vee e_{2}\vee\cdots\vee e_{n})^{\bot}={e_{1}}^{\bot}\vee{e_{2}}^{\bot}\vee\cdots\vee{e_{n}}^{\bot}=f_{1}\vee f_{2}\vee\cdots f_{k}\vee{\mathbb{K}}^{n}\vee\cdots\vee{\mathbb{K}}^{n}=f_{1}\vee f_{2}\vee\cdots\vee f_{k}.

Consider the chain 𝒴{\cal Y}^{\bot\bot} in {\cal M}. Then 𝒴f1,f2,,fk{\cal Y}^{\bot\bot}\subseteq\langle f_{1},f_{2},\ldots,f_{k}\rangle since each YiY_{i}^{\bot} is the join of a subset of e1,e2,,ene_{1},e_{2},\ldots,e_{n}. Taking ()(\cdot)^{\bot} as above, YiY_{i}^{\bot\bot} is represented as the join of a subset of f1,f2,,fkf_{1},f_{2},\ldots,f_{k}. Consider a consecutive pair Yi1,YiY_{i-1},Y_{i} in 𝒴{\cal Y}. Consider Yi1{Y_{i-1}}^{\bot\bot} and Yi{Y_{i}}^{\bot\bot}. Then, by Lemma 3.11 (3), Yi1Yi1{Y_{i-1}}^{\bot\bot}\preceq Y_{i-1} and YiYi{Y_{i}}^{\bot\bot}\preceq Y_{i}. Suppose that Yi1Yi{Y_{i-1}}^{\bot\bot}\neq{Y_{i}}^{\bot\bot}. Then Yi1:Yi{Y_{i-1}}^{\bot\bot}\prec:{Y_{i}}^{\bot\bot} (by (2.2) and Lemma 3.11 (1)). Thus, for some fjf_{j} (1jk)(1\leq j\leq k), it holds Yi=fjYi1{Y_{i}}^{\bot\bot}=f_{j}\vee{Y_{i-1}}^{\bot\bot}. Here fjYi1f_{j}\not\preceq Y_{i-1} must hold. Otherwise Yi1fj=ej=fj{Y_{i-1}}^{\bot\bot}\succeq{f_{j}}^{\bot\bot}={e_{j}}^{\bot\bot\bot}=f_{j}, which contradicts Yi1:Yi=fjYi1{Y_{i-1}}^{\bot\bot}\prec:{Y_{i}}^{\bot\bot}=f_{j}\vee{Y_{i-1}}^{\bot\bot}. Also, fjYiYif_{j}\preceq Y_{i}^{\bot\bot}\preceq Y_{i}. Thus Yi=Yi1fjY_{i}=Y_{i-1}\vee f_{j}. Therefore, for each ii with Yi1=Yi{Y_{i-1}}^{\bot\bot}={Y_{i}}^{\bot\bot}, we can choose an atom ff with Yi=fYi1Y_{i}=f\vee Y_{i-1} to add to f1,f2,,fkf_{1},f_{2},\ldots,f_{k}, and obtain a required frame f1,f2,fn\langle f_{1},f_{2},\ldots f_{n}\rangle (containing 𝒳{\cal X}^{\bot} and 𝒴{\cal Y}).

Proof of Proposition 3.8.

Consider retractions φ:=φen,en1,,ek+1,e1,e2,,ek:e1,e2,,en\varphi:=\varphi_{e_{n},e_{n-1},\ldots,e_{k+1},e_{1},e_{2},\ldots,e_{k}}:{\cal L}\to\langle e_{1},e_{2},\ldots,e_{n}\rangle and ϕ:=φf1,f2,,fn:f1,f2,,fn\phi:=\varphi_{f_{1},f_{2},\ldots,f_{n}}:{\cal M}\to\langle f_{1},f_{2},\ldots,f_{n}\rangle; see Lemma 2.4 (2) for definition. Define a retraction (φ¯,ϕ¯):K()×K()K(e1,,en,f1,,fn)(\bar{\varphi},\bar{\phi}):K({\cal L})\times K({\cal M})\to K(\langle e_{1},\ldots,e_{n},f_{1},\ldots,f_{n}\rangle) by

(φ¯,ϕ¯)(x,y):=(φ¯(x),ϕ¯(y))((x,y)K()×K())(\bar{\varphi},\bar{\phi})(x,y):=(\bar{\varphi}(x),\bar{\phi}(y))\quad((x,y)\in K({\cal L})\times K({\cal M}))

Our goal is to show that (φ¯,ϕ¯)(\bar{\varphi},\bar{\phi}) does not increase the objective value of P2.

First we show

(ϕ(Y))=φ(Y)(Y).(\phi(Y))^{\bot}=\varphi(Y^{\bot})\quad(Y\in{\cal M}). (3.4)

Indeed, letting Fi:=f1f2fiF_{i}:=f_{1}\vee f_{2}\vee\cdots\vee f_{i} and Ei:=e1e2eiE_{i}:=e_{1}\vee e_{2}\vee\cdots\vee e_{i}, we have

(ϕ(Y))\displaystyle(\phi(Y))^{\bot} =\displaystyle= ({fii[n]:YFi:YFi1})\displaystyle\left(\bigvee\{f_{i}\mid i\in[n]:Y\wedge F_{i}:\succ Y\wedge F_{i-1}\}\right)^{\bot}
=\displaystyle= (V0{fii[n]:YFi:YFi1})\displaystyle\left(V_{0}\wedge\bigvee\{f_{i}\mid i\in[n]:Y\wedge F_{i}:\succ Y\wedge F_{i-1}\}\right)^{\bot}
=\displaystyle= ({fii[k]:YFi:YFi1})\displaystyle\left(\bigvee\{f_{i}\mid i\in[k]:Y\wedge F_{i}:\succ Y\wedge F_{i-1}\}\right)^{\bot}
=\displaystyle= ({fii[k]:(YV0)Fi:(YV0)Fi1})\displaystyle\left(\bigvee\{f_{i}\mid i\in[k]:(Y\wedge V_{0})\wedge F_{i}:\succ(Y\wedge V_{0})\wedge F_{i-1}\}\right)^{\bot}
=\displaystyle= {U0eii[k]:Y(U0Ei):Y(U0Ei1)}=φ(Y).\displaystyle\bigvee\{U_{0}\vee e_{i}\mid i\in[k]:Y^{\bot}\wedge(U_{0}\vee E_{i}):\succ Y^{\bot}\wedge(U_{0}\vee E_{i-1})\}=\varphi(Y^{\bot}).

The second equality follows from (V0+Z)=V0Z=𝕂nZ=Z(V_{0}+Z)^{\bot}=V_{0}^{\bot}\cap Z^{\bot}={\mathbb{K}}^{n}\cap Z^{\bot}=Z^{\bot}. The third from the modularity: Let A:={fii[k]:YFi:YFi1}A:=\bigvee\{f_{i}\mid i\in[k]:Y\wedge F_{i}:\succ Y\wedge F_{i-1}\} and B:={fii[n][k]:YFi:YFi1}B:=\bigvee\{f_{i}\mid i\in[n]\setminus[k]:Y\wedge F_{i}:\succ Y\wedge F_{i-1}\}. Then V0B=𝕂nV_{0}\wedge B={\mathbb{K}}^{n} and Y=ABY=A\vee B. Thus we have A=(V0B)A=V0YA=(V_{0}\wedge B)\vee A=V_{0}\wedge Y. The forth follows from fiV0f_{i}\preceq V_{0} for i[k]i\in[k]. The fifth follows from Lemma 3.11 (4). Note that by YU0Y^{\bot}\succeq U_{0} each atom eie_{i} with ik+1i\geq k+1 is taken in the join of the definition (2.3) of φ=φen,en1,,ek+1,e1,e2,,ek\varphi=\varphi_{e_{n},e_{n-1},\ldots,e_{k+1},e_{1},e_{2},\ldots,e_{k}}.

Next we show

R(φ(X),ϕ(Y))R(X,Y)(X,Y).R(\varphi(X),\phi(Y))\leq R(X,Y)\quad(X\in{\cal L},Y\in{\cal M}). (3.5)

Indeed, for r=dimr=\dim, we have R(φ(X),ϕ(Y))=r(φ(X))r(φ(X)ϕ(Y))=r(X)r(φ(X)φ(Y))r(X)r(φ(XY))=r(X)r(XY)=R(X,Y).R(\varphi(X),\phi(Y))=r(\varphi(X))-r(\varphi(X)\wedge\phi(Y)^{\bot})=r(X)-r(\varphi(X)\wedge\varphi(Y^{\bot}))\leq r(X)-r(\varphi(X\wedge Y^{\bot}))=r(X)-r(X\wedge Y^{\bot})=R(X,Y). In the second equality, we use (3.4) and rank-preserving property of φ\varphi. The inequality follows from order-preserving property φ(X)φ(Y)φ(XY)\varphi(X)\wedge\varphi(Y^{\bot})\succeq\varphi(X\wedge Y^{\bot}).

By (3.5), we have R(φ¯(x),ϕ¯(y))R(x,y)R(\bar{\varphi}(x),\bar{\phi}(y))\leq R(x,y); recall the isometry between K(×)K({\cal L}\times{\cal M}) and K()×K()K({\cal L})\times K({\cal M}) (Section 2.2.2). Since φ¯\bar{\varphi} and ϕ¯\bar{\phi} are nonexpansive retractions (Proposition 2.5)), we have d(x0,x)d(φ¯(x0),φ¯(x))=d(x0,φ¯(x))d(x^{0},x)\geq d(\bar{\varphi}(x^{0}),\bar{\varphi}(x))=d(x^{0},\bar{\varphi}(x)) and d(y0,y)d(ϕ¯(y0),ϕ¯(y))=d(y0,ϕ¯(y))d(y^{0},y)\geq d(\bar{\phi}(y^{0}),\bar{\phi}(y))=d(y^{0},\bar{\phi}(y)). Thus, (φ¯,ϕ¯)(\bar{\varphi},\bar{\phi}) has a desired property to prove the statement.

4 A pp-adic approach to nc-rank over {\mathbb{Q}}

In this section, we consider nc-rank computation of A=i=1mAixiA=\sum_{i=1}^{m}A_{i}x_{i}, where each AiA_{i} is a matrix over {\mathbb{Q}}. Specifically, we assume that each AiA_{i} is an integer matrix. As remarked in Remark 3.10, the algorithm in the previous section has no polynomial guarantee for the length of bits representing bases of vector subspaces. Instead of controlling bit sizes, we consider to reduce nc-rank computation over {\mathbb{Q}} to that over GF(p)GF(p) (for small pp).

For simplicity, we deal with nc-singularity testing of AA. Here AA is called nc-singular if nc-rankA<n\mathop{\rm nc\mbox{-}rank}A<n, and called nc-regular if nc-rankA=n\mathop{\rm nc\mbox{-}rank}A=n. We utilize a relationship between nc-rank and the ordinary rank (on arbitrary field 𝕂{\mathbb{K}}). For a positive integer dd, the dd-blow up A{d}A^{\{d\}} of AA is a linear symbolic matrix defined by

A{d}:=i=1mAiXi,A^{\{d\}}:=\sum_{i=1}^{m}A_{i}\otimes X_{i},

where \otimes denotes the Kronecker product and Xi=(xi,jk)X_{i}=(x_{i,jk}) is a d×dd\times d matrix with variable entries xi,jkx_{i,jk} (i[m],j,k[d])(i\in[m],j,k\in[d]).

Lemma 4.1 ([28, 33]).

A matrix AA of form (1.1) is nc-regular if and only if there is a positive integer dd such that A{d}A^{\{d\}} is regular.

There is an upper bound of such a dd. Derksen and Makam[13] proved a polynomial (linear) bound dn1d\leq n-1 by utilizing the regularity lemma d|rankA{d}d|\mathop{\rm rank}A^{\{d\}} in [29]. Such bounds play an essential role in the validity of the algorithms of [17, 29, 30]. Interestingly, our reduction presented below does not use any bound of dd.

Fix an arbitrary prime number p>1p>1. Let vp:{}v_{p}:{\mathbb{Q}}\to{\mathbb{Z}}\cup\{\infty\} denote the pp-adic valuation:

vp(u):=kifu=pka/b,v_{p}(u):=k\ {\rm if}\ u=p^{k}a/b,

where a,ba,b are nonzero integers prime to pp, and we let vp(0):=v_{p}(0):=\infty. Every rational uu\in{\mathbb{Q}} is uniquely represented as the pp-adic expansion

u=i=kaipi,u=\sum_{i=k}^{\infty}a_{i}p^{i}, (4.1)

where k=vp(u)k=v_{p}(u) and ai{0,1,2,,p1}a_{i}\in\{0,1,2,\ldots,p-1\}. The leading (nonzero) coefficient aka_{k} is given as the solution of a=bxmodpa=bx\mod p. Then uakpku-a_{k}p^{k} is divided by pk+1p^{k+1}. Repeating the same procedure for uakpku-a_{k}p^{k}, we obtain the subsequent coefficients in (4.1).

The 22-adic expansion of a nonnegative integer zz is the same as the binary expression of zz, where v2(z)v_{2}(z) is equal to the number of consecutive zeros from the first bit. This interpretation holds for an arbitrary prime pp. In particular, the pp-adic valuation of a nonzero integer is bounded by the bit-length in base pp:

vp(z)logp|z|(z{0}).v_{p}(z)\leq\log_{p}|z|\quad(z\in{\mathbb{Z}}\setminus\{0\}). (4.2)

The pp-adic valuation vpv_{p} on {\mathbb{Q}} is extended to (x1,x2,,xm){\mathbb{Q}}(x_{1},x_{2},\ldots,x_{m}) as follows. For a polynomial f[x1,x2,,xm]f\in{\mathbb{Q}}[x_{1},x_{2},\ldots,x_{m}], define vp(f)v_{p}(f) by

vp(f):=min{v(a)a is the coefficient of a term of f}.v_{p}(f):=\min\{v(a)\mid\mbox{$a$ is the coefficient of a term of $f$}\}. (4.3)

Accordingly, the valuation of a rational function f/gf/g is defined as vp(f)vp(g)v_{p}(f)-v_{p}(g). This is called the Gauss extension of vpv_{p}.

Our algorithm for testing nc-singularity is based on the following problem (maximum vanishing submodule problem; MVMP):

MVMP:Max.\displaystyle\mbox{MVMP:}\quad{\rm Max}. vpdetPvpdetQ\displaystyle-v_{p}\det P-v_{p}\det Q
s.t.\displaystyle{\rm s.t.} vp(PAQ)ij0(i,j[n]),\displaystyle v_{p}(PAQ)_{ij}\geq 0\quad(i,j\in[n]),
P,QGLn().\displaystyle P,Q\in GL_{n}({\mathbb{Q}}).

This problem is definable for an arbitrary field with a discrete valuation, and the following arguments are applicable for such a field, while [25] introduced MVMP for the rational function field with one valuable.

MVMP is also a discrete convex optimization on a CAT(0) space. Indeed, its domain can be viewed as the vertex set (the set of lattices, certain submodules of n{\mathbb{Q}}^{n}) of the Euclidean building for GLn()GL_{n}({\mathbb{Q}}), and the objective function is an L-convex function; see [25, 26]. A Euclidean building is a representative space admitting a CAT(0)-metric.

The optimal value of MVMP is denoted by vpDetA{}v_{p}\mathop{\rm Det}^{\prime}A\in{\mathbb{Z}}\cup\{\infty\}, where we let vpDetA:=v_{p}\mathop{\rm Det}^{\prime}A:=\infty if MVMP is unbounded. The motivation behind this notation vpDetAv_{p}\mathop{\rm Det}^{\prime}A is explained in Remark 4.6.

For a feasible solution (P,Q)(P,Q) of MVMP, consider the pp-adic expansion of PAiQ=k=0(PAiQ)(k)pkPA_{i}Q=\sum_{k=0}^{\infty}(PA_{i}Q)^{(k)}p^{k} for each ii. The leading matrix (PAiQ)(0)(PA_{i}Q)^{(0)} consists of values 0,1,,p10,1,\ldots,p-1 and is considered in GF(p)GF(p). Then we can consider the linear symbolic matrix

(PAQ)(0):=i=1m(PAiQ)(0)xi(PAQ)^{(0)}:=\sum_{i=1}^{m}(PA_{i}Q)^{(0)}x_{i}

over GF(p)GF(p).

Lemma 4.2.

For a feasible solution (P,Q)(P,Q) of MVMP, the following hold:

  • (1)

    vpdetPvpdetQvpdetA-v_{p}\det P-v_{p}\det Q\leq v_{p}\det A. In particular, vpDetAvpdetAv_{p}\mathop{\rm Det}^{\prime}A\leq v_{p}\det A.

  • (2)

    If (PAQ)(0)(PAQ)^{(0)} is regular, then vpdetA=vpdetPvpdetQ=vpDetAv_{p}\det A=-v_{p}\det P-v_{p}\det Q=v_{p}\mathop{\rm Det}^{\prime}A.

Proof.

They follow from 0vpdetPAQ=vpdetP+vpdetQ+vpdetA0\leq v_{p}\det PAQ=v_{p}\det P+v_{p}\det Q+v_{p}\det A. The inequality holds in equality precisely when the leading matrix (PAQ)(0)(PAQ)^{(0)} is regular. ∎

The following algorithm for MVMP is due to [25], which originated from Murota’s combinatorial relaxation algorithm [37] and can be viewed as an descent algorithm on the Euclidean building. For an integer vector zz\in{\mathbb{Z}}, let (pz)(p^{z}) denote the diagonal matrix with diagonals pz1,pz2,,pznp^{z_{1}},p^{z_{2}},\ldots,p^{z_{n}} in order.

Algorithm: Val-Det
0:

Let (P,Q):=(I,I)(P,Q):=(I,I).

1:

Solve FR (or MVSP) for (PAQ)(0)(PAQ)^{(0)}, and obtain optimal matrices S,TGLn(GF(p))S,T\in GL_{n}(GF(p)) such that S(PAQ)(0)TS(PAQ)^{(0)}T has an r×sr\times s zero submatrix in its upper-left corner.

2:

If (PAQ)(0)(PAQ)^{(0)} is nc-singular, i.e., n<r+sn<r+s, then let (P,Q)((p1[r])SP,QT(p1[n][s]))(P,Q)\leftarrow((p^{-1_{[r]}})SP,QT(p^{1_{[n]\setminus[s]}})) and go to step 1. Otherwise stop.

The initial (P,Q)(P,Q) in step 0 is feasible with objection value 0, as each AiA_{i} is an integer matrix. In step 2, S,TS,T are regarded as matrices in GLn()GL_{n}({\mathbb{Q}}) with entries in {0,1,,p1}\{0,1,\ldots,p-1\}. Observe that each entry in the r×sr\times s upper-left submatrix of SPAQTSPAQT is divided by pp. Thus, the update in step 2 keeps the feasibility of (P,Q)(P,Q). Further, it strictly increases the objective value: vpdet(p1[r])SPvpdetQT(p1[n][s])=(r+sn)vpdetPvpdetQ-v_{p}\det(p^{-1_{[r]}})SP-v_{p}\det QT(p^{1_{[n]\setminus[s]}})=(r+s-n)-v_{p}\det P-v_{p}\det Q. Note that detS\det S and detT\det T cannot be divided by pp, since SS and TT are invertible in modulo pp. Therefore, nc-regularity of (PAQ)(0)(PAQ)^{(0)} is a necessary condition for optimality of (P,Q)(P,Q). In fact, it is sufficient.

Proposition 4.3 ([26]).

A feasible solution (P,Q)(P,Q) is optimal if and only if (PAQ)(0)(PAQ)^{(0)} is nc-regular. In this case, it holds vpDetA=(1/d)vpdetA{d}v_{p}\mathop{\rm Det}^{\prime}A=(1/d)v_{p}\det A^{\{d\}} for some d>0d>0.

Proof.

As in [26, Lemma 4.2 (1)], one can show vpDetA=(1/d)vpDetA{d}v_{p}\mathop{\rm Det}^{\prime}A=(1/d)v_{p}\mathop{\rm Det}^{\prime}A^{\{d\}} for all dd. By Lemma 4.2, vpDetA(1/d)vpdetA{d}v_{p}\mathop{\rm Det}^{\prime}A\leq(1/d)v_{p}\det A^{\{d\}} holds for all dd.

Suppose that (PAQ)(0)(PAQ)^{(0)} is nc-regular. It suffices to show vpDetA(1/d)vpdetA{d}v_{p}\mathop{\rm Det}^{\prime}A\geq(1/d)v_{p}\det A^{\{d\}} for some dd. By Lemma 4.1, for some d>0d>0, ((PAQ)(0)){d}=((PAQ){d})(0)=((PI)A{d}(QI))(0)((PAQ)^{(0)})^{\{d\}}=((PAQ)^{\{d\}})^{(0)}=((P\otimes I)A^{\{d\}}(Q\otimes I))^{(0)} is regular. Observe that (PI,QI)(P\otimes I,Q\otimes I) is feasible to MVMP for A{d}A^{\{d\}}. By Lemma 4.2 (2), we have vpdetA{d}=vpdetPIvpdetQI=d(vpdetP+vpdetQ)dvpDetAv_{p}\det A^{\{d\}}=-v_{p}\det P\otimes I-v_{p}\det Q\otimes I=-d(v_{p}\det P+v_{p}\det Q)\leq dv_{p}\mathop{\rm Det}^{\prime}A. ∎

From the proof and Lemma 4.1, we have:

Corollary 4.4.

AA is nc-regular if and only if vpDetA<v_{p}\mathop{\rm Det}^{\prime}A<\infty.

Therefore, Val-Det does not terminate if AA is nc-singular. A stopping criterion guaranteeing nc-singularity of AA is obtained as follows:

Proposition 4.5.

Suppose that each AiA_{i} consists of integer entries whose absolute values are at most DD. If AA is nc-regular, then vpDetA=O(nlogpnD)v_{p}\mathop{\rm Det}^{\prime}A=O(n\log_{p}nD). Thus, Ω(nlogpnD)\Omega(n\log_{p}nD) iterations of Val-Det certify nc-singularity of AA.

Proof.

Suppose that AA is nc-regular. By Proposition 4.3, vpDetA=(1/d)vpdetA{d}v_{p}\mathop{\rm Det}^{\prime}A=(1/d)v_{p}\det A^{\{d\}} for some dd. We estimate vpdetA{d}v_{p}\det A^{\{d\}}. The following argument is a sharpening of the proof of [26, Lemma 4.9]. Rewrite A{d}A^{\{d\}} as

A{d}=i[m],j,k[d]Ai,jkxi,jk,A^{\{d\}}=\sum_{i\in[m],j,k\in[d]}A_{i,jk}x_{i,jk},

where Ai,jkA_{i,jk} is an nd×ndnd\times nd block matrix with block size nn such that the (j,k)(j,k)-th block equals to AiA_{i} and other blocks are zero. By multilinearity of determinant, we have

detA{d}=α1,α2,,αnd±detA[α1,α2,,αnd]xα1xα2xαnd,\det A^{\{d\}}=\sum_{\alpha_{1},\alpha_{2},\ldots,\alpha_{nd}}\pm\det A[\alpha_{1},\alpha_{2},\ldots,\alpha_{nd}]x_{\alpha_{1}}x_{\alpha_{2}}\cdots x_{\alpha_{nd}},

where αγ\alpha_{\gamma} (γ[nd])(\gamma\in[nd]) ranges over {(i,jk)}i[m],j[d]\{(i,jk)\}_{i\in[m],j\in[d]} if γ\gamma belongs to the kk-th block (i.e., k=γ/d)k=\lceil\gamma/d\rceil) and A[α1,α2,,αnd]A[\alpha_{1},\alpha_{2},\ldots,\alpha_{nd}] is the nd×ndnd\times nd matrix with the γ\gamma-th column chosen from Ak,ijA_{k,ij} with αγ=(k,ij)\alpha_{\gamma}=(k,ij). A monomial in this expression is written as azxk,ijzk,ija_{z}\prod x_{k,ij}^{z_{k,ij}} for a nonnegative vector z=(zi,jk)md2z=(z_{i,jk})\in{\mathbb{Z}}^{md^{2}} with i,jzi,jk=n\sum_{i,j}z_{i,jk}=n (k[d])(k\in[d]). The coefficient aza_{z} is given by

az=α1,α2,,αnd±detA[α1,α2,,αnd],a_{z}=\sum_{\alpha_{1},\alpha_{2},\ldots,\alpha_{nd}}\pm\det A[\alpha_{1},\alpha_{2},\ldots,\alpha_{nd}],

where α1,α2,,αnd\alpha_{1},\alpha_{2},\ldots,\alpha_{nd} are taken so that (i,jk)(i,jk) appears zi,jkz_{i,jk} times. The total number of such indices is

k=1dn!i,jzi,jk!nnd.\prod_{k=1}^{d}\frac{n!}{\prod_{i,j}z_{i,jk}!}\leq n^{nd}.

From Hadamard’s inequality and the fact that each column of A[α1,α2,,αnd]A[\alpha_{1},\alpha_{2},\ldots,\alpha_{nd}] has at most nn nonzero entries with absolute values at most DD, we have

|az|nnd(n1/2D)ndn3nd/2Dnd.|a_{z}|\leq n^{nd}(n^{1/2}D)^{nd}\leq n^{3nd/2}D^{nd}. (4.4)

Therefore, the bit length of aza_{z} in base pp is bounded by O(ndlogpnD)O(nd\log_{p}nD). By (4.2), we have vpdetA{d}=O(ndlognD)v_{p}\det A^{\{d\}}=O(nd\log nD). Thus, vpDetA=O(nlogpnD)v_{p}\mathop{\rm Det}^{\prime}A=O(n\log_{p}nD). ∎

For p=2p=2, the algorithm Val-Det is executed as follows. Instead of updating (P,Q)(P,Q), update AA as A(p1[r])SAT(p1[n][s])A\leftarrow(p^{-1_{[r]}})SAT(p^{1_{[n]\setminus[s]}}). Then, A(0)A^{(0)} is computed as (Ai)(0)=Aimod2(A_{i})^{(0)}=A_{i}\mod 2. In step 2, S,TS,T are 0,10,1 matrices such that all entries of the r×sr\times s corner of each SAiTSA_{i}T are divided by 22. Hence, the next AiA_{i} is again an integer matrix. The bit-length bound of each entry in AiA_{i} increases by O(log2n)O(\log_{2}n) (starting from the initial bound O(log2D)O(\log_{2}D)). Therefore, until detecting nc-singularity of AA, the required bit-length is O(nlog2nlog2nD)O(n\log_{2}n\log_{2}nD).

Remark 4.6 (Valuations on the free skew field).

As shown by Cohn [11, Corollary 4.6], any valuation vv on a field 𝕂{\mathbb{K}} is extended to the free skew field 𝕂(x1,,xm){\mathbb{K}}(\langle x_{1},\ldots,x_{m}\rangle). Then we can consider the valuation vDetAv\mathop{\rm Det}A of the Dieudonne determinant DetA\mathop{\rm Det}A of AA. If the extension vv is discrete and coincides with the Gauss extension (4.3) on 𝕂x1,x2,,xm{\mathbb{K}}\langle x_{1},x_{2},\ldots,x_{m}\rangle, then one can show by precisely the same argument in [25] that vDetAv\mathop{\rm Det}A is given by MVMP. Such an extension seems always exist; in this case, vpDet=vpDetv_{p}\mathop{\rm Det}^{\prime}=v_{p}\mathop{\rm Det}. We verified the existence of an extension with the latter property (by adapting Cohn’s argument in [11, Section 4]). However we could not prove the discreteness. Note that the arguments in this section is independent of the existence issue.

Acknowledgments

We thank Kazuo Murota, Satoru Iwata, Satoru Fujishige, Yuni Iwamasa for helpful comments, and thank Koyo Hayashi for careful reading. The work was partially supported by JSPS KAKENHI Grant Numbers 25280004, 26330023, 26280004, 17K00029, and JST PRESTO Grant Number JPMJPR192A, Japan.

References

  • [1] P. Abramenko and K. S. Brown, Buildings—Theory and Applications. Springer, New York, 2008.
  • [2] Z. Allen-Zhu, A. Garg, Y. Li, R. Oliveira, and A. Wigderson. Operator scaling via geodesically convex optimization, invariant theory and polynomial identity testing. preprint, 2017, the conference version in STOC 2018.
  • [3] M. Bačák, The proximal point algorithm in metric spaces. Israel Journal of Mathematics 194 (2013), 689–701.
  • [4] M. Bačák, Computing medians and means in Hadamard spaces. SIAM Journal on Optimization 24 (2014), 1542–1566.
  • [5] M. Bačák, Convex Analysis and Optimization in Hadamard Spaces. De Gruyter, Berlin, 2014.
  • [6] L. J. Billera, S. P. Holmes, and K. Vogtmann: Geometry of the space of phylogenetic trees. Advances in Applied Mathematics 27 (2001), 733–767.
  • [7] T. Brady and J. McCammond, Braids, posets and orthoschemes. Algebraic and Geometric Topology 10 (2010), 2277–2314.
  • [8] M. R. Bridson and A. Haefliger, Metric Spaces of Non-positive Curvature. Springer-Verlag, Berlin, 1999.
  • [9] P. Bürgisser, C. Franks, A. Garg, R. Oliveira, M. Walter, and A. Wigderson, Towards a theory of non-commutative optimization: geodesic first and second order methods for moment maps and polytopes. preprint, 2019, the conference version in FOCS 2019.
  • [10] J. Chalopin, V. Chepoi, H. Hirai, and D. Osajda. Weakly modular graphs and nonpositive curvature. Memoirs of the AMS, to appear.
  • [11] P. M. Cohn, The construction of valuations of skew fields. Journal of the Indian Mathematical Society 54 (1989) 1–45.
  • [12] P. M. Cohn, Skew fields. Cambridge University Press, Cambridge, 1995.
  • [13] H. Derksen and V. Makam, Polynomial degree bounds for matrix semi-invariants. Advances in Mathematics 310 (2017), 44–63.
  • [14] M. Fortin and C. Reutenauer, Commutative/non-commutative rank of linear matrices and subspaces of matrices of low rank. Séminaire Lotharingien de Combinatoire 52 (2004), B52f.
  • [15] S. Fujishige, Submodular Functions and Optimization, 2nd Edition. Elsevier, Amsterdam, 2005.
  • [16] S. Fujishige, T Király, K. Makino, K. Takazawa, and S. Tanigawa, Minimizing Submodular Functions on Diamonds via Generalized Fractional Matroid Matchings. EGRES Technical Report (TR-2014-14), (2014).
  • [17] A. Garg, L. Gurvits, R. Oliveira, and A. Wigderson, Operator scaling: theory and applications. Foundations of Computational Mathematics (2019).
  • [18] G. Grätzer, Lattice Theory: Foundation. Birkhäuser, Basel, 2011.
  • [19] L. Gurvits, Classical complexity and quantum entanglement. Journal of Computer and System Sciences 69 (2004), 448–484.
  • [20] T. Haettel, D. Kielak, and P. Schwer, The 6-strand braid group is CAT(0). Geometriae Dedicata 182 (2016), 263–286.
  • [21] M. Hamada and H. Hirai, Maximum vanishing subspace problem, CAT(0)-space relaxation, and block-triangularization of partitioned matrix. preprint, 2017.
  • [22] K. Hayashi, A polynomial time algorithm to compute geodesics in CAT(0) cubical complexes. Discrete & Computational Geometry, to appear.
  • [23] H. Hirai, Computing DM-decomposition of a partitioned matrix with rank-1 blocks. Linear Algebra and Its Applications 547 (2018), 105–123.
  • [24] H. Hirai, L-convexity on graph structures. Journal of the Operations Research Society of Japan 61 (2018), 71–109.
  • [25] H. Hirai, Computing the degree of determinants via discrete convex optimization on Euclidean buildings. SIAM Journal on Applied Geometry and Algebra 3 (2019), 523–557.
  • [26] H. Hirai and M. Ikeda, A cost-scaling algorithm for computing the degree of determinants, preprint, 2020.
  • [27] H. Hirai and Y. Iwamasa, A combinatorial algorithm for computing the rank of a generic partitioned matrix with 2×22\times 2 submatrices. preprint, 2020, the conference version in IPCO 2020.
  • [28] P. Hrubeš and A. Wigderson, Non-commutative arithmetic circuits with division. Theory of Computing 11 (2015), 357–393.
  • [29] G. Ivanyos, Y. Qiao, and K. V. Subrahmanyam, Non-commutative Edmonds’ problem and matrix semi-invariants. Computational Complexity 26 (2017) 717–763.
  • [30] G. Ivanyos, Y. Qiao, and K. V. Subrahmanyam, Constructive noncommutative rank computation in deterministic polynomial time over fields of arbitrary characteristics. Computational Complexity 27 (2018), 561–593.
  • [31] H. Ito, S. Iwata, and K. Murota, Block-triangularizations of partitioned matrices under similarity/equivalence transformations. SIAM Journal on Matrix Analysis and Applications 15 (1994), 1226–1255.
  • [32] S. Iwata and K. Murota, A minimax theorem and a Dulmage-Mendelsohn type decomposition for a class of generic partitioned matrices. SIAM Journal on Matrix Analysis and Applications 16 (1995), 719–734.
  • [33] D. S. Kaliuzhnyi-Verbovetskyi and V. Vinnikov, Noncommutative rational functions, their difference-differential calculus and realizations. Multidimensional Systems and Signal Processing 23 (2012), 49–77.
  • [34] F. Kuivinen, On the complexity of submodular function minimisation on diamonds. Discrete Optimization, 8 (2011), 459–477.
  • [35] L. Lovász, Submodular functions and convexity. In A. Bachem, M. Grötschel, and B. Korte (eds.): Mathematical Programming—The State of the Art (Springer-Verlag, Berlin, 1983), 235–257.
  • [36] L. Lovász, Singular spaces of matrices and their application in combinatorics. Boletim da Sociedade Brasileira de Matemática 20 (1989), 87–99.
  • [37] K. Murota, Matrices and Matroids for Systems Analysis. Springer-Verlag, Berlin, 2000.
  • [38] K. Murota, Discrete Convex Analysis. SIAM, Philadelphia, 2004.
  • [39] S. Ohta and M. Pálfia, Discrete-time gradient flows and law of large numbers in Alexandrov spaces. Calculus of Variations and Partial Differential Equations 54 (2015) 1591–1610.
  • [40] T. Oki, Computing the maximum degree of minors in skew polynomial matrices. preprint, 2019, the conference version in ICALP 2020.
  • [41] M. Owen, Computing geodesic distances in tree space. SIAM Journal on Discrete Mathematics 25 (2011), 1506–1529.