This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

11affiliationtext: University of Bergen, Norway22affiliationtext: University of Utah, USA

Structural Optimal Jacobian Accumulation and Minimum Edge Count are NP-Complete Under Vertex Elimination

Matthias Bentert Alex Crane Pål Grønås Drange Yosuke Mizutani Blair D. Sullivan
Abstract

We study graph-theoretic formulations of two fundamental problems in algorithmic differentiation. The first (Structural Optimal Jacobian Accumulation) is that of computing a Jacobian while minimizing multiplications. The second (Minimum Edge Count) is to find a minimum-size computational graph. For both problems, we consider the vertex elimination operation. Our main contribution is to show that both problems are NP-complete, thus resolving longstanding open questions. In contrast to prior work, our reduction for Structural Optimal Jacobian Accumulation does not rely on any assumptions about the algebraic relationships between local partial derivatives; we allow these values to be mutually independent. We also provide O(2n)O^{*}(2^{n})-time exact algorithms for both problems, and show that under the exponential time hypothesis these running times are essentially tight. Finally, we provide a data reduction rule for Structural Optimal Jacobian Accumulation by showing that false twins may always be eliminated consecutively.

1 Introduction

A core subroutine in numerous scientific computing applications is the efficient computation of derivatives. If inexactness is acceptable, finite difference methods [22] may be applicable. Meanwhile, computer algebra packages allow for exact symbolic computation, though these methods suffer from the requirement of a closed-form expression of the function FF to be differentiated as well as poor running time in practice. A third approach is algorithmic differentiation, also sometimes called automatic differentiation. Algorithmic differentiation provides almost exact computations, incurring only rounding errors (in contrast to finite difference methods which also incur truncation errors). Furthermore, the running time required by algorithmic differentiation is bounded by the time taken to compute FF (in contrast to symbolic methods). We refer to the works by Griewank and Walther [11] and by Naumann [20] for an introduction and numerous applications.

In algorithmic differentiation, we assume that the function FF is implemented as a numerical program. The key insight is that such programs consist of compositions of elementary functions, e.g., multiplication, sin, cos, etc., for which the derivatives are known or easily computable. Then, derivative computations for the function FF follow by application of the chain rule. The relevant numerical program can be modeled as a directed acyclic graph (DAG) D=(SI,E)D=(S\uplus I,E), referred to as the computational graph of FF [10]. The source and sink vertices SS model the inputs and outputs of FF, respectively. The internal vertices II model elementary function calls. Arcs (directed edges) model data dependencies. Let 𝒫s,t\mathcal{P}_{s,t} denote the set of all paths from source ss to sink tt. If we associate to each arc (u,v)(u,v) the known (or easily computable) local partial derivative \pdvvu\pdv{v}{u}, then the chain rule allows us to compute the derivative of tt with respect to ss by

\odvts=P𝒫s,t(u,v)P\pdvvu\displaystyle\odv{t}{s}=\sum_{P\in\mathcal{P}_{s,t}}\prod_{(u,v)\in P}\pdv{v}{u}

as shown by Bauer [2]. This weighted DAG is the linearized computational graph.

Using the linearized computational graph, we may model the computation of a Jacobian as follows. An elimination of an internal vertex vv is the deletion of vv and the creation of all arcs from each in-neighbor to each out-neighbor of vv which are not already present. We write D/v{D}_{/{v}} for the resulting DAG. An elimination sequence σ=(v1,v2,,v){\sigma=(v_{1},v_{2},\ldots,v_{\ell})} of length \ell is a tuple of internal vertices, and we denote by Dσ=(((D/v1)/v2))/v{D}_{\sigma}=((({D}_{/{v_{1}}})_{/v_{2}})\ldots)_{/v_{\ell}} the result of eliminating these vertices in the order given by σ\sigma. We call σ\sigma a total elimination sequence if =|I|\ell=|I|. If σ\sigma is a total elimination sequence, then with appropriate local partial derivative computations or updates during the sequence, Dσ{D}_{\sigma} can be thought of as a bipartite DAG (with sources on one side and sinks on the other) representing the Jacobian matrix of the associated numerical program. To reflect the number of multiplications needed to maintain correctness of Equation (1), we say that the cost of eliminating an internal vertex vv is the Markowitz degree μ(v)\mu(v) of vv, that is, the in-degree of vv times the out-degree of vv. The cost of an elimination sequence is the sum of the costs of the involved eliminations. We can now phrase the problem of computing a Jacobian with few multiplications in purely graph-theoretic terms:


Input: A DAG D=(SI,A)D=(S\uplus I,A) and an integer kk. Question: Does there exist a total elimination sequence of cost at most kk? Structural Optimal Jacobian Accumulation parameterized by 

Optimal Jacobian Accumulation is known to be NP-complete only under certain assumptions regarding algebraic dependencies between local partial derivatives [19]. Despite this, heuristics used in practice are based on the purely structural formulation presented here [4, 6, 9, 21] and understanding the complexity of this formulation, open since at least 1993 [8], has recently been highlighted as an important problem in applied combinatorics [1].


A solution to Structural Optimal Jacobian Accumulation is always a total elimination sequence, resulting in a bipartite DAG representing the Jacobian. A related problem, first posed by Griewank and Vogel in 2005 [10], is to identify a (not necessarily total) elimination sequence which results in a computational graph with a minimum number of arcs:


Input: A DAG D=(SI,A)D=(S\uplus I,A) and an integer kk. Question: Does there exist an elimination sequence σ\sigma (of any length) such that DσD_{\sigma} contains at most kk arcs? Minimum Edge Count parameterized by 

The motivation to solve this problem is twofold. First, suppose that our function FF is a map from n\mathbb{R}^{n} to m\mathbb{R}^{m} and we wish to multiply its Jacobian (a matrix in m×n\mathbb{R}^{m\times n}) by a matrix Sn×qS\in\mathbb{R}^{n\times q}. We may model this computation by augmenting the linearized computational graph DD with qq new source vertices. For each new source i{1,2,,q}i\in\{1,2,\ldots,q\} and each original source j{1,2,,n}j\in\{1,2,\ldots,n\}, we add the arc (i,j)(i,j). By labeling the new arcs with entries from the matrix SS, we may obtain the result of the matrix multiplication via application of Equation (1). We refer to the work by Mosenkis and Naumann [17] for a formal presentation. The number of multiplications required to use Equation (1) in this way grows with the number of arcs in DD, thus motivating the computation of a small (linearized) computational graph. A second motivation is that DD can sometimes reveal useful information not evident in the Jacobian matrix. This situation, known as scarcity, is described by Griewank and Vogel [10]. Thus, it is desirable to store DD, rather than the Jacobian matrix, and consequently it is also desirable for DD to be as small as possible.

Despite these motivations and several algorithmic studies [10, 16, 17], the computational complexity of Minimum Edge Count has remained open since its introduction [10, 17]. Like Structural Optimal Jacobian Accumulation, resolving this question has recently been highlighted as an important open problem [1].

Our Results.

We show that both Structural Optimal Jacobian Accumulation and Minimum Edge Count are NP-complete, resolving the key complexity questions which have stood open since 1993 [8] and 2005 [10], respectively. Furthermore, we prove that unless the exponential time hypothesis (ETH)111The ETH is a popular complexity assumption that states that 3-Sat cannot be solved in subexponential time. See Section 2 for more details. fails, neither problem admits a subexponential algorithm, i.e., an algorithm with running time 2o(n+m)2^{o(n+m)}. We complement our lower bounds by providing O(2n)O^{*}(2^{n})-time algorithms for both problems.

2 Preliminaries and Basic Observations

In this section, we define the notation we use throughout the paper, introduce relevant concepts from the existing literature, and show two useful basic propositions.

Notation.

For a positive integer nn, we use [n][n] to denote the set {1,2,,n}\{1,2,\ldots,n\}. We use standard graph terminology. In particular, a graph G=(V,E)G=(V,E) or D=(V,A)D=(V,A) is a pair where VV denotes the set of vertices and EE and AA denote the set of (undirected) edges or (directed) arcs, respectively. We use nn to indicate the number of vertices in a graph and mm to indicate the number of edges or arcs. For an (undirected) edge between two vertices uu and vv we write {u,v}\{u,v\}, and for an arc (a directed edge) from uu to vv we write (u,v)(u,v). Given a vertex vVv\in V, we denote by ND(v)N^{-}_{{D}}({v}) and ND+(v)N^{+}_{{D}}({v}) the open in- and out-neighborhoods of vv, respectively. The Markowitz degree is defined to be μD(v)=degD(v)degD+(v)=|ND(v)||ND+(v)|\mu_{D}(v)=\deg_{D}^{-}(v)\cdot\deg_{D}^{+}(v)=|N^{-}_{{D}}({v})|\cdot|N^{+}_{{D}}({v})|. If the graph is clear from context, we omit the subscript in the above notation. We say that two vertices uu and vv are false twins in a directed graph DD if ND(u)=ND(v)N^{-}_{{D}}({u})=N^{-}_{{D}}({v}) and ND+(u)=ND+(v)N^{+}_{{D}}({u})=N^{+}_{{D}}({v}). Given sequences σ1=(a1,a2,,ai)\sigma_{1}=(a_{1},a_{2},\ldots,a_{i}) and σ2=(b1,b2,,bj)\sigma_{2}=(b_{1},b_{2},\ldots,b_{j}), we write (σ1,σ2)(\sigma_{1},\sigma_{2}) for the combined sequence (a1,a2,,ai,b1,b2,,bj)(a_{1},a_{2},\ldots,a_{i},b_{1},b_{2},\ldots,b_{j}), and we generalize this notation to more than two sequences.

Reductions and the ETH.

We assume the reader to be familiar with basic concepts in complexity theory like big-O (Bachmann–Landau) notation, NP-completeness, and polynomial-time many-one reductions (also known as Karp reductions). We refer to the standard textbook by Garey and Johnson [7] for an introduction. We use OO^{*} to hide factors that are polynomial in the input size and call a polynomial-time many-one reduction a linear reduction when the size of the constructed instance II^{\prime} is linear in the size of the original instance II, that is, |I|O(|I|)|I^{\prime}|\in O(|I|). The exponential time hypothesis (ETH) [12] states that there is some ε>0\varepsilon>0 such that each algorithm solving 3-Sat takes at least 2εn+o(n)2^{\varepsilon n+o(n)} time, where nn is the number of variables in the input instance. Assuming the ETH, 3-Sat and many other problems cannot be solved in subexponential (2o(n+m)2^{o(n+m)}) time [13]. It is known that if there is a linear reduction from a problem AA to a problem BB and AA cannot be solved in subexponential time, then also BB cannot be solved in subexponential time [13].

Fundamental Observations.

We next show two useful observations. The first one states that the order in which vertices are eliminated does not affect the resulting graph (note that it may still affect the cost of the elimination sequence). This is a folklore result, but to our knowledge no proof is known. Our argument can be seen as an adaptation of one used by Rose and Tarjan to prove a closely related result [23].

Proposition 1.

Let D=(V,A)D=(V,A) be a DAG, let XVX\subseteq V be a set of vertices, and let σ1\sigma_{1} and σ2\sigma_{2} be two permutations of the vertices in XX. Then, Dσ1=Dσ2D_{\sigma_{1}}=D_{\sigma_{2}}.

Proof.

We first show for any DAG D=(V,A)D=(V,A) and any three vertices u,v,wV{u,v,w\in V} that there is a directed path from uu to vv in DD if and only if there is a directed path from uu to vv in D/w{D}_{/{w}}. To this end, first assume that there is a directed path PP from uu to vv in DD. If PP does not contain ww, then PP is also a directed path in D/w{D}_{/{w}}. Otherwise, let x,yx,y be the vertices before and after ww in PP, respectively. Since the elimination of ww adds an arc from xx to yy, there is also a directed path from uu to vv in this case. Now assume that there is a directed path in D/w{D}_{/{w}}. We assume without loss of generality that PP is a shortest path in DD. There are again two cases. Either PP is also a directed path in DD or there is at least one arc (x,y)(x,y) in PP that is not present in DD. In the first case, PP shows that there is a directed path from uu to vv in DD by definition. In the second case, we consider the first arc (x,y)(x,y) in PP that is not contained in DD. Note that by construction (x,y)(x,y) is only added to D/w{D}_{/{w}} if xx is an in-neighbor of ww and yy is an out-neighbor of ww in DD. Moreover, since PP is a shortest path, there is no other vertex yy^{\prime} in PP that is also a out-neighbor of ww as otherwise the arc (x,y)(x,y^{\prime}) exists in D/w{D}_{/{w}} contradicting that PP is a shortest path. We can now replace the arc (x,y)(x,y) in PP by a subpath (x,w,y)(x,w,y) to get a directed path from uu to vv in DD.

Let σ=(w1,w2,,wk)\sigma=(w_{1},w_{2},\ldots,w_{k}) be a sequence. By induction on kk (using the above argument), it holds for any two vertices u,vV{w1,w2,,wk}u,v\in V\setminus\{w_{1},w_{2},\ldots,w_{k}\} that there is a directed path from uu to vv in DD if and only if there one in (((D/w1)/w2))/wk=Dσ((({D}_{/{w_{1}}})_{/w_{2}})\ldots)_{/w_{k}}=D_{\sigma}.

Now assume towards a contradiction that Dσ1Dσ2D_{\sigma_{1}}\neq D_{\sigma_{2}}. Note that both contain the same set VXV\setminus X of vertices. We assume without loss of generality that there is an arc (u,v)(u,v) that exists in Dσ1D_{\sigma_{1}} but not in Dσ2D_{\sigma_{2}}. By the above argument, since (u,v)(u,v) appears in Dσ1D_{\sigma_{1}} there is a directed path from uu to vv in DD. However, since (u,v)(u,v) does not appear in Dσ2D_{\sigma_{2}}, there is no directed path from uu to vv in DD, a contradiction. This concludes the proof. ∎

Let σ\sigma be a sequence of internal vertices, and let XX be the set of vertices appearing in σ\sigma. In the rest of this paper, we may use DX{D}_{X} to denote the graph Dσ{D}_{\sigma}. By Proposition 1, this notation is well-defined.

To conclude this section, we show that false twins can be handled uniformly in Structural Optimal Jacobian Accumulation. Let TT be a set of false twins, i.e., uu and vv are false twins for every u,vTu,v\in T. Then we may assume, without loss of generality, that eliminating any uTu\in T also entails eliminating the rest of the vertices in TT immediately afterward.

Proposition 2.

Let D=(SI,A)D=(S\uplus I,A) be a DAG and let TIT\subseteq I be a set of false twins. Then, there exists an optimal elimination sequence (for Structural Optimal Jacobian Accumulation) that eliminates the vertices of TT consecutively.

Proof.

Let TIT\subseteq I be a set of false twins in DD. We first prove the result when |T|=2|T|=2. Let T={u,v}T=\{u,v\}, and let σ\sigma be an optimal solution. We may assume that uu and vv are eliminated non-consecutively in σ\sigma, as otherwise the proof is complete. We further assume without loss of generality that uu is eliminated before vv in σ\sigma. Let σ1\sigma_{1} be the subsequence of σ\sigma before uu, σ2\sigma_{2} be the subsequence between uu and vv and σ3\sigma_{3} be the subsequence after vv. Let X1,X2X_{1},X_{2}, and X3X_{3} be the sets of vertices appearing in σ1,σ2\sigma_{1},\sigma_{2}, and σ3\sigma_{3}, respectively. Note that X1X_{1} and X3X_{3} might be empty, but X2X_{2} contains at least one vertex. Let σ=(σ1,σ2,u,v,σ3)\sigma^{\prime}=(\sigma_{1},\sigma_{2},u,v,\sigma_{3}) and σ′′=(σ1,u,v,σ2,σ3)\sigma^{\prime\prime}=(\sigma_{1},u,v,\sigma_{2},\sigma_{3}). Let c,cc,c^{\prime}, and c′′c^{\prime\prime} be the costs of σ,σ\sigma,\sigma^{\prime}, and σ′′\sigma^{\prime\prime}, respectively. Since σ\sigma is optimal, it holds that ccc\leq c^{\prime} and cc′′c\leq c^{\prime\prime}. Now, we claim that both σ\sigma^{\prime} and σ′′\sigma^{\prime\prime} are optimal (it suffices to show that either σ\sigma^{\prime} or σ′′\sigma^{\prime\prime} is optimal, but it will be useful later to prove that both are). Assume otherwise, so at least one of the inequalities ccc\leq c^{\prime} or cc′′c\leq c^{\prime\prime} is strict. This implies 2c<c+c′′2c<c^{\prime}+c^{\prime\prime}. We will show that this inequality leads to a contradiction.

It holds that the cost of σ1\sigma_{1} in DD and the cost of σ3\sigma_{3} in DX1X2{u,v}D_{X_{1}\cup X_{2}\cup\{u,v\}} are accounted for identically in the total costs of σ,σ\sigma,\sigma^{\prime}, and σ′′\sigma^{\prime\prime}. Thus, any difference in the values of c,c,c,c^{\prime}, and c′′c^{\prime\prime} is attributable entirely to differing costs of eliminating u,vu,v and the vertices in X2X_{2}. Let d1,d2,d_{1},d_{2}, and d3d_{3} be the costs of σ2\sigma_{2} in DX1,DX1{u}D_{X_{1}},D_{X_{1}\cup\{u\}}, and DX1{u,v}D_{X_{1}\cup\{u,v\}}, respectively. Note that these terms are well-defined by Proposition 1. This implies

2(μDX1(u)+d2+μDX1{u}X2(v))<(d1+μDX1X2(u)+μDX1X2{u}(v))+(μDX1(u)+μDX1{u}(v)+d3).\displaystyle\begin{split}2\big{(}\mu_{D_{X_{1}}}(u)+d_{2}+\mu_{D_{X_{1}\cup\{u\}\cup X_{2}}}(v)\big{)}<&\\ \big{(}d_{1}+\mu_{D_{X_{1}\cup X_{2}}}(u)+\mu_{D_{X_{1}\cup X_{2}\cup\{u\}}}&(v)\big{)}+\big{(}\mu_{D_{X_{1}}}(u)+\mu_{D_{X_{1}\cup\{u\}}}(v)+d_{3}\big{)}.\end{split} (5)

Moreover, since uu and vv are false twins and the elimination of uu does not change the cost of eliminating vv, it holds that μDX1X2(u)=μDX1X2{u}(v)\mu_{D_{X_{1}\cup X_{2}}}(u)=\mu_{D_{X_{1}\cup X_{2}\cup\{u\}}}(v) and μDX1(u)=μDX1{u}(v){\mu_{D_{X_{1}}}(u)=\mu_{D_{X_{1}\cup\{u\}}}(v)}. Substituting this into Inequality (5) yields 2d2<d1+d32d_{2}<d_{1}+d_{3}. Next, let σ2=(w1,w2,,wk){\sigma_{2}=(w_{1},w_{2},\ldots,w_{k})} and let Wi=X1{w1,w2,,wi1}{W_{i}=X_{1}\cup\{w_{1},w_{2},\ldots,w_{i-1}\}} for each i[k]i\in[k]. Notice that it holds that d1=i[k]μDWi(wi){d_{1}=\sum_{i\in[k]}\mu_{D_{W_{i}}}(w_{i})}, d2=i[k]μDWi{u}(wi){d_{2}=\sum_{i\in[k]}\mu_{D_{W_{i}\cup\{u\}}}(w_{i})}, and d3=i[k]μDWi{u,v}(wi)d_{3}=\sum_{i\in[k]}\mu_{D_{W_{i}\cup\{u,v\}}}(w_{i}). To conclude the proof, we will show that for each wiX2w_{i}\in X_{2},

2μDWi{u}(wi)μDWi(wi)+μDWi{u,v}(wi).2\mu_{D_{W_{i}\cup\{u\}}}(w_{i})\geq\mu_{D_{W_{i}}}(w_{i})+\mu_{D_{W_{i}\cup\{u,v\}}}(w_{i}).

Note that this implies 2d2d1+d32d_{2}\geq d_{1}+d_{3}, yielding the desired contradiction. To show the above claim, we consider three cases: (i) wiw_{i} is an out-neighbor of uu in DWiD_{W_{i}}, (ii) wiw_{i} is an in-neighbor of uu in DWiD_{W_{i}}, and (iii) wiw_{i} is neither an in- nor an out-neighbor of uu in DWiD_{W_{i}}. Note that since DWiD_{W_{i}} is a DAG, wiw_{i} cannot be both an in-neighbor and an out-neighbor of uu. Moreover, since uu and vv are false twins, wiw_{i} is an in-/out-neighbor of uu if and only if it is an in-/out-neighbor of vv. In the first case, note that

  • |NDWi+(wi)|=|NDWi{u}+(wi)|=|NDWi{u,v}+(wi)||N^{+}_{{D_{W_{i}}}}({w_{i}})|=|N^{+}_{{D_{W_{i}\cup\{u\}}}}({w_{i}})|=|N^{+}_{{D_{W_{i}\cup\{u,v\}}}}({w_{i}})|,

  • |NDWi(wi)||NDWi{u,v}(wi)|+2|N^{-}_{{D_{W_{i}}}}({w_{i}})|\leq|N^{-}_{{D_{W_{i}\cup\{u,v\}}}}({w_{i}})|+2, and

  • |NDWi{u}(wi)|=|NDWi{u,v}(wi)|+1|N^{-}_{{D_{W_{i}\cup\{u\}}}}({w_{i}})|=|N^{-}_{{D_{W_{i}\cup\{u,v\}}}}({w_{i}})|+1.

The first holds as the out-degree of wiw_{i} does not change if uu and/or vv are eliminated. To see the second, note that eliminating uu and vv can reduce the in-degree of wiw_{i} by at most two. Finally, if uu is already eliminated, then eliminating vv does not add any new in-neighbors of wiw_{i} since uu and vv are false twins (and this property remains true even if other vertices are eliminated). Thus, we get

2μDWi{u}(wi)\displaystyle 2\mu_{D_{W_{i}\cup\{u\}}}(w_{i}) =2(|NDWi{u}(wi)||NDWi{u}+(wi)|)\displaystyle=2\big{(}|N^{-}_{{D_{W_{i}\cup\{u\}}}}({w_{i}})|\cdot|N^{+}_{{D_{W_{i}\cup\{u\}}}}({w_{i}})|\big{)}
=2((|NDWi{u,v}(wi)|+1)|NDWi{u}+(wi)|)\displaystyle=2\big{(}(|N^{-}_{{D_{W_{i}\cup\{u,v\}}}}({w_{i}})|+1)\cdot|N^{+}_{{D_{W_{i}\cup\{u\}}}}({w_{i}})|\big{)}
=(2|NDWi{u,v}(wi)|+2)|NDWi{u}+(wi)|\displaystyle=(2|N^{-}_{{D_{W_{i}\cup\{u,v\}}}}({w_{i}})|+2)\cdot|N^{+}_{{D_{W_{i}\cup\{u\}}}}({w_{i}})|
(|NDWi(wi)|+|NDWi{u,v}(wi)|)|NDWi{u}+(wi)|\displaystyle\geq(|N^{-}_{{D_{W_{i}}}}({w_{i}})|+|N^{-}_{{D_{W_{i}\cup\{u,v\}}}}({w_{i}})|)\cdot|N^{+}_{{D_{W_{i}\cup\{u\}}}}({w_{i}})|
=μDWi(wi)+μDWi{u,v}(wi).\displaystyle=\mu_{D_{W_{i}}}(w_{i})+\mu_{D_{W_{i}\cup\{u,v\}}}(w_{i}).

The second case is analogous with the roles of in- and out-neighbors swapped, that is,

  • |NDWi(wi)|=|NDWi{u}(wi)|=|NDWi{u,v}(wi)||N^{-}_{{D_{W_{i}}}}({w_{i}})|=|N^{-}_{{D_{W_{i}\cup\{u\}}}}({w_{i}})|=|N^{-}_{{D_{W_{i}\cup\{u,v\}}}}({w_{i}})|,

  • |NDWi+(wi)||NDWi{u,v}+(wi)|+2|N^{+}_{{D_{W_{i}}}}({w_{i}})|\leq|N^{+}_{{D_{W_{i}\cup\{u,v\}}}}({w_{i}})|+2, and

  • |NDWi{u}+(wi)|=|NDWi{u,v}+(wi)|+1|N^{+}_{{D_{W_{i}\cup\{u\}}}}({w_{i}})|=|N^{+}_{{D_{W_{i}\cup\{u,v\}}}}({w_{i}})|+1.

This yields

2μDWi{u}(wi)\displaystyle 2\mu_{D_{W_{i}\cup\{u\}}}(w_{i}) =2(|NDWi{u}(wi)||NDWi{u}+(wi)|)\displaystyle=2\big{(}|N^{-}_{{D_{W_{i}\cup\{u\}}}}({w_{i}})|\cdot|N^{+}_{{D_{W_{i}\cup\{u\}}}}({w_{i}})|\big{)}
=2(|NDWi{u}(wi)|(|NDWi{u,v}+(wi)|+1))\displaystyle=2\big{(}|N^{-}_{{D_{W_{i}\cup\{u\}}}}({w_{i}})|\cdot(|N^{+}_{{D_{W_{i}\cup\{u,v\}}}}({w_{i}})|+1)\big{)}
=|NDWi{u}(wi)|(2|NDWi{u,v}+(wi)|+2)\displaystyle=|N^{-}_{{D_{W_{i}\cup\{u\}}}}({w_{i}})|\cdot(2|N^{+}_{{D_{W_{i}\cup\{u,v\}}}}({w_{i}})|+2)
|NDWi{u}(wi)|(|NDWi+(wi)|+|NDWi{u,v}+(wi)|)\displaystyle\geq|N^{-}_{{D_{W_{i}\cup\{u\}}}}({w_{i}})|\cdot(|N^{+}_{{D_{W_{i}}}}({w_{i}})|+|N^{+}_{{D_{W_{i}\cup\{u,v\}}}}({w_{i}})|)
=μDWi(wi)+μDWi{u,v}(wi).\displaystyle=\mu_{D_{W_{i}}}(w_{i})+\mu_{D_{W_{i}\cup\{u,v\}}}(w_{i}).

Finally, in the third case it holds that

  • |NDWi(wi)|=|NDWi{u}(wi)|=|NDWi{u,v}(wi)||N^{-}_{{D_{W_{i}}}}({w_{i}})|=|N^{-}_{{D_{W_{i}\cup\{u\}}}}({w_{i}})|=|N^{-}_{{D_{W_{i}\cup\{u,v\}}}}({w_{i}})|, and

  • |NDWi+(wi)|=|NDWi{u}+(wi)|=|NDWi{u,v}+(wi)||N^{+}_{{D_{W_{i}}}}({w_{i}})|=|N^{+}_{{D_{W_{i}\cup\{u\}}}}({w_{i}})|=|N^{+}_{{D_{W_{i}\cup\{u,v\}}}}({w_{i}})|.

Thus, we get

2μDWi{u}(wi)=2|NDWi{u}(wi)||NDWi{u}+(wi)|=μDWi(wi)+μDWi{u,v}(wi).2\mu_{D_{W_{i}\cup\{u\}}}(w_{i})=2\cdot|N^{-}_{{D_{W_{i}\cup\{u\}}}}({w_{i}})|\cdot|N^{+}_{{D_{W_{i}\cup\{u\}}}}({w_{i}})|=\mu_{D_{W_{i}}}(w_{i})+\mu_{D_{W_{i}\cup\{u,v\}}}(w_{i}).

Since 2μDWi{u}(wi)μDWi(wi)+μDWi{u,v}(wi)2\mu_{D_{W_{i}\cup\{u\}}}(w_{i})\geq\mu_{D_{W_{i}}}(w_{i})+\mu_{D_{W_{i}\cup\{u,v\}}}(w_{i}) holds in all cases and we showed before that this contradicts Inequality (5), this concludes the proof when |T|=2|T|=2.

To generalize the result to larger sets TT, we use induction on the size \ell of TT. As shown above, the base case where =2\ell=2 holds true. Next, assume the proposition is true for 1\ell-1. Let T={v1,v2,,v}T=\{v_{1},v_{2},\ldots,v_{\ell}\}, and let σ\sigma be an optimal elimination sequence. Let T=T{v}T^{\prime}=T\setminus\{v_{\ell}\}. By the inductive hypothesis (on TT^{\prime}) and Proposition 1, we may assume that σ=(σ1,v1,v2,v1,σ2,v,σ3)\sigma=(\sigma_{1},v_{1},v_{2},\ldots v_{\ell-1},\sigma_{2},v_{\ell},\sigma_{3}) or σ=(σ1,v,σ2,v1,v2,,v1,σ3){\sigma=(\sigma_{1},v_{\ell},\sigma_{2},v_{1},v_{2},\ldots,v_{\ell-1},\sigma_{3})} for some sequences σ1,σ2\sigma_{1},\sigma_{2}, and σ3\sigma_{3}. By applying the argument from the case where |T|=2|T|=2 to v1v_{\ell-1} and vv_{\ell} or to vv_{\ell} and v1v_{1}, we observe optimal solutions (σ1,v1,v2,,v1,v,σ2,σ3)(\sigma_{1},v_{1},v_{2},\ldots,v_{\ell-1},v_{\ell},\sigma_{2},\sigma_{3}) or (σ1,σ2,v,v1,v2,,v1,σ3){(\sigma_{1},\sigma_{2},v_{\ell},v_{1},v_{2},\ldots,v_{\ell-1},\sigma_{3})}. This completes the proof. ∎

3 Structural Optimal Jacobian Accumulation is NP-complete

In this section, we show that Structural Optimal Jacobian Accumulation is NP-complete. We reduce from Vertex Cover, which is defined as follows.


Input: An undirected graph G=(V,E)G=(V,E) and an integer kk. Question: Is there a set CVC\subseteq V of at most kk vertices such that each edge in EE has at least one endpoint in CC? Vertex Cover parameterized by 
Theorem 1.

Structural Optimal Jacobian Accumulation is NP-complete. Assuming the ETH, it cannot be solved in 2o(n+m)2^{o(n+m)} time.

Proof.

We first show containment in NP. Note that a total elimination sequence is a permutation of II and therefore can be encoded in polynomial space. Moreover, given such a sequence, we can compute its cost in polynomial time by computing the vertex eliminations one after another.

To show hardness, we reduce from Vertex Cover. It is well known that Vertex Cover is NP-hard and cannot be solved in 2o(n+m)2^{o(n+m)} time unless the ETH fails [13, 14]. We will provide a linear reduction from Vertex Cover to Structural Optimal Jacobian Accumulation thereby proving the theorem. To this end, let (G=(V,E),k)(G=(V,E),k) be an input instance of Vertex Cover. Let n=|V|n=|V| and m=|E|m=|E|. We create an equivalent instance (D=(SI,A),k)(D=(S\uplus I,A),k^{\prime}) of Structural Optimal Jacobian Accumulation as follows. For each vertex vVv\in V, we create five vertices v1,v2,v3,v4,v5v_{1},v_{2},v_{3},v_{4},v_{5}. The vertices v1,v4v_{1},v_{4} and v5v_{5} are contained in SS for all vVv\in V. Vertices v2v_{2} and v3v_{3} are contained in II. Next, we add the set {(v1,v2),(v2,v3),(v2,v4),(v2,v5),(v3,v4),(v3,v5)}\left\{(v_{1},v_{2}),(v_{2},v_{3}),(v_{2},v_{4}),(v_{2},v_{5}),(v_{3},v_{4}),(v_{3},v_{5})\right\} of arcs to AA for each vVv\in V. Finally, for each edge {u,v}E\{u,v\}\in E, we add the arcs (u1,v3),(u2,v3),(v1,u3)(u_{1},v_{3}),(u_{2},v_{3}),(v_{1},u_{3}) and (v2,u3)(v_{2},u_{3}) to AA. Notice that every arc goes from a lower-indexed vertex to a higher-indexed vertex. Hence, the constructed digraph is a DAG. To finish the construction, we set k=6m+4n+k.k^{\prime}=6m+4n+k. An illustration of the construction is depicted in Figure 1.

uuvvwwu1u_{1}u2u_{2}u3u_{3}u4u_{4}u5u_{5}
Figure 1: The input graph is shown on the left and the constructed graph is shown on the right. An optimal solution (corresponding to the vertex cover that only contains the middle vertex) first eliminates the red vertex, then all blue vertices, then all green vertices, and finally the yellow vertex.

Since the reduction can clearly be computed in polynomial time, it only remains to show correctness. We proceed to show that (G,k)(G,k) is a yes-instance of Vertex Cover if and only if the constructed instance (D,k)(D,k^{\prime}) is a yes-instance of Structural Optimal Jacobian Accumulation.

First, assume that GG contains a vertex cover CC of size at most kk. We show that eliminating all vertices v2v_{2} with vCv\in C first, then u3u_{3} for all uVCu\in V\setminus C, followed by u2u_{2} for all uVCu\in V\setminus C, and finally all vertices v3v_{3} for each vCv\in C results in a total cost of kk^{\prime}. To see this, note that the cost of eliminating v2v_{2} for any vertex vCv\in C is deg(v)+3\deg(v)+3 as v2v_{2} has a single in-neighbor v1v_{1}, three out-neighbors v3,v4v_{3},v_{4}, and v5v_{5}, and deg(v)\deg(v) out-neighbors u3u_{3} (one for each uN(v)u\in N(v)). The cost of eliminating u3u_{3} for any uCu\notin C afterwards is 2deg(u)+22\deg(u)+2 as by construction u3u_{3} has in-neighbors {w1wN(u)}{u2}\{w_{1}\mid w\in N(u)\}\cup\{u_{2}\}, two out-neighbors u4,u5u_{4},u_{5} and no in-neighbors in {w2wN(u)}\{w_{2}\mid w\in N(u)\}, as each wN(u)w\in N(u) is by definition in CC and hence the corresponding w2w_{2} has been eliminated before. The cost of eliminating u2u_{2} for uVCu\in V\setminus C is afterwards deg(u)+2\deg(u)+2 as u2u_{2} has the single in-neighbor u1u_{1}, two out-neighbors u4u_{4} and u5u_{5} and deg(u)\deg(u) out-neighbors {v3vC}\{v_{3}\mid v\in C\} (note that uu cannot have neighbors in VCV\setminus C since CC is a vertex cover and uCu\notin C). Finally, the cost of eliminating v3v_{3} for any vCv\in C is 2deg(v)+22\deg(v)+2 since v3v_{3} has deg(v)+1\deg(v)+1 in-neighbors {w1wN[v]}\{w_{1}\mid w\in N[v]\} and two out-neighbors v4v_{4} and v5v_{5}. Summing these costs over all vertices and applying the handshake lemma222The handshake lemma states that the sum of vertex degrees is twice the number of edges [5]. gives a total cost of

vC(deg(v)+3)\displaystyle\sum_{v\in C}(\deg(v)+3) +uVC(2deg(u)+2)+uVC(deg(u)+2)+vC(2deg(v)+2)\displaystyle+\sum_{u\in V\setminus C}(2\deg(u)+2)+\sum_{u\in V\setminus C}(\deg(u)+2)+\sum_{v\in C}(2\deg(v)+2)
=vC(3deg(v)+5)+uVC(3deg(u)+4)\displaystyle=\sum_{v\in C}(3\deg(v)+5)+\sum_{u\in V\setminus C}(3\deg(u)+4)
=vV(3deg(v)+4)+|C|\displaystyle=\sum_{v\in V}(3\deg(v)+4)+|C|
6m+4n+k=k.\displaystyle\leq 6m+4n+k=k^{\prime}.

This shows that the constructed instance (D,k)(D,k^{\prime}) is a yes-instance of Structural Optimal Jacobian Accumulation.

In the other direction, assume that there is an ordering σ\sigma of the vertices in II resulting in a total cost of at most kk^{\prime}. Let JVJ\subseteq V be the set of vertices such that for each vJv\in J it holds that v2v_{2} is eliminated before v3v_{3} or v3v_{3} is eliminated before u2u_{2} for any uN(v)u\in N(v) by σ\sigma. We will show that JJ is a vertex cover of size at most kk in GG. To this end, we first provide a lower bound for the cost of eliminating any vertex, regardless of which vertices have been eliminated previously. Note that v3v_{3} for any vertex vVv\in V has two out-neighbors v4v_{4} and v5v_{5} in SS, deg(v)\deg(v) in-neighbors {w1wN(v)}\{w_{1}\mid w\in N(v)\} in SS, and one additional in-neighbor which is either v2v_{2} if v2v_{2} was not eliminated before or v1v_{1} if v2v_{2} was eliminated before. Hence, the cost for eliminating v3v_{3} is at least 2deg(v)+22\deg(v)+2. Moreover, the cost of eliminating v2v_{2} for any vertex vVv\in V is at least deg(v)+2\deg(v)+2 as v2v_{2} has the in-neighbor v1Sv_{1}\in S, two out-neighbors v4,v5Sv_{4},v_{5}\in S, and for each wN(v)w\in N(v) at least one additional out-neighbor (w3w_{3} if w3w_{3} was not eliminated before or w4w_{4} and w5w_{5} if w3w_{3} was eliminated before). Summing these costs over all vertices (and again applying the handshake lemma) gives a lower bound of 6m+4n=kk6m+4n=k^{\prime}-k.

The next step is to prove that JJ contains at most kk vertices. To this end, note that for each vertex vJv\in J, the cost increases by at least one over the analyzed lower bound. If v3v_{3} is eliminated after v2v_{2} for some vJv\in J, then the cost of eliminating v2v_{2} increases by one as v2v_{2} has the additional out-neighbor v3v_{3}. If v3v_{3} is eliminated before u2u_{2} for some vJv\in J and uN(v)u\in N(v), then the cost of eliminating u2u_{2} increases by one as the out-neighbor v3v_{3} is replaced by the two out-neighbors v4v_{4} and v5v_{5}. This immediately implies that |J|k|J|\leq k.

Finally, we show that JJ is a vertex cover. Assume towards a contradiction that this is not the case. Then, there is some edge {u,v}E\{u,v\}\in E with uJu\notin J and vJv\notin J. By definition of JJ, it holds that u3u_{3} is eliminated before u2u_{2}, u3u_{3} is eliminated after v2v_{2}, v3v_{3} is eliminated before v2v_{2}, and v3v_{3} is eliminated after u2u_{2} by σ\sigma. Note that this implies that σ\sigma eliminates u3u_{3} before u2u_{2}, before v3v_{3}, before v2v_{2}, before u3u_{3}, a contradiction. Thus, JJ is a vertex cover of size at most kk and the initial instance of Vertex Cover is therefore a yes-instance. This concludes the proof. ∎

4 Minimum Edge Count is NP-complete

In this section we show that Minimum Edge Count is NP-complete and, assuming the ETH, it cannot be solved in subexponential time. To this end, we reduce from Independent Set, which is defined as follows.


Input: An undirected graph G=(V,E)G=(V,E) and an integer kk. Question: Is there a set XVX\subseteq V of at least kk vertices such that no edge in EE has both endpoints in XX? Independent Set parameterized by 
Theorem 2.

Minimum Edge Count is NP-complete. Assuming the ETH, it cannot be solved in 2o(n+m)2^{o(n+m)} time.

Proof.

We again start by showing containment in NP. We can encode a (not necessarily total) elimination sequence in polynomial space. Moreover, given such a sequence, we can compute the resulting DAG in polynomial time and verify that it contains at most kk edges.

To show hardness, we reduce from Independent Set in 2-degenerate subcubic graphs of minimum girth five, that is, graphs in which each vertex has between two and three neighbors and each cycle has length at least five. This problem is known to be NP-hard and cannot be solved in 2o(n+m)2^{o(n+m)} time assuming the ETH [15]. We will provide a linear reduction from that problem to Minimum Edge Count to show the theorem. To this end, let (G=(V={v1,v2,,vn},E),k)(G=(V=\{v_{1},v_{2},\ldots,v_{n}\},E),k) be an instance of Independent Set where each vertex has between two and three neighbors and no cycles of length three or four are present in GG. We will construct an instance (D=(SI,A),k){(D=(S\uplus I,A),k^{\prime})} of Minimum Edge Count. We begin by imposing an arbitrary total order π\pi on the vertex set VV. We partition the vertices into four types based on their degree and the order π\pi as follows. Vertices of type 1 have degree 3 and either all neighbors come earlier with respect to π\pi or all neighbors come later. Vertices of type 2 have degree 3 and at least one earlier and at least one later neighbor with respect to π\pi. Vertices of type 3 have degree 2 and either both neighbors come earlier or both neighbors come later with respect to π\pi. Finally, vertices of type 4 have degree 2 and one of the neighbors comes earlier with respect to π\pi while the other neighbor comes later.

We now describe the construction of DD, depicted in Figure 2. We begin by creating a set TST\subseteq S of 44 sink vertices. Next, for each viVv_{i}\in V, we create a vertex uiIu_{i}\in I as well as two vertex sets IiSI_{i}\subseteq S and OiSO_{i}\subseteq S both of size 44. We add arcs from each vertex in IiI_{i} to every vertex in {ui}T\{u_{i}\}\cup T to AA. We also add arcs from uiu_{i} to each vertex in OiTO_{i}\cup T to AA. Finally, we add arcs from IiI_{i} to OiO_{i} based on the type of viv_{i} as follows. Note that since |Ii|=|Oi|=4|I_{i}|=|O_{i}|=4, there are up to 1616 possible arcs from vertices in IiI_{i} to vertices in OiO_{i}. For type-1 vertices, we add 14 of the possible arcs (it does not matter which arcs we add). For vertices of type 2,3, and 4, we add 16,11,16,11, and 1212 arcs to AA, respectively. After completing this procedure for every vertex in VV, we add an arc (ui,uj)(u_{i},u_{j}) for any edge {vi,vj}E\{v_{i},v_{j}\}\in E where viv_{i} comes before vjv_{j} in the order π\pi. This concludes the construction of the graph DD. Let nn^{\prime} and mm^{\prime} be the number of vertices and arcs in DD. To conclude the construction, we set k=mkk^{\prime}=m^{\prime}-k. Observe that DD is a DAG and the number of vertices and arcs is linear in n+mn+m.

v1v_{1}v2v_{2}v3v_{3}TTI1I_{1}u1u_{1}O1O_{1}I2I_{2}u2u_{2}O2O_{2}I3I_{3}u3u_{3}O3O_{3}111211
Figure 2: An example instance of independent set on the left and the constructed instance on the right. We assume the order π\pi to be v1v_{1} first, then v2v_{2} and v3v_{3} last. Each large node Ii,OiI_{i},O_{i} and TT (shaded gray) represents an independent set of size 44. A bold arc between two nodes represents all possible arcs between the respective vertex sets (in one direction) unless a number is shown next to the arc. In this case, the number represents the number of arcs between the two sets of vertices.

Since the reduction can be computed in polynomial time and the constructed instance has linear size in the input instance, it only remains to show correctness. To this end, first assume that there exists an independent set XVX\subseteq V of size kk in GG. We eliminate the vertices X={uiviX}X^{\prime}=\{u_{i}\mid v_{i}\in X\}. Note that Proposition 1 ensures that the order in which we eliminate the vertices does not matter. We will show that the resulting graph contains at most kk^{\prime} arcs.

First, note that new arcs between vertices uiu_{i} and uju_{j} might be created while eliminating vertices in XX^{\prime}. However, since XX is an independent set, such arcs are only created between vertices where N(vi)N(vj)N(v_{i})\cap N(v_{j})\neq\emptyset. Since GG has girth at least 5, if the elimination of a vertex in XX^{\prime} could create the arc (ui,uj)(u_{i},u_{j}), then this arc was not there initially and was not added by the elimination of a different vertex in XX^{\prime}. We next show that the elimination of each vertex in XX^{\prime} reduces the number of arcs in the graph by exactly one. Let uiXu_{i}\in X^{\prime} be an arbitrary vertex. If viv_{i} has degree three, then uiu_{i} has exactly 1515 incident arcs, where 12 are to or from Ii,OoI_{i},O_{o} and TT and three are to vertices uj1,uj2,uj3u_{j_{1}},u_{j_{2}},u_{j_{3}}. The number of new arcs created in this case is 1414 as shown next. For each [3]\ell\in[3], four new arcs are created from vertices in IiI_{i} to uju_{j_{\ell}} if viv_{i} comes before vjv_{j_{\ell}} with respect to π\pi and four new arcs from uju_{j_{\ell}} to vertices in OiO_{i} are created otherwise. Hence, in any case, 12 new arcs are created. If viv_{i} is of type 1, then 2 arcs are created from vertices in IiI_{i} to vertices in OiO_{i}. If viv_{i} is of type 2, then two additional arcs are created between the vertices uj1,uj2u_{j_{1}},u_{j_{2}}, and uj3u_{j_{3}}. Hence, if viv_{i} has degree 3, then the elimination of vertex uiu_{i} removes 1515 arcs and creates 14 new ones, that is, the number of arcs decreases by one. If viv_{i} has degree 2, then uiu_{i} is incident to exactly 1414 arcs (12 to and from vertices in Ii,OiI_{i},O_{i}, and TT and two additional arcs to or from vertices uj1u_{j_{1}} and uj2u_{j_{2}}). The number of new arcs created in this case is 1313 as shown next. For each [2]\ell\in[2], four new arcs are created from vertices in IiI_{i} to uju_{j_{\ell}} if viv_{i} comes before vjv_{j_{\ell}} with respect to π\pi and four new arcs from uju_{j_{\ell}} to vertices in OiO_{i} are created otherwise. In any case, 8 new arcs are created this way. If viv_{i} is of type 3, then five arcs are created from vertices in IiI_{i} to vertices in OiO_{i}. If viv_{i} is of type 4, then four additional arcs are created from vertices in IiI_{i} to vertices in OiO_{i} and one additional arc is added between uj1u_{j_{1}} and uj2u_{j_{2}}. Hence, if viv_{i} has degree 2, then the elimination of vertex uiu_{i} removes 1414 arcs and creates 13 new ones, that is, the number of arcs also decreases by one in this case. Since k=mkk^{\prime}=m^{\prime}-k and each of the kk removals decreases the number of arcs by one, the resulting graph contains at most kk^{\prime} arcs, showing that the constructed instance is a yes-instance.

For the other direction, assume that XX^{\prime} is a solution to (D,k)(D,k^{\prime}), that is, DXD_{X^{\prime}} has at most kk^{\prime} arcs. Further, assume that XX^{\prime} is minimal in the sense that, for each uXu\in X^{\prime}, DX{u}D_{X^{\prime}\setminus\{u\}} contains more arcs than DXD_{X^{\prime}}. Note that this notion is well-defined due to Proposition 1. Moreover, given any solution, a minimal one can be computed in polynomial time. Let X={viuiX}X=\{v_{i}\mid u_{i}\in X^{\prime}\}. We will show that XX induces an independent set in GG and that |X|k|X|\geq k, that is, XX is an independent set of size at least kk in GG and the original instance is therefore a yes-instance.

Assume toward a contradiction that XX is not an independent set in GG, that is, there exist vertices ui,ujXu_{i},u_{j}\in X^{\prime} such that {vi,vj}E\{v_{i},v_{j}\}\in E. We claim that XX^{\prime} is not minimal in this case. To prove this claim, note that eliminating a vertex uiu_{i} does not decrease the in-degree or out-degree of any vertex uju_{j} (at any stage during a elimination sequence) and if {vi,vj}E\{v_{i},v_{j}\}\in E, then one of the degrees of uju_{j} increases. If uiu_{i} is neither an in-neighbor nor an out-neighbor of uju_{j}, then eliminating uiu_{i} does not change either degree of uju_{j}. If uiu_{i} is an in-neighbor, then the out-degree of uju_{j} remains unchanged and the in-degree increases as the vertices in IiI_{i} become in-neighbors of uju_{j} (and they cannot be in-neighbors of uju_{j} while uiu_{i} is not eliminated). If uiu_{i} is an out-neighbor of uju_{j}, then the in-degree of uju_{j} remains unchanged and the out-degree increases as the vertices in OiO_{i} become new out-neighbors of uju_{j}. Let dd be the number of vertices ww such that (w,uj)(w,u_{j}) or (uj,w)(u_{j},w) is an arc in DX{uj}D_{X^{\prime}\setminus\{u_{j}\}} and wIjOjTw\notin I_{j}\cup O_{j}\cup T. Note that d>deg(vj)d>\deg(v_{j}) since uiX{uj}u_{i}\in X^{\prime}\setminus\{u_{j}\} and {vi,vj}E\{v_{i},v_{j}\}\in E. Eliminating uju_{j} in DX{uj}D_{X^{\prime}\setminus\{u_{j}\}} removes d+12d+12 arcs. If vjv_{j} has degree 3, then eliminating uju_{j} in DX{uj}D_{X^{\prime}\setminus\{u_{j}\}} creates at least 4d=d+3dd+124d=d+3d\geq d+12 arcs since (i) d>deg(vj)=3d>\deg(v_{j})=3 implies d4d\geq 4 and (ii) eliminating uju_{j} creates four arcs between vertices in IjOjI_{j}\cup O_{j} and each other (in- or out-)neighbor of uju_{j} except for vertices in TT. If vjv_{j} has degree 2, then eliminating uju_{j} creates at least four arcs from vertices in IjI_{j} to vertices in OjO_{j} plus at least 4d4d arcs, that is, at least 4d+4=d+3d+4d+13>d+124d+4=d+3d+4\geq d+13>d+12 arcs since d>deg(vj)=2d>\deg(v_{j})=2 implies d3d\geq 3. Hence, in any case the number of newly created arcs is at least as large as the number of removed arcs. That is, the number of arcs does not decrease, showing that XX^{\prime} is not a minimal solution.

It only remains to show that |X|k|X|\geq k. As analyzed in the forward direction, the elimination of any vertex uiu_{i} reduces the number of arcs by exactly one if no vertex uju_{j} with {vi,vj}E\{v_{i},v_{j}\}\in E was eliminated before. Since k=mkk^{\prime}=m-k, this shows that |X|k|X|\geq k, concluding the proof. ∎

5 Algorithms

In this section, we give two simple algorithms that show that Structural Optimal Jacobian Accumulation and Minimum Edge Count can be solved in O(2n)O^{*}(2^{n}) time. We begin with Minimum Edge Count.

Proposition 3.

Minimum Edge Count can be solved in O(2nn3)O(2^{n}n^{3}) time and with polynomial space.

Proof.

By Proposition 1, the order in which vertices in an optimal solution are eliminated is irrelevant. Hence, we can simply test for each subset XX of vertices, how many arcs remain if the vertices in XX are eliminated. Since there are 2n2^{n} possible subsets and each of the at most nn eliminations for each subset can be computed in O(n2)O(n^{2}) time, all subsets can be tested in O(2nn3)O(2^{n}n^{3}) time. ∎

We continue with Structural Optimal Jacobian Accumulation, where we use an algorithmic framework due to Bodlaender et al. [3].

Proposition 4.

Structural Optimal Jacobian Accumulation can be solved in O(2nn4)O(2^{n}n^{4}) time. It can also be solved in O(4nn3)O(4^{n}n^{3}) time using polynomial space.

Proof.

As shown by Bodlaender et al. [3], any vertex ordering problem can be solved in O(2nnc+1)O(2^{n}n^{c+1}) time and in O(4nnc)O(4^{n}n^{c}) time using polynomial space if it can be reformulated as minπvVf(D,π<v,v)\min_{\pi}\sum_{v\in V}f(D,\pi_{<v},v), where π\pi is a permutation of the vertices, π<v\pi_{<v} is the set of all vertices that appear before vv in π\pi, and ff can be computed in O(nc)O(n^{c}) time. We show that Structural Optimal Jacobian Accumulation fits into this framework (with c=3c=3). We only consider vertices in II, that is, non-terminal vertices as these are all the vertices that should be eliminated. We use the function

f(D,π<v,v)=|NDπ<v(v)||NDπ<v+(v)|.f(D,\pi_{<v},v)=|N^{-}_{{D_{\pi_{<v}}}}({v})|\cdot|N^{+}_{{D_{\pi_{<v}}}}({v})|.

Note that we can compute Dπ<vD_{\pi_{<v}} (and therefore ff) in O(n3)O(n^{3}) time. Note that given a permutation π\pi, the cost of eliminating all vertices in II exactly corresponds to vVf(D,π<v,v)\sum_{v\in V}f(D,\pi_{<v},v) as the cost of eliminating a vertex vv in a solution sequence following π\pi is exactly |NDπ<v(v)||NDπ<v+(v)|=f(D,π<v,v).{|N^{-}_{{D_{\pi_{<v}}}}({v})|\cdot|N^{+}_{{D_{\pi_{<v}}}}({v})|=f(D,\pi_{<v},v)}. This concludes the proof. ∎

6 Conclusion

We have resolved a pair of longstanding open questions by showing that Structural Optimal Jacobian Accumulation and Minimum Edge Count are both NP-complete. Our progress opens the door to many interesting questions. On the theoretical side, a key next step is to understand the complexities of both problems under the more expressive edge elimination operation [18]. There are also promising opportunities to develop approximation algorithms and/or establish lower bounds.

Acknowledgments

The authors would like to sincerely thank Paul Hovland for drawing their attention to the studied problems at Dagstuhl Seminar 24201, for insightful discussions, and for generously reviewing a preliminary version of this manuscript, providing valuable feedback and comments.

MB was supported by the European Research Council (ERC) project LOPRE (819416) under the Horizon 2020 research and innovation program. AC, YM, and BDS were partially supported by the Gordon & Betty Moore Foundation under grant GBMF4560 to BDS.

References

  • [1] S. G. Aksoy, R. Bennink, Y. Chen, J. Frías, Y. R. Gel, B. Kay, U. Naumann, C. O. Marrero, A. V. Petyuk, S. Roy, I. Segovia-Dominguez, N. Veldt, and S. J. Young. Seven open problems in applied combinatorics. Journal of Combinatorics, 14(4):559–601, 2023.
  • [2] F. L. Bauer. Computational graphs and rounding error. SIAM Journal on Numerical Analysis, 11(1):87–96, 1974.
  • [3] H. L. Bodlaender, F. V. Fomin, A. M. C. A. Koster, D. Kratsch, and D. M. Thilikos. A note on exact algorithms for vertex ordering problems on graphs. Theory of Computing Systems, 50(3):420–432, 2012.
  • [4] J. Chen, P. Hovland, T. Munson, and J. Utke. An integer programming approach to optimal derivative accumulation. In Proceedings of the 6th International Conference on Automatic Differentiation, pages 221–231, Berlin Heidelberg, 2012. Springer.
  • [5] R. Diestel. Graph Theory. Springer, Berlin Heidelberg, 2012.
  • [6] S. A. Forth, M. Tadjouddine, J. D. Pryce, and J. K. Reid. Jacobian code generated by source transformation and vertex elimination can be as efficient as hand-coding. ACM Transactions on Mathematical Software, 30(3):266–299, 2004.
  • [7] M. R. Garey and D. S. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, USA, 1979.
  • [8] A. Griewank. Some bounds on the complexity of gradients, Jacobians, and Hessians. In Complexity in numerical optimization, pages 128–162. World Scientific, Berlin Heidelberg, 1993.
  • [9] A. Griewank and U. Naumann. Accumulating Jacobians as chained sparse matrix products. Mathematical Programming, 95:555–571, 2003.
  • [10] A. Griewank and O. Vogel. Analysis and exploitation of Jacobian scarcity. In Proceedings of the 2nd International Conference on High Performance Scientific Computing, pages 149–164, Berlin Heidelberg, 2003. Springer.
  • [11] A. Griewank and A. Walther. Evaluating derivatives: principles and techniques of algorithmic differentiation. SIAM, Philadelphia, 2008.
  • [12] R. Impagliazzo and R. Paturi. On the complexity of kk-Sat. Journal on Computer and Syststem Sciences, 62(2):367–375, 2001.
  • [13] R. Impagliazzo, R. Paturi, and F. Zane. Which problems have strongly exponential complexity? Journal of Computer and System Sciences, 63(4):512–530, 2001.
  • [14] R. M. Karp. Reducibility among Combinatorial Problems, pages 85–103. Springer US, Boston, MA, 1972.
  • [15] C. Komusiewicz. Tight running time lower bounds for vertex deletion problems. ACM Transactions on Computation Theory, 10(2):6:1–6:18, 2018.
  • [16] A. Lyons and J. Utke. On the practical exploitation of scarsity. In Proceedings of the 5th International Conference on Automatic Differentiation, pages 103–114, Berlin Heidelberg, 2008. Springer.
  • [17] V. Mosenkis and U. Naumann. On optimality preserving eliminations for the minimum edge count and optimal Jacobian accumulation problems in linearized DAGs. Optimization Methods and Software, 27(2):337–358, 2012.
  • [18] U. Naumann. Elimination Techniques for Cheap Jacobians, pages 247–253. Springer, Berlin Heidelberg, 2002.
  • [19] U. Naumann. Optimal Jacobian accumulation is NP-complete. Mathematical Programming, 112:427–441, 2008.
  • [20] U. Naumann. The art of differentiating computer programs: An introduction to algorithmic differentiation. SIAM, Philadelphia, 2011.
  • [21] J. D. Pryce and E. M. Tadjouddine. Fast automatic differentiation Jacobians by compact lu factorization. SIAM Journal on Scientific Computing, 30(4):1659–1677, 2008.
  • [22] A. Quarteroni, R. Sacco, and F. Saleri. Numerical mathematics. Springer Science & Business Media, Berlin Heidelberg, 2006.
  • [23] D. J. Rose and R. E. Tarjan. Algorithmic aspects of vertex elimination on directed graphs. SIAM Journal on Applied Mathematics, 34(1):176–197, 1978.