Finding the KT partition of a weighted graph in near-linear time
Abstract
In a breakthrough work, Kawarabayashi and Thorup (J. ACM’19) gave a near-linear time deterministic algorithm to compute the weight of a minimum cut in a simple graph . A key component of this algorithm is finding the -KT partition of , the coarsest partition of such that for every non-trivial -near minimum cut with sides it holds that is contained in either or , for . In this work we give a near-linear time randomized algorithm to find the -KT partition of a weighted graph. Our algorithm is quite different from that of Kawarabayashi and Thorup and builds on Karger’s framework of tree-respecting cuts (J. ACM’00).
We describe a number of applications of the algorithm. (i) The algorithm makes progress towards a more efficient algorithm for constructing the polygon representation of the set of near-minimum cuts in a graph. This is a generalization of the cactus representation, and was initially described by Benczúr (FOCS’95). (ii) We improve the time complexity of a recent quantum algorithm for minimum cut in a simple graph in the adjacency list model from to , when the graph has vertices and edges. (iii) We describe a new type of randomized algorithm for minimum cut in simple graphs with complexity . For graphs that are not too sparse, this matches the complexity of the current best algorithm which uses a different approach based on random contractions.
The key technical contribution of our work is the following. Given a weighted graph with edges and a spanning tree of , consider the graph whose nodes are the edges of , and where there is an edge between two nodes of iff the corresponding 2-respecting cut of is a non-trivial near-minimum cut of . We give a time deterministic algorithm to compute a spanning forest of .
1 Introduction
Given a weighted and undirected graph with vertices and edges,111Throughout this paper we will use and to denote the number of vertices and edges of the input graph. the minimum cut problem is to find the minimum weight of a set of edges whose removal disconnects . When is unweighted, this is simply the minimum number of edges whose removal disconnects , also known as the edge connectivity of . The minimum cut problem is a fundamental problem in theoretical computer science whose study goes back to at least the 1960s when the first polynomial time algorithm computing edge connectivity was given by Gomory and Hu [GH61]. In the current state-of-the-art, there are near-linear time randomized algorithms for the minimum cut problem in weighted graphs [Kar00, GMW20, MN20] and near-linear time deterministic algorithms in the case of simple graphs222A simple graph is an unweighted graph with no self loops and at most one edge between any pair of vertices. [KT19, HRW20]. Very recently, Li [Li21] has given an almost-linear time (i.e. time ) deterministic algorithm for weighted graphs as well.
The best known algorithms for weighted graphs all rely on a framework developed by Karger [Kar00] which, for an input graph , relies on finding spanning trees of such that with high probability one of these spanning trees will contain at most 2 edges from a minimum cut of . In this case the cut is said to 2-respect the tree. A key insight of Karger is that, given a spanning tree of , the problem of finding a 2-respecting cut of that has minimum weight in can be solved deterministically in near-linear time, specifically time . After standing for 20 years, the bound for this minimum-weight 2-respecting cut problem was recently improved by Gawrychowski, Mozes, and Weimann [GMW20], who gave a deterministic time algorithm, and independently by Mukhopadhyay and Nanongkai [MN20] who gave a randomized algorithm with complexity .
The best algorithms in the case of a simple graph rely on a quite different approach, pioneered by Kawarabayashi and Thorup [KT19]. This approach begins by finding the minimum degree of a vertex in . Then the question becomes if there is a non-trivial cut, i.e. a cut where both sides of the corresponding bipartition have cardinality at least 2, whose weight is less than . This problem is solved by finding what we call the -KT partition of the graph. Let be the set of all bipartitions of the vertex set corresponding to non-trivial cuts whose weight is at most . The -KT partition of is the coarsest partition of the vertex set such that for any it holds that is contained in either or , for each . If one considers the multigraph formed from by identifying vertices in the same set , then preserves all non-trivial -near minimum cuts of . Kawarabayashi and Thorup further show that for any the graph only has edges. This bound crucially uses that the original graph is simple. The edge connectivity of is thus the minimum of and the edge connectivity of . One can use Gabow’s deterministic edge connectivity algorithm [Gab95] for a multigraph with edges and edge connectivity to check in time if the edge connectivity of is less than and, if so, compute it. In the most technical part of their work, Kawarabayashi and Thorup give a deterministic algorithm to find the -KT partition of a simple graph in time , giving an time deterministic algorithm overall for edge connectivity. The key tool in their algorithm is the PageRank algorithm, which they use for finding low conductance cuts in the graph.
The KT partition has proven to be a very useful concept. Rubinstein, Schramm, and Weinberg [RSW18] also go through the -KT partition to give a near-optimal randomized query algorithm determining the edge connectivity of a simple graph in the cut query model. In the cut query model one can query a subset of the vertices and receive in return the number of edges with exactly one endpoint in . En route to their result, [RSW18] also improved the bound on the number of inter-component edges in the -KT partition of a simple graph to , for any . In the case this was independently done by Lo, Schmidt, and Thorup [LST20]. The KT partition approach is also used in the current best randomized algorithm for edge connectivity, which runs in time [GNT20].333The bound quoted in [GNT20] is but the improvement to Karger’s algorithm by [GMW20] reduces this to .
1.1 Our results
In this work we give the first near-linear time randomized algorithm to find the -KT partition of a weighted graph, for . An interesting aspect of our algorithm is that it uses Karger’s 2-respecting cut framework to find the -KT partition, thereby combining the aforementioned major lines of work on the minimum cut problem. This makes progress on a number of problems.
-
1.
The polygon representation is a compact representation of the set of near-minimum cuts of a weighted graph, originally described by Benczúr [Ben95, Ben97] and Benczúr-Goemans [BG08]. It extends the cactus representation [DKL76], which only works for the set of exact minimum cuts, and has played a key role in recent breakthroughs on the traveling salesperson problem [GSS11, KKG21]. For a general weighted graph the polygon representation has size , and Benczúr has given a randomized algorithm to construct a polygon representation of the -near mincuts of a graph in time [Ben97, Section 6.3] by building on the Karger-Stein algorithm. It is an open question whether we can construct a polygon representation in time for . In his thesis [Ben97, pg. 126], Benczúr says, “It already seems hard to directly identify the system of atoms within the time bound,” where the system of atoms is defined analogously to the -KT partition but for the set of all -near minimum cuts, not just the non-trivial ones. One can easily construct the set of atoms from a -KT partition, thus our KT partition algorithm gives a time algorithm for this task as well, making progress on this open question.
-
2.
The -KT partition of a weighted graph is exactly what is needed to give an optimal quantum algorithm for minimum cut: Apers and Lee [AL21] showed that the quantum query and time complexity of minimum cut in the adjacency matrix model is for a weighted graph where the ratio of the largest to smallest edge weights is , with the algorithm proceeding by finding a -KT partition.
In the case where the graph is instead represented as an adjacency list, they gave an algorithm with query complexity but whose running time is larger at . The bottleneck in the time complexity is the time taken to find a -KT partition of a weighted graph with edges. Using the near-linear time randomized algorithm we give to find a -KT partition here improves the time complexity of this algorithm to , matching the query complexity. We detail the full algorithm in Section 6.1.
Both quantum algorithms also used the following observation [AL21, Lemma 2]: if in a weighted graph the ratio of the largest edge weight to the smallest is , then the total weight of inter-component edges in a -KT partition of for is , which can be tight.
-
3.
The best randomized algorithm to compute the edge connectivity of a simple graph is the 2-out contraction approach of Ghaffari, Nowicki, and Thorup [GNT20], which has running time . Using our algorithm to find a -KT partition in a weighted graph we can follow Karger’s 2-respecting tree approach to compute the edge connectivity of a simple graph in time , thus also achieving the optimal bound on graphs that are not too sparse. We postpone details to Section 6.2.
Apart from these examples, we are hopeful that our near-optimal randomized algorithm for finding the KT partition of a weighted graph will find further applications.
In order to find a -KT partition in near-linear time, Apers and Lee [AL21] show that it suffices to solve the following problem in near-linear time. Let be a connected weighted graph and a spanning tree of . Consider a graph whose nodes are the edges of , and where two nodes of are connected by an edge iff the 2-respecting cut defined by is a non-trivial -near minimum cut of . Then the problem is to find a spanning forest of . Our main technical contribution is to give a time deterministic algorithm to solve this problem, where is the number of edges of the original graph .
It is interesting to compare the problem of finding a spanning forest of with the original problem solved by Karger of finding a minimum-weight 2-respecting cut of . To find a spanning forest of we potentially have to find many -near minimum cuts, which we accomplish with only an additional logarithmic overhead in the running time. The first insight to how this might be possible is to note that Karger’s original algorithm to find the minimum weight 2-respecting cut actually does something stronger than needed. Let be the weight of the 2-respecting cut of defined by . For every edge of Karger’s algorithm attempts to find an . It does not always succeed in this task, but if the candidate returned for edge is not such a minimizer, then for it must be the case that the candidate returned for satisfies . In this way, the algorithm still succeeds to find a minimum weight 2-respecting cut in the end.
In contrast, we give an algorithm that for every edge of actually finds
We then show that this suffices to implement a round of Borůvka’s spanning forest algorithm [NMN01] on in near-linear time. Borůvka’s spanning forest algorithm consists of rounds and maintains the invariant of having a partition of the vertex set and a spanning tree for each set . The algorithm terminates when there is no outgoing edge from any set of the partition, at which point the collection of spanning trees for the sets of the partition is a spanning forest of . The sets of the partition are initialized to be individual nodes of .
In each round of Borůvka’s algorithm the goal is to find an outgoing edge from each set of the partition which has one. Consider a node of with . We can find the best partner for and check if indeed gives rise to a non-trivial -near minimum cut and so is an edge of . The problem is that could also be in in which case the edge is not an outgoing edge of as desired. To handle this, we maintain a data structure that allows us to find both the best partner for , but also the best partner for that lies in a different set of the partition from . We call this operation a categorical top two query. If there actually is an edge of with one endpoint and the other endpoint outside of then either or will be such an edge. Following the approach of [GMW20] to the minimum-weight 2-respecting cut problem, combined with an efficient data structure for handling categorical top two queries, we are able to do this for all nodes of in near-linear time, which allows us to implement a round of Borůvka’s algorithm in near-linear time.
1.2 Technical overview
We now give a more detailed description of our main result. Let be a weighted graph, where is the set of edges and assigns a positive weight to each edge. For a set let be the set of all edges of with exactly one endpoint in . A cut of is a set of edges of the form for some . We call and the shores of the cut. Let . We use for the minimum weight of a cut in .
We will be interested in partitions of and the partial order on partitions induced by refinement. For two partitions of we say that iff for every there is a with . In this case we say is a refinement of . The meet of two partitions and , denoted , is the partition such that and for any other partition satisfying these two conditions . In other words, is the greatest lower bound on and under . For a set of partitions we write .
For our applications we need to consider not only minimum cuts, but also near-minimum cuts. For , let be the set of all bipartitions of corresponding to -near minimum cuts. Let be the set of all the non-trivial cuts in . The -KT partition of is exactly .
Both and are important sets for understanding the structure of (near)-minimum cuts in a graph. Consider first , the meet of the set of all bipartitions corresponding to minimum cuts. This set arises in the cactus decomposition of [DKL76], a compact representation of all minimum cuts of . A cactus is a connected multigraph where every edge appears in exactly one cycle. The edge connectivity of a cactus is 2 and the minimum cuts are obtained by removing any two edges from the same cycle. A cactus decomposition of a graph is a cactus on vertices and a mapping such that is a mincut of iff is a mincut of . The mapping does not have to be injective, so multiple vertices of can map to the same vertex of . In this case, however, the cactus decomposition property means that all vertices in must be on the same side of every minimum cut of , for every . Thus as ranges over the sets give the elements of (note that can also be empty). A cactus decomposition of a weighted graph can be constructed by a randomized algorithm in near-linear time [KP09], thus this also gives a near-linear time randomized algorithm to compute .
Lo, Schmidt, and Thorup [LST20] give a version of the cactus decomposition that only represents the non-trivial minimum cuts. In fact, they give a deterministic time algorithm that converts a standard cactus into one representing the non-trivial minimum cuts. Combining this with the near-linear time algorithm to compute a cactus decomposition, this gives a near-linear time randomized algorithm to compute as well.
The situation changes once we go to near-minimum cuts, which can no longer be represented by a cactus, but require the deformable polygon representation from [Ben95, Ben97, BG08]. This construction is fairly intricate, and the best known randomized algorithm to construct a deformable polygon representation of the -near mincuts of a graph builds on the Karger-Stein algorithm and takes time [Ben97, Section 6.3]. A prerequisite to constructing a deformable polygon representation is being able to compute as, analogously to the case of a cactus, these sets will be the “atoms” that label regions of the polygons.
Our main result in this work is to give a randomized algorithm to compute and in time .
Theorem 1.
Let be a graph with vertices and edges. For let and be the subset of containing only non-trivial cuts. Both and can be computed with high probability by a randomized algorithm with running time .
In the rest of this introduction we focus on computing . It is easy to construct from deterministically in time.
The first obstacle we face in designing a near-linear time algorithm to compute the meet of is that the number of near-minimum cuts in can be , so we cannot afford to consider all of them. An idea to get around this is to try the following:
-
1.
Efficiently find a “small” subset such that . We call such a subset a generating set.
A greedy argument shows that such a subset exists of size at most . We initialize for some element in . We then iterate through the elements of and add it to iff . Each bipartition added to increases the number of elements in by at least . As this size can be at most , and begins with size the total number of sets at termination is at most . This shows that a small generating set exists, but there still remains the problem of finding such a generating set efficiently.
Assuming we get past the first obstacle, there remains a second obstacle. The most straightforward algorithm to compute the meet of partitions of a set of size takes time , which is again too slow if . Thus we will also need to
-
2.
Exploit the structure of to compute efficiently.
Apers and Lee [AL21] give an approach to accomplish (1) and (2) following Karger’s framework of tree respecting cuts. Karger shows that in near-linear time one can compute a set of spanning trees of such that every -near minimum cut of 2-respects at least one of these trees. Let be the bipartitions corresponding to non-trivial near-minimum cuts that 2-respect . To compute it suffices to compute for each and then compute . The latter can be done in time by the aforementioned algorithm. This leaves the problem of computing .
A key observation from [AL21] gives a generating set for of size . One intializes to be empty and then adds bipartitions in that 1-respect to . This is a set of size , and Karger has shown that all near-minimum cuts that 1-respect a tree can be found in time .
Now we focus on the cuts that strictly 2-respect . To handle these one creates a graph whose nodes are the edges of and where there is an edge between nodes and iff the 2-respecting cut of defined by is a near-minimum cut in . One then adds to the bipartitions corresponding to a set of 2-respecting cuts that form a spanning forest of . The resulting set has size and it can be shown to be a generating set for .
Apers and Lee give a quantum algorithm to find a spanning forest of with running time . They then give a randomized algorithm to compute in time . As our main technical contribution, we give a deterministic algorithm to find a spanning forest of in time . We also replace the randomization used in the algorithm to compute with an appropriate data structure to give an deterministic algorithm to compute the meet.
2 Preliminaries
For a natural number we use .
Graph notation
For a set we let denote the set of unordered pairs of elements of . We represent an undirected edge-weighted graph as a triple where and gives the weight of an edge . We will also use to denote the vertex set of and to denote the set of edges. We always use for the number of vertices in and for the number of edges. We will overload the function to let for a set of edges and for two disjoint sets we use to denote , that is the sum of the weights of edges with one endpoint in and one endpoint in . For a subset we let be the set of edges with exactly one endpoint in . This is the cut defined by . We let denote the weight of a minimum cut in , i.e., .
Heavy path decomposition
We use the standard notion of heavy path decomposition of [ST83, HT84], which is a partition of the edges of into heavy paths. We define this partition recursively: first, find the heavy path starting at the root by repeatedly descending to the child of the current node with the largest subtree. This creates the topmost heavy path starting at the root (called its head) and terminating at a leaf (called its tail). Second, remove the topmost heavy path from and repeat the reasoning on each of the obtained smaller trees. The crucial property is that, for any node , the path from to the root in intersects at most heavy paths.
Algorithmic preliminaries
We collect here a few theorems from previous work that we will need. The first is Karger’s result [Kar00] about finding many spanning trees of a graph such that every minimum cut of will 2-respect at least one of these trees. We will need the easy extension of this result to near-minimum cuts, which has been explicitly stated by Apers and Lee.
Theorem 2 ([Kar00, Theorem 4.1], [AL21, Theorem 24]).
Let be a weighted graph with vertices and edges. There is a randomized algorithm that in time time constructs a set of spanning trees such that every -near minimum cut of 2-respects of them with high probability.
We will also need the fact that for a weighted graph the values in of all 1-respecting cuts of a tree can be computed quickly. For a rooted spanning tree of and an edge , let be the set of vertices in the component not containing the root when is removed from . We use the shorthand .
Lemma 3 ([Kar00, Lemma 5.1]).
Let be a weighted graph with vertices and edges, and a spanning tree of . There is a deterministic algorithm that computes for every in time .
We will also make use of the improvement by Gawrychowski, Mozes and Weimann of Karger’s mincut algorithm.
Lemma 4 ([GMW20, Theorem 7]).
Let be a weighted graph with vertices and edges and a spanning tree of . A cut of minimum weight in that 2-respects can be found deterministically in time. Using this, there is a randomized algorithm that finds a minimum cut in with high probability in time .
Finally we give the formal statement of the result from [AL21] that underlies our algorithm to construct a KT partition.
Lemma 5 ([AL21, Lemma 29]).
Let be a tree and a family of subsets of such that for each . Let be a set of pairs of edges in . Suppose is spanning forest for the graph . Then the set of shores of the 2-respecting cuts defined by edges in is a generating set for .
3 Data structures
In this section we develop the data structure we will need for a fast implementation of our spanning forest algorithm. We want to maintain a tree with root , in which each edge has a score and a color, so that we can support the following queries and updates. For an edge of the tree, let be the set of edges in the component not containing when is removed from the tree. On query an edge we want to find the edge with the smallest score, and the edge with the smallest score among edges whose color is different from that of . We call such a query a categorical top two query. We want to answer these queries while allowing adding to the score of every edge on the path between two nodes. We could use link-cut trees [ST83] to accomplish this with update and query time using the fact that link-cut trees can be modified to support any semigroup operation under path updates. However, in our case the tree is static, and this allows for a simple and self-contained solution that requires only a well-known binary tree data structure coupled with the standard heavy path decomposition of a tree. This comes at the expense of implementing updates in time instead of time. The construction can be seen as folklore and has been explicitly stated by Bhardwaj, Lovett and Sandlund [BLS20] for the case when each edge maintains its score and there are no colors. We provide a detailed description of such an approach in Appendix A. We note that the increased update time does not to affect the overall time complexity of our algorithm.
Lemma 6.
Let be an array where each element has two fields, a color and a score . In time we can create a data structure using space and supporting the following operations in time per operation.
-
1.
: for all do ,
-
2.
: return where and if for all and otherwise.
Lemma 7.
Let be a tree on nodes, with each edge having its color and score. In time we can create a data structure using space and supporting the following operations.
-
1.
: add to the score of every edge on a path in in time.
-
2.
: categorical top-two query in in time.
4 Spanning tree of near-minimum 2-respecting cuts in near-linear time
Let be a weighted undirected graph. We will assume throughout that is connected, and in particular that , as the KT partition of a disconnected graph can be easily determined from its connected components. Let be a spanning tree of . We will choose an with degree 1 in to be the root of . We view as a directed graph with all edges directed away from . With some abuse of notation, we will also use to refer to this directed version. If we remove any edge from then becomes disconnected into two components. We use to denote the set of vertices in the component not containing the root, and to denote the set of edges in the subtree rooted at the head of , i.e. the edges in the subgraph of induced by . We further use the shorthand for the weight of the cut with shore .
Two edges define a unique cut in which we denote by (or if it is clear from the context which we are referring to). The cut depends on the relationship between and . If or then we say that and are descendant edges. Without loss of generality, say that . Then the cut defined by and is . If and are not descendant edges, then we say they are independent. For independent edges we see that . In both cases we use to denote the weight of the corresponding cut.
In a KT partition we are only interested in non-trivial cuts. We first prove the following simple claim that characterizes when is trivial.
Proposition 8.
Let be a connected graph with vertices and let be a spanning tree of with root . For if is trivial then
-
1.
If are independent then they must be the unique edges incident to the root.
-
2.
If are descendant then there is a vertex such that is the edge incoming to and is the unique edge outgoing from , or vice versa.
Proof.
First suppose that are independent. Then a shore of is . We have that any , thus . Hence for to be trivial we must have . The root is not contained in for any thus it must be the case that . For this to happen, must be incident to , and cannot have any other outgoing edges besides and .
Now consider the case that are descendant and suppose without loss of generality that . Let . In this case we have as does not contain the root and . Let us understand when . As all vertices on the path from the head of to and including the tail of are in it must be the case that the head of is the tail of . Call this vertex and note . If has any other child besides the head of then we would have as well, thus must be the unique outgoing edge from . ∎
By choosing a root for that has degree 1 we avoid the case of item 1 of Proposition 8. Thus we only have to worry about trivial cuts when are descendant.
With that out of the way, we now turn to the main theorem of this section. As outlined in Section 1.2 this theorem is the key routine in our -KT partition algorithm, which we fully describe in Section 5.
Theorem 9.
Let be a connected graph with vertices and edges and let be a spanning tree of . For a given parameter , define the graph , with and . There is a deterministic algorithm that given adjacency list access to and outputs a spanning forest of in time.
At a high level, we prove Theorem 9 by following Borůvka’s algorithm to find a spanning forest of . Throughout the algorithm we maintain a subgraph of that is a forest, initialized to be the empty graph on vertex set . At the end of the algorithm, will be a spanning forest of . The algorithm proceeds in rounds. In each round, for every tree in the forest, we find an edge connecting it to another tree in the forest, if such an edge exists. If has connected components, then in each round the number of trees in minus goes down by at least a factor of , and so the algorithm terminates in rounds.
The main work is implementing a round of Borůvka’s algorithm. We will think of the nodes of as having colors, where nodes in the same tree of the forest have the same color, and nodes in distinct trees have distinct colors. The goal of a single round is to find, for each color , a pair of edges such that and , or detect that there is no such pair with these properties, in which case the nodes colored in already form a connected component of . As we need to refer to such pairs often we make the following definition.
Definition 10 (partner).
Let and be as in Theorem 9. Given an assignment of colors to the edges of we say that is a partner for if and .
We will actually do something stronger than what is required to implement a round of Borůvka’s algorithm, which we encapsulate in the following code header.
Input: Adjacency list access to , a spanning tree of , a parameter , and an assignment of colors to each .
Output: For every output a partner , or report that no partner for exists.
The implementation of is our main technical contribution. Let us first see how to use to find a spanning forest of .
Lemma 11.
Let , and be as in Theorem 9. There is a deterministic algorithm that makes calls to and in additional time outputs a spanning forest of .
Proof.
We construct a spanning forest of by maintaining a collection of trees that will be updated in rounds by Borůvka’s algorithm until it becomes a spanning forest. We initialize and give all distinct colors. We maintain the invariants that is a forest and that nodes in the same tree have the same color and those in different trees have distinct colors.
Consider a generic round where contains trees. We call with the current color assignment. For every which has one we obtain a partner such that and . For each color class we select one with which has a returned partner (if it exists) and let be the set of selected edges. We then find a maximal subset of edges that do not create a cycle among the color classes by computing a spanning forest of the graph whose supervertices are given by the color classes and edges given by . We add the edges in to . Finally we merge the color classes of the connected components in by appropriately updating the color assignments, and we pass the updated forest and color assignments to the next round of the algorithm. Each of the steps in a single round can be executed in time.
We have that where is the number of connected components of . Each edge from added to decreases the number of trees in by one. Thus the number of trees in minus decreases by at least a factor of 2 in each round and the algorithm terminates after rounds. The time spent outside of the calls to is for each of the rounds. This is overall. ∎
If a node has a partner , then can either be a pair of descendant or independent edges. To implement we will separately handle these cases, as described in the next two subsections.
4.1 Descendant edges
We follow the approach from [GMW20] originally designed to find a single pair of descendant edges that minimizes over all in time. Their approach actually does something stronger (as does Karger’s original algorithm): for every it finds the best match in the subtree , i.e., it returns an edge . In order to implement the descendant edge part of we have three additional complications to handle:
-
1.
The edge might have the same color as .
-
2.
The resulting might be a trivial cut.
-
3.
Edge may have no partner in but still have a partner such that . This partnership may not be discovered when we are looking for partners of if there is another with .
Item 1 can be easily solved by, in addition to finding , also finding . Phrasing things in this way, rather than simply looking for the edge with color different from which minimizes , helps to limit the dependence of the query on and thus reduce the query time. If there is an with and then at least one of will satisfy this too.
For item 2, we use the result of Proposition 8 that descendant edges that give rise to trivial cuts have a very constrained structure. This allow us to avoid trivial cuts when looking for a partner of .
Item 3 is relatively subtle and does not arise in the minimum weight 2-respecting cut problem. To explain the issue we have to first say something about the high level structure of our implementation of . We will perform an Euler tour of and, when the tour visits edge for the first time, we will look for a partner for in . The issue is the following, which we explain in the context of the very first round of Borůvka’s algorithm so we do not have to worry about nodes having different colors. Suppose that in the graph the only edge incident to node is a node with . Thus in the execution of we want to find as a partner of . When the Euler tour is at we will not find any suitable partner for , as there is none in . We would like to identify as a partner for when the Euler tour visits for the first time. However, if there is a with then the algorithm will return as a partner of rather than . To handle this we will actually make two passes over . In the first pass, when we visit edge for the first time we look for a partner in . In the second pass, we handle the case where the partner of might be an ancestor of . To do this we need to de-activate nodes. When the Euler tour visits for the first time, we first find the lowest cost partner for in . We then de-activate this node, and again find the best active partner for in . Repeating this process, we will eventually find if is indeed an edge of and have different colors.
Now we turn to more specific implementation details. A key idea in [GMW20] is that we can do an Euler tour of while maintaining a data structure such that when we first visit an edge we can easily look up for any . The way this is maintained can be best understood by noting that for :
(1) |
where for convenience we defined , where the superscript denotes taking the complement.
We begin the algorithm by computing for every , which can be done in time by Lemma 3. We then do an Euler of while maintaining a data structure from Lemma 7 such that, when we are considering , for every the value of the data structure at location is . For this will not in general be the case.
As can be seen from Section 4.1, the key to maintaining this data structure is how to update the values when we descend edge . Consider the case where we are currently at edge and move to a descending edge . For two vertices let be the set of edges on the path from to in , and let be their least common ancestor in . For we see that
(2) |
By its definition in (4.1) we can compute from by subtracting from for every such that and . The for-loop on line 2 of Algorithm 2 implements this step for all by looping over all with . After this update we have that for every . This shows how to descend down while keeping the invariant. The full tree is then explored by taking an Euler tour through , and whenever we go back up in the tree we revert the score updates (for-loop on line 10 of Algorithm 3). This allows us to find candidate for every . To bound the number of updates, note that each of the edges has a unique lca, and we only do an update corresponding to an edge when the lca is visited by the Euler tour. Since the Euler tour visits every vertex at most twice, the number of updates is at most . In addition, the number of categorical top two queries is . The data structure from Lemma 7 then yields time overall.
The algorithm is formalized in Algorithm 2, whose correctness we prove in the following lemma.
Lemma 12 (cf. [GMW20, Lemma 8]).
Assume that we first initialize for every , and then run Algorithm 2 (doing nothing in line 6). Then whenever an edge is followed on line 5 in the call to Traverse it holds that for all .
Proof.
We will prove this by induction on the depth of . Consider the case where is the root . Before the call to Traverse we initialized all scores to . Then, on line 3 of Traverse, for each with we subtract from the score of every edge on the to path in . Let us refer to scores at this point in time as “at time zero.” We first claim that at time zero for any outgoing edge from the root this makes for all .
Let be the set of edges on the path from to in . By Section 4.1 we have thus it suffices to show that for any
This holds because by definition iff one endpoint is in the other endpoint is in , which in turn happens iff the least common ancestor of the endpoints is and lies on the path between the endpoints.
To finish the base case, we claim that at each iteration of the for loop all scores are the same as at time zero. This is because in the recursive calls that follow the update to the scores on line 3 the update is exactly canceled out by the reverse update on line 10 when the recursive call exits.
For the inductive step, let us suppose that when an edge is followed on line 5 in the call to Traverse it holds that for all . Let us now refer to scores at this point in time as “at time zero.” We then want to show that on line 5 in the call to Traverse that for an outgoing edge it holds that for all . The change in the scores from time zero to the execution of the for loop in the call to Traverse occurs in the update on line 3. Let us refer to scores at this point in time as “at time one.” We first show that at time one for any outgoing edge of it holds that for all . The key to this is to consider the difference between and for an . By Section 4.1 we have , and by the inductive hypothesis at time zero . Thus so that we need to update the by
(3) |
To see this, first note that . This confirms that we should subtract something to perform this update. An edge is in but not iff one endpoint, say , is in and the other endpoint is in . This means that but now and so . The condition is then equivalent to having on the path between and . This confirms that Eq. 3 performs the correct update.
To finish the proof, we claim that not just at time one, but at the time when the for loop with is executed. This is again because in the changes to the scores on line 3 that are made in a recursive call are reversed when the recursive call exits on line 10, thus every time the for loop is executed the scores are the same as the scores at time one. ∎
Given Lemma 12 to maintain for during an Euler tour of the tree, and with a link cut tree data structure to handle categorical top two queries, it is now straightforward to design an algorithm to find for every edge a partner for that is a descendant or ancestor, if such a partner exists.
Theorem 13.
Given an assignment for each , there is a deterministic algorithm that runs in time and for each finds an such that
-
1.
-
2.
or
-
3.
if such an exists.
Proof.
The algorithm is given by Algorithm 3. Suppose that an edge has a partner satisfying the 3 conditions of the theorem. Then either or . We claim that if then we will find a partner of in the call to Traverse() using Below() to process edge , and if then we will find a partner of in the call to Traverse() using Above() to process edge .
Let us show these statements separately, starting with the case . Consider the time when is considered in the for loop line 5 in a recursive call from Traverse() using Below() to process edge . In the call to Below() we first check if the tail of has a single outgoing edge . If this is the case then is a trivial cut and thus we do not want to find as a partner for . We thus add to the score of ensuring that it will never be a valid partner for . By Proposition 8 this is the only situation where is trivial for . By Lemma 12 for all other it holds that . Thus the call to CatTopTwo() will perform correctly, and one of the returned edges must be a valid partner for . We then reset the score of , if it was changed, to maintain the property given by Lemma 12.
Now consider the case where has a partner with . Let be the first such partner that is encountered in an Euler tour of . We claim that the edge will be added to in the call to Traverse() using Above() to process edge . First note that after the previous call to Traverse() terminates it holds that as all changes to the scores in the recursive calls are reverted after the call returns. Thus we are again in position to apply Lemma 12, although we have to be slightly more careful this time as scores are modified within the body of the for loop on line 5 of Algorithm 2 when we run Above() to process . We again handle the possibility of trivial cuts as in the “below” case. We also modify a score for an edge after a partner for has been found and thus our job for is done and we no longer need to use its score. As by assumption is the first potential partner for encountered in the Euler tour, the score of has not been modified at this point. Thus by Lemma 12 it holds that . This means that will eventually be found in the repeat loop on line 7 of the function Above().
Let us now analyze the running time. Computing for each edge can be done in time by Lemma 3. Before proceeding with the traversal, we gather, for each node , all edges such that . This can be done in time by constructing in time a constant-time LCA structure [BF00], and iterating over the edges. Next consider the body of the for loop in the Below function. Here we make a single CatTopTwo query which takes time by Lemma 7. Thus over the entire Euler tour these queries contribute to the running time. In the Above function all CatTopTwo queries in the body of the for loop except for one (when noMore becomes true) will result in de-activating an edge. Thus again the total query time over the Euler tour is .
Finally consider the cost of updating the scores in the Euler tour. As discussed earlier, over the course of the Euler tour this requires doing 2 calls to AddPath for every edge of . Each AddPath call can be done in time by Lemma 7, thus the overall time for this is , which dominates the complexity of the algorithm. ∎
4.2 Independent edges
The goal now is to find, for every edge , a partner such that are independent, or decide that there is no such . As we chose the root of to have degree 1, by Proposition 8 we do not have to worry about trivial cuts in the independent edge case. Instead of considering all edges one-by-one, we first find a heavy path decomposition of and then iterate over all pairs of heavy paths to look for a partner for every . We cannot literally carry out this plan as the number of pairs of heavy paths can be and so we cannot explicitly consider every pair. We show next that many pairs result in a trivial case and that all these trivial pairs can be solved together in one batch. We then bound the number of non-trivial pairs and show that in near-linear time we can explicitly process all of them. The idea of processing pairs of heavy paths, and explicitly considering only the non-trivial ones, was introduced in the context of 2-respecting cuts by Mukhopadhyay and Nanongkai [MN20] (see also [GMW20]).
Consider two distinct heavy paths , where is the path and is the path . We let for and for . It can be that not all pairs are independent, see Fig. 1. However, we can easily identify the subpaths of containing pairwise independent edges in constant time by computing the least common ancestor of the tails of . If lies on then will be independent for and , and similarly if lies on . In general we assume that have been determined so that are independent for all and , and that these pairs comprise all of the independent pairs on . We can associate to a -by- matrix where for and
(4) |
and is undefined otherwise.444We could restrict to the submatrix on which it is defined, but find it notationally easier for the indices in to match the edge labels. By Lemma 3, all values of can be computed in total time. To efficiently evaluate , we will prepare a list of all edges that contribute to for independent with . For many the list will be empty, leading to the trivial case mentioned above. The following lemma bounds the size of all the non-empty lists and shows they can be constructed efficiently.
Lemma 14.
The total length of all lists is and all non-empty lists can be constructed deterministically in time .
Proof.
Observe that an edge can contribute to for independent with only if is in the subtree rooted at the head of and is in the subtree rooted at the head of . There are at most heavy paths intersecting the path from to the root and from to the root, and we can iterate over all such heavy paths in time proportional to their number (for example, by storing for each edge of , a pointer to the head of the heavy path that contains it). Thus, we can iterate over all relevant in time, and add a triple to an auxiliary list in which the heavy paths are identified by their heads. The total size of the auxiliary list is now and it can be lexicographically sorted in the same time with radix sort. After sorting each non-empty list constitutes a contiguous fragment of the auxiliary list. ∎
We can now describe how to find a partner for every such that are independent. The algorithm first solves together in one batch the case where the partner of is in a heavy path where is empty. After that we explicitly consider all with non-empty. We consider these two cases in the next two subsections.
4.2.1 Empty lists
Lemma 15.
There is a deterministic algorithm that in time finds a partner for every edge that has a partner such that are independent and with empty.
Proof.
The key observation is that if is empty then by Section 4.2. As can be seen from Section 4.1 and Section 4.2, for any edge it always holds that , whether are descendant or independent. Thus in this case it suffices for us to find any of color different from such that , and is non-trivial as this ensures . We are guaranteed such an exists as satisfies this.
By Lemma 3 we can compute for every in time . Then in time with one pass over we compute the edge of lowest cost and the edge of lowest cost that is of color different to . We then repeat this categorical top two query twice more, each time excluding all previously found edges. At the end we obtain edges . We claim that for every , at least one of these must be a valid partner.
Consider any particular . The first categorical top two query can only fail to find a valid partner for if one of creates a trivial cut with . In this case, the second categorical top two query can only fail if one of creates a trivial cut with as well. By Proposition 8, however, there are at most two possible edges that can create a trivial cut with , thus in this case the third categorical top two query must succeed and we find a valid partner for . ∎
4.2.2 Non-empty lists
The more difficult case is to find partners among pairs with non-empty. To solve this case we will use the special structure of . As above, say that is the path and is the path , and let for and for . Further suppose are independent for all . We have that for . Recall that is defined precisely as the list of edges that contribute to for independent . The contribution of a specific edge can be understood as follows: let be the lowest common ancestor of and , and be the lowest common ancestor of and . Then the weight of contributes to for every , . This is depicted in Fig. 1. We will compute these indices and for every . This takes constant time per edge using an appropriate LCA structure [BF00], and so total time . Let denote the resulting list of index pairs, each of which has an associated weight.

Lemma 16.
Let . There is a deterministic algorithm to find a partner for every in time .
Proof.
The algorithm is given in Algorithm 4. We describe the algorithm here and analyze its correctness and running time.
For every heavy path let be an array with and for every . Via Lemma 6 there is a data structure that supports path updates and CatTopTwo queries to in time. The total time for this initialization step is .
Let be an ordered list of pairs that contains and for every with non-empty. We sort by the name of the first path with radix sort in time. We will follow to iterate over all with non-empty.
Let us describe what the algorithm does when considering pairs where consists of edges and consists of the edges , where edges are independent for , and these comprise all the independent pairs in . We iterate over the columns of , starting from and going until , and maintain the invariant that, when considering column , it holds that for every active edge with index (where active will be defined later). We postpone describing how to maintain this invariant for the moment. Then we do a CatTopTwo query on which returns potential candidates . If there is an edge for which is a valid partner then must be a partner for either . This can be checked in constant time. If is not a partner for either then we move on to column ; if it is a partner for, say , then we add to to “de-activate” and repeat the process by doing a CatTopTwo query again on until no valid partner is returned.
The basic algorithm we have described considers every column of from to . We now show how to accelerate this algorithm by restricting our attention to a subset of the columns of in this interval. Let . We sort the pairs in by the second coordinate in time . Let be the distinct values of the second coordinate that appear in this sorted list, where . Set and . For we have that is constant over for every by the definition of . We call such an interval a void interval. Thus the minimum of over necessarily occurs at . This means that if edge has a partner with then one of the two edges returned by a CatTopTwo to must be a partner for .
We can thus amend the algorithm to the following. For we iterate over the endpoints of the void intervals. When we maintain the invariant that . Thus for all it holds that . We then do a CatTopTwo query on and a CatTopTwo query on with the interval . If any has a partner with at least one of the 4 possible pairs returned must be partners. We de-activate any which finds a partner by adding to its score and repeat the process until no valid partners are found, at which point we move on to the next void interval. If partners are found then the total time spent in this void interval will be for the CatTopTwo queries and updates to de-activate edges.
It remains to describe how to maintain the invariant for all when . To do this for every pair in with with associated weight we subtract from in the interval . Each such interval update can be done in time by Lemma 6 so the total time for all updates is .
Once we finish processing , we reverse all of the interval updates (but not the edge de-activations) so that we again have for all for all active edges . This again can be done in time . Once we finish processing all pairs associated we subtract from for all edges that were de-activated to make them active again.
The total number of edge de-activations is at most thus this contributes to the running time and is low order. Over all the total time spent is . ∎
4.3 Spanning tree algorithm
We now have all components of the spanning tree algorithm, which we can combine together to implement .
Lemma 17.
There is a deterministic algorithm to implement which runs in time .
Proof.
Given an assignment of colors to the edges of , our task is to find a partner for every which has one. If has a partner such that are in a descendant relationship then a partner for can be found in time by Theorem 13. The other case is that has a partner such that are independent. This divides into two subcases. If the heavy paths containing respectively are such that is empty then we will find a partner for via Lemma 15 in time . The bottleneck of the algorithm is where in non-empty in which case we use Lemma 16 to find a partner in time . ∎
We can now prove the main theorem of this section, Theorem 9, that we can find a spanning tree of in time .
5 KT partition algorithm
For completeness we state here the full KT partition algorithm, including the reductions from [AL21]. At a high level, we follow Karger’s algorithm to find spanning trees so that with high probability every -minimum cut 2-respects at least one of them. We then use our algorithm from Theorem 9 to, for each tree , find a generating set for the meet of all -minimum cuts that 2-respect . We are then left with two problems. The first is that we still have to find the meet of the partitions in the generating set. A near-linear time randomized algorithm was given to do this in [AL21]. Here we give a deterministic algorithm to do this. Then we need to take the meet of partitions, one for each tree. This is simple to do and we handle this first.
Lemma 18.
Let be a set of partitions of . There is a deterministic algorithm to compute in time .
Proof.
In time we can assign each a bit key indicating which set contains in each . Collecting together elements with the same key gives . ∎
Next we see how to explicitly construct the meet of all -minimum cuts that 2-respect a tree from a generating set. We follow the idea of the proof in [AL21] but make it deterministic by replacing random hashing with an appropriate data structure.
Lemma 19 (Mehlhorn, Sundar, and Uhrig [MSU97]).
A dynamic family of persistent sequences, each of length at most , can be maintained under the following updates. A length-1 sequence is created in constant time, and a new sequence can be obtained by joining and splitting existing sequences in time, where is the number of updates executed so far. Each sequence has an associated signature with the property that iff .
For the proof of the lemma it will be useful to use the following definition.
Definition 20 (separate).
Let be a finite set and . For we say that separates if exactly one of them is in .
Lemma 21 (cf. [AL21, Lemma 31]).
Consider as input a tree on a vertex set of size , and sets of edge singletons and edge pairs . These sets define sets of 1-respecting and 2-respecting cuts, respectively. There is an algorithm that in time returns the meet of the bipartitions induced by these cuts.
Proof.
We root the tree at an arbitrary vertex . When we speak of the shore of a cut we always refer to the shore not containing . Arrange all elements of and in an arbitrary order to obtain a sequence of cuts. Our goal is to construct, for each node , a string where the bit of is iff the shore of the cut contains . Assuming that we can indeed efficiently construct such strings, the meet is obtained by grouping together nodes with the same string . However, the difficulty is that we cannot afford to construct for all explicitly as this would require bits. Instead, we will use Lemma 19 for representing a collection of strings of length .
Consider the preorder traversal of starting from the root . By definition , which we create in the data structure by joins of . We then create from the string during the preorder traversal, where is the parent of . To do this we set and then flip the bits of corresponding to cuts whose shore contains but not or vice versa. Thus we need to understand when the shore of a 2-respecting cut separates from . The shore of a one respecting cut defined by edge is , and hence separates and iff . A 2-respecting cut defined by edges separates two vertices and iff exactly one of is on the path from to in . Thus a 2-respecting cut will separate and iff either or . Hence there will be at most bit flips in total and in time we can annotate the tree with which bits should be flipped at each node.
A bit flip can be implemented in the data structure by a constant number of splits, joins and the creation of a length-1 sequence. As there are total operations on the data structure, the total time for all updates is by Lemma 19.
Having obtained all the strings , we can group together nodes with the same string by sorting their signatures . Because each signature is a positive integer bounded by by Lemma 19, this can be implemented with radix sort in time. This gives the lemma. ∎
Input: A weighted graph an a parameter
Output: -KT partition
We are now ready to prove our main theorem. See 1
Proof.
We first prove the theorem for . The algorithm for computing is given in Algorithm 5. Let us first argue the correctness. Step 1 succeeds with high probability by Theorem 2, and the rest of the algorithm is deterministic. Thus if we show that the algorithm is correct assuming that Step 1 succeeds, then the algorithm will be correct with high probability.
Let us now assume that Step 1 succeeds. Then in Step 2 . Let be the set of bipartitions of all non-trivial -minimum cuts of that 2-respect , for . We have that . Therefore
For each we have by the correctness of our main algorithm Theorem 9 and Lemma 5. We compute via Lemma 21. Finally, we compute in Step 8 by Lemma 18.
Now let us go over the time complexity. Step 1 runs in time by Theorem 2. Step 2 takes time by Lemma 4. In the for loop, Step 4 takes time by Lemma 3; Step 5 takes time by Theorem 9; Step 6 takes time by Lemma 21. Thus the time in the for loop is dominated by Step 4, and the total time taken over the iterations is . The last step takes time . Thus the complexity overall is .
To finish the proof of the theorem let us handle the case of . We claim that given the value of we can compute from deterministically in time. In time we can identify the set . Let be the corresponding set of bipartitions and note that . The meet is simply the partition consisting of the sets for and . To take the meet of this partition with we simply cycle through each and split into the sets for and , which can be done in time . Thus the total time of computing is dominated by the computation of , and can be done asymptotically in the same time. ∎
6 Applications
In this section we give two applications of our main result: an improved quantum algorithm for minimum cut in weighted graphs in the adjacency list model, and a new randomized algorithm with running time to compute the edge connectivity of a simple graph.
6.1 Quantum algorithm for minimum cut in weighted graphs
In a recent work by Apers and Lee [AL21] the quantum complexity of the minimum cut problem was studied. They distinguish two models for querying a weighted graph as an input. In the adjacency matrix model a query is a pair of vertices and the answer to the query reveals whether , and if so, also returns the weight . In the adjacency list model a query is a vertex and an integer , and the answer to the query is the -th neighbor of vertex (if it exists) and the corresponding weight . The main results from [AL21] depend on the edge-weight ratio , defined as the ratio of the maximum edge weight over the minimum edge weight. These results can be summarized as follows:
-
•
In the adjacency matrix model, finding a minimum cut of a weighted graph with edge-weight ratio has quantum query and time complexity . This compares to the query complexity of any classical algorithm for minimum cut in this model [DHHM06].
-
•
In the adjacency list model, finding a minimum cut of a weighted graph with edge-weight ratio requires quantum query complexity and quantum time complexity . There are also lower bounds of for and for . This compares to the query complexity of any classical algorithm for minimum cut in this model [BGMP21].
While this fully resolves the quantum complexity of minimum cut in the adjacency matrix model, there are two apparent gaps in the adjacency list model. On the one hand there is a gap between the upper and lower bounds on the quantum query complexity. On the other hand there is a gap between the upper bounds on the quantum query complexity and the quantum time complexity. Using our new result (Theorem 1) we can close this second gap.
Let denote the (quantum) time complexity for finding a -KT partition of a weighted graph with vertices and edges. The following lemma is proven in [AL21].
Lemma 22 ([AL21, Lemma 22]).
Let be a weighted graph with vertices, edges, and edge-weight ratio . There is a quantum algorithm to compute the weight and shores of a minimum cut of with time complexity in the adjacency list model.
In [AL21] a quantum algorithm was proposed for finding the KT partition of a weighted graph with edges in time , giving an upper bound and an upper bound on the quantum time complexity. Our main result gives a classical algorithm that improves this upper bound to , and hence this yields a quantum algorithm for minimum cut with time complexity .
Corollary 23.
Let be a weighted graph with vertices, edges, and edge-weight ratio . There is a quantum algorithm to compute the weight and shores of a minimum cut of with time complexity in the adjacency list model.
6.2 Randomized algorithm for edge connectivity
We can use our algorithm for finding the KT partition of a weighted graph to give a randomized algorithm that computes the edge connectivity of a simple graph with high probability in time . For graphs that are not too sparse this equals the best known complexity of the random contraction based algorithm by Ghaffari, Nowicki and Thorup [GNT20].
Our new algorithm uses the key idea from Kawarabayashi and Thorup [KT19]: (i) find the KT partition of the graph and contract the components of the partition, and (ii) find a minimum cut in the contracted graph. By definition of the KT partition, this contraction will preserve the set of non-trivial minimum cuts, and so it suffices to find a minimum cut in the contracted graph and the minimum degree of a vertex. Moreover, the contracted graph has only edges and so we can find a minimum cut in this graph quickly.
Our algorithm follows the same blueprint, except that in order to obtain leading complexity we first find an -cut sparsifier of the input graph, for a small constant . For this step we can use the sparsification algorithm from Fung, Hariharan, Harvey and Panigrahi [FHHP19, Theorem 1.22]. Provided that the sparsification step is successful, any minimum cut of the original simple graph will be a -near minimum cut of for . Thus if we find a -KT partition of and contract the sets of resulting partition in we obtain a multigraph which preserves all non-trivial minimum cuts of . In this way we only need to find the KT partition of , which has edges rather than edges. On the other hand, the sparsifier will in general be weighted, and hence we cannot run the near-linear time algorithm from [KT19] to find its KT partition. This is a prime example where finding the KT partition of a weighted graph is very useful.
The next theorem fleshes out this algorithm. For this, we need the fact that for a simple graph there are only inter-component edges in a KT partition. We take the version from [AL21] which gives an explicit constant in the bound.
Lemma 24 ([RSW18, Lemma 2.6],[AL21, Lemma 2]).
Let be a simple graph with . Let . For a nonnegative , let and let be the multigraph formed from by contracting the sets in . Then
Input: Adjacency list access to a simple graph .
Output: A minimum cut of .
Theorem 25.
Let be a simple graph with edges. There is a classical randomized algorithm that runs in time and with high probability outputs the edge connectivity of and a cut realizing this value.
Proof.
The algorithm is given in Algorithm 6. The time complexity of each step is given in the comments. Let us prove correctness.
The algorithm either outputs a trivial cut or a cut from a contraction of . As contraction cannot decrease the edge connectivity, if the edge connectivity of is realized by a trivial cut the algorithm will be correct. Let us now assume that the edge connectivity is realized by a non-trivial cut . In step 2 we use the sparsification algorithm of Fung, Hariharan, Harvey and Panigrahi [FHHP19, Theorem 1.22] to find a -cut sparsifier of , which succeeds with high probability. Thus with high probability . Also with high probability the weight of a minimum cut of is at least , in which case will be a -near minimum cut of . Hence with high probability the -KT partition of will be a refinement of , and in the contraction it will hold that and so the edge connectivity of is . Further, if is a valid -cut sparsifer of it will hold that has at most edges by Lemma 24 and so we can find a minimum cut of in time using the minimum cut algorithm of [GMW20] given in Lemma 4. Thus in this case with high probability will be a cut realizing the edge connectivity of and the algorithm is correct. ∎
7 Discussion
We find the -KT partition of a weighted graph in near-linear time for any . The near-linear time deterministic algorithm of Kawarabayashi and Thorup [KT19] to find a KT-partition of a simple graph differs from ours with respect to the parameters in an interesting respect. Recall that we defined to be the set of all bipartitions of the vertex set corresponding to non-trivial cuts whose weight is at most , and a -KT partition to be . Kawarabayashi and Thorup consider the larger set of bipartitions corresponding to non-trival cuts of weight at most , where is the minimum degree of . When is simple they can compute for any in near-linear time. Thus it is stronger than our result with respect to the parameters in two ways: it allows any and also lets multiply the minimum degree rather than .
There is an inherent barrier to extending the 2-respecting cut framework we employ here to this parameter regime. The reason is that Karger’s tree packing lemma [Kar00, Lemma 2.3] only shows that a cut of weight will 2-respect a positive fraction of the trees from a maximum tree packing. To handle cuts of weight one would have to move instead to considering 3-respecting cuts, which seems to add a good deal of complexity. Thus while we have not tried to optimize the constant , there is a natural barrier to extending our methods for . Pushing to larger and also allowing to multiply the minimum weight of a vertex rather than both seem to require new techniques, and we leave this as an open question.
References
- [AL21] Simon Apers and Troy Lee. Quantum complexity of minimum cut. In Proceedings of the 36th Computational Complexity Conference (CCC ’21), page 28:1–28:3. LIPIcs, 2021.
- [Ben95] András A. Benczúr. A representation of cuts within 6/5 times the edge connectivity with applications. In Proceedings of 36th Annual Symposium on Foundations of Computer Science (FOCS ’95), pages 92–102. IEEE Computer Society, 1995.
- [Ben97] András Benczúr. Cut structures and randomized algorithms in edge-connectivity problems. PhD thesis, MIT, 1997.
- [BF00] Michael A. Bender and Martin Farach-Colton. The LCA problem revisited. In Proceedings of 4th Latin American Symposium on Theoretical Informatics (LATIN ’00), pages 88–94. Springer, 2000.
- [BG08] András A. Benczúr and Michel X. Goemans. Deformable Polygon Representation and Near-Mincuts, pages 103–135. Springer Berlin Heidelberg, Berlin, Heidelberg, 2008.
- [BGMP21] Arijit Bishnu, Arijit Ghosh, Gopinath Mishra, and Manaswi Paraashar. Query complexity of global minimum cut. In Proceedings of the 24th international conference on Approximation Algorithms for Combinatorial Optimization Problems (APPROX ’21), 2021.
- [BLS20] Nalin Bhardwaj, Antonio Molina Lovett, and Bryce Sandlund. A simple algorithm for minimum cuts in near-linear time. In 17th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT ’20). Schloss Dagstuhl–Leibniz-Zentrum für Informatik, 2020.
- [DHHM06] Christoph Dürr, Mark Heiligman, Peter Høyer, and Mehdi Mhalla. Quantum query complexity of some graph problems. SIAM Journal on Computing, 35(6):1310–1328, 2006.
- [DKL76] Efim A Dinitz, Alexander V Karzanov, and Michael V Lomonosov. On the structure of the system of minimum edge cuts in a graph. Issledovaniya po Diskretnoi Optimizatsii (Studies in Discrete Optimization), pages 290–306, 1976. Appeared in Russian.
- [FHHP19] Wai-Shing Fung, Ramesh Hariharan, Nicholas JA Harvey, and Debmalya Panigrahi. A general framework for graph sparsification. SIAM Journal on Computing, 48(4):1196–1223, 2019.
- [Gab95] Harold N. Gabow. A matroid approach to finding edge connectivity and packing arborescences. Journal of Computer and System Sciences, 50(2):259–273, 1995.
- [GH61] Ralph E. Gomory and Te C. Hu. Multi-terminal network flows. Journal of the Society for Industrial and Applied Mathematics, 9(4):551–570, 1961.
- [GMW20] Pawel Gawrychowski, Shay Mozes, and Oren Weimann. Minimum cut in time. In Proceedings of the 47th International Colloquium on Automata, Languages, and Programming (ICALP ’20), volume 168 of LIPIcs, pages 57:1–57:15. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2020.
- [GNT20] Mohsen Ghaffari, Krzysztof Nowicki, and Mikkel Thorup. Faster algorithms for edge connectivity via random 2-out contractions. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA ’20), pages 1260–1279. SIAM, 2020.
- [GSS11] Shayan Oveis Gharan, Amin Saberi, and Mohit Singh. A randomized rounding approach to the traveling salesman problem. In 52nd Annual IEEE Symposium on Foundations of Computer Science (FOCS ’11), pages 550–559. IEEE, 2011.
- [HRW20] Monika Henzinger, Satish Rao, and Di Wang. Local flow partitioning for faster edge connectivity. SIAM Journal on Computing, 49(1):1–36, 2020.
- [HT84] Dov Harel and Robert Endre Tarjan. Fast algorithms for finding nearest common ancestors. SIAM Journal on Computing, 13(2):338–355, 1984.
- [Kar00] David R. Karger. Minimum cuts in near-linear time. Journal of the ACM, 47(1):46–76, 2000. Announced at STOC 1996.
- [KKG21] Anna R Karlin, Nathan Klein, and Shayan Oveis Gharan. A (slightly) improved approximation algorithm for metric TSP. In 53rd Annual ACM-SIGACT Symposium on Theory of Computing (STOC ’21), pages 32–45, 2021.
- [KP09] David R. Karger and Debmalya Panigrahi. A near-linear time algorithm for constructing a cactus representation of minimum cuts. In Proceedings of the 20th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA ’09), pages 246–255. SIAM, 2009.
- [KT19] Ken-ichi Kawarabayashi and Mikkel Thorup. Deterministic edge connectivity in near-linear time. Journal of the ACM, 66(1):4:1–4:50, 2019. Announced at STOC 2015.
- [Li21] Jason Li. Deterministic mincut in almost-linear time. In Proceedings of the 53rd Annual ACM Symposium on Theory of Computing (STOC ’21), pages 384–395. ACM, 2021.
- [LST20] On-Hei S. Lo, Jens M. Schmidt, and Mikkel Thorup. Compact cactus representations of all non-trivial min-cuts. Discrete Applied Mathematics, 303:296–304, 2020.
- [MN20] Sagnik Mukhopadhyay and Danupon Nanongkai. Weighted min-cut: sequential, cut-query, and streaming algorithms. In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing (STOC ’20), pages 496–509. ACM, 2020.
- [MSU97] Kurt Mehlhorn, R. Sundar, and Christian Uhrig. Maintaining dynamic sequences under equality tests in polylogarithmic time. Algorithmica, 17(2):183–198, 1997.
- [NMN01] Jaroslav Nesetril, Eva Milková, and Helena Nesetrilová. Otakar Borůvka on the minimum spanning tree problem: Translation of both the 1926 papers, comments, history. Discrete Mathematics, 233(1-3):3–36, 2001.
- [RSW18] Aviad Rubinstein, Tselil Schramm, and S. Matthew Weinberg. Computing exact minimum cuts without knowing the graph. In Proceedings of the 9th Innovations in Theoretical Computer Science Conference (ITCS ’18), pages 39:1–39:16. LIPIcs, 2018.
- [ST83] Daniel D Sleator and Robert Endre Tarjan. A data structure for dynamic trees. Journal of Computer and System Sciences, 26(3):362–391, 1983.
Appendix A Data structures
We first show how to implement categorical top two queries on an array while allowing updates to add to the scores in an interval. This can be accomplished using a well-known binary tree data structure. We will then port this construction to a tree by means of the heavy path decomposition of [ST83, HT84].
The key to the binary tree data structure is the following simple fact. For a node of a tree let be the set of labels of leaves that are descendants of .
Fact 26.
Let be a power of 2 and a complete binary tree with leaves labeled by . For any interval there are many nodes such that . Moreover can be found in time, and the total number of ancestors of is .
See 6
Proof.
By padding the array with scores of infinity and an arbitrary color we may assume that is a power of . The data structure will be a complete binary tree with leaves labeled as . Each leaf stores a three tuple and at leaf this three tuple is initialized to be . Every internal node will store a pair of such 3-tuples. The data structure will maintain the invariant (Invariant 1) that at every internal node the indices in this pair of three tuples is the answer to the categorical top two query for the interval . The answer to a categorical top two query for the interval can be computed in constant time from answers to this query at the children of . Thus in time we can propagate the answers to the categorical top two queries from the leaves to the root so that Invariant 1 holds.
Each node will also store an value . We initialize the leaves to have and set the update value of all internal nodes of the tree to be zero. Thus we have the property (Invariant 2) that the sum of the update values from to the root is , which will be maintained under the updates. This completes the pre-processing step and the total pre-processing time is .
We now show that after an update we can adjust the tree to maintain Invariant 1 and Invariant 2 in time. If the invariants hold, then we can answer a categorical top two query for the interval in time . This is done by first using 26 to find in time nodes such that form a partition of . Then by building a binary tree on top of and propagating the categorical top two query answers up this tree we can answer the categorical top two query for in time .
To restore the invariants after , we use 26 to find in time nodes such that form a partition of . Then for each we set . This restores Invariant 2 under the update. To restore Invariant 1, we recompute the answers to the categorical top two query at all ancestors of . By 26 there are only many such ancestors, thus we can perform this computation in time as well. ∎
In order to extend this structure to a general tree , we first construct its heavy path decomposition. Next, we concatenate the heavy paths to form a list of all edges of with the property that any subtree is described by a contiguous range of edges (but potentially containing many heavy paths). This is done recursively as follows. Let the topmost heavy path be . We first write down its edges . Then, we remove them from the tree. We recurse on the trees consisting of more than one node rooted at (note that is always a root of tree consisting of size 1), in this order. This guarantees that, for any , indeed consists of a contiguous range of edges, while for other edges this is guaranteed recursively.
See 7
Proof.
Consider a heavy path decomposition of , and construct the edge array by concatenating the heavy paths as described above. We will use the data structure from Lemma 6 on . Any path can be decomposed into infixes of heavy paths (in fact, at most one proper infix and a number of prefixes), and hence it corresponds to contiguous ranges of . Hence we implement the first operation by making calls to , by Lemma 6 this takes time. Finally, since is described by a single contiguous range , a categorical top-two query in corresponds to operation , which again takes time . ∎