Folklore Sampling is Optimal for Exact Hopsets: Confirming the Barrier111This work was supported by NSF:AF 2153680.
Abstract
For a graph , a -diameter-reducing exact hopset is a small set of additional edges that, when added to , maintains its graph metric but guarantees that all node pairs have a shortest path in using at most edges. A shortcut set is the analogous concept for reachability rather than distances. These objects have been studied since the early ’90s, due to applications in parallel, distributed, dynamic, and streaming graph algorithms.
For most of their history, the state-of-the-art construction for either object was a simple folklore algorithm, based on randomly sampling nodes to hit long paths in the graph. However, recent breakthroughs of Kogan and Parter [SODA ’22] and Bernstein and Wein [SODA ’23] have finally improved over the folklore algorithm for shortcut sets and for -approximate hopsets. For either object, it is now known that one can use hop-edges to reduce diameter to , improving over the folklore diameter bound of . The only setting in which folklore sampling remains unimproved is for exact hopsets. Can these improvements be continued?
We settle this question negatively by constructing graphs on which any exact hopset of edges has diameter . This improves on the previous lower bound of by Kogan and Parter [FOCS ’22]. Using similar ideas, we also polynomially improve the current lower bounds for shortcut sets, constructing graphs on which any shortcut set of edges reduces diameter to . This improves on the previous lower bound of by Huang and Pettie [SIAM J. Disc. Math. ’18]. We also extend our constructions to provide lower bounds against -size exact hopsets and shortcut sets for other values of ; in particular, we show that folklore sampling is near-optimal for exact hopsets in the entire range of parameters .
1 Introduction
In graph algorithms, many basic problems ask to compute information about the shortest path distances or reachability relation among node pairs in an input graph. In parallel, distributed, dynamic, or streaming settings, algorithm complexity often scales with the diameter of the graph, i.e., the smallest integer such that every connected node pair has a path of at most edges. Therefore, a popular strategy to optimize these algorithms is to add a few edges to the input graph in preprocessing, with the goal to reduce diameter while leaving the relevant distance or reachability information unchanged. In the context of reachability, this set of additional edges is called a shortcut set.
Definition 1 (Shortcut Sets).
For a directed graph , a -diameter reducing shortcut set is a set of additional edges such that:
-
•
Every edge is in the transitive closure of ; that is, there exists a path in .
-
•
For every pair of nodes in the transitive closure of , there exists an path in using at most edges.
Shortcut sets were introduced by Thorup [33], after they were used implicitly in prior work. Many algorithmic applications of shortcut sets and their relatives were discovered in the following years [34, 26, 19, 18, 20, 16, 24, 15, 17, 6, 10, 25, 2], but actual constructions of shortcut sets were elusive. For most of their history, essentially the only known construction was the following simple algorithm: randomly sample a set of vertices, and add a shortcut edge between each pair of sampled nodes that lie in the transitive closure of the input graph. To argue correctness: for any nodes in the graph where the shortest path has length , with high probability we sample nodes in that respectively hit a prefix and suffix of of length . Using the added shortcut edge , we obtain an path of length . This analysis gives:
Theorem 1 (Folklore, [34]).
Every -node graph has a -diameter-reducing shortcut set on edges.
Remarkably, despite its simplicity, the diameter bound of achieved by the folklore sampling algorithm remained nearly unimproved for 30 years (log factors were removed in [5], improving the diameter bound to ). This led researchers to wonder if the bound could be improved in the exponent at all. This was finally answered affirmatively in a recent breakthrough of Kogan and Parter [29]:
Theorem 2 ( [29] ).
The folklore algorithm is polynomially suboptimal for shortcut sets. In particular, every -node graph has a -diameter-reducing shortcut set on edges.
Kogan and Parter proved this theorem via an elegant construction based on sampling vertices and sampling from a set of carefully-chosen paths from the input graph. Following this, there are two clear avenues for further progress. First, the new diameter bound of is still not necessarily tight. It was still conceivable to improve diameter as far as , at which point we encounter a lower bound construction of Huang and Pettie [22] (improving on a classic construction of Hesse [21]). Second, many algorithms aim to compute exact or approximate shortest paths of an input graph, rather than any path as in the case of shortcut sets/reachability. These algorithms benefit from shortcut-set-like structures that more strongly reduce the number of edges along (near-)shortest paths in the input graph. Such a structure is called a hopset:
Definition 2 (Hopsets).
For a graph and , a -diameter reducing hopset is a set of additional edges such that:
-
•
Every edge has weight .
-
•
For every pair of nodes in the transitive closure of , there exists an path in that uses at most edges, and which satisfies .
When , the path is required to be an exact shortest path in , so we call an exact hopset.
A nice feature of the folklore sampling algorithm is that it extends immediately to hopsets with no real changes. This yields:
Theorem 3 (Folklore).
Every -node graph has a -diameter-reducing (exact or ) hopset on edges.
Thus, the hunt is back on for a hopset construction algorithm that beats folklore sampling. Kogan and Parter partially achieved this goal: they extended their shortcut set construction to also show a new diameter bound of for hopsets [29]. Bernstein and Wein [7] then developed a clever extension of the Kogan-Parter construction, further improving the bound for hopsets to match the one achieved for shortcut sets:
Theorem 4 ([29, 7] ).
The folklore algorithm is polynomially suboptimal for hopsets. In particular, for all fixed , every (possibly directed and weighted) -node graph has a -diameter-reducing-shortcut set on edges.
Still, both of these improvements required , and so neither extended to exact hopsets, which still remained as the last holdout where the folklore algorithm had not been improved. The only progress for exact hopsets came on the lower bounds side, where a separate work of Kogan and Parter [29] showed a diameter lower bound of (see also [9]). Was it possible to translate the recent progress on shortcut sets and hopsets to exact hopsets, and finally move past folklore?
1.1 Our Results
Our main results are polynomial improvements on the lower bounds for -size shortcut sets and hopsets, which we depict in Figure 1. These lower bounds confirm that the folklore algorithm for exact hopsets is essentially the right one, showing that its diameter bound is optimal up to factors:
Theorem 5 (New).
The folklore algorithm is near-optimal for exact hopsets. In particular, there are -node graphs on which any exact hopset on edges reduces diameter to .
This provides a strong separation between exact and hopsets. Our lower bound holds for both directed and undirected input graphs, but it critically uses edge weights and thus does not extend to unweighted graphs as well. Our lower bound in the body of the paper (Theorem 7) is actually a bit more general than the one stated in Theorem 5 above. Although it is popular to focus on hopsets of size , one can also ask about hopsets of size for any parameter . The folklore sampling algorithm extends to any , by adjusting the size of the sampled vertex set to . Our generalized lower bound establishes that the diameter bound from folklore sampling is near-optimal in the entire range of parameters.
We now turn to shortcut sets. Our new lower bound is the following:
Theorem 6 (New).
There are -node directed input graphs on which any shortcut set on edges reduces diameter to .
This improves over the previous lower bound of by Huang and Pettie [22], but a polynomial gap to the upper bound of still remains [29]. It is an interesting open question is to narrow this gap further. We note that every hopset is also a shortcut set, and so this lower bound extends to hopsets as well.222Note that, since the shortcut set lower bound is only for directed graphs, the lower bound only extends to hopsets for directed input graphs. For hopsets in undirected (but possibly weighted) input graphs, Elkin, Gitlitz, and Neiman proved that much better diameter can be achieved [13]. In the body of the paper (Theorem 8) we prove a more general theorem giving improved lower bounds against size hopsets, but this time the parameter range in which our extended theorem is nontrivial is only .
1.2 Other Related Work and Open Questions
This work is only concerned with the existential bounds that can be achieved for shortcut sets and hopsets. Some prior work in the area has also focused on constructions that are efficient in the appropriate computational model. This was perhaps most famously achieved by Fineman [15], whose breakthrough algorithm for parallel reachability was centered around a new shortcut set construction. His construction reduced diameter to using edges; this is a worse diameter bound than the one achieved by folklore sampling, but crucially its work-efficiency was much better than folklore. This was later improved by Jambulapati, Liu, and Sidford [24], who achieved roughly the diameter bound from folklore sampling with work-efficiency comparable to Fineman [15]. Relatedly, another work of Kogan and Parter [27] gave a construction improving the (centralized) construction time of their shortcut set algorithm.
This work focuses on hopsets for weighted graphs, but hopsets can be studied for unweighted graphs as well. Specifically for hopsets in undirected unweighted graphs, it is known that far better diameter bounds are achievable. In particular, following preliminary constructions in [26, 32, 11], constructions of Huang and Pettie [23] and Elkin and Neiman [14] showed that one can reduce diameter to using hop edges. These papers actually provide a more fine-grained tradeoff between hopset size, , and diameter bound, which is shown to be essentially tight in [1]. Hopsets for undirected unweighted graphs with larger stretch values were studied in [4], and a unification of the various hopset constructions in this setting was developed in [31].
This work focuses on shortcut/hopset bounds in the setting where the edge budget is , or more generally some function of . These objects are also sometimes studied in a related setting where the edge budget is , where is the number of edges in the input graph. Lower bounds in this setting were achieved in [21, 22, 30]; most recently, Lu, Williams, Wein, and Xu showed graphs where any -size shortcut set reduces diameter to . On the upper bounds side, it is easy to get upper bounds as functions of both and – for example, folklore sampling with sampled nodes yields a diameter bound of , and the construction by Kogan and Parter [29] implies a bound of for -size shortcut sets. However, it is an interesting open problem to construct -size shortcut/hopsets with nontrivial diameter upper bounds that depend only on . By “nontrivial,” we mean that one can always assume without loss of generality that the input graph is connected, and thus , and so a construction of -edge shortcut/hopsets is always valid in the -setting. A nontrivial construction is one that beats that bound.
Open Question 1.
Prove that, for any , every -node, -edge directed graph has an -edge shortcut/hopset that reduces diameter to , where is a constant strictly less than the one that can currently be achieved for -edge shortcut/hopsets.
Finally, we remark that the setting of exact hopsets in unweighted graphs seems to be unexplored. Although unpublished to our knowledge, one can obtain a lower bound by applying a standard analysis in [22, 28, 9] to the -node undirected unweighted distance preserver lower bound graphs constructed by Coppersmith and Elkin [12]. This would imply that any -size exact hopset on the Coppersmith-Elkin graphs would reduce diameter to . Our new hopset lower bound can also be interpreted as an improved lower bound for this setting:333This corollary is not immediate from the discussion so far: since our shortcut set lower bound is directed, it is not clear that it would imply a lower bound against undirected unweighted exact hopsets. This holds specifically because our shortcut set lower bound construction happens to be layered.
Corollary 1.
There are -node undirected unweighted input graphs on which any exact hopset on edges reduces diameter to .
But on the upper bounds side, folklore sampling remains the best known algorithm, and it only reduces diameter to . It would be interesting to narrow this gap, and in particular to confirm or refute whether folklore sampling is near-optimal.
Open Question 2.
Is the folklore algorithm near-optimal for exact hopsets in unweighted graphs? Or, alternately, does every -node unweighted graph have an -size exact hopset that reduces diameter to , for some ?
2 Technical Overview
2.1 Recap of Prior Work
In order to explain the strategy used for our improved lower bounds, it will be helpful to first recall the construction of Huang and Pettie [22] for shortcut set lower bounds, and the construction of Kogan and Parter [28] for hopset lower bounds. The Huang-Pettie shortcut set lower bound is a construction of a directed graph and a set of paths with the following properties:
-
1.
Each path in is the unique path in between its endpoints
-
2.
The paths in are pairwise edge-disjoint
-
3.
There are paths, where is a constant that can be selected as large as we want
-
4.
Subject to the above constraints, we want to make the paths in as long as possible. In particular, in [22], every path in has the same length .
Let us see why these properties imply a lower bound. Each time we add an edge to our hopset , by uniqueness and edge-disjointness of paths in , there can be at most one path where the distance between its endpoints decreases due to . Thus, if we build a shortcut set of size , then for at least one path the distance between its endpoints is the same in as in . Thus, the final diameter of the graph is at least , giving the lower bound.
The Kogan-Parter exact hopset lower bound is similar, except that each path is only required to be a unique shortest path between its endpoints in the weighted graph . This is a more relaxed constraint than requiring to be the unique path of any kind, and this additional freedom in the construction lets us improve the path lengths to . Besides that, the argument is identical.
2.2 Allowing Paths to Overlap
The change in our construction is a relaxation of item (2); that is, the paths in our constructions are not pairwise edge-disjoint. This has appeared in prior work only in a rather weak form: all of the lower bounds against -size shortcut sets use paths that may intersect pairwise on a single edge [21, 22, 30]. These constructions begin with a system of paths as above, and then apply a tool called the alternation product which introduces path overlap. However, the alternation product is not a useful tool for our purposes, and it does not appear in this paper at all. The alternation product harms the path lengths of the construction (relative to its number of nodes), in exchange for also reducing the number of edges relative to the number of paths in the construction. This is useful for constructing lower bounds against -size shortcut/hopsets, but is not helpful for our goal of constructing lower bounds against -size shortcut/hopsets.
In our construction, paths that may intersect pairwise on polynomially many edges. This property arises from an entirely different technique, and for an entirely different reason: our goal is to use this overlap to get improved path lengths. Let us first observe why we can tolerate some path overlap while maintaining correctness of the lower bound. Suppose our shortcut set has a budget of edges, and we construct a graph and a set of paths, where each path has length and each path is the unique path between its endpoints. Let be the set of node pairs that are the endpoints of paths in , and consider the following potential function over shortcut sets , which simply sums distances over critical pairs:
Initially, we have . Then we add edges to our shortcut set one at a time, gradually reducing . How much could any given shortcut edge reduce ? Clearly it could reduce by , in the case where are a pair in , since this reduces from to . This is acceptable: if all edges reduce by at most , then the final potential would be
Thus, over the critical pairs, the average distance in is , and so the lower bound still holds.
So the only overlap constraint we need to enforce in our lower bound is that no shortcut edge can reduce the potential by more than . This is a much more forgiving constraint than edge-disjoint paths. For example, for two internal nodes with , we could allow two different paths to coincide on a subpath: adding to the shortcut set would then reduce by only . In general, for two nodes at distance , we can safely allow paths to coincide on the subpath between these nodes while maintaining correctness of the lower bound.
2.3 Constructing Overlapping Paths
The previous part explains why we are allowed overlapping paths, but it is still not clear how to leverage this freedom into an improved lower bound construction. This is where our technical contribution lies. It is again a bit easier to explain the new idea in the context of shortcut sets, but the intuition is essentially the same in the context of hopsets.
Let us return to the previous lower bound constructions. For the shortcut lower bounds of [22], one constructs an -layered directed graph where the nodes in each layer are a copy of a grid within . The next step is to construct a set of convex vectors . A key perspective shift in this paper is that we think of the vectors in as playing two independent roles in this previous construction:
-
•
They play the role of edge vectors: we include an edge from a node in layer to a node in layer iff the difference between the grid points is a vector .
-
•
They also play the role of (objective) direction vectors. The paths are indexed by a node in the first layer and a vector , and we generate by using as its first node, and then iteratively selecting its node in the next layer by adding to the node in the previous layer. (A technical detail is that only choices of are allowed that reach the last layer without the path falling off the side of the grid.) Notice the argument for path uniqueness: using as an objective direction, due to the convexity of the vectors in , the edge vector itself is the one that maximizes progress in the direction . Thus, no alternate path beginning at can reach the grid point within steps, since it necessarily makes less progress in the direction of , implying path uniqueness.
Our constructions disentangle these two uses of the vector set : we depart from prior work by explicitly using a separate edge vector set and direction vector set . These vector sets crucially do not have the same size: instead we will have , and this difference allows for a technical optimization of parameters leading to improved lower bounds. Roughly, we can choose large enough to have , while also allowing . This smaller size can be achieved using shorter convex vectors, which in turn lets us pack more layers into the construction without worrying about paths falling off the sides of the grid before reaching the final layer.
We use the following generalized process for iteratively generating critical paths. Each path is indexed by a node in the first layer and a direction vector , and at each layer, we greedily select the edge vector that maximizes progress in the objective direction . Since , by necessity many different objective directions will all select the same edge vector at each layer. This leads to overlapping paths discussed above, but more technical ingredients are still needed to ensure that paths don’t overlap too much. We explain these next.
2.4 Symmetry Breaking
There is an important problem with the construction sketched so far. Consider two nearby direction vectors , which have the same optimizing edge vector . Then for any start node , the paths will simply select the same edge vector at each layer, and these two paths will entirely coincide. In other words, the “extra paths” bought by using are actually just identical copies of a much smaller set of paths, which is not interesting or useful.
We therefore need to somehow break the symmetry between paths that use nearby direction vectors, getting them to eventually choose different edge vectors at some layer to split apart. This is where our lower bound constructions diverge; we will need to use two different symmetry-breaking strategies for shortcut and hopset lower bounds.
Hopset Lower Bounds and -Shifting.
In our hopset lower bound construction , like [28], our vertices can be interpreted as points in . More specifically, they initially form a square grid within the integer lattice , and the columns of this grid act as layers of . Our edges initially have the form for edge vectors ; edge vectors always have first coordinate , so that they go from one layer to the next. Initially, the weight of an edge is the Euclidean distance between its endpoints.
Our symmetry-breaking step is a random operation where for each layer we choose a random variable sampled uniformly from the interval , and we shift the layer upwards so that its nodes are offset higher than the nodes in the previous layer. The shifts therefore compound across the layers. See Figure 2 for a picture.
Our -shifting strategy does not affect the edge set of , nor does it affect the set of direction vectors in any way, but it does affect the Euclidean distance between nodes in adjacent layers, and hence it changes the edge weights. It achieves symmetry-breaking for roughly the following reason. In our greedy generation of paths, a path with direction vector will use the closest edge vector at each level. If two paths and with direction vectors intersect at a node in the th layer of , then there will be an interval such that if lands in , then have different closest edge vectors after shifting. Thus, in this event, the paths and split apart at and never reconverge (this is formalized in Lemma 3). The size of the interval , and hence the probability that it gets hit by , is proportional to the distance between and . The effect is that paths generated by nearby direction vectors tend to intersect on long subpaths, while paths generated by far apart direction vectors intersect on shorter subpaths or perhaps just a single node, but with high probability all pairs of paths split apart eventually. Paths generated by the same direction vector remain parallel, and do not intersect at all.
There is a technical detail remaining: we still need to prove that each critical path is a unique shortest path between its endpoints. In [28], the critical paths correspond to lines in Euclidean space, and since edge weights correspond to Euclidean distances, the analogous unique shortest paths property follows instantly from the geometry of . Since our critical paths are generated by a more involved process, it is much more technical to prove that they are unique shortest paths. Proposition 3 contains the optimization lemma that needs to hold for our process to generate unique shortest paths, and to push it through, it turns out that we essentially need the derivative of edge weights to be proportional to Euclidean distances. We therefore differ again from [28] by squaring all of our edge weights, meaning that our graph metric is ultimately quite different from . See Section 4.4 for additional details.
Shortcut Set Lower Bounds and Edge Vector Subsampling.
Shortcut set lower bounds are unweighted, and this makes the technique of -sampling essentially useless in this setting, since it only affects edge weights in the construction and it does not change the edge set. For shortcut sets, we need an entirely different symmetry-breaking strategy that actually changes the edge set from layer to layer.
Our starting graph is similar to the one used by Huang and Pettie [22], mentioned earlier. Each layer of the graph is an independent copy of a square grid in . We generate a large convex set ; initially, plays the role of both edge vectors and direction vectors . However, for the sake of symmetry-breaking, we do not put edges between all nodes in adjacent layers whose difference is a vector in . Instead, at each layer we randomly sample exactly two adjacent vectors , and we use only these two edge vectors to generate edges to the next layer. This is depicted in Figure 3.
The fact that the sampled vectors are adjacent, and hence typically close together, allows for a key optimization in the construction. The rate at which paths drift apart from each other is much slower than in [22], even when they are generated by very different direction vectors. This allows us to apply a carefully-chosen translation of the grid from layer to layer, in order to keep all of the paths contained in the grid. This in turn lets us pack many more layers into the construction while still ensuring that all of our paths stay within the confines of the grid.
As before, paths are generated greedily: for direction vector , an associated path will traverse the sampled edge vector in each layer that maximizes progress in the objective direction . For two paths with direction vectors and , these paths split at in the event that the sampled edge vectors lie between and in . This again leads to behavior where paths generated by nearby direction vectors tend to coincide on long subpaths, while paths generated by far apart direction vectors have smaller intersections, but with high probability all pairs of paths split apart eventually. Paths with the same direction vector again remain parallel. This is formalized in Lemma 10.
3 Preliminaries
We use the following notations:
-
•
For a path , we use to denote the number of nodes in . This is different from the (unweighted) length of . In weighted graphs, we write for the sum of edge weights in .
-
•
We write for the shortest path distance from node to node in graph (counting edge weights, if is weighted). We write for the least number of edges contained in any shortest path.
-
•
We use to denote the standard Euclidean inner product.
4 Exact Hopsets
In this section we will prove the following theorem.
Theorem 7.
For any parameter , there exists an -node weighted undirected graph such that for any exact hopset of size where , the graph must have hopbound .
We will prove this via a construction of the following type:
Lemma 1.
For any , there is an infinite family of -node undirected weighted graphs and sets of paths in with the following properties:
-
•
has layers, and each path in starts in the first layer, ends in the last layer, and contains exactly one node in each layer.
-
•
Each path in is the unique shortest path between its endpoints in .
-
•
For any two nodes , there are at most paths in that contain both and .
-
•
Each node lies on at most paths in .
4.1 Proving Theorem 7 using Lemma 1
Fix an and . Let be the graph in Lemma 1 with associated set of paths in . Let be an exact hopset of size . Let be the set of node pairs that are the endpoints of paths in . We define the following potential function over hopsets , which simply sums hopdistances over critical pairs:
Observe that by Lemma 1, we have . Now fix a pair of nodes , and let be the set of paths such that . We make the following observations.
-
•
For all , if the unique shortest -path in is not in , then
-
•
For all , if the unique shortest -path in is in , then
Then by Lemma 1, . We obtain the following sequence of inequalities:
Rearranging, we find that
Thus, over the pairs of path endpoints in , the average hopdistance in is , and so there must be a pair such that by the pigeonhole principle.
4.2 Constructing
Our goal is now to prove Lemma 1. Let be a sufficiently large positive integer, and let .444We will handle the case where later. For simplicity of presentation, we will frequently ignore issues related to non-integrality of expressions that arise in our construction; these issues affect our bounds only by lower-order terms. Initially, all edges in will be directed from to ; we will convert into an undirected graph in the final step of our construction.
Vertex Set .
-
•
Let be a positive integer parameter of the construction to be specified later. Our graph will have layers , and each layer will have nodes, ordered from to . Initially, we will label the th node in layer with tuple . We will interpret the node labeled as a point in with integer coordinates. These nodes arranged in layers will be the node set of graph .
-
•
We now perform the following random operation on the node labels of . For each layer , , uniformly sample a random real number in the interval and call it . Now for each node in layer of labeled , relabel this node with the label
Again, we interpret the resulting labels for nodes in as points in . In a slight abuse of notation, we will treat as either a node in or a point in , depending on the context. Less formally: for each layer , this step shifts the nodes in layer vertically upwards to be higher than the previous layer (and thus, these vertical shifts compound across the layers). See Section 2 for intuition on this design choice.
Edge Set .
-
•
All the edges in will be between consecutive layers of . We will let denote the set of edges in between layers and .
-
•
Just as our nodes in correspond to points in , we can interpret the edges in as vectors in . In particular, for every edge , we identify with the corresponding vector . Note that since all edges in are between adjacent layers and , the first coordinate of is for all . We will use to denote the 2nd coordinate of , i.e., for all , we write .
-
•
We begin our construction of by defining the following set of vectors:
We will refer to the vectors in as edge vectors.
-
•
For each , let
Intuitively: we want the edge vectors in to point between nodes in adjacent layers, and due to the random vertical shifts between layers applied to the nodes, we need to apply a similar shift to at each layer to adjust for this.
-
•
For each and edge vector , if , then add edge to . After adding these edges to , we will have that
Additionally, note that the case only occurs if for some that is higher than any point in the st layer; that is, .
-
•
For each , if , then we assign edge the weight .
This completes the construction of our graph .
4.3 Direction Vectors, Critical Pairs, and Critical Paths
Our next step is to generate a set of critical pairs , as well as a set of critical paths . Specifically, there will be one critical path going between each critical pair , and we will show that is the unique shortest (weighted) path in . We will identify our critical pairs and paths by first constructing a set of vectors that we call direction vectors, which we define next.
Direction Vectors .
-
•
Let be a sufficiently large integer parameter to be specified later. The size of will roughly correspond to the maximum number of edges shared between any two critical paths in .
-
•
We choose our set of direction vectors to be 555Note that if , then . However, if , then . This gap between and is needed to accommodate the -shifting operation used to obtain , and is relevant in the proof of Lemma 3.
Note that there are direction vectors between adjacent vectors for . Additionally, adjacent direction vectors in differ only by in their second coordinate.
Proposition 1.
With probability , for every and every direction vector , there is a unique vector that minimizes over all choices of .
Proof.
There are only finitely many choices of that result in there being two distinct vectors such that . We conclude that the claim holds with probability . ∎
In the following we assume that this event holds, i.e., there is a unique minimizing vector in for all . Each of our critical paths in will have an associated direction vector , and for all , path will take an edge vector in that is closest to in the sense of Proposition 1 (see Section 2 for more intuition).
Critical Pairs and Critical Paths .
-
•
We first define a set containing half of the nodes in the first layer of :
We will choose our set of demand pairs so that . For every node and direction vector , we will create a critical pair and a corresponding critical path to add to and .
-
•
Let , and let . The associated path has start node . We iteratively grow , layer-by-layer, as follows. Suppose that currently , for , with each . To determine the next node , let be the edges in incident to , and let
By definition, is an edge whose first node is ; we define to be the other node in , and we append to .
-
•
This completes our construction of and . Note that
-
–
we will show that the paths generated in this way have distinct endpoints (with high probability), and therefore , and
-
–
every path contains one node in each layer, and therefore its number of nodes is .
-
–
An important feature for correctness of our construction is that, when we iteratively generate paths, we never reach a point such that (i.e. for some ). This follows by straightforward counting, based on the maximum second coordinate used in our edge vectors and also on our choice of start nodes as only the “lower half” of the nodes in the first layer. The following proposition expresses this correctness in a particular way, pointing out that for any node lying on a generated path , none of the edges from to the following layer are omitted from the graph due to falling off the top of the grid with a too-high second coordinate.
Proposition 2.
Let for some and . Then .
Proof.
Let , and let be the endpoints of . Since , we have , where
Moreover, since for all the corresponding vector satisfies , we have
Then observe that for all , we have that , where
Thus , and so we have for all . It follows that . ∎
4.4 Critical paths are unique shortest paths
We now verify that graph and paths have the unique shortest path property as stated in Lemma 1.
Lemma 2 (Unique shortest paths).
With probability , for every , path is a unique shortest (weighted) -path in .
We begin with a technical proposition:
Proposition 3.
Let . Now consider such that
-
•
for all , and
-
•
.
Then
with equality only if for all .
Proof.
We will prove the equivalent statement . Fix an . First we will show that
We split our analysis into four cases:
-
•
Case 1: . In this case, .
-
•
Case 2: . In this case, .
-
•
Case 3: . In this case, , since .
-
•
Case 4: . In this case, , since .
Then
This inequality is strict if for some . ∎
Proof of Lemma 2.
Fix an , and let be the direction vector associated with . Let be real numbers such that the th edge of has the corresponding vector for . Now consider an arbitrary -path in , where . Since all edges in are directed from to , it follows that has edges and the th edge of is in . Let be real numbers such that the th edge of has the corresponding vector for . Now observe that since and are both -paths, it follows that
Additionally, by our construction of , it follows that
for all . In particular, since , there must be some such that , and so by Proposition 1, with probability 1. Then by Proposition 3,
Path is a unique shortest -path in , as desired. ∎
4.5 Critical Paths Intersection Properties
Before finishing our proof of Lemma 1, we will need to establish several properties of the critical paths in .
Proposition 4.
Let be two critical paths with the same corresponding direction vector . Then .
Proof.
Let denote the th node of , where . Note that since and share the same direction vector , the edges and have the same corresponding vector for all by Proposition 2. By our construction of , for each node in the first layer , belongs to at most one path with direction vector , so . Then for all ,
Let be two critical paths, and let be a node in . We say that paths and split at if and the node following in is distinct from the node following in , and we simply say that and split if there exists some such that they split at . Note that since are unique shortest paths in , paths and can split at most once.
Lemma 3.
Fix a node , where , and let be critical paths with direction vectors such that and . Then paths and split at with probability at least .666For the sake of completeness, let us be more precise here about the probability claim being made in this lemma. Consider any two paths , indexed by two start nodes and two direction vectors, and consider a node . The event that we generate in such a way that depends only on the random choices of . If , then the event that split at depends only on the random choice of . The claim is: conditional on the event that are selected in such a way that , the probability that is selected such that split at is at least .
Proof.
By Proposition 4, , and assume wlog that . Let be the event that the random variable was sampled so that
(where is the open interval with endpoints and ). Our proof strategy is to show that implies that split at , and then to show that occurs with the claimed probability.
implies that split at .
Assume that occurs. By construction there is a nonnegative integer such that is in the interval . Since , it follows that vectors are in , because . More generally, by our choice of sets and there are vectors such that
Now we claim that
To see this, suppose for the sake of contradiction that there is a vector such that
Then using our assumption that , we obtain
and
Together, these two sequences of inequalities imply that . But this contradicts our assumption that , so we conclude that
By Proposition 2, , so we have also shown that
Then and must split at by our construction of the critical paths in .
happens with good probability.
Since is sampled uniformly at random from the interval , it follows that with probability at least . ∎
We will use Lemma 3 to prove the following two lemmas, which capture the key properties of our graph .
Lemma 4 (Low path overlap).
Let be critical paths with distinct associated direction vectors . Then:
-
•
If , then with probability at least , we have777Formally, we consider any two paths indexed by two start nodes and direction vectors. When we iteratively generate these paths, the number of nodes in their intersection (possibly ) depends only on the random choices of . The probability claim in this lemma is with respect to these random choices.
-
•
If , then (deterministically).
Proof.
We begin with the first point; suppose . Suppose we iteratively generate one layer at a time. Each time we choose a node that lies in both and , by Lemma 3, and split at with probability at least (over the random choice of ). Moreover, since and are unique shortest paths in and is acyclic, it follows that is a contiguous subpath of and ; thus, once they split, they can no longer intersect in later layers. The number of nodes in the intersection is more than the number of consecutive nodes at which intersect but do not split. So by the above discussion, we have
For the second point of the lemma: if , then by Lemma 3, if there is a node , then and split at with probability , and then they can no longer intersect in later layers. So we have . ∎
Since , we can argue by a union bound that Lemma 4 holds for all simultaneously with probability at least . From now on, we will assume that this property holds for our constructed graph .
Once we specify our construction parameters and , the following lemma will immediately imply the third property of as stated in Lemma 1.
Lemma 5.
Let be distinct nodes in , and let . Let
Then .
Proof.
Let and let be the direction vector associated with for . Since the paths in all intersect, by Proposition 4 we must have for . Let and let . Then
since for all such that . Thus, by Lemma 4 we must have , since we have at least two nodes . So by Lemma 4,
Since and and are unique shortest paths in , it follows that they coincide on their subpaths . Moreover, since the hopdistance from to in is , it follows that . Then taken together we have
Rearranging, we get
If , then this implies . Otherwise, if , then this implies that since . ∎
4.6 Finishing the proof of Lemma 1
We note that Theorem 7 is trivial in the parameter regime , since its lower bound on hopbound is . So we will assume in the following, with as small of an implicit constant as needed. Let
We now quickly verify that graph and associated critical paths satisfy the properties of Lemma 1:
-
•
By construction, has layers, and each path in travels from the first layer to the last layer.
-
•
Proposition 4 implies that each vertex has at most paths passing through it. By our choice of construction parameters and , we conclude that .
-
•
Each path is a unique shortest path between its endpoints in by Lemma 2.
-
•
Since and , Lemma 5 immediately implies that for all , there are at most
paths in that contain both and .
- •
We have shown that our directed graph satisfies the properties of Lemma 1 in the regime of . Moreover, our construction still goes through even in the extended regime of for any constant . All that remains is to extend our construction to the entire regime of and make undirected.
Extending the construction to .
We can extend our construction to the regime of with a simple modification to that was previously used in the prior work of [29]. We will sketch the modification here and defer the proof of correctness to Lemma 6 in Appendix A.
Let denote an instance of our originally constructed graph with input parameters and . Let be a sufficiently large integer and let . Let where and divides . Now for each node in , replace with a directed path with nodes. For all , assign weight to all edges in . For each edge originally in , add edge to the graph. Let be the resulting graph, and let be the updated set of critical paths. This completes the modification.
Lemma 6.
The -node graph and the set of paths satisfy the properties of Lemma 1.
Proof.
We defer the proof to Appendix A, as it largely follows our earlier analysis. ∎
Making undirected.
To make undirected, we use the following standard simple blackbox reduction. Let be the sum of all edge weights in , i.e., . For every edge , add to the weight of , and treat as an undirected edge. Call the resulting graph .
We now argue correctness: in particular, we need to argue that for all such that is reachable from in , is a shortest weighted (directed) -path in if and only if is a shortest weighted (undirected) -path in .
-
•
First, note that for all -paths in , the number of nodes in satisfies by the construction of and .
-
•
Moreover, if , then has one more edge than . Thus, its weighted length in satisfies
and so is not a shortest path. We conclude that if is a shortest -path in , then .
-
•
Any -path in with edges must use exactly one node in each layer, and thus it respects the original edge directions in . We conclude that is a shortest weighted -path in if and only if is a shortest weighted (directed) -path in .
Lemma 1 is immediate from the above discussion.
5 Shortcut Sets
In this section we will prove the following theorem.
Theorem 8.
For any parameter , there exists an -node unweighted directed graph such that for any shortcut set of size , the graph must have diameter , where
In particular, when , must have diameter .
We will prove this via a construction of the following type:
Lemma 7.
For any , where is a sufficiently large constant, there is an infinite family of -node directed unweighted graphs and sets of paths in with the following properties:
-
•
has layers. Each path in starts in the first layer, ends in the last layer, and contains exactly one node in each layer.
-
•
Each path in is the unique path between its endpoints in .
-
•
For any two nodes , there are at most paths in that contain both and .
Proof of Theorem 8 using Lemma 7.
Fix an and . Let be the graph in Lemma 7 with associated set of paths in . Let be a shortcut set of size . Let be the set of node pairs that are the endpoints of paths in . Since all paths in are unique paths between their endpoints in , it follows that
Then when , we can achieve the bounds in Theorem 8 using the same potential function argument as in Section 4.1.
5.1 Constructing the strongly convex vector set
In our construction of the graph , we will implicitly use the following lemma from [3, 8] that establishes the existence of a dense set of vectors that each extend the farthest in their own direction.888This is a slightly stronger property than convexity, and hence is sometimes referred to as “strong convexity” in the area.
Lemma 8 (Theorem 1 of [3]; Lemma 7 of [8]).
For sufficiently large , there exists a strongly convex set of integer vectors of size , such that
-
•
for all , ,
-
•
every lies in the first quadrant, i.e., both coordinates are positive, and
-
•
for all distinct , .
In our construction of , we will use a vector set from this lemma to help generate edge and direction vectors. We will make use of the following technical property of the vectors in :
Proposition 5.
Let be a set of vectors as described in Lemma 8, with its vectors ordered counterclockwise. For all with , the following inequalities hold:
Proof.
We will only prove here that ; the other set of inequalities follows from an identical argument. By Lemma 8 we already have , so it remains only to show that .
Let be the inner angle between and and let be the inner angle between and ; thus the inner angle between and is . We first establish a useful inequality:
by Lemma 8 | ||||
cosine formula | ||||
follows algebraically from previous line |
We will next use the trigonometric identity
since by Lemma 8. |
We are now ready to show:
cosine formula | ||||
second inequality | ||||
first inequality | ||||
cosine formula ∎ |
5.2 Constructing
We next construct the graph that will be used for Lemma 7. Let be a sufficiently large positive integer, and let for a sufficiently large constant to be chosen later (we will extend our construction to other choices of later). For simplicity of presentation, we will frequently ignore issues related to non-integrality of expressions that arise in our construction; these issues affect our bounds only by lower-order terms. All edges in will be directed from to .
Vertex Set .
-
•
Let be a positive integer construction parameter to be specified later. Our graph will have layers , and each layer will have nodes.
-
•
We will label each node in layer , , with a distinct triple in . We will interpret the node in labeled as an integer point .
-
•
These nodes arranged in layers will compose our node set
of graph . In a slight abuse of notation, we will treat as either a node in or a point in , depending on the context.
Edge Set .
-
•
All the edges in will be between consecutive layers of . We will let denote the set of edges in between layers and .
-
•
Just as our nodes in correspond to points in , we can interpret the edges in as vectors in . In particular, for every edge , we identify with the corresponding vector . Note that since all edges in are between adjacent layers and , the first coordinate of is for all . We will use to denote the th coordinate of for (i.e. for all , we write ).
-
•
We begin our construction of by defining the set of vectors , where is the strongly convex set of vectors defined in Lemma 8. Let , where vectors are ordered counterclockwise and . We may assume wlog that (e.g. by removing vectors from until this is true).
-
•
Using , we define our set of edge vectors as:
Let denote the th vector of .
-
•
For each layer , , sample a random integer from and call it . We define the set as
where is the th vector of . Note that contains exactly two vectors, the zero vector and a randomly chosen vector from . Intuitively: for each layer , we are sampling two adjacent vectors and from and adding to each of them to obtain . The purpose of adding the normalizing vector to and is to reduce the magnitude of the vectors in , as we will formalize in Proposition 6.
-
•
The vectors in set will define the edges in . Specifically, for all and for all such that
we add the edge to .
This completes the construction of our graph . We now verify that the vectors in have small magnitude in expectation.
Proposition 6.
For all , .
Proof.
Note that the vectors in correspond to sides of a convex polygon whose vertices are the vectors in . Since this polygon is contained in a ball of radius in by Lemma 8, it follows that . Note that for sufficiently large . Then
The vectors have magnitude roughly , whereas the vectors in have expected magnitude at most by Proposition 6. Since each edge in our critical path corresponds to a vector in , ensuring that the vectors in have small magnitude (at least in expectation) will be essential for guaranteeing that the paths are long.
Let us comment here on a discrepancy between this construction and the technical overview. In the technical overview, we stated that we would use the convex vectors to generate the edge vectors . Instead, we are using the difference between adjacent convex vectors to generate . This is an optimization: our plan is to sample one edge vector from , and use it together with the zero vector as the two available vectors between pairs of adjacent layers. This is equivalent to sampling two adjacent edge vectors from , as advertised in the technical overview, and then applying an appropriate translation of the next layer in space. Our strategy lets us use vectors of length , instead of , and these shorter vectors ultimately lead to a stronger lower bound.
5.3 Direction Vectors, Critical Pairs, and Critical Paths
Our next step is to generate a set of critical pairs , as well as a set of critical paths . Specifically, there will be one critical path going between each critical pair , and we will show that is the unique path in .
Direction Vectors .
We choose our set of direction vectors to be , where is our strongly convex set of vectors. We will let be our list of direction vectors, and we will let the th vector of correspond to the th vector of , i.e. , for . We will simply use the name when we wish to emphasize the role of these vectors as direction vectors.
Note that since , Proposition 5 also holds for . That is, if and , then
Critical Pairs and Critical Paths .
-
•
We first define a set , containing a subset of the nodes in the first layer of :
Informally, is a middle square patch of the nodes in . The key property of is that all nodes in are of distance at least from the sides of the square grid corresponding to layer .
We will choose our set of demand pairs so that . For every node and direction vector , we will choose a critical pair and a corresponding critical path to add to and .
-
•
Let , and let . The associated path has start node . We iteratively grow , layer-by-layer, as follows. Suppose that currently with each . To determine the next node , let be the edges in incident to , and let be
If for some , then by definition, is an edge whose first node is ; we define to be the other node in , and we append to . Otherwise, if there is no such edge in , then we terminate our construction of path (i.e. will be the final node in ).
This completes our construction of and . We will show that the paths generated in this way have distinct endpoints (with high probability), and therefore , where .
An important feature for correctness of our construction is that, when we iteratively generate paths, if we reach a point such that (i.e. for some ), then we end our path at . As a consequence, our critical paths in may not travel through all layers of . However, with nonzero probability, paths in travel through a constant fraction of layers, as we prove in the following proposition.
Proposition 7.
Let . With probability at least , for all , .
Proof.
Each critical path starts at a node in Each edge of the path corresponds to a vector such that for some . Our path ends when we reach the boundary of our vertex set . We must show that before any such path travels through nodes before reaching the boundary. Note that , since .
Let be the random variable defined as . Then by Proposition 6 and the linearity of expectation,
where the final equality follows from the fact that . Then by Markov’s inequality,
Now we claim that if , then for all , . Let be a critical path for critical pair . Let and let . By our construction of critical paths , either or . In the first case, , since path traveled through all layers . In the second case, we must have that by our choice of set . But since , we conclude that . The claim follows. ∎
We have shown that with nonzero probability, all our paths in travel through the first layers of . However, we cannot guarantee that paths in travel to layers with . Because of this, we choose to remove all layers , , from . We replace all critical paths with the truncated subpath of containing only the first nodes of , and we update our critical pairs to be the set of all pairs of endpoints of the updated paths in .
5.4 Critical paths are unique paths
We now verify that graph and paths have the unique path property as stated in Lemma 7. This will follow straightforwardly from the properties of our set of direction vectors , particularly Proposition 5.
Lemma 9 (Unique paths).
For every , path is a unique -path in .
Proof.
Fix a direction vector in . We claim that for all , there is a unique vector such that maximizes . Recall that contains exactly two vectors: the zero vector and the vector .
Now assume that and observe the following sequence of equivalent inequalities:
by Lemma 8 and Proposition 5 | ||||
When , an identical argument shows that . In either case, there is a unique vector maximizing for all .
Now fix a critical pair that has as its associated direction vector and as its critical path (where and ). Let be the function that projects each point in onto the subspace corresponding to the last two coordinates of , i.e. for all .
Let be an arbitrary -path, and note that since is a layered directed graph. Let , where and . By our construction of path , we must have that for all ,
Now suppose for the sake of contradiction that . Then for some . Then by the above discussion. But then since and are both -paths,
This is a contradiction, so we conclude that . Then path is a unique -path in . ∎
5.5 Critical Paths Intersection Properties
Before finishing our proof of Lemma 7, we will need to establish several properties of the critical paths in .
Proposition 8.
Let be two critical paths with the same corresponding direction vector . Then .
Proof.
Let . Let denote the th node of , where and . Note that since and share the same direction vector , edges and have the same corresponding vector for all by our construction of and . Also, for each node , belongs to at most one path with direction vector , so . Then for all ,
Let be two critical paths, and let be a node in . We say that paths and split at if and the node following in is distinct from the node following in , and we simply say that and split if there exists some such that and split at . Note that since are unique paths in , paths and can split at most once.
Lemma 10.
Fix a node , where , and let be critical paths with direction vectors and , , such that and . Then paths and split at with probability at least .
Proof.
Fix a node , where , and let be critical paths with direction vectors and , , such that and . By Proposition 8, , and assume wlog that . Let be the event that the random variable was sampled so that
Our proof strategy is to show that implies that split at , and then to show that occurs with the claimed probability.
implies that split at .
Assume that occurs. Then . Now observe the following sequence of equivalent inequalities:
by Lemma 8 and Proposition 5 | ||||
Since , by our construction of the critical paths in , the above inequality implies that path takes an edge in corresponding to vector . An identical argument will show that , so path takes an edge in corresponding to vector . Since paths and take different edges in , they must split at .
happens with good probability.
Random variable is sampled uniformly from . Then the event occurs with probability . ∎
We will use Lemma 11 to prove the following two lemmas, which capture key properties of our graph .
Lemma 11.
Let be critical paths with associated direction vectors . Then with probability at least .
Proof.
If , then the claim is immediate, so assume there is a node . Suppose . By Lemma 10, and split at with probability at least . Moreover, conditioning on , the event that and split at given that and depends only on our choice of and is independent of for .
Since and are unique paths in , it follows that is a contiguous subpath of and . The number of nodes in the intersection is more than the number of consecutive nodes at which intersect but do not split. So by the above discussion, we have
∎
Since , we can argue by a union bound that Proposition 7 holds and Lemma 4 holds for all simultaneously with probability at least . From now on, we will assume that this property holds for our constructed graph .
Once we specify our construction parameters and , the following lemma will immediately imply the third property of as stated in Lemma 7.
Lemma 12.
Let be nodes in such that the unweighted distance from to in is , where . Let be the following set of critical paths:
Then .
Proof.
Let and let be the direction vector associated with for . By Proposition 4, for . Let and let . Then , so by Lemma 11,
Additionally, since and and are unique paths, it follows that . Moreover, since the unweighted distance from to in is , it follows that . Then taken together we have
Rearranging, we get that
(We obtain the final inequality from the following observation: if , then this implies . Otherwise, if , then this implies that since .) ∎
5.6 Finishing the proof of Lemma 7
Let
Now recall that , and let be a constant such that . Then
We now quickly verify that graph and associated critical paths satisfy the properties of Lemma 7:
We have shown that our directed graph satisfies the properties of Lemma 7 when . All that remains is to extend our construction to the regime .
Extending the construction to .
We can extend our construction to the regime of using the same modification to that we performed on our exact hopset construction and that was previously used in the prior work of [29]. We will sketch the modification here. The proof of correctness follows from an argument identical to the proof of Lemma 6 in Appendix A.
We use denote an instance of our originally constructed graph with input parameters and . Let be a sufficiently large integer and let . Let where and divides . Now for each node in , replace with a directed path with nodes. For all , assign weight to all edges in . For each edge originally in , add edge to the graph. Let be the resulting graph, and let be the updated set of critical paths. This completes the modification.
References
- [1] Amir Abboud, Greg Bodwin, and Seth Pettie. A hierarchy of lower bounds for sublinear additive spanners. In Proceedings of the 28th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 568–576. Society for Industrial and Applied Mathematics, 2017.
- [2] Alexandr Andoni, Clifford Stein, and Peilin Zhong. Parallel approximate undirected shortest paths via low hop emulators. In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, pages 322–335, 2020.
- [3] Imre Bárány and David G Larman. The convex hull of the integer points in a large ball. Mathematische Annalen, 312(1):167–181, 1998.
- [4] Uri Ben-Levy and Merav Parter. New (, ) spanners and hopsets. In Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 1695–1714. SIAM, 2020.
- [5] Piotr Berman, Sofya Raskhodnikova, and Ge Ruan. Finding sparser directed spanners. In IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2010). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2010.
- [6] Aaron Bernstein, Maximilian Probst Gutenberg, and Christian Wulff-Nilsen. Near-optimal decremental sssp in dense weighted digraphs. In 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS), pages 1112–1122. IEEE, 2020.
- [7] Aaron Bernstein and Nicole Wein. Closing the gap between directed hopsets and shortcut sets. In Proceedings of the 2023 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 163–182. SIAM, 2023.
- [8] Greg Bodwin and Gary Hoppenworth. New additive spanner lower bounds by an unlayered obstacle product. 2022.
- [9] Greg Bodwin, Gary Hoppenworth, and Ohad Trabelsi. Bridge girth: A unifying notion in network design. arXiv preprint arXiv:2212.11944, 2022.
- [10] Nairen Cao, Jeremy T Fineman, and Katina Russell. Efficient construction of directed hopsets and parallel approximate shortest paths. In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, pages 336–349, 2020.
- [11] Edith Cohen. Polylog-time and near-linear work approximation scheme for undirected shortest paths. Journal of the ACM (JACM), 47(1):132–166, 2000.
- [12] Don Coppersmith and Michael Elkin. Sparse sourcewise and pairwise distance preservers. SIAM Journal on Discrete Mathematics, 20(2):463–501, 2006.
- [13] Michael Elkin, Yuval Gitlitz, and Ofer Neiman. Almost shortest paths with near-additive error in weighted graphs. arXiv preprint arXiv:1907.11422, 2019.
- [14] Michael Elkin and Ofer Neiman. Linear-size hopsets with small hopbound, and constant-hopbound hopsets in rnc. In The 31st ACM Symposium on Parallelism in Algorithms and Architectures, pages 333–341, 2019.
- [15] Jeremy T Fineman. Nearly work-efficient parallel algorithm for digraph reachability. SIAM Journal on Computing, 49(5):STOC18–500, 2019.
- [16] Sebastian Forster and Danupon Nanongkai. A faster distributed single-source shortest paths algorithm. In 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS), pages 686–697. IEEE, 2018.
- [17] Maximilian Probst Gutenberg and Christian Wulff-Nilsen. Decremental sssp in weighted digraphs: Faster and against an adaptive adversary. In Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 2542–2561. SIAM, 2020.
- [18] Monika Henzinger, Sebastian Krinninger, and Danupon Nanongkai. Decremental single-source shortest paths on undirected graphs in near-linear total update time. In Foundations of Computer Science (FOCS), 2014 IEEE 55th Annual Symposium on, pages 146–155. IEEE, 2014.
- [19] Monika Henzinger, Sebastian Krinninger, and Danupon Nanongkai. Sublinear-time decremental algorithms for single-source reachability and shortest paths on directed graphs. In Proceedings of the forty-sixth annual ACM symposium on Theory of computing, pages 674–683, 2014.
- [20] Monika Henzinger, Sebastian Krinninger, and Danupon Nanongkai. Improved algorithms for decremental single-source reachability on directed graphs. In Automata, Languages, and Programming: 42nd International Colloquium, ICALP 2015, Kyoto, Japan, July 6-10, 2015, Proceedings, Part I 42, pages 725–736. Springer, 2015.
- [21] William Hesse. Directed graphs requiring large numbers of shortcuts. In Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms, pages 665–669. Society for Industrial and Applied Mathematics, 2003.
- [22] Shang-En Huang and Seth Pettie. Lower Bounds on Sparse Spanners, Emulators, and Diameter-reducing shortcuts. In David Eppstein, editor, 16th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2018), volume 101 of Leibniz International Proceedings in Informatics (LIPIcs), pages 26:1–26:12, Dagstuhl, Germany, 2018. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik.
- [23] Shang-En Huang and Seth Pettie. Thorup–zwick emulators are universally optimal hopsets. Information Processing Letters, 142:9–13, 2019.
- [24] Arun Jambulapati, Yang P Liu, and Aaron Sidford. Parallel reachability in almost linear work and square root depth. In 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS), pages 1664–1686. IEEE, 2019.
- [25] Adam Karczmarz and Piotr Sankowski. A deterministic parallel apsp algorithm and its applications. In Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 255–272. SIAM, 2021.
- [26] Philip N Klein and Sairam Subramanian. A randomized parallel algorithm for single-source shortest paths. Journal of Algorithms, 25(2):205–220, 1997.
- [27] Shimon Kogan and Merav Parter. Beating Matrix Multiplication for -Directed Shortcuts. In Mikołaj Bojańczyk, Emanuela Merelli, and David P. Woodruff, editors, 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022), volume 229 of Leibniz International Proceedings in Informatics (LIPIcs), pages 82:1–82:20, Dagstuhl, Germany, 2022. Schloss Dagstuhl – Leibniz-Zentrum für Informatik.
- [28] Shimon Kogan and Merav Parter. Having hope in hops: New spanners, preservers and lower bounds for hopsets. In 2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS), pages 766–777. IEEE, 2022.
- [29] Shimon Kogan and Merav Parter. New diameter-reducing shortcuts and directed hopsets: Breaking the barrier. In Proceedings of the 2022 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1326–1341. SIAM, 2022.
- [30] Kevin Lu, Virginia Vassilevska Williams, Nicole Wein, and Zixuan Xu. Better lower bounds for shortcut sets and additive spanners via an improved alternation product. In Proceedings of the 2022 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 3311–3331. SIAM, 2022.
- [31] Ofer Neiman and Idan Shabat. A unified framework for hopsets. In 30th Annual European Symposium on Algorithms (ESA 2022). Schloss Dagstuhl-Leibniz-Zentrum für Informatik, 2022.
- [32] Hanmao Shi and Thomas H Spencer. Time–work tradeoffs of the single-source shortest paths problem. Journal of algorithms, 30(1):19–32, 1999.
- [33] Mikkel Thorup. On shortcutting digraphs. In International Workshop on Graph-Theoretic Concepts in Computer Science, pages 205–211. Springer, 1992.
- [34] Jeffrey D Ullman and Mihalis Yannakakis. High-probability parallel transitive-closure algorithms. SIAM Journal on Computing, 20(1):100–125, 1991.
Appendix A Proof of Lemma 6
Let be a sufficiently large integer, and let such that divides . Let . Recall that to obtain , we first construct graph , where denotes our initial construction of the graph in Lemma 1 on nodes with an associated set of paths. We let denote the nodes and denote the edges of . Let be the set of critical pairs associated with , and let be the corresponding canonical paths. Let and be the construction parameters used to construct . Then
Note that for all , we have that . We then modified by replacing each node with a path with nodes. If an edge was originally in , we replaced it with an edge . This gave us our final -node graph .
Let , and for all , let be the path obtained by taking and replacing each node with the path . Let . Then it is clear from our construction of that for all , the path is a unique shortest -path in . Additionally, for all , the number of nodes in path is at least , where
We now quickly verify that graph and associated critical paths satisfy the properties of Lemma 1:
-
•
By construction, has layers, and each path in travels from the first layer to the last layer.
-
•
Each path is a unique shortest path between its endpoints in . This follows from Lemma 2 and the observation that our path replacement step cannot increase the number of paths between pairs of nodes in .
-
•
What remains is to show that for any two nodes , there are at most paths in that contain both and . We prove this in two cases:
-
–
Case 1: for some . Note that . By Proposition 4, the number of paths such that is at most the number of direction vectors
Then by our construction of , the number of paths such that is at most .
-
–
Case 2: and for distinct . By Lemma 5, the number of paths such that is at most
(The final inequality follows from the fact that .)
-
–
Appendix B Extending our shortcut set lower bound
To extend our shortcut set lower bound so that it holds in the regime of , we will prove a more general statement about the behavior of the extremal function of shortcut sets. We write for the smallest integer such that every -node graph has a shortcut set of size such that has diameter at most .
Lemma 13.
For all positive integers and ,
Lemma 13 essentially states that if we decrease the number of shortcuts allowed in our shortcut set, the extremal function controlling the worst-case size of shortcut sets won’t increase by too much. We will use this lemma in the opposite direction to argue that our lower bound of (where is a sufficiently large constant) that we obtained from our shortcut set construction in Lemma 7 implies lower bounds for shortcut sets with greater than shortcuts.
Let , and let . Then by applying our lower bound from Lemma 7 to Lemma 13 we find that
Rearranging, we find that
as claimed in Theorem 8.
We now prove Lemma 13, which will follow from a simple path subsampling argument.
Proof of Lemma 13.
Let be a positive integer, , and . Let be a graph on nodes. We subsample nodes of to construct a smaller graph as follows.
-
•
Independently sample each node into set with probability . Then .
-
•
For all pairs of nodes such that , add directed edge to . This completes the construction of .
By Markov’s inequality, with probability at least . Assume for now that this does indeed hold, and we have . Then using the fact that if , we find that there exists a shortcut set of size such that the diameter of is at most
Now we claim that this implies that . For every pair of nodes such that is reachable from in , fix a shortest -path in . Then the following statement holds with high probability: for all pairs of nodes such that , path contains a node that was sampled into , i.e. . From now on, we will assume this property holds for our sampled set .
Consider a pair of nodes such that and is reachable from in , and let be an -path in . Let be be the node in closest to , and let be the node in closest to . By our construction of and the above discussion, there must be a -path in . Moreover, . Then since for all edges in , it follows that
Then putting it all together,
Finally, to conclude the analysis, we made two assumptions: (1) that , which occurs with probability , and (2) that for all that are sufficiently far apart, we sampled a node on a path, which happens with high probability, i.e., . By an intersection bound, there is positive probability that both events happen at the same time. So a graph exists as described, completing the proof. ∎