New algorithms for girth and cycle detection
Abstract
Let be an unweighted undirected graph with vertices and edges. Let be the girth of , that is, the length of a shortest cycle in . We present a randomized algorithm with a running time of that returns a cycle of length at most where is an integer and , for every graph with .
Our algorithm generalizes an algorithm of Kadria et al. [SODA’22] that computes a cycle of length at most in time. Kadria et al. presented also an algorithm that finds a cycle of length at most in time, where must be an integer. Our algorithm generalizes this algorithm, as well, by replacing the integer parameter in the running time exponent with a real-valued parameter , thereby offering greater flexibility in parameter selection and enabling a broader spectrum of combinations between running times and cycle lengths.
We also show that for sparse graphs a better tradeoff is possible, by presenting an time randomized algorithm that returns a cycle of length at most , where is an integer and , for every graph with .
To obtain our algorithms we develop several techniques and introduce a formal definition of hybrid cycle detection algorithms. Both may prove useful in broader contexts, including other cycle detection and approximation problems. Among our techniques is a new cycle searching technique, in which we search for a cycle from a given vertex and possibly all its neighbors in linear time. Using this technique together with more ideas we develop two hybrid algorithms. The first allows us to obtain a -time, -approximation of . The second is used to obtain our -time and -time approximation algorithms.
1 Introduction
Let be an unweighted undirected graph with vertices and edges. A set of vertices in , where , is a cycle of length if and , where . A is a cycle of length at most . The girth of is the length of a shortest cycle in . The girth of a graph has been studied extensively since the 1970s by researchers from both the graph theory and the algorithms communities.
Itai and Rodeh [6] showed that the girth can be computed in time or in time, where [13], if Fast Matrix Multiplication (FMM) algorithms are used. They also proved that the problem of computing the girth is equivalent to the problem of deciding whether there is a (triangle) in the graph or not.
Interestingly, there is a close connection between the girth problem and the All Pairs Shortest Path (APSP) problem. Vassilevska W. and Williams [12] proved that a truly subcubic time algorithm that computes the girth, without FMM, implies a truly subcubic time algorithm that computes APSP, without FMM. Such an algorithm for APSP would be a major breakthrough. In light of this girth and APSP connection, it is natural to settle for an approximation algorithm for the girth instead of exact computation. An -approximation of (where and ), satisfies . We denote an approximation as an -approximation if and as a -approximation if .
Itai and Rodeh [6] presented a -approximation algorithm that runs in time. Notice that in contrast to the APSP problem, where a running time of is inevitable since the output size is , in the girth problem the output is a single number, thus, there is no natural barrier for sub-quadratic time algorithms. Indeed, Lingas and Lundell [8] presented a -approximation algorithm that runs in time, and Roditty and V. Williams [11] presented a -approximation algorithm that runs in time. Dahlgaard, Knudsen and Stöckel [5] presented two tradeoffs between running time and approximation. One generalizes the algorithms of [8, 11] and computes a cycle of length at most in time. The other computes, whp, a , for any integer , in time.
Kadria et al. [7] significantly improved upon the second algorithm of [5] and presented an algorithm, that for every integer , computes a in time. They also presented an algorithm, that for every , computes a cycle of length at most , in time, for every graph with .
These two algorithms of Kadria et al., as well as few other approximation algorithms (see for example, [8], [3], [9]), were obtained using a general framework for girth approximation in which a search is performed over the range of possible values of , using some algorithm that gets as an input an integer which is a guess for the value of . In each step of the search either returns a cycle , where is a non decreasing function, or determines that . The goal of the search is to find the smallest , for which returns a cycle, because for this value we have (and thus ), and algorithm returns a . This cycle is of length at most since and is a non decreasing function. The two possible outcomes of and its usage in the general girth approximation framework inspired us to formally define the notion of a -hybrid algorithm as follows:
Definition 1.1.
A -hybrid algorithm is an algorithm that either outputs a or determines that .
When , the algorithm is referred to as a -hybrid algorithm. The girth approximation framework described above suggests that a possible approach for developing efficient girth approximation algorithms is by developing efficient -hybrid algorithms.
Kadria et al. [7] designed several algorithms that satisfy the definition of -hybrid algorithms. Their girth approximation algorithms mentioned above were obtained using two different -hybrid algorithms. Additionally, for every , they presented a -hybrid and a -hybrid algorithms that run in time, and a -hybrid algorithm that runs in time. Therefore, for , and , there is a -hybrid algorithm that runs in time. A natural question is whether these three algorithms are only part of a general tradeoff between the running time, and .
Problem 1.1.
Let and be two integers and let . Is it possible to obtain a -hybrid algorithm that runs in time?
In this paper we present a -hybrid algorithm that runs, whp, in time. This algorithm provides an affirmative answer to Problem 1.1, albeit the factor in the running time.
Using our -hybrid algorithm we obtain a generalization of the time algorithm of Kadria et al. [7] that computes a cycle of length at most . Our generalized algorithm runs in time, whp, and returns a cycle of length at most , where is an integer and , for every graph with . We also show that if the graph is sparse then the approximation can be improved. More specifically, we present an algorithm that runs in time, whp, and returns a cycle of length at most , where is an integer and , for every graph with .
Our -time algorithm also generalizes the -time algorithm of Kadria et al. [7], which computes a for every integer . In our algorithm, the integer parameter that appears in the exponent of the running time is replaced by a real-valued parameter . Thus, we introduce many new points on the tradeoff curve between running time and approximation ratio. Specifically, for every integer , up to additional tradeoff points are added (since for every such , when is a multiple of , we get a -time algorithm which computes a ). For example, consider and a graph with girth or . Our algorithm yields two additional points on the tradeoff curve, corresponding to and . For , we compute a in time, and for , we compute a in time. These points lie between the two points on the tradeoff curve given by the algorithm of Kadria et al. [7], which computes either a in time or a in time. See Figure 1 for a comparison.
Time [3pt] exp Cycle bound
The tradeoff curve to which we add new points encompasses many known algorithms, including those of Itai and Rodeh [6], Lingas and Lundell [8], and Kadria et al. [7] (and those of Roditty and V. Williams [11] for and when is an integer, and Dahlgaard et al. [5] for some values of and ). Notably, some of these algorithms have resisted improvement for many years. The addition of new points to this curve reinforces the possibility that it captures a fundamental relationship between running time and approximation quality. This, in turn, motivates further investigation into whether a matching lower bound exists for this tradeoff.
The rest of this paper is organized as follows. In Section 2 we provide an overview. Preliminaries are in Section 3. In Section 4, we present a new cycle searching technique that is used by our algorithms. In Section 5 we present a -hybrid algorithm and then use it to obtain a -approximation algorithm for the girth. In Section 6 we generalize the -hybrid algorithm and present a -hybrid algorithm. In Section 7 we use the hybrid algorithm from Section 6 to obtain two more approximation algorithms for the girth.
2 Overview
Among the techniques that we develop to obtain our new algorithms, is a new cycle searching technique that might be of independent interest. Our new technique exploits the property that if is not on a , then for any two neighbors and of , the set of vertices at distance exactly from and that are also at distance from are disjoint (see Figure 2). This allows us to check efficiently for all the neighbors of if they are on a . Using this technique, together with more tools that we develop, we obtain two hybrid algorithms.
The first is a relatively simple -time, -hybrid algorithm. We use this hybrid algorithm in the girth approximation framework described earlier, to obtain an -time, -approximation of the girth, where or . We remark that using an algorithm of [4] it is possible to obtain a -approximation in time.111[4] showed that a , if exists, can be found in time, and if not then a , if exists, can be found in the same time. Thus, if we run their algorithm with increasing values of we can obtain a -approximation for the girth in time, where or . However, the additional factor might be significant even for small values of .
The second is the -hybrid algorithm that solves Problem 1.1. Its main component is an -hybrid algorithm that runs in -time, whp, and generalizes the first -time -hybrid algorithm, by introducing an additional parameter . Using we can tradeoff between the running time and the lower bound on and obtain a faster running time at the price of a worse lower bound.
We compare our -hybrid algorithm to algorithm Cycle of Kadria et al. [7], an -time222 Cycle runs in time, which can be reduced to time, as shown in [7]. -hybrid algorithm, where , that they used to obtain the -time, -approximation algorithm. As we show later, the running time of our -hybrid algorithm can be bounded by . Since in our algorithm is not necessarily a multiple of (compared to the of Cycle), our algorithm allows more flexibility, and we achieve many more possible tradeoffs between the running time and the output cycle length. For example, if we consider a multiplicative approximation better than , when the value of is a constant known in advance, our algorithm can return longer cycles that are still shorter than , in a faster running time. See Figure 3 for a comparison.333 [7] presented also an -time, -hybrid algorithm, where are integers. For , this is an -time, -hybrid algorithm, similar to our -hybrid algorithm. However, since , the possible values of are restricted and must satisfy . By choosing and an appropriate the two algorithms have a similar flexibility for a -approximation, but since in our algorithm also larger values of are allowed, we can achieve a faster running time for a -approximation where .
The flexibility of our algorithm is also demonstrated in Figure 4. For a given constant value of , if our -hybrid algorithm returns a cycle then its length is at most . If we want algorithm Cycle to output a , then is the largest that we can choose, since must be an integer. The running time is . Our algorithm achieves a better running time if is not divisible by . (In Figure 4 we choose .)
Next, we overview our -hybrid algorithm that either finds a or determines that in time. To determine that , we can check for every if is on a . If is on a , then all the vertices and edges of this are at distance at most from . If, for every , the number of edges at distance at most is then using standard techniques we can check for every if is on a in time. However, this is not necessarily the case, and the region at distance at most from some vertices might be dense. To deal with dense regions within the promised running time we develop an iterative sampling procedure (see BfsSample in Section 6), whose goal is to sparsify the graph, or to return a . One component of the iterative sampling procedure is a generalization of our new cycle searching technique mentioned above. In the generalization instead of checking whether a vertex and its neighbors are on a , we check whether all the vertices up to a possibly further distance from are on a , for , and if not we mark them so that they can be removed later.
If the iterative sampling procedure ends without finding a then there are two possibilities. Let . If then it holds that the number of edges at distance at most from every is , whp, as required. If then it holds that the number of edges at distance at most from every is , whp. This does not necessarily imply that the graph is sparse enough for checking whether . In this case, we run another algorithm (see HandleReminder in Section 6) that continues to sparsify the graph until the number of edges at distance at most from every is and checking whether is possible within the required running time of .
3 Preliminaries
Let be an unweighted undirected graph with vertices and edges. Let be a set of vertices and let be the graph obtained from by deleting all the vertices of together with their incident edges. For two graphs and , let be . We say that if and . For convenience, we use both and to say that . For every , let be the length of a shortest path between and in . The girth of is the length of a shortest cycle in . Let be the length of a cycle . For an integer , we denote a cycle of length (at most) by () .444Both and might not be simple cycles. However, the cycles that our algorithms return are simple. Let be the edges incident to and the th edge in . Let be the degree of in . Let be the set of neighbors of , namely . For an edge set , let be the endpoints of ’s edges, that is, . Let and . The distance between and is . For every and a real number let be the ball graph of , where and [7].555When the graph is clear from the context, we sometimes omit from the notations.
We now turn to present several essential tools that are required in order to obtain our new algorithms. We first restate an important property of the ball graph .
Lemma 3.1 ([7]).
Let be two integers and let . If is a tree then no vertex in is part of a cycle of length at most in .
We use procedure [7, 8] (see Algorithm 1) that searches for a in the ball graph . We summarize the properties of BallOrCycle in the next lemma.
Lemma 3.2 ([7]).
Let . If the ball graph is not a tree then returns a from . If is a tree then returns .777If is returned then we assume that is ordered by the distance from , and for every we store with . Thus, given the set , we can find for every in time. The running time of is .
Next, we obtain a simple -hybrid algorithm, called AllVtxBallOrCycle, using BallOrCycle. AllVtxBallOrCycle (see Algorithm 2) gets a graph and an integer , and runs from every as long as no cycle is found by BallOrCycle. If BallOrCycle finds a cycle then AllVtxBallOrCycle stops and returns that cycle. If no cycle is found then AllVtxBallOrCycle returns null. We prove the next Lemma.
Lemma 3.3.
either finds a or determines that , in time.
Proof.
By Lemma 3.2, if returns a cycle then . Also by Lemma 3.2, if does not return a cycle then the ball graph is a tree, and specifically is not part of a in . Therefore, if no cycle was found during any of the calls then all the vertices in are not part of a in . Hence, does not contain a , and we get that . By Lemma 3.2, the running time of is , which is in total. ∎
We now show that if the input graph satisfies a certain sparsity property then the running time of can be bounded as follows.
Corollary 3.1.
If for every then runs in time.
Proof.
For every , we have , which is at most . Thus, by Lemma 3.3, the running time of is . ∎
Next, we present procedure from [7]. IsDense (see Algorithm 3) gets a graph , a vertex , a budget (real) and a distance (integer). In the procedure a BFS is executed from . The BFS counts the edges that are scanned as long as their total number is less than and the farthest vertex from is at distance at most .
Lemma 3.4 ([7]).
Procedure runs in time. If then . If then .
Given a vertex and a distance we sometimes want to bound . Therefore, we adapt a lemma and a corollary of [7] from vertices to edges.
Lemma 3.5.
Let be positive integers, let be a real number, and let . If , and for every , then .
Proof.
Let and assume that . We also know that , for every . We denote . If then as required. Now assume that . Since , . Therefore, . As the ball graph is connected, we know that and since , . Thus, we get that so . ∎
Using Lemma 3.5, we prove following corollary.
Corollary 3.2.
Let be a positive integer and let be a real number. If for every , then , for every and .
Proof.
The proof is by induction on . For it follows from our assumption that . Assume now that the claim holds for . This implies that , for every . Combining this with the fact that , for every , by Lemma 3.5 we get that , for every and . ∎
We also adapt procedure of [7] to our needs. SparseOrCycle (see Algorithm 4) gets a graph , a parameter , and two integers , and iterates over vertices using a for-each loop. Let be the vertex currently considered and the current graph. If then is called. If BallOrCycle returns a cycle then is returned by SparseOrCycle. Otherwise, the vertex set is removed from along with the edge set . After the loop ends, if no cycle was found, we return null. Let be the set of vertices for which BallOrCycle was called and no cycle was found, and the graph after SparseOrCycle ends.
The following lemma is similar to the corresponding lemmas from [7], and the proof is omitted.
Lemma 3.6.
satisfies the following:
-
(i)
If a cycle is returned then
-
(ii)
If a cycle is not returned then , for every
-
(iii)
If then is not part of a in
-
(iv)
runs in time.
Similarly to AllVtxBallOrCycle, we show for SparseOrCycle that if satisfies a certain sparsity property, the running time can be bounded as follows.
Corollary 3.3.
If for every vertex then runs in time.
Proof.
By Lemma 3.6, runs in time. For every , the call to returned Yes, so it follows from Lemma 3.4 that . The edge set is removed while removing . Therefore, for every we remove at least edges. Since at most edges can be removed, the size of is at most . By our assumption, we have . Therefore, we get that . Thus, the running time of SparseOrCycle is . ∎
Lemma 3.7.
It is possible to obtain in time, using sampling, a set of edges of size , that hits, whp, the closest edges of every .
We remark that some of our algorithms get a graph that is being updated during their run. Within their scope, denotes the current graph that includes all updates done so far.
4 A new cycle searching technique
Consider a vertex . It is straightforward to check whether is on a , for every integer , using in time. If does not return a then for every it holds that , where , as otherwise there was a passing through and would have returned a . We show that it is possible to exploit this property to check for every whether is on a , using , in time instead of . More specifically, we present algorithm (see Algorithm 5) that gets a graph , a vertex , and an integer . We first initialize to . Then, we run . If a cycle is found by then is returned by NbrBallOrCycle. Otherwise, we add the vertex to , keep the neighbors of in , and then remove from . Recall that equals . Next, for every we run , as long as a cycle is not found. If a cycle is returned by then is returned by NbrBallOrCycle.888For our needs it suffices to stop and return a cycle passing through an neighbor once we find one, though BallOrCycle can be run from all the neighbors of in the same running time bound of . Otherwise, is added to . After the loop ends, the vertex and its adjacent edges are added back to the graph, and the set is returned by NbrBallOrCycle. We prove the following lemma.
Lemma 4.1.
If algorithm finds a cycle then . Otherwise, no vertex in is part of a in , and the set is returned.
Proof.
If a cycle was found, it happened during the call to or one of the calls to . Therefore, by Lemma 3.2, the cycle length is at most .
If no cycle was found during the run of then is not on a in . In addition, for every did not find a cycle. Hence, is not on a in and therefore also in , since and is not on a . Since no cycle was found, and the vertices for every are added to , so by the definition of we have . ∎
To bound the running time of NbrBallOrCycle, we show how to use the fact that no was found by , to efficiently run for every .
Lemma 4.2.
Let . If the ball graph is a tree then the total cost of running for every is .
Proof.
By the definitions of , and , we know that and . Since contains no cycles it follows that and , for any two distinct vertices . Therefore, (i) , and (ii) . From Lemma 3.2 it follows that the total cost of the calls to for every is . It holds that , for every . Thus, we get that the total cost is . This equals to , and it follows from (i) and (ii) that this is at most . ∎
We use Lemma 4.2 to bound the running time of NbrBallOrCycle.
Lemma 4.3.
Algorithm NbrBallOrCycle runs in time.
Proof.
Running costs . If a cycle is found, it is returned and the running time is . If no cycle is found, then removing (and later adding back) and its edges costs . By Lemma 4.2, the cost of running for every is . Adding and to takes time. Thus, the total running time of NbrBallOrCycle is . ∎
5 A -hybrid algorithm and a -approximation of the girth
In this section we first show how to use algorithm NbrBallOrCycle from the previous section to obtain a -hybrid algorithm that in time, either returns a or determines that . Then, we use the -hybrid algorithm to compute a -approximation of .
5.1 A -hybrid algorithm
We first present algorithm that gets a graph and an integer . Let () be before (after) running -SparseOrCycle. either finds a or removes vertices that are not on a , such that for every , the ball graph is relatively sparse, that is, . -SparseOrCycle (see Algorithm 6) iterates over vertices using a for-each loop. Let be the vertex currently considered. If then is called. If NbrBallOrCycle returns a cycle then -SparseOrCycle returns . If NbrBallOrCycle returns a vertex set then is removed from . After the loop ends, if no cycle was found, we return null.
Remark. Notice that either finds a or removes vertices that are not on a , such that for every it holds that . Using NbrBallOrCycle instead of BallOrCycle in -SparseOrCycle enables us in the case that a cycle is found to bound the cycle length with rather than , while still maintaining the property that , for every , in the case that no cycle is found.
We prove the following lemma.
Lemma 5.1.
satisfies the following:
-
(i)
If a cycle is returned then
-
(ii)
If a cycle is not returned then , for every
-
(iii)
If then is not part of a in
-
(iv)
runs in time.
Proof.
-
(i)
Since -SparseOrCycle returns a cycle only if a call to returns a cycle , it follows from Lemma 4.1 that .
-
(ii)
Let . Since was not removed, was considered in the for-each loop at some stage during the execution of -SparseOrCycle. At this stage , as otherwise, since no cycle was returned, by Lemma 4.1 the call to NbrBallOrCycle with would have returned , so would have been removed while removing . Since we have . As edges can only be removed during the run of -SparseOrCycle, we have also in .
-
(iii)
Since it follows that there was a vertex such that after a call to did not return a cycle. By Lemma 4.1, no vertex in is part of a in . Therefore, is not part of a in . Since during the run of -SparseOrCycle we remove only vertices that are not part of a , is not part of a also in .
-
(iv)
Computing takes time, as all the degrees can be computed in advance in time. We compute this value for at most distinct vertices so the running time of this part is at most in total. By Lemma 4.1, running NbrBallOrCycle takes time. Each edge in contributes at most to the sum , so and . If a call to did not return a cycle, then by Lemma 4.1, the set is returned. -SparseOrCycle removes the set and by doing so, the edge set is also removed. We charge each edge of with . Thus the total cost that we charge for is , which covers the cost of .
Since each edge can be charged and removed from at most once during the execution of -SparseOrCycle, the running time of -SparseOrCycle is at most . ∎
Next, we use -SparseOrCycle to design a -hybrid algorithm called -Hybrid. Notice first that if for every , then it is straightforward to obtain an -time -hybrid algorithm, by running . Thus, in -Hybrid we ensure that if we call AllVtxBallOrCycle then it holds for every that . To do so, we run -SparseOrCycle and possibly SparseOrCycle. If no cycle was returned then it holds that for every , and we can safely run AllVtxBallOrCycle.
-Hybrid (see Algorithm 7) gets a graph and an integer . -Hybrid is composed of three stages. In the first stage we call . If -SparseOrCycle returns a cycle then -Hybrid stops and returns , otherwise we proceed to the second stage. In the second stage, if , we call . If SparseOrCycle returns a cycle then -Hybrid stops and returns , otherwise we proceed to the last stage. In the last stage, we call . If AllVtxBallOrCycle returns a cycle then -Hybrid stops and returns , otherwise -Hybrid returns null. In the next lemma we prove the correctness and analyze the running time of .
Lemma 5.2.
either returns a or determines that , in time.
Proof.
First, returns a cycle only if , or returns a cycle . Therefore, if a cycle is returned then it follows from Lemmas 5.1, 3.6, or 3.3, respectively, that .
Second, we show that if no cycle was found then . If no cycle was found then it might be that some vertices were removed from the graph. A vertex can be removed either by SparseOrCycle or by -SparseOrCycle. It follows from Lemma 3.6 and Lemma 5.1, that is not part of a when is removed. Since only vertices that are not on a are removed, every that was in the input graph also belongs to the updated graph. We then call with the updated graph. Since we are in the case that no cycle was found, AllVtxBallOrCycle did not return a cycle. It follows from Lemma 3.3 that in the updated graph, and therefore also in the input graph.
Now we turn to analyze the running time of -Hybrid. At the beginning, -Hybrid calls -SparseOrCycle. By Lemma 5.1, -SparseOrCycle runs in time. Let be the graph after the call to -SparseOrCycle. By Lemma 5.1, for every we have .
Next, -Hybrid checks if . We divide the rest of the proof to the case that and to the case that . If then -Hybrid calls AllVtxBallOrCycle. Since and since for every we have , it follows from Corollary 3.2 that for every . By Corollary 3.1, the running time of AllVtxBallOrCycle is .
We now turn to the case that . In this case it might be that for some vertices . Therefore, we first call . Notice that since we are in the case that it holds that . Moreover, for every . Thus, it follows from Corollary 3.2 that for every . By Corollary 3.3 if for every then the running time of is .
Let be the graph after SparseOrCycle ends. By Lemma 3.6, for every we have . It follows from Corollary 3.2 that . Now -Hybrid calls AllVtxBallOrCycle, and using Corollary 3.1 again, we get that the running time of AllVtxBallOrCycle is .
It follows from the above discussion that ShortCycleSparse either returns a or determines that , and the running time is . ∎
5.2 A -approximation of the girth
Next, we describe algorithm AdtvGirthApprox, which uses -Hybrid and the framework described in Section 1, to obtain a -approximation of , when . AdtvGirthApprox (see Algorithm 8) gets a graph . In AdtvGirthApprox, we set to and start a while loop. In each iteration, we create a copy of and call . If -Hybrid finds a cycle then AdtvGirthApprox stops and returns , otherwise we increment by and continue to the next iteration. We prove the following theorem.
Theorem 5.1.
Algorithm returns either a or a , and runs in time, where or and .999 We note that when , the -time, -approximation of Itai and Rodeh [6] can be used.
Proof.
We first prove the bound on the approximation. AdtvGirthApprox always returns a cycle in so it cannot be that a will be returned. AdtvGirthApprox starts with . It follows from Lemma 5.2 that as long as the calls to -Hybrid will not return a cycle since the graph does not contain a . Consider now the iteration in which . AdtvGirthApprox calls to where . It follows from Lemma 5.2 that -Hybrid will either return a , or determine that . Since we assume that , -Hybrid will return a which is either a or a since or .
We now turn to analyze the running time. Creating a copy of takes time, and by Lemma 5.2 the running time of is . Therefore, for every , the running time of the iteration of the while loop with this value of is . From the previous part of this proof it follows that the last iteration of the while loop is when , thus, the running time of AdtvGirthApprox is . Therefore, when , the running time is . ∎
6 A general hybrid algorithm
Algorithm , presented in the previous section, either returns a or determines that , in time. In this section we introduce an additional parameter and present a -hybrid algorithm that either returns a or determines that , in time. In Section 7 we use the -hybrid algorithm to present two tradeoffs for girth approximation.
To obtain the -hybrid algorithm we first extend algorithm NbrBallOrCycle. Then, we use the extended NbrBallOrCycle together with additional tools that we develop to either return a or sparsify dense regions of the graph, so that we can check whether (or return a ) in time, by running .
6.1 Extending NbrBallOrCycle
In algorithm NbrBallOrCycle we mark vertices that can be removed from the graph, by using the property that if did not return a then is not on a . In [7], they introduce an additional parameter and used the following extended version of this property: If did not return a then no vertex of is on a . We use the same approach and modify NbrBallOrCycle to get an additional integer parameter such that . After each call to , where , if no cycle was found we add , instead of , to . The modified pseudo-code appears in Algorithm 9. We rephrase Lemma 4.1 to suit this modification.
Lemma 6.1.
Let . If finds a cycle then . Otherwise, no vertex in is part of a in , and the set is returned.
Proof.
The proof that if a cycle is returned then is as in Lemma 4.1. It is left to show that if no cycle is found then no vertex in is part of a in , and the set is returned.
From Lemma 3.1 it follows that if did not return a cycle then no vertex in is part of a in , and therefore also in , since and since itself is not on a as otherwise would not have been called. Thus, if no cycle was found, we have , as in this case contains and for each , which equals to . ∎
For the running time, we note that the sets are computed during the execution of , for every for which no cycle was found. Their total size is also , and we can obtain from them the sets and add these sets to in time. Therefore, the modified NbrBallOrCycle also runs in time.
6.2 A -hybrid algorithm
In this section we present a -hybrid algorithm called ShortCycle, where . ShortCycle (see Algorithm 10) gets a graph and two integers . If then we run algorithm (see Algorithm 11), which is based on algorithm DegenerateOrCycle of [7]. If then the main challenge is when . In this case we run algorithm (described later). The cases that or are relatively simple and treated in algorithm (see Section 6.2.1). We summarize the properties of in the next theorem.
Theorem 6.1.
Let be integers. runs whp in time and either returns a , or determines that .
The next corollary follows from Theorem 6.1, when .
Corollary 6.1.
Let . Algorithm runs whp in time and either returns a , or determines that .
In the rest of this section, we present the proof of Theorem 6.1. As follows from [7], if then returns in time a . We now consider the case in which . We prove in Section 6.2.1 that satisfies the claim of Theorem 6.1, when or .
Our main technical contribution is algorithm ShortCycleSparse that handles the case of . Notice that if for every , then is a -hybrid algorithm that either finds a (which is also a as ) or determines that , in time. Thus, in ShortCycleSparse we ensure that if we call AllVtxBallOrCycle, the property that , for every (whp), holds. To do so, we run BfsSample and possibly HandleReminder. If no cycle was returned, the property holds, and we can safely run AllVtxBallOrCycle.
ShortCycleSparse (see Algorithm 12) gets a graph and two integers such that , and is composed of three stages. In the first stage we call (described later). If BfsSample returns a cycle then ShortCycleSparse stops and returns , otherwise we proceed to the second stage. In the second stage, if , we call (also described later). If HandleReminder returns a cycle then ShortCycleSparse stops and returns , otherwise we proceed to the last stage. In the last stage, we call . If AllVtxBallOrCycle returns a cycle then -Hybrid stops and returns , otherwise ShortCycleSparse returns null.
Next, we give a high level description of BfsSample. The goal of BfsSample is to either sparsify the graph without removing any , or to report a . For simplicity assume that . In such a case, if BfsSample does not report a , then the graph after BfsSample ends contains all the s that were in the original graph, and satisfies, whp, the following sparsity property: For every it holds that .
This implies that in BfsSample we need to find every which is in a dense region with , and to check if is in a , so that if not we can remove . Finding every such is possible within the time limit by running for every . The problem is that checking whether is on a for every such is too costly since there might be such vertices, and this check costs using .
One way to overcome this problem is to sample an edge set of size that hits the closest edges of each vertex, and then use to detect the vertices in the dense regions that are not on a . In BfsSample we use a detection process in which we call BallOrCycle or NbrBallOrCycle from the endpoints of ’s edges, and then, if no cycle was found, we use the information obtained from this call to identify vertices that are not on a . The detection process either detects vertices that are not on a and can be removed, or reports a . However, it is not clear how to implement this detection process efficiently, since just running BallOrCycle from the endpoints of ’s edges takes time which might be too much. Our solution is an iterative sampling procedure that starts with a smaller hitting set of edges, of size . For such a hitting set we can run our detection process. If a was not reported, then we remove the appropriate vertices and sparsify the graph without removing any . When the graph is sparser, the running time of our detection process becomes faster. Thus, in the following iteration we can sample a larger hitting set for which we run this process, and either return a or sparsify the graph further for the next iteration. We continue the iterative sampling procedure until we get to the required sparsity property in which for every (whp).
We remark that in the first iteration of BfsSample, the detection process calls NbrBallOrCycle, while in the rest of the iterations BallOrCycle is called. The use of NbrBallOrCycle allows us, in the case that no is reported, to bound with for every , rather than . This is used to achieve the required sparsity property. Since NbrBallOrCycle runs in time we can only use it in the first iteration when the sampled set is small enough. In the rest of the iterations we use BallOrCycle instead.
We now formally describe BfsSample. BfsSample (see Algorithm 13) gets a graph and two integers such that . We first set to . Then, we start the main for loop that has at most iterations. In the th iteration, we initialize to and sample a set of size . Next, we scan the endpoints in using an inner for-each loop.
If , we call from every endpoint . NbrBallOrCycle returns either a cycle or a set of vertices . If NbrBallOrCycle returns a cycle then the cycle is returned by BfsSample. Otherwise, we add to .
If then we call , where , from every endpoint . If a cycle is found by BallOrCycle then the cycle is returned by BfsSample. If BallOrCycle does not return a cycle then we add to .
Right after the inner for-each loop ends, we remove from , and continue to the next iteration of the main for loop. If no cycle was found after iterations, we return null. Let be the last iteration in which vertices were removed. Let be the set of vertices that were removed during the th iteration, where for . Let () be before (after) running BfsSample. Figure 5 and Figure 6 illustrate the key steps of the first and the following iterations of BfsSample, respectively. We summarize the properties BfsSample in the next lemma.
Lemma 6.2.
satisfies the following:
-
(i)
If a cycle is returned then
-
(ii)
If a cycle is not returned then , for every , whp
-
(iii)
If then is not part of a in
-
(iv)
runs in time, whp.
Proof.
- (i)
-
(ii)
Next, we show that if a cycle is not returned then , for every , whp. Since , we have and there is at least one iteration.
Consider the th iteration of the main loop. We show that if no cycle was found during the th iteration then after the th iteration every satisfies the following property: , whp.
Let . By Lemma 3.7, the set hits the closest edges of every vertex of , whp. Assume that is indeed such a hitting set, and assume, towards a contradiction, that after the th iteration . Since we have . Now, since and since the graph is not updated in the inner for-all loop, it follows that there is an edge such that . By the definition of , either or , denoted with , satisfies that . By the definition of , we know that .
When , we have . Therefore, . At the first iteration, when , the input graph has not changed yet so . Since no cycle was found by , it follows from Lemma 6.1 that . Therefore, , and is added to after the call to . Hence, , a contradiction.
We now handle the case that . It holds that . Hence, . Since no cycle was found during the th iteration, is added to after the call to , and therefore, , a contradiction.
Now, if BfsSample does not return a cycle we get for that if then , whp.
-
(iii)
Next, we prove that if then is not part of a in . Since and since ) it holds that , where .
If then since it follows that there was in the first iteration a vertex such that after a call to did not return a cycle. By Lemma 6.1, and no vertex in is part of a in . Therefore, is not part of a in .
If then since it follows that there was in the th iteration a vertex such that after a call to did not return a cycle. As did not return a cycle, by Lemma 3.2 we know that is a tree. It follows from Lemma 3.1 that no vertex in , and in particular , is part of a in . Since during the run of BfsSample we remove only vertices that are not part of a , is not part of a also in .
-
(iv)
Finally, we show that BfsSample runs in time, whp. To do so, we show that the running time of the th iteration of the main for loop is whp .
We start with the first iteration, in which . The size of is . The size of is at most . For every we run . By Lemma 4.3, running NbrBallOrCycle from costs . Adding to costs . Therefore, the total running time for all is at most .
Now we assume that . We proved in (ii) that if and then after the th iteration, if no cycle was found, we have , whp. By Lemma 3.2, for every , the cost of running is . In our case this is at most . As the size of is and the size of is at most , the total running time of the calls to BallOrCycle for every is, whp,
The cost of adding to is . This is at most (whp), which is for all (similarly to the previous calculation).
The cost of removing a vertex is . Thus, for every , the total cost of removing all the vertices in is at most , so the total running time of the th iteration is , whp.
If we are in the scenario that a cycle is returned, then the th iteration stops at an earlier stage, and therefore the running time is also .
Now, since there are at most iterations of the main for loop, the running time of BfsSample is, whp101010This is the running time in the case that was a hitting sets as described for every . This happens whp since we assume that and therefore . For every the probability that is not such a hitting set is at most . Therefore, using the standard union bound argument, the probability that there exists such that is not a hitting set is at most . For large enough , we get that is a hitting set for every , whp., . ∎
Recall that our goal is to obtain the sparsity property that , for every , so that we can run . However, after running BfsSample the required sparsity property is guaranteed to hold (whp) only if . In the case that we need an additional step which is implemented in HandleReminder, to guarantee that the required sparsity property holds.
Next, we formally describe HandleReminder. HandleReminder (see Algorithm 14) gets a graph and two integers such that where and . We set to and to . Then, a while loop runs as long as . Let be the value of when the th iteration begins, so that is . Let be the total number of iterations and the value of after the th iteration. During the th iteration, we set to , where is the smallest multiple of that is at least (see Figure 7). Then, we call . If SparseOrCycle returns a cycle then HandleReminder returns . If SparseOrCycle does not return a cycle then it might be that some vertices were removed from , and we continue to the next iteration. If the while loop ends without returning a cycle then we return null. Let () be before (after) running HandleReminder. Next, we prove two properties on the value of during the run of HandleReminder.
Claim 6.1.
Let and assume . (i) . (ii) (, for every .
Proof.
-
(i)
First, since and we assume that , we have . Now we show that , for every . This implies that , as required.
We first show by induction that , for every . The base of the induction follows from the assumption that . We assume that and prove that . During the th iteration of the while loop, we set to . Since this occurs during the th iteration it must be that , as otherwise the th iteration would not have started. Since we have and therefore, . We get that , as required.
We now turn to prove that . As is the smallest multiple of that is at least , we know that . Therefore, . Since , we get that .
-
(ii)
Let . Since , it follows that is a multiple of , so (. ∎
We now prove the main lemma regarding HandleReminder.
Lemma 6.3.
Let and assume that , , and that , for every . satisfies the following:
-
(i)
If a cycle is returned then
-
(ii)
If a cycle is not returned then , for every vertex
-
(iii)
If then is not on a in
-
(iv)
runs in time.
Proof.
- (i)
-
(ii)
Next, we show that if a cycle is not returned then , for every vertex . To do so, we show that if HandleReminder does not return a cycle then when the algorithm ends, for every vertex we have .
If we do not enter the while loop and then by our assumption , for every , as required. Now we assume that we enter the loop so . Consider the th iteration of the while loop. During the th iteration is called. If SparseOrCycle does not return a cycle, then it follows from Lemma 3.6 that if ( was not removed) then . Therefore, if no cycle was returned by HandleReminder we get for that after the th iteration, we have , for every .
-
(iii)
Next, we prove that if then is not on a in . If then by the definition of and , was removed while executing HandleReminder. During the run of HandleReminder, a vertex can be removed only by for some . Therefore, by Lemma 3.6, is not part of a in . Since during the run of HandleReminder only vertices that are not part of a are removed, is not part of a also in .
-
(iv)
Finally, we show that HandleReminder runs in time. To do so, we show that the running time of the th iteration of the while loop is . When there are no iterations and the running time is . Now we assume that . During the th iteration, we call . We proved in (ii) that for if SparseOrCycle did not return a cycle during the th iteration and ( was not removed) then after the th iteration . For by our assumption . Therefore, before the th iteration starts, by Corollary 3.2, for every integer . By Claim 6.1(ii), is divisible by , so for every vertex . It then follows from Corollary 3.3 that runs in time. By Claim 6.1(i), so , and the running time of the th iteration is . Now, it follows from Claim 6.1(i) that after at most iterations, the value of cannot decrease anymore (since it cannot become less than , and ) so the while loop ends. As we saw, the running time of each iteration is , hence the total running time of the while loop is . ∎
Now we are ready to prove the correctness and running time of ShortCycleSparse.
Lemma 6.4.
Let such that , where and are integers. Algorithm runs whp in time and either returns a , or determines that .
Proof.
First, returns a cycle only if , , or returns a cycle . If BfsSample or AllVtxBallOrCycle returns a cycle then by Lemma 6.2 or by Lemma 3.3, . If HandleReminder returns a cycle then by Lemma 6.3, since and hence , .
Second, we show that if no cycle was found then . If no cycle was found then it might be that some vertices were removed from the graph. A vertex can be removed either by BfsSample or by HandleReminder. It follows from Lemma 6.2 and Lemma 6.3, that is not part of a when is removed. Since only vertices that are not on a are removed, every that was in the input graph also belongs to the updated graph. After the (possible) removal of vertices, we call with the updated graph. Since we are in the case that no cycle was found, AllVtxBallOrCycle did not return a cycle. It follows from Lemma 3.3 that in the updated graph, and therefore also in the input graph.
Now we turn to analyze the running time of ShortCycleSparse. At the beginning, ShortCycleSparse calls BfsSample. By Lemma 6.2, BfsSample runs in time. Let be the graph after the call to BfsSample. Recall that . By Lemma 6.2, for every we have , whp. If then and . If then and .
Next, ShortCycleSparse checks if . We divide the rest of the proof to the case that and to the case that . If then ShortCycleSparse calls AllVtxBallOrCycle. Since , we have after BfsSample. By Corollary 3.1, the running time of AllVtxBallOrCycle is .
We now turn to the case that . In this case it might be that for some vertices . Therefore, we first call , knowing that , and that whp, after BfsSample. By Lemma 6.3, the running time is . Let be the graph after HandleReminder ends. By Lemma 6.3, for every we have , where . Now ShortCycleSparse calls AllVtxBallOrCycle, and using Corollary 3.1 again, we get that the running time of AllVtxBallOrCycle is .
It follows from the above discussion that ShortCycleSparse either returns a or determines that , and the running time is whp111111 Throughout the run of ShortCycleSparse, some of the bounds that we get on for vertices and distances , are whp, because the sets that we sample are hitting sets whp (see the proof of Lemma 6.2). Therefore, also the running times of BfsSample, HandleReminder and AllVtxBallOrCycle are whp, since they rely on these bounds. . ∎
Since ShortCycleSparse is run by ShortCycle when and when and so , we have . Thus, the running time of ShortCycleSparse is whp , which is at most .
6.2.1 Algorithm SpecialCases
We now present algorithm SpecialCases that handles special cases of and . gets as an input a graph and two integers and .
If , the algorithm simply runs from an arbitrary vertex in time, to check whether contains a cycle. If a cycle is found then its length is at most , and we return a . Otherwise, so for every integer we return that .
If , we check if the graph contains a for , using an algorithm of Alon, Yuster and Zwick [2]. Their algorithm decides whether contains s and s, and finds such cycles if it does, in time. Applying this algorithm with increasing cycle lengths until a length of (the values of are in the worst case ), we can either find the shortest cycle or determine that . The running time is time121212It is possible to modify the algorithm of Alon et al. [2] to search in time a shortest cycle of length at most instead of exactly or , and then run it only with , to avoid the factor in the running time. . Since , we have and therefore . In addition, since and since , we have . Therefore, the running time is .
By choosing which algorithm to run according to the relation between , and , we get that for every two integers and , algorithm runs whp in time and either returns a , or determines that . This completes the proof of Theorem 6.1.
7 Approximation of the girth
In this section we present two new tradeoffs for girth approximation that follow from Corollary 6.1. In these tradeoffs we use ShortCycle with , so by Corollary 6.1, ShortCycle is a -time, -hybrid algorithm.
7.1 Dense graphs
Kadria et al. [7] presented an -time algorithm that either returns a , or determines that , where are two integers. This is a -hybrid algorithm which, combined with a binary search, was used by [7] to compute for every a cycle such that , in time, if . We use ShortCycle in a similar way and prove:
Theorem 7.1.
Let be an integer, and . It is possible to compute, whp, in time, a cycle such that .
Proof.
For each in the range in increasing order, we call , where , and . When we find the smallest value for which ShortCycle returns a cycle, we stop and return that cycle. Since and we have , and it follows from Corollary 6.1 that ShortCycle either returns a or determines that in time, whp.
We first prove that the algorithm returns a cycle such that . Let be the smallest value for which ShortCycle returned a cycle. This implies that for the algorithm did not return a cycle, and hence . Since and are integers, we have . Also for we have since the girth is at least .
The call to returns a cycle such that . Thus, .
For the running time, there are at most calls to ShortCycle, and each call costs whp (with the values of and that correspond to that call). In each call, which is at most since . In addition, . Thus, the running time of each call is whp, which is since in each call. Therefore, the total running time is, whp131313The running time of each call to ShortCycle is whp. Since the number of calls to ShortCycle is at most , using a union bound argument as in the proof of Lemma 6.2 we get that also the total running of all the calls is whp., . ∎
7.2 Sparse graphs
We use a similar approach to obtain a tradeoff for girth approximation in sparse graphs. We prove the following theorem.
Theorem 7.2.
Let be an integer, and . It is possible to compute, whp, in time, a cycle such that .
Proof.
For each in the range in increasing order, we call , where , and . When we find the smallest value for which ShortCycle returns a cycle, we stop and return that cycle. Since and we have , and it follows from Corollary 6.1 that ShortCycle either returns a or determines that in time, whp.
We first prove that the algorithm returns a cycle such that . Let be the smallest value for which ShortCycle returned a cycle. As before, this implies that . The call to returns a cycle such that
Now, since , we get that .
For the running time, there are at most calls to ShortCycle, and each call costs whp (with the values of and that correspond to that call). We have . In addition, . Thus, the running time of each call is whp, which is since in each call. Therefore, the total running time is, whp141414See the previous footnote., . ∎
By setting we get an -time algorithm that computes a , as opposed to the time algorithm that computes a .
References
- [1] Donald Aingworth, Chandra Chekuri, Piotr Indyk, and Rajeev Motwani. Fast estimation of diameter and shortest paths (without matrix multiplication). SIAM Journal on Computing, 28(4):1167–1181, 1999.
- [2] Noga Alon, Raphael Yuster, and Uri Zwick. Finding and counting given length cycles. Algorithmica, 17(3):209–223, 1997.
- [3] Shiri Chechik, Yang P. Liu, Omer Rotem, and Aaron Sidford. Constant girth approximation for directed graphs in subquadratic time. In Proccedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, STOC 2020, Chicago, IL, USA, June 22-26, 2020, pages 1010–1023. ACM, 2020.
- [4] Søren Dahlgaard, Mathias Bæk Tejs Knudsen, and Morten Stöckel. Finding even cycles faster via capped k-walks. In Hamed Hatami, Pierre McKenzie, and Valerie King, editors, Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, Montreal, QC, Canada, June 19-23, 2017, pages 112–120. ACM, 2017.
- [5] Søren Dahlgaard, Mathias Bæk Tejs Knudsen, and Morten Stöckel. New subquadratic approximation algorithms for the girth. arXiv preprint arXiv:1704.02178, 2017.
- [6] Alon Itai and Michael Rodeh. Finding a minimum circuit in a graph. In Proceedings of the ninth annual ACM symposium on Theory of computing, pages 1–10, 1977.
- [7] Avi Kadria, Liam Roditty, Aaron Sidford, Virginia Vassilevska Williams, and Uri Zwick. Algorithmic trade-offs for girth approximation in undirected graphs. In Proceedings of the 2022 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1471–1492. SIAM, 2022.
- [8] Andrzej Lingas and Eva-Marta Lundell. Efficient approximation algorithms for shortest cycles in undirected graphs. Information Processing Letters, 109(10):493–498, 2009.
- [9] Liam Roditty and Roei Tov. Approximating the girth. ACM Transactions on Algorithms (TALG), 9(2):1–13, 2013.
- [10] Liam Roditty and Virginia Vassilevska Williams. Fast approximation algorithms for the diameter and radius of sparse graphs. In Proceedings of the forty-fifth annual ACM symposium on Theory of computing, pages 515–524, 2013.
- [11] Liam Roditty and Virginia Vassilevska Williams. Subquadratic time approximation algorithms for the girth. In Proceedings of the Twenty-Third Annual ACM-SIAM Symposium on Discrete Algorithms, pages 833–845. SIAM, 2012.
- [12] Virginia Vassilevska Williams and R. Ryan Williams. Subcubic equivalences between path, matrix, and triangle problems. J. ACM, 65(5):27:1–27:38, 2018.
- [13] Virginia Vassilevska Williams, Yinzhan Xu, Zixuan Xu, and Renfei Zhou. New bounds for matrix multiplication: from alpha to omega. In Proceedings of the 2024 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 3792–3835. SIAM, 2024.