This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Stochastic and Worst-Case Generalized Sorting Revisited

William Kuszmaul, Shyam Narayanan
(MIT)
Abstract

The generalized sorting problem is a restricted version of standard comparison sorting where we wish to sort nn elements but only a subset of pairs are allowed to be compared. Formally, there is some known graph G=(V,E)G=(V,E) on the nn elements v1,,vnv_{1},\dots,v_{n}, and the goal is to determine the true order of the elements using as few comparisons as possible, where all comparisons (vi,vj)(v_{i},v_{j}) must be edges in EE. We are promised that if the true ordering is x1<x2<<xnx_{1}<x_{2}<\cdots<x_{n} for {xi}\{x_{i}\} an unknown permutation of the vertices {vi}\{v_{i}\}, then (xi,xi+1)E(x_{i},x_{i+1})\in E for all ii: this Hamiltonian path ensures that sorting is actually possible.

In this work, we improve the bounds for generalized sorting on both random graphs and worst-case graphs. For Erdős-Renyi random graphs G(n,p)G(n,p) (with the promised Hamiltonian path added to ensure sorting is possible), we provide an algorithm for generalized sorting with an expected O(nlg(np))O(n\lg(np)) comparisons, which we prove to be optimal for query complexity. This strongly improves over the best known algorithm of Huang, Kannan, and Khanna (FOCS 2011), which uses O~(min(nnp,n/p2))\tilde{O}(\min(n\sqrt{np},n/p^{2})) comparisons. For arbitrary graphs GG with nn vertices and mm edges (again with the promised Hamiltonian path), we provide an algorithm for generalized sorting with O~(mn)\tilde{O}(\sqrt{mn}) comparisons. This improves over the best known algorithm of Huang et al., which uses min(m,O~(n3/2))\min(m,\tilde{O}(n^{3/2})) comparisons.

1 Introduction

Whereas the standard comparison-based sorting problem was solved more than half a century ago, several variations on the problem have only recently begun to be understood. One of the most natural of these is the so-called generalized sorting problem.

Generalized sorting: sorting with forbidden comparisons.

In the generalized sorting problem, we wish to sort the vertices of a graph, but only certain pairs of vertices are permitted to be compared. The input to the problem is a graph G=(V,E)G=(V,E), where the vertices represent the elements to be sorted, and the edges indicate the pairs of vertices that can be compared; we will use x1,x2,,xnx_{1},x_{2},\ldots,x_{n} to denote the true ordering of the vertices VV, and we say that xixjx_{i}\prec x_{j} if i<ji<j. The goal is to sort the vertices (i.e., discover the true ordering) with as few comparisons as possible. In this setting, comparisons are also referred to as edge queries, meaning that the goal is to achieve the minimum possible query complexity.

In order for an input to the generalized sorting problem to be valid, it is required that the true ordering x1x2xnx_{1}\prec x_{2}\prec\cdots\prec x_{n} of the vertices VV appears as a Hamiltonian path in GG. This ensures that, should an algorithm query every edge, the algorithm will be able to deduce the true order of the vertices.

A classic special case of the generalized sorting problem is the nuts-and-bolts problem, which considers the case where GG is a complete bipartite graph. The vertices on one side of the graph represent “nuts” and vertices on the other side of the graph represent “bolts”. The goal is to compare nuts to bolts in order to match up each nut with its correspondingly sized bolt. Nuts cannot be compared to other nuts and bolts cannot be compared to other bolts, which is why the graph has a complete bipartite structure. The problem was first introduced in 1994 by Alon et al. [1], who gave a deterministic O(nlg4n)O(n\lg^{4}n)-time algorithm. Subsequent work [2, 13, 6] improved the running time to O(nlgn)O(n\lg n), which is asymptotically optimal.

The first nontrivial bounds for the full generalized sorting problem were given by Huang et al. [11], who studied two versions of the problem:

  • Worst-case generalized sorting: In the worst-case version of the problem, GG is permitted to have an arbitrary structure. A priori, it is unclear whether it is even possible to achieve o(n2)o(n^{2}) comparisons in this setting. The paper [11] showed that this is always possible, with a randomized algorithm achieving query complexity O~(n1.5)\tilde{O}(n^{1.5}) for any graph GG, with high probability.

  • Stochastic generalized sorting: In the stochastic version of the problem, every vertex pair (u,v)(u,v) (that is not part of the Hamiltonian path for the true order) is included independently and randomly with some probability pp. The paper [11] gave a randomized algorithm with O~(min{n/p2,n3/2p})\tilde{O}(\min\{n/p^{2},n^{3/2}\sqrt{p}\}) query complexity, which evaluates to O~(n1.4)\tilde{O}(n^{1.4}) for the worst-case choice of pp.

Although there have been several subsequent papers on the topic [4, 14, 5] (which we will discuss further shortly), these upper bounds of O~(n1.5)\tilde{O}(n^{1.5}) and O~(n1.4)\tilde{O}(n^{1.4}) have remained the state of the art for the worst-case and stochastic generalized sorting problems, respectively. Moreover, no nontrivial lower bounds are known. Thus it is an open question how these problems compare to standard comparison-based sorting.

This paper.

The main result of this paper is an algorithm for stochastic generalized sorting with query complexity O(nlg(pn))O(n\lg(pn)). This means that, in the worst case, the query complexity of stochastic generalized sorting is asymptotically the same as that of classical sorting. Perhaps even more remarkably, when pp is small, stochastic generalized sorting is actually easier than its classical counterpart; for example, when p=polylog(n)np=\frac{\operatorname{polylog}(n)}{n}, the query complexity becomes O(nlglgn)O(n\lg\lg n).

When p=(lgn+ω(1))/np=(\lg n+\omega(1))/n, which is the parameter regime in which the Erdös-Renyi random graph G(n,p)G(n,p) contains at least one Hamiltonian cycle with overwhelming probability, we prove a matching lower bound of Ω(nlg(pn))\Omega(n\lg(pn)) for the query complexity of any stochastic-generalized-sorting algorithm. Thus our algorithm is optimal.

Given that the optimal running time for stochastic generalized sorting is faster for sparser graphs, a natural question is whether the running time for worst-case generalized sorting can also be improved in the sparse case. It is known that in the very dense case, where there are (n2)q\binom{n}{2}-q edges, there is an algorithm with query complexity (n+q)lgn(n+q)\lg n [4]. However, in the case where there are m(n2)m\ll\binom{n}{2} edges, the state-of-the-art remains the min(m,O~(n1.5))\min(m,\tilde{O}(n^{1.5})) bound of [11], where one can always get an O(m)O(m)-query bound simply by querying all edges.

Our second result is a new algorithm for worst-case generalized sorting with query complexity O~(nm)\tilde{O}(\sqrt{nm}), where nn is the number of vertices and mm is the number of edges. The algorithm is obtained by combining the convex-geometric approach of [11] with the predictions-based approach of [4].

Interestingly, if we instantiate m=p(n2)m=p\binom{n}{2} for some p(0,1]p\in(0,1], then the running time becomes O~(n1.5p)\tilde{O}(n^{1.5}\sqrt{p}), which is precisely what the previous state of the art was was for the stochastic generalized sorting problem (in the sparse case of p<1/n1/5p<1/n^{1/5}) [11].

Other related work.

The original bounds by Huang et al.[11] for both the worst-case and the stochastic generalized sorting problems remained unimproved until now. The difficulty of these problems has led researchers to consider alternative formulations that are more tractable. Banerjee and Richards [4] considered the worst-case generalized sorting problem in the setting where GG is very dense, containing (n2)q\binom{n}{2}-q edges for some parameter qq, and gave an algorithm that performs O((q+n)lgn)O((q+n)\lg n) comparisons; whether this is optimal remains open, and the best known lower bound is Ω(q+nlgn)\Omega(q+n\lg n) [5]. Banerjee and Richards [4] also gave on alternative algorithm for the stochastic generalized sorting problem, achieving O~(min{n3/2,pn2})\tilde{O}(\min\{n^{3/2},pn^{2}\}) comparisons, but this bound is never better than that of [11] for any pp. Work by Lu et al. [14] considered a variation on the worst-case generalized sorting problem in which we are also given predicted outcomes for all possible comparisons, and all but ww of the predictions are guaranteed to be true. The authors [14] show that, in this case, it is possible to sort with only O(nlgn+w)O(n\lg n+w) comparisons.

The generalized sorting problem is also closely related to the problem of sorting with comparison costs. In this problem, we are given as input (n2)\binom{n}{2} nonnegative costs {cu,vu,vV}\{c_{u,v}\mid u,v\in V\}, where VV is the set of elements that must be sorted and cu,vc_{u,v} is the cost of comparing uu to vv. The goal is to determine the true order x1x2xnx_{1}\prec x_{2}\prec\cdots\prec x_{n} of VV at a cost that is as close as possible to the optimal cost

OPT=cx1,x2+cx2,x3++cxn1,xn.\text{OPT}=c_{x_{1},x_{2}}+c_{x_{2},x_{3}}+\cdots+c_{x_{n-1},x_{n}}.

If all comparisons have equal costs, the problem reduces to traditional sorting and the optimal competitive ratio is O(lgn)O(\lg n). On the opposite side of the spectrum, if the costs are allowed to be arbitrary, then the best achievable competitive ratio is known to be Ω(n)\Omega(n) [9]. Nonetheless, there are many specific families of cost functions for which better bounds are achievable. Gupta and Kumar [9] consider the setting in which each of the elements vVv\in V represents a database entry of some size svs_{v}, and the costs cu,vc_{u,v} are a function of the sizes svs_{v} and sus_{u}; two especially natural cases are when cu,v=sv+suc_{u,v}=s_{v}+s_{u} or cu,v=susvc_{u,v}=s_{u}s_{v}, and in both cases, a competitive ratio of O(lgn)O(\lg n) can be achieved [9].111 The authors [9] also describe an algorithm for arbitrary monotone cost functions, but the analysis of the algorithm has been observed by subsequent work to be incorrect in certain cases [12]. Several other families of cost functions have also been considered. Gupta and Kumar [10] consider the setting in which the costs induce a metric space over the elements VV, and give a O(lg2n)O(\lg^{2}n)-competitive algorithm. In another direction, Angelov et al. [3] consider costs that are drawn independently and uniformly from certain probability distributions (namely, the uniform distribution on [0,1][0,1], and the Bernoulli distribution with Pr[X=1]=p\Pr[X=1]=p), and give a O(lgn)O(\lg n)-competitive algorithm in each case. The unifying theme across these works is that, by focusing on specific families of cost functions, polylogarithmic competitive ratios can be achieved.

The generalized sorting problem can also be interpreted in this cost-based framework, with queryable edges having cost 1 and un-queryable edges having cost \infty. The presence of very large (infinite) costs significantly changes the nature of the problem, however. The fact that some edges simply cannot be queried makes it easy for an algorithm to get “stuck”, unable to query the edge that it needs to query in order to continue. Whereas sorting with well-behaved (and bounded) cost functions is, at this point, a relatively well understood problem, the problem of sorting with infinite costs on some edges (i.e., generalized sorting) has remained much more open.

1.1 Roadmap

In Section 2, we provide a technical overview of our stochastic generalized sorting algorithm and its analysis. This result is our most interesting and technically involved result. In Section 3, we provide a full description and analysis of the stochastic generalized sorting algorithm, which uses O(nlg(np))O(n\lg(np)) queries on a random graph G(n,p)G(n,p). In Section 4, we prove that this algorithm is essentially optimal on random graphs. In Section 5, we provide an improved worst-case generalized sorting algorithm that uses O(mnlgn)O(\sqrt{mn}\lg n) queries on an arbitrary graph with nn vertices and mm edges.

Finally, in Appendix A, we provide pseudocode for the algorithms in Section 3 and Section 5.

2 Technical Overview

In this section, we give a technical overview of the main technical result of the paper, which is our algorithm for stochastic generalizing sorting that runs in O(nlg(np))O(n\lg(np)) queries.

The setup.

Suppose that we are given an instance G=(V,E)G=(V,E) of the stochastic generalized sorting problem, where |V|=n|V|=n, and where pp is the edge sampling probability. Let x1,x2,,xnx_{1},x_{2},\ldots,x_{n} denote the true order of the vertices (and recall that we use xixjx_{i}\prec x_{j} to denote that i<ji<j). Note that the edges (xi,xi+1)(x_{i},x_{i+1}) are all included deterministically in EE, and that each other edge is included independently and randomly with probability pp. Further note that the edges in EE are undirected, so the edge (u,v)(u,v) is the same as the edge (v,u)(v,u).

The basic algorithm design.

The algorithm starts by constructing sets of edges E1,E2,,EqE_{1},E_{2},\ldots,E_{q}, where q=lg(np)q=\lg(np) and where each EiE_{i} consists of roughly a 1/2i1/2^{i} fraction of all the edges EE. These sets of edges are constructed once at the beginning of the algorithm, and are never modified.

The algorithm then places the vertices into levels L1L2Lq+cL_{1}\subseteq L_{2}\subseteq\cdots\subseteq L_{q+c}, where cc is some large positive constant. The top levels Lq+1,,Lq+cL_{q+1},\ldots,L_{q+c} automatically contain all of the vertices. To construct the lower levels, we use the following promotion rule to determine whether a given vertex vv in a level Li+1L_{i+1} should also be placed into level LiL_{i}: if there is some vertex uvu\prec v such that uLi+cu\in L_{i+c} and (u,v)Ei(u,v)\in E_{i}, then we consider vv to be blocked by uu; otherwise, if there is no such uu, then vv gets promoted to level LiL_{i}.

As we shall describe shortly, our algorithm discovers the true order of the vertices x1,x2,x_{1},x_{2},\ldots one vertex at a time. At any given moment, we will use xx_{\ell} to denote the most recently discovered vertex in the true order, and we will say that each remaining vertex xix_{i}, i>i>\ell, has rank r(xi)=ir(x_{i})=i-\ell.

Before we describe how to discover the true order of the vertices, let us first describe how the levels are updated as we go. Intuitively, the goal of this construction is to make it so that, at any given moment, each level LiL_{i} has size roughly si=2i/ps_{i}=2^{i}/p, and so that the majority of the elements in LiL_{i} have rank O(si)O(s_{i}). We refer to si=2i/ps_{i}=2^{i}/p as the target size for level LiL_{i}.

Whenever we discover a new vertex xx_{\ell} (i.e., we discern that some vertex qq is the \ell-th vertex in the true order), we perform incremental updates to the levels as follows: we remove xx_{\ell} from all of the levels, and then for each vertex vv that was previously blocked from promotion by xx_{\ell}, we apply the promotion rule as many times as necessary to determine what new level vv should be promoted to. Note that, when we apply the promotion rule to a vertex vv in a level Li+1L_{i+1}, we apply it based on the current set Li+cL_{i+c}, and even if the set Li+cL_{i+c} changes in the future, we do not consequently “unpromote” the vertex.

In addition to the incremental updates, we also periodically rebuild entire levels from scratch. In particular, we rebuild the levels Li,Li1,,L1L_{i},L_{i-1},\ldots,L_{1} whenever the number \ell of vertices that we have discovered is a multiple of si/32s_{i}/32. (Recall that si=2i/ps_{i}=2^{i}/p is our target size for each level LiL_{i}.) To rebuild a level LiL_{i}, we take the vertices currently in Li+1L_{i+1} and apply the promotion rule to each of them (based on the current values of the sets Li+1L_{i+1} and Li+cL_{i+c}).

We remark that, whereas incremental updates can only promote vertices (and cannot unpromote them), the process of rebuilding the levels Li,Li1,,L1L_{i},L_{i-1},\ldots,L_{1}, one after another, can end up moving vertices in both directions. Moreover, there are interesting chains of cause and effect. For example, if a vertex vv gets unpromoted from level LiL_{i} (so now it no longer appears in LiL_{i}), then that may cause some vertex vLic+1v^{\prime}\in L_{i-c+1} to no longer be blocked, meaning that vv^{\prime} now gets promoted to LicL_{i-c}; the addition of vv^{\prime} to LicL_{i-c} may block some v′′Li2c+1v^{\prime\prime}\in L_{i-2c+1} from being able to reside in level Li2cL_{i-2c}, causing v′′v^{\prime\prime} to get unpromoted from Li2cL_{i-2c}, etc.

Having described how we maintain the levels L1,L2,L_{1},L_{2},\ldots, let us describe how we use the levels to discover the next vertex x+1x_{\ell+1}. We first construct a candidate set CC consisting of the vertices vL1v\in L_{1} such that (x,v)E(x_{\ell},v)\in E. Then, for each vCv\in C, we determine whether v=x+1v=x_{\ell+1} by going through the levels LiL_{i} in order of i=1,2,,q+ci=1,2,\ldots,q+c, and querying all of the edges between vv and the level LiL_{i}; we can remove vv from the candidate set CC if we ever find an edge (u,v)(u,v), where uu is in some LiL_{i} and uvu\prec v. In particular, all vertices ux+1u\prec x_{\ell+1} have already been discovered and removed from the levels LiL_{i}, so the existence of such a vertex uvu\prec v implies that vx+1v\neq x_{\ell+1}.

In the worst case, the process for eliminating a vertex vv from the candidate set CC might have to look at all of the edges incident to vv. One of the remarkable features of our algorithm is that, because of how the levels are structured, the expected number of queries needed to identify x+1x_{\ell+1} ends up being only O(lg(pn))O(\lg(pn)).

The structure of the analysis.

In the rest of the section, we discuss how to bound the expected number of edge queries made by the algorithm. For now, we will ignore issues surrounding whether events are independent, and we will think of events that are “morally independent” as being fully independent. One of the main challenges in the full analysis is to define the events being analyzed in such a way that these interdependence issues can be overcome.

Our analysis proceeds in three parts. First we analyze for each vertex vv and each level LiL_{i} the probability that vLiv\in L_{i} at any given moment. Next, we use this to bound the total number of queries spent maintaining the levels LiL_{i}. Finally, we bound the number of queries spent eliminating candidates from candidate sets (i.e., actually identifying each xx_{\ell}).

Understanding which vertices are in which levels.

Consider a vertex vv with rank r(v)sjr(v)\approx s_{j}. We will argue that vv is likely to appear in the levels Lj,Lj+1,L_{j},L_{j+1},\ldots and is unlikely to appear in the levels Ljc,Ljc1,,L1L_{j-c},L_{j-c-1},\ldots,L_{1}.

Begin by considering Pr[vLj+i]\Pr[v\in L_{j+i}] for some i0i\geq 0. If vLj+iv\not\in L_{j+i}, then there must be some vertex uu with rank 1r(u)r(v)sj1\leq r(u)\leq r(v)\approx s_{j} such that the edge (u,v)(u,v) is in one of Ej+i,Ej+i+1,E_{j+i},E_{j+i+1},\ldots. However, there are only O(sj)=O(2j/p)O(s_{j})=O(2^{j}/p) total vertices uu with ranks 1r(u)r(v)1\leq r(u)\leq r(v), and only a pp-fraction of them have an edge to vv. Moreover, only a O(1/2i+j)O(1/2^{i+j}) fraction of those edges are expected to appear in one of Ej+i,Ej+i+1,E_{j+i},E_{j+i+1},\ldots. So the probability of any such edge appearing is that most

O((2j/p)p/2i+j)=O(1/2i).O((2^{j}/p)\cdot p/2^{i+j})=O(1/2^{i}).

In other words, if vv has rank r(v)sjr(v)\approx s_{j}, then

Pr[vLj+i]=1O(1/2i)\Pr[v\in L_{j+i}]=1-O(1/2^{i}) (1)

for each level Lj+iL_{j+i}.

An important consequence of (1) is that, for any given LiL_{i}, the vertices vv with ranks less than O(si)O(s_{i}) (say, less than si/16s_{i}/16) each have probability at least Ω(1)\Omega(1) of being in LiL_{i}. We refer to the set of such vertices in LiL_{i} as the anchor set AiA_{i}. At any given moment, we have with reasonably high probability that |Ai|si/128|A_{i}|\geq s_{i}/128.

The anchor sets AiA_{i} play a critical role in ensuring that vertices with large ranks do not make it into low levels. Consider again a vertex vv with rank r(v)sjr(v)\approx s_{j} and let us bound the probability that vLjciv\in L_{j-c-i} for some i0i\geq 0. Assume for the moment that vLjci+1v\in L_{j-c-i+1}. The only way that vv can be promoted to LjciL_{j-c-i} is if there are no edges in EjciE_{j-c-i} that connect vv to an element of the anchor set AjiA_{j-i} (in particular, if there is such an edge, then vv will be blocked from promotion). However, the anchor set AjiA_{j-i} has size at least sji/128=2ji128ps_{j-i}/128=\frac{2^{j-i}}{128p}, and thus the expected number of edges (in all of EE) from vv to AjiA_{j-i} is roughly 2ji/1282^{j-i}/128. A 1/2jci1/2^{j-c-i} fraction of these edges are expected to be in EjciE_{j-c-i}. Thus, the expected number of edges that block vv from promotion is roughly

2ji/1282jci=2c/128.\frac{2^{j-i}/128}{2^{j-c-i}}=2^{c}/128.

Since these edge-blockages are (mostly) independent, the probability that no such edges block vv from promotion ends up being at most

e2c/128.e^{-2^{c}/128}.

This analysis considers just the promotion of vv from level Ljci+1L_{j-c-i+1} to level LjciL_{j-c-i}. Applying the same analysis repeatedly, we get that for any vertex vv with rank r(v)sjr(v)\approx s_{j}, and any i0i\geq 0,

Pr[vLjci]e2ci/128.\Pr[v\in L_{j-c-i}]\leq e^{-2^{c}i/128}. (2)

Or, to put it another way, for each level LiL_{i}, and for each vertex vv satisfying r(v)si+k+cr(v)\approx s_{i+k+c} for some k0k\geq 0, we have that

Pr[vLi]e2ck/128.\Pr[v\in L_{i}]\leq e^{-2^{c}k/128}. (3)

The derivation of (3) reveals an important insight in the algorithm’s design. One of the interesting features of the promotion rule is that, when deciding whether to promote a vertex vv from a given level Li+1L_{i+1} into LiL_{i}, we use the edges in EiE_{i} to compare vv not to its companions in Li+1L_{i+1} (as might be tempting to do) but instead to the elements of the larger level Li+cL_{i+c}. For vertices vv with small ranks, this distinction has almost no effect on whether vv makes it into level LiL_{i} (note, in particular, that the proof of (1) is unaffected by the choice of cc!); but for vertices vv with large ranks (i.e., ranks at least si+cs_{i+c}), this distinction makes the filtration process between levels much more effective (i.e., the larger that cc is, the stronger that (3) becomes). It is important that cc not be too large, however, since otherwise the application of (3) to a given LiL_{i} would only tell us information about vertices vv with very large ranks. By setting cc to be a large positive constant, we get the best of all worlds.

The derivation of (3) also reveals the reason why we must perform periodic rebuilds of levels. We rely on the anchor set Ai+cA_{i+c} to ensure that vertices vv with large ranks stay out low levels LiL_{i}, but the anchor set Ai+cA_{i+c} changes dynamically. Since Ai+cA_{i+c} takes many different values over time, the result is that a given high-rank vertex vv might at some point get lucky and encounter a state of Ai+cA_{i+c} that allows for vv to get promoted to level LiL_{i} (despite vv’s large rank!). The main purpose of performing regular rebuilds is to ensure that this doesn’t happen, and in particular, that the vertices in level LiL_{i} were all promoted into LiL_{i} based on a relatively recent version of the anchor set Ai+cA_{i+c}.

Combined, (1) and (3) give us a clear picture of which levels we should expect each vertex to be in. Now we turn our attention to analyzing the query complexity of the algorithm.

Bounding the cost of level updates.

The next step of the analysis is to bound the number of queries spent rebuilding levels and performing incremental updates to them.

We can deduce from (3) that, at any given moment, 𝔼[|Li|]=O(si+c)=O(si)\mathbb{E}[|L_{i}|]=O(s_{i+c})=O(s_{i}). For our discussion here we will simply assume that |Li|=O(si)|L_{i}|=O(s_{i}) always holds.

Now let us consider the number of queries needed to perform a promotion test (i.e., to apply the promotion rule) on a given vertex vv in a given level LiL_{i}. The promotion rule requires us to examine every edge in EiE_{i} that goes from vv to any vertex in Li+cL_{i+c}. The expected number of such edges is roughly

p2i|Li+c|=O(p2isi)=O(1).\frac{p}{2^{i}}|L_{i+c}|=O\left(\frac{p}{2^{i}}s_{i}\right)=O(1). (4)

Thus we can treat each promotion test as taking O(1)O(1) queries.

We can now easily bound the total work spent rebuilding levels from scratch. Each level LiL_{i} is rebuilt O(n/si)O(n/s_{i}) times, and each rebuild requires us to apply the promotion rule to O(si)O(s_{i}) different vertices. It follows that each rebuild takes O(si)O(s_{i}) queries, and that the total number of queries spent performing rebuilds on LiL_{i} is O(n)O(n). Summing over the O(lg(pn))O(\lg(pn)) levels results in a total of O(nlg(pn))O(n\lg(pn)) queries.

What about the query complexity of the incremental updates to the levels? Recall that whenever we discover a new xx_{\ell}, we must revisit every vertex vv that was formerly blocked by xx_{\ell}. For each such vv, we must check whether vv can be promoted, and if so then we must repeatedly promote vv until it reaches a level where it is again blocked.

Recall that each promotion test takes O(1)O(1) expected queries. To bound the total number of promotion tests due to incremental updates, we break the promotion tests into two categories: the failed promotion tests (i.e., the promotion tests that do not result in a promotion) and the successful promotion tests (i.e., the promotion tests that do result in a promotion).

The number of failed promotion tests is easy to bound. It is straightforward to show that the expected number of distinct vertices vv that we perform a promotion test on (each time that a new xx_{\ell} is discovered) is O(lg(pn))O(\lg(pn)) (roughly speaking, there will be one such vertex per level). Each such vv can contribute at most one failed promotion test. So summing over the xx_{\ell}’s, we find that the total number of failed promotion tests is O(nlg(pn))O(n\lg(pn)).

To bound the number of successful promotion tests, on the other hand, we can simply bound the number of promotions that occur (due to incremental updates). Each time that some vertex vv is promoted from a level Li+1L_{i+1} to a level LiL_{i}, the size of LiL_{i} increases by one (and the size of Li+1L_{i+1} is unchanged). We know that the size of each LiL_{i} stays below O(si)O(s_{i}) at all times, however, so there can be at most O(si)O(s_{i}) such promotions between every two consecutive rebuilds of the level. Since LiL_{i} is rebuilt O(n/si)O(n/s_{i}) times, the total number of promotions into LiL_{i} is O(n)O(n). Summing over the levels, we get O(nlg(pn))O(n\lg(pn)), as desired.

Bounding the cost of eliminating candidates from candidate sets.

The final and most interesting step in the analysis is to bound the total number of queries spent eliminating candidates from candidate sets. Suppose we have already discovered vertices x1,x2,,xx_{1},x_{2},\ldots,x_{\ell}, and we wish to discover x+1x_{\ell+1}. How many queries does it take to identify which element of CC is x+1x_{\ell+1}?

For simplicity, we will assume here that all of the candidates vCv\in C (besides x+1x_{\ell+1}) have ranks at least r(v)>1/pr(v)>1/p (this assumption can easily be removed with a bit of extra casework).

We can start by bounding the probability that a given vv is in CC. In order for vv to be in CC we need both that vL1v\in L_{1} and that (x,v)(x_{\ell},v) is an edge. With a bit of manipulation, one can deduce from (3) that if a vertex vv has rank r(v)1/pr(v)\geq 1/p, then

Pr[vL1]O(1(r(v)p)10).\Pr[v\in L_{1}]\leq O\left(\frac{1}{(r(v)\cdot p)^{10}}\right).

If vL1v\in L_{1} (but vx+1v\neq x_{\ell+1}), then Pr[(x,v) is an edge]=p\Pr[(x_{\ell},v)\text{ is an edge}]=p. Thus

Pr[vC]O(p(r(v)p)10).\Pr[v\in C]\leq O\left(\frac{p}{(r(v)\cdot p)^{10}}\right).

If vCv\in C, then we must look at all of the edges from vv to the levels L1,L2,L_{1},L_{2},\ldots until we find an edge (u,v)(u,v) with uvu\prec v. Since each level LiL_{i} has size O(si)O(s_{i}), the expected number of edges from vv to a given LiL_{i} is O(psi)=O(2i)O(ps_{i})=O(2^{i}). If LJL_{J} is the highest level that we look at while eliminating vv from CC, then the total number of queries incurred will be roughly O(2J).O(2^{J}).

So how many levels must we look at in order to eliminate vv from CC? Let uu be the predecessor to vv in the true order, and let LKL_{K} be the lowest level that contains uu. Since uvu\prec v and since (u,v)(u,v) is deterministically an edge, we are guaranteed to stop at a level JKJ\leq K. We can bound KK (and thus JJ) by applying (1); this gives us the identity

Pr[Jlg(pr(v))+i]O(1/2i)\Pr[J\geq\lg(p\cdot r(v))+i]\leq O\left(1/2^{i}\right) (5)

for every i0i\geq 0.

To summarize, if vCv\in C (and vx+1v\neq x_{\ell+1}), then the expected number of queries to remove vv from CC is roughly 𝔼[2J]\mathbb{E}[2^{J}]; and for any i0i\geq 0, the probability that 2Jpr(v)2i2^{J}\geq pr(v)2^{i} is O(1/2i)O(1/2^{i}). There are O(lgpn)O(\lg pn) possible values for ii, each of which contributes O(pr(v))O(p\cdot r(v)) to 𝔼[2J]\mathbb{E}[2^{J}]. Thus the expected number of queries to remove vv from CC is O(pr(v)lg(pn))O(p\cdot r(v)\lg(pn)).

To bound the cost of eliminating all candidates from CC, we must sum over the ranks rp1r\geq p^{-1} to get

v:r(v)p1Pr[vC]𝔼[2J]=O(rp1p(rp)9lg(pn))=O(lg(pn)).{\sum_{v:r(v)\geq p^{-1}}\Pr[v\in C]\cdot\mathbb{E}[2^{J}]}=O\left(\sum_{r\geq p^{-1}}\frac{p}{(rp)^{9}}\lg(pn)\right)=O\left(\lg(pn)\right).

This represents the cost to identify a given x+1x_{\ell+1}. Summing over all \ell, the total contribution of these costs to the query complexity of the algorithm of is O(nlg(pn))O(n\lg(pn)).

3 Stochastic Generalized Sorting

3.1 Algorithm Design

Conventions and Notation.

Let G=(V,E)G=(V,E) be the input graph and let x1x2xnx_{1}\prec x_{2}\prec\cdots\prec x_{n} denote the true order of the vertices.

There are two types of edges in the graph, those that were included randomly with probability pp, and those that are part of the true path connecting together x1,x2,,xnx_{1},x_{2},\ldots,x_{n}. We say that a vertex pair (u,v)(u,v) is stochastic if it is not an edge in the true-ordering path and otherwise the vertex pair is deterministic. Note that a stochastic vertex pair does not need to have an edge connecting it, but a deterministic vertex pair does.

Our algorithm will find x1,x2,x_{1},x_{2},\dots one at a time. Once we have found xix_{i}, we say that xix_{i} has been discovered. As a convention, we will use xx_{\ell} to denote the most recently discovered vertex. When we refer to the rank of a vertex, we mean its rank out of the not yet identified vertices {x+1,xn}\{x_{\ell+1},\ldots x_{n}\}. For a vertex vv, we let r(v)r(v) be the rank of vv, so v=x+r(v)v=x_{\ell+r(v)}.

To simplify our discussion in this section, we will consider only the task of discovering the vertices x1,,xn/2x_{1},\ldots,x_{n/2}. This allows for us to assume that the number of remaining vertices is always Θ(n)\Theta(n). Moreover, by a symmetric argument, we can recover xn,xn1,,xn/2+1x_{n},x_{n-1},\ldots,x_{n/2+1} and therefore recover the complete order, so this assumption does not affect the correctness of the overall algorithm.

Constructing edge sets 𝑬𝟏,𝑬𝟐,E_{1},E_{2},\ldots.

The first step of our algorithm is to use the following proposition to construct q=O(lg(pn))q=O(\lg(pn)) sets of edges E1,E2,,EqE_{1},E_{2},\ldots,E_{q} (these sets are built once and then never modified). When discussing these sets of edges, we will use Ei(u,v)E_{i}(u,v) to denote the indicator random variable for whether (u,v)Ei(u,v)\in E_{i}, and we will use E(u,v)E(u,v) to denote the tuple E1(u,v),E2(u,v),,Eq(u,v)\langle E_{1}(u,v),E_{2}(u,v),\ldots,E_{q}(u,v)\rangle. The proof of Proposition 1 is deferred to Subsection 3.3.

Proposition 1.

Suppose we are given EE but we are not told which vertex pairs are stochastic/deterministic. Suppose, on the other hand, that Alice is told which vertex pairs are stochastic/deterministic, but is not given EE. Based on EE alone, we can construct sets of edges E1,E2,,EqE_{1},E_{2},\ldots,E_{q} such that E=i=1qEiE=\bigcup_{i=1}^{q}E_{i}, and such that, from Alice’s perspective, the following properties hold:

  • For each stochastic vertex pair (u,v)(u,v), Pr[Ei(u,v)]=αp2i\Pr[E_{i}(u,v)]=\frac{\alpha\cdot p}{2^{i}}, where 1α21\leq\alpha\leq 2 is some constant that depends only on pp and nn.

  • For each deterministic vertex pair (u,v)(u,v), Pr[Ei(u,v)]=α2i\Pr[E_{i}(u,v)]=\frac{\alpha}{2^{i}}.

  • The random variables {Ei(u,v)i[q],(u,v) is stochastic}\{E_{i}(u,v)\mid i\in[q],(u,v)\text{ is stochastic}\} are mutually independent.

  • The random variables {E(u,v)}u,vV\{E(u,v)\}_{u,v\in V} are mutually independent.

We remark that the sets EiE_{i} do not partition EE, as the sets EiE_{i} are not necessarily disjoint. The sets EiE_{i} are constructed independently across edges, but for each edge (u,v)(u,v) in EE we anti-correlate the events {Ei(u,v)}i=1q\{E_{i}(u,v)\}_{i=1}^{q} so that the events are independent if we do not condition on the set EE. We formally show how to construct the sets EiE_{i} in the proof of the above proposition, in Subsection 3.3.

One thing that is subtle about the above proposition is that the randomization is coming from two sources, the random generation of EE and the random construction of E1,E2,,EqE_{1},E_{2},\ldots,E_{q} based on EE. The probabilities in the bullet points depend on both sources of randomness simultaneously, and they do not condition on the final edge set EE (hence, the use of Alice in the proposition statement).

Assigning the vertices to levels.

A central part of the algorithm design is to use the edge sets E1,E2,,EqE_{1},E_{2},\ldots,E_{q} in order to dynamically assign the vertices to levels L1L2Lq+cL_{1}\subseteq L_{2}\subseteq\cdots\subseteq L_{q+c}, where q=O(lg(pn))q=O(\lg(pn)) is as defined above and cc is some large but fixed constant. Before we can describe how the algorithm discovers the true order of the vertices, we must first describe how the levels {Li}\{L_{i}\} are constructed and maintained over time. In particular, the algorithm will make use of the structure of the levels in order to efficiently discover new vertices in the true order.

Intuitively, the goal of how the levels are maintained is to make it so that, at any given moment, each level LiL_{i} has size roughly si=2i/ps_{i}=2^{i}/p, and so that the majority of the elements in LiL_{i} have ranks O(si)O(s_{i}). We refer to si=2i/ps_{i}=2^{i}/p as the target size for level LiL_{i}.

Initial construction of levels.

We perform the initial construction of the levels as follows. The final levels Lq+1,,Lq+cL_{q+1},\ldots,L_{q+c} all automatically contain all of the vertices. For each other level LiL_{i}, we construct the level as

Li+1{vLi+1(u,v)Ei for some uv such that uLi+c}.L_{i+1}\setminus\{v\in L_{i+1}\mid(u,v)\in E_{i}\text{ for some }u\prec v\text{ such that }u\in L_{i+c}\}. (6)

The vertices vv from Li+1L_{i+1} that are not included in LiL_{i} are said to be blocked by the edge (u,v)(u,v) defined in (6).

One point worth highlighting is that, when we are deciding whether a vertex vv should move from Li+1L_{i+1} to LiL_{i}, we query not just the edges in EiE_{i} that connect vv to other vertices in Li+1L_{i+1}, but also the edges that connect vv to other vertices in the larger set Li+cL_{i+c}. This may at first seem like a minor distinction, but as we shall later see, it is critical to ensuring that vertices with large ranks do not make to levels LiL_{i} for small ii.

Performing incremental updates to levels.

When we discover a given xx_{\ell}, we perform incremental updates to the levels as follows. First, we remove xx_{\ell} from all of the levels. Next we consider vertices vv that were formerly blocked by an edge of the form (x,v)(x_{\ell},v) in some level LiL_{i}: since xx_{\ell} is no longer in the levels, (x,v)(x_{\ell},v) no longer blocks vv. Since xx_{\ell} has been removed from the levels, we must give vv the opportunity to advance to lower levels. That is, if there is no edge (u,v)Ei1Li+c1(u,v)\in E_{i-1}\cap L_{i+c-1} such that uvu\prec v, then we advance vv to level Li1L_{i-1}; if, additionally, there is no edge (u,v)Ei2Li+c2(u,v)\in E_{i-2}\cap L_{i+c-2}, then we advance vv to level Li2L_{i-2}, and so on. The vertex continues to advance until it is either in the bottom level L1L_{1} or it is again blocked by an edge.

Rebuilding levels periodically.

In addition to the incremental updates, we also periodically reconstruct each level in its entirety. Recall that we use si=2i/ps_{i}=2^{i}/p to denote the target size for each level LiL_{i}. For each ii, every time that the index \ell of the most recently discovered xx_{\ell} is a multiple of si/32s_{i}/32, we rebuild the levels Li,Li1,,L1L_{i},L_{i-1},\ldots,L_{1}, one after another, from scratch according to (6). That is, we rebuild level LiL_{i} based on the current values of Li+1L_{i+1} and Li+cL_{i+c}, we then rebuild Li1L_{i-1} based on the new value of LiL_{i} and the current value of Li+c1L_{i+c-1}, etc.

Note that, when rebuilding a level LjL_{j} from scratch, we also redetermine from scratch which of the vertices in Lj+1L_{j+1} are blocked from entering LjL_{j}. In particular, a vertex vv may have previously been blocked from entering LjL_{j} by an edge (u,v)(u,v) for some uLj+cu\in L_{j+c}; but if Lj+cL_{j+c} has since been rebuilt, then uu might no longer be in Lj+cL_{j+c}, and so vv may no longer be blocked from entering level LjL_{j}.

Finding the first vertex.

We are now ready to describe how the algorithm discovers the true order x1,x2,x_{1},x_{2},\ldots of the vertices. We discover the first vertex x1x_{1} in a different way from how we discover the other vertices (and, in particular, we do not make use of either the edge sets EiE_{i} or the levels LiL_{i}).

The algorithm for finding x1x_{1} works as follows. We always keep track of a single vertex v0v_{0} which will be the earliest vertex found so far (in the true ordering) and a set S={v1,,vr}S=\{v_{1},\dots,v_{r}\} of vertices that we know come after v0v_{0} in the true ordering. We begin by picking an arbitrary edge (u,v)(u,v) and querying the edge to find which vertex precedes the other. If uu precedes vv, we set v0=uv_{0}=u and v1=vv_{1}=v; else, we set v0=vv_{0}=v and v1=uv_{1}=u. For each subsequent step, if we currently have v0v_{0} and S={v1,,vr},S=\{v_{1},\dots,v_{r}\}, then we do the following. If there exists any edge connecting v0v_{0} and some u{v1,,vr},u\not\in\{v_{1},\dots,v_{r}\}, we query the edge (u,v0).(u,v_{0}). If v0v_{0} precedes uu, then we add vr+1:=uv_{r+1}:=u to the set SS. Otherwise, if uu precedes vv, we add vr+1=v0v_{r+1}=v_{0} to the set SS, and then replace v0v_{0} with uu. Finally, if there is no edge connecting v0v_{0} with some u{v1,,vr},u\not\in\{v_{1},\dots,v_{r}\}, we obtain that x1=v0x_{1}=v_{0}.

Finding subsequent vertices.

Now suppose we have most recently discovered some vertex xx_{\ell} and we wish to discover x+1x_{\ell+1}. This is done using the levels {Li}\{L_{i}\} and the edge sets {Ei}\{E_{i}\} as follows.

We start by constructing a candidate set CC consisting of all of the vertices in L1L_{1} that have an edge to xx_{\ell}. Next, we perform the following potentially laborious process to remove vertices from the candidate set until we have gotten down to just one candidate. We go through the levels L1,L2,L3,L_{1},L_{2},L_{3},\ldots, and for each level LiL_{i} we query all of the edges between vertices in LiL_{i} and the remaining vertices in CC. We remove a vertex vv from the candidate set CC if we discover an edge (u,v)(u,v) with uvu\prec v. Once we have narrowed down the candidate set to a single vertex, we conclude that the vertex is x+1x_{\ell+1}.

By repeating this process over and over, we can discover all of the vertices x1,x2,,xn/2x_{1},x_{2},\ldots,x_{n/2}.

3.2 Algorithm Correctness

In this subsection, we prove StochasticSort always outputs the correct order (with probability 11). First, we show that the algorithm finds x1x_{1} correctly.

Proposition 2.

The algorithm for finding x1x_{1} always succeeds. Moreover, it deterministically uses at most nn queries.

Proof.

First, we note that at any step, if we have vertex v0v_{0} and set S={v1,,vr}S=\{v_{1},\dots,v_{r}\}, then v0viv_{0}\prec v_{i} for all viSv_{i}\in S. This is obviously true at the first step after we compare the first two vertices (u,v)(u,v), and it continues to be true later on by the following inductive argument. If we find some edge (u,v0)(u,v_{0}) with uSu\not\in S, then if uv0u\prec v_{0}, then vv0viv\prec v_{0}\prec v_{i} for all viSv_{i}\in S, so adding v0v_{0} to SS and replacing v0v_{0} with uu means the new v0v_{0} still precedes all vertices in SS. On the other hand, if v0u,v_{0}\prec u, then since we just add uu to SS, we also still have that v0v_{0} precedes everything in SS. Thus we always have v0viv_{0}\prec v_{i} for all viSv_{i}\in S.

Next we show that, whenever our algorithm for finding x1x_{1} terminates, it is guaranteed to have successfully found x1x_{1}. That is, we show that if v0v_{0} has no edges (u,v)(u,v) such that uSu\not\in S, then v0v_{0} must equal x1x_{1}. This is because if v0x1,v_{0}\neq x_{1}, then v0v_{0} is connected to its immediate predecessor in the true ordering (which we can call vv^{\prime}), which would mean vSv^{\prime}\in S. This, however, contradicts the fact that v0viv_{0}\prec v_{i} for all viSv_{i}\in S. Therefore, our algorithm is correct assuming it terminates.

Finally, observe that the algorithm must terminate after nn queries, since each query increases the size of SS by 11 (we either add v0v_{0} or uu to SS, neither of which was in SS before). This concludes the proof. ∎

Next, we give a condition that guarantees that a given vertex vv will be in a given level LiL_{i}.

Proposition 3.

Fix a vertex vv that has not been discovered yet, and suppose that at some point, for all uvu\prec v such that uu has not been discovered, (u,v)jiEj(u,v)\not\in\bigcup_{j\geq i}E_{j}. Then, vv is in level LiL_{i}.

Proof.

Suppose that vLiv\not\in L_{i}. Then, since vv is not in the first level, the algorithm currently states that vv is blocked by some edge (u,v)(u,v) with uvu\prec v.

Since the algorithm currently states that vv is blocked by (u,v)(u,v), the vertex uu cannot have been discovered yet. Indeed, once uu is discovered, we remove uu from all levels and perform incremental updates to all of the vertices that edges incident to uu blocked, including vv. This incremental update would either push vv all the way to level L1L_{1} or would result in vv being blocked by a new edge (different form the edge (u,v(u,v)), contradicting the fact that vv is currently blocked by (u,v)(u,v). So, uvu\prec v and uu is not discovered. Finally, since vLiv\not\in L_{i}, there exists some jij\geq i such that vLj+1\Ljv\in L_{j+1}\backslash L_{j}, which means that since the algorithm decided that (u,v)(u,v) blocks vv from level LjL_{j}, we have that (u,v)Ej(u,v)\in E_{j}. ∎

As a direct corollary, we have the following:

Corollary 4.

After x1,,xx_{1},\dots,x_{\ell} are discovered and all level updates are performed, x+1L1x_{\ell+1}\in L_{1}.

Proof.

Indeed, there are no vertices ux+1u\prec x_{\ell+1} such that uu has not been discovered. So, by setting i=1i=1 and v=x+1v=x_{\ell+1} in Proposition 3, the corollary is proven. ∎

To finish the proof of algorithm correctness, suppose we have discovered xx_{\ell} and we are now in the process of discovering x+1x_{\ell+1}. Since x+1L1x_{\ell+1}\in L_{1} (by Corollary 4), the set CC of vertices in L1L_{1} that are connected to xx_{\ell} contains x+1x_{\ell+1}. For each vCv\in C such that vx+1v\neq x_{\ell+1}, the immediate predecessor of vv is undiscovered, so it is in Lq+1L_{q+1}. Moreover, the immediate predecessor of vv has an edge connecting it to vv. Therefore, the other vertices vCv\in C will be eliminated eventually. However, x+1x_{\ell+1} can never be eliminated, since x1,,xx_{1},\dots,x_{\ell} are the only vertices that precede x+1x_{\ell+1}, and they have been removed from all levels. Therefore, eventually we will narrow down the candidate set to precisely x+1x_{\ell+1}, meaning that we will successfully discover the correct value for x+1x_{\ell+1}. The proof of correctness follows by induction.

3.3 Analyzing the Query Complexity

In this section we bound the total number of queries made by the StochasticSort algorithm. First, as promised in Subsection 3.1, we prove Proposition 1.

Proof of Proposition 1.

Recall that qq (i.e., the number of edge sets E1,E2,E_{1},E_{2},\ldots) is defined solely as a function of pp and nn. Choose α\alpha so that

i=1q(1αp2i)=1p.\prod_{i=1}^{q}\left(1-\frac{\alpha\cdot p}{2^{i}}\right)=1-p. (7)

Note that g(α):=i=1q(1αp2i)g(\alpha):=\prod_{i=1}^{q}\left(1-\frac{\alpha\cdot p}{2^{i}}\right) is a continuous and strictly decreasing function over α>0\alpha>0. Moreover, g(α)1i=1qαp2i1αpg(\alpha)\geq 1-\sum_{i=1}^{q}\frac{\alpha\cdot p}{2^{i}}\geq 1-\alpha\cdot p and g(α)1αp2.g(\alpha)\leq 1-\frac{\alpha\cdot p}{2}. Thus, there is a unique solution α>0\alpha>0 to g(α)=1pg(\alpha)=1-p, and the solution α\alpha must be in the range [1,2][1,2].

Let 𝒟\mathcal{D} be the probability distribution over tuples of the form X1,X2,,Xq\langle X_{1},X_{2},\ldots,X_{q}\rangle, where each XiX_{i} is an indicator random variable that is independently set to 11 with probability αp/2i\alpha p/2^{i}. To construct the EiE_{i}s, for each edge eE,e\in E, we sample E(u,v)E(u,v) at random from the distribution 𝒟\mathcal{D} conditioned on at least one of the XiX_{i}s being 11; and for each edge eEe\not\in E, we set E(u,v)E(u,v) to be the zero tuple. Note that, by design, iEi=E\bigcup_{i}E_{i}=E.

Now let us analyze the sets EiE_{i} from Alice’s perspective (i.e., conditioning on which edges are stochastic/deterministic but not on which edges are in EE). For each stochastic vertex pair (u,v)(u,v), the pair (u,v)(u,v) is included in EE with probability pp. This is exactly the probability that a tuple sampled from 𝒟\mathcal{D} is non-zero. Thus, for each stochastic vertex pair (u,v)(u,v), we independently have E(u,v)𝒟E(u,v)\sim\mathcal{D}. (This implies the first property.) On the other hand, for each deterministic vertex pair (u,v)(u,v), we independently have E(u,v)𝒟(iXi=1)E(u,v)\sim\mathcal{D}\mid\left(\vee_{i}X_{i}=1\right). Observe that for X1,X2,,XqD\langle X_{1},X_{2},\ldots,X_{q}\rangle\sim D,

Pr[XiiXi=1]\displaystyle\Pr\left[X_{i}\mid\vee_{i}X_{i}=1\right] =Pr[Xi]Pr[iXi=1]\displaystyle=\frac{\Pr[X_{i}]}{\Pr[\vee_{i}X_{i}=1]}
=Pr[Xi]p\displaystyle=\frac{\Pr[X_{i}]}{p}
=α2i,\displaystyle=\frac{\alpha}{2^{i}},

where the first equality uses the fact that XiX_{i} can only hold if iXi\vee_{i}X_{i} holds, and the second equality uses (7). This establishes that for any deterministic vertex pair (u,v)(u,v), Pr[Ei(u,v)]=α2i\Pr[E_{i}(u,v)]=\frac{\alpha}{2^{i}}, hence the second property. Finally, the aforementioned independencies imply the third and fourth properties, with the third property (independence if we do not condition on EE) also using the fact that we chose α\alpha to satisfy Equation (7). ∎

Throughout the rest of the algorithm analysis, whenever we discuss the events {Ei(u,v)}\{E_{i}(u,v)\}, we will be taking the perspective of Alice from Proposition 1. That is, our analysis will be considering both the randomness that is involved in selecting stochastic edges, and the randomness that our algorithm introduces when constructing the EiE_{i}s.

For now, we shall assume that 1/p1/p is at least a sufficiently large constant multiple of lgn\lg n; this assumption means that the target size sis_{i} for each level LiL_{i} satisfies si=Ω(lgn)s_{i}=\Omega(\lg n) for all ii, and will allow for us to use high probability Chernoff bounds in several places. We will remove the assumption at the end of the section and extend the analysis to consider arbitrary values of pp.

Assume that the first \ell vertices x1,,xx_{1},\dots,x_{\ell} have been discovered, and we are in the process of identifying x+1x_{\ell+1}. Recall that, for each vertex pair (u,v)(u,v), and for each level LiL_{i}, we use Ei(u,v)E_{i}(u,v) to denote the indicator random variable for the event that (u,v)Ei(u,v)\in E_{i}. Let Xi,vX_{i,v} be the indicator random variable

jixuvEj(u,v)\vee_{j\geq i}\vee_{x_{\ell}\prec u\prec v}E_{j}(u,v)

for the event that any vertex pair (u,v)(u,v) with xuvx_{\ell}\prec u\prec v is contained in any of Li,Li+1,L_{i},L_{i+1},\ldots. Note that there is no restriction on uu as to which levels contain it (all xuvx_{\ell}\prec u\prec v are considered). For a given vertex vv satisfying xvx_{\ell}\prec v, if Xi,v=0X_{i,v}=0 then vv is guaranteed to be in LiL_{i}, but the reverse does not hold (vv could be in LiL_{i} despite Xi,vX_{i,v} being 11).

The random variables Xi,vX_{i,v} are independent across vertices (but not across levels), since even for deterministic edge pairs we still have independence of E(u,v)={Ei(u,v)}i=1qE(u,v)=\{E_{i}(u,v)\}_{i=1}^{q} across vertex pairs (see the fourth property of Proposition 1). Moreover, each Xi,vX_{i,v} satisfies the following useful inequality:

Lemma 5.

Suppose vv has rank rr. Then

Pr[Xi,v]α(1+rp)2i1,\Pr[X_{i,v}]\leq\frac{\alpha\cdot(1+rp)}{2^{i-1}},

where α\alpha is the positive constant defined in the construction of the EiE_{i}s.

Proof.

By the union bound,

Pr[Xi,v]\displaystyle\Pr[X_{i,v}] jixuvPr[Ej(u,v)].\displaystyle\leq\sum_{j\geq i}\phantom{f}\sum_{x_{\ell}\prec u\prec v}\Pr[E_{j}(u,v)].

Recall from Proposition 1 that each stochastic vertex pair has probability at most αp/2j\alpha\cdot p/2^{j} of being included in EjE_{j}, and that each deterministic vertex pair has probability at most α/2j\alpha/2^{j} of being included in EjE_{j}. Thus

Pr[Xi,v]\displaystyle\Pr[X_{i,v}] ji(α2j+(r1)αp2j)\displaystyle\leq\sum_{j\geq i}\left(\frac{\alpha}{2^{j}}+(r-1)\frac{\alpha\cdot p}{2^{j}}\right)
α(1+rp)2i1.\displaystyle\leq\frac{\alpha\cdot(1+rp)}{2^{i-1}}.\qed

Whenever a level LiL_{i} is rebuilt, we define the anchor vertices AiA_{i} to be the vertices in LiL_{i} with ranks in the range

[si/16,3si/32).[s_{i}/16,3s_{i}/32).

Note that, even as LiL_{i} changes incrementally over time, the anchor set AiA_{i} does not change until the next time that LiL_{i} is rebuilt from scratch. Moreover, because LiL_{i} is rebuilt relatively frequently (once for every si/32s_{i}/32 vertices xx_{\ell} that are discovered), no anchor vertices are ever removed from LiL_{i} (until after the level is next rebuilt). Moreover, the rank of any anchor vertex vAiv\in A_{i} is always between si/32s_{i}/32 and 3si/323s_{i}/32, since the rank is initially at least si/16s_{i}/16 and decreases by a total of si/32s_{i}/32 between rebuilds.

Our next lemma establishes that, with high probability, there are a reasonably large number of anchor vertices in each level at any given moment. Recall that r(v)r(v) is used to denote the rank of a given vertex vv.

Lemma 6.

Let i>3i>3. Then, with probability at least 1n101-n^{-10},

|Ai|si/128.|A_{i}|\geq s_{i}/128.
Proof.

For any vertex vv, if Xi,v=0X_{i,v}=0, then vv must be in LiL_{i}, as there is nothing that can block it from levels ii or above. Therefore,

|Ai|v:r(v)[si/16,3si/32)(1Xi,v).|A_{i}|\geq\sum_{v\;:\;r(v)\in[s_{i}/16,3s_{i}/32)}(1-X_{i,v}).

By Lemma 5,

𝔼[v:r(v)[si/16,3si/32)(1Xi,v)]\displaystyle\mathbb{E}\left[\sum_{v\;:\;r(v)\in[s_{i}/16,3s_{i}/32)}(1-X_{i,v})\right] si32(1α(1+3si32p)2i1)\displaystyle\geq\frac{s_{i}}{32}\left(1-\frac{\alpha\left(1+\frac{3s_{i}}{32}p\right)}{2^{i-1}}\right)
si32(12(1+3322i)2i1)\displaystyle\geq\frac{s_{i}}{32}\left(1-\frac{2\left(1+\frac{3}{32}\cdot 2^{i}\right)}{2^{i-1}}\right)
3si256,\displaystyle\geq\frac{3s_{i}}{256},

since α2\alpha\leq 2 and i4.i\geq 4. Moreover, since the variables Xi,vX_{i,v} are independent across vertices vv (by the fourth property of Proposition 1), we can apply a Chernoff bound to deduce that |Ai|si/128|A_{i}|\geq s_{i}/128 with probability at least 1n101-n^{-10}, as we are assuming sis_{i} is at least a sufficiently large constant multiple of lgn\lg n. ∎

So far, dependencies between random variables have not posed much of an issue. In the subsequent lemmas, however, we will need to be careful about the interaction between which vertices are in each level, which vertices are in each anchor set, and which random variables Ej(u1,u2)E_{j}(u_{1},u_{2}) hold. For this, we will need the following two propositions.

Proposition 7.

At any time in the algorithm, for each vertex vv and level ii, the event that vLiv\in L_{i} only depends on Ej(u1,u2)E_{j}(u_{1},u_{2}) over triples (j,u1,u2)(j,u_{1},u_{2}) where jij\geq i and u1,u2vu_{1},u_{2}\preceq v (including if u1u_{1} or u2u_{2} is already discovered).

Proof.

We prove this by induction on ii. For i>q,i>q, the proposition is trivial since every vertex vv (that is not yet discovered) is in level LiL_{i}. Assume the claim is true for levels i+1,i+2,i+1,i+2,\dots.

Define the sets Lj=Lj{uuv}L^{\prime}_{j}=L_{j}\cap\{u\mid u\preceq v\}. By the inductive hypothesis, the sets Li+1,Li+2,L^{\prime}_{i+1},L^{\prime}_{i+2},\ldots depend only on Ej(u1,u2)E_{j}(u_{1},u_{2}) over triples (j,u1,u2)(j,u_{1},u_{2}) where ji+1j\geq i+1 and u1,u2vu_{1},u_{2}\preceq v. If we fix the outcomes of those Ej(u1,u2)E_{j}(u_{1},u_{2})s, thereby fixing the outcomes of Li+1,Li+2,L^{\prime}_{i+1},L^{\prime}_{i+2},\ldots, then whether or not vLiv\in L_{i} depends only on Ej(u,v)E_{j}(u,v) where i=ji=j and uvu\preceq v. Thus whether or not vLiv\in L_{i} depends only on the allowed Ej(u1,u2)E_{j}(u_{1},u_{2}) variables. ∎

Proposition 8.

Conditioned on the sets AiA_{i} over all ii, the random variables Ei(u,v)E_{i}(u,v) are jointly independent over all triples (i,u,v)(i,u,v) with uAi+cu\in A_{i+c} and with vv satisfying r(v)si+c/8r(v)\geq s_{i+c}/8. Moreover, for each such triple (i,u,v)(i,u,v), the probability Pr[Ei(u,v)=1]\Pr[E_{i}(u,v)=1] remains αp/2i\alpha\cdot p/2^{i}, even after conditioning on the sets AiA_{i}.

Proof.

Note that AiA_{i} only depends on which vv of ranks between si/16s_{i}/16 and 3si/323s_{i}/32 are in level LiL_{i}. Therefore, by Proposition 7, {Ai}\{A_{i}\} over all ii is strictly a function of Ej(u,v)E_{j}(u,v) over all choices (i,j,u,v)(i,j,u,v) with jij\geq i and u,vx+(3si/32)x+(3sj/32)u,v\preceq x_{\ell+(3s_{i}/32)}\preceq x_{\ell+(3s_{j}/32)} (possibly including already discovered vertices u,vu,v). Simplifying, we get that {Ai}\{A_{i}\} over all ii is strictly a function of Ej(u,v)E_{j}(u,v) over all choices of (j,u,v)(j,u,v) satisfying u,vx+(3sj/32)u,v\preceq x_{\ell+(3s_{j}/32)}. Let T1T_{1} denote the set of such triples (j,u,v)(j,u,v).

We wish to prove the independence of {Ei(u,v)}\{E_{i}(u,v)\} over the set T2T_{2} of triples (i,u,v)(i,u,v) with uAi+cu\in A_{i+c} and r(v)si+c/8r(v)\geq s_{i+c}/8. Note that if a triple (i,u,v)(i,u,v) is in T2T_{2}, then we must have that x+(3si/32)u,vx_{\ell+(3s_{i}/32)}\prec u,v, so (i,u,v)X(i,u,v)\not\in X. Moreover, r(v)si+c/83si+c/32+2r(u)+2,r(v)\geq s_{i+c}/8\geq 3s_{i+c}/32+2\geq r(u)+2, so (u,v)(u,v) is a stochastic vertex pair. Thus, the triples (i,u,v)T2(i,u,v)\in T_{2} do not include any deterministic vertex pairs (u,v)(u,v), and are disjoint from the triples T1T_{1} on which the anchor sets AiA_{i} depend. The conclusion follows from the first and third properties of from Proposition 1. ∎

Now, for each vertex vv satisfying xvx_{\ell}\prec v and for each level LiL_{i} such that r(v)si+c/8r(v)\geq s_{i+c}/8, define the indicator random variable

Yi,v=uAi+cEi(u,v).Y_{i,v}=\vee_{u\in A_{i+c}}E_{i}(u,v).

In order for vv to be in any of the levels L1,,LiL_{1},\ldots,L_{i}, we must have that Yi,v=0Y_{i,v}=0.

The next lemma bounds the probability that Yi,v=0Y_{i,v}=0 for a given vertex vv.

Lemma 9.

Consider a vertex vv satisfying xvx_{\ell}\prec v, and suppose that we condition on the AiA_{i}s such that each AiA_{i} has size at least si/128s_{i}/128. Then, for each level LiL_{i} we have

Pr[Yi,v]1e2c/128.\Pr[Y_{i,v}]\geq 1-e^{-2^{c}/128}.

Moreover, conditioned on the AiA_{i}s, the Yi,vY_{i,v}s are mutually independent.

Proof.

By Proposition 8, we can use the independence of the Ei(u,v)E_{i}(u,v)s conditioned on the AiA_{i}s to conclude that

Pr[Yi,v]\displaystyle\Pr[Y_{i,v}] 1(1αp2i)|Ai+c|\displaystyle\geq 1-\left(1-\frac{\alpha\cdot p}{2^{i}}\right)^{|A_{i+c}|}
1(1p2i)si+c/128\displaystyle\geq 1-\left(1-\frac{p}{2^{i}}\right)^{s_{i+c}/128}
=1(1p2i)2i+c/(128p)\displaystyle=1-\left(1-\frac{p}{2^{i}}\right)^{2^{i+c}/(128p)}
1e2c/128.\displaystyle\geq 1-e^{-2^{c}/128}.

Moreover, conditioned on the AiA_{i}s, Proposition 8 implies that the random variables Yi,vY_{i,v} are all mutually independent. ∎

Using these observations, we can now bound to the size of each LiL_{i}.

Lemma 10.

Each LiL_{i} has size at most O(si)O(s_{i}) with probability at least 12n101-2n^{-10}.

Proof.

Recall from Lemma 6 that, with probability at least 1n101-n^{-10}, we have |Aj|sj/128|A_{j}|\geq s_{j}/128 for every AjA_{j}. Condition on some fixed choice of the AjA_{j}s, such that |Aj|sj/128|A_{j}|\geq s_{j}/128 for each AjA_{j}. Fix ii, and for each vertex vv with r(v)si+c/8r(v)\geq s_{i+c}/8, let

Zv=j>i such that sj+c/8r(v)(Yj,v=0).Z_{v}=\wedge_{j>i\text{ such that }s_{j+c}/8\leq r(v)}\phantom{f}(Y_{j,v}=0).

We claim that, if vLiv\in L_{i}, then ZvZ_{v} must occur. If ZvZ_{v} does not occur, then there is some j>ij>i such that sj+c/8r(v)s_{j+c}/8\leq r(v) and some uAj+cu\in A_{j+c} such that Ej(u,v)=1.E_{j}(u,v)=1. But then, at the last time level Lj+cL_{j+c} (and thus all lower levels) was rebuilt, uu would block vv from coming to level jj (and thus from coming to level ii), and since uu has not been removed yet, vv must not be in LiL_{i}.

For a given vv with r(v)si+c/8r(v)\geq s_{i+c}/8, since ZvZ_{v} depends on lg(r(v)/si)O(1)\lg(r(v)/s_{i})-O(1) different Yj,vY_{j,v}s, we have by Lemma 9 that

Pr[Zv](e2c/128)lg(r(v)/si)O(1)4lg(r(v)/si)(si/r(v))2,\Pr[Z_{v}]\leq(e^{-2^{c}/128})^{\lg(r(v)/s_{i})-O(1)}\leq 4^{-\lg(r(v)/s_{i})}\leq(s_{i}/r(v))^{2},

assuming the constant cc is sufficiently large. The expected number of vv satisfying r(v)si+c/8r(v)\geq s_{i+c}/8 for which ZvZ_{v} holds is therefore at most

r=si+cn(si/r(v))2O(si).\sum_{r=s_{i+c}}^{n}(s_{i}/r(v))^{2}\leq O(s_{i}).

By a Chernoff bound (which can be used since the ZvZ_{v}’s are independent by Lemma 9), the total number of vertices vv for which r(v)si+c/8r(v)\geq s_{i+c}/8 and ZvZ_{v} holds is O(si)O(s_{i}) with failure probability at most n10n^{-10}. This, in turn, means that

|{vr(v)si+c/8}Li|O(si).|\{v\mid r(v)\geq s_{i+c}/8\}\cap L_{i}|\leq O(s_{i}).

On the other hand, LiL_{i} can contain at most si+c/8s_{i+c}/8 vertices with r(v)<si+c/8r(v)<s_{i+c}/8. Thus, |Li|O(si)|L_{i}|\leq O(s_{i}). ∎

Having established the basic properties of the levels, we now bound the total number of comparisons of the various components of the algorithm.

Lemma 11.

The total expected number of comparisons spent rebuilding levels (from scratch) is O(nlg(pn))O(n\lg(pn)).

Proof.

It suffices to show that, for each level LiL_{i}, the expected number of comparisons spent performing rebuilds on LiL_{i} is O(n)O(n). We deterministically perform O(n/si)O(n/s_{i}) rebuilds on LiL_{i}. Each time that we perform a rebuild, we must query all of the edges in EiE_{i} that go from Li+1L_{i+1} to Li+cL_{i+c}. By Lemma 10, we know that |Li+1||L_{i+1}| and |Li+c||L_{i+c}| are O(si)O(s_{i}) with high probability in nn. Moreover, by Proposition 7, the levels Li+1L_{i+1} and Li+cL_{i+c} only depend on Ei+1,,EqE_{i+1},\dots,E_{q}, so by Proposition 1 they are independent of the random variables Ei(u,v)E_{i}(u,v) ranging over the stochastic vertex pairs (u,v)(u,v). Therefore, if we assume that |Li||L_{i}| and |Li+c||L_{i+c}| are O(si)O(s_{i}), then the expected number of edges that we must query for the rebuild is

O(si+si2αp2i)=O(si),O\left(s_{i}+s_{i}^{2}\cdot\frac{\alpha\cdot p}{2^{i}}\right)=O(s_{i}),

since there are at most |Li+1|+|Li+c|=O(si)|L_{i+1}|+|L_{i+c}|=O(s_{i}) deterministic vertex pairs and O(si2)O(s_{i}^{2}) stochastic vertex pairs that we might have to query, and each of the stochastic pairs is included in EiE_{i} with probability αp/2i\alpha\cdot p/2^{i}.

In summary, there are O(n/si)O(n/s_{i}) rebuilds of level LiL_{i}, each of which requires O(si)O(s_{i}) comparisons in expectation. The total number of comparisons from performing rebuilds on LiL_{i} is therefore O(n)O(n) in expectation. ∎

Recall that, each time that we discover a new vertex xx_{\ell} in the true order, we must revisit each vertex vv and each LiL_{i} such that vLiv\in L_{i} and (x,v)Ei1(x_{\ell},v)\in E_{i-1}. Because the vertex xx_{\ell} has been discovered, it is now removed from all of the levels, and thus the edge (x,v)(x_{\ell},v) no longer blocks vv from advancing to level Li1L_{i-1}. If there is another edge (u,v)Ei1(u,v)\in E_{i-1} such that uvu\prec v and uLi+c1u\in L_{i+c-1}, then the vertex vv remains blocked from advancing to level Li1L_{i-1}. If, however, vv is no longer blocked from advancing, and vv will advance some number k1k\geq 1 of levels. In this case, we say that vv performs kk incremental advancements.

Lemma 12.

The total expected number of queries performed for incremental advancements is O(nlg(pn))O(n\lg(pn)).

Proof.

For each level ii, we only attempt to increment vertices vLi\Li1v\in L_{i}\backslash L_{i-1} if vv is blocked by xx_{\ell}, meaning that (x,v)Ei1(x_{\ell},v)\in E_{i-1}. Except for when v=x+1,v=x_{\ell+1}, the event that (x,v)Ei1,(x_{\ell},v)\in E_{i-1}, which occurs with probability αp/2i1=O(p2i)\alpha\cdot p/2^{i-1}=O(p\cdot 2^{-i}), is independent of whether vLiv\in L_{i}, which only depends on Ei,Ei+1,E_{i},E_{i+1},\ldots by Proposition 7. By Lemma 10, |Li|=O(si)=O(2i/p)|L_{i}|=O(s_{i})=O(2^{i}/p) with probability 1O(n10)1-O(n^{-10}). It follows that the expected number of vertices vLi\Li1v\in L_{i}\backslash L_{i-1} that we even attempt to increment, in expectation, is O(p2i2i/p)=O(1)O(p\cdot 2^{-i}\cdot 2^{i}/p)=O(1). Over all levels, we increment an expected O(lg(pn))O(\lg(pn)) vertices during each edge discovery (i.e., when trying to find x+1x_{\ell+1}).

Whenever we try to incrementally advance a vertex vv that is currently at some level Lj+1,L_{j+1}, the number of queries that we must perform in order to determine whether vv should further advance to level LjL_{j} is equal to the number of vertices ww in Lj+cL_{j+c} such that (v,w)Ej(v,w)\in E_{j}. However, the only information about vv that we are conditioning on is that vv is in Lj+1L_{j+1} and that vv was blocked by xx_{\ell}. So, by Propositions 1 and 7, the events Ej(v,w)E_{j}(v,w) ranging over ww for which (v,w)(v,w) is a stochastic edge are independent of the information about vv that we have conditioned on; note that the only ww for which (v,w)(v,w) is not stochastic are the vertices ww that come immediately before or immediately after vv in the true order (which we call prec(v)\text{prec}(v), succ(v)\text{succ}(v)). Therefore, the expected number of queries we must perform is at most 2+|Lj+c|αp2j2+|L_{j+c}|\cdot\frac{\alpha\cdot p}{2^{j}}, since each vertex in Lj+c{v,prec(v),succ(v)}L_{j+c}\setminus\{v,\text{prec}(v),\text{succ}(v)\} has probability αp2i\frac{\alpha\cdot p}{2^{i}} of having an edge in EiE_{i} to vv. By Lemma 10, we know that |Li+c|=O(si)=O(2i/p)|L_{i+c}|=O(s_{i})=O(2^{i}/p) with probability at least 12n101-2n^{-10}. Thus, the expected number of queries that we must perform for each incremental advancement is O(1)O(1).

To prove the lemma, it suffices to show that the expected total number of attempted incremental advancements is O(nlg(pn))O(n\lg(pn)), since each one uses an expected O(1)O(1) queries. Recall that, each time that we try to discover a new vertex x+1,x_{\ell+1}, we attempt to increment O(lg(pn))O(\lg(pn)) vertices in expectation: each vertex vv can fail to be incremented only once, because we stop incrementing vv after that. So, the expected number of failed incremental advancements across the entire algorithm is O(nlg(pn))O(n\lg(pn)). In addition, whenever we perform a successful incremental advancement, we increase the size of some level by 11. We know that, with high probability in nn, each level LiL_{i} never has size exceeding O(si)O(s_{i}). Between rebuilds of LiL_{i}, O(si)O(s_{i}) total vertices are removed from LiL_{i}. In order so that |Li|=O(si)|L_{i}|=O(s_{i}), the total number of vertices that are added to LiL_{i} between rebuilds must be at most O(si)O(s_{i}). Thus, between rebuilds, at most O(si)O(s_{i}) vertices are incrementally advanced into LiL_{i}. Since LiL_{i} is rebuilt O(n/si)O(n/s_{i}) times, the total number of incremental advancements into LiL_{i} over all time is O(n)O(n). Summing over the levels LiL_{i}, the total number of incremental advancements is O(nlg(pn))O(n\lg(pn)). ∎

After doing the incremental advancements for a given xx_{\ell}, and performing any necessary rebuilds of levels, the algorithm searches for the next vertex x+1x_{\ell+1}. Our final task is to bound the expected number of queries needed to identify x+1x_{\ell+1}.

Lemma 13.

The expected number of comparisons needed to identify a given x+1x_{\ell+1} (i.e., to eliminate incorrect candidates from the candidate set) is O(lg(pn))O(\lg(pn)).

Proof.

Consider the candidate set CC for x+1x_{\ell+1}. In addition, consider a vertex vx+1v\succ x_{\ell+1}. For vv to be in the candidate set CC, the following two events must occur:

  • Event 1: vv must have an edge to xx_{\ell}.

  • Event 2: For all ii such that r(v)si+c/8r(v)\geq s_{i+c}/8, we have Yi,v=0Y_{i,v}=0.

Event 11 occurs with probability pp, and only depends on {Ej(x,v)}\{E_{j}(x_{\ell},v)\} over all jj. On the other hand, by Proposition 7, the sets {Ak}k1\{A_{k}\}_{k\geq 1} only depend on {Ej(u1,u2)}\{E_{j}(u_{1},u_{2})\} over triples (j,u1,u2)(j,u_{1},u_{2}) with u1,u2x+3sj/32u_{1},u_{2}\preceq x_{\ell+3s_{j}/32}. Let

T1={(j,x,v)j[q]}{(j,u1,u2)u1,u2x+3sj/32,j[q]}T_{1}=\{(j,x_{\ell},v)\mid j\in[q]\}\cup\{(j,u_{1},u_{2})\mid u_{1},u_{2}\preceq x_{\ell+3s_{j}/32},j\in[q]\}

be the set of triples (j,u1,u2)(j,u_{1},u_{2}) whose corresponding variables Ej(u1,u2)E_{j}(u_{1},u_{2}) cumulatively determine Event 1 and the anchor sets {Ak}k1\{A_{k}\}_{k\geq 1}.

Event 2 considers Yi,vY_{i,v} for ii such that si+c/8r(v)s_{i+c}/8\leq r(v) (note that these are the only ii for which Yi,vY_{i,v} is defined), so assuming the AkA_{k}s are fixed, it only depends on Ei(u,v)E_{i}(u,v) for triples in the set

T2={(i,u1,v)si+c/8r(v),u1Ai+c}.T_{2}=\{(i,u_{1},v)\mid s_{i+c}/8\leq r(v),u_{1}\in A_{i+c}\}.

Note that T2T_{2} is disjoint from T1T_{1}, since u1Ai+cu_{1}\in A_{i+c} means xux_{\ell}\prec u and since the fact that si+c/8r(v)s_{i+c}/8\leq r(v) means that x+si+c/8vx_{\ell+s_{i+c}/8}\preceq v, so (i,u1,v)T1(i,u_{1},v)\not\in T_{1}. Moreover, T2T_{2} consists exclusively of stochastic vertex pairs, since for any u1Ai+cu_{1}\in A_{i+c}, we have r(v)si+c/8r(u1)+2r(v)\geq s_{i+c}/8\geq r(u_{1})+2. Thus, if we fix the outcomes of Event 1 and of the anchor sets {Ak}k1\{A_{k}\}_{k\geq 1}, then Proposition 1 tells us that the outcomes of the Ei(u1,v)E_{i}(u_{1},v)s corresponding to T2T_{2} are independent of the random variables that have already been fixed.

Now let us consider the probability of Event 2 if we condition on Event 1 and if we condition on the sets AkA_{k} each having size at least sk/128s_{k}/128. There are lg(r(v)p)O(1)\lg(r(v)\cdot p)-O(1) random variables Yi,vY_{i,v} for which r(v)si+c/8r(v)\geq s_{i+c}/8, and by Lemma 9 each Yi,vY_{i,v} independently has probability at most e2c/128e^{-2^{c}/128} of being 0 (based on the outcomes of the random variables Ei(u1,v)E_{i}(u_{1},v) where u1u_{1} is selected so that (i,u1,v)T2(i,u_{1},v)\in T_{2}). So conditioned on the outcome of Event 1 and on the sets AkA_{k} each having size at least sk/128s_{k}/128, Event 2 occurs with probability at most

(e2c/128)lg(r(v)p)O(1)O(1)(r(v)p)10.\left(e^{-2^{c}/128}\right)^{\lg(r(v)p)-O(1)}\leq\frac{O(1)}{(r(v)\cdot p)^{10}}.

Let 1\mathcal{E}_{1} and 2\mathcal{E}_{2} be the indicator random variables for Events 1 and 2, respectively, and let 𝒟\mathcal{D} be the indicator that |Ak|sk/128|A_{k}|\geq s_{k}/128 for all kk. To summarize, we have so far shown that Pr[1]=p\Pr[\mathcal{E}_{1}]=p and that

Pr[21,𝒟]O(1)(r(v)p)10.\Pr[\mathcal{E}_{2}\mid\mathcal{E}_{1},\mathcal{D}]\leq\frac{O(1)}{(r(v)\cdot p)^{10}}.

Let uu be the true predecessor of vv. As soon as we query the edge (u,v)(u,v), we will be able to eliminate vv from the candidate set. Thus, if vv is in the candidate set, then we can upper bound the number of queries needed to eliminate vv by the number of query-able edges from {x,v}\{x_{\ell},v\} to each level Li+1L_{i+1} for which uLiu\not\in L_{i}.

Now consider a fixed level LiL_{i}. In order so that uLiu\not\in L_{i}, we must have that:

  • Event 3: Xi,u=1X_{i,u}=1.

Event 3 depends only on Ej(u1,u)E_{j}(u_{1},u) where u1uu_{1}\prec u and jij\geq i. This means that the events that determine Event 3 concern different pairs of vertices than do the events {Ej(x,v)}j=1q\{E_{j}(x_{\ell},v)\}_{j=1}^{q} that determine Event 1 (since the latter events all involve uu’s successor vv); and the events that determine Event 3 concern different pairs of vertices than do the events {Ej(u1,v)}(j,u1,v)T2\{E_{j}(u_{1},v)\}_{(j,u_{1},v)\in T_{2}} that determine Event 2 once the AkA_{k}s are fixed (since, once again, the latter events all involve uu’s successor vv). Importantly, by the fourth property of Proposition 1, this means that Event 11 and Event 33 are independent if we do not condition on the AkA_{k}s, and that conditioning on Events 11 and 33 along with the AkA_{k}s does not affect the probability of Event 22 in comparison to conditioning only on Event 1 and the AkA_{k}s (and not on Event 3).

Let 3\mathcal{E}_{3} be the indicator for Event 3. By Lemma 5,

Pr[3]1+r(u)p2i1.\Pr[\mathcal{E}_{3}]\leq\frac{1+r(u)p}{2^{i-1}}.

Thus we have that

Pr[1,2,3]\displaystyle\Pr[\mathcal{E}_{1},\mathcal{E}_{2},\mathcal{E}_{3}] =Pr[1,3,𝒟]Pr[21,3,𝒟]+Pr[1,2,3,¬𝒟]\displaystyle=\Pr[\mathcal{E}_{1},\mathcal{E}_{3},\mathcal{D}]\cdot\Pr[\mathcal{E}_{2}\mid\mathcal{E}_{1},\mathcal{E}_{3},\mathcal{D}]+\Pr[\mathcal{E}_{1},\mathcal{E}_{2},\mathcal{E}_{3},\lnot\mathcal{D}]
Pr[1,3]Pr[21,3,𝒟]+Pr[¬𝒟]\displaystyle\leq\Pr[\mathcal{E}_{1},\mathcal{E}_{3}]\cdot\Pr[\mathcal{E}_{2}\mid\mathcal{E}_{1},\mathcal{E}_{3},\mathcal{D}]+\Pr[\lnot\mathcal{D}]
=Pr[1]Pr[3]Pr[21,𝒟]+Pr[¬𝒟]\displaystyle=\Pr[\mathcal{E}_{1}]\cdot\Pr[\mathcal{E}_{3}]\cdot\Pr[\mathcal{E}_{2}\mid\mathcal{E}_{1},\mathcal{D}]+\Pr[\lnot\mathcal{D}]
p1+r(u)p2i1min(1,O(1)(r(v)p)10)+1n9.\displaystyle\leq p\cdot\frac{1+r(u)\cdot p}{2^{i-1}}\cdot\min\left(1,\frac{O(1)}{(r(v)\cdot p)^{10}}\right)+\frac{1}{n^{9}}.

We remark that, although Events 1, 2, 3 are formally defined only for i>0i>0 (since there is no level L0L_{0}), if we also consider a level L0L_{0} to be the empty set (so uL0u\not\in L_{0} by default), we have that Pr[3]11+r(u)p2i1\Pr[\mathcal{E}_{3}]\leq 1\leq\frac{1+r(u)p}{2^{i-1}}, so our bound for Pr[1,2,3]\Pr[\mathcal{E}_{1},\mathcal{E}_{2},\mathcal{E}_{3}] is still true even in the case of i=0i=0.

For any vertex vx+1v\succ x_{\ell+1}, if vv’s predecessor uu is in Li+1\LiL_{i+1}\backslash L_{i} for some i0i\geq 0 (where L0=L_{0}=\emptyset), then once we have queried edges from vv to Li+1L_{i+1} we will have found ux+1u\neq x_{\ell+1}, eliminating vv from the candidate set. Hence, for any level Li+1L_{i+1}, we will only query edges from vv to Li+1L_{i+1} if Event 3 occurs for level ii, and we only query edges from x+1x_{\ell+1} to Li+1L_{i+1} if Event 33 occurs for some vv in the original candidate set (since otherwise we will have eliminated all vertices except x+1x_{\ell+1}).

We have already bounded the probability of any of Events 1, 2, 3 occurring for a given vv and level LiL_{i}. Therefore, if we wish to bound the number of edges that are queried while discovering x+1x_{\ell+1}, then our final task is the following: for each vv and level LiL_{i} (including i=0i=0), we must bound the number hh of query-able edges ee from {x+1,v}\{x_{\ell+1},v\} to Li+1L_{i+1}, conditioning on Events 1, 2, 3 occurring for vv and LiL_{i}. We do not need to count any edge ee that has already been queried in the past. There are at most O(1)O(1) deterministic edges incident to {x+1,v}\{x_{\ell+1},v\}. On the other hand, if we condition on which edges have been queried in the past, and we also condition on Events 1, 2, and 3, then each remaining stochastic vertex pair (that is not yet been queried) has conditional probability at most pp of being a query-able edge. In particular, for each stochastic vertex pair that has not been queried, the only information that our conditions can reveal about it is that there is some set of EiE_{i}s in which the vertex pair is known not to appear222Note, in particular, that every time that the algorithm checks whether some edge (u,v)(u,v) is in EE or is in some EiE_{i}, if the algorithm finds that the edge is, then it immediately queries the edge (unless u=xu=x_{\ell}, in which case we already know the direction of the edge (u,v)(u,v) since vxv\succ x_{\ell}, so we will never need to query it). Thus, the only information that the algorithm ever learns about not-yet-queried/not-yet-solved edges is that those edges are not in some subset of the EiE_{i}s. (This can also easily be verified by the pseudocode in Appendix A.); and this can only decrease the probability that the vertex pair is a query-able edge. Since |Li+1|O(si)|L_{i+1}|\leq O(s_{i}) with high probability in nn, the expected number of edges that we must query from {x+1,v}\{x_{\ell+1},v\} to Li+1L_{i+1} (and if we condition on Events 1, 2, 3) is O(psi)O(ps_{i}) (unless the combined probability of Events 1, 2, 3 is already 1/poly(n)\leq 1/\operatorname{poly}(n), in which case we can use the trivial bound of O(n2)O(n^{2}) on the number of edges that we query).333Note that psiΩ(1)ps_{i}\geq\Omega(1), and thus the bound of O(psi)O(ps_{i}) also counts the O(1)O(1) deterministic edges that we might have to query.

Combining the pieces of the analysis, and setting r=r(v)r=r(v), the expected number of comparisons that we incur removing vertices from xx_{\ell}’s candidate set is at most

1polyn+vi=1lg(pn)Pr[Events 1, 2, 3 for v and Li]O(psi)\displaystyle\hskip 14.22636pt\frac{1}{\operatorname{poly}n}+\sum_{v}\sum_{i=1}^{\lg(pn)}\Pr[\text{Events 1, 2, 3 for }v\text{ and }L_{i}]\cdot O(ps_{i})
=1polyn+r=1ni=1lg(pn)O(pmin(1,1(rp)10)1+rp2i1psi)\displaystyle=\frac{1}{\operatorname{poly}n}+\sum_{r=1}^{n}\sum_{i=1}^{\lg(pn)}O\left(p\cdot\min\left(1,\frac{1}{(r\cdot p)^{10}}\right)\cdot\frac{1+rp}{2^{i-1}}\cdot ps_{i}\right)
=1polyn+r=1ni=1lg(pn)O(pmin(1,1(rp)10)1+rp2i12i)\displaystyle=\frac{1}{\operatorname{poly}n}+\sum_{r=1}^{n}\sum_{i=1}^{\lg(pn)}O\left(p\cdot\min\left(1,\frac{1}{(r\cdot p)^{10}}\right)\cdot\frac{1+rp}{2^{i-1}}\cdot 2^{i}\right)
=1polyn+r=1ni=1lg(pn)O(pmin(1,1(rp)10)(1+rp))\displaystyle=\frac{1}{\operatorname{poly}n}+\sum_{r=1}^{n}\sum_{i=1}^{\lg(pn)}O\left(p\cdot\min\left(1,\frac{1}{(r\cdot p)^{10}}\right)\cdot(1+rp)\right)
=1polyn+O(lg(pn))r=1nO(pmin(1,1(rp)10)(1+rp)).\displaystyle=\frac{1}{\operatorname{poly}n}+O(\lg(pn))\sum_{r=1}^{n}O\left(p\cdot\min\left(1,\frac{1}{(r\cdot p)^{10}}\right)\cdot(1+rp)\right).

Furthermore, we can write the sum

r=1n(pmin(1,1(rp)10)(1+rp))\displaystyle\sum_{r=1}^{n}\left(p\cdot\min\left(1,\frac{1}{(r\cdot p)^{10}}\right)\cdot(1+rp)\right) =r=11/pp(1+rp)+r=1/p+1np(1+rp)(rp)10\displaystyle=\sum_{r=1}^{1/p}p\cdot(1+rp)+\sum_{r=1/p+1}^{n}\frac{p\cdot(1+rp)}{(r\cdot p)^{10}}
1pp2+2pr=1/p1(rp)9𝑑r\displaystyle\leq\frac{1}{p}\cdot p\cdot 2+2p\cdot\int_{r=1/p}^{\infty}\frac{1}{(r\cdot p)^{9}}dr
2+2r=11(r)9𝑑r=O(1),\displaystyle\leq 2+2\cdot\int_{r^{\prime}=1}^{\infty}\frac{1}{(r^{\prime})^{9}}\cdot dr^{\prime}=O(1),

substituting r=rpr^{\prime}=r\cdot p. Therefore, in total, the expected number of comparisons that we incur while removing vertices from xx_{\ell}’s candidate set is O(lg(pn)).O(\lg(pn)).

The preceding lemmas, along with subsection 3.2, combine to give us the following theorem:

Theorem 14.

The expected number of comparisons made by the StochasticSort algorithm is O(nlg(pn))O(n\lg(pn)). Moreover, the algorithm always returns the correct ordering.

We conclude the section by revisiting the requirement that 1/p1/p is at least a sufficiently large constant multiple of lgn\lg n. Recall that this requirement was so that all of the level sizes would be large enough that we could apply Chernoff bounds to them. We now remove this requirement, thereby completing the proof that Theorem 14 holds for all pp. Consider an index ii for which sis_{i} is a large constant multiple of lgn\lg n. We can analyze all of the levels LjL_{j}, jij\geq i, as before, but we must analyze the levels below level ii differently. One way to upper bound the number of edge queries in the levels below LiL_{i} is to simply assume that every pair of vertices in LiL_{i} is queried as an edge. Even in this worst case, this would only add O(lg2n)O(\lg^{2}n) queries between every pair of consecutive rebuilds of LiL_{i} (since the number of distinct vertices that reside in LiL_{i} at any point between a given pair of rebuilds is O(lgn)O(\lg n)). There are O(n/lgn)O(n/\lg n) total rebuilds of LiL_{i}, and thus the total number of distinct queries that occur at lower levels over the entire course of the algorithm is upper bound by O(nlgn)O(n\lg n). Thus Theorem 14 holds even when 1/p=O(lgn)1/p=O(\lg n).

4 Lower Bound for Stochastic Generalized Sorting

In this section, we prove that our stochastic generalized sorting bound of O(nlg(pn))O(n\lg(pn)) is tight for plnn+lnlnn+ω(1)np\geq\frac{\ln n+\ln\ln n+\omega(1)}{n}. In other words, for pp even slightly greater than (lnn)/n(\ln n)/n, generalized sorting requires at least Ω(nlg(pn))\Omega(n\lg(pn)) queries (to succeed with probability at least 2/32/3) on an Erdős-Renyi random graph G(n,p)G(n,p) (with a planted Hamiltonian path to ensure sorting is possible). Formally, we prove the following theorem:

Theorem 15.

Let nn and plnn+lnlnn+ω(1)np\geq\frac{\ln n+\ln\ln n+\omega(1)}{n} be known. Suppose that we are given a random graph GG created by starting with the Erdős-Renyi random graph G(n,p)G(n,p), deciding the true order of the vertices uniformly at random, and then adding the corresponding Hamiltonian path. Then, any algorithm that determines the true order of GG with probability at least 2/32/3 over the randomness of the algorithm and the randomness of GG must use Ω(nlg(pn))\Omega(n\lg(pn)) queries.

The main technical tool we use is the following bound on the number of Hamiltonian cycles in a random graph, due to Glebov and Krivelevich [8]:

Theorem 16.

For plnn+lnlnn+ω(1)np\geq\frac{\ln n+\ln\ln n+\omega(1)}{n}, the number of (undirected) Hamiltonian cycles in the random graph G(n,p)G(n,p) is at least n!pn(1o(1))nn!\cdot p^{n}\cdot(1-o(1))^{n}, with probability 1o(1)1-o(1) as nn goes to \infty.

A consequence of this is that, for the graph GG in Theorem 15, the total number of undirected Hamiltonian paths is, with probability 1o(1)1-o(1), also at least (np(1o(1))/e)n=(np)Ω(n)(np\cdot(1-o(1))/e)^{n}=(np)^{\Omega(n)}.

We will think of the graph GG in Theorem 15 as being constructed through the following process. First, Alice chooses a random permutation π1,,πn\pi_{1},\dots,\pi_{n} of the vertices V=[n]V=[n] and makes the true ordering π1π2πn\pi_{1}\prec\pi_{2}\prec\cdots\prec\pi_{n}. Then, she adds the edges (π1,π2),,(πn1,πn)(\pi_{1},\pi_{2}),\ldots,(\pi_{n-1},\pi_{n}). Finally, for each pair of vertices iji\neq j that are not consecutive in the true order, she adds (i,j)(i,j) to EE with probability pp. We use GG to denote the undirected graph that Alice has constructed, and we use (G,π)(G,\pi) to denote the directed graph.

Now, suppose an adversary sees the undirected graph G=(V,E)G=(V,E) with the planted Hamiltonian path. In addition, suppose the adversary knows some set of directed edges EE^{\prime} (where, as undirected edges, EEE^{\prime}\subset E). Now, consider a permutation σ\sigma such that (σ1,σ2),(σ2,σ3),,(σn1,σn)(\sigma_{1},\sigma_{2}),(\sigma_{2},\sigma_{3}),\cdots,(\sigma_{n-1},\sigma_{n}) are all undirected edges in EE, and such that the known orders in EE^{\prime} do not violate σ\sigma. We say that such permutations σ\sigma are consistent with GG and EE^{\prime}.

The next proposition tells us that the adversary cannot distinguish the true order π\pi from any other permutation π\pi^{\prime} that is consistent with GG and EE^{\prime}. In other words, the adversary must treat all possible Hamiltonian paths as equally viable.

Proposition 17.

Suppose we know GG as well as a subset EE^{\prime} of directed edges. Then, the posterior probability of the true ordering of the vertices is uniform over all σ\sigma that are consistent with GG and EE^{\prime}.

Proof.

For each permutation σ\sigma, define 𝒮(σ)\mathcal{S}(\sigma) to be the event that Alice chose the permutation σ\sigma. Define 𝒢\mathcal{G} to be the event that the undirected graph constructed by Alice is the graph GG, and define \mathcal{E}^{\prime} be the event that every directed edge EE^{\prime} is in the directed graph (G,π)(G,\pi). Then, the initial probability that Alice created (G,σ)(G,\sigma) (before we are told GG or any of the orders of EE^{\prime}) is precisely

Pr[𝒮(σ),𝒢,]=1n!p|E|(n1)(1p)(n)(n1)/2|E|.\Pr[\mathcal{S}(\sigma),\mathcal{G},\mathcal{E}^{\prime}]=\frac{1}{n!}\cdot p^{|E|-(n-1)}\cdot(1-p)^{(n)(n-1)/2-|E|}. (8)

This is because she decided π=σ\pi=\sigma with probability 1n!,\frac{1}{n!}, each stochastic edge in EE (of which there are |E|(n1)|E|-(n-1) of them) is included with probability pp, and each non-edge vertex pair is excluded with probability (1p)(1-p). Finally, given σ,G,\sigma,G, we know that the event \mathcal{E}^{\prime} is also true. As a result, even if we condition on 𝒢\mathcal{G} and \mathcal{E}^{\prime}, this means

Pr[𝒮(σ)𝒢,]=Pr[𝒮(σ),𝒢,]Pr[𝒢,]\Pr[\mathcal{S}(\sigma)\mid\mathcal{G},\mathcal{E}^{\prime}]=\frac{\Pr[\mathcal{S}(\sigma),\mathcal{G},\mathcal{E}^{\prime}]}{\Pr[\mathcal{G},\mathcal{E}^{\prime}]}

which by (8) is the same for every σ\sigma that is consistent with the fixed G,EG,E^{\prime}. Hence, knowing GG and EE^{\prime} means each permutation σ\sigma that is consistent with GG and EE^{\prime} is equally likely. ∎

At this point, the remainder of the proof of the lower bound is relatively simple. The idea is that initially, using Theorem 16, there are (np)Ω(n)(np)^{\Omega(n)} possible choices for σ\sigma. But each query cannot give more than 11 bit of information about σ\sigma, so one would need nlg(np)n\lg(np) queries to determine σ.\sigma. We now prove this formally.

Proof of Theorem 15.

Suppose that GG is a graph with K(np)cnK\geq(np)^{c\cdot n} Hamiltonian paths for some fixed constant cc, which is true with probability at least 1o(1)1-o(1). By Proposition 17, unless we have narrowed down the possible permutations to 11, we have at most a 1/21/2 probability of guessing the true permutation. Therefore, to guess the true permutation with probability 2/32/3 overall using TT queries, we must be able to perform TT queries to reduce the number of consistent permutations σ\sigma to 11, with probability at least 1/3o(1)>1/41/3-o(1)>1/4.

Now, after performing tTt\leq T queries, suppose there are KtK_{t} potential consistent permutations. At the beginning, K0=KK_{0}=K, which means lgK0cnlg(pn)\lg K_{0}\geq cn\lg(pn). Now, suppose we have KtK_{t} possible permutations at some point and we query some edge e=(u,v)e=(u,v). Then, for some integers a,ba,b such that a+b=Kt,a+b=K_{t}, either we will reduce the number of permutations to aa or reduce the number of permutations to bb, depending on the direction of ee. Due to the uniform distribution of the consistent edges (by Proposition 17), the first event occurs with probability aKt\frac{a}{K_{t}} and the second event occurs with probability bKt\frac{b}{K_{t}}. Therefore, the expected value of lgKt+1\lg K_{t+1}, assuming we query this edge ee, is

aKtlga+bKtlgb=lgKt+(aKtlgaKt+bKtlgbKt)=lgKtH2(a/Kt),\frac{a}{K_{t}}\lg a+\frac{b}{K_{t}}\lg b=\lg K_{t}+\left(\frac{a}{K_{t}}\lg\frac{a}{K_{t}}+\frac{b}{K_{t}}\lg\frac{b}{K_{t}}\right)=\lg K_{t}-H_{2}(a/K_{t}),

where H2(p):=[plgp+(1p)lg(1p)]H_{2}(p):=-[p\lg p+(1-p)\lg(1-p)] is known to be at most 11 for p[0,1]p\in[0,1]. (Note that we are using the convention that 0lg0=00\lg 0=0).

Therefore, no matter what edge we query, the expected value of lgKt+1\lg K_{t+1} is at least (lgKt)1.(\lg K_{t})-1. Hence, for any t0,t\geq 0, 𝔼[lgKt](lgK)tcnlg(pn)t\mathbb{E}[\lg K_{t}]\geq(\lg K)-t\geq cn\lg(pn)-t, regardless of the choice of queries that we make. So, after T:=cnlg(pn)/4T:=cn\lg(pn)/4 queries, the random variable lgKT\lg K_{T} is bounded in the range [0,lgK][0,\lg K] and has expectation at least lgKT34lgK\lg K-T\geq\frac{3}{4}\lg K. Therefore, Pr[lgKT=0]\Pr[\lg K_{T}=0] is at most 14\frac{1}{4} by Markov’s inequality on the random variable lgKlgKT\lg K-\lg K_{T}. Since lgKT=0\lg K_{T}=0 is equivalent to there being exactly one choice for the true permutation σ\sigma, we have that T=cnlg(pn)/4T=cn\lg(pn)/4 queries is not sufficient to determine σ\sigma uniquely with success probability at least 1/3o(1)>1/41/3-o(1)>1/4. This concludes the proof. ∎

5 Generalized Sorting in Arbitrary (Moderately Sparse) Graphs

In this section, we improve the worst-case generalized sorting bound of [11] from O(min(n3/2lgn,m))O(\min(n^{3/2}\lg n,m)) to O(mnlgn)O(\sqrt{mn}\cdot\lg n). Hence, for moderately sparse graphs, i.e., n(lgn)O(1)mn2/(lgn)O(1)n(\lg n)^{O(1)}\leq m\leq n^{2}/(\lg n)^{O(1)}, we provide a significant improvement over [11].

The main ingredients we use come from the papers [11], which provides a convex geometric approach that is especially useful when we are relatively clueless about the true order, and [14], which shows how to efficiently determine the true order given a sufficiently good approximation to ordering.

Suppose that the original (undirected) graph is some G=(V,E)G=(V,E) and that at some point, we have queried some subset EEE^{\prime}\subset E of (directed) edges and know their orders. We say that a permutation σ\sigma is compatible with GG and EE^{\prime} if for every directed edge (u,v)E(u,v)\in E^{\prime}, uvu\prec v according to σ\sigma. (Note that being compatible is a slightly weaker property than being consistent, as defined in Section 4, because compatibility does not require that σ\sigma appear as a Hamiltonian path in GG.) Let Σ(G,E)\Sigma(G,E^{\prime}) be the set of permutations that are compatible with GG and EE^{\prime}. Moreover, for any vertex uu, define SE,G(u)S_{E^{\prime},G}(u) to be the (unweighted) average of the ranks of uu among all σΣ(G,E)\sigma\in\Sigma(G,E^{\prime}). Huang et al. [11] proved the following theorem using an elegant geometric argument:

Theorem 18 (Lemma 2.1 of [11]).

Suppose that SE,G(u)SE,G(v).S_{E^{\prime},G}(u)\geq S_{E^{\prime},G}(v). Then, |Σ(G,E(u,v))|(11e)|Σ(G,E)|,|\Sigma(G,E^{\prime}\cup(u,v))|\leq\left(1-\frac{1}{e}\right)\cdot|\Sigma(G,E^{\prime})|, where (u,v)(u,v) represents the directed edge uv.u\to v.

The above theorem can be thought of as follows. If we see that uvu\prec v in the true ordering but previously the expectation of uu’s rank under all compatible permutations was larger than the expectation of vv’s rank under all compatible permutations, then the number of compatible permutations decreases by a constant factor.

In addition, we have the following theorem from [14]:

Theorem 19 (Theorem 1.1 of [14]).

Let E\vec{E} be the true directed graph of EE (with respect to the true permutation π\pi) and let E~\tilde{E} be some given directed graph of EE, such that at most ww of the edge directions differ from E\vec{E}. Then, given the graph G=(V,E)G=(V,E), E~\tilde{E} (but not E\vec{E}), and ww, there exists a randomized algorithm PredictionSort(G,E~,wG,\tilde{E},w) using O(w+nlgn)O(w+n\lg n) queries that can determine E\vec{E} with high probability.

Our algorithm, which we call SparseGeneralizedSort, works as follows. Set a parameter aa to be (1+c)m/n(1+c)\sqrt{m/n} for some sufficiently large positive constant cc. We repeatedly apply the following procedure until we have found the true ordering.

  1. 1.

    Let EE^{\prime} be the current set of queried edges and Σ(G,E)\Sigma(G,E^{\prime}) be the set of compatible permutations. If |Σ(G,E)|=1|\Sigma(G,E^{\prime})|=1 then there is only one option for the permutation, which we return.

  2. 2.

    Else, choose σ\sigma as the permutation of [n][n] ordered where uu occurs before vv if SE,G(u)SE,G(v).S_{E^{\prime},G}(u)\leq S_{E^{\prime},G}(v). We break ties arbitrarily.

  3. 3.

    Let E~\tilde{E} be our “approximation” for the true directed graph E\vec{E}, where (u,v)E~(u,v)\in\tilde{E} if uu comes before vv in the σ\sigma order.

  4. 4.

    Query aa random edges from EE.

  5. 5.

    If all aa edges match the same direction as our approximation E~\tilde{E}, we use the algorithm of Theorem 19 with the parameter ww set to m/nlgn\sqrt{m/n}\cdot\lg n. If not, we add the mispredicted edges and their true directions to EE^{\prime}, and we return to the first step.

Theorem 20.

SparseGeneralizedSort makes at most O(mnlgn)O(\sqrt{mn}\lg n) queries and successfully identifies the true order of the vertices with high probability.

Proof.

To analyze the algorithm, we must bound the number of queries that it performs, and we must also bound the probability that it fails. Note that there are two ways the algorithm can terminate, either by achieving |Σ(G,E)|=1|\Sigma(G,E^{\prime})|=1 in Step 11, or by invoking the algorithm of Theorem 19 in Step 55. (In the latter case, our algorithm terminates regardless of whether the algorithm from Theorem 19 successfully finds the true order or not.)

First, we bound the number of queries. Note that the only queries we make are at most the aa random queries in each loop plus the O(nlgn+w)O(n\lg n+w) queries made by a single application of Theorem 19. Hence, if we repeat this procedure at most kk times, we make at most O(ka+nlgn+w)O(k\cdot a+n\lg n+w) total queries.

Now, each time we repeat this procedure, this means that we found an edge that went in the opposite direction of our approximation E~\tilde{E}, meaning that we found an edge uvu\to v such that SE,G(u)SE,G(v).S_{E^{\prime},G}(u)\geq S_{E^{\prime},G}(v). Then, by Theorem 18, |Σ(G,E(u,v))|(11e)|Σ(G,E)|.|\Sigma(G,E^{\prime}\cup(u,v))|\leq\left(1-\frac{1}{e}\right)\cdot|\Sigma(G,E^{\prime})|. Thus, each iteration of the procedure decreases the number of compatible edges by at least a factor of 11e.1-\frac{1}{e}. Since we start off with n!n! compatible permutations, after kk iterations, we have at most n!(11e)kn!\cdot\left(1-\frac{1}{e}\right)^{k} remaining permutations. So after k=O(nlgn)k=O(n\lg n) iterations, we are guaranteed to be down to at most 11 permutation, ensuring that Step 1 terminates the procedure. Therefore, we make at most O(nlgna+w)=O(mnlgn)O(n\lg n\cdot a+w)=O(\sqrt{mn}\cdot\lg n) total queries.

Finally, we show that that the algorithm succeeds with high probability. Note that there are two ways that the algorithm could fail: the first is that Step 55 invokes the algorithm from Theorem 19 even though there are more than ww mispredicted edges in E~\tilde{E}; and the second is that Step 55 correctly invokes the algorithm from Theorem 19, but that algorithm fails. By Theorem 19, the latter case happens with low probability, and thus our task is to bound the probability of the first case happening.

Suppose that for some given E~\tilde{E}, the number of incorrect edges is more than w=m/nlgnw=\sqrt{m/n}\cdot\lg n. Then, the probability of a random edge being incorrect is at least lgn/m/n\lg n/\sqrt{m/n}. So, if we sample aa edges, at least one of the edges will be wrong with probability

1(1lgnm/n)a1elgna/m/n=1n(1+c).1-\left(1-\frac{\lg n}{\sqrt{m/n}}\right)^{a}\geq 1-e^{-\lg n\cdot a/\sqrt{m/n}}=1-n^{-(1+c)}.

Therefore, at each time we query aa random edges from EE, with failure probability n(1+c)n^{-(1+c)}, we either find an incorrect direction edge in E~\tilde{E} or the number of incorrect direction edges in E~\tilde{E} is at most ww. Since we only repeat this loop at most nlgnn\lg n times, the probability that we ever incorrectly invoke the algorithm of Theorem 19 is at most (nlgn)/n(1+c)=1/poly(n)(n\lg n)/n^{(1+c)}=1/\operatorname{poly}(n). ∎

Remark.

Unlike the stochastic sorting algorithm, it is not immediately clear how to perform SparseGeneralizedSort in polynomial time, since the average rank SE,G(v)S_{E^{\prime},G}(v) is difficult to compute. We briefly note, however, that SE,G(v)S_{E^{\prime},G}(v) can be approximated with ±O(1)\pm O(1) error in polynomial time by sampling from the polytope in n\mathbb{R}^{n} with facets created by xixjx_{i}\leq x_{j} if we know that iji\prec j, for each pair (i,j)(i,j) (see Lemma 3.1 of [11] as well as discussion in [7] for more details). Moreover, these approximations suffice for our purposes, since it is known (see Lemma 2.1 of [11]) that Theorem 18 still holds even if SE,G(u)SE,G(v)O(1)S_{E^{\prime},G}(u)\geq S_{E^{\prime},G}(v)-O(1) (although the modified theorem now has a different constant than (11e)\left(1-\frac{1}{e}\right)).

Acknowledgments

S. Narayanan is funded by an NSF GRFP Fellowship and a Simons Investigator Award. W. Kuszmaul is funded by a Fannie & John Hertz Foundation Fellowship; by an NSF GRFP Fellowship; and by the United States Air Force Research Laboratory and the United States Air Force Artificial Intelligence Accelerator and was accomplished under Cooperative Agreement Number FA8750-19-2-1000. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the United States Air Force or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation herein

The authors would like to thank Nicole Wein for several helpful conversations during the start of this project.

References

  • [1] N. Alon, M. Blum, A. Fiat, S. Kannan, M. Naor, and R. Ostrovsky. Matching nuts and bolts. In 5th Annual Symposium on Discrete Algorithms, SODA, pages 690–696. ACM/SIAM, 1994.
  • [2] N. Alon, P. G. Bradford, and R. Fleischer. Matching nuts and bolts faster. Inf. Process. Lett., 59(3):123–127, 1996.
  • [3] S. Angelov, K. Kunal, and A. McGregor. Sorting and selection with random costs. In Theoretical Informatics, 8th Latin American Symposium, LATIN, volume 4957 of Lecture Notes in Computer Science, pages 48–59. Springer, 2008.
  • [4] I. Banerjee and D. S. Richards. Sorting under forbidden comparisons. In 15th Scandinavian Symposium and Workshops on Algorithm Theory, SWAT, volume 53 of LIPIcs, pages 22:1–22:13. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2016.
  • [5] A. Biswas, V. Jayapaul, and V. Raman. Improved bounds for poset sorting in the forbidden-comparison regime. In Algorithms and Discrete Applied Mathematics - Third International Conference, CALDAM, volume 10156 of Lecture Notes in Computer Science, pages 50–59. Springer, 2017.
  • [6] P. G. Bradford. Matching nuts and bolts optimally. Technical Report MPI-I-95-1-025, 1995.
  • [7] M. E. Dyer, A. M. Frieze, and R. Kannan. A random polynomial time algorithm for approximating the volume of convex bodies. J. ACM, 38(1):1–17, 1991.
  • [8] R. Glebov and M. Krivelevich. On the number of hamilton cycles in sparse random graphs. SIAM J. Discret. Math., 27(1):27–42, 2013.
  • [9] A. Gupta and A. Kumar. Sorting and selection with structured costs. In 42nd Annual Symposium on Foundations of Computer Science, FOCS, pages 416–425. IEEE Computer Society, 2001.
  • [10] A. Gupta and A. Kumar. Where’s the winner? max-finding and sorting with metric costs. In 8th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems, APPROX, volume 3624 of Lecture Notes in Computer Science, pages 74–85. Springer, 2005.
  • [11] Z. Huang, S. Kannan, and S. Khanna. Algorithms for the generalized sorting problem. In 52nd Annual Symposium on Foundations of Computer Science, FOCS, pages 738–747. IEEE Computer Society, 2011.
  • [12] S. Kannan and S. Khanna. Selection with monotone comparison costs. In 14th Annual Symposium on Discrete Algorithms, SODA, volume 12, pages 10–17. ACM-SIAM, 2003.
  • [13] J. Komlós, Y. Ma, and E. Szemerédi. Matching nuts and bolts in OO(n log nn\text{ log }n) time. SIAM J. Discret. Math., 11(3):347–372, 1998.
  • [14] P. Lu, X. Ren, E. Sun, and Y. Zhang. Generalized sorting with predictions. In 4th Symposium on Simplicity in Algorithms, SOSA, pages 111–117. SIAM, 2021.

Appendix A Pseudocode

In this Appendix, we provide pseudocode for both of our algorithms. Algorithms 1, 2, 3, 4, and 5 comprise the StochasticSort(a)lgorithm, with Algorithm 5 our main routine. Algorithm 6 implements the SparsGeneralizedSort(a)lgorithm.

As a notation, we will use Query(u,v)\textsc{Query}(u,v) to be the comparison function that takes two vertices u,vu,v connected by an edge, and returns 11 if uvu\prec v and 0 if vuv\prec u.

Algorithm 1
1:procedure Findx1 \triangleright Find the first vertex x1x_{1}
2:     v0=v_{0}= NULL, S={}S=\{\}
3:     Pick arbitrary edge (u,v)(u,v)
4:     if Query(u,v)=1\textsc{Query}(u,v)=1 then \triangleright Means that uu comes before vv
5:         v0u,v_{0}\leftarrow u, S{v}S\leftarrow\{v\}
6:     else
7:         v0v,v_{0}\leftarrow v, S{u}S\leftarrow\{u\}      
8:     while u\exists u such that (u,v0)E(u,v_{0})\in E and uSu\not\in S do
9:         if Query(u,v0)=1\textsc{Query}(u,v_{0})=1 then
10:              SS{v0}S\leftarrow S\cup\{v_{0}\}
11:              v0uv_{0}\leftarrow u
12:         else
13:              SS{u}S\leftarrow S\cup\{u\}               
14:     Return v0v_{0}
Algorithm 2
1:procedure CreateLevel(ii) \triangleright Create level LiL_{i} and all lower levels
2:     LiLi+1L_{i}\leftarrow L_{i+1}
3:     for each vv in Li+1L_{i+1} do
4:         elim[vv] \leftarrow NULL \triangleright Recreating levels, so vv may become unblocked
5:         for each uu in Li+cL_{i+c} do
6:              if (u,v)Ei(u,v)\in E_{i} then
7:                  if Query(u,v)=1\textsc{Query}(u,v)=1 then \triangleright Means that uu comes before vv
8:                       LiLi\{v}L_{i}\leftarrow L_{i}\backslash\{v\} \triangleright Remove vv from LiL_{i}, since it is blocked by uu
9:                       elim[vv] \leftarrow uu \triangleright keep track of which vertex blocked vv                                               
10:     if i>1i>1 then
11:         CreateLevel(i1i-1) \triangleright recursively do lower levels      
Algorithm 3
1:procedure Increment(\ell) \triangleright Do after current lowest vertex xx_{\ell} found
2:     for i=1i=1 to q+cq+c do
3:         Li=Li\{x}L_{i}=L_{i}\backslash\{x_{\ell}\} \triangleright Remove xx_{\ell} from all levels      
4:     for each vv such that elim[vv]=x=x_{\ell} do
5:         elim[vv] \leftarrow NULL \triangleright xx_{\ell} is discovered, so vv no longer blocked by xx_{\ell}
6:         Let ii be the lowest level containing vv
7:         while i>1i>1 do
8:              for each uu in Li+cL_{i+c} do
9:                  if (u,v)Ei(u,v)\in E_{i} then
10:                       if Query(u,v)=1\textsc{Query}(u,v)=1 then \triangleright Means that uu comes before vv
11:                           elim[vv] \leftarrow uu \triangleright keep track of which vertex eliminated vv
12:                           break \triangleright End procedure for vv                                                        
13:              ii1i\leftarrow i-1 \triangleright Decrement level of vv if vv isn’t blocked by any uu
14:              Li1Li1{v}L_{i-1}\leftarrow L_{i-1}\cup\{v\} \triangleright Add vv to level Li1L_{i-1}               
Algorithm 4
1:procedure Find(\ell) \triangleright Find the next vertex in the order, xx_{\ell}
2:     S=S=\emptyset \triangleright SS is the set of candidates that could be xx_{\ell}
3:     for each v{x1,,x1}v\not\in\{x_{1},\dots,x_{\ell-1}\} do
4:         if vL1v\in L_{1} and (x1,v)E(x_{\ell-1},v)\in E then
5:              SS{v}S\leftarrow S\cup\{v\} \triangleright Add vv to set of candidates               
6:     for i=1i=1 to qq do
7:         for each vv in SS do
8:              for each uu in LiL_{i} do \triangleright Check if something in iith level has edge to vv
9:                  if (u,v)E(u,v)\in E then
10:                       if Query(u,v)=1\textsc{Query}(u,v)=1 then
11:                           S=S\{v}S=S\backslash\{v\} \triangleright vv is no longer a candidate                                          
12:                  if |S|=1|S|=1 then \triangleright once we’ve reached a single candidate
13:                       Return S[0]S[0] \triangleright Return the only element in SS, and break                                               
Algorithm 5
1:procedure StochasticSort(G=(V,E)G=(V,E)) \triangleright Recover x1,,xn/2.x_{1},\dots,x_{n/2}. n,pn,p are known
2:     Initialize all levels L1,,Lq+cL_{1},\dots,L_{q+c} as [n][n]
3:     Initialize elim[vv] = NULL for all v[n]v\in[n]
4:     CreateLevel(qq)
5:     for =1\ell=1 to n/2n/2 do \triangleright Recover xx_{\ell} in order
6:         if =1\ell=1 then
7:              x=x_{\ell}= Findx1 \triangleright Use different procedure for finding first vertex
8:         else
9:              x=x_{\ell}= Find(\ell)          
10:         Increment(\ell)
11:         if =p1/16\ell=\lfloor\ell^{\prime}\cdot p^{-1}/16\rfloor for some integer \ell^{\prime} then
12:              CreateLevel(1+ν2()1+\nu_{2}(\ell^{\prime})) \triangleright ν2():\nu_{2}(\ell^{\prime}): largest exponent of 22 dividing \ell^{\prime}               
Algorithm 6
1:procedure SparseGeneralSort(G=(V,E)G=(V,E)) \triangleright Recover sorted order for arbitrary graph. n=|V|,m=|E|n=|V|,m=|E|.
2:     a2m/na\leftarrow 2\sqrt{m/n}, wm/nlgnw\leftarrow\sqrt{m/n}\cdot\lg n
3:     E=E^{\prime}=\emptyset \triangleright EE^{\prime} is current set of queried edges, currently empty
4:     while |Σ(G,E)|2|\Sigma(G,E^{\prime})|\geq 2, where Σ(G,E)\Sigma(G,E^{\prime}) is the set of permutations that do not violate any of the edge orders in EE^{\prime}  do
5:         for each vertex vv do
6:              SE,G(v):=S_{E^{\prime},G}(v):= average rank of vv among all permutations in Σ(G,E)\Sigma(G,E^{\prime}).          
7:         E~\tilde{E}\leftarrow\emptyset \triangleright E~\tilde{E} will be the predicted edge orders
8:         for each edge (u,v)E(u,v)\in E do
9:              if SE,G(u)SE,G(v)S_{E^{\prime},G}(u)\leq S_{E^{\prime},G}(v) then
10:                  E~E~(u,v)\tilde{E}\leftarrow\tilde{E}\cup(u,v)
11:              else
12:                  E~E~(v,u)\tilde{E}\leftarrow\tilde{E}\cup(v,u)                        
13:         boolean order \leftarrow TRUE \triangleright Checks if the aa random edges are predicted correctly by E~\tilde{E}
14:         for ii from 11 to aa do
15:              (ui,vi)E(u_{i},v_{i})\sim E uniformly at random \triangleright WLOG (ui,vi)(u_{i},v_{i}) is ordered such that (ui,vi)E~(u_{i},v_{i})\in\tilde{E}
16:              if Query(ui,vi)=0\textsc{Query}(u_{i},v_{i})=0 then \triangleright Means that viuiv_{i}\prec u_{i} in the true order
17:                  EE(vi,ui)E^{\prime}\leftarrow E^{\prime}\cup(v_{i},u_{i})
18:                  order \leftarrow FALSE                        
19:         if order == TRUE then
20:              Return PredictionSort(G,E~,wG,\tilde{E},w)               
21:     Return the unique permutation in Σ(G,E)\Sigma(G,E^{\prime})