Technical Report:
Dealing with Undependable Workers
in Decentralized Network Supercomputing††thanks:
This work is supported in part by the NSF award 1017232.
Abstract
Internet supercomputing is an approach to solving partitionable, computation-intensive problems by harnessing the power of a vast number of interconnected computers. This paper presents a new algorithm for the problem of using network supercomputing to perform a large collection of independent tasks, while dealing with undependable processors. The adversary may cause the processors to return bogus results for tasks with certain probabilities, and may cause a subset of the initial set of processors to crash. The adversary is constrained in two ways. First, for the set of non-crashed processors , the average probability of a processor returning a bogus result is inferior to . Second, the adversary may crash a subset of processors , provided the size of is bounded from below. We consider two models: the first bounds the size of by a fractional polynomial, the second bounds this size by a poly-logarithm. Both models yield adversaries that are much stronger than previously studied. Our randomized synchronous algorithm is formulated for processors and tasks, with , where depending on the number of crashes each live processor is able to terminate dynamically with the knowledge that the problem is solved with high probability. For the adversary constrained by a fractional polynomial, the round complexity of the algorithm is , its work is and message complexity is . For the poly-log constrained adversary, the round complexity is , work is , and message complexity is All bounds are shown to hold with high probability.
1 Introduction
Cooperative network supercomputing is becoming increasingly popular for harnessing the power of the global Internet computing platform. A typical Internet supercomputer, e.g., [1, 2], consists of a master computer and a large number of computers called workers, performing computation on behalf of the master. Despite the simplicity and benefits of a single master approach, as the scale of such computing environments grows, it becomes unrealistic to assume the existence of the infallible master that is able to coordinate the activities of multitudes of workers. Large-scale distributed systems are inherently dynamic and are subject to perturbations, such as failures of computers and network links, thus it is also necessary to consider fully distributed peer-to-peer solutions.
One could address the single point of failure issue by providing redundant multiple masters, yet this would remain a centralized scheme that is not suitable for big data processing that involves a large amount of input and output data. For example, consider applications in molecular biology that require large reference databases of gene models or annotated protein sequences, and large sets of unknown protein sequences [15]. Dealing with such voluminous data requires a large scale platform providing the necessary computational power and storage.
Therefore, a more scalable approach is to use a decentralized system, where the input is distributed and, once the processing is complete, the output is distributed across multiple nodes. Interestingly, computers returning bogus results is a phenomenon of increasing concern. While this may occur unintentionally, e.g., as a result of over-clocked processors, workers may in fact wrongly claim to have performed assigned work so as to obtain incentives associated with the system, e.g., higher rank. To address this problem, several works, e.g., [5, 10, 11, 18], study approaches based on a reliable master coordinating unreliable workers. The drawback in these approaches is the reliance on a reliable, bandwidth-unlimited master processor.
In our recent work [6, 7] we began to address this drawback of centralized systems by removing the assumption of an infallible and bandwidth-unlimited master processor. We introduced a decentralized approach, where a collection of worker processors cooperates on a large set of independent tasks without the reliance on a centralized control. Our prior algorithm is able to perform all tasks with high probability (whp), while dealing with undependable processors under an assumption that the average probability of live (non-crashed) processors returning incorrect results remains inferior to during the computation. There the adversary is only allowed to crash a constant fraction of processors, and the correct termination of the -processor algorithm strongly depends on the availability of live processors.
The goal of this work is to develop a new -processor algorithm that is able to deal with much stronger adversaries, e.g., those that can crash all but a fractional polynomial in , or even a poly-log in , number of processors, while still remaining in the synchronous setting with reliable communication. One of the challenges here is to enable an algorithm to terminate efficiently in the presence of any allowable number of crashes. Of course, to be interesting, such a solution must be efficient in terms of its work and communication complexities.
Contributions. We consider the problem of performing tasks in a distributed system of workers without centralized control. The tasks are independent, they admit at-least-once execution semantics, and each task can be performed by any worker in constant time. We assume that tasks can be obtained from some repository (else we can assume that the tasks are initially known). The fully-connected message-passing system is synchronous and reliable. We deal with failure models where crash-prone workers can return incorrect results. We present a randomized decentralized algorithm and analyze it for two different adversaries of increasing strength: constrained by a fractional polynomial, and poly-log constrained. In each of these settings, we assume that at any point of the computation live processors return bogus results with the average probability inferior to . In more detail our contributions are as follows.
-
1.
Given the initial set of processors , with , we formulate two adversarial models, where the adversary can crash a set of processors, subject to the model constraints:
-
For the first adversary, constrained by a fractional polynomial, we have , for a constant .
-
For the second, poly-log constrained model, we have , for a constant .
In both models the adversary may assign arbitrary constant probability to processors, provided that processors in return bogus results with the average probability inferior to . The adversary is additionally constrained, so that the average probability of returning bogus results for processors in must remain inferior to .
-
-
2.
We present a randomized algorithm for processors and tasks that works in synchronous rounds, where each processor performs a random task and shares its cumulative knowledge of results with one randomly chosen processor. Each processor starts as a “worker,” and once a processor accumulates a “sufficient” number of results, it becomes “enlightened.” Enlightened processors then “profess” their knowledge by multicasting it to exponentially growing random subsets of processors. When a processor receives a “sufficient” number of such messages, it halts. We note that workers become enlightened without any synchronization, and using only the local knowledge. The values that control “sufficient” numbers of results and messages are established in our analysis and are used as compile-time constants.
We consider the protocol, by which the “enlightened” processors “profess” their knowledge and reach termination, to be of independent interest. The protocol’s message complexity does not depend on crashes, and the processors can terminate without explicit coordination. This addresses one of the challenges associated with termination when can vary broadly in both models.
-
3.
We analyze the quality and performance of the algorithm for the two adversarial models. For each model we show that all live workers obtain the results of all tasks whp, and that these results are correct whp. Complexity results for the algorithm also hold whp:
-
For the polynomially constrained adversary we show that the algorithm has work complexity and message complexity .
-
For the poly-log constrained adversary we show that the algorithm has work complexity and message complexity , for any . For this model we note that trivial solutions with all workers doing all tasks may be work-efficient, but they do not guarantee that the results are correct.
-
Prior work. Earlier approaches explored ways of improving the quality of the results obtained from untrusted workers in the settings where a bandwidth-unlimited and infallible master is coordinating the workers. Fernandez et al. [11, 10] and Konwar et al. [18] consider a distributed master-worker system where the workers may act maliciously by returning wrong results. Works [11, 10, 18] design algorithms that help the master determine correct results whp, while minimizing work. The failure models assume that some fraction of processors can exhibit faulty behavior. Another recent work by Christoforou et al. [5] pursues a game-theoretic approach. Paquette and Pelc [21] consider a model of a fault-prone system in which a decision has to be made on the basis of unreliable information and design a deterministic strategy that leads to a correct decision whp.
As already mentioned, our prior work [6] introduced the decentralized approach that eliminates the master, and provided a synchronous algorithm that is able to perform all tasks whp. That algorithm requires live processors to terminate correctly. Our new algorithm uses a similar approach to performing tasks, however, it takes a completely different approach to termination that enables it to tolerate a much broader spectrum of crashes. The approach uses the new notion of “enlightened” processors that, having acquired sufficient knowledge, “profess” this knowledge to other processors, ultimately leading to termination. The behavior of two algorithms is similar while processors remain in the computation, and we use the results from our prior analysis for this case.
A related problem, called Do-All, deals with the setting where a set of processors must perform a collection of tasks in the presence of adversity [12, 16]. For Do-All the termination condition is that all tasks must be performed and at least one processor is aware of the fact. The problem in this paper is different in that each non-crashed processor must learn the results of all tasks. Additionally, the failure model in our problem allows processors to return incorrect results, and our solution requires that each task is performed a certain minimum number of times so that the correct result can be discerned, whereas Do-All algorithms only guarantee that each task is performed at least once. Thus major changes are required to adapt a solution for Do-All to our setting. Do-All, being a key problem in the study of cooperative distributed computation was considered in a variety of models, including message-passing [9, 22] and shared-memory models [17, 19]. Chlebus et al. [4], study the Do-All problem, in the synchronous setting, considering work (total number of steps taken) and communication (total number of point-to-point messages) as equivalent, i.e. they consider the complexity work + communication as the cost metric. They derive upper bounds for the bounded adversary, upper and lower bounds for bounded adversary, and almost matching upper and lower bounds for the linearly-bounded adversary.
Another related problem is the Omni-Do problem [8, 14, 13]. Here the problem is to perform all tasks in a network that is prone to fragmentations (partitions), thus here the results also must be known to all processors. However the failure models are quite different (network fragmentation and merges), and so is the analysis for these models. For linearly-bounded weakly-adaptive adversaries Chlebus and Kowalski [3] give a very efficient randomized algorithm for with work and communication complexities .
Probabilistic quantification of trustworthiness of participants is also used in distributed reputation mechanisms. Yu et al. [23] propose a probabilistic model for distributed reputation management, where agents, i.e., participants, keep ratings of trustworthiness on one another, and update the values, using referrals and testimonies, through interactions.
2 Model of Computation and Definitions
System model. There are processors, each with a unique identifier (id) from set . We refer to the processor with id as processor . The system is synchronous and processors communicate by exchanging reliable messages. Computation is structured in terms of synchronous rounds, where in each round a processor performs three steps: send, receive, and compute. In these steps, respectively, processors can send and receive messages, and perform local polynomial computation, where the local computation time is assumed to be negligible compared to message latency. Messages received by a processor in a given step include all messages sent to it in the previous step. The duration of each round depends on the algorithm.
Tasks. There are tasks to be performed, each with a unique id from set . We refer to the task with id as . The tasks are similar, meaning that any task can be done in constant time by any processor, () independent, meaning that each task can be performed independently of other tasks, and (c) idempotent, meaning that each task admits at-least-once execution semantics and can be performed concurrently. For simplicity, we assume that the outcome of each task is a binary value. The problem is most interesting when there are at least as many tasks as there are processors, thus we consider .
Models of Adversity. Processors are undependable in that a processor may compute the results of tasks incorrectly and it may crash. A processor can crash at any moment during the computation; following a crash, a processor performs no further actions.
Otherwise, each processor adheres to the protocol established by the algorithm it executes. We refer to non-crashed processors as live. We consider an oblivious adversary that decides prior to the computation what processors to crash and when to crash them. The maximum number of processors that can crash is established by the adversarial models (specified below).
For each processor , we define to be the probability of processor returning incorrect results, independently of other processors, such that, . That is, the average probability of processors in returning incorrect results is inferior to . We use the constant to ensure that the average probability of incorrect computation does not become arbitrarily close to as grows arbitrarily large. The individual probabilities of incorrect computation are unknown to the processors.
For an execution of an algorithm, let be the set of processors that adversary crashes. The adversary is constrained in that the average probability of processors in computing results incorrectly remains inferior to . We define two adversarial models:
-
Model , adversary constrained by a fractional polynomial :
, for a constant . -
Model , poly-log constrained adversary :
, for a constant .
Measures of efficiency. We assess the efficiency of algorithms in terms of time, work, and message complexities. We use the conventional measures of time complexity and work complexity. Message complexity assesses the number of point-to-point messages sent during the execution of an algorithm. Lastly, we use the common definition of an event occurring with high probability (whp) to mean that for some constant .
3 Algorithm Description
We now present our decentralized solution, called algorithm daks (for Decentralized Algorithm with Knowledge Sharing), that employs no master and instead uses a gossip-based approach. We start by specifying in detail the algorithm for processors and tasks, then we generalize it for tasks, where .
The algorithm is structured in terms of a main loop. The principal data structures at each processor are two arrays of size linear in : one accumulates knowledge gathered from the processors, and another stores the results. All processors start as workers. In each iteration, any worker performs one randomly selected task and sends its knowledge to just one other randomly selected processor. When a worker obtains “enough” knowledge about the tasks performed in the system, it computes the final results, stops being a worker, and becomes “enlightened.” Such processors no longer perform tasks, and instead “profess” their knowledge to other processors by means of multicasts to exponentially increasing random sets of processors. The main loop terminates when a certain number of messages is received from enlightened processors. The pseudocode for algorithm daks is given in Figure 1. We now give the details.
Local knowledge and state. The algorithm is parameterized by , the number of processors and tasks, and by compile-time constants and that are discussed later (they emerge from the analysis). Every processor maintains the following:
-
•
Array of results , where element , for , is a set of results for . Each is a set of triples , where is the result computed for by processor in round (here the inclusion of ensures that the results computed by processor in different rounds are preserved).
-
•
The array stores the final results.
-
•
The stores the number of messages received from enlightened processors.
-
•
is the round (iteration) number that is used by workers to timestamp the computed results.
-
•
is the exponent that controls the number of messages multicast by enlightened processors.
Control flow. The algorithm iterations are controlled by the main while-loop, and we use the term round to refer to a single iteration of the loop. The loop contains three stages, viz., Send, Receive, and Compute.
Processors communicate using messages that contain pairs . Here is the sender’s array of results. When a processor is a worker, it sends messages with . When a processor becomes enlightened, it sends messages with . The loop is controlled by the counter that keeps track of the received messages of type profess. We next describe the stages in detail.
-
Send stage: Any worker chooses a target processor at random and sends its array of results to processor in a share message. Any enlightened processor chooses a set of processors at random and sends the array of results to processors in in a profess message. The size of the set is , where initially , and once a processor is enlightened, it increments by in every round. (Strictly speaking, is a multiset, because the random selection is with replacement. However this is done only for the purpose of the analysis, and can be safely treated as a set for the purpose of sending profess messages.)
-
Receive stage: Processor receives messages (if any) sent to it in the preceding Send stage. The processor increments its by the number of profess messages received. For each task , the processor updates its by including the results received in all messages.
-
Compute stage: Any worker randomly selects task , computes the result , and adds the triple for round to . For each task the worker checks whether “enough” results were collected. Once at least results for each task are obtained, the worker stores the final results in by taking the plurality of results for each task, and becomes enlightened. (In Section 4 we reason about the compile-time constant , and establish that results are sufficient for our claims.) Enlightened processors rest on their laurels in subsequent Compute stages.
Reaching Termination. We note that a processor must become enlightened before it can terminate. Processors can become enlightened at different times and without any synchronization. Once enlightened, they profess their knowledge by multicasting it to exponentially growing random subsets of processors. When a processor receives sufficiently many such messages, i.e., , it halts, again without any synchronization, and using only the local knowledge. We consider this protocol to be of independent interest. In Section 4 we reason about the compile-time constant , and establish that profess messages are sufficient for our claims; additionally we show that the protocol’s efficiency can be assessed independently of the number of crashes.
Extending the Algorithm for . We now show how to modify the algorithm to handle arbitrary number of tasks such that . Let be the set of unique task identifiers, where . We segment the tasks into chunks of tasks, and construct a new array of chunk-tasks with identifiers in , where each chunk-task takes time to perform by any live processor. We now use algorithm daks, where the only difference is that each Compute stage takes time to perform a chunk-task. In the sequel, we use daks as the name of the algorithm when , and we use dakst,n as the name of the algorithm when .
4 Algorithm Analysis
We present the performance analysis of algorithm daks in the two adversarial failure models. We first present the analysis that deals with the case when , then extend the results to the general case with for algorithm dakst,n.
4.1 Foundational Lemmas
We proceed by giving lemmas relevant to both adversarial models, starting with the statement of the well known Chernoff bound.
Lemma 1 (Chernoff Bounds)
Let be independent Bernoulli random variables with and , then it holds for and that for all , (i) , and (ii) .
Now we show that if profess messages are sent by the enlightened processors, then the algorithm terminates whp in one round.
Lemma 2
Let be the first round by which the total number of profess messages is . Then by the end of this round every live processor halts whp.
Proof. Let be the number of messages sent by round , where is a sufficiently large constant. We show that whp every live processor receives at least messages, for some constant . Let us assume that there exists processor that receives less than of such messages. We prove that whp such a processor does not exist.
Since messages are sent by round , there were random selections of processors from set in line 15 of algorithm daks on page 1, possibly by different enlightened processors. We denote by an index of one of the random selections in line 15. Let be a Bernoulli random variable such that if processor was chosen by an enlightened processor and otherwise.
We define a random variable to estimate the total number of times processor is selected by round . In line 15 every enlightened processor chooses a set of destinations for the message uniformly at random, and hence . Let , then by applying Chernoff bound, for the same chosen as above, we have:
where for some sufficiently large . We now define to be . Thus, with this , we have for some . Now let us denote by the fact that by the end of round , and let be the complement of that event. By Boole’s inequality we have , where . Hence each processor is the destination of at least messages whp, i.e.,
and hence, it halts (line 10).
We use the constant from the proof of Lemma 2 as a compile-time constant in algorithm daks (Figure 1). The constant is used in the main while loop (line 10) to determine when a sufficient number of profess messages is received from enlightened processors, causing the loop to terminate.
We now show that once a processor, that the adversary does not crash, becomes enlightened then, whp, in rounds every other live processor becomes enlightened, and halts.
Lemma 3
Once a processor becomes enlightened, every live processor halts in additional rounds whp.
Proof. According to Lemma 2 if messages are sent then every processor halts whp. Given that processor does not crash it takes at most rounds to send messages (per line 15 in Figure 1), regardless of the actions of other processors. Hence, whp every live processor halts in rounds.
Next we establish the work and message complexities for algorithm daks for the case when number of crashes is small, specifically when at least a linear number of processors do not crash. As we mentioned in the introduction, while processors remain active in the computation, algorithm daks performs tasks in exactly the same pattern as algorithm in [6] (to avoid a complete restatement, we kindly refer the reader to that earlier paper). This forms the basis for the next lemma.
Lemma 4
Algorithm daks has work and message complexity when processors do not crash.
Proof. Algorithm daks chooses tasks to perform in the same pattern as algorithm in [6], however the two algorithms have very different termination strategies. Theorems 2 and 4 of [6] establish that in the presence of at most crashes, for a constant , the work and message complexities of algorithm are . The termination strategy of algorithm daks is completely different, however, per Lemmas 2 and 3, after at least one processor from is enlightened every live processor halts in rounds whp, having sent messages. Thus, with at least a linear number of processors remaining, the work and message complexities, relative to algorithm , increase by an additive term. The result follows.
We denote by the number of rounds required for a processor from the set to become enlightened. We next analyze the value of for models and .
4.2 Analysis for Model
In model we have . Let be the actual number of crashes that occur prior to round . For the purpose of analysis we divide an execution of the algorithm into two epochs: epoch consists of all rounds where is at most linear in , so that when the number of live processors is at least for some suitable constant ; epoch consists of all rounds starting with first round (it can be round 1) when the number of live processors drops below some and becomes for some suitable constant . Note that either epoch may be empty.
For the small number of crashes in epoch , Lemma 4 gives the worst case work and message complexities as ; the upper bounds apply whether or not the algorithm terminates in this epoch.
Next we consider epoch . If the algorithm terminates in round , the first round of the epoch, the cost remains the same as given by Lemma 4. If it does not terminate, it incurs additional costs associated with the processors in , where . We analyze the costs for epoch in the rest of this section. The final message and work complexities will be at most the worst case complexity for epoch plus the additional costs for epoch incurred while per model .
First we show that whp it will take rounds for a worker from the set to become enlightened in epoch .
Lemma 5
In rounds of epoch every task is performed times whp by processors in .
Proof. If the algorithm terminates within rounds of epoch , then each task is performed times as reasoned earlier. Suppose the algorithm does not terminate (in this case its performance is going to be worse).
Let us assume that after rounds of algorithm daks, where is a sufficiently large constant and is a constant, there exists a task that is performed less than times among all live workers, for some . We prove that whp such a task does not exist.
We define to be such that (the constant will play a role in establishing the value of the compile-time constant of algorithm daks; we come back to this in Section 4). According to the above assumption, at the end of round for some task , we have .
Let us consider all algorithm iterations individually performed by each processor in during the rounds. Let be the total number of such individual iterations. Then . During any such iteration, a processor from selects and performs task in line 24 independently with probability . Let us arbitrarily enumerate said iterations from to . Let be Bernoulli random variables, such that is if task is performed in iteration , and otherwise. We define , the random variable that describes the total number of times task is performed during the rounds by processors in . We define to be . Since , for , where , by linearity of expectation, we obtain . Now by applying Chernoff bound, for the same chosen as above, we have:
where for some sufficiently large . Now let us denote by the fact that by the round of the algorithm and we denote by the complement of that event. Next by Boole’s inequality we have , where . Hence each task is performed at least times by workers in whp, i.e.,
We now focus only on the set of live processors with . Our goal is to show that in rounds of algorithm daks at least one processor from becomes enlightened. In reasoning about Lemmas 6, 7 and 8, that follow, we note that if the algorithm terminates within rounds of epoch , then every processor in is enlightened as reasoned earlier. Suppose the algorithm does not terminate (in focusing on this case we note that the algorithm’s performance is going to be worse).
We first show that any triple generated by a processor in is known to all processors in in rounds of algorithm daks.
We denote by the set of processors that know a certain triple by round , and let . Next lemma shows that by round in epoch we have .
Lemma 6
By round of epoch , whp.
Proof. Consider a scenario when a processor generates a triple . Then the probability that processor sends triple to at least one other processor , where , in rounds is at least
s.t.
for some appropriately chosen and for a sufficiently large . Similarly, it is straightforward to show that the number of live processors that learn about doubles every rounds, hence whp after rounds the number of processors in that learn about is .
In the next lemma we reason about the growth of after round .
Lemma 7
Let be the first round after round in epoch such that . Then whp.
Proof. Per model , let constant be such that . We would like to apply the Chernoff bound to approximate the number of processors from that learn about triple by round . According to algorithm daks if a processor learns about triple in some round , then in round processor forwards to some randomly chosen processor (lines 12-13 of the algorithm). Let , where , be a random variable such that if processor receives the triple from some processor , in some round , and otherwise. It is clear that if some processor , where receives triple from processor in round , then random variables and are not independent, and hence, the Chernoff bound cannot be applied. To circumvent this, we consider the rounds between and and partition these rounds into blocks of consecutive rounds. For instance, rounds form the first block, rounds form the second block, etc. The final block may contain less than rounds.
We are interested in estimating the fraction of the processors in that learn about triple at the end of each block.
For the purpose of the analysis we consider another algorithm, called daks′. The difference between algorithms daks and daks′ is that in daks′ a processor does not forward triple in round if was first received in the round that belongs to the same block as does. This allows us to apply Chernoff bound (with negative dependencies) to approximate the number of processors in that learn about triple in a block. We let be the subset of processors in that are aware of triple by round in algorithm daks′, and we let . Note, that since in daks′ triple is forwarded less often than in daks, it follows that the number of processors from that learn about in daks is at least as large as the number of processors from that learn about in daks′, and, in particular, , for any . This allows us to consider algorithm daks′ instead of daks for assessing the number of processors from that learn about by round , and we do this by having serve as a lower bound for .
Let , where , be a random variable, s.t. if processor receives the triple from some processor in a block that starts with round , e.g., for the first block and otherwise. Let us next define the random variable to count the number of processors in that received triple in the block that starts with round .
Next, we calculate , the expected number of processors in that learn about triple at the end of the block that begins with round in algorithm daks′. There are processors in that are aware of triple . Note that there are consecutive rounds in a block; and during every round every processor picks a processor from uniformly at random, and sends the triple to it. Note also that in algorithm daks′, triple is not forwarded by a processor during the same round in which it is received. Therefore, every processor in has a probability of to be selected by a processor in one round. Conversely, the probability that is not selected by is . The number of trials is , hence the probability that processor is not selected is . On the contrary, the probability that a processor is selected is . Therefore, the expected number of processors from that learn about triple by the end of the block in algorithm daks′ is . Next, by applying the binomial expansion, we have:
The number of processors from that become aware of triple in the block of rounds that starts with round is . While, as shown above, the expected number of processors that learn about triple is .
On the other hand, because in algorithm daks′ no processor that learns about triple in a block forwards it in the same block, we have negative dependencies among the random variables . And hence, we can apply the regular Chernoff bound, with . Considering also that and that by Lemma 6, we obtain:
where for some sufficiently large .
Therefore, whp the number of processors that learn about triple in a block that starts with round is
for a sufficiently large , and given that (otherwise the lemma is proved).
Hence we showed that the number of processors from that learnt about triple at the end of the block that starts with round , is at least whp. It remains to show that whp. Indeed, even assuming that processors that learnt about triple following round do not disseminate it, after repeating the process described above for some times, it is clear that whp . On the other hand since the block size is and there are blocks.
Thus whp we have for , and since we have .
In the proof of the next lemma we use the Coupon Collector’s problem [20]:
Definition 1
The Coupon Collector’s Problem (CCP). There are types of coupons and at each trial a coupon is chosen at random. Each random coupon is equally likely to be of any of the n types, and the random choices of the coupons are mutually independent. Let be the number of trials. The goal is to study the relationship between and the probability of having collected at least one copy of each of types.
In [20] it is shown that and that whp the number of trials for collecting all coupon types lies in a small interval centered about its expected value.
Next we calculate the number of rounds required for the remaining processors in to learn . Let be the set of workers that do not learn after rounds of algorithm daks. According to Lemma 7 we have .
Lemma 8
Once every task is performed times in epoch by processors in then at least one worker from becomes enlightened in rounds, whp.
Proof. According to Lemmas 6 and 7 in rounds of algorithm daks at least of the workers are aware of triple generated by a processor in . Let us denote this subset of workers by , where is the first such round.
We are interested in the number of rounds required for every processor in to learn about whp by receiving a message from a processor in in some round following .
We show that, by the analysis similar to CCP, in rounds triple is known to all processors in , whp. Every processor in has a unique id, hence we consider these processors as different types of coupons and we assume that the processors in collectively represent the coupon collector. In this case, however, we do not require that every processor in contacts all processors in whp. Instead, we require only that the processors in collectively contact all processors in whp. According to our algorithm in every round every processor in , selects a processor uniformly at random and sends all its data to it. Let us denote by the collective number of trials by processors in to contact processors in . According to CCP if then whp processors in collectively contact every processor in , including those in . Since there are at least processors in then in every round the number of trials is at least , hence in rounds whp all processors in learn about . Therefore, in rounds whp all processors in , and thus in , learn about .
Let be the set of triples such that for every task there are triples generated by processors in , and hence . Now by applying Boole’s inequality we want to show that whp in rounds all triples in become known to all processors in .
Let be the event that some triple is not known to all processors in . In the preceding part of the proof we have shown that , where . By Boole’s inequality, the probability that there exists one triple in that is not known to all processors in can be bounded as
where . This implies that every processor in collects all triples generated by processors in , whp. And hence, at least one of these processors becomes enlightened after rounds.
Theorem 1
Algorithm daks makes known the correct results of all tasks at every live processor in epoch after rounds whp.
Proof. According to algorithm daks (line 28) every live processor computes the result of every task by taking a plurality among all the results. We want to prove that the majority of the results for any task are correct at any enlightened processor, whp.
To do that, for a task we estimate (with a concentration bound) the number of times the results are computed correctly, then we estimate the bound on the total number of times task is computed (whether correctly or incorrectly), and we show that a majority of the results are computed correctly.
Let us consider random variables that denote the success or failure of correctly computing the result of some task in round by worker . Specifically, if in round , worker computes the result of the task correctly, otherwise . According to our algorithm we observe that for a live processor we have and , where . We want to count the number of correct results calculated for task when a processor becomes enlightened. As before, we let be the set of processors that crashes prior to round .
Let denote the number of correctly computed results for task among all live workers during round . By linearity of expected values of a sum of random variables we have
We denote by the minimum number of rounds required for at least one processor from to become enlightened. It follows from line 26 of algorithm daks that a processor becomes enlightened only when there are at least results for every task (the constant is chosen later in this section). We see that , where . This is because there are tasks to be performed, and in epoch we have for a constant .
We further denote by the number of correctly computed results for task when the condition in line 26 of the algorithm is satisfied. Again, using the linearity of expected values of a sum of random variables we have
Note that, according to our adversarial model definition, for every round we have , for some fixed . Note also that , and hence, there exists some , such that, . Also, observe that the random variables are mutually independent, since we consider an oblivious adversary and the random variables correspond to different rounds of execution of the algorithm. Therefore, by applying Chernoff bound on we have:
where as above and for a sufficiently large .
Let us now count the total number of times task is chosen to be performed during the execution of the algorithm until every live processor halts. We represent the choice of task by worker during round by a random variable . We assume if is chosen by worker in round , otherwise .
At this juncture, we address a technical point regarding the total number of results for used for computing plurality. Note that even after round any processor that is still a worker continues to perform tasks, thereby adding more results for task . According to Lemma 3 every processor is enlightened in rounds after . Furthermore, in epoch following round , the number of processors that are still workers is . Hence, the expected number of results computed for every task by workers is , for some , that is, , for some . Therefore, the number of results computed for task , starting from round and until the termination is negligible. Let us denote by the total number of results computed for a task at termination. We express the random variable as , where is the last round prior to termination. As argued above, the total number of results computed for task between rounds and is , for some , and hence . Note that the outer sum terms of consisting of the inner sums are mutually independent because each sum pertains to a different round; this allows us to use Chernoff bounds. From above it is clear that . Therefore, by applying Chernoff bound for the same as chosen above we have:
where for a sufficiently large .
Then, by applying Boole’s inequality on the above two events, we have
where
Therefore, from above, and from the fact that , we have for some . Hence, at termination, whp, the majority of calculated results for task are correct. Let us denote this event by . It follows that . Now, by Boole’s inequality we obtain
where is the set of all tasks, and .
By Lemmas 3, 7 and 8, whp, in rounds of the algorithm, at least triples generated by processors in are disseminated across all workers whp. Thus, the majority of the results computed for any task at any worker is the same among all workers, and moreover these results are correct whp.
According to Lemma 8, after rounds of epoch at least one processor in becomes enlightened. Furthermore, once a processor in becomes enlightened, according to Lemma 3 after rounds of the algorithm every live processor becomes enlightened and then terminates, whp. Next we assess work and message complexities.
Theorem 2
For algorithm daks has work and message complexity .
Proof. To obtain the result we combine the costs associated with epoch with the costs of epoch . The work and message complexity bounds for epoch are given by Lemma 4 as .
For epoch (if it is not empty), where , the algorithm terminates after rounds whp and there are live processors, thus its work is . In every round if a processor is a worker it sends a share message to one randomly chosen processor. If a processor is enlightened then it sends profess messages to a randomly selected subset of processors. In every round share messages are sent. Since the algorithm terminates, whp, in rounds, share messages are sent. On the other hand, according to Lemma 2, if during the execution of the algorithm profess messages are sent then every processor terminates whp. Hence, the message complexity is .
The worst case costs of the algorithm correspond to the executions with non-empty epoch , where the algorithm does not terminate early. In this case the costs from epoch are asymptotically absorbed into the worst case costs of epoch computed above.
Finally, we consider the efficiency of algorithm dakst,n for tasks, where . Note that the only change in the algorithm is that, instead of one task, processors perform chunks of tasks. The communication pattern in the algorithm remains exactly the same. The following result is directly obtained from the analysis of algorithm daks for by multiplying the time and work complexities by the size of the chunk of tasks; the message complexity is unchanged.
Theorem 3
Algorithm dakst,n, with , computes the results of tasks correctly in model whp, with time complexity , work complexity , and message complexity .
Proof. For epoch algorithm daks has time , work , and message complexity is . The same holds for algorithm dakst,n. For epoch algorithm daks takes iterations for at least one processor from set to become enlightened whp. The same holds for dakst,n, except that each iteration is extended by rounds due to the size of chunks (recall that no communication takes place during these rounds). This yields round complexity . Work complexity is then . Message complexity remains the same as for algorithm daks at as the number of messages does not change. The final assessment is obtained by combining the costs of epoch and epoch .
4.3 Failure Model
We start with the analysis of algorithm daks, then extend the main result to algorithm dakst,n, for the adversarial model , where , where we use the term to denote a member of the class of functions . As a motivation, first note that when a large number of crashes make , one may attempt a trivial solution where all live processors perform all tasks. While this approach has efficient work, it does not guarantee that workers compute correct results; in fact, since the overall probability of live workers producing bogus results can be close to , this may yield on the average just slightly more than correct results.
For executions in , let be at least , for specific constants and satisfying the model constraints. Let be the actual number of crashes that occur prior to round . For the purpose of analysis we divide an execution of the algorithm into two epochs: epoch and epoch . In epoch we include all rounds where remains constrained as in model , i.e., , for some constants and ; for reference, this epoch combines epoch and epoch from the previous section. In epoch we include all rounds starting with the first round (it can be round 1) when the number of live processors drops below , but remains per model .Also note that either epoch may be empty.
In epoch the algorithm incurs costs exactly as in model . Next we consider epoch . If algorithm daks terminates in round , the first round of the epoch, the costs remain the same as the costs analyzed for in the previous section.
If it does not terminate, it incurs additional costs associated with the processors in , where . We analyze the costs for epoch next. The final message and work complexities are then at most the worst case complexity for epoch plus the additional costs for epoch .
In the next lemmas we use the fact that . The first lemma shows that within some rounds in epoch every task is chosen for execution times by processors in whp.
Lemma 9
In rounds of epoch every task is performed times whp by processors in .
Proof. If the algorithm terminates within rounds of epoch , then each task is performed times as reasoned earlier. Suppose the algorithm does not terminate (its performance is worse in this case). Let us assume that after rounds of algorithm daks, where ( is a sufficiently large constant), there exists a task that is performed less than times by the processors in , for some . We prove that whp such a task does not exist.
We define to be such that (the constant will play a role in establishing the value of the compile-time constant of algorithm daks; we come back to this at the end of Section 4). According to the above assumption, at the end of round for some task , we have .
Let us consider all algorithm iterations individually performed by each processor in during the rounds. Let be the total number of such individual iterations. Then . During any such iteration, a processor from selects and performs task in line 24 independently with probability . Let us arbitrarily enumerate said iterations from to . Let be Bernoulli random variables, such that is if task is performed in iteration , and otherwise. We define , the random variable that describes the total number of times task is performed during the rounds by processors in . We define to be . Since , for , where , by linearity of expectation, we obtain . Now by applying Chernoff bound for the same as chosen above, we have:
where for some sufficiently large . Now let denote the probability event by the round of the algorithm, and we let be the complement of that event. Next, by Boole’s inequality we have , where . Hence each task is performed at least times whp, i.e., .
Next we show that once each task is done a logarithmic number of times by processors in , then at least one worker in acquires a sufficient collection of triples in at most a linear number of rounds to become enlightened. We note that if the algorithm terminates within rounds of epoch , then every processor in is enlightened as reasoned earlier. Suppose the algorithm does not terminate (leading to its worst case performance).
Lemma 10
Once every task is performed times by processors in then at least one worker in becomes enlightened whp after rounds in epoch .
Proof. Assume that after rounds of algorithm daks, every task is done times by processors in , and let be the set of corresponding triples in the system. Consider a triple that was generated in some round . We want to prove that whp it takes rounds for the rest of the processors in to learn about .
Let be the number of processors in , then , by the constraint of model . While there may be more than processors that start epoch , we focus only on processors in . This is sufficient for our purpose of establishing an upper bound on the number of rounds of at least one worker becoming enlightened: in line 12 of algorithm daks every live processor chooses a destination for a share message uniformly at random, and hence having more processors will only cause a processor in becoming enlightened quicker.
Let be the set of processors that becomes aware of triple , in round . Beginning with round when the triple is generated, we have (at least one processor is aware of the triple). For any rounds and , where , we have because the considered processors that become aware of do not crash; thus is monotonically non-decreasing with respect to .
We want to estimate an upper bound on the total number of rounds required for to become . We will do this by constructing a sequence of random mutually independent variables, each corresponding to a contiguous segment of rounds , for in an execution of the algorithm. Let be the round that precedes round . Our contiguous segment of rounds has the following properties: (a) for , where during such rounds the set does not grow (the set of such rounds may be empty), and (b) , i.e., the size of the set grows.
For the purposes of analysis, we assume that , i.e., the set grows by exactly one processor. Of course it is possible that this set grows by more than a unity in one round. Thus we consider an ‘amnesiac’ version of the algorithm where if more than one processor learns about the triple, then all but one processor ‘forget’ about that triple. The information is propagated slower in the amnesiac algorithm, but this is sufficient for us to establish the needed upper bound on the number of rounds needed to propagate the triple in question.
Consider some round with . We define random variable that represents the number of rounds required for , i.e., corresponds to the number of rounds in the contiguous segment of rounds we defined above. The random variables are geometric, independent random variables. Hence, we acquire a sequence of random variables , since and according to our amnesiac algorithm for any round .
Let us define the random variable as . is the total number of rounds required for all processors in to learn about triple : By Markov’s inequality we have:
for some and to be specified later in the proof.
We say that “a transmission in round successful” if processor sends a message to some processor ; otherwise we say that “the transmission is unsuccessful.” Let be the probability that the transmission is successful in a round, and be the probability that it is unsuccessful. Note that if a transmission is unsuccessful then this means that in that round none of the processors in , where , were able to contact a processor in (here ), and hence we have:
By geometric distribution, we have the following:
In order to sum the infinite geometric series, we need to have . Assume that (note that we will need to choose such that the inequality is satisfied), hence using infinite geometric series we have:
In the remainder of the proof we focus on deriving a tight bound on the , and subsequently apply Boole’s inequality across all triples in
Remember that we assumed that , for , and . Let be such that , then we have the following
In order to show that it remains to show that is positive. Note that is increasing until , we should also note that we consider cases for . Hence, the minimal value of will be when either , or and in both cases , for sufficiently large .
Let us now evaluate the following expression:
Then, we have
The latest is true because achieves its minimal value when . Now, since we have:
Since then by taking natural base logarithm of both sides and using Taylor series for , where , we have . And hence, . And we get,
By taking , where is a sufficiently large constant, we get
where for some sufficiently large constant .
Thus we showed that if a new triple is generated by a worker in then whp it is known to all processors in in rounds. Now by applying Boole’s inequality we want to show that whp in rounds all triples in become known to all processors in .
Let be the event that some triple is not spread around among all workers in . In the preceding part of the proof we have shown that , where . By Boole’s inequality, the probability that there exists one triple that did not get spread to all workers in , can be bounded as
where . This implies that every worker in collects all triples generated by processors in whp. Thus, at least one worker in becomes enlightened after rounds.
The following theorem shows that, with high probability, during epoch the correct results for all tasks, are available at all live processors in rounds.
Theorem 4
Algorithm daks makes known the correct results of all tasks at every live processor in epoch after rounds whp.
Proof sketch. The proof of this theorem is similar to the proof of Theorem 1. This is because, by Lemma 9, in rounds the processors in generate triples, where is a constant. According to Lemmas 3 and 10 in rounds every live worker becomes enlightened.
According to Lemma 10, after rounds of epoch at least one processor in becomes enlightened. Furthermore, once a processor in becomes enlightened, according to Lemma 3 after additional rounds every live processor becomes enlightened and then terminates, whp. Next we assess work and message complexities (using the approach in the proof of Theorem 2). Recall that we may choose arbitrary , such that .
Theorem 5
Under adversarial model algorithm daks has work complexity and message complexity , for any .
Proof. To obtain the result we combine the costs associated with epoch with the costs of epoch . As reasoned earlier, the worst case costs for epoch are given in Theorem 2.
For epoch (if it is not empty), where , algorithm daks terminates after rounds whp and there are up to live processors. Thus its work is . In every round, if a processor is a worker it sends a share message to one randomly chosen processor. If a processor is enlightened then it sends profess messages to a randomly selected subset of processors. In every round share messages are sent. Since whp algorithm daks terminates in rounds, share messages are sent. On the other hand, according to Lemma 2, if during an execution profess messages are sent then every processor terminates whp. Hence, the message complexity is .
The worst case costs of the algorithm correspond to executions with non-empty epoch , where the algorithm does not terminate early. In this case the costs from epoch are asymptotically absorbed into the worst case costs of epoch computed above.
Last, we extend our analysis to assess the efficiency of algorithm dakst,n for tasks, where . This is done based on the definition of algorithm dakst,n using the same observations as done in discussing Theorem 3.
Theorem 6
Algorithm dakst,n, with , computes the results of tasks correctly in adversarial model whp, in rounds, with work complexity and message complexity , for any .
Proof. The result for algorithm dakst,n is obtained (as in Theorem 3) by combining the costs from epoch (ibid.) with the costs of epoch derived from the analysis of algorithm daks for (Theorem 5). This is done by multiplying the number of rounds and work complexities by the size of the chunk ; the message complexity is unchanged.
We note that it should be possible to derive tighter bounds on the complexity of the algorithm. This is because we only assume that for all rounds in epoch the number of live processors is bounded by the generous range . In particular, if in all rounds of epoch there are live processors, the round and message complexities both become as follows from the arguments along the lines of the proofs of Theorems 5 and 6.
4.4 Finalizing Algorithm Parameterization
Lastly, we discuss the compile-time constants and that appear in algorithm daks (starting with line 2). Recall that we have already given the constant in Section 4.1; the constant stems from the proof of Lemma 2.
We compute as , where and come from the proofs of Lemmas 5 and 9. The constant , as we detail below, emerges from the proof of Lemma 2 of [6] in the same way that the constants and are established in Lemmas 5 and 9.
As we discussed in conjunction with Lemma 4, algorithm daks in epoch performs tasks in the same pattern as in algorithm [6] when processors do not crash. Lemma 2 of [6] shows that after rounds of algorithm there is no task that is performed less than times, whp, for a suitably large constant and some constant . Thus, we let to be . This allows us to define to be , ensuring that the constant in algorithm daks (and thus in algorithm dakst,n) is large enough to satisfy all requirements of the analysis.
5 Conclusion
We presented a synchronous decentralized algorithm that can perform a set of tasks using a distributed system of undependable, crash-prone processors. Our randomized algorithm allows the processors to compute the correct results and make the results available at every live participating processor, whp. We provided time, message, and work complexity bounds for two adversarial strategies, viz., (a) all but , , processors can crash, and (b) all but a poly-logarithmic number of processors can crash. In this work our focus was on stronger adversarial behaviors, while still assuming synchrony and reliable communication. Future work considers the problem in synchronous and asynchronous decentralized systems, with more virulent adversarial settings in both. We plan to derive strong lower bounds on the message, time, and work complexities in various models. Our algorithm solves the problem (whp) even if only one processor remains operational. Thus it is worthwhile to understand its behavior in light of failure dynamics during executions. Accordingly we plan to derive complexity bounds that depend on the number of processors and tasks, and also on the actual number of crashes.
References
- [1] Distributed.net. http://www.distributed.net/.
- [2] Seti@home. http://setiathome.ssl.berkeley.edu/.
- [3] B. S. Chlebus and D. R. Kowalski. Randomization helps to perform independent tasks reliably. Random Structures and Algorithms, 24(1):11–41, 2004.
- [4] Bogdan S. Chlebus, Leszek Gasieniec, Dariusz R. Kowalski, and Alexander A. Shvartsman. Bounding work and communication in robust cooperative computation. In DISC, pages 295–310, 2002.
- [5] E. Christoforou, A. Fernandez, Ch. Georgiou, and M. Mosteiro. Algorithmic mechanisms for internet supercomputing under unreliable communication. In NCA, pages 275–280, 2011.
- [6] S. Davtyan, K. M. Konwar, and A. A. Shvartsman. Robust network supercomputing without centralized control. In Proc. of the 15th Int-l Conf. on Principles of Distributed Systems, pages 435–450, 2011.
- [7] S. Davtyan, K. M. Konwar, and A. A. Shvartsman. Decentralized network supercomputing in the presence of malicious and crash-prone workers. In Proc. of 31st ACM Symp. on Principles of Distributed Computing, pages 231–232, 2012.
- [8] S. Dolev, R. Segala, and A.A. Shvartsman. Dynamic load balancing with group communication. Theoretical Computer Science, 369(1–3):348–360, 2006. A preliminary version appeared in SIROCCO 1999.
- [9] C. Dwork, J. Y. Halpern, and O. Waarts. Performing work efficiently in the presence of faults. SIAM J. Comput., 27(5):1457–1491, 1998.
- [10] A. Fernandez, C. Georgiou, L. Lopez, and A. Santos. Reliably executing tasks in the presence of malicious processors. Technical Report Numero 9 (RoSaC-2005-9), Grupo de Sistemas y Comunicaciones, Universidad Rey Juan Carlos, 2005. http://gsyc.escet.urjc.es/publicaciones/tr/RoSaC-2005-9.pdf.
- [11] A. Fernandez, C. Georgiou, L. Lopez, and A. Santos. Reliably executing tasks in the presence of untrusted entities. In SRDS, pages 39–50, 2006.
- [12] C. Georgiou and A.A. Shvartsman. Cooperative Task-Oriented Computing; Algorithms and Complexity. Morgan & Claypool Publishers, first edition, 2011.
- [13] Ch. Georgiou, A. Russell, and A.A. Shvartsman. Work-competitive scheduling for cooperative computing with dynamic groups. SIAM Journal on Computing, 34(4):848–862, 2005. A preliminary version appeared in STOC 2003.
- [14] Ch. Georgiou and A.A. Shvartsman. Cooperative computing with fragmentable and mergeable groups. Journal of Discrete Algorithms, 1(2):211–235, 2003. A preliminary version appeared in SIROCCO 2000.
- [15] N.W. Hanson, K.M. Konwar, S.-J. Wu, and S.J. Hallam. Metapathways v2.0: A master-worker model for environmental pathway/genome database construction on grids and clouds. In IEEE Conf. on Comput. Intelligence in Bioinf. and Comput. Biology, Hawaii, 2014 (to appear).
- [16] P. C. Kanellakis and A. A. Shvartsman. Fault-Tolerant Parallel Computation. Kluwer Academic Publishers, 1997.
- [17] Z.M. Kedem, K.V. Palem, A. Raghunathan, and P. Spirakis. In Combining tentative and definite executions for dependable parallel computing, pages 381–390, 1991.
- [18] K. M. Konwar, S. Rajasekaran, and A. A. Shvartsman. Robust network supercomputing with malicious processes. In Proceedings of the 17th International Symposium on Distributed Computing, pages 474–488, 2006.
- [19] C. Martel and R. Subramonian. On the complexity of certified write-all algorithms. Journal of Algorithms, 16(3):361–387, 1994.
- [20] R. Motwani and P. Raghavan. Randomized Algorithms. Cambridge University Press, 1995.
- [21] M. Paquette and A. Pelc. Optimal decision strategies in byzantine environments. Parallel and Distributed Computing, 66(3):419–427, 2006.
- [22] R. De. Prisco, A. Mayer, and M. Yung. Time-optimal message-efficientwork performance in the presence of faults. In Proceedings of the 13th ACM Symp. Principles of Distributed Computing, pages 161–172, 1994.
- [23] Bin Yu and Munindar P. Singh. An evidential model of distributed reputation management. In AAMAS, pages 294–301, 2002.