TU Wien, Vienna, Austriastephan.felber@ecs.tuwien.ac.at0009-0003-6576-1468 École Polytechnique, Palaiseau, Francebernardo.hummes-flores@lix.polytechnique.fr0000-0003-2325-1497 TU Berlin, Berlin, Germanyhugo.galeana@ecs.tuwien.ac.at0000-0002-8152-1275 {CCSXML} <ccs2012> <concept> <concept_id>10003752.10003753.10003761.10003763</concept_id> <concept_desc>Theory of computation Distributed computing models</concept_desc> <concept_significance>500</concept_significance> </concept> <concept> <concept_id>10002950</concept_id> <concept_desc>Mathematics of computing</concept_desc> <concept_significance>500</concept_significance> </concept> </ccs2012> \ccsdesc[500]Theory of computation Distributed computing models \ccsdesc[500]Mathematics of computing
A Sheaf-Theoretic Characterization of Tasks in Distributed Systems
Abstract
We introduce a sheaf-theoretic characterization of task solvability in general distributed computing models, unifying distinct approaches to message-passing models. We establish cellular sheaves as a natural mathematical framework for analyzing the global consistency requirements of local computations. Our main contribution is a task sheaf construction that explicitly relates both the distributed system and task, in which terminating solutions are precisely its global sections. We prove that a task can only be solved by a system when such sections exist in a task sheaf obtained from an execution cut, the frontier in which processes have enough information to decide. Our characterization is model-independent, working under varying synchronicity, failures and message adversaries, as long as the model produces runs composed of global states of the system. Furthermore, we show that the cohomology of the task sheaf provides a linear algebraic description of the decision space of the processes, and encodes the obstructions to find solutions. This opens way to a computational approach to protocol synthesis, which we illustrated by deriving a protocol for approximate agreement. This work bridges distributed computing and sheaf theory, providing both theoretical foundations for analyzing task solvability and tools for protocol design leveraging computational topology.
keywords:
Task Solvability, Locality, Distributed computing, Sheaf Theory, Kripke Frame, Applied Category Theory, Cohomology Theory1 Introduction
Assessing the solvability or impossibility of a given task in a distributed system is at the center of distributed systems research. A variety of theoretical tools have been developed by sourcing from different fields such as combinatorics, topology and epistemic logic. One of the earliest results is the two generals problem impossibility [2, 19], which roughly states that consensus is impossible in networks with message loss and unbounded delays. Numerous results have followed since: the celebrated FLP [11] impossibility shows that consensus cannot be achieved in asynchronous and crash-prone systems, the lossy-link impossibility [26] (the authors referred to it as mobile failures) that shows that consensus is impossible even in a synchronous two-process model if just one message may be lost in each round, [13] characterized consensus in distributed systems via epistemic logic and [23] provides a topological characterization of consensus under general message adversaries.
Although the aforementioned results are mostly related to problem solvability, they are specifically focused on consensus. Nevertheless, these results undoubtedly contributed to the development of frameworks that would later address general task solvability. Some such noteworthy results are Herlihy and Shavit’s characterization of wait-free t-resilient shared memory distributed computation, via combinatorial topology [17], Attiya, Nowak and Castañeda’s characterization of terminating tasks [4], and Alcántara, Castañeda, Flores and Rajsbaum’s characterization of look-compute-move wait-free robot tasks [3].
Sheaf theory originated from studying how local constraints give rise to global solutions, paired with cohomology theory as a measure of obstructions to such global constructions. Cellular sheaves were introduced by Curry [7] as a combinatorial counterpart to the common notion of sheaves found in the literature [20], with the interest of having a theory that is computable. Most importantly to us, the theory of cellular sheaves presents the concept of sheaves in an approachable manner, whence it becomes clear its usefulness to depict global properties of local information, as is the case for distributed computing. Sheaves have proven valuable in studying similar local-to-global phenomena, from opinion dynamics [14] to contextuality in quantum mechanics [1] and sensor integration [25, 18]. We now extend this framework to distributed systems.
In this paper, we extend the brief announcement [9] introducing a generic sheaf-theoretic framework for analyzing task solvability parametrized by indistinguishability relations on global states of the system, forming a graph-like structure. These indistinguishability graphs are closely related to Kripke frames in epistemic logic, as explored in [8]. Furthermore, our framework enables us to leverage the computational methods in cohomology theory so to analyze task solvability, and even derive solving algorithms in systems where the indistinguishability graph admits a cohomology group construction.
Contributions We introduce a novel sheaf-theoretic framework for analyzing distributed task solvability. Our approach complements the existing frameworks by connecting category theory with distributed computing. This novel bridge allows us to directly apply computational methods for decision function synthesis using linear algebra. We expect this first connection to provide fertile ground for more applications in distributed computing.
More precisely, we provide the following.
-
•
A task sheaf construction (Definition˜4.1) that captures both the local knowledge of each process (local constraints) and relates it to valid solutions, assessing global consistency;
-
•
A finite substructure of the executions that preserves the task solvability as system slices (Definition˜3.6);
-
•
An equivalence between the existence of solutions to a task and the existence of sections over the corresponding system slice (Theorem˜4.4);
-
•
A way to use cohomology to transform impossibility arguments into computational problems (Theorem˜5.5).
Paper Organization We start by defining the model of a distributed computing system we use in Section˜2, giving special attention to the discrete structures used to depict the system executions (Section˜2.1) and its tasks (Section˜2.2). Section˜3 introduces system slices, the smallest sufficient structure needed to assess task solvability, and how it brought to the categorical machinery. Section˜4 defines the task sheaf, first by its explicit construction (Section˜4.1), then by its equivalent categorical formulation (Section˜4.2), so to state the main theorem relating the solvability of tasks to the existence of sections (Section˜4.3). Section˜5 uses the categorical machinery of task sheaves to define its cohomology construction, so used to computationally assess task solvability. Section˜6 provides some concluding words.
2 System Model
We now define how processes evolve, communicate and make decisions, in order to define the runs of that system. Our main contribution on task solvability solely depends on the runs and the task specification and consequently our approach is applicable to any system providing runs consisting of global states consisting of local states. As there are multiple approaches to modeling distributed systems, we do not formally specify how to obtain said runs, but exemplify it via a synchronous message passing system, which is one common choice to represent a distributed system (e.g., [24]).
We call local protocols processes which are deterministic state machines containing the possible local states, together with distinguished input states. The local protocol is defined by transition function, a communication function and a decision function mapping local states to task-specific outputs. Processes evolve depending on their previous local state and some partial information about other processes local states, usually represented via messages they received. The possibilities of (partial) information reception of the processes about the whole system state is captured as the adversary, also called the message adversary [26, 27, 28], heard-of predicates [6], time-varying-graphs [5], scheduler [22] etc., which influence the behavior of the system. Since there is a pallette of different adversaries, each different in their way they impact the system, and the precise capabilities of the adversary are not important in stating our results, we do not formalize it further. The only object we need is a run consisting of a sequence of configurations, defined in what follows.
2.1 System Runs and System Frames
A global configuration is a tuple of local states, one per process. A projection function is defined as mapping configurations to local states. A projection to the inputs is defined as retrieving the vector of input states, and accordingly they can be composed , projecting to ’s input. A configuration is called terminal if all processes’ decision functions map to a valid output value. If this is true for some processes, but not all, then the configuration is called partially terminated. A decision function that always eventually provides a terminal configuration is called terminating. Additionally, we assume that the input values of a process are encoded into their local states and are not forgotten. A sequence of global configurations is an execution or a run of the system, we identify the system with the set of all of its runs, denoted .
Example 2.1 (Synchronous message-passing system with lossy link adversary).
For example, the synchronous lossy link model for two processes produces a system frame. Informally a synchronous message adversary evolves in synchronous rounds where processes compute a new state from the messages received in the previous round and the current state simultaneously. Messages sent at the beginning of one round either arrive at the end of that round, or are lost forever. The message adversary is allowed to drop at most one message in each round, implying that processes know that if they receive no message at the end of a round, then their own message arrived. Written differently, possible message arrival in each round may be indicated by an edge in , i.e., the no-arrow case is excluded. We assume that processes remember everything, i.e., they have unlimited memory to keep all their history and additionally, they transmit their whole history in every round (in the literature this is know as the full information protocol and mostly used for impossibility arguments, see for example [10] where they call it a full information algorithm).
It is well known that terminating consensus is already impossible here, for example Santoro and Widmayer proved it in [26].
We can now define execution graphs that do not depend on the system’s decision functions.
Definition 2.2 (Execution Graph).
Let be a set of runs. The execution graph of is a directed graph where:
The uniquely generated execution graph generated by the system is denoted .
From the system model established in Section˜2, we formalize the requirement that the decision function must be deterministic. This takes the form of a system frame, where any set of configurations that constitute a run are equipped with process-wise indistinguishability relations. The system frame will later parametrize where, in the space of configurations, a decision function must behave constantly for lack of distinguishing power between identical states of the same process.
Definition 2.3 (Configuration Indistinguishability).
Two configurations , are indistinguishable for process , denoted , iff has the same local state in both and , i.e., .
Edges on the execution graph are also called causal links, as they arrange the configurations in a causal sequences that follows the execution of a possible run. An acyclic execution graph also induces a partial order over configurations, we write if there exists a path from to .
We define now the system frame, which formalizes the information available to each process throughout the possible executions of the distributed system. This structure will be later used in Definition˜4.1, on the construction of the task sheaf, to make precise the idea that a process must always choose the same value when the information available to it is the same.
Definition 2.4 (System Frame).
Let be the execution graph generated by the set of runs , and the set of indistinguishability relations on , indexed by each process . We say that is the system frame induced by .
Remark 2.5 (Equivalence Relation on the System Frame).
Note the indistinguishability relation on local states defines an equivalence relation.
See in Example˜2.6 for the explicit construction of a system frame after one step.
Example 2.6.
A system frame after zero and one step corresponding in the lossy-link synchronous message adversary from the previous example is shown in Figure˜2. The frame after zero steps could be written as and , . This is the finest execution graph possible under the lossy-link adversary. Dotted arrows are the causal links. The colored edges denote the indistinguishability edges between configurations, orange for , green for . The nodes here represent the configurations: for example, after the first step, cannot distinguish between from as hasn’t received a message in neither configuration, whereas can distinguish the two configurations because it has a different inputs in them.
An example for the causality relation would be the configuration depending on the configuration , written as .
2.2 Distributed Tasks and their Algebraic Structure
Given a system, we can now talk about the distributed tasks for which we will provide a sheaf-theoretic perspective.
Definition 2.7 (Tasks).
A task is a triple , where is the set of possible input vectors, is the set of possible output vectors and is a map associating to each input vector the set of valid output vectors. , denote the possible inputs and outputs restricted to a process .
Definition 2.8 (Terminating Task Solvability).
A decision function is said to solve a task if for every run the following holds:
-
•
Termination: For every process , there is a step such that ,
-
•
Validity: There is an output vector , such that for every process there is a step where and for every , .
Termination requires every process to eventually decide on some value, and validity requires that all individual decisions correspond to a valid output configuration for the respective input configuration. Note that the decision of process corresponds to the first value obtained by its decision function other than .
The topological approach to distributed computing [16] provides a combinatorial structure for defining tasks as simplicial complexes. Recall that a chromatic simplicial complex is a set of vertices and a set of simplices defined from the vertices: a subset of the powerset of vertices that is closed under inclusion. Each vertex has a color, the process identity, and each face can depict at most one instance of each color. Vertices are local states and faces are then global states.
The sets of input and output values are now chromatic simplicial complexes, and the task specification works the same, and a task consists of a relation between valid outputs simplices for a given set of inputs simplices. This formulation captures the combinatorial nature of a task specification. It which will be later used in Section˜4 as the data of the task that must be tracked by the processes.
3 From System Frames to Categories
Termination on an execution graph can be represented as a set of configurations that cuts the execution graph in half. Indeed, termination by definition is just a set of configurations that intersects any run, together with some extra conditions that we will define in the following. Unfortunately, not every such cut through the execution graph need be finite and therefore easy to find. Thus we do not concern ourselves with how we look for one, instead describe its shape.
Definition 3.1 (Execution Cut).
Let be a set of configurations. We say that is an execution cut iff it is a cut set in , i.e., it intersects every run in .
An execution cut represents a set of “unavoidable” configurations within the system. Furthermore, we say that an execution cut is terminal iff any is terminal, i.e., at configuration , all processes must have decided.
Definition 3.2 (Local Star).
Let be a configuration, , and . That is, is the equivalence class of under . We define the local star of , denoted by , as the labeled graph obtained by and extend it over sets by
Remark 3.3.
Note that for any , induces a complete graph with all edges labeled by .
Definition 3.4 (Causal Closure).
Let be a terminal execution cut and its local star. The causal closure contains all configurations that lie between and :
A causal closure of a terminal execution cut extends to the partially terminated configurations in , where at least one process has already terminated, but not necessarily all of them. Any partially terminated configuration eventually results into a terminated configuration in . The causal closure contains all successors of partially terminated configurations up until they result in a fully terminated configuration in . As some processes have already decided in a partially terminated configuration , we ensure that they keep their decided values in any successor configurations (reflecting that decisions are final) by constructing the causal consistency relation such that .
Definition 3.5 (Causal Consistency Relation).
We define the causal consistency relation as the symmetric closure of the binary relation composition .
Intuitively, given a causal closure of a terminal execution cut , the causal consistency relation relates all configurations where a process terminated in a preceding or indistinguishable configuration. As and are reflexive, is reflexive, and contains both relations and . Observe that any causally dependent configurations in such that are related (and also ) for all processes. Configurations in that are not in , are related to a configuration in via the indistinguishability relation of and any successor of is also related to for . This ensures that the partially terminated process keeps its terminated value.
Definition 3.6 (System Slice).
Let be an execution cut in the system frame , we define a system slice, denoted by , as a tuple consisting of the causal closure over the local star over , i.e., , together with its causal consistency relation.
In general system slices need not be finite, see Example˜3.7 for an explicit construction of an execution graph with a necessarily infinite system slice. An finite system slice is depicted in Example˜4.6
Example 3.7 (System slice of single shot message adversary execution).
In this example, we consider the tilted consensus task, where both processes have to decide on ’s value. The synchronous communication adversary here allows exactly one message from to per run, the set of all runs therefore consists of all , where . In Figure˜3 ’s indistinguishability relations between configurations are again orange, ’s are blue. The configurations within dashed boxes are terminated, the yellow boxes are partially terminated. The dashed red and blue edges represent the causal consistency relation (most transitive edges are omitted favoring readability).
Clearly, just waits until something arrives, whereas can terminate immediately. As cannot distinguish whether it will end up in the or deciding half until it receives that message (i.e. doesn’t know ’s value), no configuration where no message has arrived yet can be terminal. At the same time, every configuration is partially terminated as it is indistinguishable for from a terminal configuration (marked in the dashed terminal regions). The smallest system slice therefore has infinite size in this execution graph.
System slices naturally form a graph structure where we can define sheaves, where the set of global configurations are its vertices and the causal consistency relations its edges. This suffices for our purposes, but note that this consists of the more general structure of a cellular complex, where vertices and edges are and cells, and the execution structure naturally satisfies its local finite requirement. Cellular complexes are fundamental objects in algebraic topology, and a full account can be found in [15]. Most importantly, a cellular complex has an underlying partial ordering of its cells (vertices and edges), denoted , which we exemplify now.
Finally, in order to access the categorical machinery needed to define the task sheaves, we need need to introduce the concept of a cellular category [21], which categorifies the cellular complex . by viewing its associated poset as a category, which preserves the topological structure while enabling a categorical perspective better suited for defining sheaves.
Example 3.8 (Undirected Graph).
An undirected graph , with a collection of vertices and a collection of edges, gives rise to a cellular category . Note that the data of an edge consists of an unordered pair of vertices , and the incidence relation of an edge to a vertex satisfies the inclusion . With this in mind, is obtained by constructing an object for each vertex and for each pair of vertices , with an arrow whenever . Objects obtained from vertices are called -cells and those obtained from edges are called -cells. This construction preserves the information of the graph, while adding a categorical structure.
Observe that any system slice is an undirected graph and therefore induces a cellular category. We will make ample use of this in the the following.
4 Task Sheaves
In this section, we introduce the task sheaf as a mathematical framework for encoding the solvability constraints of a distributed task. Sheaves provide a formalism that captures both the global structure of a task definition and the local constraints which protocols must respect. We present two equivalent formulations. First, an explicit construction, used to build the intuition over the running examples. Then, we present its categorical foundations, which enables the use of cohomology later in Section˜5. We then state the main theorem relating the solvability of a task to the existence of sections in the appropriate task sheaf.
4.1 The Task Sheaf
Sheaves can be informally understood as a structure allowing to track data that is associated to pieces of a space. In our case, we will be tracking the task data, the possible solutions according to the task specification, across the possible global states that our distributed system may assume. As such, the runs of the system being analyzed will provide us with our base space: a system slice , derived from its system frame , turned into a cellular category , as seen in Section˜3.
A sheaf defined on a cellular category is a cellular sheaf 111See [7] for a thorough treatment of cellular sheaves, and [20] for an overview of sheaf theory.. A key characteristic of cellular sheaves is the combinatorial nature of this space, which allows us to look at discrete structures and their generalizations to higher dimensions, such as the graphs and cellular complexes. This nature also gives us access to a much simpler theory. In full generality, sheaves must be shown to respect a technical sheaf condition, which is automatically satisfied in the case of cellular sheaves, as proven in [7, Theorem 4.2.10].
In a distributed system, configurations represent snapshots of the global state, while indistinguishability relations capture what each process can observe locally. The task specification defines which outputs are valid for given inputs. We introduce now the construction of a sheaf task, that connects those concepts and allows us to reason about task solvability.
Definition 4.1 (Task Sheaf).
Let be a system slice obtained of set of runs , and let be a task. The task sheaf is a cellular sheaf defined as follows.
-
1.
(stalks of configurations) For each configuration , the stalk is , i.e., the set of possible valid output configurations given the input assignments in .
-
2.
(stalks of relations) For each edge between configurations and for process , the stalk is , i.e., the set of possible values that process can choose in either of the adjacent configurations.
-
3.
(restriction maps) The restriction map from a configuration to an edge is , i.e., it projects an output configuration to its ’th entry.
The vertices of are configurations, while the edges in connect configurations where should have the same decision value.
Example 4.2 (Task sheaf after one iterations).
In Figure˜4 we depict a cellular category (note, that it is also the system frame of the lossy link synchronous message adversary) with objects representing the possible task solving outputs.
Indistinguishability edges are also assigned objects (in our case simply the union of both adjacent vertices’ objects, i.e., all possible task outputs of the specific agent that cannot distinguish both configurations) and are omitted for space except at two edges. The restriction maps go from vertices to edges and map the global task output to the non-distinguishing agents output.
This explicit construction captures the constraints imposed by the task on the decisions of the processes. In order to establish its mathematical properties, we provide now its equivalent formulation as a colimit of task sheaves defines for each process.
4.2 Categorical foundations
The task sheaf has an equivalent process-wise formulation using the chromatic semi-simplicial sets (csets) as data modeling the distributed tasks. Csets were introduced by Goubault et al. [12], and can be understood as sets of output values with enough structure to represent labeled configurations, a categorical generalization of the simplicial complexes used in Definition˜2.7.
For each process , a sheaf is defined. Whenever the system slice and the task are clear from the context, we will write and . has the following structure:
-
•
is the cellular category obtained by localizing the system slice to process , i.e., from the poset induced by over . Note that a pair of configurations induces the relations and .
-
•
is the category of chromatic semi-simplicial sets.
In this formalism, configurations become -cells and equivalence causal consistency become -cells. The sheaf maps a set of configurations to the set of -simplices corresponding to its acceptable outputs, according to the task specification , and maps an inclusion of configurations to face maps , where a set of output decisions is sent to decisions process-wise, i.e. its th colored faces, such that the following diagram commutes.
The sheaves defined for all agents are a set of functors with common codomain.
These form a subcategory of the category of cellular sheaves, where each object is a cellular sheaf and morphisms are commutative squares
where only all sheaves have a common codomain. The colimit of such cellular categories is well behaved and can be lifted for a colimit of cellular sheaves under those assumptions. The colimit of then gives us a sheaf that captures all of the information on the individual sheaves, where a global section exists iff there is a global section on the individual ones. The sheaf is defined over each configuration (resp. indistinguishability edge ) as the colimit of the individual sheaves localized at (resp. ). The sheaf coincides with the explicit definition given in Definition˜4.1.
The colimit of coincides with the task sheaf from Definition˜4.1. This categorical perspective ensures our construction is well-founded and enables the cohomology computations in Section˜5.
4.3 Solvability as Sections
We now define sections, which capture globally consistent assignments of data across the cellular complex. They are a fundamental concept in sheaf theory and will complete the language needed for our analysis of distributed task solvability.
Definition 4.3 (Section).
Let be a cellular sheaf over . A section of is a choice of elements such that for every pair of cells with the values coincide through restriction maps, i.e., . The set of all global sections of is denoted by .
A section is a choice of values, one per cell, such that the same value is obtained if we restrict to a cell from each of its incident neighbors. In the case of a task sheaf, it is a choice of data (i.e, output value) for each vertex (i.e, configuration) and each edge (i.e, causal consistency relation) such that they all agree under the restriction maps. Explicitly, it is an element of the direct sum of the stalks:
We can now state our main characterization theorem, which establishes that task solvability is equivalent to the existence of sections in our task sheaf.
Theorem 4.4 (Terminating Task Solvability).
Let be a task and a system: there exists a terminating decision map solving iff there exists a execution cut such that its system slice, , together with it causal consistency relation , has a section over the task sheaf .
Proof 4.5.
Let us assume first that , solves . Therefore, for any run , there exists a configuration that is the earliest configuration of where each process has decided . Note that is a system cut.
Consider the system slice . By assumption assigns a solving, non- value to every configuration , and satisfies Item˜1, Item˜3 and Item˜2 making a sheaf. The section condition in Definition˜4.3 is satisfied as decisions are final, i.e., any successor configuration of a terminated configuration has the same decision, and is a function on local states, i.e., configurations cannot distinguish are mapped identically by .
Now to prove the converse, assume that there exists an execution cut such that its system slice, , has a section . We construct a terminating decision map by setting
Clearly terminates because is an execution cut and satisfies Definition˜2.8. solves because first, the only vectors in the stalks over the configurations are task solving vectors. And second, is well-behaved and maps identical inputs to identical outputs by Item˜3 and the section condition in Definition˜4.3, i.e, per-processes outputs need to agree over indistinguishable configurations.
See the following Example˜4.6 for an application of Theorem˜4.4.
Example 4.6 (Consensus on Task Sheaf).
In Figure˜5 we depict a section over a task sheaf defined on a system slice. The vectors over the configurations in the system slice correspond to the valid choice of data that the sheaf took. Restriction maps and stalks are not depicted.
Now that we have characterized solvability of a task under a protocol in terms of its task sheaf, , we will also establish a relation between local cohomology of the task sheaf and the task solvability.
5 Computing Solutions
Given a communication adversary representing some adversarial entity together with a task , we search for a protocol together with a decision map. As the full-information protocol provides the finest possible execution tree, its natural to start the search process there. Slices that are finite in size can be recursively enumerated and, although that’s computationally inefficient, tested. If we find a finite slice that allows for a terminal decision map we can try to optimize the protocol inducing the execution tree. If we do not find a slice, but keep looking forever, then a finite slice does not exist and the task is not wait-free solvable, although it might be solvable given an infinite slice.
In this section we focus on wait-free solvability, so assume we have found a finite terminating slice , we computationally determine the space of all sections and therefore the decision map . We do this via cohomology, i.e., we turn our structure into an abelian group and extract from it all possible solutions.
Given a task sheaf , the process of obtaining the th sheaf cohomology can be understood as an iteration of the following steps.
Where is the category of simplicial abelian groups, is the category of (co)chain complexes with integer coefficients and the category of Abelian groups, where our cohomology lives.
We provide a brief explanation, and illustrate it below in Example˜5.4. Given a cset containing the possible system output states, we can obtain a simplicial abelian group through the left adjoint of the forgetful functor that sends it to the underlying simplicial set222Here chromatic semi-simplicial sets are treated as simplicial sets for simplicity, as they only add the process labeling that would require extra bookkeeping.. The third map gets us a cochain complex with differentials defined from the alternating sum of the (co)face maps. Finally, we obtain the -th cohomology group, which corresponds to a space of sections of our sheaf. This construction is well known in the literature of algebraic topology [15] and we adapt it as a tool for understanding distributed tasks.
Definition 5.1 (Space of Zero- and One- Cochains).
Resembling Definition˜4.3, we define
as the space of zero cochains of the sheaf , i.e., the vector space of all possible assignments to vertices (configurations) in . Similarly
is the space one one cochains, i.e., the space over possible output choices for processors.
The two cochain groups are connected via a linear coboundary map . This maps a specific choice of output vectors to the individual choices along the indistinguishability edges defined by the restriction maps. To define we chose an arbitrary direction on each indistinguishability edge just to facilitate an algebraic representation.
Definition 5.2 (Coboundary Map).
We denote by the coboundary map, defined per edge in the sheaf as:
where we assume that the chosen direction goes from .
We can represent the coboundary map as a coboundary matrix where rows are indexed by edges and columns are indexed by configurations:
One can think of simply as computing the difference between two indistinguishable configurations. A section on our sheaf is a -cochain that is mapped to by , so any assignment to configurations such that any process that cannot distinguish two configurations, decides the same thing. The set of all sections is then the kernel .
Definition 5.3 (Zeroth Cohomology).
The zero-th cohomology is , i. e., the kernel of the coboundary map.
Example 5.4.
Let us consider the approximate agreement problem for 2 processes in the lossy link synchronous message adversary setting, illustrated in Figure˜6. We are interested in whether given a full information protocol, the induced system execution graph allows for a terminal execution cut, such that by Theorem˜4.4 we can find a section that gives us a decision map.
We set the possible input vectors to , the possible output vectors to and define the validity map as
Intuitively, in the initial configuration we cannot find a section, since both configurations, and , force the respective connecting configurations and to choose an output vector that projects to and , which does not exist in . We can formalize this impossibility starting with the co-boundary matrix , where we number the configurations by the initial values interpreted in binary, and edges are just tuples of configurations, with the direction following the written order. Note that this is the co-boundary matrix of the system slice consisting of all initial configurations, i.e., after steps. The co-boundary matrix for all configurations after one step has the unreadable dimensions of .
In order to find the kernel of , we can assume some arbitrary assignment vector to the configurations (as the configurations and have only one possible choice by validity) and solve :
This proves that epsilon agreement is impossible in steps as the required solutions and are not possible solutions, the kernel is trivial. The impossibility itself does not come as a surprise. The novelty lies in the fact that that every step we took was purely deterministic and computable, meaning that such operations could have been done by a program.
We depict the system slice consisting of all configurations after one step in Figure˜6. forces to choose an output that matches ’s decision, i.e., . This forces ’s decision in to be a valid output could be . Again, this forces ’s hand in to , we could choose here. But now we run into trouble, in configuration we cannot find an output vector that maps ’s value to as the only possible choice here is ! Therefore this assignment is not a section!
Note that this example is not a proof that one cannot solve approximate agreement after one step, and is only meant to illustrate the role of cohomology in determining task solvability, as the co-boundary matrix after one step is already huge. But, as already illustrated, any step here is deterministic and computable, therefore we can find a section after two steps, implying the existence of a protocol solving epsilon agreement.
We formalize this intuition in Theorem˜5.5.
Theorem 5.5 (Computable Decision Maps).
Let be a task that can be solved in a finite system slice (i. e. finitely many terminal configurations) in a given execution graph , then its decision map is computable in finite time.
The idea is simple: compute the execution graph layer by layer and check whether any system slice admits a non-trivial zeroth cohomology.
Proof 5.6.
Assume the task can be solved in a finite system slice. We can iteratively build up the tree 333As any configuration has at most finitely directly causally dependent configurations, i. e., children in the execution graph, we can label any node by its depth. Then we iteratively build the tree up to depth ., up to some . For each , chose any possible and corresponding system slice and compute its zeroth cohomology. If it admits a non-trivial kernel on , then derive a protocol as described in Theorem˜4.4.
By assumption such a exists. By building the tree iteratively, eventually this will be found and the iterations terminates.
6 Conclusions
Our results, and in particular, our task sheaf construction, constitutes to the best of our knowledge, the first sheaf-theoretic characterization of general distributed computing tasks. Moreover, the generality of our model allows us to describe a wide range of systems that only need to satisfy minimal assumptions, namely, that the set of processes is finite, and that the communication is produced via messages.
By expressing tasks as a sheaf, we are able to incorporate cohomology theory as a powerful tool for distributed systems. For instance, the cohomology of a task sheaf is a group that represents the “obstructions” or “limitations” in the distributed system that prevent a specific task to be solved. Moreover, we show an impossibility result by simply looking at the cohomology group of its task sheaf. However, the cohomology of a task sheaf is not only restricted to determining impossibilities, but it may also be used for explicitly finding a protocol. Thus, sheaf-cohomology is shown to be a powerful, and promising tool for obtaining novel results and insights in distributed computing.
Finally, the rigorous categorical foundation of our approach provides a solid starting point for further research, such as incorporating failure models or exploring the complexity of protocol synthesis for different task definitions.
References
- [1] Samson Abramsky and Adam Brandenburger. The sheaf-theoretic structure of non-locality and contextuality. New Journal of Physics, 13(11):113036, 2011. doi:10.1088/1367-2630/13/11/113036.
- [2] E. A. Akkoyunlu, K. Ekanadham, and R. V. Huber. Some constraints and tradeoffs in the design of network communications. In SOSP ’75: Proceedings of the fifth ACM symposium on Operating systems principles, pages 67–74, New York, NY, USA, 1975. ACM. doi:http://doi.acm.org/10.1145/800213.806523.
- [3] Manuel Alcántara, Armando Castañeda, David Flores-Peñaloza, and Sergio Rajsbaum. The topology of look-compute-move robot wait-free algorithms with hard termination. Distrib. Comput., 32(3):235–255, June 2019. doi:10.1007/s00446-018-0345-3.
- [4] Hagit Attiya, Armando Castañeda, and Thomas Nowak. Topological Characterization of Task Solvability in General Models of Computation. In Rotem Oshman, editor, 37th International Symposium on Distributed Computing (DISC 2023), volume 281 of Leibniz International Proceedings in Informatics (LIPIcs), pages 5:1–5:21, Dagstuhl, Germany, 2023. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. ISSN: 1868-8969. URL: https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.DISC.2023.5, doi:10.4230/LIPIcs.DISC.2023.5.
- [5] Arnaud Casteigts, Paola Flocchini, Walter Quattrociocchi, and Nicola Santoro. Time-varying graphs and dynamic networks. International Journal of Parallel, Emergent and Distributed Systems, 27(5):387–408, 2012. arXiv:https://doi.org/10.1080/17445760.2012.668546, doi:10.1080/17445760.2012.668546.
- [6] Bernadette Charron-Bost and André Schiper. The Heard-Of model: computing in distributed systems with benign faults. Distributed Computing, 22(1):49–71, April 2009. doi:10.1007/s00446-009-0084-6.
- [7] Justin M Curry. Sheaves, cosheaves and applications, 2014.
- [8] Ronald Fagin, Joseph Y. Halpern, Yoram Moses, and Moshe Y. Vardi. Reasoning About Knowledge. MIT Press, 1995. doi:10.7551/mitpress/5803.001.0001.
- [9] Stephan Felber, Bernardo Hummes Flores, and Hugo Rincon Galeana. Brief announcement: A sheaf-theoretic characterization of tasks in distributed systems. In Ulrich Schmid and Roman Kuznets, editors, Structural Information and Communication Complexity - 32nd International Colloquium, SIROCCO 2025, Delphi, Greece, June 2-4, 2025, Proceedings, volume 15671 of Lecture Notes in Computer Science, pages 425–430. Springer, 2025. doi:10.1007/978-3-031-91736-3\_26.
- [10] Faith E. Fich and Eric Ruppert. Hundreds of impossibility results for distributed computing. Distributed Comput., 16(2-3):121–163, 2003. URL: https://doi.org/10.1007/s00446-003-0091-y, doi:10.1007/S00446-003-0091-Y.
- [11] Michael J. Fischer, Nancy A. Lynch, and Michael S. Paterson. Impossibility of distributed consensus with one faulty process. J. ACM, 32(2):374–382, April 1985. doi:10.1145/3149.214121.
- [12] Éric Goubault, Roman Kniazev, Jérémy Ledent, and Sergio Rajsbaum. Semi-Simplicial Set Models for Distributed Knowledge. 2023 38th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), pages 1–13, 2023. doi:10.1109/LICS56636.2023.10175737.
- [13] Joseph Y. Halpern and Yoram Moses. Knowledge and common knowledge in a distributed environment. J. ACM, 37(3):549–587, 1990. doi:http://doi.acm.org/10.1145/79147.79161.
- [14] Jakob Hansen and Robert Ghrist. Opinion Dynamics on Discourse Sheaves. SIAM Journal on Applied Mathematics, 81(2):2033–2060, 2021. doi:10.1137/20M1341088.
- [15] Allen Hatcher. Algebraic topology. Cambridge University Press, Cambridge, 2002.
- [16] Maurice Herlihy, D. N. Kozlov, and Sergio Rajsbaum. Distributed Computing through Combinatorial Topology. Morgan Kaufmann, is an imprint of Elsevier, 2014.
- [17] Maurice Herlihy and Nir Shavit. The asynchronous computability theorem for t-resilient tasks. In STOC ’93: Proceedings of the twenty-fifth annual ACM symposium on Theory of computing, pages 111–120, New York, NY, USA, 1993. ACM. doi:http://doi.acm.org/10.1145/167088.167125.
- [18] Cliff A. Joslyn, Lauren Charles, Chris DePerno, Nicholas Gould, Kathleen Nowak, Brenda Praggastis, Emilie Purvine, Michael Robinson, Jennifer Strules, and Paul Whitney. A Sheaf Theoretical Approach to Uncertainty Quantification of Heterogeneous Geolocation Information. Sensors, 20(12):3418, 2020. doi:10.3390/s20123418.
- [19] Leslie Lamport. The weak byzantine generals problem. Journal of the ACM (JACM), 30(3):668–676, 1983.
- [20] Saunders Mac Lane and Ieke Moerdijk. Sheaves in Geometry and Logic: A First Introduction to Topos Theory. Universitext. Springer New York, 1994. doi:10.1007/978-1-4612-0927-0.
- [21] M. Makkai and J. Rosický. Cellular categories. Journal of Pure and Applied Algebra, 218(9):1652–1664, 2014. doi:10.1016/j.jpaa.2014.01.005.
- [22] Yoram Moses and Sergio Rajsbaum. A Layered Analysis of Consensus. SIAM Journal on Computing, 31(4):989–1021, 2002. _eprint: https://doi.org/10.1137/S0097539799364006. doi:10.1137/S0097539799364006.
- [23] Thomas Nowak, Ulrich Schmid, and Kyrill Winkler. Topological characterization of consensus in distributed systems: Dedicated to the 2018 dijkstra prize winners bowen alpern and fred b. schneider. J. ACM, August 2024. Just Accepted. doi:10.1145/3687302.
- [24] Michel Raynal. Concurrent Programming: Algorithms, Principles, and Foundations: Algorithms, Principles, and Foundations. Springer, Berlin, Heidelberg, 2013. URL: https://link.springer.com/10.1007/978-3-642-32027-9, doi:10.1007/978-3-642-32027-9.
- [25] Michael Robinson. Sheaves are the canonical data structure for sensor integration. Information Fusion, 36:208–224, 2017. doi:10.1016/j.inffus.2016.12.002.
- [26] Nicola Santoro and Peter Widmayer. Time is not a healer. In Proc. 6th Annual Symposium on Theor. Aspects of Computer Science (STACS’89), LNCS 349, pages 304–313, Paderborn, Germany, February 1989. Springer-Verlag.
- [27] Nicola Santoro and Peter Widmayer. Agreement in synchronous networks with ubiquitous faults. Theoretical Computer Science, 384(2):232–249, 2007. Structural Information and Communication Complexity (SIROCCO 2005). URL: https://www.sciencedirect.com/science/article/pii/S0304397507003350, doi:10.1016/j.tcs.2007.04.036.
- [28] Ulrich Schmid, Manfred Schwarz, and Kyrill Winkler. On the Strongest Message Adversary for Consensus in Directed Dynamic Networks. In Structural Information and Communication Complexity, Cham, 2018. Springer International Publishing. doi:10.1007/978-3-030-01325-7_13.