This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Convex Decreasing Algorithms: Distributed Synthesis and Finite-time Termination in Higher Dimension

James Melbourne, Govind Saraswat, Vivek Khatana,
Sourav Patel, and Murti V. Salapaka
James Melbourne, Vivek Khatana, Sourav Patel, and Murti V. Salapaka are with the Department of Electrical and Computer Engineering, University of Minnesota, Minneapolis, USA.Govind Saraswat is with the National Renewable Energy Laboratory, Golden, CO, USA (e-mail: govind.saraswat@nrel.gov).This work was authored in part by the National Renewable Energy Laboratory, operated by Alliance for Sustainable Energy, LLC, for the U.S. Department of Energy (DOE) under Contract No. DE-AC36-08GO28308. Funding provided by the Advanced Research Projects Agency-Energy (ARPA-E) under grant no. DE-AR0000701. The views expressed in the article do not necessarily represent the views of the DOE or the U.S. Government. The U.S. Government retains and the publisher, by accepting the article for publication, acknowledges that the U.S. Government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this work, or allow others to do so, for U.S. Government purposes.
Abstract

We introduce a general mathematical framework for distributed algorithms, and a monotonicity property frequently satisfied in application. These properties are leveraged to provide finite-time guarantees for converging algorithms, suited for use in the absence of a central authority. A central application is to consensus algorithms in higher dimension. These pursuits motivate a new peer to peer convex hull algorithm which we demonstrate to be an instantiation of the described theory. To address the diversity of convex sets and the potential computation and communication costs of knowing such sets in high dimension, a lightweight norm based stopping criteria is developed. More explicitly, we give a distributed algorithm that terminates in finite time when applied to consensus problems in higher dimensions and guarantees the convergence of the consensus algorithm in norm, within any given tolerance. Applications to consensus least squared estimation and distributed function determination are developed. The practical utility of the algorithm is illustrated through MATLAB simulations.

I Introduction

Recent advancements in the field of intelligent communication technologies and ad-hoc wireless sensor networks for monitoring and control of multi-agent systems have necessitated design of algorithms to assert global information without the complete knowledge of the system. The task of obtaining the global information is often accomplished via a class of algorithms that employ strategies of achieving consensus. In a consensus algorithm, agents iteratively and in a distributed manner agree on a common state. The ideas of distributed consensus algorithms can be traced back to the seminal works, see [1, 2, 3, 4]. Recent works on consensus algorithms are focused on designing protocols to drive agents to the average of their initial states, see [5, 6, 7, 8]. These protocols were designed for cases where the state of each agent is a scalar value. However, increasing storage and computation capabilities of modern day sensor interfacing technologies have motivated large-scale applications, examples of which include distributed machine learning, see [9], multi-agent control and co-ordination, see [10, 11], distributed optimization problems, see [12, 13], distributed sensor localization, see [14]. In order to meet the requirements of such applications there is need of distributed consensus algorithms that allow for vector states. [15] presents such a higher dimensional consensus protocol. The framework in [15] is based on a leader-follower architecture with the agents being partitioned between anchors and sensors. Anchors are agents with fixed states behaving as leaders in the algorithm, while the sensors change their state by taking a convex combination of their state with the neighboring nodes’ states.

Cyber physical systems such as electrical power networks need to accommodate for large number of states for crucial applications such as state-estimations, optimal dispatch, and demand response for ancillary services. [16] formulates the distributed apportioning problem using consensus protocols where only a single state is shared for each protocol. A similar situation involving higher dimensional states arises in distributed resource allocation problems where a fixed amount of resource is required to be apportioned among all participating agents in a network and a convex cost associated with the each resource is to be distributively minimized. The method used to solve the above resource allocation problem involves a higher dimensional consensus protocol (see [17]).

Other applications where accommodating higher dimensional data is required are distributed optimization and their applications to deep neural networks such as in the case of diffusion learning where local agents perform model training using local datasets in parallel, for example see [18], where after the completion of the training process, model parameters are then shared with the neighboring nodes allowing for faster dissemination of the model parameters. This method also allows for asynchronous learning, where different agents learn with models of different parameters at every iteration. Recent frameworks for unsupervised learning such as Generative Adversarial Networks (GANs) where the objective is to train learning models to be robust against adversarial attacks (see [19]). [20] propose methods for scaling adversarial training and learning to large datasets performed on a distributed setup where each agent trains a discriminator model on a local training data and a shared generator model is trained based on the feedback received from the discriminator models of the different agents. These applications need sharing of a high dimensional parameter vector, usually of the order of 32×32×31024×1024×332\times 32\times 3\sim 1024\times 1024\times 3. Applications such as spectrum sensing in Ad hoc cognitive radio networks, see [21], distributed detection of malicious attacks in finite-time consensus networks [22] and control of autonomous agents like unmanned aerial vehicles (UAVs) for search and survey operations, see [10] also depend on implementing higher dimensional consensus protocols.

Termination of consensus algorithms in finite time provides an advantage of getting an approximate consensus while saving valuable computation and communication resources. For the scalar average consensus protocols discussed earlier, the authors in [23, 24] have proposed such a finite time stopping criteria utilizing two additional states namely the global maximum and global minimum over the network. This allows each agent to distributively detect the convergence to the (approximate) average and terminate further computations. The works in [25, 26] generalize this result to the cases of dynamic interconnection topology and communication delays. The authors in [27] present a method based on the minimal polynomial associated with the weight matrix in the state update iterations to achieve the consensus value in a finite number of iterations. However, to calculate the coefficients of the minimal polynomial each node has to run nn (total number of agents) different linear iterations each for at least n+1n+1 time-steps.

The protocol developed here is based on a new geometric insight into the behavior of push-sum or ratio consensus algorithms that is of independent interest. We prove that many popular consensus algorithms fit into a general class of algorithms which we call “convex decreasing”. We demonstrate that convex decreasing consensus algorithms satisfy monotone convergence properties, which we leverage to give guarantees on network configurations. That is, if it can be determined that all nodes of convex decreasing consensus algorithm belongs to a convex set, then all nodes of the algorithm will remain in said convex set. We develop distributed stopping criteria based on this geometric insight.

Centralized algorithms for finding the convex hull of a finite set of nn points in a plane have been long proposed. Such algorithms have the worst-case running time of 𝒪(nlogn)\mathcal{O}(n~{}\log n), which is also the best achievable performance for obtaining the ordered hull (see [28]). However, with the recent advancements in distributed multi-agent systems, the problem of estimating the convex hull in a distributed manner, in the absence of a centralized entity, has become important. To this regard, distributed algorithms to estimate convex hull have been proposed in the literature for applications such as in classification problems, locational region estimation and formation control to name a few. In classification problems, [29, 30] have proposed an one-class scaled distributed convex hull algorithm where the geometric structure of the convex hull is used to model the boundary of the target class defining the one-class. However an approximation of the convex hull in the original large dimensional space by means of randomly projected decisions on 2-D spaces results in a residual error when applied to finite time applications as analyzed in [31]. In order to obtain the convex hull of the feature space (kernel space) by communicating the extremities, [32] employed a quadratic programming approach. However, the computational complexity of the proposed solution is limiting when extended to higher dimensions. [33] proposed a convex hull algorithm, but require the assumption that the feature space is generated from a Gaussian mixture model and thus is limited to applications of a special class of support vector machines. A difference in paradigm between the convex hull estimation pursued here and the literature on distributed programming for computation of convex hulls, is that the current article constructs a protocol that can be implemented in a plug and play manner in the absence of a central authority or knowledge of the network.

The problem of computing a specified function of the sensor measurements is common in wireless sensor networks, see [34], [35]. The setup in [34] focuses on the problem of determining an arbitrary function of the sensor measurements from a specified sink node. The article studies the maximum rate at which the function can be calculated and communicated to the sink node. The article provides a characterization of the achievable rates for thee different classes of functions. The authors in [35] proposed a method to calculate any arbitrary function in networks utilizing a linear iteration. The authors show that the proposed method can be modeled as a dynamical system. Based on the structured system observability approach it is shown that the linear iterations can determine any specified function of the initial values of the nodes after observing the evolution of the agent values for a large but finite time-steps. However, the method is based on forming observability matrices for all agents in the network; here the method poses limitations as the size of the network increases leading to a large amount of computation and storage requirement. Further, the linear iterations require doubly-stochastic matrices for the agent state updates. The need of doubly-stochastic matrices makes the method in [35] not applicable to directed networks. In this article we propose a distributed algorithm to determine any arbitrary function of the initial values of the agents, applicable to general connected directed graph topologies. Moreover, the proposed method has a fixed storage requirement and the communication overhead of the proposed method compared to the existing methods is insignificant.

In this article, we present a distributed stopping criteria for the higher dimensional consensus problem. Our first progress in this direction can be found in [36], where we investigated distributed stopping algorithm for the special case of “ratio consensus”. Here we present a general theory of convex monotonicity, and demonstrate that one can apply the distributed termination techniques to any algorithm satisfying this criteria. In particular we show that in popular consensus algorithms (ratio consensus and row stochastic for example), the evolution of the convex hull of network states (in any dimension) indexed by time form a nested sequence of convex sets. This motivates an algorithm for distributively computing the convex hull within a time that scales linear with diameter of the network. We further provide a simpler algorithm which guarantees the convergence of consensus algorithm in norm, within any given tolerance.

Statement of contribution:

  1. 1.

    This article constructs a general mathematical framework for convergence of network algorithms. In particular a notion of monotoniticity satisfied by important consensus algorithms is introduced. We show ratio consensus [5] and row stochastic updating of a network to be examples of “convex decreasing” algorithms.

  2. 2.

    A convex hull algorithm, of independent interest, is developed for distributed determination of the extreme points of a set of vectors in the absence of a central authority. In the context of a distributed convex decreasing algorithm, the hull algorithm can be used by the agents to obtain the convex hull by a fixed time tt, and thus give guarantees on the state of the convex decreasing algorithm for all times ttt^{\prime}\geq t.

  3. 3.

    Feasibility concerns for convex hull computation are addressed for high dimensional data, and an alternative lightweight (in the sense of both computational and communication cost) stopping criterion is given that guarantees finite convergence within an ε\varepsilon-threshold of consensus with respect to an arbitrary norm.

  4. 4.

    As application of the theory developed, new stopping criteria are developed for consensus based least square estimation as well as distributed function calculation that give the agents convergence guarantees in a peer to peer network.

The rest of the paper is organized as follows. In Section II, the basic definitions needed for subsequent developments are presented. Further, we discuss the setup for the distributed average consensus in higher dimensions (called the vector consensus problem) using ratio consensus. Sections III presents an analysis on the polytopes of the network states generated in the ratio consensus algorithm. Section V establishes a norm-based finite-time termination criterion for the vector consensus problem. Theoretical findings are validated with simulations presented in Section VII followed by conclusions in Section VIII.

II Definitions, and Problem Statement

II-A Definitions and Notations

In this section we present basic notions of graph theory and linear algebra which are essential for the subsequent developments. Detailed description of graph theory and linear algebra notions are available in [37], and [38] respectively. We will also develop a general mathematical framework for consensus algorithms, and introduce a notion of convex monotonicity which will be crucial in the development of stopping criterion for vector consensus.

Definition 1.

(Cardinality of a set) Let AA be a set. The cardinality of a set AA denoted by |A||A| is the number of elements of the set AA.

Definition 2.

(Directed Graph) A directed graph (denoted as digraph) 𝒢\mathcal{G} is a pair (𝒱,)(\mathcal{V},\mathcal{E}) where 𝒱\mathcal{V} is a set of vertices or nodes and \mathcal{E} is a set of edges, which are ordered subsets of two distinct elements of 𝒱\mathcal{V}. If an edge from j𝒱j\in\mathcal{V} to i𝒱i\in\mathcal{V} exists then it is denoted as (i,j)(i,j)\in\mathcal{E}.

Definition 3.

(Path) In a directed graph, a directed path from node ii to jj exists if there is a sequence of distinct directed edges of 𝒢\mathcal{G} of the form (k1,i),(k2,k1),,(j,km).(k_{1},i),(k_{2},k_{1}),...,(j,k_{m}). For the rest of the article, a path refers to a directed path.

Definition 4.

(Path Length) The path length, or length of a path is the number of directed edges belonging to the path. By convention, we consider a node iVi\in V to be connected to itself by a path of length zero.

Definition 5.

(Strongly Connected Graph) A directed graph is strongly connected if to every i,jVi,j\in V there exists a directed path from node ii to node jj.

Definition 6.

(In-Neighborhood) Set of in-neighbors of node i𝒱i\in\mathcal{V} is denoted by Ni={j|(i,j)}.N^{-}_{i}=\{j\ |\ (i,j)\in\mathcal{E}\}. In this article, we assume (i,i)Ni(i,i)\in N^{-}_{i} for all i𝒱.i\in\mathcal{V}.

Definition 7.

(mm-In-Neighborhood) For m=0m=0 define Ni(0)=iN^{-}_{i}(0)=i, and for m>0m>0 define Ni(m)N^{-}_{i}(m) to be jNi(m1)Nj\cup_{j\in N_{i}^{-}(m-1)}N^{-}_{j}, so that Ni(m)N^{-}_{i}(m) is the set of nodes jGj\in G from which ii can be reached in mm or less steps.

Definition 8.

(Diameter of a Graph) The diameter of a graph is the longest shortest path between any two nodes in the network. We will consider DD as an upper bound on the diameter of the graph throughout the rest of the article.

Definition 9.

(Network State) For a vector space WW, a WW-valued network state is a function f:𝒱Wf:{\mathcal{V}}\to W, which we denote by W𝒱W^{\mathcal{V}}.

Definition 10.

(Network Update) A network update on G=G(𝒱,)G=G(\mathcal{V},\mathcal{E}), is a map ϕ:W𝒱W𝒱\phi:W^{\mathcal{V}}\to W^{\mathcal{V}}.

Definition 11.

(Network and Consensus algorithms) A discrete time network algorithm on G=G(𝒱,)G=G(\mathcal{V},\mathcal{E}), is a finite or countably infinite sequence of maps of network updates. When WW is endowed with a norm \|\cdot\|, a consensus algorithm is a network algorithm such that for any fW𝒱f\in W^{\mathcal{V}},

Φn(f)ϕnϕn1ϕ1(f)\displaystyle\Phi_{n}(f)\coloneqq\phi_{n}\circ\phi_{n-1}\circ\cdots\circ\phi_{1}(f) (1)

satisfies,

limnΦn(f)(i)Φn(f)(j)=0,\displaystyle\lim_{n\to\infty}\|\Phi_{n}(f)(i)-\Phi_{n}(f)(j)\|=0,

for all i,jVi,j\in V.

We will only consider discrete time network algorithms in this work, a discrete time consensus algorithm will be referred to a consensus algorithm hereafter.

Definition 12.

(Distributed Network Update and Algorithm) A network update ϕ\phi, is distributed if f,gW𝒱f,g\in W^{\mathcal{V}} satisfying f(j)=g(j)f(j)=g(j) for jNi(1)j\in N^{-}_{i}(1) implies ϕ(f)(i)=ϕ(g)(i).\phi(f)(i)=\phi(g)(i). A network algorithm {ϕk}k\{\phi_{k}\}_{k}, is distributed if ϕk\phi_{k} is a distributed network update for every kk.

For i𝒱i\in\mathcal{V}, define πi:W𝒱WNi(i)\pi_{i}:W^{\mathcal{V}}\to W^{N_{i}^{-}(i)} to be the restriction of ff to Ni(1)N_{i}^{-}(1). Explicitly, for fW𝒱f\in W^{\mathcal{V}} and jNi(i)j\in N_{i}^{-}(i), πi(f)(j)f(j)\pi_{i}(f)(j)\coloneqq f(j). Also, define ϵi:WNi(i)W𝒱\epsilon_{i}:W^{N_{i}^{-}(i)}\to W^{\mathcal{V}} by

ϵi(x)(j)={x(j) when jNi(1),0 else,\displaystyle\epsilon_{i}(x)(j)=\begin{cases}x(j)&\text{ when }j\in N_{i}^{-}(1),\\ 0&\text{ else, }\end{cases}

for xWNi(1)x\in W^{N_{i}^{-}(1)}.

In the following proposition, we gives as alternative formulation of the fact that distributed updates are determined locally.

Proposition II.1.

A network update ϕ\phi is distributed if and only if for every ii, fϕ(f)(i)f\mapsto\phi(f)(i) can be expressed as ψi(πi(f))\psi_{i}(\pi_{i}(f)) for a function ψi:WNi(1)W\psi_{i}:W^{N_{i}^{-}(1)}\to W.

Proof.

First suppose a function ψi\psi_{i} satisfying ψi(πi(f))=ϕ(f)(i)\psi_{i}(\pi_{i}(f))=\phi(f)(i) exists, and that ff and gW𝒱g\in W^{\mathcal{V}} satisfy, fj=gjf_{j}=g_{j} for jNi(1)j\in N_{i}^{-}(1), then πi(f)=πi(g)\pi_{i}(f)=\pi_{i}(g). Thus ϕ(f)(i)=ψi(πi(f))=ψi(πi(g))=ϕ(g)(i)\phi(f)(i)=\psi_{i}(\pi_{i}(f))=\psi_{i}(\pi_{i}(g))=\phi(g)(i). Conversely, assume ϕ\phi is a distributed network update and define for xWNi(1)x\in W^{N_{i}^{-}(1)} ψi(x)=ϕ(ϵi(x))(i)\psi_{i}(x)=\phi(\epsilon_{i}(x))(i). Observe that for jNi(1)j\in N_{i}^{-}(1), f(j)=ϵi(πi(f))(j)f(j)=\epsilon_{i}({\pi_{i}(f)})(j), hence by the definition of a distributed network update

ϕ(f)(i)=ϕ(ϵi(πi(f)))(i).\displaystyle\phi(f)(i)=\phi(\epsilon_{i}({\pi_{i}(f)}))(i). (2)

Further by the definition of the function ψi\psi_{i},

ϕ(ϵi(πi(f)))(i)=ψi(πi(f)).\displaystyle\phi(\epsilon_{i}({\pi_{i}(f)}))(i)=\psi_{i}(\pi_{i}(f)). (3)

Combining (2) and (3) completes the proof. ∎

Observe that in the case that the ψi\psi_{i} is linear in ff in the sense that ψi(f)=jNi(1)λijfj\psi_{i}(f)=\sum_{j\in N_{i}^{-}(1)}\lambda_{ij}f_{j} for λj\lambda_{j}\in\mathbb{R}, then ϕk(f)\phi_{k}(f) can be represented by a matrix 𝚲𝐤=(λij){\bf\Lambda_{k}}=\left(\lambda_{ij}\right). Conversely, ϕ\phi represented by matrices, clearly induce linear ψi\psi_{i}.

We will be concerned with linear consensus algorithms, those that can be build from matrix operations. That is when ϕn\phi_{n} can be represented by a matrix 𝚲(𝐧){\bf\Lambda(n)}, in the sense that

(ϕnf)(i)=ijΛij(n)f(j),\displaystyle(\phi_{n}f)(i)=\sum_{ij}\Lambda_{ij}(n)f(j),

we will write 𝚲(𝐧)f{\bf\Lambda(n)}f in place of ϕn(f)\phi_{n}(f). Moreover for brevity for exposition, our focus will be on the case that our dynamics are time homogeneous in the sense that 𝚲(𝐧)=𝚲{\bf\Lambda(n)}={\bf\Lambda}.

Definition 13.

(Column Stochastic Matrix) A real N×NN\times N matrix 𝐏=[pji]{\bf P}=[p_{ji}] is called a column stochastic matrix if pji0p_{ji}\geq 0 for 1i,jN1\leq i,j\leq N and j=1Npji=1\sum_{j=1}^{N}p_{ji}=1 for 1iN.1\leq i\leq N.

Definition 14.

(Row Stochastic Matrix) A real N×NN\times N matrix 𝐀=[aij]{\bf A}=[a_{ij}] is called a row stochastic matrix if 1aij01\geq a_{ij}\geq 0 for 1i,jN1\leq i,j\leq N and j=1Naij=1\sum_{j=1}^{N}a_{ij}=1 for 1iN.1\leq i\leq N.

Definition 15.

(Irreducible Matrix) A N×NN\times N matrix AA is said to be irreducible if for any 1i,jN1\leq i,j\leq N, there exist mm\in\mathbb{N} such that (Am)(i,j)>0(\textbf{A}^{m})(i,j)>0, that is, it is possible to reach any state from any other state in a finite number of hops.

Definition 16.

(Primitive Matrix) A non negative matrix 𝐀{\bf A} is primitive if it is irreducible and has only one eigenvalue of maximum modulus.

As a notational convention, matrices will be written in bold face as above.

Definition 17.

(Convex hull) For a set UWU\subseteq W, the convex hull of UU is the smallest convex set containing UU,

co(U)={F convex :UF}F.co(U)=\bigcap_{\{F\hbox{ convex }:U\subseteq F\}}F. (4)

The topological closure of a set UU, will be denoted

U¯={F closed :UF}F,\bar{U}=\bigcap_{\{F\hbox{ closed }:U\subseteq F\}}F,

the closure of a convex hull, will be denote co¯(U)co(U)¯\overline{co}(U)\coloneqq\overline{co(U)}.

For fW𝒱f\in W^{\mathcal{V}} we consider co(f)co(f) to be the convex hull of ff, when considered as a set of elements of WW indexed by 𝒱\mathcal{V}. More explicitly if we denote the simplex by 𝒮n{tn:ti0,j=1nti=1}\mathcal{S}_{n}\coloneqq\left\{t\in\mathbb{R}^{n}:t_{i}\geq 0,\sum_{j=1}^{n}t_{i}=1\right\} then for fW𝒱,f\in W^{\mathcal{V}},

co(f){xW:x=iVtif(i), for t𝒮|𝒱|}\displaystyle co(f)\coloneqq\left\{x\in W:x=\sum_{i\in V}t_{i}f(i),\hbox{ for }t\in\mathcal{S}_{|\mathcal{V}|}\right\} (5)
Definition 18.

(Extreme point) For a convex set UWU\subseteq W define uUu\in U to be an extreme point of UU, denoted (U)\mathscr{E}(U), if u=u1+u22u=\frac{u_{1}+u_{2}}{2} for uiUu_{i}\in U implies u1=u2=uu_{1}=u_{2}=u. For a general UU, define (U)(co(U))\mathscr{E}(U)\coloneqq\mathscr{E}(co(U)).

We will also have use for the following

Definition 19.

For a norm \|\cdot\| and a set UWU\subseteq W define the diameter of UU with respect to the norm \|\cdot\|, diam(U)=supx,yUxydiam_{\|\cdot\|}(U)=\sup_{x,y\in U}\|x-y\|.

Definition 20.

(Convex Decreasing) A sequence of sets SnWS_{n}\subseteq W is convex decreasing if co(Sn+1)co(Sn)co(S_{n+1})\subseteq co(S_{n}). An WW consensus algorithm is convex decreasing when the sets Sn=Φn(f)S_{n}=\Phi_{n}(f) (with Φn(f)\Phi_{n}(f) defined as in (1), and the convex hull of an element of W𝒱W^{\mathcal{V}} defined as in (5)) are convex decreasing for any fW𝒱f\in W^{\mathcal{V}}.

Our primary interest is in the case that W=dW=\mathbb{R}^{d} and for this case we now recall a standard tool from Convex Geometry, the support function of a set, which we can use to give an analytic description of a Convex Decreasing sequence of sets. For x,ynx,y\in\mathbb{R}^{n}, we use the notation x,y=xTy=i=1nxiyi\langle x,y\rangle=x^{T}y=\sum_{i=1}^{n}x_{i}y_{i}, where we use ()T(\cdot)^{T} to denote the usual transpose operation.

Definition 21.

(Support function) For a non-empty set AdA\subseteq\mathbb{R}^{d}, define its support function

hA(u)supxAx,u.\displaystyle h_{A}(u)\coloneqq\sup_{x\in A}\langle x,u\rangle.
Proposition II.2.

Support functions satisfy the following:

  1. 1.

    ABA\subseteq B implies hAhBh_{A}\leq h_{B}

  2. 2.

    hA=hco(A)h_{A}=h_{co(A)}

  3. 3.

    hA=hA¯h_{A}=h_{\bar{A}}

  4. 4.

    hAhBh_{A}\leq h_{B} implies co¯(A)co¯(B)\overline{co}(A)\subseteq\overline{co}(B).

  5. 5.

    A sequence of compact sets {An}\{A_{n}\} is convex decreasing if and only if hAnhAn+1h_{A_{n}}\geq h_{A_{n+1}} holds for all nn.

Proof.

Observe that ABA\subseteq B implies, hA(u)=supaAa,usupbBb,u=hB(u)h_{A}(u)=\sup_{a\in A}\langle a,u\rangle\leq\sup_{b\in B}\langle b,u\rangle=h_{B}(u) giving (1).

If xco(A)x\in co(A) then x=i=1ntiaix=\sum_{i=1}^{n}t_{i}a_{i} for some t𝒮nt\in\mathcal{S}_{n} and aiAa_{i}\in A. Thus x,u=itiai,uitisupaAa,u=hA(u)\langle x,u\rangle=\sum_{i}t_{i}\langle a_{i},u\rangle\leq\sum_{i}t_{i}\sup_{a\in A}\langle a,u\rangle=h_{A}(u). Thus hco(A)(u)hA(u)h_{co(A)}(u)\leq h_{A}(u), while the opposite inequality follows from (1), giving (2).

If xA¯x\in\bar{A} then x=limnanx=\lim_{n}a_{n} for anAa_{n}\in A, so that x,u=limnan,uhA(u)\langle x,u\rangle=\lim_{n}\langle a_{n},u\rangle\leq h_{A}(u). Thus hA¯(u)hA(u)h_{\bar{A}}(u)\leq h_{A}(u). The opposite inequality follows from (1), and (3) follows.

Suppose that AA and BB are closed convex sets such that hAhBh_{A}\leq h_{B}, and take aABa\in A-B. Then, by the hyperplane seperation theorem [39], there exists uu such that a,u>supbBb,u=hB(u)\langle a,u\rangle>\sup_{b\in B}\langle b,u\rangle=h_{B}(u). This would be a contradiction on hAhBh_{A}\leq h_{B}, so we must have ABA\subseteq B. For general AA and BB, we need only recall from (2) and (3) that hA=hco¯(A)h_{A}=h_{\overline{co}(A)} and hB=hco¯(B)h_{B}=h_{\overline{co}(B)} and apply the previous to the closed convex hull to obtain co¯(A)co¯(B)\overline{co}(A)\subseteq\overline{co}(B). Thus (4) follows.

To prove (5) observe that if {An}\{A_{n}\} is a convex decreasing sequence, by definition co(An+1)co(An)co(A_{n+1})\subseteq co(A_{n}). So that hAn=hco(An)hco(An+1)=hAn+1h_{A_{n}}=h_{co(A_{n})}\geq h_{co(A_{n+1})}=h_{A_{n+1}}. Conversely, if hAnhAn+1h_{A_{n}}\geq h_{A_{n+1}} then by (4), co¯(An+1)co¯(An)\overline{co}(A_{n+1})\subseteq\overline{co}(A_{n}), and since the AnA_{n} are assumed compact their convex hulls are as well, and hence co(An+1)co(An)co(A_{n+1})\subseteq co(A_{n}). ∎

We will also have use for a few basic results from Convex Geometry, which we collect bellow.

Lemma II.1.

For KK convex and compact co((K))=Kco(\mathscr{E}(K))=K. For AαdA_{\alpha}\subseteq\mathbb{R}^{d}, then co(αAα)=co(αco(Aα))co\left(\cup_{\alpha}A_{\alpha}\right)=co\left(\cup_{\alpha}co(A_{\alpha})\right)

Proof.

The first result is standard (see [39]), its infinite dimensional generalization is the Krein-Milman Theorem (see [40]). For the second result, clearly αco(Aα)αAα\cup_{\alpha}co(A_{\alpha})\supseteq\cup_{\alpha}A_{\alpha} so that co(αco(Aα))co(αAα)co(\cup_{\alpha}co(A_{\alpha}))\supseteq co(\cup_{\alpha}A_{\alpha}). The reverse inequality follows by fixing α\alpha^{\prime} and observing co(αAα)co(Aα)co(\cup_{\alpha}A_{\alpha})\supseteq co(A_{\alpha^{\prime}}), which implies co(αAα)αco(Aα)co(\cup_{\alpha}A_{\alpha})\supseteq\cup_{\alpha}co(A_{\alpha}). Since co(αco(Aα))co(\cup_{\alpha}co(A_{\alpha})) is the smallest convex set containing αco(Aα)\cup_{\alpha}co(A_{\alpha}) the proof is complete. ∎

II-B Vector Consensus framework

Here, we extend a key result from [5, 6] where a ratio of two states was maintained to reach average consensus. We consider the network topology to be represented by a directed graph 𝒢(𝒱,)\mathcal{G}(\mathcal{V},\mathcal{E}) containing |𝒱|<|\mathcal{V}|<\infty nodes and satisfies the following assumptions throughout the rest of the paper.

Assumption 1.

The directed graph 𝒢(𝒱,)\mathcal{G}(\mathcal{V},\mathcal{E}) representing the agent interconnections is strongly-connected.

Assumption 2.

Let 𝐏=[pji]{\bf P}=[p_{ji}] be a primitive column stochastic matrix with digraph 𝒢(𝒱,)\mathcal{G}(\mathcal{V},\mathcal{E}) with pji>0p_{ji}>0 if and only if (i,j)(i,j)\in\mathcal{E}.

Theorem II.1.

A sequence of matrices 𝐀𝐧{\bf A_{n}} defines a scalar consensus algorithm if and only if it defines an d\mathbb{R}^{d} vector consensus algorithm.

Proof.

The value of ii-th node in the ll-th coordinate after nn iterations is the application of nn-iterations to the ll-th coordinate function evaluated at the ii-th node, (k=1n𝐀𝐤)(f)(i)l=(k=1n𝐀𝐤)(fl)(i)(\prod_{k=1}^{n}{\bf A_{k}})(f)(i)_{l}=(\prod_{k=1}^{n}{\bf A_{k}})(f_{l})(i). Hence the theorem follows from the existence of cc and C>0C>0 (dependent on dimension and choice of norm \|\cdot\|) such that

cmaxl|Φn(f)(i)l\displaystyle c\max_{l}|\Phi_{n}(f)(i)_{l} Φn(f)(j)l|Φn(f)(i)Φn(f)(j)\displaystyle-\Phi_{n}(f)(j)_{l}|\leq\|\Phi_{n}(f)(i)-\Phi_{n}(f)(j)\|
Cmaxl|Φn(f)(i)lΦn(f)(j)l|\displaystyle\leq C\max_{l}|\Phi_{n}(f)(i)_{l}-\Phi_{n}(f)(j)_{l}|

with the fact above that Φn(fl)(i)=Φn(f)(j)l\Phi_{n}(f_{l})(i)=\Phi_{n}(f)(j)_{l} the result follows. ∎

Each node i𝒱i\in\mathcal{V} maintains three state estimates at time kk, denoted by xi(k)dx^{i}(k)\in\mathbb{R}^{d} (referred as numerator state of node ii), yi(k)y_{i}(k)\in\mathbb{R} (referred as denominator state of node ii) and ri(k)dr^{i}(k)\in\mathbb{R}^{d} (referred as ratio state of node ii). Here d(1)d\ (\geq 1) is the dimension of each node’s state. Node ii updates its numerator and denominator states at the (k+1)th(k+1)^{th} discrete iteration according to the following update law:

xj(k+1)\displaystyle x^{j}(k+1) =iNjpjixi(k),\displaystyle=\sum_{i\in N^{-}_{j}}p_{ji}x^{i}(k), (6)
yj(k+1)\displaystyle y_{j}(k+1) =iNjpjiyi(k),\displaystyle=\sum_{i\in N^{-}_{j}}p_{ji}y_{i}(k), (7)

where, NiN^{-}_{i} is the set of in-neighbors of node ii. We will use the notation x(k+1)=𝐏x(k)x(k+1)={\bf P}x(k) as shorthand for (6), and observe that x(n)=𝐏nx(0)x(n)={\bf P}^{n}x(0). The initial conditions for the numerator vector state and denominator state for any node i𝒱i\in\mathcal{V} are:

xi(0)\displaystyle x^{i}(0) =[x1i(0)x2i(0)xdi(0)]T,yi(0)=1.\displaystyle=[x_{1}^{i}(0)\ x_{2}^{i}(0)\dots x_{d}^{i}(0)]^{T},\ y_{i}(0)=1. (8)

Node ii further updates its ratio state as:

ri(k+1)\displaystyle r^{i}(k+1) =1yi(k+1)xi(k+1),\displaystyle=\frac{1}{y_{i}(k+1)}x^{i}(k+1), (9)

Under Assumptions 12 and the initialization in (8), ratio state in (9) is well defined. The next theorem establishes the convergence of the ratio state, which is a direct and simple generalization of the result in [5, 6].

Theorem II.2.

Let {xi(k)},{yi(k)}\{x^{i}(k)\},\{y_{i}(k)\} and {ri(k)}\{r^{i}(k)\} be the sequences generated by (6), (7) and (9) respectively. Let the initial conditions for the network states be as defined in (8). Then, under Assumptions 1 and 2 the ratio state ri(k)r^{i}(k) asymptotically converges to r¯:=limk1yi(k)xi(k)=1Nj=1Nxj(0)\overline{r}:=\lim\limits_{k\rightarrow\infty}\frac{1}{y_{i}(k)}x^{i}(k)=\frac{1}{N}\sum\limits_{j=1}^{N}x^{j}(0) for all i{1,,N}i\in\{1,...,N\}.

Proof.

The proof follows by applying the result from [5, 6] applied component wise to xix^{i} and ri,r^{i}, and thus we have convergence in the case when node states are vectors. ∎

In a slightly different framework, we consider the average consensus problem, where each node i𝒱i\in\mathcal{V} maintains a single state zi(k)dz^{i}(k)\in\mathbb{R}^{d}, for each time kk, and update its state according to the following update law:

zi(k+1)=jNiaijzj(k),\displaystyle z^{i}(k+1)=\sum_{j\in N^{-}_{i}}a_{ij}z^{j}(k), (10)

where 𝐀=[aij]{\bf A}=[a_{ij}] is a primitive, and row-stochastic, with aij>0a_{ij}>0 if and only if (i,j)(i,j)\in\mathcal{E}. In this case, zi(k)z^{i}(k) converges independent of ii, to j=1Nπjzj(0)\sum\limits_{j=1}^{N}\pi_{j}z^{j}(0) for some π𝒮n\pi\in\mathcal{S}_{n} (see [41]), and πi=1N\pi_{i}=\frac{1}{N} in the case that 𝐀{\bf A} is assumed to be column stochastic as well (see [11]).

III Convex Hull based Finite-Time stopping Criterion

The following theorem shows that consensus algorithms that are convex decreasing converge to the same finite limit at all nodes, and that if one sets an threshold for convergence with an open set about the consensus, the threshold will be met in finite time. Further, when the threshold set is assumed convex, it is proven that if all agents possess a value within the set, their updated values remain within this threshold. In this sense, convex threshold sets provide a guarantee on future behavior of the network.

Theorem III.1.

Suppose that Sn=Φn(f)S_{n}=\Phi_{n}(f) represents the state of a convex decreasing consensus algorithm, then SlimnΦn(f)iS_{\infty}\coloneqq\lim_{n}\Phi_{n}(f)_{i} is finite and well defined independent of ii. Further, given a set with non-empty interior 𝒪\mathcal{O}, containing SS_{\infty}, there exists n0n_{0} such that nn0n\geq n_{0} implies Sn𝒪S_{n}\subseteq\mathcal{O}. If KK is a convex set such that SMKS_{M}\subseteq K, then SnKS_{n}\subseteq K for nMn\geq M.

Proof.

For a convex set KK, if SMKS_{M}\subseteq K, then co(SM)Kco(S_{M})\subseteq K, and since co(Sn)co(S_{n}) are nested, the last statement follows immediately. Note that by Cantor’s intersection theorem (see for example [42, Lemma 3.2.2]), since co(Sn)co(S_{n}) are nested, compact (since the convex hull of finitely many points, {Φn(f)i}i=1|𝒱|\{\Phi_{n}(f)_{i}\}_{i=1}^{|\mathcal{V}|}, can be expressed as the continuous image of the simplex, a compact set) sets, nco(Sn)\cap_{n}co(S_{n}) is non-empty. Since the mapping (x,y)xy(x,y)\mapsto\|x-y\| is a convex map111Indeed, the inequality (1t)x0+tx1(1t)y0ty1(1t)x0y0+tx1y1\|(1-t)x_{0}+tx_{1}-(1-t)y_{0}-ty_{1}\|\leq(1-t)\|x_{0}-y_{0}\|+t\|x_{1}-y_{1}\| follows from an application of the triangle inequality and scalar homogeneity, for any t(0,1)t\in(0,1) and xi,yiWx_{i},y_{i}\in W., it follows that

diam(co(Sn))=maxijΦn(f)iΦn(f)j.\displaystyle diam(co(S_{n}))=\max_{ij}\|\Phi_{n}(f)_{i}-\Phi_{n}(f)_{j}\|.

Thus the diameter of co(Sn)co(S_{n}) is the maximum of finitely many terms tending to zero, and hence limndiam(co(Sn))=0\lim_{n}diam(co(S_{n}))=0, and the non-empty set nco(Sn)\cap_{n}co(S_{n}) can contain at most one point, which we denote SS_{\infty}.
Given 𝒪\mathcal{O}, an open set containing SS_{\infty}, co(Sn)𝒪co(S_{n})\subseteq\mathcal{O} for large enough nn, since diam(co(Sn))diam(co(S_{n})) tends to zero and 𝒪\mathcal{O} contains an ε\varepsilon ball about SS_{\infty} with respect to \|\cdot\| for small enough ε\varepsilon, since all finite dimensional norms are equivalent. ∎

Theorem III.1 allows us to provide stopping guarantees to convex decreasing consensus algorithms, particularly useful in distributed contexts. We will show that in both consensus frameworks (9) and (10) the network states {ri(k)}i=1N\{r^{i}(k)\}_{i=1}^{N} and {zi(k)}i=1N\{z^{i}(k)\}_{i=1}^{N} at time kk define a sequence of polytopes {r(k)}k=0\{r(k)\}_{k=0}^{\infty} and {z(k)}k=0\{z(k)\}_{k=0}^{\infty} respectively defined to be

r(k)\displaystyle r(k) co({ri(k)}i=1N)\displaystyle\coloneqq co(\{r^{i}(k)\}_{i=1}^{N})
z(k)\displaystyle z(k) co({zi(k)}i=1N)\displaystyle\coloneqq co(\{z^{i}(k)\}_{i=1}^{N})

that are convex decreasing.

Theorem III.2.

Consider the update equation (10) for z(k)z(k) where z(k+1)=𝐀z(k)z(k+1)={\bf A}z(k) with the assumption that 𝐀{\bf A} is row stochastic and the update equations for r(k)r(k) as given in [10, 11, 12, 13], where x(k+1)=𝐏x(k);y(k+1)=𝐏y(k)x(k+1)={\bf P}x(k);y(k+1)={\bf P}y(k) and r(k+1)=1y(k)x(k)r(k+1)=\frac{1}{y(k)}x(k) where 𝐏{\bf P} is column stochastic. Then z(k)z(k) and r(k)r(k) form convex decreasing consensus algorithms.

Proof.

To see that z(k)z(k) is convex decreasing is immediate, since by definition

zi(k+1)=jNiaijzj(k),\displaystyle z^{i}(k+1)=\sum_{j\in N^{-}_{i}}a_{ij}z^{j}(k),

where {aij}j\{a_{ij}\}_{j} is a sequence of non-negative numbers that sum to one. Thus zi(k+1)z^{i}(k+1) is a convex combination of elements of z(k)z(k) and hence z(k+1)=co({zi(k+1)})z(k)z(k+1)=co(\{z^{i}(k+1)\})\subseteq z(k).
To see that r(k)r(k) are convex decreasing, since r(k)r(k) is finite and hence compact for all kk, by Proposition II.2, it is enough to show that their support functions are decreasing, that is hr(k+1)hr(k)h_{r(k+1)}\leq h_{r(k)}. Note that from rj(k)=xj(k)/yj(k)r^{j}(k)=x^{j}(k)/y^{j}(k) the support function satisfies the following inequality for all jj,

xj(k),uhr(k)(u)yj(k).\displaystyle\langle x^{j}(k),u\rangle\leq h_{r(k)}(u)y^{j}(k).

With ratio-consensus updates from 𝐏{\bf P} column stochastic, to prove convex decreasingness, it suffices to show that rj(k+1),uhr(k)(u)\langle r^{j}(k+1),u\rangle\leq h_{r(k)}(u), or equivalently,

xj(k+1),uhr(k)(u)yj(k+1).\displaystyle\langle x^{j}(k+1),u\rangle\leq h_{r(k)}(u)y^{j}(k+1).

Computing,

xj(k+1),u\displaystyle\langle x^{j}(k+1),u\rangle =ipjixi(k),u\displaystyle=\sum_{i}p_{ji}\langle x^{i}(k),u\rangle
ipjihr(k)yi(k)\displaystyle\leq\sum_{i}p_{ji}h_{r(k)}y^{i}(k)
=hr(k)(u)yj(k+1).\displaystyle=h_{r(k)}(u)y^{j}(k+1).

That r(k)r(k) and z(k)z(k) are consensus algorithms follows from well known literature. In particular r(k)r(k) is a consensus algorithm by Theorem II.2; indeed, r(k)r(k) converges to the average i=1nxi(0)d.\sum_{i=1}^{n}x^{i}(0)\in\mathbb{R}^{d}.. For zz, observe that z(n)=𝐏nz(0)z(n)={\bf P}^{n}z(0). The connectivity properties of 𝐏{\bf P}, ensure that

limn𝐏n=(π1π2π|𝒱|π1π2π|𝒱|π1π2π|𝒱|)\displaystyle\lim_{n\to\infty}{\bf P}^{n}=\left(\begin{array}[]{cccc}\pi_{1}&\pi_{2}&\dots&\pi_{|\mathcal{V}|}\\ \pi_{1}&\pi_{2}&\dots&\pi_{|\mathcal{V}|}\\ \vdots&\vdots&\vdots&\vdots\\ \pi_{1}&\pi_{2}&\dots&\pi_{|\mathcal{V}|}\\ \end{array}\right)

for a π𝒮|𝒱|\pi\in\mathcal{S}_{|\mathcal{V}|} (see [41] for example, and note in the language of Markov Chains that pii>0p_{ii}>0 ensures aperiodicity, while irreducibility follows from strong connectedness). As a consequence limnz(n)=limn𝐏nz(0)=(iπizi(0)iπizi(0).)\lim_{n}z(n)=\lim_{n}{\bf P}^{n}z(0)=\left(\begin{array}[]{c}\sum_{i}\pi_{i}z_{i}(0)\\ \vdots\\ \sum_{i}\pi_{i}z_{i}(0).\end{array}\right). ∎

Below in Figure 1, we briefly illustrate the result in Thoerem III.2 via an example, for the update z(k+1)=𝐀z(k)z(k+1)={\bf A}z(k) with 𝐀{\bf A} being row stochastic a 30 node Erdos-Rényi graph is initialized with values chosen uniformly at random from (0,1)2(0,1)^{2}. The initial values are displayed in red, with the boundary of the convex hull traced in black. The points of z(0)z(0) are displayed in red, z(1)z(1) in blue, z(2)z(2) in green, with the convex hull boundaries of the respective sets traced in black. That row stochastic updating is convex decreasing is instantiated in the nested-ness of the sets z(i)z(i).

Refer to caption
Figure 1: Iterations of a convex decreasing algorithm

When d=1d=1, the convex hull is simply described, co(r(k))=[minj{rj(k)}j=1n,maxj{rj(k)}j=1n]co(r(k))=[\min_{j}\{r^{j}(k)\}_{j=1}^{n},\max_{j}\{r^{j}(k)\}_{j=1}^{n}], so that co(r(k))co(r(k+1))co(r(k))\supseteq co(r(k+1)) gives the monotonicity results from [26], minj{rj(k)}j=1n,minj{rj(k+1)}j=1n\min_{j}\{r^{j}(k)\}_{j=1}^{n},\leq\min_{j}\{r^{j}(k+1)\}_{j=1}^{n} and maxj{rj(k)}j=1n,maxj{rj(k+1)}j=1n\max_{j}\{r^{j}(k)\}_{j=1}^{n},\geq\max_{j}\{r^{j}(k+1)\}_{j=1}^{n}. Analogously, applying Theorem III.2 in the one dimensional case delivers the monotonicity of min and max from [23]. In the dd-dimensional case, taking \mathcal{F} to be the family of all rectangular sets recovers Theorem 44 of [13].

We will use the monotonicity of convex decreasing consensus algorithms to develop a distributed stopping criteria, guaranteeing convergence of all nodes within an ϵ\epsilon-ball of the consensus value for a general norm. First we develop a distributed convex hull computation algorithm that is of independent interest. Here, given a function w:𝒱dw:{\mathcal{V}}\to\mathbb{R}^{d} the convex hull co(w)co({w(i)}i=1n)co(w)\coloneqq co(\{w(i)\}_{i=1}^{n}) is to be determined by the network in a distributed fashion. Consider a algorithm where at stage one, agents share their value w(j)w(j) with neighbors. The nodes then update their approximation of the convex hull, by determining the extreme points among all values received from their neighbors and their own. This new set of extreme points is communicated to all neighbors and then the process repeats. We show that after DD (DD being the diameter of the network) iterations, every node will have determined the extreme points of ww.

In the context of a consensus algorithm protocol w:𝒱dw:{\mathcal{V}}\to\mathbb{R}^{d} represents w(i)=ri(k)w(i)=r^{i}(k) or zi(k)z^{i}(k) the value at node ii in iteration kk of a consensus algorithm, and we implement the following stopping criterion. Given a norm \|\cdot\| and a tolerance ε\varepsilon, implement the convex hull algorithm at time kk, then at time k+Dk+D, at a node jj if maxe(i),e(j)(w)e(i)e(j)ε\max_{e(i),e(j)\in\mathscr{E}(w)}\|e(i)-e(j)\|\leq\varepsilon, then stop the consensus algorithm. In what follows we demonstrate that upon stopping every node is within ε\varepsilon of the consensus value in norm.

IV Peer to Peer Convex Hull Algorithm

We now describe a finite time algorithm for distributed convex hull computation. Suppose that |𝒱||\mathcal{V}| agents indexed by ii where each agent has a set SiS_{i} of elements of d\mathbb{R}^{d}. The agent can communicate with each other while respecting constraints imposed by a specific communication network. We provide a distributed consensus algorithm through which all agents obtain (iSi)\mathscr{E}\left(\cup_{i}S_{i}\right) in DD-iterations of the algorithm. If we let EE denote the space of all finite sequences of elements of d\mathbb{R}^{d}, the convex hull algorithm can be understood as a distributed consensus algorithm.

Definition 22.

For S:𝒱ES:\mathcal{V}\to E, define the initialization xi(0)=(Si)x_{i}(0)=\mathscr{E}(S_{i}). Iteratively define,

xi(t)\displaystyle x_{i}(t) =(jNi(1)xj(t1)).\displaystyle=\mathscr{E}(\bigcup_{j\in N_{i}^{-}(1)}x_{j}(t-1)).

We identify xi(t)x_{i}(t) with an element of EE, by writing its elements in lexicographical order.

That is, at iteration tt, an agent ii receives the extreme points known to their “in-neighbors” and forms a new set si(t)s_{i}(t) comprised of their previous extreme points and their neighbors. Agent ii finds the extreme points of this new set, and then communicates the set to its “out-neighbors” to initiate another iteration.

Theorem IV.1.

For any initial configuration defined by S:𝒱ES:{\mathcal{V}}\mapsto E, the algorithm for xi(t)x_{i}(t) described in Definition 22 is a distributed consensus algorithm. Moreover, considered as sets

xi(t)=(Si(t))\displaystyle x_{i}(t)=\mathscr{E}(S_{i}(t)) (11)

where Si(t)jNi(t)SjS_{i}(t)\coloneqq\cup_{j\in N_{i}^{-}(t)}S_{j}, and we recall that Ni(t)N_{i}^{-}(t) is the tt in-neighborhood of node ii (see Definition 6). In particular, for tDt\geq D, xi(t)=xi(D)=(i𝒱Si)x_{i}(t)=x_{i}(D)=\mathscr{E}(\cup_{i\in\mathcal{V}}S_{i}).

Proof.

The result is true by definition checking when t=0t=0, since Si(0)=SiS_{i}(0)=S_{i}. Thus we proceed by induction and assume the result holds for k<tk<t. By definition,

xi(t)=(jNi(1)xj(t1)).\displaystyle x_{i}(t)=\mathscr{E}\left(\bigcup_{j\in N_{i}^{-}(1)}x_{j}(t-1)\right). (12)

By the induction hypothesis,

(jNi(1)xj(t1))\displaystyle\mathscr{E}\left(\bigcup_{j\in N_{i}^{-}(1)}x_{j}(t-1)\right) =(jNi(1)(Sj(t1)))\displaystyle=\mathscr{E}\left(\bigcup_{j\in N_{i}^{-}(1)}\mathscr{E}(S_{j}(t-1))\right)
=(jNi(1)(kNj(t1)Sk)).\displaystyle=\mathscr{E}\left(\bigcup_{j\in N_{i}^{-}(1)}\mathscr{E}\left(\bigcup_{k\in N_{j}^{-}(t-1)}S_{k}\right)\right).

Recall that for non-convex sets UU, (U)(co(U))\mathscr{E}(U)\coloneqq\mathscr{E}(co(U)) , so that

(j(kjSkj))=(co(j(co(kjSkj))))\displaystyle\mathscr{E}\left(\cup_{j}\mathscr{E}\left(\cup_{k_{j}}S_{k_{j}}\right)\right)=\mathscr{E}\left(co(\cup_{j}\mathscr{E}\left(co(\cup_{k_{j}}S_{k_{j}})\right))\right)

where the subscript kjk_{j} ranges over the set Nj(t1)N_{j}^{-}(t-1). If we write Kj=co(kjSkj)K_{j}=co(\cup_{k_{j}}S_{k_{j}}), and apply Lemma II.1, with the fact that KK is convex and compact,

co(j(Kj))\displaystyle co(\cup_{j}\mathscr{E}\left(K_{j}\right)) =co(jco((Kj)))\displaystyle=co(\cup_{j}co(\mathscr{E}(K_{j})))
=co(jKj).\displaystyle=co(\cup_{j}K_{j}).

By definition of KjK_{j} and another application of Lemma II.1,

co(jKj)\displaystyle co(\cup_{j}K_{j}) =co(jco(kjSkj))\displaystyle=co(\cup_{j}co(\cup_{k_{j}}S_{k_{j}}))
=co(jkjSkj).\displaystyle=co(\cup_{j}\cup_{k_{j}}S_{k_{j}}).

Thus our result follows once we can show,

jNi(1)(kNj(t1)Sk)=jNi(t)Sj.\displaystyle\bigcup_{j\in N_{i}^{-}(1)}\left(\bigcup_{k\in N_{j}^{-}(t-1)}S_{k}\right)=\bigcup_{j\in N_{i}^{-}(t)}S_{j}.

Both sets can be considered as unions of SkS_{k} indexed by paths of length not larger than tt terminating at ii. More explicitly, both sets can be written as λΛSλ\bigcup_{\lambda\in\Lambda}S_{\lambda} where Λ\Lambda is the space of all paths v:{0,1,2,,k}Vv:\{0,1,2,\dots,k\}\to V such that ktk\leq t, v(k)=viv(k)=v_{i}. This gives (11). Since Ni(t)=VN_{i}^{-}(t)=V for tDt\geq D, Si(t)=j𝒱SjS_{i}(t)=\cup_{j\in\mathcal{V}}S_{j} and xi(D)=xj(D)=(S)x_{i}(D)=x_{j}(D)=\mathscr{E}(S). Thus the algorithm considers is a consensus algorithm. The algorithm is distributed as each Si(t)S_{i}(t) is a function of the Sj(t1)S_{j}(t-1) for jNi(1)j\in N_{i}^{-}(1). ∎

This shows that, agents in a distributed network can obtain exact knowledge of the convex hull in DD iterations. As an application the convex hull algorithm can be used to provide finite time stopping criterion for a convex decreasing consensus algorithm. We need the following lemma.

Lemma IV.1.

For a norm \|\cdot\| and a convex set KK,

diam(K)=supw1,w2(K)w1w2.\displaystyle diam_{\|\cdot\|}(K)=\sup_{w_{1},w_{2}\in\mathscr{E}(K)}\|w_{1}-w_{2}\|.
Proof.

For fixed yKy\in K that, xxyx\mapsto\|x-y\| is convex and hence takes its maximum value on KK at extreme value of KK. Hence supxKxy=supw1(K)w1y\sup_{x\in K}\|x-y\|=\sup_{w_{1}\in\mathscr{E}(K)}\|w_{1}-y\|, applying the same argument again we obtain

supw1(K)w1y\displaystyle\sup_{w_{1}\in\mathscr{E}(K)}\|w_{1}-y\| =supw1(K)supw2(K)w1w2,\displaystyle=\sup_{w_{1}\in\mathscr{E}(K)}\sup_{w_{2}\in\mathscr{E}(K)}\|w_{1}-w_{2}\|,

and our result follows. ∎

Theorem IV.2.

If ci(k)c^{i}(k) denotes the vector at node ii at time kk in a convex-decreasing consensus algorithm, then for kkk^{\prime}\geq k

ci(k)limnci(n)maxw1,w2(ck)w1w2.\displaystyle\|c^{i}(k^{\prime})-\lim_{n\to\infty}c^{i}(n)\|\leq\max_{w_{1},w_{2}\in\mathscr{E}(c^{k})}\|w_{1}-w_{2}\|.
Proof.

Denoting by c(k)c(k) the element of (d)𝒱(\mathbb{R}^{d})^{\mathcal{V}} defined by ici(k)i\mapsto c_{i}(k), the assumption that c(k)c(k) is convex decreasing implies co(c(k))co(c(k))co(c(k^{\prime}))\subseteq co(c(k)) for kkk^{\prime}\geq k, and hence ci(k)co(c(k))c_{i}(k^{\prime})\in co(c(k)) for all ii. Thus, limnci(n)co(c(k))\lim_{n}c^{i}(n)\in co(c(k)) as well and we have

ci(k)limnci(n)\displaystyle\|c^{i}(k^{\prime})-\lim_{n}c^{i}(n)\| diam(c(k))\displaystyle\leq diam_{\|\cdot\|}(c(k))
=maxw1,w2(ck)w1w2.\displaystyle=\max_{w_{1},w_{2}\in\mathscr{E}(c_{k})}\|w_{1}-w_{2}\|.

It follows that an agent can obtain exact bounds on the distance from convergence of the consensus with respect to an arbitrary norm.

Standard algorithms for computing the convex hull of a set of points in dd-dimensional exist, see [43, 44]. However such can easily be prohibitively expensive especially in high dimension (worst case runtime is of the order O(nd/2)O(n^{d/2})), when computational resources, or communication power is limited. Further, in the worst case scenario, the number of extreme points can be of the same order as the nodes of the graph (take w(i)w(i) to be points of a dd-dimensional sphere for instance), and hence their communication cost is equivalent to that of the entire system state. In the following section we develop a stopping algorithm to address these potential feasibility issues.

V Norm Based Finite-Time Termination

Similar to the convex hull comprising all points (corresponding to each agent), radius of a minimal ball in dd dimension enclosing all the points can also be used as a termination criterion. Once the radius is within some bound ρ\rho, it can be shown that every agent’s state is within 2ρ2\rho of the consensus value. We remark that even in the pnormp-norm case determination of a minimum norm ball in a distributed manner is a difficult problem (see [45]). Here, we provide an algorithm which distributedly finds an approximation of minimal ball at each agent. We next show that the minimal ball is enclosed in this approximation, thus if the approximate ball’s radius is within ρ\rho then the minimal ball’s radius is within ρ\rho as well. This is established in next Lemma.

Lemma V.1.

Let {ci(k)}\{c^{i}(k)\} be the sequence generated by a distributed convex-decreasing consensus protocol. For all i𝒱i\in\mathcal{V}, let

Ri(k+1,k)maxjNi\displaystyle R_{i}(k+1,k^{\prime})\coloneqq\underset{j\in N^{-}_{i}}{\max} {ci(k+k+1)cj(k+k)\displaystyle\{\|c^{i}(k^{\prime}+k+1)-c^{j}(k^{\prime}+k)\| (13)
+Rj(k,k)}\displaystyle+R_{j}(k,k^{\prime})\} (14)

with Ri(0,k):=0R_{i}(0,k^{\prime}):=0 and k0k^{\prime}\geq 0. Then

cj(k)B{Ri(D,k),ci(k+D)},c^{j}(k^{\prime})\in B\{R_{i}(D,k^{\prime}),c^{i}(k^{\prime}+D)\}, (15)

for all j𝒱j\in\mathcal{V}, where B{R,x}B\{R,x\} denotes the closed ball of radius RR centered at xx and DD is the diameter of the underlying graph topology.

Proof.

We first prove the following claim.

cj(k)B{Ri(k,k),ci(k+k)}c^{j}(k^{\prime})\in B\{R_{i}(k,k^{\prime}),c^{i}(k^{\prime}+k)\} (16)

for all j𝒱j\in\mathcal{V} such that the length of the shortest path from ii to jj is less than equal to k.k. Clearly above claim is sufficient to prove (15) as when k=Dk=D, (16) is valid for all j𝒱j\in\mathcal{V}. Let the length of the shortest path from ii to jj be denoted as |path(i,j)||path(i,j)|. We prove the claim using induction. For k=1k=1,

Ri(1,k)=maxjNici(k+1)cj(k).R_{i}(1,k^{\prime})=\underset{j\in N^{-}_{i}}{\max}\|c^{i}(k^{\prime}+1)-c^{j}(k^{\prime})\|.

Then for all jNij\in N^{-}_{i}, that is for all jj such that |path(i,j)|1|path(i,j)|\leq 1, we get

i(k+1)cj(k)\displaystyle\|^{i}(k^{\prime}+1)-c^{j}(k^{\prime})\| Ri(1,k), and thus\displaystyle\leq R_{i}(1,k^{\prime}),\mbox{ and thus }
cj(k)\displaystyle\implies c^{j}(k^{\prime})\in B{Ri(1,k),ci(k+1)}.\displaystyle B\{R_{i}(1,k^{\prime}),c^{i}(k^{\prime}+1)\}.

Thus the assertion holds for k=1k=1. Now lets assume (16) is true for kk. Let jj be a node such that |path(i,j)|k+1|path(i,j)|\leq k+1. Let qq be a neighbor of ii on the shortest path from ii to jj, then |path(q,j)|k|path(q,j)|\leq k. Then from induction assumption,

cj(k)B{Rq(k,k),cq(k+k)},c^{j}(k^{\prime})\in B\{R_{q}(k,k^{\prime}),c^{q}(k+k^{\prime})\},

that is,

cq(k+k)cj(k)Rq(k,k).\|c^{q}(k^{\prime}+k)-c^{j}(k^{\prime})\|\leq R_{q}(k,k^{\prime}). (17)

From definition of Ri(k+1,k)R_{i}(k+1,k^{\prime}),

ci(k+k+1)cq(k+k)+Rq(k,k)Ri(k+1,k).\|c^{i}(k^{\prime}+k+1)-c^{q}(k^{\prime}+k)\|+R_{q}(k,k^{\prime})\leq R_{i}(k+1,k^{\prime}).

From triangle inequality,

ci(k+k+1)cj(k)ci(k+k+1)cq(k+k)+cq(k+k)cj(k).\|c^{i}(k^{\prime}+k+1)-c^{j}(k^{\prime})\|\leq\|c^{i}(k^{\prime}+k+1)-c^{q}(k^{\prime}+k)\|\\ +\|c^{q}(k^{\prime}+k)-c^{j}(k^{\prime})\|.

Using (17),

ci(k+k+1)cj(k)ci(k+k+1)cq(k+k)+Rq(k,k)\|c^{i}(k^{\prime}+k+1)-c^{j}(k^{\prime})\|\leq\|c^{i}(k^{\prime}+k+1)-c^{q}(k^{\prime}+k)\|\\ +R_{q}(k,k^{\prime})

which implies that

ci(k+k+1)cj(k)Ri(k+1,k)\displaystyle\|c^{i}(k^{\prime}+k+1)-c^{j}(k^{\prime})\|\leq R_{i}(k+1,k^{\prime})

and thus,

cj(k)B{Ri(k+1,k),ci(k+k+1)},\displaystyle c^{j}(k^{\prime})\in B\{R_{i}(k+1,k^{\prime}),c^{i}(k^{\prime}+k+1)\},

and the result follows. ∎

Lemma V.1 provides a distributed way to find a ball which encloses all the nodes. Only information needed by a node is the current radius of its neighbors (along with the states pertaining to ratio consensus) and it can determine the final radius within DD iteration. Further, since the ball B{Ri(D,k),ci(k+D)}B\{R_{i}(D,k^{\prime}),c^{i}(k^{\prime}+D)\} encloses all the nodes, it also encloses the minimum ball, as mentioned earlier. Thus we have provided an algorithm to find an approximation of the minimum ball comprising of all nodes. We next present a framework which we use to prove that this radius converges to 0 and can be used as a distributed stopping criterion.

Consider the coordinate-wise maximum and minimum of the states taken over all the agents at atime instant kk be given by, M(k)=[M1(k)M2(k)Md(k)]TM(k)=[M_{1}(k)\ M_{2}(k)\ \dots M_{d}(k)]^{T} and m(k)=[m1(k)m2(k)md(k)]Tm(k)=[m_{1}(k)\ m_{2}(k)\ \dots m_{d}(k)]^{T} respectively. That is,

Ms(k)\displaystyle M_{s}(k) maxi𝒱csi(k)\displaystyle\coloneqq\underset{i\in\mathcal{V}}{\max}\ c^{i}_{s}(k) (18)
ms(k)\displaystyle m_{s}(k) mini𝒱csi(k)\displaystyle\coloneqq\underset{i\in\mathcal{V}}{\min}\ c^{i}_{s}(k) (19)

where Ms(k)M_{s}(k)\in\mathbb{R}, ms(k)m_{s}(k)\in\mathbb{R} for all s{1,2,,d}s\in\{1,2,\dots,d\} and csi(k)c_{s}^{i}(k) is the ss-th elements of ci(k)c^{i}(k). Then from [24], for all time instants kkk^{{}^{\prime}}\geq k and for all i𝒱i\in\mathcal{V} and s{1,2,,d}s\in\{1,2,\dots,d\},

ms(k)csi(k)Ms(k).\displaystyle m_{s}(k)\leq c^{i}_{s}(k^{{}^{\prime}})\leq M_{s}(k). (20)

Further from [24], for all i𝒱,l0i\in\mathcal{V},\ l\geq 0 and s{1,2,,d}s\in\{1,2,\dots,d\},

Ms((l+1)D)\displaystyle M_{s}((l+1)D) <Ms(lD)\displaystyle<M_{s}(lD) (21)
ms((l+1)D)\displaystyle m_{s}((l+1)D) >ms(lD).\displaystyle>m_{s}(lD). (22)

By using (20), (21) and (22), we can prove the following theorem.

Theorem V.1.

Consider cc to a convex decreasing consensus algorithm

limlM(lD)=limlm(lD)=climlci(l).\displaystyle\lim_{l\to\infty}M(lD)=\lim_{l\to\infty}m(lD)=c^{\infty}\coloneqq\lim_{l\to\infty}c^{i}(l).

where, M(k)M(k) and m(k)m(k) are as defined in (18) and (19).

Proof.

It follows from Theorem III.1 that limkci(k)=c\lim\limits_{k\rightarrow\infty}c^{i}(k)=c_{\infty} for all i𝒱i\in\mathcal{V}. This implies that for all i𝒱i\in\mathcal{V}, s{1,2,,d}s\in\{1,2,\dots,d\} and given ϵ>0\epsilon>0, there exists a L>0L>0 such that for all kL,csi(k)cs<ϵk\geq L,\|c^{i}_{s}(k)-c^{\infty}_{s}\|<\epsilon. This implies that, there exists Q>0Q>0 such that for all kQ,maxicsi(k)cs<ϵk\geq Q,\ \|\max\limits_{i}c^{i}_{s}(k)-c^{\infty}_{s}\|<\epsilon. Similarly, minicsi(k)cs<ϵ\|\min\limits_{i}c^{i}_{s}(k)-c^{\infty}_{s}\|<\epsilon. Thus, it follows that, limkMs(k)=cs\lim\limits_{k\rightarrow\infty}M_{s}(k)=c^{\infty}_{s} and limkms(k)=cs\lim\limits_{k\rightarrow\infty}m_{s}(k)=c^{\infty}_{s}. As subsequences of convergent subsequences, the same conclusion follows for M(lD)M(lD) and m(lD)m(lD). ∎

Corollary V.1.

For cc a convex decreasing consensus algorithm, then,

limlM(lD)m(lD)=0.\displaystyle\lim\limits_{l\rightarrow\infty}||M(lD)-m(lD)||=0. (23)

In particular (23) holds for the ratio consensus protocol of (6), (7) and (9) when Assumptions  1 and 2 hold.

Proof.

The proof directly results from Theorem V.1. ∎

It is clear from Lemma V.1 that at any instant kk, all agents’ states are within 2Ri(D,k)2R_{i}(D,k) of each other, that is,

maxi,j𝒱ci(k)cj(k)2Ri(D,k).\underset{i,j\in\mathcal{V}}{\max}\|c^{i}(k)-c^{j}(k)\|\leq 2R_{i}(D,k). (24)

Thus if Ri(D,k)R_{i}(D,k) is within a tolerance ρ/2\rho/2, all the agents ratio state will be within ρ\rho of consensus. We next provide convergence result for Ri(D,k)R_{i}(D,k) as kk\to\infty.

Theorem V.2.

For a distributed convex decreasing consensus algorithm cc and update as in (13). Let Ri¯(l):=Ri(D,lD)\overline{R_{i}}(l):=R_{i}(D,lD) for l=0,1,2,l=0,1,2,\ldots and all i𝒱.i\in\mathcal{V}. Then

limlRi¯(l)\displaystyle\lim\limits_{l\rightarrow\infty}\overline{R_{i}}(l) =0\displaystyle=0

for all i𝒱.i\in\mathcal{V}.

Proof.

From definition,

Ri(1,lD)=maxjNi{ci(1+lD)cj(lD)+Rj(0,lD)}R_{i}(1,lD)=\underset{j\in N^{-}_{i}}{\max}\{\|c^{i}(1+lD)-c^{j}(lD)\|+R_{j}(0,lD)\}

which implies,

Ri(1,lD)maxjNici(1+lD)cj(lD)+maxjNiRj(0,lD)R_{i}(1,lD)\leq\underset{j\in N^{-}_{i}}{\max}\|c^{i}(1+lD)-c^{j}(lD)\|\\ +\underset{j\in N^{-}_{i}}{\max}R_{j}(0,lD) (25)

Let M(lD)M(lD) and m(lD)m(lD) be as defined in Theorem V.1. As ci(1+lD)m(lD)c^{i}(1+lD)\geq m(lD) and cj(lD)M(lD)c^{j}(lD)\leq M(lD) from (20), we get

ci(1+lD)cj(lD)M(lD)m(lD)\|c^{i}(1+lD)-c^{j}(lD)\|\leq\|M(lD)-m(lD)\| (26)

Then using (26) in (25) and observing Rj(0,lD)=0R_{j}(0,lD)=0 for all j𝒱j\in\mathcal{V}, we get

Ri(1,lD)M(lD)m(lD).R_{i}(1,lD)\leq\|M(lD)-m(lD)\|. (27)

Similarly,

Ri(2,lD)maxjNici(2+lD)cj(1+lD)+maxjNiRj(1,lD)R_{i}(2,lD)\leq\underset{j\in N^{-}_{i}}{\max}\|c^{i}(2+lD)-c^{j}(1+lD)\|+\\ \underset{j\in N^{-}_{i}}{\max}R_{j}(1,lD) (28)

Again as ci(2+lD)m(lD)c^{i}(2+lD)\geq m(lD) and cj(1+lD)M(lD)c^{j}(1+lD)\leq M(lD), we have

ci(2+lD)cj(1+lD)M(lD)m(lD)\|c^{i}(2+lD)-c^{j}(1+lD)\|\leq\|M(lD)-m(lD)\| (29)

Then using (27), (28) and (29), we get

Ri(2,lD)2M(lD)m(lD).R_{i}(2,lD)\leq 2\|M(lD)-m(lD)\|.

Following the same process, we have

Ri¯(l)=Ri(D,lD)DM(lD)m(lD).\overline{R_{i}}(l)=R_{i}(D,lD)\leq D\|M(lD)-m(lD)\|. (30)

Then from Corollary V.1,

limlRi¯(l)=0\lim\limits_{l\rightarrow\infty}\overline{R_{i}}(l)=0

Notice that Ri(l)R_{i}(l) can be different for different nodes and each node might detect ρ\rho-convergence (Ri(l)<ρR_{i}(l)<\rho) at different time instants. According to Lemma V.1, once Ri(l)<ρR_{i}(l)<\rho for any i𝒱i\in\mathcal{V}, ci(lD)cj(lD)<2ρ\|c^{i}(lD)-c^{j}(lD)\|<2\rho, that is the ratio state is within 2ρ2\rho of consensus value, and the consensus is achieved. Further, any node ii which detects convergence can propagate a “converged flag” in the network. To take that into account, we run a separate 11-bit consensus algorithm (denoted as convergence consensus) for each node where each node maintains a convergence state bi(k)b_{i}(k) and shares it with neighbors. Each node initializes bi(k)b_{i}(k) at every lDlD iteration for l{0,1,2,}l\in\{0,1,2,\dots\} with 11 or 0 depending on the node has detected convergence or not, and updates its value on every iteration using,

bi(k+1)=jNibj(k),\displaystyle b_{i}(k+1)=\underset{j\in N^{-}_{i}}{\bigcup}b_{j}(k), (31)

where \bigcup denotes OR operation, k0k\geq 0 and bj(0)=1b_{j}(0)=1 if node jj has detected convergence at initialization instant 0 and bj(0)=0b_{j}(0)=0 otherwise. Clearly, if bj(0)=1b_{j}(0)=1 for any j𝒱j\in\mathcal{V}, then bi(D)=1b_{i}(D)=1 for all i𝒱i\in\mathcal{V} where DD is the diameter. Thus each node can use bi(D)b_{i}(D) as a stopping criterion.

Using above discussion and Theorem V.2, we present an algorithm (see Algorithm 1) instantiating the result for ratio consensus (which could easily be adapted for more general settings), which determines the radius Ri¯(l)\overline{R_{i}}(l) for l=0,1,2,l=0,1,2,\ldots and all i𝒱i\in\mathcal{V} and provides a finite-time stopping criterion for vector consensus.

Theorem V.3.

Algorithm 1 converges in finite-time simultaneously at each node.

Proof.

From Corollary V.1, it follows that Ri¯(l)0\overline{R_{i}}(l)\rightarrow 0 as l.l\rightarrow\infty. Thus, for any given ρ>0\rho>0 and node i𝒱i\in\mathcal{V} there exists an integer t(ρ,i)t(\rho,i) such that for l=t(ρ,i),l=t(\rho,i), Ri¯(l)<ρ\overline{R_{i}}(l)<\rho. As each node has access to Ri¯(l)\overline{R_{i}}(l), convergence can be detected by each node and the convergence bit bi(lD+1)b_{i}(lD+1) will be set to 1. Thus bi(lD+D+1)=1b_{i}(lD+D+1)=1 for all i𝒱i\in\mathcal{V} and algorithm will stop simultaneously at each node.∎

Input:
      ρ\rho, xi(0)x^{i}(0) ;
        // Initial condition
      
Initialize:
      k:=0k:=0; Ri(0):=0R_{i}(0):=0; yi(0)=1y_{i}(0)=1;bi(0)=0b_{i}(0)=0; l:=1l:=1;
Repeat:
       Input:
            xj(k),yj(k),Rj(k),jNix^{j}(k),y_{j}(k),R_{j}(k),j\in N^{-}_{i}
      
      
      /* ratio consensus updates of node ii given by (6), (7) and (9) */
       xi(k+1):=jNipji(k)xj(k)x^{i}(k+1):=\sum\limits_{j\in\mathit{N^{-}_{i}}}p_{ji}(k)x^{j}(k);
       yi(k+1):=jNipji(k)yj(k)y_{i}(k+1):=\sum\limits_{j\in\mathit{N^{-}_{i}}}p_{ji}(k)y_{j}(k);
       ri(k+1):=1yi(k+1)xi(k+1)r^{i}(k+1):=\frac{1}{y_{i}(k+1)}x^{i}(k+1);
       /* radius updates of node ii given by (13) */
       Ri(k+1):=maxjNi{ri(k+1)rj(k)+Rj(k)}R_{i}(k+1):=\underset{j\in N^{-}_{i}}{\max}\{\|r^{i}(k+1)-r^{j}(k)\|+R_{j}(k)\}
       /* convergence bit update of node ii given by (31) */
       bi(k+1)=jNibj(k,)b_{i}(k+1)=\underset{j\in N^{-}_{i}}{\bigcup}b_{j}(k,)
       if k=lDk=lD then
             if bi(k+1)=1b_{i}(k+1)=1 then
                  break ;
                    // stop xi,yi,rix^{i},y_{i},r^{i}, RiR_{i} and bib_{i} updates
                  
             else
                   Ri¯(l)=Ri(k+1)\overline{R_{i}}(l)=R_{i}(k+1);
                   if (Ri¯(l)<ρ\overline{R_{i}}(l)<\rho) then
                        bi(k+1)=1b_{i}(k+1)=1 ;
                          // set convergence bit to 1
                        
                   else
                        Ri(k+1)=0R_{i}(k+1)=0; bi(k+1)=0b_{i}(k+1)=0;
                         l=l+1l=l+1;
                        
                  
            
       end if
      k=k+1k=k+1;
      
Algorithm 1 Finite-time termination of ratio consensus in higher-dimension dd (at each node i𝒱i\in\mathcal{V})
Remark 1.

Notice that using the above protocol, each node detects convergence simultaneously. Further, the only global parameter needed for Algorithm 1 is the knowledge of diameter DD. However, it should be noted that an upper bound will suffice. In most applications, an upper bound on the diameter DD is readily available.

Remark 2.

It is to note here that for Algorithm 1, only extra communication required between nodes is passing of the current radius at each node which is just a scalar along with a single bit for convergence consensus. Therefore the extra bandwidth required for each neighbor-neighbor interaction is B+1B+1 where BB is the bit length (usually 32) for floating point representation. Thus, the above protocol is suitable for ad-hoc communication networks where communication cost is high and bandwidth is limited.

A finite-time termination criterion for vector consensus was previously provided in [13]. There, each element of ratio state required a maximum-minimum protocol (see (18) and (19)), with stopping criterion given by,

maxs{1,2,,d}maxi𝒱rsi(k)mins{1,2,,d}mini𝒱rsi(k)<ρ\displaystyle\left\|\underset{s\in\{1,2,\dots,d\}}{\max}\underset{i\in\mathcal{V}}{\max}r^{i}_{s}(k)-\underset{s\in\{1,2,\dots,d\}}{\min}\underset{i\in\mathcal{V}}{\min}r^{i}_{s}(k)\right\|<\rho

This maximum-minimum is a special case for finding a minimum convex set in the form of a hyper rectangle (box) which encompasses all the points. Here, at each iteration, two extra states are shared by each node, namely, one state for element-wise maximum and the other for element-wise minimum. Thus the extra communication bandwidth required for this algorithm is 2Bd2Bd. An example case where d=10,B=32d=10,B=32, requires an extra bandwidth of 640640 bits per interaction. For this example, Algorithm 1 only requires B+1=33B+1=33 extra bits of communication per interaction, providing a reduction of more than 1919x. Thus for the applications with high dimensional vector consensus (like GANs, as described in the introduction, see [19]), the algorithm reported here provides a reliable distributed stopping criterion with significantly less communication bandwidth.

VI Applications of finite-time terminated average consensus in higher dimensions

VI-A Least Squares Estimation

We follow [46] in our exposition of the least squares problem as solved by consensus.

Consider the problem of estimating a function y=φθ(x)y=\varphi_{\theta}(x), given a noisy dataset {(xj,yj)}j=1N\{(x_{j},y_{j})\}_{j=1}^{N} under the assumption that φθ(x)\varphi_{\theta}(x) is a linear combination of known functions gi(x)g_{i}(x). Explicitly φθ(x)=i=1Mθigi(x)\varphi_{\theta}(x)=\sum_{i=1}^{M}\theta_{i}g_{i}(x). Defining for 1jN1\leq j\leq N, vectors gj=(g1(xj),,gM(xj))Tg^{j}=(g_{1}(x_{j}),\dots,g_{M}(x_{j}))^{T} and 𝐆{\bf G} to be the matrix formed by taking columns Gj=gjG_{j}=g^{j}. Then taking for granted the invertibility of the relevant matrices, the least squares estimate θ^\hat{\theta} of φθ\varphi_{\theta} is given by taking

θ^\displaystyle\hat{\theta} =argminθ|y𝐆θ|2\displaystyle=\operatornamewithlimits{argmin}_{\theta}|y-{\bf G}\theta|^{2} (32)
=(𝐆𝐓𝐆)1𝐆y\displaystyle=({\bf G^{T}}{\bf G})^{-1}{\bf G}y (33)
=(1Nj=1Ngj(gj)T)1(1Nj=1Ngjyj),\displaystyle=\left(\frac{1}{N}\sum_{j=1}^{N}g^{j}(g^{j})^{T}\right)^{-1}\left(\frac{1}{N}\sum_{j=1}^{N}g^{j}y_{j}\right), (34)

where the first equality is derived from setting the gradient of θ|y𝐆θ|2\theta\mapsto|y-{\bf G}\theta|^{2} to zero and the second follows from algebra.

Initializing an average consensus algorithms on NN nodes, initialized to fj(0)=(Mj(0),zj(0))=(gj(gj)T,gjyj)f_{j}(0)=(M_{j}(0),z_{j}(0))=(g^{j}(g^{j})^{T},g^{j}y_{j}), a node ii, can form a consensus estimate of θ^\hat{\theta} at time nn, by

θi(n)=𝐌i1(n)zi(n).\displaystyle\theta_{i}(n)={\bf M}_{i}^{-1}(n)z_{i}(n). (35)

Indeed, as an average consensus algorithm,

𝐌1Nj=1Ngj(gj)T=limn𝐌i(n),\displaystyle{\bf M}\coloneqq\frac{1}{N}\sum_{j=1}^{N}g^{j}(g^{j})^{T}=\lim_{n}{\bf M}_{i}(n),

and

z1Nj=1Ngjyj=limnzi(n),\displaystyle z\coloneqq\frac{1}{N}\sum_{j=1}^{N}g^{j}y_{j}=\lim_{n}z_{i}(n),

thus

limnθj(n)=limn𝐌i1(n)zi(n)=𝐌1z=θ^.\displaystyle\lim_{n}\theta_{j}(n)=\lim_{n}{\bf M}_{i}^{-1}(n)z_{i}(n)={\bf M}^{-1}z=\hat{\theta}.

For a matrix 𝐀{\bf A}, let 𝐀op=inf{c:|𝐀x|c|x|}\|{\bf A}\|_{op}=\inf\{c:|{\bf A}x|\leq c|x|\}

Lemma VI.1.

[38] For invertible matrices, 𝐀{\bf A} and 𝐁{\bf B},

𝐀1𝐁1op𝐀1op21𝐀1op𝐁𝐀op𝐁𝐀op\displaystyle\|{\bf A}^{-1}-{\bf B}^{-1}\|_{op}\leq\frac{\|{\bf A}^{-1}\|_{op}^{2}}{1-\|{\bf A}^{-1}\|_{op}\|{\bf B}-{\bf A}\|_{op}}\|{\bf B}-{\bf A}\|_{op}
Theorem VI.1.

The average consensus estimate θi(n)\theta_{i}(n) in (35), of the least squared estimator θ^\hat{\theta} in (32) satisfies

|θi(n)θ^|m|zi(n)z|+C𝐌i(n)𝐌op\displaystyle|\theta_{i}(n)-\hat{\theta}|\leq m|z_{i}(n)-z|+C\|{\bf M}_{i}(n)-{\bf M}\|_{op}

where

m\displaystyle m =𝐌i1(n)op\displaystyle=\|{\bf M}^{-1}_{i}(n)\|_{op}
C\displaystyle C =m2(|zi(n)|+|zi(n)z|)1m𝐌i(n)𝐌op\displaystyle=\frac{m^{2}(|z_{i}(n)|+|z_{i}(n)-z|)}{1-m\|{\bf M}_{i}(n)-{\bf M}\|_{op}}

Note the terms 𝐌i(n)𝐌op\|{\bf M}_{i}(n)-{\bf M}\|_{op} and |zi(n)z||z_{i}(n)-z| can be bounded through the finite time stopping criteria. Further the terms mm and CC depend only on locally computable terms. Thus the theorem demonstrates that not only can agents perform a distributed least square estimate, they can obtain error bounds on their estimates in a distributed fashion. That limnm=𝐌1op\lim_{n}m=\|{\bf M}^{-1}\|_{op} and limnC=𝐌1op2|z|\lim_{n}C=\|{\bf M}^{-1}\|_{op}^{2}|z| ensure convergence of the algorithm.

Proof.

For suppress we write 𝐌i=𝐌i(n){\bf M}_{i}={\bf M}_{i}(n) and zi=zi(n)z_{i}=z_{i}(n) By definition, and the triangle inequality

|θi(n)\displaystyle|\theta_{i}(n) θ^|\displaystyle-\hat{\theta}|
=|𝐌i1zi𝐌1z|\displaystyle=|{\bf M}_{i}^{-1}z_{i}-{\bf M}^{-1}z|
=|𝐌i1(ziz)(𝐌1𝐌i1)(z)|\displaystyle=|{\bf M}_{i}^{-1}(z_{i}-z)-({\bf M}^{-1}-{\bf M}_{i}^{-1})(z)|
|𝐌i1(ziz)|+|(𝐌1𝐌i1)(z)|.\displaystyle\leq|{\bf M}_{i}^{-1}(z_{i}-z)|+|({\bf M}^{-1}-{\bf M}_{i}^{-1})(z)|.

Applying the definition of the operator norm,

|𝐌i1(ziz)|\displaystyle|{\bf M}_{i}^{-1}(z_{i}-z)| 𝐌i1op|ziz|\displaystyle\leq\|{\bf M}_{i}^{-1}\|_{op}|z_{i}-z| (36)
=m|ziz|.\displaystyle=m|z_{i}-z|. (37)

Further,

|(𝐌1\displaystyle|({\bf M}^{-1} 𝐌i1)(z)|\displaystyle-{\bf M}_{i}^{-1})(z)| (38)
𝐌1𝐌i1op|z|\displaystyle\leq\|{\bf M}^{-1}-{\bf M}_{i}^{-1}\|_{op}|z| (39)
𝐌1𝐌i1op(|zi|+|zzi|).\displaystyle\leq\|{\bf M}^{-1}-{\bf M}_{i}^{-1}\|_{op}(|z_{i}|+|z-z_{i}|). (40)

Applying Lemma VI.1 with 𝐀=𝐌i{\bf A}={\bf M}_{i} and 𝐁=𝐌{\bf B}={\bf M},

𝐌1𝐌i1op𝐌i1op21𝐌i1op𝐌𝐌iop𝐌𝐌iop.\displaystyle\|{\bf M}^{-1}-{\bf M}_{i}^{-1}\|_{op}\leq\frac{\|{\bf M}_{i}^{-1}\|_{op}^{2}}{1-\|{\bf M}_{i}^{-1}\|_{op}\|{\bf M}-{\bf M}_{i}\|_{op}}\|{\bf M}-{\bf M}_{i}\|_{op}. (41)

Inserting 41 into 40 gives

|(𝐌1\displaystyle|({\bf M}^{-1} 𝐌i1)(z)|C𝐌i𝐌op.\displaystyle-{\bf M}_{i}^{-1})(z)|\leq C\|{\bf M}_{i}-{\bf M}\|_{op}. (42)

Combining (37) and (42) gives our result. ∎

VI-B Distributed Function Calculation

Here, we give another application of the average consensus protocol in higher dimensions. We focus on the problem of computing arbitrary functions of agent state values over sensor networks, see [34]. In particular, given a directed graph 𝒢(𝒱,)\mathcal{G}(\mathcal{V},\mathcal{E}) with nn nodes representing the communication constraints in a sensor network, the objective is to design an interaction rule for the nodes in the network to cooperatively compute a desired function f(u1(0),u2(0),,uN(0))f(u_{1}(0),u_{2}(0),\dots,u_{N}(0)), of the initial values ui(0),i{1,2,,N}u_{i}(0)\in\mathbb{R},i\in\{1,2,\dots,N\}. Such a problem is of interest in wireless sensor networks, see [34] where, the sink nodes in the sensor networks has to carry out the task of communicating a relevant function of the raw sensor measurements. Another example is the case of coordination tasks in multi-agent systems as given in [47], [11], where all agents communicate with each other to coordinate their speed and direction of motion. Consider, the directed graph 𝒢(𝒱,)\mathcal{G}(\mathcal{V},\mathcal{E}) modeling the interconnection topology between the nn agents. Let each agent maintain three variables denoted by xi(k)d,yi(k)x^{i}(k)\in\mathbb{R}^{d},\ y_{i}(k)\in\mathbb{R} and ri(k)dr^{i}(k)\in\mathbb{R}^{d} with the following initialization:

xi(0)\displaystyle x^{i}(0) =[00Nui(0) 00]Td\displaystyle=[0\dots 0\ Nu_{i}(0)\ 0\dots 0]^{T}\in\mathbb{R}^{d} (43)
yi(0)\displaystyle y_{i}(0) =1\displaystyle=1 (44)
ri(0)\displaystyle r^{i}(0) =1yi(0)ri(0).\displaystyle=\frac{1}{y_{i}(0)}{r^{i}(0)}. (45)

The estimates xi(k)x^{i}(k), yi(k)y_{i}(k) and ri(k)r^{i}(k) are updated according to (6), (7) and (9) respectively. The following theorem upper bounds the error in distributed function calculation by the error in the consensus estimation.

Theorem VI.2.

Let 𝒢(𝒱,)\mathcal{G}(\mathcal{V},\mathcal{E}) and P=[pji]P=[p_{ji}] associated with 𝒢(𝒱,)\mathcal{G}(\mathcal{V},\mathcal{E}) satisfy Assumptions 1 and 2 respectively. Denote by {xi(k)}k0,{yi(k)}k0\{x^{i}(k)\}_{k\geq 0},\{y_{i}(k)\}_{k\geq 0} and {ri(k)}k0\{r^{i}(k)\}_{k\geq 0} the sequences generated by (6), (7) and (9) respectively. Under the initialization (43)-(45) the estimates ri(k)r^{i}(k) asymptotically converges to r¯:=limk1yi(k)xi(k)=[u1(0)u2(0)uN(0)]T\bar{r}:=\lim\limits_{k\rightarrow\infty}\frac{1}{y_{i}(k)}x^{i}(k)=[u_{1}(0)\ u_{2}(0)\dots u_{N}(0)]^{T} for all i{1,,N}i\in\{1,...,N\}, and ff is CC-Lipschitz, or more generally α\alpha-Hölder continuous with constant CC, then

|f(ri(k))f(r¯)|Cri(k)r¯α.\displaystyle|f(r^{i}(k))-f(\bar{r})|\leq C\|r^{i}(k)-\bar{r}\|^{\alpha}.

col

Proof.

The proof follows from Theorem II.2. In particular, by Theorem II.2 under Assumptions 1 and 2 the updates (9) converges to r¯:=limk1yi(k)xi(k)=1Nj=1Nxj(0)\overline{r}:=\lim\limits_{k\rightarrow\infty}\frac{1}{y_{i}(k)}x^{i}(k)=\frac{1}{N}\sum\limits_{j=1}^{N}x^{j}(0) for all i{1,,N}i\in\{1,...,N\}. With the initialization (43)-(45) the limiting value r¯\overline{r} is given by

r¯=1Ni=1Nxi(0)=[u1(0)u2(0)uN(0)]T.\displaystyle\overline{r}=\frac{1}{N}\sum_{i=1}^{N}x^{i}(0)=[u_{1}(0)\ u_{2}(0)\dots u_{N}(0)]^{T}.

To complete the proof one only needs to apply the definition of α\alpha-Hölder continuity, that |f(x)f(y)|Cxyα|f(x)-f(y)|\leq C\|x-y\|^{\alpha}, to the estimate ri(k)r^{i}(k) and the consensus value r¯\bar{r}.                                       ∎

Remark 3.

Theorem VI.2 guarantees that for large enough values of kk the agents following update rules (6)-(9) will have enough information to calculate any arbitrary function of the initial values. Moreover, unlike the existing methods in the literature [35] which require carefully designed matrices based on the global information of the network the proposed scheme allows for distributed synthesis. Further, the finite-time terminated protocol discussed here is applicable for arbitrary time-invariant connected directed graphs unlike the stringent assumptions required for applicability of some schemes in the literature, see [48].

VII Results

In this section, we present simulation results to demonstrate finite-time stopping criterion for high-dimensional ratio consensus. A network of 25 nodes is considered which is represented by a randomly generated directed graph (see Fig. 2(a)) with diameter 66. Here the numerator state is chosen to be a 10-dimensional vector and selected randomly for every node. Equation (6), (7) and (9) are implemented in MATLAB and simulated. 2-norm of each node’s ratio state is plotted in Figure 2(b) achieving convergence in 6060 iterations.

Refer to caption Refer to caption
(a) (b)
Figure 2: (a) A communication network represented by a 2525 node directed graph. (b) 2-norm of 10-dimensional ratio states of all the nodes (25) in the network.

Algorithm 1 is implemented in MATLAB and the radius Ri¯(l)\overline{R_{i}}(l) for all i𝒱i\in\mathcal{V} is plotted in Fig. 3(a). Here, it can be seen that radius comes under some pre-specified tolerance (0.01660.0166, 1%1\% of the norm of the consensus vector) within 60 iteration and is used as a stopping criterion by each node. Fig. 3(b) plots the two dimensional projection of the normnorm ball B{Ri¯(l),ri(lD+D)}B\{\overline{R_{i}}(l),r^{i}(lD+D)\} for node 11 as ll progresses over time. As expected, with increase in ll, balls shrink in size. Similar observation is seen for all the other nodes as well. The above illustration demonstrates how Algorithm 1 can be successfully used as finite-time termination criterion for distributed ratio consensus.

Refer to caption Refer to caption
(a) (b)
Figure 3: (a) Radius Ri¯(l)\overline{R_{i}}(l) at each node. (b) 2-dimensional projection of norm balls for node 11 with changing ll.

VIII Conclusion

In this article, we presented a notion of monotonicity of network states in vector consensus algorithms, which we called a convex decreasing consensus algorithm. We showed that this property can be used to construct finite-time stopping criterion and provided a distributed algorithm. We further provided an algorithm which calculates an approximation of minimum norm balls which contain all the network states at a given iteration. Radius of these balls was shown to converge to zero, and algorithm was presented to use that as a finite-time stopping criterion. This algorithm was shown to have much smaller communication requirement compared to existing methods. The effectiveness of our algorithm is validated by simulating a vector (10\in\mathbb{R}^{10}) ratio consensus algorithm for a network graph of 25 nodes. Further we demonstrated how these stopping criteria could be applied to provide guarantees on the convergence of Least Squared Estimator approximation through consensus.

References

  • [1] K. J. Arrow and L. Hurwicz, Decentralization and computation in resource allocation. Stanford University, Department of Economics, 1958.
  • [2] M. H. DeGroot, “Reaching a consensus,” Journal of the American Statistical Association, vol. 69, no. 345, pp. 118–121, 1974.
  • [3] N. A. Lynch, Distributed algorithms. Elsevier, 1996.
  • [4] J. N. Tsitsiklis, “Problems in decentralized decision making and computation.,” tech. rep., DTIC Document, 1984.
  • [5] D. Kempe, A. Dobra, and J. Gehrke, “Gossip-based computation of aggregate information,” in 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings., pp. 482–491, IEEE, 2003.
  • [6] A. D. Dominguez-Garcia and C. N. Hadjicostis, “Coordination and control of distributed energy resources for provision of ancillary services,” in Smart Grid Communications (SmartGridComm), 2010 First IEEE International Conference on, pp. 537–542, IEEE, 2010.
  • [7] C. N. Hadjicostis and T. Charalambous, “Average consensus in the presence of delays in directed graph topologies,” IEEE Transactions on Automatic Control, vol. 59, no. 3, pp. 763–768, 2013.
  • [8] K. Cai and H. Ishii, “Average consensus on general strongly connected digraphs,” Automatica, vol. 48, no. 11, pp. 2750–2761, 2012.
  • [9] J. B. Predd, S. R. Kulkarni, and H. V. Poor, “A collaborative training algorithm for distributed learning,” IEEE Transactions on Information Theory, vol. 55, no. 4, pp. 1856–1871, 2009.
  • [10] J. A. Fax and R. M. Murray, “Information flow and cooperative control of vehicle formations,” IFAC Proceedings Volumes, vol. 35, no. 1, pp. 115–120, 2002.
  • [11] R. Olfati-Saber, J. A. Fax, and R. M. Murray, “Consensus and cooperation in networked multi-agent systems,” Proceedings of the IEEE, vol. 95, no. 1, pp. 215–233, 2007.
  • [12] A. Nedić and A. Olshevsky, “Distributed optimization over time-varying directed graphs,” IEEE Transactions on Automatic Control, vol. 60, no. 3, pp. 601–615, 2014.
  • [13] V. Khatana, G. Saraswat, S. Patel, and M. V. Salapaka, “Gradient-consensus method for distributed optimization in directed multi-agent networks,” arXiv preprint arXiv:1909.10070, 2019.
  • [14] U. A. Khan, S. Kar, and J. M. Moura, “Distributed sensor localization in random environments using minimal number of anchor nodes,” IEEE Transactions on Signal Processing, vol. 57, no. 5, pp. 2000–2016, 2009.
  • [15] U. A. Khan, S. Kar, and J. M. Moura, “Higher dimensional consensus: Learning in large-scale networks,” IEEE Transactions on Signal Processing, vol. 58, no. 5, pp. 2836–2849, 2010.
  • [16] S. Patel, S. Attree, S. Talukdar, M. Prakash, and M. V. Salapaka, “Distributed apportioning in a power network for providing demand response services,” in 2017 IEEE International Conference on Smart Grid Communications (SmartGridComm), pp. 38–44, IEEE, 2017.
  • [17] A. Nedic, A. Ozdaglar, and P. A. Parrilo, “Constrained consensus and optimization in multi-agent networks,” IEEE Transactions on Automatic Control, vol. 55, no. 4, pp. 922–938, 2010.
  • [18] S. Chakraborty, A. Preece, M. Alzantot, T. Xing, D. Braines, and M. Srivastava, “Deep learning for situational understanding,” in 2017 20th International Conference on Information Fusion (Fusion), pp. 1–8, IEEE, 2017.
  • [19] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in Advances in neural information processing systems, pp. 2672–2680, 2014.
  • [20] A. Kurakin, I. Goodfellow, and S. Bengio, “Adversarial machine learning at scale,” ICLR, 2017.
  • [21] Z. Li, F. R. Yu, and M. Huang, “A distributed consensus-based cooperative spectrum-sensing scheme in cognitive radios,” IEEE Transactions on Vehicular Technology, vol. 59, no. 1, pp. 383–393, 2009.
  • [22] S. Patel, V. Khatana, G. Saraswat, and M. V. Salapaka, “Distributed detection of malicious attacks on consensus algorithms with applications in power networks,” 2020.
  • [23] V. Yadav and M. V. Salapaka, “Distributed protocol for determining when averaging consensus is reached,” in 45th Annual Allerton Conf, pp. 715–720, 2007.
  • [24] M. Prakash, S. Talukdar, S. Attree, S. Patel, and M. V. Salapaka, “Distributed Stopping Criterion for Ratio Consensus,” in 2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 131–135, Oct. 2018.
  • [25] G. Saraswat, V. Khatana, S. Patel, and M. V. Salapaka, “Distributed finite-time termination for consensus algorithm in switching topologies,” arXiv preprint arXiv:1909.00059, 2019.
  • [26] M. Prakash, S. Talukdar, S. Attree, V. Yadav, and M. V. Salapaka, “Distributed stopping criterion for consensus in the presence of delays,” IEEE Transactions on Control of Network Systems, 2019.
  • [27] S. Sundaram and C. N. Hadjicostis, “Finite-time distributed consensus in graphs with time-invariant topologies,” in 2007 American Control Conference, pp. 711–716, IEEE, 2007.
  • [28] F. P. Preparata, “An optimal real-time algorithm for planar convex hulls,” Communications of the ACM, vol. 22, no. 7, pp. 402–405, 1979.
  • [29] D. Fernández-Francos, Ó. Fontenla-Romero, and A. Alonso-Betanzos, “One-class convex hull-based algorithm for classification in distributed environments,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2017.
  • [30] P. Casale, O. Pujol, and P. Radeva, “Approximate polytope ensemble for one-class classification,” Pattern Recognition, vol. 47, no. 2, pp. 854–864, 2014.
  • [31] L. Kavan, I. Kolingerova, and J. Zara, “Fast approximation of convex hull.,” ACST, vol. 6, pp. 101–104, 2006.
  • [32] E. Osuna and O. De Castro, “Convex hull in feature space for support vector machines,” in Ibero-American Conference on Artificial Intelligence, pp. 411–419, Springer, 2002.
  • [33] W. Kim, M. S. Stanković, K. H. Johansson, and H. J. Kim, “A distributed support vector machine learning over wireless sensor networks,” IEEE transactions on cybernetics, vol. 45, no. 11, pp. 2599–2611, 2015.
  • [34] A. Giridhar and P. R. Kumar, “Computing and communicating functions over sensor networks,” IEEE Journal on selected areas in communications, vol. 23, no. 4, pp. 755–764, 2005.
  • [35] S. Sundaram and C. N. Hadjicostis, “Distributed function calculation and consensus using linear iterative strategies,” IEEE journal on selected areas in communications, vol. 26, no. 4, pp. 650–660, 2008.
  • [36] J. Melbourne, G. Saraswat, V. Khatana, S. Patel, and M. V. Salapaka, “On the geometry of consensus algorithms with application to distributed termination in higher dimension,” the proceedings of International Federation of Automatic Control (IFAC), 2020.
  • [37] R. Diestel, Graph Theory. Berlin, Germany: Springer-Verlag, 2006.
  • [38] R. A. Horn and C. R. Johnson, Matrix analysis. Cambridge university press, 2012.
  • [39] R. T. Rockafellar, Convex analysis. No. 28, Princeton university press, 1970.
  • [40] P. Lax, Functional Analysis, vol. 1. Wiley-Interscience, 2002.
  • [41] D. A. Levin and Y. Peres, Markov chains and mixing times, vol. 107. American Mathematical Soc., 2017.
  • [42] R. M. Gray and R. Gray, Probability, random processes, and ergodic properties, vol. 1. Springer, 2009.
  • [43] K. L. Clarkson and P. W. Shor, “Applications of random sampling in computational geometry, ii,” Discrete & Computational Geometry, vol. 4, no. 5, pp. 387–421, 1989.
  • [44] C. B. Barber, D. P. Dobkin, D. P. Dobkin, and H. Huhdanpaa, “The quickhull algorithm for convex hulls,” ACM Transactions on Mathematical Software (TOMS), vol. 22, no. 4, pp. 469–483, 1996.
  • [45] K. Fischer, Smallest enclosing balls of balls. PhD thesis, ETH Zürich, 1975.
  • [46] F. Garin and L. Schenato, “A survey on distributed estimation and control applications using linear consensus algorithms,” in Networked control systems, pp. 75–107, Springer, 2010.
  • [47] W. Ren, R. W. Beard, and E. M. Atkins, “A survey of consensus problems in multi-agent coordination,” in Proceedings of the 2005, American Control Conference, 2005., pp. 1859–1864, IEEE, 2005.
  • [48] D. B. Kingston and R. W. Beard, “Discrete-time average-consensus under switching network topologies,” in 2006 American Control Conference, pp. 6–pp, IEEE, 2006.
[Uncaptioned image] James Melbourne received his Bachelors in Art History in 2006 and a Masters in Mathematics in 2009 both from the University of Kansas, and his PhD in Mathematics in 2015 at the University of Minnesota. He was a postdoctoral researcher in the University of Delaware Mathematics department from 2015 to 2017, and is currently a postdoctoral researcher at the University of Minnesota in Electrical and Computer Engineering. His research interest include convexity theory, particularly its application to probabilistic, geometric, and information theoretic inequalities, consensus algorithms, and stochastic energetics.
[Uncaptioned image] Govind Saraswat received his B.Tech degree in Electrical Engineering from the Indian Institute of Technology, Delhi, in 2007 and his PhD degree in Electrical Engineering from University of Minnesota, Twin Cities in 2014. Currently, he is part of Sensing and Predictive Analytics group at National Renewable Energy Laboratory, Golden, CO (NREL) where he works on data-driven technology for energy systems planning and operation. His research includes power system modeling and analysis, measurement-based operation and control, machine learning, and optimization.
[Uncaptioned image] Vivek Khatana received the B.Tech degree in Electrical Engineering from the Indian Institute of Technology, Roorkee, in 2018. Currently, he is working towards a Ph.D. degree at the department of Electrical Engineering at University of Minnesota. His Ph.D. research interest includes distributed optimization, consensus algorithms, distributed control and stochastic calculus.
[Uncaptioned image] Sourav Kumar Patel received his B.Tech. degree in Instrumentation and Control Engineering from the National Institute of Technology, Jalandhar, in 2011. In the same year, he joined National Thermal Power Corporation Ltd. in Kaniha, Odisha, India as a Control and Instrumentation Engineer. He received his M.S. degree in Electrical Engineering from the University of Minnesota, Twin-Cities, in 2018 where currently, he is working towards his Ph.D. degree in Electrical Engineering. His research interests include control and systems theory, coordination and communication protocols for Distributed Energy Resources towards smart grid applications.
[Uncaptioned image] Murti V. Salapaka received the B.Tech. degree in Mechanical Engineering from the Indian Institute of Technology, Madras, in 1991 and the M.S. and Ph.D. degrees in Mechanical Engineering from the University of California at Santa Barbara, in 1993 and 1997, respectively. He was a faculty member in the Electrical and Computer Engineering Department, Iowa State University, Ames, from 1997 to 2007. Currently, he is the Director of Graduate Studies and the Vincentine Hermes Luh Chair Professor in the Electrical and Computer Engineering Department, University of Minnesota, Minneapolis. His research interests include control and network science, nanoscience and single molecule physics. Dr. Salapaka received the 1997 National Science Foundation CAREER Award and is an IEEE fellow.