This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

\epstopdfDeclareGraphicsRule

.tifpng.pngconvert #1 \OutputFile \AppendGraphicsExtensions.tif

Distributed Control-Estimation Synthesis for Stochastic Multi-Agent Systems via Virtual Interaction between Non-neighboring Agents

[Uncaptioned image] Hojin Lee
Department of Mechanical Engineering
Ulsan National Institute of Science and Technology
Ulsan, 44919 Repulic of Korea
hojinlee@unist.ac.kr
&[Uncaptioned image] Cheolhyeon Kwon
Department of Mechanical Engineering
Ulsan National Institute of Science and Technology
Ulsan, 44919 Repulic of Korea
kwonc@unist.ac.kr
© 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. Citation information: DOI 10.1109/LCSYS.2021.3086848, IEEE Control Systems Letters. This work was supported by the National Research Foundation of Korea(NRF) grants funded by the Korea government(MSIT) (No. 2020R1A5A8018822 and No. 2020R1C1C1007323
Abstract

This paper considers the optimal distributed control problem for a linear stochastic multi-agent system (MAS). Due to the distributed nature of MAS network, the information available to an individual agent is limited to its vicinity. From the entire MAS aspect, this imposes the structural constraint on the control law, making the optimal control law computationally intractable. This paper attempts to relax such a structural constraint by expanding the neighboring information for each agent to the entire MAS, enabled by the distributed estimation algorithm embedded in each agent. By exploiting the estimated information, each agent is not limited to interact with its neighborhood but further establishing the ‘virtual interactions’ with the non-neighboring agents. Then the optimal distributed MAS control problem is cast as a synthesized control-estimation problem. An iterative optimization procedure is developed to find the control-estimation law, minimizing the global objective cost of MAS.

Keywords Distributed control \cdot Distributed estimation  \cdot Optimal control

1 Introduction

Distributed control within a cooperative multi-agent system (MAS) is the key enabling technology for different networked dynamical systems (Shi and Yan, 2020; Li et al., 2019; Zhu et al., 2020; Morita et al., 2015). Notwithstanding diverse distributed control strategies, their optimality is one of the stumbling blocks due to individual agents’ limited information. In particular, finding the optimal distributed control with network topological constraint is a well-known NP-hard problem (Gupta et al., 2005). To ease this problem, some former studies have focused on a specific form of objective function along with certain MAS network topology conditions under which the optimal distributed control laws can be designed (Ma et al., 2015). More particularly, different techniques have been investigated to design suboptimal distributed control laws for different MAS cooperative tasks (Gupta et al., 2005; Nguyen, 2015; Jiao et al., 2019). In this paper, a new avenue for accomplishing the optimal distributed control of MAS is presented while not requiring the restricted form of the objective function, nor the network topology. The key idea is to expand the available information for each agent by employing the distributed estimation algorithm, and use the expanded information to relax the network topological constraint in a tractable manner. In a nutshell, the main contributions are the following.

  1. 1.

    A synthesized distributed control-estimation framework is proposed based on the authors’ preliminarily developed distributed estimation algorithm (Kwon and Hwang, 2018). The newly proposed framework enables the interactions between non-neighboring agents, namely virtual interactions.

  2. 2.

    With the aid of virtual interactions, a design procedure that solves for the optimal distributed control law of the stochastic MAS over a finite time horizon is developed, which was originally an intractable non-convex problem due to the network topological constraint.

2 Problem Formulation

2.1 Dynamical Model of Stochastic MAS

Consider a stochastic linear multi-agent dynamical system including NN homogeneous agents whose dynamics is given by:

xi(t+1)=Axi(t)+Bui(t)+wi(t),i{1,,N}x_{i}(t+1)=Ax_{i}(t)+Bu_{i}(t)+w_{i}(t),\ \ \ \ \forall i\in\{1,\cdots,N\} (1)

where xi(t)nx_{i}(t)\in\mathbb{R}^{n} and ui(t)pu_{i}(t)\in\mathbb{R}^{p} are the state and the control input of the ithi^{th} agent, respectively. wi(t)w_{i}(t) is a disturbance imposed on the ithi^{th} agent, assumed to follow zero-mean white Gaussian distribution with covariance Θi(t)0\mathrm{\Theta}_{i}(t)\succ 0. t+={0,1,}t\in\mathbb{Z}_{+}=\{0,1,\cdots\} indicates a discrete-time index. A,BA,B are the system matrices with appropriate dimensions, and are assumed to satisfy the controllability condition. Accordingly, the entire MAS dynamics can be written as

x(t+1)=A~x(t)+B~u(t)+w~(t)x(t+1)=\tilde{A}x(t)+\tilde{B}u(t)+\tilde{w}(t) (2)
A~=(INA),B~=(INB)x(t)=[x1T(t)xNT(t)]T,u(t)=[u1T(t)uNT(t)]Tw~=[w1T(t)wNT(t)]T\begin{split}\tilde{A}&=\left(I_{N}\otimes A\right),\ \tilde{B}=\left(I_{N}\otimes B\right)\\ x(t)&=\begin{bmatrix}x_{1}^{\mathrm{T}}(t)\cdots x_{N}^{\mathrm{T}}(t)\end{bmatrix}^{\mathrm{T}},\ u(t)=\begin{bmatrix}u_{1}^{\mathrm{T}}(t)\cdots u_{N}^{\mathrm{T}}(t)\end{bmatrix}^{\mathrm{T}}\\ \tilde{w}&=\begin{bmatrix}w_{1}^{\mathrm{T}}(t)\cdots w_{N}^{\mathrm{T}}(t)\end{bmatrix}^{\mathrm{T}}\end{split}

where \otimes is the Kronecker product between matrices. The interactions between agents are rendered by inter-agent network topology, described by a graph model 𝒢\mathcal{G} consisting of a node set 𝒱={1, 2..,N}\mathcal{V}=\{1,\ 2\ .\ .\ ,\ N\} indexing each agent and an edge set 𝒱×𝒱\mathcal{E}\subseteq\mathcal{V}\times\mathcal{V} indicating the network connectivity between the agents. Each edge (i,j)(i,j)\in\mathcal{E} denotes that the node ii can acquire the state information of the node jj. An adjacency matrix 𝒜=[a]N×N\mathcal{A}=[a]\in\mathbb{R}^{N\times N} can express the network connectivity of the graph model, where its element aij=1a_{ij}=1 if (i,j)(i,j)\in\mathcal{E}, and aij=0a_{ij}=0 otherwise. A degree matrix is defined as 𝒟=diag(d1dN)\mathcal{D}=diag(d_{1}\cdots d_{N}) where di=jaijd_{i}=\sum_{\begin{subarray}{c}j\end{subarray}}a_{ij} is (weighted) degree of node ii. The Laplacian matrix \mathcal{L}, given by =𝒟𝒜\mathcal{L}=\mathcal{D}-\mathcal{A}, is useful for analysis of the network topology. The set of agents whose state information is available to the ithi^{th} agent, i.e., the neighborhood of the ithi^{th} agent, is expressed as Ωi\Omega_{i}, and its cardinality is expressed as |Ωi||{\Omega_{i}}|. Based on the given network topology, the noisy measurement of neighborhood states {xj(t)|j𝒱}\{x_{j}(t)|j\in\mathcal{V}\} from the ithi^{th} agent’s perspective can be represented as follows (Kwon and Hwang, 2018):

zij(t)=cij(xj(t)+vij(t)),j𝒱z_{ij}(t)=c_{ij}\left(x_{j}(t)+v_{ij}(t)\right),\ \ \ \ \ \forall j\in\mathcal{V} (3)

where cijc_{ij} indicates the availability of the measurement of the jthj^{th} agent’s state from the ithi^{th} agent such that cij=1c_{ij}=1 when jΩij\in\Omega_{i}, and cij=0c_{ij}=0 otherwise. The noise of the measurement from the ithi^{th} to the jthj^{th} agent is specified as vij(t)v_{ij}(t) which is assumed to be independent and identically distributed (i.i.d.) Gaussian random variables with zero mean and covariance Ξij(t)0\mathrm{\Xi}_{ij}(t)\succ 0. Further the measurement and the noise sets of the ithi^{th} agent are denoted by Zi(t)=[zi1T(t)ziNT(t)]TZ_{i}(t)=\left[z_{i1}^{\mathrm{T}}(t)\cdots z_{iN}^{\mathrm{T}}(t)\right]^{\mathrm{T}} and vi(t)=[vi1T(t)viNT(t)]Tv_{i}(t)=\left[v_{i1}^{\mathrm{T}}(t)\cdots v_{iN}^{\mathrm{T}}(t)\right]^{\mathrm{T}}, respectively. Over a finite time horizon t=0,,Tt=0,\cdots,T, one can rewrite (2) as a static form by stacking up the variables and matrices (Furieri and Kamgarpour, 2020):

x=P11w+P12u\begin{split}x&=P_{11}w+P_{12}u\\ \end{split} (4)

where

P11=(IDA¯)1,P12=(IDA¯)1DB¯A¯=IT+1A~,B¯=[ITB~0Nn×NpT]D=[0Nn×NnT0Nn×NnINnT0NnT×Nn]\begin{split}P_{11}&=(I-\mathrm{D}\bar{A})^{-1},\ P_{12}=(I-\mathrm{D}\bar{A})^{-1}\mathrm{D}\bar{B}\\ \bar{A}&=I_{T+1}\otimes\tilde{A},\ \ \ \ \ \bar{B}=\left[\begin{matrix}I_{T}\otimes\tilde{B}\\ 0_{Nn\times NpT}\end{matrix}\right]\\ \mathrm{D}&=\left[\begin{matrix}0_{Nn\times NnT}&0_{Nn\times Nn}\\ I_{NnT}&0_{NnT\times Nn}\end{matrix}\right]\end{split}

where ITI_{T} and 0T0_{T} respectively denote the identity and zero matrices of size T×TT\times T , and Mi=[0pIp0p]p×NpM_{i}=[0_{p}\cdots I_{p}\cdots 0_{p}]\in\mathbb{R}^{p\times Np} is the block matrix having IpI_{p} in the ithi^{th} block entry and filled with 0p0_{p} in other block entries. And x=[x(0)Tx(T)T]TNn(T+1)x=[x(0)^{\mathrm{T}}\dots x(T)^{\mathrm{T}}]^{\mathrm{T}}\in\mathbb{R}^{Nn(T+1)}, and u=iN(ITMiT)uiNpTu=\sum_{i}^{N}(I_{T}\otimes M_{i}^{\mathrm{T}})u_{i}\in\mathbb{R}^{NpT} are the stacked agents’ states and their control inputs over the horizon, where ui=[ui(0)Tui(T1)T]TpT,i𝒱u_{i}=[u_{i}(0)^{\mathrm{T}}\dots u_{i}(T-1)^{\mathrm{T}}]^{\mathrm{T}}\in\mathbb{R}^{pT},\forall i\in\mathcal{V}. w=[x(0)Tw~(0)Tw~(T1)T]Nn(T+1)w=[x(0)^{\mathrm{T}}\ \tilde{w}(0)^{\mathrm{T}}\ \dots\tilde{w}(T-1)^{\mathrm{T}}]\in\mathbb{R}^{Nn(T+1)} is the vector containing initial agents’ states and the additive noise. Over the finite time horizon TT, individual agents interact with their neighbors according to the control law uiu_{i} embedded in each agent. Without loss of generality, uiu_{i} can be designed by the following output feedback control law (Furieri and Kamgarpour, 2020):

ui=(ITMi)Zi,(0:T1)=(ITMi)C(x+vi),i𝒱u_{i}=(I_{T}\otimes M_{i})\mathcal{F}Z_{i,(0:T-1)}=(I_{T}\otimes M_{i})\mathcal{F}C(x+v_{i}),\ \ \forall i\in\mathcal{V} (5)

where vi=[vi(0)Tvi(T)T]TNn(T+1),i𝒱v_{i}=[v_{i}(0)^{\mathrm{T}}\dots v_{i}(T)^{\mathrm{T}}]^{\mathrm{T}}\in\mathbb{R}^{Nn(T+1)},\forall i\in\mathcal{V}, Zi,(0:T1)=[Zi(0)TZi(T1)T]NnTZ_{i,(0:T-1)}=[Z_{i}(0)^{\mathrm{T}}\cdots Z_{i}(T-1)^{\mathrm{T}}]\in\mathbb{R}^{NnT}, and C=[INnT 0NnT×Nn]C=\left[I_{NnT}\ 0_{NnT\times Nn}\right]. The crucial part is the design of the feedback gain, which is denoted by 𝔽\mathcal{F}\in\mathbb{F}. Here, 𝔽NpT×NnT\mathbb{F}\subset\mathbb{R}^{NpT\times NnT} is an invariant subspace that encodes network topological constraints for distributed MAS imposed by 𝒜\mathcal{A}, as well as embeds causal feedback policies by forcing the future response entries to zeros.

2.2 Optimal MAS Distributed Control Problem

Given the equivalent static form of the stochastic linear MAS dynamics over the time horizon TT (4), we seek to address the optimal distributed control problem.

Problem 1.

Optimal distributed control law subject to structural constraint (Furieri and Kamgarpour, 2020):

min𝔽𝔼[xT𝒬x+uTu]subjectto(4),(5),i𝒱\begin{split}\ \ \ &\min\limits_{\mathcal{F}\in\mathbb{F}}\ \mathbb{E}\left[x^{\mathrm{T}}\mathcal{Q}x+u^{\mathrm{T}}\mathcal{R}u\right]\\ \mathrm{subje}\mathrm{ct\ to}\ &\eqref{eq:state=dist+input},\eqref{eq:measurement_input_equation},\ \ \ \ \ \forall i\in\mathcal{V}\end{split} (6)

where 𝒬Nn(T+1)×Nn(T+1)0\mathcal{Q}\in\mathbb{R}^{Nn(T+1)\times Nn(T+1)}\succeq 0, and NpT×NpT0\mathcal{R}\in\mathbb{R}^{NpT\times NpT}\succ 0 are the associated weight matrices.

Due to the structural constraints imposed on the control input space 𝔽\mathbb{F}, Problem 1 is a highly non-convex problem, which is indeed NP-hard and a formidable computational burden (Gupta et al., 2005). To circumvent such a difficulty, we propose a concept of virtualnetworktopologyvirtual\ network\ topology that allows for the interactions between non-neighboring agents, i.e., virtualinteractionvirtual\ interaction as depicted in Figure. 1.

Refer to caption
Figure 1: Virtual interaction based distributed control-estimation synthesis.

Since the state information of the non-neighboring agent is not available, an appropriate estimator is required for each agent to obtain the estimates of non-neighboring agents’ states. Using the Bayesian approach, Kalman-like filter is adopted for estimation as we consider a linear MAS along with the Gaussian uncertainties.

Definition 1.

The state estimate and its covariance of the MAS using the ithi^{th} agent’s measurements are denoted by x^i(t):=𝔼[x(t)|Zi,(0:t)]{}^{i}\hat{x}(t):=\mathbb{E}\left[x(t)|Z_{i,(0:t)}\right] and Σi(t):=𝔼[(x(t)x^i(t))(x(t)x^i(t))T|Zi,(0:t)],j𝒱{}^{i}\Sigma(t):=\mathbb{E}\left[\left(x(t)-{}^{i}\hat{x}(t)\right)\left(x(t)-{}^{i}\hat{x}(t)\right)^{\mathrm{T}}|Z_{i,(0:t)}\right],\forall j\in\mathcal{V}, respectively (Kwon and Hwang, 2018), where 𝔼[|]\mathbb{E}[\bullet|\bullet] is the conditional expectation.

The nominal recursive structure of Kalman-like filter is represented by:

x^i(t)=x^i(t)+Li(t)Hi(Zi(t)x^i(t))\begin{split}{}^{i}\hat{x}(t)&={}^{i}\hat{x}^{-}(t)+L_{i}(t)H_{i}\left(Z_{i}(t)-{}^{i}\hat{x}^{-}(t)\right)\end{split} (7)

where x^i(t):=𝔼[x(t)|Zi,(0:t1)]{}^{i}\hat{x}^{-}(t):=\mathbb{E}\left[x(t)|Z_{i,(0:t-1)}\right] denotes the predicted state estimate from the ithi^{th} agent’s perspective. HiRn|Ωi|×nNH_{i}\in\mathrm{R}^{n|\Omega_{i}|\times nN} only encodes the neighbor of the ithi^{th} agent, that is, Hi=[h1h2h|Ωi|]TInH_{i}=[h_{1}\ h_{2}\ \cdots h_{|\Omega_{i}|}]^{\mathrm{T}}\otimes I_{n} where hmNh_{m}\in\mathbb{R}^{N}, and m=1,2,,|Ωi|m={1,2,...,|\Omega_{i}|} are the nonzero column vectors of the matrix diag(ci1,ci2.,ciN)diag(c_{i1},c_{i2}....,c_{iN}). And, Li(t)nN×n|Ωi|L_{i}(t)\in\mathbb{R}^{nN\times n|\Omega_{i}|} represents the estimator gain at time step tt for estimating the states of the MAS from the ithi^{th} agent’s perspective. Once the entire MAS state estimate information becomes available for each agent, one can replace (5) with the estimation-based feedback control law. Accordingly, Problem 1, distributed control law subject to structural constraint, can be reformulated into the problem that simultaneously designs both distributed control and distributed estimator.

Problem 2.

Optimal distributed control-estimation law with virtual interactions:

min𝔽~,Υi,i𝒱J(,Υ1,,ΥN)subjectto(1),andui=(ITMi)Cx^i,i𝒱\begin{split}\ \ \ &\min\limits_{\mathcal{F}\in\tilde{\mathbb{F}},\Upsilon_{i},\forall i\in\mathcal{V}}\ J(\mathcal{F},\Upsilon_{1},\cdots,\Upsilon_{N})\\ \mathrm{subje}&\mathrm{ct\ to}\ \eqref{eq:state_equation},\ \mathrm{and}\\ &u_{i}=(I_{T}\otimes M_{i})\mathcal{F}C\ {}^{i}\hat{x},\ \ \ \ \ \forall i\in\mathcal{V}\end{split} (8)

where x^i=[x^i(0)Tx^i(T)T]T{}^{i}\hat{x}=[{}^{i}\hat{x}(0)^{\mathrm{T}}\cdots{}^{i}\hat{x}(T)^{\mathrm{T}}]^{\mathrm{T}}, J(,Υ1,,ΥN):=𝔼[xT𝒬x+uTu]J(\mathcal{F},\Upsilon_{1},\cdots,\Upsilon_{N}):=\mathbb{E}\left[x^{\mathrm{T}}\mathcal{Q}x+u^{\mathrm{T}}\mathcal{R}u\right] and Υi:={Li(t)|t=0,,T}\Upsilon_{i}:=\{L_{i}(t)|t=0,\cdots,T\} is the set of estimator gains over the time horizon TT for the ithi^{th} agent.

Remark 1.

It is worth noting that, compared to 𝔽\mathbb{F}, 𝔽~NpT×NnT\tilde{\mathbb{F}}\subset\mathbb{R}^{NpT\times NnT} is a subspace that only encodes causal feedback policies, not restricted by any network topological constraint.

Albeit Problem 2 can successfully relax the structural constraint on the control law \mathcal{F}, it is not straightforward to solve as the control law and the state estimation errors mutually affect each other (Kwon and Hwang, 2018). To resolve such complexity, we propose an iterative optimization procedure in a distributed fashion such that: i) divide the primal problem (Problem 2) into the set of sub-problems, each is viewed from an individual agent’s perspective; ii) sequentially design the distributed estimation and control laws for each sub-problem; iii) mix the results from individual sub-problems to approximate the optimal solution to the primal problem. The overall schematic of the proposed distributed control-estimation synthesis is delineated in Figure. 2.

Refer to caption
Figure 2: Iterative optimization procedure for the optimal distributed control-estimation synthesis.

For the lthl^{th} iteration, the optimization procedure consists of the following sub-steps. Firstly, distributed estimator design optimizes a set of estimator gains Υi(l),i𝒱\Upsilon_{i}^{(l)},\forall i\in\mathcal{V} based on the disturbance/noise model, the network topological constraint, and the suboptimal control law resulted from the previous iteration. Then, distributed control law design computes a set of optimal control laws (l)i,i𝒱{}^{i}\mathcal{F}^{(l)},\forall i\in\mathcal{V}, each from the individual agents’ perspectives, based on the state estimation error information from the designed distributed estimator. Finally, distributed control-estimation synthesis mixes the set of (l)i,i𝒱{}^{i}\mathcal{F}^{(l)},\forall i\in\mathcal{V} to construct the solution candidate, (l)\mathcal{F}^{(l)}, for Problem 2. The constructed control law is evaluated to check the convergence and is used for the next iteration. The iteration terminates once the pre-defined stopping criteria are fulfilled, yielding the optimal control-estimation law, denoted by \mathcal{F}^{*} and Υi,i𝒱\Upsilon_{i}^{\ast},\forall i\in\mathcal{V}.

3 Algorithm Development

This section details out the proposed synthesis procedure of the optimal distributed control-estimation law that can comply with an arbitrary network topology of the stochastic MAS.

3.1 Distributed estimator design

To begin with, the distributed estimation algorithm is optimized by means of estimator gains Υi(l),i𝒱\Upsilon_{i}^{(l)},\forall i\in\mathcal{V}. As offline design phase, individual estimators can be designed based on the entire MAS model information along with the control law computed from the previous iteration, (AA, BB, and (l1)\mathcal{F}^{(l-1)}). For brevity, we use \mathcal{F} to designate (l1)\mathcal{F}^{(l-1)} in this subsection. Recalling (7), the distributed estimator embedded in the ithi^{th} agent calculates x^i(t),t{0,,T}{}^{i}\hat{x}(t),\forall t\in\{0,\cdots,T\}, whose performance can be measured by the estimation error.

Definition 2.

ei(t):=x(t)x^i(t){}^{i}e(t):=x(t)-{}^{i}\hat{x}(t) is the MAS state estimation error from the ithi^{th} agent’s perspective, and its covariance is Σi(t){}^{i}\Sigma(t) by Definition 1. Further, e(t)=[e1(t)TeN(t)T]TNnNe(t)=[{}^{1}e(t)^{\mathrm{T}}\cdots{}^{N}e(t)^{\mathrm{T}}]^{\mathrm{T}}\in\mathbb{R}^{NnN} stacks all the estimation errors from individual agents’ estimator, and the corresponding covariance is denoted by Σ(t):=𝔼[e(t)e(t)T]NnN×NnN\Sigma(t):=\mathbb{E}[e(t)e(t)^{\mathrm{T}}]\in\mathbb{R}^{NnN\times NnN}. Similarly, ei(t){}^{i}e^{-}(t), Σi(t){}^{i}\Sigma^{-}(t), e(t)e^{-}(t), Σ(t)\Sigma^{-}(t) are defined in terms of the predicted state estimate x^i(t){}^{i}\hat{x}^{-}(t) (Kwon and Hwang, 2018).

Assumption 1.

The initial conditions x^i(0),i𝒱{}^{i}\hat{x}(0),\forall i\in\mathcal{V}, and Σ(0)\Sigma(0) are given to individual agents in order to initiate each of their distributed estimators.

Based on the prior knowledge, the estimation based control input of the ithi^{th} agent at time step tt can be written by:

ui(t)=Mik=0tktx^i(k)u_{i}(t)=M_{i}\sum_{k=0}^{t}\mathcal{F}_{kt}\ {}^{i}\hat{x}(k) (9)

where ktpN×nN\mathcal{F}_{kt}\in\mathbb{R}^{pN\times nN} is the block matrix which spans from (knN)th(knN)^{th} to (knN+nN1)th(knN+nN-1)^{th} columns and from (kpN)th(kpN)^{th} to (kpN+pN1)th(kpN+pN-1)^{th} rows of the control law matrix \mathcal{F}. With (9), the entire MAS dynamics (2) can be expressed by:

x(t+1)=A~x(t)+B~ttx(t)+k=0t1B~ktx(k)k=0tB¯M~~kte(k)+w~(t)\begin{split}x(t+1)&=\tilde{A}x(t)+\tilde{B}\mathcal{F}_{tt}x(t)+\sum_{k=0}^{t-1}\tilde{B}\mathcal{F}_{kt}x(k)-\sum_{k=0}^{t}\bar{B}\tilde{M}\tilde{\mathcal{F}}_{kt}e(k)+\tilde{w}(t)\end{split} (10)

where B¯=1NTB~\bar{B}=1_{N}^{\mathrm{T}}\otimes\tilde{B}, M~=blkdg(M1TM1,,MNTMN)NpN×NpN\tilde{M}=blkdg(M_{1}^{\mathrm{T}}M_{1},\dots,M_{N}^{\mathrm{T}}M_{N})\in\mathbb{R}^{NpN\times NpN}, and ~kt=INkt\tilde{\mathcal{F}}_{kt}=I_{N}\otimes\mathcal{F}_{kt}. blkdg()blkdg(\bullet) denotes a block-diagonal matrix with block matrices \bullet, and the vector 1NN1_{N}\in\mathbb{R}^{N} indicates every element equals to 11. Then the predicted state estimate of the entire MAS from the ithi^{th} agent’s perspective is given by:

x^i(t+1)=A~x^i(t)+B~ttx^i(t)+k=0t1B~ktx^i(k)\begin{split}{}^{i}\hat{x}^{-}(t+1)=\tilde{A}\ {}^{i}\hat{x}(t)+\tilde{B}\mathcal{F}_{tt}{}^{i}\hat{x}(t)+\sum_{k=0}^{t-1}\tilde{B}\mathcal{F}_{kt}{}^{i}\hat{x}(k)\end{split} (11)

Subtracting (11) from (10), and concatenating the results for all agents in 𝒱\mathcal{V} gives:

e(t+1)=Λte(t)+k=0t1Ψkte(k)+1Nw~(t)whereΛt=IN(A~+B~tt)1NB¯M~~tt,Ψkt=INB~kt1NB¯M~~kt\begin{split}e^{-}(t+1)&=\Lambda_{t}e(t)+\sum_{k=0}^{t-1}\Psi_{kt}e(k)+1_{N}\otimes\tilde{w}(t)\\ \mathrm{where}\ \Lambda_{t}&=I_{N}\otimes(\tilde{A}+\tilde{B}\mathcal{F}_{tt})-1_{N}\otimes\bar{B}\tilde{M}\tilde{\mathcal{F}}_{tt},\ \ \Psi_{kt}=I_{N}\otimes\tilde{B}\mathcal{F}_{kt}-1_{N}\otimes\bar{B}\tilde{M}\tilde{\mathcal{F}}_{kt}\end{split} (12)

Correspondingly, Σ(t+1)\Sigma^{-}(t+1) is represented by:

Σ(t+1)=ΛtΣ(t)ΛtT+Σw~(t)+q=0t1Λt𝔼[e(t)e(q)T]ΨqtT+p=0t1Ψpt𝔼[e(p)e(t)T]ΛtT+p=0t1q=0t1Ψpt𝔼[e(p)e(q)T]ΨqtT\begin{split}\Sigma^{-}(t+1)&=\Lambda_{t}\Sigma(t)\Lambda_{t}^{\mathrm{T}}+\Sigma_{\tilde{w}}(t)+\sum_{q=0}^{t-1}\Lambda_{t}\mathbb{E}[e(t)e(q)^{\mathrm{T}}]\Psi_{qt}^{\mathrm{T}}+\sum_{p=0}^{t-1}\Psi_{pt}\mathbb{E}[e(p)e(t)^{\mathrm{T}}]\Lambda_{t}^{\mathrm{T}}+\sum_{p=0}^{t-1}\sum_{q=0}^{t-1}\Psi_{pt}\mathbb{E}[e(p)e(q)^{\mathrm{T}}]\Psi_{qt}^{\mathrm{T}}\end{split} (13)

where Σw~(t)=(1N1NT)blkdg(Θ1(t),,ΘN(t))\Sigma_{\tilde{w}}(t)=(1_{N}1_{N}^{\mathrm{T}})\otimes blkdg(\Theta_{1}(t),\dots,\Theta_{N}(t)). Note that the summation terms in the RHS of (13) (e.g., 𝔼[e(p)e(q)T],qp\mathbb{E}[e(p)e(q)^{\mathrm{T}}],q\neq p) imply the correlations of state estimates over time induced by the control law (9). Now, the predicted error, (7) can be rewritten by:

x^i(t+1)=x^i(t+1)+Li(t+1)Hi(ei(t+1)+vi(t+1))\begin{split}{}^{i}\hat{x}(t+1)=&{}^{i}\hat{x}^{-}(t+1)+L_{i}(t+1)H_{i}({}^{i}e^{-}(t+1)+v_{i}(t+1))\end{split} (14)

Like the Kalman gain, Li(t+1)L_{i}(t+1) can be computed in a way minimizing the mean-squared error of the state estimate, i.e., 𝔼[ei(t+1)2]\mathbb{E}\left[\left\|{}^{i}e(t+1)\right\|^{2}\right]. This is in fact equivalent to minimizing the trace of the posterior covariance matrices, i.e., Tr(Σi(t+1)),i𝒱\mathrm{Tr}\left({}^{i}\Sigma(t+1)\right),\forall i\in\mathcal{V}. By the definition of Σi(t+1){}^{i}\Sigma(t+1), we have:

Σi(t+1):=𝔼[ei(t+1)ei(t+1)T|Zi,(0:t+1)]=(InNLi(t+1)Hi)Σi(t+1)(InNLi(t+1)Hi)T+Li(t+1)HiΞi(t+1)(Li(t+1)Hi)T\begin{split}{}^{i}\Sigma&(t+1):=\mathbb{E}[{}^{i}e(t+1){}^{i}e(t+1)^{\mathrm{T}}|Z_{i,(0:t+1)}]\\ =&\left(I_{nN}-L_{i}(t+1)H_{i}\right){}^{i}\Sigma^{-}(t+1)(I_{nN}-L_{i}(t+1)H_{i})^{\mathrm{T}}+L_{i}(t+1)H_{i}{}^{i}\Xi(t+1)(L_{i}(t+1)H_{i})^{\mathrm{T}}\end{split} (15)

where

Li(k+1)=Σi(t+1)HiT(Si(t+1))1Si(t+1)=Hi(Σi(t+1)+Ξi(x+1))HiTΞi(t+1)=blkdg(Ξi1(t+1),Ξi2(t+1),ΞiN(t+1))\begin{split}L_{i}(k+1)&={}^{i}\Sigma^{-}(t+1)H_{i}^{\mathrm{T}}(S_{i}(t+1))^{-1}\\ S_{i}(t+1)&=H_{i}({}^{i}\Sigma^{-}(t+1)+{}^{i}\mathrm{\Xi}(x+1))H_{i}^{\mathrm{T}}\\ {}^{i}\mathrm{\Xi}(t+1)&=blkdg(\mathrm{\Xi}_{i1}(t+1),\ \mathrm{\Xi}_{i2}(t+1),\dots\mathrm{\Xi}_{iN}(t+1))\end{split} (16)

Correspondingly, Σ(t+1)\Sigma(t+1) can be updated by:

Σ(t+1)=(IL~(t+1))Σ(t+1)(IL~(t+1))T+L~(t+1)ΣΞ(t+1)L~(t+1)T\begin{split}\Sigma(t+1)&=(I-\tilde{L}(t+1))\Sigma^{-}(t+1)(I-\tilde{L}(t+1))^{\mathrm{T}}+\tilde{L}(t+1)\Sigma_{\Xi}(t+1)\tilde{L}(t+1)^{\mathrm{T}}\end{split} (17)

where ΣΞ(t+1)=blkdg(Ξ1(t+1),,ΞN(t+1))\Sigma_{\Xi}(t+1)=blkdg({}^{1}\mathrm{\Xi}(t+1),\dots,{}^{N}\mathrm{\Xi}(t+1)), and L~(t+1)=blkdg(L1(t+1)H1,,LN(t+1)HN)\tilde{L}(t+1)=blkdg(L_{1}(t+1)H_{1},\dots,L_{N}(t+1)H_{N}). Note that, the covariance between the state estimation errors at current and past steps, i.e., 𝔼[e(t+1)e(s)T]\mathbb{E}[e(t+1)e(s)^{\mathrm{T}}] and 𝔼[e(s)e(t+1)T],s<t\mathbb{E}[e(s)e(t+1)^{\mathrm{T}}],\forall s<t need to be updated using the computed L~(t+1)\tilde{L}(t+1). The cross-covariance between the ithi^{th} and the jthj^{th} agents’ state estimates Σij(t+1):=𝔼[ei(t+1)ej(t+1)T]Nn×Nn{}^{ij}\Sigma(t+1):=\mathbb{E}[{}^{i}e(t+1){}^{j}e(t+1)^{\mathrm{T}}]\in\mathbb{R}^{Nn\times Nn} is at the off-diagonal block entry, while Σi(t+1)Nn×Nn{}^{i}\Sigma(t+1)\in\mathbb{R}^{Nn\times Nn} is at the diagonal block entry of Σ(t+1)NnN×NnN\Sigma(t+1)\in\mathbb{R}^{NnN\times NnN}. The detailed expansions of Σ(t)\Sigma(t) is as follows:

Σ(t)=[Σ1(t)Σ12(t)Σ1N(t)Σ21(t)Σ2(t)Σ2N(t)ΣN1(t)ΣN2(t)ΣN(t)]\begin{split}\Sigma(t)&=\left[\begin{matrix}{}^{1}\Sigma(t)&{}^{12}\Sigma(t)&\cdots&{}^{1N}\Sigma(t)\\ {}^{21}\Sigma(t)&{}^{2}\Sigma(t)&\cdots&{}^{2N}\Sigma(t)\\ \vdots&\vdots&\ddots&\vdots\\ {}^{N1}\Sigma(t)&{}^{N2}\Sigma(t)&\cdots&{}^{N}\Sigma(t)\end{matrix}\right]\end{split}

Based on the computed Υi\Upsilon_{i} from (16), each agent can update x^i(t){}^{i}\hat{x}(t), Σi(t){}^{i}\Sigma(t) and Σ(t)\Sigma(t), respectively using (14), (15), and (17). This completes the implementation of the distributed estimation algorithm.

Remark 2.

It is noted that Σ(t)\Sigma(t) computed by each agent is irrespective of agent’s perspective since the same initial condition Σ(0)\Sigma(0) is given to each agent by Assumption 1.

3.2 Distributed control law design

In this section, the computationally tractable formulation of the optimal distributed control law is derived from the individual agents’ perspectives. The main idea starts with relaxing the structural constraints on \mathcal{F} by applying the estimator (14) to each agent.

Definition 3.

Let ei:=xx^i{}^{i}e:=x-{}^{i}\hat{x} stacks up the time series of the estimation errors from the ithi^{th} agent’s perspective, over the time horizon TT. Given \mathcal{F} and Υi,i𝒱\Upsilon_{i},\forall i\in\mathcal{V}, one can construct the estimation error covariance over the time horizon TT, Σi:=𝔼[eieTi],i𝒱\Sigma_{i}:=\mathbb{E}[{}^{i}e{}^{i}e^{\mathrm{T}}],\forall i\in\mathcal{V}, as well as the cross-covariance between two different agents Σij:=𝔼[eieTj],ij𝒱\Sigma_{ij}:=\mathbb{E}[{}^{i}e\ {}^{j}e^{\mathrm{T}}],\forall i\neq j\in\mathcal{V}.

In terms of the time series of the estimation errors, the state estimation based control law over the time horizon can be expressed by:

u=iNiCx^i=CxiNiCeiu=\sum_{i}^{N}\mathcal{M}_{i}\mathcal{F}C\ {}^{i}\hat{x}=\mathcal{F}Cx-\sum_{i}^{N}\mathcal{M}_{i}\mathcal{F}C\,{}^{i}e (18)

where i=IT(MiTMi),i𝒱\mathcal{M}_{i}=I_{T}\otimes(M_{i}^{\mathrm{T}}M_{i}),\forall i\in\mathcal{V}. Plugging (18) into (4) yields the objective cost in (8) as follows:

J(,Υ1,,ΥN)=𝒬12(IP12C)1P11Σw12F2+12(ICP12)1CP11Σw12F2+i,j𝒬12P12(ICP12)1(iCΣijCTTjT)12F2+i,j12(ICP12)1(iCΣijCTTjT)12F2+𝒬12(IP12C)1P11μw22+12(ICP12)1CP11μw22\begin{split}J(\mathcal{F},\Upsilon_{1},\cdots,\Upsilon_{N})=&\|\mathcal{Q}^{\frac{1}{2}}(I-P_{12}\mathcal{F}C)^{-1}P_{11}\Sigma_{w}^{\frac{1}{2}}\|^{2}_{{}_{F}}+\|\mathcal{R}^{\frac{1}{2}}(I-\mathcal{F}CP_{12})^{-1}\mathcal{F}CP_{11}\Sigma_{w}^{\frac{1}{2}}\|^{2}_{{}_{F}}\\ &+\sum_{i,j}\|\mathcal{Q}^{\frac{1}{2}}P_{12}(I-\mathcal{F}CP_{12})^{-1}(\mathcal{M}_{i}\mathcal{F}C\Sigma_{ij}C^{\mathrm{T}}\mathcal{F}^{\mathrm{T}}\mathcal{M}_{j}^{\mathrm{T}})^{\frac{1}{2}}\|^{2}_{{}_{F}}\\ &+\sum_{i,j}\|\mathcal{R}^{\frac{1}{2}}(I-\mathcal{F}CP_{12})^{{}^{-1}}(\mathcal{M}_{i}\mathcal{F}C\Sigma_{ij}C^{\mathrm{T}}\mathcal{F}^{\mathrm{T}}\mathcal{M}_{j}^{\mathrm{T}})^{\frac{1}{2}}\|^{2}_{{}_{F}}\\ &+\|\mathcal{Q}^{\frac{1}{2}}(I-P_{12}\mathcal{F}C)^{-1}P_{11}\mu_{w}\|^{2}_{2}+\|\mathcal{R}^{\frac{1}{2}}(I-\mathcal{F}CP_{12})^{-1}\mathcal{F}CP_{11}\mu_{w}\|^{2}_{2}\end{split} (19)

where 22\|\cdot\|^{2}_{2} and F2\|\cdot\|^{2}_{{}_{F}} denote Euclidean norm and the Frobenius norm, respectively; and μw:=𝔼[w]Nn(T+1)\mu_{w}:=\mathbb{E}[w]\in\mathbb{R}^{Nn(T+1)}, Σw:=𝔼[(wμw)(wμw)T]Nn(T+1)×Nn(T+1)\Sigma_{w}:=\mathbb{E}[(w-\mu_{w})(w-\mu_{w})^{\mathrm{T}}]\in\mathbb{R}^{Nn(T+1)\times Nn(T+1)}.

Apparently, the objective cost (19) has high-dimensional, highly coupled optimization variables, i.e., \mathcal{F}, which is our main interest, and Σij,i,j𝒱\Sigma_{ij},\forall i,j\in\mathcal{V}, which are the implicit functions of both \mathcal{F} and Υi,i𝒱\Upsilon_{i},\forall i\in\mathcal{V}. The proposed iterative optimization procedure alleviates these coupling complexities in two aspects. First, akin to the alternating direction method of multipliers (ADMM) technique (Lin et al., 2013), we set Σij,i,j𝒱\Sigma_{ij},\forall i,j\in\mathcal{V} constant while optimizing \mathcal{F} at the lthl^{th} iteration, thereby treating JJ as the function of \mathcal{F} only. Note that Υi,i𝒱\Upsilon_{i},\forall i\in\mathcal{V} is designed over the constant \mathcal{F} in the distributed estimator design phase of the next iteration. Second, we interpret the global objective cost from the individual agent’s viewpoint, and translate the primal problem (Problem 2) into the agent-wise objective cost. The objective cost, locally seen by the ithi^{th} agent’s viewpoint at the lthl^{th} iteration, is denoted by J(l)i{}^{i}J^{(l)} which can be constructed using the estimated MAS input u(l)i=(l)iC(xei){}^{i}u^{(l)}={}^{i}\mathcal{F}^{(l)}C(x-{}^{i}e) instead of (18). (l)i{}^{i}\mathcal{F}^{(l)} is constructed from the ithi^{th} agent’s perspective by optimizing the agent-wise objective cost, J(l)i{}^{i}J^{(l)}. Then, the resulting agent-wise optimization problem is represented as follows:

Problem 3.

Optimal distributed control law from agent-wise viewpoint:

min(l)i𝔽~J(l)i((l)i)\begin{split}\ \ \ &\min\limits_{{}^{i}\mathcal{F}^{(l)}\in\mathbb{\tilde{F}}}\ {}^{i}J^{(l)}({}^{i}\mathcal{F}^{(l)})\end{split} (20)

where

J(l)i((l)i)=𝒬12(IP12(l)iC)1P11Σw12iF2+12(I(l)iCP12)1(l)iCP11Σw12iF2+𝒬12(IP12(l)iC)1P12(l)iCΣi(l)12F2+12(I(l)iCP12)1(l)iCΣi(l)12F2+𝒬12(IP12(l)iC)1P11μwi22+12(I(l)iCP12)1(l)iCP11μwi22\begin{split}{}^{i}J^{(l)}({}^{i}\mathcal{F}^{(l)})=&\|\mathcal{Q}^{\frac{1}{2}}(I-P_{12}\,{}^{i}\mathcal{F}^{(l)}C)^{{}^{-1}}P_{11}{}^{i}\Sigma_{w}^{\frac{1}{2}}\|^{2}_{{}_{F}}+\|\mathcal{R}^{\frac{1}{2}}(I-{}^{i}\mathcal{F}^{(l)}CP_{12})^{{}^{-1}}\,{}^{i}\mathcal{F}^{(l)}CP_{11}{}^{i}\Sigma_{w}^{\frac{1}{2}}\|^{2}_{{}_{F}}\\ &+\|\mathcal{Q}^{\frac{1}{2}}(I-P_{12}\,{}^{i}\mathcal{F}^{(l)}C)^{{}^{-1}}P_{12}\,{}^{i}\mathcal{F}^{(l)}C\Sigma_{i}^{(l)\frac{1}{2}}\|^{2}_{{}_{F}}+\|\mathcal{R}^{\frac{1}{2}}(I-{}^{i}\mathcal{F}^{(l)}CP_{12})^{{}^{-1}}\,{}^{i}\mathcal{F}^{(l)}C\Sigma_{i}^{(l)\frac{1}{2}}\|^{2}_{{}_{F}}\\ &+\|\mathcal{Q}^{\frac{1}{2}}(I-P_{12}\,{}^{i}\mathcal{F}^{(l)}C)^{{}^{-1}}P_{11}{}^{i}\mu_{w}\|^{2}_{2}+\|\mathcal{R}^{\frac{1}{2}}(I-{}^{i}\mathcal{F}^{(l)}CP_{12})^{{}^{-1}}\,{}^{i}\mathcal{F}^{(l)}CP_{11}{}^{i}\mu_{w}\|^{2}_{2}\end{split} (21)

where μwi=𝔼[w|Zi,(0:T)]{}^{i}\mu_{w}=\mathbb{E}[w|Z_{i,(0:T)}], Σwi:=𝔼[(wμwi)(wμwi)T|Zi,(0:T)],i𝒱{}^{i}\Sigma_{w}:=\mathbb{E}[(w-{}^{i}\mu_{w})(w-{}^{i}\mu_{w})^{\mathrm{T}}|Z_{i,(0:T)}],\forall i\in\mathcal{V}. Note that Σi(l)Nn(T+1)×Nn(T+1)\Sigma_{i}^{(l)}\in\mathbb{R}^{Nn(T+1)\times Nn(T+1)} is computed by Definition 3 at the lthl^{th} iteration.

Definition 4.

A subspace 𝔽~\tilde{\mathbb{F}} is quadratic invariance (QI) with respect to CP12CP_{12} if and only if (l)iCP12(l)i𝔽~{}^{i}\mathcal{F}^{(l)}CP_{12}{}^{i}\mathcal{F}^{(l)}\in\tilde{\mathbb{F}}. And it is trivial to show that 𝔽~\tilde{\mathbb{F}} is QI with respect to CP12CP_{12}(Lessard and Lall, 2011).

It is well-known fact that QI is a sufficient and necessary condition for the exact convex reformulation (Lessard and Lall, 2011). That is, one can apply an equivalent disturbance-feedback policy to make (21) a convex form, similar to (Furieri and Kamgarpour, 2020).

Definition 5.

Let us introduce the nonlinear mapping as:

h(Φ)=(I+ΦCP12)1Φ,h:NpT×NnTNpT×NnTh(\Phi)=(I+\Phi CP_{12})^{-1}\Phi,\ \ h:\mathbb{R}^{NpT\times NnT}\mapsto\mathbb{R}^{NpT\times NnT} (22)

and define the cost function J~:NpT×NnT\tilde{J}:\mathbb{R}^{NpT\times NnT}\mapsto\mathbb{R} in terms of the design parameter Φ(l)i{}^{i}\Phi^{(l)}(Furieri and Kamgarpour, 2020).

J~(l)i(Φ(l)i)=𝒬12(I+P12Φ(l)iC)P11Σw12iF2+12Φ(l)iCP11Σw12iF2+𝒬12P12Φ(l)iΣi(l)12F2+12Φ(l)iΣi(l)12F2+12Φ(l)iCP11μwi22+𝒬12(I+P12Φ(l)iC)P11μwi22\begin{split}{}^{i}\tilde{J}^{(l)}({}^{i}\Phi^{(l)})&=\|\mathcal{Q}^{\frac{1}{2}}(I+P_{12}{}^{i}\Phi^{(l)}C)P_{11}{}^{i}\Sigma_{w}^{\frac{1}{2}}\|^{2}_{{}_{F}}+\|\mathcal{R}^{\frac{1}{2}}{}^{i}\Phi^{(l)}CP_{11}{}^{i}\Sigma_{w}^{\frac{1}{2}}\|^{2}_{{}_{F}}+\|\mathcal{Q}^{\frac{1}{2}}P_{12}{}^{i}\Phi^{(l)}\Sigma_{i}^{(l)\frac{1}{2}}\|^{2}_{{}_{F}}\\ &\ \ \ +\|\mathcal{R}^{\frac{1}{2}}{}^{i}\Phi^{(l)}\Sigma_{i}^{(l)\frac{1}{2}}\|^{2}_{{}_{F}}+\|\mathcal{R}^{\frac{1}{2}}{}^{i}\Phi^{(l)}CP_{11}{}^{i}\mu_{w}\|^{2}_{2}+\|\mathcal{Q}^{\frac{1}{2}}(I+P_{12}{}^{i}\Phi^{(l)}C)P_{11}{}^{i}\mu_{w}\|^{2}_{2}\end{split} (23)

By Theorem 11 in (Furieri and Kamgarpour, 2020), the following convex optimization problem is equivalent to Problem 3.

Problem 4.

Equivalent convex problem to optimal distributed control from agent-wise viewpoint:

minΦ(l)ih1(𝔽~)J~(l)i(Φ(l)i)\min\limits_{{}^{i}\Phi^{(l)}\in h^{-1}(\tilde{\mathbb{F}})}{}^{i}\tilde{J}^{(l)}({}^{i}\Phi^{(l)}) (24)

By solving (24) using convex programming, one can find the optimal Φ(l)i{}^{i}\Phi^{(l)}, and the corresponding (l)i{}^{i}\mathcal{F}^{(l)} from the inverse mapping h1h^{-1} of (22). The same optimization routines (Problem 3 and 4) based on the locally seen cost from the other agents’ perspectives are processed to get the optimal control laws (l)i,i𝒱{}^{i}\mathcal{F}^{(l)},\forall i\in\mathcal{V} at the lthl^{th} iteration step.

3.3 Distributed control-estimation synthesis

The set of optimal control laws from individual agents’ viewpoint, (l)i,i𝒱{}^{i}\mathcal{F}^{(l)},\forall i\in\mathcal{V}, is mixed to approximate the solution to Problem 2 by the agent-wise mixing strategy proposed as follows:

(l)=iNi(l)i\mathcal{F}^{(l)}=\sum_{i}^{N}\mathcal{M}_{i}\,{}^{i}\mathcal{F}^{(l)} (25)

The basic intuition of the proposed strategy is to exhibit the control law for the ithi^{th} agent using the one computed from the sup-optimization problem (Problem 3) from the ithi^{th} agent’s perspective. Accordingly, the proposed mixing strategy allows for individual agents to retain distributed controllers to be executed, retaining each of their sub-optimal solutions without interfering with each other.

3.4 Convergence check

The last step of the iteration loop evaluates the designed distributed control law (25), together with the estimator (7). First, let SS be a set which stores the designed control law, (l)\mathcal{F}^{(l)}, and the estimator gains, Υi(l),i𝒱\Upsilon_{i}^{(l)},\ \forall i\in\mathcal{V}, from each iteration step as follows:

S:={s(l)|s(l)=((l),Υ1(l),,ΥN(l)),l}S:=\left\{s^{(l)}\bigg{|}s^{(l)}=\left(\mathcal{F}^{(l)},\Upsilon_{1}^{(l)},\cdots,\Upsilon_{N}^{(l)}\right),\ l\in\mathbb{N}\right\} (26)

The iteration terminates if: i) the total iteration counts the threshold number NmaxN_{max}; or ii) the consecutive iteration is converged with respect to the following stopping condition:

J(l,l1)ϵstop\triangle J(l,l-1)\leq\epsilon_{stop} (27)

where J(l,l1):=|J((l),Υ1(l),,ΥN(l))J((l1),Υ1(l1),,ΥN(l1))|\triangle J(l,l-1):=\lvert J(\mathcal{F}^{(l)},\Upsilon_{1}^{(l)},\cdots,\Upsilon_{N}^{(l)})-J(\mathcal{F}^{(l-1)},\Upsilon_{1}^{(l-1)},\cdots,\Upsilon_{N}^{(l-1)})\lvert and ϵstop\epsilon_{stop} is the threshold magnitude for the convergence. The objective cost of the corresponding control law J((l),Υ1(l),,ΥN(l))J(\mathcal{F}^{(l)},\Upsilon_{1}^{(l)},\cdots,\Upsilon_{N}^{(l)}) is computed by plugging the designed control law, (l)\mathcal{F}^{(l)}, and the set of distributed estimator gains Υi(l),i𝒱\Upsilon_{i}^{(l)},\forall i\in\mathcal{V} into (19). The final output of the control-estimation synthesis is given by:

=(l),Υi=Υi(l),i𝒱wherel=argminl|S|J((l),Υ1(l),,ΥN(l))\begin{split}\mathcal{F}^{\ast}=&\mathcal{F}^{(l)},\ \Upsilon_{i}^{\ast}=\Upsilon_{i}^{(l)},\ \forall i\in\mathcal{V}\\ \mathrm{where}\ l=&\displaystyle\operatorname*{arg\,min}_{\forall l\in|S|}J(\mathcal{F}^{(l)},\Upsilon_{1}^{(l)},\cdots,\Upsilon_{N}^{(l)})\end{split} (28)

The overall recursive structure of the control-estimation synthesis procedure is summarized in Algorithm 1.

Initialize the MAS dynamics information AA, BB, \mathcal{L}, Σ(0)\Sigma(0), (0)\mathcal{F}^{(0)}, ϵstop\epsilon_{stop}, NmaxN_{max} and the cost metrics 𝒬\mathcal{Q}, \mathcal{R}.
for l=1,2,Nmaxl=1,2,\cdots N_{max} do
  1. a)

    Distributed estimator design

for t=0 to the termination time TT
  1. 1)

    Update Σ(t+1)\Sigma(t+1) using (l1)\mathcal{F}^{(l-1)}, (13), (16), and (17)

end for, Output \longrightarrow Υi(l)andΣi(l),i𝒱\Upsilon_{i}^{(l)}\ \mathrm{and}\ \Sigma_{i}^{(l)},\forall i\in\mathcal{V}
  • b)

    Distributed control law design

  • for i=1 to the number of total agent, NN
    1. 2)

      Solve (24), and compute (l)i{}^{i}\mathcal{F}^{(l)}

    end for, Output \longrightarrow (l)i,i𝒱{}^{i}\mathcal{F}^{(l)},\forall i\in\mathcal{V}
  • c)

    Distributed control-estimation synthesis

    1. 3)

      Synthesize the control law (l)\mathcal{F}^{(l)} using (25)

  • d)

    Convergence check

    1. 4)

      Store (l)\mathcal{F}^{(l)}, and Υi(l),i𝒱\Upsilon_{i}^{(l)},\forall i\in\mathcal{V} in the set SS

    1. 5)

      If (27) is satisfied or l>Nmaxl>N_{max}, then terminates.

    end for,  Output \Longrightarrow\mathcal{F}^{\ast}, and Υi,i𝒱\Upsilon_{i}^{\ast},\ \forall i\in\mathcal{V}

  • Algorithm 1 Virtual interaction based distributed control-estimation synthesis.

    It is noted that the algorithm 1 is executed in the offline design phase. Once the distributed control law \mathcal{F}^{\ast} and the corresponding estimator gains Υi\Upsilon_{i}^{\ast} for each agent are designed, each agent is deployed into the distributed online operation using its own prior knowledge. It is worth noting that the majority of the heavy computations occur at the offline design phase. Therefore, when it comes to the online operation, it is not burdensome to the limited on-board resources of each agent. Indeed, the computational complexity of the online operation for the proposed algorithm is scaled by the number of agents, i.e., 𝒪(N)\mathcal{O}(N).

    4 Stability Analysis

    In this session, the stability analysis of the proposed distributed estimation algorithm in section 3.1 is presented. To begin with, let us consider the control law as static memoryless feedback gain FF as follows:

    ui(t)=MiFx^i(t)u_{i}(t)=M_{i}F\,{}^{i}\hat{x}(t) (29)

    where Mi=[0pIp0p]p×NpM_{i}=[0_{p}\cdots I_{p}\cdots 0_{p}]\in\mathbb{R}^{p\times Np} is the block matrix having IpI_{p} in the ithi^{th} block entry and filled with 0p0_{p} in other entries. FF has structural constraints subject to the network topology of MAS specified by the Laplacian \mathcal{L}. Note that the estimation stability is unrelated to the design of FF as will be discussed below, and thus readily applicable to memory based feedback control law as in (9). Corresponding to (29), the predicted state estimation errors of the ithi^{th} agent can be written as follows:

    ei(t+1)=D1ei(t)+D12e(t)+w~(t){}^{i}e^{-}(t+1)=D_{1}{}^{i}e(t)+D_{12}e(t)+\tilde{w}(t) (30)

    where

    D1=A~+B~FD12=(1NTB~)blkdg(M1TM1,,MNTMN)(INF)e(t)=[e1(t)TeN(t)T]T\begin{split}D_{1}&=\tilde{A}+\tilde{B}F\\ D_{12}&=-(1_{N}^{\mathrm{T}}\otimes\tilde{B})blkdg(M_{1}^{\mathrm{T}}M_{1},\dots,M_{N}^{\mathrm{T}}M_{N})(I_{N}\otimes F)\\ e(t)&=[{}^{1}e(t)^{\mathrm{T}}\cdots{}^{N}e(t)^{\mathrm{T}}]^{\mathrm{T}}\end{split}

    It is noted from the above equation (30) that the estimation error of the individual agent ei{}^{i}e is coupled with the augmented estimation error of the entire MAS, ee. Then, the following two lemmas are required for proving the stability of the proposed distributed estimation algorithm.

    Lemma 1.

    Estimation error covariance from the ithi^{th} agent, Σi(t){}^{i}\Sigma(t) is positive definite and bounded for all tt if the following system is observable (Kwon and Hwang, 2018):

    x(t+1)=x(t)Zi(t)=Cix(t)\begin{split}x(t+1)&=\mathcal{L}x(t)\\ Z_{i}(t)&=C_{i}x(t)\end{split} (31)

    where Ci=[h1h2h|Ωi|]TInn|Ωi|×nNC_{i}=[h_{1}\ h_{2}\ \cdots h_{|\Omega_{i}|}]^{\mathrm{T}}\otimes I_{n}\in\mathbb{R}^{n|\Omega_{i}|\times nN} is a observer matrix which gathers the measurements from the ithi^{th} agent’s perspective, i.e., those that are neighboring agents of the ithi^{th} agent. hqN,q=1,2,,|Ωi|h_{q}\in\mathbb{R}^{N},q={1,2,...,|\Omega_{i}|} are the nonzero column vectors of the matrix diag(ci1,ci2.,ciN)diag(c_{i1},c_{i2}....,c_{iN}). It is noted that the value of cijc_{ij} can be decided by the graph 𝒢=(𝒱,)\mathcal{G}=(\mathcal{V},\mathcal{E}) of the given network where (i,j)(i,j)\in\mathcal{E} indicates the availability of measurement of the jthj^{th} agent’s state from the ithi^{th} agent, i.e., cij=1c_{ij}=1 and (i,j)(i,j)\notin\mathcal{E} means cij=0c_{ij}=0.

    Proof. The proof is referred to in the author’s previous work (Kwon and Hwang, 2018). \blacksquare

    To analyze the estimation stability of ei{}^{i}e, we first introduce Hi(t)nN2×nNH_{i}(t)\in\mathbb{R}^{nN^{2}\times nN} as a affine mapping matrix and αi(t)nN2\alpha_{i}(t)\in\mathbb{R}^{nN^{2}} as a lumped noise as follows:

    Hi(t+1)=F~(t+1)Hi(t)(D22i(t+1)D2i(t))1αi(t+1)=(F~(t+1)Hi(t+1)D22i(t+1)D12)αi(t)+γ(t+1)Hi(t+1)ζi(t+1)D2i(t)=D1+D12Hi(t)D22i(t+1)=InNLi(t+1)Ciζi(t+1)=D22i(t+1)w~(t)Li(t+1)Civi(t+1)F~(t+1)=blkdg(D221(t+1)D21(t),,D22N(t+1)D2N(t))γ(t+1)=[D221(t+1)D12α1(t)+ζ1(t+1)D22N(t+1)D12αN(t)+ζN(t+1)]\begin{split}H_{i}(t+1)&=\tilde{F}(t+1)H_{i}(t)({}^{i}D_{22}(t+1){}^{i}D_{2}(t))^{-1}\\ \alpha_{i}(t+1)&=(\tilde{F}(t+1)-H_{i}(t+1){}^{i}D_{22}(t+1)D_{12})\alpha_{i}(t)+\gamma(t+1)-H_{i}(t+1)\zeta_{i}(t+1)\\ {}^{i}D_{2}(t)&=D_{1}+D_{12}H_{i}(t)\\ {}^{i}D_{22}(t+1)&=I_{nN}-L_{i}(t+1)C_{i}\\ \zeta_{i}(t+1)&={}^{i}D_{22}(t+1)\tilde{w}(t)-L_{i}(t+1)C_{i}v_{i}(t+1)\\ \tilde{F}(t+1)&=blkdg({}^{1}D_{22}(t+1){}^{1}D_{2}(t),\ \dots,\ {}^{N}D_{22}(t+1){}^{N}D_{2}(t))\\ \gamma(t+1)&=\begin{bmatrix}{}^{1}D_{22}(t+1)D_{12}\alpha_{1}(t)+\zeta_{1}(t+1)\\ \vdots\\ {}^{N}D_{22}(t+1)D_{12}\alpha_{N}(t)+\zeta_{N}(t+1)\end{bmatrix}\end{split} (32)

    It is noted that it is trivial to compute initial affine transformation matrix Hi(0),i𝒱H_{i}(0),\forall i\in\mathcal{V} which satisfies Σ(0)=HiΣi(0)HiT\Sigma(0)=H_{i}{}^{i}\Sigma(0)H_{i}^{\mathrm{T}} under the Assumptions 1. On the other hand, αi\alpha_{i} follows the Gaussian distribution αi(t)𝒩nNN(0,ηi(t))\alpha_{i}(t)\sim\mathcal{N}_{nNN}(0,\eta_{i}(t)), where ηi(t)=𝔼[αi(t)αiT(t)]T\eta_{i}(t)=\mathbb{E}[\alpha_{i}(t)\alpha_{i}^{\mathrm{T}}(t)]^{\mathrm{T}} with initial value as ηi(0)=0nNN\eta_{i}(0)=0_{nNN}. Then, we can show that the augmented estimation error can be mapped to the estimation error from the single agent’s perspective by the following lemma.

    Lemma 2.

    Suppose the system given in (31) is observable and satisfies the Assumption 1. Then, given the agent dynamics and the estimation based control (29) with control law as FF, there exists a affine mapping between the estimation error of the ithi^{th} agent, ei{}^{i}e, and the augmented MAS estimation error, ee, as follows:

    e(t+1)=Hi(t+1)ei(t+1)+αi(t+1),i𝒱,t0\begin{split}e(t+1)&=H_{i}(t+1){}^{i}e(t+1)+\alpha_{i}(t+1),\ \forall i\in\mathcal{V},\ \forall t\geq 0\end{split} (33)

    Proof. The proof can be shown by induction. Let the estimation error at time step tt satisfies (33). To verify that the (33) is satisfied at the next time step t+1t+1 with the definitions of Hi(t+1)H_{i}(t+1) and αi(t+1)\alpha_{i}(t+1), the dynamics of the estimation error of the ithi^{th} agent is considered. By (33), (30) can be stated as follows:

    ei(t+1)=D1ei(t)+D12(Hi(t)ei(t)+αi(t))+w~(t)=D2i(t)ei(t)+D12αi(t)+w~(t)\begin{split}{}^{i}e^{-}(t+1)&=D_{1}{}^{i}e(t)+D_{12}(H_{i}(t){}^{i}e(t)+\alpha_{i}(t))+\tilde{w}(t)\\ &={}^{i}D_{2}(t){}^{i}e(t)+D_{12}\alpha_{i}(t)+\tilde{w}(t)\end{split} (34)

    And the updated estimation error is derived as follows:

    ei(t+1)=D22i(t+1)D2i(t)ei(t)+D22i(t+1)D12αi(t)+ζi(t+1)\begin{split}{}^{i}e(t+1)&={}^{i}D_{22}(t+1){}^{i}D_{2}(t){}^{i}e(t)+{}^{i}D_{22}(t+1)D_{12}\alpha_{i}(t)+\zeta_{i}(t+1)\end{split} (35)

    By concatenating (35) for all agents i𝒱i\in\mathcal{V}, the MAS augmented estimation error, ee, can be formulated as follows:

    e(t+1)=F~(t+1)e(t)+γ(t+1)=F~(t+1)Hi(t)ei(t)+F~(t+1)αi(t)+γ(t+1)=Hi(t+1)ei(t+1)+αi(t+1)\begin{split}e(t+1)&=\tilde{F}(t+1)e(t)+\gamma(t+1)\\ &=\tilde{F}(t+1)H_{i}(t){}^{i}e(t)+\tilde{F}(t+1)\alpha_{i}(t)+\gamma(t+1)\\ &=H_{i}(t+1){}^{i}e(t+1)+\alpha_{i}(t+1)\end{split} (36)

    Without loss of generality, the matrix D22i(t+1)D2i(t){}^{i}D_{22}(t+1){}^{i}D_{2}(t) is invertible as it governs the estimation error dynamics (35) induced by the stochastic linear dynamical system. This completes the proof of Lemma 2. \blacksquare

    Based on the derived affine mapping between ee and ei{}^{i}e, we are ready to present the stability of the proposed estimation algorithm. As our paper considers the stochastic MAS, the estimation error can be regarded as a super martingale of the Lyapunov functions, which satisfies the following conditions:

    {V(e(t),t)=0,e(t)=𝟎V(e(t),t)>0,e(t)𝟎V(e(t),t),e(t),t\left\{\begin{aligned} &V(e(t),t)=0,\ \ \ e(t)=\mathbf{0}\\ &V(e(t),t)>0,\ \ \ e(t)\neq\mathbf{0}\\ &V(e(t),t)\rightarrow\infty,\ \ \ e(t)\rightarrow\infty\end{aligned}\right.,\ \ \ \forall t (37)
    ΔV(t+1,t)<0,k\Delta V(t+1,t)<0,\ \ \ \forall k (38)

    where ΔV(t+1,t):=V(𝔼[e(t+1)|e(t)],t+1)V(e(t),t)\Delta V(t+1,t):=V(\mathbb{E}\left[e(t+1)|e(t)\right],t+1)-V(e(t),t).
      

    Theorem 1.

    Given the MAS dynamics and the control protocol as in (29), the proposed distributed estimation algorithm is globally asymptotically stable in the sense of Lyapunov if the system (31) is observable.

    Proof. Let us define the Lyapunov function V:nN×V:\mathbb{R}^{nN}\times\mathbb{N}\rightarrow\mathbb{R} of the estimation error of the ithi^{th} agent as follows:

    V(ei(t),t):=eTi(t)(Σi(t))1ei(t)V(t+1,t):=V(𝔼[ei(t+1)|ei(t)],t+1)V(ei(t),t)\begin{split}V({}^{i}e(t),t):=&{}^{i}e^{\mathrm{T}}(t)\left({}^{i}\Sigma(t)\right)^{-1}{}^{i}e(t)\\ \triangle V(t+1,t):=&V(\mathbb{E}[{}^{i}e(t+1)|{}^{i}e(t)],t+1)-V({}^{i}e(t),t)\end{split} (39)

    As Σi(t){}^{i}\Sigma(t) is positive definite and bounded by lemma 1, There exists (Σi(t))10\left({}^{i}\Sigma(t)\right)^{-1}\succ 0. Therefore, VV is a quadratic function which satisfies the condition in (37). Besides, using (34), the conditional expectation 𝔼[ei(t+1)|ei(t)]\mathbb{E}[{}^{i}e(t+1)|{}^{i}e(t)] is given by

    𝔼[ei(t+1)|ei(t)]=D22i(t+1)D2i(t)ei(t)\mathbb{E}[{}^{i}e(t+1)|{}^{i}e(t)]={}^{i}D_{22}(t+1){}^{i}D_{2}(t){}^{i}e(t)

    Then, by applying equation (35), V\triangle V can be written as follows:

    V(t+1,t)=eTi(t)((Σi(t))1D2Ti(t)D22Ti(t+1)×(Σi(t+1))1D22i(t+1)D2i(t))ei(t)=eTi(t)i(t+1)ei(t)\begin{split}\triangle V(t+1,t)=&-{}^{i}e^{\mathrm{T}}(t)\big{(}({}^{i}\Sigma(t))^{-1}-{}^{i}D_{2}^{\mathrm{T}}(t){}^{i}D_{22}^{\mathrm{T}}(t+1)\times({}^{i}\Sigma(t+1))^{-1}{}^{i}D_{22}(t+1){}^{i}D_{2}(t)\big{)}{}^{i}e(t)\\ =&-{}^{i}e^{\mathrm{T}}(t)\mathcal{M}_{i}(t+1){}^{i}e(t)\end{split} (40)

    To satisfies the condition in (38), i(t+1)\mathcal{M}_{i}(t+1) should be a positive definite matrix. By applying Lemma 2, the predicted estimation error covariance of the ithi^{th} agent is derived using (34) as:

    Σi(t+1)=D2i(t)Σi(t)D2Ti(t)+D12ηi(t)D12T+Σw~{}^{i}\Sigma^{-}(t+1)={}^{i}D_{2}(t){}^{i}\Sigma(t){}^{i}D_{2}^{\mathrm{T}}(t)+D_{12}\eta_{i}(t)D_{12}^{\mathrm{T}}+\Sigma_{\tilde{w}} (41)

    Correspondingly, the updated estimation error covariance can be computed as follows:

    Σi(t+1)=D22i(t+1)Σi(t+1)=Σi(t+1)Σi(t+1)CiT(CiΣi(t+1)CiT+CiΞiCiT)1CiΣi(t+1)\begin{split}{}^{i}\Sigma(t+1)=&{}^{i}D_{22}(t+1){}^{i}\Sigma^{-}(t+1)\\ =&{}^{i}\Sigma^{-}(t+1)-{}^{i}\Sigma^{-}(t+1)C_{i}^{\mathrm{T}}\big{(}C_{i}{}^{i}\Sigma^{-}(t+1)C_{i}^{\mathrm{T}}+C_{i}\mathrm{\Xi}_{i}C_{i}^{\mathrm{T}}\big{)}^{-1}C_{i}{}^{i}\Sigma^{-}(t+1)\end{split} (42)

    and using the definition of D22i{}^{i}D_{22}, Σi(t+1){}^{i}\Sigma(t+1) can be defined as:

    Σi(t+1)=D22i(t+1)Σ1i(t+1){}^{i}\Sigma(t+1)={}^{i}D_{22}(t+1){}^{i}\Sigma^{-1}(t+1) (43)

    By applying the matrix inversion lemma, (42) is rewritten as:

    Σi(t+1)=((Σi(t+1))1+CiT(CiΞiCiT)1Ci)1\begin{split}{}^{i}\Sigma(t+1)=&\big{(}({}^{i}\Sigma^{-}(t+1))^{-1}+C_{i}^{\mathrm{T}}(C_{i}\mathrm{\Xi_{i}}C_{i}^{\mathrm{T}})^{-1}C_{i}\big{)}^{-1}\end{split} (44)

    And using (44), multiplying Σi(t+1){}^{i}\Sigma(t+1) to the left and the right sides of (Σi(t+1))1\big{(}{}^{i}\Sigma(t+1)\big{)}^{-1} and applying (43) yields:

    Σi(t+1)=D22i(t+1)(Σ1i(t+1)+𝒲i(t+1))D22Ti(t+1){}^{i}\Sigma(t+1)={}^{i}D_{22}(t+1)\big{(}{}^{i}\Sigma^{-1}(t+1)+\mathcal{W}_{i}(t+1)\big{)}{}^{i}D_{22}^{\mathrm{T}}(t+1) (45)

    where 𝒲i(t+1)=Σi(t+1)CiT(CiΞiCiT)1CiΣi(t+1)\mathcal{W}_{i}(t+1)={}^{i}\Sigma^{-}(t+1)C_{i}^{\mathrm{T}}(C_{i}\mathrm{\Xi_{i}}C_{i}^{\mathrm{T}})^{-1}C_{i}{}^{i}\Sigma^{-}(t+1). It is trivial to show that matrix 𝒲i0,t\mathcal{W}_{i}\succ 0,\ \forall t. Recalling Σi(t+1){}^{i}\Sigma^{-}(t+1) denoted in (41), (Σi(t+1))1\big{(}{}^{i}\Sigma(t+1)\big{)}^{-1} can be rewritten by:

    (Σi(t+1))1=(D22Ti(t+1))1(D2i(t)Σi(t)D2Ti+Σw~+D12ηi(t)D12T+𝒲i(t+1))1(D22i(t+1))1\begin{split}\big{(}{}^{i}\Sigma(t+1)\big{)}^{-1}=&\big{(}{}^{i}D_{22}^{\mathrm{T}}(t+1)\big{)}^{-1}\big{(}{}^{i}D_{2}(t){}^{i}\Sigma(t){}^{i}D_{2}^{\mathrm{T}}+\Sigma_{\tilde{w}}+D_{12}\eta_{i}(t)D_{12}^{\mathrm{T}}+\mathcal{W}_{i}(t+1)\big{)}^{-1}\big{(}{}^{i}D_{22}(t+1)\big{)}^{-1}\end{split} (46)

    By applying (46), i(t+1)\mathcal{M}_{i}(t+1) can be redefined as:

    i(t+1)=(Σi(t))1D2Ti(t)(D2i(t)Σi(t)D2Ti(t)+D12ηi(t)D12T+Σw~+𝒲i(t+1))1D2i(t)\begin{split}\mathcal{M}_{i}(t+1)=&({}^{i}\Sigma(t))^{-1}-{}^{i}D_{2}^{\mathrm{T}}(t)\big{(}{}^{i}D_{2}(t){}^{i}\Sigma(t){}^{i}D_{2}^{\mathrm{T}}(t)+D_{12}\eta_{i}(t)D_{12}^{\mathrm{T}}+\Sigma_{\tilde{w}}+\mathcal{W}_{i}(t+1)\big{)}^{-1}{}^{i}D_{2}(t)\end{split} (47)

    After going through tedious conversion using the matrix inversion lemma, (47) can be rewritten as follows:

    i(t+1)=(Σi(t))1((Σi(t))1+D2Ti(t)(D12ηi(t)D12T+Σw~+𝒲i(t+1))1D2i(t))1(Σi(t))1\begin{split}\mathcal{M}_{i}(t+1)=&({}^{i}\Sigma(t))^{-1}\big{(}({}^{i}\Sigma(t))^{-1}+{}^{i}D_{2}^{\mathrm{T}}(t)(D_{12}\eta_{i}(t)D_{12}^{\mathrm{T}}+\Sigma_{\tilde{w}}+\mathcal{W}_{i}(t+1))^{-1}{}^{i}D_{2}(t)\big{)}^{-1}({}^{i}\Sigma(t))^{-1}\end{split} (48)

    As (Σi(t))10({}^{i}\Sigma(t))^{-1}\succ 0 and D2Ti(t)(D12ηi(t)D12T+Σw~+𝒲i(t+1)0{}^{i}D_{2}^{\mathrm{T}}(t)(D_{12}\eta_{i}(t)D_{12}^{\mathrm{T}}+\Sigma_{\tilde{w}}+\mathcal{W}_{i}(t+1)\succ 0 in (48), one can verify i(t+1)0,t\mathcal{M}_{i}(t+1)\succ 0,\ \forall t, which guarantees the Lyapunov function satisfies (37) and (38). This proves that the estimation error is globally asymptotically stable. \blacksquare

    5 Numerical Simulation

    The effectiveness of the proposed algorithm is demonstrated with an illustrative MAS example. The MAS consists of five agents whose dynamics and objective are specified by the following parameter sets: A=1,B=1,Θi(t)=1A=1,\ B=1,\ \Theta_{i}(t)=1Ξi(t)=diag(1,1,1,2,1),i𝒱{}^{i}\mathrm{\Xi}(t)=diag(1,1,1,2,1),\forall i\in\mathcal{V}, t{0,,T}\forall t\in\{0,\cdots,T\}T=5T=5𝒬=I6(5I515×5)\mathcal{Q}=I_{6}\otimes(5I_{5}-1_{5\times 5}), and =I25\mathcal{R}=I_{25}. The MAS network topology is set to be partially connected, the same as the one in (Kwon and Hwang, 2018). To validate the performance of the proposed algorithm, we conduct a comparative analysis with two different scenarios: i) MAS with the fully connected network, which is free from network topological constraint; and ii) MAS with the same (partially connected) network topology where non-neighboring agent information is not available to each agent. For the second case, we test the suboptimal method presented in (Gupta et al., 2005). The simulation results are shown in Figure. 3. By virtue of the virtual interactions between non-neighboring agents, our algorithm outperforms the existing method in the partially connected network, and even matches the fully connected network case despite the network topological constraints.

    Refer to caption
    Figure 3: Cost value statistics histogram (Monte Carlo simulations with 100 runs).

    To further demonstrate the performance of the proposed algorithm, we have performed additional simulations with respect to the different number of "real" interactions. It is worth noting that the network links between agents yield the “real" interaction while those with no explicit links create the virtual interaction. MAS with five agents has been simulated under different network topology shown in the below Figure. 4.

    Refer to caption
    Figure 4: Different network topology with 5 agents.

    All experiments followed the same setting except the network topology. Apparently, the number of virtual interactions decreases as the number of network links between agents increase whereby we can analyze the effect of ratio between real and virtual interactions. The performance of our proposed method under different network settings is depicted in the below Figure. 5.

    Refer to caption
    Figure 5: Cost value comparison with a different number of network links between agents.

    The result shows that the performance of our proposed method does not vary much with respect to the ratio between virtual to real interactions. At the cost of some onboard computational resources to estimate non-neighboring agents, our proposed method provides the clear advantage of having optimal performance with fewer network connections.

    6 Conclusions

    This paper has proposed a novel design procedure of the optimal distributed control for the linear stochastic MAS, generally subject to network topological constraints. The proposed method gets around the network topological constraint by employing the distributed estimator, whereby each agent can exploit the non-neighboring agent’s information. Future work will include the theoretical performance guarantee of the proposed distributed control-estimation synthesis such as stability analysis, and a further extension to the infinite time horizon case for practical use.

    References

    • Shi and Yan [2020] Peng Shi and Bing Yan. A survey on intelligent control for multiagent systems. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2020.
    • Li et al. [2019] Xianwei Li, Fangzhou Liu, Martin Buss, and Sandra Hirche. Fully distributed consensus control for linear multi-agent systems: A reduced-order adaptive feedback approach. IEEE Transactions on Control of Network Systems, 2019.
    • Zhu et al. [2020] Yunru Zhu, Liqi Zhou, Yuanshi Zheng, Jian Liu, and Shiming Chen. Sampled-data based resilient consensus of heterogeneous multiagent systems. International Journal of Robust and Nonlinear Control, 30(17):7370–7381, 2020.
    • Morita et al. [2015] Ryosuke Morita, Takayuki Wada, Izumi Masubuchi, Toru Asai, and Yasumasa Fujisaki. Multiagent consensus with noisy communication: stopping rules based on network graphs. IEEE Transactions on Control of Network Systems, 3(4):358–365, 2015.
    • Gupta et al. [2005] Vijay Gupta, Babak Hassibi, and Richard M Murray. A sub-optimal algorithm to synthesize control laws for a network of dynamic agents. International Journal of Control, 78(16):1302–1313, 2005.
    • Ma et al. [2015] Jingying Ma, Yuanshi Zheng, and Long Wang. Lqr-based optimal topology of leader-following consensus. International Journal of Robust and Nonlinear Control, 25(17):3404–3421, 2015.
    • Nguyen [2015] Dinh Hoa Nguyen. A sub-optimal consensus design for multi-agent systems based on hierarchical lqr. Automatica, 55:88–94, 2015.
    • Jiao et al. [2019] Junjie Jiao, Harry L Trentelman, and M Kanat Camlibel. A suboptimality approach to distributed linear quadratic optimal control. IEEE Transactions on Automatic Control, 65(3):1218–1225, 2019.
    • Kwon and Hwang [2018] Cheolhyeon Kwon and Inseok Hwang. Sensing-based distributed state estimation for cooperative multiagent systems. IEEE Transactions on Automatic Control, 64(6):2368–2382, 2018.
    • Furieri and Kamgarpour [2020] Luca Furieri and Maryam Kamgarpour. First order methods for globally optimal distributed controllers beyond quadratic invariance. In 2020 American Control Conference (ACC), pages 4588–4593. IEEE, 2020.
    • Lin et al. [2013] Fu Lin, Makan Fardad, and Mihailo R Jovanović. Design of optimal sparse feedback gains via the alternating direction method of multipliers. IEEE Transactions on Automatic Control, 58(9):2426–2431, 2013.
    • Lessard and Lall [2011] Laurent Lessard and Sanjay Lall. Quadratic invariance is necessary and sufficient for convexity. In Proceedings of the 2011 American Control Conference, pages 5360–5362. IEEE, 2011.