\epstopdfDeclareGraphicsRule

.tifpng.pngconvert #1 \OutputFile \AppendGraphicsExtensions.tif

Distributed Control-Estimation Synthesis for Stochastic Multi-Agent Systems via Virtual Interaction between Non-neighboring Agents

Hojin Lee
Department of Mechanical Engineering
Ulsan National Institute of Science and Technology
Ulsan, 44919 Repulic of Korea
hojinlee@unist.ac.kr
&

Cheolhyeon Kwon
Department of Mechanical Engineering
Ulsan National Institute of Science and Technology
Ulsan, 44919 Repulic of Korea
kwonc@unist.ac.kr
© 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. Citation information: DOI 10.1109/LCSYS.2021.3086848, IEEE Control Systems Letters. This work was supported by the National Research Foundation of Korea(NRF) grants funded by the Korea government(MSIT) (No. 2020R1A5A8018822 and No. 2020R1C1C1007323

Abstract

This paper considers the optimal distributed control problem for a linear stochastic multi-agent system (MAS). Due to the distributed nature of MAS network, the information available to an individual agent is limited to its vicinity. From the entire MAS aspect, this imposes the structural constraint on the control law, making the optimal control law computationally intractable. This paper attempts to relax such a structural constraint by expanding the neighboring information for each agent to the entire MAS, enabled by the distributed estimation algorithm embedded in each agent. By exploiting the estimated information, each agent is not limited to interact with its neighborhood but further establishing the ‘virtual interactions’ with the non-neighboring agents. Then the optimal distributed MAS control problem is cast as a synthesized control-estimation problem. An iterative optimization procedure is developed to find the control-estimation law, minimizing the global objective cost of MAS.

Keywords Distributed control $\cdot$ Distributed estimation $\cdot$ Optimal control

1 Introduction

Distributed control within a cooperative multi-agent system (MAS) is the key enabling technology for different networked dynamical systems (Shi and Yan, 2020; Li et al., 2019; Zhu et al., 2020; Morita et al., 2015). Notwithstanding diverse distributed control strategies, their optimality is one of the stumbling blocks due to individual agents’ limited information. In particular, finding the optimal distributed control with network topological constraint is a well-known NP-hard problem (Gupta et al., 2005). To ease this problem, some former studies have focused on a specific form of objective function along with certain MAS network topology conditions under which the optimal distributed control laws can be designed (Ma et al., 2015). More particularly, different techniques have been investigated to design suboptimal distributed control laws for different MAS cooperative tasks (Gupta et al., 2005; Nguyen, 2015; Jiao et al., 2019). In this paper, a new avenue for accomplishing the optimal distributed control of MAS is presented while not requiring the restricted form of the objective function, nor the network topology. The key idea is to expand the available information for each agent by employing the distributed estimation algorithm, and use the expanded information to relax the network topological constraint in a tractable manner. In a nutshell, the main contributions are the following.

1.

A synthesized distributed control-estimation framework is proposed based on the authors’ preliminarily developed distributed estimation algorithm (Kwon and Hwang, 2018). The newly proposed framework enables the interactions between non-neighboring agents, namely virtual interactions.
2.

With the aid of virtual interactions, a design procedure that solves for the optimal distributed control law of the stochastic MAS over a finite time horizon is developed, which was originally an intractable non-convex problem due to the network topological constraint.

2 Problem Formulation

2.1 Dynamical Model of Stochastic MAS

Consider a stochastic linear multi-agent dynamical system including $N$ homogeneous agents whose dynamics is given by:

x_{i}(t+1)=Ax_{i}(t)+Bu_{i}(t)+w_{i}(t),\ \ \ \ \forall i\in\{1,\cdots,N\}

(1)

where $x_{i}(t)\in\mathbb{R}^{n}$ and $u_{i}(t)\in\mathbb{R}^{p}$ are the state and the control input of the $i^{th}$ agent, respectively. $w_{i}(t)$ is a disturbance imposed on the $i^{th}$ agent, assumed to follow zero-mean white Gaussian distribution with covariance $\mathrm{\Theta}_{i}(t)\succ 0$ . $t\in\mathbb{Z}_{+}=\{0,1,\cdots\}$ indicates a discrete-time index. $A,B$ are the system matrices with appropriate dimensions, and are assumed to satisfy the controllability condition. Accordingly, the entire MAS dynamics can be written as

x(t+1)=\tilde{A}x(t)+\tilde{B}u(t)+\tilde{w}(t)

(2)

\begin{split}\tilde{A}&=\left(I_{N}\otimes A\right),\ \tilde{B}=\left(I_{N}\otimes B\right)\\ x(t)&=\begin{bmatrix}x_{1}^{\mathrm{T}}(t)\cdots x_{N}^{\mathrm{T}}(t)\end{bmatrix}^{\mathrm{T}},\ u(t)=\begin{bmatrix}u_{1}^{\mathrm{T}}(t)\cdots u_{N}^{\mathrm{T}}(t)\end{bmatrix}^{\mathrm{T}}\\ \tilde{w}&=\begin{bmatrix}w_{1}^{\mathrm{T}}(t)\cdots w_{N}^{\mathrm{T}}(t)\end{bmatrix}^{\mathrm{T}}\end{split}

where $\otimes$ is the Kronecker product between matrices. The interactions between agents are rendered by inter-agent network topology, described by a graph model $\mathcal{G}$ consisting of a node set $\mathcal{V}=\{1,\ 2\ .\ .\ ,\ N\}$ indexing each agent and an edge set $\mathcal{E}\subseteq\mathcal{V}\times\mathcal{V}$ indicating the network connectivity between the agents. Each edge $(i,j)\in\mathcal{E}$ denotes that the node $i$ can acquire the state information of the node $j$ . An adjacency matrix $\mathcal{A}=[a]\in\mathbb{R}^{N\times N}$ can express the network connectivity of the graph model, where its element $a_{ij}=1$ if $(i,j)\in\mathcal{E}$ , and $a_{ij}=0$ otherwise. A degree matrix is defined as $\mathcal{D}=diag(d_{1}\cdots d_{N})$ where $d_{i}=\sum_{\begin{subarray}{c}j\end{subarray}}a_{ij}$ is (weighted) degree of node $i$ . The Laplacian matrix $\mathcal{L}$ , given by $\mathcal{L}=\mathcal{D}-\mathcal{A}$ , is useful for analysis of the network topology. The set of agents whose state information is available to the $i^{th}$ agent, i.e., the neighborhood of the $i^{th}$ agent, is expressed as $\Omega_{i}$ , and its cardinality is expressed as $|{\Omega_{i}}|$ . Based on the given network topology, the noisy measurement of neighborhood states $\{x_{j}(t)|j\in\mathcal{V}\}$ from the $i^{th}$ agent’s perspective can be represented as follows (Kwon and Hwang, 2018):

z_{ij}(t)=c_{ij}\left(x_{j}(t)+v_{ij}(t)\right),\ \ \ \ \ \forall j\in\mathcal{V}

(3)

where $c_{ij}$ indicates the availability of the measurement of the $j^{th}$ agent’s state from the $i^{th}$ agent such that $c_{ij}=1$ when $j\in\Omega_{i}$ , and $c_{ij}=0$ otherwise. The noise of the measurement from the $i^{th}$ to the $j^{th}$ agent is specified as $v_{ij}(t)$ which is assumed to be independent and identically distributed (i.i.d.) Gaussian random variables with zero mean and covariance $\mathrm{\Xi}_{ij}(t)\succ 0$ . Further the measurement and the noise sets of the $i^{th}$ agent are denoted by $Z_{i}(t)=\left[z_{i1}^{\mathrm{T}}(t)\cdots z_{iN}^{\mathrm{T}}(t)\right]^{\mathrm{T}}$ and $v_{i}(t)=\left[v_{i1}^{\mathrm{T}}(t)\cdots v_{iN}^{\mathrm{T}}(t)\right]^{\mathrm{T}}$ , respectively. Over a finite time horizon $t=0,\cdots,T$ , one can rewrite (2) as a static form by stacking up the variables and matrices (Furieri and Kamgarpour, 2020):

\begin{split}x&=P_{11}w+P_{12}u\\ \end{split}

(4)

where

\begin{split}P_{11}&=(I-\mathrm{D}\bar{A})^{-1},\ P_{12}=(I-\mathrm{D}\bar{A})^{-1}\mathrm{D}\bar{B}\\ \bar{A}&=I_{T+1}\otimes\tilde{A},\ \ \ \ \ \bar{B}=\left[\begin{matrix}I_{T}\otimes\tilde{B}\\ 0_{Nn\times NpT}\end{matrix}\right]\\ \mathrm{D}&=\left[\begin{matrix}0_{Nn\times NnT}&0_{Nn\times Nn}\\ I_{NnT}&0_{NnT\times Nn}\end{matrix}\right]\end{split}

where $I_{T}$ and $0_{T}$ respectively denote the identity and zero matrices of size $T\times T$ , and $M_{i}=[0_{p}\cdots I_{p}\cdots 0_{p}]\in\mathbb{R}^{p\times Np}$ is the block matrix having $I_{p}$ in the $i^{th}$ block entry and filled with $0_{p}$ in other block entries. And $x=[x(0)^{\mathrm{T}}\dots x(T)^{\mathrm{T}}]^{\mathrm{T}}\in\mathbb{R}^{Nn(T+1)}$ , and $u=\sum_{i}^{N}(I_{T}\otimes M_{i}^{\mathrm{T}})u_{i}\in\mathbb{R}^{NpT}$ are the stacked agents’ states and their control inputs over the horizon, where $u_{i}=[u_{i}(0)^{\mathrm{T}}\dots u_{i}(T-1)^{\mathrm{T}}]^{\mathrm{T}}\in\mathbb{R}^{pT},\forall i\in\mathcal{V}$ . $w=[x(0)^{\mathrm{T}}\ \tilde{w}(0)^{\mathrm{T}}\ \dots\tilde{w}(T-1)^{\mathrm{T}}]\in\mathbb{R}^{Nn(T+1)}$ is the vector containing initial agents’ states and the additive noise. Over the finite time horizon $T$ , individual agents interact with their neighbors according to the control law $u_{i}$ embedded in each agent. Without loss of generality, $u_{i}$ can be designed by the following output feedback control law (Furieri and Kamgarpour, 2020):

u_{i}=(I_{T}\otimes M_{i})\mathcal{F}Z_{i,(0:T-1)}=(I_{T}\otimes M_{i})\mathcal{F}C(x+v_{i}),\ \ \forall i\in\mathcal{V}

(5)

where $v_{i}=[v_{i}(0)^{\mathrm{T}}\dots v_{i}(T)^{\mathrm{T}}]^{\mathrm{T}}\in\mathbb{R}^{Nn(T+1)},\forall i\in\mathcal{V}$ , $Z_{i,(0:T-1)}=[Z_{i}(0)^{\mathrm{T}}\cdots Z_{i}(T-1)^{\mathrm{T}}]\in\mathbb{R}^{NnT}$ , and $C=\left[I_{NnT}\ 0_{NnT\times Nn}\right]$ . The crucial part is the design of the feedback gain, which is denoted by $\mathcal{F}\in\mathbb{F}$ . Here, $\mathbb{F}\subset\mathbb{R}^{NpT\times NnT}$ is an invariant subspace that encodes network topological constraints for distributed MAS imposed by $\mathcal{A}$ , as well as embeds causal feedback policies by forcing the future response entries to zeros.

2.2 Optimal MAS Distributed Control Problem

Given the equivalent static form of the stochastic linear MAS dynamics over the time horizon $T$ (4), we seek to address the optimal distributed control problem.

Problem 1.

Optimal distributed control law subject to structural constraint (Furieri and Kamgarpour, 2020):

\begin{split}\ \ \ &\min\limits_{\mathcal{F}\in\mathbb{F}}\ \mathbb{E}\left[x^{\mathrm{T}}\mathcal{Q}x+u^{\mathrm{T}}\mathcal{R}u\right]\\ \mathrm{subje}\mathrm{ct\ to}\ &\eqref{eq:state=dist+input},\eqref{eq:measurement_input_equation},\ \ \ \ \ \forall i\in\mathcal{V}\end{split}

(6)

where $\mathcal{Q}\in\mathbb{R}^{Nn(T+1)\times Nn(T+1)}\succeq 0$ , and $\mathcal{R}\in\mathbb{R}^{NpT\times NpT}\succ 0$ are the associated weight matrices.

Due to the structural constraints imposed on the control input space $\mathbb{F}$ , Problem 1 is a highly non-convex problem, which is indeed NP-hard and a formidable computational burden (Gupta et al., 2005). To circumvent such a difficulty, we propose a concept of $virtual\ network\ topology$ that allows for the interactions between non-neighboring agents, i.e., $virtual\ interaction$ as depicted in Figure. 1.

Refer to caption — Figure 1: Virtual interaction based distributed control-estimation synthesis.

Since the state information of the non-neighboring agent is not available, an appropriate estimator is required for each agent to obtain the estimates of non-neighboring agents’ states. Using the Bayesian approach, Kalman-like filter is adopted for estimation as we consider a linear MAS along with the Gaussian uncertainties.

Definition 1.

The state estimate and its covariance of the MAS using the $i^{th}$ agent’s measurements are denoted by ${}^{i}\hat{x}(t):=\mathbb{E}\left[x(t)|Z_{i,(0:t)}\right]$ and ${}^{i}\Sigma(t):=\mathbb{E}\left[\left(x(t)-{}^{i}\hat{x}(t)\right)\left(x(t)-{}^{i}\hat{x}(t)\right)^{\mathrm{T}}|Z_{i,(0:t)}\right],\forall j\in\mathcal{V}$ , respectively (Kwon and Hwang, 2018), where $\mathbb{E}[\bullet|\bullet]$ is the conditional expectation.

The nominal recursive structure of Kalman-like filter is represented by:

\begin{split}{}^{i}\hat{x}(t)&={}^{i}\hat{x}^{-}(t)+L_{i}(t)H_{i}\left(Z_{i}(t)-{}^{i}\hat{x}^{-}(t)\right)\end{split}

(7)

where ${}^{i}\hat{x}^{-}(t):=\mathbb{E}\left[x(t)|Z_{i,(0:t-1)}\right]$ denotes the predicted state estimate from the $i^{th}$ agent’s perspective. $H_{i}\in\mathrm{R}^{n|\Omega_{i}|\times nN}$ only encodes the neighbor of the $i^{th}$ agent, that is, $H_{i}=[h_{1}\ h_{2}\ \cdots h_{|\Omega_{i}|}]^{\mathrm{T}}\otimes I_{n}$ where $h_{m}\in\mathbb{R}^{N}$ , and $m={1,2,...,|\Omega_{i}|}$ are the nonzero column vectors of the matrix $diag(c_{i1},c_{i2}....,c_{iN})$ . And, $L_{i}(t)\in\mathbb{R}^{nN\times n|\Omega_{i}|}$ represents the estimator gain at time step $t$ for estimating the states of the MAS from the $i^{th}$ agent’s perspective. Once the entire MAS state estimate information becomes available for each agent, one can replace (5) with the estimation-based feedback control law. Accordingly, Problem 1, distributed control law subject to structural constraint, can be reformulated into the problem that simultaneously designs both distributed control and distributed estimator.

Problem 2.

Optimal distributed control-estimation law with virtual interactions:

\begin{split}\ \ \ &\min\limits_{\mathcal{F}\in\tilde{\mathbb{F}},\Upsilon_{i},\forall i\in\mathcal{V}}\ J(\mathcal{F},\Upsilon_{1},\cdots,\Upsilon_{N})\\ \mathrm{subje}&\mathrm{ct\ to}\ \eqref{eq:state_equation},\ \mathrm{and}\\ &u_{i}=(I_{T}\otimes M_{i})\mathcal{F}C\ {}^{i}\hat{x},\ \ \ \ \ \forall i\in\mathcal{V}\end{split}

(8)

where ${}^{i}\hat{x}=[{}^{i}\hat{x}(0)^{\mathrm{T}}\cdots{}^{i}\hat{x}(T)^{\mathrm{T}}]^{\mathrm{T}}$ , $J(\mathcal{F},\Upsilon_{1},\cdots,\Upsilon_{N}):=\mathbb{E}\left[x^{\mathrm{T}}\mathcal{Q}x+u^{\mathrm{T}}\mathcal{R}u\right]$ and $\Upsilon_{i}:=\{L_{i}(t)|t=0,\cdots,T\}$ is the set of estimator gains over the time horizon $T$ for the $i^{th}$ agent.

Remark 1.

It is worth noting that, compared to $\mathbb{F}$ , $\tilde{\mathbb{F}}\subset\mathbb{R}^{NpT\times NnT}$ is a subspace that only encodes causal feedback policies, not restricted by any network topological constraint.

Albeit Problem 2 can successfully relax the structural constraint on the control law $\mathcal{F}$ , it is not straightforward to solve as the control law and the state estimation errors mutually affect each other (Kwon and Hwang, 2018). To resolve such complexity, we propose an iterative optimization procedure in a distributed fashion such that: i) divide the primal problem (Problem 2) into the set of sub-problems, each is viewed from an individual agent’s perspective; ii) sequentially design the distributed estimation and control laws for each sub-problem; iii) mix the results from individual sub-problems to approximate the optimal solution to the primal problem. The overall schematic of the proposed distributed control-estimation synthesis is delineated in Figure. 2.

For the $l^{th}$ iteration, the optimization procedure consists of the following sub-steps. Firstly, distributed estimator design optimizes a set of estimator gains $\Upsilon_{i}^{(l)},\forall i\in\mathcal{V}$ based on the disturbance/noise model, the network topological constraint, and the suboptimal control law resulted from the previous iteration. Then, distributed control law design computes a set of optimal control laws ${}^{i}\mathcal{F}^{(l)},\forall i\in\mathcal{V}$ , each from the individual agents’ perspectives, based on the state estimation error information from the designed distributed estimator. Finally, distributed control-estimation synthesis mixes the set of ${}^{i}\mathcal{F}^{(l)},\forall i\in\mathcal{V}$ to construct the solution candidate, $\mathcal{F}^{(l)}$ , for Problem 2. The constructed control law is evaluated to check the convergence and is used for the next iteration. The iteration terminates once the pre-defined stopping criteria are fulfilled, yielding the optimal control-estimation law, denoted by $\mathcal{F}^{*}$ and $\Upsilon_{i}^{\ast},\forall i\in\mathcal{V}$ .

3 Algorithm Development

This section details out the proposed synthesis procedure of the optimal distributed control-estimation law that can comply with an arbitrary network topology of the stochastic MAS.

3.1 Distributed estimator design

To begin with, the distributed estimation algorithm is optimized by means of estimator gains $\Upsilon_{i}^{(l)},\forall i\in\mathcal{V}$ . As offline design phase, individual estimators can be designed based on the entire MAS model information along with the control law computed from the previous iteration, ( $A$ , $B$ , and $\mathcal{F}^{(l-1)}$ ). For brevity, we use $\mathcal{F}$ to designate $\mathcal{F}^{(l-1)}$ in this subsection. Recalling (7), the distributed estimator embedded in the $i^{th}$ agent calculates ${}^{i}\hat{x}(t),\forall t\in\{0,\cdots,T\}$ , whose performance can be measured by the estimation error.

Definition 2.

${}^{i}e(t):=x(t)-{}^{i}\hat{x}(t)$ is the MAS state estimation error from the $i^{th}$ agent’s perspective, and its covariance is ${}^{i}\Sigma(t)$ by Definition 1. Further, $e(t)=[{}^{1}e(t)^{\mathrm{T}}\cdots{}^{N}e(t)^{\mathrm{T}}]^{\mathrm{T}}\in\mathbb{R}^{NnN}$ stacks all the estimation errors from individual agents’ estimator, and the corresponding covariance is denoted by $\Sigma(t):=\mathbb{E}[e(t)e(t)^{\mathrm{T}}]\in\mathbb{R}^{NnN\times NnN}$ . Similarly, ${}^{i}e^{-}(t)$ , ${}^{i}\Sigma^{-}(t)$ , $e^{-}(t)$ , $\Sigma^{-}(t)$ are defined in terms of the predicted state estimate ${}^{i}\hat{x}^{-}(t)$ (Kwon and Hwang, 2018).

Assumption 1.

The initial conditions ${}^{i}\hat{x}(0),\forall i\in\mathcal{V}$ , and $\Sigma(0)$ are given to individual agents in order to initiate each of their distributed estimators.

Based on the prior knowledge, the estimation based control input of the $i^{th}$ agent at time step $t$ can be written by:

u_{i}(t)=M_{i}\sum_{k=0}^{t}\mathcal{F}_{kt}\ {}^{i}\hat{x}(k)

(9)

where $\mathcal{F}_{kt}\in\mathbb{R}^{pN\times nN}$ is the block matrix which spans from $(knN)^{th}$ to $(knN+nN-1)^{th}$ columns and from $(kpN)^{th}$ to $(kpN+pN-1)^{th}$ rows of the control law matrix $\mathcal{F}$ . With (9), the entire MAS dynamics (2) can be expressed by:

\begin{split}x(t+1)&=\tilde{A}x(t)+\tilde{B}\mathcal{F}_{tt}x(t)+\sum_{k=0}^{t-1}\tilde{B}\mathcal{F}_{kt}x(k)-\sum_{k=0}^{t}\bar{B}\tilde{M}\tilde{\mathcal{F}}_{kt}e(k)+\tilde{w}(t)\end{split}

(10)

where $\bar{B}=1_{N}^{\mathrm{T}}\otimes\tilde{B}$ , $\tilde{M}=blkdg(M_{1}^{\mathrm{T}}M_{1},\dots,M_{N}^{\mathrm{T}}M_{N})\in\mathbb{R}^{NpN\times NpN}$ , and $\tilde{\mathcal{F}}_{kt}=I_{N}\otimes\mathcal{F}_{kt}$ . $blkdg(\bullet)$ denotes a block-diagonal matrix with block matrices $\bullet$ , and the vector $1_{N}\in\mathbb{R}^{N}$ indicates every element equals to $1$ . Then the predicted state estimate of the entire MAS from the $i^{th}$ agent’s perspective is given by:

\begin{split}{}^{i}\hat{x}^{-}(t+1)=\tilde{A}\ {}^{i}\hat{x}(t)+\tilde{B}\mathcal{F}_{tt}{}^{i}\hat{x}(t)+\sum_{k=0}^{t-1}\tilde{B}\mathcal{F}_{kt}{}^{i}\hat{x}(k)\end{split}

(11)

Subtracting (11) from (10), and concatenating the results for all agents in $\mathcal{V}$ gives:

\begin{split}e^{-}(t+1)&=\Lambda_{t}e(t)+\sum_{k=0}^{t-1}\Psi_{kt}e(k)+1_{N}\otimes\tilde{w}(t)\\ \mathrm{where}\ \Lambda_{t}&=I_{N}\otimes(\tilde{A}+\tilde{B}\mathcal{F}_{tt})-1_{N}\otimes\bar{B}\tilde{M}\tilde{\mathcal{F}}_{tt},\ \ \Psi_{kt}=I_{N}\otimes\tilde{B}\mathcal{F}_{kt}-1_{N}\otimes\bar{B}\tilde{M}\tilde{\mathcal{F}}_{kt}\end{split}

(12)

Correspondingly, $\Sigma^{-}(t+1)$ is represented by:

\begin{split}\Sigma^{-}(t+1)&=\Lambda_{t}\Sigma(t)\Lambda_{t}^{\mathrm{T}}+\Sigma_{\tilde{w}}(t)+\sum_{q=0}^{t-1}\Lambda_{t}\mathbb{E}[e(t)e(q)^{\mathrm{T}}]\Psi_{qt}^{\mathrm{T}}+\sum_{p=0}^{t-1}\Psi_{pt}\mathbb{E}[e(p)e(t)^{\mathrm{T}}]\Lambda_{t}^{\mathrm{T}}+\sum_{p=0}^{t-1}\sum_{q=0}^{t-1}\Psi_{pt}\mathbb{E}[e(p)e(q)^{\mathrm{T}}]\Psi_{qt}^{\mathrm{T}}\end{split}

(13)

where $\Sigma_{\tilde{w}}(t)=(1_{N}1_{N}^{\mathrm{T}})\otimes blkdg(\Theta_{1}(t),\dots,\Theta_{N}(t))$ . Note that the summation terms in the RHS of (13) (e.g., $\mathbb{E}[e(p)e(q)^{\mathrm{T}}],q\neq p$ ) imply the correlations of state estimates over time induced by the control law (9). Now, the predicted error, (7) can be rewritten by:

\begin{split}{}^{i}\hat{x}(t+1)=&{}^{i}\hat{x}^{-}(t+1)+L_{i}(t+1)H_{i}({}^{i}e^{-}(t+1)+v_{i}(t+1))\end{split}

(14)

Like the Kalman gain, $L_{i}(t+1)$ can be computed in a way minimizing the mean-squared error of the state estimate, i.e., $\mathbb{E}\left[\left\|{}^{i}e(t+1)\right\|^{2}\right]$ . This is in fact equivalent to minimizing the trace of the posterior covariance matrices, i.e., $\mathrm{Tr}\left({}^{i}\Sigma(t+1)\right),\forall i\in\mathcal{V}$ . By the definition of ${}^{i}\Sigma(t+1)$ , we have:

\begin{split}{}^{i}\Sigma&(t+1):=\mathbb{E}[{}^{i}e(t+1){}^{i}e(t+1)^{\mathrm{T}}|Z_{i,(0:t+1)}]\\ =&\left(I_{nN}-L_{i}(t+1)H_{i}\right){}^{i}\Sigma^{-}(t+1)(I_{nN}-L_{i}(t+1)H_{i})^{\mathrm{T}}+L_{i}(t+1)H_{i}{}^{i}\Xi(t+1)(L_{i}(t+1)H_{i})^{\mathrm{T}}\end{split}

(15)

where

\begin{split}L_{i}(k+1)&={}^{i}\Sigma^{-}(t+1)H_{i}^{\mathrm{T}}(S_{i}(t+1))^{-1}\\ S_{i}(t+1)&=H_{i}({}^{i}\Sigma^{-}(t+1)+{}^{i}\mathrm{\Xi}(x+1))H_{i}^{\mathrm{T}}\\ {}^{i}\mathrm{\Xi}(t+1)&=blkdg(\mathrm{\Xi}_{i1}(t+1),\ \mathrm{\Xi}_{i2}(t+1),\dots\mathrm{\Xi}_{iN}(t+1))\end{split}

(16)

Correspondingly, $\Sigma(t+1)$ can be updated by:

\begin{split}\Sigma(t+1)&=(I-\tilde{L}(t+1))\Sigma^{-}(t+1)(I-\tilde{L}(t+1))^{\mathrm{T}}+\tilde{L}(t+1)\Sigma_{\Xi}(t+1)\tilde{L}(t+1)^{\mathrm{T}}\end{split}

(17)

where $\Sigma_{\Xi}(t+1)=blkdg({}^{1}\mathrm{\Xi}(t+1),\dots,{}^{N}\mathrm{\Xi}(t+1))$ , and $\tilde{L}(t+1)=blkdg(L_{1}(t+1)H_{1},\dots,L_{N}(t+1)H_{N})$ . Note that, the covariance between the state estimation errors at current and past steps, i.e., $\mathbb{E}[e(t+1)e(s)^{\mathrm{T}}]$ and $\mathbb{E}[e(s)e(t+1)^{\mathrm{T}}],\forall s<t$ need to be updated using the computed $\tilde{L}(t+1)$ . The cross-covariance between the $i^{th}$ and the $j^{th}$ agents’ state estimates ${}^{ij}\Sigma(t+1):=\mathbb{E}[{}^{i}e(t+1){}^{j}e(t+1)^{\mathrm{T}}]\in\mathbb{R}^{Nn\times Nn}$ is at the off-diagonal block entry, while ${}^{i}\Sigma(t+1)\in\mathbb{R}^{Nn\times Nn}$ is at the diagonal block entry of $\Sigma(t+1)\in\mathbb{R}^{NnN\times NnN}$ . The detailed expansions of $\Sigma(t)$ is as follows:

\begin{split}\Sigma(t)&=\left[\begin{matrix}{}^{1}\Sigma(t)&{}^{12}\Sigma(t)&\cdots&{}^{1N}\Sigma(t)\\ {}^{21}\Sigma(t)&{}^{2}\Sigma(t)&\cdots&{}^{2N}\Sigma(t)\\ \vdots&\vdots&\ddots&\vdots\\ {}^{N1}\Sigma(t)&{}^{N2}\Sigma(t)&\cdots&{}^{N}\Sigma(t)\end{matrix}\right]\end{split}

Based on the computed $\Upsilon_{i}$ from (16), each agent can update ${}^{i}\hat{x}(t)$ , ${}^{i}\Sigma(t)$ and $\Sigma(t)$ , respectively using (14), (15), and (17). This completes the implementation of the distributed estimation algorithm.

Remark 2.

It is noted that $\Sigma(t)$ computed by each agent is irrespective of agent’s perspective since the same initial condition $\Sigma(0)$ is given to each agent by Assumption 1.

3.2 Distributed control law design

In this section, the computationally tractable formulation of the optimal distributed control law is derived from the individual agents’ perspectives. The main idea starts with relaxing the structural constraints on $\mathcal{F}$ by applying the estimator (14) to each agent.

Definition 3.

Let ${}^{i}e:=x-{}^{i}\hat{x}$ stacks up the time series of the estimation errors from the $i^{th}$ agent’s perspective, over the time horizon $T$ . Given $\mathcal{F}$ and $\Upsilon_{i},\forall i\in\mathcal{V}$ , one can construct the estimation error covariance over the time horizon $T$ , $\Sigma_{i}:=\mathbb{E}[{}^{i}e{}^{i}e^{\mathrm{T}}],\forall i\in\mathcal{V}$ , as well as the cross-covariance between two different agents $\Sigma_{ij}:=\mathbb{E}[{}^{i}e\ {}^{j}e^{\mathrm{T}}],\forall i\neq j\in\mathcal{V}$ .

In terms of the time series of the estimation errors, the state estimation based control law over the time horizon can be expressed by:

u=\sum_{i}^{N}\mathcal{M}_{i}\mathcal{F}C\ {}^{i}\hat{x}=\mathcal{F}Cx-\sum_{i}^{N}\mathcal{M}_{i}\mathcal{F}C\,{}^{i}e

(18)

where $\mathcal{M}_{i}=I_{T}\otimes(M_{i}^{\mathrm{T}}M_{i}),\forall i\in\mathcal{V}$ . Plugging (18) into (4) yields the objective cost in (8) as follows:

\begin{split}J(\mathcal{F},\Upsilon_{1},\cdots,\Upsilon_{N})=&\|\mathcal{Q}^{\frac{1}{2}}(I-P_{12}\mathcal{F}C)^{-1}P_{11}\Sigma_{w}^{\frac{1}{2}}\|^{2}_{{}_{F}}+\|\mathcal{R}^{\frac{1}{2}}(I-\mathcal{F}CP_{12})^{-1}\mathcal{F}CP_{11}\Sigma_{w}^{\frac{1}{2}}\|^{2}_{{}_{F}}\\ &+\sum_{i,j}\|\mathcal{Q}^{\frac{1}{2}}P_{12}(I-\mathcal{F}CP_{12})^{-1}(\mathcal{M}_{i}\mathcal{F}C\Sigma_{ij}C^{\mathrm{T}}\mathcal{F}^{\mathrm{T}}\mathcal{M}_{j}^{\mathrm{T}})^{\frac{1}{2}}\|^{2}_{{}_{F}}\\ &+\sum_{i,j}\|\mathcal{R}^{\frac{1}{2}}(I-\mathcal{F}CP_{12})^{{}^{-1}}(\mathcal{M}_{i}\mathcal{F}C\Sigma_{ij}C^{\mathrm{T}}\mathcal{F}^{\mathrm{T}}\mathcal{M}_{j}^{\mathrm{T}})^{\frac{1}{2}}\|^{2}_{{}_{F}}\\ &+\|\mathcal{Q}^{\frac{1}{2}}(I-P_{12}\mathcal{F}C)^{-1}P_{11}\mu_{w}\|^{2}_{2}+\|\mathcal{R}^{\frac{1}{2}}(I-\mathcal{F}CP_{12})^{-1}\mathcal{F}CP_{11}\mu_{w}\|^{2}_{2}\end{split}

(19)

where $\|\cdot\|^{2}_{2}$ and $\|\cdot\|^{2}_{{}_{F}}$ denote Euclidean norm and the Frobenius norm, respectively; and $\mu_{w}:=\mathbb{E}[w]\in\mathbb{R}^{Nn(T+1)}$ , $\Sigma_{w}:=\mathbb{E}[(w-\mu_{w})(w-\mu_{w})^{\mathrm{T}}]\in\mathbb{R}^{Nn(T+1)\times Nn(T+1)}$ .

Apparently, the objective cost (19) has high-dimensional, highly coupled optimization variables, i.e., $\mathcal{F}$ , which is our main interest, and $\Sigma_{ij},\forall i,j\in\mathcal{V}$ , which are the implicit functions of both $\mathcal{F}$ and $\Upsilon_{i},\forall i\in\mathcal{V}$ . The proposed iterative optimization procedure alleviates these coupling complexities in two aspects. First, akin to the alternating direction method of multipliers (ADMM) technique (Lin et al., 2013), we set $\Sigma_{ij},\forall i,j\in\mathcal{V}$ constant while optimizing $\mathcal{F}$ at the $l^{th}$ iteration, thereby treating $J$ as the function of $\mathcal{F}$ only. Note that $\Upsilon_{i},\forall i\in\mathcal{V}$ is designed over the constant $\mathcal{F}$ in the distributed estimator design phase of the next iteration. Second, we interpret the global objective cost from the individual agent’s viewpoint, and translate the primal problem (Problem 2) into the agent-wise objective cost. The objective cost, locally seen by the $i^{th}$ agent’s viewpoint at the $l^{th}$ iteration, is denoted by ${}^{i}J^{(l)}$ which can be constructed using the estimated MAS input ${}^{i}u^{(l)}={}^{i}\mathcal{F}^{(l)}C(x-{}^{i}e)$ instead of (18). ${}^{i}\mathcal{F}^{(l)}$ is constructed from the $i^{th}$ agent’s perspective by optimizing the agent-wise objective cost, ${}^{i}J^{(l)}$ . Then, the resulting agent-wise optimization problem is represented as follows:

Problem 3.

Optimal distributed control law from agent-wise viewpoint:

\begin{split}\ \ \ &\min\limits_{{}^{i}\mathcal{F}^{(l)}\in\mathbb{\tilde{F}}}\ {}^{i}J^{(l)}({}^{i}\mathcal{F}^{(l)})\end{split}

(20)

where

\begin{split}{}^{i}J^{(l)}({}^{i}\mathcal{F}^{(l)})=&\|\mathcal{Q}^{\frac{1}{2}}(I-P_{12}\,{}^{i}\mathcal{F}^{(l)}C)^{{}^{-1}}P_{11}{}^{i}\Sigma_{w}^{\frac{1}{2}}\|^{2}_{{}_{F}}+\|\mathcal{R}^{\frac{1}{2}}(I-{}^{i}\mathcal{F}^{(l)}CP_{12})^{{}^{-1}}\,{}^{i}\mathcal{F}^{(l)}CP_{11}{}^{i}\Sigma_{w}^{\frac{1}{2}}\|^{2}_{{}_{F}}\\ &+\|\mathcal{Q}^{\frac{1}{2}}(I-P_{12}\,{}^{i}\mathcal{F}^{(l)}C)^{{}^{-1}}P_{12}\,{}^{i}\mathcal{F}^{(l)}C\Sigma_{i}^{(l)\frac{1}{2}}\|^{2}_{{}_{F}}+\|\mathcal{R}^{\frac{1}{2}}(I-{}^{i}\mathcal{F}^{(l)}CP_{12})^{{}^{-1}}\,{}^{i}\mathcal{F}^{(l)}C\Sigma_{i}^{(l)\frac{1}{2}}\|^{2}_{{}_{F}}\\ &+\|\mathcal{Q}^{\frac{1}{2}}(I-P_{12}\,{}^{i}\mathcal{F}^{(l)}C)^{{}^{-1}}P_{11}{}^{i}\mu_{w}\|^{2}_{2}+\|\mathcal{R}^{\frac{1}{2}}(I-{}^{i}\mathcal{F}^{(l)}CP_{12})^{{}^{-1}}\,{}^{i}\mathcal{F}^{(l)}CP_{11}{}^{i}\mu_{w}\|^{2}_{2}\end{split}

(21)

where ${}^{i}\mu_{w}=\mathbb{E}[w|Z_{i,(0:T)}]$ , ${}^{i}\Sigma_{w}:=\mathbb{E}[(w-{}^{i}\mu_{w})(w-{}^{i}\mu_{w})^{\mathrm{T}}|Z_{i,(0:T)}],\forall i\in\mathcal{V}$ . Note that $\Sigma_{i}^{(l)}\in\mathbb{R}^{Nn(T+1)\times Nn(T+1)}$ is computed by Definition 3 at the $l^{th}$ iteration.

Definition 4.

A subspace $\tilde{\mathbb{F}}$ is quadratic invariance (QI) with respect to $CP_{12}$ if and only if ${}^{i}\mathcal{F}^{(l)}CP_{12}{}^{i}\mathcal{F}^{(l)}\in\tilde{\mathbb{F}}$ . And it is trivial to show that $\tilde{\mathbb{F}}$ is QI with respect to $CP_{12}$ (Lessard and Lall, 2011).

It is well-known fact that QI is a sufficient and necessary condition for the exact convex reformulation (Lessard and Lall, 2011). That is, one can apply an equivalent disturbance-feedback policy to make (21) a convex form, similar to (Furieri and Kamgarpour, 2020).

Definition 5.

Let us introduce the nonlinear mapping as:

h(\Phi)=(I+\Phi CP_{12})^{-1}\Phi,\ \ h:\mathbb{R}^{NpT\times NnT}\mapsto\mathbb{R}^{NpT\times NnT}

(22)

and define the cost function $\tilde{J}:\mathbb{R}^{NpT\times NnT}\mapsto\mathbb{R}$ in terms of the design parameter ${}^{i}\Phi^{(l)}$ (Furieri and Kamgarpour, 2020).

\begin{split}{}^{i}\tilde{J}^{(l)}({}^{i}\Phi^{(l)})&=\|\mathcal{Q}^{\frac{1}{2}}(I+P_{12}{}^{i}\Phi^{(l)}C)P_{11}{}^{i}\Sigma_{w}^{\frac{1}{2}}\|^{2}_{{}_{F}}+\|\mathcal{R}^{\frac{1}{2}}{}^{i}\Phi^{(l)}CP_{11}{}^{i}\Sigma_{w}^{\frac{1}{2}}\|^{2}_{{}_{F}}+\|\mathcal{Q}^{\frac{1}{2}}P_{12}{}^{i}\Phi^{(l)}\Sigma_{i}^{(l)\frac{1}{2}}\|^{2}_{{}_{F}}\\ &\ \ \ +\|\mathcal{R}^{\frac{1}{2}}{}^{i}\Phi^{(l)}\Sigma_{i}^{(l)\frac{1}{2}}\|^{2}_{{}_{F}}+\|\mathcal{R}^{\frac{1}{2}}{}^{i}\Phi^{(l)}CP_{11}{}^{i}\mu_{w}\|^{2}_{2}+\|\mathcal{Q}^{\frac{1}{2}}(I+P_{12}{}^{i}\Phi^{(l)}C)P_{11}{}^{i}\mu_{w}\|^{2}_{2}\end{split}

(23)

By Theorem $1$ in (Furieri and Kamgarpour, 2020), the following convex optimization problem is equivalent to Problem 3.

Problem 4.

Equivalent convex problem to optimal distributed control from agent-wise viewpoint:

\min\limits_{{}^{i}\Phi^{(l)}\in h^{-1}(\tilde{\mathbb{F}})}{}^{i}\tilde{J}^{(l)}({}^{i}\Phi^{(l)})

(24)

By solving (24) using convex programming, one can find the optimal ${}^{i}\Phi^{(l)}$ , and the corresponding ${}^{i}\mathcal{F}^{(l)}$ from the inverse mapping $h^{-1}$ of (22). The same optimization routines (Problem 3 and 4) based on the locally seen cost from the other agents’ perspectives are processed to get the optimal control laws ${}^{i}\mathcal{F}^{(l)},\forall i\in\mathcal{V}$ at the $l^{th}$ iteration step.

3.3 Distributed control-estimation synthesis

The set of optimal control laws from individual agents’ viewpoint, ${}^{i}\mathcal{F}^{(l)},\forall i\in\mathcal{V}$ , is mixed to approximate the solution to Problem 2 by the agent-wise mixing strategy proposed as follows:

\mathcal{F}^{(l)}=\sum_{i}^{N}\mathcal{M}_{i}\,{}^{i}\mathcal{F}^{(l)}

(25)

The basic intuition of the proposed strategy is to exhibit the control law for the $i^{th}$ agent using the one computed from the sup-optimization problem (Problem 3) from the $i^{th}$ agent’s perspective. Accordingly, the proposed mixing strategy allows for individual agents to retain distributed controllers to be executed, retaining each of their sub-optimal solutions without interfering with each other.

3.4 Convergence check

The last step of the iteration loop evaluates the designed distributed control law (25), together with the estimator (7). First, let $S$ be a set which stores the designed control law, $\mathcal{F}^{(l)}$ , and the estimator gains, $\Upsilon_{i}^{(l)},\ \forall i\in\mathcal{V}$ , from each iteration step as follows:

S:=\left\{s^{(l)}\bigg{|}s^{(l)}=\left(\mathcal{F}^{(l)},\Upsilon_{1}^{(l)},\cdots,\Upsilon_{N}^{(l)}\right),\ l\in\mathbb{N}\right\}

(26)

The iteration terminates if: i) the total iteration counts the threshold number $N_{max}$ ; or ii) the consecutive iteration is converged with respect to the following stopping condition:

\triangle J(l,l-1)\leq\epsilon_{stop}

(27)

where $\triangle J(l,l-1):=\lvert J(\mathcal{F}^{(l)},\Upsilon_{1}^{(l)},\cdots,\Upsilon_{N}^{(l)})-J(\mathcal{F}^{(l-1)},\Upsilon_{1}^{(l-1)},\cdots,\Upsilon_{N}^{(l-1)})\lvert$ and $\epsilon_{stop}$ is the threshold magnitude for the convergence. The objective cost of the corresponding control law $J(\mathcal{F}^{(l)},\Upsilon_{1}^{(l)},\cdots,\Upsilon_{N}^{(l)})$ is computed by plugging the designed control law, $\mathcal{F}^{(l)}$ , and the set of distributed estimator gains $\Upsilon_{i}^{(l)},\forall i\in\mathcal{V}$ into (19). The final output of the control-estimation synthesis is given by:

\begin{split}\mathcal{F}^{\ast}=&\mathcal{F}^{(l)},\ \Upsilon_{i}^{\ast}=\Upsilon_{i}^{(l)},\ \forall i\in\mathcal{V}\\ \mathrm{where}\ l=&\displaystyle\operatorname*{arg\,min}_{\forall l\in|S|}J(\mathcal{F}^{(l)},\Upsilon_{1}^{(l)},\cdots,\Upsilon_{N}^{(l)})\end{split}

(28)

The overall recursive structure of the control-estimation synthesis procedure is summarized in Algorithm 1.

Initialize the MAS dynamics information

A

B

\mathcal{L}

\Sigma(0)

\mathcal{F}^{(0)}

\epsilon_{stop}

N_{max}

and the cost metrics

\mathcal{Q}

\mathcal{R}

for

l=1,2,\cdots N_{max}

a)

Distributed estimator design

for t=0 to the termination time

T

1)

Update $\Sigma(t+1)$ using $\mathcal{F}^{(l-1)}$ , (13), (16), and (17)

end for, Output

\longrightarrow

\Upsilon_{i}^{(l)}\ \mathrm{and}\ \Sigma_{i}^{(l)},\forall i\in\mathcal{V}

Distributed control law design

for i=1 to the number of total agent,

N

2)

Solve (24), and compute ${}^{i}\mathcal{F}^{(l)}$

end for, Output

\longrightarrow

{}^{i}\mathcal{F}^{(l)},\forall i\in\mathcal{V}

Distributed control-estimation synthesis

3)

Synthesize the control law $\mathcal{F}^{(l)}$ using (25)

Convergence check

4)

Store $\mathcal{F}^{(l)}$ , and $\Upsilon_{i}^{(l)},\forall i\in\mathcal{V}$ in the set $S$

5)

If (27) is satisfied or $l>N_{max}$ , then terminates.

end for, Output $\Longrightarrow\mathcal{F}^{\ast}$ , and $\Upsilon_{i}^{\ast},\ \forall i\in\mathcal{V}$

Algorithm 1 Virtual interaction based distributed control-estimation synthesis.

It is noted that the algorithm 1 is executed in the offline design phase. Once the distributed control law $\mathcal{F}^{\ast}$ and the corresponding estimator gains $\Upsilon_{i}^{\ast}$ for each agent are designed, each agent is deployed into the distributed online operation using its own prior knowledge. It is worth noting that the majority of the heavy computations occur at the offline design phase. Therefore, when it comes to the online operation, it is not burdensome to the limited on-board resources of each agent. Indeed, the computational complexity of the online operation for the proposed algorithm is scaled by the number of agents, i.e., $\mathcal{O}(N)$ .

4 Stability Analysis

In this session, the stability analysis of the proposed distributed estimation algorithm in section 3.1 is presented. To begin with, let us consider the control law as static memoryless feedback gain $F$ as follows:

u_{i}(t)=M_{i}F\,{}^{i}\hat{x}(t)

(29)

where $M_{i}=[0_{p}\cdots I_{p}\cdots 0_{p}]\in\mathbb{R}^{p\times Np}$ is the block matrix having $I_{p}$ in the $i^{th}$ block entry and filled with $0_{p}$ in other entries. $F$ has structural constraints subject to the network topology of MAS specified by the Laplacian $\mathcal{L}$ . Note that the estimation stability is unrelated to the design of $F$ as will be discussed below, and thus readily applicable to memory based feedback control law as in (9). Corresponding to (29), the predicted state estimation errors of the $i^{th}$ agent can be written as follows:

{}^{i}e^{-}(t+1)=D_{1}{}^{i}e(t)+D_{12}e(t)+\tilde{w}(t)

(30)

where

\begin{split}D_{1}&=\tilde{A}+\tilde{B}F\\ D_{12}&=-(1_{N}^{\mathrm{T}}\otimes\tilde{B})blkdg(M_{1}^{\mathrm{T}}M_{1},\dots,M_{N}^{\mathrm{T}}M_{N})(I_{N}\otimes F)\\ e(t)&=[{}^{1}e(t)^{\mathrm{T}}\cdots{}^{N}e(t)^{\mathrm{T}}]^{\mathrm{T}}\end{split}

It is noted from the above equation (30) that the estimation error of the individual agent ${}^{i}e$ is coupled with the augmented estimation error of the entire MAS, $e$ . Then, the following two lemmas are required for proving the stability of the proposed distributed estimation algorithm.

Lemma 1.

Estimation error covariance from the $i^{th}$ agent, ${}^{i}\Sigma(t)$ is positive definite and bounded for all $t$ if the following system is observable (Kwon and Hwang, 2018):

\begin{split}x(t+1)&=\mathcal{L}x(t)\\ Z_{i}(t)&=C_{i}x(t)\end{split}

(31)

where $C_{i}=[h_{1}\ h_{2}\ \cdots h_{|\Omega_{i}|}]^{\mathrm{T}}\otimes I_{n}\in\mathbb{R}^{n|\Omega_{i}|\times nN}$ is a observer matrix which gathers the measurements from the $i^{th}$ agent’s perspective, i.e., those that are neighboring agents of the $i^{th}$ agent. $h_{q}\in\mathbb{R}^{N},q={1,2,...,|\Omega_{i}|}$ are the nonzero column vectors of the matrix $diag(c_{i1},c_{i2}....,c_{iN})$ . It is noted that the value of $c_{ij}$ can be decided by the graph $\mathcal{G}=(\mathcal{V},\mathcal{E})$ of the given network where $(i,j)\in\mathcal{E}$ indicates the availability of measurement of the $j^{th}$ agent’s state from the $i^{th}$ agent, i.e., $c_{ij}=1$ and $(i,j)\notin\mathcal{E}$ means $c_{ij}=0$ .

Proof. The proof is referred to in the author’s previous work (Kwon and Hwang, 2018). $\blacksquare$

To analyze the estimation stability of ${}^{i}e$ , we first introduce $H_{i}(t)\in\mathbb{R}^{nN^{2}\times nN}$ as a affine mapping matrix and $\alpha_{i}(t)\in\mathbb{R}^{nN^{2}}$ as a lumped noise as follows:

\begin{split}H_{i}(t+1)&=\tilde{F}(t+1)H_{i}(t)({}^{i}D_{22}(t+1){}^{i}D_{2}(t))^{-1}\\ \alpha_{i}(t+1)&=(\tilde{F}(t+1)-H_{i}(t+1){}^{i}D_{22}(t+1)D_{12})\alpha_{i}(t)+\gamma(t+1)-H_{i}(t+1)\zeta_{i}(t+1)\\ {}^{i}D_{2}(t)&=D_{1}+D_{12}H_{i}(t)\\ {}^{i}D_{22}(t+1)&=I_{nN}-L_{i}(t+1)C_{i}\\ \zeta_{i}(t+1)&={}^{i}D_{22}(t+1)\tilde{w}(t)-L_{i}(t+1)C_{i}v_{i}(t+1)\\ \tilde{F}(t+1)&=blkdg({}^{1}D_{22}(t+1){}^{1}D_{2}(t),\ \dots,\ {}^{N}D_{22}(t+1){}^{N}D_{2}(t))\\ \gamma(t+1)&=\begin{bmatrix}{}^{1}D_{22}(t+1)D_{12}\alpha_{1}(t)+\zeta_{1}(t+1)\\ \vdots\\ {}^{N}D_{22}(t+1)D_{12}\alpha_{N}(t)+\zeta_{N}(t+1)\end{bmatrix}\end{split}

(32)

It is noted that it is trivial to compute initial affine transformation matrix $H_{i}(0),\forall i\in\mathcal{V}$ which satisfies $\Sigma(0)=H_{i}{}^{i}\Sigma(0)H_{i}^{\mathrm{T}}$ under the Assumptions 1. On the other hand, $\alpha_{i}$ follows the Gaussian distribution $\alpha_{i}(t)\sim\mathcal{N}_{nNN}(0,\eta_{i}(t))$ , where $\eta_{i}(t)=\mathbb{E}[\alpha_{i}(t)\alpha_{i}^{\mathrm{T}}(t)]^{\mathrm{T}}$ with initial value as $\eta_{i}(0)=0_{nNN}$ . Then, we can show that the augmented estimation error can be mapped to the estimation error from the single agent’s perspective by the following lemma.

Lemma 2.

Suppose the system given in (31) is observable and satisfies the Assumption 1. Then, given the agent dynamics and the estimation based control (29) with control law as $F$ , there exists a affine mapping between the estimation error of the $i^{th}$ agent, ${}^{i}e$ , and the augmented MAS estimation error, $e$ , as follows:

\begin{split}e(t+1)&=H_{i}(t+1){}^{i}e(t+1)+\alpha_{i}(t+1),\ \forall i\in\mathcal{V},\ \forall t\geq 0\end{split}

(33)

Proof. The proof can be shown by induction. Let the estimation error at time step $t$ satisfies (33). To verify that the (33) is satisfied at the next time step $t+1$ with the definitions of $H_{i}(t+1)$ and $\alpha_{i}(t+1)$ , the dynamics of the estimation error of the $i^{th}$ agent is considered. By (33), (30) can be stated as follows:

\begin{split}{}^{i}e^{-}(t+1)&=D_{1}{}^{i}e(t)+D_{12}(H_{i}(t){}^{i}e(t)+\alpha_{i}(t))+\tilde{w}(t)\\ &={}^{i}D_{2}(t){}^{i}e(t)+D_{12}\alpha_{i}(t)+\tilde{w}(t)\end{split}

(34)

And the updated estimation error is derived as follows:

\begin{split}{}^{i}e(t+1)&={}^{i}D_{22}(t+1){}^{i}D_{2}(t){}^{i}e(t)+{}^{i}D_{22}(t+1)D_{12}\alpha_{i}(t)+\zeta_{i}(t+1)\end{split}

(35)

By concatenating (35) for all agents $i\in\mathcal{V}$ , the MAS augmented estimation error, $e$ , can be formulated as follows:

\begin{split}e(t+1)&=\tilde{F}(t+1)e(t)+\gamma(t+1)\\ &=\tilde{F}(t+1)H_{i}(t){}^{i}e(t)+\tilde{F}(t+1)\alpha_{i}(t)+\gamma(t+1)\\ &=H_{i}(t+1){}^{i}e(t+1)+\alpha_{i}(t+1)\end{split}

(36)

Without loss of generality, the matrix ${}^{i}D_{22}(t+1){}^{i}D_{2}(t)$ is invertible as it governs the estimation error dynamics (35) induced by the stochastic linear dynamical system. This completes the proof of Lemma 2. $\blacksquare$

Based on the derived affine mapping between $e$ and ${}^{i}e$ , we are ready to present the stability of the proposed estimation algorithm. As our paper considers the stochastic MAS, the estimation error can be regarded as a super martingale of the Lyapunov functions, which satisfies the following conditions:

\left\{\begin{aligned} &V(e(t),t)=0,\ \ \ e(t)=\mathbf{0}\\ &V(e(t),t)>0,\ \ \ e(t)\neq\mathbf{0}\\ &V(e(t),t)\rightarrow\infty,\ \ \ e(t)\rightarrow\infty\end{aligned}\right.,\ \ \ \forall t

(37)

\Delta V(t+1,t)<0,\ \ \ \forall k

(38)

where $\Delta V(t+1,t):=V(\mathbb{E}\left[e(t+1)|e(t)\right],t+1)-V(e(t),t)$ .

Theorem 1.

Given the MAS dynamics and the control protocol as in (29), the proposed distributed estimation algorithm is globally asymptotically stable in the sense of Lyapunov if the system (31) is observable.

Proof. Let us define the Lyapunov function $V:\mathbb{R}^{nN}\times\mathbb{N}\rightarrow\mathbb{R}$ of the estimation error of the $i^{th}$ agent as follows:

\begin{split}V({}^{i}e(t),t):=&{}^{i}e^{\mathrm{T}}(t)\left({}^{i}\Sigma(t)\right)^{-1}{}^{i}e(t)\\ \triangle V(t+1,t):=&V(\mathbb{E}[{}^{i}e(t+1)|{}^{i}e(t)],t+1)-V({}^{i}e(t),t)\end{split}

(39)

As ${}^{i}\Sigma(t)$ is positive definite and bounded by lemma 1, There exists $\left({}^{i}\Sigma(t)\right)^{-1}\succ 0$ . Therefore, $V$ is a quadratic function which satisfies the condition in (37). Besides, using (34), the conditional expectation $\mathbb{E}[{}^{i}e(t+1)|{}^{i}e(t)]$ is given by

\mathbb{E}[{}^{i}e(t+1)|{}^{i}e(t)]={}^{i}D_{22}(t+1){}^{i}D_{2}(t){}^{i}e(t)

Then, by applying equation (35), $\triangle V$ can be written as follows:

\begin{split}\triangle V(t+1,t)=&-{}^{i}e^{\mathrm{T}}(t)\big{(}({}^{i}\Sigma(t))^{-1}-{}^{i}D_{2}^{\mathrm{T}}(t){}^{i}D_{22}^{\mathrm{T}}(t+1)\times({}^{i}\Sigma(t+1))^{-1}{}^{i}D_{22}(t+1){}^{i}D_{2}(t)\big{)}{}^{i}e(t)\\ =&-{}^{i}e^{\mathrm{T}}(t)\mathcal{M}_{i}(t+1){}^{i}e(t)\end{split}

(40)

To satisfies the condition in (38), $\mathcal{M}_{i}(t+1)$ should be a positive definite matrix. By applying Lemma 2, the predicted estimation error covariance of the $i^{th}$ agent is derived using (34) as:

{}^{i}\Sigma^{-}(t+1)={}^{i}D_{2}(t){}^{i}\Sigma(t){}^{i}D_{2}^{\mathrm{T}}(t)+D_{12}\eta_{i}(t)D_{12}^{\mathrm{T}}+\Sigma_{\tilde{w}}

(41)

Correspondingly, the updated estimation error covariance can be computed as follows:

\begin{split}{}^{i}\Sigma(t+1)=&{}^{i}D_{22}(t+1){}^{i}\Sigma^{-}(t+1)\\ =&{}^{i}\Sigma^{-}(t+1)-{}^{i}\Sigma^{-}(t+1)C_{i}^{\mathrm{T}}\big{(}C_{i}{}^{i}\Sigma^{-}(t+1)C_{i}^{\mathrm{T}}+C_{i}\mathrm{\Xi}_{i}C_{i}^{\mathrm{T}}\big{)}^{-1}C_{i}{}^{i}\Sigma^{-}(t+1)\end{split}

(42)

and using the definition of ${}^{i}D_{22}$ , ${}^{i}\Sigma(t+1)$ can be defined as:

{}^{i}\Sigma(t+1)={}^{i}D_{22}(t+1){}^{i}\Sigma^{-1}(t+1)

(43)

By applying the matrix inversion lemma, (42) is rewritten as:

\begin{split}{}^{i}\Sigma(t+1)=&\big{(}({}^{i}\Sigma^{-}(t+1))^{-1}+C_{i}^{\mathrm{T}}(C_{i}\mathrm{\Xi_{i}}C_{i}^{\mathrm{T}})^{-1}C_{i}\big{)}^{-1}\end{split}

(44)

And using (44), multiplying ${}^{i}\Sigma(t+1)$ to the left and the right sides of $\big{(}{}^{i}\Sigma(t+1)\big{)}^{-1}$ and applying (43) yields:

{}^{i}\Sigma(t+1)={}^{i}D_{22}(t+1)\big{(}{}^{i}\Sigma^{-1}(t+1)+\mathcal{W}_{i}(t+1)\big{)}{}^{i}D_{22}^{\mathrm{T}}(t+1)

(45)

where $\mathcal{W}_{i}(t+1)={}^{i}\Sigma^{-}(t+1)C_{i}^{\mathrm{T}}(C_{i}\mathrm{\Xi_{i}}C_{i}^{\mathrm{T}})^{-1}C_{i}{}^{i}\Sigma^{-}(t+1)$ . It is trivial to show that matrix $\mathcal{W}_{i}\succ 0,\ \forall t$ . Recalling ${}^{i}\Sigma^{-}(t+1)$ denoted in (41), $\big{(}{}^{i}\Sigma(t+1)\big{)}^{-1}$ can be rewritten by:

\begin{split}\big{(}{}^{i}\Sigma(t+1)\big{)}^{-1}=&\big{(}{}^{i}D_{22}^{\mathrm{T}}(t+1)\big{)}^{-1}\big{(}{}^{i}D_{2}(t){}^{i}\Sigma(t){}^{i}D_{2}^{\mathrm{T}}+\Sigma_{\tilde{w}}+D_{12}\eta_{i}(t)D_{12}^{\mathrm{T}}+\mathcal{W}_{i}(t+1)\big{)}^{-1}\big{(}{}^{i}D_{22}(t+1)\big{)}^{-1}\end{split}

(46)

By applying (46), $\mathcal{M}_{i}(t+1)$ can be redefined as:

\begin{split}\mathcal{M}_{i}(t+1)=&({}^{i}\Sigma(t))^{-1}-{}^{i}D_{2}^{\mathrm{T}}(t)\big{(}{}^{i}D_{2}(t){}^{i}\Sigma(t){}^{i}D_{2}^{\mathrm{T}}(t)+D_{12}\eta_{i}(t)D_{12}^{\mathrm{T}}+\Sigma_{\tilde{w}}+\mathcal{W}_{i}(t+1)\big{)}^{-1}{}^{i}D_{2}(t)\end{split}

(47)

After going through tedious conversion using the matrix inversion lemma, (47) can be rewritten as follows:

\begin{split}\mathcal{M}_{i}(t+1)=&({}^{i}\Sigma(t))^{-1}\big{(}({}^{i}\Sigma(t))^{-1}+{}^{i}D_{2}^{\mathrm{T}}(t)(D_{12}\eta_{i}(t)D_{12}^{\mathrm{T}}+\Sigma_{\tilde{w}}+\mathcal{W}_{i}(t+1))^{-1}{}^{i}D_{2}(t)\big{)}^{-1}({}^{i}\Sigma(t))^{-1}\end{split}

(48)

As $({}^{i}\Sigma(t))^{-1}\succ 0$ and ${}^{i}D_{2}^{\mathrm{T}}(t)(D_{12}\eta_{i}(t)D_{12}^{\mathrm{T}}+\Sigma_{\tilde{w}}+\mathcal{W}_{i}(t+1)\succ 0$ in (48), one can verify $\mathcal{M}_{i}(t+1)\succ 0,\ \forall t$ , which guarantees the Lyapunov function satisfies (37) and (38). This proves that the estimation error is globally asymptotically stable. $\blacksquare$

5 Numerical Simulation

The effectiveness of the proposed algorithm is demonstrated with an illustrative MAS example. The MAS consists of five agents whose dynamics and objective are specified by the following parameter sets: $A=1,\ B=1,\ \Theta_{i}(t)=1$ , ${}^{i}\mathrm{\Xi}(t)=diag(1,1,1,2,1),\forall i\in\mathcal{V}$ , $\forall t\in\{0,\cdots,T\}$ , $T=5$ , $\mathcal{Q}=I_{6}\otimes(5I_{5}-1_{5\times 5})$ , and $\mathcal{R}=I_{25}$ . The MAS network topology is set to be partially connected, the same as the one in (Kwon and Hwang, 2018). To validate the performance of the proposed algorithm, we conduct a comparative analysis with two different scenarios: i) MAS with the fully connected network, which is free from network topological constraint; and ii) MAS with the same (partially connected) network topology where non-neighboring agent information is not available to each agent. For the second case, we test the suboptimal method presented in (Gupta et al., 2005). The simulation results are shown in Figure. 3. By virtue of the virtual interactions between non-neighboring agents, our algorithm outperforms the existing method in the partially connected network, and even matches the fully connected network case despite the network topological constraints.

To further demonstrate the performance of the proposed algorithm, we have performed additional simulations with respect to the different number of "real" interactions. It is worth noting that the network links between agents yield the “real" interaction while those with no explicit links create the virtual interaction. MAS with five agents has been simulated under different network topology shown in the below Figure. 4.

All experiments followed the same setting except the network topology. Apparently, the number of virtual interactions decreases as the number of network links between agents increase whereby we can analyze the effect of ratio between real and virtual interactions. The performance of our proposed method under different network settings is depicted in the below Figure. 5.

The result shows that the performance of our proposed method does not vary much with respect to the ratio between virtual to real interactions. At the cost of some onboard computational resources to estimate non-neighboring agents, our proposed method provides the clear advantage of having optimal performance with fewer network connections.

6 Conclusions

This paper has proposed a novel design procedure of the optimal distributed control for the linear stochastic MAS, generally subject to network topological constraints. The proposed method gets around the network topological constraint by employing the distributed estimator, whereby each agent can exploit the non-neighboring agent’s information. Future work will include the theoretical performance guarantee of the proposed distributed control-estimation synthesis such as stability analysis, and a further extension to the infinite time horizon case for practical use.

References

Shi and Yan [2020] Peng Shi and Bing Yan. A survey on intelligent control for multiagent systems. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2020.
Li et al. [2019] Xianwei Li, Fangzhou Liu, Martin Buss, and Sandra Hirche. Fully distributed consensus control for linear multi-agent systems: A reduced-order adaptive feedback approach. IEEE Transactions on Control of Network Systems, 2019.
Zhu et al. [2020] Yunru Zhu, Liqi Zhou, Yuanshi Zheng, Jian Liu, and Shiming Chen. Sampled-data based resilient consensus of heterogeneous multiagent systems. International Journal of Robust and Nonlinear Control, 30(17):7370–7381, 2020.
Morita et al. [2015] Ryosuke Morita, Takayuki Wada, Izumi Masubuchi, Toru Asai, and Yasumasa Fujisaki. Multiagent consensus with noisy communication: stopping rules based on network graphs. IEEE Transactions on Control of Network Systems, 3(4):358–365, 2015.
Gupta et al. [2005] Vijay Gupta, Babak Hassibi, and Richard M Murray. A sub-optimal algorithm to synthesize control laws for a network of dynamic agents. International Journal of Control, 78(16):1302–1313, 2005.
Ma et al. [2015] Jingying Ma, Yuanshi Zheng, and Long Wang. Lqr-based optimal topology of leader-following consensus. International Journal of Robust and Nonlinear Control, 25(17):3404–3421, 2015.
Nguyen [2015] Dinh Hoa Nguyen. A sub-optimal consensus design for multi-agent systems based on hierarchical lqr. Automatica, 55:88–94, 2015.
Jiao et al. [2019] Junjie Jiao, Harry L Trentelman, and M Kanat Camlibel. A suboptimality approach to distributed linear quadratic optimal control. IEEE Transactions on Automatic Control, 65(3):1218–1225, 2019.
Kwon and Hwang [2018] Cheolhyeon Kwon and Inseok Hwang. Sensing-based distributed state estimation for cooperative multiagent systems. IEEE Transactions on Automatic Control, 64(6):2368–2382, 2018.
Furieri and Kamgarpour [2020] Luca Furieri and Maryam Kamgarpour. First order methods for globally optimal distributed controllers beyond quadratic invariance. In 2020 American Control Conference (ACC), pages 4588–4593. IEEE, 2020.
Lin et al. [2013] Fu Lin, Makan Fardad, and Mihailo R Jovanović. Design of optimal sparse feedback gains via the alternating direction method of multipliers. IEEE Transactions on Automatic Control, 58(9):2426–2431, 2013.
Lessard and Lall [2011] Laurent Lessard and Sanjay Lall. Quadratic invariance is necessary and sufficient for convexity. In Proceedings of the 2011 American Control Conference, pages 5360–5362. IEEE, 2011.