This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Stability of FFLS-based Diffusion Adaptive Filter Under Cooperative Excitation Condition

Die Gan    Siyu Xie    Zhixin Liu    \IEEEmembershipMember, IEEE    and Jinhu Lü    \IEEEmembershipFellow, IEEE Corresponding author: Zhixin Liu.This work was supported by Natural Science Foundation of China under Grant T2293772, the National Key R&D Program of China under Grant 2018YFA0703800, the Strategic Priority Research Program of Chinese Academy of Sciences under Grant No. XDA27000000, and National Science Foundation of Shandong Province (ZR2020ZD26).D. Gan is with the Zhongguancun Laboratory, Beijing, China (e-mail: gandie@amss.ac.cn).S. Y. Xie is with the School of Aeronautics and Astronautics, University of Electronic Science and Technology of China, Chengdu 611731, China (e-mail: syxie@uestc.edu.cn).Z. X. Liu is with the Key Laboratory of Systems and Control, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, and School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, China. (e-mail: lzx@amss.ac.cn.).J. H. Lü is with the School of Automation Science and Electrical Engineering, Beihang University, Beijing, China, and also with the Zhongguancun Laboratory, Beijing, China (e-mail: jhlu@iss.ac.cn).
Abstract

In this paper, we consider the distributed filtering problem over sensor networks such that all sensors cooperatively track unknown time-varying parameters by using local information. A distributed forgetting factor least squares (FFLS) algorithm is proposed by minimizing a local cost function formulated as a linear combination of accumulative estimation error. Stability analysis of the algorithm is provided under a cooperative excitation condition which contains spatial union information to reflect the cooperative effect of all sensors. Furthermore, we generalize theoretical results to the case of Markovian switching directed graphs. The main difficulties of theoretical analysis lie in how to analyze properties of the product of non-independent and non-stationary random matrices. Some techniques such as stability theory, algebraic graph theory and Markov chain theory are employed to deal with the above issue. Our theoretical results are obtained without relying on the independency or stationarity assumptions of regression vectors which are commonly used in existing literature.

{IEEEkeywords}

Distributed forgetting factor least squares, cooperative excitation condition, exponential stability, stochastic dynamic systems, Markovian switching topology

1 Introduction

\IEEEPARstart

Owing to the capability to process the collaborative data, wireless sensor networks (WSNs) have attracted increasing research attention in diverse areas, including consensus seeking [1][2], resource allocation [3][4], and formation control [5][6]. How to design the distributed adaptive estimation and filtering algorithms to cooperatively estimate unknown parameters has become one of the most important research topics. Compared with centralized estimation algorithms where a fusion center is needed to collect and process information measured by all sensors, the distributed ones can estimate or track an unknown parameter process of interest cooperatively by using local noisy measurements. Therefore, the distributed algorithms are easier to be implemented because of their robustness to network link failure, privacy protection, and reduction on communication and computation costs.

Based on classical estimation algorithms and typical distributed strategies such as the incremental, diffusion and consensus, a number of distributed adaptive estimation or filtering algorithms have been investigated (cf., [7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]), e.g., the consensus-based least mean squares (LMS), the diffusion Kalman filter (KF), the diffusion least squares (LS), the incremental LMS, the combination of diffusion and consensus stochastic gradient (SG), the diffusion forgetting factor least squares (FFLS). Performance analysis of the distributed algorithms is also studied under some information conditions. For deterministic signals or deterministic system matrices, Battistelli and Chisci in [7] provided the mean-square boundedness of the state estimation error of the distributed Kalman filter algorithm under a collectively observable condition. Chen et al. in [8] studied the convergence of distributed adaptive identification algorithm under a cooperative persistent excitation (PE) condition. Javed et al. in [9] presented stability analysis of the cooperative gradient algorithm for the deterministic regression vectors satisfying a cooperative PE condition. Note that the signals are often random since they are generated from dynamic systems affected by noises. For the random regression vector case, Barani et al. in [10] studied the convergence of distributed stochastic gradient descent algorithm with independent and identically distributed (i.i.d.) signals. Schizas et al. in [11] provided the stability analysis of a distributed LMS-type adaptive algorithm under the strictly stationary and ergodic regression vectors. Zhang et al. in [12] studied the mean square performance of a diffusion FFLS algorithm with independent input signals. Takahashi et al. in [13] established the performance analysis of the diffusion LMS algorithm for i.i.d. regression vectors. Lei and Chen in [14] established the convergence analysis of the distributed stochastic approximation algorithm with ergodic system signals. Mateos and Giannakis in [15] presented the stability and performance analysis of the distributed FFLS algorithm under the spatio-temporally white regression vectors condition.

We remark that most theoretical results mentioned in the above literature were established by requiring regression vectors to be either deterministic and satisfy PE conditions, or random but satisfy independency, stationarity and ergodicity conditions. In fact, the observed data are often random and hard to satisfy the above statistical assumptions, since they are generated by complex dynamic systems where feedback loops inevitably exist (cf., [20]). The main difficulty in performance analysis of distributed algorithms is to analyze the product of random matrices involved in estimation error equations. In order to relax the above stringent conditions on random regression vectors, some progress has been made on distributed adaptive estimation and filtering algorithms under undirected graphs. For estimating time-invariant parameters, the convergence analysis of distributed SG algorithm and distributed LS algorithm is provided in [21] and [22] under cooperative excitation conditions. For tracking a time-varying parameter, Xie and Guo in [16] and [23] proposed the weakest possible cooperative information conditions to guarantee the stability and performance of consensus-based and diffusion-based LMS algorithms. Compared with LMS algorithm, FFLS algorithm can generate more accurate estimates in the transient phase (see e.g.,[24]), and the stability analysis for the distributed FFLS algorithm is still lacking. In this paper, we focus on the design and stability analysis of distributed FFLS algorithm without relying on the independency, stationarity or ergodicity assumptions on regression vectors.

The information exchange between sensors is an important factor for the performance of distributed estimation algorithms, and previous studies often assume that the networks are undirected and time-invariant. In practice, they might not be bidirectional or time-invariant due to the heterogeneity of sensors and signal losses caused by the temporary deterioration in the communication link. One approach is to model the networks which randomly change over time as an i.i.d. process, see e.g., [25, 26]. However, the loss of connection usually occurs with correlations [27]. Another approach is to model the random switching process as a Markov chain whose states correspond to possible communication topologies, see [27, 28, 29, 30] among many others. Some studies on the distributed algorithms with deterministic or temporally independent measurement matrix under Markovian switching topologies are given in e.g.,[31, 32].

In this paper, we consider the distributed filtering problem over sensor networks where all sensors aim at collectively tracking an unknown randomly time-varying parameter vector. Based on the fact that recent observation data respond to the parameter changes faster than the early data, we introduce a forgetting factor into the local accumulative cost function formulated as a linear combination of local estimation errors between the observation signals and the prediction signals. By minimizing the local cost function, we propose the distributed FFLS algorithm based on the diffusion strategy over the fixed undirected graph. The stability analysis of the distributed FFLS algorithm is provided under a cooperative excitation condition. Moreover, we generalize the theoretical results to the case of Markovian switching directed sensor networks. The key difference from the fixed undirected graph case is that the adjacency matrix is an asymmetric random matrix. We employ the Markov chain theory to deal with the coupled relationship between random adjacency matrices and random regression vectors. The main contributions of this paper can be summarized as the following aspects:

  • In comparison with [16] and [21], the main difficulty is that the random matrices in the error equation of the diffusion FFLS algorithm are not symmetric and the adaptive gain is no longer a scalar. We establish the exponential stability of the homogeneous part of the estimation error equation and the bound of the tracking error by virtue of the specific structure of the proposed diffusion FFLS algorithm and stability theory of stochastic dynamic systems.

  • Different from the theoretical results of distributed FFLS algorithms in [12] and [15] where regression vectors are required to satisfy the independent or spatio-temporally uncorrelated assumptions, our theoretical analysis is obtained without relying on such stringent conditions, which makes it possible to be applied to the stochastic feedback systems.

  • The cooperative excitation condition introduced in this paper is a temporal and spatial union information condition on the random regression vectors, which can reveal the cooperative effect of multiple sensors in a certain sense, i.e., the whole sensor network can cooperatively finish the estimation task, even if any individual sensor cannot due to lack of necessary information.

The remainder of this paper is organized as follows. In Section 2, we give the problem formulation of this paper. Section 3 presents the distributed FFLS algorithm. The stability of the proposed algorithm under fixed undirected graph and Markovian switching directed graphs are given in Section 4 and Section 5, respectively. Finally, we conclude the paper with some remarks in Section 6.

2 Problem Formulation

2.1 Matrix theory

In this paper, we use m\mathbb{R}^{m} to denote the set of mm-dimensional real vectors, m×n\mathbb{R}^{m\times n} to denote the set of real matrices with mm rows and nn columns, and 𝑰m\bm{I}_{m} to denote the mm-dimensional square identity matrix. For a matrix 𝑨m×n\bm{A}\in\mathbb{R}^{m\times n}, 𝑨\|\bm{A}\| denotes its Euclidean norm, i.e., 𝑨(λmax(𝑨𝑨T))12\|\bm{A}\|\triangleq(\lambda_{\max}(\bm{A}\bm{A}^{T}))^{\frac{1}{2}}, where the notation TT denotes the transpose operator and λmax()\lambda_{\max}(\cdot) denotes the largest eigenvalue of the matrix. Correspondingly, λmin()\lambda_{\min}(\cdot) and tr()tr(\cdot) denote the smallest eigenvalue and the trace of the matrix, respectively. The notation col(,,){\rm{col}}(\cdot,\cdots,\cdot) is used to denote a vector stacked by the specified vectors, and diag(,,){\rm{diag}}(\cdot,\cdots,\cdot) is used to denote a block matrix formed in a diagonal manner of the corresponding vectors or matrices.

For a matrix 𝑨=[aij]m×m\bm{A}=[a_{ij}]\in\mathbb{R}^{m\times m}, if j=1maij=1\sum_{j=1}^{m}a_{ij}=1 holds for all i=1,,mi=1,\cdots,m, then it is called stochastic. The Kronecker product of two matrices 𝑨\bm{A} and 𝑩\bm{B} is denoted by 𝑨𝑩\bm{A}\otimes\bm{B}. For two real symmetric matrices 𝑿n×n\bm{X}\in\mathbb{R}^{n\times n} and 𝒀n×n\bm{Y}\in\mathbb{R}^{n\times n}, 𝑿𝒀\bm{X}\geq\bm{Y} (𝑿>𝒀\bm{X}>\bm{Y}, 𝑿𝒀\bm{X}\leq\bm{Y}, 𝑿<𝒀\bm{X}<\bm{Y}) means that 𝑿𝒀\bm{X}-\bm{Y} is a semi-positive (positive, semi-negative, negative) definite matrix. For a matrix sequence {𝑨t}\{\bm{A}_{t}\} and a positive scalar sequence {at}\{a_{t}\}, the equation 𝑨t=O(at)\bm{A}_{t}=O(a_{t}) means that there exists a positive constant CC independent of tt and ata_{t} such that 𝑨tCat\|\bm{A}_{t}\|\leq Ca_{t} holds for all t0t\geq 0.

The matrix inversion formula is often used in this paper and we list it as follows.

Lemma 2.1 (Matrix inversion formula [33])

For any matrices 𝐀\bm{A}, 𝐁\bm{B}, 𝐂\bm{C} and 𝐃\bm{D} with suitable dimensions, the following formula

(𝑨+𝑩𝑫𝑪)1=𝑨1𝑨1𝑩(𝑫1+𝑪𝑨1𝑩)1𝑪𝑨1(\bm{A}+\bm{B}\bm{D}\bm{C})^{-1}=\bm{A}^{-1}-\bm{A}^{-1}\bm{B}(\bm{D}^{-1}+\bm{C}\bm{A}^{-1}\bm{B})^{-1}\bm{C}\bm{A}^{-1}

holds, provided that the relevant matrices are invertible.

2.2 Graph theory

We use graphs to model the communication topology between sensors. A directed graph 𝒢=(𝒱,,𝒜)\mathcal{G}=(\mathcal{V},\mathcal{E},\mathcal{A}) is composed of a vertex set 𝒱={1,2,3,,n}\mathcal{V}=\{1,2,3,\cdots,n\} which stands for the set of sensors (i.e., nodes), 𝒱×𝒱\mathcal{E}\subset\mathcal{V}\times\mathcal{V} is the edge set, and 𝒜=[aij]1i,jn\mathcal{A}=[a_{ij}]_{1\leq i,j\leq n} is the weighted adjacency matrix. A directed edge (i,j)(i,j)\in\mathcal{E} means that the jj-th sensor can receive the data from the ii-th sensor, and sensors ii and jj are called the parent and child sensors, respectively. The elements of matrix 𝒜\mathcal{A} satisfy aij>0a_{ij}>0 if (i,j)(i,j)\in\mathcal{E} and aij=0a_{ij}=0 otherwise. The in-degree and out-degree of sensor ii are defined by degin(i)=j=1naji\deg_{in}(i)=\sum^{n}_{j=1}a_{ji} and degout(i)=j=1naij\deg_{out}(i)=\sum^{n}_{j=1}a_{ij} respectively. The digraph 𝒢\mathcal{G} is called balanced if degin(i)=degout(i)\deg_{in}(i)=\deg_{out}(i) for i=1,,ni=1,...,n. Here, we assume that 𝒜\mathcal{A} is a stochastic matrix. The neighbor set of ii is denoted as 𝒩i={j𝒱,(j,i)}\mathcal{N}_{i}=\{j\in\mathcal{V},(j,i)\in\mathcal{E}\}, and the sensor ii is also included in this set. For a given positive integer kk, the union of kk digraphs {𝒢j=(𝒱,j,𝒜j),1jk}\{\mathcal{G}_{j}=(\mathcal{V},\mathcal{E}_{j},\mathcal{A}_{j}),1\leq j\leq k\} with the same node set is denoted by j=1k𝒢j=(𝒱,j=1kj,1kj=1k𝒜j)\cup^{k}_{j=1}\mathcal{G}_{j}=(\mathcal{V},\cup^{k}_{j=1}\mathcal{E}_{j},\frac{1}{k}\sum^{k}_{j=1}\mathcal{A}_{j}). A directed path from i1i_{1} to ili_{l} consists of a sequence of sensors i1,i2,il(l2)i_{1},i_{2},...i_{l}(l\geq 2), such that (ik,ik+1)(i_{k},i_{k+1})\in\mathcal{E} for k=1,,l1k=1,...,l-1. The digraph 𝒢\mathcal{G} is said to be strongly connected if for any senor there exist directed paths from this sensor to all other sensors. For the graph 𝒢=(𝒱,,𝒜)\mathcal{G}=(\mathcal{V},\mathcal{E},\mathcal{A}), if aij=ajia_{ij}=a_{ji} for all i,j𝒱i,j\in\mathcal{V}, then it is called an undirected graph. The diameter D𝒢D_{\mathcal{G}} of the undirected graph 𝒢\mathcal{G} is defined as the maximum shortest length of paths between any two sensors.

2.3 Observation model

Consider a network consisting of nn sensors (labeled 1,,n1,\cdots,n) whose task is to estimate an unknown time-varying parameter 𝜽t\bm{\theta}_{t} by cooperating with each other. We assume that the measurement {yt,i,𝝋t,i}\{y_{t,i},\bm{\varphi}_{t,i}\} at the sensor ii obeys the following discrete-time stochastic regression model,

yt+1,i=𝝋t,iT𝜽t+wt+1,i,\displaystyle y_{t+1,i}=\bm{\varphi}_{t,i}^{T}\bm{\theta}_{t}+w_{t+1,i}, (1)

where yt,iy_{t,i} is the scalar output of the sensor ii at time tt, 𝝋t,im\bm{\varphi}_{t,i}\in\mathbb{R}^{m} is the random regression vector, {wt,i}\{w_{t,i}\} is a noise process, and 𝜽t\bm{\theta}_{t} is the unknown mm-dimensional time-varying parameter whose variation at time tt is denoted by Δ𝜽t\Delta\bm{\theta}_{t}, i.e.,

Δ𝜽t𝜽t+1𝜽t,t0.\displaystyle\Delta\bm{\theta}_{t}\triangleq\bm{\theta}_{t+1}-\bm{\theta}_{t},~{}~{}t\geq 0. (2)

Note that when Δ𝜽t0\Delta\bm{\theta}_{t}\equiv 0, 𝜽t\bm{\theta}_{t} becomes a constant vector. For the special case where wt+1,iw_{t+1,i} is a moving average process and 𝝋t,i\bm{\varphi}_{t,i} consists of current and past input-output data, i.e.,

𝝋t,iT=[yt,i,,ytp,i,ut,i,,utq,i]\displaystyle\bm{\varphi}_{t,i}^{T}=[y_{t,i},\cdots,y_{t-p,i},u_{t,i},\cdots,u_{t-q,i}]

with ut,iu_{t,i} being the input signal of the sensor ii at time tt, then the model (1) can be reduced to ARMAX model with time-varying coefficients.

3 The distributed FFLS Algorithm

Tracking a time-varying signal is a fundamental problem in system identification and signal processing. The well-known recursive least squares estimator with a constant forgetting factor α(0,1)\alpha\in(0,1) is often used to track time-varying parameters, which is defined by

𝜽^t+1,iargmin𝜷k=0tαtk(yk+1,i𝜷T𝝋k,i)2.\displaystyle\bm{\hat{\theta}}_{t+1,i}\triangleq\arg\min_{\bm{\beta}}\sum^{t}_{k=0}\alpha^{t-k}(y_{k+1,i}-{\bm{\beta}}^{T}\bm{\varphi}_{k,i})^{2}. (3)

With some simple manipulations using the matrix inversion formula, we can obtain the following recursive FFLS algorithm (Algorithm 1) for an individual sensor.

Algorithm 1 Standard non-cooperative FFLS algorithm

For any given sensor i{1,,n}i\in\{1,...,n\}, begin with an initial estimate 𝜽^0,im\bm{\hat{\theta}}_{0,i}\in\mathbb{R}^{m} and an initial positive definite matrix 𝑷0,im×m\bm{P}_{0,i}\in\mathbb{R}^{m\times m}. The standard FFLS is recursively defined at time t0t\geq 0 as follows,

𝜽^t+1,i\displaystyle\bm{\hat{\theta}}_{t+1,i} =𝜽^t,i+𝑷t,i𝝋t,iα+𝝋t,iT𝑷t,i𝝋t,i(yt+1,i𝝋t,iT𝜽^t,i),\displaystyle=\bm{\hat{\theta}}_{t,i}+\frac{\bm{P}_{t,i}\bm{\varphi}_{t,i}}{\alpha+\bm{\varphi}^{T}_{t,i}\bm{P}_{t,i}\bm{\varphi}_{t,i}}(y_{t+1,i}-\bm{\varphi}^{T}_{t,i}\bm{\hat{\theta}}_{t,i}),
𝑷t+1,i\displaystyle\bm{P}_{t+1,i} =1α(𝑷t,i𝑷t,i𝝋t,i𝝋t,iT𝑷t,iα+𝝋t,iT𝑷t,i𝝋t,i).\displaystyle=\frac{1}{\alpha}\left(\bm{P}_{t,i}-\frac{\bm{P}_{t,i}\bm{\varphi}_{t,i}\bm{\varphi}^{T}_{t,i}\bm{P}_{t,i}}{\alpha+\bm{\varphi}^{T}_{t,i}\bm{P}_{t,i}\bm{\varphi}_{t,i}}\right).

However, due to the limited sensing ability of each sensor, it is often the case where the measurements obtained by each sensor can only reflect partial information of the unknown parameter. In such a case, if only local measurements of the sensor itself are utilized to perform the estimation task (see Algorithm 1), then at most part of the unknown parameter rather than the whole vector can be estimated. Thus, in this paper, we aim at designing a distributed adaptive estimation algorithm such that all sensors cooperatively track the unknown time-varying parameter 𝜽t\bm{\theta}_{t} by using random regression vectors and the observation signals from its neighbors. To simplify the analysis, in this section, we use a fixed undirected graph 𝒢=(𝒱,,𝒜)\mathcal{G}=(\mathcal{V},\mathcal{E},\mathcal{A}) to model the communication topology of nn sensors.

We first introduce the following local cost function σt+1,i(𝜷)\sigma_{t+1,i}(\bm{\beta}) for each sensor ii at the time instant t0t\geq 0 recursively formulated as a linear combination of its neighbors’ local estimation error between the observation signal and the prediction signal,

σt+1,i(𝜷)=j𝒩iaij(ασt,j(𝜷)+(yt+1,j𝜷T𝝋t,j)2).\sigma_{t+1,i}(\bm{\beta})=\sum_{j\in\mathcal{N}_{i}}a_{ij}\bigg{(}\alpha\sigma_{t,j}(\bm{\beta})+(y_{t+1,j}-{\bm{\beta}}^{T}\bm{\varphi}_{t,j})^{2}\bigg{)}. (4)

with σ0,i(𝜷)=0\sigma_{0,i}(\bm{\beta})=0. Set

𝝈t(𝜷)=col{σt,1(𝜷),,σt,n(𝜷)},\displaystyle\bm{\sigma}_{t}(\bm{\beta})={\rm{col}}\{\sigma_{t,1}(\bm{\beta}),\cdots,\sigma_{t,n}(\bm{\beta})\},
𝒆t+1(𝜷)=col{(yt+1,1𝜷T𝝋t,1)2,,(yt+1,n𝜷T𝝋t,n)2}.\displaystyle\bm{e}_{t+1}(\bm{\beta})={\rm{col}}\{(y_{t+1,1}-{\bm{\beta}}^{T}\bm{\varphi}_{t,1})^{2},\cdots,(y_{t+1,n}-{\bm{\beta}}^{T}\bm{\varphi}_{t,n})^{2}\}.

Hence by (4), we have

𝝈t+1(𝜷)\displaystyle\bm{\sigma}_{t+1}(\bm{\beta}) =\displaystyle= α𝒜𝝈t(𝜷)+𝒜𝒆t+1(𝜷)\displaystyle\alpha\mathcal{A}\bm{\sigma}_{t}(\bm{\beta})+\mathcal{A}\bm{e}_{t+1}(\bm{\beta})
=\displaystyle= α2𝒜2𝝈t1(𝜷)+α𝒜2𝒆t(𝜷)+𝒜𝒆t+1(𝜷)\displaystyle\alpha^{2}\mathcal{A}^{2}\bm{\sigma}_{t-1}(\bm{\beta})+\alpha\mathcal{A}^{2}\bm{e}_{t}(\bm{\beta})+\mathcal{A}\bm{e}_{t+1}(\bm{\beta})
=\displaystyle= \displaystyle\cdots
=\displaystyle= αt+1𝒜t+1𝝈0(𝜷)+k=0tαtk𝒜t+1k𝒆k+1(𝜷)\displaystyle\alpha^{t+1}\mathcal{A}^{t+1}\bm{\sigma}_{0}(\bm{\beta})+\sum^{t}_{k=0}\alpha^{t-k}\mathcal{A}^{t+1-k}\bm{e}_{k+1}(\bm{\beta})
=\displaystyle= k=0tαtk𝒜t+1k𝒆k+1(𝜷),\displaystyle\sum^{t}_{k=0}\alpha^{t-k}\mathcal{A}^{t+1-k}\bm{e}_{k+1}(\bm{\beta}),

which implies that

σt+1,i(𝜷)=j=1nk=0tαtkaij(t+1k)(yk+1,j𝜷T𝝋k,j)2,\displaystyle\sigma_{t+1,i}(\bm{\beta})=\sum^{n}_{j=1}\sum^{t}_{k=0}\alpha^{t-k}a^{(t+1-k)}_{ij}(y_{k+1,j}-{\bm{\beta}}^{T}\bm{\varphi}_{k,j})^{2}, (5)

where aij(t+1k)a^{(t+1-k)}_{ij} is the ii-th row, jj-th column entry of the matrix 𝒜t+1k\mathcal{A}^{t+1-k}.

By minimizing the local cost function σt+1,i(𝜷)\sigma_{t+1,i}(\bm{\beta}) in (5), we obtain the distributed FFLS estimate 𝜽^t+1,i\bm{\hat{\theta}}_{t+1,i} of the unknown time-varying parameter for sensor ii, i.e.,

𝜽^t+1,i\displaystyle\bm{\hat{\theta}}_{t+1,i} \displaystyle\triangleq argmin𝜷σt+1,i(𝜷)\displaystyle\arg\min_{\bm{\beta}}\sigma_{t+1,i}(\bm{\beta}) (6)
=\displaystyle= [j=1nk=0tαtkaij(t+1k)𝝋k,j𝝋k,jT]1\displaystyle\left[\sum^{n}_{j=1}\sum^{t}_{k=0}\alpha^{t-k}a^{(t+1-k)}_{ij}\bm{\varphi}_{k,j}\bm{\varphi}^{T}_{k,j}\right]^{-1}
(j=1nk=0tαtkaij(t+1k)𝝋k,jyk+1,j).\displaystyle\left(\sum^{n}_{j=1}\sum^{t}_{k=0}\alpha^{t-k}a^{(t+1-k)}_{ij}\bm{\varphi}_{k,j}y_{k+1,j}\right).

Denote 𝑷t+1,i=(j=1nk=0tαtkaij(t+1k)𝝋k,j𝝋k,jT)1.\bm{P}_{t+1,i}=\left(\sum^{n}_{j=1}\sum^{t}_{k=0}\alpha^{t-k}a^{(t+1-k)}_{ij}\bm{\varphi}_{k,j}\bm{\varphi}^{T}_{k,j}\right)^{-1}. Then we write it into the following recursive form,

𝑷t+1,i1=j𝒩iaij(α𝑷t,j1+𝝋t,j𝝋t,jT).\displaystyle\bm{P}^{-1}_{t+1,i}=\sum_{j\in\mathcal{N}_{i}}a_{ij}(\alpha{{\bm{P}}}^{-1}_{t,j}+\bm{\varphi}_{t,j}\bm{\varphi}^{T}_{t,j}). (7)

By (6), we similarly have

𝜽^t+1,i=𝑷t+1,ij𝒩iaij(α𝑷t,j1𝜽^t,j+𝝋t,jyt+1,j).\displaystyle\bm{\hat{\theta}}_{t+1,i}=\bm{P}_{t+1,i}\sum_{j\in\mathcal{N}_{i}}a_{ij}(\alpha\bm{P}^{-1}_{t,j}\bm{\hat{\theta}}_{t,j}+\bm{\varphi}_{t,j}y_{t+1,j}). (8)

Note that in the above derivation, we assume that the matrix j=1nk=0tαtkaij(t+1k)𝝋k,j𝝋k,jT\sum^{n}_{j=1}\sum^{t}_{k=0}\alpha^{t-k}a^{(t+1-k)}_{ij}\bm{\varphi}_{k,j}\bm{\varphi}^{T}_{k,j} is invertible which is usually not satisfied for small tt. To solve this problem, we take the initial matrix 𝑷0,i\bm{P}_{0,i} to be positive definite. Then (7) can be modified into the following equation,

𝑷t+1,i=\displaystyle\bm{P}_{t+1,i}= (j=1nk=0tαtkaij(t+1k)𝝋k,j𝝋k,jT\displaystyle\Bigg{(}\sum^{n}_{j=1}\sum^{t}_{k=0}\alpha^{t-k}a^{(t+1-k)}_{ij}\bm{\varphi}_{k,j}\bm{\varphi}^{T}_{k,j}
+j=1nαt+1aij(t+1)𝑷0,j1)1.\displaystyle+\sum^{n}_{j=1}\alpha^{t+1}a^{(t+1)}_{ij}\bm{P}^{-1}_{0,j}\Bigg{)}^{-1}. (9)

Though, the estimate given by (8) has a slight difference with (6), which does not affect the analysis of the asymptotic properties of the estimates.

To design the distributed algorithm, we denote

𝑷¯t+1,i1=α𝑷t,i1+𝝋t,i𝝋t,iT.\displaystyle\bm{\bar{P}}^{-1}_{t+1,i}=\alpha{{\bm{P}}}^{-1}_{t,i}+\bm{\varphi}_{t,i}\bm{\varphi}^{T}_{t,i}. (10)

By Lemma 2.1, we have 𝑷¯t+1,i=1α(𝑷t,i𝑷t,i𝝋t,i𝝋t,iT𝑷t,iα+𝝋t,iT𝑷t,i𝝋t,i)\bm{\bar{P}}_{t+1,i}=\frac{1}{\alpha}(\bm{P}_{t,i}-\frac{\bm{P}_{t,i}\bm{\varphi}_{t,i}\bm{\varphi}^{T}_{t,i}\bm{P}_{t,i}}{\alpha+\bm{\varphi}^{T}_{t,i}\bm{P}_{t,i}\bm{\varphi}_{t,i}}). Hence,

𝜽¯t+1,i\displaystyle\bm{\bar{\theta}}_{t+1,i} \displaystyle\triangleq 𝑷¯t+1,i(α𝑷t,i1𝜽^t,i+𝝋t,iyt+1,i)\displaystyle\bm{\bar{P}}_{t+1,i}(\alpha\bm{P}^{-1}_{t,i}\bm{\hat{\theta}}_{t,i}+\bm{\varphi}_{t,i}y_{t+1,i})
=\displaystyle= 𝜽^t,i+𝑷t,i𝝋t,iα+𝝋t,iT𝑷t,i𝝋t,i(yt+1,i𝝋t,iT𝜽^t,i).\displaystyle\bm{\hat{\theta}}_{t,i}+\frac{\bm{P}_{t,i}\bm{\varphi}_{t,i}}{\alpha+\bm{\varphi}^{T}_{t,i}\bm{P}_{t,i}\bm{\varphi}_{t,i}}(y_{t+1,i}-\bm{\varphi}^{T}_{t,i}\bm{\hat{\theta}}_{t,i}).

Therefore, we get the following distributed FFLS algorithm of diffusion type, i.e., Algorithm 2.

Algorithm 2 Distributed FFLS algorithm

Input: {𝝋t,i,yt+1,i}i=1n\{\bm{\varphi}_{t,i},y_{t+1,i}\}^{n}_{i=1}, t=0,1,2,t=0,1,2,\cdots
Output: {𝜽^t+1,i}i=1n\{\bm{\hat{\theta}}_{t+1,i}\}^{n}_{i=1}, t=0,1,2,t=0,1,2,\cdots

Initialization: For each sensor i{1,,n}i\in\{1,\cdots,n\}, begin with an initial vector 𝜽^0,i\bm{\hat{\theta}}_{0,i} and an initial positive definite matrix 𝑷0,i>0\bm{P}_{0,i}>0.
for  each time t=0,1,2,t=0,1,2,\cdots do
     for  each  sensor  i=1,,ni=1,\cdots,n do
         𝐒𝐭𝐞𝐩 1.\mathbf{Step\ 1.} Adaption (generate 𝜽¯t+1,i\bm{\bar{\theta}}_{t+1,i} and 𝑷¯t+1,i\bm{\bar{P}}_{t+1,i} based           on 𝜽^t,i\bm{\hat{\theta}}_{t,i}, 𝑷t,i\bm{P}_{t,i}, 𝝋t,i\bm{\varphi}_{t,i} and yt+1,iy_{t+1,i}):
𝜽¯t+1,i\displaystyle~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}\bm{\bar{\theta}}_{t+1,i} =𝜽^t,i+𝑷t,i𝝋t,iα+𝝋t,iT𝑷t,i𝝋t,i(yt+1,i𝝋t,iT𝜽^t,i),\displaystyle=\bm{\hat{\theta}}_{t,i}+\frac{\bm{P}_{t,i}\bm{\varphi}_{t,i}}{\alpha+\bm{\varphi}^{T}_{t,i}\bm{P}_{t,i}\bm{\varphi}_{t,i}}(y_{t+1,i}-\bm{\varphi}^{T}_{t,i}\bm{\hat{\theta}}_{t,i}), (11)
𝑷¯t+1,i\displaystyle\bm{\bar{P}}_{t+1,i} =1α(𝑷t,i𝑷t,i𝝋t,i𝝋t,iT𝑷t,iα+𝝋t,iT𝑷t,i𝝋t,i),\displaystyle=\frac{1}{\alpha}\left(\bm{P}_{t,i}-\frac{\bm{P}_{t,i}\bm{\varphi}_{t,i}\bm{\varphi}^{T}_{t,i}\bm{P}_{t,i}}{\alpha+\bm{\varphi}^{T}_{t,i}\bm{P}_{t,i}\bm{\varphi}_{t,i}}\right), (12)
         𝐒𝐭𝐞𝐩 2.\mathbf{Step\ 2.} Combination (generate 𝑷t+1,i1\bm{P}^{-1}_{t+1,i} and 𝜽^t+1,i\bm{\hat{\theta}}_{t+1,i}           by a convex combination of 𝜽¯t+1,j\bm{\bar{\theta}}_{t+1,j} and 𝑷¯t+1,j\bm{\bar{P}}_{t+1,j}):
𝑷t+1,i1\displaystyle\bm{P}^{-1}_{t+1,i} =j𝒩iaij𝑷¯t+1,j1,\displaystyle=\sum_{j\in\mathcal{N}_{i}}a_{ij}\bm{\bar{P}}^{-1}_{t+1,j}, (13)
𝜽^t+1,i\displaystyle\bm{\hat{\theta}}_{t+1,i} =𝑷t+1,ij𝒩iaij𝑷¯t+1,j1𝜽¯t+1,j.\displaystyle=\bm{P}_{t+1,i}\sum_{j\in\mathcal{N}_{i}}a_{ij}\bm{\bar{P}}^{-1}_{t+1,j}\bm{\bar{\theta}}_{t+1,j}. (14)
     

Note that when 𝒜=𝑰n\mathcal{A}=\bm{I}_{n}, the distributed FFLS algorithm will degenerate to the classical FFLS (i.e., Algorithm 1), and when α=1\alpha=1, the distributed FFLS algorithm will degenerate to the distributed LS in [22] which is used to estimate the time-invariant parameter. The quantity 1α1-\alpha is usually referred to as the speed of adaption. Intuitively, when the parameter process {𝜽t}\{\bm{\theta}_{t}\} is slowly time-varying, the adaptation speed should also be slow (i.e., α\alpha is large). The purpose of this paper is to establish the stability of the above diffusion FFLS-based adaptive filter without independence or stationarity assumptions on random regression vector {𝝋t,i}\{\bm{\varphi}_{t,i}\}.

In order to analyze the distributed FFLS algorithm, we need to derive the estimation error equation. Denote 𝜽~t,i𝜽t𝜽^t,i\bm{\widetilde{\theta}}_{t,i}\triangleq\bm{\theta}_{t}-\bm{\hat{\theta}}_{t,i}, then from (13) and (14), we have

𝜽~t+1,i=𝜽t+1𝑷t+1,ij𝒩iaij𝑷¯t+1,j1𝜽¯t+1,j\displaystyle\bm{\widetilde{\theta}}_{t+1,i}=\bm{\theta}_{t+1}-\bm{P}_{t+1,i}\sum_{j\in\mathcal{N}_{i}}a_{ij}\bm{\bar{P}}^{-1}_{t+1,j}\bm{\bar{\theta}}_{t+1,j}
=\displaystyle= 𝑷t+1,ij𝒩iaij𝑷¯t+1,j1𝜽t+1𝑷t+1,ij𝒩iaij𝑷¯t+1,j1𝜽¯t+1,j\displaystyle\bm{P}_{t+1,i}\sum_{j\in\mathcal{N}_{i}}a_{ij}\bm{\bar{P}}^{-1}_{t+1,j}\bm{\theta}_{t+1}-\bm{P}_{t+1,i}\sum_{j\in\mathcal{N}_{i}}a_{ij}\bm{\bar{P}}^{-1}_{t+1,j}\bm{\bar{\theta}}_{t+1,j}
=\displaystyle= 𝑷t+1,ij𝒩iaij𝑷¯t+1,j1(𝜽t+1𝜽¯t+1,j).\displaystyle\bm{P}_{t+1,i}\sum_{j\in\mathcal{N}_{i}}a_{ij}\bm{\bar{P}}^{-1}_{t+1,j}(\bm{\theta}_{t+1}-\bm{\bar{\theta}}_{t+1,j}). (15)

By (1), (2), (11) and (12), we can obtain the following equation,

𝜽t+1𝜽¯t+1,i\displaystyle~{}~{}~{}~{}\bm{\theta}_{t+1}-\bm{\bar{\theta}}_{t+1,i}
=𝜽t+Δ𝜽t𝜽^t,i𝑷t,i𝝋t,iα+𝝋t,iT𝑷t,i𝝋t,i(yt+1,i𝝋t,iT𝜽^t,i)\displaystyle=\bm{\theta}_{t}+\Delta\bm{\theta}_{t}-\bm{\hat{\theta}}_{t,i}-\frac{\bm{P}_{t,i}\bm{\varphi}_{t,i}}{\alpha+\bm{\varphi}^{T}_{t,i}\bm{P}_{t,i}\bm{\varphi}_{t,i}}(y_{t+1,i}-\bm{\varphi}^{T}_{t,i}\bm{\hat{\theta}}_{t,i})
=(𝑰m𝑷t,i𝝋t,i𝝋t,iTα+𝝋t,iT𝑷t,i𝝋t,i)𝜽~t,i𝑷t,i𝝋t,iwt+1,iα+𝝋t,iT𝑷t,i𝝋t,i+Δ𝜽t\displaystyle=\Big{(}\bm{I}_{m}-\frac{\bm{P}_{t,i}\bm{\varphi}_{t,i}\bm{\varphi}^{T}_{t,i}}{\alpha+\bm{\varphi}^{T}_{t,i}\bm{P}_{t,i}\bm{\varphi}_{t,i}}\Big{)}\bm{\widetilde{\theta}}_{t,i}-\frac{\bm{P}_{t,i}\bm{\varphi}_{t,i}w_{t+1,i}}{\alpha+\bm{\varphi}^{T}_{t,i}\bm{P}_{t,i}\bm{\varphi}_{t,i}}+\Delta\bm{\theta}_{t}
=α𝑷¯t+1,i𝑷t,i1𝜽~t,i𝑷t,i𝝋t,iwt+1,iα+𝝋t,iT𝑷t,i𝝋t,i+Δ𝜽t.\displaystyle=\alpha\bm{\bar{P}}_{t+1,i}\bm{P}^{-1}_{t,i}\bm{\widetilde{\theta}}_{t,i}-\frac{\bm{P}_{t,i}\bm{\varphi}_{t,i}w_{t+1,i}}{\alpha+\bm{\varphi}^{T}_{t,i}\bm{P}_{t,i}\bm{\varphi}_{t,i}}+\Delta\bm{\theta}_{t}. (16)

For convenience of analysis, we introduce the following set of notations,

𝒀t=col{yt,1,,yt,n},(n×1)\displaystyle\bm{Y}_{t}={\rm{col}}\{y_{t,1},\cdots,y_{t,n}\},\hskip 55.19841pt(n\times 1)
𝚽t=diag{𝝋t,1,,𝝋t,n},(mn×n)\displaystyle\bm{\Phi}_{t}={\rm{diag}}\{\bm{\varphi}_{t,1},\cdots,\bm{\varphi}_{t,n}\},\hskip 42.67912pt(mn\times n)
𝑾t=col{wt,1,,wt,n},(n×1)\displaystyle\bm{W}_{t}={\rm{col}}\{w_{t,1},\cdots,w_{t,n}\},\hskip 46.94687pt(n\times 1)
𝑷t=diag{𝑷t,1,,𝑷t,n},(mn×mn)\displaystyle\bm{P}_{t}={\rm{diag}}\{\bm{P}_{t,1},\cdots,\bm{P}_{t,n}\},\hskip 45.52458pt(mn\times mn)
𝑷¯t=diag{𝑷¯t,1,,𝑷¯t,n},(mn×mn)\displaystyle\bm{\bar{P}}_{t}={\rm{diag}}\{\bm{\bar{P}}_{t,1},\cdots,\bm{\bar{P}}_{t,n}\},\hskip 45.52458pt(mn\times mn)
𝚯t=col{𝜽t,,𝜽tn},(mn×1)\displaystyle\bm{\Theta}_{t}={\rm{col}}\{\underbrace{\bm{\theta}_{t},\cdots,\bm{\theta}_{t}}_{n}\},\hskip 68.28644pt(mn\times 1)
Δ𝚯t=col{Δ𝜽t,,Δ𝜽tn},(mn×1)\displaystyle\Delta\bm{\Theta}_{t}={\rm{col}}\{\underbrace{\Delta\bm{\theta}_{t},\cdots,\Delta\bm{\theta}_{t}}_{n}\},\hskip 44.10185pt(mn\times 1)
𝑳t=diag{𝑳t,1,,𝑳t,n},(mn×n)\displaystyle\bm{L}_{t}={\rm{diag}}\{\bm{L}_{t,1},\cdots,\bm{L}_{t,n}\},\hskip 47.23167pt(mn\times n)
where𝑳t,i=𝑷t,i𝝋t,iα+𝝋t,iT𝑷t,i𝝋t,i,\displaystyle\hskip 28.45274pt{\rm where}~{}~{}\bm{L}_{t,i}=\frac{\bm{P}_{t,i}\bm{\varphi}_{t,i}}{\alpha+\bm{\varphi}^{T}_{t,i}\bm{P}_{t,i}\bm{\varphi}_{t,i}},
𝚯~t=col{𝜽~t,1,,𝜽~t,n},(mn×1)\displaystyle\bm{\widetilde{\Theta}}_{t}={\rm{col}}\{\bm{\widetilde{\theta}}_{t,1},\cdots,\bm{\widetilde{\theta}}_{t,n}\},\hskip 56.9055pt(mn\times 1)
𝒜=𝒜𝑰m,(mn×mn)\displaystyle\mathscr{A}=\mathcal{A}\otimes\bm{I}_{m},\hskip 104.70593pt(mn\times mn)

Hence by (15) and (16), we have the following equation about estimation error,

𝚯~t+1=α𝑷t+1𝒜𝑷t1𝚯~t𝑷t+1𝒜𝑷¯t+11(𝑳t𝑾t+1+Δ𝚯t).\displaystyle\bm{\widetilde{\Theta}}_{t+1}=\alpha\bm{P}_{t+1}\mathscr{A}\bm{P}^{-1}_{t}\bm{\widetilde{\Theta}}_{t}-\bm{P}_{t+1}\mathscr{A}\bm{\bar{P}}^{-1}_{t+1}(\bm{L}_{t}\bm{W}_{t+1}+\Delta\bm{\Theta}_{t}). (17)

From (17), we see that the properties of product of random matrices, i.e., tα𝑷t+1𝒜𝑷t1\prod_{t}\alpha\bm{P}_{t+1}\mathscr{A}\bm{P}^{-1}_{t}, play important roles in stability analysis of the homogeneous part in error equation.

As we all know, the analysis of product of random matrices is generally a difficult mathematical problem if the random matrices do not satisfy the independency or stationarity assumptions. There is some work to study this problem, which focuses on either symmetric random matrix or scalar gain case. For example, [21] and [16] investigated the convergence of consensus-diffusion SG algorithm and the stability of consensus normalized LMS algorithm where the random matrices in error equations are symmetric. Note that the random matrices α𝑷t+1𝒜𝑷t1\alpha\bm{P}_{t+1}\mathscr{A}\bm{P}^{-1}_{t} here are asymmetric. Although [23] studied the properties of the asymmetric random matrices in the LMS-based estimation error equation, the adaptive gain of distributed LMS algorithm in [23] is a scalar while the gain 𝑷t,iα+𝝋t,iT𝑷t,i𝝋t,i\frac{\bm{P}_{t,i}}{\alpha+\bm{\varphi}^{T}_{t,i}\bm{P}_{t,i}\bm{\varphi}_{t,i}} in (11) of this paper is a random matrix. Hence the methods used in existing literature including [16, 21, 23] are no longer applicable to our case. One of the main purposes of this paper is to overcome the above difficulties by using both the specific structure of the diffusion FFLS and some results of FFLS on single sensor case (see [34]).

4 Stability of distributed FFLS algorithm under fixed undirected graph

In this section, we will establish exponential stability for the homogeneous part of the error equation (17) and the tracking error bounds for the proposed distributed FFLS algorithm in Algorithm 2 without requiring statistical independence on the system signals. For this purpose, we need to introduce some definitions on the stability of random matrices (see [34]) and assumptions on the graph and random regression vectors.

4.1 Some definitions

Definition 4.1

A random matrix sequence {𝐀t,t0}\{\bm{A}_{t},t\geq 0\} defined on the basic probability space (Ω,,P)(\Omega,\mathscr{F},P) is called LpL_{p}-stable (p>0)(p>0) if supt0𝔼(𝐀tp)<\sup_{t\geq 0}\mathbb{E}(\|\bm{A}_{t}\|^{p})<\infty, where 𝔼()\mathbb{E}(\cdot) denotes the mathematical expectation operator. We define 𝐀tLp[𝔼(𝐀tp)]1p\|\bm{A}_{t}\|_{L_{p}}\triangleq[\mathbb{E}(\|\bm{A}_{t}\|^{p})]^{\frac{1}{p}} as the LpL_{p}-norm of the random matrix 𝐀t\bm{A}_{t}.

Definition 4.2

A sequence of n×nn\times n random matrices 𝐀={𝐀t,t0}\bm{A}=\{\bm{A}_{t},t\geq 0\} is called LpL_{p}-exponentially stable (p0)(p\geq 0) with parameter λ[0,1)\lambda\in[0,1), if it belongs to the following set

Sp(λ)={\displaystyle S_{p}(\lambda)=\Big{\{} 𝑨:j=k+1t𝑨jLpMλtk,tk,\displaystyle\bm{A}:\Big{\|}\prod^{t}_{j=k+1}\bm{A}_{j}\Big{\|}_{L_{p}}\leq M\lambda^{t-k},\forall t\geq k,
k0,forsomeM>0}.\displaystyle\forall k\geq 0,{\rm for~{}some}~{}M>0\Big{\}}. (18)

As demonstrated by Guo in [34], {𝑨t,t0}Sp(λ)\{\bm{A}_{t},t\geq 0\}\in S_{p}(\lambda) is in some sense the necessary and sufficient condition for stability of {𝒙t}\{\bm{x}_{t}\} generated by 𝒙t=𝑨t𝒙t+𝝃t+1,t0\bm{x}_{t}=\bm{A}_{t}\bm{x}_{t}+\bm{\xi}_{t+1},~{}t\geq 0. Also, the stability analysis of the matrix sequence may be reduced to that of a certain class of scalar sequence, which can be further analyzed based on some excitation conditions on the regressors. To this end, we introduce the following subset of S1(λ)S_{1}(\lambda) for a scalar sequence a=(at,t0)a=(a_{t},t\geq 0).

S0(λ)={\displaystyle S^{0}(\lambda)=\Big{\{} a:at[0,1),𝔼(j=k+1taj)Mλtk,\displaystyle a:a_{t}\in[0,1),\mathbb{E}\left(\prod^{t}_{j=k+1}a_{j}\right)\leq M\lambda^{t-k},
tk,k0,forsomeM>0}.\displaystyle\forall t\geq k,\forall k\geq 0,{\rm for~{}some}~{}M>0\Big{\}}.

The definition S0(λ)S^{0}(\lambda) will be used when we convert the product of a random matrix to that of a scalar sequence.

Remark 4.1

It is clear that if there exist a constant a0(0,1)a_{0}\in(0,1) such that ata0a_{t}\leq a_{0} for all tt, then atS0(a0)a_{t}\in S^{0}(a_{0}). More properties about the set S0(λ)S^{0}(\lambda) can be found in [35].

4.2 Assumptions

Assumption 4.1

The undirected graph 𝒢\mathcal{G} is connected.

Remark 4.2

For any k>1k>1, we denote 𝒜k(aij(k))\mathcal{A}^{k}\triangleq(a_{ij}^{(k)}) with 𝒜\mathcal{A} being the weighted adjacency matrix of the graph 𝒢\mathcal{G}, i.e., aij(k)a_{ij}^{(k)} is the ii-th row, jj-th column element of the matrix 𝒜k\mathcal{A}^{k}. Under Assumption 4.1, it is clear that 𝒜k\mathcal{A}^{k} is a positive matrix for kD𝒢k\geq D_{\mathcal{G}}, which means that aij(k)>0a_{ij}^{(k)}>0 for any ii and jj (cf., [36]).

Assumption 4.2 (Cooperative Excitation Condition)

For the adapted sequences {𝛗t,i,t,t0}\{\bm{\varphi}_{t,i},\mathscr{F}_{t},t\geq 0\}, where t\mathscr{F}_{t} is a sequence of non-decreasing σ\sigma-algebras, there exists an integer h>0h>0 such that {1λt}S0(λ)\{1-\lambda_{t}\}\in S^{0}(\lambda) for some λ(0,1)\lambda\in(0,1), where λt\lambda_{t} is defined by

λtλmin[𝔼(1n(1+h)i=1nk=th+1(t+1)h𝝋k,i𝝋k,iT1+𝝋k,i2|th)]\displaystyle\lambda_{t}\triangleq\lambda_{\min}\left[\mathbb{E}\left(\frac{1}{n(1+h)}\sum^{n}_{i=1}\sum^{(t+1)h}_{k=th+1}\frac{\bm{\varphi}_{k,i}\bm{\varphi}^{T}_{k,i}}{1+\|\bm{\varphi}_{k,i}\|^{2}}\Big{|}\mathscr{F}_{th}\right)\right]

with 𝔼(|)\mathbb{E}(\cdot|\cdot) being the conditional mathematical expectation operator.

Remark 4.3

Assumption 4.2 is also used to guarantee the stability and performance of the distributed LMS algorithm (see e.g., [16, 23]). We give some intuitive explanations for the above cooperative excitation condition about the following two aspects.

(1) “Why excitation”. Let us consider an extreme case where all regression vectors 𝛗k,i\bm{\varphi}_{k,i} are equal to zero, then Assumption 4.2 can not be satisfied. Moreover, from (1), we see that the unknown parameter 𝛉t\bm{\theta}_{t} can not be estimated or tracked since the observations yt,iy_{t,i} do not contain any information about the unknown parameter 𝛉t\bm{\theta}_{t}. In order to estimate 𝛉t\bm{\theta}_{t}, some nonzero information condition (named excitation condition) should be imposed on the regression vectors 𝛗t,i\bm{\varphi}_{t,i}. In fact, Assumption 4.2 intuitively gives a lower bound (which may be changed over time) of the sequence {λt}\{\lambda_{t}\}. For example, if there exists a constant λ0(0,1)\lambda_{0}\in(0,1) such that inftλtλ0\inf_{t}\lambda_{t}\geq\lambda_{0}, then by Remark 4.1, we know that Assumption 4.2 can be satisfied.

(2) “Why cooperative”. Compared with the excitation condition for FFLS algorithm of single sensor case in [34], i.e., there exists a constant h>0h>0 such that

{1λ,tt0}S0(λ)\displaystyle\{1-{\lambda}{{}^{\prime}}_{t},t\geq 0\}\in S^{0}({\lambda}{{}^{\prime}}) (19)

for some λ{\lambda}{{}^{\prime}} where

λ=tλmin[𝔼(11+hk=th+1(t+1)h𝝋k,i𝝋k,iT1+𝝋k,i2|th)].\displaystyle{\lambda}{{}^{\prime}}_{t}=\lambda_{\min}\left[\mathbb{E}\left(\frac{1}{1+h}\sum^{(t+1)h}_{k=th+1}\frac{\bm{\varphi}_{k,i}\bm{\varphi}^{T}_{k,i}}{1+\|\bm{\varphi}_{k,i}\|^{2}}\Big{|}\mathscr{F}_{th}\right)\right].

Assumption 4.2 contains not only temporal union information but also spatial union information of all the sensors, which means that Assumption 4.2 is much weaker than the condition (19) since λtλt\lambda_{t}\geq{\lambda}{{}^{\prime}}_{t} when n>1n>1. Besides, we also note that Assumption 4.2 can be reduced to the condition (19) when n=1n=1. In fact, Assumption 4.2 can reflect the cooperative effect of multiple sensors in the sense that the estimation task can be still fulfilled by the cooperation of multiple sensors even if any of them cannot.

4.3 Main results

In order to establish exponential stability of the product of random matrices α𝑷t+1𝒜𝑷t1\alpha\bm{P}_{t+1}\mathscr{A}\bm{P}^{-1}_{t}, we first analyze the properties of the random matrix 𝑷t\bm{P}_{t} to obtain its upper bound.

Lemma 4.1

For {𝐏t}\{\bm{P}_{t}\} generated by (12) and (13), under Assumptions 4.1-4.2, we have

Tt+11αh(1βt+1)(hD𝒢)tr(𝑷th+1).\displaystyle T_{t+1}\leq\frac{1}{\alpha^{h^{\prime}}}(1-\beta_{t+1})(h^{\prime}-D_{\mathcal{G}})tr(\bm{P}_{th^{\prime}+1}). (20)

where

Tt\displaystyle T_{t} k=(t1)h+D𝒢+1thtr(𝑷k+1),T0=0,\displaystyle\triangleq\sum^{th^{\prime}}_{k=(t-1)h^{\prime}+D_{\mathcal{G}}+1}tr(\bm{P}_{k+1}),~{}T_{0}=0,
βt+1\displaystyle\beta_{t+1} amin2γt+1n(hD𝒢)(αh+λmax(l=1n𝑷th+1,l))tr(𝑷th+1),\displaystyle\triangleq\frac{a^{2}_{\min}\gamma_{t+1}}{n(h^{\prime}-D_{\mathcal{G}})\left(\alpha^{h^{\prime}}+\lambda_{\max}\left(\sum^{n}_{l=1}\bm{P}_{th^{\prime}+1,l}\right)\right)tr(\bm{P}_{th^{\prime}+1})},
γt+1\displaystyle\gamma_{t+1} tr((l=1n𝑷th+1,l)2k=th+D𝒢+1(t+1)hj=1n𝝋k,j𝝋k,jT(1+𝝋k,j2)),\displaystyle\triangleq tr\left(\left(\sum^{n}_{l=1}\bm{P}_{th^{\prime}+1,l}\right)^{2}\sum^{(t+1)h^{\prime}}_{k=th^{\prime}+D_{\mathcal{G}}+1}\sum^{n}_{j=1}\frac{\bm{\varphi}_{k,j}\bm{\varphi}^{T}_{k,j}}{(1+\|\bm{\varphi}_{k,j}\|^{2})}\right),
amin\displaystyle a_{\min} mini,j{1,,n}aij(D𝒢)>0,\displaystyle\triangleq\min\limits_{i,j\in\{1,\cdots,n\}}a^{(D_{\mathcal{G}})}_{ij}>0,
h\displaystyle h^{\prime} 2h+D𝒢,\displaystyle\triangleq 2h+D_{\mathcal{G}},

and hh is given by Assumption 4.2.

Proof 4.1.

Note that aij(k)a^{(k)}_{ij} is the ii-th row, jj-th column element of the matrix 𝒜k,\mathcal{A}^{k}, k1k\geq 1, where aij(1)=aija^{(1)}_{ij}=a_{ij}. By (10), we have 𝐏k+1,i1j=1naijα𝐏k,j1\bm{P}^{-1}_{k+1,i}\geq\sum^{n}_{j=1}a_{ij}{\alpha\bm{P}^{-1}_{k,j}}. Hence by the inequality

(j=1naij𝑨j)1j=1naij𝑨j1\displaystyle\Big{(}\sum^{n}_{j=1}a_{ij}{\bm{A}_{j}}\Big{)}^{-1}\leq\sum^{n}_{j=1}a_{ij}{\bm{A}^{-1}_{j}} (21)

with 𝐀j0\bm{A}_{j}\geq 0, we obtain for any t0t\geq 0, and any k[th+D𝒢+1,(t+1)h]k\in[th^{\prime}+D_{\mathcal{G}}+1,(t+1)h^{\prime}],

𝑷k,i\displaystyle\bm{P}_{k,i} (j=1naijα𝑷k1,j1)11αj=1naij𝑷k1,j\displaystyle\leq\Big{(}\sum^{n}_{j=1}a_{ij}{\alpha\bm{P}^{-1}_{k-1,j}}\Big{)}^{-1}\leq\frac{1}{\alpha}\sum^{n}_{j=1}a_{ij}{\bm{P}_{k-1,j}}
1αj=1naij(1αl=1najl𝑷k2,l)\displaystyle\leq\frac{1}{\alpha}\sum^{n}_{j=1}a_{ij}\left(\frac{1}{\alpha}\sum^{n}_{l=1}a_{jl}\bm{P}_{k-2,l}\right)
=1α2j=1naij(2)𝑷k2,j\displaystyle=\frac{1}{\alpha^{2}}\sum^{n}_{j=1}a^{(2)}_{ij}\bm{P}_{k-2,j}\leq\cdots
1αkth1j=1naij(kth1)𝑷th+1,j\displaystyle\leq\frac{1}{\alpha^{k-th^{\prime}-1}}\sum^{n}_{j=1}a^{({k-th^{\prime}-1})}_{ij}\bm{P}_{th^{\prime}+1,j}
1αh1j=1naij(kth1)𝑷th+1,j.\displaystyle\leq\frac{1}{\alpha^{h^{\prime}-1}}\sum^{n}_{j=1}a^{({k-th^{\prime}-1})}_{ij}\bm{P}_{th^{\prime}+1,j}. (22)

Denote 𝐐ik,th=j=1naij(kth1)𝐏th+1,j\bm{Q}^{k,th^{\prime}}_{i}=\sum^{n}_{j=1}a^{({k-th^{\prime}-1})}_{ij}\bm{P}_{th^{\prime}+1,j}. Then by (10), (13), (21) and (22), we have for k[th+D𝒢+1,(t+1)h]k\in[th^{\prime}+D_{\mathcal{G}}+1,(t+1)h^{\prime}],

𝑷k+1,i\displaystyle\bm{P}_{k+1,i} =(j=1naij(α𝑷k,j1+𝝋k,j𝝋k,jT))1\displaystyle=\left(\sum^{n}_{j=1}a_{ij}(\alpha{{\bm{P}}}^{-1}_{k,j}+\bm{\varphi}_{k,j}\bm{\varphi}^{T}_{k,j})\right)^{-1}
j=1naij(α𝑷k,j1+𝝋k,j𝝋k,jT)1\displaystyle\leq\sum^{n}_{j=1}a_{ij}(\alpha{{\bm{P}}}^{-1}_{k,j}+\bm{\varphi}_{k,j}\bm{\varphi}^{T}_{k,j})^{-1}
j=1naij(α(1αh1𝑸jk,th)1+𝝋k,j𝝋k,jT)1.\displaystyle\leq\sum^{n}_{j=1}a_{ij}\left(\alpha\left(\frac{1}{\alpha^{h^{\prime}-1}}\bm{Q}^{k,th^{\prime}}_{j}\right)^{-1}+\bm{\varphi}_{k,j}\bm{\varphi}^{T}_{k,j}\right)^{-1}. (23)

By Lemma 2.1 and (23), it follows that

𝑷k+1,i\displaystyle\bm{P}_{k+1,i} 1αhj=1naij(𝑸jk,th𝑸jk,th𝝋k,j𝝋k,jT𝑸jk,thαh+𝝋k,jT𝑸jk,th𝝋k,j)\displaystyle\leq\frac{1}{\alpha^{h^{\prime}}}\sum^{n}_{j=1}a_{ij}\Bigg{(}\bm{Q}^{k,th^{\prime}}_{j}-\frac{\bm{Q}^{k,th^{\prime}}_{j}\bm{\varphi}_{k,j}\bm{\varphi}^{T}_{k,j}\bm{Q}^{k,th^{\prime}}_{j}}{\alpha^{h^{\prime}}+\bm{\varphi}^{T}_{k,j}\bm{Q}^{k,th^{\prime}}_{j}\bm{\varphi}_{k,j}}\Bigg{)}
=1αhj=1naij(kth)𝑷th+1,j\displaystyle=\frac{1}{\alpha^{h^{\prime}}}\sum^{n}_{j=1}a^{({k-th^{\prime}})}_{ij}\bm{P}_{th^{\prime}+1,j}
1αhj=1naij𝑸jk,th𝝋k,j𝝋k,jT𝑸jk,thαh+𝝋k,jT𝑸jk,th𝝋k,j\displaystyle~{}~{}~{}~{}-\frac{1}{\alpha^{h^{\prime}}}\sum^{n}_{j=1}a_{ij}\frac{\bm{Q}^{k,th^{\prime}}_{j}\bm{\varphi}_{k,j}\bm{\varphi}^{T}_{k,j}\bm{Q}^{k,th^{\prime}}_{j}}{\alpha^{h^{\prime}}+\bm{\varphi}^{T}_{k,j}\bm{Q}^{k,th^{\prime}}_{j}\bm{\varphi}_{k,j}}
1αhj=1naij(kth)𝑷th+1,j\displaystyle\leq\frac{1}{\alpha^{h^{\prime}}}\sum^{n}_{j=1}a^{({k-th^{\prime}})}_{ij}\bm{P}_{th^{\prime}+1,j}
1αhj=1naij𝑸jk,th𝝋k,j𝝋k,jT𝑸jk,thαh+λmax(𝑸jk,th)(1+𝝋k,j2).\displaystyle~{}~{}~{}~{}-\frac{1}{\alpha^{h^{\prime}}}\sum^{n}_{j=1}\frac{a_{ij}\bm{Q}^{k,th^{\prime}}_{j}\bm{\varphi}_{k,j}\bm{\varphi}^{T}_{k,j}\bm{Q}^{k,th^{\prime}}_{j}}{\alpha^{h^{\prime}}+\lambda_{\max}(\bm{Q}^{k,th^{\prime}}_{j})(1+\|\bm{\varphi}_{k,j}\|^{2})}. (24)

Then by (24), we have

tr(𝑷k+1)=tr(i=1n𝑷k+1,i)\displaystyle tr(\bm{P}_{k+1})=tr\Bigg{(}\sum^{n}_{i=1}\bm{P}_{k+1,i}\Bigg{)}
\displaystyle\leq 1αhtr(i=1nj=1naij(kth)𝑷th+1,j)\displaystyle\frac{1}{\alpha^{h^{\prime}}}tr\Bigg{(}\sum^{n}_{i=1}\sum^{n}_{j=1}a^{({k-th^{\prime}})}_{ij}\bm{P}_{th^{\prime}+1,j}\Bigg{)}
1αhtr(i=1nj=1naij𝑸jk,th𝝋k,j𝝋k,jT𝑸jk,thαh+λmax(𝑸jk,th)(1+𝝋k,j2))\displaystyle-\frac{1}{\alpha^{h^{\prime}}}tr\Bigg{(}\sum^{n}_{i=1}\sum^{n}_{j=1}a_{ij}\frac{\bm{Q}^{k,th^{\prime}}_{j}\bm{\varphi}_{k,j}\bm{\varphi}^{T}_{k,j}\bm{Q}^{k,th^{\prime}}_{j}}{\alpha^{h^{\prime}}+\lambda_{\max}(\bm{Q}^{k,th^{\prime}}_{j})(1+\|\bm{\varphi}_{k,j}\|^{2})}\Bigg{)}
=\displaystyle= 1αh(tr(𝑷th+1)j=1ntr(𝑸jk,th𝝋k,j𝝋k,jT𝑸jk,th)αh+λmax(𝑸jk,th)(1+𝝋k,j2)).\displaystyle\frac{1}{\alpha^{h^{\prime}}}\Bigg{(}tr(\bm{P}_{th^{\prime}+1})-\sum^{n}_{j=1}\frac{tr\left(\bm{Q}^{k,th^{\prime}}_{j}\bm{\varphi}_{k,j}\bm{\varphi}^{T}_{k,j}\bm{Q}^{k,th^{\prime}}_{j}\right)}{\alpha^{h^{\prime}}+\lambda_{\max}(\bm{Q}^{k,th^{\prime}}_{j})(1+\|\bm{\varphi}_{k,j}\|^{2})}\Bigg{)}.

Hence combining this with the inequality j=1najbjj=1najj=1nbj\sum^{n}_{j=1}\frac{a_{j}}{b_{j}}\geq\frac{\sum^{n}_{j=1}a_{j}}{\sum^{n}_{j=1}b_{j}} where aj0a_{j}\geq 0 and bj0b_{j}\geq 0, we obtain that

tr(𝑷k+1)\displaystyle tr(\bm{P}_{k+1})
\displaystyle\leq 1αh(tr(𝑷th+1)tr(j=1n(𝑸jk,th)2𝝋k,j𝝋k,jT(1+𝝋k,j2))j=1n(αh+λmax(𝑸jk,th))).\displaystyle\frac{1}{\alpha^{h^{\prime}}}\left(tr(\bm{P}_{th^{\prime}+1})-\frac{tr\left(\sum^{n}_{j=1}\left(\bm{Q}^{k,th^{\prime}}_{j}\right)^{2}\frac{\bm{\varphi}_{k,j}\bm{\varphi}^{T}_{k,j}}{(1+\|\bm{\varphi}_{k,j}\|^{2})}\right)}{\sum^{n}_{j=1}\left(\alpha^{h^{\prime}}+\lambda_{\max}\left(\bm{Q}^{k,th^{\prime}}_{j}\right)\right)}\right). (25)

By Remark 4.2, we know that aij(k)amina^{(k)}_{ij}\geq a_{\min} holds for all kD𝒢k\geq D_{\mathcal{G}}. Thus, by (25), we have for k[th+D𝒢+1,(t+1)h]k\in[th^{\prime}+D_{\mathcal{G}}+1,(t+1)h^{\prime}]

tr(𝑷k+1)\displaystyle tr(\bm{P}_{k+1}) 1αh(tr(𝑷th+1)\displaystyle\leq\frac{1}{\alpha^{h^{\prime}}}\Bigg{(}tr(\bm{P}_{th^{\prime}+1})
amin2tr(j=1n(l=1n𝑷th+1,l)2𝝋k,j𝝋k,jT(1+𝝋k,j2))n(αh+λmax(l=1n𝑷th+1,l))).\displaystyle-\frac{a^{2}_{\min}tr\left(\sum^{n}_{j=1}\left(\sum^{n}_{l=1}\bm{P}_{th^{\prime}+1,l}\right)^{2}\frac{\bm{\varphi}_{k,j}\bm{\varphi}^{T}_{k,j}}{(1+\|\bm{\varphi}_{k,j}\|^{2})}\right)}{n\left(\alpha^{h^{\prime}}+\lambda_{\max}\left(\sum^{n}_{l=1}\bm{P}_{th^{\prime}+1,l}\right)\right)}\Bigg{)}. (26)

Summing up both sides of (26) from th+D𝒢+1th^{\prime}+D_{\mathcal{G}}+1 to (t+1)h(t+1)h^{\prime}, by the definition of βt+1\beta_{t+1}, we have

Tt+1\displaystyle T_{t+1} =k=th+D𝒢+1(t+1)htr(𝑷k+1)\displaystyle=\sum^{(t+1)h^{\prime}}_{k=th^{\prime}+D_{\mathcal{G}}+1}tr(\bm{P}_{k+1})
1αh(1βt+1)(hD𝒢)tr(𝑷th+1).\displaystyle\leq\frac{1}{\alpha^{h^{\prime}}}(1-\beta_{t+1})(h^{\prime}-D_{\mathcal{G}})tr(\bm{P}_{th^{\prime}+1}).

This completes the proof of the lemma.

Before giving the boundness of the random matrix 𝑷t\bm{P}_{t}, we first introduce two lemmas in [34].

Lemma 4.2.

[34] Let {1ξt}S0(λ)\{1-\xi_{t}\}\in S^{0}(\lambda), and 0<ξtξ<10<\xi_{t}\leq\xi^{*}<1, where ξ\xi^{*} is a positive constant. Then for any ε(0,1)\varepsilon\in(0,1), {1εξt}S0(λ(1ξ)ε)\{1-\varepsilon\xi_{t}\}\in S^{0}(\lambda^{(1-\xi^{*})\varepsilon}).

Lemma 4.3.

[34] Let {xt,t}\{x_{t},\mathscr{F}_{t}\} be an adapted process, and

xt+1ξt+1xt+ηt+1,t0,𝔼x02<,\displaystyle x_{t+1}\leq\xi_{t+1}x_{t}+\eta_{t+1},~{}~{}~{}~{}t\geq 0,~{}~{}\mathbb{E}x^{2}_{0}<\infty,

where {ξt,t}\{\xi_{t},\mathscr{F}_{t}\} and {ηt,t}\{\eta_{t},\mathscr{F}_{t}\} are two adapted nonnegative process with properties:

ξtε0>0,t,\displaystyle\xi_{t}\geq\varepsilon_{0}>0,~{}~{}\forall t,
𝔼(ηt+12|t)N<,t,\displaystyle\mathbb{E}(\eta^{2}_{t+1}|\mathscr{F}_{t})\leq N<\infty,~{}~{}\forall t,
k=jt𝔼(ξk+14|k)Mηtj+1,tj,j,\displaystyle\left\|\prod^{t}_{k=j}\mathbb{E}(\xi^{4}_{k+1}|\mathscr{F}_{k})\right\|\leq M\eta^{t-j+1},~{}~{}\forall t\geq j,~{}~{}\forall j,

where ε0,M,N\varepsilon_{0},M,N and η(0,1)\eta\in(0,1) are constants. Then we have

(i)\displaystyle(i) k=jtξkL2M14η14(tj+1),tj,j;\displaystyle~{}~{}\left\|\prod^{t}_{k=j}\xi_{k}\right\|_{L_{2}}\leq M^{\frac{1}{4}}\eta^{\frac{1}{4}(t-j+1)},~{}~{}~{}~{}\forall t\geq j,~{}~{}\forall j;
(ii)\displaystyle(ii) supt𝔼(xt)<.\displaystyle~{}~{}\sup_{t}\mathbb{E}(\|x_{t}\|)<\infty.

The following lemma proves the boundedness of the random matrix sequence {𝑷t}\{\bm{P}_{t}\}.

Lemma 4.4.

For {𝐏t}\{\bm{P}_{t}\} generated by (12) and (13), under Assumptions 4.1-4.2, we have for any p1p\geq 1, 𝐏t\bm{P}_{t} is LpL_{p} stable, i.e.,

supt0𝔼(𝑷tp)<\displaystyle\sup_{t\geq 0}\mathbb{E}(\|\bm{P}_{t}\|^{p})<\infty

provided that λamin232pmh(4h+D𝒢1)<α<1\lambda^{\frac{a^{2}_{\min}}{32pmh(4h+D_{\mathcal{G}}-1)}}<\alpha<1, where λ\lambda and hh are given by Assumption 4.2, and mm is the dimension of 𝛗t,i\bm{\varphi}_{t,i}.

Proof 4.5.

For any t0t\geq 0, there exists an integer zt=th+D𝒢h+1z_{t}=\lfloor\frac{th^{\prime}+D_{\mathcal{G}}}{h}\rfloor+1 such that

(zt1)h(th+D𝒢+1)zth+1.\displaystyle(z_{t}-1)h\leq(th^{\prime}+D_{\mathcal{G}}+1)\leq z_{t}h+1. (27)

By the definition of βt+1\beta_{t+1} in Lemma 4.1, it is clear that

βt+1\displaystyle\beta_{t+1}
\displaystyle\geq amin2tr((l=1n𝑷th+1,l)2k=zth+1(zt+1)hj=1n𝝋k,j𝝋k,jT(1+𝝋k,j2))n(hD𝒢)(αh+λmax(l=1n𝑷th+1,l))tr(𝑷th+1)\displaystyle\frac{a^{2}_{\min}tr\Big{(}\left(\sum^{n}_{l=1}\bm{P}_{th^{\prime}+1,l}\right)^{2}\sum^{(z_{t}+1)h}_{k=z_{t}h+1}\sum^{n}_{j=1}\frac{\bm{\varphi}_{k,j}\bm{\varphi}^{T}_{k,j}}{(1+\|\bm{\varphi}_{k,j}\|^{2})}\Big{)}}{n(h^{\prime}-D_{\mathcal{G}})\left(\alpha^{h^{\prime}}+\lambda_{\max}\left(\sum^{n}_{l=1}\bm{P}_{th^{\prime}+1,l}\right)\right)tr(\bm{P}_{th^{\prime}+1})}
\displaystyle\triangleq bt+1.\displaystyle b_{t+1}. (28)

Hence by Lemma 4.1 and (28), we obtain

Tt+11αh(1bt+1)(hD𝒢)tr(𝑷th+1).\displaystyle T_{t+1}\leq\frac{1}{\alpha^{h^{\prime}}}(1-b_{t+1})(h^{\prime}-D_{\mathcal{G}})tr(\bm{P}_{th^{\prime}+1}). (29)

By the inequality 𝐏k,i1αj=1naij𝐏k1,j\bm{P}_{k,i}\leq\frac{1}{\alpha}\sum^{n}_{j=1}a_{ij}{\bm{P}_{k-1,j}} used in (22) it follows that

(hD𝒢)tr(𝑷th+1)=k=(t1)h+D𝒢+1thtr(𝑷th+1)\displaystyle(h^{\prime}-D_{\mathcal{G}})tr(\bm{P}_{th^{\prime}+1})=\sum^{th^{\prime}}_{k=(t-1)h^{\prime}+D_{\mathcal{G}}+1}tr(\bm{P}_{th^{\prime}+1})
=\displaystyle= k=(t1)h+D𝒢+1thi=1ntr(𝑷th+1,i)\displaystyle\sum^{th^{\prime}}_{k=(t-1)h^{\prime}+D_{\mathcal{G}}+1}\sum^{n}_{i=1}tr(\bm{P}_{th^{\prime}+1,i})
\displaystyle\leq k=(t1)h+D𝒢+1thi=1ntr(1αthkj=1naij(thk)𝑷k+1,j)\displaystyle\sum^{th^{\prime}}_{k=(t-1)h^{\prime}+D_{\mathcal{G}}+1}\sum^{n}_{i=1}tr\left(\frac{1}{\alpha^{th^{\prime}-k}}\sum^{n}_{j=1}a^{(th^{\prime}-k)}_{ij}\bm{P}_{k+1,j}\right)
\displaystyle\leq 1αhD𝒢1k=(t1)h+D𝒢+1thtr(𝑷k+1)=1αhD𝒢1Tt.\displaystyle\frac{1}{\alpha^{h^{\prime}-D_{\mathcal{G}}-1}}\sum^{th^{\prime}}_{k=(t-1)h^{\prime}+D_{\mathcal{G}}+1}tr(\bm{P}_{k+1})=\frac{1}{\alpha^{h^{\prime}-D_{\mathcal{G}}-1}}T_{t}.

Hence by (29), we have

Tt+11α2hD𝒢1(1bt+1)Tt.\displaystyle T_{t+1}\leq\frac{1}{\alpha^{2h^{\prime}-D_{\mathcal{G}}-1}}(1-b_{t+1})T_{t}. (30)

For p1p\geq 1, denote

ct+1=1αp(2hD𝒢1)(1bt+12)I{tr(𝑷th+1)1}\displaystyle c_{t+1}=\frac{1}{\alpha^{p(2h^{\prime}-D_{\mathcal{G}}-1})}\left(1-\frac{b_{t+1}}{2}\right)I_{\{tr(\bm{P}_{th^{\prime}+1})\geq 1\}} (31)

where I{}I_{\{\cdot\}} denotes the indicator function, whose value is 1 if its argument (a formula) is true, and 0, otherwise. Then by (29) and (30), we have

Tt+1p\displaystyle T^{p}_{t+1} \displaystyle\leq Tt+1p(I{tr(𝑷th+1)1}+I{tr(𝑷th+1)1})\displaystyle T^{p}_{t+1}\left(I_{\{tr(\bm{P}_{th^{\prime}+1})\geq 1\}}+I_{\{tr(\bm{P}_{th^{\prime}+1})\leq 1\}}\right) (32)
\displaystyle\leq 1αp(2hD𝒢1)(1bzt+1)pTtpI{tr(𝑷th+1)1}\displaystyle\frac{1}{\alpha^{p(2h^{\prime}-D_{\mathcal{G}}-1)}}(1-b_{z_{t}+1})^{p}T^{p}_{t}I_{\{tr(\bm{P}_{th^{\prime}+1})\geq 1\}}
+Tt+1pI{tr(𝑷th+1)1}\displaystyle+T^{p}_{t+1}I_{\{tr(\bm{P}_{th^{\prime}+1})\leq 1\}}
\displaystyle\leq ct+1Ttp+1αph(hD𝒢)p.\displaystyle c_{t+1}T^{p}_{t}+\frac{1}{\alpha^{ph^{\prime}}}(h^{\prime}-D_{\mathcal{G}})^{p}.

Denote

𝑯zt=𝔼(k=zth+1(zt+1)hj=1n𝝋k,j𝝋k,jT1+𝝋k,j2|zth).\bm{H}_{z_{t}}=\mathbb{E}\left(\sum^{(z_{t}+1)h}_{k=z_{t}h+1}\sum^{n}_{j=1}\frac{\bm{\varphi}_{k,j}\bm{\varphi}^{T}_{k,j}}{1+\|\bm{\varphi}_{k,j}\|^{2}}\Bigg{|}\mathscr{F}_{z_{t}h}\right).

By the inequality

tr((l=1n𝑷th+1,l)2)m1(tr(l=1n𝑷th+1,l))2tr\left(\left(\sum^{n}_{l=1}\bm{P}_{th^{\prime}+1,l}\right)^{2}\right)\geq m^{-1}\left(tr\left(\sum^{n}_{l=1}\bm{P}_{th^{\prime}+1,l}\right)\right)^{2}

and 𝐏th+1,lthzth\bm{P}_{th^{\prime}+1,l}\in\mathscr{F}_{th^{\prime}}\subset\mathscr{F}_{z_{t}h}, from the definition of bt+1b_{t+1} in (28), we can conclude the following inequality,

𝔼(bt+1|zth)\displaystyle\mathbb{E}(b_{t+1}|\mathscr{F}_{z_{t}h})
=\displaystyle= amin2tr[(l=1n𝑷th+1,l)2𝑯zt]n(hD𝒢)(αh+λmax(l=1n𝑷th+1,l))tr(𝑷th+1)\displaystyle\frac{a^{2}_{\min}tr\left[\left(\sum^{n}_{l=1}\bm{P}_{th^{\prime}+1,l}\right)^{2}\bm{H}_{z_{t}}\right]}{n(h^{\prime}-D_{\mathcal{G}})\left(\alpha^{h^{\prime}}+\lambda_{\max}\left(\sum^{n}_{l=1}\bm{P}_{th^{\prime}+1,l}\right)\right)tr(\bm{P}_{th^{\prime}+1})}
\displaystyle\geq amin2(tr(𝑷th+1))2λmin(𝑯zt)mn(hD𝒢)(αh+λmax(l=1n𝑷th+1,l))tr(𝑷th+1)\displaystyle\frac{a^{2}_{\min}\left(tr(\bm{P}_{th^{\prime}+1})\right)^{2}\lambda_{\min}(\bm{H}_{z_{t}})}{mn(h^{\prime}-D_{\mathcal{G}})\left(\alpha^{h^{\prime}}+\lambda_{\max}\left(\sum^{n}_{l=1}\bm{P}_{th^{\prime}+1,l}\right)\right)tr(\bm{P}_{th^{\prime}+1})}
\displaystyle\geq amin2(tr(𝑷th+1))λzt(1+h)m(hD𝒢)(αh+λmax(l=1n𝑷th+1,l))\displaystyle\frac{a^{2}_{\min}\left(tr(\bm{P}_{th^{\prime}+1})\right)\lambda_{z_{t}}(1+h)}{m(h^{\prime}-D_{\mathcal{G}})\left(\alpha^{h^{\prime}}+\lambda_{\max}\left(\sum^{n}_{l=1}\bm{P}_{th^{\prime}+1,l}\right)\right)}
\displaystyle\geq amin2(tr(𝑷th+1))λzt(1+h)m(hD𝒢)(1+tr(𝑷th+1))\displaystyle\frac{a^{2}_{\min}\left(tr(\bm{P}_{th^{\prime}+1})\right)\lambda_{z_{t}}(1+h)}{m(h^{\prime}-D_{\mathcal{G}})\left(1+tr(\bm{P}_{th^{\prime}+1})\right)}
\displaystyle\geq amin2λzt(1+h)2m(hD𝒢)on{tr(𝑷th+1)1}.\displaystyle\frac{a^{2}_{\min}\lambda_{z_{t}}(1+h)}{2m(h^{\prime}-D_{\mathcal{G}})}~{}~{}~{}~{}{\rm{on}}~{}~{}\{tr(\bm{P}_{th^{\prime}+1})\geq 1\}. (33)

Hence by the definition of czt+1c_{z_{t}+1} in (31),

𝔼(ct+1|zth)\displaystyle\mathbb{E}(c_{t+1}|\mathscr{F}_{z_{t}h})
\displaystyle\leq 1αp(2hD𝒢1)(1amin2λzt(1+h)4m(hD𝒢))I{tr(𝑷th+1)1}.\displaystyle\frac{1}{\alpha^{p(2h^{\prime}-D_{\mathcal{G}}-1)}}\left(1-\frac{a^{2}_{\min}\lambda_{z_{t}}(1+h)}{4m(h^{\prime}-D_{\mathcal{G}})}\right)I_{\{tr(\bm{P}_{th^{\prime}+1})\geq 1\}}. (34)

Denote

dt+1={ct+1,tr(𝑷th+1)1;1αp(2hD𝒢1)(1amin2λzt(1+h)4m(hD𝒢)),otherwise.\displaystyle d_{t+1}=\begin{cases}c_{t+1},&tr(\bm{P}_{th^{\prime}+1})\geq 1;\\ \frac{1}{\alpha^{p(2h^{\prime}-D_{\mathcal{G}}-1)}}\left(1-\frac{a^{2}_{\min}\lambda_{z_{t}}(1+h)}{4m(h^{\prime}-D_{\mathcal{G}})}\right),&{\rm otherwise.}\end{cases}

Then by (32) and (34), we have

Tt+1p\displaystyle T^{p}_{t+1}\leq dt+1Ttp+1αph(hD𝒢)p.\displaystyle d_{t+1}T^{p}_{t}+\frac{1}{\alpha^{ph^{\prime}}}(h^{\prime}-D_{\mathcal{G}})^{p}. (35)

Since λzth1+h\lambda_{z_{t}}\leq\frac{h}{1+h} and bt+1amin2hhD𝒢b_{t+1}\leq\frac{a^{2}_{\min}h}{h^{\prime}-D_{\mathcal{G}}}, we know that dt+1ε0d_{t+1}\geq\varepsilon_{0} with ε0\varepsilon_{0} being a positive constant. Denote tzth\mathscr{B}_{t}\triangleq\mathscr{F}_{z_{t}h}, then by the definition of ztz_{t}, it is clear that zt+1zt+2z_{t+1}\geq z_{t}+2. Thus, we obtain that dt+1(zt+1)ht+1d_{t+1}\in\mathscr{F}_{(z_{t}+1)h}\subset\mathscr{B}_{t+1}. Similar to the analysis of (34)(\ref{adap22}), we have

𝔼(ct+14|t)1α4p(2hD𝒢1)(1amin2λzt(1+h)4m(hD𝒢)).\displaystyle\mathbb{E}(c^{4}_{t+1}|\mathscr{B}_{t})\leq\frac{1}{\alpha^{4p(2h^{\prime}-D_{\mathcal{G}}-1)}}\left(1-\frac{a^{2}_{\min}\lambda_{z_{t}}(1+h)}{4m(h^{\prime}-D_{\mathcal{G}})}\right). (36)

Hence by the definition of dt+1d_{t+1}, it follows that

k=jt𝔼(dk+14|k)L1\displaystyle\Big{\|}\prod^{t}_{k=j}\mathbb{E}(d^{4}_{k+1}|\mathscr{B}_{k})\Big{\|}_{L_{1}}
\displaystyle\leq k=jt(1α4p(2hD𝒢1)(1amin2λzk(1+h)8mh))L1.\displaystyle\Big{\|}\prod^{t}_{k=j}\left(\frac{1}{\alpha^{4p(2h^{\prime}-D_{\mathcal{G}}-1)}}\left(1-\frac{a^{2}_{\min}\lambda_{z_{k}}(1+h)}{8mh}\right)\right)\Big{\|}_{L_{1}}. (37)

By Assumption 4.2 and the fact λzkh1+h\lambda_{z_{k}}\leq\frac{h}{1+h}, applying Lemma 4.2, we obtain {1amin2λzk(1+h)8mh}S0(λamin28mh)\{1-\frac{a^{2}_{\min}\lambda_{z_{k}}(1+h)}{8mh}\}\in S^{0}\Big{(}\lambda^{\frac{a^{2}_{\min}}{8mh}}\Big{)}. By (37), we see that there exists a positive constant NN such that

k=jt𝔼(dk+14|k)L1Nλ1tj+1,\displaystyle\Big{\|}\prod^{t}_{k=j}\mathbb{E}(d^{4}_{k+1}|\mathscr{B}_{k})\Big{\|}_{L_{1}}\leq N\lambda_{1}^{t-j+1},

where λ1=1α4p(2hD𝒢1)λamin28mh(0,1)\lambda_{1}=\frac{1}{\alpha^{4p(2h^{\prime}-D_{\mathcal{G}}-1)}}\lambda^{\frac{a^{2}_{\min}}{8mh}}\in(0,1). Furthermore, by Lemma 4.3, we have supt𝔼(Ttp)<\sup_{t}\mathbb{E}(T^{p}_{t})<\infty, which implies that supt0𝔼(𝐏tp)<\sup_{t\geq 0}\mathbb{E}(\|\bm{P}_{t}\|^{p})<\infty. This completes the proof.

We then establish the exponential stability of the homogeneous part of the error equation (17).

Theorem 4.6.

Consider the distributed FFLS algorithm in Algorithm 2. If the forgetting factor α\alpha satisfies λamin232pmh(4h+D𝒢1)<α<1\lambda^{\frac{a^{2}_{\min}}{32pmh(4h+D_{\mathcal{G}}-1)}}<\alpha<1 and for any i{1,,n}i\in\{1,\cdots,n\}, supt𝛗t,iL6p<\sup_{t}\|\bm{\varphi}_{t,i}\|_{L_{6p}}<\infty, then under Assumptions 4.1 and 4.2, for any p1p\geq 1, {α𝐏t+1𝒜𝐏t1}\{\alpha\bm{P}_{t+1}\mathscr{A}\bm{P}^{-1}_{t}\} is LpL_{p}-exponentially stable.

Proof 4.7.

By (10) and (13), we have

𝑷t+1,i1=j=1naij(α𝑷t,j1+𝝋t,j𝝋t,jT).\displaystyle\bm{P}^{-1}_{t+1,i}=\sum^{n}_{j=1}a_{ij}(\alpha\bm{P}^{-1}_{t,j}+\bm{\varphi}_{t,j}\bm{\varphi}^{T}_{t,j}).

Then we can obtain the following equation,

tr(𝑷t+11)\displaystyle tr(\bm{P}^{-1}_{t+1}) =tr(i=1n𝑷t+1,i1)\displaystyle=tr\left(\sum^{n}_{i=1}\bm{P}^{-1}_{t+1,i}\right)
=tr(j=1n(α𝑷t,j1+𝝋t,j𝝋t,jT))\displaystyle=tr\left(\sum^{n}_{j=1}(\alpha\bm{P}^{-1}_{t,j}+\bm{\varphi}_{t,j}\bm{\varphi}^{T}_{t,j})\right)
=αtr(𝑷t1)+j=1n𝝋t,j2.\displaystyle=\alpha tr(\bm{P}^{-1}_{t})+\sum^{n}_{j=1}\|\bm{\varphi}_{t,j}\|^{2}.

By Mikowski inequality, it follows that

tr(𝑷t+11)L3p\displaystyle\|tr(\bm{P}^{-1}_{t+1})\|_{L_{3p}} αtr(𝑷t1)L3p+O(j=1n𝝋t,jL6p2)\displaystyle\leq\alpha\|tr(\bm{P}^{-1}_{t})\|_{L_{3p}}+O\left(\sum^{n}_{j=1}\|\bm{\varphi}_{t,j}\|^{2}_{L_{6p}}\right)
=αt+1tr(𝑷01)L3p+O(k=0tαj).\displaystyle=\alpha^{t+1}\|tr(\bm{P}^{-1}_{0})\|_{L_{3p}}+O\left(\sum^{t}_{k=0}\alpha^{j}\right).

Hence we have

supt𝑷t+11L3p<.\displaystyle\sup_{t}\|\bm{P}^{-1}_{t+1}\|_{L_{3p}}<\infty. (38)

By Lemma 4.4, we derive that

k=jtα𝑷k+1𝒜𝑷k1Lp\displaystyle\Big{\|}\prod^{t}_{k=j}\alpha\bm{P}_{k+1}\mathscr{A}\bm{P}^{-1}_{k}\Big{\|}_{L_{p}}
=\displaystyle= 𝔼(k=jtα𝑷k+1𝒜𝑷k1p)1p\displaystyle\mathbb{E}\left(\Big{\|}\prod^{t}_{k=j}\alpha\bm{P}_{k+1}\mathscr{A}\bm{P}^{-1}_{k}\Big{\|}^{p}\right)^{\frac{1}{p}}
=\displaystyle= 𝔼(αtj+1𝑷t+1𝒜tj+1𝑷j1p)1p\displaystyle\mathbb{E}\left(\|\alpha^{t-j+1}\bm{P}_{t+1}\mathscr{A}^{t-j+1}\bm{P}^{-1}_{j}\|^{p}\right)^{\frac{1}{p}}
\displaystyle\leq αtj+1𝑷t+1L2p𝑷j1L2p=O(αtj+1).\displaystyle\alpha^{t-j+1}\|\bm{P}_{t+1}\|_{L_{2p}}\|\bm{P}^{-1}_{j}\|_{L_{2p}}=O(\alpha^{t-j+1}).

This completes the proof of the theorem.

Based on Theorem 4.6, we further establish the tracking error bound of Algorithm 2 under some conditions on the noises and parameter variation.

Theorem 4.8.

Consider the model (1) and the error equation (17). Under the conditions of Theorem 4.6, if for some p1p\geq 1, σ3psupt(𝐖tL3p+Δ𝚯tL3p)<\sigma_{3p}\triangleq\sup_{t}(\|\bm{W}_{t}\|_{L_{3p}}+\|\Delta\bm{\Theta}_{t}\|_{L_{3p}})<\infty, then there exists a constant cc such that

lim supt𝚯~tLpcσ3p.\displaystyle\limsup_{t\rightarrow\infty}\|\bm{\widetilde{\Theta}}_{t}\|_{L_{p}}\leq c\sigma_{3p}.
Proof 4.9.

For convenience of analysis, let the state transition matrix 𝚿(t,k){\bm{\Psi}}(t,k) be recursively defined by

𝚿(t+1,k)=α𝑷t+1𝒜𝑷t1𝚿(t,k),𝚿(k,k)=𝑰mn.\displaystyle{\bm{\Psi}}(t+1,k)=\alpha\bm{P}_{t+1}\mathscr{A}\bm{P}^{-1}_{t}{\bm{\Psi}}(t,k),~{}{\bm{\Psi}}(k,k)=\bm{I}_{mn}. (39)

It is clear that 𝚿(t+1,k)=αtk+1𝐏t+1𝒜tk+1𝐏k1{\bm{\Psi}}(t+1,k)=\alpha^{t-k+1}\bm{P}_{t+1}\mathscr{A}^{t-k+1}\bm{P}^{-1}_{k}. From the definition of 𝐋t\bm{L}_{t} and (10), we have 𝐏¯t+11𝐋t=𝚽t\bm{\bar{P}}^{-1}_{t+1}\bm{L}_{t}=\bm{\Phi}_{t}. Then by (17), we have

𝚯~t+1=α𝑷t+1𝒜𝑷t1𝚯~t𝑷t+1𝒜(𝚽t𝑾t+1+𝑷¯t+11Δ𝚯t).\displaystyle\bm{\widetilde{\Theta}}_{t+1}=\alpha\bm{P}_{t+1}\mathscr{A}\bm{P}^{-1}_{t}\bm{\widetilde{\Theta}}_{t}-\bm{P}_{t+1}\mathscr{A}(\bm{\Phi}_{t}\bm{W}_{t+1}+\bm{\bar{P}}^{-1}_{t+1}\Delta\bm{\Theta}_{t}).

Hence by Hölder inequality, we have

𝚯~t+1Lp\displaystyle\|\bm{\widetilde{\Theta}}_{t+1}\|_{L_{p}}
=\displaystyle= 𝚿(t+1,0)𝚯~0\displaystyle\Big{\|}{\bm{\Psi}}(t+1,0)\bm{\widetilde{\Theta}}_{0}
k=0t𝚿(t+1,k+1)(𝑷k+1𝒜(𝚽k𝑾k+1+𝑷¯k+11Δ𝚯k))Lp\displaystyle-\sum^{t}_{k=0}{\bm{\Psi}}(t+1,k+1)(\bm{P}_{k+1}\mathscr{A}(\bm{\Phi}_{k}\bm{W}_{k+1}+\bm{\bar{P}}^{-1}_{k+1}\Delta\bm{\Theta}_{k}))\Big{\|}_{L_{p}}
\displaystyle\leq αt+1𝑷t+1𝒜t+1𝑷01𝚯~0Lp\displaystyle\|\alpha^{t+1}\bm{P}_{t+1}\mathscr{A}^{t+1}\bm{P}^{-1}_{0}\bm{\widetilde{\Theta}}_{0}\|_{L_{p}}
+k=0tαtk𝑷t+1𝒜tk+1(𝚽k𝑾k+1+𝑷¯k+11Δ𝚯k)Lp\displaystyle+\Big{\|}\sum^{t}_{k=0}{\alpha^{t-k}\bm{P}_{t+1}\mathscr{A}^{t-k+1}(\bm{\Phi}_{k}\bm{W}_{k+1}+\bm{\bar{P}}^{-1}_{k+1}\Delta\bm{\Theta}_{k})}\Big{\|}_{L_{p}}
\displaystyle\leq O(αt+1𝑷t+1L2p)\displaystyle O(\alpha^{t+1}\|\bm{P}_{t+1}\|_{L_{2p}})
+k=0tαtk𝑷t+1L3p𝚽kL3p𝑾k+1L3p\displaystyle+\sum^{t}_{k=0}\alpha^{t-k}\|\bm{P}_{t+1}\|_{L_{3p}}\|\bm{\Phi}_{k}\|_{L_{3p}}\|\bm{W}_{k+1}\|_{L_{3p}}
+k=0tαtk𝑷t+1L3p𝑷¯k+11L3pΔ𝚯kL3p.\displaystyle+\sum^{t}_{k=0}\alpha^{t-k}\|\bm{P}_{t+1}\|_{L_{3p}}\|\bm{\bar{P}}^{-1}_{k+1}\|_{L_{3p}}\|\Delta\bm{\Theta}_{k}\|_{L_{3p}}.

Hence by Lemma 4.4 and (38), it follows that

lim supt𝚯~tLpcσ3p,\displaystyle\limsup_{t\rightarrow\infty}\|\bm{\widetilde{\Theta}}_{t}\|_{L_{p}}\leq c\sigma_{3p},

where cc is a positive constant depending on α\alpha and the upper bounds of {𝐏t}\{\bm{P}_{t}\}, {𝚽t}\{\bm{\Phi}_{t}\} and {𝐏t1}\{\bm{P}^{-1}_{t}\}. This completes the proof.

Remark 4.10.

From the proof of Theorems 4.6 and 4.8, we can see that if the forgetting factor α\alpha is selected to be uncoordinated for different sensors, i.e., we replace α\alpha with αi\alpha_{i} in Algorithm 2, the results of Theorems 4.6 and 4.8 also hold only if the condition λamin232pmh(4h+D𝒢1)<α\lambda^{\frac{a^{2}_{\min}}{32pmh(4h+D_{\mathcal{G}}-1)}}<\alpha is replaced with λamin232pmh(4h+D𝒢1)<αminmin{α1,,αn}\lambda^{\frac{a^{2}_{\min}}{32pmh(4h+D_{\mathcal{G}}-1)}}<\alpha_{\min}\triangleq\min\{\alpha_{1},...,\alpha_{n}\}.

5 Stability of distributed FFLS algorithm over unreliable directed networks

In Section IV, we have studied the stability of the distributed FFLS algorithm under the fixed undirected graph. However, in practical engineering applications, the information exchange between sensors might not be bidirectional. Moreover, it is often interfered by many uncertain random factors due to the distance, obstacle and interference, which will lead to the interruption or reconstruction of communication links. Thus, in this section, we model the communication links between sensors as time-varying random switching directed communication topologies 𝒢r(t)=(𝒱,r(t),𝒜r(t))\mathcal{G}_{r(t)}=(\mathcal{V},\mathcal{E}_{r(t)},\mathcal{A}_{r(t)}). The switching process is governed by a homogeneous Markov chain r(t)r(t) whose states belong to a finite set 𝕊={1,2,,s}\mathbb{S}=\{1,2,...,s\}, and the corresponding set of communication topology graph is denoted by 𝒞={𝒢1,,𝒢s}\mathcal{C}=\{\mathcal{G}_{1},...,\mathcal{G}_{s}\}. The communication graph is switched just at the instant that the value of r(t)r(t) is changed. Thus, the corresponding adjacency matrix and the neighbor set of the sensor ii are denoted as 𝒜r(t)=[aij,r(t)]1i,jn\mathcal{A}_{r(t)}=[a_{ij,r(t)}]_{1\leq i,j\leq n} and 𝒩i,r(t)\mathcal{N}_{i,r(t)}, respectively. For the distributed FFLS algorithm over the Markovian switching directed topologies, we just modify Step 2 in Algorithm 2 as follows:

𝑷t+1,i1\displaystyle\bm{P}^{-1}_{t+1,i} =j𝒩i,r(t)aji,r(t)𝑷¯t+1,j1,\displaystyle=\sum_{j\in\mathcal{N}_{i,r(t)}}a_{ji,r(t)}\bm{\bar{P}}^{-1}_{t+1,j}, (40)
𝜽^t+1,i\displaystyle\bm{\hat{\theta}}_{t+1,i} =𝑷t+1,ij𝒩i,r(t)aji,r(t)𝑷¯t+1,j1𝜽¯t+1,j.\displaystyle=\bm{P}_{t+1,i}\sum_{j\in\mathcal{N}_{i,r(t)}}a_{ji,r(t)}\bm{\bar{P}}^{-1}_{t+1,j}\bm{\bar{\theta}}_{t+1,j}. (41)

To analyze the stability of algorithm (11), (12), (40), (41), we introduce the following assumptions:

Assumption 5.1

All possible digraphs {𝒢1,,𝒢s}\{\mathcal{G}_{1},...,\mathcal{G}_{s}\} are balanced and the union of all those digraphs is strongly connected.

Assumption 5.2

The Markov chain {rt,t0}\{r_{t},t\geq 0\} is irreducible and aperiodic with the transition probability matrix 𝐏=[pij]1i,js\bm{P}=[p_{ij}]_{1\leq i,j\leq s} where pij=Pr(rt+1=j|rt=i)p_{ij}=\Pr(r_{t+1}=j|r_{t}=i) with Pr(|)\Pr(\cdot|\cdot) being the conditional probability.

According to Markov chain theory (c.f., [37]), a discrete-time homogeneous Markov chain with finite states is ergodic if and only if it is irreducible and aperiodic. Hence Assumption 5.2 means that the ll-step transition matrix 𝑷l\bm{P}^{l} has a limit with identical rows.

In the following, we will analyze the properties of the strongly connected directed graph. For convenience, we denote the ii-th row, jj-th column element of the matrix 𝑨\bm{A} as 𝑨(i,j)\bm{A}(i,j).

Lemma 5.1.

Let 𝒢k=(𝒱,k,𝒜k),(1kn)\mathcal{G}_{k}=(\mathcal{V},\mathcal{E}_{k},\mathcal{A}_{k}),(1\leq k\leq n) be nn strongly connected graph with 𝒱={1,2,,n}\mathcal{V}=\{1,2,\cdots,n\}. Then 𝒜1𝒜2𝒜n\mathcal{A}_{1}\mathcal{A}_{2}\cdots\mathcal{A}_{n} is a positive matrix, i.e., every element of the matrix 𝒜1𝒜2𝒜n\mathcal{A}_{1}\mathcal{A}_{2}\cdots\mathcal{A}_{n} is positive.

Proof 5.2.

We just prove that the graph 𝒢1n\mathcal{G}^{n}_{1} corresponding to the matrix 𝒜1𝒜2𝒜n\mathcal{A}_{1}\mathcal{A}_{2}\cdots\mathcal{A}_{n} is a complete graph. Denote the child node set of the node ii in graph 𝒢k\mathcal{G}_{k} as 𝒪k(i)\mathcal{O}_{k}(i). The corresponding child node set of the node ii in graph 𝒢1n\mathcal{G}^{n}_{1} is denoted by 𝒪1n(i)\mathcal{O}^{n}_{1}(i). For any i𝒱i\in\mathcal{V} and j𝒪1(i)j\in\mathcal{O}_{1}(i), we have

(𝒜1𝒜2)(i,j)\displaystyle(\mathcal{A}_{1}\mathcal{A}_{2})(i,j) =k=1n𝒜1(i,k)𝒜2(k,j)\displaystyle=\sum^{n}_{k=1}\mathcal{A}_{1}(i,k)\mathcal{A}_{2}(k,j)
𝒜1(i,j)𝒜2(j,j)>0.\displaystyle\geq\mathcal{A}_{1}(i,j)\mathcal{A}_{2}(j,j)>0. (42)

Since 𝒢2\mathcal{G}_{2} is strongly connected, if 𝒪1(i)𝒱\mathcal{O}_{1}(i)\neq\mathcal{V}, then there exists two nodes j1𝒱\𝒪1(i)j_{1}\in\mathcal{V}\backslash\mathcal{O}_{1}(i) and j2𝒪1(i)j_{2}\in\mathcal{O}_{1}(i) such that (j2,j1)2(j_{2},j_{1})\in\mathcal{E}_{2}, hence

(𝒜1𝒜2)(i,j1)\displaystyle(\mathcal{A}_{1}\mathcal{A}_{2})(i,j_{1}) =k=1n𝒜1(i,k)𝒜2(k,j1)\displaystyle=\sum^{n}_{k=1}\mathcal{A}_{1}(i,k)\mathcal{A}_{2}(k,j_{1})
𝒜1(i,j2)𝒜2(j2,j1)>0.\displaystyle\geq\mathcal{A}_{1}(i,j_{2})\mathcal{A}_{2}(j_{2},j_{1})>0. (43)

By (42) and (43), it is clear that {j1}𝒪1(i)𝒪12(i)\{j_{1}\}\cup\mathcal{O}_{1}(i)\subset\mathcal{O}^{2}_{1}(i). Hence for any j{j1}𝒪1(i)j\in\{j_{1}\}\cup\mathcal{O}_{1}(i), we have

(𝒜1𝒜2𝒜3)(i,j)\displaystyle(\mathcal{A}_{1}\mathcal{A}_{2}\mathcal{A}_{3})(i,j) =k=1n(𝒜1𝒜2)(i,k)𝒜3(k,j)\displaystyle=\sum^{n}_{k=1}(\mathcal{A}_{1}\mathcal{A}_{2})(i,k)\mathcal{A}_{3}(k,j)
(𝒜1𝒜2)(i,j)𝒜3(j,j)>0.\displaystyle\geq(\mathcal{A}_{1}\mathcal{A}_{2})(i,j)\mathcal{A}_{3}(j,j)>0. (44)

Since 𝒢3\mathcal{G}_{3} is strongly connected, if {j1}𝒪1(i)𝒱\{j_{1}\}\cup\mathcal{O}_{1}(i)\neq\mathcal{V}, then there exists two nodes j2𝒱\({j1}𝒪1(i))j_{2}\in\mathcal{V}\backslash(\{j_{1}\}\cup\mathcal{O}_{1}(i)) and j3{j1}𝒪1(i)j_{3}\in\{j_{1}\}\cup\mathcal{O}_{1}(i) such that (j3,j2)3(j_{3},j_{2})\in\mathcal{E}_{3}, hence

(𝒜1𝒜2𝒜3)(i,j2)\displaystyle(\mathcal{A}_{1}\mathcal{A}_{2}\mathcal{A}_{3})(i,j_{2}) =k=1n(𝒜1𝒜2)(i,k)𝒜3(k,j2)\displaystyle=\sum^{n}_{k=1}(\mathcal{A}_{1}\mathcal{A}_{2})(i,k)\mathcal{A}_{3}(k,j_{2})
(𝒜1𝒜2)(i,j3)𝒜3(j3,j2)>0.\displaystyle\geq(\mathcal{A}_{1}\mathcal{A}_{2})(i,j_{3})\mathcal{A}_{3}(j_{3},j_{2})>0. (45)

By (44) and (45), we can see that {j2}{j1}𝒪1(i)𝒪13(i)\{j_{2}\}\cup\{j_{1}\}\cup\mathcal{O}_{1}(i)\subset\mathcal{O}^{3}_{1}(i). We repeat the above process until 𝒪1n(i)=𝒱\mathcal{O}^{n}_{1}(i)=\mathcal{V}. The lemma can be proved by the arbitrariness of the node ii.

Compared with the undirected graph case, the key difference is that the adjacency matrix in this section is an asymmetric and random matrix. Hence we need to deal with the coupled relationship between random adjacency matrices and random regression vectors. By using the above lemma and Markov chain theory, we establish the stability of the algorithm (11), (12), (40), (41) under Markovian switching topology.

Theorem 5.3.

Under Assumptions 4.2, 5.1 and 5.2, if for any i{1,,n}i\in\{1,\cdots,n\}, supt𝛗t,iL6p<\sup_{t}\|\bm{\varphi}_{t,i}\|_{L_{6p}}<\infty and σ3psupt(𝐖tL3p+Δ𝚯tL3p)<\sigma_{3p}\triangleq\sup_{t}(\|\bm{W}_{t}\|_{L_{3p}}+\|\Delta\bm{\Theta}_{t}\|_{L_{3p}})<\infty hold, then there exists a constant cc^{\prime} such that

lim supt𝚯~tLpcσ3p.\displaystyle\limsup_{t\rightarrow\infty}\|\bm{\widetilde{\Theta}}_{t}\|_{L_{p}}\leq c^{\prime}\sigma_{3p}.
Proof 5.4.

Following the proof line of Theorem 4.8 in Subsection 4.3, it can be seen that we need to prove equation (33) holds under the assumptions of the theorem. By Assumption 5.2, there exists a positive integer q0q_{0} such that

Pr(r(t+q0)=a|r(t)=b)>0\displaystyle\Pr(r(t+q_{0})=a|r(t)=b)>0 (46)

holds for all tt and all states a,b𝕊a,b\in\mathbb{S}. Denote Πkt=𝒜r(t)𝒜r(t1)𝒜r(k)\Pi^{t}_{k}=\mathscr{A}_{r(t)}\mathscr{A}_{r(t-1)}\cdots\mathscr{A}_{r(k)}. Then the ii-th row, jj-th column element of the matrix Πkt\Pi^{t}_{k} is denoted by Πkt(i,j)\Pi^{t}_{k}(i,j). Following Lemmas 4.1 and 4.4, we may abuse some notations h=2h+nsq0h^{\prime}=2h+nsq_{0}, zt=th+nsq0h+1z_{t}=\lfloor\frac{th^{\prime}+nsq_{0}}{h}\rfloor+1 and

bt+1=\displaystyle b_{t+1}=
tr(k=zth+1(zt+1)hj=1n(l=1nΠth+1k1(j,l)𝑷th+1,l)2𝝋k,j𝝋k,jT1+𝝋k,j2)n(hnsq0)(αh+λmax(l=1n𝑷th+1,l))tr(𝑷th+1).\displaystyle\frac{tr\Big{(}\sum^{(z_{t}+1)h}_{k=z_{t}h+1}\sum^{n}_{j=1}\left(\sum^{n}_{l=1}\Pi^{k-1}_{th^{\prime}+1}(j,l)\bm{P}_{th^{\prime}+1,l}\right)^{2}\frac{\bm{\varphi}_{k,j}\bm{\varphi}^{T}_{k,j}}{1+\|\bm{\varphi}_{k,j}\|^{2}}\Big{)}}{n(h^{\prime}-nsq_{0})\left(\alpha^{h^{\prime}}+\lambda_{\max}\left(\sum^{n}_{l=1}\bm{P}_{th^{\prime}+1,l}\right)\right)tr(\bm{P}_{th^{\prime}+1})}.

In the following we analyze the term 𝔼(bt+1|zth)\mathbb{E}(b_{t+1}|\mathscr{F}_{z_{t}h}). By (46), we can see that there exists a positive constant p0p_{0} such that for all tt,

Pr(r(t+nsq0)=s,r(t+(ns1)q0)=s1,,\displaystyle\Pr\Big{(}r(t+nsq_{0})=s,r(t+(ns-1)q_{0})=s-1,\cdots,
r(t+((n1)s+1)q0)=1;\displaystyle~{}~{}~{}~{}~{}r(t+((n-1)s+1)q_{0})=1;
r(t+2sq0)=s,r(t+(2s1)q0)=s1,,\displaystyle~{}~{}~{}~{}~{}\cdots r(t+2sq_{0})=s,r(t+(2s-1)q_{0})=s-1,\cdots,
r(t+(s+1)q0)=1;\displaystyle~{}~{}~{}~{}~{}r(t+(s+1)q_{0})=1;
r(t+sq0)=s,r(t+(s1)q0)=s1,\displaystyle~{}~{}~{}~{}~{}r(t+sq_{0})=s,r(t+(s-1)q_{0})=s-1\cdots,
r(t+q0)=1|)\displaystyle~{}~{}~{}~{}~{}r(t+q_{0})=1\Big{|}\mathscr{F}\Big{)}
=\displaystyle= a0Pr(r(t+nsq0)=s|r(t+(ns1)q0)=s1)\displaystyle\sum_{a_{0}}\Pr\Big{(}r(t+nsq_{0})=s\Big{|}r(t+(ns-1)q_{0})=s-1\Big{)}\cdots
Pr(r(t+((n1)s+1)q0)=1|r(t+((n1)s)q0)=s)\displaystyle\Pr\Big{(}r(t+((n-1)s+1)q_{0})=1\Big{|}r(t+((n-1)s)q_{0})=s\Big{)}
Pr(r(t+q0)=1|r(t)=a0)Pr(r(t)=a0|)\displaystyle~{}~{}~{}~{}~{}\cdots\Pr\Big{(}r(t+q_{0})=1|r(t)=a_{0}\Big{)}\Pr\Big{(}r(t)=a_{0}\Big{|}\mathscr{F}\Big{)}
\displaystyle\geq p0a0Pr(r(t)=a0|)=p0>0\displaystyle p_{0}\sum_{a_{0}}\Pr\Big{(}r(t)=a_{0}\Big{|}\mathscr{F}\Big{)}=p_{0}>0 (47)

with \mathscr{F} being a σ\sigma-algebra. By (47), we know that the Markov chain {rt,t0}\{r_{t},t\geq 0\} can visit all states in 𝕊\mathbb{S} with nn times in a positive probability during the time interval [t+q0,t+nsq0][t+q_{0},t+nsq_{0}]. Hence for k[zth+1,(zt+1)h)]k\in[z_{t}h+1,(z_{t}+1)h)], by Assumption 5.1 and Lemma 5.1, there exists a positive constant σ>0\sigma>0 such that the following inequality holds,

𝔼((l=1nΠth+1k1(j,l)𝑷th+1,l)2|k)\displaystyle\mathbb{E}\left(\left(\sum^{n}_{l=1}\Pi^{k-1}_{th^{\prime}+1}(j,l)\bm{P}_{th^{\prime}+1,l}\right)^{2}\Bigg{|}\mathscr{F}_{k}\right)
=\displaystyle= 𝔼((u𝒱v𝒱Πth+1k1(j,u)Πth+1k1(j,v)𝑷th+1,u𝑷th+1,v)|k)\displaystyle\mathbb{E}\Big{(}\Big{(}\sum_{u\in\mathcal{V}}\sum_{v\in\mathcal{V}}\Pi^{k-1}_{th^{\prime}+1}(j,u)\Pi^{k-1}_{th^{\prime}+1}(j,v)\bm{P}_{th^{\prime}+1,u}\bm{P}_{th^{\prime}+1,v}\Big{)}\Big{|}\mathscr{F}_{k}\Big{)}
=\displaystyle= u𝒱v𝒱(𝔼(Πth+1k1(j,u)Πth+1k1(j,v))|k)𝑷th+1,u𝑷th+1,v\displaystyle\sum_{u\in\mathcal{V}}\sum_{v\in\mathcal{V}}\Big{(}\mathbb{E}\Big{(}\Pi^{k-1}_{th^{\prime}+1}(j,u)\Pi^{k-1}_{th^{\prime}+1}(j,v)\Big{)}\Big{|}\mathscr{F}_{k}\Big{)}\bm{P}_{th^{\prime}+1,u}\bm{P}_{th^{\prime}+1,v}
\displaystyle\geq σu𝒱v𝒱𝑷th+1,u𝑷th+1,v=σ(l=1n𝑷th+1,l)2.\displaystyle\sigma\sum_{u\in\mathcal{V}}\sum_{v\in\mathcal{V}}\bm{P}_{th^{\prime}+1,u}\bm{P}_{th^{\prime}+1,v}=\sigma\left(\sum^{n}_{l=1}\bm{P}_{th^{\prime}+1,l}\right)^{2}.

By zthk\mathscr{F}_{z_{t}h}\subset\mathscr{F}_{k} and 𝛗k,jk\bm{\varphi}_{k,j}\in\mathscr{F}_{k}, we conclude that

𝔼((l=1nΠth+1k1(j,l)𝑷th+1,l)2𝝋k,j𝝋k,jT1+𝝋k,j2|zth)\displaystyle\mathbb{E}\left(\left(\sum^{n}_{l=1}\Pi^{k-1}_{th^{\prime}+1}(j,l)\bm{P}_{th^{\prime}+1,l}\right)^{2}\frac{\bm{\varphi}_{k,j}\bm{\varphi}^{T}_{k,j}}{1+\|\bm{\varphi}_{k,j}\|^{2}}\Bigg{|}\mathscr{F}_{z_{t}h}\right)
=\displaystyle= 𝔼((𝔼(l=1nΠth+1k1(j,l)𝑷th+1,l)2|k)\displaystyle\mathbb{E}\Bigg{(}\Bigg{(}\mathbb{E}\left(\sum^{n}_{l=1}\Pi^{k-1}_{th^{\prime}+1}(j,l)\bm{P}_{th^{\prime}+1,l}\right)^{2}\Bigg{|}\mathscr{F}_{k}\Bigg{)}
𝝋k,j𝝋k,jT1+𝝋k,j2|zth)\displaystyle\cdot\frac{\bm{\varphi}_{k,j}\bm{\varphi}^{T}_{k,j}}{1+\|\bm{\varphi}_{k,j}\|^{2}}\Bigg{|}\mathscr{F}_{z_{t}h}\Bigg{)}
\displaystyle\geq σ𝔼((l=1n𝑷th+1,l)2𝝋k,j𝝋k,jT1+𝝋k,j2|zth).\displaystyle\sigma\mathbb{E}\Bigg{(}\left(\sum^{n}_{l=1}\bm{P}_{th^{\prime}+1,l}\right)^{2}\frac{\bm{\varphi}_{k,j}\bm{\varphi}^{T}_{k,j}}{1+\|\bm{\varphi}_{k,j}\|^{2}}\Bigg{|}\mathscr{F}_{z_{t}h}\Bigg{)}. (48)

From the above analysis, we can obtain the following inequality

𝔼(bt+1|zth)\displaystyle\mathbb{E}(b_{t+1}|\mathscr{F}_{z_{t}h})\geq
tr(k=zth+1(zt+1)hj=1nσ𝔼((l=1n𝑷th+1,l)2𝝋k,j𝝋k,jT1+𝝋k,j2|zth))n(hnsq0)(αh+λmax(l=1n𝑷th+1,l))tr(𝑷th+1)\displaystyle\frac{tr\Big{(}\sum^{(z_{t}+1)h}_{k=z_{t}h+1}\sum^{n}_{j=1}\sigma\mathbb{E}\Big{(}\left(\sum^{n}_{l=1}\bm{P}_{th^{\prime}+1,l}\right)^{2}\frac{\bm{\varphi}_{k,j}\bm{\varphi}^{T}_{k,j}}{1+\|\bm{\varphi}_{k,j}\|^{2}}\Big{|}\mathscr{F}_{z_{t}h}\Big{)}\Big{)}}{n(h^{\prime}-nsq_{0})\Big{(}\alpha^{h^{\prime}}+\lambda_{\max}\Big{(}\sum^{n}_{l=1}\bm{P}_{th^{\prime}+1,l}\Big{)}\Big{)}tr(\bm{P}_{th^{\prime}+1})}
=σtr[(l=1n𝑷th+1,l)2𝑯zt]n(hnsq0)(αh+λmax(l=1n𝑷th+1,l))tr(𝑷th+1).\displaystyle=\frac{\sigma tr\left[\left(\sum^{n}_{l=1}\bm{P}_{th^{\prime}+1,l}\right)^{2}\bm{H}_{z_{t}}\right]}{n(h^{\prime}-nsq_{0})\left(\alpha^{h^{\prime}}+\lambda_{\max}\left(\sum^{n}_{l=1}\bm{P}_{th^{\prime}+1,l}\right)\right)tr(\bm{P}_{th^{\prime}+1})}.

The rest part of the proof can be obtained by following the proofs of Lemma 4.4, Theorems 4.6 and 4.8 just replacing the notation D𝒢D_{\mathcal{G}} with nsq0nsq_{0}. This completes the proof of Theorem 5.3.

Remark 5.5.

From Theorem 5.3, (also Theorems 4.6 and 4.8), we see that our results are obtained without using the independency or stationarity assumptions on the regression signals, which makes it possible to apply the distributed FFLS algorithm to practical feedback systems.

6 Concluding Remarks

This paper proposed a distributed FFLS algorithm to collaboratively track an unknown time-varying parameter by minimizing a local loss function with a forgetting factor. By introducing a spatio-temporal cooperative excitation condition, we established the stability of the proposed distributed FFLS algorithm for fixed undirected graph case. Then, the theoretical results were generalized to the case of Markovian switching directed graphs. The cooperative excitation condition revealed that the sensors can collaboratively accomplish the tracking task even though any individual sensor cannot. We note that our theoretical results are established without using independence or stationarity conditions of the regression vectors. Thus, a relevant research topic is how to combine the distributed adaptive estimation with the distributed control. How to establish the stability analysis of the distributed algorithms for more complex cases such as considering quantization effect or time-delay in communication channels is another interesting research topic.

References

  • [1] W. Ren and R. Beard, “Consensus seeking in multiagent systems under dynamically changing interaction topologies,” IEEE Transactions on Automatic Control, vol. 50, no. 5, pp. 655–661, 2005.
  • [2] Y. Wang, L. Cheng, W. Ren, Z.-G. Hou, and M. Tan, “Seeking consensus in networks of linear agents: Communication noises and markovian switching topologies,” IEEE Transactions on Automatic Control, vol. 60, no. 5, pp. 1374–1379, 2015.
  • [3] K. Lu, H. Xu, and Y. Zheng, “Distributed resource allocation via multi-agent systems under time-varying networks,” Automatica, vol. 136, p. 110059, 2022.
  • [4] B. Wang, Q. Fei, and Q. Wu, “Distributed time-varying resource allocation optimization based on finite-time consensus approach,” IEEE Control Systems Letters, vol. 5, no. 2, pp. 599–604, 2021.
  • [5] Z. Lin, L. Wang, Z. Han, and M. Fu, “Distributed formation control of multi-agent systems using complex laplacian,” IEEE Transactions on Automatic Control, vol. 59, no. 7, pp. 1765–1777, 2014.
  • [6] Y. Zhi, L. Liu, B. Guan, B. Wang, Z. Cheng, and H. Fan, “Distributed robust adaptive formation control of fixed-wing uavs with unknown uncertainties and disturbances,” Aerospace Science and Technology, vol. 126, p. 107600, 2022.
  • [7] G. Battistelli and L. Chisci, “Kullback-Leibler average, consensus on probability densities, and distributed state estimation with guaranteed stability,” Automatica, vol. 50, no. 3, pp. 707–718, 2014.
  • [8] W. Chen, C. Wen, S. Hua, and C. Sun, “Distributed cooperative adaptive identification and control for a group of continuous-time systems with a cooperative pe condition via consensus,” IEEE Transactions on Automatic Control, vol. 59, no. 1, pp. 91–106, 2014.
  • [9] M. U. Javed, J. I. Poveda, and X. Chen, “Excitation conditions for uniform exponential stability of the cooperative gradient algorithm over weakly connected digraphs,” IEEE Control Systems Letters, vol. 6, pp. 67–72, 2022.
  • [10] F. Barani, A. Savadi, and H. S. Yazdi, “Convergence behavior of diffusion stochastic gradient descent algorithm,” Signal Processing, vol. 183, p. 108014, 2021.
  • [11] I. D. Schizas, G. Mateos, and G. B. Giannakis, “Distributed LMS for consensus-based in-network adaptive processing,” IEEE Transactions on Signal Processing, vol. 57, no. 6, pp. 2365–2382, 2009.
  • [12] L. Zhang, Y. Cai, C. Li, and R. C. de Lamare, “Variable forgetting factor mechanisms for diffusion recursive least squares algorithm in sensor networks,” EURASIP Journal on Advances in Signal Processing, vol. 57, 2017, doi:10.1186/s13634-017-0490-z.
  • [13] N. Takahashi, I. Yamada, and A. H. Sayed, “Diffusion least-mean squares with adaptive combiners: Formulation and performance analysis,” IEEE Transactions on Signal Processing, vol. 58, no. 9, pp. 4795–4810, 2010.
  • [14] J. Lei and H. Chen, “Distributed estimation for parameter in heterogeneous linear time-varying models with observations at network sensors,” Communications in Information and Systems, vol. 15, no. 4, pp. 423–451, 2015.
  • [15] G. Mateos and G. B. Giannakis, “Distributed recursive least-squares: Stability and performance analysis,” IEEE Transactions on Signal Processing, vol. 60, no. 7, pp. 3740–3754, 2012.
  • [16] S. Xie and L. Guo, “Analysis of normalized least mean squares-based consensus adaptive filters under a general information condition,” SIAM Journal on Control and Optimization, vol. 56, no. 5, pp. 3404–3431, 2018.
  • [17] D. Gan and Z. Liu, “Performance analysis of the compressed distributed least squares algorithm,” Systems & Control Letters, vol. 164, p. 105228, 2022.
  • [18] ——, “Distributed order estimation of arx model under cooperative excitation condition,” SIAM Journal on Control and Optimization, vol. 60, no. 3, pp. 1519–1545, 2022.
  • [19] D. Gan, S. Xie, and Z. Liu, “Stability of the distributed Kalman filter using general random coefficients,” Science China Information Sciences, vol. 64, pp. 172 204:1–172 204:14, 2021.
  • [20] L. Guo, “Estimation, control, and games of dynamical systems with uncertainty,” SCIENTIA SINICA Informationis, vol. 50, no. 9, pp. 1327–1344, 2020.
  • [21] D. Gan and Z. Liu, “Convergence of the distributed SG algorithm under cooperative excitation condition,” IEEE Transactions on Neural Networks and Learning Systems, pp. 1–15, 2022, doi=10.1109/TNNLS.2022.3213715.
  • [22] S. Xie, Y. Zhang, and L. Guo, “Convergence of a distributed least squares,” IEEE Transactions on Automatic Control, vol. 66, no. 10, pp. 4952–4959, 2021.
  • [23] S. Xie and L. Guo, “Analysis of distributed adaptive filters based on diffusion strategies over sensor networks,” IEEE Transactions on Automatic Control, vol. 63, no. 11, pp. 3643–3658, 2018.
  • [24] O. Macchi and E. Eweda, “Compared speed and accuracy of RLS and LMS algorithms with constant forgetting factors,” Traitement Signal, vol. 22, pp. 255–267, 1988.
  • [25] Y. Hatano and M. Mesbahi, “Agreement over random networks,” IEEE Transactions on Automatic Control, vol. 50, no. 11, pp. 1867–1872, 2005.
  • [26] S. Kar, J. M. F. Moura, and K. Ramanan, “Distributed parameter estimation in sensor networks: Nonlinear observation models and imperfect communication,” IEEE Transactions on Information Theory, vol. 58, no. 6, pp. 3575–3605, 2012.
  • [27] I. Matei, N. Martins, and J. S. Baras, “Almost sure convergence to consensus in markovian random graphs,” in Proceedings of the 47th IEEE Conference on Decision and Control, Cancun, Mexico, December 2008, pp. 3535–3540.
  • [28] K. You, Z. Li, and L. Xie, “Consensus condition for linear multi-agent systems over randomly switching topologies,” Automatica, vol. 49, no. 10, pp. 3125–3132, 2013.
  • [29] Y. Wang, L. Cheng, W. Ren, Z.-G. Hou, and M. Tan, “Seeking consensus in networks of linear agents: Communication noises and markovian switching topologies,” IEEE Transactions on Automatic Control, vol. 60, no. 5, pp. 1374–1379, 2015.
  • [30] M. Meng, L. Liu, and G. Feng, “Adaptive output regulation of heterogeneous multiagent systems under markovian switching topologies,” IEEE Transactions on Cybernetics, vol. 48, no. 10, pp. 2962–2971, 2018.
  • [31] Q. Zhang and J.-F. Zhang, “Distributed parameter estimation over unreliable networks with markovian switching topologies,” IEEE Transactions on Automatic Control, vol. 57, no. 10, pp. 2545–2560, 2012.
  • [32] Q. Liu, Z. Wang, X. He, and D. Zhou, “Event-based distributed filtering over markovian switching topologies,” IEEE Transactions on Automatic Control, vol. 64, no. 4, pp. 1595–1602, 2019.
  • [33] G. Zielke, “Inversion of modified symmetric matrices,” Journal of the Association for Computing Machinery, vol. 15, no. 3, pp. 402–408, 1968.
  • [34] L. Guo, “Stability of recursive stochastic tracking algorithms,” SIAM Journal on Control and Optimization, vol. 32, no. 5, pp. 1195–1225, 1994.
  • [35] ——, Time-varying stochastic systems, stability and adaptive theory, Second edition.   Science Press, Beijing, 2020.
  • [36] C. Godsil and G. Royle, Algebraic Graph Theory.   Spring-Verlag, 2001.
  • [37] S. Karlin and H. Taylor, A Second Course in Stochastic Processes.   New York: Academic, 1981.