Resilient Distributed Parameter Estimation in Sensor Networks

Jiaqi Yan, Kuo Li, and Hideaki Ishii J. Yan and H. Ishii are with the Department of Computer Science, Tokyo Institute of Technology, Japan. Emails: jyan@sc.dis.titech.ac.jp, ishii@c.titech.ac.jp. K. Li is with the Center for Intelligent and Networked System, Department of Automation, BNRist, Tsinghua University, P.R. China. Email: li-k19@mails.tsinghua.edu.cn.This work was supported in the part by JSPS under Grants-in-Aid for Scientific Research Grant No. 22H01508 and 21F40376.

Abstract

In this paper, we study the problem of parameter estimation in a sensor network, where the measurements and updates of some sensors might be arbitrarily manipulated by adversaries. Despite the presence of such misbehaviors, normally behaving sensors make successive observations of an unknown $d$ -dimensional vector parameter and aim to infer its true value by cooperating with their neighbors over a directed communication graph. To this end, by leveraging the so-called dynamic regressor extension and mixing procedure, we transform the problem of estimating the vector parameter to that of estimating $d$ scalar ones. For each of the scalar problem, we propose a resilient combine-then-adapt diffusion algorithm, where each normal sensor performs a resilient combination to discard the suspicious estimates in its neighborhood and to fuse the remaining values, alongside an adaptation step to process its streaming observations. With a low computational cost, this estimator guarantees that each normal sensor exponentially infers the true parameter even if some of them are not sufficiently excited.

I Introduction

As a fundamental problem in various applications such as system identification and adaptive control, distributed parameter estimation has been studied in the literature over decades (see, e.g., [1, 2, 3, 4, 5]). In this problem, each sensor observes (partial) information of a system with an unknown (vector) parameter, and attempts to consistently estimate the true parameter by cooperating with others.

As for a single sensor, it is well known that consistent estimation is possible only if its regressor meets certain excitation conditions. Moreover, exponential convergence can be further achieved if a persistent excitation (PE) condition is verified, which guarantees that the input signals of a plant are sufficiently rich that all modes of the plant can be excited [1]. However, in a distributed framework, the PE condition may not necessarily hold at every sensor side. Therefore, by properly introducing consensus algorithms, weaker excitation conditions have been proposed, with which sensors collectively satisfy the PE condition and cooperatively fulfill the estimation task (see, for example, [4, 6, 7]).

However, distributed estimation algorithms, despite relaxing the condition on PE and providing better flexibility as compared to their centralized counterparts, are vulnerable to adversarial behaviors in the network [8]. As the scale of the network increases, it becomes especially difficult to secure every sensor and communication channel. Particularly, adversaries could manipulate measurements or transmissions of sensors and disrupt the operation of conventional algorithms. In fact, as reported in [9], even a single adversary is able to drive normal sensors falsely to its desired estimates.

Inspired by these issues, recent research efforts have been devoted to the design of secure estimation protocols. A large class of methods focuses on developing resilient algorithms which ensure that all the normally behaving sensors resiliently recover the unknown parameter even in the presence of attacks. To raise the resiliency against malicious behaviors, many approaches adopt the idea to simply ignore the most extreme values in the neighborhood. Stemming from this idea, a family of strategies termed as mean-subsequence reduced (MSR) algorithms has been proposed and widely applied to solve the problem of resilient consensus (see, for example, [10, 11, 12, 13]). However, different from the consensus problem, which does not incorporate sensors’ measurements, in the problem of resilient estimation, each sensor must generate a secure estimate by processing its streaming data. To cope with this issue, the works [14, 9] have developed resilient estimators by extending the MSR algorithms. These estimators are shown to be resilient to Byzantine sensors if the network is sufficiently connected and the collective measurements from normal sensors are observable for the system state. Other resilient approaches include [8, 15], where a pre-defined threshold is necessary at each sensor side to check the distance between the local estimates of its neighbors and its own, and thus limits effects from the misbehaving ones.

In this paper, we also investigate the problem of distributed parameter estimation in an adversarial environment. However, different from most of the aforementioned works, where the regressor is assumed to be a constant matrix, we consider a more general model. Specifically, the observations of each sensor are generated via a linear regression equation (LRE), which is able to describe the input-output relationship of a large class of linear and nonlinear dynamical systems [1]. Moreover, notice that in the literature [8, 15, 16], it is assumed that only the measurements of sensors can be manipulated. In contrast, this paper considers the case where not only the measurements, but also the updating rules of a sensor can be faulty and arbitrarily manipulated. Notice that the latter situation could happen if some “non-participating” sensors exist in the network, which weigh their private interests more than the public ones and are not willing to follow the given protocols.

Our estimator is developed inspired by the so-called dynamic regressor extension and mixing (DREM) algorithm. The DREM was first introduced in the recent work [17], where, in a fault-free environment, it reveals decent performance in relaxing the excitation condition and guaranteeing asymptotic convergence at a fusion center. This paper, with subtle modifications on DREM, proposes a resilient combine-then-adapt (RCTA) diffusion algorithm to accommodate the difficulties brought by the distributed framework and malicious behaviors. To be specific, by leveraging DREM, we transform the problem of estimating a $d$ -dimensional vector parameter to that of $d$ scalar ones: one for each of the unknown parameters. For each of the scalar problem, an estimation strategy is given, where each normal sensor runs a resilient algorithm to discard the suspicious estimates in its in-neighborhood and fuse the remaining values, alongside an adaptation process to incorporate its own measurements by using least-mean squares. We provide sufficient conditions under which each normal sensor exponentially infers the true parameter over a directed graph, even if some of them are not sufficiently excited through their inputs. Notice that by decoupling the vector estimation problem into scalar ones, the algorithm proposed here reveals a lightweight implementation, which yields lower cost in solving the more complicated distributed estimation problems.

The rest pf this paper is organized as follows. We will introduce some preliminaries and problem formulation in Sections II and III, respectively. The main algorithm and its convergence analysis are presented in Section IV. We test the main results through some numerical examples in Section V. Finally, Section VI concludes the paper.

II Preliminaries

Consider a digraph $\mathcal{G}=(\mathcal{V},\mathcal{E})$ . Let $\mathcal{V}$ be the set of sensors, and $\mathcal{E}\subseteq\mathcal{V}\times\mathcal{V}$ be the set of edges. An edge from sensor $j$ to $i$ is denoted by $e_{ij}\in\mathcal{E}$ , indicating sensor $i$ can receive the information directly from sensor $j$ . Accordingly, the sets of in-neighbors and out-neighbors of agent $i\in\mathcal{V}$ are defined, respectively, as

\mathcal{N}_{i}^{+}\triangleq\{j\in\mathcal{V}|e_{ij}\in\mathcal{E}\},\ \mathcal{N}_{i}^{-}\triangleq\{j\in\mathcal{V}|e_{ji}\in\mathcal{E}\}.

(1)

For the algorithms employed in this paper, we shall characterize their resilience in terms of the definitions below [9]:

Definition 1 ( $r$ -reachable)

Consider the digraph $\mathcal{G}=(\mathcal{V},\mathcal{E})$ . A set $\mathcal{S}\subseteq\mathcal{V}$ is said to be $r$ -reachable if it contains at least one sensor that has at least $r$ in-neighbors from outside $\mathcal{S}$ . That is, there exists $i\in\mathcal{S}$ such that $|\mathcal{N}_{i}^{+}\backslash\mathcal{S}|\geq r$ .

Definition 2 (Strongly $r$ -robust w.r.t. $\mathcal{S}$ )

Consider the digraph $\mathcal{G}=(\mathcal{V},\mathcal{E})$ with $\mathcal{S}\subseteq\mathcal{V}$ being a nonempty subset of $\mathcal{V}$ . The graph $\mathcal{G}$ is said to be strongly $r$ -robust w.r.t. $\mathcal{S}$ , if for any nonempty subset $\mathcal{S}^{\prime}\subseteq\mathcal{V}\backslash\mathcal{S}$ , $\mathcal{S}^{\prime}$ is $r$ -reachable.

Intuitively, Definition 2 requires that for any nonempty subset of $\mathcal{V}\backslash\mathcal{S}$ , it has at least one sensor with a sufficient number of in-neighbors from outside.

III Problem Formulation

Let us consider the problem of distributed parameter estimation, where multiple sensors cooperatively infer an unknown parameter through local measurements. To be specific, for each sensor $i\in\{1,2,...,m\}$ , the measurable signals $y_{i}(k)\in\mathbb{R}$ and $\phi_{i}(k)\in\mathbb{R}^{d}$ are related via the following linear regression equation (LRE)¹¹1The results in this paper can be readily generalized to cases where the sensor outputs a vector measurement $y_{i}(k)$ , by treating each of its entries independently as a scalar measurement.:

y_{i}(k)=\theta^{\prime}\phi_{i}(k),

(2)

where $\theta\in\mathbb{R}^{d}$ is the parameter to be estimated and $\theta^{\prime}$ denotes its transpose.

The sensors aim to estimate $\theta$ from a stream of measurable signals. However, in a practical network, a single sensor may not be sufficiently excited through its inputs. Therefore, the signals available at its local side are not enough to consistently estimate the parameter $\theta$ [1]. In this respect, each sensor intends to obtain an exact estimate on $\theta$ through the information exchange with others. We use the digraph $\mathcal{G}=(\mathcal{V},\mathcal{E})$ to model the interaction among them.

III-A Attack model

This paper is concerned with the parameter estimation in an adversarial environment, where some of the sensors might be faulty or misbehaving. Let us denote the set of indices of these sensors as $\mathcal{F}\subset\mathcal{V}$ . Any sensor $i\in\mathcal{F}$ could be the one that fails to follow the pre-defined estimation protocol or whose transmitted data is manipulated by an adversary. On the other hand, the normal or benign sensors will always adopt the prescribed estimation algorithm; the set of such nodes is denoted by $\mathcal{R}$ . Given the limited energy of adversaries, it is reasonable to assume an upper bound on the number of faulty sensors. In this paper, we shall consider an $f$ -local attack model as defined below²²2Note that the $f$ -local attack model assumed here is more general than the $f$ -total attack model, where the total number of faulty sensors is assumed to be upper bounded by $f$ .:

Assumption 1 (Attack model)

The network $\mathcal{G}=(\mathcal{V},\mathcal{E})$ is under an $f$ -local attack model. That is, for any normal sensor $i\in\operatorname*{\mathcal{R}}$ , it has no more than $f$ in-neighbors that are misbehaving, i.e., $|\mathcal{F}\cap\mathcal{N}^{+}_{i}|\leq f$ .

III-B Resilient parameter estimation

Our goal is to ensure that all of the normal sensors consistently estimate the parameter $\theta$ . Specifically, let $\hat{\theta}_{i}(k)\in\mathbb{R}^{d}$ be the local estimate produced by sensor $i$ at time $k$ . This paper aims to develop a distributed estimation algorithm which works resiliently against the set of misbehaving sensors by solving the following problem:

Definition 3 (Resilient parameter estimation)

Resilient parameter estimation is said to be achieved if the local estimate of each normal sensor converges to the true parameter $\theta$ , regardless of the initial states and network misbehaviors, namely,

\displaystyle\lim_{k\to\infty}\hat{\theta}_{i}(k)=\theta,\;\forall i\in\operatorname*{\mathcal{R}}.

IV Main Results

This section will provide a resilient algorithm to solve the problem of distributed parameter estimation. Specifically, by decoupling the $d$ -dimensional estimation problem into $d$ scalar ones, every normal sensor performs $d$ independent MSR-based estimation algorithms in a parallel manner, each of which infers an entry of the unknown vector parameter. We shall show that this protocol is lightweight and efficient even in the presence of faulty sensors.

IV-A Dynamic regressor extension and mixing (DREM)

Our estimator is developed based on the dynamic regressor extension and mixing (DREM) algorithm, which was first introduced in [17]. The DREM algorithm is expressed by the three new variables for each sensor $i\in\mathcal{V}$ , defined as follows:

\begin{split}\Phi_{i}(k)&\triangleq\begin{bmatrix}\left(\phi_{i}(k)\right)^{\prime}\\ \left(\phi_{i}(k-1)\right)^{\prime}\\ \vdots\\ \left(\phi_{i}(k-d+1)\right)^{\prime}\end{bmatrix}\in\mathbb{R}^{d\times d},\\ \overline{y}_{i}(k)&\triangleq\operatorname*{adj}(\Phi_{i}(t))\left[\begin{array}[]{c}y_{i}(k)\\ y_{i}(k-1)\\ \vdots\\ y_{i}(k-d+1)\end{array}\right]\in\mathbb{R}^{d},\\ \delta_{i}(k)&\triangleq\det(\Phi_{i}(k)),\end{split}

(3)

where we respectively denote by $\operatorname*{adj}(\Phi_{i}(k))$ and $\det(\Phi_{i}(k))$ the adjugate matrix and determinant of matrix $\Phi_{i}(k)$ . Notice that $\overline{y}_{i}(k)$ is a $d$ -dimensional column vector. For simplicity, we denote by $\overline{y}^{\ell}_{i}(k)$ the $\ell$ -th entry of it. By using DREM, the following lemma is obtained:

Lemma 1 ( [17])

Consider the LRE (2). For any $\ell\in\{1,\cdots,d\},$ it holds for any $i\in\mathcal{V}$ and at any time $k$ that

\overline{y}_{i}^{\ell}(k)=\delta_{i}(k)\theta^{\ell},

(4)

where $\overline{y}_{i}(k)$ and $\delta_{i}(k)$ are given in (3). Moreover, $\theta^{\ell}$ is the $\ell$ -th entry of the true parameter $\theta$ .

Therefore, leveraging DREM, we generate $d$ scalar LREs as presented in (4): one for each of the unknown parameters.

IV-B Description of the resilient algorithm

This subsection is devoted to developing the resilient estimator based on DREM. In order to reject the possible attacks, we propose a resilient combine-then-adapt (RCTA) diffusion algorithm, which overrules the malicious effects from faulty sensors by performing a resilient convex combination and then an adaptation. Specifically, each normal sensor $i\in\mathcal{R}$ starts with any initial estimate $\hat{\theta}_{i}(0)\in\mathbb{R}^{d}$ . At any time $k>0$ , it makes an estimation as in Algorithm 1.

1: Collect the local estimates from all in-neighboring agents $j\in\mathcal{N}_{i}^{+}$ , and place these values in the multiset $\mathcal{X}_{i}(k)$ .

2: for $\ell\in\{1,2,...,d\}$ do

a):

Set $\mathcal{J}^{\ell}_{i}(k)=\mathcal{X}_{i}(k)$ . Then sort the points in $\mathcal{J}^{\ell}_{i}(k)$ according to their $\ell$ -th entries in an ascending order.
b):

Based on the sorted set, remove the first $f$ points from $\mathcal{J}^{\ell}_{i}(k)$ which have the smallest $\ell$ -th entries. If there are less than $f$ points in $\mathcal{J}^{\ell}_{i}(k)$ , then remove all of them.
c):

Similarly, remove the last $f$ points from $\mathcal{J}^{\ell}_{i}(k)$ which have the largest $\ell$ -th entries. If there are less than $f$ points in $\mathcal{J}^{\ell}_{i}(k)$ , then remove all of them.

d):

Sensor $i$ resiliently combines the neighboring estimates as

(Resilient combination)

\begin{split}&\bar{\theta}_{i}^{\ell}(k)\\ &=\begin{cases}\hat{\theta}_{i}^{\ell}(k),\quad\text{if }\mathcal{J}^{\ell}_{i}(k)=\varnothing,\\ a^{\ell}_{ii}(k)\hat{\theta}_{i}^{\ell}(k)+\sum_{j\in\mathcal{J}^{\ell}_{i}(k)}a^{\ell}_{ij}(k)\hat{\theta}_{j}^{\ell}(k),\;\;\text{otherwise},\end{cases}\end{split}

(5)

where each weight $a^{\ell}_{ij}(k)$ is lower bounded by $\alpha>0$ and $a^{\ell}_{ii}(k)+\sum_{j\in\mathcal{J}^{\ell}_{i}(k)}a^{\ell}_{ij}(k)=1$ .

e):

Sensor $i$ updates the $\ell$ -th entry of local estimate as

(Adaptation)

\begin{split}&\hat{\theta}^{\ell}_{i}(k+1)\\ &\;=\bar{\theta}_{i}^{\ell}(k)+\frac{\delta_{i}(k)}{\mu_{i}+\left(\delta_{i}(k)\right)^{2}}\left(\overline{y}^{\ell}_{i}(k)-\delta_{i}(k)\bar{\theta}_{i}^{\ell}(k)\right),\end{split}

(6)

where $\mu_{i}>0$ .

end for

3: Transmit $\hat{\theta}_{i}(k+1)=\operatorname*{col}(\hat{\theta}^{\ell}_{i}(k+1))$ to out-neighbors.

Algorithm 1 Resilient parameter estimation algorithm

In the proposed algorithm, for inferring each entry of the unknown parameter, i.e., $\theta^{\ell}$ , each normal sensor $i$ sorts the received estimates based on their $\ell$ -th entries, where it discards up to $f$ smallest ones and up to $f$ largest ones. As discussed previously, for normal sensor $i\in\mathcal{R}\backslash\operatorname*{\mathcal{S}_{\mathcal{R}}}$ , it has at least $3f+1$ in-neighbors. Since it removes at most $2f$ values at each $\ell\in\{1,\cdots,d\}$ , $\mathcal{J}^{\ell}_{i}(k)$ must be non-empty at any time $k$ . On the other hand, if $i\in\operatorname*{\mathcal{S}_{\mathcal{R}}}$ , it is possible that $\mathcal{J}^{\ell}_{i}(k)=\varnothing$ . In either case, sensor $i$ calculates a resilient combination (5) via linearly combining its own estimate and the ones in $\mathcal{J}^{\ell}_{i}(k)$ . After that, an adaptation (6) is performed using a least-mean square scheme to update sensor $i$ ’s local estimate on $\theta^{\ell}$ . Finally, sensor $i$ aggregates its estimates on all the entries and sends them to its out-neighbors.

In contrast, a main feature of Algorithm 1 is its simplicity in implementation and computation. Particularly, by leveraging DREM, we decouple the $d$ -dimensional parameter estimation problem into $d$ scalar ones. Then the resilient combination (5) is performed within each scalar system, where only coordinate-wise message sorting and trimming are needed. The computational complexity of each normal sensor $i$ can be shown to be $\mathcal{O}(d\cdot n_{i}\log_{2}n_{i})$ , where $n_{i}\triangleq|\mathcal{N}_{i}^{+}|$ . As compared with the existing solutions in multi-dimensional spaces, such as [19, 22, 18, 20], Algorithm 1 yields lower computational cost.

IV-C Assumptions

Before proceeding, we first introduce assumptions that would be adopted in this paper.

It is well known that consistent estimation is possible only if the input signal satisfies certain excitation conditions. In particular, a persistent excitation (PE) condition, which guarantees the signals to be sufficiently rich, is usually required to achieve an exponential convergence ( [1]). On the other hand, in order to countermeasure the faulty behaviors in an adversarial environment, the network should contain a certain degree of redundancy in its communication structure. In this respect, let us introduce the following assumptions:

Assumption 2

There exists a subset of sensors $\mathcal{S}\subset\mathcal{V}$ such that the following statements hold:

1.

[Persistent excitation] There exist $\Delta>0$ and a finite time $T\in\mathbb{N}_{+}$ such that the following PE condition is satisfied by each $i\in\mathcal{S}$ :

$\sum_{t=k}^{k+T-1}(\delta_{i}(t))^{2}\geq\Delta,\;\forall k.$ (7)
2.

[Network topology] The network $\mathcal{G}=(\mathcal{V},\mathcal{E})$ is strongly $(3f+1)$ -robust w.r.t. $\mathcal{S}$ .

According to Assumption 2, the set $\mathcal{S}$ consists of the sensors that are persistently excited. Moreover, there could exist two cases: all sensors in the network are persistently excited and thus $\mathcal{S}=\mathcal{V}$ , or only a subset of the sensors is persistently excited, but they are “sufficiently connected” to all the others. Particularly, as for the second case, since the network is strongly $(3f+1)$ -robust w.r.t. $\mathcal{S}$ , one can verify from Definition 2 that $|\mathcal{S}|\geq 3f+1$ . Moreover, each sensor $i\in\mathcal{V}\backslash\mathcal{S}$ has at least $3f+1$ in-neighbors³³3To see this, suppose that sensor $i$ outside the set $\mathcal{S}$ has at most $3f$ in-neighbors. Let us choose $\mathcal{S}^{\prime}=\{i\}$ . Then based on Definition 2, $\mathcal{S}^{\prime}$ is at most $3f$ -reachable. Therefore, the network can not be strongly $(3f+1)$ -robust w.r.t. $\mathcal{S}$ , which contradicts the assumption.. As will be proved later, in either case, our algorithm guarantees that each normal sensor, whether it satisfies the PE condition or not, consistently estimates the true parameter.

Notice that sensors in $\mathcal{S}$ may also be faulty or misbehaving. For simplicity, we denote by $\operatorname*{\mathcal{S}_{\mathcal{R}}}$ the subset of $\mathcal{S}$ which contains only the normal sensors, that is, $\operatorname*{\mathcal{S}_{\mathcal{R}}}\triangleq\mathcal{S}\cap\mathcal{R}$ . The following assumption is finally made:

Assumption 3

The set $\operatorname*{\mathcal{S}_{\mathcal{R}}}$ is nonempty.

Assumption 3 implies that at least one normal sensor exists, which can find the true parameter. Notice that this is necessary for $\theta$ to be consistently estimated by all normal sensors.

IV-D Convergence analysis

To theoretically analyze the performance of Algorithm 1, let us define the following set for each normal sensor $i\in\mathcal{R}$ :

\widetilde{\mathcal{R}}_{i}\triangleq(\mathcal{N}_{i}^{+}\cap\mathcal{R})\cup\{i\},

(8)

which includes all the normal in-neighbors of sensor $i$ and itself. This set is used in our analysis only and need not be known by sensor $i$ . In the next lemma, we shall prove that $\bar{\theta}_{i}^{\ell}(k)$ obtained by performing the resilient combination (5) is indeed a convex combination of the local estimates from the normal sensors in $\widetilde{\mathcal{R}}_{i}$ :

Lemma 2

Suppose that under Assumptions 1–3, each normal sensor performs parameter estimation by following Algorithm 1. Then, for any normal sensor $i\in\mathcal{R}$ and $\ell\in\{1,2,...,d\}$ , there exists a set of weights $\{d^{\ell}_{ij}(k)\}$ such that $\bar{\theta}_{i}^{\ell}(k)$ can be represented as

\bar{\theta}_{i}^{\ell}(k)=\sum_{j\in\widetilde{\mathcal{R}}_{i}}d^{\ell}_{ij}(k)\hat{\theta}_{j}^{\ell}(k),

(9)

where $\bar{\theta}_{i}^{\ell}(k)$ is calculated by (5). Moreover, the following statements hold:

1.

$\bar{\theta}_{i}^{\ell}(k)$ is a convex combination of its own estimate and the ones received from normal in-neighbors. That is, each weight in (9) is non-negative and $\sum_{j\in\widetilde{\mathcal{R}}_{i}}d^{\ell}_{ij}(k)=1$ ;
2.

$d^{\ell}_{ii}(k)\geq\alpha$ ;
3.

For any normal sensor $j$ whose estimate at $\ell$ -th entry is retained by sensor $i$ , i.e., $j\in\mathcal{J}^{\ell}_{i}(k)\cap\mathcal{R}$ , it follows that $j\in\widetilde{\mathcal{R}}_{i}$ and $d^{\ell}_{ij}(k)\geq\alpha$ .

As implied by Lemma 2, $\bar{\theta}_{i}^{\ell}(k)$ is updated in a safe manner as it always remains in a region that is determined by the local estimates of only normal sensors. Hence, by resiliently combining in-neighbors’ estimates, (5) prevents the faulty sensors from taking arbitrary control over the dynamics of any normal one.

Therefore, one concludes that the resilient combination step ensures that each normal sensor is updated in a safe manner. We shall next study the performance of Algorithm 1 on achieving the parameter estimation. This is particularly guaranteed by the adaptation step in (6).

To see this, for each sensor $i\in\mathcal{V}$ , let us define the estimation error of it at $\ell$ -th entry as

\tilde{\theta}_{i}^{\ell}(k)\triangleq\hat{\theta}_{i}^{\ell}(k)-\theta^{\ell}.

(10)

Combining (4) and (6) with Lemmas 1 and 2, the dynamics of $\tilde{\theta}_{i}^{\ell}(k)$ is obtained as

\begin{split}&\tilde{\theta}_{i}^{\ell}(k+1)=\hat{\theta}^{\ell}_{i}(k+1)-\theta^{\ell}\\ &=\sum_{j\in\widetilde{\mathcal{R}}_{i}}d^{\ell}_{ij}(k)\hat{\theta}_{j}^{\ell}(k)-\theta^{\ell}\\ &\qquad+\frac{(\delta_{i}(k))^{2}}{\mu_{i}+\left(\delta_{i}(k)\right)^{2}}\left(\theta^{\ell}-\sum_{j\in\widetilde{\mathcal{R}}_{i}}d^{\ell}_{ij}(k)\hat{\theta}_{j}^{\ell}(k)\right)\\ &=\left(1-\frac{(\delta_{i}(k))^{2}}{\mu_{i}+\left(\delta^{\ell}_{i}(k)\right)^{2}}\right)\sum_{j\in\widetilde{\mathcal{R}}_{i}}d^{\ell}_{ij}(k)(\hat{\theta}_{j}^{\ell}(k)-\theta^{\ell})\\ &=\nu_{i}(k)\sum_{j\in\widetilde{\mathcal{R}}_{i}}d^{\ell}_{ij}(k)\tilde{\theta}_{j}^{\ell}(k),\end{split}

(11)

where $\nu_{i}(k)\triangleq 1-\frac{(\delta_{i}(k))^{2}}{\mu_{i}+\left(\delta_{i}(k)\right)^{2}}\leq 1.$

Then let us introduce the following results, the proofs of which are omitted due to the space limitation:

Lemma 3

Consider the network of sensors verifying the LRE (2). Suppose under Assumptions 1–3, that each normal sensor performs parameter estimation by following Algorithm 1. Then it holds for any $\ell$ that

m(k+1)\leq m(k),\;\forall k.

(12)

Therefore, the maximum estimation error would never increase throughout the execution. Further, we show that this error will definitely decrease after a finite time:

Lemma 4

Consider the network of sensors verifying the LRE (2). Suppose under Assumptions 1–3, that each normal sensor performs parameter estimation by following Algorithm 1. Then, it holds for any $\ell$ that

m(k+T+M+1)\leq(1-\alpha^{T+M}(1-\gamma))m(k).

(13)

With the above preparations, we are now ready to state the main result of the paper on the convergence of the estimation error as follows:

Theorem 1

Consider the network of sensors satisfying the LRE (2). Suppose that Assumptions 1–3 hold. Let each normal sensor update with Algorithm 1. Then resilient parameter estimation is achieved with an exponential rate. Namely, it holds that

\lim\limits_{k\to\infty}\tilde{\theta}_{i}(k)=0,\;\forall i\in\operatorname*{\mathcal{R}},

(14)

regardless of the network misbehaviors.

In view of Theorem 1, the convergence of Algorithm 1 holds regardless of the presence and behaviors of faulty sensors. Therefore, the algorithm can be applied to withstand even the worst-case adversaries.

Remark 1

We further point out that the topology condition that the network $\mathcal{G}=(\mathcal{V},\mathcal{E})$ is strongly $(3f+1)$ -robust w.r.t. a set $\mathcal{S}$ is similar to those in the works [23, 9, 24]. In these works, agents in $\mathcal{S}$ are assumed to have direct knowledge of the state to be estimated. Thus, these agents perform as leaders while others are followers that resiliently track the states of the leaders. Note that in the framework of leader-following consensus, the leaders only broadcast their estimates to the followers while not taking account of others’ states into the updates of their own. However, different from such a framework, in the parameter estimation here, the sensors in $\mathcal{S}$ are the ones that satisfy the PE condition. Moreover, according to (5), the dynamics of them are inevitably affected by other sensors including the ones outside $\mathcal{S}$ . Therefore, the existing methods for convergence analysis are prevented from being directly applied. In order to solve this problem, the adaptation step (6) is necessary to guarantee the convergence of normal sensors.

V Numerical Example

This example considers the network of $8$ sensors, which aim to cooperatively estimate a $2$ -dimensional parameter $\theta$ over the digraph given by Fig. 1. It can be checked that the graph is strongly $4$ -robust w.r.t. the set $\mathcal{S}=\{1,2,3,4\}$ . Therefore, according to Theorem 1, it is able to tolerate the $1$ -local attack.

Refer to caption — Figure 1: A network that is strongly $4$ -robust w.r.t. the set $\mathcal{S}=\{1,2,3,4\}$ .

To verify this, let sensors $1$ and $5$ be misbehaving. They intend to prevent the estimation task from being fulfilled by maliciously broadcasting their estimates as

\begin{split}\hat{\theta}^{1}_{1}(k)&=2,\;\hat{\theta}^{2}_{1}(k)=-2,\\ \hat{\theta}^{1}_{5}(k)&=k/20+2,\;\hat{\theta}^{2}_{5}(k)=0.5\sin(k/5).\end{split}

(15)

On the other hand, the regressor $\phi_{i}(t)$ for each normal sensor is given by

\begin{split}\phi_{2}(t)&=[a(t)\quad 1]^{\prime},\;\phi_{3}(t)=[1\quad b(t)]^{\prime},\\ \phi_{4}(t)&=\begin{cases}[1\quad 2]^{\prime},\text{ if }t\text{ is odd},\\ [2\quad 3]^{\prime},\text{ if }t\text{ is even},\end{cases}\\ \phi_{6}(t)&=[1\quad 0]^{\prime},\;\phi_{7}(t)=[0\quad 2]^{\prime},\;\phi_{8}(t)=[1\quad 1]^{\prime},\end{split}

(16)

where $a(t)=a(t-1)+\cos\left(\frac{t\pi}{4}\right),\;b(t)=b(t-1)+\cos\left(\frac{t\pi}{2}\right),$ with $a(0)=1$ and $b(0)=2$ . Notice that the PE condition is satisfied by $\delta_{i}(k),i=\{2,3,4\}$ . We set the initial estimates of all sensors as $[0\quad 0]^{\prime}$ and other parameters as $\theta=[2.5\quad-1]^{\prime},$ and $\mu_{i}=0.1i,\;\forall i.$

The performance of Algorithm 1 is demonstrated in Fig. 2, which shows the Euclidean norm of each sensor’s estimation error. From the figure, we can see that, despite the presence of misbehaving nodes, the normal sensors can consistently estimate the true parameter, as expected from Theorem 1.

Figure 2: The Euclidean norm of each sensor’s estimation error, where the lines with ‘x’ are those of misbehaving sensors.

VI Conclusion

This paper has considered the problem of distributed estimation in an adversarial environment, where some of the sensors might be Byzantine. To mitigate their misbehaviors, a resilient estimation algorithm has been proposed, which, with low computation complexity, guarantees that the normal sensors exponentially estimate the true parameter when certain requirements on the regressor vector and the network topology are met. As a future work, we plan to extend the results in this paper to more challenging scenarios where the parameter to be estimated might be changing dynamically and the sensor measurements might be subject to stochastic noises.

References

[1] G. C. Goodwin and K. S. Sin, Adaptive Filtering Prediction and Control. Courier, 2014.
[2] S. Xie, Y. Zhang, and L. Guo, “Convergence of a distributed least squares,” IEEE Transactions on Automatic Control, vol. 66, no. 10, pp. 4952–4959, 2020.
[3] I. D. Schizas, G. Mateos, and G. B. Giannakis, “Distributed LMS for consensus-based in-network adaptive processing,” IEEE Transactions on Signal Processing, vol. 57, no. 6, pp. 2365–2382, 2009.
[4] W. Chen, C. Wen, S. Hua, and C. Sun, “Distributed cooperative adaptive identification and control for a group of continuous-time systems with a cooperative PE condition via consensus,” IEEE Transactions on Automatic Control, vol. 59, no. 1, pp. 91–106, 2013.
[5] J. Yan, X. Yang, Y. Mo, and K. You, “A distributed implementation of steady-state kalman filter,” IEEE Transactions on Automatic Control, 2022, DOI: 10.1109/TAC.2022.3175925.
[6] S. Xie and L. Guo, “Analysis of distributed adaptive filters based on diffusion strategies over sensor networks,” IEEE Transactions on Automatic Control, vol. 63, no. 11, pp. 3643–3658, 2018.
[7] A. S. Matveev, M. Almodarresi, R. Ortega, A. Pyrkin, and S. Xie, “Diffusion-based distributed parameter estimation through directed graphs with switching topology: Application of dynamic regressor extension and mixing,” IEEE Transactions on Automatic Control, vol. 67, no. 8, pp. 4256–4263, 2022.
[8] Y. Chen, S. Kar, and J. M. Moura, “Resilient distributed parameter estimation with heterogeneous data,” IEEE Transactions on Signal Processing, vol. 67, no. 19, pp. 4918–4933, 2019.
[9] A. Mitra and S. Sundaram, “Byzantine-resilient distributed observers for LTI systems,” Automatica, vol. 108, p. 108487, 2019.
[10] D. Dolev, N. A. Lynch, S. S. Pinter, E. W. Stark, and W. E. Weihl, “Reaching approximate agreement in the presence of faults,” Journal of the ACM (JACM), vol. 33, no. 3, pp. 499–516, 1986.
[11] H. J. LeBlanc, H. Zhang, X. Koutsoukos, and S. Sundaram, “Resilient asymptotic consensus in robust networks,” IEEE Journal on Selected Areas in Communications, vol. 31, no. 4, pp. 766–781, 2013.
[12] S. M. Dibaji, H. Ishii, and R. Tempo, “Resilient randomized quantized consensus,” IEEE Transactions on Automatic Control, vol. 63, no. 8, pp. 2508–2522, 2017.
[13] H. Ishii, Y. Wang, and S. Feng, “An overview on multi-agent consensus under adversarial attacks,” Annual Reviews in Control, 2022.
[14] H. J. LeBlanc and F. Hassan, “Resilient distributed parameter estimation in heterogeneous time-varying networks,” in Proceedings of the 3rd International Conference on High Confidence Networked Systems, 2014, pp. 19–28.
[15] M. Meng, X. Li, and G. Xiao, “Distributed estimation under sensor attacks: Linear and nonlinear measurement models,” IEEE Transactions on Signal and Information Processing over Networks, vol. 7, pp. 156–165, 2021.
[16] Y. Chen, S. Kar, and J. M. Moura, “Resilient distributed estimation through adversary detection,” IEEE Transactions on Signal Processing, vol. 66, no. 9, pp. 2455–2469, 2018.
[17] S. Aranovskiy, A. Bobtsov, R. Ortega, and A. Pyrkin, “Performance enhancement of parameter estimators via dynamic regressor extension and mixing,” IEEE Transactions on Automatic Control, vol. 62, no. 7, pp. 3546–3550, 2017.
[18] H. Mendes and M. Herlihy, “Multidimensional approximate agreement in Byzantine asynchronous systems,” in Proceedings of the 45th Annual ACM Symposium on Theory of Computing, 2013, pp. 391–400.
[19] X. Wang, S. Mou, and S. Sundaram, “A resilient convex combination for consensus-based distributed algorithms,” arXiv preprint arXiv:1806.10271, 2018.
[20] J. Yan, X. Li, Y. Mo, and C. Wen, “Resilient multi-dimensional consensus in adversarial environment,” Automatica, vol. 145, p. 110530, 2022.
[21] B. Yi and R. Ortega, “Conditions for convergence of dynamic regressor extension and mixing parameter estimators using lti filters,” IEEE Transactions on Automatic Control, 2022, DOI: 10.1109/TAC.2022.3149964.
[22] N. H. Vaidya, L. Tseng, and G. Liang, “Iterative approximate Byzantine consensus in arbitrary directed graphs,” in Proceedings of the ACM Symposium on Principles of Distributed Computing, 2012, pp. 365–374.
[23] J. Usevitch and D. Panagou, “Resilient leader-follower consensus to arbitrary reference values in time-varying graphs,” IEEE Transactions on Automatic Control, vol. 65, no. 4, pp. 1755–1762, 2019.
[24] J. Yan, C. Deng, and C. Wen, “Resilient output regulation in heterogeneous networked systems under Byzantine agents,” Automatica, vol. 133, p. 109872, 2021.