This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

A Numerical Verification Framework for Differential Privacy in Estimation

Yunhai Han1,2 and Sonia Martínez1 1Department of Mechanical and Aerospace Engineering, University of California at San Diego, La Jolla, CA, 92093, USA (email: y8han@ucsd.edu; soniamd@ucsd.edu).2Laboratory for Intelligent Decision and Autonomous Robots, George W. Woodruff School of Mechanical Engineering, Georgia Institute of Technology, Atlanta, GA, 30313, USA (email: yhan364@gatech.edu). This work was supported by Grant ONR N00014-19-1-2471.
Abstract

This work proposes an algorithmic method to verify differential privacy for estimation mechanisms with performance guarantees. Differential privacy makes it hard to distinguish outputs of a mechanism produced by adjacent inputs. While obtaining theoretical conditions that guarantee differential privacy may be possible, evaluating these conditions in practice can be hard. This is especially true for estimation mechanisms that take values in continuous spaces, as this requires checking for an infinite set of inequalities. Instead, our verification approach consists of testing the differential privacy condition for a suitably chosen finite collection of events at the expense of some information loss. More precisely, our data-driven, test framework for continuous range mechanisms first finds a highly-likely, compact event set, as well as a partition of this event, and then evaluates differential privacy wrt this partition. This results into a type of differential privacy with high confidence, which we are able to quantify precisely. This approach is then used to evaluate the differential-privacy properties of the recently proposed W2W_{2} Moving Horizon Estimator. We confirm its properties, while comparing its performance with alternative approaches in simulation.

I INTRODUCTION

A growing number of emerging, on-demand applications, require data from users or sensors in order to make predictions and/or recommendations. Examples include smart grids, traffic networks, or home assistive technology. While more accurate information can benefit the quality of service, an important concern is that sharing personalized data may compromise the privacy of its users. This has been demonstrated over Netflix datasets [1], as well as on traffic monitoring systems [2].

Initially proposed from the database literature, Differential Privacy [3] addresses this issue, and has become a standard in privacy specification of commercial products. More recently, differential privacy has attracted the attention of the Systems and Control literature [4] and been applied on control systems [5], optimization [6], and estimation and filtering [7]. In particular, the work [8] develops the concept of differential privacy for Kalman filter design. The work [9] proposes a more general moving-horizon estimator via a perturbed objective function to enable privacy. To the best of our knowledge, all of these works only propose sufficient and theoretical conditions for differential privacy.

However, the design of such algorithms can be subtle and error-prone. It has been proved that a number of algorithms in the database literature are incorrect [10] [11], and their claimed level of privacy can not be achieved. Motivated by this, the work [12] introduces an approach to detect the violation of differential privacy for discrete mechanisms. Yet, this method is only applicable for mechanisms that result in a small and finite number of events. Further, a precise characterization of its performance guarantees is not provided.

This motivates our work with contributions in two directions. First, we build a tractable, data-driven, framework to detect violations of differential privacy in system estimation. To handle the infinite collection of events of continuous spaces, the evaluation is conditioned over a highly-likely, compact set. This results into a type of approximate differential privacy with high confidence. We then approximate this set in a data-driven fashion. Further, tests are performed wrt a collectively-exhaustive and mutually exclusive partition of the approximated highly-likely set. By assuming the probability of these events is upper bounded by a small constant, and implementing an exact hypothesis test procedure, we are able to quantify the approximate differential privacy of the estimation wrt two adjacent inputs with high-likelihood. Second, we employ this procedure to evaluate the differential privacy of a previously proposed, W2W_{2}-MHE estimator. Our experiments show some interesting results including: i) the theoretical conditions for the W2W_{2}-MHE seem to hold but may be rather conservative, ii) there is an indication that perturbing the output estimation mapping results in a better performance than perturbing the input sensor data, iii) differential privacy does depend on sensor locations, and iv) the W2W_{2}-MHE performs better than a differentially-private EKF.

II Problem Formulation

Consider a sensor network performing a distributed estimation task. Sensors may have different owners, who wish to maintain their locations private. Even if communication among sensors is secure, an adversary may have access to the estimation on the target, and other network side information111This can be any information including, but not limited to, the target’s true location, and all other sensor positions. to gain critical knowledge about any individual sensor; see Figure 1.

Refer to caption
Figure 1: The solid circle represents the moving target that is being estimated; the squares represent the location of known sensors (side information known by an adversary); the star represents the sensor location of a particular sensor that an adversary is tracking. The actual location of the starred sensor can be anywhere within the shaded circle (hypothesis). The diameter of the hypothesis depends on the level of differential privacy of the estimator. The dashed curve represents the target’s trajectory and the arrows indicate the direction. The set of red/blue lines represent the output data released from sensors when the target is at the start/end point. In practice, an adversary can probably have access to a time history of data.

We now start by defining the concept of differential privacy in estimation, and state our problem objectives.

Let a system and observation models be of the form :

Ω:{xk+1=f(xk,wk),yk=h(xk,vk),\Omega:\begin{cases}x_{k+1}=f\left(x_{k},w_{k}\right),\\ {\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}y_{k}=h\left(x_{k},v_{k}\right)},\end{cases} (1)

where xkdX,ykdYx_{k}\in\mathbb{R}^{d_{X}},y_{k}\in\mathbb{R}^{d_{Y}}, wkdWw_{k}\in\mathbb{R}^{d_{W}} and vkdVv_{k}\in\mathbb{R}^{d_{V}}. Here, wkw_{k} and vkv_{k} represent the iid process and measurement noises at time step kk, respectively.

Let {0,,T}\{0,\ldots,T\} be a time horizon, and the sensor data up to time TT be denoted by y0:T\mathrm{y}_{0:T} =(y0,,yT)=\left(y_{0}^{\top},\ldots,y_{T}^{\top}\right)^{\top}. An estimator or mechanism \mathcal{M} of (1) is a stochastic mapping: (T+1)dYmdX{}^{(T+1)d_{Y}}\rightarrow{}^{md_{X}}, for some m1m\geq 1, which assigns sensor data y0:T\mathrm{y}_{0:T} to a random state trajectory estimate. We will assume that the distribution of (y0:T)\mathcal{M}(\mathrm{y}_{0:T}), \mathbb{P}, is independent of the distribution of y0:T\mathrm{y}_{0:T}. In Section V, we test a W2W_{2}-MHE filter that takes this form and assimilates sensor data online. Roughly speaking, the W2W_{2}-MHE employs a moving window NN and sensor data (yk+1,,yk+N)(y_{k+1},\ldots,y_{k+N}) to estimate the state at time kk; see [9] for more information.

Definition 1

((ε\varepsilon,dd-adjacent), λ\lambda-approximate, Differential Privacy) Let \mathcal{M} be a state estimator of System 1 and dyd_{\mathrm{y}} a distance metric on (T+1)dY{}^{(T+1)d_{Y}}. Given ε,λ,d0\varepsilon,\lambda,d\in\mathbb{R}_{\geq 0}, \mathcal{M} is (ε\varepsilon, dd-adjacent), λ\lambda-approximate, differentially private if for any y0:T1,y0:T2(T+1)dY\mathrm{y}_{0:T}^{1},\mathrm{y}_{0:T}^{2}\in{}^{(T+1)d_{Y}}, with dy(y0:T1,y0:T2)dd_{\mathrm{y}}(\mathrm{y}_{0:T}^{1},\mathrm{y}_{0:T}^{2})\leq d we have

((y0:Ti)E)eε((y0:Tj)E))+λ,{\small\mathbb{P}(\mathcal{M}\left(\mathrm{y}_{0:T}^{i})\in E\right)\leq\mathrm{e}^{\varepsilon}{\mathbb{P}(\mathcal{M}(\mathrm{y}_{0:T}^{j})\in E))}+\lambda,\;} (2)

for i,j=1,2,i,j=1,2, and for all Erange()E\subset\text{range}(\mathcal{M}). In what follows, we use the notation (ε,d(\varepsilon,d-adj))-λ\lambda (resp. (ε,d(\varepsilon,d-adj)), for λ=0\lambda=0).

This notion of privacy matches the standard definition of [4][8], for a 11-adjacency relation given by a distance d\leq d. We are mostly interested in the case of λ=0\lambda=0 and the standard (ε,d(\varepsilon,d-adj)) differential privacy [9]. However, later we discuss how to choose a λ\lambda to approximate it via (ε,d(\varepsilon,d-adj))-λ\lambda differential privacy. Ideally, λ\lambda is to be chosen as small as possible. Here, dy(y0:T1,y0:T2)d_{\mathrm{y}}\left(\mathrm{y}_{0:T}^{1},\mathrm{y}_{0:T}^{2}\right) is induced by the 2-norm. However, the results do not depend on the choice of metric.

Finding theoretical conditions that guarantee (ε(\varepsilon,dd-adj)) differential privacy can be difficult, conservative, and hard to verify. Thus, in this work we aim to:

  1. 1.

    Obtain a tractable, numerical test procedure to evaluate the differential privacy of an estimator; while providing quantifiable performance guarantees of its correctness.

  2. 2.

    Verify numerically the differential-privacy guarantees of the W2W_{2}-MHE filter of [9]; and compare its performance with that of an extended Kalman filter.

  3. 3.

    Evaluate the differences in privacy/estimation when the perturbations are directly applied to the sensor data before the filtering process is done.

Our approach employs a statistical, data-driven method. Although a main motivation for this work is the evaluation of the W2W_{2}-MHE filter, the method can be used to verify the privacy of any mappings with a continuous space range.

III Approximate (ε(\varepsilon,dd-adj)) Differential Privacy

In this section, we start by introducing a notion of high-likelihood (ε(\varepsilon,dd-adj)) differential privacy. This is a first step to simplify the evaluation of differential privacy via the proposed numerical framework. A second step lies on the identification of a suitable space partition.

Definition 2

(High-likelihood (ε(\varepsilon,dd-adj)) Differential Privacy). Suppose that \mathcal{M} is a state estimator of System 1. Given ε,d0\varepsilon,d\in{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}\mathbb{R}_{\geq 0}}, we say that \mathcal{M} is (ε(\varepsilon,dd-adj)) differentially private with high likelihood 1θ1-\theta if there exists an event RR with (R)1θ\mathbb{P}(R)\geq 1-\theta such that, for any two y0:Ti\mathrm{y}_{0:T}^{i}, i=1,2i=1,2, with dy(y0:T1,y0:T2)dd_{\mathrm{y}}(\mathrm{y}_{0:T}^{1},\mathrm{y}_{0:T}^{2})\leq d, we have:

((y0:Ti)E|R)eε((y0:Tj)E|R),\mathbb{P}(\mathcal{M}(\mathrm{y}_{0:T}^{i})\in E|R)\leq e^{\varepsilon}\mathbb{P}(\mathcal{M}(\mathrm{y}_{0:T}^{j})\in E|R),

for i,j{1,2}i,j\in\{1,2\} and all events Erange()E\subseteq\text{range}(\mathcal{M}).

Lemma 1

Suppose that \mathcal{M} is a high-likelihood (ε(\varepsilon,dd-adj)) differentially private estimator, with likelihood 1θ1-\theta. Then, \mathcal{M} is (ε(\varepsilon,dd-adj))-λ\lambda differentially private with λ=θ\lambda=\theta.

Proof:

Let RR be the high-likely event wrt which \mathcal{M} is high-likely differentially private. Let R¯\overline{R} be its complement and Erange()E\subseteq\text{range}(\mathcal{M}). We have

((y0:T1)E)\displaystyle\mathbb{P}(\mathcal{M}(\mathrm{y}_{0:T}^{1})\in E)
=((y0:T1)E|R)(R)+((y0:T1)E|R¯)(R¯)\displaystyle=\mathbb{P}(\mathcal{M}(\mathrm{y}_{0:T}^{1})\in E|R)\mathbb{P}(R)+\mathbb{P}(\mathcal{M}(\mathrm{y}_{0:T}^{1})\in E|\overline{R})\mathbb{P}(\overline{R})
eε((y0:T2)E|R)(R)+θ((y0:T1)E|R¯)\displaystyle\leq e^{\varepsilon}\mathbb{P}(\mathcal{M}(\mathrm{y}_{0:T}^{2})\in E|R)\mathbb{P}(R)+\theta\mathbb{P}(\mathcal{M}(\mathrm{y}_{0:T}^{1})\in E|\overline{R})\leq
eε[((y0:T2)E|R)(R)+((y0:T2)E|R¯)(R¯)]+θ\displaystyle e^{\varepsilon}[\mathbb{P}(\mathcal{M}(\mathrm{y}_{0:T}^{2})\in E|R)\mathbb{P}(R)+\mathbb{P}(\mathcal{M}(\mathrm{y}_{0:T}^{2})\in E|\overline{R})\mathbb{P}(\overline{R})]+\theta
=eε((y0:T2)E)+θ,\displaystyle=e^{\varepsilon}\mathbb{P}(\mathcal{M}(\mathrm{y}_{0:T}^{2})\in E)+\theta,

similarly, the roles of y0:T1\mathrm{y}_{0:T}^{1} and y0:T2\mathrm{y}_{0:T}^{2} can be exchanged. ∎

Definition 2 still requires checking conditions involving an infinite number of event sets. Our test framework limits evaluations to a finite collection as follows.

Definition 3 (Differential privacy wrt a space partition)

Let \mathcal{M} be an estimator of System 1 and 𝒫={E1,,En}\mathcal{P}=\{E_{1},\ldots,E_{n}\} be a space partition222By partition we mean a collection of mutually exclusive and collectively exhaustive set of events wrt \mathbb{P}. of range()\text{range}(\mathcal{M}). We say that \mathcal{M} is (ε(\varepsilon,dd-adj)) differentially private wrt 𝒫\mathcal{P} if the definition of (ε(\varepsilon,dd-adj)) differential privacy holds for each Ek𝒫E_{k}\in\mathcal{P}.

The following helps explain the relationship between (ε(\varepsilon,dd-adj)) differential privacy wrt a partition and the original (ε(\varepsilon,dd-adj)) differential privacy.

Lemma 2

Let \mathcal{M} be a state estimator of System 1, and consider a partition of range()\text{range}(\mathcal{M}), 𝒫1={E1,,En1}\mathcal{P}_{1}=\{E_{1},\ldots,E_{n_{1}}\}, which is finer than another partition 𝒫2\mathcal{P}_{2} ={F1,,Fn2}=\{F_{1},\ldots,F_{n_{2}}\} (n1>n2n_{1}>n_{2}). That is, each FiF_{i} can be represented by the disjoint union Fi=s=1miElsF_{i}=\operatorname{\cup}_{s=1}^{m_{i}}E_{l_{s}}. Then, if \mathcal{M} is (ε(\varepsilon,dd-adj)) differentally private wrt 𝒫1\mathcal{P}_{1}, then it is also differentially private wrt 𝒫2\mathcal{P}_{2}.

Proof:

By assumption, it holds that ((y0:T1)Ei)\mathbb{P}\left(\mathcal{M}\left(\mathrm{y}_{0:T}^{1}\right)\in E_{i}\right) eε((y0:T2)Ei)\leq\mathrm{e}^{\varepsilon}\mathbb{P}\left(\mathcal{M}\left(\mathrm{y}_{0:T}^{2}\right)\in E_{i}\right). Take Fi=El1ElmiF_{i}=E_{l_{1}}\operatorname{\cup}\ldots\operatorname{\cup}E_{l_{m_{i}}}, then, from the properties of partition, we obtain:

((y0:T1)Fi)=((y0:T1)El1Elmi)=s=1mi((y0:T1)Els)eεs=1mi((y0:T2)Els)=eε((y0:T2)Fi).\begin{array}[]{ll}&\mathbb{P}\left(\mathcal{M}\left(\mathrm{y}_{0:T}^{1}\right)\in F_{i}\right)=\mathbb{P}\left(\mathcal{M}\left(\mathrm{y}_{0:T}^{1}\right)\in E_{l_{1}}\operatorname{\cup}\ldots\operatorname{\cup}E_{l_{m_{i}}}\right)\\ &=\sum^{m_{i}}_{s=1}\mathbb{P}\left(\mathcal{M}\left(\mathrm{y}_{0:T}^{1}\right)\in E_{l_{s}}\right)\\ &\leq e^{\varepsilon}\sum^{m_{i}}_{s=1}\mathbb{P}\left(\mathcal{M}\left(\mathrm{y}_{0:T}^{2}\right)\in E_{l_{s}}\right)=e^{\varepsilon}\mathbb{P}\left(\mathcal{M}\left(\mathrm{y}_{0:T}^{2}\right)\in F_{i}\right).\end{array}

Similarly, the roles of y0:T1\mathrm{y}_{0:T}^{1} and y0:T2\mathrm{y}_{0:T}^{2} can be exchanged.

Thus, it follows intuitively that \mathcal{M} is (ε(\varepsilon,dd-adj)) differentially private if it is (ε(\varepsilon,dd-adj)) differentially private wrt infinitesimally small partitions. Now, by considering partitions of a given resolution, we can also guarantee a type of approximate (ε(\varepsilon,dd-adj)) privacy:

Lemma 3

Consider a partition 𝒫={Ei}i\mathcal{P}=\{E_{i}\}_{i\in\mathcal{I}} such that (Ei)η\mathbb{P}(E_{i})\leq\eta for all ii\in\mathcal{I}. Then, if (ε(\varepsilon,dd-adj)) differential privacy holds wrt the partition 𝒫\mathcal{P}, then \mathcal{M} is (ε(\varepsilon,dd-adj))-λ\lambda differentially private with λ=2ηeε\lambda=2\eta e^{\varepsilon}.

Proof:

For any RR, we have ((y0:T1)R)=i((y0:T1)REi).\mathbb{P}(\mathcal{M}(\mathrm{y}_{0:T}^{1})\in R)=\sum_{i}\mathbb{P}(\mathcal{M}(\mathrm{y}_{0:T}^{1})\in R\cap E_{i}). By hypothesis,

((y0:T1)Ei)eε((y0:T2)Ei)\displaystyle\mathbb{P}(\mathcal{M}(\mathrm{y}_{0:T}^{1})\in E_{i})\leq e^{\varepsilon}\mathbb{P}(\mathcal{M}(\mathrm{y}_{0:T}^{2})\in E_{i})\Longleftrightarrow
((y0:T1)Ei(RR¯))eε((y0:T2)Ei(RR¯)),\displaystyle\mathbb{P}(\mathcal{M}(\mathrm{y}_{0:T}^{1})\in E_{i}\cap(R\cup\overline{R}))\leq e^{\varepsilon}\mathbb{P}(\mathcal{M}(\mathrm{y}_{0:T}^{2})\in E_{i}\cap(R\cup\overline{R})),

for ii\in\mathcal{I}, and where R¯\overline{R} is the complement of RR. Thus,

((y0:T1)EiR)+((y0:T1)EiR¯)\displaystyle\mathbb{P}(\mathcal{M}(\mathrm{y}_{0:T}^{1})\in E_{i}\cap R)+\mathbb{P}(\mathcal{M}(\mathrm{y}_{0:T}^{1})\in E_{i}\cap\overline{R})
eε((y0:T2)EiR)+eε((y0:T2)EiR¯).\displaystyle\leq e^{\varepsilon}\mathbb{P}(\mathcal{M}(\mathrm{y}_{0:T}^{2})\in E_{i}\cap R)+e^{\varepsilon}\mathbb{P}(\mathcal{M}(\mathrm{y}_{0:T}^{2})\in E_{i}\cap\overline{R}).

Now, if (Ei)η\mathbb{P}(E_{i})\leq\eta, we have

((y0:T1)EiR)eε((y0:T2)EiR)+2ηeε.\displaystyle\mathbb{P}(\mathcal{M}(\mathrm{y}_{0:T}^{1})\in E_{i}\cap R)\leq e^{\varepsilon}\mathbb{P}(\mathcal{M}(\mathrm{y}_{0:T}^{2})\in E_{i}\cap R)+2\eta e^{\varepsilon}.

Similarly, the roles of y0:T1,y0:T2\mathrm{y}_{0:T}^{1},\mathrm{y}_{0:T}^{2} can be exchanged. ∎

Our approach is based on checking differential privacy wrt a partition given by {R¯,Ei}i=1n\{\overline{R},E_{i}\}_{i=1}^{n}, where R¯\overline{R} is the complement of a highly likely event, and {Ei}\{E_{i}\} is a partition of RR.

IV Differential Privacy Test Framework

In this section, we present the components of our test framework (Section IV-A) and its theoretical guarantees (Section IV-B).

IV-A Overview of the differential-privacy test framework

Privacy is evaluated wrt two dd-close y0:T1\mathrm{y}_{0:T}^{1}, y0:T2\mathrm{y}_{0:T}^{2} as follows:

  1. 1.

    Instead of verifying (ε(\varepsilon,dd-adj)) privacy for an infinite number of events, an EventListGenerator module extracts a finite collection EventList. This is done by partitioning an approximated high-likely event set.

  2. 2.

    Next, a WorstEventSelector module identifies the worst-case event in EventList that violates (ε(\varepsilon,dd-adj)) differential privacy with the highest probability.

  3. 3.

    Finally, a HypothesisTest module evaluates (ε(\varepsilon,dd-adj)) ifferential privacy wrt the worst-case event.

The overall description is summarized in Algorithm 1.

1:function Test Framework(,ε\mathcal{M},\varepsilon, y0:T1,y0:T2\mathrm{y}_{0:T}^{1},\mathrm{y}_{0:T}^{2})
2:     Inputs: Target estimator \mathcal{M}, privacy level ε\varepsilon, sensor data (y0:T1,y0:T2\mathrm{y}_{0:T}^{1},\mathrm{y}_{0:T}^{2})
3:     EventList = EventListGenerator(\mathcal{M}, y0:T1\mathrm{y}_{0:T}^{1})
4:     WorstEvent =
5:     WorstEventSelector(,ε,y0:T1,y0:T2\mathcal{M},\varepsilon,\mathrm{y}_{0:T}^{1},\mathrm{y}_{0:T}^{2},
6:         EventList)
7:     p+,p+p^{+},p_{+} = HypothesisTest(,ε\mathcal{M},\varepsilon, y0:T1,y0:T2\mathrm{y}_{0:T}^{1},\mathrm{y}_{0:T}^{2},
8:         WorstEvent)
9:     Return p+,p+p^{+},p_{+}
10:end function
Algorithm 1 (ε(\varepsilon,dd-adj)) Differentially-private Test Framework

We now describe each module in detail.

IV-A1 EventListGenerator module.

Consider System 1, with initial condition x0𝒦0dXx_{0}\in\mathcal{K}_{0}\subset\mathbb{R}^{d_{X}}. The estimated state x^k\hat{x}_{k} under \mathcal{M} for a given y0:T1\mathrm{y}_{0:T}^{1}, belongs to a set of all possible estimates given x0x_{0} and y0:T1\mathrm{y}_{0:T}^{1}. Denote this set as R[0,k]R_{[0,k]}.

In [9], this set is bounded as all disturbances and initial distribution are assumed to have a compact support. However, R[0,k]R_{[0,k]} can be unbounded for other estimators. To reduce the set of events to be checked for (ε(\varepsilon,dd-adj)) differential privacy: a) we approximate the set R[0,k]R_{[0,k]} by a compact, high-likely set in a data-driven fashion, and b) we finitely partition this set by a mutually exclusive collection of events; see Algorithm 2.

1:function EventListGenerator(\mathcal{M}, y0:T1\mathrm{y}_{0:T}^{1}, β,γ\beta,\gamma)
2:     Input: Target Estimator(\mathcal{M})
3:     In put: Sensor Data(y0:T1\mathrm{y}_{0:T}^{1})
4:     In put: Parameters for Algorithm 3 (β,γ\beta,\gamma)
5:     HighLikelySet \leftarrow Apply Algorithm 3
6:     EventList \leftarrow a partition of the HighLikelySet
7:     Return EventList
8:end function
Algorithm 2 EventListGenerator

Inspired by [13] focusing on reachability, we employ the Scenario Optimation approach to approximate a high-likely set via a product of ellipsoids; see Algorithm 3. Here, ΓΓ(β)\Gamma\equiv\Gamma(\beta) defines the number of estimate samples (filter runs) required to guarantee that the output set contains 1β1-\beta of the probability mass of R[0,k]R_{[0,k]} with high confidence 1γ1-\gamma. Then, a convex optimization problem is solved to find the output set as a hyper-ellipsoid with parameters AkA^{k} and bkb^{k}.

1:Input: Target Estimator(\mathcal{M}) with dimension dXd_{X}
2:In put: Sensor data(y0:T1\mathrm{y}_{0:T}^{1}), parameters β,γ\beta,\gamma
3:Output: Matrix AkA^{k} and vector bkb^{k} representing an
4:Ou tput:11-β\beta-accurate high-likely set at time step kk
5:Ou tput: Rk(Ak,bk)={x|dXAkx+bk21}R_{k}(A^{k},b^{k})=\left\{x\in{}^{d_{X}}\,|\,\|A^{k}x+b^{k}\|_{2}\leq 1\right\}
6:Ou tput: with confidence 1γ1-\gamma.
7:Set number of samples Γ\Gamma =
8:Ou tput:[1βee1(log1γ+dX(dX+1)/2+dX)\left[\frac{1}{\beta}\frac{e}{e-1}\left(\log\frac{1}{\gamma}+d_{X}(d_{X}+1)/2+d_{X}\right)\right\rceil
9:for k{0,,T}k\in\{0,\ldots,T\} do
10:     for i{0,,Γ}i\in\{0,\ldots,\Gamma\} do
11:         Record zikz^{k}_{i} = (y0:k1)\mathcal{M}\left(\mathrm{y}_{0:k}^{1}\right)
12:     end for
13:     Solve the convex problem
14:     argminAk,bklogdetAksubject to Akzikbk210,i=0,,Γ\begin{array}[]{ll}\arg\min_{A^{k},b^{k}}&-\log\operatorname{det}A^{k}\\ \text{subject to }&\left\|A^{k}z^{k}_{i}-b^{k}\right\|_{2}-1\leq 0,\,i=0,\ldots,\Gamma\end{array}
15:     return Ak,bkA^{k},b^{k}
16:end for
Algorithm 3 HighLikelySet

The data-driven approximation of the high-likely set can now be partitioned using e.g. a grid per time step. This is what we do in simulation later. For the W2W_{2}-MHE, the last NN (window size) steps would not be evaluated for differential privacy. Thus, the high-likely set is the product of T+1NT+1-N ellipsoids. Alternatively, an outline on how to re-use the sample runs of Algorithm 3 to find a finite partition of the approximated set with a common upper-bounded probability is provided as follows.

Observe that ΓΓ(β)\Gamma\equiv\Gamma(\beta) is a function of β\beta, and denote the output of Algorithm 3 by RR. For η>β\eta>\beta, we can choose a number of samples Γ(η)<Γ(β)\Gamma(\eta)<\Gamma(\beta) and solve a similar convex problem as in step [13:]. The resulting hyper-ellipsoid, E¯\overline{E}, satisfies E¯R\overline{E}\cap R\neq\emptyset as they contain common sample runs, and (E¯R)=(E¯)+(R)(E¯R)1η+1β1=1ηβ\mathbb{P}(\overline{E}\cap R)=\mathbb{P}(\overline{E})+\mathbb{P}(R)-\mathbb{P}(\overline{E}\cup R)\geq 1-\eta+1-\beta-1=1-\eta-\beta with confidence 1γ1-\gamma. Similarly, the complement of E¯\overline{E}, EE, is such that ERE\cap R\neq\emptyset and (ER)=(R)(RE¯)1(1ηβ)=η+β\mathbb{P}(E\cap R)=\mathbb{P}(R)-\mathbb{P}(R\cap\overline{E})\leq 1-(1-\eta-\beta)=\eta+\beta. This process can be repeated by choosing different subsets of sample runs to find a finite collection of sets E1R,,EnRE_{1}\cap R,\dots,E_{n}\cap R such that (EiR)η+β\mathbb{P}(E_{i}\cap R)\leq\eta+\beta and (E1R)(EnR)=R(E_{1}\cap R)\cup\dots\cup(E_{n}\cap R)=R. Wlog, it can be assumed that the sets are mutually exclusive by re-assigning sets overlaps to one of the sets. This approach can be extended to achieve any desired upper bound ηd\eta_{d} for a desired βd\beta_{d}. To do this, first run Algorithm 3 with respect to β¯\bar{\beta} so that ηd>2β¯\eta_{d}>2\bar{\beta}. This results into a set R¯\bar{R} with probability at least 1β¯1-\bar{\beta} and high confidence 1γ1-\gamma. By selecting a subset of sample runs Γ(βd)\Gamma(\beta_{d}) and solving the associated optimization problem, one can obtain RR with the desired probability lower-bound 1βd1-\beta_{d}. Now, following the previous strategy for η¯=ηdβ¯\bar{\eta}=\eta_{d}-\bar{\beta}, we can obtain a partition of RR and R¯\bar{R} with probability upper bounded by η¯+β¯=ηd\bar{\eta}+\bar{\beta}=\eta_{d} and high confidence 1γ1-\gamma.

We also note that a finite partition of a compact set is always guaranteed to exist under certain conditions, in particular, as the following:

Lemma 4

Let RR be a compact set in d. If \mathbb{P} is an absolutely continuous measure wrt the Lebesgue measure in d, and if the Randon-Nikodym derivative of \mathbb{P} is a continuous function, then there is a finite partition of RR of a given resolution η\eta.

Proof:

First of all, from absolutely continuity, (E)=Ef𝑑μ\mathbb{P}(E)=\int_{E}fd\mu, where μ\mu is the Lebesgue measure, and ff the Randon-Nikodym derivative. Second, observe that we can take arbitrarily small-volume neighborhoods ExE_{x}, of points of xRx\in R, to make (ExR)\mathbb{P}(E_{x}\cap R) as small as we like. This follows from (ExR)vol(Ex)supExRfvol(Ex)maxRfη\mathbb{P}(E_{x}\cap R)\leq\text{vol}(E_{x})\sup_{E_{x}\cap R}f\leq\text{vol}(E_{x})\max_{R}f\leq\eta, where vol(Ex)\text{vol}(E_{x}) is the standard volume of the set ExE_{x} wrt the Lebesgue measure, and maxRf<\max_{R}f<\infty by compactness of RR and continuity of ff. Using these neighborhoods, we can construct an open cover {Ex}xR\{E_{x}\}_{x\in R} of RR, i.e. RxRExR\subseteq\cup_{x\in R}E_{x}. By compactness of RR, there must exist a finite subcover {E1,,En}\{E_{1},\dots,E_{n}\}, i.e. RE1EnR\subseteq E_{1}\cup\dots\cup E_{n}. From this cover, we can obtain {E1R,,EnR}\{E_{1}\cap R,\dots,E_{n}\cap R\}, with probabilities (EiR)η\mathbb{P}(E_{i}\cap R)\leq\eta, for each ii. Now the sets EiRE_{i}\cap R can then be used to construct a finite partition of RR, with probabilities that will be less or equal than η\eta. ∎

IV-A2 WorstEventSelector module.

We now discuss how to select an EE^{*}, a most-likely event that leads to a violation of (ε(\varepsilon,dd-adj)) differential privacy; see Algorithm 4.

1:function WorstEventSelector(n,n,\mathcal{M}, ε\varepsilon, y0:T1,y0:T2\mathrm{y}_{0:T}^{1},\mathrm{y}_{0:T}^{2}, EventList)
2:     Input: Target Estimator(\mathcal{M})
3:     In put: Desired differential privacy(ε\varepsilon)
4:     In put: dd-adjacent sensor data(y0:T1,y0:T2\mathrm{y}_{0:T}^{1},\mathrm{y}_{0:T}^{2})
5:     In put: EventList
6:     O1O_{1} \leftarrow Estimate set after nn runs of (y0:T1)\mathcal{M}(\mathrm{y}_{0:T}^{1})
7:     O2O_{2} \leftarrow Estimate set after nn runs of (y0:T2)\mathcal{M}(\mathrm{y}_{0:T}^{2})
8:     pvalues[]\texttt{\small pvalues}\leftarrow[\phantom{i}]
9:     for EE\in EventList do
10:         c1c_{1} \leftarrow |{i|O1[i]E}||\{i|O_{1}[i]\in E\}|
11:         c2c_{2} \leftarrow |{i|O2[i]E}||\{i|O_{2}[i]\in E\}|
12:         p+,p+p^{+},p_{+}\leftarrow PVALUE (c1,c2,n,ε)(c_{1},c_{2},n,\varepsilon)
13:         pmin(p+,p+)p^{*}\leftarrow\min{(p^{+},p_{+})}
14:         pvalues.append(pp^{*})
15:     end for
16:     WorstEvent \leftarrow EventList[argmin{pvalues}\operatorname{argmin}\{\texttt{\small pvalues}\}]
17:     Return E=E^{*}= WorstEvent
18:end function
Algorithm 4 WorstEvent Selector

The returned WorstEvent (EE^{*}) is then used in Hypothesis Test function in Algorithm 1. First, WorstEventSelector receives an EventList from EventListGenerator. The algorithm runs the estimator nn times with each sensor data, and counts the number of estimates that fall in each event of the list, respectively. Then, a PVALUE function is run to get p+,p+p^{+},p_{+}; see Algorithm 5 (top). The pp-values quantify the probability of the Type I error of the test (refer to the next subsection).

IV-A3 HypothesisTest module.

The hypothesis test module aims to verify (ε(\varepsilon,dd-adj)) privacy in a data-driven fashion for a fixed EE^{*}. Given a sequence of mm statistically-independent trials, the ocurrence or non-occurrence of (y0:Ti)E\mathcal{M}(\mathrm{y}_{0:T}^{i})\in E is a Bernouilli sequence and the number of occurrences in mm trials is distributed as a Binomial.

1:function pvalue(c1,c2,n,εc_{1},c_{2},n,\varepsilon)
2:     c1¯\bar{c_{1}}\leftarrow B(c1,1/eεc_{1},1/e^{\varepsilon})
3:     sc1¯+c2s\leftarrow\bar{c_{1}}+c_{2}
4:     p+p^{+}\leftarrow 1 - Hypergeom.cdf(c1¯1|2n,n,s\bar{c_{1}}-1|2n,n,s)
5:     c2¯\bar{c_{2}}\leftarrow B(c2,1/eεc_{2},1/e^{\varepsilon})
6:     sc2¯+c1s\leftarrow\bar{c_{2}}+c_{1}
7:     p+p_{+}\leftarrow 1 - Hypergeom.cdf(c2¯1|2n,n,s\bar{c_{2}}-1|2n,n,s)
8:     return p+,p+p^{+},p_{+}
9:end function
10:function HypothesisTest(m,{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}m},\mathcal{M}, ε\varepsilon, y0:T1,y0:T2\mathrm{y}_{0:T}^{1},\mathrm{y}_{0:T}^{2}, EE^{*})
11:     Input: Target Estimator(\mathcal{M})
12:     In put: Desired differential privacy(ε\varepsilon)
13:     In put: dd-adjacent sensor data(y0:T1,y0:T2\mathrm{y}_{0:T}^{1},\mathrm{y}_{0:T}^{2})
14:     In put: EE^{*}(WorstEvent)
15:     O1O_{1} \leftarrow Estimate set after mm runs of (y0:T1)\mathcal{M}(\mathrm{y}_{0:T}^{1})
16:     O2O_{2} \leftarrow Estimate set after mm runs of (y0:T2)\mathcal{M}(\mathrm{y}_{0:T}^{2})
17:     c1c_{1} \leftarrow |{i|O1[i]E}||\{i|O_{1}[i]\in E^{*}\}|
18:     c2c_{2} \leftarrow |{i|O2[i]E}||\{i|O_{2}[i]\in E^{*}\}|
19:     p+,p+p^{+},p_{+}\leftarrow PVALUE (c1,c2,m,ε)(c_{1},c_{2},m,\varepsilon)
20:     Return p+,p+p^{+},p_{+}
21:end function
Algorithm 5 HypothesisTest

Thus, the verification reduces to evaluating how the parameters of two Binomial distributions differ. More precisely, define pi=((y0:Ti)E)p_{i}=\mathbb{P}\left(\mathcal{M}\left(\mathrm{y}_{0:T}^{i}\right)\in E^{*}\right), i=1,2i=1,2. By running mm times the estimator, we count the number of (y0:Ti)E\mathcal{M}\left(\mathrm{y}_{0:T}^{i}\right)\in E^{*} as cic_{i}, i=1,2i=1,2. Thus, each cic_{i} can be seen as a sample of the binomial distribution B(ni,pi)(n_{i},p_{i}), for i=1,2i=1,2. However, instead of evaluating p1p2p_{1}\leq p_{2}, we are interested in testing the null hypothesis p1eεp2p_{1}\leq\mathrm{e}^{\varepsilon}p_{2}, with an additional eε\mathrm{e}^{\varepsilon}. This can be addressed by considering samples c¯1\bar{c}_{1} of B(c1,1/eεc_{1},1/e^{\varepsilon}) distribution. It is easy to see the following:

Lemma 5 ([12].)

Let YY\sim B(n,p1n,p_{1}), and ZZ be sampled from B(Y,1/eεY,1/e^{\varepsilon}), then, ZZ is distributed as B(n,p1/eεn,p_{1}/e^{\varepsilon}).

Hence, the problem can be reduced to the problem of testing the null hypothesis H¯0:p¯1:=p1/eεp2\bar{H}_{0}:\bar{p}_{1}:=p_{1}/e^{\varepsilon}\leq p_{2} on the basis of the samples c¯1\bar{c}_{1}, c2c_{2}. Checking whether or not c¯1\bar{c}_{1}, c2c_{2} are generated from the same binomial distribution can be done via a Fisher’s exact test [14] wth pp-value being equal to 1 - Hypergeometric.cdf(c¯11|2m,m,c¯1+c2\bar{c}_{1}-1|2m,m,\bar{c}_{1}+c_{2}) (cumulative hypergeometric distribution) . As an exact test, c1eεc2c_{1}\gg e^{\varepsilon}c_{2} provides firm evidence against the null hypothesis with a given confidence.

The pp-value is the Type I error of the test, or the probability of incorrectly rejecting H0H_{0} when it is indeed true. The null hypothesis is rejected based on a significance level α\alpha (typically 0.10.1, 0.050.05 or 0.010.01). If pp is such that pαp\leq\alpha, then the probability that a mistake is made by rejecting H0H_{0} is very small (smaller or equal than α\alpha). Since Condition 2 involves two inequalities p1eεp2p_{1}\leq e^{\varepsilon}p_{2} and p2eεp1p_{2}\leq e^{\varepsilon}p_{1}, we need to evaluate two null hypotheses. This results into p+p_{+} and p+p^{+} values that should be larger than α\alpha if we want to accept H0H_{0}; see Algorithm 5. In simulation, we choose m>nm>n in both WorstEventSelector and HypothesisTest for the purpose of i) increasing the accuracy of the test for the worst event, ii) some additional practical considerations as follows.

In the WorstEventSelector, the pp-values for all the events are computed using the same ε\varepsilon, for example, ε=1\varepsilon=1. And then the worse event EE^{*} which gives the minimum pp-value is selected. Later, in HypothesisTest, c1,c2c_{1},c_{2} are counted with respect to the worse event and then we keep increasing ε\varepsilon and running PVALUE with the same c1,c2c_{1},c_{2} until the pp-value is larger than α\alpha. The critical values are reported in Section . The reasons why we select the same ε\varepsilon in WorstEventSelector are:

  • The p-value evolves consistently with respect to the test ε\varepsilon. For example, the event with c1=12,c2=27c_{1}=12,c_{2}=27 is the most possible event that shows the violatation of differential privacy. From Figure 2, we can see that no matter what value ε\varepsilon is chosen, the pp-values for that event are the smallest, which means in WorstEventSelector, it will always be returned as the worst event EE^{*}. This happens similarly for the event selected from c1=14,c2=27c_{1}=14,c_{2}=27, the pp--values are always the largest. In that case, it is safe to select one specific ε\varepsilon in WorstEventSelector to decide which event is the worst.

    Refer to caption
    Figure 2: pp-values evolution with respect to different choices of event. It is clear that no matter what value ε\varepsilon is chosen, pp-values for the worst event are the smallest.
  • On the other hand, the variations may come from the scenarios that c2/c1=c_{2}/c_{1}= constant, but (c1,c2)(c_{1},c_{2}) share different values, such as (10,20), (15,30), (20, 40). From Figure 3, for different ε\varepsilon, the pp-values are almost similar. Hence, in this case, in WorstEventSelector, the returned worst event is still the one of the events, if any, that are most likely to show the violations of differential privacy.

    Refer to caption
    Figure 3: pp-values evolution with respect to different choices of event. For different choices of event, the variations are not significant.

IV-B Theoretical Guarantee

Here, we specify the guarantees of the numerical test.

Theorem 1

Let {\mathcal{M}} be a state estimator of System 1, and let ε,d,β\varepsilon,d,\beta and γ0\gamma\in\mathbb{R}_{\geq 0}. We denote two dd-adjacent sensor data as y0:Ti\mathrm{y}_{0:T}^{i}, i{1,2}i\in\{1,2\} and a partition of the high-likely (1β1-\beta) set RR from Algorithm 3 with high confidence 1γ1-\gamma as 𝒫={E1,,En}\mathcal{P}=\{E_{1},\ldots,E_{n}\} such that (Ei)η\mathbb{P}(E_{i})\leq\eta for all ii . Then, if Γ\Gamma is selected accordingly, and the estimator passes the test in Algorithm 1, then {\mathcal{M}} is approximately (ε(\varepsilon,dd-adj)) differentially private wrt y0:Ti\mathrm{y}_{0:T}^{i}, i{1,2}i\in\{1,2\}, and λ=β+2ηeε\lambda=\beta+2\eta\text{e}^{\varepsilon}, with confidence (1α)(1γ)(1-\alpha)(1-\gamma).

Proof:

First, from Scenario Optimization RR is a highly likely event (1β1-\beta) with high confidence (1γ1-\gamma), and approximate differential privacy with λ=β+2ηeε\lambda=\beta+2\eta e^{\varepsilon} is a consequence of the previous lemmas in Section III. Second, as Fisher’s test is exact, the chosen significance level α\alpha characterizes exactly the Type I error of the test for any number of samples. If Fisher’s test is applied to the worst-case event in the partition, we can guarantee approximate differential privacy with confidence 1γ1-\gamma does not hold with probability α\alpha. ∎

To extend the result over any two dd-adjacent sensor data, one would have to sample over the measurement space and evaluate the ratio of passing the tests.

V EXPERIMENTS

In this section, we evaluate our test on a toy dynamical system. All simulations are performed in MATLAB (R2020a).

System Example.

Consider a non-isotropic oscillator in 2 with potential function V(x1,x2)=12((x1)2+4(x2)2).V\left(x^{1},x^{2}\right)=\frac{1}{2}((x^{1})^{2}+4(x^{2})^{2}). Thus, the corresponding oscillator particle with position 𝐱k=(xk1,xk2)2\mathbf{x}_{k}=(x^{1}_{k},x^{2}_{k})\in{}^{2} moves from initial conditions x01=5,x02=0,x˙01=0,x˙02=2.5x^{1}_{0}=5,x^{2}_{0}=0,\dot{x}^{1}_{0}=0,\dot{x}^{2}_{0}=2.5 under the force ΔV-\Delta V. The discrete update equations of our (noiseless) dynamic system take the form 𝐱k+1=f0(𝐱k)=A𝐱k,k0\mathbf{x}_{k+1}=f_{0}(\mathbf{x}_{k})=A\mathbf{x}_{k},\,k\geq 0, where, AA is a constant matrix. At each time step kk, the system state is perturbed by a uniform distribution over [0.001,0.001]2[-0.001,0.001]^{2}. The distribution of initial conditions is given by a truncated Gaussian mixture with mean vector (5,0,0,2.5)(5,0,0,2.5) uch that diam(𝒦0)\operatorname{diam}(\mathcal{K}_{0}) in Theorem 4 of [9] is 0.1.

The target is tracked by a sensor network of 10 nodes located on a circle with center (0,0)(0,0) and radius 10210\sqrt{2}. The sensor model is homogeneous and given by:

yki\displaystyle\mathrm{y}_{k}^{i} =h(𝐱k,𝐪i)+𝐯i,k\displaystyle=h(\mathbf{x}_{k},\mathbf{q}_{i})+\mathbf{v}_{i,k}
=100tanh(0.1(𝐱k𝐪i))+vki,i=1,,10,\displaystyle=100\tanh\left(0.1\left(\mathbf{x}_{k}-\mathbf{q}_{i}\right)\right)+\mathrm{v}_{k}^{i},\quad i=1,\ldots,10,

where 𝐪i2\mathbf{q}_{i}\in{}^{2} is the position of sensor ii on the circle, and 𝐱k=(xk1,xk2)\mathbf{x}_{k}=\left(x^{1}_{k},x^{2}_{k}\right) is the position of the target at time kk. Here, the hyperbolic tangent function tanh\tanh is applied in an element-wise way. The vector vki2\mathrm{v}_{k}^{i}\in{}^{2} represents the observation noise of each sensor, which is generated from the same truncated Gaussian mixture distribution at each time step kk in simulation, indicating that the volume of random noise has a limit. All these observations are stacked together as sensor data yk=(yk1,,yk10)20\mathrm{y}_{k}=(\mathrm{y}_{k}^{1\top},\dots,\mathrm{y}_{k}^{10\top})^{\top}\in{}^{20} in W2W_{2}-MHE.

In order to implement the W2W_{2}-MHE filter, we consider a time horizon T=8T=8 and a moving horizon N=5TN=5\leq T.

With these simulation settings, cf,lc_{f},l in Theorem 4 of [9] can be computed as: cf=Ac_{f}=\|A\|, l=2(N+1)100Al=2(N+1)100\|A\|.

dd-adjacent sensor data.

Let us use θ1\mathbf{\theta}^{1} and θ2\mathbf{\theta}^{2}\in\real to represent the angle of a single sensor that is moved to check for differential privacy. Denote Δθ=θ1θ2\Delta\theta=\|\theta^{1}-\theta^{2}\| (distance on the unit circle). By exploiting the Lipschitz’s properties of the function hh, it can be verified that the corresponding measurements satisfy: dy(y0:T1,y0:T2)102(TN+1)Δθ=202Δθ.d_{\mathrm{y}}\left(\mathrm{y}_{0:T}^{1},\mathrm{y}_{0:T}^{2}\right)\leq 10\sqrt{2(T-N+1)}\Delta\mathbf{\theta}=20\sqrt{2}\Delta\mathbf{\theta}. Thus, in order to generate dd-adjacent sensor data, we take Δθd202.\Delta\mathbf{\theta}\leq\frac{{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}d}}{20\sqrt{2}}. In the sequel, we take d=10{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}d}=10.

Numerical Verification Results of W2W_{2}-MHE. Here, evaluate the differential privacy of W2W_{2}-MHE with parameters T=8T=8 and N=5N=5. An entropy factor sk[0,1]s_{k}\in[0,1] determines the distribution of the filter, from sk=1,ks_{k}=1,\forall k — a deterministic \mathcal{M} to sk=0,ks_{k}=0,\forall k — a uniformly distributed random variable.

Fixing y0:T1\mathrm{y}_{0:T}^{1}, we run the W2W_{2}-MHE filter for a number of Γ\Gamma (=814=814) runs to obtain a high-likely set characterized by β=0.05,γ=109\beta=0.05,\gamma=10^{-9}. This allows us to produce an ellipsoid that contains at least 0.95 of the high-likely set with probability 1109\geq 1-10^{-9}. At each time step kk, we consider a grid partition of the high-likely set consisting of 4 regions (r=2r=2). The EventList is obtained by storing all of the possible combinations of these sets, which results in a total of 44=2564^{4}=256 for T=8,N=5T=8,N=5. Followed by this, we re-run W2W_{2}-MHE enough times using both sets of sensor data, respectively. As shown in Algorithm 4, we obtain c1,c2c_{1},c_{2} for each event and, finally the event with minimum pp-value is returned as the WorstEvent. After this, we record c1,c2c_{1},c_{2} with respect to this event from another set of runs. Then, pp-values are computed for different values of ε\varepsilon. We then use these with the significance parameter α\alpha (0.05 in this work) to accept or reject the null hypothesis and decide whether (ε(\varepsilon,dd-adj)) differential privacy is satisfied.

For a fixed sensor setup 𝐐𝟏\mathbf{Q_{1}}, we now report on the numerical results. We first show the simulation results when no entropy term exists (sk=1,ks_{k}=1,\forall k in Theorem 4 of [9]. We use ss to denote sk,ks_{k},\forall k below). In Figure 4, from the left figure, the pp-value is always equal to 0, which means that the null hypothesis should always be rejected for all test ε\varepsilon. Thus, the two sets of sensor data are distinguishable when s=1s=1. On the other hand, the estimate RMSE error using the correct sensor data generated from 𝐐𝟏\mathbf{Q_{1}} (EcorrectE_{\textup{correct}}) is almost zero and much smaller than EadjacentE_{\textup{adjacent}}. This error (EadjacentE_{\textup{adjacent}}) employs sensor data that are in fact generated from adjacent sensor positions to 𝐐𝟏\mathbf{Q_{1}}.

Refer to caption
(a) Hypothesis test results
Refer to caption
(b) Estimation accuracy
Figure 4: W2W_{2}-MHE: State estimation & privacy test results (s=1s=1)

Then, we set s=0.8s=0.8 and the simulation results are shown in Figure 5 (left figure). We obtain the critical value εc=0.39947\varepsilon_{c}=0.39947, where the pp-value is larger than 0.05. This means that the null hypothesis is not rejected and we accept that differential privacy holds for ε0.39947\varepsilon\geq 0.39947, and d=10d=10 and these two sensor data sets. The corresponding estimate errors are found to be Ecorrect=0.0040408E_{\textup{{correct}}}=0.0040408 and Eadjacent=0.026032E_{\textup{\text{adjacent}}}=0.026032 for the estimates using the sensor data generated from 𝐐𝟏\mathbf{Q_{1}} and adjacent sensor positions, respectively. Recall that s=0.8s=0.8 implies a relatively low noise injection level.

Refer to caption
(a) s=0.8s=0.8
Refer to caption
(b) s=0.7s=0.7
Figure 5: W2W_{2}-MHE: Privacy test results

Decreasing ss to 0.7, leads to a larger entropy term in the W2W_{2}-MHE: as shown in Figure 5 (right figure), the critical εc\varepsilon_{c} becomes smaller (εc=0.11485\varepsilon_{c}=0.11485), confirming that a higher level of (ε(\varepsilon,dd-adj)) differential privacy is achieved. There is also a decrease in accuracy, which can be seen from Ecorrect=0.00599996E_{\text{correct}}=0.00599996.

Therefore, the tests reflect the expected trade-off between differential privacy and accuracy. In order to choose between two given estimation methods, a designer can either (i) first set a bound on what is the tolerable estimation error, then compare two methods based on the differential privacy level they guarantee based on the given test, or (ii) given a desired level of differential privacy, choose the estimation method that results into the smallest estimation error.

Regarding the approximation term λ\lambda, we can compute their values as discussed in Section IV-B. For each event EiE_{i}, (Ei)\mathbb{P}(E_{i}) can be approximated as c1/nc_{1}/n, so η\eta is obtained via max((Ei))\max(\mathbb{P}(E_{i})); β=0.05\beta=0.05 is fixed and εc\varepsilon_{c} values are known from tests. Therefore, when s=0.8s=0.8, we get η=0.013\eta=0.013 and λ=0.0888\lambda=0.0888; s=0.7s=0.7, we get η=0.010\eta=0.010 and λ=0.0724\lambda=0.0724.

Input Perturbation.

The work [8] proposes two approaches to obtain differentially-private estimators. The first one randomizes the output of regular estimator (W2W_{2}-MHE belongs to this class). The second one perturbs the sensor data, which is then filtered through the original estimator. An advantage of this approach is that users do not need to rely on a trusted server to maintain their privacy since they can themselves release noisy signals. The question is which of the two approaches can lead to a better estimation result for the same level of differential privacy.

We can now compare these approaches numerically for the W2W_{2}-MHE estimator method. By selecting s=1s=1 and adding a Gaussian noise directly to both sets of adjacent sensor data, we re-run our test to find the trade-off between accuracy and privacy. The Gaussian noise has zero mean and the covariance matrix Q=(1s¯)(I+R+R2)Q=(1-\bar{s})(I+\frac{R+R^{\prime}}{2}), where, II is the identity matrix and RR is a matrix of random numbers inl (0,1)(0,1). A value s¯[0,1]\bar{s}\in[0,1] quantifies how much the sensor data is perturbed. In simulation, we change the value of s¯\bar{s} and compare the results.

We first set s¯=0.944\bar{s}=0.944 in QQ, see Figure 6 (left figure). The εc\varepsilon_{c} and estimate error are found to be εc=0.70306,Ecorrect=0.0010771\varepsilon_{c}=0.70306,E_{\text{correct}}=0.0010771 and λ=0.1550\lambda=0.1550. Then, we decrease s¯=0.894\bar{s}=0.894, see Figure 6 (right figure). We obtain εc=0.41844\varepsilon_{c}=0.41844, Ecorrect=0.0013998E_{\text{correct}}=0.0013998 and λ=0.1351\lambda=0.1351. They also show that the higher level of differential privacy is achieved at the loss of accuracy. For the four cases, λ\lambda values are not significant, indicating the approximation method is meaningful.

Refer to caption
(a) s¯=0.944\bar{s}=0.944
Refer to caption
(b) s¯=0.894\bar{s}=0.894
Figure 6: Input perturbation: Privacy test results

From (s=0.8s=0.8, εc=0.39947\varepsilon_{c}=0.39947, Ecorrect=0.0040408E_{\text{correct}}=0.0040408) and (s¯=0.894\bar{s}=0.894, εc=0.41844\varepsilon_{c}=0.41844, Ecorrect=0.0013998E_{\text{correct}}=0.0013998), we find that although the level of differential privacy is close to each other, the estimate error of s¯=0.894\bar{s}=0.894 is only 1/31/3 of that of s=0.8s=0.8. Thus, the second mechanism (adding noise directly at the mechanism input) seems to indicate that it can lead to better accuracy while maintaining the same (ε(\varepsilon,dd-adj)) differential privacy guarantee for this set of sensors.

We further compare the performances of the two mechanisms on other sensor setups (e.g. uniformly located on the circle), see Table I.

Sensor Setup W2W_{2}MHE Input Perturbation Better choice
𝐐𝟐\mathbf{Q_{2}} εc=0.53229\varepsilon_{c}=0.53229 εc=0.72204\varepsilon_{c}=0.72204 W2W_{2}-MHE
λ=0.1011\lambda=0.1011 λ\lambda=0.2106
Ecorrect=0.0049874E_{\text{correct}}=0.0049874 Ecorrect=0.0049674E_{\text{correct}}=0.0049674
𝐐𝟑\mathbf{Q_{3}} εc=0.98768\varepsilon_{c}=0.98768 εc=2.3423\varepsilon_{c}=2.3423 W2W_{2}-MHE
λ=0.1037\lambda=0.1037 λ=0.8408\lambda=0.8408
Ecorrect=0.0030866E_{\text{correct}}=0.0030866 Ecorrect=0.0037826E_{\text{correct}}=0.0037826
TABLE I: Comparisons of two mechanisms

From the table, we see that performance depend on the specific sensor setups. In 2 out of 3 sensor setups, perturbing the filter output seems the better option. However, more simulation results are needed to reach a reliable conclusion. Besides, we see that the approximation method works only when differential privacy holds (εc\varepsilon_{c} is relatively small).

Differentially private EKF [8].

The framework can also be applied to other differentially private estimators. For comparison, we evaluate the performance of an extended Kalman filter applied to the same examples. Compared EKF, random noise 1s^s^w\frac{1-\hat{s}}{\hat{s}}w (ww is uniformly distributed over [0,1][0,1]) is added to the filter output at the update step, which makes the estimator differentially private. The initial guess μ0\mu_{0} is the same for the EKF and for the W2W_{2}-MHE.

Table II illustrates the EKF test results. Compared with those of W2W_{2}-MHE, the performance of the EKF remains to be worse than that of W2W_{2}-MHE with respect to both privacy level and RMSE (for all sensor setups). This is consistent with the fact that W2W_{2}-MHE performs better than the EKF for multi-modal distributions.

Sensor Setup εc\varepsilon_{c} EcorrectE_{\text{correct}} Better choice
𝐐𝟏\mathbf{Q_{1}} 0.46223\quad 0.46223\quad 0.0066205\quad 0.0066205\quad W2W_{2}-MHE
𝐐𝟐\mathbf{Q_{2}} 1.9239\quad 1.9239\quad 0.0064686\quad 0.0064686\quad W2W_{2}-MHE
𝐐𝟑\mathbf{Q_{3}} 2.3085\quad 2.3085\quad 0.0062608\quad 0.0062608\quad W2W_{2}-MHE
TABLE II: Diff-private EKF test results
Correctness of sufficient condition for W2W_{2}-MHE.

Theorem 4 of [9] provides with a theoretical formula to calculate ss that guarantees (ε(\varepsilon,dd-adj)) differential privacy. Since this condition is derived using several assumptions and upper bounds, the answer is in general expected to be conservative.

In order to make comparisons, we choose the sensor setup 𝐐𝟐\mathbf{Q_{2}} and take s=0.8s=0.8 for simulation. The value of other parameters are: T=7 (for less computation),cf=1.0777,ch=100,l=1293.2(N=5),diam(𝒦0)=0.1,d=10T=7\text{ (for less computation)},c_{f}=1.0777,c_{h}=100,l=1293.2(N=5),\operatorname{diam}(\mathcal{K}_{0})=0.1,{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}d}=10. Plug these values into the theorem, we can obtain ε6474.3\varepsilon\geq 6474.3. In simulation, the critical εc\varepsilon_{c} = 0.89281. Upon inspection, it is clear that the theoretical answer is much more conservative than the approximated one, which indicates that if ε0.89281\varepsilon\geq 0.89281, differential privacy is satisfied with high confidence wrt the given space partition. While this is a necessary condition for privacy, we run a few more simulations to test how this changes for finer space partitions.

  1. 1.

    r=3r=3 (9 regions per time step), εc=0.98162\varepsilon_{c}=0.98162

  2. 2.

    r=4r=4 (16 regions per time step), εc=1.9939\varepsilon_{c}=1.9939

  3. 3.

    r5r\geq 5, much more computation are required (numebr of events increases exponentially)

As observed, a finer space partition leads an increase of εc\varepsilon_{c}. But the theoretical bound is still far from the observed values.

VI CONCLUSION

This work presents a numerical test framework to evaluate the differential privacy of continuous-range mechanisms such as state estimators. This includes a precise quantification of its performance guarantees. Then, we apply the numerical method on differentially-private versions of the W2W_{2}-MHE filter, and compare it with other competing approaches. Future work will be devoted to obtain more efficient algorithms that; e.g., refine the considered partition adaptively.

References

  • [1] A. Narayanan and V. Shmatikov, “How to break anonymity of the netflix prize dataset,” preprint arxiv:cs/0610105, 2006.
  • [2] B. Hoh, T. Iwuchukwu, Q. Jacobson, D. Work, A. M. Bayen, R. Herring, J. Herrera, M. Gruteser, M. Annavaram, and J. Ban, “Enhancing privacy and accuracy in probe vehicle-based traffic monitoring via virtual trip lines,” IEEE Transactions on Mobile Computing, 2012.
  • [3] C. Dwork, M. Frank, N. Kobbi, and S. Adam, “Calibrating noise to sensitivity in private data analysis,” in Theory of Cryptography, 2006.
  • [4] J. Cortes, G. E. Dullerud, S. Han, J. L. Ny, S. Mitra, and G. J. Pappas, “Differential privacy in control and network systems,” in IEEE Int. Conf. on Decision and Control, 2016, pp. 4252–4272.
  • [5] Y. Wang, Z. Huang, S. Mitra, and G. E. Dullerud, “Entropy-minimizing mechanism for differential privacy of discrete-time linear feedback systems,” in IEEE Int. Conf. on Decision and Control, 2014.
  • [6] E. Nozari, P. Tallapragada, and J. Cortes, “Differentially private distributed convex optimization via functional perturbation,” IEEE Transactions on Control of Network Systems, 2019.
  • [7] J. L. Ny, “Privacy-preserving filtering for event streams,” preprint arXiv: 1407.5553, 2014.
  • [8] J. L. Ny and G. J. Pappas, “Differentially private filtering,” IEEE Transactions on Automatic Control, pp. 341–354, 2014.
  • [9] V. Krishnan and S. Martínez, “A probabilistic framework for moving-horizon estimation: Stability and privacy guarantees,” IEEE Transactions on Automatic Control, pp. 1–1, 06 2020.
  • [10] Y. Chen and A. Machanavajjhala, “On the privacy properties of variants on the sparse vector technique,” 2015.
  • [11] M. Lyu, D. Su, and N. Li, “Understanding the sparse vector technique for differential privacy,” 2016.
  • [12] Z. Y. Ding, Y. X. Wang, G. H. Wang, D. F. Zhang, and D. Kifer, “Detecting violations of differential privacy,” Proc.s of the 2018 ACM SIGSAC Conference on Computer and Communications Security.
  • [13] A. Devonport and M. Arcak, “Estimating reachable sets with scenario optimization,” in Annual Learning for Dynamics & Control Conference, 2020.
  • [14] R. A. Fisher, The design of experiments, 1935.