This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Windowed total variation denoising and noise variance monitoring

Zhanhao Liu Université de Lorraine, CNRS, Inria, IECL, F-54000 Nancy, France Saint-Gobain Research Paris, 39 Quai Lucien Lefranc, F-93300 Aubervilliers, France Marion Perrodin Saint-Gobain Research Paris, 39 Quai Lucien Lefranc, F-93300 Aubervilliers, France Thomas Chambrion Institut de Mathématiques de Bourgogne, UMR 5584, CNRS, UBFC, F-21000 Dijon, France Radu S.Stoica Université de Lorraine, CNRS, Inria, IECL, F-54000 Nancy, France
Abstract

We proposed a real time Total-Variation denosing method with an automatic choice of hyper-parameter λ\lambda, and the good performance of this method provides a large application field. In this article, we adapt the developed method to the non stationary signal in using the sliding window, and propose a noise variance monitoring method. The simulated results show that our proposition follows well the variation of noise variance.

I Introduction

The signal y=(y1,,yn)ny=(y_{1},\cdots,y_{n})\in\mathbb{R}^{n} collected by the sensor can be modeled as y=u+ϵy=u+\epsilon: a random noise ϵ\epsilon is added into the useful physical quantity uu with 𝔼(ϵ)=0\mathbb{E}(\epsilon)=0 and 𝕍(ϵ)=σ2\mathbb{V}(\epsilon)=\sigma^{2}.

We aim to recover the unknown vector u=(u1,,un)nu=(u_{1},\cdots,u_{n})\in\mathbb{R}^{n} from the noisy sample vector y=(y1,,yn)y=(y_{1},\cdots,y_{n}) with yiy_{i} the sample at time tit_{i} by minimizing the Total Variation (TV) restoration functional:

F(u,y,τ,λ)=i=1nτi(yiui)2+λi=1n1|uiui1|\displaystyle F(u,y,\tau,\lambda)=\sum_{i=1}^{n}\tau_{i}(y_{i}-u_{i})^{2}+\lambda\sum_{i=1}^{n-1}|u_{i}-u_{i-1}| (1)

with the sampling period vector τ=(τ1,,τn)\tau=(\tau_{1},\cdots,\tau_{n}) where τi=titi1\tau_{i}=t_{i}-t_{i-1} for i=2,,ni=2,...,n and τ1=τ2\tau_{1}=\tau_{2}. The restored signal is given by:

u(λ)=(u1(λ),,un(λ))=argminF(u,y,τ,λ)u^{*}(\lambda)=(u^{*}_{1}(\lambda),\cdots,u^{*}_{n}(\lambda))=\arg\min F(u,y,\tau,\lambda) (2)

Total Variation based restoration method is first proposed in [1]. The authors in [2] show an one-to-one correspondence between the noise’s variance σ2\sigma^{2} and λ\lambda under the hypothesis of constant σ2\sigma^{2}.

Motivated by the local influence of new sample to the actual restoration, we propose a new online TV-restoration algorithm with an automatic choice of λ\lambda for a stationary 1D signal in [3] following the work of [4] and [5]. The simulations show that our proposition of λ\lambda has a similar performance as the existing methods (SURE [6] and cross-validation). Further more, our method is appropriate for real time applications, especially for monitoring a huge amount of sensors in the plants.

In this paper, we adapt our online TV restoration to non-stationary signals and propose a noise variance monitoring method. The key idea is to use our method on sliding windows, assuming a local-stationary hypothesis for the signal. This hypothesis is strong but commonly used already (e.g. Fourier analysis, wavelet transforms). In order to apply our method to non-stationary signals, two problems are to be solved:

  • Adaptation of the online method to the sliding window

  • Choice of the length of window (noted mm)

The first point ensures the efficiency of the real time estimation, while the second point guarantees the good performance of method. Indeed, the choice of mm needs to get a compromise between the local stationarity assumption and the performance of the restoration : the local influence of new sample vanishes inside a large window, but the local stationarity may not be respected. In this article, we present how we deal with the first point by supposing the influence of new sample stays inside the actual window. The choice of window size is the immediate perspective of this work.

The increasing of the ground noise is one of the index for the failures of the sensors or the production lines. This motivates us to build a real time application for detecting the variance behaviour by tracking the windowed restoration residuals yu(λours)y-u^{*}(\lambda_{ours}) based on the automatic determination of hyper-parameter λours\lambda_{ours}.

Here some approaches for the estimation of noise variance σ2\sigma^{2} are listed:

  • Most of the existing methods are based on the separation of signal and noise by digital filters [7] or the restoration methods (i.e. wavelet transform) [8] [9] [10]. See more details in [11].

  • Median absolute deviation (MAD) proposed by [12] is a heuristic method largely used in many applications, especially for the choice of threshold for wavelet restoration. σ\sigma can be estimated by the median absolute deviation of the finest scale of wavelet coefficients ByBy of signal.

With respect to the previous cited method, our approach, based on the separation of signal and noise, proposes a procedure for simultaneously signal denoising and variance tracking, while being implementable as an online application. Thanks to its low time and space complexity, our proposition is well fitted to the cases with limited computation resource.

The paper continues as it follows: in Section II we will at first present our TV-based denosing method with an automatic choice of parameter. Then, in Section III we adapt our online implementation to the sliding window. After that, in Section IV, we will present an application of our algorithms: noise variance monitoring, and compare with an existing method. Finally, conclusions and perspectives are depicted.

II Automatic total variation denoising method

In this section, we will present the automatic TV-denoising method proposed in [3]. We aim to recover the unknown vector uu from the noisy samples yy by minimizing (1). The restored signal is given by (2).

Since we work on a finite sample of signal, the solution u(λ)u^{*}(\lambda) can be seen as piece-wise constant. We use the segment representation [5] for the constant pieces: a set of index {j,j+1,,k}\{j,j+1,\cdots,k\} of consecutive points whose restored value uj(λ)==uk(λ)u^{*}_{j}(\lambda)=\cdots=u^{*}_{k}(\lambda) is called a segment if it can not be enlarged, which means if uj1(λ)uj(λ)u^{*}_{j-1}(\lambda)\neq u^{*}_{j}(\lambda) (or j=1j=1) and uk(λ)uk+1(λ)u^{*}_{k}(\lambda)\neq u^{*}_{k+1}(\lambda) (or k=nk=n). The segments number of u(λ)u^{*}(\lambda) is noted as K(λ)K(\lambda). The following notations are introduced for the jthj^{th} segment:

  • Index set 𝒩j(λ)={i1j,,injj}\mathcal{N}_{j}(\lambda)=\{i^{j}_{1},\cdots,i^{j}_{n_{j}}\} with nj(λ)=Card(𝒩j(λ))n_{j}(\lambda)=\text{Card}(\mathcal{N}_{j}(\lambda)), containing the point inside the jthj^{th} segment.

  • Segment level vj(λ)=ui(λ),i𝒩j(λ)v^{*}_{j}(\lambda)=u^{*}_{i}(\lambda),\forall i\in\mathcal{N}_{j}(\lambda)

An equivalent representation of u(λ)u^{*}(\lambda) is provided by the couple {v(λ),𝒩(λ)}\{v^{*}(\lambda),\mathcal{N}(\lambda)\} with the set of segment levels v(λ)=(v1(λ),,vK(λ))v^{*}(\lambda)=(v^{*}_{1}(\lambda),\cdots,v^{*}_{K}(\lambda)) and the cutting set 𝒩(λ)={𝒩1(λ),,𝒩K(λ)}\mathcal{N}(\lambda)=\{\mathcal{N}_{1}(\lambda),\cdots,\mathcal{N}_{K}(\lambda)\}. By knowing the cutting set 𝒩(λ)\mathcal{N}(\lambda), let s(v)={s0(v),s1(v),,sK(v)}s(v)=\{s_{0}(v),s_{1}(v),\cdots,s_{K}(v)\} with si(v)=sign(vi+1vi)s_{i}(v)=\text{sign}(v_{i+1}-v_{i}) for i=1,,K1i=1,\cdots,K-1, si(v)=0s_{i}(v)=0 for i{0,K}i\in\{0,K\} and s=s(v(λ))s^{*}=s(v^{*}(\lambda)), the level of jthj^{th} segment is given by:

vj(λ)=y¯j+λ2𝒯j(sjsj1)v^{*}_{j}(\lambda)=\overline{y}^{*}_{j}+\frac{\lambda}{2\mathcal{T}_{j}}(s^{*}_{j}-s^{*}_{j-1}) (3)

with j=1,,K(λ)j=1,\cdots,K(\lambda), the segment length 𝒯j=i𝒩j(λ)τi\mathcal{T}_{j}=\sum_{i\in\mathcal{N}_{j}(\lambda)}\tau_{i} and the mean value inside jthj^{th} segment y¯j=i𝒩j(λ)τiyi𝒯j\overline{y}^{*}_{j}=\frac{\sum_{i\in\mathcal{N}_{j}(\lambda)}\tau_{i}y_{i}}{\mathcal{T}_{j}}.

A different value of λ\lambda may provide a solution with distinct cutting set: λ=0\lambda=0 gives u=yu^{*}=y, while λ=\lambda=\infty implies a parsimonious solution u=mean(y)u^{*}=\text{mean}(y). The authors in [4] and [5] show there exists a sequence Λ=(λ1,)\Lambda=(\lambda_{1},\cdots) such that for every λΛ\lambda\in\Lambda, two segments are merged together by moving λη\lambda-\eta to λ\lambda with η0\eta\to 0 and η>0\eta>0. In [3], we propose a rapid algorithm to estimate the dynamic of the restoration in function of λ\lambda, and the dynamic is saved in Λ=(λ1,λ2,,λn1)\Lambda^{\circ}=(\lambda^{\circ}_{1},\lambda^{\circ}_{2},\cdots,\lambda^{\circ}_{n-1}) with λi\lambda^{\circ}_{i} the value of λ\lambda for which the points ii and i+1i+1 are merged into the same segment. Λ\Lambda^{\circ} allows the computation of the cutting set and also the restoration (3) for every λ\lambda.

Based on the variation of the extremums number (noted g(λ)g(\lambda)) of the restoration u(λ)u^{*}(\lambda) in function of λ\lambda, an adaptive choice of hyper-parameter λ\lambda is proposed. The simulations show that our estimation λours\lambda_{ours} has a similar performance as the state of the art, all near the optimal choice λop=argmin|u(λ)unet|2\lambda_{op}=\arg\min|u^{*}(\lambda)-u_{net}|^{2} with unetu_{net} the original signal. The variation Δg(λ)=g(λη)g(λ)\Delta g(\lambda)=g(\lambda-\eta)-g(\lambda) for every λΛ\lambda\in\Lambda with η0\eta\to 0 and η>0\eta>0 can be estimated simultaneously as the estimation of Λ\Lambda^{\circ}.

Besides, we analysed the local influence for Λ\Lambda^{\circ} by introducing a new sample at the end of the sequence: only a small part of Λ\Lambda^{\circ} will be changed. Based on this local property, we proposed an online algorithm (c.f. Algorithm 2 in [3]) for updating the changed part of Λ\Lambda^{\circ}, which is more efficient than the offline estimation. The locality of the TV-denoising method motives the adaption of sliding windows in order to deal with the non-stationary signal by supposing the local stationarity inside the window.

III Adaptation to sliding windows

In this section, we will present some theoretical elements about the local behaviour of the restoration and adapt the online algorithm to the sliding window.

Let’s consider two successive sliding windows wim=[i,i+m1]w_{i}^{m}=[i,i+m-1] and wi+1m=[i+1,i+m]w_{i+1}^{m}=[i+1,i+m]. wi+1mw_{i+1}^{m} is indeed wimw_{i}^{m} in which we take off the first point (yi,τi)(y_{i},\tau_{i}) and add a new point (yi+m,τi+m)(y_{i+m},\tau_{i+m}). For the sake of simplicity, F(u{i,,i+m1},y{i,,i+m1},τ{i,,i+m1},λ)F(u_{\{i,\cdots,i+m-1\}},y_{\{i,\cdots,i+m-1\}},\tau_{\{i,\cdots,i+m-1\}},\lambda) is noted Fim(λ)F^{m}_{i}(\lambda) for the window wimw^{m}_{i}.

We note u=(u1,,um)={v,𝒩}=argminFimu^{*}=(u^{*}_{1},\cdots,u^{*}_{m})=\{v^{*},\mathcal{N}^{*}\}=\arg\min F^{m}_{i} the restoration of the window wimw^{m}_{i} with a given λ\lambda and u^=(u^1,,u^m)={v^,𝒩^}=argminFi+1m\hat{u}=(\hat{u}_{1},\cdots,\hat{u}_{m})=\{\hat{v},\hat{\mathcal{N}}\}=\arg\min F^{m}_{i+1} for wi+1mw^{m}_{i+1}. For the simplify of the presentation, the elements concerning about wimw^{m}_{i} will be noted with the symbol *, and that for wi+1mw^{m}_{i+1} will be noted with the hat symbol. For example, with a given λ\lambda, the segment length set of u(λ)u^{*}(\lambda) is 𝒯={𝒯1,,𝒯K^}\mathcal{T}^{*}=\{\mathcal{T}^{*}_{1},\cdots,\mathcal{T}^{*}_{\hat{K}}\} with 𝒯j=i𝒩jτi\mathcal{T}^{*}_{j}=\sum_{i\in\mathcal{N}^{*}_{j}}\tau_{i}, and that of u^(λ)\hat{u}(\lambda) is 𝒯^={𝒯^1,,𝒯^K^}\hat{\mathcal{T}}=\{\hat{\mathcal{T}}_{1},\cdots,\hat{\mathcal{T}}_{\hat{K}}\} with 𝒯^j=i𝒩^jτi\hat{\mathcal{T}}_{j}=\sum_{i\in\hat{\mathcal{N}}_{j}}\tau_{i}.

III-A Influence of slide movement

The update of the restored signal and Λ\Lambda^{\circ} need to consider the sliding of index from wimw^{m}_{i} to wi+1mw^{m}_{i+1}: a restored signal point is unchanged under the influence of slide movement means u^i1=ui\hat{u}_{i-1}=u^{*}_{i}. In [3], we have shown the following theorem:

Theorem 1.

If there exists an index j{2,,K(λ)}j\in\{2,\cdots,K^{*}(\lambda)\} such that sign(vj1(λ)vj(λ))=sign(vK(λ)(λ)yi+m)\text{sign}(v^{*}_{j-1}(\lambda)-v^{*}_{j}(\lambda))=\text{sign}(v^{*}_{K^{*}(\lambda)}(\lambda)-y_{i+m}), then the new restoration u^\hat{u} satisfies u^i1(λ)=ui(λ)\hat{u}_{i-1}(\lambda)=u^{*}_{i}(\lambda) for all i<i1j,i<i^{j,*}_{1}.

The diffusion from the “new” sample yi+my_{i+m} changes only the last part of the restoration of wmiw^{i}_{m} up to the junction of two segments whose sign of the variation vj1(λ)vj(λ)v^{*}_{j-1}(\lambda)-v^{*}_{j}(\lambda) corresponds to that of vK(λ)yn+1v^{*}_{K^{*}}(\lambda)-y_{n+1}. We can establish a similar theorem (c.f. Theorem 2) about the influence of the first sample yiy_{i} for the sliding window wimw^{m}_{i}: the removing of yiy_{i} changes only the first part of the restoration.

Theorem 2.

If there exists an index j{2,,K(λ)}j\in\{2,\cdots,K^{*}(\lambda)\} such that sign(vj1(λ)vj(λ))=sign(v1(λ)yi)\text{sign}(v^{*}_{j-1}(\lambda)-v^{*}_{j}(\lambda))=\text{sign}(v^{*}_{1}(\lambda)-y_{i}), then the new restoration u^\hat{u} satisfies u^i1(λ)=ui(λ)\hat{u}_{i-1}(\lambda)=u^{*}_{i}(\lambda) for all ii1j+1,i\geq i^{j+1,*}_{1}.

To sum up, for updating u(λ)u^{*}(\lambda) to u^(λ)\hat{u}(\lambda), only the first part and the last part of wimw^{m}_{i} are changed. We introduce the following definitions:

Definition 1.

With l=i1jl=i^{j}_{1} and p=i1k+11p=i^{k+1}_{1}-1 where jj is the last segment which satisfies sign(vj1(λ)vj(λ))=sign(vK(λ)yi+m)\text{sign}(v^{*}_{j-1}(\lambda)-v^{*}_{j}(\lambda))=\text{sign}(v^{*}_{K^{*}}(\lambda)-y_{i+m}) and kk is the first segment which satisfies sign(vk1(λ)vk(λ))=sign(v1(λ)yi)\text{sign}(v^{*}_{k-1}(\lambda)-v^{*}_{k}(\lambda))=\text{sign}(v^{*}_{1}(\lambda)-y_{i}),

  • (ul,,um)(u^{*}_{l},\cdots,u^{*}_{m}) is called non right-isolated sequence.

  • (u2,,up)(u^{*}_{2},\cdots,u^{*}_{p}) is called non left-isolated sequence.

  • (up+1,,ul1)(u^{*}_{p+1},\cdots,u^{*}_{l-1}) is called isolated sequence.

III-B Independence between segments

For a given λ\lambda (noted λ^\hat{\lambda}), the cutting set 𝒩(λ^)\mathcal{N}(\hat{\lambda}) is given by {λ>λ^}\{\lambda^{\circ}>\hat{\lambda}\}. In [3], we showed the points inside a segment of u(λ^)u^{*}(\hat{\lambda}) are merged for λλ^\lambda\leq\hat{\lambda}, and the value of λ\lambda provoking the merge is independent to the points outside the segment. By introducing the virtual segment (c.f. Definition 2), the estimation of Λ\Lambda^{\circ} can be broken into some sub-problems for each virtual segment, shown in proposition 1. Combining with Theorem 1 and 2, the elements of {λλ^}\{\lambda^{\circ}\leq\hat{\lambda}\} inside the isolated sequence are not influenced by the sliding movement.

Definition 2.

Let λ^\hat{\lambda} and ϵλ>0\epsilon_{\lambda}>0, we have v(λ^)={v1(λ^),,vl(λ^)}v^{*}(\hat{\lambda})=\{v_{1}^{*}(\hat{\lambda}),\cdots,v_{l}^{*}(\hat{\lambda})\}, 𝒩(λ^)={𝒩1,,𝒩l}\mathcal{N}^{*}(\hat{\lambda})=\{\mathcal{N}_{1},\cdots,\mathcal{N}_{l}\}, l=K(λ^)l=K(\hat{\lambda}) and si=sign(vi+1(λ^)vi(λ^))s_{i}=\text{sign}(v_{i+1}^{*}(\hat{\lambda})-v_{i}^{*}(\hat{\lambda})). For each segment {y𝒩j,τ𝒩j}\{y_{\mathcal{N}_{j}},\tau_{\mathcal{N}_{j}}\} with j=1,,lj=1,\cdots,l, let ci=λ^+ϵλ2sic_{i}=\frac{\hat{\lambda}+\epsilon_{\lambda}}{2s_{i}}, we introduce the virtual segment {y𝒩j+,τ𝒩j+}\{y^{+}_{\mathcal{N}_{j}},\tau^{+}_{\mathcal{N}_{j}}\} where:

  • y𝒩1+={y𝒩1,v1(λ^)+c1}y^{+}_{\mathcal{N}_{1}}=\{y_{\mathcal{N}_{1}},v_{1}^{*}(\hat{\lambda})+c_{1}\}, y𝒩i+={vi(λ^)ci1,y𝒩i,vi(λ^)+ci}y^{+}_{\mathcal{N}_{i}}=\{v_{i}^{*}(\hat{\lambda})-c_{i-1},y_{\mathcal{N}_{i}},v_{i}^{*}(\hat{\lambda})+c_{i}\} for i=2,,l1i=2,...,l-1 and y𝒩l+={vl(λ^)ci1,y𝒩l}y^{+}_{\mathcal{N}_{l}}=\{v_{l}^{*}(\hat{\lambda})-c_{i-1},y_{\mathcal{N}_{l}}\}.

  • τ𝒩1+={τ𝒩1,1}\tau^{+}_{\mathcal{N}_{1}}=\{\tau_{\mathcal{N}_{1}},1\}, τ𝒩i+={1,τ𝒩i,1}\tau^{+}_{\mathcal{N}_{i}}=\{1,\tau_{\mathcal{N}_{i}},1\} for i=2,,l1i=2,...,l-1 and τ𝒩l+={1,τ𝒩l}\tau^{+}_{\mathcal{N}_{l}}=\{1,\tau_{\mathcal{N}_{l}}\}.

Proposition 1.

Let Λ\Lambda^{\circ} the estimation with all the samples {yi,ti}1,,n\{y_{i},t_{i}\}_{1,\cdots,n}, Λi={λi,1,}\Lambda^{\circ}_{i}=\{\lambda_{i,1}^{\circ},\cdots\} the estimation with ithi^{th} virtual segment (y𝒩i+,τ𝒩i+)(y^{+}_{\mathcal{N}_{i}},\tau_{\mathcal{N}_{i}}^{+}), Λ1={λi,1,,λi,ni1}\Lambda^{*}_{1}=\{\lambda_{i,1}^{\circ},\cdots,\lambda_{i,n_{i}-1}^{\circ}\} and Λi={λi,2,,λi,ni}\Lambda^{*}_{i}=\{\lambda_{i,2}^{\circ},\cdots,\lambda_{i,n_{i}}^{\circ}\} for i=2,,li=2,...,l, we have i=1,,lΛi={λ|λλ^λΛ}\cup_{i=1,\cdots,l}\Lambda^{*}_{i}=\{\lambda|\lambda\leq\hat{\lambda}\cap\lambda\in\Lambda^{\circ}\}.

III-C Proposition of algorithms

Let Δg=(Δg(λ1),Δg(λ2),,Δg(λn1))\Delta g^{\circ}=(\Delta g(\lambda^{\circ}_{1}),\Delta g(\lambda^{\circ}_{2}),\cdots,\Delta g(\lambda^{\circ}_{n-1})), we note Λ,i\Lambda^{\circ,i} and Δg,i\Delta g^{\circ,i}, two vectors of size m1m-1, the result of wimw^{m}_{i}. The results after sliding (Λ,i+1\Lambda^{\circ,i+1} and Δg,i+1\Delta g^{\circ,i+1}) based on wi+1mw^{m}_{i+1} can be obtained by updating Λ,i\Lambda^{\circ,i} and Δg,i\Delta g^{\circ,i}. In this section, we will propose an adaptation of the online algorithm proposed in [3] to the sliding windows.

We will only talk about the online estimation of Λ,i+1\Lambda^{\circ,i+1} from Λ,i\Lambda^{\circ,i} in detail. By following, we note an application of Algorithm 1 in [3] to a given sequence of {y}\{y\} and {τ}\{\tau\} as Λ=DP-TV({y},{τ})\Lambda^{\circ}=\text{DP-TV}(\{y\},\{\tau\}).

After choosing λ^0\hat{\lambda}\geq 0, called the cutting point, the restoration for wimw^{m}_{i} is u(λ^)u^{*}(\hat{\lambda}). We note pp and jpj^{*}_{p} respectively the last point and the last segment of the non left-isolated sequence of u(λ^)u^{*}(\hat{\lambda}), ll and jlj^{*}_{l} respectively the first point and the first segment of the non right-isolated sequence of u(λ^)u^{*}(\hat{\lambda}). Λ,i+1\Lambda^{\circ,i+1} can be splitted into four parts: (1) Some elements of {λ,i+1>λ^}\{\lambda^{\circ,i+1}>\hat{\lambda}\} are changed following Proposition 1; (2) The non left-isolated sequence {λj,iλ^}\{\lambda^{\circ,i}_{j}\leq\hat{\lambda}\} for jpj\leq p is influenced by the removal of the first point yiy_{i}; (3) The non right-isolated sequence {λj,iλ^}\{\lambda^{\circ,i}_{j}\leq\hat{\lambda}\} for jlj\geq l is influenced by the new point yi+my_{i+m}; (4) All λj,iλ^\lambda^{\circ,i}_{j}\leq\hat{\lambda} remains the same for p<j<lp<j<l.

We treat at first the unchanged part of Λ,i+1\Lambda^{\circ,i+1} : due to the sliding, we have λ{λp,,l2,i+1λ^},i+1=λ{λp+1,,l1,nλ^},i\lambda^{\circ,i+1}_{\{\lambda^{\circ,i+1}_{p,\cdots,l-2}\leq\hat{\lambda}\}}=\lambda^{\circ,i}_{\{\lambda^{\circ,n}_{p+1,\cdots,l-1}\leq\hat{\lambda}\}}.

For the non right-isolated sequence, let ϵλ>0\epsilon_{\lambda}>0, Λa=λ{λl1,,m,i+1λ^},i+1\Lambda^{a}=\lambda^{\circ,i+1}_{\{\lambda^{\circ,i+1}_{l-1,\cdots,m}\leq\hat{\lambda}\}} can be estimated by DP-TV(ya+,τa+)\text{DP-TV}(y^{+}_{a},\tau^{+}_{a}) with the virtual segment:

  • ya+={vjl(λ^)λ^+ϵλ2sign(vjlvjl1),y{l+i1,,m+i}}y^{+}_{a}=\{v^{*}_{j^{*}_{l}}(\hat{\lambda})-\frac{\hat{\lambda}+\epsilon_{\lambda}}{2\text{sign}(v_{j^{*}_{l}}-v_{j^{*}_{l}-1})},y_{\{l+i-1,\cdots,m+i\}}\}

  • τa+={1,τ{l+i1,,m+i}}\tau^{+}_{a}=\{1,\tau_{\{l+i-1,\cdots,m+i\}}\}

For the non left-isolated sequence, Λb=λ{λ1,,p1,i+1λ^},i+1\Lambda^{b}=\lambda^{\circ,i+1}_{\{\lambda^{\circ,i+1}_{1,\cdots,p-1}\leq\hat{\lambda}\}} can be estimated by DP-TV(yb+,τb+)\text{DP-TV}(y^{+}_{b},\tau^{+}_{b}) with the virtual segment:

  • yb+={y{i+1,,i+p1},vjp(λ^)+λ^+ϵλ2sign(vjp+1vjp)}y^{+}_{b}=\{y_{\{i+1,\cdots,i+p-1\}},v^{*}_{j^{*}_{p}}(\hat{\lambda})+\frac{\hat{\lambda}+\epsilon_{\lambda}}{2\text{sign}(v_{j^{*}_{p}+1}-v_{j^{*}_{p}})}\}

  • τb+={τ{i+1,,i+p1},1}\tau^{+}_{b}=\{\tau_{\{i+1,\cdots,i+p-1\}},1\}

The isolated and non-isolated sequences can be assembled in Λc={λ1c,,λmc}\Lambda^{c}=\{\lambda^{c}_{1},\cdots,\lambda^{c}_{m}\} with λkc=λkl+3a\lambda^{c}_{k}=\lambda^{a}_{k-l+3} for kl1k\geq l-1, λkc=λkb\lambda^{c}_{k}=\lambda^{b}_{k} for kp1k\leq p-1 and λkc=λk1,n\lambda^{c}_{k}=\lambda^{\circ,n}_{k-1} for p<k1<lp<k-1<l.

It remains {λ,i+1>λ^}\{\lambda^{\circ,i+1}>\hat{\lambda}\}. Let d={λ,i+1>λ^}={λkc>λ^}d=\{\lambda^{\circ,i+1}>\hat{\lambda}\}=\{\lambda^{c}_{k}>\hat{\lambda}\} containing indeed all the last points of u^(λ^)\hat{u}(\hat{\lambda})’s segments, we can get Λd=λd,i+1=DP-TV(v^(λ^),𝒯^(λ^))\Lambda^{d}=\lambda^{\circ,i+1}_{d}=\text{DP-TV}(\hat{v}(\hat{\lambda}),\hat{\mathcal{T}}(\hat{\lambda})).

Finally, Λ,i+1\Lambda^{\circ,i+1} can be assembled in the following way:

λk,i+1={λkc,if λkdλ^.λα(i)dif λkd>λ^.\lambda^{\circ,i+1}_{k}=\begin{cases}\lambda^{c}_{k},&\text{if }\lambda^{d}_{k}\leq\hat{\lambda}.\\ \lambda^{d}_{\alpha(i)}&\text{if }\lambda^{d}_{k}>\hat{\lambda}.\par\end{cases} (4)

with α:\alpha\mathrel{\mathop{\mathchar 58\relax}}\mathbb{Z}\to\mathbb{Z} giving the index of ii in the vector dd.

Algorithm 1 Adaptation of online implementation to sliding window
(yi,,yi+m1),(τi,,τi+m1),(yi+m,τi+m)(y_{i},\cdots,y_{i+m-1}),(\tau_{i},\cdots,\tau_{i+m-1}),\ (y_{i+m},\tau_{i+m})
Λ,i,Δg,i,λ^\Lambda^{\circ,i},\ \Delta g^{\circ,i},\ \hat{\lambda}
Find non right-isolated sequence (l,,m)(l,\cdots,m) of u(λ^)u^{*}(\hat{\lambda})
Find non left-isolated sequence (2,,p)(2,\cdots,p) of u(λ^)u^{*}(\hat{\lambda})
if p<l2p<l-2 then
     (Λa,Δga)=DP-TV(ya+,τa+)\Lambda^{a},\Delta g^{a})=\text{DP-TV}(y^{+}_{a},\tau^{+}_{a})
     (Λb,Δgb)=DP-TV(yb+,τb+)\Lambda^{b},\Delta g^{b})=\text{DP-TV}(y^{+}_{b},\tau^{+}_{b})
     Λ,i+1={Λ{1,,p1}b,Λ{p+1,,l1},i,Λ{2,,ml+2}a}\Lambda^{\circ,i+1}=\{\Lambda^{b}_{\{1,\cdots,p-1\}},\Lambda^{\circ,i}_{\{p+1,\cdots,l-1\}},\Lambda^{a}_{\{2,\cdots,m-l+2\}}\}
     Δg,i+1={Δg{1,,p1}b,Δg{p+1,,l1},i,\Delta g^{\circ,i+1}=\{\Delta g^{b}_{\{1,\cdots,p-1\}},\Delta g^{\circ,i}_{\{p+1,\cdots,l-1\}},
     Δg{2,,ml+2}a}\qquad\qquad\quad\Delta g^{a}_{\{2,\cdots,m-l+2\}}\}
     d={λ,n+1>λ^}d=\{\lambda^{\circ,n+1}>\hat{\lambda}\}
     (λd,i+1,Δgd,i+1)=DP-TV(u^(λ^),𝒯^(λ^))(\lambda^{\circ,i+1}_{d},\Delta g^{\circ,i+1}_{d})=\text{DP-TV}(\hat{u}(\hat{\lambda}),\hat{\mathcal{T}}(\hat{\lambda}))
else\triangleright Offline approach
     (λd,i+1,𝒯d,i+1)=DP-TV({yi,τi}i+1,,i+m)(\lambda^{\circ,i+1}_{d},\mathcal{T}^{\circ,i+1}_{d})=\text{DP-TV}(\{y_{i},\tau_{i}\}_{i+1,\cdots,i+m})
end if
return Λ,i+1\Lambda^{\circ,i+1} and Δg,i+1\Delta g^{\circ,i+1}

The algorithm adapted for the sliding window is gathered in Algorithm 1. With a nice choice of λ^\hat{\lambda}, only a small part (Λa,Λb,Λd\Lambda^{a},\Lambda^{b},\Lambda^{d}) of the window wi+1mw^{m}_{i+1} needs to be updated from wimw^{m}_{i}. The overall complexity for estimating the restoration in a window of size mm is in O(m)O(m).

IV Performance analysis

IV-A Application: noise variance monitoring

We propose a method to detect the shift of noise’s variance based on our restoration method. For an observed signal yy of size nn, we take a sliding window of size mm with m<nm<n. For each window wim=[i,i+m1]w_{i}^{m}=[i,i+m-1], we apply our denoising algorithm with the automatic choice of λ\lambda on the sequence (yi,,yi+m1)(y_{i},\cdots,y_{i+m-1}) and (τi,,τi+m1)(\tau_{i},\cdots,\tau_{i+m-1}) for estimating the restored signal u(λours)=(u1,,um)u^{*}(\lambda_{ours})=(u^{*}_{1},\cdots,u^{*}_{m}). The windowed restoration residual for j=1,,mj=1,\cdots,m is given by rj=yj+i1uj(λours)r_{j}=y_{j+i-1}-u^{*}_{j}(\lambda_{ours}). The variance of the residual inside wimw_{i}^{m} can be estimated by:

(σim)2=1m1j=1m(rjr¯i)2(\sigma_{i}^{m*})^{2}=\frac{1}{m-1}\sum_{j=1}^{m}(r_{j}-\overline{r}_{i})^{2} (5)

with the mean value of residual r¯i=1mj=1mrj\overline{r}_{i}=\frac{1}{m}\sum_{j=1}^{m}r_{j}.

For the simulation, the realisation of noise is ϵ^j=yjunet,j\hat{\epsilon}_{j}=y_{j}-u_{net,j} with unetu_{net} the original signal and j=1,,nj=1,\cdots,n. The estimation of noise variance inside wimw_{i}^{m} is given by (σ^im)2=1m1j=ii+m1(ϵ^jϵ^¯i)2(\hat{\sigma}_{i}^{m})^{2}=\frac{1}{m-1}\sum_{j=i}^{i+m-1}(\hat{\epsilon}_{j}-\overline{\hat{\epsilon}}_{i})^{2} with ϵ^¯i=1mj=ii+m1ϵ^j\overline{\hat{\epsilon}}_{i}=\frac{1}{m}\sum_{j=i}^{i+m-1}\hat{\epsilon}_{j}.

Since u(λours)u^{*}(\lambda_{ours}) is a good restoration of yy, u(λours)u^{*}(\lambda_{ours}) is similar to unetu_{net}, which means that σim\sigma_{i}^{m*} is close to σ^im\hat{\sigma}_{i}^{m} inside each window wimw^{m}_{i}. The noise variance σim\sigma_{i}^{m*} is not available for the real data, so we propose to detect the variance shift in using the residual standard deviation vector σm=(σ1m,,σnm+1m)\sigma^{m*}=(\sigma^{m*}_{1},\cdots,\sigma^{m*}_{n-m+1}) with σim\sigma^{m*}_{i} obtained by (5) inside wimw^{m}_{i}.

For estimating the noise variance in a window of size mm, the complexity is in O(m)O(m) with the online implementation of sliding windows, and the overall complexity for monitoring the variance of a signal of size nn is in O((nm)m)O((n-m)m).

IV-B Results

For the variance monitoring, the method needs to estimate accurately the noise variance in the ideal case or capture the variation of noise variance in a more realistic case. We use the following criteria to evaluate the performance of variance monitoring:

  • Average bias: bias¯=1nm+1i=1nm+1(σ^imσim)\overline{\text{bias}}=\frac{1}{n-m+1}\sum_{i=1}^{n-m+1}(\hat{\sigma}^{m}_{i}-\sigma^{m*}_{i}).

  • Ratio of σ^m\hat{\sigma}^{m} variation explained:

    RVE=1i=1nm+1(σ^imbias¯σim)2i=1nm+1(σ^imσ^m¯)2\text{RVE}=1-\frac{\sum_{i=1}^{n-m+1}(\hat{\sigma}^{m}_{i}-\overline{\text{bias}}-\sigma^{m*}_{i})^{2}}{\sum_{i=1}^{n-m+1}(\hat{\sigma}^{m}_{i}-\overline{\hat{\sigma}^{m}})^{2}} (6)

    where σ^m¯=1nm+1i=1nm+1(σ^im)\overline{\hat{\sigma}^{m}}=\frac{1}{n-m+1}\sum_{i=1}^{n-m+1}(\hat{\sigma}^{m}_{i}). It is indeed the R2R^{2} score between σm\sigma^{m*} and σ^m\hat{\sigma}^{m} after adjusting those two items to the same mean value. The range of RVE is between 0 and 11, and RVE=1\text{RVE}=1 indicates our estimation explains perfectly the variation of σm\sigma^{m*}.

We compare our method with Median Absolute Deviation (MAD) following the proposition of [12] and [13] σ=1.4826median(Bymedian(By))\sigma=1.4826\ \text{median}(By-\text{median}(By)) with By=2(y2y1,,ynyn1)By=\sqrt{2}(y_{2}-y_{1},\cdots,y_{n}-y_{n-1}).

Refer to caption
(a)
Figure 1: Up: example of simulated signal: y=unet+ϵy=u_{net}+\epsilon with ϵ𝒩(0,σa2(t))\epsilon\sim\mathcal{N}(0,\sigma_{a}^{2}(t)). Colors : yy (red) and unetu_{net} (blue). Middle: the standard deviation of the noise realisation ϵ=yunet\epsilon=y-u_{net} (red line), that estimated by our method (green line) and by MAD (blue line) inside every wi400w^{400}_{i}. Down : histogram of non-right-isolated sequence’s length inside each sliding window

At first, we will fix the window length m=400m=400. Our proposition requires a parameter qq which stands for the length of the approximation of derivative, see p13 in [3] for more details. We apply our method to a simulated piece-wise constant signal with the parameter log10(q)=1\log_{10}(q)=1. An example of y=unet+ϵy=u_{net}+\epsilon with ϵ𝒩(0,σa2(t))\epsilon\sim\mathcal{N}(0,\sigma_{a}^{2}(t)) and the noise variance estimations for 400-point windows wi400w^{400}_{i} and i=1,,1601i=1,\cdots,1601 are shown in Figure 1. Both MAD and our method propose an estimation similar to the variance of the noise realisation for every sliding windows. In this example, our method underestimates the variance, but the variation of our estimation inside each sliding window follows well that of σ^400\hat{\sigma}^{400}. The estimation of MAD propose a precise estimation, but does not follow the variation of σ^400\hat{\sigma}^{400}. The lengths of non-right-isolated sequence of each window (shown in Figure 1) are all smaller than the window size m=400m=400, which means the influence of new sample remains inside each window and validates our assumption about the local influence.

We tested the performance of MAD and our approach under different noise hypothesis:

  1. 1.

    𝒩(0,σa2(t))\mathcal{N}(0,\sigma_{a}^{2}(t)) with σa(t)=1+0.0005t\sigma_{a}(t)=1+0.0005t.

  2. 2.

    𝒩(0,σb2(t))\mathcal{N}(0,\sigma_{b}^{2}(t)) with σb(t)=1\sigma_{b}(t)=1 for t1000t\leq 1000 and σb(t)=1+0.001t\sigma_{b}(t)=1+0.001t for t>1000t>1000.

  3. 3.

    𝒰(d(t),d(t))\mathcal{U}(-d(t),d(t)) with d(t)=1+0.0005td(t)=1+0.0005t.

  4. 4.

    𝒩(0,σa2(t))+𝒰(1,1)\mathcal{N}(0,\sigma_{a}^{2}(t))+\mathcal{U}(-1,1).

Refer to caption
(a)
Refer to caption
(b)
Refer to caption
(c)
Figure 2: Result of 100 simulations with different types of noise (1): 𝒩(0,σa2(t))\mathcal{N}(0,\sigma_{a}^{2}(t)), (2): 𝒩(0,σb2(t))\mathcal{N}(0,\sigma_{b}^{2}(t)), (3): 𝒰(d(t),d(t))\mathcal{U}(-d(t),d(t)) and (4): 𝒩(0,σa2(t))+𝒰(1,1)\mathcal{N}(0,\sigma_{a}^{2}(t))+\mathcal{U}(-1,1). Up: RVE Score of our method; Middle: RVE score of MAD; Down: Bias¯\overline{\text{Bias}} of our method.
Refer to caption
(a)
Figure 3: RVE score of 100 simulations with different window length (m).

We have done 100 simulations for each type of noise, and the results are gathered in Figure 2. For the most of simulations, our method has RVE>0.95\text{RVE}>0.95 and over-performs MAD. The high RVE score indicates the variation of our proposition fits well that of the real values, which allows us to monitor the variation of noise variance for the real signal collected from plants. However, the result of Bias¯\overline{\text{Bias}} shows our method always underestimates the noise variance, and the bias depends on the type of noise, which means we can not estimate an universal offset for removing this bias in a general case.

The windows length mm plays an important role in the restoration and the variance monitoring, and needs to be chosen carefully. A short window can capture the local variation of the noise variance, but the new point (i.e (i+m)th(i+m)^{th} point) may still have a strong influence for the pass (i.e. i1,i2,i-1,i-2,\cdots), which limits the restoration performance for the pass points. We test m=200,400,600m=200,400,600 with the noise ϵ𝒩(0,σa2(t))\epsilon\sim\mathcal{N}(0,\sigma^{2}_{a}(t)), and the results are shown in Figure 3. A longer window provides a better performance of monitoring. But the local variation of σ2\sigma^{2} will be neglected for a long window, and the stationarity hypothesis inside the window is not validated anymore. The choice of mm is out of the scope of this article, and it remains an open question.

To sum up, our method can not provide a precise estimation of the noise’s variance values, but our estimation follows well that of the real value with an estimation error depending on the type of noise and the shape of original signal.

One of the applications of our method is to detect the shift of noise variance. The residual variance may stay stable (i.e σim=σ\sigma^{m*}_{i}=\sigma) during the correct functional period, and the noise variance shift can be detected by a prefixed threshold over σim\sigma^{m*}_{i} comparing to the stable regime: e.g σim>1.2σ\sigma^{m*}_{i}>1.2\sigma for some successive sliding windows wimw^{m}_{i}.

V Conclusion

In this article, we adapt the TV denoising method proposed in [3] to sliding windows based on the local property of the TV-denoising signal. We believe our adaptive choice of λ\lambda works also for the non-stationary signal. We applied the adaption to some signals with time varying noise variance. The simulated results shows that the variance of the restoration residuals follows well that of the noise, and this method can be used to monitor the ground noise variance in real time with a limited computation resource. However, the performance of these methods (both restoration and variance monitoring) depends on the choice of window length mm which is our on-going research interest. The application to the real data is also one of on-going works.

Acknowledgment

This work has been supported by the EIPHI Graduate School (ANR-17-EURE-0002).

References

  • [1] L. I. Rudin, S. Osher, and E. Fatemi, “Nonlinear total variation based noise removal algorithms,” Physica D: nonlinear phenomena, 1992.
  • [2] A. Chambolle and P.-L. Lions, “Image recovery via total variation minimization and related problems,” Numerische Mathematik, 1997.
  • [3] Z. Liu, M. Perrodin, T. Chambrion, and R. Stoica, “Revisit 1d total variation restoration problem with new real-time algorithms for signal and hyper-parameter estimations,” arXiv preprint arXiv:2012.09481, 2020.
  • [4] R. J. Tibshirani and J. Taylor, “The solution path of the generalized lasso,” The Annals of Statistics, 2011.
  • [5] I. Pollak, A. S. Willsky, and Y. Huang, “Nonlinear evolution equations as fast and exact solvers of estimation problems,” IEEE Trans. on Signal Processing, 2005.
  • [6] C. M. Stein, “Estimation of the mean of a multivariate normal distribution,” The annals of Statistics, 1981.
  • [7] D.-H. Shin, R.-H. Park, S. Yang, and J.-H. Jung, “Block-based noise estimation using adaptive gaussian filtering,” IEEE Transactions on Consumer Electronics, 2005.
  • [8] S. Beheshti and M. A. Dahleh, “A new information-theoretic approach to signal denoising and best basis selection,” IEEE Transactions on Signal Processing, 2005.
  • [9] M. Hashemi and S. Beheshti, “Adaptive noise variance estimation in bayesshrink,” IEEE Signal Processing Letters, 2010.
  • [10] J. Han and L. Xu, “A new method for variance estimation of white noise corrupting a signal,” in I2MTC, 2006.
  • [11] S. Pyatykh, J. Hesser, and L. Zheng, “Image noise level estimation by principal component analysis,” IEEE transactions on image processing, 2012.
  • [12] D. L. Donoho and J. M. Johnstone, “Ideal spatial adaptation by wavelet shrinkage,” biometrika, 1994.
  • [13] S. Sardy and H. Monajemi, “Efficient threshold selection for multivariate total variation denoising,” Journal of Computational and Graphical Statistics, 2019.