This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Compressive Sensing of ECG Signals using Plug-and-Play Regularization111K. N. Chaudhury was partially supported by Core Research Grant CRG/2020/000527 and SERB-STAR Award STR/2021/000011 from the Department of Science and Technology, Government of India.

Unni V. S unniv@iisc.ac.in Ruturaj G. Gavaskar ruturajg@iisc.ac.in Kunal N. Chaudhury kunal@iisc.ac.in
Abstract

Compressive Sensing (CS) has recently attracted attention for ECG data compression. In CS, an ECG signal is projected onto a small set of random vectors. Recovering the original signal from such compressed measurements remains a challenging problem. Traditional recovery methods are based on solving a regularized minimization problem, where a sparsity-promoting prior is used. In this paper, we propose an alternative iterative recovery algorithm based on the Plug-and-Play (PnP) method, which has recently become popular for imaging problems. In PnP, a powerful denoiser is used to implicitly perform regularization, instead of using hand-crafted regularizers; this has been found to be more successful than traditional methods. In this work, we use a PnP version of the Proximal Gradient Descent (PGD) algorithm for ECG recovery. To ensure mathematical convergence of the PnP algorithm, the signal denoiser in question needs to satisfy some technical conditions. We use a high-quality ECG signal denoiser fulfilling this condition by learning a Bayesian prior for small-sized signal patches. This guarantees that the proposed algorithm converges to a fixed point irrespective of the initialization. Importantly, through extensive experiments, we show that the reconstruction quality of the proposed method is superior to that of state-of-the-art methods.

keywords:
ECG signals, compressive sensing, proximal gradient descent, plug-and-play regularization, GMM denoiser.
\affiliation

organization=Department of Electrical Engineering, Indian Institute of Science, city=Bangalore, postcode=560012, country=India

1 Introduction

Electrocardiogram (ECG) is widely used for diagnosis and monitoring cardiac conditions such as hypertension [1], heart failure [2], arrhythmia [3] etc. Essentially, an ECG signal is a representation of the electric activity in the heart over time. Extensive use of ECG in healthcare fosters a need for sophisticated signal processing approaches to efficiently compress, analyze, store and transmit ECG signals. For example, in wearable devices, there is a need to reduce energy consumption due to data transmission and to increase memory usage efficiency. Recently, several efforts have been made to develop wireless ECG sensors for continuous health monitoring, for which it is desirable to have devices with low power consumption or low complexity. However, continuous wireless transmission of long-term biomedical data consumes a significant amount of energy. Thus, compression of ECG signals would be helpful to achieve energy efficiency.

Compressive Sensing (CS) is a possible solution for signal compression. It employs random linear projections that aim to preserve the structure of the signal. The signal can be reconstructed from its projections using nonlinear recovery methods. In fact, several works have explored the application of CS to biomedical signal processing, including ECG [4, 5, 6], EMG [7, 8], EEG [9] signals and MRI images [10].

1.1 Compressive Sensing of ECG Signals

The data acquisition model in CS is given by [5, 6]

𝒚=𝚽𝒙+𝒏\boldsymbol{y}=\boldsymbol{\Phi}\boldsymbol{x}+\boldsymbol{n} (1)

where 𝒙N\boldsymbol{x}\in\mathbb{R}^{N} is the original ECG signal having length NN, 𝚽M×N\boldsymbol{\Phi}\in\mathbb{R}^{M\times N} is a compression matrix with MNM\ll N, and 𝒏\boldsymbol{n} denotes the noise in the acquisition system. Typically, 𝒏\boldsymbol{n} is assumed to be white Gaussian noise with mean 0, whereas 𝚽\boldsymbol{\Phi} is taken to be a random Gaussian or binary matrix [4, 5, 6].

The original signal 𝒙\boldsymbol{x} can be approximately recovered by solving the regularized inversion problem

min𝒙Nf(𝒙)+λg(𝒙),\underset{\boldsymbol{x}\in\mathbb{R}^{N}}{\text{min}}\ \ f(\boldsymbol{x})+\lambda g(\boldsymbol{x}), (2)

where the term f(𝒙)=12𝒚𝚽𝒙2f(\boldsymbol{x})=\frac{1}{2}\|\boldsymbol{y}-\boldsymbol{\Phi}\boldsymbol{x}\|^{2} forces consistency of the recovered signal w.r.t. the measurements, whereas g(𝒙)g(\boldsymbol{x}) (known as the regularizer) acts as a penalty function that forces the recovered signal to have some desirable properties such as smoothness. Here, λ\lambda is a positive scalar used to control the amount of regularization and \lVert\cdot\rVert denotes the 2\ell_{2} norm.

A good regularizer g(𝒙)g(\boldsymbol{x}) is necessary since recovering 𝒙\boldsymbol{x} from 𝒚\boldsymbol{y} is an ill-posed problem (as MNM\ll N). Several regularizers have been explored for the ECG compressive recovery task, such as the weighted 1\ell_{1} norm [11], various other p\ell_{p} norms [12], total variation (TV) [13], and second-order sparsity-promoting functions [6]. Moreover, efficient recovery algorithms have been derived by exploiting the temporal correlation between successive samples; examples include p1d\ell_{p}^{1d}-regularized least-squares [14], Block Sparse Bayesian Learning Bound-Optimization (BSBL-BO) [15], and Block Sparse Bayesian Learning with Expectation Maximization (BSBL-EM) [16]. The latter two are considered to be state-of-the-art.

1.2 Classical Regularization

It is well-known that the 1\ell_{1} norm promotes sparse solutions; moreover, natural signals such as ECG are known to be approximately sparse in suitably chosen domains [11, 13]. Therefore, sparsity-promoting regularizers based on the 1\ell_{1} norm in the wavelet and gradient domains (TV) have traditionally been used for ECG reconstruction. The downside is that the 1\ell_{1} norm is not differentiable. Hence, the objective function in (2) as a whole is non-differentiable, even though f(𝒙)f(\boldsymbol{x}) is differentiable. A good choice of iterative numerical solvers to solve such problems is the class of proximal algorithms [17], such as ADMM and Proximal Gradient Descent (PGD). A proximal algorithm generally consists of smaller subproblems which individually involve only one of the two functions. Consequently, it is possible to take advantage of the differentiability of f(𝒙)f(\boldsymbol{x}). In this paper, we focus on PGD since it is a particularly simple proximal algorithm. PGD is sometimes known as the Iterative Shrinkage-Thresholding Algorithm (ISTA) [18]. Starting from an initial point 𝒙0N\boldsymbol{x}_{0}\in\mathbb{R}^{N}, PGD creates a sequence of points 𝒙1,𝒙2,\boldsymbol{x}_{1},\boldsymbol{x}_{2},\ldots recursively using the rule

𝒙k+1=proxγλg(𝒙kγf(𝒙k)),\boldsymbol{x}_{k+1}=\mathrm{prox}_{\gamma\lambda g}\big{(}\boldsymbol{x}_{k}-\gamma\nabla f(\boldsymbol{x}_{k})\big{)}, (3)

where γ>0\gamma>0 is a fixed parameter (known as the step size) and proxγλg()\mathrm{prox}_{\gamma\lambda g}(\cdot) is a function known as the proximal operator of gg:

proxγλg(𝒛)=argmin𝒘N[12𝒘𝒛2+γλg(𝒘)].\mathrm{prox}_{\gamma\lambda g}(\boldsymbol{z})=\mathop{\mathrm{argmin}}_{\boldsymbol{w}\in\mathbb{R}^{N}}\left[\frac{1}{2}\|\boldsymbol{w}-\boldsymbol{z}\|^{2}+\gamma\lambda g(\boldsymbol{w})\right]. (4)

Note that if we put 𝚽=𝐈\boldsymbol{\Phi}=\mathbf{I} in (2), then (2) reduces to (4). Thus, the proximal operator can be interpreted effectively as a Gaussian denoising operator. For regularizers such as the 0\ell_{0} and 1\ell_{1} norms, the proximal operator has a closed-form formula [17]. Hence, the PGD algorithm is easy to implement. As discussed in Section 2, the algorithm is guaranteed to converge under some mild conditions to a minimum of f(𝒙)+λg(𝒙)f(\boldsymbol{x})+\lambda g(\boldsymbol{x}). Note that every PGD step can be seen as the composition of two operations: the first (computing 𝒙kγf(𝒙k)\boldsymbol{x}_{k}-\gamma\nabla f(\boldsymbol{x}_{k})) is effectively one step of the classical gradient descent algorithm and depends only on the function ff, while the second (computing the proximal operator) depends only on gg. This is why the algorithm is named as Proximal Gradient Descent.

1.3 Plug-and-Play Regularization

Plug-and-play (PnP) regularization is a novel regularization technique developed in the image processing community [19, 20]. The main step in PnP is to replace the proximal operator by a powerful signal denoiser. As discussed in Section 2, this is due to the similarity of the proximal operator with a denoising operation. In the context of PGD, the function proxγλg()\mathrm{prox}_{\gamma\lambda g}(\cdot) is replaced by a signal denoiser D()D(\cdot), so that the kk-th step now becomes 𝒙k+1=D(𝒙kγf(𝒙k))\boldsymbol{x}_{k+1}=D\big{(}\boldsymbol{x}_{k}-\gamma\nabla f(\boldsymbol{x}_{k})\big{)}. This algorithm is known as PnP-PGD. Essentially, this amounts to taking one step of gradient descent, followed by denoising. Note that we no longer need to choose a regularizer g(𝒙)g(\boldsymbol{x}) since gg does not appear in the modified algorithm; the regularization is performed implicitly by the denoiser.

PnP has yielded state-of-the-art results in many imaging problems. However, since there is no regularizer gg involved, the aforementioned convergence to a minimum of f(𝒙)+λg(𝒙)f(\boldsymbol{x})+\lambda g(\boldsymbol{x}) does not apply.

1.4 Contribution

The contributions of this work are as follows.

  1. 1.

    We introduce the PnP framework for reconstructing ECG signals from CS measurements. To the best of our knowledge, PnP has never been used for this application. Through extensive experiments, we show that the proposed method, based on PnP-PGD, outperforms the current state-of-the-art CS recovery methods for ECG signals.

  2. 2.

    Even though convergence of PGD to a minimum of f+λgf+\lambda g is not applicable to PnP-PGD, we show that a different form of convergence, known as fixed-point convergence, can be guaranteed if the ECG denoiser D()D(\cdot) satisfies a condition known as contractivity. Thus, the challenge lies in designing a contractive ECG denoiser.

  3. 3.

    We derive a high-quality contractive ECG denoiser D()D(\cdot) by modeling small patches as random vectors following a Gaussian Mixture Model (GMM). We experimentally show that the denoising performance of the GMM denoiser is comparable or better than existing state-of-the-art ECG denoisers.

The rest of this paper is organized as follows. In Section 2, we give an overview of the PnP-PGD algorithm and explain the motivation behind its development. We derive the GMM denoiser in Section 3 and compare its denoising quality with existing ECG signal denoisers. In Section 4, we discuss how the this denoiser can be used in a way that guarantees convergence of PnP-PGD for CS recovery. Numerical experiments on CS recovery of ECG signals are shown in Section 5, and we conclude the paper in Section 6. Some of the mathematical proofs are given in the Appendix.

2 Plug-and-Play PGD

We first state a standard convergence result for PGD and then move on to discuss some convergence-related aspects of PnP. What makes PGD a simple yet powerful algorithm is its guarantee of convergence to a minimum of f(𝒙)+λg(𝒙)f(\boldsymbol{x})+\lambda g(\boldsymbol{x}). In the following theorem (and the rest of the paper), we denote the largest singular value of a matrix by σmax()\sigma_{\mathrm{max}}(\cdot).

Theorem 1 ([18]).

Consider the PGD algorithm for minimizing the function f(𝐱)+λg(𝐱)f(\boldsymbol{x})+\lambda g(\boldsymbol{x}), where f(𝐱)=12𝐲𝚽𝐱2f(\boldsymbol{x})=\frac{1}{2}\|\boldsymbol{y}-\boldsymbol{\Phi}\boldsymbol{x}\|^{2} Suppose gg is continuous and convex, and that 0<γ<2/σmax(𝚽𝚽)0<\gamma<2/\sigma_{\mathrm{max}}(\boldsymbol{\Phi}^{\top}\boldsymbol{\Phi}). Then the sequence f(𝐱k)+γg(𝐱k)f(\boldsymbol{x}_{k})+\gamma g(\boldsymbol{x}_{k}) converges to the minimum of f(𝐱)+λg(𝐱)f(\boldsymbol{x})+\lambda g(\boldsymbol{x}) as kk\to\infty.

We now turn our attention to the PnP framework. Consider the definition of proxγλg(𝒛)\mathrm{prox}_{\gamma\lambda g}(\boldsymbol{z}) in (4). Note that the minimization problem in (4) is similar to (2) if we put 𝚽=𝐈\boldsymbol{\Phi}=\mathbf{I}, the identity matrix. Hence, proxγλg(𝒛)\mathrm{prox}_{\gamma\lambda g}(\boldsymbol{z}) is essentially a regularized inverse corresponding to the additive noise model 𝒛=𝒘+𝒏\boldsymbol{z}=\boldsymbol{w}+\boldsymbol{n}, where 𝒏\boldsymbol{n} is Gaussian noise. Thus, the proximal operator is simply an additive Gaussian denoising operator.

It is well-known in the image processing community that specially designed denoisers such as nonlocal means (NLM) [21] and BM3D [22] are superior to traditional denoisers based on regularization, e.g. 1\ell_{1} or TV-regularized denoising. Motivated by this observation, the work in [19] explored how the performance of a proximal algorithm for image recovery problems is affected if we replace proxγλg()\mathrm{prox}_{\gamma\lambda g}(\cdot) by some arbitrary Gaussian denoiser D()D(\cdot), such as NLM or BM3D. This scheme was named as plug-and-play, since the denoiser DD serves as a pluggable module that replaces the proximal operator in an already existing numerical solver. In the original work [19], the PnP scheme was explored for a different proximal algorithm – ADMM – but it was adapted to PGD in [20]. The PnP-PGD algorithm is thus recursively defined by

𝒙k+1=D(𝒙kγf(𝒙k)).\boldsymbol{x}_{k+1}=D\big{(}\boldsymbol{x}_{k}-\gamma\nabla f(\boldsymbol{x}_{k})\big{)}. (5)

Note that the same denoiser DD can be utilized for several kinds of image recovery problems using the PnP framework, since only the function ff changes from problem to problem. For this reason, in the past few years, PnP has gained a lot of interest in the imaging community. However, the use of PnP for recovering one-dimensional signals such as ECG signals has remained an unexplored territory.

An immediate question that arises from the PnP scheme is as follows: Does the sequence (𝒙k)(\boldsymbol{x}_{k}) converge to some 𝒙\boldsymbol{x}^{\ast}? And if so, is 𝒙\boldsymbol{x}^{\ast} optimal in some sense? The latter question can be resolved if D()D(\cdot) is expressible as the proximal operator of some function gg. In general, however, an arbitrary DD cannot be expressed in this way. As a result, the PnP-PGD algorithm cannot be interpreted as minimizing an objective function of the form f+λgf+\lambda g, and the convergence result in Theorem 1 is not generally applicable. Therefore, we are left with trying to determine whether at least the sequence (𝒙k)(\boldsymbol{x}_{k}) converges. It turns out that such a guarantee can indeed be given under a technical condition on DD.

Definition 1.

The denoiser D:NND:\mathbb{R}^{N}\to\mathbb{R}^{N} is said to be contractive if there exists δ[0,1)\delta\in[0,1) such that for all points 𝐳1,𝐳2N\boldsymbol{z}_{1},\boldsymbol{z}_{2}\in\mathbb{R}^{N},

D(𝒛1)D(𝒛2)δ𝒛1𝒛2\lVert D(\boldsymbol{z}_{1})-D(\boldsymbol{z}_{2})\rVert\leqslant\delta\lVert\boldsymbol{z}_{1}-\boldsymbol{z}_{2}\rVert

We now state a theorem that guarantees the convergence of PnP-PGD using a contractive denoiser.

Theorem 2.

Consider the PnP-PGD algorithm, 𝐱k+1=D(𝐱kγf(𝐱k))\boldsymbol{x}_{k+1}=D\big{(}\boldsymbol{x}_{k}-\gamma\nabla f(\boldsymbol{x}_{k})\big{)}, where f(𝐱)=(1/2)𝚽𝐱𝐲2f(\boldsymbol{x})=(1/2)\lVert\boldsymbol{\Phi}\boldsymbol{x}-\boldsymbol{y}\rVert^{2}. Suppose 0<γ2/σmax(𝚽𝚽)0<\gamma\leqslant 2/\sigma_{\mathrm{max}}(\boldsymbol{\Phi}^{\top}\boldsymbol{\Phi}). Moreover, suppose the denoiser DD is contractive. Then, as kk\to\infty, the sequence 𝐱1,𝐱2,\boldsymbol{x}_{1},\boldsymbol{x}_{2},\ldots converges linearly (at an exponential rate) to a unique fixed point 𝐱\boldsymbol{x}^{\ast} that does not depend on the initialization 𝐱0\boldsymbol{x}_{0}.

While Theorem 2 is proved in the Appendix, we mention here that the proof uses the Banach Fixed Point Theorem [23, Th. 9.23], from which the linear rate of convergence follows.

Note the difference between the types of convergence addressed in Theorems 1 and 2: Theorem 1 claims the convergence of the sequence of objective function values f(𝒙k)+λg(𝒙k)f(\boldsymbol{x}_{k})+\lambda g(\boldsymbol{x}_{k}), whereas Theorem 2 is concerned with the sequence of variables 𝒙k\boldsymbol{x}_{k}. Theorem 2 essentially claims that the PnP-PGD algorithm eventually stabilizes, in the sense that two consecutive iterates 𝒙k\boldsymbol{x}_{k} and 𝒙k+1\boldsymbol{x}_{k+1} are close to each other. This property is known as fixed-point convergence [24], and is desirable for any recovery algorithm. Thus, by Theorem 2, it is sufficient for the denoiser DD to be contractive, in order to have fixed-point convergence.

It is useful to compare our result with a similar result in a recent work [25]. In [25], the authors proved fixed-point convergence of PnP-PGD under a different set of assumptions than ours. The convergence result in [25] is applicable to the case where the loss function f(𝒙)f(\boldsymbol{x}) is strongly convex and DID-I is Lipschitz continuous, where II is the identity operator. In contrast, we require DD to be contractive and we do not require strong convexity of ff. In fact, in our case, ff is not strongly convex since the sensing matrix 𝚽\boldsymbol{\Phi} has a non-trivial null space.

Various methods for Gaussian denoising of ECG signals have been explored in the literature; see [26] for a review. The state-of-the-art techniques are optimization-based, e.g. TV, multiresolution analysis methods such as wavelets, empirical mode decomposition methods etc. A combination of these methods is sometimes used [27]. Further, nonlocal means (NLM) denoising has also been found to be promising [28]. However, to the best of our knowledge, there is no work that determines whether any of these denoisers are contractive. Can we design a high-quality contractive ECG signal denoiser? In the next section, we show that this can indeed be done. Specifically, we design a Gaussian denoiser which takes the form D(𝒛)=𝐖𝒛D(\boldsymbol{z})=\mathbf{W}\boldsymbol{z}, where 𝐖\mathbf{W} is a symmetric matrix. The resulting PnP-PGD algorithm outperforms state-of-the-art methods for ECG.

Refer to caption
Figure 1: Comparison of the denoising performance using various state-of-the-art signal denoisers on an ECG signal of length N=200N=200. The clean signal (green) is overlaid in every plot for comparison. Note that the denoised signal using GMM has more structural similarity with the clean signal.

3 GMM Denoiser

Our ECG denoiser is inspired from an observation that was made in the context of images [29, 30]: A small patch of some fixed size belonging to a clean (i.e. noiseless) image can be well-modeled as a random vector having a Gaussian Mixture Model (GMM) as its density. Such a density can be learned by fitting a GMM to a large collection of patches extracted from a set of clean images, usually belonging to a common class (e.g. face images). We apply this idea to model patches belonging to ECG signals. Specifically, we extract a large collection of patches of length PNP\ll N from a set of noiseless ECG signals as our training data set. This is used to fit a GMM density (with a pre-determined number of components KK) using the expectation-maximization (EM) algorithm. Essentially, we model clean ECG patches of length PP as random vectors 𝒗P\boldsymbol{v}\in\mathbb{R}^{P} drawn from this learned GMM density, which we denote by p(𝒗)p(\boldsymbol{v}). For j=1,,Kj=1,\ldots,K, let αj0\alpha_{j}\geqslant 0 be the mixture coefficient of the jj-th component, 𝝁jP\boldsymbol{\mu}_{j}\in\mathbb{R}^{P} be the mean and 𝚺j\boldsymbol{\Sigma}_{j} be the positive definite covariance matrix. Then p(𝒗)p(\boldsymbol{v}) is given by

p(𝒗)=j=1Kαj𝒩(𝒗;𝝁j,𝚺j),p(\boldsymbol{v})=\sum_{j=1}^{K}\alpha_{j}\mathcal{N}(\boldsymbol{v};\boldsymbol{\mu}_{j},\boldsymbol{\Sigma}_{j}), (6)

where 𝒩\mathcal{N} denotes a Gaussian density function.

How can this model be used for denoising an ECG signal corrupted with Gaussian noise? Again, we borrow from a patch-based denoising framework that is quite popular in image processing [29, 30]. At a high level, this framework consists of the following steps:

  1. 1.

    Extract all possible patches of length PP from the noisy signal; if the signal length is NN, then there are NN such patches (we apply circular padding to the signal). If 𝒛\boldsymbol{z} denotes the noisy signal, the collection of patches is given by 𝐏1𝒛,,𝐏N𝒛\mathbf{P}_{1}\boldsymbol{z},\ldots,\mathbf{P}_{N}\boldsymbol{z}, where 𝐏i:NP\mathbf{P}_{i}:\mathbb{R}^{N}\to\mathbb{R}^{P} is the linear operator that extracts the patch starting at the ii-th location. This is defined as the segment (zi,zi+1,,zi+P1)(z_{i},z_{i+1},\ldots,z_{i+P-1}).

  2. 2.

    Denoise each patch independently by computing a Bayesian estimate of its corresponding clean patch under an additive Gaussian noise model, using p(𝒗)p(\boldsymbol{v}) as the prior distribution of clean patches. Letting GG denote the denoising operator, the collection of denoised patches is given by G(𝐏1𝒛),,G(𝐏N𝒛)G(\mathbf{P}_{1}\boldsymbol{z}),\ldots,G(\mathbf{P}_{N}\boldsymbol{z}).

  3. 3.

    Place each denoised patch back into its corresponding location in the signal. Each sample location i{1,,N}i\in\{1,\ldots,N\} belongs to PP overlapping denoised patches; take the average of the PP values at this location as the estimate of the ii-th sample of the denoised signal. This completes the overall denoising process, which is given by

    D(𝒛)=1Pi=1N𝐏iG(𝐏i𝒛).D(\boldsymbol{z})=\frac{1}{P}\sum_{i=1}^{N}\mathbf{P}_{i}^{\top}G\big{(}\mathbf{P}_{i}\boldsymbol{z}\big{)}. (7)

The patch denoiser GG forms the core of this framework, and the overall denoising performance depends on the performance of GG. One approach to incorporate the Bayesian prior p()p(\cdot) in the patch denoising is to take G()G(\cdot) as the maximum a-posteriori (MAP) estimator of the clean patch under a Gaussian noise model. That is, for a noisy patch 𝒖=𝒗+𝒏\boldsymbol{u}=\boldsymbol{v}+\boldsymbol{n} (where 𝒏\boldsymbol{n} is Gaussian noise), we can define G(𝒖)G(\boldsymbol{u}) to be the mode of the conditional density p(𝒗|𝒖)p(\boldsymbol{v}|\boldsymbol{u}). However, it is known that this cannot be computed in closed form when the prior p(𝒗)p(\boldsymbol{v}) is a GMM [29]. Instead, motivated by [30], we choose GG to be the minimum mean-squared-error (MMSE) estimator of the clean patch:

G(𝒖)=𝔼[𝒗|𝒖].G(\boldsymbol{u})=\mathbb{E}[\boldsymbol{v}|\boldsymbol{u}]. (8)

The theorem below gives a closed-form expression for 𝔼[𝒗|𝒖]\mathbb{E}[\boldsymbol{v}|\boldsymbol{u}].

Theorem 3 ([30]).

Consider the additive noise model 𝐮=𝐯+𝐧\boldsymbol{u}=\boldsymbol{v}+\boldsymbol{n}, where 𝐧\boldsymbol{n} is zero-mean Gaussian noise having variance σ2\sigma^{2}. Suppose 𝐯\boldsymbol{v} has the GMM density given by (6). Then,

G(𝒖)=(j=1Kβj(𝒖)𝐂j)𝒖,G(\boldsymbol{u})=\left(\sum_{j=1}^{K}\beta_{j}(\boldsymbol{u})\mathbf{C}_{j}\right)\boldsymbol{u},

where 𝐂j=𝚺j(𝚺j+σ2𝐈)1\mathbf{C}_{j}=\boldsymbol{\Sigma}_{j}(\boldsymbol{\Sigma}_{j}+\sigma^{2}\mathbf{I})^{-1} and

βj(𝒖)=αj𝒩(𝒖;𝝁j,𝚺j+σ2𝐈)l=1Kαl𝒩(𝒖;𝝁l,𝚺l+σ2𝐈).\beta_{j}(\boldsymbol{u})=\frac{\alpha_{j}\mathcal{N}(\boldsymbol{u};\boldsymbol{\mu}_{j},\boldsymbol{\Sigma}_{j}+\sigma^{2}\mathbf{I})}{\sum_{l=1}^{K}\alpha_{l}\mathcal{N}(\boldsymbol{u};\boldsymbol{\mu}_{l},\boldsymbol{\Sigma}_{l}+\sigma^{2}\mathbf{I})}.

To summarize, the overall GMM denoiser DD is given by

D(𝒛)=1Pi=1N𝐏i[(j=1Kβj(𝐏i𝒛)𝐂j)𝐏i𝒛].D(\boldsymbol{z})=\frac{1}{P}\sum_{i=1}^{N}\mathbf{P}_{i}^{\top}\left[\left(\sum_{j=1}^{K}\beta_{j}\big{(}\mathbf{P}_{i}\boldsymbol{z}\big{)}\mathbf{C}_{j}\right)\mathbf{P}_{i}\boldsymbol{z}\right]. (9)

In order to gauge the quality of the GMM denoiser, we perform a denoising experiment on a noisy ECG signal. Specifically, we compare its performance with the following ECG signal denoising schemes: TV [31], NLM [28], and wavelet-1\ell_{1} regularization. In Figure. 1, we show a segment of signal #115\#115 (assumed noiseless) from the Physionet MIT-BIH Arrhythmia Database [32, 33, 34]. The segment has length N=200N=200. We add white Gaussian noise such that the signal-to-noise ratio (SNR) of the noisy signal is 3030 dB. The SNR for an estimated signal 𝒙^\hat{\boldsymbol{x}} (here, the denoised signal) with respect to a reference signal 𝒙\boldsymbol{x} (here, the clean signal) is defined as

SNR= 10log10(𝒙2𝒙𝒙^2).\text{SNR}=\ 10\log_{10}\Bigg{(}\frac{||\boldsymbol{x}||^{2}}{||\boldsymbol{x}-\hat{\boldsymbol{x}}||^{2}}\Bigg{)}.

A higher SNR value indicates a better estimation quality. The denoised signals obtained using the aforementioned denoising schemes are shown in Figure. 1. Observe that the GMM denoiser yields the highest SNR of all the methods; the visual quality is considerably better compared to TV and wavelet, and comparable to NLM.

For a more extensive comparison, we repeat this experiment for different SNR values of the noisy signal. The SNR values of the denoised signal are noted in Table 1. Again, it is observed that the GMM denoiser outperforms the other denoisers, while NLM is the second-best method. In fact, for high noise levels (SNR of 1515 and 2020 dB), the gap in performance between GMM and NLM is quite high. A possible explanation is that when the noise level is high, reliable computation of weights for NLM is difficult and can result in spikes in the denoised signal.

Table 1: Denoising performance (SNR in dB). The signal length is N=200N=200.
Noisy TV NLM Wavelet GMM
1515 25.72125.721 26.25726.257 23.21623.216 27.492\mathbf{27.492}
2020 26.90126.901 27.02327.023 25.86125.861 28.373\mathbf{28.373}
2525 29.25229.252 29.41529.415 29.15729.157 29.646\mathbf{29.646}
3030 33.08133.081 33.13933.139 31.39631.396 33.819\mathbf{33.819}
3535 35.32235.322 35.89235.892 32.69932.699 36.276\mathbf{36.276}
4040 40.39140.391 40.45540.455 33.68133.681 41.262\mathbf{41.262}

4 Convergence Analysis

Recall that we would like DD to be contractive; however, due to the complexity of the expression in (9), it is difficult to determine whether this is the case. Fortunately, while using it as part of the larger PnP-PGD framework, we can modify the denoiser to make it contractive using a simple trick. Note that the coefficients βj(𝐏i𝒛)\beta_{j}(\mathbf{P}_{i}\boldsymbol{z}) in (9) are nonnegative and sum to 11 for each ii. Consider the situation where we replace the βj(𝐏i𝒛)\beta_{j}(\mathbf{P}_{i}\boldsymbol{z})’s by some fixed universal constants bjib_{ji} that do not depend on 𝒛\boldsymbol{z}, but have the same properties: bji0b_{ji}\geqslant 0 for all i,ji,j, and jbji=1\sum_{j}b_{ji}=1 for all ii. Then D(𝒛)D(\boldsymbol{z}) becomes a linear function of 𝒛\boldsymbol{z}. In fact, we can write D(𝒛)=𝐖𝒛D(\boldsymbol{z})=\mathbf{W}\boldsymbol{z}, where 𝐖N×N\mathbf{W}\in\mathbb{R}^{N\times N} is given by

𝐖=1Pi=1N𝐏i(j=1Kbji𝐂j)𝐏i.\mathbf{W}=\frac{1}{P}\sum_{i=1}^{N}\mathbf{P}_{i}^{\top}\left(\sum_{j=1}^{K}b_{ji}\mathbf{C}_{j}\right)\mathbf{P}_{i}. (10)
Theorem 4.

Let 𝐖\mathbf{W} be defined as in (10), where bji0b_{ji}\geqslant 0 for all i,ji,j, and j=1Kbji=1\sum_{j=1}^{K}b_{ji}=1 for all ii. Suppose NN is a multiple of PP. Then the largest eigenvalue of 𝐖\mathbf{W}, λmax(𝐖)\lambda_{\mathrm{max}}(\mathbf{W}), is <1<1. Consequently, the denoiser D(𝐳)=𝐖𝐳D(\boldsymbol{z})=\mathbf{W}\boldsymbol{z} is contractive, with the constant δ\delta being λmax(𝐖)\lambda_{\mathrm{max}}(\mathbf{W}).

The proof is given in the appendix. Note that the requirement for NN to be a multiple of PP is not too restrictive, since we can pad the signal if it is not. We only need to find suitable coefficients bijb_{ij} to replace βj(𝐏i𝒛)\beta_{j}(\mathbf{P}_{i}\boldsymbol{z}) in (9). This can be done as follows. We first run a small number TT (say, T=10T=10 or 2020) of PnP-PGD iterations using the coefficients βj(𝐏i𝒛)\beta_{j}(\mathbf{P}_{i}\boldsymbol{z}) in the denoiser, to get an intermediate estimate 𝒙T\boldsymbol{x}_{T}. We then set bij=βj(𝐏i𝒙T)b_{ij}=\beta_{j}(\mathbf{P}_{i}\boldsymbol{x}_{T}) for all ii and jj. That is, we fix the bijb_{ij}’s as the coefficients obtained from the intermediate point 𝒙T\boldsymbol{x}_{T}. The subsequent PnP-PGD iterations are run using the denoiser in (10) that uses these fixed coefficients. Since 𝐖\mathbf{W} is contractive, it follows from Theorem 2 that the sequence 𝒙T+1,𝒙T+2,\boldsymbol{x}_{T+1},\boldsymbol{x}_{T+2},\ldots converges to some fixed point 𝒙\boldsymbol{x}^{\ast}. Consequently, the PnP-PGD iterations 𝒙1,𝒙2,\boldsymbol{x}_{1},\boldsymbol{x}_{2},\ldots converge to some fixed point 𝒙\boldsymbol{x}^{\ast}.

The idea behind fixing the coefficients after TT iterations is that as kk increases, 𝒙k\boldsymbol{x}_{k} is expected to become more refined (in the sense of looking similar to the unknown signal 𝒙\boldsymbol{x}); therefore, the coefficients βj(𝐏i𝒙T)\beta_{j}(\mathbf{P}_{i}\boldsymbol{x}_{T}) after TT iterations would be good enough to use for all subsequent iterates as well. In fact, this scheme has been used in PnP algorithms for image restoration [35, 24].

We note that although the patch denoiser GG in (8) is an MMSE estimator, the overall image denoiser DD is not. Therefore existing convergence results for PnP with MMSE denoisers, e.g. [36], do not apply to our case. We make an important remark on the similarity and differences of the proposed GMM denoiser with the denoiser in [37, 30]. Indeed, the idea of our GMM denoiser is inspired by that in [37, 30]. However, there are a couple of subtle differences:

  • The denoiser in [37, 30] is scene-adapted, in the sense that the GMM distribution is tailored for the specific scene being reconstructed. This is possible because the application considered there is hyperspectral image sharpening, in which a complementary image of the same scene is available to obtain training data tailored for that scene. In contrast, in this work, we learn just one GMM distribution that is kept common for all the signals being reconstructed.

  • The method used to replace βj(𝐏i𝒛)\beta_{j}(\mathbf{P}_{i}\boldsymbol{z}) by a fixed coefficient bijb_{ij} is different in our paper as compared to [2]. This is because the approach in [2] fundamentally relies on the availability of a complementary image, and thus cannot be applied to compressive sensing. In particular, in our paper, we take the bijb_{ij}’s to be the coefficients acquired from a surrogate signal that is obtained by running a few iterations of the PnP-PGD algorithm; this idea was inspired by [Sreehari]. On the other hand, in [2], the bijb_{ij}’s are taken to the coefficients obtained from the complementary image (multispectral or panchromatic image).

Refer to caption
Figure 2: Ten randomly selected eigenvectors corresponding to the largest few eigenvalues of the covariance matrices from the learned GMM model.
Refer to caption
Figure 3: Ten randomly selected eigenvectors corresponding to the smallest few eigenvalues of the covariance matrices from the learned GMM model.
Refer to caption
Figure 4: Visual comparison of a CS recovered ECG signal with 10%10\% measurements and no additive noise. The length of the original signal is N=512N=512. The proposed method produces more structurally similar (with the clean signal) output even under lower number of observations.
Refer to caption
Figure 5: Different test signals from the MIT-BIH Arrhythmia database.

5 Experimental Results

Database: To validate the proposed PnP-PGD method for ECG CS recovery, we use a subset of the data from the Physionet MIT-BIH Arrhythmia Database [32, 33, 34]. Every file in the database consists of two lead recordings sampled at 360Hz with 11 bits per sample of resolution. It contains 48 half-hour excerpts of two-channel ambulatory ECG recordings, obtained from 47 subjects studied by the BIH Arrhythmia Laboratory.

Metrics: We quantify the performance of the proposal using following metrics: SNR, which is defined in Section 3), and mean-squared error (MSE), which is defined below.

MSE=\displaystyle\text{MSE}= 1N𝒙𝒙^2,\displaystyle\ \frac{1}{N}||\boldsymbol{x}-\hat{\boldsymbol{x}}||^{2},

where 𝒙^\hat{\boldsymbol{x}} and 𝒙\boldsymbol{x} are the reconstructed and original ECG signals, respectively. Note that here we are assuming the Physionet signals are the true signals 𝒙\boldsymbol{x}; in reality these signals also contain noise, which the metrics above neglect, though at most SNRs the (simulated) additive noise dominates [28].

Compared methods: We compare with the following state-of-the-art methods: BSBL-BO [15], BSBL-EM [16], sparse prior on the ECG wavelet representation [11], and TV regularization [13]. We tuned the parameters of all the methods so that maximum SNR is obtained. The codes are used from the publicly available sites [5, 15, 16]. All simulations were performed using MATLAB (R2021a) on a Quad core, 3.80GHz machine with 32GB RAM.

The sensing matrix 𝚽\boldsymbol{\Phi} is constructed by randomly drawing each entry from the standard normal distribution 𝒩(0,1)\mathcal{N}(0,1), and applying an orthonormalization step to ensure that the rows of 𝚽\boldsymbol{\Phi} are orthonormal [12]. We trained the GMM on the set of all possible overlapping patches of size P=30P=30 extracted from signal #104104 in the dataset [32], which is of length 10,80010,800. The number of GMM components KK is set to be 1010. The training time is found to be 2.862.86s. In all the experiments on CS reconstruction, we terminate the PnP-PGD algorithm after 150150 iterations since we observed that the algorithm stabilizes by then.

In addition to the denoising results reported in Section 3, the results in the subsequent sections support the claim that GMM is a good prior for modeling ECG signal patches. Note that the entire training process can be done offline.

5.1 Goodness of GMM Modeling

In this experiment, we evaluate the goodness of GMM modeling by visualizing some of the eigenvectors of the covariance matrices from the learned GMM distribution. This type of visualization is commonly utilized in papers on image processing, e.g. [29]. In particular, in [29] it is observed that the eigenvectors corresponding to the largest few eigenvalues (of each covariance matrix) are relatively smooth and capture the large-scale structure of the patches, whereas the eigenvalues corresponding to the smallest few eigenvectors contain many fluctuations and thus capture the local structure. If the GMM is a good model, we expect to see a similar trend for ECG signal patches.

In our case, we fit a GMM with K=10K=10 components on ECG patches of size 3030. In Figure 3, we plot 1010 randomly selected eigenvectors (having unit norm) corresponding to eigenvalue indices 28\geqslant 28 (i.e. largest few eigenvalues) from the fitted GMM model. In Figure 3, we show similar plots but for eigenvalue indices 3\leqslant 3 (i.e. smallest few eigenvalues). It is evident that the expected trend described in the previous paragraph holds true in practice: the richness of textures, fluctuations and other local structures is captured by the signals in Figure 3, while most of the large-scale details are captured by the signals in Figure 2.

5.2 Recovery from Noiseless Measurements

We study the signal reconstruction performance of our method (under zero noise), especially when the number of measurements MM is much lesser than NN. In Figure 4, we show a segment of the original ECG signal #100\#100 from [33] and its reconstruction obtained using our proposed method with 10%10\% measurements. We can see that the recovered signal using GMM as the denoiser produces visually similar result on comparison with the other methods and also has better performance in terms of SNR.

5.3 Study of SNR for Different MM

We next perform an exhaustive experiment where we vary the number of measurements (MM) while fixing the length of the signal. The signal #105\#105 of length N=512N=512 is used for the experiment. The results are reported in Figure 6. Each instance of the SNR values shown is obtained by averaging of 500500 independent trials. We note that our proposed method achieves the best recovery among all the methods.

Refer to caption
Figure 6: Average SNR vs. measurements (MM) over 500500 runs, with no additive noise. The length of the original ECG signal is N=512N=512.

5.4 Study of Average SNR over Different Signals

In this section, we study the performace of the proposed method with that of various ECG signals from MIT-BIH database [33]. We consider a larger dataset of 1010 signals (#100 to #109), shown in Figure 5 for performance evaluation. All the signals are of length N=512N=512. We also compare with a more recent method, ECGLet [38], in this section. We measure the average SNR (over all the 1010 test signals) of the reconstructed signal for different values of MM. The results are reported in Figure 7. Note that the proposed method yields the highest SNR among all the methods considered, for every MM.

Refer to caption
Figure 7: Average SNR vs. measurements (MM) averaged over the 1010 ECG signals in Figure 5. The measurements do not contain any additive noise.

5.5 Effect of Compression Ratio

In this section we investigate the effect of compression ratio (CR) on the quality of the reconstructed ECG signals. The CR is defined as:

CR=NMN×100\text{CR}\ =\frac{N-M}{N}\times 100 (11)

where NN is the length of the original signal and MM is the length of the compressed signal. For each value of CR, we repeated the experiment 500500 times, and in each time, the sensing matrix was randomly generated [5]. Figure 8, shows the variation of SNR with CR. It is worth noting that we obtain superior performance over the whole range of CR values. The input signal is a segment of ECG signal #105\#105 from the MIT-BIH database [33].

Refer to caption
Figure 8: Average SNR vs. CR over 500500 runs. The length of the original ECG signal is N=512N=512.
Refer to caption
Figure 9: Average SNR vs. MSE over 500500 runs. The length of the original ECG signal is N=512N=512.
Refer to caption
Figure 10: Visual comparison of a CS recovered ECG signal with 50%50\% measurements and additive Gaussian noise. The length of the original signal is N=512N=512. The proposed method is able to capture the minute variations in the signal faithfully.

5.6 Recovery from Noisy Measurements

From previous experiments we notice that the proposed method performs well when compared with the other methods in noiseless scenario, i.e. when 𝒏=𝟎\boldsymbol{n}=\mathbf{0} in (1). Now we examine the recovery performance of our method from noisy compressed measurements. The signal #103\#103 of length N=512N=512 and M=256M=256 is used for this experiment. In Figure 9, we plot the MSE of the recovered signal as a function of the noise level in the input 𝒚\boldsymbol{y} (specified in terms of the SNR of the input). The values reported are averaged over 500500 independent trials. To simulate the noisy measurements, we followed the approach in [28]. It is evident that the proposed method produces quality reconstructions under noisy measurements. Finally we show a reconstruction result from noisy measurements in Figure. 10. In the experiment, we add random Gaussian noise to the compressed measurements. On comparison, none of the three methods in Figure 10 are able to completely mitigate the effect of noise. However, our method performs the best by a significant margin, resulting in an SNR of 24.2424.24 dB in the recovered signal, and is visually similar to the original signal. We observed that in low SNR scenarios, TV acts as a better denoiser than GMM, which might explain why the CS reconstruction performance is higher for TV as compared to the proposed method when the input SNR is low.

5.7 Numerical Convergence

In this section, we numerically verify the convergence of the proposed PnP-PGD algorithm. For all the signals in Figure 5, we use PnP-PGD for reconstruction from M=0.5NM=0.5N measurements. As explained in Section 4, we take the surrogate signal as the signal obtained after running the algorithm for T=10T=10 iterations. The plots of 𝒙k+1𝒙k\lVert\boldsymbol{x}_{k+1}-\boldsymbol{x}_{k}\rVert as a function of kk are shown in Figure 11 for each of the signals. Note that 𝒙k+1𝒙k\lVert\boldsymbol{x}_{k+1}-\boldsymbol{x}_{k}\rVert decays to 0 as kk increases, which is a necessary condition for convergence.

Refer to caption
Figure 11: Plots of 𝒙k+1𝒙k\lVert\boldsymbol{x}_{k+1}-\boldsymbol{x}_{k}\rVert vs kk in the proposed algorithm (with M=0.5NM=0.5N) for the test signals in Figure 5.
Table 2: Performance comparison with SDAE on FECG data, collected from Physionet database (patient id: ecgca154) by varying the compression ratio.
CR 25 37.5 50 62.5 75 87.5 93.75
SDAE 37.6937.69 36.7336.73 35.3935.39 32.4032.40 26.7826.78 23.56 22.29
GMM 42.85 39.56 37.18 33.29 27.16 20.5720.57 11.4411.44
Refer to caption
Figure 12: Variation of SNR with trials for BSBL-BO (blue) and the proposed method (red). The length of the original ECG signal N=512N=512. The proposed method exhibits stable recovery of the ECG signal even for a low compression ratio.

5.8 Comparison with Deep Learning

For completeness, we compare the proposed method with a deep learning method for CS reconstruction, known as SDAE [39]. In this experiment, we use the non-invasive Fetal ECG dataset from Physionet, which was used in [39]. This database contains a series of 55 multichannel abdominal non-invasive fetal electrocardiogram (FECG) recordings, taken from a single subject between 21 to 40 weeks of pregnancy. We conducted an experiment on the signal with patient id ecgca154. Table 2 shows the variation of SNR with CR for the proposed method and SDAE. We note that the proposed method is able to outperform SDAE in all cases except when the CR is very high.

5.9 Stable Recovery

We show that the proposed method produces very stable reconstructions. For this experiment, we considered the signal #105\#105 with 10%10\% measurements (N=512N=512). We ran 500500 trials of the method, so that its stability can be observed for different realizations of the sensing matrix 𝚽\boldsymbol{\Phi}. In Figure 12, we show the SNR variation for BSBL-BO (blue) and the proposed method (red). We noted a standard deviation of 6.906.90 dB in SNR for BSBL-BO and 2.702.70 dB for the proposed method. Thus, the proposed method is more stable as compared to BSBL-BO. In fact, we observed that the contrast in stability is more pronounced for smaller MM.

6 Conclusion

We introduced a novel framework for recovering ECG signals from compressively sensed measurements. Our method is based on the plug-and-play (PnP) paradigm that has recently become popular for image restoration problems. Essentially, the recovery method consists of repeating two main steps – inverting the forward model, and denoising – until stability is attained. We designed a high-quality ECG signal denoiser to be used in the denoising step. Moreover, we proved that the recovery algorithm is guaranteed to converge. Importantly, we showed via numerical experiments that our proposed method is superior to current state-of-the-art methods used for ECG CS recovery.

7 Appendix

7.1 Proof of Theorem 2

Since f(𝒙)=𝚽(𝚽𝒙𝒚)\nabla f(\boldsymbol{x})=\boldsymbol{\Phi}^{\top}(\boldsymbol{\Phi}\boldsymbol{x}-\boldsymbol{y}), we can write the PnP-PGD algorithm as 𝒙k+1=S(𝒙k)\boldsymbol{x}_{k+1}=S(\boldsymbol{x}_{k}), where

S(𝒙)=D(𝒙γ𝚽𝚽𝒙+γ𝚽𝒚).S(\boldsymbol{x})=D\big{(}\boldsymbol{x}-\gamma\boldsymbol{\Phi}^{\top}\boldsymbol{\Phi}\boldsymbol{x}+\gamma\boldsymbol{\Phi}^{\top}\boldsymbol{y}\big{)}.

It is enough to prove that the function S()S(\cdot) is contractive, since the convergence of (𝒙k)(\boldsymbol{x}_{k}) to some unique fixed point 𝒙\boldsymbol{x}^{\ast} at a linear rate would then follow by the Banach Fixed Point Theorem [23, Th. 9.23].

Let δ<1\delta<1 be the constant in Definition 1. Then for any 𝒛1,𝒛2N\boldsymbol{z}_{1},\boldsymbol{z}_{2}\in\mathbb{R}^{N}, we have,

S(𝒛1)S(𝒛2)\displaystyle\left\lVert S(\boldsymbol{z}_{1})-S(\boldsymbol{z}_{2})\right\rVert δ(𝒛1γ𝚽𝚽𝒛1)(𝒛2γ𝚽𝚽𝒛2)\displaystyle\leqslant\delta\lVert(\boldsymbol{z}_{1}-\gamma\boldsymbol{\Phi}^{\top}\boldsymbol{\Phi}\boldsymbol{z}_{1})-(\boldsymbol{z}_{2}-\gamma\boldsymbol{\Phi}^{\top}\boldsymbol{\Phi}\boldsymbol{z}_{2})\rVert
=δ(𝐈γ𝚽𝚽)(𝒛1𝒛2)\displaystyle=\delta\lVert(\mathbf{I}-\gamma\boldsymbol{\Phi}^{\top}\boldsymbol{\Phi})(\boldsymbol{z}_{1}-\boldsymbol{z}_{2})\rVert
δσmax(𝐈γ𝚽𝚽)𝒛1𝒛2.\displaystyle\leqslant\delta\cdot\sigma_{\mathrm{max}}(\mathbf{I}-\gamma\boldsymbol{\Phi}^{\top}\boldsymbol{\Phi})\cdot\lVert\boldsymbol{z}_{1}-\boldsymbol{z}_{2}\rVert.

Let L=σmax(𝚽𝚽)L=\sigma_{\mathrm{max}}(\boldsymbol{\Phi}^{\top}\boldsymbol{\Phi}). Since 𝚽𝚽\boldsymbol{\Phi}^{\top}\boldsymbol{\Phi} is positive semidefinite, its singular values are also its eigenvalues. In particular, its eigenvalues lie in [0,L][0,L]. Since 0<γ2/L0<\gamma\leqslant 2/L, the eigenvalues of 𝐈γ𝚽𝚽\mathbf{I}-\gamma\boldsymbol{\Phi}^{\top}\boldsymbol{\Phi} lie in [1,1][-1,1]. Therefore, σmax(𝐈γ𝚽𝚽)1\sigma_{\mathrm{max}}(\mathbf{I}-\gamma\boldsymbol{\Phi}^{\top}\boldsymbol{\Phi})\leqslant 1. Thus, for all 𝒛1,𝒛2N\boldsymbol{z}_{1},\boldsymbol{z}_{2}\in\mathbb{R}^{N} we have,

S(𝒛1)S(𝒛2)δ𝒛1𝒛2.\left\lVert S(\boldsymbol{z}_{1})-S(\boldsymbol{z}_{2})\right\rVert\leqslant\delta\lVert\boldsymbol{z}_{1}-\boldsymbol{z}_{2}\rVert.

Since δ<1\delta<1, the function S()S(\cdot) is contractive.

7.2 Proof of Theorem 4

A proof can be found in [30, Appendix B]; for completeness, here we give a different and more concise proof. Note that each 𝐂j\mathbf{C}_{j} is symmetric positive semidefinite (p.s.d.); hence, for each ii, the matrix 𝐁i:=jbji𝐂j\mathbf{\mathbf{B}}_{i}:=\sum_{j}b_{ji}\mathbf{C}_{j} is p.s.d. (as a convex combination of p.s.d. matrices). By the same logic, we get that 𝐖\mathbf{W} is p.s.d. Thus, to show λmax(𝐖)<1\lambda_{\mathrm{max}}(\mathbf{W})<1, we only need to show that 𝒛𝐖𝒛<𝒛2\boldsymbol{z}^{\top}\mathbf{W}\boldsymbol{z}<\lVert\boldsymbol{z}\rVert^{2} for all 𝒛N\boldsymbol{z}\in\mathbb{R}^{N}.

To prove this, first note that λmax(𝐂j)<1\lambda_{\mathrm{max}}(\mathbf{C}_{j})<1 for all jj. Since each 𝐁i\mathbf{B}_{i} is a convex combination of all 𝐂j\mathbf{C}_{j}’s and since λmax()\lambda_{\mathrm{max}}(\cdot) is a convex function on the set of symmetric matrices, we have λmax(𝐁i)<1\lambda_{\mathrm{max}}(\mathbf{B}_{i})<1 for all i=1,,Ni=1,\ldots,N. Let

δ=max(λmax(𝐁1),,λmax(𝐁N)).\delta=\max\big{(}\lambda_{\mathrm{max}}(\mathbf{B}_{1}),\ldots,\lambda_{\mathrm{max}}(\mathbf{B}_{N})\big{)}.

Clearly, δ<1\delta<1. Note that for each ii, we have 𝒖𝐁i𝒖λmax(𝐁i)𝒖2δ𝒖2\boldsymbol{u}^{\top}\mathbf{B}_{i}\boldsymbol{u}\leqslant\lambda_{\mathrm{max}}(\mathbf{B}_{i})\lVert\boldsymbol{u}\rVert^{2}\leqslant\delta\lVert\boldsymbol{u}\rVert^{2}. Therefore, for any 𝒛N\boldsymbol{z}\in\mathbb{R}^{N},

𝒛𝐖𝒛=1Pi=1N(𝐏i𝒛)𝐁i(𝐏i𝒛)δPi=1N𝐏i𝒛2,\boldsymbol{z}^{\top}\mathbf{W}\boldsymbol{z}=\frac{1}{P}\sum_{i=1}^{N}(\mathbf{P}_{i}\boldsymbol{z})^{\top}\mathbf{B}_{i}(\mathbf{P}_{i}\boldsymbol{z})\leqslant\frac{\delta}{P}\sum_{i=1}^{N}\lVert\mathbf{P}_{i}\boldsymbol{z}\rVert^{2},

Since NN is a multiple of PP, i=1N𝐏i𝒛2\sum_{i=1}^{N}\lVert\mathbf{P}_{i}\boldsymbol{z}\rVert^{2}, which is the sum of all patches of length PP extracted from 𝒛\boldsymbol{z} (using circular padding), is simply equal to P𝒛2P\lVert\boldsymbol{z}\rVert^{2}. Thus, we get that 𝒛𝐖𝒛δ𝒛2<𝒛2\boldsymbol{z}^{\top}\mathbf{W}\boldsymbol{z}\leqslant\delta\lVert\boldsymbol{z}\rVert^{2}<\lVert\boldsymbol{z}\rVert^{2} for all 𝒛N\boldsymbol{z}\in\mathbb{R}^{N}.

References

  • [1] S. Ahmad, S. Chen, K. Soueidan, I. Batkin, M. Bolic, H. Dajani, V. Groza, Electrocardiogram-assisted blood pressure estimation, IEEE Transactions on Biomedical Engineering 59 (3) (2012) 608–618.
  • [2] L. Pecchia, P. Melillo, M. Sansone, M. Bracale, Discrimination power of short-term heart rate variability measures for CHF assessment, IEEE Transactions on Information Technology in Biomedicine 15 (1) (2010) 40–46.
  • [3] M. I. Owis, A. H. Abou-Zied, A.-B. Youssef, Y. M. Kadah, Study of features based on nonlinear dynamical modeling in ECG arrhythmia detection and classification, IEEE Transactions on Biomedical Engineering 49 (7) (2002) 733–736.
  • [4] H. Mamaghanian, N. Khaled, D. Atienza, P. Vandergheynst, Compressed sensing for real-time energy-efficient ECG compression on wireless body sensor nodes, IEEE Transactions on Biomedical Engineering 58 (9) (2011) 2456–2466.
  • [5] Z. Zhang, T. P. Jung, S. Makeig, B. D. Rao, Compressed sensing for energy-efficient wireless telemonitoring of noninvasive fetal ECG via block sparse bayesian learning, IEEE Transactions on Biomedical Engineering 60 (2) (2012) 300–309.
  • [6] J. K. Pant, S. Krishnan, Compressive sensing of electrocardiogram signals by promoting sparsity on the second-order difference and by using dictionary learning, IEEE Transactions on Biomedical Circuits and Systems 8 (2) (2013) 293–302.
  • [7] A. Salman, E. G. Allstot, A. Y. Chen, A. M. Dixon, D. Gangopadhyay, D. J. Allstot, Compressive sampling of EMG bio-signals, Proc. IEEE International Symposium of Circuits and Systems (2011) 2095–2098.
  • [8] A. M. Dixon, E. G. Allstot, D. Gangopadhyay, D. J. Allstot, Compressed sensing system considerations for ECG and EMG wireless biosensors, IEEE Transactions on Biomedical Circuits and Systems 6 (2) (2012) 156–166.
  • [9] S. Aviyente, Compressed sensing framework for EEG compression, Proc. IEEE/SP 14th Workshop on Statistical Signal Processing (2007) 181–184.
  • [10] M. Lustig, D. Donoho, J. M. Pauly, Sparse MRI: The application of compressed sensing for rapid MR imaging, Magnetic Resonance in Medicine 58 (6) (2007) 1182–1195.
  • [11] L. F. Polania, K. E. Barner, A weighted 1\ell_{1} minimization algorithm for compressed sensing ECG, Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (2014) 4413–4417.
  • [12] J. K. Pant, W.-S. Lu, A. Antoniou, New improved algorithms for compressive sensing based on p\ell_{p} norm, IEEE Transactions on Circuits and Systems II: Express Briefs 61 (3) (2014) 198–202.
  • [13] Y. Liu, M. De Vos, I. Gligorijevic, V. Matic, Y. Li, S. Van Huffel, Multi-structural signal recovery for biomedical compressive sensing, IEEE Transactions on Biomedical Engineering 60 (10) (2013) 2794–2805.
  • [14] J. K. Pant, S. Krishnan, Reconstruction of ECG signals for compressive sensing by promoting sparsity on the gradient, Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (2013) 993–997.
  • [15] Z. Zhang, B. D. Rao, Sparse signal recovery with temporally correlated source vectors using sparse Bayesian learning, IEEE Journal of Selected Topics in Signal Processing 5 (5) (2011) 912–926.
  • [16] Z. Zhang, B. D. Rao, Extension of SBL algorithms for the recovery of block sparse signals with intra-block correlation, IEEE Transactions on Signal Processing 61 (8) (2013) 2009–2015.
  • [17] A. Beck, First-Order Methods in Optimization, SIAM, Philadelphia, PA, USA, 2017.
  • [18] A. Beck, M. Teboulle, A fast iterative shrinkage-thresholding algorithm for linear inverse problems, SIAM Journal on Imaging Sciences 2 (1) (2009) 183–202.
  • [19] S. V. Venkatakrishnan, C. A. Bouman, B. Wohlberg, Plug-and-play priors for model based reconstruction, Proc. IEEE Global Conference on Signal and Information Processing (2013) 945–948.
  • [20] U. S. Kamilov, H. Mansour, B. Wohlberg, A plug-and-play priors approach for solving nonlinear imaging inverse problems, IEEE Signal Processing Letters 24 (12) (2017) 1872–1876.
  • [21] A. Buades, B. Coll, J. M. Morel, A non-local algorithm for image denoising, Proc. IEEE Computer Vision and Pattern Recognition 2 (2005) 60–65.
  • [22] K. Dabov, A. Foi, V. Katkovnik, K. Egiazarian, Image denoising by sparse 3-D transform-domain collaborative filtering, IEEE Transactions on Image Processing 16 (8) (2007) 2080–2095.
  • [23] W. Rudin, Principles of Mathematical Analysis, New York: McGraw-Hill, 1976.
  • [24] P. Nair, R. G. Gavaskar, K. N. Chaudhury, Fixed-point and objective convergence of plug-and-play algorithms, IEEE Transactions on Computational Imaging (7) (2021) 337–348.
  • [25] E. Ryu, J. Liu, S. Wang, X. Chen, Z. Wang, W. Yin, Plug-and-play methods provably converge with properly trained denoisers, International Conference on Machine Learning (2019) 5546–5557.
  • [26] S. Chatterjee, R. S. Thakur, R. N. Yadav, L. Gupta, D. K. Raghuvanshi, Review of noise removal techniques in ECG signals, IET Signal Processing 14 (9) (2020) 569–590.
  • [27] S. Kumar, D. Panigrahy, P. K. Sahu, Denoising of Electrocardiogram (ECG) signal by using empirical mode decomposition (EMD) with non-local mean (NLM) technique, Biocybernetics and Biomedical Engineering 38 (2) (2018) 297–312.
  • [28] B. H. Tracey, E. L. Miller, Nonlocal means denoising of ECG signals, IEEE Transactions on Biomedical Engineering 59 (9) (2012) 2383–2386.
  • [29] D. Zoran, Y. Weiss, From learning models of natural image patches to whole image restoration, Proc. International Conference on Computer Vision (2011) 479–486.
  • [30] A. M. Teodoro, J. M. Bioucas-Dias, M. A. Figueiredo, A convergent image fusion algorithm using scene-adapted Gaussian-mixture-based denoising, IEEE Transactions on Image Processing 28 (1) (2018) 451–463.
  • [31] L. Condat, A direct algorithm for 1-D total variation denoising, IEEE Signal Processing Letters 101 (23) (2013) 1054–1057.
  • [32] A. L. Goldberger, L. A. Amaral, L. Glass, J. M. Hausdorff, P. C. Ivanov, R. G. Mark, J. E. Mietus, G. B. Moody, C. K. Peng, H. E. Stanley, Physiobank, Physiotoolkit, and Physionet: components of a new research resource for complex physiologic signals, Circulation 101 (23) (2000) e215–e220.
  • [33] G. B. Moody, R. G. Mark, The impact of the MIT-BIH arrhythmia database, IEEE Engineering in Medicine and Biology Magazine 20 (3) (2001) 45–50.
  • [34] Z. Lu, D. Y. Kim, W. A. Pearlman, Wavelet compression of ECG signals by the set partitioning in hierarchical trees algorithm, IEEE Transactions on Biomedical Engineering 47 (7) (2000) 849–856.
  • [35] S. Sreehari, S. V. Venkatakrishnan, B. Wohlberg, G. T. Buzzard, L. F. Drummy, J. P. Simmons, C. A. Bouman, Plug-and-play priors for bright field electron tomography and sparse interpolation, IEEE Transactions on Computational Imaging 2 (4) (2016) 408–423.
  • [36] X. Xu, Y. Sun, J. Liu, B. Wohlberg, U. S. Kamilov, Provable convergence of plug-and-play priors with mmse denoisers, IEEE Signal Processing Letters 27 (2020) 1280–1284.
  • [37] A. M. Teodoro, J. M. Bioucas-Dias, M. A. Figueiredo, Scene-adapted plug-and-play algorithm with convergence guarantees, IEEE International Workshop on Machine Learning for Signal Processing (2017) 1–6.
  • [38] N. Ansari, A. Gupta, Wnc-ecglet: Weighted non-convex minimization based reconstruction of compressively transmitted ecg using ecglet, Biomedical Signal Processing and Control 49 (2019) 1–13.
  • [39] P. R. Muduli, R. R. Gunukula, A. Mukherjee, A deep learning approach to fetal-ecg signal reconstruction, National Conference on Communications (2016) 1–6.