This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

On A Class of Greedy Sparse Recovery Algorithms

Gang Li, Qiuwei Li, Shuang Li, and Wu Angela Li School of Automation and Electrical Engineering, Zhejiang University of Science & Technology, Hangzhou, Zhejiang, 310023, P.R. China. Email: ieligang@zjut.edu.cnDAMO Academy, Alibaba Group US, Bellevue, Washington, 98004, USA. Email: liqiuweiss@gmail.comDepartment of Electrical and Computer Engineering, Iowa State University, Ames, Iowa, 50014, USA. Email: lishuang@iastate.eduDepartment of Computer Science, Rice University, Houston, Texas, 77005, USA. Email: awl@rice.edu
Abstract

Sparse signal recovery deals with finding the sparest solution of an under-determined linear system 𝒙=𝑸𝒔\bm{x}=\bm{Q}\bm{s}. In this paper, we propose a novel greedy approach to addressing the challenges from such a problem. Such an approach is based on a characterization of solutions to the system, which allows us to work on the sparse recovery in the 𝒔\bm{s}-space directly with a given measure. With l2l_{2}-based measure, an orthogonal matching pursuit (OMP)-type algorithm is proposed, which significantly outperforms the classical OMP algorithm in terms of recovery accuracy while maintaining comparable computational complexity. An l1l_{1}-based algorithm, denoted as AlgGL1\text{Alg}_{GL1}, is derived. Such an algorithm significantly outperforms the classical basis pursuit (BP) algorithm. Combining with the CoSaMP-strategy for selecting atoms, a class of high performance greedy algorithms is also derived. Extensive numerical simulations on both synthetic and image data are carried out, with which the superior performance of our proposed algorithms is demonstrated in terms of sparse recovery accuracy and robustness against numerical instability of the system matrix 𝑸\bm{Q} and disturbance in the measurement 𝒙\bm{x}.

Keywords: Sparse signal recovery; Greedy methods; Basis pursuit methods; Under-determined linear system; Low-rank interference; Compressed sensing.

1 Introduction

The sparse recovery problem, also known as the sparse linear inverse problem, has found extensive applications in many areas such as data science [1, 2], signal processing [3, 4, 5, 6], communications engineering [7, 8, 9], structural health monitoring [10, 11, 12], hyperspectral imaging [13, 14, 15], and deep learning [16, 17]. It plays a crucial role in the framework of compressed sensing (CS) [18, 19, 20, 21, 22, 23], and in deep learning for feature selection and designing efficient convolutional neural networks [24, 25, 26, 27, 28, 29].

Consider the following popularly used synthetic model:

𝒙=𝑸𝒔,\bm{x}=\bm{Q}\bm{s}, (1)

where 𝑸N×L\bm{Q}\in\mathbb{R}^{N\times L} with N<LN<L, and 𝒔L×1\bm{s}\in\mathbb{R}^{L\times 1} is the representation of the measurement 𝒙\bm{x} in 𝑸\bm{Q}. The associated sparse recovery problem regards for a given (𝒙,𝑸)(\bm{x},\bm{Q}), finding such a solution of (1), denoted as 𝒔\bm{s}^{\star}, which has fewer nonzero entries than any other solution. Mathematically, it is formulated as

𝒔argmin𝒔𝒔0s.t.𝒙=𝑸𝒔,\bm{s}^{\star}\triangleq\arg\min_{\bm{s}}\|\bm{s}\|_{0}~{}~{}~{}~{}\text{s.t.}~{}~{}\bm{x}=\bm{Q}\bm{s}, (2)

where 𝒗0\|\bm{v}\|_{0}, often denoted as l0l_{0} norm though it is not a norm in a strict sense, represents the number of nonzero elements in a vector 𝒗\bm{v}, that is 𝒗0=card[𝒗]\|\bm{v}\|_{0}=\text{card}[\mathcal{I}_{\bm{v}}] - the cardinality of the support 𝒗{l[L]:𝒗(l)0}\mathcal{I}_{\bm{v}}\triangleq\{l\in[L]:~{}\bm{v}(l)\neq 0\} of a vector 𝒗L×1\bm{v}\in\mathbb{R}^{L\times 1}. A signal 𝒙\bm{x} is said to be κ\kappa-sparse in 𝑸\bm{Q} if 𝒔0κ\|\bm{s}\|_{0}\leq\kappa.

As is well known, the sparse recovery problem (2) is NP-hard and has been investigated for many years. Determining how to solve problem (2) efficiently and accurately is crucial for a wide range of sparsity-related problems. To the best of our knowledge, this remains an open problem, and there is still a significant need for more effective algorithms.

1.1 Related work

To accurately solve the sparse recovery problem (2), the very first question to raise is: when does it have a unique solution 𝒔\bm{s}^{\star}? Let spark[𝑸]\text{spark}[\bm{Q}] denote the smallest number of columns of 𝑸\bm{Q} that are linearly dependent. It is well known that the following result [30] plays a key role in sparse recovery: Let 𝒙\bm{x} be any κ\kappa-sparse vector in 𝑸\bm{Q}, that is there exists a 𝒔~\tilde{\bm{s}} with 𝒔~0κ\|\tilde{\bm{s}}\|_{0}\leq\kappa such that 𝒙=𝑸𝒔~\bm{x}=\bm{Q}\tilde{\bm{s}}. Then, problem (2) has one and only one solution, 𝒔=𝒔~\bm{s}^{\star}=\tilde{\bm{s}}, if and only if 𝑸\bm{Q} satisfies

spark[𝑸]>2κ.\text{spark}[\bm{Q}]>2\kappa. (3)

Let κ𝑸\kappa_{\bm{Q}} be the largest integer that is not bigger than (spark[𝑸]1)/2(\text{spark}[\bm{Q}]-1)/2. Then, any κ\kappa-sparse solution 𝒔~\tilde{\bm{s}} of 𝒙=𝑸𝒔=𝑸𝒔~\bm{x}=\bm{Q}\bm{s}=\bm{Q}\tilde{\bm{s}} is the unique solution of (2) as long as κκ𝑸\kappa\leq\kappa_{\bm{Q}}, which is assumed in the sequel.

Another very important concept is the restricted isometry property (RIP), introduced in [31], which is defined as follows: a matrix 𝑸\bm{Q} is said to satisfy the (κ,δ)(\kappa,\delta)-RIP with 0δ<10\leq\delta<1 if

(1δ)𝒔22𝑸𝒔22(1+δ)𝒔22,𝒔Σκ,(1-\delta)\|\bm{s}\|^{2}_{2}\leq\|\bm{Q}\bm{s}\|^{2}_{2}\leq(1+\delta)\|\bm{s}\|^{2}_{2},~{}~{}~{}\forall~{}\bm{s}\in\Sigma_{\kappa}, (4)

where Σκ\Sigma_{\kappa} denotes the set of all κ\kappa-sparse signals in 𝑸\bm{Q}. It can be seen that if 𝑸\bm{Q} satisfies the (2κ,δ)(2\kappa,\delta)-RIP, then any 2κ2\kappa columns of 𝑸\bm{Q} are linearly independent, and hence condition (3) holds. Therefore, if 𝒙=𝑸𝒔\bm{x}=\bm{Q}\bm{s}^{\star} with 𝒔0κ\|\bm{s}^{\star}\|_{0}\leq\kappa, then, the underdetermined system 𝒙=𝑸𝒔\bm{x}=\bm{Q}\bm{s} has a unique κ\kappa-sparse solution, which is 𝒔\bm{s}^{\star}.

As an NP-hard problem, (2) is not tractable due to the l0l_{0} norm. To address this, a convex relaxation is commonly employed. In particular, two classes of methods are popularly used:

1.1.1 Iteratively re-weighted approaches

This class includes basis pursuit (BP)-based algorithms, which rely on replacing the l0l_{0} norm with l1l_{1}-minimization. This approach regularizes problem (2) with the following problem [32, 31]:

𝒔argmin𝒔𝒔1s.t𝒙=𝑸𝒔.\displaystyle\bm{s}^{\ast}\triangleq\arg\min_{\bm{s}}\|\bm{s}\|_{1}~{}~{}~{}\text{s.t}~{}~{}~{}\bm{x}=\bm{Q}\bm{s}. (5)

This is a convex problem and there exist standard algorithms designed to solve such a problem efficiently, particularly when the problem’s dimension is manageable. Denote BPC\text{BP}_{C} as the algorithm implemented using MATLAB command linprog.m.

Note that 𝒔0=l=1Lsgn[|𝒔(l)|]\|\bm{s}\|_{0}=\sum^{L}_{l=1}\text{sgn}[|\bm{s}(l)|],where sgn[t]\text{sgn}[t] is the sign function. Thus, the l1l_{1}-minimization-based BP approach mentioned above is just an approximation of .0\|.\|_{0}, where sgn[|t|]\text{sgn}[|t|] is replaced with ϕ1(t)|t|\phi_{1}(t)\triangleq|t|.

To further improve the sparse recovery accuracy of the BP-based approach, a better approximating function ϕ(t)\phi(t) can be used, such as: ϕ2(t)log(|t|+ϵ);ϕ3(t)(|t|2+ϵ)q/2,0<q<1,\phi_{2}(t)\triangleq\text{log}(|t|+\epsilon);~{}\phi_{3}(t)\triangleq(|t|^{2}+\epsilon)^{q/2},~{}~{}0<q<1, where ϵ>0\epsilon>0 is a parameter used to regularize the singularity at t=0t=0. With this function, we define 𝒔χl=1Lϕ(𝒔(l)).\|\bm{s}\|_{\chi}\triangleq\sum^{L}_{l=1}\phi(\bm{s}(l)). The corresponding sparse recovery problem is formulated as the following constrained minimization

𝒔argmin𝒔𝒔χs.t.𝒙=𝑸𝒔.\displaystyle\bm{s}^{\ast}\triangleq\arg\min_{\bm{s}}\|\bm{s}\|_{\chi}~{}~{}\text{s.t.}~{}~{}\bm{x}=\bm{Q}\bm{s}.

This non-convex problem is then addressed with two classes of iteratively re-weighted algorithms [33, 34, 35, 36, 37, 38, 39]. These algorithms are based on a first-order approximation approach and can be unified as follows:

𝒔(k+1)\displaystyle\bm{s}^{(k+1)} \displaystyle\triangleq argmin𝒔𝑫p(k)𝒔pps.t𝒙=𝑸𝒔,\displaystyle\arg\min_{\bm{s}}\|\bm{D}^{(k)}_{p}\bm{s}\|^{p}_{p}~{}~{}~{}\text{s.t}~{}~{}~{}\bm{x}=\bm{Q}\bm{s},
𝑫p(k+1)\displaystyle\bm{D}^{(k+1)}_{p} =\displaystyle= diag{𝒘p[𝒔(k+1)]},p=1,2,\displaystyle\text{diag}\{\bm{w}_{p}[\bm{s}^{(k+1)}]\},~{}~{}~{}~{}~{}~{}~{}~{}~{}p=1,2, (6)

where 𝒘p[𝒔]L×1\bm{w}_{p}[\bm{s}]\in\mathbb{R}^{L\times 1} is determined by the weighting scheme. For p=1p=1, the formulation (6) is referred to as the iteratively re-weighted BP (IRBP) [37, 38, 36], and for p=2p=2, it corresponds to the iteratively re-weighted least squares (IRLS) [35, 39, 33, 34].

In general, the weighting 𝒘p[𝒔]\bm{w}_{p}[\bm{s}] is determined by ϕ(t)\phi(t). For example, the llth element of 𝒘1[𝒔]\bm{w}_{1}[\bm{s}] is given by

𝒘1[𝒔(l)]=1|𝒔(l)|+ϵ\bm{w}_{1}[\bm{s}(l)]=\frac{1}{|\bm{s}(l)|+\epsilon}

in [37], where ϕ(t)=ϕ2(t)\phi(t)=\phi_{2}(t) was used for IRBP, while with ϕ(t)=ϕ3(t)\phi(t)=\phi_{3}(t) for IRLS

𝒘2[𝒔(l)]=(|𝒔(l)|2+ϵ)(q/21)/2\bm{w}_{2}[\bm{s}(l)]=(|\bm{s}(l)|^{2}+\epsilon)^{(q/2-1)/2}

used in [39].

The IRBP-based algorithms provide a powerful tool for solving sparse recovery problems. However, these algorithms are computationally demanding, especially for large-scale scenarios. To reduce computational complexity, an l1l_{1}-l2l_{2} minimization-based reformulation of (5) was proposed in [40, 41], and a class of iterative algorithms, including FISTA, was derived [42, 43].

The IRLS method is popularly used for non-smooth cost functions by solving a sequence of quadratic problems as specified in (6) with p=2p=2. The solution to the 1st equation of (6) is given by

𝒔(k+1)=(𝑫2(k))2𝑸[𝑸(𝑫2(k))2𝑸]1𝒙.\displaystyle\bm{s}^{(k+1)}=(\bm{D}^{(k)}_{2})^{-2}\bm{Q}^{\top}[\bm{Q}(\bm{D}^{(k)}_{2})^{-2}\bm{Q}^{\top}]^{-1}\bm{x}. (7)

Note that the algorithms (6) are ϵ\epsilon-sensitive, and a time-varying ϵ\epsilon is crucial for these algorithms to achieve state-of-the-art recovery performance. In [39], an ϵ\epsilon-regularization strategy was proposed, where (6) with p=2p=2 is iteratively run with a monotonically decreasing sequence {ϵj:108ϵj1}\{\epsilon_{j}:10^{-8}\leq\epsilon_{j}\leq 1\}. In this iterative procedure, 𝒔j\bm{s}_{j} - the solution given by (6) with ϵj\epsilon_{j}, is used as the initial iterate for the next iteration. We denote such an iterative procedure as IRLSC\text{IRLS}_{C}.

1.1.2 Greedy methods

By nature, sparse recovery aims to identify the support 𝒔\mathcal{I}_{\bm{s}^{\star}} of the κ\kappa-sparse vector 𝒔\bm{s}^{\star} underlying the measurement 𝒙\bm{x}. Thus, one sparse algorithm differs from another in how it detects the support. The iteratively re-weighted methods estimate the support 𝒔\mathcal{I}_{\bm{s}^{\star}} through a global search in the 𝒔\bm{s}-space and then select the indices of the κ\kappa largest entries in magnitude from their minimizer 𝒔\bm{s}^{\ast}.

The greedy algorithms, as described in [44, 45], intends to address the sparse recovery problem (2) by solving the following alternative problem:

𝒔#min𝒔𝒙𝑸𝒔22s.t.𝒔0κ,\displaystyle\bm{s}^{\#}\triangleq\min_{\bm{s}}\|\bm{x}-\bm{Q}\bm{s}\|^{2}_{2}~{}~{}~{}\text{s.t.}~{}~{}~{}\|\bm{s}\|_{0}\leq\kappa,

where κ\kappa is the given sparsity level. The κ\kappa non-zero entries of 𝒔#\bm{s}^{\#} are identified sequentially through a procedure that minimizes residual. Such a method is known as the orthogonal matching pursuit (OMP) method.

The OMP algorithm extracts the κ\kappa indices one by one iteratively. The kkth atom, indexed by iki_{k}, is identified as follows:

ikargmini,β𝒓k1β𝑸(:,i)22.i_{k}\triangleq\arg\min_{i,\beta}\|\bm{r}_{k-1}-\beta\bm{Q}(:,i)\|^{2}_{2}. (8)

Here, 𝒓k1\bm{r}_{k-1} is the residual obtained at the (k1)(k-1)th iteration:111Throughout this paper, we use MATLAB notations: 𝒗(i)\bm{v}(i) denotes the iith entry of a vector 𝒗\bm{v}, while 𝑨(i,j),𝑨(:,j)\bm{A}(i,j),~{}\bm{A}(:,j), and 𝑨(i,:)\bm{A}(i,:) denote the (i,j)(i,j)th entry, the jjth column, and the iith row of a matrix 𝑨\bm{A}, respectively. Furthermore, let \mathcal{I} be a subset of [L]={1,2,,L}[L]=\{1,2,\cdots,L\}. 𝒗()\bm{v}(\mathcal{I}) is defined as the sub-vector of the vector 𝒗\bm{v} with dimension LL, obtained by excluding those entries whose indices do not belong to \mathcal{I}. 𝒓k1=𝒙𝑸(:,k1)𝜷k1,\bm{r}_{k-1}=\bm{x}-\bm{Q}(:,\mathcal{I}_{k-1})\bm{\beta}_{k-1}, where k1\mathcal{I}_{k-1} is the set of the k1k-1 indices detected before the kkth iteration and 𝜷k1\bm{\beta}_{k-1} is the associated coefficient vector obtained in the (k1)(k-1)th iteration.

1.2 Problem formulation and contribution

It should be pointed out that the iteratively re-weighted algorithms provide state-of-the-art sparse recovery performance. However, they are computationally demanding. In particular, IRLS-based algorithms, though much faster than the IRBP-based ones, can encounter numerical issues when the 𝑸\bm{Q} matrix is ill-conditioned (See (7)). By contrast, OMP is significantly more efficient. The most computationally demanding stage involves solving the least squares problem to update the non-zero entries. A class of iterative methods, such as gradient pursuit (GP) and conjugate gradient pursuit (CGP), can be used to address this issue, as described in [46]. To improve the recovery accuracy and update the index set of the κ\kappa non-zero entries more efficiently, one approach is to select multiple entries at once, rather than just one as in OMP. This leads to the development of stage-wise OMP (StOMP) [47], compressive sampling match pursuit (CoSaMP) [48], and stagewise weak GP (StWGP) [49]. A comprehensive survey of these issues as well as performance guarantees can be found in [50].

As well-known, the greedy methods can not yield a comparable sparse recovery performance to the iteratively re-weighted ones. Suppose that k1\mathcal{I}_{k-1} has its k1k-1 indices that all fall within the support 𝒔\mathcal{I}_{\bm{s}^{\star}}. Then, 𝒓k1=𝑸𝒔𝑸(:,𝒔)𝜷k1𝑸𝒔~\bm{r}_{k-1}=\bm{Q}\bm{s}-\bm{Q}(:,\mathcal{I}_{\bm{s}^{\star}})\bm{\beta}_{k-1}\triangleq\bm{Q}\tilde{\bm{s}}, and this new system has a unique sparsest solution, denoted as 𝒔~\tilde{\bm{s}}^{\star}, with 𝒔~0κ\|\tilde{\bm{s}}^{\star}\|_{0}\leq\kappa. The l2l_{2}-norm based minimization (8) can not ensure that the obtained index iki_{k} falls into 𝒔\mathcal{I}_{\bm{s}^{\star}} because the cost function is a complicated function of the true indices to be detected and such a function changes with 𝜷k1\bm{\beta}_{k-1}. This is the main reason why the OMP algorithm cannot generally yield a sparse recovery as accurate as the iteratively re-weighted algorithms, prompting the proposal of its variants such as CoSaMP and StOMP.

The question we ask ourselves: Can we transform the system 𝒓k1=𝑸𝒔~\bm{r}_{k-1}=\bm{Q}\tilde{\bm{s}} into 𝒔\bm{s} domain by characterizing its set of solutions and then work directly on the sparse recovery problem? If an estimate of 𝒔~\tilde{\bm{s}}^{\star} is achieved, the next index iki_{k} can be selected as the one that corresponds to the largest entry in magnitude of this estimate. Intuitively, by doing so the accuracy of index detection can be improved. This motivates us to deal with the sparse recovery problem on the 𝒔\bm{s}-space, instead of the measurement 𝒙\bm{x}-space.

The main objective of this paper is to derive a class of greedy algorithms that operate directly in the 𝒔\bm{s}-domain and outperform classical greedy algorithms (i.e., OMP, CoSaMP) in terms of sparse recovery accuracy while maintaining similar implementation complexity. Additionally, these algorithms offer greater robustness against the numerical instability caused by 𝑸\bm{Q} and interferences in the signal 𝒙\bm{x} compared to the classical algorithms, especially the IRLS-based ones. Our contributions are summarized as follows.

  • Based on a characterization of solutions to equation (1), we propose a class of greedy sparse recovery approaches for a given sparsity measure. Such an approach allows us to work directly in the 𝒔\bm{s}-space;

  • Two algorithms, denoted as AlgGL2\text{Alg}_{GL2} and AlgGL1\text{Alg}_{GL1}, are derived based on l2l_{2} and l1l_{1}, respectively. Despite having nearly the same computational complexity, AlgGL2\text{Alg}_{GL2} significantly outperforms the classical OMP algorithm and is comparable to the classical BP algorithm in terms of recovery accuracy. Meanwhile, the AlgGL1\text{Alg}_{GL1} algorithm significantly surpasses the classical OMP, BPC, and CoSaMP algorithms in terms of sparse recovery accuracy;

  • An improved version of AlgGL1\text{Alg}_{GL1} has been derived, denoted as the AlgGLQ\text{Alg}_{GLQ} algorithm, in which lql_{q}-minimization is used to enhance the sparsity. Fast variants of the algorithms AlgGL1\text{Alg}_{GL1} and AlgGLQ\text{Alg}_{GLQ}, denoted as AlgGL1F\text{Alg}^{F}_{GL1} and AlgGLQF\text{Alg}^{F}_{GLQ}, are developed to accelerate AlgGL1\text{Alg}_{GL1} and AlgGLQ\text{Alg}_{GLQ} by selecting multiple atoms per iteration;

  • It is shown that the proposed algorithms such as AlgGL1\text{Alg}_{GL1}, AlgGLQ\text{Alg}_{GLQ} and their fast variants can be used to deal with low-rank interferences in the measurements effectively, which owes to the proposed characterization and is a significant advantage over the classical algorithms.

All claims have been validated through extensive numerical simulations on both synthetic and image data.

The paper is organized as follows. In Section 2, we derive a characterization of the solution set for the linear system 𝒙=𝑸𝒔\bm{x}=\bm{Q}\bm{s}. Based on this characterization, we propose a greedy approach to sparse recovery. Two specified algorithms, denoted as AlgGL2\text{Alg}_{GL2} and AlgGL1\text{Alg}_{GL1}, are presented in this section. Section 3 analyzes the performance of the proposed algorithms. An improved version of AlgGL1\text{Alg}_{GL1}, denoted as AlgGLQ\text{Alg}_{GLQ}, is derived in Section 4, where the lql_{q}-minimization is employed to enhance the sparsity. Fast variants of both AlgGLQ\text{Alg}_{GLQ} and AlgGL1\text{Alg}_{GL1} are also derived in this section. Numerical examples and simulations are presented in Section 5 to validate the theoretical results, evaluate the performance of the proposed methods, and compare them with some existing approaches. To end this paper, some concluding remarks are given in Section 6.

2 The Proposed Approach and Algorithms

Our proposed greedy approach is based on the following characterization of the solutions to the system (1).

Let 𝑸=𝑼[𝚺𝟎𝟎𝟎]𝑽\bm{Q}=\bm{U}\left[\begin{array}[]{cc}\bm{\Sigma}&{{\bf 0}}\\ {{\bf 0}}&{{\bf 0}}\end{array}\right]\bm{V}^{\top} be a singular value decomposition (SVD) of 𝑸\bm{Q}, where the diagonal matrix 𝚺0\bm{\Sigma}\succ 0 has dimension N~×N~\tilde{N}\times\tilde{N} with N~N\tilde{N}\leq N. It then follows from 𝑸𝒔=𝒙\bm{Q}\bm{s}=\bm{x} that

[𝚺𝟎𝟎𝟎]𝑽𝒔=𝑼𝒙.\left[\begin{array}[]{cc}\bm{\Sigma}&{{\bf 0}}\\ {{\bf 0}}&{{\bf 0}}\end{array}\right]\bm{V}^{\top}\bm{s}=\bm{U}^{\top}\bm{x}.

Denote 𝑽1𝑽(:,1:N~)\bm{V}_{1}\triangleq\bm{V}(:,1:\tilde{N}), 𝑾𝑽(:,N~+1:L)\bm{W}\triangleq\bm{V}(:,\tilde{N}+1:L), and 𝑼1𝑼(:,1:N~)\bm{U}_{1}\triangleq\bm{U}(:,1:\tilde{N}). With 𝑽𝒔=[𝒔¯1𝒛]\bm{V}^{\top}\bm{s}=\left[\begin{array}[]{c}\bar{\bm{s}}_{1}\\ \bm{z}\end{array}\right], we can show that any solution 𝒔\bm{s} to (1) is given by 𝒔=𝑽[𝒔¯1𝒛]\bm{s}=\bm{V}\left[\begin{array}[]{c}\bar{\bm{s}}_{1}\\ \bm{z}\end{array}\right] and hence is of the form 𝒔=𝑽1𝚺1𝑼1𝒙+𝑾𝒛\bm{s}=\bm{V}_{1}\bm{\Sigma}^{-1}\bm{U}^{\top}_{1}\bm{x}+\bm{W}\bm{z}, i.e.,

𝒔𝑽1𝒙c+𝑾𝒛𝒔0(𝒙)+𝑾𝒛𝒔(𝒛,𝒙),\displaystyle\bm{s}\triangleq\bm{V}_{1}\bm{x}_{c}+\bm{W}\bm{z}\triangleq\bm{s}_{0}(\bm{x})+\bm{W}\bm{z}\triangleq\bm{s}(\bm{z},\bm{x}), (9)

where 𝒙c=𝚺1𝑼1𝒙𝑻0𝒙\bm{x}_{c}=\bm{\Sigma}^{-1}\bm{U}^{\top}_{1}\bm{x}\triangleq\bm{T}_{0}\bm{x}. Consequently, the solution set 𝒮0{𝒔(𝒛,𝒙)}\mathcal{S}_{0}\triangleq\{\bm{s}(\bm{z},\bm{x})\} of system (1) is completely characterized by the vector variable 𝒛(LN~)×1\bm{z}\in\mathbb{R}^{(L-\tilde{N})\times 1} for a given 𝒙\bm{x}. The sparse recovery problem (2) can then be converted into the following equivalent form222For simplicity, 𝒔0(𝒙)\bm{s}_{0}(\bm{x}) and 𝒔(𝒛,𝒙)\bm{s}(\bm{z},\bm{x}) are sometimes denoted as 𝒔0\bm{s}_{0} and 𝒔(𝒛)\bm{s}(\bm{z}), respectively, in the sequel.

𝒛\displaystyle\bm{z}^{\star} \displaystyle\triangleq argmin𝒛𝒔0+𝑾𝒛0𝒔=𝒔0+𝑾𝒛.\displaystyle\arg\min_{\bm{z}}\|\bm{s}_{0}+\bm{W}\bm{z}\|_{0}\quad\mapsto~{}~{}~{}~{}\bm{s}^{\star}=\bm{s}_{0}+\bm{W}\bm{z}^{\star}. (10)

The alternative formulation transforms the classical problem (2) from a constrained minimization to an unconstrained one, allowing us to directly address sparse recovery in the 𝒔\bm{s}-space of dimension L×1L\times 1.

A traditional sparse recovery algorithm aims to find the κ\kappa-sparse solution 𝒔\bm{s}^{\star} for each measurement 𝒙\bm{x}, where the system is parametrized by 𝑸\bm{Q} according to the model (1). In this paper, we propose a class of greedy algorithms that obtain the solution based on the proposed model (9), where the system is parametrized by (𝑾,𝑽1)(\bm{W},\bm{V}_{1}). The input 𝒔0\bm{s}_{0} is computed as 𝒔0=𝑽1𝒙c=𝑽1𝑻0𝒙\bm{s}_{0}=\bm{V}_{1}\bm{x}_{c}=\bm{V}_{1}\bm{T}_{0}\bm{x}, serving as an initial estimate of 𝒔\bm{s}^{\star}. The parameters (𝑾,𝑽1,𝑻0)(\bm{W},\bm{V}_{1},\bm{T}_{0}), which are equivalent to 𝑸\bm{Q}, are determined at the design stage. When deriving the algorithms, we assume that the triple (𝑾,𝑽1,𝒔0)(\bm{W},\bm{V}_{1},\bm{s}_{0}) is available, similar to (𝑸,𝒙)(\bm{Q},\bm{x}) in traditional methods.

2.1 The proposed greedy approach

Suppose that i1i_{1} is a detected index, expected to be one of the elements in 𝒔\mathcal{I}_{\bm{s}^{\star}}. Let’s consider how to detect the next index, say i2i_{2}, in a greedy manner.

First of all, it follows from 𝒙=𝑸𝒔\bm{x}=\bm{Q}\bm{s} that the residual is given by 𝒙β𝑸(:,i1)=𝑸𝒔β𝑸(:,i1)\bm{x}-\beta\bm{Q}(:,i_{1})=\bm{Q}\bm{s}-\beta\bm{Q}(:,i_{1}) or more simply 𝒙β𝑸(:,i1)=𝑸𝒔1\bm{x}-\beta\bm{Q}(:,i_{1})=\bm{Q}\bm{s}_{1}, where 𝒔1\bm{s}_{1}, according to the proposed characterization (9), can be expressed as

𝒔1\displaystyle\bm{s}_{1} =𝑽1𝑻0(𝒙β𝑸(:,i1))+𝑾𝒛=𝒔0+𝑾𝒛β𝑽1𝑻0𝑸(:,i1)=𝒔(𝒛)β𝑽1𝑻0𝑸(:,i1),\displaystyle=\bm{V}_{1}\bm{T}_{0}(\bm{x}-\beta\bm{Q}(:,i_{1}))+\bm{W}\bm{z}=\bm{s}_{0}+\bm{W}\bm{z}-\beta\bm{V}_{1}\bm{T}_{0}\bm{Q}(:,i_{1})=\bm{s}(\bm{z})-\beta\bm{V}_{1}\bm{T}_{0}\bm{Q}(:,i_{1}),

where 𝒔(𝒛)=𝒔0+𝑾𝒛𝒮0\bm{s}(\bm{z})=\bm{s}_{0}+\bm{W}\bm{z}\in\mathcal{S}_{0}, as defined in (9).

Recall that 𝑸=𝑼1𝚺𝑽1\bm{Q}=\bm{U}_{1}\bm{\Sigma}\bm{V}_{1}^{\top}. Together with 𝑻0=𝚺1𝑼1\bm{T}_{0}=\bm{\Sigma}^{-1}\bm{U}^{\top}_{1} and 𝒔0=𝑽1𝑻0𝒙=𝑽1𝑻0𝑸𝒔\bm{s}_{0}=\bm{V}_{1}\bm{T}_{0}\bm{x}=\bm{V}_{1}\bm{T}_{0}\bm{Q}\bm{s}^{\star}, we obtain

𝒔0=𝑽1𝑽1𝒔𝑸¯𝒔,𝑽1𝑻0𝑸(:,i1)=𝑸¯(:,i1),\displaystyle\bm{s}_{0}=\bm{V}_{1}\bm{V}^{\top}_{1}\bm{s}^{\star}\triangleq\bar{\bm{Q}}\bm{s}^{\star},\quad\bm{V}_{1}\bm{T}_{0}\bm{Q}(:,i_{1})=\bar{\bm{Q}}(:,i_{1}), (11)

where 𝑸¯𝑽1𝑽1\bar{\bm{Q}}\triangleq\bm{V}_{1}\bm{V}^{\top}_{1}. Define a set 𝒮1{𝒔1:𝒔1=𝒔(𝒛)β𝑸¯(:,i1)}\mathcal{S}_{1}\triangleq\{\bm{s}_{1}:\bm{s}_{1}=\bm{s}(\bm{z})-\beta\bar{\bm{Q}}(:,i_{1})\} that is characterized by (𝒛,β)(\bm{z},\beta).

Remark 2.1.

It is important to note that since we assume 𝒮0\mathcal{S}_{0} has a unique κ\kappa-sparse vector 𝐬=𝐬0+𝐖𝐳\bm{s}^{\star}=\bm{s}_{0}+\bm{W}\bm{z}^{\star}, 𝒮1\mathcal{S}_{1} possesses the following properties:

  • having a unique (κ1)(\kappa-1)-sparse element 𝒔1\bm{s}^{\star}_{1}, if and only if i1𝒔i_{1}\in\mathcal{I}_{\bm{s}^{\star}} and β=𝒔(i1)β1\beta=\bm{s}^{\star}(i_{1})\triangleq\beta^{\star}_{1};

  • 𝒔10κ,𝒔1𝒮1\|\bm{s}_{1}\|_{0}\geq\kappa,~{}\forall~{}\bm{s}_{1}\in\mathcal{S}_{1}, if i1𝒔i_{1}\notin\mathcal{I}_{\bm{s}^{\star}}.

Thus, any solution of

{i~,β~,𝒛~}argmini,β,𝒛𝒔(𝒛)β𝑸¯(:,i)0\{\tilde{i},\tilde{\beta},\tilde{\bm{z}}\}\triangleq\arg\min_{i,\beta,\bm{z}}\|\bm{s}(\bm{z})-\beta\bar{\bm{Q}}(:,i)\|_{0} (12)

satisfies i~𝐬,β~=𝐬(i~)\tilde{i}\in\mathcal{I}_{\bm{s}^{\star}},\tilde{\beta}=\bm{s}^{\star}(\tilde{i}), and 𝐳~=𝐳\tilde{\bm{z}}=\bm{z}^{\star}.

Assume that i1𝒔i_{1}\in\mathcal{I}_{\bm{s}^{\star}}. We note that problem (12) with i=i1i=i_{1} results in 𝒛~=𝒛\tilde{\bm{z}}=\bm{z}^{\star} and β~=β1\tilde{\beta}=\beta^{\star}_{1}. Then, the corresponding 𝒔1=𝒔𝒔(i1)𝑸¯(:,i1)\bm{s}^{\star}_{1}=\bm{s}^{\star}-\bm{s}^{\star}(i_{1})\bar{\bm{Q}}(:,i_{1}) is the unique (κ1)(\kappa-1)-sparse and the sparsest element in set 𝒮1\mathcal{S}_{1}. This fact implies that to detect the next index in 𝒔\mathcal{I}_{\bm{s}^{\star}}, say i2i_{2}, we can first identify the sparsest element in 𝒮1\mathcal{S}_{1}. Then, i2i_{2} can be determined as the index corresponding to the largest entry in magnitude of 𝒔1\bm{s}^{\star}_{1}.

It is based on this observation that we propose the following greedy method, whose (k+1)(k+1)th iteration involves

{{𝒛~k,𝜷~k}argmin𝒛,𝜷𝒔(𝒛)𝑸¯k𝜷χ𝒔~k=𝒔(𝒛~k)𝑸¯k𝜷~k,ik+1argmaxilkc|𝒔~k(il)|k+1=kik+1,\displaystyle\left\{\begin{array}[]{rcl}\{\tilde{\bm{z}}_{k},\tilde{\bm{\beta}}_{k}\}&\triangleq&\arg\min_{\bm{z},\bm{\beta}}\|\bm{s}(\bm{z})-\bar{\bm{Q}}_{k}\bm{\beta}\|_{\chi}\quad\mapsto~{}\tilde{\bm{s}}_{k}=\bm{s}(\tilde{\bm{z}}_{k})-\bar{\bm{Q}}_{k}\tilde{\bm{\beta}}_{k},\\ i_{k+1}&\triangleq&\arg\max_{\forall~{}i_{l}\in\mathcal{I}^{c}_{k}}|\tilde{\bm{s}}_{k}(i_{l})|\quad~{}~{}~{}\quad\mapsto~{}~{}\mathcal{I}_{k+1}=\mathcal{I}_{k}\cup i_{k+1},\end{array}\right. (15)

where 𝑸¯k𝑸¯(:,k)\bar{\bm{Q}}_{k}\triangleq\bar{\bm{Q}}(:,\mathcal{I}_{k}) with k{i1,i2,,ik}\mathcal{I}_{k}\triangleq\{i_{1},i_{2},\cdots,i_{k}\} being the set of indices detected before, and kc\mathcal{I}^{c}_{k} its complement.333Throughout this paper, k\mathcal{I}_{k} denotes a set of indices when kk is an integer, while, as defined before, 𝒗\mathcal{I}_{\bm{v}} denotes the support of a vector 𝒗\bm{v}. Here, .χ\|.\|_{\chi} denotes the l0l_{0} norm or one of its approximating measures.

Note that the pair (𝒔~k,𝜷~k)(\tilde{\bm{s}}_{k},\tilde{\bm{\beta}}_{k}) in (15) corresponds to an 𝒔k\bm{s}_{k} that satisfies 𝒙=𝑸𝒔k\bm{x}=\bm{Q}\bm{s}_{k}, where

𝒔k=𝒔~k+𝜷~k,\bm{s}_{k}=\tilde{\bm{s}}_{k}+\tilde{\bm{\beta}}^{\ast}_{k}, (16)

and 𝜷~k\tilde{\bm{\beta}}^{\ast}_{k} is kk-sparse with 𝜷~k(k)=𝜷~k\tilde{\bm{\beta}}^{\ast}_{k}(\mathcal{I}_{k})=\tilde{\bm{\beta}}_{k}.

Running (15) for κ\kappa iterations, the κ\kappa-sparse estimate of 𝒔\bm{s}^{\star}, denoted as 𝒔^\hat{\bm{s}}^{\star}, is given by 𝒔^(κ)=𝜷~κ\hat{\bm{s}}^{\star}(\mathcal{I}_{\kappa})=\tilde{\bm{\beta}}_{\kappa}. It will be shown in the next section that if 𝒔~κ=𝟎\tilde{\bm{s}}_{\kappa}=\bm{0}, then κ=𝒔\mathcal{I}_{\kappa}=\mathcal{I}_{\bm{s}^{\star}}, and 𝜷~κ=𝒔(κ),\tilde{\bm{\beta}}_{\kappa}=\bm{s}^{\star}(\mathcal{I}_{\kappa}), which implies that the unique κ\kappa-sparse solution 𝒔\bm{s}^{\star} of 𝒙=𝑸𝒔\bm{x}=\bm{Q}\bm{s} has been obtained.

Although the .0\|.\|_{0} norm is the best measure to deal with sparsity-related problems, one of its approximating measures has to be used, as the .0\|.\|_{0} norm is NP-hard and hence not tractable. In the next two subsections, we will further elaborate on the proposed greedy approach (15) with two approximating measures.

2.2 An l2l_{2}-relaxation

Now, consider χ=2\chi=2, that is, the l2l_{2}-relaxation of l0l_{0}. Denote 𝒔k=𝒔(𝒛)𝑸¯(:,k)𝜷\bm{s}_{k}=\bm{s}(\bm{z})-\bar{\bm{Q}}(:,\mathcal{I}_{k})\bm{\beta}. We have

𝒔k22=𝒔0𝑸¯(:,k)𝜷22+𝑾𝒛22\|\bm{s}_{k}\|^{2}_{2}=\|\bm{s}_{0}-\bar{\bm{Q}}(:,\mathcal{I}_{k})\bm{\beta}\|^{2}_{2}+\|\bm{W}\bm{z}\|^{2}_{2}

due to the fact that 𝒔0=𝑸¯𝒔\bm{s}_{0}=\bar{\bm{Q}}\bm{s}^{\star}, where 𝑸¯=𝑽1𝑽1\bar{\bm{Q}}=\bm{V}_{1}\bm{V}^{\top}_{1}, as defined in (11), and 𝑽1𝑾=𝟎\bm{V}_{1}^{\top}\bm{W}={\bf 0}. Then, the proposed greedy approach (15) turns to

{𝜷~kargmin𝜷𝒔0𝑸¯(:,k)𝜷22𝒔~k=𝒔0𝑸¯(:,k)𝜷~kik+1argmaxilkc|𝒔~k(il)|k+1=kik+1\left\{\begin{array}[]{rcl}\tilde{\bm{\beta}}_{k}&\triangleq&\arg\min_{\bm{\beta}}\|\bm{s}_{0}-\bar{\bm{Q}}(:,\mathcal{I}_{k})\bm{\beta}\|^{2}_{2}\quad\mapsto~{}~{}\tilde{\bm{s}}_{k}=\bm{s}_{0}-\bar{\bm{Q}}(:,\mathcal{I}_{k})\tilde{\bm{\beta}}_{k}\\ i_{k+1}&\triangleq&\arg\max_{\forall~{}i_{l}\in\mathcal{I}^{c}_{k}}|\tilde{\bm{s}}_{k}(i_{l})|\quad\quad\quad~{}\mapsto~{}~{}\mathcal{I}_{k+1}=\mathcal{I}_{k}\cup i_{k+1}\end{array}\right. (17)

as 𝒛~k=𝟎,k\tilde{\bm{z}}_{k}=\bm{0},~{}\forall~{}k. We name the above approach AlgGL2\text{Alg}_{GL2}, where the subscript “GL2” stands for greedy method with indices selected using l2l_{2} minimization. We briefly denote the approach as 𝒔^=AlgGL2(𝒔0,𝑽1,κ)\hat{\bm{s}}^{\star}=\text{Alg}_{GL2}(\bm{s}_{0},\bm{V}_{1},\kappa), where 𝒔0=𝑽1𝑻0𝒙\bm{s}_{0}=\bm{V}_{1}\bm{T}_{0}\bm{x}, as mentioned before, is the initial estimate of 𝒔\bm{s}^{\star}.

We realize by noting 𝒔0=𝑸¯𝒔\bm{s}_{0}=\bar{\bm{Q}}\bm{s}^{\star} that the proposed AlgGL2\text{Alg}_{GL2} algorithm (17) is closely related to the classical OMP algorithm applied to the following higher dimensional system (than 𝒙=𝑸𝒔\bm{x}=\bm{Q}\bm{s})

𝒔0=𝑸¯𝒔.\bm{s}_{0}=\bar{\bm{Q}}\bm{s}. (18)

It can be seen that the AlgGL2\text{Alg}_{GL2} algorithm differs from the classical OMP applied to (18) in how it detects the next index ik+1i_{k+1}.

Remark 2.2.

It should be pointed out that, although the transformed system 𝐬0=𝐐¯𝐬\bm{s}_{0}=\bar{\bm{Q}}\bm{s} is theoretically equivalent to the original 𝐱=𝐐𝐬\bm{x}=\bm{Q}\bm{s}, the proposed AlgGL2\text{Alg}_{GL2} algorithm significantly outperform the classical OMPC(𝐱,𝐐,κ)\text{OMP}_{C}(\bm{x},\bm{Q},\kappa) algorithm - the OMP algorithm applied directly to 𝐱=𝐐𝐬\bm{x}=\bm{Q}\bm{s}. The most important reason for this is the “residual” in the transformed system being 𝐬~k\tilde{\bm{s}}_{k}, instead of the 𝐫k\bm{r}_{k} in the classical greedy approach. More details will be analyzed in the next section.

2.3 An l1l_{1}-relaxation

Note that 𝒔k=𝒔0+𝑾𝒛𝑸¯(:,k)𝜷\bm{s}_{k}=\bm{s}_{0}+\bm{W}\bm{z}-\bar{\bm{Q}}(:,\mathcal{I}_{k})\bm{\beta} can be rewritten as 𝒔k=𝒔0+𝛀k𝒛¯\bm{s}_{k}=\bm{s}_{0}+\bm{\Omega}_{k}\bar{\bm{z}}, where

𝛀k\displaystyle\bm{\Omega}_{k} \displaystyle\triangleq [𝑾𝑸¯(:,k)],𝒛¯[𝒛𝜷].\displaystyle\left[\begin{array}[]{cc}\bm{W}&-\bar{\bm{Q}}(:,\mathcal{I}_{k})\end{array}\right],~{}\bar{\bm{z}}\triangleq\left[\begin{array}[]{c}\bm{z}\\ \bm{\beta}\end{array}\right]. (22)

With χ=1\chi=1, the proposed greedy approach (15) becomes

{𝒛¯kargmin𝒛¯𝒔0+𝛀k𝒛¯1𝒔~k=𝒔0+𝛀k𝒛¯k,ik+1argmaxilkc|𝒔~k(il)|k+1=kik+1,\left\{\begin{array}[]{rcl}\bar{\bm{z}}_{k}&\triangleq&\arg\min_{\bar{\bm{z}}}\|\bm{s}_{0}+\bm{\Omega}_{k}\bar{\bm{z}}\|_{1}\quad\mapsto~{}~{}\tilde{\bm{s}}_{k}=\bm{s}_{0}+\bm{\Omega}_{k}\bar{\bm{z}}_{k},\\ i_{k+1}&\triangleq&\arg\max_{\forall~{}i_{l}\in\mathcal{I}^{c}_{k}}|\tilde{\bm{s}}_{k}(i_{l})|\quad\mapsto~{}~{}\mathcal{I}_{k+1}=\mathcal{I}_{k}\cup i_{k+1},\end{array}\right. (23)

which is a greedy l1l_{1}-based algorithm for sparse recovery.

As seen, such an iterative algorithm involves solving a series of l1l_{1}-minimizations of the form

𝒛~argmin𝒛𝒔0+𝛀𝒛1.\tilde{\bm{z}}\triangleq\arg\min_{\bm{z}}\|\bm{s}_{0}+\bm{\Omega}\bm{z}\|_{1}. (24)

As shown in [31], the above problem can be converted as a linear programming problem and hence can be solved with one of the standard convex optimization techniques, such as the MATLAB command linprog.mlinprog.m. For convenience, we denote the corresponding algorithm as BPalt\text{BP}_{alt}: [𝒛~,𝒔~]=BPalt(𝒔0,𝛀)[\tilde{\bm{z}},\tilde{\bm{s}}]=\text{BP}_{alt}(\bm{s}_{0},\bm{\Omega}) with the residual 𝒔~=𝒔0+𝛀𝒛~\tilde{\bm{s}}=\bm{s}_{0}+\bm{\Omega}\tilde{\bm{z}}.

The outline of the proposed greedy l1l_{1}-based algorithm (23), denoted as AlgGL1\text{Alg}_{GL1}, is then presented in Algorithm 1. For convenience, we denote this algorithm as 𝒔^=AlgGL1(𝒔0,𝑾,𝑽1,κ).\hat{\bm{s}}^{\star}=\text{Alg}_{GL1}(\bm{s}_{0},\bm{W},\bm{V}_{1},\kappa).

Algorithm 1 A greedy l1l_{1}-based method for sparse recovery (AlgGL1\text{Alg}_{GL1})
Inputs: (𝒔0,𝑾,𝑽1)(\bm{s}_{0},\bm{W},\bm{V}_{1}) and κ\kappa.
Initials: =[]\mathcal{I}=[~{}]. Run [𝒛~,𝒔~]=BPalt(𝒔0,𝑾)[\tilde{\bm{z}},\tilde{\bm{s}}]=\text{BP}_{alt}(\bm{s}_{0},\bm{W}), and assign the residual 𝒔~\tilde{\bm{s}} to 𝒔\bm{s};
for k=1:κk=1:\kappa do
     - Checking whether 𝒔\bm{s} is the desired solution: Let |𝒔|κ+1|\bm{s}|_{\kappa+1} denote the (κ+1)(\kappa+1)-th entry of the re-arranged sequence of {|𝒔(l)|}\{|\bm{s}(l)|\} sorted in a descending order.   𝐈𝐟|𝒔|κ+1104,𝐛𝐫𝐞𝐚𝐤𝑓𝑜𝑟𝑙𝑜𝑜𝑝.{\bf If}~{}~{}|\bm{s}|_{\kappa+1}\leq 10^{-4},~{}{\bf break}~{}~{}{\it for-loop}.
     - Updating index \mathcal{I}: Let imi_{m} be the index such that |𝒔~(im)|=maxi{|𝒔~(i)|}|\tilde{\bm{s}}(i_{m})|=\max_{i}\{|\tilde{\bm{s}}(i)|\}. Then, [im]\mathcal{I}~{}\leftarrow~{}[\mathcal{I}~{}~{}i_{m}].
     - Updating the residual 𝒔~\tilde{\bm{s}} and 𝜷~\tilde{\bm{\beta}}: Compute 𝛀=[𝑾𝑽1(𝑽1(,:))]\bm{\Omega}=\left[\begin{array}[]{cc}\bm{W}&-\bm{V}_{1}(\bm{V}_{1}(\mathcal{I},:))^{\top}\end{array}\right], then run [𝒛,𝒔~]=BPalt(𝒔0,𝛀),[\bm{z},\tilde{\bm{s}}]=\text{BP}_{alt}(\bm{s}_{0},\bm{\Omega}), set 𝜷~=𝟎\tilde{\bm{\beta}}=\bm{0} and 𝜷~()\tilde{\bm{\beta}}(\mathcal{I}) with the last length()\text{length}(\mathcal{I}) entries of 𝒛\bm{z};
     - Updating 𝒔\bm{s} using (16) with the pair (𝒔~,𝜷~)(\tilde{\bm{s}},\tilde{\bm{\beta}}):   𝒔=𝒔~+𝜷~.\bm{s}=\tilde{\bm{s}}+\tilde{\bm{\beta}}.
end for
Output: 𝒔^=𝒔\hat{\bm{s}}^{\star}=\bm{s}.

Before proceeding to the next section, we first evaluate the performance of the proposed greedy algorithms through the following numerical experiment. We generate a sequence {𝒙j}\{\bm{x}_{j}\} of J=1,000J=1,000 samples with 𝒙j=𝑸𝒔j,j\bm{x}_{j}=\bm{Q}\bm{s}_{j},~{}\forall~{}j. Here, 𝑸64×128\bm{Q}^{64\times 128} - a (column) l2l_{2}-normalized matrix, and {𝒔j}\{\bm{s}_{j}\} with 𝒔j0=20\|\bm{s}_{j}\|_{0}=20 are generated randomly. We run each of the four algorithms: OMPC(𝒙,𝑸,κ)\text{OMP}_{C}(\bm{x},\bm{Q},\kappa), BPC\text{BP}_{C} - the classical BP method, AlgGL2(𝒔0,𝑽1,κ)\text{Alg}_{GL2}(\bm{s}_{0},\bm{V}_{1},\kappa) and AlgGL1(𝒔0,𝑾,𝑽1,κ)\text{Alg}_{GL1}(\bm{s}_{0},\bm{W},\bm{V}_{1},\kappa), where 𝒔0=𝑽1𝑻0𝒙\bm{s}_{0}=\bm{V}_{1}\bm{T}_{0}\bm{x} with 𝒙=𝒙j\bm{x}=\bm{x}_{j}, for all the 1,000 samples {𝒙j}\{\bm{x}_{j}\}.

Let λ[𝒗]\mathcal{I}_{\lambda}[\bm{v}] denote the index set of the λ\lambda largest absolute entries of 𝒗\bm{v}. A recovery of a κ\kappa-sparse 𝒔\bm{s}^{\star} is considered successful if the solution 𝒔^\hat{\bm{s}}^{\star} obtained by an algorithm satisfies κ[𝒔^]=𝒔.\mathcal{I}_{\kappa}[\hat{\bm{s}}^{\star}]=\mathcal{I}_{\bm{s}^{\star}}. The rate of successful recovery for an algorithm running JJ samples is defined as ϱokJok/J\varrho_{ok}\triangleq J_{ok}/J, where JokJ_{ok} is the number of successful recoveries out of the JJ samples.

Table 1: Recovery rates for the four algorithms.
OMPC\text{OMP}_{C} BPC\text{BP}_{C} AlgGL2\text{Alg}_{GL2} AlgGL1\text{Alg}_{GL1}
ϱok\varrho_{ok} 65.80% 94.30% 97.40% 100%

With the above setting, the successful recovery rates for the four algorithms are given in Table 1.

Let 𝒔^\hat{\bm{s}}^{\star} be an estimate of 𝒔\bm{s}^{\star}, both being κ\kappa-sparse. The signal-to-noise ratio of this estimation is defined as ϱsnr10×log10𝑸𝒔22𝑸(𝒔𝒔^)22(dB)\varrho_{snr}\triangleq 10\times\text{log}_{10}\frac{\|\bm{Q}\bm{s}^{\star}\|^{2}_{2}}{\|\bm{Q}(\bm{s}^{\star}-\hat{\bm{s}}^{\star})\|^{2}_{2}}~{}~{}~{}(\text{dB}). When some of the κ\kappa nonzero entries of 𝒔\bm{s}^{\star} are very small, these entries have a little contribution to 𝒙=𝑸𝒔\bm{x}=\bm{Q}\bm{s}^{\star} but it is very difficult to correctly identify the indices of these entries. In this paper, we consider a recovery to be successful if either all the indices of 𝒔\bm{s}^{\star} are detected or the corresponding ϱsnr\varrho_{snr} exceeds 60 dB.

3 Performance Analysis of the Proposed Greedy Approach

From the numerical example presented above, we observe that the proposed AlgGL2\text{Alg}_{GL2} algorithm significantly outperforms the classical OMPC\text{OMP}_{C} and even surpasses BPC\text{BP}_{C}. Meanwhile, as expected, the proposed AlgGL1\text{Alg}_{GL1} achieves a perfect sparse recovery across all 1,000 samples. These results demonstrate that our proposed greedy approach successfully achieves the objective raised in Section 1.2. In this section, we conduct a detailed analysis of the proposed approach and the resulting algorithms.

3.1 Approach analysis

As understood, greedy methods, including our proposed approach (15), are designed based on minimizing a residual. Consequently, the generated sequence of residuals is non-increasing. Although the residuals converge, most greedy algorithms are expected to yield the unique solution 𝒔\bm{s}^{\star} within a finite number of iterations.

First of all, we present the following result associated with the proposed greedy approach (15).

Theorem 1.

Let k\mathcal{I}_{k} denote the set of indices detected, and let (𝛃~k,𝐳~k)(\tilde{\bm{\beta}}_{k},\tilde{\bm{z}}_{k}) be obtained using the proposed greedy approach (15). Furthermore, let 𝐬~k=𝐬(𝐳~k)𝐐¯k𝛃~k\tilde{\bm{s}}_{k}=\bm{s}(\tilde{\bm{z}}_{k})-\bar{\bm{Q}}_{k}\tilde{\bm{\beta}}_{k} be the residual, and denote 𝐬^kL×1\hat{\bm{s}}_{k}\in\mathbb{R}^{L\times 1} as the vector satisfying

𝒔^k(kc)=𝟎,𝒔^k(k)=𝜷~k.\hat{\bm{s}}_{k}(\mathcal{I}^{c}_{k})=\bm{0},~{}~{}\hat{\bm{s}}_{k}(\mathcal{I}_{k})=\tilde{\bm{\beta}}_{k}.

Then, when 𝐬~k=𝟎\tilde{\bm{s}}_{k}=\bm{0} for k2κk\leq 2\kappa, we have 𝐬^k=𝐬\hat{\bm{s}}_{k}=\bm{s}^{\star} as long as 𝐬k\mathcal{I}_{\bm{s}^{\star}}\subseteq\mathcal{I}_{k}.

Proof: First of all, we note that 𝒔~k=𝟎\tilde{\bm{s}}_{k}=\bm{0}, that is 𝒔0+𝑾𝒛~k𝑸¯(:,k)𝜷~k=𝟎,\bm{s}_{0}+\bm{W}\tilde{\bm{z}}_{k}-\bar{\bm{Q}}(:,\mathcal{I}_{k})\tilde{\bm{\beta}}_{k}=\bm{0}, which is equivalent to 𝑸¯(𝒔𝒔^k)+𝑾𝒛~k=𝟎\bar{\bm{Q}}(\bm{s}^{\star}-\hat{\bm{s}}_{k})+\bm{W}\tilde{\bm{z}}_{k}=\bm{0}. It then follows from 𝑾𝑽1=𝟎\bm{W}^{\top}\bm{V}_{1}={\bf 0} and 𝑸¯=𝑽1𝑽1\bar{\bm{Q}}=\bm{V}_{1}\bm{V}^{\top}_{1} that 𝒛~k=𝟎and𝑽1(𝒔𝒔^k)=𝟎.\tilde{\bm{z}}_{k}=\bm{0}~{}\text{and}~{}\bm{V}^{\top}_{1}(\bm{s}^{\star}-\hat{\bm{s}}_{k})=\bm{0}.

We note that 𝒔𝒔^k\bm{s}^{\star}-\hat{\bm{s}}_{k} is 2κ2\kappa-sparse when the condition given in the theorem holds and spark[𝑽1]=spark[𝑸]>2κ\text{spark}[\bm{V}^{\top}_{1}]=\text{spark}[\bm{Q}]>2\kappa due to the assumption of uniqueness on 𝒔\bm{s}^{\star}. Thus, 𝑽1(:,k)\bm{V}^{\top}_{1}(:,\mathcal{I}_{k}) is of full (column) rank, and hence the above equation leads to 𝒔𝒔^k=𝟎\bm{s}^{\star}-\hat{\bm{s}}_{k}=\bm{0}. This completes the proof.  

In fact, the claims in Theorem 1 also hold for the classical OMPC\text{OMP}_{C} algorithm, in which the upper bound for the number of iterations is set to κ\kappa. This means that OMPC\text{OMP}_{C} must select an index that falls within 𝒔\mathcal{I}_{\bm{s}^{\star}}, which precludes the possibility of OMPC\text{OMP}_{C} making some mistakes in selecting indices. However, these mistakes can be rectified by taking more than κ\kappa iterations [50]. In general, for k\mathcal{I}_{k} with κ<k2κ\kappa<k\leq 2\kappa, nothing can guarantee whether 𝒔k\mathcal{I}_{\bm{s}^{\star}}\subseteq\mathcal{I}_{k} holds unless 𝜷~k0=κ\|\tilde{\bm{\beta}}_{k}\|_{0}=\kappa is verified.

The primary advantage of the proposed greedy approach over classical methods, such as OMPC\text{OMP}_{C} and CoSaMP, lies in its index selection strategy. In the proposed greedy approach (15), the next index ik+1i_{k+1} is chosen as the one corresponding to the largest absolute entry of the residual 𝒔~k\tilde{\bm{s}}_{k}. As demonstrated in Remark 2.1 and the surrounding discussion, this selection ensures that ik+1i_{k+1} belongs to 𝒔\mathcal{I}_{\bm{s}^{\star}}. In contrast, OMPC\text{OMP}_{C} determines ik+1i_{k+1} by minimizing 𝒓kβ𝑸(:,i)22\|\bm{r}_{k}-\beta\bm{Q}(:,i)\|^{2}_{2} with respect to β\beta for ikci\in\mathcal{I}^{c}_{k}. However, such a selection does not necessarily guarantee that ik+1i_{k+1} belongs to 𝒔\mathcal{I}_{\bm{s}^{\star}}, even if k𝒔\mathcal{I}_{k}\subseteq\mathcal{I}_{\bm{s}^{\star}}, as the minimizer of 𝒓kβ𝑸(:,i)22\|\bm{r}_{k}-\beta\bm{Q}(:,i)\|^{2}_{2} may not correspond to an index within 𝒔\mathcal{I}_{\bm{s}^{\star}}.

Furthermore, let k\mathcal{I}_{k} be the detected indices and k\mathcal{I}^{\ast}_{k} be a subset of k\mathcal{I}_{k}, which contains kk^{\star} elements with k𝒔\mathcal{I}^{\ast}_{k}\subseteq\mathcal{I}_{\bm{s}^{\star}}. Using similar arguments to those in Remark 2.1, we have

{𝜷~k,𝒛~k}argmin𝜷,𝒛𝒔(𝒛)𝑸¯(:,k)𝜷0,\{\tilde{\bm{\beta}}_{k},\tilde{\bm{z}}_{k}\}\triangleq\arg\min_{\bm{\beta},\bm{z}}\|\bm{s}(\bm{z})-\bar{\bm{Q}}(:,\mathcal{I}_{k})\bm{\beta}\|_{0},

for which the residual is

𝒔~k=𝒔(𝒛~k)𝑸¯(:,k)𝜷~k=𝒔k\tilde{\bm{s}}_{k}=\bm{s}(\tilde{\bm{z}}_{k})-\bar{\bm{Q}}(:,\mathcal{I}_{k})\tilde{\bm{\beta}}_{k}=\bm{s}^{\star}_{k^{\star}}

as long as k>kkk>k/2k^{\star}>k-k^{\star}~{}~{}\Rightarrow~{}~{}k^{\star}>k/2 holds, where 𝒔k(l)=0,lk\bm{s}^{\star}_{k^{\star}}(l)=0,~{}l\in\mathcal{I}^{\ast}_{k} and otherwise, 𝒔k(l)=𝒔(l)\bm{s}^{\star}_{k^{\star}}(l)=\bm{s}^{\star}(l). This implies that with χ\chi taking l0l_{0}, the index ik+1i_{k+1} of the absolutely largest entry of the residual 𝒔k\bm{s}^{\star}_{k} should fall within 𝒔\mathcal{I}_{\bm{s}^{\star}}, provided that more than half of the kk detected indices belong to 𝒔\mathcal{I}_{\bm{s}^{\star}}. In other words, when the first few indices are detected correctly, the proposed greed approach tolerates a small number of mistakes in index detection. This is indeed a remarkable property.

Remark 3.1.

The conclusion drawn above is based on the use of the ideal sparsity measure l0l_{0}. Intuitively, the same conclusion is likely to hold with a high probability when a good approximating measure of l0l_{0} is used. Indeed, the excellent performance of AlgGL1\text{Alg}_{GL1}, in which l1l_{1} is used, appears to support this argument. This highlighted property suggests that algorithms yielding better performance than AlgGL1\text{Alg}_{GL1} could be developed by using the proposed greedy approach with more suitable approximating measures χ\chi of l0l_{0}. This will be elaborated further in the next section.

3.2 Comparison between OMPC\text{OMP}_{C} and AlgGL2\text{Alg}_{GL2}

As argued above, the key to AlgGL2\text{Alg}_{GL2} outperforming the classical OMPC\text{OMP}_{C} algorithm lies in its method of detecting the support 𝒔\mathcal{I}_{\bm{s}^{\star}}. Another contributing factor is related to the numerical properties of the system matrix, including the concept of spark and RIP (see equations (3) &\& (4)), and mutual coherence. In fact, let 𝒙=𝑸𝒔\bm{x}=\bm{Q}\bm{s}^{\star} with 𝒔0κ\|\bm{s}^{\star}\|_{0}\leq\kappa. It was shown in [30] and [45] that both the classical BPC\text{BP}_{C} and OMPC\text{OMP}_{C} algorithms can yield the true solution 𝒔\bm{s}^{\star} from the system 𝒙=𝑸𝒔\bm{x}=\bm{Q}\bm{s} with 𝑸\bm{Q} (column) l2l_{2}-normalized, if the following condition holds:

κ<12(1+1μ[𝑸]),\kappa<\frac{1}{2}(1+\frac{1}{\mu[\bm{Q}]}), (25)

where μ[𝑸]\mu[\bm{Q}] is the mutual coherence of 𝑸N×L\bm{Q}\in\mathbb{R}^{N\times L}. As shown in [51], for a matrix 𝑸N×L\bm{Q}\in\mathbb{R}^{N\times L} with 𝑸(:,l)2=1,l\|\bm{Q}(:,l)\|_{2}=1,~{}\forall~{}l, μ[𝑸]LNN(L1)μ¯\mu[\bm{Q}]\leq\sqrt{\frac{L-N}{N(L-1)}}\triangleq\underline{\mu} and the lower bound is achieved if and only if the Gram matrix 𝑮𝑸𝑸𝑸\bm{G}_{\bm{Q}}\triangleq\bm{Q}^{\top}\bm{Q} satisfies 𝑮𝑸(i,j)=μ¯,ij\bm{G}_{\bm{Q}}(i,j)=\underline{\mu},\forall~{}i\neq j. Such a 𝑸\bm{Q} is called an equiangular tight-frame (ETF).

The condition (25) is just a sufficient condition to ensure a prefect recovery of 𝒔\bm{s}^{\star}. As shown in [52] and [53], the spare recovery accuracy of OMPC\text{OMP}_{C} is more closely related to averaged mutual coherence of 𝑸\bm{Q}, denoted as μav[𝑸]\mu_{av}[\bm{Q}], which is defined as the mean of all the squared off-diagonal entries of 𝑮𝑸\bm{G}_{\bm{Q}} rather than μ[𝑸]\mu[\bm{Q}]. This implies that it is desired for the system matrix 𝑸\bm{Q} to have its Gram matrix close to the identity matrix 𝑰\bm{I}. Based on this argument, it was shown in [53] and [54] that an optimal 𝑸\bm{Q} is a tight-frame (TF), whose Gram matrix 𝑮𝑸\bm{G}_{\bm{Q}} is of form 𝑮𝑸=𝑽[α2𝑰N𝟎𝟎𝟎]𝑽,\bm{G}_{\bm{Q}}=\bm{V}\left[\begin{array}[]{cc}\alpha^{2}\bm{I}_{N}&{\bf 0}\\ {\bf 0}&{\bf 0}\end{array}\right]\bm{V}^{\top}, where 𝑽\bm{V} is orthonormal. As ETFs exist only for some pairs (N,L)(N,L), tight-frames provide an alternative for designing sparse recovery systems [55, 56, 57].

Our proposed AlgGL2\text{Alg}_{GL2} algorithm is based on the system 𝒔0(𝒙)=𝑸¯𝒔\bm{s}_{0}(\bm{x})=\bar{\bm{Q}}\bm{s}. As shown in (11), 𝑸¯=𝑽1𝑻0𝑸\bar{\bm{Q}}=\bm{V}_{1}\bm{T}_{0}\bm{Q}. We note that

  • It is difficult to find an explicit relationship between μ[𝑸¯]\mu[\bar{\bm{Q}}] and μ[𝑸]\mu[\bm{Q}]. However, 𝑸¯=𝑽1𝑽1\bar{\bm{Q}}=\bm{V}_{1}\bm{V}^{\top}_{1} is actually a 1-tight frame and extensive examples have shown that μav[𝑸¯]<μav[𝑸]\mu_{av}[\bar{\bm{Q}}]<\mu_{av}[\bm{Q}];

  • The classical OMPC\text{OMP}_{C} algorithm intends to find (𝜷~,k)(\tilde{\bm{\beta}},\mathcal{I}_{k}) such that the residual 𝒓k=𝒙𝑸k𝜷𝒙𝑸𝒔~k\bm{r}_{k}=\bm{x}-\bm{Q}_{k}\bm{\beta}\triangleq\bm{x}-\bm{Q}\tilde{\bm{s}}_{k} goes to nil as much as possible. Clearly, 𝒓k=𝑸(𝒔𝒔~k)𝑸Δ𝒔\bm{r}_{k}=\bm{Q}(\bm{s}^{\star}-\tilde{\bm{s}}_{k})\triangleq\bm{Q}\Delta\bm{s}. Assume that Δ𝒔\Delta\bm{s} is uncorrelated with a zero-mean. Thus, 𝑹E[Δ𝒔(Δ𝒔)]\bm{R}\triangleq E[\Delta\bm{s}(\Delta\bm{s})^{\top}] is diagonal with 𝑹(l,l)=σl2,l\bm{R}(l,l)=\sigma^{2}_{l},~{}\forall~{}l, where E[.]E[.] denotes the statistical average operation. Then, ηavE[𝒓k22]\eta_{av}\triangleq E[\|\bm{r}_{k}\|^{2}_{2}] is given by ηav=Tr[𝑸𝑸𝑹]=l=1Lσl2\eta_{av}=\text{Tr}[\bm{Q}^{\top}\bm{Q}\bm{R}]=\sum^{L}_{l=1}\sigma^{2}_{l}, as 𝑸\bm{Q} is (column) l2l_{2}-normalized.

    Applying the above to the higher-dimensional system 𝒔0=𝑸¯𝒔\bm{s}_{0}=\bar{\bm{Q}}\bm{s}, we obtain η¯av=Tr[𝑸¯𝑹]=l=1L𝑽1(l,:)22σl2.\bar{\eta}_{av}=\text{Tr}[\bar{\bm{Q}}\bm{R}]=\sum^{L}_{l=1}\|\bm{V}_{1}(l,:)\|^{2}_{2}\sigma^{2}_{l}. Since 𝑽1(ik,:)21\|\bm{V}_{1}(i_{k},:)\|_{2}\leq 1, we can see that η¯avηav\bar{\eta}_{av}\leq\eta_{av}. This implies that, in a statistical sense, AlgGL2\text{Alg}_{GL2} converges faster than OMPC\text{OMP}_{C};

  • Let 𝑸\bm{Q} be a matrix with the (κ,δ)(\kappa,\delta)-RIP property. It follows from 𝑸=𝑼1𝚺𝑽1\bm{Q}=\bm{U}_{1}\bm{\Sigma}\bm{V}^{\top}_{1} that

    σ11𝑸𝒔2𝑽1𝒔2|σN~1𝑸𝒔2,\sigma^{-1}_{1}\|\bm{Q}\bm{s}\|_{2}\leq\|\bm{V}^{\top}_{1}\bm{s}\|_{2}\leq|\sigma^{-1}_{\tilde{N}}\|\bm{Q}\bm{s}\|_{2},

    where σ1\sigma_{1} and σN~\sigma_{\tilde{N}} are the smallest and largest (nonzero) singular values of 𝑸\bm{Q}, respectively. It can be shown that with some manipulations that 𝑸¯=𝑽1𝑽1\bar{\bm{Q}}=\bm{V}_{1}\bm{V}^{\top}_{1} satisfies the (κ,δ¯)(\kappa,\bar{\delta})-RIP property with δ¯<δ+(1δ)(σ1/σN~1)σ1/σN~+(1δ)/(1+δ).\bar{\delta}<\delta+\frac{(1-\delta)(\sigma_{1}/\sigma_{\tilde{N}}-1)}{\sigma_{1}/\sigma_{\tilde{N}}+(1-\delta)/(1+\delta)}. If 𝑸\bm{Q} is well-conditioned,444This is usually a necessary condition for the sparsifying systems to function properly. i.e., the condition number σ1/σN~\sigma_{1}/\sigma_{\tilde{N}} is not far away from one, so is 𝑸¯\bar{\bm{Q}}. Thus, most of the theoretical results for the original system 𝒙=𝑸𝒔\bm{x}=\bm{Q}\bm{s} hold for the transformed system 𝒔0(𝒙)=𝑸¯𝒔\bm{s}_{0}(\bm{x})=\bar{\bm{Q}}\bm{s}.

Now, let us consider the implementation complexity of AlgGL2\text{Alg}_{GL2}. Define 𝑸c𝑽1.\bm{Q}_{c}\triangleq\bm{V}^{\top}_{1}. It follows from (9) and 𝑸¯=𝑽1𝑽1\bar{\bm{Q}}=\bm{V}_{1}\bm{V}^{\top}_{1} that the 1st equation of (17) can be rewritten as

𝒔0𝑸¯(:,k)𝜷22=𝒙c𝑸c(:,k)𝜷22.\|\bm{s}_{0}-\bar{\bm{Q}}(:,\mathcal{I}_{k})\bm{\beta}\|^{2}_{2}=\|\bm{x}_{c}-\bm{Q}_{c}(:,\mathcal{I}_{k})\bm{\beta}\|^{2}_{2}.

Comparing to its counterpart in OMPC\text{OMP}_{C}: 𝒙𝑸(:,k)𝜷22\|\bm{x}-\bm{Q}(:,\mathcal{I}_{k})\bm{\beta}\|^{2}_{2}, we can see that at the stage of obtaining the optimal coefficients, our proposed AlgGL2\text{Alg}_{GL2} has exactly the same computation complexity if 𝒙c\bm{x}_{c} is available. We note that the extra computational burden, denoted as 𝒞b\mathcal{C}_{b}, for computing 𝒙c=𝑻0𝒙\bm{x}_{c}=\bm{T}_{0}\bm{x} involves N2N^{2} multiplications and N(N1)N(N-1) additions.

Regarding the index selection stage, i.e., the 2nd equation of (17), our proposed AlgGL2\text{Alg}_{GL2} algorithm needs to compute the residual as follows

𝒔^k=𝒔0𝑸¯(:,k)𝜷=𝑽1(𝒙c𝑸c(:,k)𝜷),\hat{\bm{s}}^{\star}_{k}=\bm{s}_{0}-\bar{\bm{Q}}(:,\mathcal{I}_{k})\bm{\beta}=\bm{V}_{1}(\bm{x}_{c}-\bm{Q}_{c}(:,\mathcal{I}_{k})\bm{\beta}),

while OMPC\text{OMP}_{C} requires computing 𝝃\bm{\xi} with

𝝃=𝑸𝒓k=𝑸(𝒙𝑸(:,k)𝜷).\bm{\xi}=\bm{Q}^{\top}\bm{r}_{k}=\bm{Q}^{\top}(\bm{x}-\bm{Q}(:,\mathcal{I}_{k})\bm{\beta}).

This implies that both algorithms have the same computational complexity at this stage.

Therefore, we claim that our proposed AlgGL2\text{Alg}_{GL2} algorithm significantly outperforms OMPC\text{OMP}_{C}, albeit at the price of an extra computational burden 𝒞b\mathcal{C}_{b}.555Note that converting the original system 𝒙=𝑸𝒔\bm{x}=\bm{Q}\bm{s} into the higher dimensional system 𝒔0(𝒙)=𝑸¯𝒔\bm{s}_{0}(\bm{x})=\bar{\bm{Q}}\bm{s} requires the SVD of 𝑸\bm{Q}. Since this regards designing the system/algorithm and is done once only, the computational complexity involved at this stage is not considered part of the algorithm’s overall complexity.

3.3 Comparison between BPC/BPalt\text{BP}_{C}/\text{BP}_{alt} and AlgGL1\text{Alg}_{GL1}

Recall that BPC\text{BP}_{C} refers to the BP algorithm based on the classical formulation (5), while BPalt\text{BP}_{alt} denotes the BP algorithm based on the reformulation (24) of (5). They are mathematically equivalent.

Compared to OMPC\text{OMP}_{C}, the two BP-based algorithms adopt a global optimization approach to detect 𝒔\mathcal{I}_{\bm{s}^{\star}}. However, the interactions among the entries in 𝒔\bm{s}, imposed by the constraint 𝒙=𝑸𝒔\bm{x}=\bm{Q}\bm{s}, make it challenging to identify the sparsest solution 𝒔\bm{s}^{\star}. The proposed AlgGL1\text{Alg}_{GL1} algorithm is specifically designed to mitigate the influence of the magnitudes of detected atoms on the identification of subsequent indices in a greedy manner. It leverages the strengths of both the classical BPalt\text{BP}_{alt} and OMPC\text{OMP}_{C} algorithms, effectively combining their advantages. Consequently, the substantial performance improvement over BPC/BPalt\text{BP}_{C}/\text{BP}_{alt}, as previously discussed, primarily stems from the method used to detect the support 𝒔\mathcal{I}_{\bm{s}^{\star}}, which is performed in a higher-dimensional space.

The proposed AlgGL1\text{Alg}_{GL1} algorithm exhibits excellent performance in terms of sparse recovery accuracy. Regarding implementation complexity, however, it requires solving κ\kappa l1l_{1}-minimization problems of the form min𝒛𝒔0+𝑾¯𝒛1\min_{\bm{z}}\|\bm{s}_{0}+\bar{\bm{W}}\bm{z}\|_{1}, i.e., (24), using BPalt\text{BP}_{alt}. Consequently, similar to BPC\text{BP}_{C} and BPalt\text{BP}_{alt}, both of which are implemented via linear programming, AlgGL1\text{Alg}_{GL1} is computationally demanding.

More efficient algorithms for solving (24) are needed to enable the use of AlgGL1\text{Alg}_{GL1} in large-scale sparse recovery systems. Indeed, a subgradient-based algorithm was proposed in [58] for solving (24), which significantly outperforms BPalt\text{BP}_{alt} in terms of speed and has substantially lower implementation complexity. In the next section, we introduce an IRLS-based algorithm for solving (24). By replacing BPalt\text{BP}_{alt} with this IRLS-based approach in AlgGL1\text{Alg}_{GL1}, the resulting algorithm not only achieves significantly faster convergence but also delivers superior sparse recovery performance.

4 Improved Algorithms

The proposed AlgGL1\text{Alg}_{GL1} algorithm is computationally slow due to two main factors: i) it relies on linear programming-based BP; and ii) similar to the classical OMP, it selects atoms one-by-one. Additionally, the proposed algorithm AlgGL1\text{Alg}_{GL1} is designed to solve the minimization problems of the form (15) with χ\chi defined as the l1l_{1}-norm. Higher performance algorithms can be developed by employing a more effective norm to approximate the sparsity norm .0\|.\|_{0}.

In this section, we derive a class of efficient algorithms that adopt an atom update scheme similar to that of CoSaMP and explore strategies for handling cases where the measurements are contaminated by noise.

4.1 Iteratively re-weighted approach to (15)

Note that the cost function F(𝒛,𝜷)F(\bm{z},\bm{\beta}) of (15) is of form

f(𝒛¯)\displaystyle f(\bar{\bm{z}}) \displaystyle\triangleq 𝒔0+𝛀𝒛¯χs.t.𝛀=𝒔0+[𝑾𝑸¯k],𝒛¯=[𝒛𝜷],\displaystyle\|\bm{s}_{0}+\bm{\Omega}\bar{\bm{z}}\|_{\chi}~{}~{}~{}~{}~{}~{}~{}~{}\text{s.t.}\quad\bm{\Omega}=\bm{s}_{0}+\left[\begin{array}[]{cc}\bm{W}&-\bar{\bm{Q}}_{k}\end{array}\right],\bar{\bm{z}}=\left[\begin{array}[]{c}\bm{z}\\ \bm{\beta}\end{array}\right], (29)

where χ\chi represents an approximating function ϕ(t)\phi(t) of the sign function sgn[|t|]\text{sgn}[|t|].

Thus, the key to (15) is to solve

min𝒛¯f(𝒛¯)=min𝒛¯lϕ(𝒔0(l)+𝛀(l,:)𝒛¯).\min_{\bar{\bm{z}}}f(\bar{\bm{z}})=\min_{\bar{\bm{z}}}\sum_{l}\phi(\bm{s}_{0}(l)+\bm{\Omega}(l,:)\bar{\bm{z}}). (30)

Here, we consider two approximating functions defined before: ϕ1(t)=|t|,ϕ3(t)=(|t|2+ϵ)q/2,\phi_{1}(t)=|t|,~{}~{}\phi_{3}(t)=(|t|^{2}+\epsilon)^{q/2}, where ϵ>0\epsilon>0 and 0<q<10<q<1. When χ\chi represents ϕ1(t)=|t|\phi_{1}(t)=|t|, (30) is convex and can be solved using BPalt\text{BP}_{alt}, resulting in the AlgGL1\text{Alg}_{GL1} algorithm.

When χ\chi represents the lql_{q}-norm ϕ3(t)\phi_{3}(t), f(𝒛¯)f(\bar{\bm{z}}) is highly non-convex and such a problem can be addressed using the IRLS method, which generates a sequence {𝒛¯(k)}\{\bar{\bm{z}}^{(k)}\} for a fixed ϵ\epsilon in the way:

𝒛¯(k+1)\displaystyle\bar{\bm{z}}^{(k+1)}\triangleq argmin𝒛¯𝒔¯0+𝛀¯𝒛¯22,𝒔(k+1)=𝒔0+𝛀𝒛¯(k+1),\displaystyle\arg\min_{\bar{\bm{z}}}\|\bar{\bm{s}}_{0}+\bar{\bm{\Omega}}\bar{\bm{z}}\|^{2}_{2},\quad\bm{s}^{(k+1)}=\bm{s}_{0}+\bm{\Omega}\bar{\bm{z}}^{(k+1)}, (31)

where 𝒔¯0𝑫2(k)𝒔0\bar{\bm{s}}_{0}\triangleq\bm{D}^{(k)}_{2}\bm{s}_{0} and 𝛀¯𝑫2(k)𝛀\bar{\bm{\Omega}}\triangleq\bm{D}^{(k)}_{2}\bm{\Omega}, with the diagonal 𝑫2(k)\bm{D}^{(k)}_{2} given by

𝑫2(k)(l,l)=[|𝒔(k)(l)|2+ϵ]q/21,l.\bm{D}^{(k)}_{2}(l,l)=\sqrt{[|\bm{s}^{(k)}(l)|^{2}+\epsilon]^{q/2-1}},~{}\forall~{}l. (32)

Note that the solution to the 1st term of (31) is given by

𝒛¯(k+1)=(𝛀¯𝛀¯)1𝛀¯𝒔¯0.\bar{\bm{z}}^{(k+1)}=-(\bar{\bm{\Omega}}^{\top}\bar{\bm{\Omega}})^{-1}\bar{\bm{\Omega}}^{\top}\bar{\bm{s}}_{0}. (33)

We use the same ϵ\epsilon-regularization scheme to set {ϵj}\{\epsilon_{j}\} and the stopping criterion as in [39]. The corresponding ϵ\epsilon-regularized IRLS algorithm, denoted as IRLSLQalt\text{IRLS}_{LQ-alt} to differentiate the classical IRLSC\text{IRLS}_{C}, is presented in Algorithm 2. For convenience, we denote this algorithm as [𝒔,𝒛]=IRLSLQalt(𝒔0,𝛀,q).[\bm{s},\bm{z}]=\text{IRLS}_{LQ-alt}(\bm{s}_{0},\bm{\Omega},q). As understood, with BPalt\text{BP}_{alt} replaced by IRLSLQalt\text{IRLS}_{LQ-alt} in AlgGL1\text{Alg}_{GL1} a similar algorithm to AlgGL1\text{Alg}_{GL1} can be obtained. Such an algorithm is denoted as 𝒔=AlgGLQ(𝒔0,𝛀,q)\bm{s}^{\ast}=\text{Alg}_{GLQ}(\bm{s}_{0},\bm{\Omega},q).

Algorithm 2 An IRLS-based algorithm for solving (30) (IRLSLQalt\text{IRLS}_{LQ-alt})
Inputs: (𝒔0,𝛀,q)(\bm{s}_{0},\bm{\Omega},q);
Initials: Set ϵ=10\epsilon=10, 𝒛=𝟎\bm{z}=\bm{0}, 𝒔=𝒔0\bm{s}=\bm{s}_{0}, and 𝒔p=𝟎\bm{s}_{p}=\bm{0};
for p~=1,2,,9\tilde{p}=1,2,\cdots,9 do
     (1) set ϵ=ϵ/10\epsilon=\epsilon/10 and ct=0ct=0;
     (2) while 𝒔𝒔p2/𝒔2>ϵ/100\|\bm{s}-\bm{s}_{p}\|_{2}/\|\bm{s}\|_{2}>\sqrt{\epsilon}/100, set 𝒔p=𝒔\bm{s}_{p}=\bm{s} and ctct+1ct~{}\leftarrow~{}ct+1. Then, compute the weighting matrix 𝑫2\bm{D}_{2} with (32);
     (3) update 𝒔\bm{s} with 𝒛=(𝛀𝑫22𝛀)1𝛀𝑫22𝒔0,𝒔=𝒔0+𝛀𝒛;~{}~{}\bm{z}=-(\bm{\Omega}^{\top}\bm{D}_{2}^{2}\bm{\Omega})^{-1}\bm{\Omega}^{\top}\bm{D}_{2}^{2}\bm{s}_{0},\quad\bm{s}=\bm{s}_{0}+\bm{\Omega}\bm{z};
     (4) check   𝐈𝐟ct>4500,𝐛𝐫𝐞𝐚𝐤.{\bf If}~{}ct>4500,~{}{\bf break}.
     (5) Go to step (2);
end for
Output: 𝒔\bm{s}.
Remark 4.1.

It should be noted that the solution to (31) is theoretically obtained through matrix inversion as shown in (33). However, this solution for the unconstrained LS problem can also be computed using more efficient iterative algorithms, such as GP and CGP, which avoid matrix inversion. See [46] for details. This provides a significant advantage of our proposed IRLSLQalt\text{IRLS}_{LQ-alt} over the classical IRLSC\text{IRLS}_{C}, which requires a large number of matrix inversions (see (7)). These inversions are not only computationally expensive but can also lead to numerical issues when the system matrix 𝐐\bm{Q} is ill-conditioned. This advantage will be further demonstrated in the simulation section.

4.2 CoSaMP-type algorithms for sparse recovery

As mentioned earlier, although AlgGL1\text{Alg}_{GL1} achieves excellent sparse recovery performance, it is very slow, partially due to the one-by-one update of atoms. By using the same idea as in CoSaMP, we can update the set of selected atoms with multiple atoms at each iteration.

Let AlgGLX\text{Alg}_{GLX} be an algorithm, say AlgGL1\text{Alg}_{GL1} or AlgGLQ\text{Alg}_{GLQ}, derived earlier. Its improved version, which uses a CoSaMP-type strategy for updating the set of selected indices, is denoted as AlgGLXF\text{Alg}^{F}_{GLX} and outlined in Algorithm 3 (for AlgGL1F\text{Alg}^{F}_{GL1}). For convenience, we denote this algorithm as 𝒔=AlgGL1F(𝒔0,𝑾,𝑽1,κ,κp).\bm{s}^{\ast}=\text{Alg}^{F}_{GL1}(\bm{s}_{0},\bm{W},\bm{V}_{1},\kappa,\kappa_{p}). By replacing BPalt\text{BP}_{alt} with IRLSLQalt\text{IRLS}_{LQ-alt} in AlgGL1F\text{Alg}^{F}_{GL1}, we obtain AlgGLQF\text{Alg}^{F}_{GLQ}.

Algorithm 3 A CoSaMP-type sparse recovery algorithm associated with BPalt\text{BP}_{alt} (AlgGL1F\text{Alg}^{F}_{GL1})
Inputs: (𝒔0,𝑾,𝑽1)(\bm{s}_{0},\bm{W},\bm{V}_{1}), κ\kappa, and κp\kappa_{p};
Initials: Run [𝒛~,𝒔~]=BPalt(𝒔0,𝑾),[\tilde{\bm{z}},\tilde{\bm{s}}]=\text{BP}_{alt}(\bm{s}_{0},\bm{W}), Then, assign the residual 𝒔~\tilde{\bm{s}} to 𝒔\bm{s}, and set \mathcal{I} as the set of indices for the κp\kappa_{p} largest entries of 𝒔\bm{s} in magnitude;
for k=1:Niterk=1:N_{iter} do  
     - Checking whether 𝒔\bm{s} is the desired solution: Let |𝒔|κ+1|\bm{s}|_{\kappa+1} denote the (κ+1)(\kappa+1)-th entry of the re-arranged sequence of {|𝒔(l)|}\{|\bm{s}(l)|\} sorted in a descending order. 𝐈𝐟|𝒔|κ+1104,𝐛𝐫𝐞𝐚𝐤𝑓𝑜𝑟𝑙𝑜𝑜𝑝.~{}~{}{\bf If}~{}~{}|\bm{s}|_{\kappa+1}\leq 10^{-4},~{}{\bf break}~{}~{}{\it for-loop}.
     - Updating the residual 𝒔~\tilde{\bm{s}} and 𝜷~\tilde{\bm{\beta}}: Compute 𝛀=[𝑾𝑽1(𝑽1(,:))]\bm{\Omega}=\left[\begin{array}[]{cc}\bm{W}&-\bm{V}_{1}(\bm{V}_{1}(\mathcal{I},:))^{\top}\end{array}\right], then run [𝒛,𝒔~]=BPalt(𝒔0,𝛀)[\bm{z},\tilde{\bm{s}}]=\text{BP}_{alt}(\bm{s}_{0},\bm{\Omega}). Set 𝜷~=𝟎\tilde{\bm{\beta}}=\bm{0} and 𝜷~()\tilde{\bm{\beta}}(\mathcal{I}) with the last length()\text{length}(\mathcal{I}) entries of 𝒛\bm{z};
     - Updating 𝒔\bm{s} using (16) with the pair (𝒔~,𝜷~)(\tilde{\bm{s}},\tilde{\bm{\beta}}):   𝒔=𝒔~+𝜷~.\bm{s}=\tilde{\bm{s}}+\tilde{\bm{\beta}}.
     - Updating the index set \mathcal{I} with the indices of the κ\kappa largest entries of 𝒔\bm{s} in magnitude.
end for
Output: 𝒔=𝒔\bm{s}^{\ast}=\bm{s}.
Remark 4.2.

We note that (i) the parameter κp\kappa_{p} is determined by the sparse recovery performance of the associated algorithm, like BPalt\text{BP}_{alt} and IRLSLQalt\text{IRLS}_{LQ-alt}. In general, κp\kappa_{p} is chosen such that 0.35κp/κ0.750.35\leq\kappa_{p}/\kappa\leq 0.75; (ii) as observed, the number of indices selected in \mathcal{I} is equal to κ\kappa, except during the first iteration, where κp\kappa_{p} indices are included. This number can be slightly greater than κ\kappa but is definitely smaller than 2κ2\kappa. See Theorem 1.

4.3 Robust sparse recovery algorithms against low-rank disturbance

So far, our discussion has been limited to the case where the measurement signal 𝒙\bm{x} is strictly sparse. Now, we consider a more practical scenario where the signal 𝒙\bm{x} is of the following form

𝒙=𝑸𝒔+𝒆𝒙0+𝒆,\bm{x}=\bm{Q}\bm{s}+\bm{e}\triangleq\bm{x}_{0}+\bm{e}, (34)

where 𝒆\bm{e} represents a disturbance that affects the recovery of the true sparse vector 𝒙0\bm{x}_{0} from 𝒙\bm{x}.

The objective here is to separate 𝒙0\bm{x}_{0} and 𝒆\bm{e} from a 𝒙\bm{x} given. A traditional approach to address this problem is to first obtain a primary estimate 𝒔\bm{s}^{\ast} of the true 𝒔\bm{s}^{\star} underlying 𝒙0\bm{x}_{0} by applying a sparse recovery algorithm to 𝒙\bm{x}. Then, defining \mathcal{I} as the set of indices corresponding to the κ\kappa largest entries of 𝒔\bm{s}^{\ast} in magnitude, an estimate 𝒙^0\hat{\bm{x}}_{0} of the clean 𝒙0\bm{x}_{0} is given by

𝒙^0=𝑸𝒔^,\hat{\bm{x}}_{0}=\bm{Q}\hat{\bm{s}}^{\star}, (35)

where 𝒔^\hat{\bm{s}}^{\star} is κ\kappa-sparse and satisfies 𝒔^()=𝜷\hat{\bm{s}}^{\star}(\mathcal{I})=\bm{\beta}^{\ast}, with 𝜷argmin𝜷𝒙𝑸(:,)𝜷22.\bm{\beta}^{\ast}\triangleq\arg\min_{\bm{\beta}}||\bm{x}-\bm{Q}(:,\mathcal{I})\bm{\beta}||^{2}_{2}. This estimation procedure corresponds to the classical Oracle estimator [31] when =𝒔\mathcal{I}=\mathcal{I}_{\bm{s}^{\star}}. In general, the classical sparse recovery algorithms are highly sensitive to noise 𝒆\bm{e}. Even when =𝒔\mathcal{I}=\mathcal{I}_{\bm{s}^{\star}}, the obtained estimate 𝒙^0\hat{\bm{x}}_{0} does not necessarily equal 𝒙0\bm{x}_{0} unless 𝒆=𝟎\bm{e}=\bm{0}.

There are scenarios, in which 𝒆\bm{e} is often modeled as a low-rank signal [59, 60]:

𝒆=𝚪𝜶,\bm{e}=\bm{\Gamma}\bm{\alpha}^{\star}, (36)

where 𝚪N×Ne\bm{\Gamma}\in\mathbb{R}^{N\times N_{e}} with Ne<<NN_{e}<<N is a matrix satisfying 𝚪𝚪=𝑰Ne\bm{\Gamma}^{\top}\bm{\Gamma}=\bm{I}_{N_{e}}, and 𝜶\bm{\alpha}^{\star} is an unknown vector. In multiple measurement vector models, (34) transforms into a low-rank and sparse decomposition problem, a framework that has been extensively studied in the literature [61, 62, 63, 64, 65]. Similar to the Robust PCA (RPCA)-based approach introduced in [61], most of these works focus on the special case where 𝑸=𝑰N\bm{Q}=\bm{I}_{N}, investigating various methods for separating the low-rank and sparse components. In this paper, we address the problem of recovering the clean signal 𝒙0\bm{x}_{0} from 𝒙\bm{x} as given in (34), with 𝒆\bm{e} modeled by (36). We consider the scenario where both 𝑸\bm{Q} and 𝚪\bm{\Gamma} are known, with 𝑸\bm{Q} having dimensions N×LN\times L and assuming Ne<<NLN_{e}<<N\leq L.

Note 𝒙𝒆=𝑸𝒔\bm{x}-\bm{e}=\bm{Q}\bm{s}. Under our proposed characterization, we have 𝒔=𝑻0(𝒙𝒆)+𝑾𝒛\bm{s}=\bm{T}_{0}(\bm{x}-\bm{e})+\bm{W}\bm{z}, and hence

𝒔=𝑽1𝑻0𝒙+[𝑾𝑽1𝑻0𝚪][𝒛𝜶]𝒔0+𝑾c𝒛c.\bm{s}=\bm{V}_{1}\bm{T}_{0}\bm{x}+\left[\begin{array}[]{cc}\bm{W}&-\bm{V}_{1}\bm{T}_{0}\bm{\Gamma}\end{array}\right]\left[\begin{array}[]{c}\bm{z}\\ \bm{\alpha}\end{array}\right]\triangleq\bm{s}_{0}+\bm{W}_{c}\bm{z}_{c}.

Based on this formulation, we propose the following robust approach to sparse recovery

𝒛c\displaystyle\bm{z}^{\star}_{c} \displaystyle\triangleq argmin𝒛c𝒔0+𝑾c𝒛c0𝒔=𝒔0+𝑾c𝒛c.\displaystyle\arg\min_{\bm{z}_{c}}\|\bm{s}_{0}+\bm{W}_{c}\bm{z}_{c}\|_{0}\quad\mapsto~{}~{}~{}~{}\bm{s}^{\star}=\bm{s}_{0}+\bm{W}_{c}\bm{z}^{\star}_{c}. (37)

Let 𝑾cL×Lc\bm{W}_{c}\in\mathbb{R}^{L\times L_{c}}. It was shown in [58] that (37) has a unique κ\kappa-solution 𝒔\bm{s}^{\star} if and only if all the (L2κ)×Lc(L-2\kappa)\times L_{c} submatrices of 𝑾c\bm{W}_{c} are of rank LcL_{c}, which holds when 𝑾c=𝑾\bm{W}_{c}=\bm{W}. Since, in general, Lc=LN+NeL_{c}=L-N+N_{e}, ensuring this condition requires that L2κLcL-2\kappa\geq L_{c}, which simplifies to 2κ+NeN.2\kappa+N_{e}\leq N.

As observed, (37) has the same form as (10). Consequently, it can be addressed using classical methods such as BPalt\text{BP}_{alt} and the just derived IRLSLQalt\text{IRLS}_{LQ-alt}. More interestingly, with (𝑾,𝒛)(\bm{W},\bm{z}) replaced by (𝑾c,𝒛c)(\bm{W}_{c},\bm{z}_{c}), our proposed greedy algorithms such as AlgGL1\text{Alg}_{GL1}, AlgGLQ\text{Alg}_{GLQ}, AlgGL1F\text{Alg}^{F}_{GL1} and AlgGLQF\text{Alg}^{F}_{GLQ}, can also be applied to solve (37).

Remark 4.3.

It is important to highlight the following key points:

  • The results obtained above indicate that the true support 𝒔\mathcal{I}_{\bm{s}^{\star}} can be recovered with high probability using our proposed algorithms, even in the presence of disturbance 𝒆\bm{e}. This demonstrates a significant advantage of our proposed sparse model 𝒔=𝒔0(𝒙)+𝑾𝒛\bm{s}=\bm{s}_{0}(\bm{x})+\bm{W}\bm{z} over the classical formulation 𝒙=𝑸𝒔\bm{x}=\bm{Q}\bm{s};

  • Let 𝒔\bm{s}^{\ast} be the primary estimate of 𝒔\bm{s}^{\star} obtained using one of our proposed algorithms. Suppose that \mathcal{I} - the set of indices corresponding to the κ\kappa largest entries of 𝒔\bm{s}^{\ast}, is equal to 𝒔\mathcal{I}_{\bm{s}^{\star}}. In this case, the estimate 𝒙^0\hat{\bm{x}}_{0} can be computed using the classical Oracle estimator (35). Alternatively, it follows from 𝒔=𝒔0+𝑾c𝒛c\bm{s}^{\star}=\bm{s}_{0}+\bm{W}_{c}\bm{z}^{\star}_{c} and 𝒔()=𝟎\bm{s}^{\star}(\mathcal{I})=\bm{0} that 𝟎=𝒔0()+𝑾c(,:)𝒛c.\bm{0}=\bm{s}_{0}(\mathcal{I})+\bm{W}_{c}(\mathcal{I},:)\bm{z}^{\star}_{c}. Thus, we have

    𝒛c=((𝑾c)𝑾c)1(𝑾c)𝒔0()=[𝒛α],\bm{z}^{\star}_{c}=-((\bm{W}^{\star}_{c})^{\top}\bm{W}^{\star}_{c})^{-1}(\bm{W}^{\star}_{c})^{\top}\bm{s}_{0}(\mathcal{I})=\left[\begin{array}[]{c}\bm{z}^{\star}\\ \alpha^{\star}\end{array}\right],

    where 𝑾c𝑾c(,:)\bm{W}^{\star}_{c}\triangleq\bm{W}_{c}(\mathcal{I},:). The corresponding estimate of 𝒙0\bm{x}_{0} is given by 𝒙^0=𝒙𝚪𝜶=𝒙0.\hat{\bm{x}}_{0}=\bm{x}-\bm{\Gamma}\bm{\alpha}^{\star}=\bm{x}_{0}. This result implies that the clean signal 𝒙0\bm{x}_{0} can be exactly recovered using the framework based on our proposed model or characterization, provided that =𝒔\mathcal{I}=\mathcal{I}_{\bm{s}^{\star}};

  • Note 𝒙=𝑸𝒔+𝚪𝜶=[𝑸𝚪][𝒔𝜶]𝑸~𝒔~\bm{x}=\bm{Q}\bm{s}+\bm{\Gamma}\bm{\alpha}=\left[\begin{array}[]{cc}\bm{Q}&\bm{\Gamma}\end{array}\right]\left[\begin{array}[]{c}\bm{s}\\ \bm{\alpha}\end{array}\right]\triangleq\tilde{\bm{Q}}\tilde{\bm{s}}. The sparsity-based signal separation problem can then be formulated as the classical form:

    𝒔~argmin𝒔~𝑫0𝒔~0s.t.𝒙=𝑸~𝒔~,\tilde{\bm{s}}^{\star}\triangleq\arg\min_{\tilde{\bm{s}}}||\bm{D}_{0}\tilde{\bm{s}}||_{0}~{}~{}~{}\text{s.t.}~{}~{}\bm{x}=\tilde{\bm{Q}}\tilde{\bm{s}},

    where 𝑫0[𝑰L𝟎𝟎𝟎]\bm{D}_{0}\triangleq\left[\begin{array}[]{cc}\bm{I}_{L}&{\bf 0}\\ {\bf 0}&{\bf 0}\end{array}\right]. However, solving this problem using classical algorithms such OMPC\text{OMP}_{C} and IRLSC\text{IRLS}_{C} is challenging due to the potential numerical ill-conditioning of 𝑸~\tilde{\bm{Q}}. In particular, IRLSC\text{IRLS}_{C} can not be directly applied to the problem due to the singularity of 𝑫0\bm{D}_{0}. Furthermore, even if the true support 𝒔\mathcal{I}_{\bm{s}^{\star}} is correctly identified using a classical algorithm, the clean signal 𝒙0\bm{x}_{0} cannot be recovered using the classical Oracle estimator, as previously discussed.

5 Numerical examples and simulations

In this section, we present a number of numerical examples and simulations in MATLAB to demonstrate the performance of five classical algorithms and five proposed algorithms:

  • OMPC\text{OMP}_{C} -the classical orthogonal matching pursuit (OMP) algorithm [44, 45];

  • CoSaMPC\text{CoSaMP}_{C} - the classical compressive sampling match pursuit (CoSaMP) algorithm [48];

  • BPC\text{BP}_{C} - the classical basis pursuit (BP) algorithm [32, 31];

  • IRLSC\text{IRLS}_{C} - the iteratively re-weighted least squares (IRLS) algorithm proposed in [39] with q=1/2q=1/2;

  • FISTA - the fast iterative shrinkage-thresholding (FISTA) algorithm proposed in [42];

  • AlgGL2\text{Alg}_{GL2} - the proposed greedy algorithm using l2l_{2};

  • AlgGL1\text{Alg}_{GL1} - the proposed greedy algorithm with atoms selected one-by-one using BPalt\text{BP}_{alt};

  • AlgGL1F\text{Alg}^{F}_{GL1} - the proposed fast greedy algorithm with κ\kappa atoms selected at each iteration using BPalt\text{BP}_{alt};

  • AlgGLQ\text{Alg}_{GLQ} - the proposed greedy algorithm with atoms selected one-by-one using AlgLQalt\text{Alg}_{LQ-alt};

  • AlgGLQF\text{Alg}^{F}_{GLQ} - the proposed fast greedy algorithm with κ\kappa atoms selected at each iteration using AlgLQalt\text{Alg}_{LQ-alt}.

5.1 Data setting and measures

The synthetic data used in this section is generated as follows: For a given setting (κ,N,L)(\kappa,N,L), we create a matrix of dimension N×LN\times L using randn, and the system matrix 𝑸\bm{Q} is obtained by normalizing the columns of this matrix. A sequence of JJ κ\kappa-sparse samples, {𝒔j}\{\bm{s}_{j}\}, is generated with the positions of the non-zero entries randomly selected using randperm, while these entries are produced using randn such that 𝑸𝒔j2=1,j||\bm{Q}\bm{s}_{j}||_{2}=1,~{}\forall~{}j.

To evaluate the performance of a sparse recovery algorithm, we adopt the rate of successful sparse recovery ϱok\varrho_{ok} and the signal-to-noise ratio (SNR), both defined at the end of Section 2. Furthermore, the wall-clock time in seconds, denoted as TcT_{c}, represents the total time taken by an algorithm to process the JJ samples.

5.2 Sparse recovery accuracy and computational efficiency

Table 2: Statistics of the rate ϱok\varrho_{ok} and TcT_{c} of the ten algorithms for different settings (κ,N,L)(\kappa,N,L) with J=100J=100.
[κNL]\left[\begin{array}[]{c}\kappa\\ N\\ L\end{array}\right] OMPC\text{OMP}_{C} CoSaMPC\text{CoSaMP}_{C} BPC\text{BP}_{C} IRLSC\text{IRLS}_{C} FISTA AlgGL2\text{Alg}_{GL2} AlgGL1\text{Alg}_{GL1} AlgGL1F\text{Alg}^{F}_{GL1} AlgGLQ\text{Alg}_{GLQ} AlgGLQF\text{Alg}^{F}_{GLQ}
[1064128]\left[\begin{array}[]{c}10\\ 64\\ 128\end{array}\right] ϱok\varrho_{ok} 99% 100% 100% 100% 99% 100% 100% 100% 100% 100%
TcT_{c} 3.05e-2 2.09e-2 8.43e-1 3.20e-1 4.04e-1 2.22e-2 2.00e+0 2.00e+0 2.26e-1 2.06e-1
[3064128]\left[\begin{array}[]{c}30\\ 64\\ 128\end{array}\right] ϱok\varrho_{ok} 6% 0% 11% 92% 3% 53% 94% 78% 99% 94%
TcT_{c} 2.12e-1 9.87e+0 1.12e+0 1.05e+0 2.75e+0 2.27e-1 3.02e+1 7.34e+0 6.31e+0 1.87e+0
[35128256]\left[\begin{array}[]{c}35\\ 128\\ 256\end{array}\right] ϱok\varrho_{ok} 85% 100% 100% 100% 62% 100% 100% 100% 100% 100%
TcT_{c} 3.88e-1 1.76e-1 4.71e+0 2.84e+0 1.91e+1 5.40e-1 1.47e+1 1.47e+1 2.32e+0 2.29e+0
[58128256]\left[\begin{array}[]{c}58\\ 128\\ 256\end{array}\right] ϱok\varrho_{ok} 8% 0% 4% 100% 0% 56% 100% 99% 100% 100%
TcT_{c} 1.62e+0 8.14e+1 5.99e+0 5.24e+0 2.07e+1 2.38e+0 3.02e+2 3.58e+1 3.92e+0 3.81e+0
[60256512]\left[\begin{array}[]{c}60\\ 256\\ 512\end{array}\right] ϱok\varrho_{ok} 90% 100% 100% 100% 35% 100% 100% 100% 100% 100%
TcT_{c} 2.68e+0 7.78e-1 3.13e+1 9.07e+0 5.55 e+1 3.69e+0 1.18e+2 1.18e+2 6.93e+0 6.77e+0
[100256512]\left[\begin{array}[]{c}100\\ 256\\ 512\end{array}\right] ϱok\varrho_{ok} 16% 0% 53% 100% 0% 90% 100% 100% 100% 100%
TcT_{c} 9.05e+0 2.82e+2 4.63e+1 1.44e+1 5.90e+1 1.09e+1 1.03e+3 2.05e+2 1.04e+1 1.04e+1

Table 2 shows the performance of the ten algorithms for different settings (κ,N,L)(\kappa,N,L) with J=100J=100 samples. The ten algorithms are further evaluated under the setting N=64,L=128N=64,L=128 and J=1000J=1000, with κ\kappa varying from 4 to N/2=32N/2=32 in steps of 4. Fig. 1 shows the corresponding statistics of ϱok\varrho_{ok} and TcT_{c}. We note from the simulations above that 1) among the five classical algorithms, IRLSC\text{IRLS}_{C} achieves the best sparse recovery performance and is significantly faster than BPC\text{BP}_{C}; 2) the proposed AlgGL2\text{Alg}_{GL2} achieves the same sparse recovery accuracy as IRLSC\text{IRLS}_{C} when κ/N\kappa/N is small, but with much higher computational efficiency; 3) the other four proposed algorithms, i.e., AlgGL1\text{Alg}_{GL1}, AlgGLQ\text{Alg}_{GLQ}, AlgGL1F\text{Alg}^{F}_{GL1}, and AlgGLQF\text{Alg}^{F}_{GLQ}, demonstrate a similar performance to IRLSC\text{IRLS}_{C}. Notably, AlgGL1\text{Alg}_{GL1}, AlgGLQ\text{Alg}_{GLQ} and AlgGLQF\text{Alg}^{F}_{GLQ} even outperform the state-of-the-art algorithm IRLSC\text{IRLS}_{C} in terms of ϱok\varrho_{ok}.

Refer to caption

(a)

Refer to caption

(b)

Figure 1: The rate ϱok\varrho_{ok} of successful recovery and the wall-clock time TcT_{c} with respect to different κ\kappa for each of the ten algorithms for N=64,L=128,J=1000N=64,L=128,J=1000.

As observed in Table 2, the classical algorithms BPC\text{BP}_{C}, CoSaMPC\text{CoSaMP}_{C} and FISTA, along with the proposed AlgGL1\text{Alg}_{GL1} and AlgGL1F\text{Alg}^{F}_{GL1}, become very time-consuming as the dimension (N,L)(N,L) of the system (matrix) 𝑸\bm{Q} gets large. Therefore, in the sequel, we will focus on the other five algorithms when large-scale systems are concerned. Fig. 2 presents simulations similar to those in Fig. 1 for the five algorithms: OMPC\text{OMP}_{C}, IRLSC\text{IRLS}_{C}, AlgGL2\text{Alg}_{GL2}, AlgGLQ\text{Alg}_{GLQ} and AlgGLQF\text{Alg}^{F}_{GLQ}, under the setting N=256,L=512N=256,L=512 and J=100J=100, with κ\kappa ranging from 64 to 144 in steps of 8. As observed, the proposed algorithms AlgGLQ\text{Alg}_{GLQ} and AlgGLQF\text{Alg}^{F}_{GLQ} can perfectly reconstruct all J=100J=100 samples, similar to the performance of the IRLSC\text{IRLS}_{C} algorithm, when κ/N\kappa/N is small (e.g., κ/N0.5\kappa/N\leq 0.5), but with much higher computational efficiency. However, as κ/N\kappa/N increases, the proposed AlgGLQ\text{Alg}_{GLQ} and AlgGLQF\text{Alg}^{F}_{GLQ} algorithms demonstrate superior sparse recovery performance, albeit with higher computational time requirements.

Refer to caption

(a)

Refer to caption

(b)

Figure 2: The rate ϱok\varrho_{ok} of successful recovery and the wall-clock time TcT_{c} with respect to different κ\kappa for each of the five algorithms for N=256,L=512,J=100N=256,L=512,J=100.

To investigate the impact of sparsity level κ\kappa and signal dimension NN on the performance of the five algorithms: OMPC\text{OMP}_{C}, IRLSC\text{IRLS}_{C}, AlgGL2\text{Alg}_{GL2}, AlgGLQ\text{Alg}_{GLQ} and AlgGLQF\text{Alg}^{F}_{GLQ}, we repeat the above experiments with J=100J=100 samples and generate phase transition plots, as shown in Fig. 3. These experiments span signal dimensions 128N256128\leq N\leq 256 and sparsity levels 20κ9020\leq\kappa\leq 90. The results demonstrate that the proposed AlgGLQ\text{Alg}_{GLQ} algorithm significantly outperforms the other four algorithms in terms of successful recovery rates. In addition, its accelerated version, AlgGLQF\text{Alg}^{F}_{GLQ}, achieves a slightly higher successful recovery rate compared to IRLSC\text{IRLS}_{C}, as shown in Fig. 3 (f), which highlights the differences in successful recovery rates between AlgGLQF\text{Alg}^{F}_{GLQ} and IRLSC\text{IRLS}_{C}.

Refer to caption

(a)

Refer to caption

(b)

Refer to caption

(c)

Refer to caption

(d)

Refer to caption

(e)

Refer to caption

(f)

Figure 3: The rate of successful recovery ϱok\varrho_{ok} with respect to a series of sparsity level κ\kappa and signal dimension NN. These phase transition plots are generated with J=100J=100 samples and with the setting (κ,N,L=2N)(\kappa,N,L=2N).

5.2.1 Effects of system matrix conditioning

From the above, we note that the classical IRLSC\text{IRLS}_{C} and our proposed AlgGLQ\text{Alg}_{GLQ} and AlgGLQF\text{Alg}^{F}_{GLQ} yield almost identical performance for those systems 𝑸\bm{Q} generated using randn(N,L)randn(N,L) and subsequently normalized. These matrices are well-conditioned. We note from (7), IRLSC\text{IRLS}_{C} may encounter numerical issues when 𝑸\bm{Q} is ill-conditioned.

Let 𝒞𝑸σ1σN\mathcal{C}_{\bm{Q}}\triangleq\frac{\sigma_{1}}{\sigma_{N}} denote the condition number of 𝑸\bm{Q}, where σ1\sigma_{1} and σN\sigma_{N} are the biggest and smallest of singular values of 𝑸N×L\bm{Q}\in\mathbb{R}^{N\times L}. For a given pair (N,L)(N,L), we can generate a system matrix 𝑸\bm{Q} in the same way as used in [66] for a specified 𝒞𝑸\mathcal{C}_{\bm{Q}}. We now consider the following settings: (i) κ=25,N=256,L=512\kappa=25,N=256,L=512 and J=100J=100; and (ii) κ=60,N=512,L=1024\kappa=60,N=512,L=1024 and J=100J=100. Fig.s 4 and 5 depict the statistics for ϱok\varrho_{ok} and TcT_{c} under each setting, respectively, where the system matrices 𝑸N×L\bm{Q}\in\mathbb{R}^{N\times L} are generated with 𝒞𝑸\mathcal{C}_{\bm{Q}} varies within [510501005001,0005,00010,00050,000100,000].[5~{}~{}10~{}~{}50~{}~{}100~{}~{}500~{}~{}1,000~{}~{}5,000~{}~{}10,000~{}~{}50,000~{}~{}100,000].

Refer to caption

(a)

Refer to caption

(b)

Figure 4: The rate ϱok\varrho_{ok} of successful recovery and the wall-clock time TcT_{c} with respect to 𝒞𝑸\mathcal{C}_{\bm{Q}} for each of the five algorithms for κ=25,N=256,L=512\kappa=25,N=256,L=512 and J=100J=100.
Refer to caption

(a)

Refer to caption

(b)

Figure 5: The rate ϱok\varrho_{ok} of successful recovery and the wall-clock time TcT_{c} with respect to 𝒞𝑸\mathcal{C}_{\bm{Q}} for each of the five algorithms for κ=60,N=512,L=1024\kappa=60,N=512,L=1024 and J=100J=100.

It is observed that the ϱok\varrho_{ok} of the classical OMPC\text{OMP}_{C} is very sensitive to 𝒞𝑸\mathcal{C}_{\bm{Q}}. This is not the case for our proposed AlgGL2\text{Alg}_{GL2}. Furthermore, 1) the TcT_{c} of IRLSC\text{IRLS}_{C} increases dramatically when 𝒞𝑸\mathcal{C}_{\bm{Q}} exceeds 10310^{3}, reaching values hundreds of times greater than those of AlgGLQ\text{Alg}_{GLQ} and AlgGLQF\text{Alg}^{F}_{GLQ}. Indeed, when 𝒞𝑸\mathcal{C}_{\bm{Q}} exceeds 10610^{6}, IRLSC\text{IRLS}_{C} encounters numerical issues and fails to work properly; 2) the conditioning number 𝒞𝑸\mathcal{C}_{\bm{Q}} has almost no effect on the performance of the three of our proposed algorithms in terms of ϱok\varrho_{ok} and TcT_{c}. This is further confirmed under the setting (κ,N,L)=(500,2048,4096)(\kappa,N,L)=(500,2048,4096) and J=10J=10, where 𝒞𝑸=10m\mathcal{C}_{\bm{Q}}=10^{m} varies from m=3,4,5,6,7,8,9,10m=3,4,5,6,7,8,9,10. For all the proposed three algorithms, ϱok\varrho_{ok} is constantly equal to one, while the Wall-clock time TcT_{c} is presented in Fig. 6.

Refer to caption
Figure 6: The wall-clock time TcT_{c} of the three algorithms with respect to 𝒞𝑸\mathcal{C}_{\bm{Q}} for κ=500,N=2048,L=4096\kappa=500,N=2048,L=4096 and J=10J=10.

In summary,

  • for well-conditioned systems, our proposed algorithms AlgGL1,AlgGL1F\text{Alg}_{GL1},\text{Alg}^{F}_{GL1}, AlgGLQ\text{Alg}_{GLQ} and AlgGLQF\text{Alg}^{F}_{GLQ} achieve the same or even better sparse recovery performance compared to the state-of-art performance algorithm IRLSC\text{IRLS}_{C}. Additionally, the proposed AlgGL2\text{Alg}_{GL2} significantly outperforms the classical OMPC\text{OMP}_{C} and BPC\text{BP}_{C} in terms of sparse recovery, with computational efficiency comparable to OMPC\text{OMP}_{C};

  • for ill-conditioned systems, the performance of all classical algorithms degrades dramatically in terms of either sparse recovery accuracy or computational efficiency, or both. In contrast, the three proposed algorithms AlgGL2\text{Alg}_{GL2}, AlgGLQ\text{Alg}_{GLQ} and AlgGLQF\text{Alg}^{F}_{GLQ} can work properly even for systems with 𝒞𝑸\mathcal{C}_{\bm{Q}} up to 101010^{10} and large-scale dimensions.

5.3 Robustness against low-rank disturbance

Next, we evaluate the performance of our proposed algorithms under the noisy signal model introduced in Section 4.3, with a particular focus on comparing them to IRLSC\text{IRLS}_{C}.

5.3.1 Synthetic data

We repeat the previous experiments with parameters N=128,L=256,J=100N=128,L=256,J=100, and κ=30\kappa=30. We set Ne=20N_{e}=20 and construct 𝚪\bm{\Gamma} using the first NeN_{e} left singular vectors of an N×NN\times N Gaussian random matrix with entries drawn from 𝒩(0,1)\mathcal{N}(0,1). The coefficient 𝜶\bm{\alpha}^{\star} in (36) is then generated as 𝜶=σ𝜶0\bm{\alpha}^{\star}=\sigma\bm{\alpha}_{0}, where 𝜶0\bm{\alpha}_{0} is a normalized Gaussian random vector with entries following 𝒩(0,1)\mathcal{N}(0,1). Here, our primary focus is on comparing the performance of our proposed AlgGLQ\text{Alg}_{GLQ} and AlgGLQF\text{Alg}^{F}_{GLQ} algorithms against IRLSC\text{IRLS}_{C}. As illustrated in Fig. 7, our proposed methods demonstrate significant advantages over IRLSC\text{IRLS}_{C} algorithm in terms of both successful recovery rate and computational efficiency. Notably, as the noise level σ\sigma increases to 0.050.05, IRLSC\text{IRLS}_{C} completely fails to reconstruct the J=100J=100 signals, whereas the proposed AlgGLQ\text{Alg}_{GLQ} and AlgGLQF\text{Alg}^{F}_{GLQ} maintain perfect recovery performance across all the J=100J=100 signals.

Refer to caption

(a)

Refer to caption

(b)

Figure 7: The rate ϱok\varrho_{ok} of successful recovery and the wall-clock time TcT_{c} with respect to different noise variance σ\sigma for each of the three algorithms for N=128,L=256,J=100N=128,L=256,J=100.

5.3.2 Image transmission

We consider an image transmission system affected by a low-rank interference signal 𝒆\bm{e} modeled as in (36), with 𝚪64×5\bm{\Gamma}\in\mathbb{R}^{64\times 5} assumed to be known. The clean images intended for transmission are shown in the first column of Fig. 8. Prior to transmission, each clean image is divided into patches of size 8×88\times 8, resulting in an N×JN\times J data matrix with N=64N=64 and JJ corresponding to the number of patches. After transmission, the received noisy images are presented in the second column of Fig. 8.

Given the noisy images, we reconstruct the clean images using six different algorithms: OMPC\text{OMP}_{C}, IRLSC\text{IRLS}_{C}, AlgGL1\text{Alg}_{GL1}, AlgGL1F\text{Alg}^{F}_{GL1}, AlgGLQ\text{Alg}_{GLQ}, and AlgGLQF\text{Alg}^{F}_{GLQ}. In the reconstruction process, we employ an overcomplete discrete cosine transform (DCT) dictionary 𝑸\bm{Q} of size 64×14464\times 144, i.e., N=64,L=144N=64,L=144. The sparsity level κ\kappa is set differently for each image, with specific values provided in Table 3. The corresponding image sizes are also listed in Table 3. The reconstructed images obtained from the six algorithms are displayed in Fig. 8. Table 3 further presents the peak signal-to-noise ratio (PSNR, in dB) for each method. The results demonstrate that our four proposed algorithms achieve significantly higher PSNR values compared to OMPC\text{OMP}_{C} and IRLSC\text{IRLS}_{C}, highlighting their superior performance in reconstructing the clean images.

Clean

Refer to caption

MRI1

Noisy

Refer to caption

OMPC\text{OMP}_{C}

Refer to caption

IRLSC\text{IRLS}_{C}

Refer to caption

AlgGL1\text{Alg}_{GL1}

Refer to caption

AlgGL1F\text{Alg}_{GL1}^{F}

Refer to caption

AlgGLQ\text{Alg}_{GLQ}

Refer to caption

AlgGLQF\text{Alg}^{F}_{GLQ}

Refer to caption
Refer to caption

MRI2

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 8: Image reconstruction by the six algorithms: OMPC\text{OMP}_{C}, IRLSC\text{IRLS}_{C}, AlgGL1\text{Alg}_{GL1}, AlgGL1F\text{Alg}^{F}_{GL1}, AlgGLQ\text{Alg}_{GLQ}, and AlgGLQF\text{Alg}^{F}_{GLQ}.
Table 3: Statistics of PSNR (in dB) for the six algorithms in image reconstruction.
Size κ\kappa Noisy OMPC\text{OMP}_{C} IRLSC\text{IRLS}_{C} AlgGL1\text{Alg}_{GL1} AlgGL1F\text{Alg}^{F}_{GL1} AlgGLQ\text{Alg}_{GLQ} AlgGLQF\text{Alg}^{F}_{GLQ}
MRI1 168×136168\times 136 20 24.42 25.03 25.13 50.45 49.72 49.94 49.90
MRI2 376×336376\times 336 12 20.47 22.05 22.22 55.22 55.01 55.01 54.83

6 Concluding remarks

In this paper, we have investigated the sparse signal recovery problems. Our study has led to several key findings and contributions that enhance the field of sparse signal recovery. Based on a proposed system model, we have developed a novel approach leading to a class of greedy algorithms, including

  • three innovative greedy algorithms, AlgGL2\text{Alg}_{GL2}, AlgGL1\text{Alg}_{GL1} and AlgGLQ\text{Alg}_{GLQ}, which significantly improve the accuracy of classical methods such as OMP and BP;

  • two fast algorithms, AlgGL1F\text{Alg}^{F}_{GL1} and AlgGLQF\text{Alg}^{F}_{GLQ}, designed to accelerate AlgGL1\text{Alg}_{GL1} and AlgGLQ\text{Alg}_{GLQ}, achieving performance that even surpasses state-of-the-art sparse recovery methods.

Furthermore, the five proposed algorithms demonstrate excellent robustness against instability caused by the poor numerical properties of the system matrix 𝑸\bm{Q} and measurement interferences – issues that significantly affect classical sparse recovery algorithms.

Future work will focus on further enhancing the algorithms’ efficiency and exploring their applications in more complex high-dimensional data analysis tasks, such as hyperspectral imaging, seismic data processing, and wireless communication. In conclusion, the proposed greedy algorithms represent a significant advancement in sparse recovery techniques. By combining theoretical rigor with practical efficiency, our work paves the way for future innovations and applications in the field of signal processing and beyond.

References

  • [1] Vladimir Shikhman, David Müller, Vladimir Shikhman, and David Müller. Sparse recovery. Mathematical Foundations of Big Data Analytics, pages 131–148, 2021.
  • [2] Stanley Osher. Sparse recovery for scientific data. Technical report, Univ. of California, Los Angeles, CA (United States), 2019.
  • [3] Ljubiša Stanković, Ervin Sejdić, Srdjan Stanković, Miloš Daković, and Irena Orović. A tutorial on sparse signal reconstruction and its applications in signal processing. Circuits, Systems, and Signal Processing, 38:1206–1263, 2019.
  • [4] Shuang Li, Michael B Wakin, and Gongguo Tang. Atomic norm denoising for complex exponentials with unknown waveform modulations. IEEE Transactions on Information Theory, 66(6):3893–3913, 2019.
  • [5] Mike E Davies and Yonina C Eldar. Rank awareness in joint sparse recovery. IEEE Transactions on Information Theory, 58(2):1135–1146, 2012.
  • [6] Natalie Durgin, Rachel Grotheer, Chenxi Huang, Shuang Li, Anna Ma, Deanna Needell, and Jing Qin. Jointly sparse signal recovery with prior info. In 2019 53rd Asilomar Conference on Signals, Systems, and Computers, pages 645–649. IEEE, 2019.
  • [7] Yuzhe Jin, Young-Han Kim, and Bhaskar D Rao. Limits on support recovery of sparse signals via multiple-access communication techniques. IEEE Transactions on Information Theory, 57(12):7877–7892, 2011.
  • [8] Xing Zhang, Haiyang Zhang, and Yonina C. Eldar. Near-field sparse channel representation and estimation in 6G wireless communications. IEEE Transactions on Communications, 72(1):450–464, 2023.
  • [9] Shuang Li, Daniel Gaydos, Payam Nayeri, and Michael B Wakin. Adaptive interference cancellation using atomic norm minimization and denoising. IEEE Antennas and Wireless Propagation Letters, 19(12):2349–2353, 2020.
  • [10] Yong Huang, James L Beck, Stephen Wu, and Hui Li. Bayesian compressive sensing for approximately sparse signals and application to structural health monitoring signals for data loss recovery. Probabilistic Engineering Mechanics, 46:62–79, 2016.
  • [11] Shuang Li, Dehui Yang, Gongguo Tang, and Michael B Wakin. Atomic norm minimization for modal analysis from random and compressed samples. IEEE Transactions on Signal Processing, 66(7):1817–1831, 2018.
  • [12] Zhiyi Tang, Yuequan Bao, and Hui Li. Group sparsity-aware convolutional neural network for continuous missing data recovery of structural health monitoring. Structural Health Monitoring, 20(4):1738–1759, 2021.
  • [13] Mohammad Golbabaee and Pierre Vandergheynst. Hyperspectral image compressed sensing via low-rank and joint-sparse matrix recovery. In 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 2741–2744. IEEE, 2012.
  • [14] Boaz Arad and Ohad Ben-Shahar. Sparse recovery of hyperspectral signal from natural RGB images. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part VII 14, pages 19–34. Springer, 2016.
  • [15] Natalie Durgin, Rachel Grotheer, Chenxi Huang, Shuang Li, Anna Ma, Deanna Needell, and Jing Qin. Fast hyperspectral diffuse optical imaging method with joint sparsity. In 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pages 4758–4761. IEEE, 2019.
  • [16] Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. Nature, 521(7553):436–444, 2015.
  • [17] Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep Learning. MIT press, 2016.
  • [18] David L Donoho. Compressed sensing. IEEE Transactions on Information Theory, 52(4):1289–1306, 2006.
  • [19] Emmanuel J Candès and Michael B Wakin. An introduction to compressive sampling. IEEE Signal Processing Magazine, 25(2):21–30, 2008.
  • [20] Marco F Duarte and Yonina C Eldar. Structured compressed sensing: From theory to applications. IEEE Transactions on Signal Processing, 59(9):4053–4085, 2011.
  • [21] Zhihui Zhu, Gang Li, Jiajun Ding, Qiuwei Li, and Xiongxiong He. On collaborative compressive sensing systems: The framework, design, and algorithm. SIAM Journal on Imaging Sciences, 11(2):1717–1758, 2018.
  • [22] Roman Vershynin, Y. C. Eldar, and Gitta Kutyniok. Introduction to the non-asymptotic analysis of random matrices. In Y. C. Eldar and G. Kutyniok, editors, Compressed Sensing: Theory and Applications. Cambridge University Press, UK, 2012.
  • [23] Simon Foucart and Holger Rauhut. A Mathematical Introduction to Compressive Sensing. Birkhäuser New York, NY, 2013.
  • [24] Jie Gui, Zhenan Sun, Shuiwang Ji, Dacheng Tao, and Tieniu Tan. Feature selection based on structured sparsity: A comprehensive study. IEEE Transactions on Neural Networks and Learning Systems, 28(7):1490–1507, 2016.
  • [25] Shiyun Xu, Zhiqi Bu, Pratik Chaudhari, and Ian J Barnett. Sparse neural additive model: Interpretable deep learning with feature selection via group sparsity. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 343–359. Springer, 2023.
  • [26] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 25, 2012.
  • [27] Baoyuan Liu, Min Wang, Hassan Foroosh, Marshall Tappen, and Marianna Pensky. Sparse convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 806–814, 2015.
  • [28] Jerome Friedman, Trevor Hastie, and Rob Tibshirani. Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33(1):1, 2010.
  • [29] Zhonghua Liu, Zhihui Lai, Weihua Ou, Kaibing Zhang, and Ruijuan Zheng. Structured optimal graph based sparse feature extraction for semi-supervised learning. Signal Processing, 170:107456, 2020.
  • [30] David L Donoho and Michael Elad. Optimally sparse representation in general (nonorthogonal) dictionaries via 1\ell_{1} minimization. Proceedings of the National Academy of Sciences, 100(5):2197–2202, 2003.
  • [31] Emmanuel J Candès and Terence Tao. Decoding by linear programming. IEEE Transactions on Information Theory, 51(12):4203–4215, 2005.
  • [32] Scott Shaobing Chen, David L Donoho, and Michael A Saunders. Atomic decomposition by basis pursuit. SIAM Review, 43(1):129–159, 2001.
  • [33] Irina F Gorodnitsky and Bhaskar D Rao. Sparse signal reconstruction from limited data using FOCUSS: A re-weighted minimum norm algorithm. IEEE Transactions on Signal Processing, 45(3):600–616, 1997.
  • [34] Bhaskar D Rao and Kenneth Kreutz-Delgado. An affine scaling methodology for best basis selection. IEEE Transactions on Signal Processing, 47(1):187–200, 1999.
  • [35] Ingrid Daubechies, Ronald DeVore, Massimo Fornasier, and C Sinan Güntürk. Iteratively reweighted least squares minimization for sparse recovery. Communications on Pure and Applied Mathematics: A Journal Issued by the Courant Institute of Mathematical Sciences, 63(1):1–38, 2010.
  • [36] Xiaolin Huang, Yipeng Liu, Lei Shi, Sabine Van Huffel, and Johan AK Suykens. Two-level 1\ell_{1} minimization for compressed sensing. Signal Processing, 108:459–475, 2015.
  • [37] Emmanuel J Candes, Michael B Wakin, and Stephen P Boyd. Enhancing sparsity by reweighted 1\ell_{1} minimization. Journal of Fourier Analysis and Applications, 14:877–905, 2008.
  • [38] Yilun Wang and Wotao Yin. Sparse signal reconstruction via iterative support detection. SIAM Journal on Imaging Sciences, 3(3):462–491, 2010.
  • [39] Rick Chartrand and Wotao Yin. Iteratively reweighted algorithms for compressive sensing. In 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, pages 3869–3872. IEEE, 2008.
  • [40] Scott Shaobing Chen, David L Donoho, and Michael A Saunders. Atomic decomposition by basis pursuit. SIAM Journal on Scientific Computing, 20(1):33–61, 1998.
  • [41] Robert Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society Series B: Statistical Methodology, 58(1):267–288, 1996.
  • [42] Amir Beck and Marc Teboulle. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM Journal on Imaging Sciences, 2(1):183–202, 2009.
  • [43] Ali Mousavi and Richard G Baraniuk. Learning to invert: Signal recovery via deep convolutional networks. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 2272–2276. IEEE, 2017.
  • [44] Stéphane G Mallat and Zhifeng Zhang. Matching pursuits with time-frequency dictionaries. IEEE Transactions on Signal Processing, 41(12):3397–3415, 1993.
  • [45] Joel A Tropp. Greed is good: Algorithmic results for sparse approximation. IEEE Transactions on Information Theory, 50(10):2231–2242, 2004.
  • [46] Thomas Blumensath and Mike E Davies. Gradient pursuits. IEEE Transactions on Signal Processing, 56(6):2370–2382, 2008.
  • [47] David L Donoho, Yaakov Tsaig, Iddo Drori, and Jean-Luc Starck. Sparse solution of underdetermined systems of linear equations by stagewise orthogonal matching pursuit. IEEE Transactions on Information Theory, 58(2):1094–1121, 2012.
  • [48] Deanna Needell and Joel A Tropp. CoSaMP: Iterative signal recovery from incomplete and inaccurate samples. Applied and Computational Harmonic Analysis, 26(3):301–321, 2009.
  • [49] Thomas Blumensath and Mike E Davies. Stagewise weak gradient pursuits. IEEE Transactions on Signal Processing, 57(11):4333–4346, 2009.
  • [50] Thomas Blumensath, Michael E Davies, and Gabriel Rilling. Greedy algorithms for compressed sensing. In Y. C. Eldar and G. Kutyniok, editors, Compressed Sensing: Theory and Applications. Cambridge University Press, UK, 2012.
  • [51] Thomas Strohmer and Robert W Heath Jr. Grassmannian frames with applications to coding and communication. Applied and Computational Harmonic Analysis, 14(3):257–275, 2003.
  • [52] Michael Elad. Optimized projections for compressed sensing. IEEE Transactions on Signal Processing, 55(12):5695–5702, 2007.
  • [53] Gang Li, Zhihui Zhu, Dehui Yang, Liping Chang, and Huang Bai. On projection matrix optimization for compressive sensing systems. IEEE Transactions on Signal Processing, 61(11):2887–2898, 2013.
  • [54] Wei Chen, Miguel RD Rodrigues, and Ian J Wassell. Projection design for statistical compressive sensing: A tight frame based approach. IEEE Transactions on Signal Processing, 61(8):2016–2029, 2013.
  • [55] Joel A Tropp, Inderjit S Dhillon, Robert W Heath, and Thomas Strohmer. Designing structured tight frames via an alternating projection method. IEEE Transactions on Information Theory, 51(1):188–209, 2005.
  • [56] Yonina C Eldar and G David Forney. Optimal tight frames and quantum measurement. IEEE Transactions on Information Theory, 48(3):599–610, 2002.
  • [57] Evaggelia V Tsiligianni, Lisimachos P Kondi, and Aggelos K Katsaggelos. Construction of incoherent unit norm tight frames with application to compressed sensing. IEEE Transactions on Information Theory, 60(4):2319–2330, 2014.
  • [58] Gang Li, Xiao Li, and Wu Angela Li. Revisiting sparse error correction: model analysis and new algorithms. Available at SSRN 5069675, 2024.
  • [59] Minh Dao, Yuanming Suo, Sang Peter Chin, and Trac D Tran. Structured sparse representation with low-rank interference. In 2014 48th Asilomar Conference on Signals, Systems and Computers, pages 106–110. IEEE, 2014.
  • [60] Fok Hing Chi Tivive, Abdesselam Bouzerdoum, and Canicious Abeynayake. Gpr target detection by joint sparse and low-rank matrix decomposition. IEEE Transactions on Geoscience and Remote Sensing, 57(5):2583–2595, 2018.
  • [61] Emmanuel J Candès, Xiaodong Li, Yi Ma, and John Wright. Robust principal component analysis? Journal of the ACM (JACM), 58(3):1–37, 2011.
  • [62] Tianyi Zhou and Dacheng Tao. Godec: Randomized low-rank & sparse matrix decomposition in noisy case. In Proceedings of the 28th International Conference on Machine Learning, ICML 2011, 2011.
  • [63] Ricardo Otazo, Emmanuel Candes, and Daniel K Sodickson. Low-rank plus sparse matrix decomposition for accelerated dynamic mri with separation of background and dynamic components. Magnetic Resonance in Medicine, 73(3):1125–1136, 2015.
  • [64] Sohail Bahmani and Justin Romberg. Near-optimal estimation of simultaneously sparse and low-rank matrices from nested linear measurements. Information and Inference: A Journal of the IMA, 5(3):331–351, 2016.
  • [65] John Wright, Arvind Ganesh, Kerui Min, and Yi Ma. Compressive principal component pursuit. Information and Inference: A Journal of the IMA, 2(1):32–68, 2013.
  • [66] Mark Borgerding, Philip Schniter, and Sundeep Rangan. Amp-inspired deep networks for sparse linear inverse problems. IEEE Transactions on Signal Processing, 65(16):4293–4308, 2017.