This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

On the Convergence of Orthogonal/Vector AMP: Long-Memory Message-Passing Strategy thanks: The author was in part supported by the Grant-in-Aid for Scientific Research (B) (JSPS KAKENHI Grant Numbers 21H01326), Japan.

Keigo Takeuchi Dept. Electrical and Electronic Information Eng., Toyohashi University of Technology, Toyohashi 441-8580, Japan
Email: takeuchi@ee.tut.ac.jp
Abstract

This paper proves the convergence of Bayes-optimal orthogonal/vector approximate message-passing (AMP) to a fixed point in the large system limit. The proof is based on Bayes-optimal long-memory (LM) message-passing (MP) that is guaranteed to converge systematically. The dynamics of Bayes-optimal LM-MP is analyzed via an existing state evolution framework. The obtained state evolution recursions are proved to converge. The convergence of Bayes-optimal orthogonal/vector AMP is proved by confirming an exact reduction of the state evolution recursions to those for Bayes-optimal orthogonal/vector AMP.

I Introduction

Consider the problem of reconstructing an NN-dimensional sparse signal vector 𝒙N\boldsymbol{x}\in\mathbb{R}^{N} from compressed, noisy, and linear measurements 𝒚M\boldsymbol{y}\in\mathbb{R}^{M} [1, 2] with MNM\leq N, given by

𝒚=𝑨𝒙+𝒘.\boldsymbol{y}=\boldsymbol{A}\boldsymbol{x}+\boldsymbol{w}. (1)

In (1), 𝒘𝒩(𝟎,σ2𝑰M)\boldsymbol{w}\sim\mathcal{N}(\boldsymbol{0},\sigma^{2}\boldsymbol{I}_{M}) denotes an additive white Gaussian noise (AWGN) vector with variance σ2>0\sigma^{2}>0. The matrix 𝑨M×N\boldsymbol{A}\in\mathbb{R}^{M\times N} represents a known sensing matrix. The signal vector 𝒙\boldsymbol{x} has zero-mean independent and identically distributed (i.i.d.) elements with unit variance. The triple {𝑨,𝒙,𝒘}\{\boldsymbol{A},\boldsymbol{x},\boldsymbol{w}\} is assumed to be independent random variables.

A promising approach to efficient reconstruction is message passing (MP), such as approximate MP (AMP) [3, 4] and orthogonal/vector AMP (OAMP/VAMP) [5, 6]. When the sensing matrix has zero-mean i.i.d. sub-Gaussian elements, AMP was proved to be Bayes-optimal in the large system limit [7, 8], where both MM and NN tend to infinity with the compression ratio δ=M/N(0,1]\delta=M/N\in(0,1] kept constant. However, AMP fails to converge for non-i.i.d. cases, such as the non-zero mean case [9] and the ill-conditioned case [10].

OAMP [5] or equivalently VAMP [6] is a powerful MP algorithm to solve this convergence issue for AMP. In this paper, these MP algorithms are referred to as OAMP. When the sensing matrix is right-orthogonally invariant, OAMP was proved to be Bayes-optimal in the large system limit [6, 11].

Strictly speaking, the Bayes-optimality of OAMP requires an implicit assumption under which state evolution recursions for OAMP converge to a fixed point after an infinite number of iterations [6, 11]. Thus, this assumption needs to be confirmed for individual problems [12, 13, 14]. The purpose of this paper is to prove this assumption for OAMP using the Bayes-optimal denoiser—called Bayes-optimal OAMP.

The proof is based on a Bayes-optimal long-memory (LM) MP algorithm that is guaranteed to converge systematically. LM-MP uses messages in all preceding iterations to update the current message while conventional MP utilizes messages only in the latest iteration. LM-MP was originally proposed via non-rigorous dynamical functional theory [15, 16] and formulated via rigorous state evolution [17]. A unified framework in [17] was used to propose convolutional AMP [18, 17], memory AMP [19], and VAMP with warm-started conjugate gradient (WS-CG) [20]. See [21, 22] for another state evolution approach.

A first step in the proof is a formulation of Bayes-optimal LM-OAMP, in which a message in the latest iteration is regarded as an additional measurement that depends on all preceding messages. Thus, use of the additional measurement never degrades the reconstruction performance if statistical properties of the dependent messages are grasped completely and if the current message is updated in a Bayes-optimal manner.

A second step is an application of the unified framework in [17] to state evolution analysis for LM-OAMP. It is sufficient to confirm that a general error model in [17] contains an error model for LM-OAMP. The obtained state evolution recursions represent asymptotic correlation structures for all messages in LM-OAMP. Furthermore, asymptotic Gaussianity for estimation errors [17] implies that the correlation structures provide full information on the asymptotic distributions of the estimation errors. As a result, it is possible to update the current message in a Bayes-optimal manner.

A third step is to prove that state evolution recursions for Bayes-optimal LM-OAMP converge to a fixed point under mild assumptions. While the convergence is intuitively expected from the formulation of Bayes-optimal LM-OAMP, a rigorous proof is non-trivial and based on a novel statistical interpretation for optimized LM damping in [19].

The last step is an exact reduction of state evolution recursions for Bayes-optimal LM-OAMP to those for conventional Bayes-optimal OAMP [5]. Thus, the convergence of Bayes-optimal LM-OAMP implies the convergence of conventional Bayes-optimal OAMP to a fixed point. As a by-product, the LM-MP proof strategy in this paper claims that conventional Bayes-optimal OAMP is the best in terms of convergence speed among all possible LM-MP algorithms included into a unified framework in [17].

The remainder of this paper is organized as follows: Section II reviews Bayes-optimal estimation based on dependent Gaussian measurements. The obtained results reveal a statistical interpretation for optimized LM damping [19], which is utilized to formulate LM-OAMP in Section III. Section IV presents state evolution analysis of LM-OAMP via a unified framework in [17]. Two-dimensional (2D) discrete systems—called state evolution recursions—are derived to describe the asymptotic dynamics of LM-OAMP. Furthermore, we prove the convergence and reduction of the state evolution recursions. See [23] for the details of the proof.

Finally, see a recent paper [24] for an application of the LM-MP strategy in this paper.

II Correlated AWGN Measurements

This section presents a background to define the Bayes-optimal denoiser in LM-OAMP. We review Bayesian estimation of a scalar signal XX\in\mathbb{R} from t+1t+1 correlated AWGN measurements 𝒀t=(Y0,,Yt)1×(t+1)\boldsymbol{Y}_{t}=(Y_{0},\ldots,Y_{t})\in\mathbb{R}^{1\times(t+1)}, given by

𝒀t=X𝟏T+𝑾t.\boldsymbol{Y}_{t}=X\boldsymbol{1}^{\mathrm{T}}+\boldsymbol{W}_{t}. (2)

In (2), 𝟏\boldsymbol{1} denotes a column vector whose elements are all one. The signal XX follows the same distribution as that of each element in the i.i.d. signal vector 𝒙\boldsymbol{x}. The AWGN vector 𝑾t𝒩(𝟎,𝚺t)\boldsymbol{W}_{t}\sim\mathcal{N}(\boldsymbol{0},\boldsymbol{\Sigma}_{t}) is a zero-mean Gaussian row vector with covariance 𝚺t\boldsymbol{\Sigma}_{t} and independent of XX. The covariance matrix 𝚺t\boldsymbol{\Sigma}_{t} is assumed to be positive definite.

This paper uses a two-step approach in computing the posterior mean estimator of XX given 𝒀t\boldsymbol{Y}_{t}. A first step is computation of a sufficient statistic StS_{t}\in\mathbb{R} for estimation of XX given 𝒀t\boldsymbol{Y}_{t}. The second step is evaluation of the posterior mean estimator of XX given the sufficient statistic StS_{t}. This two-step approach is useful in proving Lemma 1 while it is equivalent to direct computation of the posterior mean estimator of XX given 𝒀t\boldsymbol{Y}_{t}, i.e. 𝔼[X|St]=𝔼[X|𝒀t]\mathbb{E}[X|S_{t}]=\mathbb{E}[X|\boldsymbol{Y}_{t}].

As shown in [23, Section II], a sufficient statistic for estimation of XX is given by

St=𝒀t𝚺t1𝟏𝟏T𝚺t1𝟏=X+W~t,W~t=𝑾t𝚺t1𝟏𝟏T𝚺t1𝟏,S_{t}=\frac{\boldsymbol{Y}_{t}\boldsymbol{\Sigma}_{t}^{-1}\boldsymbol{1}}{\boldsymbol{1}^{\mathrm{T}}\boldsymbol{\Sigma}_{t}^{-1}\boldsymbol{1}}=X+\tilde{W}_{t},\quad\tilde{W}_{t}=\frac{\boldsymbol{W}_{t}\boldsymbol{\Sigma}_{t}^{-1}\boldsymbol{1}}{\boldsymbol{1}^{\mathrm{T}}\boldsymbol{\Sigma}_{t}^{-1}\boldsymbol{1}}, (3)

where {W~τ}\{\tilde{W}_{\tau}\} are zero-mean Gaussian random variables with covariance

𝔼[W~tW~t]=𝟏T𝚺t1𝔼[𝑾tH𝑾t]𝚺t1𝟏𝟏T𝚺t1𝟏𝟏T𝚺t1𝟏\displaystyle\mathbb{E}[\tilde{W}_{t^{\prime}}\tilde{W}_{t}]=\frac{\boldsymbol{1}^{\mathrm{T}}\boldsymbol{\Sigma}_{t^{\prime}}^{-1}\mathbb{E}[\boldsymbol{W}_{t^{\prime}}^{\mathrm{H}}\boldsymbol{W}_{t}]\boldsymbol{\Sigma}_{t}^{-1}\boldsymbol{1}}{\boldsymbol{1}^{\mathrm{T}}\boldsymbol{\Sigma}_{t^{\prime}}^{-1}\boldsymbol{1}\boldsymbol{1}^{\mathrm{T}}\boldsymbol{\Sigma}_{t}^{-1}\boldsymbol{1}}
=\displaystyle= 𝟏T𝚺t1(𝑰t,𝑶)𝚺t𝚺t1𝟏𝟏T𝚺t1𝟏𝟏T𝚺t1𝟏=1𝟏T𝚺t1𝟏\displaystyle\frac{\boldsymbol{1}^{\mathrm{T}}\boldsymbol{\Sigma}_{t^{\prime}}^{-1}(\boldsymbol{I}_{t^{\prime}},\boldsymbol{O})\boldsymbol{\Sigma}_{t}\boldsymbol{\Sigma}_{t}^{-1}\boldsymbol{1}}{\boldsymbol{1}^{\mathrm{T}}\boldsymbol{\Sigma}_{t^{\prime}}^{-1}\boldsymbol{1}\boldsymbol{1}^{\mathrm{T}}\boldsymbol{\Sigma}_{t}^{-1}\boldsymbol{1}}=\frac{1}{\boldsymbol{1}^{\mathrm{T}}\boldsymbol{\Sigma}_{t}^{-1}\boldsymbol{1}} (4)

for all ttt^{\prime}\leq t. An important observation is that the covariance (4) is independent of tt^{\prime} as long as tt^{\prime} is smaller than or equal to tt. This is a key property in proving the reduction of Bayes-optimal LM-OAMP to conventional Bayes-optimal OAMP.

The Bayes-optimal estimator is defined as the posterior mean fopt(St;𝔼[W~t2])=𝔼[X|St]f_{\mathrm{opt}}(S_{t};\mathbb{E}[\tilde{W}_{t}^{2}])=\mathbb{E}[X|S_{t}] of XX given the sufficient statistic StS_{t}. The posterior covariance of XX given StS_{t^{\prime}} and StS_{t} is given by

C(St,St;𝔼[W~t2],𝔼\displaystyle C(S_{t^{\prime}},S_{t};\mathbb{E}[\tilde{W}_{t^{\prime}}^{2}],\mathbb{E} [W~t2])=𝔼[{Xfopt(St;𝔼[W~t2])}\displaystyle[\tilde{W}_{t}^{2}])=\mathbb{E}\left[\{X-f_{\mathrm{opt}}(S_{t^{\prime}};\mathbb{E}[\tilde{W}_{t^{\prime}}^{2}])\}\right. (5)
{Xfopt(St;𝔼[W~t2])}|St,St].\displaystyle\left.\left.\cdot\{X-f_{\mathrm{opt}}(S_{t};\mathbb{E}[\tilde{W}_{t}^{2}])\}\right|S_{t^{\prime}},S_{t}\right].

Note that the posterior covariance depends on the noise covariance 𝔼[W~tW~t]\mathbb{E}[\tilde{W}_{t^{\prime}}\tilde{W}_{t}], which is not presented explicitly, because of 𝔼[W~tW~t]=𝔼[W~t2]\mathbb{E}[\tilde{W}_{t^{\prime}}\tilde{W}_{t}]=\mathbb{E}[\tilde{W}_{t}^{2}] for ttt^{\prime}\leq t. These definitions are used to define the Bayes-optimal denoiser.

We prove key technical results to prove the convergence of state evolution recursions for Bayes-optimal LM-OAMP and its reduction to those for Bayes-optimal OAMP.

Lemma 1
  • 𝔼[W~t2]𝔼[W~t2]\mathbb{E}[\tilde{W}_{t^{\prime}}^{2}]\geq\mathbb{E}[\tilde{W}_{t}^{2}] and 𝔼[{Xfopt(St;𝔼[W~t2])}2]𝔼[{Xfopt(St;𝔼[W~t2])}2]\mathbb{E}[\{X-f_{\mathrm{opt}}(S_{t^{\prime}};\mathbb{E}[\tilde{W}_{t^{\prime}}^{2}])\}^{2}]\geq\mathbb{E}[\{X-f_{\mathrm{opt}}(S_{t};\mathbb{E}[\tilde{W}_{t}^{2}])\}^{2}] hold for all t<tt^{\prime}<t.

  • C(St,St;𝔼[W~t2],𝔼[W~t2])=C(St,St;𝔼[W~t2],𝔼[W~t2])C(S_{t^{\prime}},S_{t};\mathbb{E}[\tilde{W}_{t^{\prime}}^{2}],\mathbb{E}[\tilde{W}_{t}^{2}])=C(S_{t},S_{t};\mathbb{E}[\tilde{W}_{t}^{2}],\mathbb{E}[\tilde{W}_{t}^{2}]) holds for all t<tt^{\prime}<t.

  • If [𝚺t]τ,τ=[𝚺t]τ,τ=[𝚺t]τ,τ[\boldsymbol{\Sigma}_{t}]_{\tau^{\prime},\tau}=[\boldsymbol{\Sigma}_{t}]_{\tau,\tau^{\prime}}=[\boldsymbol{\Sigma}_{t}]_{\tau,\tau} holds for all τ<τ\tau^{\prime}<\tau, then 𝔼[W~t2]=[𝚺t]t,t\mathbb{E}[\tilde{W}_{t}^{2}]=[\boldsymbol{\Sigma}_{t}]_{t,t} holds.

Proof:

We prove the first property. Since {Yτ}τ=0t{Yτ}τ=0t\{Y_{\tau}\}_{\tau=0}^{t^{\prime}}\subset\{Y_{\tau}\}_{\tau=0}^{t} holds for all t<tt^{\prime}<t, the optimality of the posterior mean estimator implies 𝔼[{Xfopt(St;𝔼[W~t2])}2]𝔼[{Xfopt(St;𝔼[W~t2])}2]\mathbb{E}[\{X-f_{\mathrm{opt}}(S_{t^{\prime}};\mathbb{E}[\tilde{W}_{t^{\prime}}^{2}])\}^{2}]\geq\mathbb{E}[\{X-f_{\mathrm{opt}}(S_{t};\mathbb{E}[\tilde{W}_{t}^{2}])\}^{2}]. The other monotonicity 𝔼[W~t2]𝔼[W~t2]\mathbb{E}[\tilde{W}_{t^{\prime}}^{2}]\geq\mathbb{E}[\tilde{W}_{t}^{2}] follows from the monotonicity of the MSE with respect to the variance.

We next prove the second property. Since (4) implies 𝔼[W~tW~t]=𝔼[W~t2]\mathbb{E}[\tilde{W}_{t^{\prime}}\tilde{W}_{t}]=\mathbb{E}[\tilde{W}_{t}^{2}] for t<tt^{\prime}<t, the sufficient statistic StS_{t^{\prime}} can be represented as

St=St+Zt,Zt𝒩(0,𝔼[W~t2]𝔼[W~t2]).S_{t^{\prime}}=S_{t}+Z_{t^{\prime}},\quad\hbox{$Z_{t^{\prime}}\sim\mathcal{N}(0,\mathbb{E}[\tilde{W}_{t^{\prime}}^{2}]-\mathbb{E}[\tilde{W}_{t}^{2}])$.} (6)

It is straightforward to confirm 𝔼[(StX)2]=𝔼[W~t2]\mathbb{E}[(S_{t^{\prime}}-X)^{2}]=\mathbb{E}[\tilde{W}_{t^{\prime}}^{2}] and 𝔼[(StX)(StX)]=𝔼[W~t2]\mathbb{E}[(S_{t^{\prime}}-X)(S_{t}-X)]=\mathbb{E}[\tilde{W}_{t}^{2}].

This representation implies that StS_{t} is a sufficient statistic for estimation of XX based on both StS_{t^{\prime}} and StS_{t}. Thus, we have 𝔼[X|St,St]=𝔼[X|St]\mathbb{E}[X|S_{t^{\prime}},S_{t}]=\mathbb{E}[X|S_{t}]. Using this identity, we find that (5) reduces to

C(St,St;𝔼[W~t2],𝔼[W~t2])C(St,St;𝔼[W~t2],𝔼[W~t2])\displaystyle C(S_{t^{\prime}},S_{t};\mathbb{E}[\tilde{W}_{t^{\prime}}^{2}],\mathbb{E}[\tilde{W}_{t}^{2}])-C(S_{t},S_{t};\mathbb{E}[\tilde{W}_{t}^{2}],\mathbb{E}[\tilde{W}_{t}^{2}])
=\displaystyle= {𝔼[X|St]𝔼[X|St]}{𝔼[X|St,St]𝔼[X|St]}=0.\displaystyle\{\mathbb{E}[X|S_{t}]-\mathbb{E}[X|S_{t^{\prime}}]\}\{\mathbb{E}[X|S_{t^{\prime}},S_{t}]-\mathbb{E}[X|S_{t}]\}=0. (7)

Thus, the second property holds.

Before proving the last property, we prove the monotonicity ΔΣτ,t=[𝚺t]τ,τ[𝚺t]τ+1,τ+1>0\Delta\Sigma_{\tau,t}=[\boldsymbol{\Sigma}_{t}]_{\tau,\tau}-[\boldsymbol{\Sigma}_{t}]_{\tau+1,\tau+1}>0 for all τ\tau. For that purpose, we evaluate the determinant det𝚺t\det\boldsymbol{\Sigma}_{t}. Subtracting the (τ+1)(\tau+1)th column in 𝚺t\boldsymbol{\Sigma}_{t} from the τ\tauth column for τ=0,,t1\tau=0,\ldots,t-1, we use the assumptions [𝚺t]τ,τ=[𝚺t]τ,τ=[𝚺t]τ,τ[\boldsymbol{\Sigma}_{t}]_{\tau^{\prime},\tau}=[\boldsymbol{\Sigma}_{t}]_{\tau,\tau^{\prime}}=[\boldsymbol{\Sigma}_{t}]_{\tau,\tau} to have

det𝚺t=\displaystyle\det\boldsymbol{\Sigma}_{t}= |ΔΣ0,tΔΣt1,t[𝚺t]t,t0ΔΣt1,t00[𝚺t]t,t|\displaystyle\begin{vmatrix}\Delta\Sigma_{0,t}&\cdots&\Delta\Sigma_{t-1,t}&[\boldsymbol{\Sigma}_{t}]_{t,t}\\ 0&\ddots&\vdots&\vdots\\ \vdots&\ddots&\Delta\Sigma_{t-1,t}&\vdots\\ 0&\cdots&0&[\boldsymbol{\Sigma}_{t}]_{t,t}\end{vmatrix}
=\displaystyle= [𝚺t]t,tτ=0t1ΔΣτ,t.\displaystyle[\boldsymbol{\Sigma}_{t}]_{t,t}\prod_{\tau=0}^{t-1}\Delta\Sigma_{\tau,t}. (8)

Since 𝚺t\boldsymbol{\Sigma}_{t} has been assumed to be positive definite, the determinants of all square upper-left submatrices in 𝚺t\boldsymbol{\Sigma}_{t} have to be positive. From (8) we arrive at ΔΣτ,t>0\Delta\Sigma_{\tau,t}>0 for all τ{0,,t1}\tau\in\{0,\ldots,t-1\}.

Finally, we prove the last property. Using ΔΣτ,t>0\Delta\Sigma_{\tau,t}>0, the AWGN measurements {Yτ}τ=0t\{Y_{\tau}\}_{\tau=0}^{t} can be represented as

Yt=X+Vt,Yτ1=Yτ+Vτ1Y_{t}=X+V_{t},\quad Y_{\tau-1}=Y_{\tau}+V_{\tau-1} (9)

for τ{1,,t}\tau\in\{1,\ldots,t\}, where {Vτ}\{V_{\tau}\} are independent zero-mean Gaussian random variables with variance 𝔼[Vt2]=[𝚺t]t,t\mathbb{E}[V_{t}^{2}]=[\boldsymbol{\Sigma}_{t}]_{t,t} and 𝔼[Vτ12]=ΔΣτ1,t>0\mathbb{E}[V_{\tau-1}^{2}]=\Delta\Sigma_{\tau-1,t}>0. This representation implies that YtY_{t} is a sufficient statistic for estimation of XX based on the AWGN measurements {Yτ}τ=0t\{Y_{\tau}\}_{\tau=0}^{t}. Thus, we have the identity 𝔼[{Xfopt(St;𝔼[W~t2])}2]=𝔼[{Xfopt(Yt;[𝚺t]t,t)}2]\mathbb{E}[\{X-f_{\mathrm{opt}}(S_{t};\mathbb{E}[\tilde{W}_{t}^{2}])\}^{2}]=\mathbb{E}[\{X-f_{\mathrm{opt}}(Y_{t};[\boldsymbol{\Sigma}_{t}]_{t,t})\}^{2}], which is equivalent to 𝔼[W~t2]=[𝚺t]t,t\mathbb{E}[\tilde{W}_{t}^{2}]=[\boldsymbol{\Sigma}_{t}]_{t,t}. ∎

The first property in Lemma 1 is used to prove that the mean-square error (MSE) for Bayes-optimal LM-OAMP is monotonically non-increasing. This property was utilized in convergence analysis for memory AMP [19]. The remaining two properties are used to prove the reduction of Bayes-optimal LM-OAMP to Bayes-optimal OAMP.

III Long-Memory OAMP

III-A Long-Memory Processing

LM-OAMP is composed of two modules—called modules A and B. Module A uses a linear filter to mitigate multiuser interference while module B utilizes an element-wise nonlinear denoiser for signal reconstruction. A estimator of the signal vector 𝒙\boldsymbol{x} is computed via MP between the two modules.

Each module employs LM processing, in which messages in all preceding iterations are used to update the current message, while messages only in the latest iteration are utilized in conventional MP. Let 𝒙AB,tN\boldsymbol{x}_{\mathrm{A}\to\mathrm{B},t}\in\mathbb{R}^{N} and {vAB,t,t}t=0t\{v_{\mathrm{A}\to\mathrm{B},t^{\prime},t}\in\mathbb{R}\}_{t^{\prime}=0}^{t} denote messages that are passed from module A to module B in iteration tt. The former 𝒙AB,t\boldsymbol{x}_{\mathrm{A}\to\mathrm{B},t} is an estimator of 𝒙\boldsymbol{x} while the latter vAB,t,tv_{\mathrm{A}\to\mathrm{B},t^{\prime},t} corresponds to an estimator for the error covariance N1𝔼[(𝒙AB,t𝒙)T(𝒙AB,t𝒙)]N^{-1}\mathbb{E}[(\boldsymbol{x}_{\mathrm{A}\to\mathrm{B},t^{\prime}}-\boldsymbol{x})^{\mathrm{T}}(\boldsymbol{x}_{\mathrm{A}\to\mathrm{B},t}-\boldsymbol{x})]. The messages used in iteration tt are written as 𝑿AB,t=(𝒙AB,0,,𝒙AB,t)N×(t+1)\boldsymbol{X}_{\mathrm{A}\to\mathrm{B},t}=(\boldsymbol{x}_{\mathrm{A}\to\mathrm{B},0},\ldots,\boldsymbol{x}_{\mathrm{A}\to\mathrm{B},t})\in\mathbb{R}^{N\times(t+1)} and symmetric 𝑽AB,t(t+1)×(t+1)\boldsymbol{V}_{\mathrm{A}\to\mathrm{B},t}\in\mathbb{R}^{(t+1)\times(t+1)} with [𝑽AB,t]τ,τ=vAB,τ,τ[\boldsymbol{V}_{\mathrm{A}\to\mathrm{B},t}]_{\tau^{\prime},\tau}=v_{\mathrm{A}\to\mathrm{B},\tau^{\prime},\tau} for all ττ\tau^{\prime}\leq\tau.

Similarly, we define the corresponding messages passed from module B to module A in iteration tt as 𝒙BA,tN\boldsymbol{x}_{\mathrm{B}\to\mathrm{A},t}\in\mathbb{R}^{N} and {vBA,t,t}t=0t\{v_{\mathrm{B}\to\mathrm{A},t^{\prime},t}\in\mathbb{R}\}_{t^{\prime}=0}^{t}. They are compactly written as 𝑿BA,t=(𝒙BA,0,,𝒙BA,t)N×(t+1)\boldsymbol{X}_{\mathrm{B}\to\mathrm{A},t}=(\boldsymbol{x}_{\mathrm{B}\to\mathrm{A},0},\ldots,\boldsymbol{x}_{\mathrm{B}\to\mathrm{A},t})\in\mathbb{R}^{N\times(t+1)} and symmetric 𝑽BA,t(t+1)×(t+1)\boldsymbol{V}_{\mathrm{B}\to\mathrm{A},t}\in\mathbb{R}^{(t+1)\times(t+1)} with [𝑽BA,t]τ,τ=vBA,τ,τ[\boldsymbol{V}_{\mathrm{B}\to\mathrm{A},t}]_{\tau^{\prime},\tau}=v_{\mathrm{B}\to\mathrm{A},\tau^{\prime},\tau} for all ττ\tau^{\prime}\leq\tau.

Asymptotic Gaussianity for estimation errors is postulated in formulating LM-OAMP. While asymptotic Gaussianity is defined and proved shortly, a rough interpretation is that estimation errors are jointly Gaussian-distributed in the large system limit, i.e. (𝒙AB,t𝒙)(𝒙AB,t𝒙)T𝒩(𝟎,vAB,t,t𝑰N)(\boldsymbol{x}_{\mathrm{A}\to\mathrm{B},t}-\boldsymbol{x})(\boldsymbol{x}_{\mathrm{A}\to\mathrm{B},t^{\prime}}-\boldsymbol{x})^{\mathrm{T}}\sim\mathcal{N}(\boldsymbol{0},v_{\mathrm{A}\to\mathrm{B},t^{\prime},t}\boldsymbol{I}_{N}) and (𝒙BA,t𝒙)(𝒙BA,t𝒙)T𝒩(𝟎,vBA,t,t𝑰N)(\boldsymbol{x}_{\mathrm{B}\to\mathrm{A},t}-\boldsymbol{x})(\boldsymbol{x}_{\mathrm{B}\to\mathrm{A},t^{\prime}}-\boldsymbol{x})^{\mathrm{T}}\sim\mathcal{N}(\boldsymbol{0},v_{\mathrm{B}\to\mathrm{A},t^{\prime},t}\boldsymbol{I}_{N}). This rough interpretation is too strong to justify. Nonetheless, it helps us understand update rules in LM-OAMP.

III-B Module A (Linear Estimation)

Module A utilizes 𝑿BA,t\boldsymbol{X}_{\mathrm{B}\to\mathrm{A},t} and 𝑽BA,t\boldsymbol{V}_{\mathrm{B}\to\mathrm{A},t} provided by module B to compute the mean and covariance messages 𝒙AB,t\boldsymbol{x}_{\mathrm{A}\to\mathrm{B},t} and {vAB,t,t}t=0t\{v_{\mathrm{A}\to\mathrm{B},t^{\prime},t}\}_{t^{\prime}=0}^{t} in iteration tt. A first step is computation of a sufficient statistic for estimation of 𝒙\boldsymbol{x}. According to (3) and (4), we define a sufficient statistic 𝒙BA,tsuf\boldsymbol{x}_{\mathrm{B}\to\mathrm{A},t}^{\mathrm{suf}} and the corresponding covariance vBA,t,tsufv_{\mathrm{B}\to\mathrm{A},t^{\prime},t}^{\mathrm{suf}} as

𝒙BA,tsuf=𝑿BA,t𝑽BA,t1𝟏𝟏T𝑽BA,t1𝟏,\boldsymbol{x}_{\mathrm{B}\to\mathrm{A},t}^{\mathrm{suf}}=\frac{\boldsymbol{X}_{\mathrm{B}\to\mathrm{A},t}\boldsymbol{V}_{\mathrm{B}\to\mathrm{A},t}^{-1}\boldsymbol{1}}{\boldsymbol{1}^{\mathrm{T}}\boldsymbol{V}_{\mathrm{B}\to\mathrm{A},t}^{-1}\boldsymbol{1}}, (10)
vBA,t,tsuf=1𝟏T𝑽BA,t1𝟏for all tt.v_{\mathrm{B}\to\mathrm{A},t^{\prime},t}^{\mathrm{suf}}=\frac{1}{\boldsymbol{1}^{\mathrm{T}}\boldsymbol{V}_{\mathrm{B}\to\mathrm{A},t}^{-1}\boldsymbol{1}}\quad\hbox{for all $t^{\prime}\leq t$.} (11)

In the initial iteration t=0t=0, the initial values 𝒙BA,0=𝟎\boldsymbol{x}_{\mathrm{B}\to\mathrm{A},0}=\boldsymbol{0} and vBA,0,0=𝔼[𝒙2]/Nv_{\mathrm{B}\to\mathrm{A},0,0}=\mathbb{E}[\|\boldsymbol{x}\|^{2}]/N are used to compute (10) and (11).

The sufficient statistic (10) is equivalent to optimized LM damping of all preceding messages {𝒙BA,t}t=0t\{\boldsymbol{x}_{\mathrm{B}\to\mathrm{A},t^{\prime}}\}_{t^{\prime}=0}^{t} in [19], which was obtained as a solution to an optimization problem based on state evolution results. However, this statistical interpretation is a key technical tool in proving the main theorem.

A second step is computation of posterior mean 𝒙A,tpostN\boldsymbol{x}_{\mathrm{A},t}^{\mathrm{post}}\in\mathbb{R}^{N} and covariance {vA,t,tpost}t=0t\{v_{\mathrm{A},t^{\prime},t}^{\mathrm{post}}\}_{t^{\prime}=0}^{t}. A linear filter 𝑾tM×N\boldsymbol{W}_{t}\in\mathbb{R}^{M\times N} is used to obtain

𝒙A,tpost=𝒙BA,tsuf+𝑾tT(𝒚𝑨𝒙BA,tsuf),\boldsymbol{x}_{\mathrm{A},t}^{\mathrm{post}}=\boldsymbol{x}_{\mathrm{B}\to\mathrm{A},t}^{\mathrm{suf}}+\boldsymbol{W}_{t}^{\mathrm{T}}(\boldsymbol{y}-\boldsymbol{A}\boldsymbol{x}_{\mathrm{B}\to\mathrm{A},t}^{\mathrm{suf}}), (12)
vA,t,tpost=γt,tvBA,t,tsuf+σ2NTr(𝑾t𝑾tT),v_{\mathrm{A},t^{\prime},t}^{\mathrm{post}}=\gamma_{t^{\prime},t}v_{\mathrm{B}\to\mathrm{A},t^{\prime},t}^{\mathrm{suf}}+\frac{\sigma^{2}}{N}\mathrm{Tr}\left(\boldsymbol{W}_{t^{\prime}}\boldsymbol{W}_{t}^{\mathrm{T}}\right), (13)

with

γt,t=1NTr{(𝑰N𝑾tT𝑨)T(𝑰N𝑾tT𝑨)}.\gamma_{t^{\prime},t}=\frac{1}{N}\mathrm{Tr}\left\{\left(\boldsymbol{I}_{N}-\boldsymbol{W}_{t^{\prime}}^{\mathrm{T}}\boldsymbol{A}\right)^{\mathrm{T}}\left(\boldsymbol{I}_{N}-\boldsymbol{W}_{t}^{\mathrm{T}}\boldsymbol{A}\right)\right\}. (14)

In this paper, we focus on the linear minimum mean-square error (LMMSE) filter

𝑾t=vBA,t,tsuf(σ2𝑰M+vBA,t,tsuf𝑨𝑨T)1𝑨.\boldsymbol{W}_{t}=v_{\mathrm{B}\to\mathrm{A},t,t}^{\mathrm{suf}}\left(\sigma^{2}\boldsymbol{I}_{M}+v_{\mathrm{B}\to\mathrm{A},t,t}^{\mathrm{suf}}\boldsymbol{A}\boldsymbol{A}^{\mathrm{T}}\right)^{-1}\boldsymbol{A}. (15)

The LMMSE filter minimizes the posterior variance vA,t,tpostv_{\mathrm{A},t,t}^{\mathrm{post}} among all possible linear filters.

The last step is computation of extrinsic messages 𝒙AB,tN\boldsymbol{x}_{\mathrm{A}\to\mathrm{B},t}\in\mathbb{R}^{N} and {vAB,t,t}t=0t\{v_{\mathrm{A}\to\mathrm{B},t^{\prime},t}\}_{t^{\prime}=0}^{t} to realize asymptotic Gaussianity in module B. Let

ξA,t,t=ξA,t𝒆tT𝑽BA,t1𝟏𝟏T𝑽BA,t1𝟏,\xi_{\mathrm{A},t^{\prime},t}=\xi_{\mathrm{A},t}\frac{\boldsymbol{e}_{t^{\prime}}^{\mathrm{T}}\boldsymbol{V}_{\mathrm{B}\to\mathrm{A},t}^{-1}\boldsymbol{1}}{\boldsymbol{1}^{\mathrm{T}}\boldsymbol{V}_{\mathrm{B}\to\mathrm{A},t}^{-1}\boldsymbol{1}}, (16)

where 𝒆t\boldsymbol{e}_{t} is the ttth column of 𝑰\boldsymbol{I}, with

ξA,t=1NTr(𝑰N𝑾tT𝑨).\xi_{\mathrm{A},t}=\frac{1}{N}\mathrm{Tr}\left(\boldsymbol{I}_{N}-\boldsymbol{W}_{t}^{\mathrm{T}}\boldsymbol{A}\right). (17)

The extrinsic mean 𝒙AB,t\boldsymbol{x}_{\mathrm{A}\to\mathrm{B},t} and covariance {vAB,t,t}t=0t\{v_{\mathrm{A}\to\mathrm{B},t^{\prime},t}\}_{t^{\prime}=0}^{t} are computed as

𝒙AB,t=𝒙A,tpostt=0tξA,t,t𝒙BA,t1ξA,t,\boldsymbol{x}_{\mathrm{A}\to\mathrm{B},t}=\frac{\boldsymbol{x}_{\mathrm{A},t}^{\mathrm{post}}-\sum_{t^{\prime}=0}^{t}\xi_{\mathrm{A},t^{\prime},t}\boldsymbol{x}_{\mathrm{B}\to\mathrm{A},t^{\prime}}}{1-\xi_{\mathrm{A},t}}, (18)
vAB,t,t=vA,t,tpostξA,tξA,tvBA,t,tsuf(1ξA,t)(1ξA,t).v_{\mathrm{A}\to\mathrm{B},t^{\prime},t}=\frac{v_{\mathrm{A},t^{\prime},t}^{\mathrm{post}}-\xi_{\mathrm{A},t^{\prime}}\xi_{\mathrm{A},t}v_{\mathrm{B}\to\mathrm{A},t^{\prime},t}^{\mathrm{suf}}}{(1-\xi_{\mathrm{A},t^{\prime}})(1-\xi_{\mathrm{A},t})}. (19)

The numerator in (18) is the so-called Onsager correction of the posterior mean 𝒙A,tpost\boldsymbol{x}_{\mathrm{A},t}^{\mathrm{post}} to realize asymptotic Gaussianity. The denominator can be set to an arbitrary constant. In this paper, we set the denominator so as to minimize the extrinsic variance vAB,t,tv_{\mathrm{A}\to\mathrm{B},t,t} for the LMMSE filter (15). See [23, Section II] for the details.

III-C Module B (Nonlinear Estimation)

Module B uses 𝑿AB,t\boldsymbol{X}_{\mathrm{A}\to\mathrm{B},t} and 𝑽AB,t\boldsymbol{V}_{\mathrm{A}\to\mathrm{B},t} to compute the messages 𝒙BA,t+1\boldsymbol{x}_{\mathrm{B}\to\mathrm{A},t+1} and {vBA,t,t+1}t=0t+1\{v_{\mathrm{B}\to\mathrm{A},t^{\prime},t+1}\}_{t^{\prime}=0}^{t+1} in the same manner as in module A. A sufficient statistic 𝒙AB,tsufN\boldsymbol{x}_{\mathrm{A}\to\mathrm{B},t}^{\mathrm{suf}}\in\mathbb{R}^{N} and the corresponding covariance {vAB,t,tsuf}t=0t\{v_{\mathrm{A}\to\mathrm{B},t^{\prime},t}^{\mathrm{suf}}\in\mathbb{R}\}_{t^{\prime}=0}^{t} are computed as

𝒙AB,tsuf=𝑿AB,t𝑽AB,t1𝟏𝟏T𝑽AB,t1𝟏,\boldsymbol{x}_{\mathrm{A}\to\mathrm{B},t}^{\mathrm{suf}}=\frac{\boldsymbol{X}_{\mathrm{A}\to\mathrm{B},t}\boldsymbol{V}_{\mathrm{A}\to\mathrm{B},t}^{-1}\boldsymbol{1}}{\boldsymbol{1}^{\mathrm{T}}\boldsymbol{V}_{\mathrm{A}\to\mathrm{B},t}^{-1}\boldsymbol{1}}, (20)
vAB,t,tsuf=1𝟏T𝑽AB,t1𝟏for all tt.v_{\mathrm{A}\to\mathrm{B},t^{\prime},t}^{\mathrm{suf}}=\frac{1}{\boldsymbol{1}^{\mathrm{T}}\boldsymbol{V}_{\mathrm{A}\to\mathrm{B},t}^{-1}\boldsymbol{1}}\quad\hbox{for all $t^{\prime}\leq t$.} (21)

Module B next computes the posterior messages 𝒙B,t+1post\boldsymbol{x}_{\mathrm{B},t+1}^{\mathrm{post}} and {vB,t+1,t+1post}t=0t\{v_{\mathrm{B},t^{\prime}+1,t+1}^{\mathrm{post}}\}_{t^{\prime}=0}^{t} with the posterior mean fopt(;)f_{\mathrm{opt}}(\cdot;\cdot) and covariance (5) for the correlated AWGN measurements,

𝒙B,t+1post=fopt(𝒙AB,tsuf;vAB,t,tsuf),\boldsymbol{x}_{\mathrm{B},t+1}^{\mathrm{post}}=f_{\mathrm{opt}}(\boldsymbol{x}_{\mathrm{A}\to\mathrm{B},t}^{\mathrm{suf}};v_{\mathrm{A}\to\mathrm{B},t,t}^{\mathrm{suf}}), (22)
vB,t+1,t+1post=1Nn=1NC([𝒙AB,tsuf]n,[𝒙AB,tsuf]n;\displaystyle v_{\mathrm{B},t^{\prime}+1,t+1}^{\mathrm{post}}=\frac{1}{N}\sum_{n=1}^{N}C\left([\boldsymbol{x}_{\mathrm{A}\to\mathrm{B},t^{\prime}}^{\mathrm{suf}}]_{n},[\boldsymbol{x}_{\mathrm{A}\to\mathrm{B},t}^{\mathrm{suf}}]_{n};\right.
vAB,t,tsuf,vAB,t,tsuf),\displaystyle\left.v_{\mathrm{A}\to\mathrm{B},t^{\prime},t^{\prime}}^{\mathrm{suf}},v_{\mathrm{A}\to\mathrm{B},t,t}^{\mathrm{suf}}\right), (23)

where the right-hand side (RHS) of (22) means the element-wise application of foptf_{\mathrm{opt}} to 𝒙AB,tsuf\boldsymbol{x}_{\mathrm{A}\to\mathrm{B},t}^{\mathrm{suf}}. The posterior mean (22) is used as an estimator of 𝒙\boldsymbol{x}.

To realize asymptotic Gaussianity in module A, the extrinsic mean 𝒙BA,t+1\boldsymbol{x}_{\mathrm{B}\to\mathrm{A},t+1} and covariance {vBA,t,t+1}t=0t+1\{v_{\mathrm{B}\to\mathrm{A},t^{\prime},t+1}\}_{t^{\prime}=0}^{t+1} are fed back to module A. Let

ξB,t,t=ξB,t𝒆tT𝑽AB,t1𝟏𝟏T𝑽AB,t1𝟏,\xi_{\mathrm{B},t^{\prime},t}=\xi_{\mathrm{B},t}\frac{\boldsymbol{e}_{t^{\prime}}^{\mathrm{T}}\boldsymbol{V}_{\mathrm{A}\to\mathrm{B},t}^{-1}\boldsymbol{1}}{\boldsymbol{1}^{\mathrm{T}}\boldsymbol{V}_{\mathrm{A}\to\mathrm{B},t}^{-1}\boldsymbol{1}}, (24)

with

ξB,t=1Nn=1Nfopt([𝒙AB,tsuf]n;vAB,tsuf),\xi_{\mathrm{B},t}=\frac{1}{N}\sum_{n=1}^{N}f_{\mathrm{opt}}^{\prime}([\boldsymbol{x}_{\mathrm{A}\to\mathrm{B},t}^{\mathrm{suf}}]_{n};v_{\mathrm{A}\to\mathrm{B},t}^{\mathrm{suf}}), (25)

where the derivative is taken with respect to the first variable. The extrinsic messages are computed as

𝒙BA,t+1=𝒙B,t+1postt=0tξB,t,t𝒙AB,t1ξB,t,\boldsymbol{x}_{\mathrm{B}\to\mathrm{A},t+1}=\frac{\boldsymbol{x}_{\mathrm{B},t+1}^{\mathrm{post}}-\sum_{t^{\prime}=0}^{t}\xi_{\mathrm{B},t^{\prime},t}\boldsymbol{x}_{\mathrm{A}\to\mathrm{B},t^{\prime}}}{1-\xi_{\mathrm{B},t}}, (26)
vBA,t+1,t+1=vB,t+1,t+1postξB,tξB,tvAB,t,tsuf(1ξB,t)(1ξB,t)v_{\mathrm{B}\to\mathrm{A},t^{\prime}+1,t+1}=\frac{v_{\mathrm{B},t^{\prime}+1,t+1}^{\mathrm{post}}-\xi_{\mathrm{B},t^{\prime}}\xi_{\mathrm{B},t}v_{\mathrm{A}\to\mathrm{B},t^{\prime},t}^{\mathrm{suf}}}{(1-\xi_{\mathrm{B},t^{\prime}})(1-\xi_{\mathrm{B},t})} (27)

for t{0,,t}t^{\prime}\in\{0,\ldots,t\}, with vBA,0,t+1=vB,t+1,t+1post/(1ξB,t)v_{\mathrm{B}\to\mathrm{A},0,t+1}=v_{\mathrm{B},t+1,t+1}^{\mathrm{post}}/(1-\xi_{\mathrm{B},t}).

IV Main Results

IV-A State Evolution

The dynamics of LM-OAMP is analyzed via state evolution [17] in the large system limit. Asymptotic Gaussianity has been proved for a general error model proposed in [17]. Thus, the main part in state evolution analysis is to prove the inclusion of the error model for LM-OAMP into the general error model.

Before presenting state evolution results, we first summarize technical assumptions.

Assumption 1

The signal vector 𝐱\boldsymbol{x} has i.i.d. elements with zero mean, unit variance, and bounded (2+ϵ)(2+\epsilon)th moment for some ϵ>0\epsilon>0.

Assumption 1 is a simplifying assumption. To relax Assumption 1, we need non-separable denoisers [25, 26, 27].

Assumption 2

The sensing matrix 𝐀\boldsymbol{A} is right-orthogonally invariant: For any orthogonal matrix 𝚽\boldsymbol{\Phi} independent of 𝐀\boldsymbol{A}, the equivalence in distribution 𝐀𝚽𝐀\boldsymbol{A}\boldsymbol{\Phi}\sim\boldsymbol{A} holds. More precisely, in the singular-value decomposition (SVD) 𝐀=𝐔𝚺𝐕T\boldsymbol{A}=\boldsymbol{U}\boldsymbol{\Sigma}\boldsymbol{V}^{\mathrm{T}} the orthogonal matrix 𝐕\boldsymbol{V} is independent of 𝐔𝚺\boldsymbol{U}\boldsymbol{\Sigma} and Haar-distributed [28, 29]. Furthermore, the empirical eigenvalue distribution of 𝐀T𝐀\boldsymbol{A}^{\mathrm{T}}\boldsymbol{A} converges almost surely to a compactly supported deterministic distribution with unit first moment in the large system limit.

The right-orthogonal invariance is a key assumption in state evolution analysis. The unit-first-moment assumption implies the almost sure convergence N1Tr(𝑨T𝑨)a.s.1N^{-1}\mathrm{Tr}(\boldsymbol{A}^{\mathrm{T}}\boldsymbol{A})\overset{\mathrm{a.s.}}{\to}1.

Assumption 3

The Bayes-optimal denoiser foptf_{\mathrm{opt}} in module B is nonlinear and Lipschitz-continuous.

The nonlinearity is required to guarantee the asymptotic positive definiteness of 𝑽AB,t\boldsymbol{V}_{\mathrm{A}\to\mathrm{B},t} and 𝑽BA,t\boldsymbol{V}_{\mathrm{B}\to\mathrm{A},t}. It is an interesting open question whether any Bayes-optimal denoiser is Lipschitz-continuous under Assumption 1.

We next define state evolution recursions for LM-OAMP, which are 2D discrete systems with respect to two positive-definite symmetric matrices 𝑽¯AB,t(t+1)×(t+1)\bar{\boldsymbol{V}}_{\mathrm{A}\to\mathrm{B},t}\in\mathbb{R}^{(t+1)\times(t+1)} and 𝑽¯BA,t(t+1)×(t+1)\bar{\boldsymbol{V}}_{\mathrm{B}\to\mathrm{A},t}\in\mathbb{R}^{(t+1)\times(t+1)}. We write the (τ,τ)(\tau^{\prime},\tau) elements of 𝑽¯AB,t(t+1)×(t+1)\bar{\boldsymbol{V}}_{\mathrm{A}\to\mathrm{B},t}\in\mathbb{R}^{(t+1)\times(t+1)} and 𝑽¯BA,t(t+1)×(t+1)\bar{\boldsymbol{V}}_{\mathrm{B}\to\mathrm{A},t}\in\mathbb{R}^{(t+1)\times(t+1)} as v¯AB,τ,τ\bar{v}_{\mathrm{A}\to\mathrm{B},\tau^{\prime},\tau} and v¯BA,τ,τ\bar{v}_{\mathrm{B}\to\mathrm{A},\tau^{\prime},\tau} for τ,τ{0,,t}\tau^{\prime},\tau\in\{0,\ldots,t\}, respectively.

Consider the initial condition v¯BA,0,0=1\bar{v}_{\mathrm{B}\to\mathrm{A},0,0}=1. State evolution recursions for module A are given by

v¯BA,t,tsuf=1𝟏T𝑽¯BA,t1𝟏for all tt,\bar{v}_{\mathrm{B}\to\mathrm{A},t^{\prime},t}^{\mathrm{suf}}=\frac{1}{\boldsymbol{1}^{\mathrm{T}}\bar{\boldsymbol{V}}_{\mathrm{B}\to\mathrm{A},t}^{-1}\boldsymbol{1}}\quad\hbox{for all $t^{\prime}\leq t$}, (28)
v¯A,t,tpost=limM=δN{γt,tv¯BA,t,tsuf+σ2NTr(𝑾t𝑾tT)},\bar{v}_{\mathrm{A},t^{\prime},t}^{\mathrm{post}}=\lim_{M=\delta N\to\infty}\left\{\gamma_{t^{\prime},t}\bar{v}_{\mathrm{B}\to\mathrm{A},t^{\prime},t}^{\mathrm{suf}}+\frac{\sigma^{2}}{N}\mathrm{Tr}\left(\boldsymbol{W}_{t^{\prime}}\boldsymbol{W}_{t}^{\mathrm{T}}\right)\right\}, (29)
v¯AB,t,t=v¯A,t,tpostξ¯A,tξ¯A,tv¯BA,t,tsuf(1ξ¯A,t)(1ξ¯A,t),\bar{v}_{\mathrm{A}\to\mathrm{B},t^{\prime},t}=\frac{\bar{v}_{\mathrm{A},t^{\prime},t}^{\mathrm{post}}-\bar{\xi}_{\mathrm{A},t^{\prime}}\bar{\xi}_{\mathrm{A},t}\bar{v}_{\mathrm{B}\to\mathrm{A},t^{\prime},t}^{\mathrm{suf}}}{(1-\bar{\xi}_{\mathrm{A},t^{\prime}})(1-\bar{\xi}_{\mathrm{A},t})}, (30)

with ξ¯A,t=v¯A,t,tpost/v¯BA,t,tsuf\bar{\xi}_{\mathrm{A},t}=\bar{v}_{\mathrm{A},t,t}^{\mathrm{post}}/\bar{v}_{\mathrm{B}\to\mathrm{A},t,t}^{\mathrm{suf}}. In (29), γt,t\gamma_{t^{\prime},t} is given by (14). The LMMSE filter 𝑾t\boldsymbol{W}_{t} is defined as (15) with vBA,t,tsufv_{\mathrm{B}\to\mathrm{A},t,t}^{\mathrm{suf}} replaced by v¯BA,t,tsuf\bar{v}_{\mathrm{B}\to\mathrm{A},t,t}^{\mathrm{suf}}.

State evolution recursions for module B are given by

v¯AB,t,tsuf=1𝟏T𝑽¯AB,t,t1𝟏for tt,\bar{v}_{\mathrm{A}\to\mathrm{B},t^{\prime},t}^{\mathrm{suf}}=\frac{1}{\boldsymbol{1}^{\mathrm{T}}\bar{\boldsymbol{V}}_{\mathrm{A}\to\mathrm{B},t,t}^{-1}\boldsymbol{1}}\quad\hbox{for $t^{\prime}\leq t$,} (31)
v¯B,t+1,t+1post=𝔼[\displaystyle\bar{v}_{\mathrm{B},t^{\prime}+1,t+1}^{\mathrm{post}}=\mathbb{E}[ {fopt(x1+zt;v¯AB,t,tsuf)x1}\displaystyle\{f_{\mathrm{opt}}(x_{1}+z_{t^{\prime}};\bar{v}_{\mathrm{A}\to\mathrm{B},t^{\prime},t^{\prime}}^{\mathrm{suf}})-x_{1}\} (32)
{fopt(x1+zt;v¯AB,t,tsuf)x1}],\displaystyle\cdot\{f_{\mathrm{opt}}(x_{1}+z_{t};\bar{v}_{\mathrm{A}\to\mathrm{B},t,t}^{\mathrm{suf}})-x_{1}\}],
v¯BA,t+1,t+1=v¯B,t+1,t+1postξ¯B,tξ¯B,tv¯AB,t,tsuf(1ξ¯B,t)(1ξ¯B,t)\bar{v}_{\mathrm{B}\to\mathrm{A},t^{\prime}+1,t+1}=\frac{\bar{v}_{\mathrm{B},t^{\prime}+1,t+1}^{\mathrm{post}}-\bar{\xi}_{\mathrm{B},t^{\prime}}\bar{\xi}_{\mathrm{B},t}\bar{v}_{\mathrm{A}\to\mathrm{B},t^{\prime},t}^{\mathrm{suf}}}{(1-\bar{\xi}_{\mathrm{B},t^{\prime}})(1-\bar{\xi}_{\mathrm{B},t})} (33)

for t,t0t^{\prime},t\geq 0, with v¯BA,0,t+1=v¯B,t+1,t+1post/(1ξ¯B,t)\bar{v}_{\mathrm{B}\to\mathrm{A},0,t+1}=\bar{v}_{\mathrm{B},t+1,t+1}^{\mathrm{post}}/(1-\bar{\xi}_{\mathrm{B},t}) and ξ¯B,t=v¯B,t+1,t+1post/v¯AB,t,tsuf\bar{\xi}_{\mathrm{B},t}=\bar{v}_{\mathrm{B},t+1,t+1}^{\mathrm{post}}/\bar{v}_{\mathrm{A}\to\mathrm{B},t,t}^{\mathrm{suf}}. In (32), {zτ}\{z_{\tau}\} are independent of the signal element x1x_{1} and zero-mean Gaussian random variables with covariance 𝔼[zτzτ]=v¯AB,τ,τsuf\mathbb{E}[z_{\tau^{\prime}}z_{\tau}]=\bar{v}_{\mathrm{A}\to\mathrm{B},\tau^{\prime},\tau}^{\mathrm{suf}}.

Theorem 1

Suppose that Assumptions 13 hold. Then, the covariance N1(𝐱A,tpost𝐱)T(𝐱A,tpost𝐱)N^{-1}(\boldsymbol{x}_{\mathrm{A},t^{\prime}}^{\mathrm{post}}-\boldsymbol{x})^{\mathrm{T}}(\boldsymbol{x}_{\mathrm{A},t}^{\mathrm{post}}-\boldsymbol{x}) and N1(𝐱B,t+1post𝐱)T(𝐱B,t+1post𝐱)N^{-1}(\boldsymbol{x}_{\mathrm{B},t^{\prime}+1}^{\mathrm{post}}-\boldsymbol{x})^{\mathrm{T}}(\boldsymbol{x}_{\mathrm{B},t+1}^{\mathrm{post}}-\boldsymbol{x}) converges almost surely to v¯A,t,tpost\bar{v}_{\mathrm{A},t^{\prime},t}^{\mathrm{post}} and v¯B,t+1,t+1post\bar{v}_{\mathrm{B},t^{\prime}+1,t+1}^{\mathrm{post}} in the large system limit, where v¯A,t,tpost\bar{v}_{\mathrm{A},t^{\prime},t}^{\mathrm{post}} and v¯B,t+1,t+1post\bar{v}_{\mathrm{B},t^{\prime}+1,t+1}^{\mathrm{post}} satisfy the state evolution recursions (28)–(33).

Proof:

See [23, Theorems 1 and 2]. ∎

The almost sure convergence N1(𝒙B,t+1post𝒙)T(𝒙B,t+1post𝒙)a.s.v¯B,t+1,t+1postN^{-1}(\boldsymbol{x}_{\mathrm{B},t^{\prime}+1}^{\mathrm{post}}-\boldsymbol{x})^{\mathrm{T}}(\boldsymbol{x}_{\mathrm{B},t+1}^{\mathrm{post}}-\boldsymbol{x})\overset{\mathrm{a.s.}}{\to}\bar{v}_{\mathrm{B},t^{\prime}+1,t+1}^{\mathrm{post}} is the precise meaning of asymptotic Gaussianity. As shown in (32), v¯B,t+1,t+1post\bar{v}_{\mathrm{B},t^{\prime}+1,t+1}^{\mathrm{post}} is given via the error covariance for the Bayes-optimal estimation of x1x_{1} based on the correlated AWGN measurements x1+ztx_{1}+z_{t^{\prime}} and x1+ztx_{1}+z_{t}.

Theorem 1 implies that the covariance messages vA,t,tpostv_{\mathrm{A},t^{\prime},t}^{\mathrm{post}} and vB,t+1,t+1postv_{\mathrm{B},t^{\prime}+1,t+1}^{\mathrm{post}} in LM-OAMP are consistent estimators for the covariance N1(𝒙A,tpost𝒙)T(𝒙A,tpost𝒙)N^{-1}(\boldsymbol{x}_{\mathrm{A},t^{\prime}}^{\mathrm{post}}-\boldsymbol{x})^{\mathrm{T}}(\boldsymbol{x}_{\mathrm{A},t}^{\mathrm{post}}-\boldsymbol{x}) and N1(𝒙B,t+1post𝒙)T(𝒙B,t+1post𝒙)N^{-1}(\boldsymbol{x}_{\mathrm{B},t^{\prime}+1}^{\mathrm{post}}-\boldsymbol{x})^{\mathrm{T}}(\boldsymbol{x}_{\mathrm{B},t+1}^{\mathrm{post}}-\boldsymbol{x}) in the large system limit, respectively.

IV-B Convergence

We next prove that the state evolution recursions (28)–(33) for Bayes-optimal LM-OAMP is equivalent to those for conventional Bayes-optimal OAMP and that they converge to a fixed point, i.e. v¯B,t,tpost\bar{v}_{\mathrm{B},t^{\prime},t}^{\mathrm{post}} converges to a constant as tt tends to infinity for all ttt^{\prime}\leq t. Note that, in general, the convergence of the diagonal element v¯B,t,tpost\bar{v}_{\mathrm{B},t,t}^{\mathrm{post}} does not necessarily imply the convergence of the non-diagonal elements [30].

Before analyzing the convergence of the state evolution recursions, we review state evolution recursions for conventional Bayes-optimal OAMP [5, 6, 11].

v¯A,tpost=v¯BA,t\displaystyle\bar{v}_{\mathrm{A},t}^{\mathrm{post}}=\bar{v}_{\mathrm{B}\to\mathrm{A},t}- limM=δNv¯BA,t2NTr{𝑨𝑨T\displaystyle\lim_{M=\delta N\to\infty}\frac{\bar{v}_{\mathrm{B}\to\mathrm{A},t}^{2}}{N}\mathrm{Tr}\left\{\boldsymbol{A}\boldsymbol{A}^{\mathrm{T}}\right. (34)
(σ2𝑰M+v¯BA,t𝑨𝑨T)1},\displaystyle\left.\cdot\left(\sigma^{2}\boldsymbol{I}_{M}+\bar{v}_{\mathrm{B}\to\mathrm{A},t}\boldsymbol{A}\boldsymbol{A}^{\mathrm{T}}\right)^{-1}\right\},
v¯AB,t=(1v¯A,tpost1v¯BA,t)1,\bar{v}_{\mathrm{A}\to\mathrm{B},t}=\left(\frac{1}{\bar{v}_{\mathrm{A},t}^{\mathrm{post}}}-\frac{1}{\bar{v}_{\mathrm{B}\to\mathrm{A},t}}\right)^{-1}, (35)
v¯B,t+1post=𝔼[{fopt(x1+zt;v¯AB,t)x1}2],\bar{v}_{\mathrm{B},t+1}^{\mathrm{post}}=\mathbb{E}\left[\{f_{\mathrm{opt}}(x_{1}+z_{t};\bar{v}_{\mathrm{A}\to\mathrm{B},t})-x_{1}\}^{2}\right], (36)
v¯BA,t+1=(1v¯B,t+1post1v¯AB,t)1,\bar{v}_{\mathrm{B}\to\mathrm{A},t+1}=\left(\frac{1}{\bar{v}_{\mathrm{B},t+1}^{\mathrm{post}}}-\frac{1}{\bar{v}_{\mathrm{A}\to\mathrm{B},t}}\right)^{-1}, (37)

with the initial condition v¯BA,0=1\bar{v}_{\mathrm{B}\to\mathrm{A},0}=1, where ztz_{t} in (36) is independent of x1x_{1} and follows the zero-mean Gaussian distribution with variance v¯AB,t\bar{v}_{\mathrm{A}\to\mathrm{B},t}. The MSE for the OAMP estimator in iteration tt was proved to converge almost surely to v¯B,t\bar{v}_{\mathrm{B},t} in the large system limit [6, 11].

Theorem 2

The state evolution recursions (28)–(33) for Bayes-optimal LM-OAMP are equivalent to those (34)–(37) for Bayes-optimal OAMP, i.e. vB,t,tpost=vB,tpostv_{\mathrm{B},t,t}^{\mathrm{post}}=v_{\mathrm{B},t}^{\mathrm{post}} holds for any tt. Furthermore, vB,t,tpostv_{\mathrm{B},t^{\prime},t}^{\mathrm{post}} converges to a constant as tt tends to infinity for all ttt^{\prime}\leq t.

Proof:

We first evaluate the RHS of (29). Let

𝚵t=v¯BA,t,tsuf(σ2𝑰M+v¯BA,t,tsuf𝑨𝑨T)1.\boldsymbol{\Xi}_{t}=\bar{v}_{\mathrm{B}\to\mathrm{A},t,t}^{\mathrm{suf}}\left(\sigma^{2}\boldsymbol{I}_{M}+\bar{v}_{\mathrm{B}\to\mathrm{A},t,t}^{\mathrm{suf}}\boldsymbol{A}\boldsymbol{A}^{\mathrm{T}}\right)^{-1}. (38)

Using vBA,t,tsufa.s.v¯BA,t,tsuf=v¯BA,t,tsufv_{\mathrm{B}\to\mathrm{A},t^{\prime},t}^{\mathrm{suf}}\overset{\mathrm{a.s.}}{\to}\bar{v}_{\mathrm{B}\to\mathrm{A},t^{\prime},t}^{\mathrm{suf}}=\bar{v}_{\mathrm{B}\to\mathrm{A},t,t}^{\mathrm{suf}} for all ttt^{\prime}\leq t obtained from (28), the definition of γt,t\gamma_{t^{\prime},t} in (14), and the LMMSE filter (15), we have

v¯A,t,tpost=a.s.v¯BA,t,tsufN{NTr(𝚵t𝑨𝑨T+𝚵t𝑨𝑨T)}\displaystyle\bar{v}_{\mathrm{A},t^{\prime},t}^{\mathrm{post}}\overset{\mathrm{a.s.}}{=}\frac{\bar{v}_{\mathrm{B}\to\mathrm{A},t,t}^{\mathrm{suf}}}{N}\left\{N-\mathrm{Tr}\left(\boldsymbol{\Xi}_{t^{\prime}}\boldsymbol{A}\boldsymbol{A}^{\mathrm{T}}+\boldsymbol{\Xi}_{t}\boldsymbol{A}\boldsymbol{A}^{\mathrm{T}}\right)\right\}
+1NTr{𝚵t𝑨𝑨T𝚵t(σ2𝑰+v¯BA,t,tsuf𝑨𝑨T)}+o(1)\displaystyle+\frac{1}{N}\mathrm{Tr}\left\{\boldsymbol{\Xi}_{t^{\prime}}\boldsymbol{A}\boldsymbol{A}^{\mathrm{T}}\boldsymbol{\Xi}_{t}\left(\sigma^{2}\boldsymbol{I}+\bar{v}_{\mathrm{B}\to\mathrm{A},t,t}^{\mathrm{suf}}\boldsymbol{A}\boldsymbol{A}^{\mathrm{T}}\right)\right\}+o(1)
=v¯BA,t,tsufv¯BA,t,tsufNTr(𝚵t𝑨𝑨T)+o(1),\displaystyle=\bar{v}_{\mathrm{B}\to\mathrm{A},t,t}^{\mathrm{suf}}-\frac{\bar{v}_{\mathrm{B}\to\mathrm{A},t,t}^{\mathrm{suf}}}{N}\mathrm{Tr}\left(\boldsymbol{\Xi}_{t}\boldsymbol{A}\boldsymbol{A}^{\mathrm{T}}\right)+o(1), (39)

which is equal to the RHS of (34) with v¯BA,t\bar{v}_{\mathrm{B}\to\mathrm{A},t} replaced by v¯BA,t,tsuf\bar{v}_{\mathrm{B}\to\mathrm{A},t,t}^{\mathrm{suf}}. In the derivation of the second equality we have used (38). Note that (39) implies v¯A,t,tpost=v¯A,t,tpost\bar{v}_{\mathrm{A},t^{\prime},t}^{\mathrm{post}}=\bar{v}_{\mathrm{A},t,t}^{\mathrm{post}} for all ttt^{\prime}\leq t.

We next evaluate the RHSs of (30) and (33). Applying v¯BA,t,tsuf=v¯BA,t,tsuf\bar{v}_{\mathrm{B}\to\mathrm{A},t^{\prime},t}^{\mathrm{suf}}=\bar{v}_{\mathrm{B}\to\mathrm{A},t,t}^{\mathrm{suf}} obtained from (28), v¯A,t,tpost=v¯A,t,tpost\bar{v}_{\mathrm{A},t^{\prime},t}^{\mathrm{post}}=\bar{v}_{\mathrm{A},t,t}^{\mathrm{post}}, and ξ¯A,t=v¯A,t,tpost/v¯BA,t,tsuf\bar{\xi}_{\mathrm{A},t}=\bar{v}_{\mathrm{A},t,t}^{\mathrm{post}}/\bar{v}_{\mathrm{B}\to\mathrm{A},t,t}^{\mathrm{suf}} to (30), we have

v¯AB,t,t=(1v¯A,t,tpost1v¯BA,t,tsuf)1.\bar{v}_{\mathrm{A}\to\mathrm{B},t^{\prime},t}=\left(\frac{1}{\bar{v}_{\mathrm{A},t,t}^{\mathrm{post}}}-\frac{1}{\bar{v}_{\mathrm{B}\to\mathrm{A},t,t}^{\mathrm{suf}}}\right)^{-1}. (40)

Similarly, we use v¯B,t,tpost=v¯B,t,tpost\bar{v}_{\mathrm{B},t^{\prime},t}^{\mathrm{post}}=\bar{v}_{\mathrm{B},t,t}^{\mathrm{post}} for (32), obtained from the second property in Lemma 1, to find that the RHS of (33) reduces to

v¯BA,t+1,t+1=(1v¯B,t+1,t+1post1v¯AB,t,tsuf)1.\bar{v}_{\mathrm{B}\to\mathrm{A},t^{\prime}+1,t+1}=\left(\frac{1}{\bar{v}_{\mathrm{B},t+1,t+1}^{\mathrm{post}}}-\frac{1}{\bar{v}_{\mathrm{A}\to\mathrm{B},t,t}^{\mathrm{suf}}}\right)^{-1}. (41)

To prove v¯B,t,tpost=v¯B,tpost\bar{v}_{\mathrm{B},t,t}^{\mathrm{post}}=\bar{v}_{\mathrm{B},t}^{\mathrm{post}} for any tt, it is sufficient to show v¯AB,t,t=v¯AB,t,tsuf\bar{v}_{\mathrm{A}\to\mathrm{B},t,t}=\bar{v}_{\mathrm{A}\to\mathrm{B},t,t}^{\mathrm{suf}} and v¯BA,t,t=v¯BA,t,tsuf\bar{v}_{\mathrm{B}\to\mathrm{A},t,t}=\bar{v}_{\mathrm{B}\to\mathrm{A},t,t}^{\mathrm{suf}}. These identities follow immediately from the last property in Lemma 1. Thus, the state evolution recursions for Bayes-optimal LM-OAMP are equivalent to those for Bayes-optimal OAMP.

Finally, we prove the convergence of the state evolution recursions for Bayes-optimal LM-OAMP. The first property in Lemma 1 implies that {v¯B,t,tpost0}\{\bar{v}_{\mathrm{B},t,t}^{\mathrm{post}}\geq 0\} is a monotonically non-increasing sequence as tt grows. Thus, v¯B,t,tpost\bar{v}_{\mathrm{B},t,t}^{\mathrm{post}} converges to a constant v¯Bpost\bar{v}_{\mathrm{B}}^{\mathrm{post}} as tt tends to infinity. Since v¯B,t,tpost=v¯B,t,tpost\bar{v}_{\mathrm{B},t^{\prime},t}^{\mathrm{post}}=\bar{v}_{\mathrm{B},t,t}^{\mathrm{post}} holds for all ttt^{\prime}\leq t, the convergence of the diagonal element v¯B,t,tpost\bar{v}_{\mathrm{B},t,t}^{\mathrm{post}} implies that of the non-diagonal elements {v¯B,t,tpost}\{\bar{v}_{\mathrm{B},t^{\prime},t}^{\mathrm{post}}\}. ∎

Theorem 2 implies that the state evolution recursions (34)–(37) for Bayes-optimal OAMP converge to a fixed point as tt tends to infinity. Furthermore, the LM-MP proof strategy developed in this paper claims the optimality of Bayes-optimal OAMP in terms of the convergence speed among all possible LM-MP algorithms included into a unified framework in [17].

References

  • [1] D. L. Donoho, “Compressed sensing,” IEEE Trans. Inf. Theory, vol. 52, no. 4, pp. 1289–1306, Apr. 2006.
  • [2] E. J. Candés, J. Romberg, and T. Tao, “Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information,” IEEE Trans. Inf. Theory, vol. 52, no. 2, pp. 489–509, Feb. 2006.
  • [3] D. L. Donoho, A. Maleki, and A. Montanari, “Message-passing algorithms for compressed sensing,” Proc. Nat. Acad. Sci., vol. 106, no. 45, pp. 18 914–18 919, Nov. 2009.
  • [4] S. Rangan, “Generalized approximate message passing for estimation with random linear mixing,” in Proc. 2011 IEEE Int. Symp. Inf. Theory, Saint Petersburg, Russia, Aug. 2011, pp. 2168–2172.
  • [5] J. Ma and L. Ping, “Orthogonal AMP,” IEEE Access, vol. 5, pp. 2020–2033, Jan. 2017.
  • [6] S. Rangan, P. Schniter, and A. K. Fletcher, “Vector approximate message passing,” IEEE Trans. Inf. Theory, vol. 65, no. 10, pp. 6664–6684, Oct. 2019.
  • [7] M. Bayati and A. Montanari, “The dynamics of message passing on dense graphs, with applications to compressed sensing,” IEEE Trans. Inf. Theory, vol. 57, no. 2, pp. 764–785, Feb. 2011.
  • [8] M. Bayati, M. Lelarge, and A. Montanari, “Universality in polytope phase transitions and message passing algorithms,” Ann. Appl. Probab., vol. 25, no. 2, pp. 753–822, Apr. 2015.
  • [9] F. Caltagirone, L. Zdeborová, and F. Krzakala, “On convergence of approximate message passing,” in Proc. 2014 IEEE Int. Symp. Inf. Theory, Honolulu, HI, USA, Jul. 2014, pp. 1812–1816.
  • [10] S. Rangan, P. Schniter, A. Fletcher, and S. Sarkar, “On the convergence of approximate message passing with arbitrary matrices,” IEEE Trans. Inf. Theory, vol. 65, no. 9, pp. 5339–5351, Sep. 2019.
  • [11] K. Takeuchi, “Rigorous dynamics of expectation-propagation-based signal recovery from unitarily invariant measurements,” IEEE Trans. Inf. Theory, vol. 66, no. 1, pp. 368–386, Jan. 2020.
  • [12] L. Liu, C. Yuen, Y. L. Guan, Y. Li, and Y. Su, “Convergence analysis and assurance for Gaussian message passing iterative detector in massive MU-MIMO systems,” IEEE Trans. Wireless Commun., vol. 15, no. 9, pp. 6487–6501, Sep. 2016.
  • [13] C. Gerbelot, A. Abbara, and F. Krzakala, “Asymptotic errors for teacher-student convex generalized linear models (or : How to prove Kabashima’s replica formula),” [Online] Available:
    https://arxiv.org/abs/2006.06581.
  • [14] M. Mondelli and R. Venkataramanan, “PCA initialization for approximate message passing in rotationally invariant models,” [Online] Available: https://arxiv.org/abs/2106.02356.
  • [15] M. Opper, B. Çakmak, and O. Winther, “A theory of solving TAP equations for Ising models with general invariant random matrices,” J. Phys. A: Math. Theor., vol. 49, no. 11, p. 114002, Feb. 2016.
  • [16] B. Çakmak, M. Opper, O. Winther, and B. H. Fleury, “Dynamical functional theory for compressed sensing,” in Proc. 2017 IEEE Int. Symp. Inf. Theory, Aachen, Germany, Jun. 2017, pp. 2143–2147.
  • [17] K. Takeuchi, “Bayes-optimal convolutional AMP,” IEEE Trans. Inf. Theory, vol. 67, no. 7, pp. 4405–4428, Jul. 2021.
  • [18] ——, “Convolutional approximate message-passing,” IEEE Signal Process. Lett., vol. 27, pp. 416–420, 2020.
  • [19] L. Liu, S. Huang, and B. M. Kurkoski, “Memory approximate message passing,” in Proc. 2021 IEEE Int. Symp. Inf. Theory, Jul. 2021, pp. 1379–1384.
  • [20] N. Skuratovs and M. Davies, “Compressed sensing with upscaled vector approximate message passing,” [Online] Available:
    https://arxiv.org/abs/2011.01369.
  • [21] Z. Fan, “Approximate message passing algorithms for rotationally invariant matrices,” Ann. Statist., vol. 50, no. 1, pp. 197–224, Feb. 2022.
  • [22] R. Venkataramanan, K. Kögler, and M. Mondelli, “Estimation in rotationally invariant generalized linear models via approximate message passing,” [Online] Available: https://arxiv.org/abs/2112.04330.
  • [23] K. Takeuchi, “On the convergence of orthogonal/vector AMP: Long-memory message-passing strategy,” [Online] Available:
    https://arxiv.org/abs/2111.05522.
  • [24] L. Liu, S. Huang, and B. M. Kurkoski, “Sufficient statistic memory AMP,” Dec. 2021, [Online] Available: https://arxiv.org/abs/2112.15327.
  • [25] R. Berthier, A. Montanari, and P.-M. Nguyen, “State evolution for approximate message passing with non-separable functions,” Inf. Inference: A Journal of the IMA, 2019, doi:10.1093/imaiai/iay021.
  • [26] Y. Ma, C. Rush, and D. Baron, “Analysis of approximate message passing with non-separable denoisers and Markov random field priors,” IEEE Trans. Inf. Theory, vol. 65, no. 11, pp. 7367–7389, Nov. 2019.
  • [27] A. K. Fletcher, P. Pandit, S. Rangan, S. Sarkar, and P. Schniter, “Plug-in estimation in high-dimensional linear inverse problems a rigorous analysis,” J. Stat. Mech.: Theory Exp., vol. 2019, pp. 124 021–1–15, Dec. 2019.
  • [28] A. M. Tulino and S. Verdú, Random Matrix Theory and Wireless Communications.   Hanover, MA USA: Now Publishers Inc., 2004.
  • [29] F. Hiai and D. Petz, The Semicircle Law, Free Random Variables and Entropy.   Providence, RI, USA: Amer. Math. Soc., 2000.
  • [30] K. Takeuchi, “On the convergence of convolutional approximate message-passing for Gaussian signaling,” IEICE Trans. Fundamentals., vol. E105-A, no. 2, pp. 100–108, Feb. 2022.