This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Learned Multi-layer Residual Sparsifying Transform Model for Low-dose CT Reconstruction

Xikai Yang, Xuehang Zheng, Yong Long, Saiprasad Ravishankar X. Yang, X. Zheng and Y. Long are with the University of Michigan– Shanghai Jiao Tong University Joint Institute, Shanghai Jiao Tong University, Shanghai 200240, China (e-mail: yangxk@sjtu.edu.cn; zhxhang@sjtu.edu.cn; yong.long@sjtu.edu.cn)S. Ravishankar is with the Department of Computational Mathematics, Science and Engineering and Department of Biomedical Engineering, Michigan State University, East Lansing, MI 48824, USA (e-mail: ravisha3@msu.edu)Yong Long is the corresponding author.
Abstract

Signal models based on sparse representation have received considerable attention in recent years. Compared to synthesis dictionary learning, sparsifying transform learning involves highly efficient sparse coding and operator update steps. In this work, we propose a Multi-layer Residual Sparsifying Transform (MRST) learning model wherein the transform domain residuals are jointly sparsified over layers. In particular, the transforms for the deeper layers exploit the more intricate properties of the residual maps. We investigate the application of the learned MRST model for low-dose CT reconstruction using Penalized Weighted Least Squares (PWLS) optimization. Experimental results on Mayo Clinic data show that the MRST model outperforms conventional methods such as FBP and PWLS methods based on edge-preserving (EP) regularizer and single-layer transform (ST) model, especially for maintaining some subtle details.

Index Terms:
Low-dose CT, Statistical image reconstruction, Sparse representation, Transform learning, Unsupervised learning.

I Introduction

Signal models exploiting sparsity have been shown to be useful in a variety of applications such as compression, restoration, denoising, reconstruction, etc. Natural signals can be modeled as sparse in a synthesis dictionary (i.e., represented as linear combinations of few dictionary atoms or columns) or in a sparsifying transform domain.

Transforms such as wavelets [1] and the discrete cosine transform (DCT) are well-known to sparsify images. Methods such as synthesis dictionary learning [2] and analysis dictionary learning [3] allow adapting the model to the data. These dictionary learning problems are NP-hard in general and involve computationally expensive alternating-type algorithms that limit applicability to large-scale data. In contrast, the recently proposed sparsifying transform learning approaches [4] involve exact and highly efficient updates. In particular, the transform model approximates the signal as sparse in a transformed domain. Adopting a multi-layer sparsifying transform model enables sparsifying an input image successively over layers [5], creating a rich and more complete sparsity model, which forms the core of this work.

One of the most important application of such image models is for medical image reconstruction. In particular, an important problem in computed tomography (CT) is reducing the X-ray exposure to patients while maintaining good image reconstruction quality. A conventional method for CT reconstruction is the analytical filtered back-projection (FBP) [6]. However, image quality degrades severely for FBP when the radiation dose is reduced. In contrast, model-based image reconstruction (MBIR) achieves comparable or better image quality [7].

A typical MBIR method for low-dose CT (LDCT) is the penalized weighted least squares (PWLS) approach. The cost function for PWLS includes a weighted quadratic data-fidelity term and a penalty term or regularizer capturing prior information or model of the object [8, 9]. Recent works have shown promising LDCT reconstruction quality by incorporating data-driven models into the regularizer, where the models are learned from datasets of images or image patches. In particular, PWLS reconstruction with adaptive sparsifying transform-based regularization has shown promise for tomographic reconstruction [10, 11, 12]. Recent work has also shown that they may generalize better to new data than supervised deep learning schemes [13]. The adaptive transform-based image reconstruction algorithms can exploit a variety of image models [10, 11, 14] learned in an unsupervised manner, and involve efficient closed-form solutions for sparse coding.

In this work, we propose a new formulation and algorithm for learning a multi-layer transform model [5], where the transform domain residuals (the difference between transformed data and their sparse approximations) are successively sparsified over several layers. We refer to the model as a multi-layer residual sparsifying transform (MRST) model. The transforms are learned over several layers from images to jointly minimize the transform domain residuals across layers, while enforcing sparsity conditions in each layer. Importantly, the filters in deeper layers can help better exploit finer features (e.g., edges and correlations) in the residual maps. We investigate the performance of the (unsupervised) learned MRST model for LDCT reconstruction using PWLS. We propose efficient alternating minimization algorithms for both learning and reconstruction. Experimental results on Mayo Clinic data illustrate that the learned MRST model outperforms the conventional FBP as well as PWLS methods based on the non-adaptive edge-preserving (EP) regularizer and learned single-layer transform model, especially for maintaining some subtle details. In the following sections, we will first discuss the proposed model and formulation, followed by our algorithms and experimental results on Mayo Clinic data.

II Problem Formulation

Here, we will introduce the proposed general multi-layer learning framework and the formulation for LDCT image reconstruction. The MRST learning cost and constraints are shown in Problem (P0), which is an extension of simple single-layer transform learning.

min{𝛀l,𝐙l}l=1L{𝛀l𝐑l𝐙lF2+ηl2𝐙l0}\displaystyle\quad\quad\quad\quad\min_{\{\mathbf{\Omega}_{l},\mathbf{Z}_{l}\}}\sum_{l=1}^{L}\bigg{\{}\|\mathbf{\Omega}_{l}\mathbf{R}_{l}-\mathbf{Z}_{l}\|_{F}^{2}+\eta_{l}^{2}\|\mathbf{Z}_{l}\|_{0}\bigg{\}} (P0)
s.t.𝐑l=𝛀l1𝐑l1𝐙l1,2lL,𝛀lT𝛀l=𝐈,l.\displaystyle\quad\mathrm{s.t.}\quad\mathbf{R}_{l}=\mathbf{\Omega}_{l-1}\mathbf{R}_{l-1}-\mathbf{Z}_{l-1},2\leq l\leq L,\mathbf{\Omega}_{l}^{T}\mathbf{\Omega}_{l}=\mathbf{I},\forall l.

Here, {𝛀lp×p}\{\mathbf{\Omega}_{l}\in\mathbb{R}^{p\times p}\} and {𝐙lp×N}\{\mathbf{Z}_{l}\in\mathbb{R}^{p\times N}\} denote the learned transforms and sparse coefficient maps for the 1lL1\leq l\leq L layers. Parameter ηl\eta_{l} controls the maximum allowed sparsity level (computed using the 0\ell_{0} “norm” penalty) of 𝐙l\mathbf{Z}_{l}. The residual maps {𝐑lp×N}\{\mathbf{R}_{l}\in\mathbb{R}^{p\times N}\} are defined in recursive form over layers, with 𝐑1\mathbf{R}_{1} denoting the input training data. Here, we assume 𝐑1\mathbf{R}_{1} to be a matrix, whose columns are (vectorized) patches drawn from image data sets. The unitary constraints for {𝛀l}\{\mathbf{\Omega}_{l}\} enable closed-form solutions for the sparse coefficient and transform update steps in our algorithms. The learned MRST model can then be used to construct a data-driven regularizer in PWLS as shown in Problem (P1).

min𝐱𝟎12𝐲𝐀𝐱𝐖2+β𝖲(𝐱),\min_{\mathbf{x}\geq\mathbf{0}}\frac{1}{2}\|\mathbf{y}-\mathbf{A}\mathbf{x}\|^{2}_{\mathbf{W}}+\beta\mathsf{S}(\mathbf{x}), (P1)
𝖲(𝐱)min{𝐙l}l=1L{𝛀l𝐑l𝐙lF2+γl2𝐙l0}\displaystyle\mathsf{S}(\mathbf{x})\triangleq\min_{\{\mathbf{Z}_{l}\}}\sum_{l=1}^{L}\bigg{\{}\|\mathbf{\Omega}_{l}\mathbf{R}_{l}-\mathbf{Z}_{l}\|^{2}_{F}+\gamma_{l}^{2}\|\mathbf{Z}_{l}\|_{0}\bigg{\}}
s.t.𝐑l=𝛀l1𝐑l1𝐙l1,2lL,𝐑1j=𝐏j𝐱,j.\displaystyle\;\mathrm{s.t.}\;\mathbf{R}_{l}=\mathbf{\Omega}_{l-1}\mathbf{R}_{l-1}-\mathbf{Z}_{l-1},2\leq l\leq L\,,\mathbf{R}_{1}^{j}=\mathbf{P}^{j}\mathbf{x},\,\forall\,j.

In particular, we reconstruct the image 𝐱Np\mathbf{x}\in\mathbb{R}^{N_{p}} from noisy sinogram data 𝐲Nd\mathbf{y}\in\mathbb{R}^{N_{d}} by solving (P1), where 𝐀Nd×Np\mathbf{A}\in\mathbb{R}^{N_{d}\times N_{p}} is the system matrix of the CT scan and 𝐖=𝖽𝗂𝖺𝗀{wi}Nd×Nd\mathbf{W}=\mathsf{diag}\{w_{i}\}\in\mathbb{R}^{N_{d}\times N_{d}} is the diagonal weighting matrix with elements being the estimated inverse variance of yiy_{i}. Operator 𝐏jp×Np\mathbf{P}^{j}\in\mathbb{R}^{p\times N_{p}} extracts the jjth patch of 𝐱\mathbf{x} as 𝐏j𝐱\mathbf{P}^{j}\mathbf{x}. The jjth colums of 𝐑l\mathbf{R}_{l} and 𝐙l\mathbf{Z}_{l} are denoted 𝐙lj\mathbf{Z}_{l}^{j} and 𝐑lj\mathbf{R}_{l}^{j}. The non-negative parameters {γl}\{\gamma_{l}\} control the sparsity of the coefficient maps in different layers, and β>0\beta>0 captures the relative trade-off between the data-fidelity term and regularizer.

III Algorithms for Learning and Reconstruction

III-A Transform Learning Algorithm

We propose an exact block coordinate descent (BCD) algorithm for the nonconvex Problem (P0) that cycles over updating each 𝐙l\mathbf{Z}_{l} for 1lL1\leq l\leq L (sparse coding step) and each 𝛀l\mathbf{\Omega}_{l} for 1lL1\leq l\leq L (transform update step). In each step, the remainder of the variables (that are not optimized) are kept fixed. The exact BCD algorithm ensures that the objective in (P0) is monotone decreasing over iterations and that it converges. In particular, under the unitarity condition on the transforms, every subproblem in the block coordinate descent minimization approach can be solved exactly. We initialize the algorithm with 2D DCT for 𝛀1\mathbf{\Omega}_{1} and the identity matrix for {𝛀l}(l2)\{\mathbf{\Omega}_{l}\}(l\geq 2) respectively. The initial {𝐙l}\{\mathbf{Z}_{l}\} are all-zero matrices. Since the residuals are defined recursively in (P0), to simplify the algorithmic description, we first define matrices 𝐁pq(p<q)\mathbf{B}_{p}^{q}(p<q), which can be regarded as backpropagation matrices from the qqth to ppth layers.

𝐁pq=𝛀p+1T𝐙p+1+𝛀p+1T𝛀p+2T𝐙p+2++𝛀p+1T𝛀p+2T𝛀qT𝐙q\displaystyle\mathbf{B}_{p}^{q}=\mathbf{\Omega}_{p+1}^{T}\mathbf{Z}_{p+1}+\mathbf{\Omega}_{p+1}^{T}\mathbf{\Omega}_{p+2}^{T}\mathbf{Z}_{p+2}+...+\mathbf{\Omega}_{p+1}^{T}\mathbf{\Omega}_{p+2}^{T}...\mathbf{\Omega}_{q}^{T}\mathbf{Z}_{q} (1)
=k=p+1q(s=p+1k𝛀sT)𝐙k\displaystyle=\sum_{k=p+1}^{q}\bigg{(}\prod_{s=p+1}^{k}\mathbf{\Omega}_{s}^{T}\bigg{)}\mathbf{Z}_{k}

III-A1 Sparse Coding Step for 𝐙l\mathbf{Z}_{l}

Here, we solve (P0) for 𝐙l\mathbf{Z}_{l} with all other variables fixed. The corresponding subproblem is as in (2).

min𝐙li=lL{𝛀i𝐑i𝐙iF2}+ηl2𝐙l0\vspace{-0.05in}\min_{\mathbf{Z}_{l}}\sum_{i=l}^{L}\bigg{\{}\|\mathbf{\Omega}_{i}\mathbf{R}_{i}-\mathbf{Z}_{i}\|_{F}^{2}\bigg{\}}+\eta_{l}^{2}\|\mathbf{Z}_{l}\|_{0} (2)

Although the cost function is nonconvex and the residual maps also depend on 𝐙l\mathbf{Z}_{l} (as in (P0)), we can exploit the unitarity of the transforms to simplify and rewrite Problem (2) as min𝐙l(Ll+1)×𝐙l(𝛀l𝐑l1Ll+1i=l+1L𝐁li)F2+ηl2𝐙l0\min_{\mathbf{Z}_{l}}(L-l+1)\times\|\mathbf{Z}_{l}-(\mathbf{\Omega}_{l}\mathbf{R}_{l}-\frac{1}{L-l+1}\sum_{i=l+1}^{L}\mathbf{B}_{l}^{i})\|_{F}^{2}+\eta_{l}^{2}\|\mathbf{Z}_{l}\|_{0}. This problem has a similar form as the single-transform sparse coding problem [4], and the optimal solution 𝐙^l\hat{\mathbf{Z}}_{l} is obtained as in (3), where Hη()H_{\eta}(\cdot) denotes the hard-thresholding operator that sets elements with magnitude less than the threshold η\eta to zero.

𝐙^l=\displaystyle\hat{\mathbf{Z}}_{l}= {Hηl/Ll+1(𝛀l𝐑l1Ll+1i=l+1L𝐁li),1lL1,HηL(𝛀L𝐑L),l=L.\displaystyle\begin{cases}H_{\eta_{l}/\sqrt{L-l+1}}\bigg{(}\mathbf{\Omega}_{l}\mathbf{R}_{l}-\frac{1}{L-l+1}\sum_{i=l+1}^{L}\mathbf{B}_{l}^{i}\bigg{)},&1\leq l\leq L-1,\\ H_{\eta_{L}}(\mathbf{\Omega}_{L}\mathbf{R}_{L}),&l=L.\end{cases} (3)

III-A2 Transform Update Step for 𝛀l\mathbf{\Omega}_{l}

Here, we fix {𝐙l}\{\mathbf{Z}_{l}\} and all 𝛀l\mathbf{\Omega}_{l} (except the target 𝛀l\mathbf{\Omega}_{l} in (P0)) and solve the following subproblem:

min𝛀li=lL{𝛀i𝐑i𝐙iF2}s.t.𝛀lT𝛀l=𝐈.\min_{\mathbf{\Omega}_{l}}\sum_{i=l}^{L}\bigg{\{}\|\mathbf{\Omega}_{i}\mathbf{R}_{i}-\mathbf{Z}_{i}\|_{F}^{2}\bigg{\}}\quad\mathrm{s.t.}\quad\mathbf{\Omega}_{l}^{T}\mathbf{\Omega}_{l}=\mathbf{I}. (4)

Incorporating the recursive dependence of the residual maps on the transforms (as in (P0)) along with the unitarity of the transforms simplifies the solution to (4). First, denoting the full singular value decomposition (SVD) of the matrix 𝐆l\mathbf{G}_{l} below by 𝐔l𝚺l𝐕lT\mathbf{U}_{l}\mathbf{\Sigma}_{l}\mathbf{V}_{l}^{T}, the optimal solution to (4) is 𝛀l^=𝐕l𝐔lT\hat{\mathbf{\Omega}_{l}}=\mathbf{V}_{l}\mathbf{U}_{l}^{T}.

𝐆l={𝐑l(𝐙l+1Ll+1i=l+1L𝐁li)T,1lL1,𝐑L𝐙LT,l=L.\mathbf{G}_{l}=\begin{cases}\mathbf{R}_{l}\bigg{(}\mathbf{Z}_{l}+\frac{1}{L-l+1}\sum_{i=l+1}^{L}\mathbf{B}_{l}^{i}\bigg{)}^{T},&1\leq l\leq L-1,\\ \mathbf{R}_{L}\mathbf{Z}_{L}^{T},&l=L.\end{cases}\vspace{-0.05in} (5)

III-B Image Reconstruction Algorithm

The proposed PWLS-MRST algorithm for low-dose CT image reconstruction exploits the learned MRST model. We reconstruct the image by solving the PWLS problem (P1). We propose a block coordinate descent (BCD) algorithm for (P1) that cycles over updating the image 𝐱\mathbf{x} and each of the sparse coefficient maps 𝐙l\mathbf{Z}_{l} for 1lL1\leq l\leq L.

III-B1 Image Update Step for 𝐱\mathbf{x}

First, with the sparse coefficient maps {𝐙l}\{\mathbf{Z}_{l}\} fixed, we optimize for 𝐱\mathbf{x} in (P1) by optimizing the following subproblem:

min𝐱𝟎12𝐲𝐀𝐱𝐖2+β𝖲2(𝐱),\min_{\mathbf{x}\geq\mathbf{0}}\frac{1}{2}\|\mathbf{y}-\mathbf{A}\mathbf{x}\|^{2}_{\mathbf{W}}+\beta\mathsf{S}_{2}(\mathbf{x}), (6)

where 𝖲2(𝐱)l=1L{𝛀l𝐑l𝐙lF2}\mathsf{S}_{2}(\mathbf{x})\triangleq\sum_{l=1}^{L}\bigg{\{}\|\mathbf{\Omega}_{l}\mathbf{R}_{l}-\mathbf{Z}_{l}\|_{F}^{2}\bigg{\}}, with 𝐑l=𝛀l1𝐑l1𝐙l1\mathbf{R}_{l}=\mathbf{\Omega}_{l-1}\mathbf{R}_{l-1}-\mathbf{Z}_{l-1}, 2lL2\leq l\leq L, and 𝐑1j=𝐏j𝐱\mathbf{R}_{1}^{j}=\mathbf{P}^{j}\mathbf{x}. We use the efficient relaxed linearized augmented Lagrangian method with ordered-subsets (relaxed OS-LALM) algorithm [15] to obtain the solution to (6). The algorithmic details are shown in Algorithm 1. In each iteration of the OS-LALM method, we update the image MM times corresponding to MM ordered subsets. The matrices 𝐀m\mathbf{A}_{m} and 𝐖m\mathbf{W}_{m}, and the vector 𝐲m\mathbf{y}_{m} are sub-matrices of 𝐀\mathbf{A} and 𝐖\mathbf{W}, and sub-vector of 𝐲\mathbf{y} for the mmth subset respectively. Matrix 𝐃𝐀\mathbf{D}_{\mathbf{A}} represents a diagonal majorizing matrix of 𝐀T𝐖𝐀\mathbf{A}^{T}\mathbf{W}\mathbf{A}. We precompute the Hessian matrix of 𝖲2(𝐱)\mathsf{S}_{2}(\mathbf{x}) as 𝐃𝖲2\mathbf{D}_{\mathsf{S}_{2}} in (8) to accelerate the algorithm, and the gradient of 𝖲2(𝐱)\mathsf{S}_{2}(\mathbf{x}) is as in (7).

𝖲2(𝐱)=2βj=1N(𝐏T)j{L𝐏j𝐱k=1L(𝐁0k)j}\nabla\mathsf{S}_{2}(\mathbf{x})=2\beta\sum_{j=1}^{N}(\mathbf{P}^{T})^{j}\bigg{\{}L\mathbf{P}^{j}\mathbf{x}-\sum_{k=1}^{L}(\mathbf{B}_{0}^{k})^{j}\bigg{\}} (7)
𝐃𝖲22𝖲2(𝐱)=2Lβj=1N(𝐏T)j𝐏j\mathbf{D}_{\mathsf{S}_{2}}\triangleq\nabla^{2}\mathsf{S}_{2}(\mathbf{x})=2L\beta\sum_{j=1}^{N}(\mathbf{P}^{T})^{j}\mathbf{P}^{j}\\ \vspace{-0.05in} (8)

III-B2 Sparse Coding Step for 𝐙l\mathbf{Z}_{l}

Similar to the sparse coding step during transform learning, the solution of (P1) with respect to each sparse coefficient map 𝐙l\mathbf{Z}_{l} is shown in (10), and is the solution of (9).

min𝐙li=lL{𝛀i𝐑i𝐙iF2}+γl2𝐙l0\displaystyle\min_{\mathbf{Z}_{l}}\sum_{i=l}^{L}\bigg{\{}\|\mathbf{\Omega}_{i}\mathbf{R}_{i}-\mathbf{Z}_{i}\|_{F}^{2}\bigg{\}}+\gamma_{l}^{2}\|\mathbf{Z}_{l}\|_{0} (9)
s.t.𝐑i=𝛀i1𝐑i1𝐙i1,liL\displaystyle\quad\mathrm{s.t.}\quad\mathbf{R}_{i}=\mathbf{\Omega}_{i-1}\mathbf{R}_{i-1}-\mathbf{Z}_{i-1},l\leq i\leq L
𝐙^l=Hηl/Ll+1{𝛀l𝐑l1Ll+1i=l+1L𝐁li}\hat{\mathbf{Z}}_{l}=H_{\eta_{l}/\sqrt{L-l+1}}\bigg{\{}\mathbf{\Omega}_{l}\mathbf{R}_{l}-\frac{1}{L-l+1}\sum_{i=l+1}^{L}\mathbf{B}_{l}^{i}\bigg{\}} (10)
Algorithm 1 Image Reconstruction Algorithm
Input: initial image 𝐱~(0)\tilde{\mathbf{x}}^{(0)}, all-zero initial {𝐙l~(0)}\{\tilde{\mathbf{Z}_{l}}^{(0)}\}, pre-learned {𝛀l}\{\mathbf{\Omega}_{l}\}, thresholds {γl}\{\gamma_{l}\}, α=1.999\alpha=1.999, 𝐃𝐀\mathbf{D}_{\mathbf{A}}, 𝐃𝖲2\mathbf{D}_{\mathsf{S}_{2}}, number of outer iterations TOT_{O}, number of inner iterations NN, and number of subsets MM.
Output: reconstructed image 𝐱~(T)\tilde{\mathbf{x}}^{(T)}.
for t=0,1,2,,TO1t=0,1,2,\cdots,{T_{O}-1} do
     1) Image Update: {𝐙l~(t)}\{\tilde{\mathbf{Z}_{l}}^{(t)}\} fixed, Initialization: ρ=1\rho=1, 𝐱(0)=𝐱~(t)\mathbf{x}^{(0)}=\tilde{\mathbf{x}}^{(t)}, 𝐠(0)=𝜻(0)=M𝐀MT𝐖M(𝐀M𝐱(0)𝐲M)\mathbf{g}^{(0)}=\bm{\zeta}^{(0)}=M\mathbf{A}_{M}^{T}\mathbf{W}_{M}(\mathbf{A}_{M}\mathbf{x}^{(0)}-\mathbf{y}_{M}) and 𝐡(0)=𝐃𝐀𝐱(0)𝜻(0)\mathbf{h}^{(0)}=\mathbf{D}_{\mathbf{A}}\mathbf{x}^{(0)}-\bm{\zeta}^{(0)}.
     for n=0,1,2,,N1n=0,1,2,\cdots,N-1 do
         for m=0,1,2,,M1m=0,1,2,\cdots,M-1 do r=nM+mr=nM+m
{𝐬(r+1)=ρ(𝐃𝐀𝐱(r)𝐡(r))+(1ρ)𝐠(r)𝐱(r+1)=[𝐱(r)(ρ𝐃𝐀+𝐃𝖲2)1(𝐬(r+1)+𝖲2(𝐱(r)))]+𝜻(r+1)M𝐀mT𝐖m(𝐀m𝐱(r+1)𝐲m)𝐠(r+1)=ρρ+1(α𝜻(r+1)+(1α)𝐠(r))+1ρ+1𝐠(r)𝐡(r+1)=α(𝐃𝐀𝐱(r+1)𝜻(r+1))+(1α)𝐡(r)\left\{\begin{aligned} \mathbf{s}^{(r+1)}&=\rho(\mathbf{D}_{\mathbf{A}}\mathbf{x}^{(r)}-\mathbf{h}^{(r)})+(1-\rho)\mathbf{g}^{(r)}\\ \mathbf{x}^{(r+1)}&=[\mathbf{x}^{(r)}-(\rho\mathbf{D}_{\mathbf{A}}+\mathbf{D}_{\mathsf{S}_{2}})^{-1}(\mathbf{s}^{(r+1)}+\nabla\mathsf{S}_{2}(\mathbf{x}^{(r)}))]_{+}\\ \bm{\zeta}^{(r+1)}&\triangleq M\mathbf{A}^{T}_{m}\mathbf{W}_{m}(\mathbf{A}_{m}\mathbf{x}^{(r+1)}-\mathbf{y}_{m})\\ \mathbf{g}^{(r+1)}&=\frac{\rho}{\rho+1}(\alpha\bm{\zeta}^{(r+1)}+(1-\alpha)\mathbf{g}^{(r)})+\frac{1}{\rho+1}\mathbf{g}^{(r)}\\ \mathbf{h}^{(r+1)}&=\alpha(\mathbf{D}_{\mathbf{A}}\mathbf{x}^{(r+1)}-\bm{\zeta}^{(r+1)})+(1-\alpha)\mathbf{h}^{(r)}\end{aligned}\right.
              decreasing ρ\rho.
         end for
     end for
     𝐱~(t+1)=𝐱(NM)\tilde{\mathbf{x}}^{(t+1)}=\mathbf{x}^{(NM)}.
     2) Sparse Coding: with 𝐱~(t+1)\tilde{\mathbf{x}}^{(t+1)} fixed, for each 1lL1\leq l\leq L, update 𝐙l~(t+1)\tilde{\mathbf{Z}_{l}}^{(t+1)} by (10).
end for

IV Experimental Results

In this section, we evaluate the image reconstruction quality for the proposed PWLS-MRST algorithm and compare it with conventional FBP and the PWLS-EP (Edge Preserving) method [16].

IV-A Transform Learning

To better understand the potential of the MRST model, we vary the number of layers and pre-learn transforms for ST, MRST2, MRST3, MRST5, and MRST7, which possess 11, 22, 33, 55, and 77 layers, respectively. We used 7 slices of the Mayo Clinic data to train the models. For each model, we run 1000 to 2000 iterations of the learning algorithm to ensure convergence. Fig. 1 shows some of the learned transforms, with each transform matrix row displayed as a square patch for simplicity. The single layer transform displays edge-like and directional structures that sparsify the image. However, with more layers, finer level features are learned to sparsify transform domain residuals in deeper layers. Nonetheless, transforms in deep layers could be more easily contaminated with noise in the training data, since the main image features are successively filtered out over layers.

Refer to caption
(a) ST
Refer to caption
Refer to caption
(b) MRST2
Refer to caption
Refer to caption
Refer to caption
(c) MRST3
Figure 1: Pre-learned transforms for ST, MRST2, and MRST3 (each transform matrix row is shown as a square patch).

IV-B Image Reconstruction

Refer to caption Refer to caption Refer to caption Refer to caption

ReferenceRefer to caption Refer to caption Refer to caption Refer to caption FBPRefer to caption Refer to caption Refer to caption Refer to caption EPRefer to caption Refer to caption Refer to caption Refer to caption ST
Refer to caption Refer to caption Refer to caption Refer to caption MRST2Refer to caption Refer to caption Refer to caption Refer to caption MRST3Refer to caption Refer to caption Refer to caption Refer to caption MRST5Refer to caption Refer to caption Refer to caption Refer to caption MRST7

Figure 2: Comparison of reconstructions of L067-slice100 with FBP, PWLS-EP, PWLS-ST, PWLS-MRST2, PWLS-MRST3, PWLS-MRST5, and PWLS-MRST7, respectively at incident photon intensities I0=104I_{0}=10^{4}. The display window is [800, 1200] HU.

Here, we simulated low-dose CT measurements from full-dose CT images in Mayo Clinic dataset with GE 2D LightSpeed fan-beam geometry corresponding to a monoenergetic source with 10410^{4} incident photons per ray and no scatter. We ran 100 iterations of the PWLS-EP algorithm with FBP reconstructions as initializations. We used the relaxed OS-LALM algorithm with 44 ordered subsets and regularization parameter β=215.5\beta=2^{15.5}. For the MRST model, we used the OS-LALM algorithm for the image update step with 22 inner iterations and 44 subsets. We used TO=1000T_{O}=1000 and TO=1500T_{O}=1500 outer iterations for (ST, MRST2) and (MRST3, MRST5, MRST7), respectively. We set the regularization parameters (after tuning over ranges of values) as (β(\beta, γ)\gamma) == (7×104(7\times 10^{4}, 20)20) for ST, (β(\beta, γ1\gamma_{1}, γ2)\gamma_{2}) == (3×104(3\times 10^{4}, 3030, 12)12) for MRST2, (β(\beta, γ1\gamma_{1}, γ2\gamma_{2}, γ3)\gamma_{3}) == (3×104(3\times 10^{4}, 3030, 1212, 10)10) for MRST3, (β(\beta, γ1\gamma_{1}, γ2\gamma_{2}, γ3\gamma_{3}, γ4\gamma_{4}, γ5)\gamma_{5}) == (4×104(4\times 10^{4}, 3030, 2020, 1010, 1010, 10)10) for MRST5, and (β(\beta, γ1\gamma_{1}, γ2\gamma_{2}, γ3\gamma_{3}, γ4\gamma_{4}, γ5\gamma_{5}, γ6\gamma_{6}, γ7)\gamma_{7}) == (7×104(7\times 10^{4}, 3030, 1212, 1010, 1010, 1010, 1010, 10)10) for MRST7, respectively.

To compare the performance between various models quantitatively, we compute the root mean square error (RMSE), peak signal to noise ratio (PSNR), structural similarity index measurement (SSIM) of the reconstructions in a region of interest (ROI). The RMSE in Hounsfield units (HU) computed between the ground truth image and reconstruction image is defined as RMSE =ΣiROI(x^ixi)2/NROI=\sqrt{\Sigma_{i\in\text{ROI}}(\hat{x}_{i}-x^{*}_{i})^{2}/{N_{\text{ROI}}}}, where x^i\hat{x}_{i} and xix^{*}_{i} denote the pixel intensities of the reconstructed and ground truth images, respectively, and NROIN_{\text{ROI}} is the number of pixels in the ROI. The ROI here was a circular (around center) region containing all the phantom tissues. Since the full-dose Mayo Clinic CT images still contain some considerable noise, we cannot solely rely on RMSE and PSNR as metrics of image quality, as they would include the noise. SSIM helps better evaluate the preservation of structural details in the reconstructed image.

We conduct experiments on three slices (L067-slice100, L192-slice100, L506-slice100) of the Mayo Clinic data. Fig. 2 shows the reconstruction of L067-slice100 using FBP, PWLS-EP, PWLS-ST, PWLS-MRST2, PWLS-MRST3, PWLS-MRST5, and PWLS-MRST7, respectively at incident photon intensity I0=104I_{0}=10^{4}. TABLE I lists the RMSE, PSNR, and SSIM values of reconstructions of the three test slices, with the best values bolded. The two-layer model (MRST2) provides the best RMSE and PSNR values among the methods. However, when we consider the SSIM criterion, MRST5 and MRST7 outperform ST and MRST2. So which MRST model is better? By observing the reconstructed images, we see that although MRST2 and ST have lower RMSE and higher PSNR values than MRST5 and MRST7, they sacrifice some sharpness of the central region and suffer from loss of details. The deeper models have a more positive effect in maintaining subtle features, which is clearly more essential to clinic medical diagnosis. Furthermore, after considerable parameter tuning, we have observed that the deeper models offer more stable image quality as β\beta is varied, i.e., they are more robust to oversmoothing.

TABLE I: RMSE in HU (first row), PSNR in dB (second row) and SSIM (third row) of reconstructions with FBP, PWLS-EP, PWLS-ST, PWLS-MRST2, PWLS-MRST3, PWLS-MRST5, and PWLS-MRST7, for three slices of Mayo Clinic data at incident photon intensity I0=1×104I_{0}=1\times 10^{4}.
FBP EP ST MRST2 MRST3 MRST5 MRST7
L067 slice100 101.1 34.2 30.3 29.1 29.8 30.5 31.4
27.4 36.8 37.8 38.2 38.0 37.8 37.5
0.359 0.728 0.724 0.731 0.731 0.732 0.731
L192 slice100 60.0 26.6 23.5 22.2 22.3 23.0 23.8
32.4 38.9 39.9 40.4 40.4 40.1 39.8
0.440 0.786 0.790 0.798 0.800 0.807 0.810
L506 slice100 56.8 32.6 27.1 26.0 26.2 27.5 28.7
32.8 37.6 39.2 39.6 39.5 39.1 38.7
0.489 0.780 0.790 0.791 0.793 0.797 0.802

V Conclusion

This paper proposes a general framework for Multi-layer Residual Sparsifying Transform (MRST) learning wherein the transform domain residual maps over several layers are jointly sparsified. Our work then applies MRST learning to low-dose CT (LDCT) image reconstruction by using a PWLS approach with a learned MRST-based regularizer. Experimental results illustrate the promising performance of the multi-layer scheme over single-layer learned sparsifying transforms. Learned MRST models also offer significant improvement over typical nonadaptive methods. In the future, we will investigate additional strategies such as pooling operations to reduce noise over layers.

VI Acknowledgment

The authors thank Dr. Cynthia McCollough, the Mayo Clinic, the American Association of Physicists in Medicine, and the National Institute of Biomedical Imaging and Bioengineering for providing the Mayo Clinic data.

References

  • [1] Y. Pati, R. Rezaiifar, and P. Krishnaprasad, “Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition,” in Asilomar Conf. on Signals, Systems and Computers, 1993, pp. 40–44 vol.1.
  • [2] M. Aharon, M. Elad, and A. Bruckstein, “K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation,” IEEE Trans. Sig. Proc., vol. 54, no. 11, pp. 4311–4322, Nov. 2006.
  • [3] R. Rubinstein, T. Peleg, and M. Elad, “Analysis K-SVD: A dictionary-learning algorithm for the analysis sparse model,” IEEE Trans. Sig. Proc., vol. 61, no. 3, pp. 661–677, Feb. 2013.
  • [4] S. Ravishankar and Y. Bresler, “Learning sparsifying transforms,” IEEE Trans. Sig. Proc., vol. 61, no. 5, pp. 1072–1086, Mar. 2013.
  • [5] S. Ravishankar and B. Wohlberg, “Learning multi-layer transform models,” in Allerton Conf. on Comm., Control, and Computing, 2018, pp. 160–165.
  • [6] L. A. Feldkamp, L. C. Davis, and J. W. Kress, “Practical cone beam algorithm,” J. Opt. Soc. Am. A, vol. 1, no. 6, pp. 612–619, June 1984.
  • [7] I. A. Elbakri and J. A. Fessler, “Statistical image reconstruction for polyenergetic X-ray computed tomography,” IEEE Trans. Med. Imag., vol. 21, no. 2, pp. 89–99, Feb. 2002.
  • [8] K. Sauer and C. Bouman, “A local update strategy for iterative reconstruction from projections,” IEEE Trans. Sig. Proc., vol. 41, no. 2, pp. 534–548, Feb. 1993.
  • [9] J-B. Thibault, K. Sauer, C. Bouman, and J. Hsieh, “A three-dimensional statistical approach to improved image quality for multi-slice helical CT,” Med. Phys., vol. 34, no. 11, pp. 4526–4544, Nov. 2007.
  • [10] L. Pfister and Y. Bresler, “Model-based iterative tomographic reconstruction with adaptive sparsifying transforms,” in Proc. SPIE, 2014, vol. 9020, pp. 90200H–1–90200H–11.
  • [11] X. Zheng, S. Ravishankar, Y. Long, and J. A. Fessler, “PWLS-ULTRA: An efficient clustering and learning-based approach for low-dose 3D CT image reconstruction,” IEEE Trans. Med. Imag., vol. 37, no. 6, pp. 1498–1510, June 2018.
  • [12] I. Y. Chun, X. Zheng, Y. Long, and J. A. Fessler, “Efficient sparse-view X-ray CT reconstruction using 1\ell_{1} regularization with learned sparsifying transform,” in Proc. Intl. Mtg. on Fully 3D Image Recon. in Rad. and Nuc. Med, June 2017, pp. 115–119.
  • [13] S Ye, S. Ravishankar, Y. Long, and J. A. Fessler, “SPULTRA: Low-Dose CT image reconstruction with joint statistical and learned image models,” IEEE Trans. Med. Imag., (Early Access), Aug. 2019.
  • [14] W. Zhou, J-F. Cai, and H. Gao, “Adaptive tight frame based medical image reconstruction: a proof-of-concept study for computed tomography,” Inverse Prob., vol. 29, no. 12, pp. 125006, Dec. 2013.
  • [15] H. Nien and J. A. Fessler, “Relaxed linearized algorithms for faster X-ray CT image reconstruction,” IEEE Trans. Med. Imag., vol. 35, no. 4, pp. 1090–1098, Apr. 2016.
  • [16] J. H. Cho and J. A. Fessler, “Regularization designs for uniform spatial resolution and noise properties in statistical image reconstruction for 3D X-ray CT,” IEEE Trans. Med. Imag., vol. 34, no. 2, pp. 678–689, Feb. 2015.