This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Diffusion Modeling with Domain-conditioned Prior Guidance for Accelerated MRI and qMRI Reconstruction

Wanyu Bian    Albert Jang    and Fang Liu This work was supported by the grants R21EB031185; R01AR081344; R01AR079442. W. Bian, A. Jang and F. Liu are with the Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital and Harvard Medical School, Charlestown, MA 02129 USA. (corresponding author: F. Liu, email:fliu12@mgh.harvard.edu)
Abstract

This study introduces a novel approach for image reconstruction based on a diffusion model conditioned on the native data domain. Our method is applied to multi-coil MRI and quantitative MRI reconstruction, leveraging the domain-conditioned diffusion model within the frequency and parameter domains. The prior MRI physics are used as embeddings in the diffusion model, enforcing data consistency to guide the training and sampling process, characterizing MRI k-space encoding in MRI reconstruction, and leveraging MR signal modeling for qMRI reconstruction. Furthermore, a gradient descent optimization is incorporated into the diffusion steps, enhancing feature learning and improving denoising. The proposed method demonstrates a significant promise, particularly for reconstructing images at high acceleration factors. Notably, it maintains great reconstruction accuracy and efficiency for static and quantitative MRI reconstruction across diverse anatomical structures. Beyond its immediate applications, this method provides potential generalization capability, making it adaptable to inverse problems across various domains.

{IEEEkeywords}

Diffusion Model, DDPM, MRI, Quantitative MRI, Reconstruction

1 Introduction

Magnetic resonance imaging (MRI) stands as an indispensable, non-invasive imaging tool, pivotal in both medical diagnosis and clinical research. Though MRI delivers unparalleled diagnostic value, its imaging time is lengthy compared to other imaging modalities, limiting its patient throughput. This limitation has galvanized innovations to accelerate the MRI process, all with the common goal of drastically reducing scan time without compromising image quality[1, 2]. Recently, deep learning has shown great potential in addressing this issue. Numerous techniques have been introduced that enhance the performance of optimization algorithms using finely tuned sophisticated neural networks, achieving excellent results[3]. A substantial portion of these state of the art methods utilize conditional models, adeptly converting undersampled data inputs into outputs accurately mirroring the fully-sampled data acquisitions [4, 5, 6, 7, 8, 9, 10, 11].

Quantitative magnetic resonance imaging (qMRI) provides quantitative measures of the physical parameters of tissues, providing additional information regarding its microstructural environment. This is typically accomplished by modeling the acquired MR signal and extracting the parameter of interest. To sufficiently characterize the signal model requires multiple acquisitions, making it both time-consuming and costly, even with well-established acceleration methods. For example, quantifying the spin-lattice relaxation time (T1T_{1}) using the variable flip angle (vFA) model[12, 13] requires acquisitions at multiple flip angles, leading to impractical scan times for clinical settings. Recent advances in deep learning have enabled innovative solutions to accelerate qMRI. Methods such as MANTIS[14, 15], RELAX[16], MoDL-QSM[17], and DOPAMINE[18] have utilized supervised or self-supervised learning to enable rapid MR parameter mapping using undersampled k-space data.

The desire for more robust and efficient techniques in both MRI and qMRI has led to the development of innovative approaches, among which diffusion models[19, 20, 21, 22, 23] have recently shown to be promising. A notable advancement in this area is the emergence of Denoising Diffusion Probabilistic Models (DDPMs)[24, 25, 26]. These models represent a new class of generative models that have achieved high sample quality without the need of adversarial training. DDPMs have quickly gained interest in MRI reconstruction due to its robustness, especially under distribution shifts. A few studies have explored the concept of DDPM-based MRI reconstructions[23, 21]. In these methods, DDPMs are trained to generate noisy MRI images, and reconstruction is achieved by iteratively learning to denoise at each diffusion step. This can be implemented through either unconditional or conditional DDPMs. However, to the best of our knowledge, no studies have investigated diffusion models for qMRI.

While diffusion methods have shown promising results, they are not without challenges. Some limitations include the omission of physical constraints during training[27], lack of compromise solution and optimal efficiency due to the sampling being initiated from a random noise image [28], and reliance on fully-sampled images for training[29]. In particular, DDPMs for MRI reconstruction generally starts in the image domain, where the unknown data distribution from training images go through repeated Gaussian diffusion processes which can be reversed by learned transition kernels. Applying the diffusion process in the image domain overlooks the underlying MRI physical model (i.e. k-space encoding) which is embedded in the measured k-space data. It can underperform at large distribution shifts, due to changes in scan parameters or difference in anatomy between training and testing. Since the raw MRI measurement is acquired in the frequency domain (i.e. k-space), it can be beneficial to directly apply the diffusion process in the frequency domain rather than image domain. Likewise, since qMRI focuses on the quantification of tissue parameters such as T1T_{1} and proton density, it can also be advantageous to define the diffusion model conditioned on the parameter domain for qMRI.

In this paper, we propose a novel and unified method that applies domain-conditioned diffusion models to accelerated static MRI and qMRI reconstruction, which we denote as Static Diffusion Modeling (DiMo) and Quantitative DiMo, respectively. The conditional diffusion process is defined in k-space for Static DiMo and in parameter space for Quantitative DiMo.

Three points that distinguish our method from previous works are: (1) The forward (diffusion) process and reverse (sampling) process are defined on the native data domain rather than image domain. This model is applied to multicoil static MRI and quantitative MRI reconstruction, which reflects domain-specific adaptation. (2) Prior physics knowledge is embedded in the diffusion process as a data consistency component for characterizing MRI k-space encoding and MR signal modeling for MRI and qMRI reconstruction. Gradient descent is also integrated in the diffusion steps to augment feature learning and promote denoising. (3) The proposed method preserves high reconstruction accuracy and efficiency under large undersampling rates for both static MRI and quantitative MRI reconstruction of different anatomies.

The rest of the paper is organized as follows: Section II gives the methods for Static DiMo and Quantitative DiMo. Section III describes experiment settings. Section IV presents experiment results. Section V discusses the limitation of the method and concludes the paper.

2 Methods

2.1 DDPM

DDPMs are generative models which are highly effective in learning complex data distributions. The forward diffusion process adds noise to the input data, gradually increasing the noise level until the data is transformed into pure Gaussian noise. This process systematically perturbs the structure of the data distribution. The reverse diffusion process, also known as the denoising process, is then applied to recover the original structure of the data from the perturbed data distribution.

Forward Process DDPM[24] presents the forward diffusion mechanism as a Markov Chain, wherein Gaussian noise is incrementally introduced across several steps to yield a collection of perturbed samples. Consider the uncorrupted data distribution that is characterized by density q(x0)q(x_{0}), which undergoes incremental transformations through the addition of Gaussian noise at various stages, resulting in a spectrum of modified samples. If we draw a data sample x0x_{0} from q(x0)q(x_{0}), the forward diffusion process modifies this data point through integration of Gaussian perturbations at each time step t,t, which can be mathematically represented as

q(x1:T|x0):=t=1Tq(xt|xt1),\displaystyle q(x_{1:T}|x_{0}):=\prod^{T}_{t=1}q(x_{t}|x_{t-1}), (1a)
q(xt|xt1):=𝒩(xt;1βtxt1,βtI)\displaystyle q(x_{t}|x_{t-1}):=\mathcal{N}(x_{t};\sqrt{1-\beta_{t}}x_{t-1},\beta_{t}\textbf{I}) (1b)

Here, TT is the total number of diffusion steps, while the noise scaling sequence β1,,βT[0,1)\beta_{1},...,\beta_{T}\in[0,1) defines the magnitude of variance introduced at each step. For isotropic Gaussian noise ϵ𝒩(0,I)\epsilon\sim\mathcal{N}(\textbf{0},\textbf{I}), at each time step,

xt=1βtxt1+βtϵ.x_{t}=\sqrt{1-\beta_{t}}x_{t-1}+\sqrt{\beta_{t}}\epsilon. (2)

With a large number of forward iterations, xtx_{t} converges to an isotropic Gaussian sample, so we have q(xT)𝒩(xT;0,I)q(x_{T})\approx\mathcal{N}(x_{T};\textbf{0},\textbf{I}). Let αt:=1βt\alpha_{t}:=1-\beta_{t} and α¯t:=s=0tαs\bar{\alpha}_{t}:=\prod^{t}_{s=0}\alpha_{s}, then given the initial sampled data x0x_{0} and sampling step tt, we can obtain

xt(x0,ϵ)=α¯tx0+1α¯tϵ.x_{t}(x_{0},\epsilon)=\sqrt{\bar{\alpha}_{t}}x_{0}+\sqrt{1-\bar{\alpha}_{t}}\epsilon. (3)

Reverse Process The reverse process aims to retrieve x0x_{0} by stripping away the noise from xTx_{T}. Specifically, starting with a noise vector xT𝒩(xT;0,I)x_{T}\sim\mathcal{N}(x_{T};\textbf{0},\textbf{I}), we iteratively sample from the learnable transition kernel xt1pθ(xt1|xt)x_{t-1}\sim p_{\theta}(x_{t-1}|x_{t}) until t=1t=1 through a learnable Markov chain in the reverse time direction.

pθ(x0:T):=p(xT)t=1Tpθ(xt1|xt),\displaystyle p_{\theta}(x_{0:T}):=p(x_{T})\prod^{T}_{t=1}p_{\theta}(x_{t-1}|x_{t}), (4a)
pθ(xt1|xt):=𝒩(xt1;μθ(xt,t),σt2I)\displaystyle p_{\theta}(x_{t-1}|x_{t}):=\mathcal{N}(x_{t-1};\mu_{\theta}(x_{t},t),\sigma_{t}^{2}\textbf{I}) (4b)

The mean μθ(xt,t)\mu_{\theta}(x_{t},t) is parametrized by deep neural networks and σt2=1α¯t11α¯tβt\sigma_{t}^{2}=\frac{1-\bar{\alpha}_{t-1}}{1-\bar{\alpha}_{t}}\beta_{t}. The objective during training is to minimize the discrepancy between the true data distribution and the estimated distribution, which can be achieved by minimizing the variational bound on log likelihood:

Lvb=𝔼q(x0:T)[logp(xT)t1logpθ(xt1|xt)q(xt|xt1)]L_{vb}=\mathbb{E}_{q(x_{0:T})}\Big{[}-\log p(x_{T})-\sum_{t\geq 1}\log\frac{p_{\theta}(x_{t-1}|x_{t})}{q(x_{t}|x_{t-1})}\Big{]} (5)

Ho et al.[24] proposed using deep neural networks to estimate added noise ϵ\epsilon and perform parameterization on μθ\mu_{\theta}:

μθ(xt,t)=1αt(xtβt1α¯tϵθ(xt,t)),\mu_{\theta}(x_{t},t)=\frac{1}{\sqrt{\alpha_{t}}}(x_{t}-\frac{\beta_{t}}{\sqrt{1-\bar{\alpha}_{t}}}\epsilon_{\theta}(x_{t},t)), (6)

Accordingly, the loss function simplifies to

Lsimple=𝔼xt,t,ϵϵϵθ(xt,t)L_{\text{simple}}=\mathbb{E}_{x_{t},t,\epsilon}\|\epsilon-\epsilon_{\theta}(x_{t},t)\| (7)

2.2 Static DiMo: Accelerated MRI reconstruction

Accelerated MRI considers the following measurement model

f=Ax+ε,f=Ax+\varepsilon, (8)

where xnx\in\mathbb{C}^{n} is the MR image of interest consisting of n pixels, fmf\in\mathbb{C}^{m} is its corresponding undersampled k-space measurement, and εn\varepsilon\in\mathbb{C}^{n} is measurement noise. The parameterized forward measurement matrix Am×nA\in\mathbb{C}^{m\times n} is defined as

A:=𝒫Ω𝒮,A:=\mathcal{P}_{\Omega}\mathcal{F}\mathcal{S}, (9)

where 𝒮:=[𝒮1,,𝒮j]\mathcal{S}:=[\mathcal{S}_{1},...,\mathcal{S}_{j}] are the coil sensitivity maps for jj different coils, n×n\mathcal{F}\in\mathbb{C}^{n\times n} is the 2D discrete Fourier transform, and 𝒫Ωu×n(un)\mathcal{P}_{\Omega}\in\mathbb{N}^{u\times n}(u\ll n) is the binary undersampling mask corresponding to uu sampled data points using undersampling pattern Ω\Omega. The classical approach to solve for xx using undersampled k-space ff is to solve the following optimization problem:

minx12Axf22.\min_{x}\frac{1}{2}\|Ax-f\|^{2}_{2}. (10)

The training of Static DiMo starts with the input of fully sampled k-space data, which we denote as f^\hat{f}. We can write x=1f^x=\mathcal{F}^{-1}\hat{f}, where 1\mathcal{F}^{-1} is the inverse discrete Fourier transform. The task of Static DiMo is to minimize f^\hat{f}, thus we can rewrite equation (10) as:

minf^12A1f^f22\min_{\hat{f}}\frac{1}{2}\|A\mathcal{F}^{-1}\hat{f}-f\|^{2}_{2} (11)

The minimizer f^\hat{f} can be solved by applying the gradient decent algorithm, which is straightforward to compute:

f^(k+1)=f^(k)ηkf^(k)12A1f^(k)f22\hat{f}^{(k+1)}=\hat{f}^{(k)}-\eta_{k}\nabla_{\hat{f}^{(k)}}\frac{1}{2}\|A\mathcal{F}^{-1}\hat{f}^{(k)}-f\|^{2}_{2} (12)

Typical diffusion models aim to estimate q(x|𝒫Ω,f,𝒮)q(x|\mathcal{P}_{\Omega},f,\mathcal{S}). Since Static DiMo is defined in the frequency domain and x=1f^x=\mathcal{F}^{-1}\hat{f}, the problem can be interpreted as generating samples of q(f^|𝒫Ω,f,𝒮)q(\hat{f}|\mathcal{P}_{\Omega},f,\mathcal{S}) which is conditioned on undersampling mask 𝒫Ω\mathcal{P}_{\Omega}, scanned measurement ff and coil sensitivity maps 𝒮\mathcal{S}.

According to DDPM[24], the diffusion process can be represented with a noise schedule β1,,βT\beta_{1},...,\beta_{T} as:

q(f^1:T|f^0,𝒫Ω,f,𝒮):=t=1Tq(f^t|f^t1,𝒫Ω,f,𝒮),\displaystyle q(\hat{f}_{1:T}|\hat{f}_{0},\mathcal{P}_{\Omega},f,\mathcal{S}):=\prod^{T}_{t=1}q(\hat{f}_{t}|\hat{f}_{t-1},\mathcal{P}_{\Omega},f,\mathcal{S}), (13a)
q(f^t|f^t1,𝒫Ω,f,𝒮):=𝒩(f^t;1βtf^t1,βtI).\displaystyle q(\hat{f}_{t}|\hat{f}_{t-1},\mathcal{P}_{\Omega},f,\mathcal{S}):=\mathcal{N}(\hat{f}_{t};\sqrt{1-\beta_{t}}\hat{f}_{t-1},\beta_{t}\textbf{I}). (13b)

Let αt=1βt,α¯t=s=1tαs\alpha_{t}=1-\beta_{t},\bar{\alpha}_{t}=\sum^{t}_{s=1}\alpha_{s}. We can track the forward process conditioned on the initial:

q(f^t|f^0,𝒫Ω,f,𝒮)=𝒩(f^t;α¯tf^0,(1α¯t)I)\displaystyle q(\hat{f}_{t}|\hat{f}_{0},\mathcal{P}_{\Omega},f,\mathcal{S})=\mathcal{N}(\hat{f}_{t};\sqrt{\bar{\alpha}_{t}}\hat{f}_{0},(1-\bar{\alpha}_{t})\textbf{I}) (14a)
q(f^t1|f^t,f^0,𝒫Ω,f,𝒮)=𝒩(f^t1;μ~t(f^t,f^0),β~tI),\displaystyle q(\hat{f}_{t-1}|\hat{f}_{t},\hat{f}_{0},\mathcal{P}_{\Omega},f,\mathcal{S})=\mathcal{N}(\hat{f}_{t-1};\tilde{\mu}_{t}(\hat{f}_{t},\hat{f}_{0}),\tilde{\beta}_{t}\textbf{I}), (14b)
where μ~t(f^t,f^0):=αt(1α¯t1)1α¯tf^t+α¯t1βt1α¯tf^0\displaystyle\text{ where }\tilde{\mu}_{t}(\hat{f}_{t},\hat{f}_{0}):=\frac{\sqrt{\alpha_{t}}(1-\bar{\alpha}_{t-1})}{1-\bar{\alpha}_{t}}\hat{f}_{t}+\frac{\sqrt{\bar{\alpha}_{t-1}}\beta_{t}}{1-\bar{\alpha}_{t}}\hat{f}_{0} (14c)
and β~:=1α¯t11α¯tβt.\displaystyle\text{ and }\tilde{\beta}:=\frac{1-\bar{\alpha}_{t-1}}{1-\bar{\alpha}_{t}}\beta_{t}. (14d)

We choose αt\alpha_{t} so that α¯T0\bar{\alpha}_{T}\approx 0, then we obtain

q(f^T|f^0,𝒫Ω,f,𝒮)𝒩(f^T;0,I),q(\hat{f}_{T}|\hat{f}_{0},\mathcal{P}_{\Omega},f,\mathcal{S})\approx\mathcal{N}(\hat{f}_{T};\textbf{0},\textbf{I}), (15)

which approaches an isotropic Gaussian distribution.
Sample f0^q(f0^)\hat{f_{0}}\sim q(\hat{f_{0}}), the diffusion model has the following form:

pθ(f0^|𝒫Ω,f,𝒮)=pθ(f^0:T|𝒫Ω,f,𝒮)df^1:T,p_{\theta}(\hat{f_{0}}|\mathcal{P}_{\Omega},f,\mathcal{S})=\int p_{\theta}(\hat{f}_{0:T}|\mathcal{P}_{\Omega},f,\mathcal{S})\mathrm{d}_{\hat{f}_{1:T}}, (16)

The sampling of pθ(f0^|𝒫Ω,f,𝒮)p_{\theta}(\hat{f_{0}}|\mathcal{P}_{\Omega},f,\mathcal{S}) is the reverse diffusion process.

pθ(f^0:T|𝒫Ω,f,𝒮):=pθ(f^T|𝒫Ω,f,𝒮)t=1Tpθ(f^t1|f^t,𝒫Ω,f,𝒮),\displaystyle p_{\theta}(\hat{f}_{0:T}|\mathcal{P}_{\Omega},f,\mathcal{S}):=p_{\theta}(\hat{f}_{T}|\mathcal{P}_{\Omega},f,\mathcal{S})\prod_{t=1}^{T}p_{\theta}(\hat{f}_{t-1}|\hat{f}_{t},\mathcal{P}_{\Omega},f,\mathcal{S}), (17a)
pθ(f^t1|f^t,𝒫Ω,f,𝒮):=𝒩(f^t1;μθ(f^t,t,𝒫Ω,f,𝒮),σt2I),\displaystyle p_{\theta}(\hat{f}_{t-1}|\hat{f}_{t},\mathcal{P}_{\Omega},f,\mathcal{S}):=\mathcal{N}(\hat{f}_{t-1};\mu_{\theta}(\hat{f}_{t},t,\mathcal{P}_{\Omega},f,\mathcal{S}),\sigma_{t}^{2}\textbf{I}), (17b)

The training of pθ(f0^|𝒫Ω,f,𝒮)p_{\theta}(\hat{f_{0}}|\mathcal{P}_{\Omega},f,\mathcal{S}) is to optimize the variational bound of log likelihood in (5), and this can break down into optimizing random terms of LL with stochastic gradient descent. Now we analyse Lt1L_{t-1} using (14b) and (17b), we can write

Lt1\displaystyle L_{t-1} =DKL(q(f^t1|f^t,f^0,𝒫Ω,f,𝒮)||pθ(f^t1|f^t,𝒫Ω,f,𝒮))\displaystyle=D_{KL}(q(\hat{f}_{t-1}|\hat{f}_{t},\hat{f}_{0},\mathcal{P}_{\Omega},f,\mathcal{S})||p_{\theta}(\hat{f}_{t-1}|\hat{f}_{t},\mathcal{P}_{\Omega},f,\mathcal{S})) (18a)
=𝔼q[12σ2μ~t(f^t,f^0)μθ(f^t,t,𝒫Ω,f,𝒮)22]+C,\displaystyle=\mathbb{E}_{q}[\frac{1}{2\sigma^{2}}\|\tilde{\mu}_{t}(\hat{f}_{t},\hat{f}_{0})-\mu_{\theta}(\hat{f}_{t},t,\mathcal{P}_{\Omega},f,\mathcal{S})\|^{2}_{2}]+C, (18b)

where CC is constant that independent from θ\theta. From (3) we get f^0=1αt(f^t(f^0,ϵ)1α¯tϵ)\hat{f}_{0}=\frac{1}{\sqrt{\alpha_{t}}}(\hat{f}_{t}(\hat{f}_{0},\epsilon)-\sqrt{1-\bar{\alpha}_{t}}\epsilon), then apply the forward posterior formula (14d), we have

Lt1C=𝔼q[12σ21αt(f^t(f^0,ϵ)βt1α¯tϵ)\displaystyle L_{t-1}-C=\mathbb{E}_{q}[\frac{1}{2\sigma^{2}}\|\frac{1}{\sqrt{\alpha_{t}}}(\hat{f}_{t}(\hat{f}_{0},\epsilon)-\frac{\beta_{t}}{\sqrt{1-\bar{\alpha}_{t}}}\epsilon)- (19)
μθ(f^t(f^0,ϵ),t,𝒫Ω,f,𝒮)22].\displaystyle\mu_{\theta}(\hat{f}_{t}(\hat{f}_{0},\epsilon),t,\mathcal{P}_{\Omega},f,\mathcal{S})\|^{2}_{2}].

We can derive that μθ\mu_{\theta} should be parametrized to predict 1αt(f^tβt1α¯tϵ)\frac{1}{\sqrt{\alpha_{t}}}(\hat{f}_{t}-\frac{\beta_{t}}{\sqrt{1-\bar{\alpha}_{t}}}\epsilon), so we can put

μθ(f^t,t,𝒫Ω,f,𝒮)=1αt(f^tβt1α¯tϵθ(f^t,t,𝒫Ω,f,𝒮)),\mu_{\theta}(\hat{f}_{t},t,\mathcal{P}_{\Omega},f,\mathcal{S})=\frac{1}{\sqrt{\alpha_{t}}}(\hat{f}_{t}-\frac{\beta_{t}}{\sqrt{1-\bar{\alpha}_{t}}}\epsilon_{\theta}(\hat{f}_{t},t,\mathcal{P}_{\Omega},f,\mathcal{S})), (20)

where f^t=α¯tf^0+1α¯tϵ,ϵ𝒩(0,I)\hat{f}_{t}=\sqrt{\bar{\alpha}_{t}}\hat{f}_{0}+\sqrt{1-\bar{\alpha}_{t}}\epsilon,\epsilon\sim\mathcal{N}(\textbf{0},\textbf{I}). Then the simplified loss function is

𝔼f^0,t,ϵϵϵθ(α¯tf^0+1α¯tϵ,t,𝒫Ω,f,𝒮)22,\mathbb{E}_{\hat{f}_{0},t,\epsilon}\|\epsilon-\epsilon_{\theta}(\sqrt{\bar{\alpha}_{t}}\hat{f}_{0}+\sqrt{1-\bar{\alpha}_{t}}\epsilon,t,\mathcal{P}_{\Omega},f,\mathcal{S})\|^{2}_{2}, (21)

In equation (21), we can interpret it as f^t(α¯tf^0+1α¯tϵ,𝒫Ω,f,𝒮)\hat{f}_{t}(\sqrt{\bar{\alpha}_{t}}\hat{f}_{0}+\sqrt{1-\bar{\alpha}_{t}}\epsilon,\mathcal{P}_{\Omega},f,\mathcal{S}) being a function that is dependent on the components of ϵθ\epsilon_{\theta}. Then we can write our simplified loss function as

L\displaystyle L =𝔼f^0,t,ϵϵϵθ(α¯tf^0+1α¯tϵ,𝒫Ω,f,𝒮,t)22\displaystyle=\mathbb{E}_{\hat{f}_{0},t,\epsilon}\|\epsilon-\epsilon_{\theta}(\sqrt{\bar{\alpha}_{t}}\hat{f}_{0}+\sqrt{1-\bar{\alpha}_{t}}\epsilon,\mathcal{P}_{\Omega},f,\mathcal{S},t)\|^{2}_{2} (22a)
=𝔼f^t,t,ϵϵϵθ(f^t,t)22.\displaystyle=\mathbb{E}_{\hat{f}_{t},t,\epsilon}\|\epsilon-\epsilon_{\theta}(\hat{f}_{t},t)\|^{2}_{2}. (22b)

Now we can derive the function f^t\hat{f}_{t} so that it can provide a prior guidance for the diffusion during training and sampling. The diffusion process starts with f^0q(f^)\hat{f}_{0}\sim q(\hat{f}), we have f^t=α¯tf^0+1α¯tϵ\hat{f}_{t}=\sqrt{\bar{\alpha}_{t}}\hat{f}_{0}+\sqrt{1-\bar{\alpha}_{t}}\epsilon by (3). An intuitive way of designing the function f^t\hat{f}_{t} is to incorporate a data consistency (DC) layer which perturbs the k-space data f^t\hat{f}_{t} with a linear combination of partially scanned data ff. Then we can further fine-tune the DC layer using gradient descent (GD) as described in (12). The proposed algorithm synergizes DDPM, MRI data consistency enforcement and gradient descent optimization, which is listed in Alg. 1:

Algorithm 1 Training Process of Static DiMo
0:  tUniform({1,,T}),ϵ𝒩(0,I),f^0q(f0)t\sim\text{Uniform}(\{1,\cdots,T\}),\epsilon\sim\mathcal{N}(\textbf{0},\textbf{I}),\hat{f}_{0}\sim q(f_{0}), undersampling mask 𝒫Ω\mathcal{P}_{\Omega}, partial scanned k-space ff and coil sensitivities 𝒮\mathcal{S}. Initialisation : η0\eta_{0}
1:  f^tα¯tf^0+1α¯tϵ.\hat{f}_{t}\leftarrow\sqrt{\bar{\alpha}_{t}}\hat{f}_{0}+\sqrt{1-\bar{\alpha}_{t}}\epsilon.
2:  f^t𝒫Ω(λtf+(1λt)f^t)+(𝟙𝒫Ω)f^t\hat{f}_{t}\leftarrow\mathcal{P}_{\Omega}(\lambda_{t}f+(1-\lambda_{t})\hat{f}_{t})+(\mathbbm{1}-\mathcal{P}_{\Omega})\hat{f}_{t} \triangleright DC
3:  for k=1k=1 to KK do
4:     f^tf^tηkf^t12A1f^tf22\hat{f}_{t}\leftarrow\hat{f}_{t}-\eta_{k}\nabla_{\hat{f}_{t}}\frac{1}{2}\|A\mathcal{F}^{-1}\hat{f}_{t}-f\|^{2}_{2} \triangleright GD
5:  end for
6:  Take gradient descent update step   θϵϵθ(f^t,t)22\nabla_{\theta}\|\epsilon-\epsilon_{\theta}(\hat{f}_{t},t)\|^{2}_{2}Until converge
6:  f^t\hat{f}_{t}, t{1,,T}t\in\{1,\cdots,T\}.

In step 2, λt\lambda_{t} is a scheduling factor to guide the data consistency dynamics, which follows the exponential function to schedule the perturbation dynamics:

λt=exp((t1)/(T/10)).\lambda_{t}=\text{exp}(-(t-1)/(T/10)). (23)

Prior knowledge of scanned signal decreases exponentially as tt grows. In doing this, the schedule parameter can balance network training and stabilize parameter learning. 𝟙\mathbbm{1} denotes the matrix with values of one and ηk\eta_{k} is the learnable step size for gradient descent.

Sampling f^t1pθ(f^t1|f^t,𝒫Ω,f,𝒮)\hat{f}_{t-1}\sim p_{\theta}(\hat{f}_{t-1}|\hat{f}_{t},\mathcal{P}_{\Omega},f,\mathcal{S}) leverages the Langevin dynamics with ϵθ\epsilon_{\theta} as learned gradient of data density which provide the local information of the data distribution. Here we compute f^t1=1αt(f^tβt1α¯tϵθ(f^t,t,𝒫Ω,f,𝒮))+σtz\hat{f}_{t-1}=\frac{1}{\sqrt{\alpha_{t}}}(\hat{f}_{t}-\frac{\beta_{t}}{\sqrt{1-\bar{\alpha}_{t}}}\epsilon_{\theta}(\hat{f}_{t},t,\mathcal{P}_{\Omega},f,\mathcal{S}))+\sigma_{t}z where z𝒩(0,I)z\sim\mathcal{N}(\textbf{0},\textbf{I}). Alg. 2 shows the sampling process which is the reverse of the diffusion process.

Algorithm 2 Sampling Process of Static DiMo
0:  f^T𝒩(0,I)\hat{f}_{T}\sim\mathcal{N}(\textbf{0},\textbf{I}), undersampling mask 𝒫Ω\mathcal{P}_{\Omega}, partial scanned k-space ff and coil sensitivities 𝒮\mathcal{S}.
1:  for t=T,,1t=T,...,1 do
2:     z𝒩(0,I)z\sim\mathcal{N}(\textbf{0},\textbf{I}) if t>1t>1, else z=0z=0
3:     f^t1=μθ(f^t,t)+σtz\hat{f}_{t-1}=\mu_{\theta}(\hat{f}_{t},t)+\sigma_{t}z
4:     f^t1𝒫Ω(λt1f+(1λt1)f^t1)+(𝟙𝒫Ω)f^t1\hat{f}_{t-1}\leftarrow\mathcal{P}_{\Omega}(\lambda_{t-1}f+(1-\lambda_{t-1})\hat{f}_{t-1})+(\mathbbm{1}-\mathcal{P}_{\Omega})\hat{f}_{t-1}
5:     for k=0k=0 to KK do
6:        f^t1f^t1ηkf^t112A1f^t1f22\hat{f}_{t-1}\leftarrow\hat{f}_{t-1}-\eta_{k}\nabla_{\hat{f}_{t-1}}\frac{1}{2}\|A\mathcal{F}^{-1}\hat{f}_{t-1}-f\|^{2}_{2}
7:     end for
8:  end for
8:  f^0\hat{f}_{0}

2.3 Quantitative DiMo: Accelerated qMRI reconstruction.

To extend from static MRI reconstruction to qMRI reconstruction, we first define the MR parameter maps as Δ={δi}i=1N\Delta=\{\delta_{i}\}_{i=1}^{N} where δi\delta_{i} symbolizes each MR parameter and NN is the total number of MR parameters to be estimated. Given the MR signal model 𝐌\mathbf{M}, which is a function of Δ\Delta, the MR parameters Δ\Delta can be estimated by minimizing the following:

minΔ12A𝐌(Δ)f22.\min_{\Delta}\frac{1}{2}\|A\mathbf{M}(\Delta)-f\|^{2}_{2}. (24)

Quantitative DiMo is defined in the parameter domain, therefore we focus on estimating q(Δ^|𝒫Ω,f,𝒮,𝐌)q(\hat{\Delta}|\mathcal{P}_{\Omega},f,\mathcal{S},\mathbf{M}) with an additional condition, the MR signal model 𝐌\mathbf{M}. Since the diffusion processe and reverse process are highly similar to Static DiMo, we omit the derivation of Quantitative DiMo.

In contrast to the algorithm design of Static DiMo, a major difference is the function Δ^t(α¯tf^0+1α¯tϵ,𝒫Ω,f,𝒮,𝐌)\hat{\Delta}_{t}(\bar{\alpha}_{t}\hat{f}_{0}+\sqrt{1-\bar{\alpha}_{t}}\epsilon,\mathcal{P}_{\Omega},f,\mathcal{S},\mathbf{M}). Following the same framework as Static DiMo, Quantitative DiMo starts with equation (3), which is then input to the quantitative DC (QDC) layer, followed by fine-tuning with gradient descent. The Alg. 3 displays the training process of Quantitative DiMo and Alg. 4 shows the sampling process:

Algorithm 3 Training Process of Quantitative DiMo
0:  tUniform({1,,T}),ϵ𝒩(0,I),Δ^0q(Δ0)t\sim\text{Uniform}(\{1,\cdots,T\}),\epsilon\sim\mathcal{N}(\textbf{0},\textbf{I}),\hat{\Delta}_{0}\sim q(\Delta_{0}), and acquired data 𝒫Ω,f,𝒮,𝐌\mathcal{P}_{\Omega},f,\mathcal{S},\mathbf{M}.                   Initialisation : τ0\tau_{0}
1:  Δ^tα¯tΔ^0+1α¯tϵ\hat{\Delta}_{t}\leftarrow\sqrt{\bar{\alpha}_{t}}\hat{\Delta}_{0}+\sqrt{1-\bar{\alpha}_{t}}\epsilon
2:  f^t𝒮𝐌(Δ^t)\hat{f}_{t}\leftarrow\mathcal{F}\mathcal{S}\mathbf{M}(\hat{\Delta}_{t})
3:  f^t𝒫Ω(λtf+(1λt)f^t)+(𝟙𝒫Ω)f^t\hat{f}_{t}\leftarrow\mathcal{P}_{\Omega}(\lambda_{t}f+(1-\lambda_{t})\hat{f}_{t})+(\mathbbm{1}-\mathcal{P}_{\Omega})\hat{f}_{t}
4:  Δ^t𝐌1𝒮1(1f^t)\hat{\Delta}_{t}\leftarrow\mathbf{M}^{-1}\mathcal{S}^{-1}(\mathcal{F}^{-1}\hat{f}_{t})
5:  for k=0k=0 to KK do
6:     Δ^tΔ^tτkΔ^t12A𝐌(Δ^t)f22\hat{\Delta}_{t}\leftarrow\hat{\Delta}_{t}-\tau_{k}\nabla_{\hat{\Delta}_{t}}\frac{1}{2}\|A\mathbf{M}(\hat{\Delta}_{t})-f\|^{2}_{2} \triangleright GD
7:  end for
8:  Take gradient descent update step   θϵϵθ(Δ^t,t)22\nabla_{\theta}\|\epsilon-\epsilon_{\theta}(\hat{\Delta}_{t},t)\|^{2}_{2}Until converge
8:  Δ^t\hat{\Delta}_{t}, t{1,,T}t\in\{1,\cdots,T\}. \triangleright QDC
Algorithm 4 Sampling Process of Quantitative DiMo
0:  Δ^T𝒩(0,I)\hat{\Delta}_{T}\sim\mathcal{N}(\textbf{0},\textbf{I}), and acquired data 𝒫Ω,f,𝒮,𝐌\mathcal{P}_{\Omega},f,\mathcal{S},\mathbf{M}.
1:  for t=T,,1t=T,...,1 do
2:     z𝒩(0,I)z\sim\mathcal{N}(\textbf{0},\textbf{I}) if t>1t>1, else z=0z=0
3:     Δ^t1=μθ(Δ^t,t)+σtz\hat{\Delta}_{t-1}=\mu_{\theta}(\hat{\Delta}_{t},t)+\sigma_{t}z
4:     f^t1𝒮𝐌(Δ^t1)\hat{f}_{t-1}\leftarrow\mathcal{F}\mathcal{S}\mathbf{M}(\hat{\Delta}_{t-1})
5:     f^t1𝒫Ω(λt1f+(1λt1)f^t1)+(𝟙𝒫Ω)f^t1\hat{f}_{t-1}\leftarrow\mathcal{P}_{\Omega}(\lambda_{t-1}f+(1-\lambda_{t-1})\hat{f}_{t-1})+(\mathbbm{1}-\mathcal{P}_{\Omega})\hat{f}_{t-1}
6:     Δ^t1𝐌1𝒮1(1f^t1)\hat{\Delta}_{t-1}\leftarrow\mathbf{M}^{-1}\mathcal{S}^{-1}(\mathcal{F}^{-1}\hat{f}_{t-1})
7:     for k=0k=0 to KK do
8:        Δ^t1Δ^t1τkΔ^t112A𝐌(Δ^t1)f22\hat{\Delta}_{t-1}\leftarrow\hat{\Delta}_{t-1}-\tau_{k}\nabla_{\hat{\Delta}_{t-1}}\frac{1}{2}\|A\mathbf{M}(\hat{\Delta}_{t-1})-f\|^{2}_{2}
9:     end for
10:  end for
10:  Δ^0\hat{\Delta}_{0}

QDC layer is presented in steps 2-4 in the above algorithm. The addition of the QDC step is to perturb quantitative parameters and guide the parameter space denoising and generation process. Steps 5-7 further optimize the perturbed Δ^\hat{\Delta} through iterating KK steps of gradient descent. Alg. 4 displays the sampling process which is the reverse of the diffusion process.

To demonstrate the feasibility of Quantitative DiMo for reconstructing accelerated qMRI, we used T1T_{1} mapping using the vFA model [13] as an example. For MR images are acquired at multiple flip angles ϕi\phi_{i}, where i=1,,Mi=1,...,M, the MR signal model 𝐌\mathbf{M} reads as:

𝐌i(T1,I0)=I0(1eTR/T1)sinϕi1eTR/T1cosϕi,\mathbf{M}_{i}(T_{1},I_{0})=I_{0}\cdot\frac{(1-e^{-TR/T_{1}})\sin{\phi_{i}}}{1-e^{-TR/T_{1}}\cos{\phi_{i}}}, (25)

where T1nT_{1}\in\mathbb{R}^{n} denotes the spin-lattice relaxation time map and I0nI_{0}\in\mathbb{C}^{n} is proton density map. In this model, the MR parameters estimated are encapsulated in the set Δ={T1,I0}\Delta=\{T_{1},I_{0}\}. Other imaging parameters such as flip angles and repetition time TR are known. Fig. 1 shows the framework of Static DiMo and Quantitative DiMo and Fig. 2 illustrates the U-Net architecture for ϵθ(xt,t)\epsilon_{\theta}(x_{t},t).

Refer to caption
Figure 1: Diffusion model framework for Static DiMo and Quantitative DiMo (e.g., T1T_{1} mapping).
Refer to caption
Figure 2: Network structure for ϵθ(xt,t)\epsilon_{\theta}(x_{t},t).

3 Experiments

3.1 Experiments for Static DiMo

We acquired knee data from 25 subjects, using 20 for training and 5 for testing. These fully-sampled data were obtained from a 3T GE Premier scanner equipped with an 18-element knee coil array. The data was acquired using two-dimensional fast spin-echo (FSE) sequences along the coronal plane, with both proton density-weighting (PD-weighting) and T2T_{2}-weighting. All images were resized resulting in in-plane matrix size of 320 ×\times 320 with 3mm slice thickness. Number of slices varied from 30 to 38 in each dataset, amounting to 875 total slices for both training and testing. The experiments were conducted retrospectively using three different acceleration factors: AFs = 4×\times, 5×\times and 6×\times using 1D variable density Cartesian undersampling masks[30] where the 20 central k-space lines were fully sampled. We evaluated the performance of the proposed method against several state-of-the-art methods which include Variational Network (VN)[6], ISTA-NET[31], and pMRI-Net[32] over two contrasts: PD- and T2T_{2}-weighting. Results were compared in terms of peak signal to noise ratio (PSNR), structure similarity (SSIM) and normalized mean squared error (NMSE). The equations for assessing PSNR, SSIM, and NMSE between reconstruction v\mathrm{v} and the ground truth v\mathrm{v}^{*} are:

PSNR(v,v)=20log10(max(|v|)/1Nvv2),PSNR(\mathrm{v},\mathrm{v}^{*})=20\log_{10}(\max(|\mathrm{v}^{*}|)/\tfrac{1}{N}\parallel\mathrm{v}-\mathrm{v}^{*}\parallel^{2}), (26)
SSIM(v,v)=(2μvμv+c1)(2σvv+c2)(μv2+μv2+c1)(σv2+σv2+c2),SSIM(\mathrm{v},\mathrm{v}^{*})=\frac{(2\mu_{\mathrm{v}}\mu_{\mathrm{v}^{*}}+c_{1})(2\sigma_{\mathrm{v}\mathrm{v}^{*}}+c_{2})}{(\mu_{\mathrm{v}}^{2}+\mu_{\mathrm{v}^{*}}^{2}+c_{1})(\sigma_{\mathrm{v}}^{2}+\sigma_{\mathrm{v}^{*}}^{2}+c_{2})}, (27)
NMSE(v,v)=vv22/v22,NMSE(\mathrm{v},\mathrm{v}^{*})=\parallel\mathrm{v}^{*}-\mathrm{v}\parallel^{2}_{2}/\parallel\mathrm{v}^{*}\parallel^{2}_{2}, (28)

whereμv,μv\mu_{\mathrm{v}},\mu_{\mathrm{v}^{*}} represent the local mean of pixel intensities with standard deviations σv,σv\sigma_{\mathrm{v}},\sigma_{\mathrm{v}^{*}} for images v,v\mathrm{v},\mathrm{v}^{*}, respectively. σvv\sigma_{\mathrm{v}\mathrm{v}^{*}} denotes the covariance between v\mathrm{v} and v\mathrm{v}^{*}, and C1=(K1L)2,C2=(K2L)2C_{1}=(K_{1}L)^{2},C_{2}=(K_{2}L)^{2} are two constant variables introduced to stabilize the division. We chose the total number of sampling steps as T=1000T=1000 using a linear schedule from β1=105\beta_{1}=10^{-5} to βT=102\beta_{T}=10^{-2} with training batch size of 4. The total number of epochs used was 7000. The initial parameter for Alg.1 was chosen to be η0=104\eta_{0}=10^{-4}.

3.2 Experiments for Quantitative DiMo

Quantitative DiMo experiments were carried out using in-vivo brain data of healthy volunteers, obtained from a Siemens 3T Prisma scanner equipped with a 20-channel head coil. The five subject fully sampled vFA brain data was acquired along the sagittal plane using a spoiled gradient echo sequence with imaging parameters TE/TR = 12/4012/40 ms, FA = 5,10,20,405^{\circ},10^{\circ},20^{\circ},40^{\circ}, in-plane matrix size = 176×176176\times 176 with 3mm slice thickness and a total number of 48 slices per subject. Leave-One-Out cross-validation was used for all five subjects to train and test our method.

Two undersampling schemes were retrospectively used: (1) 1D variable density Cartesian undersampling[30] with acceleration factor AF= 4×4\times, where the 16 central k-space lines were fully sampled and (2) 2D Poisson disk undersampling at AF= 4×4\times with the central 51x51 k-space portion fully sampled. The undersampling patterns were varied for each flip angle, like in previous studies[14, 16]. We compared Quantitative DiMo with two advanced non-deep learning qMRI reconstruction techniques Locally Low Rank (LLR)[33], Model-TGV[34], and a self-supervised deep learning method RELAX[16]. For LLR and Model-TGV, settings recommended in their original research papers were used along with available code. RELAX and Quantitative DiMo were trained through cross-validation. For the diffusion hyperparameters, we maintained the total number of sampling steps at T=1000T=1000 using linear scheduling from β1=106\beta_{1}=10^{-6} to βT=0.05\beta_{T}=0.05. The training batch size used was 8 with 5500 total epochs. The initial parameter for gradient descent in Alg. 3 was set to τ0=105.\tau_{0}=10^{-5}.

All in-vivo studies were carried out under a protocol approved by our institution’s institutional review board. Another experiment condition was as follows: The coil sensitivity maps were estimated from ESPIRiT[35]. Separate B1+B_{1}^{+} maps were acquired[36] to correct for B1+B_{1}^{+} bias in estimating T1T_{1}. All the programming in this study was implemented using Python language and PyTorch package, and experiments were conducted on one NVIDIA A100 80GB GPU and an Intel Xeon 6338 CPU at Centos Linux system.

4 Results

4.1 Results of Static DiMo

An example of reconstruction results obtained from different methods at AF = 6×6\times are shown for PD- (Fig. 3) and T2-weighted (Fig. 4) images along the coronal plane are displayed. Zero-filled results show significant blurring and artifacts due to undersampling. Although VN, ISTA-Net, and pMRI-Net are able to remove these artifacts, compared with the fully sampled reference images, blurring characteristic persists with VN showing the most, while ISTA-Net and pMRI-Net are comparable. Static DiMo’s reconstruction not only removes the undersampling artifacts, but also significantly suppresses blurring showing clarity and sharpness, thus outperforming both ISTA-Net and pMRI-Net. Furthermore, Static DiMo shows superior denoising capability, rooted in its denoising diffusion modeling. This not only proficiently removes noise and artifacts, but also retains the integrity of high-frequency image detail. This is further illustrated in the zoomed-in images of Fig. 3(b) and Fig. 4(b). Static DiMo clearly distinguishes tissue boundaries across various tissue types, including the cartilage, meniscus, bone and muscle. In addition, the fidelity of tissue texture and sharpness are preserved. The pixel-wise error maps shown in Fig. 3(c) and Fig. 4(c) also demonstrate that Static DiMo produces the least reconstruction errors with respect to the fully sampled reference. For both contrasts, Static DiMo consistently outperforms the other methods. Overall, in agreement with the qualitative observation in Fig. 3 and Fig. 4, reconstruction results are summarized in Table 1 for all testing subjects using quantitative metrics, showing the superior performance of Static DiMo over other methods in reconstruction fidelity, structure and texture preservation and noise suppression.

Table 1: Comparison of different methods using quantitative metrics.
PD-weighting1
AF 4x 5x 6x
Methods PSNR SSIM NMSE PSNR SSIM NMSE PSNR SSIM NMSE
VN [6] 30.5932 ±\pm 0.9312 0.9016 ±\pm 0.0390 0.1374 ±\pm 0.0203 29.3312 ±\pm 1.1914 0.8868 ±\pm 0.0482 0.1467 ±\pm 0.0269 27.5036 ±\pm 1.2453 0.8503 ±\pm 0.0594 0.1788 ±\pm 0.0301
ISTA-Net[31] 32.3933 ±\pm 1.1124 0.9640 ±\pm 0.0260 0.1030 ±\pm 0.0121 31.1878 ±\pm 1.2343 0.9555 ±\pm 0.0369 0.1150 ±\pm 0.0152 29.1795 ±\pm 1.2951 0.9603 ±\pm 0.0415 0.1281 ±\pm0.0197
pMRI-Net[32] 33.5534 ±\pm 1.2351 0.9695 ±\pm 0.0144 0.0937 ±\pm 0.0058 32.7408 ±\pm 1.4180 0.9667 ±\pm 0.0198 0.1003 ±\pm 0.0086 30.8576 ±\pm 1.5202 0.9560 ±\pm 0.0235 0.1178 ±\pm 0.0128
Static DiMo 35.3733 ±\pm 1.2142 0.9704 ±\pm 0.0137 0.0738 ±\pm 0.0044 34.0121 ±\pm 1.3185 0.9691 ±\pm 0.0169 0.0839 ±\pm 0.0081 32.7645 ±\pm 1.4719 0.9597 ±\pm 0.0119 0.1084 ±\pm 0.0109
𝑻2\bm{T}_{2}-weighting1
Methods PSNR SSIM NMSE PSNR SSIM NMSE PSNR SSIM NMSE
VN [6] 32.0078 ±\pm 0.5800 0.9129 ±\pm 0.0119 0.1464 ±\pm 0.0048 30.8641 ±\pm0.6303 0.8975 ±\pm 0.0128 0.1743 ±\pm 0.0082 29.2356 ±\pm 0.7866 0.8708 ±\pm 0.0245 0.1934 ±\pm 0.0102
ISTA-Net[31] 34.1552 ±\pm 0.7873 0.9628 ±\pm 0.0056 0.1142 ±\pm 0.0042 32.7243 ±\pm 1.0213 0.9472 ±\pm 0.0093 0.1382 ±\pm 0.0065 30.0678 ±\pm 1.1038 0.9155 ±\pm 0.0109 0.1864 ±\pm 0.0080
pMRI-Net[32] 36.3567 ±\pm 1.1123 0.9767 ±\pm 0.0045 0.0867 ±\pm 0.0032 35.5938 ±\pm 1.1256 0.9728 ±\pm 0.0078 0.0942 ±\pm 0.0045 34.1399 ±\pm 1.1321 0.9646 ±\pm 0.0087 0.1108 ±\pm 0.0059
Static DiMo 37.6438 ±\pm 1.0343 0.9817 ±\pm 0.0039 0.0776 ±\pm 0.0025 36.2274 ±\pm 1.1227 0.9812 ±\pm 0.0071 0.0793 ±\pm 0.0049 35.1207 ±\pm 1.1348 0.9708 ±\pm 0.0079 0.0983 ±\pm 0.0056

1 Data are presented as mean ±\pm std.

Refer to caption
Figure 3: Comparison of coronal PD-weighting contrast at AF = 6×\times.
Refer to caption
Figure 4: Comparison of coronal T2-weighting contrast at AF = 6×\times.

4.2 Results of Quantitative DiMo

T1T_{1} and I0I_{0} maps estimated from 4×\times 1D variable density Cartesian undersampling using various methods are shown in Fig. 5 The zero-filled T1T_{1} map derived from pixel-wise fitting the undersampled images (Fig. 5(a)) exhibits noise and ripple artifacts due to aliasing. While LLR manages to mitigate these artifacts to an extent, the noisy signature persists. In its attempt to neutralize the noise, Model-TGV yields a more refined tissue appearance with enhanced T1T_{1} contrast. However, the resulting maps are overly smooth, appearing blurry. Conversely, RELAX effectively suppresses both noise and artifacts, delivering sharper maps, albeit with a slightly blocky texture. This blockiness is speculated to arise from challenges in achieving convergence in the end-to-end network using limited data. Meanwhile, Quantitative DiMo generates clear and sharp T1T_{1} maps. Its proficiency in removing noise translates to its superior map quality, both in terms of appearance and contrast. This is further witnessed in the zoom-ins (Fig. 5 (b)) which show that Quantitative DiMo clearly distinguishes the boundary between white matter and grey matter, appearing nearly identical to the fully sampled reference map. This is quantitatively confirmed in the error maps (Fig. 5(c)) where the zero-filled incurs the largest error, followed by LLR, Model-TGV, and RELAX.

Refer to caption
Refer to caption
Figure 5: Exemplified comparison of T1T_{1} and I0I_{0} mapping among different methods using 4×\times 1D Cartesian variable density undersampling mask.

Mean T1T_{1} values obtained from representative white matter and grey matter regions are presented in Table 2 for all subjects. Overall, LLR shows least agreement with respect to the fully sampled reference method, followed by Model-GTV. On the other hand, both RELAX and Quantitative DiMo show good agreement. However, as shown by the Wilcoxon signed rank test results, Quantitative DiMo shows better correlation with the reference than RELAX.

Table 2: ROI analysis of representative brain regions for 1D Cartesian undersampling at AF = 4×\times.
T1[s]1T_{1}[s]^{1}
Region-of-interest LLR Model-TGV RELAX Quantitative DiMo Fully Sampled
White matter region
Corpus Callosum 0.957±0.2170.957\pm 0.217 0.902±0.1860.902\pm 0.186 0.895±0.1730.895\pm 0.173 0.894±0.167§0.894\pm 0.167^{\S} 0.895±0.1640.895\pm 0.164
Frontal white matter 0.932±0.2260.932\pm 0.226 0.918±0.2130.918\pm 0.213 0.906±0.207§0.906\pm 0.207^{\S} 0.905±0.203§0.905\pm 0.203^{\S} 0.905±0.2020.905\pm 0.202
Grey matter region
Putamen 1.201±0.2091.201\pm 0.209 1.287±0.1751.287\pm 0.175 1.278±0.1741.278\pm 0.174 1.308±0.170§1.308\pm 0.170^{\S} 1.311±0.1631.311\pm 0.163
Thalamus 1.252±0.2011.252\pm 0.201 1.224±0.1621.224\pm 0.162 1.216±0.1341.216\pm 0.134 1.223±0.1181.223\pm 0.118 1.225±0.1091.225\pm 0.109

1 Data are presented as mean ±\pm std. §\S P>0.05P>0.05 vs. fully sampled T1T_{1} using Wilcoxon signed rank test.

Refer to caption
Refer to caption
Figure 6: Exemplified comparison of T1T_{1} and I0I_{0} mapping among different methods using 4×\times 2D Poisson undersampling mask.
Refer to caption
Figure 7: Examples of mean, error and variance maps at different numbers of sampling for Static DiMo

The I0I_{0} maps in Fig 5(d-f) exhibit a similar signature in reconstruction quality as the T1T_{1} maps, where LLR and Model-TGV show aliasing artifacts with the most blur, further confirmed by the zoom-ins, reflecting the undersampling pattern of many high frequency data points being undersampled with low frequency data points fully sampled. The deep learning method RELAX shows its ability to denoise through artifacts removal and detail preservation. However, it still lacks sharpness resulting in smoothened edges. Quantitative DiMo produces the sharpest maps preserving detail, which again is confirmed by its error maps, showing least error.

The T1T_{1} and I0I_{0} maps estimated from 4×\times 2D Poisson disk undersampling are shown in Fig. 6. Examining Fig. 6(a) and Fig. 6(d), it is apparent that the main differences compared to the 4x 1D undersampling case are the different artifact patterns, which in turn stems from the different undersampling patterns. As seen in the zoom-in views of the T1T_{1} (Fig. 6(b)) and I0I_{0} (Fig. 6(e)) maps, compared to Fig. 5, LLR provides sharper maps details but still retain artifacts and noise, whereas Model-TGV preserves detail but exhibits blurred edges due to its general averaging of noise. RELAX captures sharp structures but is not able to attain details in some fine regions. Quantitative DiMo produces artifact free T1T_{1} and I0I_{0} maps with superior performance compared to other methods. This is again confirmed by the error maps (Fig. 6(c) and Fig. 6(f)), with Quantitative DiMo showing the least error. This is likely achieved through integrating the unrolling gradient descent algorithm and diffusion denoising network, prioritizing noise suppression without compromising the fidelity and clarity of the underlying tissue structure.

4.3 Ablation Study

An ablation study investigating model uncertainty was conducted by sampling the well-trained Static DiMo on PD data 100 times. From these 100 samples, we derived mean images, error maps, and variance maps. We then compared three distinct acceleration factors (4×\times, 5×\times, 6×\times) using 1D Cartesian masks, shown in Fig.7. When compared with fully sampled images, the error and variance maps predominantly capture the edge and boundary details. Notably, higher acceleration factors increases the uncertainty with amplified errors, especially around the tissue interfaces. Incrementally increasing our sampling data counts, starting from 10, increasing to 50, and finally reaching 100, we witnessed a progression in the variance maps. Namely, they began displaying smoother transitions, with the high uncertainty edge regions gradually diminishing. This suggests that augmenting the number of sampling instances can be instrumental in attenuating noise within the mean images.

5 Discussion and Conclusion

While diffusion models, including ours, show impressive abilities on performing the image reconstruction task, there are still some limitations. For instance, diffusion process is computationally intensive, especially those involving complex structures or high-dimensional data, can be computationally expensive, requiring significant time and resources. Also, Diffusion models can be sensitive to the choice of parameters. A slight change in parameters can result in substantially different outcomes, making it crucial to fine-tune them for specific applications. For complex systems or datasets, it might be challenging to derive analytical solutions using diffusion models, necessitating the use of numerical methods that can introduce their own set of issues. Future research is warranted to explore to address these limitations.

To sum up, this paper proposed a diffusion model that is conditioned on native data domain for reconstructing MRI and quantitative MRI. The reconstruction shows promising results comparing to other deep learning methods, which reflects the robustness and efficiency of proposed Static DiMo and Quantitative DiMo.

References

  • [1] K. P. Pruessmann, M. Weiger, M. B. Scheidegger, and P. Boesiger, “Sense: Sensitivity encoding for fast mri,” Magnetic Resonance in Medicine, vol. 42, no. 5, pp. 952–962, 1999.
  • [2] M. A. Griswold, P. M. Jakob, R. M. Heidemann, M. Nittka, V. Jellus, J. Wang, B. Kiefer, and A. Haase, “Generalized autocalibrating partially parallel acquisitions (grappa),” Magnetic Resonance in Medicine, vol. 47, no. 6, pp. 1202–1210, 2002.
  • [3] D. Liang, J. Cheng, Z. Ke, and L. Ying, “Deep magnetic resonance image reconstruction: Inverse problems meet neural networks,” IEEE Signal Processing Magazine, vol. 37, no. 1, pp. 141–151, 2020.
  • [4] Y. Yang, J. Sun, H. Li, and Z. Xu, “Deep admm-net for compressive sensing mri,” in Advances in Neural Information Processing Systems, vol. 29.   Curran Associates, Inc., 2016.
  • [5] J. Schlemper, J. Caballero, J. V. Hajnal, A. Price, and D. Rueckert, “A deep cascade of convolutional neural networks for mr image reconstruction,” in Information Processing in Medical Imaging: 25th International Conference, IPMI 2017, Boone, NC, USA, June 25-30, 2017, Proceedings 25.   Springer, 2017, pp. 647–658.
  • [6] K. Hammernik, T. Klatzer, E. Kobler, M. P. Recht, D. K. Sodickson, T. Pock, and F. Knoll, “Learning a variational network for reconstruction of accelerated mri data,” Magnetic resonance in medicine, vol. 79, no. 6, pp. 3055–3071, 2018.
  • [7] M. Mardani, E. Gong, J. Y. Cheng, S. S. Vasanawala, G. Zaharchuk, L. Xing, and J. M. Pauly, “Deep generative adversarial neural networks for compressive sensing mri,” IEEE transactions on medical imaging, vol. 38, no. 1, pp. 167–179, 2018.
  • [8] B. Zhu, J. Z. Liu, S. F. Cauley, B. R. Rosen, and M. S. Rosen, “Image reconstruction by domain-transform manifold learning,” Nature, vol. 555, no. 7697, pp. 487–492, 2018.
  • [9] R. Liu, Y. Zhang, S. Cheng, Z. Luo, and X. Fan, “A deep framework assembling principled modules for cs-mri: Unrolling perspective, convergence behaviors, and practical modeling,” IEEE Transactions on Medical Imaging, vol. 39, no. 12, pp. 4150–4163, 2020.
  • [10] S. U. Dar, M. Yurt, M. Shahdloo, M. E. Ildız, B. Tınaz, and T. Cukur, “Prior-guided image reconstruction for accelerated multi-contrast mri via generative adversarial networks,” IEEE Journal of Selected Topics in Signal Processing, vol. 14, no. 6, pp. 1072–1087, 2020.
  • [11] S. Wang, Z. Ke, H. Cheng, S. Jia, L. Ying, H. Zheng, and D. Liang, “Dimension: dynamic mr imaging with both k-space and spatial prior knowledge obtained via multi-supervised network training,” NMR in Biomedicine, vol. 35, no. 4, p. e4131, 2022.
  • [12] E. K. Fram, R. J. Herfkens, G. Johnson, G. H. Glover, J. P. Karis, A. Shimakawa, T. G. Perkins, and N. J. Pelc, “Rapid calculation of t1 using variable flip angle gradient refocused imaging,” Magnetic Resonance Imaging, vol. 5, no. 3, pp. 201–208, 1987.
  • [13] H. Z. Wang, S. J. Riederer, and J. N. Lee, “Optimizing the precision in t1 relaxation estimation using limited flip angles,” Magnetic resonance in medicine, vol. 5, no. 5, pp. 399–416, 1987.
  • [14] F. Liu, L. Feng, and R. Kijowski, “Mantis: model-augmented neural network with incoherent k-space sampling for efficient mr parameter mapping,” Magnetic resonance in medicine, vol. 82, no. 1, pp. 174–188, 2019.
  • [15] F. Liu, R. Kijowski, L. Feng, and G. El Fakhri, “High-performance rapid mr parameter mapping using model-based deep adversarial learning,” Magnetic resonance imaging, vol. 74, pp. 152–160, 2020.
  • [16] F. Liu, R. Kijowski, G. El Fakhri, and L. Feng, “Magnetic resonance parameter mapping using model-guided self-supervised deep learning,” Magnetic resonance in medicine, vol. 85, no. 6, pp. 3211–3226, 2021.
  • [17] R. Feng, J. Zhao, H. Wang, B. Yang, J. Feng, Y. Shi, M. Zhang, C. Liu, Y. Zhang, J. Zhuang et al., “Modl-qsm: Model-based deep learning for quantitative susceptibility mapping,” NeuroImage, vol. 240, p. 118376, 2021.
  • [18] Y. Jun, H. Shin, T. Eo, T. Kim, and D. Hwang, “Deep model-based magnetic resonance parameter mapping network (dopamine) for fast t1 mapping using variable flip angle method,” Medical Image Analysis, vol. 70, p. 102017, 2021.
  • [19] H. Chung, B. Sim, and J. C. Ye, “Come-closer-diffuse-faster: Accelerating conditional diffusion models for inverse problems through stochastic contraction,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 12 413–12 422.
  • [20] Y. Xie and Q. Li, “Measurement-conditioned denoising diffusion probabilistic model for under-sampled medical image reconstruction,” in International Conference on Medical Image Computing and Computer-Assisted Intervention.   Springer, 2022, pp. 655–664.
  • [21] H. Chung and J. C. Ye, “Score-based diffusion models for accelerated mri,” Medical image analysis, vol. 80, p. 102479, 2022.
  • [22] J. Song, A. Vahdat, M. Mardani, and J. Kautz, “Pseudoinverse-guided diffusion models for inverse problems,” in International Conference on Learning Representations, 2022.
  • [23] A. Güngör, S. U. Dar, Ş. Öztürk, Y. Korkmaz, H. A. Bedel, G. Elmas, M. Ozbey, and T. Çukur, “Adaptive diffusion priors for accelerated mri reconstruction,” Medical Image Analysis, p. 102872, 2023.
  • [24] J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” Advances in neural information processing systems, vol. 33, pp. 6840–6851, 2020.
  • [25] A. Q. Nichol and P. Dhariwal, “Improved denoising diffusion probabilistic models,” in International Conference on Machine Learning.   PMLR, 2021, pp. 8162–8171.
  • [26] J. Song, C. Meng, and S. Ermon, “Denoising diffusion implicit models,” in International Conference on Learning Representations, 2020.
  • [27] M. Özbey, O. Dalmaz, S. U. Dar, H. A. Bedel, Ş. Özturk, A. Güngör, and T. Çukur, “Unsupervised medical image translation with adversarial diffusion models,” IEEE Transactions on Medical Imaging, 2023.
  • [28] G. Luo, M. Blumenthal, M. Heide, and M. Uecker, “Bayesian mri reconstruction with joint uncertainty estimation using diffusion models,” Magnetic Resonance in Medicine, vol. 90, no. 1, pp. 295–311, 2023.
  • [29] C. Peng, P. Guo, S. K. Zhou, V. M. Patel, and R. Chellappa, “Towards performant and reliable undersampled mr reconstruction via diffusion model sampling,” in International Conference on Medical Image Computing and Computer-Assisted Intervention.   Springer, 2022, pp. 623–633.
  • [30] M. Lustig, D. Donoho, and J. M. Pauly, “Sparse mri: The application of compressed sensing for rapid mr imaging,” Magnetic Resonance in Medicine: An Official Journal of the International Society for Magnetic Resonance in Medicine, vol. 58, no. 6, pp. 1182–1195, 2007.
  • [31] J. Zhang and B. Ghanem, “Ista-net: Interpretable optimization-inspired deep network for image compressive sensing,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 1828–1837.
  • [32] W. Bian, Y. Chen, and X. Ye, “An optimal control framework for joint-channel parallel mri reconstruction without coil sensitivities,” Magnetic Resonance Imaging, 2022.
  • [33] T. Zhang, J. M. Pauly, and I. R. Levesque, “Accelerating parameter mapping with a locally low rank constraint,” Magnetic resonance in medicine, vol. 73, no. 2, pp. 655–661, 2015.
  • [34] O. Maier, J. Schoormans, M. Schloegl, G. J. Strijkers, A. Lesch, T. Benkert, T. Block, B. F. Coolen, K. Bredies, and R. Stollberger, “Rapid t1 quantification from high resolution 3d data with model-based reconstruction,” Magnetic resonance in medicine, vol. 81, no. 3, pp. 2072–2089, 2019.
  • [35] M. Uecker, P. Lai, M. J. Murphy, P. Virtue, M. Elad, J. M. Pauly, S. S. Vasanawala, and M. Lustig, “Espirit—an eigenvalue approach to autocalibrating parallel mri: where sense meets grappa,” Magnetic resonance in medicine, vol. 71, no. 3, pp. 990–1001, 2014.
  • [36] L. I. Sacolick, F. Wiesinger, I. Hancu, and M. W. Vogel, “B1 mapping by bloch-siegert shift,” Magnetic resonance in medicine, vol. 63, no. 5, pp. 1315–1322, 2010.