This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Multiple Residual Dense Networks for Reconfigurable Intelligent Surfaces Cascaded Channel Estimation

Yu Jin, Jiayi Zhang, Chongwen Huang, Liang Yang, Huahua Xiao, Bo Ai, and Zhiqin Wang Y. Jin and J. Zhang are with the School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing 100044, China, and also with the Frontiers Science Center for Smart High-speed Railway System, Beijing Jiaotong University, Beijing 100044, China (e-mail: jiayizhang@bjtu.edu.cn).C. Huang is with Zhejiang Provincial Key Lab of information processing, communication and networking, Zhejiang University, Hangzhou 310007, China (e-mail: chongwenhuang@zju.edu.cn).L. Yang is with College of Computer Science, and Electronic Engineering, Hunan University, Changsha 410082, China (e-mail: liangy@hnu.edu.cn).H. Xiao is with ZTE Corporation, and State Key Laboratory of Mobile Network and Mobile Multimedia Technology, Shenzhen 518057, China (e-mail: xiao.huahua@zte.com.cn).B. Ai is with the State Key Laboratory of Rail Traffic Control and Safety, Beijing Jiaotong University, Beijing 100044, China, and also with the Frontiers Science Center for Smart High-speed Railway System, and also with Henan Joint International Research Laboratory of Intelligent Networking and Data Analysis, Zhengzhou University, Zhengzhou 450001, China, and also with Research Center of Networks and Communications, Peng Cheng Laboratory, Shenzhen 518055, China (e-mail: boai@bjtu.edu.cn).Z. Wang is with China Academy of Information and Communications Technology, Beijing 100191, P. R. China (e-mail: zhiqin.wang@caict.ac.cn).
Abstract

Reconfigurable intelligent surface (RIS) constitutes an essential and promising paradigm that relies programmable wireless environment and provides capability for space-intensive communications, due to the use of low-cost massive reflecting elements over the entire surfaces of man-made structures. However, accurate channel estimation is a fundamental technical prerequisite to achieve the huge performance gains from RIS. By leveraging the low rank structure of RIS channels, three practical residual neural networks, named convolutional blind denoising network, convolutional denoising generative adversarial networks and multiple residual dense network, are proposed to obtain accurate channel state information, which can reflect the impact of different methods on the estimation performance. Simulation results reveal the evolution direction of these three methods and reveal their superior performance compared with existing benchmark schemes.

Index Terms:
Channel estimation, deep learning, multiple residual dense network, reconfigurable intelligent surface.

I Introduction

To greatly enhance ultra-high data rate and ubiquitous coverage requirements of the sixth-generation (6G) wireless networks, as one of the promising and innovative techniques, reconfigurable intelligent surface (RIS) aided massive multiple-input multiple-output (MIMO) is envisioned to significantly reduce link blocking probability and system energy consumption to improve link quality with sophisticated beamforming [1, 2, 3]. RIS aided MIMO has been explored with near-passive array to obtain green and sustainable communications between the user equipment (UE) and the base station (BS), by appropriately and dynamically adjusting the magnitude and phase response, wireless signals can be coherently combined and steered to desired directions [4, 5]. Each RIS reflective element can individually control the amplitude response and phase shift of the incident electromagnetic waves at the nanosecond level to achieve energy concentration. Through reflection, refraction, absorption, and transmission, the reshaped electromagnetic waves will form new paths. Based on the these passive and low-cost characteristics of RIS reflective elements, the RIS system requires very low energy consumption to improve the electromagnetic environment and increase propagation environment coverage.

However, benefits from a systematic performance improvement, the RIS system relies on the perfect channel state information (CSI) assumption. Unfortunately, the above work assumes a perfect CSI but not consider the difficulty of obtaining it. First of all, it is quite difficult to estimate the RIS-UE and RIS-BS channels separately, unless the RIS can be equipped with radio frequency (RF) chains. Secondly, the cascaded channel of the RIS between the BS and the UE can be very high-dimensional due to the massive number of reflecting elements. Currently, assuming that RIS elements are connected to RF chains, the channel estimation can be performed with acceptable performance through compressed sensing (CS) based methods. However, due to extremely low deployment, hardware and communication costs, purely passive RIS reflecting elements are undoubtedly more attractive.

By assuming active reflection patterns to achieve a smaller active array size to reduce hardware complexity, a conventional least squares method (LS) is proposed. In addition, by using the low-rank characteristics of the MIMO channel, the training overhead can be reduced through sparse matrix decomposition. Considering the sparse representation of cascaded channels, the CS method is proposed in [6]. Furthermore, as the number of antennas of the UE and BS are equipped with more antennas, the channel estimation complexity increases sharply. Using the angular-domain channel sparsity, a CS-based channel estimation scheme is proposed in [7]. However, the difference in structure sparsity between different channels will cause performance loss. Moreover, deep learning (DL) were proposed to predict the optimal RIS phase shift matrices [8], but it is still significant to get accurate CSI.

In the field of image denoising, the previous convolutional neural network (CNN) structure can construct a pair of training images by adding synthetic noise to the noise-free images [9]. Considering the similarity between image noise reduction and channel estimation, a deep residual learning approach was proposed to learn the cascaded channels from the noisy pilot-based observations [10, 11] Also, a new architecture called Multiple Residual Dense Network (MRDN) has been proposed and has received great attention [12]. In particular, the proposed architecture uses Residual Dense Network (RDN) as a component.

In this correspondence, we propose two practical residual neural networks to exploit the cascaded channel estimation. Main contributions are given as follows: First, generative adversarial networks-based convolutional blind denoising (GAN-CBD) and convolutional blind denoising network (CBDNet) are proposed to obtain accurate CSI, exploiting offline trained neural network; Second, multiple residual dense network (MRDN) is proposed to flexibly adapt to the online cascaded channel estimation; Finally, numerical results confirm that the performance of the proposed methods can significantly outperform existing schemes in terms of ADMM and CV-DnCNN, and CBDNet.

II System Model

Refer to caption
Figure 1: MRDN-based channel estimation for RIS system.

We begin by considering the uplink of a time division duplex (TDD) RIS-aided mmWave communication system. As shown in Fig. 1, we consider one RIS, one controller, one base station (BS) equipped with Nb{N_{b}} antennas and KK user equipments (UEs) equipped with Nu{N_{u}} antennas for the mmWave RIS-aided MIMO system, we assume that the planar RIS is equipped with N=NvNhN=N_{\mathrm{v}}N_{\mathrm{h}} passive reflecting elements, where NhN_{\mathrm{h}} and NvN_{\mathrm{v}} denote the number of unit elements for RIS in horizontal and vertical orientations. Defining 𝐡r,ukN×Nu,𝐡r,bN×Nb\mathbf{h}_{\mathrm{r},u_{k}}\in\mathbb{C}^{{N}\times N_{u}},\mathbf{h}_{\mathrm{r},b}\in\mathbb{C}^{{N}\times N_{b}} as the channels from the kkth UE to the RIS and the BS to the RIS, 𝐡uk,bNb×Nu\mathbf{h}_{u_{k},b}\in\mathbb{C}^{{N_{b}}\times{N_{u}}} as the direct channel between the kkth UE and the BS, respectively. M×N\mathbb{C}^{{M}\times{N}} represents an M×N{M}\times{N} complex-valued matrix. Then we can express the received signal as

𝐲=k=1K(𝐡r,bT𝚿k𝐡r,uk𝚽kTRIS-assisted link +𝐡uk,b𝚽kTDirect link)+𝐧,\mathbf{y}=\sum\limits_{k=1}^{K}\left(\underbrace{\mathbf{h}_{\mathrm{r},b}^{T}\bm{\Psi}_{k}\mathbf{h}_{\mathrm{r},u_{k}}\bm{\Phi}_{k}^{\mathrm{T}}}_{\text{RIS-assisted link }}+\underbrace{\mathbf{h}_{u_{k},b}\bm{\Phi}_{k}^{\mathrm{T}}}_{\text{Direct link}}\right)+\mathbf{n}, (1)

where 𝐧𝒞𝒩(𝟎,σn2𝐈)\mathbf{n}\sim\mathcal{CN}\left(\mathbf{0},\sigma_{n}^{2}\mathbf{I}\right) denotes the noise vector at the BS and σn2\sigma_{n}^{2} is the noise power at each antenna, 𝚽k=[ϕk,1,ϕk,2,,ϕk,Nu]τ×Nu\bm{\Phi}_{k}=[\bm{\phi}_{k,1},\bm{\phi}_{k,2},\ldots,\bm{\phi}_{k,{N_{u}}}]\in\mathbb{C}^{\tau\times{N_{u}}} denotes pilot matrix for the kkth UE, ϕn,kτ×1\bm{\phi}_{n,k}\in\mathbb{C}^{\tau\times 1} denotes the orthogonal pilot sequence sent by the nnth antenna of kkth UE (ϕk1,iHϕk2,j=0\bm{\phi}_{k_{1},i}^{H}\bm{\phi}_{k_{2},j}=0, if k1k2k_{1}\neq k_{2} or iji\neq j; ϕk1,iHϕk2,j=1\bm{\phi}_{k_{1},i}^{H}\bm{\phi}_{k_{2},j}=1, if k1=k2k_{1}=k_{2} and i=ji=j, k1,k2{1,2,,K}\forall k_{1},k_{2}\in\{1,2,\ldots,K\}). For transmitting the pilots, all antennas of each UE adopt different pilot sequence. In particular, a pilot would only be allocated to one UE, resulting a orthogonal pilot matrix. Considering a simple model where one or more users in each slot have different optimal RIS phase shift matrix. Therefore, the RIS phase shift matrix 𝚿k\bm{\Psi}_{k} represents the phase shift introduced by the RIS to the impinging signal from the transmitter in the kkth time slot. In addition, 𝚿kdiag{𝝍k}N×N\bm{\Psi}_{k}\triangleq\operatorname{diag}\{\bm{\psi}_{k}\}\in\mathbb{C}^{N\times N}, with 𝝍kN×1\bm{\psi}_{k}\in\mathbb{C}^{N\times 1} representing the effective phase shifts of the RIS reflecting elements and its nnth element is [𝝍k]n=ϖnejθn,n{1,2,,N}[\bm{\psi}_{k}]_{n}=\varpi_{n}e^{j\theta_{n}},\forall n\in\{1,2,\ldots,N\}. Without loss of generality, we can assume 𝝍k=𝟏\bm{\psi}_{k}=\mathbf{1}.

By exploiting 𝚽kT𝚽k=𝐈\bm{\Phi}_{k}^{\mathrm{T}}\bm{\Phi}_{k}^{*}=\mathbf{I} and for simplifying the designing and analysing of the channel estimation algorithms in this work, we assume that there is no direct link between UE and BS due to blockages or negligible receive power, then, the processed received signal of the kkth UE at the BS is given by

𝐲k=𝐲𝚽k=𝐡r,bT𝚿k𝐡r,u+𝐧𝚽k.\displaystyle\mathbf{y}_{k}=\mathbf{y}\bm{\Phi}_{k}^{*}=\mathbf{h}_{\mathrm{r},b}^{T}\bm{\Psi}_{k}\mathbf{h}_{\mathrm{r},u}+\mathbf{n}\bm{\Phi}_{k}^{*}. (2)

Since practical mmWave channels usually have limited number of scatters, a LoS is expected in RIS systems. The mmWave channel of the kkth UE to the RIS and BS to the RIS are, respectively, given as

𝐡r,u=l=1LTzl𝜶R,t(ϕR,lazi,ϕR,lele)𝜶T,tH(ϕT,lazi,ϕT,lele),{\mathbf{h}_{\mathrm{r},u}}=\sum\limits_{l=1}^{L_{\mathrm{T}}}{z_{l}\bm{\alpha}_{\mathrm{R},t}\left({{\phi_{\mathrm{R},l}^{\mathrm{azi}}},{\phi_{\mathrm{R},l}^{\mathrm{ele}}}}\right)\bm{\alpha}_{\mathrm{T},t}^{H}\left({{\phi_{\mathrm{T},l}^{\mathrm{azi}}},{\phi_{\mathrm{T},l}^{\mathrm{ele}}}}\right)}, (3)
𝐡r,b=l=1LRzl𝜶R,r(ϕR,lazi,ϕR,lele)𝜶T,rH(ϕT,lazi,ϕT,lele),{\mathbf{h}_{\mathrm{r},b}}=\sum\limits_{l=1}^{L_{\mathrm{R}}}{z_{l}\bm{\alpha}_{\mathrm{R},r}\left({{\phi_{\mathrm{R},l}^{\mathrm{azi}}},{\phi_{\mathrm{R},l}^{\mathrm{ele}}}}\right)\bm{\alpha}_{\mathrm{T},r}^{H}\left({{\phi_{\mathrm{T},l}^{\mathrm{azi}}},{\phi_{\mathrm{T},l}^{\mathrm{ele}}}}\right)}, (4)

where Lmin(Nact,Nt){L}\ll\min\left({N_{\text{act}},N_{t}}\right) denotes the number of multipaths, zl,kz_{l,k}\in\mathbb{C} denotes the distance-dependent pathloss of the 𝐡T,k\mathbf{h}_{\mathrm{T},k} in the llth path. ϕR,lele\phi_{\mathrm{R},l}^{\mathrm{ele}} (ϕR,lazi)(\phi_{\mathrm{R},l}^{\mathrm{azi}}) denotes the elevation (azimuth) angle-of-arrival of the llth path for both 𝐡T,k\mathbf{h}_{\mathrm{T},k} and 𝐡T,kact\mathbf{h}_{\mathrm{T},k_{\text{act}}}. The steering vectors depend on the array geometry. For the typical channel 𝐡T,kact{\mathbf{h}_{\mathrm{T},k_{\text{act}}}} and 𝐡T,k{\mathbf{h}_{\mathrm{T},k}}, variables 𝜶R(ϕR,lazi,ϕR,lele)Nr×1\bm{\alpha}_{\mathrm{R}}({{\phi_{\mathrm{R},l}^{\mathrm{azi}}},{\phi_{\mathrm{R},l}^{\mathrm{ele}}}})\in\mathbb{C}^{{N_{r}}\times 1} and 𝜶T(ϕT,lazi,ϕT,lele)Nt×1\bm{\alpha}_{\mathrm{T}}({{\phi_{\mathrm{T},l}^{\mathrm{azi}}},{\phi_{\mathrm{T},l}^{\mathrm{ele}}}})\in\mathbb{C}^{{N_{t}}\times 1} are given by

𝜶R,t(ϕR,lazi,ϕR,lele)=[1,ej2πkdsinϕR,lazi sinϕR,lele /λ,,\displaystyle\bm{\alpha}_{\mathrm{R},t}\left(\phi_{\mathrm{R},l}^{\mathrm{azi}},\phi_{\mathrm{R},l}^{\mathrm{ele}}\right)=[1,\mathrm{e}^{j2\pi kd\sin\phi_{\mathrm{R},l}^{\text{azi }}\sin\phi_{\mathrm{R},l}^{\text{ele }}/\lambda},\cdots, (5)
ej2πd(Nv1)sinϕR,laxi sinϕR,lele /λ]T[1,ej2πkdcosϕR,lele /λ,,\displaystyle\mathrm{e}^{j2\pi d\left(N_{\mathrm{v}}-1\right)\sin\phi_{\mathrm{R},l}^{\text{axi }}\!\sin\phi_{\mathrm{R},l}^{\text{ele }}/\!\lambda}]^{T}\!\!\otimes\![1,\mathrm{e}^{j2\pi kd\cos\phi_{\mathrm{R},l}^{\text{ele }}/\lambda},\!\cdots\!,
ej2πd(Nh1)cosϕR,lele /λ]T,\displaystyle\mathrm{e}^{j2\pi d\left(N_{\mathrm{h}}-1\right)\cos\phi_{\mathrm{R},l}^{\text{ele }}/\lambda}]^{T},
𝜶T,t(ϕT,lazi,ϕT,lele)=[1,ej2πdsinϕT,lazisinϕT,lele/λ,,\displaystyle\bm{\alpha}_{\mathrm{T},t}\left(\phi_{\mathrm{T},l}^{\mathrm{azi}},\phi_{\mathrm{T},l}^{\mathrm{ele}}\right)=[1,\mathrm{e}^{j2\pi d\sin\phi_{\mathrm{T},l}^{\mathrm{azi}}\sin\phi_{\mathrm{T},l}^{\mathrm{ele}}/\lambda},\cdots, (6)
ej2πd(NT11)sinϕT,laisinϕT,lele/λ]T[1,ej2πdcosϕT,lele/λ,,\displaystyle\mathrm{e}^{j2\pi d\left(N_{T1}-1\right)\sin\phi_{\mathrm{T},l}^{\mathrm{ai}}\sin\phi_{\mathrm{T},l}^{\mathrm{ele}}/\lambda}]^{T}\otimes[1,\mathrm{e}^{j2\pi d\cos\phi_{\mathrm{T},l}^{\mathrm{ele}}/\lambda},\!\cdots\!,
ej2πd(NT21)cosϕT,lele/λ]T\displaystyle\mathrm{e}^{j2\pi d\left(N_{T2}-1\right)\cos\phi_{\mathrm{T},l}^{\mathrm{ele}}/\lambda}]^{T}

where λ\lambda denotes the wavelength, dd denotes the antenna spacing, and \otimes is the Kronecker product. ϕT,lele\phi_{\mathrm{T},l}^{\mathrm{ele}} (ϕT,lazi)(\phi_{\mathrm{T},l}^{\mathrm{azi}}) denotes the elevation (azimuth) angle-of-departure of the llth path for both 𝐡T,k\mathbf{h}_{\mathrm{T},k} and 𝐡T,kele\mathbf{h}_{\mathrm{T},k_{\text{ele}}}. 𝜶T(ϕT,lazi,ϕT,lele)\bm{\alpha}_{\mathrm{T}}({{\phi_{\mathrm{T},l}^{\mathrm{azi}}},{\phi_{\mathrm{T},l}^{\mathrm{ele}}}}) and 𝜶R(ϕR,lazi,ϕR,lele)\bm{\alpha}_{\mathrm{R}}({{\phi_{\mathrm{R},l}^{\mathrm{azi}}},{\phi_{\mathrm{R},l}^{\mathrm{ele}}}}) denote the steering vectors at the sender side and the receive side, respectively.

III Proposed Channel Estimation Methods

In this section, we introduce CBDNet, GAN-CBD and MRDN for the cascaded channel estimation of RIS systems. Leveraging the sparsity of cascaded mmWave channel, we naturally introduce CBDNet-based method into cascaded channel estimation in line with previous works. And we use the GAN structure to improve the network structure. Specifically, the proposed method MRDN combines the application of residual dense network (RDN) structure and the convolutional block attention module (CBAM) [13], which serves as a building block and can obtain accurate CSI for the cascaded sparsity channel. Compared with existing baseline schemes, MRDN can reduce the complexity of RIS hardware. In the following, we will show the CBDNet, GAN-CBD and MRDN structure channel estimator. In addition, 𝐱\mathbf{x} and 𝐳\mathbf{z} represent the input and output of the universal layers and networks, respectively, in this correspondence.

III-A CBDNet-based Channel Estimator

DNNE\textrm{DNN}_{E} and DNND\textrm{DNN}_{D} denote the noise level estimation subnetwork and the non-blind denosing subnetwork, respectively. ΘE\Theta_{E} and ΘD\Theta_{D} are the network parameters of DNNE\textrm{DNN}_{E} and DNND\textrm{DNN}_{D}, respectively.

III-A1 Basic Structure

Assuming that * denotes Conv function, as 𝐱\mathbf{x} and 𝐳\mathbf{z} are the input and output of the kkth Conv layer, the mathematical deduction for convolutional layer is

𝐳=Wk𝐱+bk,\displaystyle\mathbf{z}=W_{k}*\mathbf{x}+b_{k}, (7)

where the weight and bias matrices WkW_{k} and bkb_{k} are the kkth Conv parameters. 𝐳=c(𝐱)\mathbf{z}=c(\mathbf{x}), Θk,c={Wk,c,bk,c}\Theta_{k,c}=\{W_{k,c},b_{k,c}\} for “Conv” layers, 𝐳=s(𝐱)\mathbf{z}=s(\mathbf{x}), Θk,s={Wk,s,bk,s}\Theta_{k,s}=\{W_{k,s},b_{k,s}\} for “SoftMax” layers. Assuming that max\max denotes “ReLU” layer function, the mathematical deduction for “ReLU” layer is

𝐳=max(0,𝐱),\displaystyle\mathbf{z}=\max\left(0,\mathbf{x}\right), (8)

count as 𝐳=r(𝐱)\mathbf{z}=r(\mathbf{x}) for “ReLU” layers.

III-A2 Noise Level Estimation Subnetwork

  • Input Layer: As the real and imaginary parts of the received signal matrix 𝐲kNb×Nu{\mathbf{y}_{k}}\in\mathbb{C}^{N_{b}\times N_{u}} are independent at the BS, we first combine them into a super matrix 𝐘Nb×2Nu\mathbf{Y}\in\mathbb{R}^{N_{b}\times 2N_{u}} as the input of DNNE\textrm{DNN}_{E}. M×N\mathbb{R}^{{M}\times{N}} represents an M×N{M}\times{N} real-valued matrix.

  • Convolutional sensing: The DNNE\textrm{DNN}_{E} consists of Bc\mathrm{B}_{c} Conv layers and K\mathrm{K} SoftMax layers. The recurrence relation of main body for DNNE\textrm{DNN}_{E} is

    σ\displaystyle\sigma =E(𝐘,ΘE)\displaystyle=\mathcal{F}_{E}(\mathbf{Y},\Theta_{E}) (9)
    =ccss(𝐘)=(c)Bc(s)K(𝐘),\displaystyle=c\!\circ\!\cdots\!\circ c\circ\!s\circ\!\cdots\!\circ s(\mathbf{Y})=(c)^{\mathrm{B}_{c}}\circ(s)^{\mathrm{K}}(\mathbf{Y}),

    where the operator \circ denotes a function composition, σ\sigma denotes the noise level for the space-invariant AWGN, 𝐌Nb×2Nu\mathbf{M}\in\mathbb{R}^{N_{b}\times 2N_{u}} is a uniform map where all elements are σ\sigma, ΘE={Θ1,c,,ΘBc,c,Θ1,s,,ΘK,s,}\Theta_{E}=\{\Theta_{1,c},\ldots,\Theta_{\mathrm{B}_{c},c},\Theta_{1,s},\ldots,\Theta_{\mathrm{K},s},\}. The E:Nb×2Nu1×1\mathcal{F}_{E}:\mathbb{R}^{N_{b}\times 2N_{u}}\mapsto\mathbb{R}^{1\times 1} is the mapping function for DNNE\textrm{DNN}_{E}.

III-A3 Non-Blind Denosing Subnetwork

  • Input Layer: The DNND\textrm{DNN}_{D} takes both 𝐘\mathbf{Y} and 𝐌\mathbf{M} as input to obtain the estimated channel 𝐇^\widehat{\mathbf{H}}.

  • Residual Blocks: The DNND\textrm{DNN}_{D} consists of B\mathrm{B} residual blocks cbrc\circ b\circ r, then, the recurrence relation of main body for DNND\textrm{DNN}_{D} is

    𝐇m\displaystyle\mathbf{{H}}_{\mathrm{m}} =D(𝐘,𝐌,ΘD)\displaystyle=\mathcal{F}_{D}(\mathbf{Y},\mathbf{M},\Theta_{D}) (10)
    =cbrcbr(𝐘,𝐌)\displaystyle=c\circ b\circ r\cdots c\circ b\circ r(\mathbf{Y},\mathbf{M})
    =(cbr)B(𝐘,𝐌).\displaystyle=(c\circ b\circ r)^{\mathrm{B}}(\mathbf{Y},\mathbf{M}).

    The middle output 𝐇m=D(𝐘,𝐌,ΘD)\mathbf{{H}}_{\mathrm{m}}=\mathcal{F}_{D}(\mathbf{Y},\mathbf{M},\Theta_{D}), where D:Nb×2NuNb×2Nu\mathcal{F}_{D}:\mathbb{R}^{N_{b}\times 2N_{u}}\mapsto\mathbb{R}^{N_{b}\times 2N_{u}} is the mapping function for stacking residual blocks.

  • Output Layer: By reversing the combining, the middle output of DNND\textrm{DNN}_{D} 𝐇mNb×2Nu\mathbf{{H}}_{\mathrm{m}}\in\mathbb{R}^{N_{b}\times 2N_{u}} produces the estimated channel matrix 𝐇^Nb×Nu\mathbf{\widehat{\mathbf{H}}}\in\mathbb{C}^{N_{b}\times N_{u}}.

  • Loss Function: In asymmetric learning, the noise level is estimated to improve the loss function, to quantify the effectiveness of DNND\textrm{DNN}_{D} criterion. The loss function is denoted as

    rec=1σ𝐇^𝐇2\displaystyle{\mathcal{L}_{\text{rec}}}={\frac{1}{\sigma}}\|\mathbf{\widehat{\mathbf{H}}}-\mathbf{H}\|^{2} (11)

    Given the estimated noise level σ(𝐘)\sigma(\mathbf{Y}) and the truth σ(𝐘𝐢)\sigma(\mathbf{Y_{i}}), more penalty is incorporated into their MSE when σ(𝐘)<σ(𝐘𝐢)\sigma(\mathbf{Y})<\sigma(\mathbf{Y_{i}}).

III-B GAN-based Channel Estimation

Motivated by the development of generative adversarial networks (GAN) structure technique, based on the previous CBDNet as our own generator subnetwork, we develop our own GAN-CBD for denoise modeling. The GAN paradigm generates samplers through training and fitting as CBDNet works, and the results of GAN-CBD network compare with and label results, making the discriminator DD work well. Training DD can distinguish the training examples from the samples generated by GG, and GG undergoes the judgment of DD to reduce the possibility of samples being misclassified.

III-B1 Generator Network

In addition, in order to verify the effectiveness of GAN structure, we use CBDNet as the generator network. The GAND\textrm{GAN}_{D} consists of B\mathrm{B} residual blocks. We have

𝐇^=𝒢d(𝐘,𝐌,ΘGd)\displaystyle\mathbf{\widehat{\mathbf{H}}}=\mathcal{G}_{d}(\mathbf{Y},\mathbf{M},\Theta_{G_{d}}) (12)
=cbrcbr(𝐘,𝐌)=(cbr)B(𝐘,𝐌),\displaystyle=c\circ b\circ r\cdots c\circ b\circ r(\mathbf{Y},\mathbf{M})=(c\circ b\circ r)^{\mathrm{B}}(\mathbf{Y},\mathbf{M}),

where 𝒢d:Nb×2NuNb×Nu\mathcal{G}_{d}:\mathbb{R}^{N_{b}\times 2N_{u}}\mapsto\mathbb{C}^{N_{b}\times N_{u}} is the mapping function for the generator network. 𝐌Nb×2Nu\mathbf{M}\in\mathbb{R}^{N_{b}\times 2N_{u}} is a uniform map from GANE\textrm{GAN}_{E}, σ=𝒢e(𝐘,ΘGe)\sigma=\mathcal{G}_{e}(\mathbf{Y},\Theta_{G_{e}}).

III-B2 Discriminator Network

In the original formulation, the training procedure defines a continuous minimax game as

argmin𝐺argmax𝐷𝔼[logD(𝐱)]+𝔼[log(1D(G(𝐧)))]\displaystyle\underset{G}{\arg\min}\ \underset{D}{\arg\max}\mathbb{E}[\log D(\mathbf{x})]+\mathbb{E}[\log(1-D(G(\mathbf{n})))] (13)

where DD is a function that maps Nb×2Nu\mathbb{R}^{N_{b}\times 2N_{u}} to the unit interval, and GG is a function that maps a noise vector 𝐧Nb×2Nu\mathbf{n}\in\mathbb{R}^{N_{b}\times 2N_{u}}, drawn from a simple distribution p(𝐧)p(\mathbf{n}), to the ambient space of the training data Nb×2Nu\mathbb{R}^{N_{b}\times 2N_{u}}.

III-C MRDN-based Channel Estimation

We define this feature concatenation part of RDN and CBAM in Fig. 2, and use it as a building module of MRDN.

III-C1 Basic Structure

Assuming that * denotes “Conv” layer function, max\max denotes “ReLU” layer function, and the residual block model is the composition of two cascaded functions:

𝐳1\displaystyle\mathbf{z}_{-1} =Wn,r𝐱+bn,r,\displaystyle=W_{n,r}*\mathbf{x}+b_{n,r}, (14)
𝐳0\displaystyle\mathbf{z}_{0} =max(0,𝐳1),\displaystyle=\max\left(0,\mathbf{z}_{-1}\right), (15)

where the weight and bias matrices of the nnth residual block parameter are denoted by Θn,r={Wn,r,bn,r},n{1,2,,B}\Theta_{n,r}=\{W_{n,r},b_{n,r}\},n\in\{1,2,\ldots,\mathrm{B}\}. 𝐱\mathbf{x} and 𝐳0\mathbf{z}_{0} are the input and output of the residual block, respectively. let gng_{n} denotes the single recursion function of nnth residual block.

Refer to caption
Figure 2: MRDN-based channel estimation system.
Refer to caption
Figure 3: RDN system model.

III-C2 Residual Dense Network Structure

RDN performs well in addressing denoising image problems. Motivated by many recent image restoration networks including RDN, we include the global residual connection such that the network can focus on learning the difference between the noisy and ground-truth channel matrix. The main body of RDN have B\mathrm{B} layers. The recurrence relation of main body for the nnth layer is F1=g1(𝐘)F_{1}=g_{1}(\mathbf{Y}) and

Fn=gn(Fn1(𝐘),,F1(𝐘),𝐘),n{2,,B}.\displaystyle F_{n}=g_{n}(F_{n-1}(\mathbf{Y}),\cdots,F_{1}(\mathbf{Y}),\mathbf{Y}),\forall n\in\{2,\ldots,\mathrm{B}\}. (16)

III-C3 Convolutional Block Attention Module

𝐳1\displaystyle\mathbf{z}_{-1} =W1,a𝐱+b1,a,\displaystyle=W_{-1,a}*\mathbf{x}+b_{-1,a}, (17)
𝐳0\displaystyle\mathbf{z}_{0} =max(0,𝐳1),\displaystyle=\max\left(0,\mathbf{z}_{-1}\right), (18)
𝐳1\displaystyle\mathbf{z}_{1} =W1,a𝐳0+b1,a,\displaystyle=W_{1,a}*\mathbf{z}_{0}+b_{1,a}, (19)

where the weight and bias matrices consist the CBAM parameter Θa={W1,a,W1,a,b1,a,b1,a}\Theta_{a}=\{W_{-1,a},W_{1,a},b_{-1,a},b_{1,a}\}. 𝐱\mathbf{x} and 𝐳1\mathbf{z}_{1} are the input and output of the CBAM. The recurrence relation of CBAM for MRDN is A(𝐱)=crc(𝐱)A(\mathbf{x})=c\circ r\circ c(\mathbf{x}).

Refer to caption
Figure 4: Convergence of CBDNet, GAN-CBD and MRDN.
Refer to caption
Figure 5: NMSE performance comparison of CBDNet, GAN-CBD, MRDN with CS methods.

III-C4 Input Layer

As the real and imaginary parts of the received signal matrix 𝐲kNb×Nu{\mathbf{y}_{k}}\in\mathbb{C}^{N_{b}\times N_{u}} are independent at the RIS, we first combine them into a super matrix 𝐘Nb×2Nu\mathbf{Y}\in\mathbb{R}^{N_{b}\times 2N_{u}}. In this case, the channel matrix can be treated as a 2D image and the super matrix 𝐘\mathbf{Y} is the input of MRDN.

III-C5 Multiple Residual Dense Network Structure

We take advantages of novel ideas in RDN and RCAN as follows.

  • RDN itself is an image restoration network, but we use it with modifications as a component of our network and construct a cascaded structure of NRN_{R} RDNs as our image denoising network.

  • The recurrence relation of main body for RDN is

    M(𝐱)=Fn,NRFn,NR1Fn,1(𝐱),\displaystyle M(\mathbf{x})=F_{n,N_{R}}\circ F_{n,N_{R}-1}\circ\cdots\circ F_{n,1}(\mathbf{x}), (20)
    F(𝐱)=MA(𝐱)=FnNRA(𝐱),\displaystyle F(\mathbf{x})=M\circ A(\mathbf{x})=F_{n}^{N_{R}}\circ A(\mathbf{x}), (21)

    where the operator \circ denotes a function composition and FnNRF_{n}^{N_{R}} denotes the NRN_{R}-fold product of FnF_{n}. The middle output 𝐇m=F(𝐘)\mathbf{{H}}_{\mathrm{m}}=F(\mathbf{Y}), where F:Nb×2NuNb×2NuF:\mathbb{R}^{N_{b}\times 2N_{u}}\mapsto\mathbb{R}^{N_{b}\times 2N_{u}} is the mapping function for MRDN.

III-C6 Computational Complexity Analysis

The computational complexity of the training phase in CBDNet is given by

𝒪(N2K2st(LdDl2+LeEl2)),\displaystyle\mathcal{O}(N^{2}{K^{2}}st({L_{d}}{D_{l}}^{2}+{L_{e}}{E_{l}}^{2})), (22)

where ss donates the size of mini-batch, tt donates the number of iterations, K2{K^{2}} donates the size of kernels. Ld{L_{d}} and Le{L_{e}} denote the number of “Conv” for DNND\textrm{DNN}_{D} and DNNE\textrm{DNN}_{E}, Dl{D_{l}} and El{E_{l}} denote the number of features for the llth layer of DNND\textrm{DNN}_{D} and DNNE\textrm{DNN}_{E}, respectively. The computational complexity of the training phase in GAN-CBD and MRDN are given by

𝒪(N2K2st(Lg,dDg,l2+Lg,eEg,l2+LaEa2)),\displaystyle\mathcal{O}(N^{2}{K^{2}}st({L_{g,d}}{D_{g,l}}^{2}+{L_{g,e}}{E_{g,l}}^{2}+{L_{a}}{E_{a}}^{2})), (23)

and

𝒪(N2K2stLm2Dm2).\displaystyle\mathcal{O}(N^{2}{K^{2}}st{L_{m}}^{2}{D_{m}}^{2}). (24)

IV Simulation Result

Refer to caption
Figure 6: NMSE performance of MRDN with different model capacity.

We consider the RIS-aided mmWave massive MIMO system with 20 UEs, where Nb=64N_{b}=64, Nu=32N_{u}=32, N=4096N=4096, L=3L=3 and d=λ/2d=\lambda/2. In terms of hardware, we use Intel Core i7-9700K @3.60GHz, 32 GB RAM and NVIDIA GeForce RTX 2080Ti to implement the above three models through PyTorch library. From the perspective of normalized mean square error (NMSE) performance, this section illustrates the pros and cons of the three proposed channel estimators in terms of structure. All simulation results are derived in PyCharm Community Edition (Python 3.8 environment). The training rate is set as 0.0001 for MRDN and 0.001 for CBDNet and GAN-CBD and the mini-batch size is 20 for all three methods. The training, validation, and testing sets include 16,000, 6,000, and 8,000 samples, respectively. The training, validation and testing sets for the three methods use the same data set samples. The number of RDN for MRDN, residual blocks for both CBDNet and GAN-CBD are 6 and 12, respectively. The MRDN has 80 features, CBDNet and GAN-CBD have 96 features. Note that NMSE is defined as

NMSE=𝔼(𝐇^𝐇2/𝐇^2).\text{NMSE}=\mathbb{E}\left({{{\left\|{\widehat{\mathbf{H}}-\mathbf{H}}\right\|^{2}}}/{{\left\|{\widehat{\mathbf{H}}}\right\|^{2}}}}\right). (25)

Figure 4 compares the three different models, including the MRDN, CBDNet and GAN-CBD. We can find that the MRDN can achieve best NMSE performance and fastest convergence. Because the GAN-CBD brings the advantage of judging the network, it shows better performance than the CBDNet. The computational complexity of training and offline operation can be hugely reduced. Also, the robustness of the channel estimator to different scenarios is enhanced. The average running time of MRDN (in seconds) is 0.0075, while the CBDNet and the GAN-CBD are 0.0094 and 0.0098 respectively, the computational complexity of training and offline operations for the MRDN can reduced compared with the CBDNet and the GAN-CBD. However, for almost the same computational complexity, the GAN-CBD can achieve better NMSE performance and fast convergence compared with the CBDNet. But compared with the MRDN, the improvement of network structure is not significant.

Figure 5 compares the NMSE performance of the proposed MRDN-based channel estimator for different structures (e.g., CBDNet [14], GAN-CBDN, CV-DnCNN [15]) and with existing conventional channel estimation methods (e.g., ADMM [16], PAPRFAC [17]). The simulation results are averaged over 300 iterations for the three proposed methods. It can be observed that MRDN can achieve better NMSE performance compared with GAN-CBD and CBDNet by 5.63dB and 4.51dB respectively. Compared with CV-DnCNN, which is also based on CNN, as well as conventional ADMM and PAPRFAC, regardless of the significant performance comparison in NMSE, the lower complexity of MRDN allows it to be better applied.

Figure 6 compares the NMSE performance for different number of features and RDN. With more RDNs for the global residual dense connection, and more comprehensive perceptual fields, the MRDN with 80 features and 6 dense connections for RDN performs better. Consequently, the main challenge in accurately describing noise is the lack of observational dimensions and modeling capabilities of neural networks, such as features and layers.

V Conclusion

We proposed the CBDNet, GAN-CBD and MRDN based cascaded channel estimators for RIS-aided mmWave massive MIMO communication systems. Utilizing the sparsity of the cascaded RIS channels and classic image processing techniques, we regard the channel matrix as a two-dimensional image. The proposed residual dense network structure can increase the flexibility of the overall network to obtain better generalization and fitting capabilities, while the advantages brought by the GAN structure are not significant. Compared with the previous generation method, based on the above advantages, the MRDN-based deep learning network is designed to estimate the cascaded RIS channels. The simulation results show that the performance of the proposed the MRDN estimator increases with the increase of the scale of the network structure under the same order of complexity as the CBDNet and the GAN-CBD.

Appendix A Forward and Backward Propagation

Assume that the weight matrix Wini+1×niW_{i}\in\mathbb{R}^{n_{i+1}\times n_{i}} and the bias vector bini+1b_{i}\in\mathbb{R}^{n_{i+1}} are the parameters at iiConv” layer cic_{i}. In the multilayer perceptron (MLP), we can explicitly write ci(𝐱i;Wi,bi)=ri(Wi𝐱i+bi)c_{i}\left(\mathbf{x}_{i};W_{i},b_{i}\right)=r_{i}\left(W_{i}\cdot\mathbf{x}_{i}+b_{i}\right), where rir_{i} is an elementwise function and the definition for Conv is ri(𝐱)max(0,𝐱)r_{i}(\mathbf{x})\equiv\max(0,\mathbf{x}), as the first derivative is ri(𝐱)=H(𝐱)r_{i}^{\prime}(\mathbf{x})=H(\mathbf{x}). For any 𝐱ini\mathbf{x}_{i}\in\mathbb{R}^{n_{i}} and vectors Uini+1×niU_{i}\in\mathbb{R}^{n_{i+1}\times n_{i}} in inner product space,

Wici(𝐱i)Ui=DΨi(𝐳i)Ui𝐱i,\displaystyle\nabla_{W_{i}}c_{i}\left(\mathbf{x}_{i}\right)\cdot U_{i}=\mathrm{D}\mit\Psi_{i}\left(\mathbf{z}_{i}\right)\cdot U_{i}\cdot\mathbf{x}_{i}, (26)
bici(𝐱i)=DΨi(𝐳i),\displaystyle\nabla_{b_{i}}c_{i}\left(\mathbf{x}_{i}\right)=\mathrm{D}\mit\Psi_{i}\left(\mathbf{z}_{i}\right), (27)

where 𝐳i=Wi𝐱i+bi\mathbf{z}_{i}=W_{i}\cdot\mathbf{x}_{i}+b_{i}, Ψ(v)=k=1nψ(vk)ek\mit\Psi(v)=\sum_{k=1}^{n}\psi\left(v_{k}\right)e_{k} and

Df(x;θ)v=ddtf(x+tv;θ)|t=0.\displaystyle\mathrm{D}f(x;\theta)\cdot v=\left.\frac{\mathrm{d}}{\mathrm{d}t}f(x+tv;\theta)\right|_{t=0}. (28)

The loss function gradients in MLP is

J(𝐱,𝐳;θ)=12𝐳F(𝐱;θ)2=12𝐳F(𝐱;θ),𝐳F(𝐱;θ).\displaystyle J(\mathbf{x},\!\mathbf{z};\!\theta)\!=\!\frac{1}{2}\|\mathbf{z}\!\!-\!\!F(\mathbf{x};\!\theta)\|^{2}\!=\!\frac{1}{2}\langle\mathbf{z}\!\!-\!\!F(\mathbf{x};\!\theta),\mathbf{z}\!\!-\!\!F(\mathbf{x};\!\theta)\rangle. (29)

Let (𝐱,𝐳)E1×EL+1(\mathbf{x},\mathbf{z})\in E_{1}\times E_{L+1} be a network input-output pair,

WiJ(𝐱,𝐳;θ)=[Ψi(𝐳i)(Dωi+1(𝐱i+1)e)]𝐱iT,\displaystyle\nabla_{W_{i}}J(\mathbf{x},\mathbf{z};\theta)=\left[\mit\Psi_{i}^{\prime}\left(\mathbf{z}_{i}\right)\odot\left(\mathrm{D}^{*}\omega_{i+1}\left(\mathbf{x}_{i+1}\right)\cdot e\right)\right]\mathbf{x}_{i}^{T}, (30)
biJ(𝐱,𝐳;θ)=Ψi(𝐳i)(Dωi+1(𝐱i+1)e),\displaystyle\nabla_{b_{i}}J(\mathbf{x},\mathbf{z};\theta)=\mit\Psi_{i}^{\prime}\left(\mathbf{z}_{i}\right)\odot\left(\mathrm{D}^{*}\omega_{i+1}\left(\mathbf{x}_{i+1}\right)\cdot e\right), (31)

where 𝐱i=αi1(𝐱)\mathbf{x}_{i}\!=\!\alpha_{i-1}(\mathbf{x}), and the prediction error is e=F(𝐱;θ)𝐳e\!=\!F(\mathbf{x};\!\theta)-\mathbf{z},

F(𝐱;θ)=(cLc1)(𝐱),\displaystyle F(\mathbf{x};\theta)=\left(c_{L}\circ\cdots\circ c_{1}\right)(\mathbf{x}), (32)

for all i[L]i\in[L] and θ{Wi,bi}\theta\in\{W_{i},b_{i}\},

θiJ(𝐱,𝐳;θ)=θici(𝐱i)Dωi+1(𝐱i+1)e.\displaystyle\nabla_{\theta_{i}}J(\mathbf{x},\mathbf{z};\theta)=\nabla_{\theta_{i}}^{*}c_{i}\left(\mathbf{x}_{i}\right)\cdot\mathrm{D}^{*}\omega_{i+1}\left(\mathbf{x}_{i+1}\right)\cdot e. (33)

The generic layer of a CNN as a parameter-dependent map that takes as input an m1m_{1}-channeled tensor, where each channel is a matrix of size n1×1n_{1}\times\ell_{1}, and outputs an m2m_{2}-channeled tensor, where each channel is a matrix of size n2×2n_{2}\times\ell_{2}. The parameters Wp×qm2W\in\mathbb{R}^{p\times q}\otimes\mathbb{R}^{m_{2}}, the input xn1×1m1x\in\mathbb{R}^{n_{1}\times\ell_{1}}\otimes\mathbb{R}^{m_{1}}. {ej}j=1m1\left\{e_{j}\right\}_{j=1}^{m_{1}} denotes an orthonormal basis for m1\mathbb{R}^{m_{1}}, and {e¯j}j=1m2\left\{\bar{e}_{j}\right\}_{j=1}^{m_{2}} denotes an orthonormal basis for m2\mathbb{R}^{m_{2}}, the 𝐱\mathbf{x} and WW is:

𝐱=j=1m1𝐱jej,W=j=1m2Wje¯j.\displaystyle\mathbf{x}=\sum_{j=1}^{m_{1}}\mathbf{x}_{j}\otimes e_{j},\quad W=\sum_{j=1}^{m_{2}}W_{j}\otimes\bar{e}_{j}. (34)

And the convolution operator CC can be written as

C(W,𝐱)=j=1m2cj(W,𝐱)e¯j,\displaystyle C(W,\mathbf{x})=\sum_{j=1}^{m_{2}}c_{j}(W,\mathbf{x})\otimes\bar{e}_{j}, (35)

where cjc_{j} is a bilinear operator that defines the mechanics of the convolution:

cj(W,𝐱)=k=1n^1l=1^1Wj,𝒦γ(k,l,Δ)(𝐱)E^k,l,\displaystyle c_{j}(W,\mathbf{x})=\sum_{k=1}^{\widehat{n}_{1}}\sum_{l=1}^{\widehat{\ell}_{1}}\left\langle W_{j},\mathcal{K}_{\gamma(k,l,\Delta)}(\mathbf{x})\right\rangle\widehat{E}_{k,l}, (36)

where 𝒦\mathcal{K} is the cropping operator that defines the action of convolution, γ(k,l,Δ)=(1+(k1)Δ,1+(l1)Δ)\gamma(k,l,\Delta)=(1+(k-1)\Delta,1+(l-1)\Delta), Δ\Delta defines the stride of the Conv. The generic layer cic_{i} is

ci(𝐱i)=Ψi(Ci(Wi,𝐱i)).\displaystyle c_{i}\left(\mathbf{x}_{i}\right)=\mit\Psi_{i}\left(C_{i}\left(W_{i},\mathbf{x}_{i}\right)\right). (37)

For the “Conv” layer,

Wici(xi)=(Ψi𝐱i)DCi(Wi,𝐱i)DΨi(Ci(Wi,𝐱i)).\displaystyle\nabla_{W_{i}}^{*}c_{i}(x_{i})=(\mit\Psi_{i}\llcorner\mathbf{x}_{i})^{*}\mathrm{D}C_{i}(W_{i},\mathbf{x}_{i})\mathrm{D}^{*}\mit\Psi_{i}(C_{i}(W_{i},\mathbf{x}_{i})). (38)
Dci(𝐱i)=(WiCi)DΨi(Ci(Wi,𝐱i))DΨi(Ci(Wi,𝐱i)).\displaystyle\!\!\!\mathrm{D}^{*}c_{i}(\mathbf{x}_{i})\!=\!(W_{i}\lrcorner C_{i})^{\!*}\mathrm{D}\mit\Psi_{i}(C_{i}(W_{i},\!\mathbf{x}_{i}))\mathrm{D}^{*}\mit\Psi_{i}(C_{i}(W_{i},\!\mathbf{x}_{i})). (39)

where (e1B)e2=B(e1,e2)(e_{1}\lrcorner B)\cdot e_{2}=B(e_{1},e_{2}), and (Be2)e1=B(e1,e2)(B\llcorner e_{2})\cdot e_{1}=B(e_{1},e_{2}). The learning rate η+\eta\in\mathbb{R}_{+}, the gradient descent step algorithm update the parameter for backpropagation is

WiJ(𝐱,𝐳;θ)(Ci𝐱i)DCi(Wi,𝐱i)DΨi(𝐳i)ei,\displaystyle\nabla_{W_{i}}J(\mathbf{x},\mathbf{z};\theta)\leftarrow\!(C_{i}\llcorner\mathbf{x}_{i})^{*}\mathrm{D}C_{i}\left(W_{i},\mathbf{x}_{i})\mathrm{D}^{*}\mit\Psi_{i}(\mathbf{z}_{i}\right)e_{i}, (40)

the parameters can be update by WiWiηWiJ(x,y;θ)W_{i}\leftarrow W_{i}-\eta\nabla_{W_{i}}J(x,y;\theta). Due to the application of the derivative chain rule and error backpropagation, the high-dimensional neural network demonstrates excellent results.

References

  • [1] C. Huang, A. Zappone, G. C. Alexandropoulos, M. Debbah, and C. Yuen, “Reconfigurable intelligent surfaces for energy efficiency in wireless communication,” IEEE Trans. Wireless Commun., vol. 18, no. 8, pp. 4157–4170, 2019.
  • [2] Q. Wu and R. Zhang, “Towards smart and reconfigurable environment: Intelligent reflecting surface aided wireless network,” IEEE Commun. Mag., vol. 58, no. 1, pp. 106–112, 2019.
  • [3] J. Zhang, E. Björnson, M. Matthaiou, D. W. K. Ng, H. Yang, and D. J. Love, “Prospective multiple antenna technologies for beyond 5G,” IEEE J. Sel. Areas Commun., vol. 38, no. 8, pp. 1637–1660, Aug. 2020.
  • [4] J. Lin, G. Wang, R. Fan, T. A. Tsiftsis, and C. Tellambura, “Channel estimation for wireless communication systems assisted by large intelligent surfaces,” arXiv preprint arXiv:1911.02158, 2019.
  • [5] J. Zhang, H. Du, Q. Sun, B. Ai, and D. W. K. Ng, “Physical layer security enhancement with reconfigurable intelligent surface-aided networks,” IEEE Trans. Inf. Forensic Secur., vol. 16, pp. 3480–3495, 2021.
  • [6] Z.-Q. He and X. Yuan, “Cascaded channel estimation for large intelligent metasurface assisted massive MIMO,” IEEE Wireless Commun. Lett., vol. 9, no. 2, pp. 210–214, Feb. 2019.
  • [7] J. Chen, Y.-C. Liang, H. V. Cheng, and W. Yu, “Channel estimation for reconfigurable intelligent surface aided multi-user MIMO systems,” arXiv:1912.03619, 2019.
  • [8] A. Taha, M. Alrabeiah, and A. Alkhateeb, “Deep learning for large intelligent surfaces in millimeter wave and massive MIMO systems,” in Proc. IEEE Globecom, 2019, pp. 1–6.
  • [9] J. Yang, C.-K. Wen, S. Jin, and F. Gao, “Beamspace channel estimation in mmWave systems via cosparse image reconstruction technique,” IEEE Trans. Commun., vol. 66, no. 10, pp. 4767–4782, Oct. 2018.
  • [10] K. Zhang, W. Zuo, Y. Chen, D. Meng, and L. Zhang, “Beyond a gaussian denoiser: Residual learning of deep CNN for image denoising,” IEEE Trans. Image Process., vol. 26, no. 7, pp. 3142–3155, Jul. 2017.
  • [11] C. Liu, X. Liu, D. W. K. Ng, and J. Yuan, “Deep residual network empowered channel estimation for IRS-assisted multi-user communication systems,” in Proc. IEEE ICC, 2021, pp. 1–7.
  • [12] D. Kim, J. R. Chung, and S. Jung, “GRDN: Grouped residual dense network for real image denoising and GAN-based real-world noise modeling,” in Proc. CVF CVPRW, 2019, pp. 2086–2094.
  • [13] S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, “CBAM: Convolutional block attention module,” in Proc. ECCV, 2018, pp. 3–19.
  • [14] Y. Jin, J. Zhang, B. Ai, and X. Zhang, “Channel estimation for mmWave massive MIMO with convolutional blind denoising network,” IEEE Commun. Lett., vol. 24, no. 1, pp. 95–98, Jan. 2019.
  • [15] S. Liu, Z. Gao, J. Zhang, M. Di Renzo, and M.-S. Alouini, “Deep denoising neural network assisted compressive channel estimation for mmwave intelligent reflecting surfaces,” IEEE Trans. Veh. Technol., vol. 69, no. 8, pp. 9223–9228, Aug. 2020.
  • [16] E. Vlachos, G. C. Alexandropoulos, and J. Thompson, “Massive MIMO channel estimation for millimeter wave systems via matrix completion,” IEEE Signal Process. Lett., vol. 25, no. 11, pp. 1675–1679, Nov. 2018.
  • [17] L. Wei, C. Huang, G. C. Alexandropoulos, C. Yuen, Z. Zhang, and M. Debbah, “Channel estimation for RIS-empowered multi-user MISO wireless communications,” IEEE Trans. Commun., vol. 69, no. 6, pp. 4144–4157, Jun. 2021.