This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Model-Driven Deep Learning for Massive Multiuser MIMO Constant Envelope Precoding

Yunfeng He, Hengtao He,  Chao-Kai Wen, 
and Shi Jin
Y. He, H. He, and S. Jin are with the National Mobile Communications Research Laboratory, Southeast University, Nanjing 210096, P. R. China (e-mail: heyunfeng@seu.edu.cn, hehengtao@seu.edu.cn, jinshi@seu.edu.cn).C.-K. Wen is with the Institute of Communications Engineering, National Sun Yat-sen University, Kaohsiung 804, Taiwan (e-mail: chaokai.wen@mail.nsysu.edu.tw).
Abstract

Constant envelope (CE) precoding design is of great interest for massive multiuser multi-input multi-output systems because it can significantly reduce hardware cost and power consumption. However, existing CE precoding algorithms are hindered by excessive computational overhead. In this letter, a novel model-driven deep learning (DL)-based network that combines DL with conjugate gradient algorithm is proposed for CE precoding. Specifically, the original iterative algorithm is unfolded and parameterized by trainable variables. With the proposed architecture, the variables can be learned efficiently from training data through unsupervised learning approach. Thus, the proposed network learns to obtain the search step size and adjust the search direction. Simulation results demonstrate the superiority of the proposed network in terms of multiuser interference suppression capability and computational overhead.

Index Terms:
Massive MIMO, constant envelope, precoding, deep learning, model-driven, unsupervised learning

I Introduction

The massive multiuser multi-input multi-output (MIMO) system has attracted considerable attention because of its superiority in terms of spectral efficiency and reliability [1]. The base station (BS) utilizes numerous antennas to serve multiple user terminals (UTs) in the same time frequency resource. Linear precoders are usually used to mitigate multiuser interference (MUI) effectively [2]. However, for existing linear precoding algorithms, the actual implementation causes several problems when the number of antennas at the BS is large. A crucial challenge includes the dramatic increase in hardware cost and power consumption. Specifically, each transmit antenna needs to use an expensive linear power amplifier (PA) because the amplitude of the elements in the transmitted signal obtained by existing precoding algorithms, e.g., zero-forcing precoding, is unconstrained.

The type of transmitted signal that facilitates the use of most power-efficient/nonlinear PAs is a constant envelope (CE) signal, i.e., the amplitude of each symbol in the precoding vectors is limited to a constant, and the information is carried on the phase for transmission. Mathematically, the CE precoding design can be formulated as a nonlinear least squares (NLS) problem, which is non-convex and has multiple suboptimal solutions. In [3], Mohammed and Larsson proposed a sequential gradient descent (GD) algorithm. Unfortunately, this method is greatly affected by the initial value of the iteration algorithm and easily falls into a local minimum, which may reduce the MUI suppression ability dramatically. Reference [4] proposed a cross-entropy optimization (CEO) method to solve the NLS problem. Although CEO can mitigate MUI effectively, its computational complexity is large, thereby hindering its practical use. In [5], a Riemannian manifold optimization (RMO)-based conjugate gradient (CG) algorithm that achieves a tradeoff between MUI performance and computational complexity was developed. However, the RMO method still relies on a large number of iterations, which is still a considerable challenge for high-speed communication.

Recently, deep learning (DL) has made remarkable achievements in physical layer communications [6] and has been introduced into precoding [7, 8]. However, most existing DL-based precoders are designed in a data-driven approach, i.e., considering the precoder as a black-box network, thereby suffering excessively high training cost and computational overhead. Deep unfolding [9, 10, 11] is another DL technique, which expands the iterative algorithms and introduces some trainable parameters to improve the convergence speed, and has been applied to physical layer communications [12, 13]. In this letter, a model-driven neural network named CEPNet, which combines DL with the RMO-based CG algorithm, is proposed for the CE precoding. Compared with the RMO-based CG algorithm, the introduced trainable variables can be optimized efficiently through unsupervised learning. Thus, the MUI performance and computational cost of the proposed network have improved significantly. In addition, simulation results demonstrate that the CEPNet shows strong robustness to channel estimation error and channel model mismatch.

Notations—Throughout this letter, we use {\mathbb{R}} and {\mathbb{C}} to denote the set of real and complex numbers, respectively. The superscripts ()T{{(\cdot)}^{\rm T}}, ()H{{(\cdot)}^{\rm H}}, and (){{(\cdot)}^{*}} represent transpose, Hermitian transpose, and conjugate transpose, respectively. {\circ} denotes the Hadamard product between two matrices with identical size. 𝔢{}{\mathfrak{Re}\{\cdot\}} returns the real part of its input argument. 2{{\|\cdot\|}_{2}} and ||{|\cdot|} represent the Euclidean norm and absolute value, respectively. Finally, for any vector 𝐳{\mathbf{z}} and any positive integer k{k}, (𝐳)k{{\left(\mathbf{z}\right)}_{k}} returns the kkth element in vector 𝐳{\mathbf{z}}.

II System Model and Problem Formulation

We consider a downlink MIMO system, in which a BS with NtN_{\rm t} transmit antennas serves Nu{N_{\rm u}} (Nu<Nt{{N}_{\rm u}}<{{N}_{\rm t}}) single-antenna UTs. The collectively received signal, denoted by 𝐲=[y1,,yNu]TNu{\mathbf{y}={{[{{y}_{1}},~{}\ldots,~{}{{y}_{{{N}_{\rm u}}}}]}^{\rm T}}\in{{\mathbb{C}}^{{{N}_{\rm u}}}}}, is provided as follows:

𝐲=𝐇𝐱+𝐧,\mathbf{y}=\mathbf{Hx}+\mathbf{n}, (1)

where 𝐇=[hmn]Nu×Nt{\mathbf{H}=[{{h}_{mn}}]\in{{\mathbb{C}}^{{{N}_{\rm u}}\times{{N}_{\rm t}}}}} , 𝐱=[x1,,xNt]TNt{\mathbf{x}={{[{{x}_{1}},~{}\ldots,~{}{{x}_{{{N}_{\rm t}}}}]}^{\rm T}}\in{{\mathbb{C}}^{{{N}_{\rm t}}}}}, and 𝐧=[n1,,nNu]TNu{\mathbf{n}={{[{{n}_{1}},~{}\ldots,~{}{{n}_{{{N}_{u}}}}]}^{\rm T}}\in{{\mathbb{C}}^{{{N}_{\rm u}}}}} denote the channel vector, transmitted vector, and additive white Gaussian noise, respectively. The total MUI energy can be expressed as

𝐇𝐱𝐬22f(𝐱)=m=1Nu|n=1Nthmnxnsm|2,\left\|\mathbf{Hx}-\mathbf{s}\right\|_{2}^{2}\triangleq f\left(\mathbf{x}\right)=\sum\limits_{m=1}^{{{N}_{\rm u}}}{{{\left|\sum\limits_{n=1}^{{{N}_{\rm t}}}{{{h}_{mn}}{{x}_{n}}}-{{s}_{m}}\right|}^{2}}}, (2)

where 𝐬=[s1,,sNu]TNu{\mathbf{s}={{[{{s}_{1}},~{}\ldots,~{}{{s}_{{{N}_{\rm u}}}}]}^{\rm T}}\in{{\mathbb{C}}^{{{N}_{\rm u}}}}} denotes the information symbol vector.

CE precoding, which imposes a constant amplitude constraint on the transmitted signal at each transmit antenna, has been proposed, thereby enabling the utilization of low-cost and high energy efficient PAs. Mathematically, the design problem that considers CE precoding can be formulated as the constraint optimization problem:

minimize𝐱f(𝐱)\displaystyle\underset{\mathbf{x}}{\mathop{\rm minimize}}\quad f\left(\mathbf{x}\right) (3)
subjectto|xn|=Pt/Nt,forn=1,,Nt,\displaystyle{\rm subject~{}to}~{}\left|{{x}_{n}}\right|=\sqrt{{{{P}_{\rm t}}}/{{{N}_{\rm t}}}},~{}{\rm for}~{}n=1,~{}\ldots,~{}{{N}_{\rm t}},

where Pt{{P}_{\rm t}} denotes the total transmit power. Although no optimized method solves the non-convex problem (3), the RMO-based CG algorithm can solve the problem with a good trade-off in terms of MUI performance and complexity.

III CEPNet

III-A Algorithm Review

RMO has been proposed to solve the optimization problem (3) in [5], which transforms the constrained domain into a Riemannian manifold and solves the optimization problem directly on this specific manifold. We briefly review the RMO-based CG algorithm and recommend [14] for technical details about the RMO method.

Considering the constant amplitude constraint for each element of 𝐱{\mathbf{x}} in problem (3) and assuming Pt=1{P_{\rm t}=1}, the constraint domain of CE precoding problem can be transformed into a Riemannian manifold given by

={𝐱Nt:|x1|=|x2|==|xNt|=1Nt}.\mathcal{M}=\left\{\mathbf{x}\in{{\mathbb{C}}^{{{N}_{\rm t}}}}:\left|{{x}_{1}}\right|=\left|{{x}_{2}}\right|=\cdots=\left|{{x}_{{N}_{\rm t}}}\right|=\frac{1}{\sqrt{{N}_{\rm t}}}\right\}. (4)

Given a point 𝐱k{{\mathbf{x}_{k}}\in\mathcal{M}} of the kkth iteration, the tangent space at point 𝐱k{\mathbf{x}_{k}} is defined as 𝒯𝐱k={𝐳Nt:𝔢{𝐳𝐱k}=𝟎Nt}{\mathcal{T}_{{{\mathbf{x}}_{k}}}}\mathcal{M}=\{\mathbf{z}\in{{\mathbb{C}}^{{{N}_{\rm t}}}}:\mathfrak{Re}\{\mathbf{z}\circ\mathbf{x}_{k}^{*}\}={{\mathbf{0}}_{{{N}_{\rm t}}}}\}. The search step size and direction of the kkth iteration are assumed as αk{{\alpha}_{k}} and 𝐝k𝒯𝐱k{\mathbf{d}_{k}\in\mathcal{T}_{{\mathbf{x}}_{k}}\mathcal{M}}, respectively. The point 𝐱k+1{\mathbf{x}_{k+1}} is obtained by projecting the point 𝐱k+αk𝐝k{\mathbf{x}_{k}+{{\alpha}_{k}\mathbf{d}_{k}}} back to the manifold as follows:

𝐱k+1\displaystyle{{\mathbf{x}}_{k+1}} =Retr𝐱k(αk𝐝k)\displaystyle={\rm Retr}_{{\mathbf{x}_{k}}}\left({{\alpha}_{k}\mathbf{d}_{k}}\right) (5)
1Nt×[(𝐱k+αk𝐝k)1|(𝐱k+αk𝐝k)1|,,(𝐱k+αk𝐝k)Nt|(𝐱k+αk𝐝k)Nt|]T.\displaystyle\triangleq\frac{1}{\sqrt{{{N}_{\rm t}}}}\times{{\left[\frac{{{\left({{\mathbf{x}}_{k}}+{{\alpha}_{k}}{{\mathbf{d}}_{k}}\right)}_{1}}}{\left|{{\left({{\mathbf{x}}_{k}}+{{\alpha}_{k}}{{\mathbf{d}}_{k}}\right)}_{1}}\right|},~{}\ldots,\frac{{{\left({{\mathbf{x}}_{k}}+{{\alpha}_{k}}{{\mathbf{d}}_{k}}\right)}_{{{N}_{\rm t}}}}}{\left|{{\left({{\mathbf{x}}_{k}}+{{\alpha}_{k}}{{\mathbf{d}}_{k}}\right)}_{{{N}_{\rm t}}}}\right|}\right]}^{\rm T}}.

Next, we introduce how to determine the search direction and step size. Specifically, the CG algorithm is used to determine the search direction. The gradient direction in Euclidean space is denoted by f(𝐱k)=2𝐇H(𝐬𝐇𝐱k){{\nabla f({{\mathbf{x}}_{k}})=-2{{\mathbf{H}}^{\rm H}}(\mathbf{s}-\mathbf{H}{{\mathbf{x}}_{k}})}}, which should be projected onto the tangent space at point 𝐱k{\mathbf{x}_{k}} as

Proj𝐱kf(𝐱k)\displaystyle{\rm Proj}_{{{\mathbf{x}}_{k}}}\nabla f\left({{\mathbf{x}}_{k}}\right) gradf(𝐱k)\displaystyle\triangleq{\rm grad}f\left({{\mathbf{x}}_{k}}\right) (6)
=f(𝐱k)Nt×𝔢{f(𝐱k)𝐱k}𝐱k.\displaystyle=\nabla f\left({{\mathbf{x}}_{k}}\right)-{{N}_{t}}\times\mathfrak{Re}\left\{\nabla f\left({{\mathbf{x}}_{k}}\right)\circ\mathbf{x}_{k}^{*}\right\}\circ{{\mathbf{x}}_{k}}.

Similarly, the search direction 𝐝k1𝒯𝐱k1{{\mathbf{d}_{k-1}}\in\mathcal{T}_{{\mathbf{x}}_{k-1}}\mathcal{M}} also needs to be projected onto the tangent plane at point 𝐱k{\mathbf{x}_{k}} as

Proj𝐱k𝐝k1=𝐝k1Nt×𝔢{𝐝k1𝐱k}𝐱k.\displaystyle{\rm Proj}_{{{\mathbf{x}}_{k}}}{{\mathbf{d}}_{k-1}}={{\mathbf{d}}_{k-1}}-{{N}_{t}}\times\mathfrak{Re}\left\{{{\mathbf{d}}_{k-1}}\circ\mathbf{x}_{k}^{*}\right\}\circ{{\mathbf{x}}_{k}}. (7)

Then, the search direction 𝐝k{{\mathbf{d}_{k}}} is given by

𝐝k=gradf(𝐱k)+βkProj𝐱k𝐝k1,{{\mathbf{d}}_{k}}=-{\rm grad}f\left({{\mathbf{x}}_{k}}\right)+{{\beta}_{k}}{\rm Proj}_{{{\mathbf{x}}_{k}}}{{\mathbf{d}}_{k-1}}, (8)

where βk{{\beta}_{k}} is the weight calculated by Polak-Ribière formula as

βk=gradf(𝐱k)H(gradf(𝐱k)Proj𝐱kgradf(𝐱k1))gradf(𝐱k1)Hgradf(𝐱k1).{{\beta}_{k}}=\frac{{\rm grad}f{{\left({{\mathbf{x}}_{k}}\right)}^{\rm H}}\left({\rm grad}f\left({{\mathbf{x}}_{k}}\right)-{\rm Proj}_{{{\mathbf{x}}_{k}}}{\rm grad}f\left({{\mathbf{x}}_{k-1}}\right)\right)}{{\rm grad}f{{\left({{\mathbf{x}}_{k-1}}\right)}^{\rm H}}{\rm grad}f\left({{\mathbf{x}}_{k-1}}\right)}. (9)

In addition, the search step size αk{{\alpha}_{k}} restricted to the direction 𝐝k{{\mathbf{d}_{k}}} can be determined according to Armijo backtracking line search rule, that is,

f(xk+αk𝐝k)f(𝐱k)c1αkgradfH(𝐱k)𝐝k,f\left({{\textbf{x}_{k}}+{{\alpha}_{k}}{\mathbf{d}_{k}}}\right)-f\left({{\mathbf{x}_{k}}}\right)\leq{c_{1}}{{\alpha}_{k}}{\rm grad}{f^{\rm H}}\left({{\mathbf{x}_{k}}}\right){\mathbf{d}_{k}}, (10)

where c1(0,1){{{c}_{1}}\in(0,~{}1)}. A large step αk{{\alpha}_{k}} is usually initialized and attenuated by a factor of τ{\tau} until (10) is satisfied. The RMO-based CE precoding method is summarized in Algorithm 1. We elaborate the RMO-based CEPNet design in the following subsection.

Input: 𝐱0{{\mathbf{x}}_{0}\in\mathcal{M}}
Output: Return the final transmitted vector 𝐱{{\mathbf{x}}}
1 Begin
2 𝐝0=gradf(𝐱0){{\mathbf{d}}_{0}}=-{\rm grad}f({{\mathbf{x}}_{0}}) and k=0{k=0};
3 Repeat
4 Determine search step size αk{{\alpha}_{k}} according to (10);
5 Update the new iterate 𝐱k+1{{\mathbf{x}}_{k+1}} using the retraction in (5);
6 Compute Riemannian gradient gradf(𝐱k+1){\rm grad}f({{\mathbf{x}}_{k+1}}) using (6);
7 Calculate the Proj𝐱k+1(𝐝k){\rm Proj}_{{{\mathbf{x}}_{k+1}}}({{\mathbf{d}}_{k}}) using (7);
8 Obtain Polak-Ribière parameter βk+1{{\beta}_{k+1}} according to (9);
9 Determine conjugate direction 𝐝k+1{{\mathbf{d}}_{k+1}} according to (8);
10 kk+1{k\leftarrow k+1};
Until Predefined number of iterations is met;
Algorithm 1 RMO-based CG Alg. for CE Precoding

III-B CEPNet Design

Two primary factors can increase the computational overhead of Algorithm 1 significantly. First, the step that determines the search step size αk{{\alpha}_{k}} takes up almost half the time of one iteration, indicating an excessive and unbearable latency overhead when the number of iterations is large. Second, the projection and retraction operations in Algorithm 1 affect the convergence speed of the CG algorithm, thereby increasing computational complexity. Therefore, reducing the backtracking line search overhead when determining the step size as much as possible and adjusting the search direction appropriately to accelerate the convergence speed of the CG algorithm are crucial. To this end, we introduce trainable variables to the algorithm and employ DL tools.

Two improvements are applied to the CG algorithm for search step size and search direction:

  1. 1.

    Given the k{k}th iteration, a trainable scalar wkα{{w}_{k}^{\alpha}\in{{\mathbb{R}}}} is defined as the search step size of the iteration k{k}. All trainable scalars for the search step size constitute the set Θα={wk1α:k=1,2,,K}{{{\Theta}^{\alpha}}=\{{w}_{k-1}^{\alpha}:k=1,~{}2,~{}\ldots,~{}K\}}, where K{K} denotes the number of units that represent K{K} iterations. Each element of the set Θα{{\Theta}^{\alpha}} is randomly and uniformly initialized between 3×103{{3}\times{10}^{-3}} and 2×102{{2}\times{10}^{-2}} and trained using the stochastic GD (SGD) algorithm111The initialization interval is obtained by statistical analysis of the search step size determined by the traditional Armijo backtracking line search rule.. All trainable variables are fixed after training so that they can be used directly during testing without researching on the basis of the Arimijo backtracking line search rule.

  2. 2.

    We focus on the weight βk{{\beta}_{k}} calculated by Polak-Ribière formula in (9) to adjust the search direction of the iteration k+1{k+1}. In particular, we calculate βk{{\beta}_{k}} according to (9) to obtain a reasonable initial weight value. In addition, a trainable scalar wkβ{{w}_{k}^{\beta}\in{{\mathbb{R}}}} is defined and multiplied by βk{{\beta}_{k}} to determine a new weight factor. All trainable scalars for adjusting the search direction constitute the set Θβ={wkβ:k=1,2,,K}{{{\Theta}^{\beta}}=\{{w}_{k}^{\beta}:k=1,~{}2,~{}\ldots,~{}K\}}. Each element of the set Θβ{{\Theta}^{\beta}} is initialized to 1 and trained with SGD. Similarly, the weights are fixed after training.

Finally, we can rewrite (5) as

𝐱k+1=Retr𝐱k(wkα×𝐝k),{{{\mathbf{x}}_{k+1}}={\rm Retr}_{{\mathbf{x}}_{k}}({w}_{k}^{\alpha}\times{\mathbf{d}_{k}^{\star}})}, (11)

where 𝐝k{\mathbf{d}_{k}^{\star}} is calculated as

𝐝k=gradf(𝐱k)+wkβ×βkProj𝐱k𝐝k1.{\mathbf{d}_{k}^{\star}}=-{\rm grad}f\left({{\mathbf{x}}_{k}}\right)+{w}_{k}^{\beta}\times{{\beta}_{k}}{\rm Proj}_{{{\mathbf{x}}_{k}}}{{\mathbf{d}}_{k-1}}. (12)

As such, we obtain a CE precoding network named CEPNet, which combines the traditional RMO-based CG algorithm with DL. The overview of the proposed DL architecture is shown in Fig. 1, in which the network inputs are 𝐇\mathbf{H} and 𝐬{\mathbf{s}}, and the output is 𝐱{\mathbf{x}}. Each unit can be regarded as an iteration of the traditional RMO method. Each unit contains two trainable variables, and the number of total trainable variables in the proposed CEPNet is 2K{2K}. Alternatively, each unit contains two active neural layers, and the total number of neural layers in the CEPNet shown in Fig. 1 is 2K{2K}. On this basis, we do not recommend that the CEPNet contain too many units because excessive neural layers cause the network to be too “deep,” which may lead to tricky gradient vanishing or exploding. The proposed network is easy to train because only few trainable variables need to be optimized.

Refer to caption
Figure 1: Architecture of CEPNet. The CEPNet contains K{K} units that represent K{K} iterations, and the dashed boxes in each unit denote trainable variables. Apart from replacing (5) and (8) by (11) and (12), respectively, the unit undates 𝐱k{\mathbf{x}_{k}} and 𝐝k{\mathbf{d}_{k}} according to Algorithm 1.

To train CEPNet, supervised learning design is inflexible and inadequate because the optimal label is unknown. Obtaining labels through existing algorithms is one approach. However, it only makes the network learn the existing algorithms. Therefore, in our design, an unsupervised learning algorithm is used to train CEPNet effectively. The set of trainable variables is denoted as Θ={Θα,Θβ}{\Theta=\{{\Theta}^{\alpha},{\Theta}^{\beta}\}}. The inputs of CEPNet are 𝐇\mathbf{H} and 𝐬{\mathbf{s}}, and the transmitted vector is denoted by 𝐱=g(𝐬;𝐇;Θ)\mathbf{x}=g(\mathbf{s};\mathbf{H};\Theta). To improve the robustness of the CEPNet to signal-to-noise ratio (SNR), we use MUI as the loss function directly rather than the average achievable rate or the bit-error rate (BER) that is related to the SNR. Specifically, the loss function is calculated as follows:

L(Θ)=10×log10(1NsNui=1Ns𝐇ig(𝐬i;𝐇i;Θ)𝐬i22),L(\Theta)=10\times{{\log}_{10}}\left(\frac{1}{{{N}_{\rm s}}{{N}_{{\rm u}}}}\sum\limits_{i=1}^{{N}_{\rm s}}{\left\|{{\mathbf{H}}_{i}}g\left({{\mathbf{s}}_{i}};{{\mathbf{H}}_{i}};\Theta\right)-{{\mathbf{s}}_{i}}\right\|_{2}^{2}}\right), (13)

where Ns{{N}_{\rm s}} denotes the total number of samples in the training set. 𝐬i{\mathbf{s}_{i}} and 𝐇i{\mathbf{H}_{i}} represent the information symbol and channel vectors associated with the ith symbol, respectively.

IV Experiments

IV-A Implementation Details

The CEPNet is constructed on top of the TensorFlow framework, and an NVIDIA GeForce GTX 1080 Ti GPU is used for accelerated training. The training, validation, and testing sets contain 40,00040,000, 20,00020,000, and 60,00060,000 samples, respectively. We perform simulation experiments with the multipath channel, and the channel vector of the μ\muth UT is determined by

𝐡μ=1Lμl=0Lμ1glμ𝐚μ(θlμ),{{\mathbf{h}}^{\mu}}=\frac{1}{\sqrt{L^{\mu}}}\sum\limits_{l=0}^{{{L}^{\mu}}-1}{g_{l}^{\mu}{{\mathbf{a}}^{\mu}}(\theta_{l}^{\mu})}, (14)

where Lμ{L^{\mu}} is the number of propagation paths of user μ\mu, and glμg_{l}^{\mu} denotes the complex gain of the llth propagation path in the μ\muth UT’s channel, and

𝐚μ=[1,ej2πdλsinθlμ,,ej2πdλ(Nt1)sinθlμ]T{{{\mathbf{a}}^{\mu}}=\left[1,~{}e^{j2\pi\frac{d}{\lambda}\sin\theta_{l}^{\mu}},~{}\dots,~{}e^{j2\pi\frac{d}{\lambda}\left({N}_{\rm t}-1\right)\sin\theta_{l}^{\mu}}\right]}^{\rm T} (15)

denotes the steering vector of the μ\muth user, where dd, λ\lambda, and θlμ\theta_{l}^{\mu} denote the distance between two horizontally or vertically adjacent antenna elements, the carrier wavelength, and the angle of departure of the llth propagation path in the μ\muth UT’s channel, respectively. The channel vector 𝐇=[𝐡1,𝐡2,,𝐡Nu]T\mathbf{H}={[{{\mathbf{h}}^{1}},~{}{{\mathbf{h}}^{2}},~{}\dots,{{\mathbf{h}}^{N_{\rm u}}}]}^{\rm T}. We set Lμ=8L^{\mu}=8. glμg_{l}^{\mu} is drawn from 𝒩(0,1){{\mathcal{N}_{\mathbb{C}}}(0,~{}1)}, and θlμ\theta_{l}^{\mu} is uniformly generated between 0 and π\pi. The data sets are formed in (𝐬i,𝐇i){(\mathbf{s}_{i},\mathbf{H}_{i})} pair. The channel vector 𝐇i{{\mathbf{H}}_{i}} is set as block fading, and one transmitted vector 𝐬i{{\mathbf{s}}_{i}} corresponds to one channel vector 𝐇i{{\mathbf{H}}_{i}}, where each 𝐇i{{\mathbf{H}}_{i}} is generated independently and each element of the transmitted vector 𝐬i{{\mathbf{s}}_{i}} is drawn from the 16-QAM constellation. We set K=20{K=20} for the trade-off between MUI performance and computational complexity. The set of trainable variables is updated by the ADAM optimizer [15]. The training epochs, learning rate, and batch size are set as 500500, 0.00020.0002, and 4,0004,000, respectively.

IV-B Performance Analysis

IV-B1 Average achievable rate

In Fig. 2, we compare the performance of the proposed CEPNet with the existing CE precoding algorithms on the average achievable rate against the SNR in a multipath channel, where the achievable rate at each UT can be calculated using [16, Eq. (43)]. The BS is equipped with Nt=64{{N}_{\rm t}=64} transmit antennas and serves Nu=16{{N}_{\rm u}=16} UTs. Fig. 2 indicates that the CEPNet trained with the matched multipath channel outperforms the RMO-based CG algorithm with the same number of iterations significantly, demonstrating that the proposed CEPNet can learn to reduce the total MUI through an unsupervised learning approach, i.e., the CEPNet learns to adjust the search step size and direction of each iteration appropriately through training. In addition, the CEPNet also outperforms the sequential GD algorithm with the same number of iterations at high SNRs. We infer that the CEPNet learns to deal with the channel singular issue in the multipath channel through training, while the sequential GD algorithm fails. In general, the average achievable rate performance of the CEPNet is better than the existing CE precoding algorithms in the multipath channel.

Refer to caption
Figure 2: Average achievable rate against the SNR in the multipath channel, where Nt=64N_{\rm t}=64 and Nu=16N_{\rm u}=16. The numbers of iterations for the RMO-based CG, the sequential GD, and the CEPNet are all 20.

IV-B2 Bit-error rate

Fig. 3 compares the BER performance of the CEPNet with existing CE precoding algorithms in the multipath channel, where the BS is equipped with Nt=64{{N}_{\rm t}=64} transmit antennas and serves Nu=16{{N}_{\rm u}=16} UTs. In Fig. 3, the CEPNet trained with the matched channel model obtains the best BER performance among all CE precoders. Specifically, the CEPNet outperforms the RMO-based CG algorithm with the same number of iterations by approximately 5.85.8 dB when we target SNR for BER=0.03\text{BER}=0.03. Similarly, the CEPNet outperforms the sequential GD algorithm with the same number of iterations by approximately 4.54.5 dB when we target SNR for BER=0.01\text{BER}=0.01.

Refer to caption
Figure 3: BER against the SNR in the multipath channel, where Nt=64N_{\rm t}=64 and Nu=16N_{\rm u}=16. The numbers of iterations for the RMO-based CG, the sequential GD, and the CEPNet are all 20.

IV-B3 Computational complexity

We compare the computational overhead of the RMO-based CG algorithm, the sequential GD algorithm, and the CEPNet222We first construct the CEPNet on top of the TensorFlow framework and an NVIDIA GeForce GTX 1080 Ti GPU is used for accelerated training. Once the CEPNet converges, the parameters are stored into .csv files. The parameters of the CEPNet are obtained from the pre-stored files directly when we implement the CEPNet with MATLAB framework. with same number of iterations in Table I, where Nt=64N_{\rm t}=64 and Nu=16N_{\rm u}=16. The aforementioned CE precoders are all implemented with MATLAB framework for fairness. Time comparison is performed on a computer with OSX 10.12, i5-6360U 2.9 GHz dual-core CPU, and 8 GB RAM.

TABLE I: The comparison of the computational overhead
CE precoding algorithms time (in seconds)
RMO-based CG – 20 iterations 0.00079
Sequential GD in [3] – 20 iterations 0.0321
Our proposed CEPNet – 20 units 0.00036

The results indicate that CE precoding through the CEPNet can be executed with a lower overhead than that through the RMO-based CG algorithm because the former does not require any backtracking on the search step size. Specifically, the CEPNet with 2020 units performs approximately 2.192.19 and 89.1789.17 times faster than the RMO-based CG algorithm with 2020 iterations and the sequential GD algorithm with 2020 iterations, respectively.

IV-C Robustness Analysis

IV-C1 Robustness to channel estimation error

In Sec. IV-B, we assume that the BS can obtain perfect channel state information (CSI) for precoding. However, considering that obtaining perfect CSI is impractical, we first investigate the robustness of the CEPNet to channel estimation error in this section. The channel estimation vector 𝐇^i\hat{\mathbf{H}}_{i} used for precoding is assumed to be given by

𝐇^i=1ε𝐇i+ε𝐄i,\hat{\mathbf{H}}_{i}=\sqrt{1-\varepsilon}\mathbf{H}_{i}+\sqrt{\varepsilon}\mathbf{E}_{i}, (16)

where ϵ[0,1]\epsilon\in[0,~{}1] and 𝐄i\mathbf{E}_{i} is drawn from 𝒩(0,1){{\mathcal{N}_{\mathbb{C}}}(0,~{}1)}. The value of ϵ\epsilon measures the magnitude of the channel estimation error. The CEPNet is trained with perfect CSI. We evaluate the RMO-based CG algorithm, the sequential GD algorithm, and the CEPNet with imperfect CSI. The aforementioned CE precoding algorithms are all performed with 2020 iterations and SNR=20\text{SNR}=20 dB.

Refer to caption
Figure 4: Average achievable rate performance of the RMO-based CG algorithm, the sequential GD algorithm, and the CEPNet with the channel estimation error ϵ\epsilon in a multipath channel with SNR=20\text{SNR}=20 dB, where Nt=64N_{\rm t}=64 and Nu=16N_{u}=16. The numbers of iterations for different CE precoding algorithms are all 20.

Fig. 4 illustrates the robustness of different CE precoding algorithms to the channel estimation error. The figure shows that the proposed CEPNet outperforms the RMO-based CG algorithm and the sequential GD algorithm when ϵ[0,0.5]\epsilon\in[0,~{}0.5], which indicates that the learned variables Θ{\Theta} are robust to channel estimation error. In addition, the performance of the aforementioned CE precoding algorithms is similar when ϵ(0.5,1]\epsilon\in(0.5,~{}1] because the channel estimation error is significant.

IV-C2 Robustness to channel model mismatch

We investigate the robustness of the CEPNet to channel model mismatch. Specifically, the CEPNet is trained with a Rayleigh-fading channel and deployed with a multipath channel. Each element of the Rayleigh-fading channel is drawn from 𝒩(0,1){{\mathcal{N}_{\mathbb{C}}}(0,~{}1)}. Figs. 2 and 3 indicate that the CEPNet trained with the Rayleigh-fading channel can also achieve significant gains compared with existing CE precoders in the multipath channel, which demonstrates that the learned variables are robust to channel model mismatch.

In general, the CEPNet shows strong robustness to channel estimation error and channel model mismatch. We also do not have to retrain the CEPNet if the channel variation is insignificant. However, we should retrain the CEPNet to improve performance when the channel variation amplitude is significant. Considering that the CEPNet only contains the 2K2K parameter, we can retrain the CEPNet with a low overhead to adapt the changed channel.

V Conclusion

We proposed a novel model-driven DL network for multiuser MIMO CE precoding. The designed CEPNet inherited the superiority of the conventional RMO-based CG algorithm and DL technology, thereby exhibiting excellent MUI suppression capability. Simulation results demonstrated that the CEPNet could reduce the precoding overhead significantly compared with the existing CE precoding algorithms. Furthermore, the CEPNet showed strong robustness to the channel estimation error and the channel model mismatch.

References

  • [1] L. Lu, G. Ye Li, A. L. Swindlehurst, A. Ashikhmin, and R. Zhang, “An overview of massive MIMO: Benefits and challenges,” IEEE J. Sel. Topics Signal Process., vol. 8, no. 5, pp. 742–758, Oct. 2014.
  • [2] E. G. Larsson, F. Tufvesson, O. Edfors, and T. L. Marzetta, “Massive MIMO for next generation wireless systems,” IEEE Commun. Mag., vol. 52, no. 2, pp. 186–195, Feb. 2014.
  • [3] S. K. Mohammend and E. G. Larsson, “Per-antenna constant envelope precoding for large multiuser MIMO systems,” IEEE Trans. Commun., vol. 61, no. 3, pp. 1059–1071, Mar. 2013.
  • [4] J.-C. Chen, C.-K.  Wen, and K.-K. Wong, “Improved constant envelope multiuser precoding for massive MIMO systems,” IEEE Commun. Lett., vol. 18, no. 8, pp. 1311–1314, Aug. 2014.
  • [5] J.-C. Chen, “Low-PAPR precoding design for massive multiuser MIMO systems via Riemannian manifold optimization,” IEEE Commun. Lett., vol. 21, no. 4, pp. 945–948, Apr. 2017.
  • [6] T. Wang, C.-K. Wen, H. Wang, F. Gao, T. Jiang and S. Jin, “Deep learning for wireless physical layer: Opportunities and challenges,” China Commun., vol. 14, no. 11, pp. 92–111, Nov. 2017.
  • [7] A. M. Elbir and A. Papazafeiropoulos, “Hybrid precoding for multi-user millimeter wave massive MIMO systems: A deep learning approach,” IEEE Trans. Veh. Technol., vol. 69, no. 1, pp. 552–563, Jan. 2020.
  • [8] W. Xia, G. Zheng, Y. Zhu, J. Zhang, J. Wang, and A. Petropulu, “A deep learning framework for optimization of MISO downlink beamforming,” IEEE Trans. Commun., vol. 68, no. 3, pp. 1866–1880, Mar. 2020.
  • [9] D. Ito, S. Takabe and T. Wadayama, “Trainable ISTA for Sparse Signal Recovery,” IEEE Trans. Signal Process., vol. 67, no. 12, pp. 3113–3125, Jun. 2019.
  • [10] A. Balatsoukas-Stimming and C. Studer, “Deep unfolding for communications systems: A survey and some new directions,” in Proc. IEEE SiPS, Nanjing, China, Oct. 2019, pp. 1–4.
  • [11] V. Monga, Y. Li, and Y. Eldar, “Algorithm unrolling: interpretable, efficient deep learning for signal and image processing,” preprint, 2019. [Online]. Available: http://arxiv.org/abs/1912.10557.
  • [12] H. He, S. Jin, C. Wen, F. Gao, G. Y. Li, and Z. Xu, “Model-driven deep learning for physical layer communications,” IEEE Wireless Commun., vol. 26, no. 5, pp. 77-83, Oct. 2019.
  • [13] A. Balatsoukas-Stimming, O. Castaneda, S. Jacobsson, G. Durisi, and C. Studer, “Neural-network optimized 1-bit precoding for massive MU-MIMO,” in Proc. IEEE SPAWC, Cannes, France, Jul. 2019, pp. 1–4.
  • [14] P.-A. Absil, R. Mahony, and R. Sepulchre, Optimization Algorithms on Matrix Manifolds. Princeton, NJ, USA: Princeton Univ. Press, 2009.
  • [15] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” preprint, 2014. [Online]. Available: http://arxiv.org/abs/1412.6980.
  • [16] C. Mollén, E. G. Larsson, and T. Eriksson, “Waveforms for the massive MIMO downlink: Amplifier efficiency, distortion, and performance,” IEEE Trans. Commun., vol. 64, no. 12, pp. 5050–5063, Dec. 2016.