This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Vandermonde Constrained Tensor Decomposition for Hybrid Beamforming in Multi-Carrier MIMO Systems

Mohamed Salah Ibrahim, Akshay Malhotra, Mihaela Beluri, Arnab Roy, and Shahab Hamidi-Rad InterDigital Communications Inc., Conshohocken, PA, USA, InterDigital Communications Inc., Los Altos, CA, USA
InterDigital Communications Inc., New York, NY, USA
Email: {mohamedsalah.ibrahim,akshay.malhotra,mihaela.beluri,
arnab.roy,shahab.hamidi-rad@interdigital.com}
Abstract

Hybrid beamforming has evolved as a promising technology that offers the balance between system performance and design complexity in mmWave MIMO systems. Existing hybrid beamforming methods either impose unit-modulus constraints or a codebook constraint on the analog precoders/combiners, which in turn results in a performance-overhead tradeoff. This paper puts forth a tensor framework to handle the wideband hybrid beamforming problem, with Vandermonde constraints on the analog precoders/combiners. The proposed method strikes the balance between performance, overhead and complexity. Numerical results on a 3GPP link-level test bench reveal the efficacy of the proposed approach relative to the codebook-based method while attaining the same feedback overhead. Moreover, the proposed method is shown to achieve comparable performance to the unit-modulus approaches, with substantial reductions in overhead.

I introduction

Millimeter wave (mmWave) has emerged as a powerful technology, that can handle the unprecedented demands on wireless connectivity, through offering large available bandwidth [1]. However, the high propagation loss inherent to mmWave bands, if not mitigated, can severely impact the system performance. Large antenna arrays which achieve high beamforming gains are used to compensate the propagation loss [2].

Large scale antenna systems implementation, on the other hand, incurs several practical challenges including the high energy consumption and cost of radio frequency (RF) chains, as each antenna element requires a dedicated RF chain. Such hurdles limit the possibility of employing a fully digital beamforming design. As an efficient surrogate, hybrid (analog/digital) beamforming has been introduced in [3, 4] as means of attaining favorable complexity-performance tradeoff in mmWave multicarrier massive MIMO systems. Hybrid beamforming relies on using a small number of RF chains to design high-dimensional analog precoders (implemented with only phase shifters) together with a low-dimensional (digital) baseband precoder. The combination of analog and digital precoders has the potential to approach the performance of a purely digital solution while providing substantial savings in energy consumption and design complexity.

Although maximizing the system spectral efficiency in the case of digital beamforming design admits a simple algebraic solution via singular value decomposition (SVD) [5], hybrid beamforming yields a highly non-convex problem that requires joint optimization of the hybrid precoders and combiners [3]. A more tractable formulation is to transform the hybrid beamforming design to a matrix factorization problem. In particular, the optimal SVD-based digital solution is first derived to maximize the spectral efficiency. Then, the hybrid beamforming is posed towards factorizing the fully digital precoder (combiner) as the hybrid precoding (combining) components. The factorization is usually solved either under unit modulus constraints [3, 6] or with codebook constraints [7] on the analog precoder (combiner), to ensure that the analog precoder can be modeled using phase shifters. While considering the unit modulus constraints, in general, result in a much better solution compared to the codebook constraint [6], the resulting communication overhead of the latter is considerably lower [7], rendering it more appropriate for limited feedback systems [8]. Further, compared to the codebook constraints approach, the feedback overhead for unit magnitude constraints scales linearly with the number of Tx/Rx antennas, thereby precluding its use in massive MIMO systems.

This begs the question whether it is possible to achieve a comparable performance to the unit-modulus based methods while yielding the feedback associated with the codebook-based approaches? This is the central question that this paper seeks to address. We answer the stated question in the affirmative by modeling the wideband hybrid beamforming as a low rank tensor decomposition problem with Vandermonde constraints on the analog precoders/combiners. Invoking the so-called parallel factor (PARAFAC) analysis, to decompose the resulting tensor, we show that PARAFAC yields high-quality hybrid precoders/combiners, with identifiability guarantees on the resulting factors. This paper adds to the broad variety of tensors applications in wireless communications [9, 10, 11]. Different from all prior hybrid beamforming works that adopt the spectral efficiency formula for performance evaluation, this paper evaluates the practical impact of the proposed method by integrating hybrid beamforming to an end-to-end communication scenario with time-varying channels. Numerical results demonstrate that the end-to-end performance of the proposed approach considerably outperforms the codebook based method while achieving comparable performance to the unit modulus based approaches. Further, the proposed method yields significantly lower communication overhead compared to unit modulus approaches.

II System Model

Consider a downlink transmission in a multi-carrier MIMO system comprising a base station (BS) and a single user equipment (UE). The BS is equipped with NtN_{t} transmit antennas and NtRFN_{t}^{\text{RF}} transmit radio frequency (RF) chains while the UE is equipped with NrN_{r} receive antennas and NrRFN_{r}^{\text{RF}} receive RF chains. The BS aims at communicating NsN_{s} data streams to the UE over KK subcarriers, where NsNtRFNtN_{s}\leq N_{t}^{\text{RF}}\leq N_{t} and NsNrRFNrN_{s}\leq N_{r}^{\text{RF}}\leq N_{r} [3]. The BS first employs a digital baseband precoding matrix 𝐅BB[k]NtRF×Ns{\bf F}_{\text{BB}}[k]\in\mathbb{C}^{N_{t}^{\text{RF}}\times N_{s}} on the transmitted symbols 𝐬[k]Ns{\bf s}[k]\in\mathbb{C}^{N_{s}}, k[K]:={0,,K1}\forall k\in[K]:=\{0,\cdots,K-1\}, as shown in Fig. 1. Then, the data symbols are transformed to the time domain using N-point inverse fast Fourier transform (IFFT). After a cyclic prefix (CP) is added to the time-domain signal, the BS applies an analog precoder 𝐅RFNt×NtRF{\bf F}_{\text{RF}}\in\mathbb{C}^{N_{t}\times N_{t}^{\text{RF}}} (implemented using analog phase shifters), i.e., |𝐅RF(i,j)|=1,i=0,,Nt1andj=0,,NtRF1|{\bf F}_{\text{RF}}(i,j)|=1,~{}\forall i=0,\cdots,N_{t}-1~{}\text{and}~{}j=0,\cdots,N_{t}^{\text{RF}}-1. Notice that same 𝐅RF{\bf F}_{\text{RF}} is applied across all subcarriers, i.e., 𝐅RF{\bf F}_{\text{RF}} is frequency independent. Towards this end, the transmitted complex signal from the BS can be expressed as,

𝐱[k]=𝐅RF𝐅BB[k]𝐬[k],k[K].\displaystyle{\bf x}[k]={\bf F}_{\text{RF}}{\bf F}_{\text{BB}}[k]{\bf s}[k],~{}\forall k\in[K]. (1)

It is assumed that i) 𝔼[𝐬[k]𝐬H[k]]=αKNs𝐈Ns\mathbb{E}[{\bf s}[k]{\bf s}^{H}[k]]=\frac{\alpha}{KN_{s}}{\bf I}_{N_{s}}, and ii) the total power budget constraint α{\alpha} is satisfied by enforcing the constraint 𝐅RF𝐅BB[k]F2=Nsk[K]\|{\bf F}_{\text{RF}}{\bf F}_{\text{BB}}[k]\|^{2}_{F}=N_{s}~{}\forall k\in[K].

At the receiver, the UE first employs an analog combiner 𝐖RFNr×NrRF{\bf W}_{\text{RF}}\in\mathbb{C}^{N_{r}\times N_{r}^{\text{RF}}} followed by a digital baseband combiner 𝐖BBNrRF×Ns{\bf W}_{\text{BB}}\in\mathbb{C}^{N_{r}^{\text{RF}}\times N_{s}} after CP removal and frequency transformation using N-point FFT. Similar to the unit modulus constraint on the entries of 𝐅RF{\bf F}_{\text{RF}}, it assumed that the (i,j)(i,j)-th entry of 𝐖RF{\bf W}_{\text{RF}} has a unit modulus, i.e., |𝐖RF(i,j)|=1,i=0,,NrRF1andj=0,,Nr1|{\bf W}_{\text{RF}}(i,j)|=1,~{}\forall i=0,\cdots,N_{r}^{\text{RF}}-1~{}\text{and}~{}j=0,\cdots,N_{r}-1. Thus, the NsN_{s}-dimensional complex baseband signal at the UE at the kk-th subcarrier is given by,

𝐲[k]=𝐖BBH[k]𝐖RFH𝐇[k]𝐱[k]+𝐖BBH[k]𝐖RFH𝐯[k].\displaystyle{\bf y}[k]={\bf W}^{H}_{\text{BB}}[k]{\bf W}^{H}_{\text{RF}}{\bf H}[k]{\bf x}[k]+{\bf W}^{H}_{\text{BB}}[k]{\bf W}^{H}_{\text{RF}}{\bf v}[k]. (2)

where 𝐇[k]Nr×Nt{\bf H}[k]\in\mathbb{C}^{N_{r}\times N_{t}} represents the downlink channel at the kk-th subcarrier, and 𝐯[k]Nr{\bf v}[k]\in\mathbb{C}^{N_{r}} is the additive white Guassian noise vector associated with the kk-th subcarrier,k[K]~{}\forall k\in[K]. It is assumed that the entries of 𝐯[k]{\bf v}[k] are independent and identically distributed (i.i.d) random variables with zero mean and variance σ2\sigma^{2}, i.e., 𝐯[k]𝒩(0,σ2𝐈Nr),i=0,,Nr1{\bf v}[k]\sim\mathcal{N}(0,\sigma^{2}{\bf I}_{N_{r}}),~{}\forall i=0,\cdots,N_{r}-1. Throughout this work, we assume that the channel matrices across the subcarriers {𝐇[k]}k=1K\{{\bf H}[k]\}_{k=1}^{K} are perfectly known at the UE.

Remark 1.

It is worth pointing out that, in practical wireless systems, there is one representative channel matrix for each group of subcarriers or resource blocks, referred to as subband size, and hence, there is one baseband precoder/combiner for each subband as opposed to each subcarrier. The reason behind that is primarily to reduce the overhead associated with the channel and/or precoding-related feedback. In the hybrid beamforming context, this will reduce the overhead associated with the baseband precoders/combiners, and will also reduce the complexity as obviously smaller number of baseband precoders and combiners need to be computed. This fact will be utilized later in the simulations (Section VI) and also in the overhead computations associated with the proposed approach and the existing hybrid beamforming methods.

Refer to caption
Figure 1: Block diagram of OFDM-MIMO system with a BS and a UE employing hybrid precoding and combining.

III Problem Definition

The wideband hybrid beamforming problem seeks to find the set of hybrid precoders and combiners (𝐅RF,{𝐅BB[k]}k=1K,𝐖RF,{𝐖BB[k]}k=1K)({\bf F}_{\text{RF}},\{{\bf F}_{\text{BB}}[k]\}_{k=1}^{K},{\bf W}_{\text{RF}},\{{\bf W}_{\text{BB}}[k]\}_{k=1}^{K}) that can maximize the spectral efficiency. Assuming the transmitted symbols follow a Gaussian distribution, the achievable spectral efficiency associated with the kk-th subcarrier can be expressed as [5]

R[k]\displaystyle\text{R}[k] =log2(det(𝐈Ns+αNs𝚪1[k]𝐖BBH[k]𝐖RFH𝐇[k]𝐅RF𝐅BB[k]\displaystyle=\log_{2}(\text{det}({\bf I}_{N_{s}}+\frac{\alpha}{N_{s}}{\bf\Gamma}^{-1}[k]{\bf W}^{H}_{\text{BB}}[k]{\bf W}^{H}_{\text{RF}}{\bf H}[k]{\bf F}_{\text{RF}}{\bf F}_{\text{BB}}[k]
×𝐅BBH[k]𝐅RFH𝐇H[k]𝐖RF𝐖BB[k]))\displaystyle\times{\bf F}^{H}_{\text{BB}}[k]{\bf F}^{H}_{\text{RF}}{\bf H}^{H}[k]{\bf W}_{\text{RF}}{\bf W}_{\text{BB}}[k])) (3)

where 𝚪[k]:=σ2𝐖BBH[k]𝐖RFH𝐖RF𝐖BB[k]{\bf\Gamma}[k]:=\sigma^{2}{\bf W}^{H}_{\text{BB}}[k]{\bf W}^{H}_{\text{RF}}{\bf W}_{\text{RF}}{\bf W}_{\text{BB}}[k] represents the covariance matrix of the post-processing noise term in (2). The goal is then to design the hybrid precoders (𝐅RF,{𝐅BB[k]}k=1K,𝐖RF,{𝐖BB[k]}k=1K)({\bf F}_{\text{RF}},\{{\bf F}_{\text{BB}}[k]\}_{k=1}^{K},{\bf W}_{\text{RF}},\{{\bf W}_{\text{BB}}[k]\}_{k=1}^{K}) that aim at maximizing the overall spectral efficiency while satisfying the imposed constraints on the analog and digital precoders/combiners. Maximizing the spectral efficiency, though, yields a highly non tractable optimization problem that requires the hybrid precoders and combiners to be jointly optimized.

Instead of maximizing the spectral efficiency, one can decouple the precoders and combiners design, and formulate the hybrid beamforming problem as two separate low-rank matrix factorization problems [3, 6, 12]. The precoder problem aims at factorizing the optimal digital precoder 𝐅opt[k]Nt×Ns{\bf F}_{\text{opt}}[k]\in\mathbb{C}^{N_{t}\times N_{s}} to 𝐅RF𝐅BB[k]{\bf F}_{\text{RF}}{\bf F}_{\text{BB}}[k], where the columns of 𝐅opt[k]{\bf F}_{\text{opt}}[k] are the NsN_{s} dominant right singular vectors of 𝐇[k]{\bf H}[k], k[K]\forall k\in[K]. On the other hand, the combiner problem seeks to factorize 𝐖opt[k]Nr×Ns{\bf W}_{\text{opt}}[k]\in\mathbb{C}^{N_{r}\times N_{s}} to 𝐖RF𝐖BB[k]{\bf W}_{\text{RF}}{\bf W}_{\text{BB}}[k], where 𝐖opt[k]{\bf W}_{\text{opt}}[k] is the WMMSE solution, i.e., 𝐖opt[k]=(𝐇¯H[k]𝐇¯[k]+σ2𝐈Ns)1𝐇¯H{\bf W}_{\text{opt}}[k]=(\overline{\bf H}^{H}[k]\overline{\bf H}[k]+\sigma^{2}{\bf I}_{N_{s}})^{-1}\overline{\bf H}^{H} and 𝐇¯[k]=𝐇[k]𝐅opt[k]\overline{\bf H}[k]={\bf H}[k]{\bf F}_{\text{opt}}[k] . Interestingly, it has been shown in [12] that solving the factorization problems implicitly leads to maximizing the system spectral efficiency. Since both problems exhibit similar mathematical formulation, except that the precoder problem has an additional sum power constraint, we will focus on the precoder factorization problem. However, the proposed method may be easily applied to solve the combiner problem. From an optimization perspective, given the fully digital SVD-based precoder 𝐅opt[k]{\bf F}_{\text{opt}}[k], the hybrid beamforming problem can be posed as [3, 6, 12]

min𝐅RF,{𝐅BB[k]}k=1Kk=1K𝐅opt[k]𝐅RF𝐅BB[k]F2\displaystyle\min_{{\bf F}_{\text{RF}},\{{\bf F}_{\text{BB}}[k]\}_{k=1}^{K}}\sum\limits_{k=1}^{K}\|{\bf F}_{\text{opt}}[k]-{\bf F}_{\text{RF}}{\bf F}_{\text{BB}}[k]\|_{F}^{2} (4a)
s.t.𝐅RF\displaystyle\textrm{s.t.}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}{\bf F}_{\text{RF}}\in\mathcal{F} (4b)
𝐅RF𝐅BB[k]F2=Ns,k[K]\displaystyle~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}\|{\bf F}_{\text{RF}}{\bf F}_{\text{BB}}[k]\|^{2}_{F}=N_{s},~{}\forall k\in[K] (4c)

where \mathcal{F} is the feasible set of the analog precoders. In the wideband hybrid beamforming literature, the feasible set either includes unit modulus constraints on the entries of 𝐅RF{\bf F}_{\text{RF}} [6] (denoted as U\mathcal{F}_{U}), or code-book based selection of the columns of 𝐅RF{\bf F}_{\text{RF}} [7] (denoted as C\mathcal{F}_{C}). The two feasible sets yield an interesting overhead-performance trade-off. While considering the feasible set C\mathcal{F}_{C} results in much lower overhead relative to the set U\mathcal{F}_{U}, the solution associated with U\mathcal{F}_{U} performs much better than that of C\mathcal{F}_{C}. The intuition is that U\mathcal{F}_{U} provides a much wider search space compared to C\mathcal{F}_{C}, i.e., CU\mathcal{F}_{C}\subset\mathcal{F}_{U}, and hence, better performance is expected.

In this paper, we will introduce a new feasible set (denoted as V\mathcal{F}_{V}) to the wideband hybrid beamforming problem in (4) by enforcing a Vandermonde structure on the columns of 𝐅RF{\bf F}_{\text{RF}}, i.e., V:={𝐱Nt|𝐱=[1,ejϕ,,ej(Nt1)ϕ]T}\mathcal{F}_{V}:=\{{\bf x}\in\mathbb{C}^{N_{t}}~{}|~{}{\bf x}=[1,e^{j\phi},\cdots,e^{j(N_{t}-1)\phi}]^{T}\}, and ϕ[π,π]\phi\in[-\pi,\pi]. Towards this end, the problem that this paper seeks to solve is the following low-rank matrix optimization problem,

Find𝐅RF(ϕ0,,ϕNRF1),{𝐅BB[k]}k=1K\displaystyle\text{Find}\quad{\bf F}_{\text{RF}}(\phi_{0},\cdots,\phi_{N_{\text{RF}}-1}),~{}\{{\bf F}_{\text{BB}}[k]\}_{k=1}^{K} (5a)
s.t.𝐅opt[k]𝐅RF𝐅BB[k],𝐅RFV,k[K].\displaystyle\textrm{s.t.}\quad~{}~{}{\bf F}_{\text{opt}}[k]\approx{\bf F}_{\text{RF}}{\bf F}_{\text{BB}}[k],~{}{\bf F}_{\text{RF}}\in\mathcal{F}_{V},~{}~{}\forall~{}k\in[K]. (5b)

Notice that the sum power constraint in (4c) is temporarily omitted as it has been shown that such a constraint can be satisfied via a simple normalization step to the resulting baseband precoders [3]. To our best knowledge, the formulation in (5) has not been considered before in the hybrid beamforming literature. Such a formulation strikes the balance between the obtained solution quality and the resulting overhead. In particular, the resulting solution achieves the same overhead associated with the C\mathcal{F}_{C} set while achieving comparable performance to the solutions associated with the U\mathcal{F}_{U} set. In the subsequent section, we will show that (5) can be reformulated as a tensor factorization problem where efficient tensor decomposition methods can be applied.

IV PARAFAC Decomposition

Before reformulating (5) as a tensor factorization problem and to facilitate our discussion, we briefly review some key concepts that will be used in the proposed tensor approach.

IV-A Tensor Preliminaries

A third order tensor 𝒳I×J×P\mathcal{X}\in\mathbb{C}^{I\times J\times P} is a three way array whose elements are indexed by three indices (i,j,p)(i,j,p). The so-called Parallel Factor decomposition (PARAFAC), a.k.a Canonical Polyadic Decomposition (CPD), is one powerful tensor decomposition method. A tensor 𝒳\mathcal{X} admits a PARAFAC decomposition if it can be written as the sum of vector outer products [13],

𝒳=f=1F𝐚f𝐛f𝐜f.\displaystyle\mathcal{X}=\sum\limits_{f=1}^{F}{\bf a}_{f}\circ{\bf b}_{f}\circ{\bf c}_{f}. (6)

where \circ denotes the vector outer product, and FF is a positive integer that we refer to as the tensor rank or CPD rank (the smallest value such that (6) holds). The terms 𝐚fI{\bf a}_{f}\in\mathbb{C}^{I}, 𝐛fJ{\bf b}_{f}\in\mathbb{C}^{J} and 𝐜fP{\bf c}_{f}\in\mathbb{C}^{P} are the ff-th columns of the so-called low-rank factors 𝐀I×F{\bf A}\in\mathbb{C}^{I\times F}, 𝐁J×F{\bf B}\in\mathbb{C}^{J\times F}, and 𝐂P×F{\bf C}\in\mathbb{C}^{P\times F}, respectively, of the tensor 𝒳\mathcal{X}.

Different from the tensor format in (6), PARAFAC can also be written in slab format. Let 𝐗p:=𝒳(:,:,p){\bf X}_{p}:=\mathcal{X}(:,:,p) represent the pp-th frontal slab of 𝒳,p[P]:={0,,P1}\mathcal{X},~{}\forall p\in[P]:=\{0,\cdots,P-1\}.111Note that we used the MATLAB notation 𝒳(:,:,p)\mathcal{X}(:,:,p) to read the frontal slab of a three-way tensor. The PARAFAC decomposition of 𝒳\mathcal{X} in the slab-format is given by

𝒳(:,:,p)=𝐀𝐃p(𝐂)𝐁T,p[P].\displaystyle\mathcal{X}(:,:,p)={\bf A}{\bf D}_{p}({\bf C}){\bf B}^{T},~{}~{}\forall p\in[P]. (7)

where 𝐃p(𝐂):=Diag(𝐂(p,:))F×F{\bf D}_{p}({\bf C}):=\text{Diag}({\bf C}(p,:))\in\mathbb{C}^{F\times F} with the elements on the diagonal be the pp-th row of 𝐂{\bf C}. Throughout this paper, we will use the notation 𝒳:=𝐀,𝐁,𝐂\mathcal{X}:=\left\llbracket{\bf A},{\bf B},{\bf C}\right\rrbracket to denote (7).

IV-B Identifiability

One distinctive property of tensors is that the PARAFAC model is essentially unique under mild conditions even if FF is greater than max(I,J,P)\max(I,J,P). The definition of essential uniqueness is presented as follows.

Definition 1.

The PARAFAC decomposition of a tensor 𝒳\mathcal{X} is said to be essentially unique, 𝒳:=𝐀,𝐁,𝐂\mathcal{X}:=\left\llbracket{\bf A},{\bf B},{\bf C}\right\rrbracket, if 𝐀,𝐁{\bf A},{\bf B} and 𝐂{\bf C} are identifiable up to scaling and permutation. This means that if 𝒳:=𝐀¯,𝐁¯,𝐂¯\mathcal{X}:=\left\llbracket\overline{\bf A},\overline{\bf B},\overline{\bf C}\right\rrbracket, for some 𝐀¯I×F\overline{\bf A}\in\mathbb{C}^{I\times F}, 𝐁¯J×F\overline{\bf B}\in\mathbb{C}^{J\times F}, and 𝐂¯P×F\overline{\bf C}\in\mathbb{C}^{P\times F}, then there exists a permutation matrix 𝚷F×F{\bf\Pi}\in\mathbb{R}^{F\times F} and diagonal scaling matrices {𝚲i}i=13\{{\bf\Lambda}_{i}\}_{i=1}^{3} such that,

𝐀=𝐀¯𝚷𝚲1,𝐁=𝐁¯𝚷𝚲2,𝐂=𝐂¯𝚷𝚲3,𝚲1𝚲2𝚲3=𝐈.\displaystyle{\bf A}=\overline{\bf A}{\bf\Pi}{\bf\Lambda}_{1},{\bf B}=\overline{\bf B}{\bf\Pi}{\bf\Lambda}_{2},{\bf C}=\overline{\bf C}{\bf\Pi}{\bf\Lambda}_{3},{\bf\Lambda}_{1}{\bf\Lambda}_{2}{\bf\Lambda}_{3}={\bf I}. (8)

If there is no structure imposed on the low rank factors, then a generic identifiability condition on PARAFAC uniqueness is given in [14]. If, however, one or more of the low rank factor matrices have a Vandermonde structure, then more relaxed uniqueness conditions based on the Kruskal rank can be found in [13, 15, 10]. The latest and the most relevant identifiability results to the problem considered herein is given as follows.

Theorem 1.

[16] Consider the data model in (7) and assume that the factors 𝐀I×F{\bf A}\in\mathbb{C}^{I\times F} and 𝐂P×F{\bf C}\in\mathbb{C}^{P\times F} are Vandermonde and that 𝐁J×F{\bf B}\in\mathbb{C}^{J\times F} is tall and full rank. If,

k𝐀+min(P1,F)F+1.\displaystyle k_{{\bf A}}+\min(P-1,F)\geq F+1. (9)

then the PARAFAC decomposition of 𝒳\mathcal{X} in terms of 𝐀{\bf A}, 𝐁{\bf B}, and 𝐂{\bf C} is essentially unique, where k𝐀k_{\bf A} denotes the Kruskal rank (k-rank) of the matrix 𝐀{\bf A}.

It has been shown in [16] that a matrix with Vandermonde structure has full k-rank, i.e., k𝐀=min(I,F)k_{\bf A}=\min(I,F). The condition in (9) will be interpreted later in the context of hybrid beamforming.

V Hybrid Beamforming via PARAFAC

In this section, it will be shown how the wideband hybrid beamforming problem in  (5) can be reformulated as a tensor decomposition problem. Let us define the matrices 𝐗=[𝐅opt[1],,𝐅opt[K]]Nt×KNs{\bf X}=[{\bf F}_{\text{opt}}[1],\cdots,{\bf F}_{\text{opt}}[K]]\in\mathbb{C}^{N_{t}\times KN_{s}} and 𝐁=[𝐅BBT[1],,𝐅BBT[K]]TKNs×NtRF{\bf B}=[{\bf F}^{T}_{\text{BB}}[1],\cdots,{\bf F}^{T}_{\text{BB}}[K]]^{T}\in\mathbb{C}^{KN_{s}\times N_{t}^{\text{RF}}}, then it can be easily seen that (5) can be expressed in more compact form as

Find𝐅RF(ϕ0,,ϕNtRF1),𝐁\displaystyle\text{Find}\quad{\bf F}_{\text{RF}}(\phi_{0},\cdots,\phi_{N_{t}^{\text{RF}}-1}),~{}{\bf B} (10a)
s.t.𝐗𝐅RF𝐁T,𝐅RFV.\displaystyle\textrm{s.t.}\quad~{}~{}{\bf X}\approx{\bf F}_{\text{RF}}{\bf B}^{T},~{}{\bf F}_{\text{RF}}\in\mathcal{F}_{V}. (10b)
Remark 2.

Notice that while (10) assumes a uniform linear array (ULA) structure on the columns of the analog beamformer 𝐅RF{\bf F}_{\text{RF}}, the proposed tensor method can be further extended to handle other array structures, for e.g., uniform planar array (UPA) [16]. In that sense, the proposed method can be used to recover azimuth and elevation estimates for each column of 𝐅RF{\bf F}_{\text{RF}}. This is in fact a big advantage of the proposed approach relative to the state-of-the-art. Owing to space limitations, we will present only the ULA structure here.

Let us construct the following two subarrays,

𝐀=𝐅RF(1:end1,:),(all rows except last)\displaystyle{\bf A}={\bf F}_{\text{RF}}(1:end-1,:),~{}(\text{all rows except last}) (11a)
𝐀¯=𝐅RF(2:end,:),(all rows except first)\displaystyle\overline{\bf A}={\bf F}_{\text{RF}}(2:end,:),~{}(\text{all rows except first}) (11b)

Then, it follows that by exploiting the Vandermonde structure of the columns of the matrix 𝐅RF{\bf F}_{\text{RF}}, the (Nt1)×NtRF{(N_{t}-1)\times N_{t}^{\text{RF}}} matrices 𝐀{\bf A} and 𝐀¯\overline{\bf A} are displaced but otherwise identical subarrays, i.e.,

𝐀¯=𝐀𝚽1.\displaystyle\overline{\bf A}={\bf A}{\bf\Phi}_{1}. (12)

where 𝚽1:=Diag([eϕ0,,eϕNtRF1]){\bf\Phi}_{1}:=\text{Diag}([e^{-\phi_{0}},\cdots,e^{-\phi_{N_{t}^{\text{RF}}-1}}]). Further, for consistency, let 𝐀=𝐀𝚽0{\bf A}={\bf A}{\bf\Phi}_{0}, where 𝚽0=𝐈NtRF{\bf\Phi}_{0}={\bf I}_{N_{t}^{\text{RF}}}. Let 𝐂P×NtRF{\bf C}\in\mathbb{C}^{P\times N_{t}^{\text{RF}}} be a matrix holding the diagonal of 𝚽p{\bf\Phi}_{p} on its pp-th row, for p=0,,P1p=0,\cdots,P-1 and P=2P=2. Then, upon defining 𝐗0=𝐗(1:end1,:)(Nt1)×KNs{\bf X}_{0}={\bf X}(1:end-1,:)\in\mathbb{C}^{(N_{t}-1)\times KN_{s}}, 𝐗1=𝐗(2:end,:)(Nt1)×KNs{\bf X}_{1}={\bf X}(2:end,:)\in\mathbb{C}^{(N_{t}-1)\times KN_{s}} and 𝐃p(𝚽)=𝚽p{\bf D}_{p}({\bf\Phi})={\bf\Phi}_{p}, for p=0,1p=0,1, we can write the following,

𝐗0=𝐀𝐃0(𝐂)𝐁T,\displaystyle{\bf X}_{0}={\bf A}{\bf D}_{0}({\bf C}){\bf B}^{T}, (13)
𝐗1=𝐀𝐃1(𝐂)𝐁T.\displaystyle{\bf X}_{1}={\bf A}{\bf D}_{1}({\bf C}){\bf B}^{T}. (14)

From the PARAFAC decomposition slab format defined in (7), it is easy to see that (13) and (14) form a two-slab, i.e., P=2P=2, PARAFAC model with Vandermonde structure in one mode. Thus, solving (10) is tantamount to decomposing the tensor 𝒳(Nt1)×KNs×P\mathcal{X}\in\mathbb{C}^{(N_{t}-1)\times KN_{s}\times P} with its pp-th slab defined as 𝒳(:,:,p):=𝐗p\mathcal{X}(:,:,p):={\bf X}_{p}, for p=0,,P1p=0,\cdots,P-1 and P=2P=2. From an optimization perspective, this can be expressed as

min𝐀,𝐁,𝐂𝒳𝐀,𝐁,𝐂F2\displaystyle\min_{{\bf A},{\bf B},{\bf C}}~{}~{}\|\mathcal{X}-\left\llbracket{\bf A},{\bf B},{\bf C}\right\rrbracket\|_{F}^{2} (15)

Several algorithms have been developed to tackle the optimization problem (15) [17]. In this work, we adopt the trilinear alternating least square (TALS) algorithm implemented in the widely known Tensorlab MATLAB toolbox [18].

Considering the condition in (9) in the context of hybrid beamforming, one can easily see that with P=2P=2 and given that 𝐀{\bf A} is tall (i.e., NtNtRF+1N_{t}\geq N_{t}^{\text{RF}}+1) and Vandermonde, the condition in (9) is always satisfied.The only requirement though to ensure essential uniqueness of 𝒳{\mathcal{X}} is that 𝐁KNs×NtRF{\bf B}\in\mathbb{C}^{KN_{s}\times N_{t}^{\text{RF}}} needs to be tall and full rank. This requires the number of subcarriers multiplied by the number of streams be greater than or equal to the number of transmit RF chains. This renders our proposed method not applicable for single carrier systems, i.e., K=1K=1, with NtRF>NsN_{t}^{\text{RF}}>N_{s}, otherwise, such a condition can be easily satisfied with a modest number of subcarriers.

Let 𝐀¯(Nt1)×NtRF\overline{\bf A}\in\mathbb{C}^{(N_{t}-1)\times N_{t}^{\text{RF}}}, 𝐁¯KNs×NtRF\overline{\bf B}\in\mathbb{C}^{KN_{s}\times N_{t}^{\text{RF}}} and 𝐂¯2×NtRF\overline{\bf C}\in\mathbb{C}^{2\times N_{t}^{\text{RF}}} be the resulting solution of (15). The goal now is to find (ϕ0,,ϕNtRF1)(\phi_{0},\cdots,\phi_{N_{t}^{\text{RF}}-1}) and {𝐅BB[k]}k=1K\{{\bf F}_{\text{BB}}[k]\}_{k=1}^{K} given (𝐀¯,𝐁¯,𝐂¯)(\overline{\bf A},\overline{\bf B},\overline{\bf C}). To do so, we first recover the NtRFN_{t}^{\text{RF}} phases {ϕi}i=0NtRF1\{{\phi_{i}}\}_{i=0}^{N_{t}^{\text{RF}}-1} from the columns of 𝐀¯\overline{\bf A} by simply reading the angles of first elements of the columns of 𝐀¯\overline{\bf A}.

Input: {𝐅opt[k]Nt×Ns}k=1K,\{{\bf F}_{\text{opt}}[k]\in\mathbb{C}^{N_{t}\times N_{s}}\}_{k=1}^{K},~{} NtRFN_{t}^{\text{RF}},
Construct 𝐗=[𝐅opt[1],,𝐅opt[K]]Nt×KNs{\bf X}=[{\bf F}_{\text{opt}}[1],\cdots,{\bf F}_{\text{opt}}[K]]\in\mathbb{C}^{N_{t}\times KN_{s}}
Construct 𝒳(Nt1)×KNs×2\mathcal{X}\in\mathbb{C}^{(N_{t}-1)\times KN_{s}\times 2} as 𝒳(:,:,1)=𝐗(1:end1,:)\mathcal{X}(:,:,1)={\bf X}(1:end-1,:), and 𝒳(:,:,2)=𝐗(2:end,:)\mathcal{X}(:,:,2)={\bf X}(2:end,:)
Decompose 𝒳:=𝐀¯,𝐁¯,𝐂¯\mathcal{X}:=\left\llbracket\overline{\bf A},\overline{\bf B},\overline{\bf C}\right\rrbracket using TALS
for i=0:NtRF1i=0:N_{t}^{\text{RF}}-1 do

       Recover ϕi\phi_{i} by computing the angle of the first element of 𝐀¯(:,i)\overline{\bf A}(:,i)
Form 𝐅RF(:,i)=[1,ejϕi,,ej(Nt1)ϕi]T{\bf F}_{\text{RF}}(:,i)=[1,e^{j\phi_{i}},\cdots,e^{j(N_{t}-1)\phi_{i}}]^{T}
Obtain λ1i=𝐀¯(1,i)\lambda^{i}_{1}=\overline{\bf A}(1,i), λ3i=𝐂¯(1,i)\lambda^{i}_{3}=\overline{\bf C}(1,i), λ2i=1λ1iλ3i\lambda^{i}_{2}=\frac{1}{\lambda^{i}_{1}\lambda^{i}_{3}}
Obtain 𝐁(:,i)=𝐁¯(:,i)/λ2i{\bf B}(:,i)=\overline{\bf B}(:,i)/\lambda^{i}_{2}
end for
Reshape 𝐁{\bf B} to retrieve {𝐅BB[k]NtRF×Ns}k=1K\{{\bf F}_{\text{BB}}[k]\in\mathbb{C}^{N_{t}^{\text{RF}}\times N_{s}}\}_{k=1}^{K}.
Algorithm 1 V-TPAR: Vandermonde Two-slab PARAFAC

To obtain {𝐅BB[k]}k=1K\{{\bf F}_{\text{BB}}[k]\}_{k=1}^{K}, we need to resolve the complex scaling ambiguity that is inherent to PARAFAC (see Definition 1 for the essential uniqueness of PARAFAC). Note that we ignore the permutation ambiguity, as in the hybrid beamforming context, finding the analog and baseband precoders up to a common permutation ambiguity is irrelevant since it merely amounts to shuffling the RF chains. The complex scale ambiguity though is important as it amounts to entirely changing the directions of the precoders. Fortunately, since the columns of both matrices 𝐀¯\overline{\bf A} and 𝐂¯\overline{\bf C} exhibit a Vandermonde structure, the column-wise scale ambiguity in both matrices can be resolved by simply dividing the elements of each column by the first element. Once the complex scale ambiguities associated with the columns of 𝐀¯\overline{\bf A} and 𝐂¯\overline{\bf C}, denoted as 𝚲1{\bf\Lambda}_{1} and 𝚲3{\bf\Lambda}_{3}, respectively, are resolved, it can be seen from (8) that the column-wise scale ambiguity of 𝐁{\bf B}, denoted as 𝚲2{\bf\Lambda}_{2}, can be easily obtained as 𝚲2=(𝚲3𝚲1)1{\bf\Lambda}_{2}=({\bf\Lambda}_{3}{\bf\Lambda}_{1})^{-1}. The above procedures for solving the wideband hybrid beamforming problem using Vandermonde-constrained Two-slab PARAFAC (V-TPAR) are outlined in Algorithm 1.

The complexity of Algorithm 1 is incurred in decomposing the tensor 𝒳\mathcal{X} using the iterative TALS algorithm. The per iteration complexity of TALS is equal to the cost of inverting an (NtRF)2×(NtRF)2(N_{t}^{\text{RF}})^{2}\times(N_{t}^{\text{RF}})^{2} matrix. The overall complexity then depends on the total number of iterations which in turn depends on the problem and the size of the tensor (see [17] and references therein for convergence properties of TALS). As we will see later, for the considered problem, a few iterations of TALS seem to be sufficient to obtain hiqh-quality solution.

Refer to caption
Figure 2: The BER performance with NRF=2N^{\text{RF}}=2 and Ns=1N_{s}=1.
     Parameter     Value
Carrier frequency 28 GHz
Subcarrier spacing 60 kHz
Modulation 16-QAM
Code rate 0.49
Number of transmit antennas 32
Number of receive antennas 8
UE speed 0.5 km/hr
Delay spread 300 ns
Channel model CDL-C
TABLE I: Parameter settings for the simulations.

VI Simulations

In this section, we will provide numerical results on 3GPP link-level channel model to assess the performance of the proposed method. The adopted simulation parameters are listed in Table I. We use the CDL-C channel model with the delay spread set to 300 ns. Both BS and UE are equipped with uniform linear array where the antenna elements are separated by a half wavelength. All results are averaged out over 200 realizations. The number of subbands is set to 3030, i.e., K=30K=30, where each subband consists of one resource block (RB), i.e., 12 subcarriers. The channel matrix for each subband is obtained by averaging out the channels across the 12 subcarriers. For the proposed method implementation, we used the TALS algorithm implemented in the Tensorlab MATLAB toolbox. Finally, all simulations were performed on an Intel(R) Xeon(R) Gold 6234 CPU.

Refer to caption
Figure 3: The BER performance NRF=3N^{\text{RF}}=3 and Ns=2N_{s}=2.

To benchmark the performance of the proposed method, we use the manifold optimization (MO) alternating minimization algorithm [6], the phase extraction (PE) alternating minimization algorithm [6] and the OMP algorithm [3] as baselines. Both MO and PE solve the wideband hybrid beamforming problem (4) with unit modulus constraints on the entries of the analog beamformers, while the OMP algorithm solve (4) with codebook constraint on the columns of the analog beamformers. For OMP, we use the DFT codebook for both 𝐅RF{\bf F}_{\text{RF}} and 𝐖RF{\bf W}_{\text{RF}}.

From the feedback overhead perspective, one can see from Table II that the Vandermonde feasible set (our proposed method) attains the same overhead of the codebook one (OMP). In particular, the number of parameters to feed back is independent of the number of transmit (receive) antennas and is equal to the number of transmit (receive) RF chains if the the analog precoders and combiners are computed at the UE (BS). On the other side, the unit-modulus feasible set (MO and PE) suffers from the large overhead that scales up with the number of transmit/receive antennas, thereby limiting their use in limited feedback systems.

𝐅RF{\bf F}_{\text{RF}} feasible set     u\mathcal{F}_{u}     c\mathcal{F}_{c}    v\mathcal{F}_{v}
Num. of parameters     NtNtRFN_{t}N_{t}^{\text{RF}}     NtRFN_{t}^{\text{RF}}     NtRFN_{t}^{\text{RF}}
Method     MO and PE      OMP     T-VPAR
TABLE II: Feedback overhead associated with the different feasible sets of the analog beamformers.

To evaluate the practical impact of the different hybrid beamforming algorithms, we report the coded BER in an end-to-end system. First, we consider a scenario with NRF=2N^{\text{RF}}=2 and Ns=1N_{s}=1, i.e., NRF=2NsN^{\text{RF}}=2N_{s}, while the rest of the parameters are as listed in Table I. It is known from [6] that when NRF=NsN^{\text{RF}}=N_{s}, PE achieves the same performance of MO at much lower complexity while the performance of the former degrades when NRF>NsN^{\text{RF}}>N_{s}. Fig. 2 shows the end-to-end coded BER performance of the different methods. One can see that, for this case, the proposed method achieves more than 1 dB SNR gain relative to OMP. More interestingly, the proposed approach outperforms the PE method with more than an order of magnitude reduction in BER at -8 dB. Further, when Ns<NRF<2NsN_{s}<N^{\text{RF}}<2N_{s} as shown in Fig. 3, the performance of the proposed method significantly outperforms OMP with roughly 4 dB SNR gain. Finally, one can see that both the tensor method and PE attain approximately the same performance, with 1 dB loss relative to MO.

Next, we simulated another scenario with NRF=Ns=2N^{\text{RF}}=N_{s}=2. It can be seen that now PE achieves the same performance as MO while the proposed method incurs roughly 2 dB SNR loss, as Fig. 4 depicts. In addition, one can see that the proposed algorithm considerably outperforms the OMP algorithm with more than an order of magnitude reduction in BER when the SNR exceeds -10 dB.

Finally, to assess the complexity of the proposed tensor approach, Fig. 5 depicts the average run time of the proposed method relative to the considered baselines, when NRF=Ns=2N^{\text{RF}}=N_{s}=2 and NRF=3N^{\text{RF}}=3 and Ns=2N_{s}=2 . We observe that the run time of the proposed method is comparable to PE while achieving more than an order of magnitude reduction in run time compared to MO in both setups. Finally, OMP features the lowest run time but this obviously comes at the expense of performance.

Refer to caption
Figure 4: The BER performance NRF=Ns=2N^{\text{RF}}=N_{s}=2.

VII Conclusions

This paper has considered single user hybrid precoding and combining in wideband mmWave MIMO systems under Vandermonde constraints on the hybrid precoders and combiners. The problem is formulated as a tensor factorization problem where PARAFAC is invoked to find the Vandermonde-constrained analog beamformers and the set of baseband precoders – with identifiability guarantees. Numerical results on a 3GPP link-level test bench have revealed the superiority of the proposed method relative to the state-of-the-art. In particular, the proposed method has shown to be striking the balance between performance, overhead and complexity. As a future work, we aim at expanding the applicability of the proposed framework to other array structures such as uniform plannar array (UPA). Further, we plan to explore the impact of increasing the number of subarrays (multi-slab PARAFAC as opposed to two slab) on the estimation accuracy, and its trade-off with computational complexity.

Refer to caption
(a) NRF=Ns=2N^{\text{RF}}=N_{s}=2
Refer to caption
(b) NRF=3N^{\text{RF}}=3 and Ns=2N_{s}=2
Figure 5: Average run time results.

References

  • [1] F. Boccardi, R. W. Heath, A. Lozano, T. L. Marzetta, and P. Popovski, “Five disruptive technology directions for 5G,” IEEE Commun. Mag., vol. 52, no. 2, pp. 74–80, Feb. 2014.
  • [2] S. Rangan, T. S. Rappaport, and E. Erkip, “Millimeter-wave cellular wireless networks: Potentials and challenges,” Proc. of the IEEE, vol. 102, no. 3, pp. 366–385, Feb. 2014.
  • [3] O. El Ayach, S. Rajagopal, S. Abu-Surra, Z. Pi, and R. W. Heath, “Spatially sparse precoding in millimeter wave MIMO systems,” IEEE Trans. on Wir. Commun., vol. 13, no. 3, pp. 1499–1513, Nov. 2014.
  • [4] A. Alkhateeb, O. El Ayach, G. Leus, and R. W. Heath, “Channel estimation and hybrid precoding for millimeter wave cellular systems,” IEEE J. of Sel. Topics in Sig. Process., vol. 8, no. 5, pp. 831–846, Oct. 2014.
  • [5] A. Goldsmith, S. A. Jafar, N. Jindal, and S. Vishwanath, “Capacity limits of MIMO channels,” IEEE J. on Sel. Areas in Commun., vol. 21, no. 5, pp. 684–702, June 2003.
  • [6] X. Yu, J.-C. Shen, J. Zhang, and K. B. Letaief, “Alternating minimization algorithms for hybrid precoding in millimeter wave MIMO systems,” IEEE J. of Sel. Topics in Sig. Process., vol. 10, no. 3, pp. 485–500, Apr. 2016.
  • [7] A. Alkhateeb and R. W. Heath, “Frequency selective hybrid precoding for limited feedback millimeter wave systems,” IEEE Trans. on Commun., vol. 64, no. 5, pp. 1801–1818, May 2016.
  • [8] D. J. Love, R. W. Heath, V. K. Lau, D. Gesbert, B. D. Rao, and M. Andrews, “An overview of limited feedback in wireless communication systems,” IEEE J. on Sel. Areas in Commun., vol. 26, no. 8, pp. 1341–1365, Oct. 2008.
  • [9] A. L. de Almeida, G. Favier, and J. C. M. Mota, “PARAFAC-based unified tensor modeling for wireless communication systems with application to blind multiuser equalization,” Signal Processing, vol. 87, no. 2, pp. 337–351, Feb. 2007.
  • [10] N. D. Sidiropoulos, G. B. Giannakis, and R. Bro, “Blind parafac receivers for DS-CDMA systems,” IEEE Trans. on Sig. Process., vol. 48, no. 3, pp. 810–823, June 2000.
  • [11] M. S. Ibrahim, C. I. Kanatsoulis, and N. D. Sidiropoulos, “Downlink channel feedback for FDD massive MIMO systems via tensor compression and sampling,” in 54th Asilomar Conference on Sig., Sys., and Comp., Pacific Grove, CA, USA, Nov. 2020, pp. 27–31.
  • [12] O. El Ayach, R. W. Heath, S. Abu-Surra, S. Rajagopal, and Z. Pi, “Low complexity precoding for large millimeter wave MIMO systems,” in IEEE Int. Conf. on Commun. (ICC), Ottawa, Canada, June 2012, pp. 3724–3729.
  • [13] N. D. Sidiropoulos and R. Bro, “On the uniqueness of multilinear decomposition of N-way arrays,” Journal of Chemometrics, vol. 14, no. 3, pp. 229–239, Nov. 2000.
  • [14] L. Chiantini and G. Ottaviani, “On generic identifiability of 3-tensors of small rank,” SIAM Journal on Matrix Analysis and Applications, vol. 33, no. 3, pp. 1018–1037, Sept. 2012.
  • [15] L. De Lathauwer, “A link between the canonical decomposition in multilinear algebra and simultaneous matrix diagonalization,” SIAM Journal on Matrix Analysis and Applications, vol. 28, no. 3, pp. 642–666, Sept. 2006.
  • [16] N. D. Sidiropoulos, R. Bro, and G. B. Giannakis, “Parallel factor analysis in sensor array processing,” IEEE Trans. on Sig. Process., vol. 48, no. 8, pp. 2377–2388, Aug. 2000.
  • [17] N. D. Sidiropoulos, L. De Lathauwer, X. Fu, K. Huang, E. E. Papalexakis, and C. Faloutsos, “Tensor decomposition for signal processing and machine learning,” IEEE Trans. on Sig. Process., vol. 65, no. 13, pp. 3551–3582, Apr. 2017.
  • [18] N. Vervliet, O. Debals, L. Sorber, M. Van Barel, and L. De Lathauwer, “Tensorlab v3.0,” Mar. 2016.