This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Interleaved Training for Massive MIMO Downlink via Exploring Spatial Correlation

Cheng Zhang,  Chang Liu,  Yindi Jing, 
Minjie Ding,  Yongming Huang
This work was supported in part by the National Natural Science Foundation of China under Grant 62271140 and 62225107, the Fundamental Research Funds for the Central Universities 2242022k60002, the Natural Science Foundation on Frontier Leading Technology Basic Research Project of Jiangsu under Grant BK20222001, and the Major Key Project of PCL. (Corresponding authors: Y. Huang, C. Zhang)C. Zhang, C. Liu, M. Ding and Y. Huang are with the National Mobile Communication Research Laboratory, Southeast University, Nanjing 210096, China, and also with the Purple Mountain Laboratories, Nanjing 211111, China (e-mail: zhangcheng_seu, 220210844, 220200816, huangym@seu.edu.cn).Y. Jing is with the Department of Electrical and Computer Engineering, University of Alberta, Edmonton T6G 1H9, Canada (e-mail: yindi@ualberta.ca).
Abstract

Interleaved training has been studied for single-user and multi-user massive MIMO downlink with either fully-digital or hybrid beamforming. However, the impact of channel correlation on its average training overhead is rarely addressed. In this paper, we explore the channel correlation to improve the interleaved training for single-user massive MIMO downlink. For the beam-domain interleaved training, we propose a modified scheme by optimizing the beam training codebook. The basic antenna-domain interleaved training is also improved by dynamically adjusting the training order of the base station (BS) antennas during the training process based on the values of the already trained channels. Exact and simplified approximate expressions of the average training length are derived in closed-form for the basic and modified beam-domain schemes and the basic antenna-domain scheme in correlated channels. For the modified antenna-domain scheme, a deep neural network (DNN)-based approximation is provided for fast performance evaluation. Analytical results and simulations verify the accuracy of our derived training length expressions and explicitly reveal the impact of system parameters on the average training length. In addition, the modified beam/antenna-domain schemes are shown to have a shorter average training length compared to the basic schemes.

Index Terms:
Massive MIMO, interleaved training, spatial correlation, conditional distribution, training overhead.

I Introduction

Via exploiting the large number of spatial degrees-of-freedom provided by large-scale antenna arrays, massive MIMO systems can achieve significant performance improvement compared to conventional MIMO systems [1, 2]. One crucial practical issue for massive MIMO downlink is the acquisition of channel state information (CSI) at the base station (BS), especially for frequency-division-duplexing (FDD) systems with no uplink-downlink channel reciprocity [3]. Traditional downlink training and channel estimation schemes cause prohibitive training overhead due to the massive number of channel coefficients to be estimated [4].

Existing studies on the downlink CSI acquisition of massive MIMO can be divided into the following categories. In [5, 6, 7, 8], the channel statistics, e.g., spatial and/or temporal correlation, are utilized to conduct beamformed channel estimation. In [9, 10, 11, 12, 13, 14], compressive sensing algorithms are designed to exploit the channel sparsity in the angular domain and/or the common sparsity among users and/or subcarriers. In [15, 16, 17, 18], either the partial reciprocity between the uplink and downlink channels, e.g., with similar angle and delay of propagation paths, or their implicit relationships, e.g., both being the functions of user location, are used for channel training designs.

These aforementioned schemes aim to obtain the complete antenna-domain CSI with the smallest possible pilot overhead before the data transmissions, thus the training design and data transmission design are decoupled, which imposes limitations on the tradeoff between training overhead and performance. Further, only the throughput or diversity gain has been considered in these existing works. The quality-of-service (QoS) provided by the obtained CSI is not taken into consideration during the training process. Therefore, the training length or pilot overhead is fixed and does not adjust according to specific channel realizations. For massive MIMO systems, it is possible to use partial CSI to design the beamforming scheme for the data transmission period, especially with the outage probability performance measure. A new training method, namely the interleaved training, is proposed to dynamically adjust the training overhead according to the required QoS and specific channel realizations.

The idea of interleaved training was proposed in [19, 20] for the downlink of single-user full-digital massive antenna systems with independent and identically distributed (i.i.d.) channels, where the channels of different antennas are trained sequentially and the estimated CSI or indicator is fed back at the end of each training step. With each feedback, the BS decides to conduct the training of another antenna’s channel or to terminate the training process based on whether an outage occurs. Compared to traditional schemes, the interleaved training can achieve a significant reduction in training overhead with no degradation of outage performance.

The work in [21] applied the idea of interleaved training to beam-domain transmission, where joint beam-based interleaved training and data transmission schemes are proposed for massive MIMO systems with single and multiple users. In [22], an improved codebook is further designed for interleaved training in millimeter-wave hybrid massive MIMO downlink. In [23], a joint interleaved training and transmission design is proposed for large-intelligent surface (LIS) assisted systems under i.i.d. Rayleigh fading channels. Recently, an interleaved training design is proposed for multi-user massive MIMO downlink in [24, 25], in which analytical results on the training length and the transmission success rate are provided for the maximum-ratio transmission (MRT) precoding. Different from the single-user scheme, the multi-user scheme needs to judge whether the signal-to-interference-plus-noise (SINR) requirements of all users can be satisfied with partial CSI during the interleaved training procedure. The advantages of interleaved training are clearly demonstrated in the above studies.

Different from previous papers on interleaved training [19, 20, 21, 22, 23, 24, 25], in this paper, we focus on exploiting channel statistics to further improve the performance of interleaved training for single-user massive MIMO downlink systems. Both beam-domain and antenna-domain modified interleaved training designs are proposed, and we further provide analytical results on the average training length of the proposed schemes and their comparison with those of the basic schemes. Detailed contributions are summarized as follows.

  • In the modified beam-domain scheme, both the beam direction and the beam training order are optimized based on the channel correlation information. In the modified antenna-domain scheme, the BS antenna training order is dynamically adjusted during the training process according to the channel values of the already trained antennas. To this end, we derive the conditional distribution of the untrained BS channels given the values of the trained channels under general correlated channels. And for exponentially correlated channels, we demonstrate that the conditional distribution of any untrained antenna’s channel is only dependent on the channels of its nearest antennas on both sides in the already-trained antenna set, which significantly simplifies the complexity of the modified antenna-domain scheme.

  • Closed-form expressions of the average training length are derived for the basic and modified beam-domain interleaved training schemes with general correlated channels. For exponentially correlated channels, we further provide a simplified approximation of the average training length for the modified scheme when the number of BS antennas is large.

  • A closed-form average-training-length expression is also derived for the basic antenna-domain interleaved training with general correlated channels, and its simple approximation is given for exponentially correlated channels. For the average training length of the modified antenna-domain interleaved training, we propose a deep neural network (DNN)-based approximation to achieve fast performance evaluation.

  • Simulation results verify our derived analytical expressions and theoretical analysis on the impact of systems parameters, e.g., channel correlation, antenna number, and the requirement of the signal-noise-ratio (SNR), on the average training length of the basic and modified antenna/beam-domain interleaved training schemes under two typical correlated channels. In addition, simulations also demonstrate that our proposed modified antenna and beam-domain interleaved training schemes both outperform the basic interleaved training schemes.

The remainder of this paper is organized as follows. In Section II, we introduce the single-user massive MIMO downlink system with two typical channel correlation models and the basic antenna/beam-domain interleaved training scheme. In Section III, the modified beam-domain interleaved training is proposed along with related analytical results. In Section IV, we propose the modified antenna-domain interleaved training and conduct theoretical analysis. Simulations are provided in Section V. Section VI summarizes this work. Some proofs are included in the appendix.

Notation: Bold upper and bold lower case letters denote matrices and vectors. m×n\mathbb{{C}}^{m\times n} denotes the mm by nn dimensional complex space. 𝐈n\mathbf{I}_{n} denote the nn-dimensional identity matrix. The conjugate transpose, transpose, determinant, adjugate matrix, rank and inverse of 𝐀\mathbf{A} are denoted by 𝐀H\mathbf{A}^{\rm{H}}, 𝐀T\mathbf{A}^{{\rm T}}, det(𝐀){\rm det}\left(\mathbf{A}\right), adj(𝐀){\rm adj}\left(\mathbf{A}\right), rank(𝐀){\rm rank}\left(\mathbf{A}\right) and 𝐀1\mathbf{A}^{\rm-1}. The vector 𝐚i\mathbf{a}_{i} denotes the ii-th column of the matrix 𝐀\mathbf{A}. For a vector 𝐚{\mathbf{a}}, an{a}_{n} is the nn-th element of 𝐚\mathbf{a} and 𝐚𝕊\mathbf{a}_{\mathbb{S}} is the sub-vector composed of the ss-th element of 𝐚\mathbf{a} for s𝕊s\in\mathbb{S} when the subscript is an index set 𝕊\mathbb{S}. Similarly, [𝐀]m,n[\mathbf{A}]_{m,n} is the (m,n)(m,n)-th element of 𝐀\mathbf{A} and [𝐀]𝕊,𝕋[\mathbf{A}]_{\mathbb{S},\mathbb{T}} is the sub-matrix composed of the (s,t)(s,t)-th element of 𝐀\mathbf{A} for s𝕊s\in\mathbb{S} and t𝕋t\in\mathbb{T}. Define [m:n][m:n] as the set {m,m+1,,n}\{m,m+1,\dots,n\}. Pr(𝒜)\Pr\left(\mathcal{A}\right) represents the probability of event 𝒜\mathcal{A}. \left\lfloor\right\rfloor represents the floor function. 𝐚\left\|\mathbf{a}\right\| denotes the Euclidean norm of 𝐚\mathbf{a}. diag(𝐚){\rm diag}(\mathbf{a}) is the diagonal matrix whose diagonal entries are elements of vector 𝐚\mathbf{a}. fX()f_{X}\left(\cdot\right) denotes the probability density function (PDF) of a random variable (RV) XX. 𝒞𝒩(𝝁,𝚺)\mathcal{CN}\left({\boldsymbol{\mu}},\boldsymbol{\Sigma}\right) denotes the circularly symmetric complex Gaussian distribution with mean vector 𝝁\boldsymbol{\mu} and covariance matrix 𝚺\boldsymbol{\Sigma}. χ2(k){{\chi}^{2}}\left(k\right) denotes the chi-squared distribution with kk being the degrees of freedom. χ2(k,λ){{\chi}^{2}}\left(k,\lambda\right) denotes the noncentral chi-squared distribution with kk and λ\lambda being the degrees of freedom and the non-centrality parameter. Q1(a,b){{Q}_{1}}\left(a,b\right) is the first order Marcum Q-function. \cong denotes the equality in distribution.

II System Model

We consider a massive MIMO downlink system with an MM-antenna BS and a single-antenna user equipment (UE). The downlink BS-UE channel, denoted as 𝐡\mathbf{h}, is modeled as a circular-symmetric complex Gaussian vector following the distribution 𝒞𝒩(𝟎,𝐑𝐡)\mathcal{CN}\left(\mathbf{0},\mathbf{R}_{\mathbf{h}}\right), where 𝐑𝐡\mathbf{R}_{\mathbf{h}} is the channel covariance matrix. One typical correlation model is the one-ring correlation model [5],[26], which is expressed as

𝐑𝐡=ΘminΘmaxg(θ)𝜶(θ)𝜶H(θ)𝑑θ,\mathbf{R}_{\mathbf{h}}=\int_{\Theta_{\rm min}}^{\Theta_{\rm max}}g(\theta)\boldsymbol{\alpha}(\theta)\boldsymbol{\alpha}^{{\rm H}}(\theta)d\theta, (1)

where [Θmin,Θmax]\left[\Theta_{\rm min},\Theta_{\rm max}\right] is the angle interval of the channel power seen at the BS, g()g(\cdot) represents the power angle spectrum (PAS), satisfying ΘminΘmaxg(θ)𝑑θ=1\int_{\Theta_{\rm min}}^{\Theta_{\rm max}}g(\theta)d\theta=1, and 𝜶(θ)M×1\boldsymbol{\alpha}(\theta)\in\mathbb{C}^{M\times 1} is the BS array response vector. For the uniform linear array (ULA), 𝜶(θ)=[1,,ej2πDsin(θ)(M1)]T\boldsymbol{\alpha}(\theta)=\left[1,\dots,e^{-j2\pi D\sin\left(\theta\right)\left(M-1\right)}\right]^{{\rm T}} where DD is the antenna spacing ratio. Another typical correlation model is the exponential one [27], i.e.,

[𝐑𝐡]m,n=ρmn,mn,m,n=1,,M,[\mathbf{R}_{\mathbf{h}}]_{m,n}=\rho^{m-n},\forall m\geq n,m,n=1,...,M, (2)

where ρ\rho, satisfying r=|ρ|<1r=\left|\rho\right|<1, is the channel correlation between adjacent antennas. This is a simple single-parameter model commonly used for many communication problems, which is also physically reasonable in the sense that the correlation decreases with increasing distance between antennas, e.g., in the ULA.

The downlink transmission can be represented as

y=P𝐡H𝐰s+n,y=\sqrt{P}\mathbf{h}^{\rm H}\mathbf{w}s+n, (3)

where yy is the received signal at the user, 𝐰M\mathbf{w}\in\mathbb{C}^{M} is the antenna-domain beamformer at the BS with the unit norm, i.e., 𝐰=1\left\|\mathbf{w}\right\|=1, ss is the transmitted symbol with unit average power, PP is the transmit power and nn is the normalized receive noise at the UE which follows 𝒞𝒩(0,1).\mathcal{CN}\left(0,1\right). The received SNR can be written as

SNR=P|𝐡H𝐰|2.{\rm SNR}=P\left|\mathbf{h}^{\rm H}\mathbf{w}\right|^{2}. (4)

If the beam-domain transmission is conducted, 𝐰\mathbf{w} can be decomposed into two parts: 𝐰=𝐖O𝐰I\mathbf{w}=\mathbf{W}_{\rm O}\mathbf{w}_{\rm I}, where 𝐖OM×B\mathbf{W}_{\rm O}\in\mathbb{C}^{M\times B} (BMB\leq M) is the external beamforming matrix, 𝐰IB\mathbf{w}_{\rm I}\in\mathbb{C}^{B} is the beam-domain beamformer, and BB is the number of beams. One typical 𝐖O\mathbf{W}_{\rm O} is the normalized discrete Fourier transformation (DFT) matrix 𝐃M×M\mathbf{D}\in\mathbb{C}^{M\times M} with [𝐃]m,n=ej2π(m1)(n1)M/M[\mathbf{D}]_{m,n}=e^{j2\pi\frac{(m-1)(n-1)}{M}}/\sqrt{M}. Define the BB-dimensional beam-domain channel as 𝐡¯=𝐖OH𝐡\bar{\mathbf{h}}=\mathbf{W}_{\rm O}^{\rm H}\mathbf{h}. We have 𝐡¯𝒞𝒩(𝟎,𝐑𝐡¯)\bar{\mathbf{h}}\sim\mathcal{CN}(\mathbf{0},\mathbf{R}_{\bar{\mathbf{h}}}) with 𝐑𝐡¯=𝐖OH𝐑𝐡𝐖O\mathbf{R}_{\bar{\mathbf{h}}}=\mathbf{W}_{\rm O}^{\rm H}\mathbf{R}_{\mathbf{h}}\mathbf{W}_{\rm O}. Eq. (4) can be converted to

SNR=P|𝐡¯H𝐰I|2.{\rm SNR}=P\left|\bar{\mathbf{h}}^{\rm H}\mathbf{w}_{\rm I}\right|^{2}. (5)

For a given target data transmission rate RthR_{\rm th}, an outage event occurs if log2(1+SNR)<Rth{\rm log_{2}\left(1+SNR\right)}<R_{\rm th}, or equivalently if SNR<Pαth{\rm SNR}<P\alpha_{\rm th}, where αth=(2Rth1)/P\alpha_{\rm th}=(2^{R_{\rm th}}-1)/P is the normalized receive SNR threshold.

II-A General Framework of Interleaved Training and the Basic Training Scheme

In this subsection, we introduce the general antenna-domain and beam-domain interleaved training and give the basic interleaved training algorithms proposed in [20, 21] as the baseline of our study. For a uniform representation, we define

(𝐡~,𝐰~,L)={(𝐡,𝐰,M), for antenna-domain training(𝐡¯,𝐰I,B), for beam-domain training.\left(\tilde{\mathbf{h}},\tilde{\mathbf{w}},L\right)=\begin{cases}\left(\mathbf{h},\mathbf{w},M\right),\text{ for antenna-domain training}\\ \left(\bar{\mathbf{h}},\mathbf{w}_{\rm I},B\right),\text{ for beam-domain training}\\ \end{cases}. (6)

In the general antenna/beam-domain interleaved training scheme, the BS trains the channel of one antenna/beam for each step, and the order of the antennas/beams during the training is determined according to a predefined criterion. After the ll-th training step, the UE knows 𝐡~𝔸l\tilde{\mathbf{h}}_{\mathbb{A}_{l}} with 𝔸l\mathbb{A}_{l} denoting the set of indices of the already trained BS antennas/beams 111The main purpose of interleaved training is to reduce the training overhead. In order to focus on the theoretical analysis and give more insights, we do not consider the error resulting from channel estimation and feedback quantization in our study.. To maximize the receive SNR based on this currently acquired CSI, the BS can conduct the following downlink beamforming

w~n={h~n𝐡~𝔸l,if n𝔸l0,if n𝔸l.\tilde{w}_{n}=\begin{cases}\frac{\tilde{h}_{n}}{\left\|\tilde{\mathbf{h}}_{\mathbb{A}_{l}}\right\|},&\text{if }n\in{\mathbb{A}_{l}}\\ 0,&\text{if }n\notin{\mathbb{A}_{l}}\\ \end{cases}. (7)

The receive SNR of this beamformer is thus SNR=P𝐡~𝔸l2{\rm SNR}=P\left\|\tilde{\mathbf{h}}_{\mathbb{A}_{l}}\right\|^{2}. Based on whether an outage occurs, i.e., 𝐡~𝔸l2<αth\left\|\tilde{\mathbf{h}}_{\mathbb{A}_{l}}\right\|^{2}<\alpha_{\rm th}, the UE decides to notify the BS to continue training by one bit 0, or feed back one bit 11 and channel values of the already trained antennas/beams to the BS for transmission beamforming.

In the basic antenna/beam-domain interleaved training scheme as shown in Algorithm 1, the channel of one BS antenna or one DFT beam is trained for each step, and the antennas/beams are trained sequentially following their original indices, i.e., after the ll-th training step, the index set of the already trained antennas/beams is [1:l][1:l].

Algorithm 1 Basic antenna/beam-domain interleaved training scheme[20, 21]
1:Initialization: 𝔸1={1}\mathbb{A}_{1}=\left\{1\right\}; l=1l=1; The BS sends a pilot for the UE to acquire h~1\tilde{h}_{1};
2:While 𝐡~𝔸l2<αth\left\|\tilde{\mathbf{h}}_{\mathbb{A}_{l}}\right\|^{2}<\alpha_{\rm th} & l<Ll<L do
3:     The UE sends one bit 0 to the BS;
4:     The BS sends a pilot for the UE to acquire h~l+1\tilde{h}_{l+1};
5:     l=l+1l=l+1; 𝔸l={𝔸l1,l}\mathbb{A}_{l}=\left\{\mathbb{A}_{l-1},l\right\};
6:end
7:if 𝐡~𝔸l2αth\left\|\tilde{\mathbf{h}}_{\mathbb{A}_{l}}\right\|^{2}\geq\alpha_{\rm th}
8:     The UE feeds back one bit 11 and 𝐡~𝔸l\tilde{\mathbf{h}}_{\mathbb{A}_{l}} to the BS;
9:     The BS conducts downlink beamforming according to Eq. (7);
10:else
11:     The UE feeds back one bit 0 to the BS;
12:end

III Modified Beam-Domain Interleaved Training and Performance Analysis

In the basic beam-domain interleaved training scheme [21], the adopted DFT beams give no guarantee to align the effective propagation paths, and nor does it consider setting a higher training priority for beams with stronger average power. In the following, we explore the channel covariance matrix to improve the beam-domain interleaved training via addressing the above issues. In addition, we perform analysis on the average training length of the modified beam-domain interleaved training and compare it with that of the basic beam-domain interleaved training to reveal the advantages of the modified design. The methods of acquiring channel covariance matrix at the BS in massive MIMO systems can be referred to [28, 29, 30].

III-A Modified Beam-Domain Training Design

Recall that the channel covariance matrix 𝐑𝐡\mathbf{R}_{\mathbf{h}} is positive semi-definite and we denote its rank as rMr_{M}. We consider the compact eigenvalue decomposition of 𝐑𝐡\mathbf{R}_{\mathbf{h}}: 𝐑𝐡=𝐔𝚺𝐔H\mathbf{R}_{\mathbf{h}}=\mathbf{U}\boldsymbol{\Sigma}\mathbf{U}^{\rm H}, where 𝐔\mathbf{U} is an M×rMM\times r_{M} semi-unitary matrix and 𝚺=diag{δ1,,δrM}\boldsymbol{\Sigma}={\rm diag}\{\delta_{1},...,\delta_{r_{M}}\} with δ1δ2δrM>0{\delta}_{1}\geq{\delta}_{2}\cdots\geq{\delta}_{r_{M}}>0. With the knowledge of 𝐔\mathbf{U} and 𝚺\boldsymbol{\Sigma}, the BS can set 𝐖O=𝐔\mathbf{W}_{\rm O}=\mathbf{U}, implying that B=rMB=r_{M}, and therefore 𝐡¯=𝐔H𝐡\bar{\mathbf{h}}=\mathbf{U}^{\rm H}\mathbf{h} is the BB-dimentional vector of the beam-domain channel coefficients. In the modified scheme, the BS trains the BB effective beams 𝐮1,𝐮2,,𝐮B\mathbf{u}_{1},\mathbf{u}_{2},\cdots,\mathbf{u}_{B} in turn, such that the beams are trained with decreasing average power. After bb steps of beam training, the BS obtains the beam-domain channels 𝐡¯𝔸b\bar{\mathbf{h}}_{\mathbb{A}_{b}} where 𝔸b={1,,b}\mathbb{A}_{b}=\{1,...,b\}. The BS conducts the beam-domain precoding 𝐰IB\mathbf{w}_{\rm I}\in\mathbb{C}^{B} according to Eq. (7). And an outage occurs if 𝐡¯𝔸b2<αth\left\|\bar{\mathbf{h}}_{\mathbb{A}_{b}}\right\|^{2}<{{\alpha}_{{\rm th}}}. With this beam ordering, the specific process of the modified beam-domain training scheme can be referred to as Algorithm 1. The modified scheme has both beam alignment and ordering through the eigenmatrix 𝐔\mathbf{U} of the channel covariance matrix. This is the major difference from the basic one.

From the Toeplitz eigen-subspace approximation result in [31], the eigenvectors of the one-ring covariance matrix in Eq. (1) and those of the exponential covariance matrix in Eq. (2) can both be well approximated by the columns of a DFT matrix for M1M\gg 1. As MM increases asymptotically to infinity, both the modified scheme and the basic scheme use the DFT codebook for beam training and their difference then only lies in the order of the beams during training. In this case, since the modified scheme trains the beams with decreasing average power sequentially, it has a shorter average training length.

III-B Average Training Length Analysis

Considering that the difference between the basic beam-domain interleaved training and the modified one lies only in the use of the training beam codebook, we first give the analysis of the average training length for the beam-domain interleaved training scheme with any given beam codebook, based on which the average training length of the modified scheme and its comparison with that of the basic scheme are subsequently given.

III-B1 Analysis for the General Beam-Domain Scheme

Recall that the receiver SNR after the bb-th training step is SNR=P𝐡¯𝔸b2{\rm SNR}=P\left\|\bar{\mathbf{h}}_{\mathbb{A}_{b}}\right\|^{2}. From Algorithm 1, we can see that the training stops after the bb-th training step with probability Pr(|𝐡¯𝔸1|2αth)\Pr\left({{\left|{{\bar{\mathbf{h}}}_{\mathbb{A}_{1}}}\right|}^{2}}\geq\alpha_{\rm th}\right) for b=1b=1 and Pr(𝐡¯𝔸b2αth)Pr(𝐡¯𝔸b12αth)\Pr\left({{\left\|{\bar{\mathbf{h}}_{\mathbb{A}_{b}}}\right\|}^{2}}\geq\alpha_{\rm th}\right)-\Pr\left({{\left\|{\bar{\mathbf{h}}_{\mathbb{A}_{b-1}}}\right\|}^{2}}\geq\alpha_{\rm th}\right) for b=2,,B1b=2,...,B-1. And the training stops after the BB-th training with probability 1Pr(𝐡¯𝔸B12αth)1-\Pr\left({{\left\|{\bar{\mathbf{h}}_{\mathbb{A}_{B-1}}}\right\|}^{2}}\geq\alpha_{\rm th}\right). The average training length of the beam-domain interleaved training scheme can be expressed as

Lt\displaystyle L_{t} =Pr(|𝐡¯𝔸1|2αth)+\displaystyle=\Pr\left({{\left|{{\bar{\mathbf{h}}}_{\mathbb{A}_{1}}}\right|}^{2}}\geq\alpha_{\rm th}\right)+ (8)
b=2B1b[Pr(𝐡¯𝔸b2αth)Pr(𝐡¯𝔸b12αth)]\displaystyle\sum_{b=2}^{B-1}b\left[\Pr\left({{\left\|{\bar{\mathbf{h}}_{\mathbb{A}_{b}}}\right\|}^{2}}\geq\alpha_{\rm th}\right)-\Pr\left({{\left\|{\bar{\mathbf{h}}_{\mathbb{A}_{b-1}}}\right\|}^{2}}\geq\alpha_{\rm th}\right)\right]
+B[1Pr(𝐡¯𝔸B12αth)]\displaystyle+B\left[1-\Pr\left({{\left\|{\bar{\mathbf{h}}_{\mathbb{A}_{B-1}}}\right\|}^{2}}\geq\alpha_{\rm th}\right)\right]
=1+b=1B1Pr(𝐡¯𝔸b2<αth).\displaystyle=1+\sum\limits_{b=1}^{B-1}{\Pr\left({{\left\|{\bar{\mathbf{h}}_{\mathbb{A}_{b}}}\right\|}^{2}}<\alpha_{\rm th}\right)}.
Theorem 1.

For an arbitrary beam codebook 𝐖O\mathbf{W}_{\rm O}, recall that 𝐑𝐡¯=𝐖OH𝐑𝐡𝐖O\mathbf{R}_{\bar{\mathbf{h}}}=\mathbf{W}_{\rm O}^{\rm H}\mathbf{R}_{\mathbf{h}}\mathbf{W}_{\rm O} and define 𝐑~b=[𝐑𝐡¯1/2]𝔸b,[1:M]\tilde{\mathbf{R}}_{b}=\left[\mathbf{R}_{\bar{\mathbf{h}}}^{1/2}\right]_{\mathbb{A}_{b},[1:M]}. Consider the compact eigenvalue decomposition 𝐑~bH𝐑~b=𝐔b𝚺b𝐔bH\tilde{\mathbf{R}}_{b}^{{\rm H}}\tilde{\mathbf{R}}_{b}={{\mathbf{U}}_{b}}{{\mathbf{\Sigma}}_{b}}\mathbf{U}_{b}^{{\rm H}} where 𝐔b\mathbf{U}_{b} is an M×rbM\times r_{b} semi-unitary matrix, 𝚺b=diag{δb,1,,δb,rb}{{\mathbf{\Sigma}}_{b}}={\rm diag}\left\{{{\delta}_{b,1}},\dots,{{\delta}_{b,r_{b}}}\right\} and rbr_{b} is the rank of 𝐑~bH𝐑~b\tilde{\mathbf{R}}_{b}^{{\rm H}}\tilde{\mathbf{R}}_{b}. Suppose that there are TbT_{b} different eigenvalues with value of δ¯b,t\bar{\delta}_{b,t} and repeated time of rb,tr_{b,t} for t=1,,Tbt=1,...,T_{b}. Define 𝐫b=[rb,1,,rb,Tb]T\mathbf{r}_{b}=[{{r}_{b,1}},...,{{r}_{b,T_{b}}}]^{\rm T}. The average training length of the general beam-domain interleaved training scheme under correlated channels can be expressed as

Lt=1+\displaystyle L_{t}=1+ b=1B1t=1Tb(1δ¯b,t)rb,tk=1Tbs=1rb,k(1)rb,ksδ¯b,krb,ks+1\displaystyle\sum\limits_{b=1}^{B-1}{\prod\limits_{t=1}^{T_{b}}\left(\frac{1}{{\bar{\delta}_{b,t}}}\right)^{{{r}_{b,t}}}}\sum\limits_{k=1}^{T_{b}}\sum\limits_{s=1}^{{{r}_{b,k}}}{{\left(-1\right)}^{{{r}_{b,k}}-s}}{{\bar{\delta}_{b,k}}}^{{{r}_{b,k}}-s+1} (9)
×Ψb,k,s,𝐫b[1eαthδ¯b,ku=0rb,ks(αthδ¯b,k)uu!],\displaystyle\times{{\Psi}_{b,k,s,\mathbf{r}_{b}}}\left[1-{{e}^{-\frac{\alpha_{\rm th}}{{\bar{\delta}_{b,k}}}}}\sum\limits_{u=0}^{{{r}_{b,k}}-s}{\frac{{{\left(\frac{\alpha_{\rm th}}{{\bar{\delta}_{b,k}}}\right)}^{u}}}{u!}}\right],

where Ψb,k,s,𝐫b=(1)rb,k1𝐢Ωb,k,snk(in+rb,n1in){{\Psi}_{b,k,s,\mathbf{r}_{b}}}={{\left(-1\right)}^{{{r}_{b,k}}-1}}\sum\limits_{\mathbf{i}\in{{\Omega}_{b,k,s}}}\prod\limits_{n\neq k}\left(\begin{matrix}{{i}_{n}}+{{r}_{b,n}}-1\\ {{i}_{n}}\\ \end{matrix}\right) ×(1δ¯b,n1δ¯b,k)(in+rb,n)\times{{\left(\frac{1}{{\bar{\delta}_{b,n}}}-\frac{1}{{\bar{\delta}_{b,k}}}\right)}^{-\left({{i}_{n}}+{{r}_{b,n}}\right)}}, 𝐢=[i1,,iTb]T\mathbf{i}={{\left[{{i}_{1}},\ldots,{{i}_{T_{b}}}\right]}^{\rm T}} and Ωb,k,s={[i1,,iTb]Tb;j=1Tbij=s1,ik=0,ij0 for all j}.{{\Omega}_{b,k,s}}=\left\{\left[{{i}_{1}},\ldots,{{i}_{T_{b}}}\right]\in{{\mathbb{Z}}^{T_{b}}};\sum\limits_{j=1}^{T_{b}}{{{i}_{j}}=s-1,{{i}_{k}}=0,{{i}_{j}}\geq 0}\text{ for all }j\right\}.

Proof.

Recall that 𝐡¯=𝐖OH𝐡𝒞𝒩(𝟎,𝐑𝐡¯)\bar{\mathbf{h}}=\mathbf{W}_{\rm O}^{\rm H}\mathbf{h}\sim\mathcal{CN}(\mathbf{0},\mathbf{R}_{\bar{\mathbf{h}}}). We have 𝐡¯𝔸b𝐑~b𝐡iid\bar{\mathbf{h}}_{\mathbb{A}_{b}}\cong\tilde{\mathbf{R}}_{b}\mathbf{h}_{\rm iid} with 𝐡iid𝒞𝒩(𝟎,𝐈M){\mathbf{h}}_{\rm iid}\sim\mathcal{C}\mathcal{N}\left(\mathbf{0},{{\mathbf{I}}_{M}}\right) and 𝐡¯𝔸b2=𝐡iidH𝐔b𝚺b𝐔bH𝐡iidh~iidb,H𝚺b𝐡~iidb{{\left\|\bar{\mathbf{h}}_{\mathbb{A}_{b}}\right\|}^{2}}=\mathbf{h}_{\rm iid}^{{\rm H}}{{\mathbf{U}}_{b}}{{\mathbf{\Sigma}}_{b}}\mathbf{U}_{b}^{{\rm H}}\mathbf{h}_{\rm iid}\cong{{\tilde{\mathbf{{\rm h}}}_{\rm iid}^{b,{\rm H}}}}{{\mathbf{\Sigma}}_{b}}\tilde{\mathbf{h}}_{\rm iid}^{b} with 𝐡~iidb𝒞𝒩(𝟎,𝐈rb)\tilde{\mathbf{h}}_{\rm iid}^{b}\sim\mathcal{C}\mathcal{N}\left(\mathbf{0},{{\mathbf{I}}_{r_{b}}}\right). Therefore,

𝐡¯𝔸b2=j=1rbδb,j|h~iid,jb|2t=1Tb12δ¯b,tQb,t,{{\left\|\bar{\mathbf{h}}_{\mathbb{A}_{b}}\right\|}^{2}}=\sum\limits_{j=1}^{r_{b}}{{{\delta}_{b,j}}{{\left|\tilde{h}_{{\rm iid},j}^{b}\right|}^{2}}}\cong\sum\limits_{t=1}^{T_{b}}{\frac{1}{2}\bar{\delta}_{b,t}}{{Q}_{b,t}}, (10)

where Qb,tχ2(2rb,t){{Q}_{b,t}}\sim{{\chi}^{2}}\left(2{{r}_{b,t}}\right). By using results in [32] on the sum of independent chi-square random variables, the PDF of 𝐡¯𝔸b2{{\left\|\bar{\mathbf{h}}_{\mathbb{A}_{b}}\right\|}^{2}} is

f(𝐡¯𝔸b2=x;𝐫b,δ¯b,1,,δ¯b,Tb)=t=1Tb(1δ¯b,t)rb,t\displaystyle f\left({{{\left\|\bar{\mathbf{h}}_{\mathbb{A}_{b}}\right\|}^{2}}}=x;\mathbf{r}_{b},{\bar{\delta}_{b,1}},\ldots,{\bar{\delta}_{b,T_{b}}}\right)=\prod\limits_{t=1}^{T_{b}}{\left(\frac{1}{{\bar{\delta}_{b,t}}}\right)^{{{r}_{b,t}}}} (11)
×k=1Tbs=1rb,kΨb,k,s,𝐫b(rb,ks)!(x)rb,ksexδ¯b,k.\displaystyle\times\sum\limits_{k=1}^{T_{b}}{\sum\limits_{s=1}^{{{r}_{b,k}}}{\frac{{{\Psi}_{b,k,s,\mathbf{r}_{b}}}}{\left({{r}_{b,k}}-s\right)!}}}{{\left(-x\right)}^{{{r}_{b,k}}-s}}{{e}^{-\frac{x}{{\bar{\delta}_{b,k}}}}}.

Therefore, we have

Pr(𝐡¯𝔸b2<αth)=t=1Tb(1δ¯b,t)rb,t×\displaystyle\Pr\left({{{\left\|\bar{\mathbf{h}}_{\mathbb{A}_{b}}\right\|}^{2}}}<\alpha_{\rm th}\right)=\prod\limits_{t=1}^{T_{b}}{\left(\frac{1}{{\bar{\delta}_{b,t}}}\right)^{{{r}_{b,t}}}}\times (12)
k=1Tbs=1rb,kΨb,k,s,𝐫b(rb,ks)!0αth(x)rb,ksexδ¯b,k𝑑x\displaystyle\hskip 14.22636pt\sum\limits_{k=1}^{T_{b}}{\sum\limits_{s=1}^{{{r}_{b,k}}}{\frac{{{\Psi}_{b,k,s,\mathbf{r}_{b}}}}{\left({{r}_{b,k}}-s\right)!}}}\int_{0}^{\alpha_{\rm th}}{{{\left(-x\right)}^{{{r}_{b,k}}-s}}{{e}^{-\frac{x}{{\bar{\delta}_{b,k}}}}}dx}
=t=1Tb(1δ¯b,t)rb,tk=1Tbs=1rb,k(1)rb,ksδ¯b,krb,ks+1Ψb,k,s,𝐫b\displaystyle=\prod\limits_{t=1}^{T_{b}}{\left(\frac{1}{{\bar{\delta}_{b,t}}}\right)^{{{r}_{b,t}}}}\sum\limits_{k=1}^{T_{b}}{\sum\limits_{s=1}^{{{r}_{b,k}}}{{{{\left(-1\right)}^{{{r}_{b,k}}-s}}{\bar{\delta}_{b,k}}^{{{r}_{b,k}}-s+1}{{\Psi}_{b,k,s,\mathbf{r}_{b}}}}}}
×[1eαthδ¯b,ku=0rb,ks(αthδ¯b,k)uu!].\displaystyle\hskip 70.0001pt\times\left[1-{{e}^{-\frac{\alpha_{\rm th}}{{\bar{\delta}_{b,k}}}}}\sum\limits_{u=0}^{{{r}_{b,k}}-s}{\frac{{{\left(\frac{\alpha_{\rm th}}{{\bar{\delta}_{b,k}}}\right)}^{u}}}{u!}}\right].

Via substituting Eq. (12) into Eq. (8), Eq. (9) can be obtained. ∎

III-B2 Analysis for the Modified Beam-Domain Scheme

The derivation for the average training length of the modified beam-domain interleaved training scheme can be referred to Theorem 1 by setting the beam codebook as 𝐖O=𝐔\mathbf{W}_{\rm O}=\mathbf{U}. Therefore, 𝐑𝐡¯=𝐔H𝐑𝐡𝐔=𝚺\mathbf{R}_{\bar{\mathbf{h}}}=\mathbf{U}^{\rm H}\mathbf{R}_{\mathbf{h}}\mathbf{U}=\boldsymbol{\Sigma} and 𝐑~b=[𝚺12][1:b],[1:M]\tilde{\mathbf{R}}_{b}=\left[\boldsymbol{\Sigma}^{\frac{1}{2}}\right]_{[1:b],[1:M]}, which leads to δb,j=δj{{\delta}_{b,j}}={{\delta}_{j}} for b=1,,B,j=1,,bb=1,\dots,B,j=1,...,b. It is noteworthy that since the modified scheme uses the eigenvectors of the channel covariance matrix as the beam codebook, δb,j\delta_{b,j} is independent of the training step index bb. Suppose that there are T¯b\bar{T}_{b} different values in the first bb eigenvalues of 𝐑𝐡\mathbf{R}_{{\mathbf{h}}} δj,j=1,,b{{\delta}_{j}},j=1,...,b with value of δ¯b,t\bar{\delta}_{b,t} and repeated times r¯b,t\bar{r}_{b,t} for t=1,,T¯bt=1,...,\bar{T}_{b}. Define 𝐫¯b=[r¯b,1,,r¯b,T¯b]T\bar{\mathbf{r}}_{b}=[{\bar{r}_{b,1}},...,{\bar{r}_{b,\bar{T}_{b}}}]^{\rm T}.

Corollary 1.

The average training length of the modified beam-domain interleaved training scheme can be expressed as:

b=1B1t=1T¯b(1δ¯b,t)r¯b,tk=1T¯bs=1r¯b,k(1)r¯b,ksδ¯kr¯b,ks+1\displaystyle\sum\limits_{b=1}^{B-1}\prod\limits_{t=1}^{\bar{T}_{b}}{\left(\frac{1}{\bar{\delta}_{b,t}}\right)^{{\bar{r}_{b,t}}}}\sum\limits_{k=1}^{\bar{T}_{b}}\sum\limits_{s=1}^{\bar{r}_{b,k}}}{{{{\left(-1\right)}^{{\bar{r}_{b,k}}-s}}\bar{\delta}_{k}}^{{\bar{r}_{b,k}}-s+1} (13)
×Ψb,k,s,𝐫¯b[1eαthδ¯b,ku=0r¯b,ks(αthδ¯b,k)uu!],\displaystyle\times{{\Psi}_{b,k,s,\bar{\mathbf{r}}_{b}}}\left[1-{{e}^{-\frac{\alpha_{\rm th}}{{\bar{\delta}_{b,k}}}}}\sum\limits_{u=0}^{{\bar{r}_{b,k}}-s}{\frac{{{\left({\frac{\alpha_{\rm th}}{\bar{\delta}_{b,k}}}\right)}^{u}}}{u!}}\right],

where Ψb,k,s,𝐫¯b=(1)r¯b,k1𝐢Ωb,k,snk(in+r¯b,n1in){{\Psi}_{b,k,s,\bar{\mathbf{r}}_{b}}}={{\left(-1\right)}^{{\bar{r}_{b,k}}-1}}\sum\limits_{\mathbf{i}\in{{\Omega}_{b,k,s}}}\prod\limits_{n\neq k}\left(\begin{matrix}{{i}_{n}}+{\bar{r}_{b,n}}-1\\ {{i}_{n}}\\ \end{matrix}\right) ×(1δ¯b,n1δ¯b,k)(in+r¯b,n)\times{{\left(\frac{1}{\bar{\delta}_{b,n}}-\frac{1}{\bar{\delta}_{b,k}}\right)}^{-\left({{i}_{n}}+{\bar{r}_{b,n}}\right)}}, 𝐢=[i1,,iT¯b]T\mathbf{i}={{\left[{{i}_{1}},\ldots,{{i}_{\bar{T}_{b}}}\right]}^{\rm T}} and Ωb,k,s={[i1,,iT¯b]T¯b;j=1T¯bij=s1,ik=0,ij0 for all j}{{\Omega}_{b,k,s}}=\left\{\left[{i}_{1},\ldots,{i}_{\bar{T}_{b}}\right]\in\mathbb{Z}^{\bar{T}_{b}};\sum\limits_{j=1}^{\bar{T}_{b}}{{{i}_{j}}=s-1,{{i}_{k}}=0,{{i}_{j}}\geq 0}\text{ for all }j\right\}.

Proof.

The result can be directly obtained from Theorem 1. ∎

Corollary 2.

For channels with the exponential covariance matrix in Eq. (2) and 0r<10\leq r<1, a large MM approximation of the average training length for the modified beam-domain interleaved training scheme can be written as

Lt=1+m=1M1j=1mlj(0)(1eαthδj),L_{t}=1+\sum\limits_{m=1}^{M-1}{\sum\limits_{j=1}^{m}{{{l}_{j}}\left(0\right)}}\left(1-{{e}^{-\frac{\alpha_{\rm th}}{{\delta}_{j}}}}\right), (14)

where lj(0)=k=1,kjmδjδjδk{{l}_{j}}\left(0\right)=\prod\limits_{k=1,k\neq j}^{m}{\frac{\delta_{j}}{\delta_{j}-\delta_{k}}} and

δj1r21+r2+2rcos((M+r)(M+1j)πM(M+1)){{\delta}_{j}}\approx\frac{1-{{r}^{2}}}{1+{{r}^{2}}+2r\cos\left(\frac{\left(M+r\right)(M+1-j)\pi}{M\left(M+1\right)}\right)} (15)

for j=1,,Mj=1,\ldots,M.

Proof.

For the exponential covariance matrix 𝐑𝐡{{\mathbf{R}}_{{{\mathbf{h}}}}}, its eigenvalues δj{{\delta}_{j}} for M1M\gg 1 can be approximated as Eq. (15) by following [33, Eq. (51)]. According to the monotonicity of cos(x)\cos(x) for 0<x<π0<x<\pi, we have δ1>δ2>>δM>0\delta_{1}>\delta_{2}>\cdots>\delta_{M}>0, and thus B=rM=MB=r_{M}=M, T¯b=b{\bar{T}}_{b}=b, δ¯b,t=δt\bar{\delta}_{b,t}={\delta}_{t}, r¯b,t=1{\bar{r}_{b,t}}=1 for t=1,,T¯bt=1,...,\bar{T}_{b}. Via substituting these into Eq. (13) in Corollary 1, Eq. (14) can be obtained. ∎

The results in Eq. (9), Eq. (13) and Eq. (14) are in closed-form and can be used to evaluate the average training length for different system parameter values.

In the following, we discuss the impact of the antenna number MM on the average training length of the modified beam-domain interleaved training.

Corollary 3.

For the modified beam-domain interleaved training scheme, when the use of all beams can avoid an outage, the average training length LtL_{t} is a non-increasing function of the antenna number MM, in both one-ring correlated channels with non-zero AS and exponentially correlated channels.

Proof.

Denote the eigenvalues of the channel covariance matrix 𝐑𝐡\mathbf{R}_{\mathbf{h}} in descending order as λ1,λ2,,λrM\lambda_{1},\lambda_{2},...,\lambda_{r_{M}} and Λ1,Λ2,,ΛrM+1\Lambda_{1},\Lambda_{2},...,\Lambda_{r_{M+1}} for the number of antennas being MM and M+1M+1, respectively. Since the channel covariance matrix for MM BS antennas is a submatrix of that for M+1M+1 BS antennas, we have either rM+1=rM+1r_{M+1}=r_{M}+1, e.g., for channels with the full-rank exponential covariance matrix in Eq. (2), or rM+1=rMr_{M+1}=r_{M}. From Eq. (8) and Eq. (10), the average training length can be expressed as Lt=1+b=1rM1Pr(j=1b|hiid,j|2δj<αth)L_{t}=1+\sum_{b=1}^{r_{M}-1}{\Pr\left({\sum_{j=1}^{b}{{{\left|{{h}_{{\rm iid},j}}\right|}^{2}}{{\delta}_{j}}}}<\alpha_{\rm th}\right)}. Then the difference between the average training lengths for systems with MM and M+1M+1 BS antennas for the case of rM+1=rM+1r_{M+1}=r_{M}+1 is

Lt(M+1)Lt(M)=Δ+Pr(j=1rM|hiid,j|2Λj<αth),L_{t}(M+1)-L_{t}(M)=\Delta+\Pr\left({\sum\limits_{j=1}^{r_{M}}{{{\left|{{h}_{{\rm iid},j}}\right|}^{2}}{{\Lambda}_{j}}}}<\alpha_{\rm th}\right), (16)

where Δ=b=1rM1[Pr(j=1b|hiid,j|2Λj<αth)Pr(\Delta=\sum_{b=1}^{r_{M}-1}\left[\Pr\left({\sum_{j=1}^{b}{{{\left|{{h}_{{\rm iid},j}}\right|}^{2}}{{\Lambda}_{j}}}}<\alpha_{\rm th}\right)-\Pr\left(\right.\right. j=1b|hiid,j|2λj<αth)]\left.\left.\sum_{j=1}^{b}{{{\left|{{h}_{{\rm iid},j}}\right|}^{2}}{{\lambda}_{j}}}<\alpha_{\rm th}\right)\right]. From the Eigenvalue Interlacing Theorem [34], we have ΛrM+1λrMΛrMλrM1ΛrM1λ2Λ2λ1Λ1\Lambda_{r_{M+1}}\leq\lambda_{r_{M}}\leq\Lambda_{r_{M}}\leq\lambda_{r_{M}-1}\leq\Lambda_{r_{M}-1}\leq\dots\leq\lambda_{2}\leq\Lambda_{2}\leq\lambda_{1}\leq\Lambda_{1}. Thus Δ0\Delta\leq 0. The condition that the use of all beams can meet the transmission requirement leads to Pr(j=1rM|hiid,j|2Λj<αth)=0\Pr\left({\sum_{j=1}^{r_{M}}{{{\left|{{h}_{{\rm iid},j}}\right|}^{2}}{{\Lambda}_{j}}}}<\alpha_{\rm th}\right)=0. Therefore, LtL_{t} is non-increasing with increasing MM under this condition. For the case of rM+1=rMr_{M+1}=r_{M}, this conclusion still stands due to Lt(M+1)Lt(M)=ΔL_{t}(M+1)-L_{t}(M)=\Delta. ∎

Remark 1.

Numerical simulation based on Eq. (13) shows that the average training length LtL_{t} increases with MM for small MM; while for large MM, LtL_{t} decreases with MM and converges to a constant value. This is because when MM is small, Pr(j=1rM|hiid,j|2Λj<αth)\Pr\left({\sum_{j=1}^{r_{M}}{{{\left|{{h}_{{\rm iid},j}}\right|}^{2}}{{\Lambda}_{j}}}}<\alpha_{\rm th}\right) is the dominant term in Lt(M+1)Lt(M)L_{t}(M+1)-L_{t}(M) in Eq. (16), which has a positive value. When MM is large, Pr(j=1rM|hiid,j|2Λj<αth)0\Pr\left({\sum_{j=1}^{r_{M}}{{{\left|{{h}_{{\rm iid},j}}\right|}^{2}}{{\Lambda}_{j}}}}<\alpha_{\rm th}\right)\rightarrow 0, therefore, as shown in Corollary 3, LtL_{t} decreases with MM.

Next, we discuss the effect of the channel correlation on the average training length of the modified beam-domain interleaved training scheme for channels with the exponential covariance matrix. Numerical simulation based on Eq. (14) shows that for relatively small αth\alpha_{\rm th}, higher channel correlation helps reduce the average training length of the modified beam-domain interleaved training. However, as αth\alpha_{\rm th} continues to increase, an increase in the channel correlation may have the opposite effect. According to the derivative of eigenvalues in Eq. (15), i.e., δj=1r21+r2+2rcos((M+r)(M+1j)πM(M+1)),j=1,,M{{\delta}_{j}}=\frac{1-{{r}^{2}}}{1+{{r}^{2}}+2r\cos\left(\frac{\left(M+r\right)(M+1-j)\pi}{M\left(M+1\right)}\right)},\forall j=1,\ldots,M, with respect to rr, larger eigenvalues increase, while smaller eigenvalues become smaller, as rr increases. For very large αth\alpha_{\rm th}, Pr(j=1b|hiid,j|2δj<αth)1\Pr\left({\sum_{j=1}^{b}{{{\left|{{h}_{{\rm iid},j}}\right|}^{2}}{{\delta}_{j}}}}<\alpha_{\rm th}\right)\approx 1 for b<rM1b<r_{M}-1, and Pr(j=1rM1|hiid,j|2δj<αth)\Pr\left({\sum_{j=1}^{r_{M}-1}{{{\left|{{h}_{{\rm iid},j}}\right|}^{2}}{{\delta}_{j}}}}<\alpha_{\rm th}\right) has the greatest impact on LtL_{t}. In this case, smaller rr results in flatter eigenvalue distribution which provides lower Pr(j=1rM1|hiid,j|2δj<αth)\Pr\left({\sum_{j=1}^{r_{M}-1}{{{\left|{{h}_{{\rm iid},j}}\right|}^{2}}{{\delta}_{j}}}}<\alpha_{\rm th}\right) and shorter LtL_{t}. For small enough αth\alpha_{\rm th}, Pr(j=1b|hiid,j|2δj<αth)0\Pr\left({\sum_{j=1}^{b}{{{\left|{{h}_{{\rm iid},j}}\right|}^{2}}{{\delta}_{j}}}}<\alpha_{\rm th}\right)\approx 0 for 2<brM12<b\leq r_{M}-1, and Pr(|hiid,1|2δ1<αth)\Pr\left({{{{\left|{{h}_{{\rm iid},1}}\right|}^{2}}{{\delta}_{1}}}}<\alpha_{\rm th}\right) has the greatest impact on LtL_{t}. In this case, larger rr results in higher δ1{\delta}_{1} and shorter LtL_{t}.

IV Modified Antenna-Domain Interleaved Training and Performance Analysis

In this section, we first discuss the impact of channel correlation on the average training length of the basic antenna-domain interleaved training. Then we derive the conditional distribution of channels of un-trained BS antennas based on channel values of the already trained BS antennas during the interleaved training process, based on which we further propose the design of the modified antenna-domain interleaved training.

IV-A Average Training Length Analysis and Impact of Channel Correlation

In the following, we give closed-form expressions of the average training length of the basic antenna-domain interleaved training scheme under general correlated channels and exponentially correlated channels respectively.

Compared to Theorem 1, the only difference in the derivation on the average training length of the basic antenna-domain interleaved training is that the covariance matrix of the trained channels after mm training steps is 𝐑~m=[𝐑𝐡12][1:m],[1:M]\tilde{\mathbf{R}}_{m}=\left[\mathbf{R}_{{{\mathbf{h}}}}^{\frac{1}{2}}\right]_{[1:m],[1:M]} and the vector of the trained channels can be represented as 𝐡𝔸m𝐑~m𝐡iid{{\mathbf{h}}_{\mathbb{A}_{m}}}\cong\tilde{\mathbf{R}}_{m}\mathbf{h}_{\rm iid}. Consider the compact eigenvalue decomposition: 𝐑~mH𝐑~m=𝐔m𝚺m𝐔mH\tilde{\mathbf{R}}_{m}^{{\rm H}}\tilde{\mathbf{R}}_{m}={{\mathbf{U}}_{m}}{{\mathbf{\Sigma}}_{m}}\mathbf{U}_{m}^{{\rm H}} where 𝚺m=diag{δm,1,,δm,rm}{{\mathbf{\Sigma}}_{m}}={\rm diag}\left\{{{\delta}_{m,1}},\ldots,{{\delta}_{m,r_{m}}}\right\} and rmr_{m} is the rank of 𝐑~mH𝐑~m\tilde{\mathbf{R}}_{m}^{{\rm H}}\tilde{\mathbf{R}}_{m}. Then we have 𝐡𝔸m2h~iidm,H𝚺m𝐡~iidm{{\left\|{{\mathbf{h}}_{\mathbb{A}_{m}}}\right\|}^{2}}\cong{{\tilde{\mathbf{{\rm h}}}_{\rm iid}^{m,{\rm H}}}}{{\mathbf{\Sigma}}_{m}}\tilde{\mathbf{h}}_{\rm iid}^{m} with 𝐡~iidm𝒞𝒩(𝟎,𝐈rm)\tilde{\mathbf{h}}_{\rm iid}^{m}\sim\mathcal{C}\mathcal{N}\left(\mathbf{0},{{\mathbf{I}}_{r_{m}}}\right). Suppose that there are TmT_{m} different eigenvalues with value of δ¯m,t\bar{\delta}_{m,t} and repeated times of rm,tr_{m,t} for t=1,,Tmt=1,...,T_{m}. Define 𝐫m=[rm,1,,rm,Tm]T\mathbf{r}_{m}=[{{r}_{m,1}},...,{{r}_{m,T_{m}}}]^{\rm T}.

Theorem 2.

The average training length of the basic antenna-domain interleaved training scheme under general correlated channels can be expressed as

Lt=1+\displaystyle L_{t}=1+ m=1M1t=1Tm(1δ¯m,t)rm,tk=1Tms=1rm,k(1)rm,ksδ¯m,krm,ks+1\displaystyle\sum\limits_{m=1}^{M-1}{\prod\limits_{t=1}^{T_{m}}\left(\frac{1}{{\bar{\delta}_{m,t}}}\right)^{{{r}_{m,t}}}}\sum\limits_{k=1}^{T_{m}}\sum\limits_{s=1}^{{{r}_{m,k}}}{{\left(-1\right)}^{{{r}_{m,k}}-s}}{{\bar{\delta}_{m,k}}}^{{{r}_{m,k}}-s+1} (17)
×Ψm,k,s,𝐫m[1eαthδ¯m,ku=0rm,ks(αthδ¯m,k)uu!],\displaystyle\times{{\Psi}_{m,k,s,\mathbf{r}_{m}}}\left[1-{{e}^{-\frac{\alpha_{\rm th}}{{\bar{\delta}_{m,k}}}}}\sum\limits_{u=0}^{{{r}_{m,k}}-s}{\frac{{{\left(\frac{\alpha_{\rm th}}{{\bar{\delta}_{m,k}}}\right)}^{u}}}{u!}}\right],

where Ψm,k,s,𝐫m=(1)rm,k1𝐢Ωm,k,snk(in+rm,n1in){{\Psi}_{m,k,s,\mathbf{r}_{m}}}={{\left(-1\right)}^{{{r}_{m,k}}-1}}\sum\limits_{\mathbf{i}\in{{\Omega}_{m,k,s}}}\prod\limits_{n\neq k}\left(\begin{matrix}{{i}_{n}}+{{r}_{m,n}}-1\\ {{i}_{n}}\\ \end{matrix}\right) ×(1δ¯m,n1δ¯m,k)(in+rm,n)\times{{\left(\frac{1}{{\bar{\delta}_{m,n}}}-\frac{1}{{\bar{\delta}_{m,k}}}\right)}^{-\left({{i}_{n}}+{{r}_{m,n}}\right)}}, 𝐢=[i1,,iTm]T\mathbf{i}={{\left[{{i}_{1}},\ldots,{{i}_{T_{m}}}\right]}^{\rm T}}, and Ωm,k,s={[i1,,iTm]Tm;j=1Tmij=s1,ik=0,ij0 for all j}.{{\Omega}_{m,k,s}}=\left\{\left[{{i}_{1}},\ldots,{{i}_{T_{m}}}\right]\in{{\mathbb{Z}}^{T_{m}}};\sum\limits_{j=1}^{T_{m}}{{{i}_{j}}=s-1,{{i}_{k}}=0,{{i}_{j}}\geq 0}\text{ for all }j\right\}.

Proof.

Please refer to the proof of Theorem 1. ∎

To analyze the impact of channel correlation on the average training length, we consider two extreme cases, i.e., the i.i.d. channels with δm,i=1,m=1,,M,i=1,,l{{\delta}_{m,i}}=1,\forall m=1,...,M,i=1,...,l and the fully correlated channels with δm,1=m{{\delta}_{m,1}}=m and δm,i=0,i=2,,m{{\delta}_{m,i}}=0,\forall i=2,...,m for m=1,,Mm=1,...,M.

For the i.i.d. channels, we have rm=mr_{m}=m, Tm=1{T}_{m}=1, δ¯m,1=1\bar{\delta}_{m,1}=1, rm,1=m{{r}_{m,1}}=m and Ψm,k,s,𝐫m=(1)m1, for s=1{{\Psi}_{m,k,s,\mathbf{r}_{m}}}=\left(-1\right)^{m-1},\text{ for }s=1; and Ψm,k,s,𝐫m=0, for s=2,,m{{\Psi}_{m,k,s,\mathbf{r}_{m}}}=0,\text{ for }s=2,\dots,m. Via substituting these into Eq. (17) in Theorem 2, we can obtain

Lt(i.i.d.)=1+m=1M1(1eαthi=0m1αthii!).L_{t}^{{\rm(i.i.d.)}}=1+\sum\limits_{m=1}^{M-1}{\left(1-e^{-\alpha_{\rm th}}\sum\limits_{i=0}^{m-1}{\frac{\alpha_{\rm th}^{i}}{i!}}\right)}. (18)

According to the result in [20, Theorem 2], we have Lt(i.i.d.)1+αthL_{t}^{{\rm(i.i.d.)}}\leq 1+\alpha_{\rm th} for MM\rightarrow\infty.

For the fully correlated channels, we have rm=1r_{m}=1, Tm=1{T}_{m}=1, δ¯m,1=m\bar{\delta}_{m,1}=m, rm,1=1{{r}_{m,1}}=1 and Ψm,k,s,𝐫m=1, for s=1{{\Psi}_{m,k,s,\mathbf{r}_{m}}}=1,\text{ for }s=1; and Ψm,k,s,𝐫m=0, for s=2,,m{{\Psi}_{m,k,s,\mathbf{r}_{m}}}=0,\text{ for }s=2,\dots,m. Via substituting these into Eq. (17) in Theorem 2, we can obtain

Lt(FC)=1+m=1M1(1eαthm).L_{t}^{{\rm(FC)}}=1+\sum\limits_{m=1}^{M-1}{\left(1-e^{-\frac{\alpha_{\rm th}}{m}}\right)}. (19)

And the behavior of Lt(FC)L_{t}^{{\rm(FC)}} for MM\rightarrow\infty is given in the following corollary.

Corollary 4.

For MM\rightarrow\infty, we have 1+αthγπ212αth2Lt(FC)αthlnM1+αthγ1+\alpha_{\rm th}\gamma-\frac{\pi^{2}}{12}\alpha_{\rm th}^{2}\leq L_{t}^{{\rm(FC)}}-\alpha_{\rm th}\ln M\leq 1+\alpha_{\rm th}\gamma, where γ0.5772\gamma\approx 0.5772 is the Euler’s constant.

Proof.

Define G(x)=x+exG(x)=x+e^{-x}. For m>0m>0, we have 1eαthmαthm=(G(αthm)G(0))=0αthmG(u)𝑑u=0αthm(1eu)𝑑u0αthmu𝑑u=αth22m21-e^{-\frac{\alpha_{\rm th}}{m}}-\frac{\alpha_{\rm th}}{m}=-\left(G\left(\frac{\alpha_{\rm th}}{m}\right)-G(0)\right)=-\int_{0}^{\frac{\alpha_{\rm th}}{m}}G^{\prime}(u)du=-\int_{0}^{\frac{\alpha_{\rm th}}{m}}\left(1-e^{-u}\right)du\geq-\int_{0}^{\frac{\alpha_{\rm th}}{m}}udu=-\frac{\alpha_{\rm th}^{2}}{2m^{2}}. Meanwhile, we have 1eαthmαthm01-e^{-\frac{\alpha_{\rm th}}{m}}-\frac{\alpha_{\rm th}}{m}\leq 0. Therefore, we can obtain m=1M1αth22m2m=1M1(1eαthm)m=1M1αthm0-\sum_{m=1}^{M-1}{\frac{\alpha_{\rm th}^{2}}{2m^{2}}}\leq\sum_{m=1}^{M-1}{\left(1-e^{-\frac{\alpha_{\rm th}}{m}}\right)}-\sum_{m=1}^{M-1}{\frac{\alpha_{\rm th}}{m}}\leq 0. Since m=11m2=π26\sum_{m=1}^{\infty}{\frac{1}{m^{2}}}=\frac{\pi^{2}}{6} and γ=limM(m=1M1mlnM)0.5772\gamma=\lim\limits_{M\to\infty}\left(\sum_{m=1}^{M}{\frac{1}{m}}-\ln M\right)\approx 0.5772, we have 1+αthγπ212αth2Lt(FC)αthlnM1+αthγ1+\alpha_{\rm th}\gamma-\frac{\pi^{2}}{12}\alpha_{\rm th}^{2}\leq L_{t}^{{\rm(FC)}}-\alpha_{\rm th}\ln M\leq 1+\alpha_{\rm th}\gamma for MM\rightarrow\infty. ∎

Remark 2.

When MM increases asymptotically, under independent channels, the average training length of the basic scheme Lt(i.i.d.)L_{t}^{{\rm(i.i.d.)}} has the upper bound 1+αth1+\alpha_{\rm th}; while under fully correlated channels, the average training length of the basic scheme Lt(FC)L_{t}^{{\rm(FC)}} is proportional to lnM\ln M, which implies the negative effect of channel correlation to the average training length of the basic training scheme. Further, numerical calculations based on Eq. (18) and (19) show that when αth\alpha_{\rm th} is small compared to MM, the basic scheme has a shorter average training length in the i.i.d. channels. This mainly benefits from the higher antenna diversity gain. After αth\alpha_{\rm th} reaches about the same size as MM, the basic scheme has a shorter average training length in the fully correlated channels, where the consistency of antenna energy is more important than the diversity gain.

Corollary 5.

For channels with the exponential covariance matrix in Eq. (2) and almost all values of 0<r<10<r<1, the average training length of the basic antenna-domain interleaved training can be expressed as

Lt=2eαth+m=2M1j=1mlm,j(0)(1eαthδm,j),L_{t}=2-{{e}^{-\alpha_{\rm th}}}+\sum\limits_{m=2}^{M-1}{\sum\limits_{j=1}^{m}{{{l}_{m,j}}\left(0\right)}}\left(1-{{e}^{-\frac{\alpha_{\rm th}}{{\delta}_{m,j}}}}\right), (20)

where lm,j(0)=k=1,kjmδm,jδm,jδm,k{{l}_{m,j}}\left(0\right)=\prod\limits_{k=1,k\neq j}^{m}{\frac{{\delta_{m,j}}}{{\delta_{m,j}}-{\delta_{m,k}}}} and

δm,j{12rcos(jπm+1),if 0<r11r2sec2(jπ2m),if 0<1r1 &r16m3sec2(jπ2m)+2(m21)Φ(r,m,j),else & mi=1m1Φ(r,m,i)Φ(r,m,j){{{\delta}_{m,j}}}\approx\begin{cases}1-2r\cos\left(\frac{j\pi}{m+1}\right),&\text{if }0<r\ll 1\\ \frac{1-r}{2}{{\sec}^{2}}\left(\frac{j\pi}{2m}\right),&\text{if }0<1-r\ll 1\text{ }\&\\ &\hskip 5.0pt{r\neq 1-\frac{6m}{3\sec^{2}\left(\frac{j\pi}{2m}\right)+2\left(m^{2}-1\right)}}\\ \Phi(r,m,j),&\text{else}\text{ }\&\text{ }m-\sum_{i=1}^{m-1}\Phi(r,m,i)\\ &\hskip 30.00005pt\neq\Phi(r,m,j)\\ \end{cases} (21)

for j=1,,m1j=1,\ldots,m-1 and

δm,m{12rcos(mπm+1),if 0<r1m(m21)(1r)3,if 0<1r1mi=1m1Φ(r,m,i),else{{\delta}_{m,m}}\approx\begin{cases}1-2r\cos\left(\frac{m\pi}{m+1}\right),&\text{if }0<r\ll 1\\ m-\frac{\left({{m}^{2}}-1\right)\left(1-r\right)}{3},&\text{if }0<1-r\ll 1\\ m-\sum\limits_{i=1}^{m-1}\Phi(r,m,i),&\text{else}\\ \end{cases} (22)

where Φ(r,m,i)1r21+r2+2r2cos(iπm)+2r(1r)cos(iπm+1)\Phi(r,m,i)\triangleq\frac{1-r^{2}}{1+r^{2}+2r^{2}\cos\left(\frac{i\pi}{m}\right)+2r(1-r)\cos\left(\frac{i\pi}{m+1}\right)}.

Proof.

For the exponential covariance matrix, the approximations of δm,j,j=1,,m{{\delta}_{m,j}},j=1,...,m can be written as Eq. (21) and Eq. (22) according to [33, Eq. (35), Eq. (43a-b), Eq. (49a-b)]. From the monotonicity of cos(x)\cos(x) in 0<x<π0<x<\pi and that of sec2(x)\sec^{2}(x) in 0<x<π20<x<\frac{\pi}{2}, we have that for 0<r10<r\ll 1, δm,j,j=1,,m\delta_{m,j},j=1,...,m are different from each other, while for 0<1r10<1-r\ll 1, δm,j,j=1,,m1\delta_{m,j},j=1,...,m-1 are different from each other, and δm,m\delta_{m,m} is different from δm,j,j=1,,m1\delta_{m,j},j=1,...,m-1 for r16m3sec2(jπ2m)+2(m21)r\neq 1-\frac{6m}{3\sec^{2}\left(\frac{j\pi}{2m}\right)+2\left(m^{2}-1\right)}. For intermediate rr values, δm,j,j=1,,m1\delta_{m,j},j=1,...,m-1 are different from each other due to the monotonicity of cos(x)\cos(x) in 0<x<π0<x<\pi as well, and δm,m\delta_{m,m} is different from δm,j,j=1,,m1\delta_{m,j},j=1,...,m-1 for mi=1m1Φ(r,m,i)Φ(r,m,j)m-\sum_{i=1}^{m-1}\Phi(r,m,i)\neq\Phi(r,m,j). Therefore, we have rm=mr_{m}=m, Tm=m{T}_{m}=m, δ¯m,t=δm,t\bar{\delta}_{m,t}={\delta}_{m,t}, rm,t=1{{r}_{m,t}}=1 for t=1,,Tmt=1,...,{T}_{m}. Via substituting these into Eq. (17) in Theorem 2, Eq. (20) can be obtained. ∎

For almost all values of 0<r<10<r<1, we can conduct faster evaluation and analysis for the average training length of the basic antenna-domain interleaved training scheme with Eq. (20) compared to that with Eq. (17). For large rr values satisfying r=16m3sec2(jπ2m)+2(m21),j=1,,m1{r=1-\frac{6m}{3\sec^{2}\left(\frac{j\pi}{2m}\right)+2\left(m^{2}-1\right)}},\forall j=1,...,m-1 or intermediate rr values satisfying mi=1m1Φ(r,m,i)=Φ(r,m,j),j=1,,m1{m-\sum_{i=1}^{m-1}\Phi(r,m,i)=\Phi(r,m,j)},\forall j=1,...,m-1, the training length can still be calculated according to Eq. (17).

IV-B Derivations on the Conditional PDF of the Untrained Channels

On the one hand, simulation based on Eq. (17) shows that the channel correlation leads to an increase in the average training length of the basic antenna-domain interleaved training at the general rate requirement. On the other hand, if the channel correlation exists and the system knows it as a priori, we can use the correlation to improve the training efficiency. Specifically, the conditional PDF of the untrained channels can be derived for given values of the already trained channels. Based on this conditional PDF, the choice of the BS antenna for the next training step can be optimized. In this subsection, we derive the conditional PDF.

Lemma 1.

Given the channel values of the already trained BS antennas, hm,m𝔸={a1,a2,,a|𝔸|}{h}_{m},m\in\mathbb{A}=\left\{{a}_{1},{a}_{2},...,{a}_{|\mathbb{A}|}\right\}, the conditional PDF of the un-trained channels hn|𝐡𝔸,n𝕄𝔸{h}_{n}|\mathbf{h}_{\mathbb{A}},n\in\mathbb{M}-\mathbb{A} follows 𝒞𝒩(μ¯n,σ¯n2)\mathcal{CN}\left(\bar{\mu}_{n},\bar{\sigma}_{n}^{2}\right) where

μ¯n=[[𝐑𝐡]n,a1,[𝐑𝐡]n,a2,,[𝐑𝐡]n,a|𝔸|]𝐑𝐡𝔸1𝐡𝔸,\bar{\mu}_{n}=\left[{{[\mathbf{R}_{\mathbf{h}}]}_{n,{{a}_{1}}}},{{[\mathbf{R}_{\mathbf{h}}]}_{n,{{a}_{2}}}},...,{{[\mathbf{R}_{\mathbf{h}}]}_{n,{{a}_{|\mathbb{A}|}}}}\right]{{\mathbf{R}}^{-1}_{{{\mathbf{h}}_{\mathbb{A}}}}}\mathbf{h}_{\mathbb{A}}, (23)
σ¯n2=1\displaystyle\bar{\sigma}_{n}^{2}=1- [[𝐑𝐡]n,a1,[𝐑𝐡]n,a2,,[𝐑𝐡]n,a|𝔸|]𝐑𝐡𝔸1\displaystyle\left[{{[\mathbf{R}_{\mathbf{h}}]}_{n,{{a}_{1}}}},{{[\mathbf{R}_{\mathbf{h}}]}_{n,{{a}_{2}}}},...,{{[\mathbf{R}_{\mathbf{h}}]}_{n,{{a}_{|\mathbb{A}|}}}}\right]{{\mathbf{R}}^{-1}_{{{\mathbf{h}}_{\mathbb{A}}}}} (24)
×[[𝐑𝐡]a1,n,[𝐑𝐡]a2,n,,[𝐑𝐡]a|𝔸|,n]T,\displaystyle\times{{\left[{{[\mathbf{R}_{\mathbf{h}}]}_{{{a}_{1}},n}},{{[\mathbf{R}_{\mathbf{h}}]}_{{{a}_{2}},n}},...,{{[\mathbf{R}_{\mathbf{h}}]}_{{{a}_{|\mathbb{A}|}},n}}\right]}^{{\rm T}}},

and [𝐑𝐡𝔸]i,j=[𝐑𝐡]ai,aj,i,j=1,,|𝔸|\left[{{\mathbf{R}}_{{{\mathbf{h}}_{\mathbb{A}}}}}\right]_{i,j}=[\mathbf{R}_{\mathbf{h}}]_{a_{i},a_{j}},i,j=1,\dots,\left|\mathbb{A}\right|. The conditional cumulative distribution function (CDF) of the power of the untrained channel hn,n𝕄𝔸h_{n},n\in\mathbb{M}-\mathbb{A} is

Pr(|hn|2x|𝐡𝔸)=1Q1(2|μ¯n|σ¯n,2xσ¯n).\Pr\left({{\left|{{h}_{n}}\right|}^{2}}\leq x|\mathbf{h}_{\mathbb{A}}\right)=1-{{Q}_{1}}\left(\sqrt{2}\frac{\left|{{{\bar{\mu}}}_{n}}\right|}{{{{\bar{\sigma}}}_{n}}},\sqrt{2}\frac{\sqrt{x}}{{{{\bar{\sigma}}}_{n}}}\right). (25)
Proof.

𝐑𝐡𝔸{{\mathbf{R}}_{{{\mathbf{h}}_{\mathbb{A}}}}} is the covariance matrix of the vector of the trained channels 𝐡𝔸\mathbf{h}_{\mathbb{A}}, which is a submatrix of the overall channel covariance matrix 𝐑𝐡\mathbf{R}_{\mathbf{h}}. Recall that 𝐡\mathbf{h} is a circular-symmetric complex Gaussian vector. Then from [35, Eq. (32)], the conditional mean in Eq. (23) and the conditional variance in Eq. (24) can be obtained. The CDF in Eq. (25) can be obtained from properties of noncentral chi-squared distribution, i.e., 2σ¯nhn|𝐡𝔸𝒞𝒩(2μ¯nσ¯n,2)\frac{\sqrt{2}}{{{{\bar{\sigma}}}_{n}}}{{h}_{n}}|\mathbf{h}_{\mathbb{A}}\sim\mathcal{C}\mathcal{N}\left(\sqrt{2}\frac{{{{\bar{\mu}}}_{n}}}{{{{\bar{\sigma}}}_{n}}},2\right) due to hn|𝐡𝔸𝒞𝒩(μ¯n,σ¯n2){{h}_{n}}|\mathbf{h}_{\mathbb{A}}\sim\mathcal{C}\mathcal{N}\left({{{\bar{\mu}}}_{n}},\bar{\sigma}_{n}^{2}\right). Then, we have the conditional PDF |2σ¯nhn|2|𝐡𝔸χ2(2,2|μ¯n|2σ¯2n){{\left|\frac{\sqrt{2}}{{{{\bar{\sigma}}}_{n}}}{{h}_{n}}\right|}^{2}}|\mathbf{h}_{\mathbb{A}}\sim{{\chi}^{2}}\left(2,2\frac{{{\left|{{{\bar{\mu}}}_{n}}\right|}^{2}}}{{{{\bar{\sigma}}}^{2}}_{n}}\right) and the conditional CDF

F|2σ¯nhn|2|𝐡𝔸(x)=1Q1(2|μ¯n|σ¯n,x).{{F}_{{{\left|\frac{\sqrt{2}}{{{{\bar{\sigma}}}_{n}}}{{h}_{n}}\right|}^{2}}|\mathbf{h}_{\mathbb{A}}}}\left(x\right)=1-{{Q}_{1}}\left(\sqrt{2}\frac{\left|{{{\bar{\mu}}}_{n}}\right|}{{{{\bar{\sigma}}}_{n}}},\sqrt{x}\right). (26)

Therefore, Eq. (25) can be obtained. ∎

Recall that 𝔸={a1,,a|𝔸|}\mathbb{A}=\{a_{1},...,a_{|\mathbb{A}|}\} denotes the set of indices of the already trained BS antennas and for simplicity of presentation, we assume that a1<a2<<a|𝔸|a_{1}<a_{2}<\cdots<a_{|\mathbb{A}|}. If the index of an un-trained BS antenna nn satisfies a1<n<a|𝔸|a_{1}<n<a_{|\mathbb{A}|}, we denote the index of its nearest BS antennas in the trained set 𝔸\mathbb{A} with a smaller index as axa_{x^{\star}}, that is, x=argminai𝔸,ai<nnai{x^{\star}}=\arg\min_{a_{i}\in\mathbb{A},a_{i}<n}n-a_{i}. Thus ax+1a_{x^{\star}+1} is the index of the trained BS antenna which is the nearest to Antenna nn with a larger index than nn. Define x1=naxx_{1}=n-a_{x^{\star}} and x2=ax+1nx_{2}=a_{{x^{\star}}+1}-n.

Corollary 6.

Under the exponential correlation model, we have

μ¯n={(ρ)a1nha1, if n<a1[ρx1(1r2x2)hax+(ρ)x2(1r2x1)hax+1]1r2(x1+x2), if a1<n<a|𝔸|ρna|𝔸|ha|𝔸|, if n>a|𝔸|,\bar{\mu}_{n}=\begin{cases}{{\left(\rho^{*}\right)}^{{{a}_{1}}-n}}{{{h}}_{{{a}_{1}}}},\hskip 38.00008pt\text{ if }n<a_{1}\\ \frac{\left[{{\rho}^{x_{1}}}\left(1-{{r}^{2x_{2}}}\right){{{{h}}}_{{{a}_{x^{\star}}}}}+{{\left(\rho^{*}\right)}^{x_{2}}}\left(1-{{r}^{2x_{1}}}\right){{{{h}}}_{{{a}_{x^{\star}+1}}}}\right]}{1-{{r}^{2\left(x_{1}+x_{2}\right)}}},\\ \hskip 94.00008pt\text{ if }a_{1}<n<a_{|\mathbb{A}|}\\ {{\rho}^{{n-{a}_{|\mathbb{A}|}}}}{{{h}}_{{{a}_{|\mathbb{A}|}}}},\hskip 40.00006pt\text{ if }n>a_{|\mathbb{A}|}\\ \end{cases}, (27)

and

σ¯n2={1r2(a1n), if n<a1(1r2x1)(1r2x2)1r2(x1+x2), if a1<n<a|𝔸|1r2(na|𝔸|), if n>a|𝔸|.\bar{\sigma}^{2}_{n}=\begin{cases}1-{{r}^{2\left({{a}_{1}}-n\right)}},&\text{ if }n<a_{1}\\ \frac{\left(1-r^{2x_{1}}\right)\left(1-r^{2x_{2}}\right)}{1-{{r}^{2\left(x_{1}+x_{2}\right)}}},&\text{ if }a_{1}<n<a_{|\mathbb{A}|}\\ 1-{{r}^{2\left(n-{{a}_{|\mathbb{A}|}}\right)}},&\text{ if }n>a_{|\mathbb{A}|}\\ \end{cases}. (28)

The conditional distribution of the channel of BS antenna n𝕄𝔸n\in\mathbb{M}-\mathbb{A} is only related to channel values of the two nearest BS antennas on both sides, axa_{{x^{\star}}} and ax+1a_{{x^{\star}}+1}.

Proof.

See Appendix A. ∎

The results in Corollary 6 can help significantly reduce the computational complexity for the conditional CDF of the untrained channel power for scenarios with an exponential correlation model.

IV-C Modified Antenna-Domain Interleaved Training Scheme

Based on the conditional PDF of the untrained channels in Lemma 1, we propose a modified antenna-domain interleaved training scheme where at the beginning of each training step, the BS antenna whose channel is to be trained is optimally selected. The basic idea is to use the channel values of the already trained antennas to calculate the probability of meeting the transmission requirement if each untrained BS antenna is selected. Then the antenna with the highest probability is chosen.

For the selection of the first antenna n0n_{0} to be trained, we cannot use the same approach since no channel values have been obtained. Instead, we use the conditional variance in Eq. (24). Under the assumption that all antennas have the same average power, the first antenna to be trained can be the one resulting in the minimum overall conditional variance of other antennas, e.g., n0=argminmn=1,nmMσ¯n2n_{0}=\arg\min_{m}\sum_{n=1,n\neq m}^{M}\bar{\sigma}^{2}_{n}. It can be seen from Eq. (28) that n0=M+12n_{0}=\left\lfloor\frac{M+1}{2}\right\rfloor under the exponential correlation model.

Recall that 𝔸m{\mathbb{A}_{m}} and 𝐡𝔸m{\mathbf{h}}_{\mathbb{A}_{m}} are the set of indices of the mm BS antennas whose channels have been trained and the obtained channel vector of these BS antennas after mm training steps. At the beginning of the (m+1)(m+1)-th training step, based on the already obtained channel vector 𝐡𝔸m\mathbf{h}_{\mathbb{A}_{m}}, the BS antenna to be trained for the (m+1)(m+1)-th step is selected by the BS as follows. For each untrained BS antenna n𝕄𝔸mn\in\mathbb{M}-\mathbb{A}_{m}, the BS calculates the probability that the obtainment of the channel of BS antenna nn in the (m+1)(m+1)-th training step can meet the transmission requirements as follows:

Pr\displaystyle\Pr (|hn|2αth𝐡𝔸m2)\displaystyle\left({{\left|{{h}_{n}}\right|}^{2}}\geq{{\alpha}_{\rm th}}-{{\left\|{{\mathbf{h}}_{\mathbb{A}_{m}}}\right\|}^{2}}\right) (29)
=Q1(2|μ¯n|σ¯n,2αth𝐡𝔸m2σ¯n).\displaystyle={{Q}_{1}}\left(\sqrt{2}\frac{\left|{{{\bar{\mu}}}_{n}}\right|}{{{{\bar{\sigma}}}_{n}}},\sqrt{2}\frac{\sqrt{\alpha_{\rm th}-{{\left\|{{\mathbf{h}}_{\mathbb{A}_{m}}}\right\|}^{2}}}}{{{{\bar{\sigma}}}_{n}}}\right).

Then, the BS selects the one with the highest probability among all untrained antennas, i.e., the index of the BS antenna for the m+1m+1 training step is

n=argmaxn𝕄𝔸mPr(|hn|2αth𝐡𝔸m2).n^{\star}=\arg\max_{n\in\mathbb{M}-\mathbb{A}_{m}}{\Pr}\left({{\left|{{h}_{n}}\right|}^{2}}\geq{{\alpha}_{\rm th}}-{{\left\|{{\mathbf{h}}_{\mathbb{A}_{m}}}\right\|}^{2}}\right). (30)

The proposed modified antenna-domain interleaved training scheme is presented in Algorithm 2, where the major difference to the basic scheme is in Step 4 on the antenna selection.

Algorithm 2 Modified Antenna-Domain Interleaved Training Scheme
1:Initialization: n=n0n^{\star}=n_{0}; 𝔸1={n}\mathbb{A}_{1}=\left\{n^{\star}\right\}; m=1m=1; BS sends a pilot for UE to acquire hn{h}_{n^{\star}};
2:While 𝐡𝔸m2<αth&m<M{{\left\|{{\mathbf{h}}_{\mathbb{A}_{m}}}\right\|}^{2}}<\alpha_{\rm th}\And m<M do
3:     The UE sends one bit 0 and hnh_{n^{\star}} to the BS;
4:     The BS calculates the probability value for each n𝕄𝔸mn\in\mathbb{M}-\mathbb{A}_{m} according to Eq. (29) and then decides the index of next training antenna nn^{\star} according to Eq. (30);
5:     The BS sends a pilot for the UE to acquire hn{h}_{n^{\star}};
6:     m=m+1m=m+1; 𝔸m={𝔸m1,n}\mathbb{A}_{m}=\left\{\mathbb{A}_{m-1},n^{\star}\right\};
7:end
8:if 𝐡𝔸m2αth\left\|\mathbf{h}_{\mathbb{A}_{m}}\right\|^{2}\geq\alpha_{\rm th}
9:     The UE feeds back one bit 11 &\& hnh_{n^{\star}} to the BS;
10:     The BS conducts downlink precoding according to Eq. (7);
11:else
12:     The UE feeds back one bit 0 to the BS;
13:end

IV-C1 Complexity Analysis

The complexity of Algorithm 2 is mainly generated by Step 44 in the loop from Step 22 to Step 77. Denote the training length for a random channel realization as NN satisfying 1NM1\leq N\leq M. For the (m+1)(m+1)-th training step, where m<Nm<N, the number of operations needed for calculating 𝐑𝐡𝔸1{{\mathbf{R}}^{-1}_{{{\mathbf{h}}_{\mathbb{A}}}}} in Eq. (23) and Eq. (24) scales as m3m^{3}, and the number of operations needed for the remaining matrix multiplications in Eq. (23) and Eq. (24) scales as (Mm)m2(M-m)m^{2}. Therefore, the complexity for calculating the conditional mean and variance for a training process with NN steps is O(N4+MN3)O(N^{4}+MN^{3}) and an upper bound on the complexity of Algorithm 2 is O(M4)O(M^{4}) since NMN\leq M. For channels with the exponential covariance matrix, since the conditional mean and variance can be calculated according to Eq. (27) and Eq. (28) without matrix inversion and matrix multiplication, an upper bound of the algorithm complexity is O(M2)O\left(M^{2}\right).

IV-C2 Average Training Length

Similar to the derivation of Eq. (8), the average training length of the modified antenna-domain interleaved training scheme can be expressed as

Lt=1+m=1M1Pr(𝐡𝔸m2<αth).L_{t}=1+\sum\limits_{m=1}^{M-1}{\Pr\left({{\left\|{\mathbf{h}}_{\mathbb{A}_{m}}\right\|}^{2}}<\alpha_{\rm th}\right)}. (31)

To derive the analytical or even closed-form expression of LtL_{t} in Eq. (31), the key is to calculate Pr(𝐡𝔸m2<αth){\Pr}({{\left\|{{\mathbf{h}}_{\mathbb{A}_{m}}}\right\|}^{2}}<\alpha_{\rm th}). From Step 44 in Algorithm 2 and Eq. (29) we know that one should first calculate the conditional mean μ¯n\bar{\mu}_{n} and conditional variance σ¯n2\bar{\sigma}^{2}_{n} for n𝕄𝔸mn\in\mathbb{M}-{\mathbb{A}_{m}} based on both 𝔸m{\mathbb{A}_{m}} and 𝐡𝔸m{\mathbf{h}}_{\mathbb{A}_{m}} to decide the antenna index nn^{\star} for the (m+1)(m+1)-th training step. This makes the derivation of the PDF of 𝐡𝔸m+12{\left\|{\mathbf{h}}_{\mathbb{A}_{m+1}}\right\|}^{2} challenging because 𝔸m{\mathbb{A}_{m}} changes for each channel realization. In addition, the Q1(a,b){{Q}_{1}}\left(a,b\right) function involves a two-fold infinite series summation, resulting in an implicit relationship between nn^{\star} and 𝔸m{\mathbb{A}_{m}}, 𝐡𝔸m{\mathbf{h}}_{\mathbb{A}_{m}}. These all make the derivation of an analytical expression of LtL_{t} in Eq. (31) intractable.

To circumvent the above difficulties, we introduce the deep neural network (DNN) Lt=f(M,𝐑𝐡,αth;𝚯)L_{t}=f\left(M,\mathbf{R}_{\mathbf{h}},\alpha_{\rm th};\boldsymbol{\Theta}\right) with 𝚯\boldsymbol{\Theta} being the network parameter matrix to model the function of LtL_{t} with respective to system parameters, e.g., the BS antenna number MM, the channel covariance matrix 𝐑𝐡\mathbf{R}_{\mathbf{h}} and the normalized SNR threshold αth=(2Rth1)/P\alpha_{\rm th}=\left({2^{R_{\rm th}}-1}\right)/{P}. For channels with the exponential covariance matrix, one can use the correlation coefficient ρ\rho to replace the input parameter 𝐑𝐡\mathbf{R}_{\mathbf{h}}. The latter simulation results show that the function ff can be well-fitted by a DNN model with fully connected hidden layers. This deep learning-based approximation of the average training length can provide a faster performance evaluation of the modified antenna-domain interleaved training scheme compared to the Monte Carlo simulation.

V Simulation and Discussion

In this section, simulation results are shown for the proposed modified beam-domain and antenna-domain interleaved training schemes and their comparison with existing baseline schemes. The exponential correlation model in Eq. (2) is considered in Sections V-A to V-D. The one-ring correlation model in Eq. (1) is considered in Section V-E.

V-A Beam-Domain Interleaved Training Under the Exponential Correlation Model

Fig. 1 shows the average training lengths of the basic and modified beam-domain interleaved training schemes under the exponential correlation model, including the simulation values, the theoretical values in Eq. (9) of Theorem 1 and Eq. (13) of Corollary 1, and the approximate values in Eq. (14) of Corollary 2. We can see from Fig. 1a that the curves of simulation values and theoretical values match well for different scenarios. The curves of approximate values for M=32,64M=32,64 and ρ=0.8\rho=0.8 in Fig. 1b have some gap with the simulation curves, while the gap for the case of M=64M=64 is relatively small. This is because that Eq. (15) is a large-MM eigenvalue approximation. These observations verify our derivations in Section III-B.

Refer to caption
(a) Basic interleaved training scheme
Refer to caption
(b) Modified interleaved training scheme
Figure 1: Average training length of beam-domain interleaved training scheme.

V-B Comparison of Basic and Modified Beam-Domain Interleaved Training Under the Exponential Correlation Model

Refer to caption
(a) Correlation coefficient ρ=0.8\rho=0.8
Refer to caption
(b) Correlation coefficient ρ=0.4\rho=0.4
Figure 2: Average training length of beam-domain interleaved training with different antennas number MM.

Fig. 2 shows the average training lengths of the basic and modified beam-domain interleaved training with different antenna number MM for ρ=0.8\rho=0.8 and 0.40.4. It can be seen that the modified scheme outperforms the basic scheme in the average training length for three combinations of RthR_{\rm th} and PP, i.e., 1) Rth=5R_{\rm th}=5 bit/s/Hz, P=0P=0 dB and αth=31\alpha_{\rm th}=31; 2) Rth=4R_{\rm th}=4 bit/s/Hz, P=2P=-2 dB and αth=23.77\alpha_{\rm th}=23.77; 3) Rth=3R_{\rm th}=3 bit/s/Hz, P=3P=-3 dB and αth=13.97\alpha_{\rm th}=13.97. And the advantage becomes larger as αth\alpha_{\rm th} increases from αth=13.97\alpha_{\rm th}=13.97 to αth=31\alpha_{\rm th}=31, showing that the modified beam-domain scheme exhibits greater performance advantages under more stringent transmission requirements. In addition, the average training lengths of both schemes first increase and then decrease as MM increases. For large enough MM, the training length levels off as MM increases. These results are consistent with the description in Corollary 3 and Remark 1.

Refer to caption
Figure 3: Average training length of beam-domain interleaved training with different channel correlation levels.

Fig. 3 shows the average training lengths of the basic and modified beam-domain interleaved training schemes with different channel correlation levels for M=32M=32, Rth=3R_{\rm th}=3 bit/s/Hz and P=6,4,0P=-6,-4,0 dB. Note that αth=27.87,17.58,7\alpha_{\rm th}=27.87,17.58,7 respectively. The basic scheme uses the DFT codebook. The shorter average training length of the modified scheme can also be observed in the figure, especially for relatively low transmit power, e.g., P=4,6P=-4,-6 dB. And with increasing ρ\rho, the performance advantage of the modified scheme over the basic scheme enlarges for P=0P=0 dB, while for P=4,6P=-4,-6 dB, it first increases for ρ0.7\rho\leq 0.7 and then decreases for ρ>0.7\rho>0.7.

V-C Antenna-Domain Interleaved Training Under the Exponential Correlation Model

Refer to caption
(a) Basic interleaved training scheme
Refer to caption
(b) Modified interleaved training scheme
Figure 4: Average training length of antenna-domain interleaved training scheme.

Fig. 4 shows the average training lengths of the basic and modified antenna-domain interleaved training schemes under the exponential correlation model for M=32M=32, Rth=2,3R_{\rm th}=2,3 bit/s/Hz and ρ=0\rho=0, i.e., i.i.d. channels, 0.40.4, and 0.80.8. Fig. 4a shows the simulation values, the theoretical values provided by Corollary 2, i.e., Eq. (9), and the approximate values of the basic scheme in Eq. (20) of Corollary 5. We can see from the figure that all three curves match well for different scenarios. This verifies the results in Section IV-A. For Rth=2R_{\rm th}=2 bit/s/Hz, the average training length of the basic scheme increases when ρ\rho increases from 0.40.4 to 0.80.8 for P>9P>-9 dB, i.e., αth<23.83\alpha_{\rm th}<23.83. For smaller PP and larger αth\alpha_{\rm th}, the increase of ρ\rho, on the contrary, leads to a decrease of average training length for the basic scheme.

In Fig. 4b, we use a fully connected DNN containing four hidden layers (each with 4, 8, 16 and 32 Relu neurons) to provide an approximate average training length of the modified antenna-domain interleaved training scheme. The function for training the DNN model is the trainlm based the Levenberg-Marquardt algorithm, which has the fastest convergence speed for medium-sized DNN. The loss is the mean-square error (MSE). The dataset has 31733173 samples of <ρ,αth,Lt><\rho,\alpha_{\rm th},L_{t}>. The ratio of the training set, validation set, and test set is 0.7:0.15:0.15. As shown in the figure, the designed DNN fits the function of training overhead well and its prediction of LtL_{t} for these unseen combinations of ρ\rho, αth\alpha_{\rm th} and PP matches with the simulation results.

V-D Comparison of Basic and Modified Antenna-Domain Interleaved Training Under the Exponential Correlation Model

Refer to caption
(a) Correlation coefficient ρ=0.8\rho=0.8
Refer to caption
(b) Correlation coefficient ρ=0.4\rho=0.4
Figure 5: Average training length of antenna-domain interleaved training with different BS antennas number MM.

Fig. 5 shows the average training lengths of the basic and modified antenna-domain interleaved training schemes with different BS antenna number MM for ρ=0.8\rho=0.8 and 0.40.4. It can be seen that the modified scheme outperforms the basic scheme in the average training length for three combinations of RthR_{\rm th} and PP, i.e., 1) Rth=5R_{\rm th}=5 bit/s/Hz, P=0P=0 dB and αth=31\alpha_{\rm th}=31; 2) Rth=4R_{\rm th}=4 bit/s/Hz, P=2P=-2 dB and αth=23.77\alpha_{\rm th}=23.77; 3) Rth=3R_{\rm th}=3 bit/s/Hz, P=3P=-3 dB and αth=13.97\alpha_{\rm th}=13.97. And the advantage becomes larger as αth\alpha_{\rm th} increases from αth=13.97\alpha_{\rm th}=13.97 to αth=31\alpha_{\rm th}=31. In addition, with increasing MM, the average training length of the basic scheme increases and converges, with faster convergence and a smaller value for ρ=0.4\rho=0.4 compared to those for ρ=0.8\rho=0.8. However, the average training length of the modified scheme first increases and then decreases and finally levels off with increasing MM. This is because as MM increases, there are more untrained antennas available after each interleaved training step, which increases the diversity of untrained antennas’ conditional distributions. Furthermore, the performance advantage of the modified scheme over the basic scheme first increases and then levels off as MM increases.

Refer to caption
Figure 6: Average training length of antenna-domain interleaved training with different channel correlation levels.
Refer to caption
(a) Channel AS σA=5\sigma_{A}=5^{\circ}
Refer to caption
(b) Channel AS σA=10\sigma_{A}=10^{\circ}
Refer to caption
(c) Channel AS σA=20\sigma_{A}=20^{\circ}
Figure 7: Average training length of beam-domain interleaved training in one-ring correlated channels.

Fig. 6 shows the average training lengths of the basic and modified antenna-domain interleaved training schemes with different channel correlation levels for M=32M=32, Rth=3R_{\rm th}=3 bit/s/Hz and P=6,4,0P=-6,-4,0 dB. We can see from the figure that the modified scheme has a shorter average training length than the basic scheme. As the correlation coefficient ρ\rho increases, the average training length of the basic scheme increases for P=0P=0 dB and 4-4 dB, but decreases for P=6P=-6 dB. On the contrary, with increasing ρ\rho, the average training length of the modified scheme for P=0,4,6P=0,-4,-6 dB first decreases for ρ0.8\rho\leq 0.8 and then increases for larger ρ>0.8\rho>0.8.

Refer to caption
(a) Channel AS σA=5\sigma_{A}=5^{\circ}
Refer to caption
(b) Channel AS σA=10\sigma_{A}=10^{\circ}
Refer to caption
(c) Channel AS σA=20\sigma_{A}=20^{\circ}
Figure 8: Average training length of antenna-domain interleaved training scheme in one-ring correlated channels.

V-E Antenna and Beam-Domain Interleaved Training Under the One-Ring Correlated Channels

In this section, simulations are given to demonstrate the applicability of partial analytical results to the one-ring correlation model in Eq. (1). The uniform PAS model [36], i.e., f(θ)=12Δθ,θ¯Δθθθ¯+Δθf\left(\theta\right)=\frac{1}{2\Delta\theta},\bar{\theta}-\Delta\theta\leq\theta\leq\bar{\theta}+\Delta\theta, is considered where θ¯\bar{\theta} denotes the mean angular of departure (AoD) and the angular spread (AS) is σA=Δθ3\sigma_{A}=\frac{\Delta\theta}{\sqrt{3}}.

Fig. 7 shows the average training lengths of the basic and modified beam-domain interleaved training schemes with different transmit powers P[5,5]P\in[-5,5] dB under the one-ring correlated channel model for M=32M=32, Rth=3R_{\rm th}=3 bit/s/Hz, D=0.5D=0.5, θ¯=45\bar{\theta}=45^{\circ} and σA=5,10,20\sigma_{A}=5^{\circ},10^{\circ},20^{\circ}. The theoretical values of the average training length in Eq. (9) of Theorem 1 and Eq. (13) of Corollary 1 match the simulation values well. And the modified scheme under three different ASs has an obvious performance advantage over the basic scheme. With decreasing AS or increasing channel correlation, this performance advantage enlarges for relatively high transmit power, e.g., P=5P=5 dB, while decreases for low transmit power, e.g., P=5P=-5 dB.

Fig. 8 shows the average training lengths of the basic and modified antenna-domain interleaved training schemes with different transmit powers P[5,5]P\in[-5,5] dB under the one-ring correlated channel model for M=32M=32, Rth=3R_{\rm th}=3 bit/s/Hz, D=0.5D=0.5, θ¯=45\bar{\theta}=45^{\circ} and σA=5,10,20\sigma_{A}=5^{\circ},10^{\circ},20^{\circ}. The modified scheme in the antenna-domain under three different ASs also outperforms the basic scheme, and the theoretical average training length in Eq. (9) matches the simulation value well.

VI Conclusion

In this paper, the channel spatial correlation was explored to improve the interleaved training for single-user massive MIMO downlink. Via optimizing the beam training codebook and the antenna training sequence based on the channel correlation, we respectively proposed the modified beam-domain and antenna-domain interleaved training schemes. For exponentially correlated channels, the conditional channel distribution of an untrained BS antenna given channel values of the already trained BS antennas was demonstrated to be only dependent on the channels of its nearest antennas on both sides in the already-trained antenna set, simplifying the complexity of the modified antenna-domain scheme significantly. Exact and approximate closed-form expressions were derived for the basic and modified beam/antenna-domain schemes in correlated channels. The impact of system parameters, e.g., the channel correlation, the antenna number, and the SNR requirement, on the average training length was explicitly revealed. Simulations verified our derivations and demonstrated the performance advantage of our proposed modified schemes.

In addition to spatial correlation, channel temporal correlation can be exploited to improve the channel acquisition efficiency in massive MIMO systems. Unlike spatial correlation, temporal correlation has causality constraints in the time dimension and we can only use the historical training result to extrapolate. However, the historical result is not complete due to the characteristics of the interleaved scheme. How to explore the temporal correlation in interleaved training is worth to be further studied.

Appendix A Appendix

A-A Proof of Corollary 6

Define 𝐦=[ρna1,,ρnax,(ρ)ax+1n,,(ρ)a|𝔸|n]\mathbf{m}=\left[\rho^{n-a_{1}},...,\rho^{n-a_{x^{\star}}},\left(\rho^{*}\right)^{a_{x^{\star}+1}-n},...,\left(\rho^{*}\right)^{a_{|\mathbb{A}|}-n}\right] ×𝐑𝐡𝔸1=[m1,,m|𝔸|].\times{{\mathbf{R}}^{-1}_{{{\mathbf{h}}_{\mathbb{A}}}}}=\left[m_{1},\dots,m_{\left|\mathbb{A}\right|}\right]. We denote 𝐑𝐡𝔸{{\mathbf{R}}_{{{\mathbf{h}}_{\mathbb{A}}}}} as 𝐑\mathbf{R} for simplicity of presentation. Then we have mi=det(𝐑i)det(𝐑),i𝕀={1,,|𝔸|},{{{m}}_{i}}=\frac{{\rm det}\left({{\mathbf{R}}_{i}}\right)}{{\rm det}\left(\mathbf{R}\right)},i\in\mathbb{I}=\{1,...,|\mathbb{A}|\}, where 𝐑i\mathbf{R}_{i} is 𝐑\mathbf{R} by replacing its ii-th row with [ρna1,,ρnax,(ρ)ax+1n,,(ρ)a|𝔸|n]\left[\rho^{n-a_{1}},...,\rho^{n-a_{x^{\star}}},\left(\rho^{*}\right)^{a_{x^{\star}+1}-n},...,\left(\rho^{*}\right)^{a_{|\mathbb{A}|}-n}\right]. Recall that 𝐑1=adj(𝐑)det(𝐑)\mathbf{R}^{-1}=\frac{{\rm adj}\left(\mathbf{R}\right)}{\det\left(\mathbf{R}\right)} where adj(𝐑){\rm adj}\left(\mathbf{R}\right) is the adjugate matrix of 𝐑\mathbf{R}, i.e., [adj(𝐑)]u,v=Rv,u,u,v=1,,|𝔸|\left[{\rm adj}\left(\mathbf{R}\right)\right]_{u,v}=R_{v,u},\forall u,v=1,\dots,\left|\mathbb{A}\right| with Ru,vR_{u,v} being the algebraic cofactor of [𝐑]u,v[\mathbf{R}]_{u,v}. Therefore, we have mi=j=1xρnajRi,j+j=x+1|𝔸|(ρ)ajnRi,jdet(𝐑)=det(𝐑i)det(𝐑){{{m}}_{i}}=\frac{\sum\limits_{j=1}^{x^{\star}}{{{\rho}^{{n}-{{a}_{j}}}}}{{{R}}_{i,j}}+\sum\limits_{j=x^{\star}+1}^{|\mathbb{A}|}{{{\left(\rho^{*}\right)}^{{{a}_{j}}-{n}}}}{{{R}}_{i,j}}}{{\rm det}\left(\mathbf{R}\right)}=\frac{{\rm det}\left({{\mathbf{R}}_{i}}\right)}{{\rm det}\left(\mathbf{R}\right)}.

Here we prove that mxm_{x^{\star}} and mx+1m_{x^{\star}+1} are the only two non-zero elements in 𝐦\mathbf{m}, equivalently det(𝐑i)=0{\rm det}\left(\mathbf{R}_{i}\right)=0 when i𝕀{x,x+1}i\in\mathbb{I}-\{x^{\star},x^{\star}+1\} and det(𝐑i)0{\rm det}\left(\mathbf{R}_{i}\right)\neq 0 when i{x,x+1}i\in\{x^{\star},x^{\star}+1\}. For the first part, it is suffice to show that the row vectors of 𝐑i\mathbf{R}_{i} are linearly dependent when i𝕀{x,x+1}i\in\mathbb{I}-\{x^{\star},x^{\star}+1\} and we show this by construction. Let cx=ρnax1(ρρ)ax+1n1(ρρ)ax+1ax,cx+1=(ρ)ax+1n1(ρρ)nax1(ρρ)ax+1ax,ci=1,cj=0 for j{x,x+1,i}c_{x^{\star}}={{\rho}^{n-{{a}_{x^{\star}}}}}\frac{1-{{\left(\rho^{*}\rho\right)}^{{{a}_{x^{\star}+1}}-n}}}{1-{{\left(\rho^{*}\rho\right)}^{{{a}_{x^{\star}+1}}-{{a}_{x^{\star}}}}}},c_{x^{\star}+1}={{\left(\rho^{*}\right)}^{{{a}_{x^{\star}+1}}-n}}\frac{1-{{\left(\rho^{*}\rho\right)}^{n-{{a}_{x^{\star}}}}}}{1-{{\left(\rho^{*}\rho\right)}^{{{a}_{x^{\star}+1}}-{{a}_{x^{\star}}}}}},c_{i}=-1,c_{j}=0\text{ for }j\notin\{x^{\star},x^{\star}+1,i\}, we have via straightforward calculations that j=1|𝔸|cj[𝐑i]j,[1:|𝔸|]=cx[𝐑i]x,[1:|𝔸|]+cx+1[𝐑i]x+1,[1:|𝔸|][𝐑i]i,[1:|𝔸|]=𝟎\sum_{j=1}^{|\mathbb{A}|}{c_{j}[\mathbf{R}_{i}]_{j,[1:|\mathbb{A}|]}}=c_{x^{\star}}[\mathbf{R}_{i}]_{x^{\star},[1:|\mathbb{A}|]}+c_{x^{\star}+1}[\mathbf{R}_{i}]_{x^{\star}+1,[1:|\mathbb{A}|]}-[\mathbf{R}_{i}]_{i,[1:|\mathbb{A}|]}=\mathbf{0}.

Next we prove that mi0{{{m}}_{i}}\neq 0 when i{x,x+1}i\in\left\{x^{\star},x^{\star}+1\right\}. Define 𝐀j=[𝐑][j:|𝔸|],{1}[(1+j):|𝔸|]\mathbf{A}_{j}=\left[\mathbf{R}\right]_{[j:\left|\mathbb{A}\right|],\{1\}\cup[(1+j):\left|\mathbb{A}\right|]} for j{1,,|𝔸|1}j\in\left\{1,\ldots,|\mathbb{A}|-1\right\}. Via splitting the (1,2)(1,2)-th element in 𝐀j\mathbf{A}_{j}, i.e., the (j,j+1)(j,j+1)-th element in 𝐑\mathbf{R}, (ρ)aj+1aj{{\left(\rho^{*}\right)}^{{{a}_{j+1}}-{{a}_{j}}}} into ((ρ)aj+1ajρajaj+1)+ρajaj+1\left({{\left(\rho^{*}\right)}^{{{a}_{j+1}}-{{a}_{j}}}}-{{\rho}^{{{a}_{j}}-{{a}_{j+1}}}}\right)+{{\rho}^{{{a}_{j}}-{{a}_{j+1}}}}, we can split the det(𝐀j){\rm det}\left(\mathbf{A}_{j}\right) into the sum of two determinants and obtain the recurrence formula via expanding the first determinant in Eq. (32) by the second column, i.e.,

det\displaystyle{\rm det} (𝐀j)=\displaystyle\left(\mathbf{A}_{j}\right)= (32)
|ρaja1(ρ)aj+1ajρajaj+1(ρ)a|𝔸|ajρaj+1a10(ρ)a|𝔸|aj+1ρa|𝔸|a101|\displaystyle\left|\begin{matrix}{{\rho}^{{{a}_{j}}-{{a}_{1}}}}&{{\left(\rho^{*}\right)}^{{{a}_{j+1}}-{{a}_{j}}}}-{{\rho}^{{{a}_{j}}-{{a}_{j+1}}}}&\cdots&{{\left(\rho^{*}\right)}^{{{a}_{|\mathbb{A}|}}-{{a}_{j}}}}\\ {{\rho}^{{{a}_{j+1}}-{{a}_{1}}}}&0&\cdots&{{\left(\rho^{*}\right)}^{{{a}_{|\mathbb{A}|}}-{{a}_{j+1}}}}\\ \vdots&\vdots&\ddots&\vdots\\ {{\rho}^{{{a}_{|\mathbb{A}|}}-{{a}_{1}}}}&0&\cdots&1\\ \end{matrix}\right|
+\displaystyle+ |ρaja1ρajaj+1(ρ)a|𝔸|ajρaj+1a11(ρ)a|𝔸|aj+1ρa|𝔸|a1ρa|𝔸|aj+11|\displaystyle\left|\begin{matrix}{{\rho}^{{{a}_{j}}-{{a}_{1}}}}&{{\rho}^{{{a}_{j}}-{{a}_{j+1}}}}&\cdots&{{\left(\rho^{*}\right)}^{{{a}_{|\mathbb{A}|}}-{{a}_{j}}}}\\ {{\rho}^{{{a}_{j+1}}-{{a}_{1}}}}&1&\cdots&{{\left(\rho^{*}\right)}^{{{a}_{|\mathbb{A}|}}-{{a}_{j+1}}}}\\ \vdots&\vdots&\ddots&\vdots\\ {{\rho}^{{{a}_{|\mathbb{A}|}}-{{a}_{1}}}}&{{\rho}^{{{a}_{|\mathbb{A}|}}-{{a}_{j+1}}}}&\cdots&1\\ \end{matrix}\right|
=\displaystyle= ((ρ)aj+1ajρajaj+1)det(𝐀j+1).\displaystyle-\left({{\left(\rho^{*}\right)}^{{{a}_{j+1}}-{{a}_{j}}}}-{{\rho}^{{{a}_{j}}-{{a}_{j+1}}}}\right){\rm det}\left(\mathbf{A}_{j+1}\right).

Note that the second determinant in Eq. (32) is zero since the first column of its matrix is ρaj+1a1{{\rho}^{{{a}_{j+1}}-{{a}_{1}}}} times the second column. Then we calculate det(𝐑)=det(𝐀1){\rm det}\left(\mathbf{R}\right)={\rm det}\left(\mathbf{A}_{1}\right) as follows:

det(𝐑)\displaystyle{\rm det}\left(\mathbf{R}\right) =(1)|𝔸|2det(𝐀|𝔸|1)j=1|𝔸|2(ρ)aj+1ajρajaj+1\displaystyle={\left(-1\right)}^{\left|\mathbb{A}\right|-2}{\rm det}\left(\mathbf{A}_{\left|\mathbb{A}\right|-1}\right)\prod\limits_{j=1}^{\left|\mathbb{A}\right|-2}{{{\left(\rho^{*}\right)}^{{{a}_{j+1}}-{{a}_{j}}}}-{{\rho}^{{{a}_{j}}-{{a}_{j+1}}}}} (33)
=j=1|𝔸|1[1(ρρ)aj+1aj].\displaystyle=\prod\limits_{j=1}^{|\mathbb{A}|-1}{\left[1-{{\left(\rho^{*}\rho\right)}^{{{a}_{j+1}}-{{a}_{j}}}}\right]}.

Similar procedure can be used to calculate det(𝐑x){\rm det}\left({{\mathbf{R}}_{x^{\star}}}\right). The difference is that the (x,x+1)(x^{*},x^{*}+1)-th element (ρ)ax+1n{{\left(\rho^{*}\right)}^{{{a}_{{x^{\star}}+1}}-{n}}} in 𝐑x{{\mathbf{R}}_{x^{\star}}} is split into ((ρ)ax+1nρnax+1)+ρnax+1\left({{\left(\rho^{*}\right)}^{{{a}_{{x^{\star}}+1}}-{n}}}-{{\rho}^{{n}-{{a}_{{x^{\star}}+1}}}}\right)+{{\rho}^{{n}-{{a}_{{x^{\star}}+1}}}}. Then we can obtain

mx\displaystyle{{{m}}_{x^{\star}}} =det(𝐑x)det(𝐑)=(ρ)ax+1nρnax+1(ρ)ax+1axρaxax+1\displaystyle=\frac{{\rm det}\left({{\mathbf{R}}_{x^{\star}}}\right)}{{\rm det}\left(\mathbf{R}\right)}=\frac{{{\left(\rho^{*}\right)}^{{{a}_{x^{\star}+1}}-n}}-{{\rho}^{n-{{a}_{x^{\star}+1}}}}}{{{\left(\rho^{*}\right)}^{{{a}_{x^{\star}+1}}-{{a}_{x^{\star}}}}}-{{\rho}^{{{a}_{x^{\star}}}-{{a}_{x^{\star}+1}}}}} (34)
=ρnax1(ρρ)ax+1n1(ρρ)ax+1ax0,\displaystyle={{\rho}^{n-{{a}_{x^{\star}}}}}\frac{1-{{\left(\rho^{*}\rho\right)}^{{{a}_{x^{\star}+1}}-n}}}{1-{{\left(\rho^{*}\rho\right)}^{{{a}_{x^{\star}+1}}-{{a}_{x^{\star}}}}}}\neq 0,

for |ρ|<1\left|\rho\right|<1 and |ρ|0\left|\rho\right|\neq 0. det(𝐑)\det\left(\mathbf{R}\right) can also be calculated by splitting the (j+1,j)\left(j+1,j\right)-th element ρaj+1aj{{\rho}^{{{a}_{j+1}}-{{a}_{j}}}} into (ρaj+1aj(ρ)ajaj+1)+(ρ)ajaj+1\left({{\rho}^{{{a}_{j+1}}-{{a}_{j}}}}-{{\left(\rho^{*}\right)}^{{{a}_{j}}-{{a}_{j+1}}}}\right)+{{\left(\rho^{*}\right)}^{{{a}_{j}}-{{a}_{j+1}}}}. Then we split the determinant by the (j+1)\left(j+1\right)-th row into the sum of two determinants and leave the rest of the rows unchanged. Similarly, we calculate det(𝐑x+1){\rm det}\left({{\mathbf{R}}_{x^{\star}+1}}\right) in this way. The difference is that the (x+1,x)\left({x^{\star}+1},{x^{\star}}\right)-th element ρnax{{\rho}^{{{n}-{a}_{{x^{\star}}}}}} in 𝐑x+1{{\mathbf{R}}_{x^{\star}+1}} is split into (ρnax(ρ)axn)+(ρ)axn\left({{\rho}^{{n}-{{a}_{{x^{\star}}}}}}-{{\left(\rho^{*}\right)}^{{{{a}_{{x^{\star}}}}}-n}}\right)+{{\left(\rho^{*}\right)}^{{{{a}_{{x^{\star}}}}}-n}}. Then we obtain

mx+1\displaystyle{{{m}}_{x^{\star}+1}} =det(𝐑x+1)det(𝐑)=ρnax(ρ)axnρax+1ax(ρ)axax+1\displaystyle=\frac{{\rm det}\left({{\mathbf{R}}_{x^{\star}+1}}\right)}{{\rm det}\left(\mathbf{R}\right)}=\frac{{{\rho}^{n-{{a}_{x^{\star}}}}}-{{\left(\rho^{*}\right)}^{{{a}_{x^{\star}}}-n}}}{{{\rho}^{{{a}_{x^{\star}+1}}-{{a}_{x}}}}-{{\left(\rho^{*}\right)}^{{{a}_{x^{\star}}}-{{a}_{x^{\star}+1}}}}} (35)
=(ρ)ax+1n1(ρρ)nax1(ρρ)ax+1ax0.\displaystyle={{\left(\rho^{*}\right)}^{{{a}_{x^{\star}+1}}-n}}\frac{1-{{\left(\rho^{*}\rho\right)}^{n-{{a}_{x^{\star}}}}}}{1-{{\left(\rho^{*}\rho\right)}^{{{a}_{x^{\star}+1}}-{{a}_{x^{\star}}}}}}\neq 0.

References

  • [1] T. L. Marzetta, “Noncooperative cellular wireless with unlimited numbers of base station antennas,” IEEE Trans. Wireless Commun., vol. 9, no. 11, pp. 3590–3600, 2010.
  • [2] H. Q. Ngo, E. G. Larsson, and T. L. Marzetta, “Energy and spectral efficiency of very large multiuser MIMO systems,” IEEE Trans. Commun., vol. 61, no. 4, pp. 1436–1449, 2013.
  • [3] C. Zhang, Y. Huang, Y. Jing, S. Jin, and L. Yang, “Sum-rate analysis for massive MIMO downlink with joint statistical beamforming and user scheduling,” IEEE Trans. Wireless Commun., vol. 16, no. 4, pp. 2181–2194, 2017.
  • [4] M. Biguesh and A. B. Gershman, “Training-based MIMO channel estimation: A study of estimator tradeoffs and optimal training signals,” IEEE Trans. Signal Process., vol. 54, no. 3, pp. 884–893, 2006.
  • [5] A. Adhikary, J. Nam, J.-Y. Ahn, and G. Caire, “Joint spatial division and multiplexing—The large-scale array regime,” IEEE Trans. Inf. Theory, vol. 59, no. 10, pp. 6441–6463, 2013.
  • [6] J. Choi, D. J. Love, and P. Bidigare, “Downlink training techniques for FDD massive MIMO systems: Open-loop and closed-loop training with memory,” IEEE J. Sel. Topics Signal Process., vol. 8, no. 5, pp. 802–814, 2014.
  • [7] W. Shen, L. Dai, B. Shim, S. Mumtaz, and Z. Wang, “Joint CSIT acquisition based on low-rank matrix completion for FDD massive MIMO systems,” IEEE Commun. Lett., vol. 19, no. 12, pp. 2178–2181, 2015.
  • [8] Y.-X. Zhang, A.-A. Lu, and X. Gao, “Sum-rate-optimal statistical precoding for FDD massive MIMO downlink with deterministic equivalents,” IEEE Trans. Veh. Technol., vol. 71, no. 7, pp. 7359–7370, 2022.
  • [9] X. Rao and V. K. Lau, “Distributed compressive CSIT estimation and feedback for FDD multi-user massive MIMO systems,” IEEE Trans. Signal Process., vol. 62, no. 12, pp. 3261–3271, 2014.
  • [10] Z. Gao, L. Dai, Z. Wang, and S. Chen, “Spatially common sparsity based adaptive channel estimation and feedback for FDD massive MIMO,” IEEE Trans. Signal Process., vol. 63, no. 23, pp. 6169–6183, 2015.
  • [11] W. Shen, L. Dai, Y. Shi, B. Shim, and Z. Wang, “Joint channel training and feedback for FDD massive MIMO systems,” IEEE Trans. Veh. Technol., vol. 65, no. 10, pp. 8762–8767, 2015.
  • [12] K. Venugopal, A. Alkhateeb, N. G. Prelcic, and R. W. Heath Jr., “Channel estimation for hybrid architecture-based wideband millimeter wave systems,” IEEE J. Sel. Areas Commun., vol. 35, no. 9, pp. 1996–2009, 2017.
  • [13] P. N. Alevizos, X. Fu, N. D. Sidiropoulos, Y. Yang, and A. Bletsas, “Limited feedback channel estimation in massive MIMO with non-uniform directional dictionaries,” IEEE Trans. Signal Process., vol. 66, no. 19, pp. 5127–5141, 2018.
  • [14] M. Ke, Z. Gao, Y. Wu, X. Gao, and R. Schober, “Compressive sensing-based adaptive active user detection and channel estimation: Massive access meets massive MIMO,” IEEE Trans. Signal Process., vol. 68, pp. 764–779, 2020.
  • [15] Y. Han, T.-H. Hsu, C.-K. Wen, K.-K. Wong, and S. Jin, “Efficient downlink channel reconstruction for FDD multi-antenna systems,” IEEE Trans. Wireless Commun., vol. 18, no. 6, pp. 3161–3176, 2019.
  • [16] W. Peng, W. Li, W. Wang, X. Wei, and T. Jiang, “Downlink channel prediction for time-varying FDD massive MIMO systems,” IEEE J. Sel. Topics Signal Process., vol. 13, no. 5, pp. 1090–1102, 2019.
  • [17] S. Kim, J. W. Choi, and B. Shim, “Downlink pilot precoding and compressed channel feedback for FDD-based cell-free systems,” IEEE Trans. Wireless Commun., vol. 19, no. 6, pp. 3658–3672, 2020.
  • [18] Y. Yang, F. Gao, G. Y. Li, and M. Jian, “Deep learning-based downlink channel prediction for FDD massive MIMO system,” IEEE Commun. Lett., vol. 23, no. 11, pp. 1994–1998, 2019.
  • [19] E. Koyuncu and H. Jafarkhani, “Interleaving training and limited feedback for point-to-point massive multiple-antenna systems,” in Proc. IEEE Int. Symp. Inf. Theor., Hong Kong, China, Jun. 2015, pp. 1242–1246.
  • [20] E. Koyuncu, X. Zou, and H. Jafarkhani, “Interleaving channel estimation and limited feedback for point-to-point systems with a large number of transmit antennas,” IEEE Trans. Wireless Commun., vol. 17, no. 10, pp. 6762–6774, 2018.
  • [21] C. Zhang, Y. Jing, Y. Huang, and L. Yang, “Interleaved training and training-based transmission design for hybrid massive antenna downlink,” IEEE J. Sel. Topics Signal Process., vol. 12, no. 3, pp. 541–556, 2018.
  • [22] W. He, C. Zhang, and Y. Huang, “Interleaved training codebook design for millimeter-wave communication system,” in Proc. IEEE/CIC Int. Conf. Commun., Beijing, China, Aug. 2018, pp. 6–10.
  • [23] C. Zhang, Y. Jing, Y. Huang, and X. You, “Interleaved training for intelligent surface-assisted wireless communications,” IEEE Signal Process Lett., vol. 27, pp. 1774–1778, 2020.
  • [24] Y. Jing, S. ShahbazPanahi, and X. Yu, “SINR-based interleaved training design for multi-user massive MIMO downlink with MRT,” in Proc. IEEE Int. Conf. Commun., Seoul, South Korea, May 2022, pp. 237–242.
  • [25] Y. Jing, X. Yu, and S. ShahbazPanahi, “Interleaved training scheme for multi-user massive MIMO downlink with user SINR constraint,” IEEE Trans. Commun., 2023.
  • [26] L. You, X. Gao, X.-G. Xia, N. Ma, and Y. Peng, “Pilot reuse for massive MIMO transmission over spatially correlated Rayleigh fading channels,” IEEE Trans. Wireless Commun., vol. 14, no. 6, pp. 3352–3366, 2015.
  • [27] S. L. Loyka, “Channel capacity of MIMO architecture using the exponential correlation matrix,” IEEE Commun. Lett., vol. 5, no. 9, pp. 369–371, 2001.
  • [28] A. Decurninge, M. Guillaud, and D. T. Slock, “Channel covariance estimation in massive MIMO frequency division duplex systems,” in Proc. IEEE Globecom Workshops, San Diego, CA, USA, Dec. 2015, pp. 1–6.
  • [29] D. Neumann, M. Joham, and W. Utschick, “Covariance matrix estimation in massive MIMO,” IEEE Signal Process Lett., vol. 25, no. 6, pp. 863–867, 2018.
  • [30] K. Li, Y. Li, L. Cheng, Q. Shi, and Z.-Q. Luo, “Downlink channel covariance matrix reconstruction for FDD massive MIMO systems with limited feedback,” arXiv preprint arXiv:2204.00779, 2022.
  • [31] U. Grenander and G. Szegö, Toeplitz forms and their applications.   California, U.S.A.: California Univ. Press, 1958.
  • [32] E. Björnson, D. Hammarwall, and B. Ottersten, “Exploiting quantized channel norm feedback through conditional statistics in arbitrarily correlated MIMO systems,” IEEE Trans. Signal Process., vol. 57, no. 10, pp. 4027–4041, 2009.
  • [33] R. K. Mallik, “The exponential correlation matrix: Eigen-analysis and applications,” IEEE Trans. Wireless Commun., vol. 17, no. 7, pp. 4690–4705, 2018.
  • [34] S.-G. Hwang, “Cauchy’s interlace theorem for eigenvalues of Hermitian matrices,” Amer. Math. Monthly, vol. 111, no. 2, pp. 157–159, 2004.
  • [35] B. Picinbono, “Second-order complex random vectors and normal distributions,” IEEE Trans. Signal Process., vol. 44, no. 10, pp. 2637–2640, 1996.
  • [36] L. Schumacher, K. I. Pedersen, and P. E. Mogensen, “From antenna spacings to theoretical capacities-guidelines for simulating MIMO systems,” in Proc. IEEE Int. Symp. Person. Indoor Mobile Radio Commun., vol. 2, Lisboa, Portugal, Sep. 2002, pp. 587–592.
[Uncaptioned image] Cheng Zhang (Member, IEEE) received the B.Eng. degree from Sichuan University, Chengdu, China, in June 2009, the M.Sc. degree from the Xi’an Electronic Engineering Research Institute (EERI), Xi’an, China, in May 2012, and the Ph.D. degree from Southeast University (SEU), Nanjing, China, in Dec. 2018. From Nov. 2016 to Nov. 2017, he was a Visiting Student with the University of Alberta, Edmonton, AB, Canada. From June 2012 to Aug. 2013, he was a Radar Signal Processing Engineer with Xi’an EERI. Since Dec. 2018, he has been with SEU, where he is currently an Associate Professor, and supported by the Zhishan Young Scholar Program of SEU. His current research interests include space-time signal processing and machine learning-aided optimization for B5G/6G wireless communications. He has authored or co-authored more than 50 IEEE journal papers and conference papers. He was the recipient of the excellent Doctoral Dissertation of the China Education Society of Electronics in Dec. 2019, that of Jiangsu Province in Dec. 2020, and the Best Paper Awards of 2023 IEEE WCNC and 2023 IEEE WCSP.
[Uncaptioned image] Chang Liu (Student Member, IEEE) received the B.Eng. degree in information engineering with the School of Information Science and Engineering, Southeast University, Nanjing, China, in 2021, where he is currently pursuing the M.Sc. degree as well. His research interests mainly focus on low overhead massive MIMO channel acquisition.
[Uncaptioned image] Yindi Jing (Senior Member, IEEE) received the B.Eng. and M.Eng. degrees in automatic control from the University of Science and Technology of China, Hefei, China, in 1996 and 1999, respectively. She received the M.Sc. degree and the Ph.D. in electrical engineering from California Institute of Technology, Pasadena, CA, in 2000 and 2004, respectively. From October 2004 to August 2005, she was a postdoctoral scholar at the Department of Electrical Engineering of California Institute of Technology. From February 2006 to June 2008, she was a postdoctoral scholar at the Department of Electrical Engineering and Computer Science of the University of California, Irvine. In 2008, she joined the Electrical and Computer Engineering Department of the University of Alberta, where she is currently a professor. She was an Associate Editor for IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS and a Senior Area Editor for IEEE SIGNAL PROCESSING LETTERS. She was a member of the IEEE Signal Processing Society Signal Processing for Communications and Networking (SPCOM) Technical Committee and a member of the NSERC Discovery Grant Evaluation Group for Electrical and Computer Engineering. Her research interests are in wireless communications and signal processing.
[Uncaptioned image] Minjie Ding (Student Member, IEEE) received the B.Eng. degree in information engineering from the School of Electronic Information and Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing, China, in 2020, where she is currently pursuing the M.Sc. degree in information and communication engineering with the School of Information Science and Engineering, Southeast University. Her research interests mainly focus on low overhead massive MIMO channel acquisition.
[Uncaptioned image] Yongming Huang (M’10-SM’16) received the B.S. and M.S. degrees from Nanjing University, Nanjing, China, in 2000 and 2003, respectively, and the Ph.D. degree in electrical engineering from Southeast University, Nanjing, in 2007. Since March 2007 he has been a faculty in the School of Information Science and Engineering, Southeast University, China, where he is currently a full professor. He has also been the Director of the Pervasive Communication Research Center, Purple Mountain Laboratories, since 2019. From 2008 to 2009, he was visiting the Signal Processing Lab, Royal Institute of Technology (KTH), Stockholm, Sweden. He has published over 200 peer-reviewed papers, hold over 80 invention patents. His current research interests include intelligent 5G/6G mobile communications and millimeter wave wireless communications. He submitted around 20 technical contributions to IEEE standards, and was awarded a certificate of appreciation for outstanding contribution to the development of IEEE standard 802.11aj. He served as an Associate Editor for the IEEE Transactions on Signal Processing and a Guest Editor for the IEEE Journal on Selected Areas in Communications. He is currently an Editor-at-Large for the IEEE Open Journal of the Communications Society.