This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Improving Sum-Rate of Cell-Free Massive MIMO with Expanded Compute-and-Forward

Jiayi Zhang,  Jing Zhang, Derrick Wing Kwan Ng, 
Shi Jin,  and Bo Ai
J. Zhang and J. Zhang are with the School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing 100044, China, and also with the Frontiers Science Center for Smart High-speed Railway System, Beijing Jiaotong University, Beijing 100044, China (e-mail: jiayizhang@bjtu.edu.cn).D. W. K. Ng is with the School of Electrical Engineering and Telecommunications, University of New South Wales, NSW 2052, Australia. (e-mail: w.k.ng@unsw.edu.au).S. Jin is with the National Mobile Communications Research Laboratory, Southeast University, Nanjing 210096, China (e-mail: jinshi@seu.edu.cn).B. Ai is with the State Key Laboratory of Rail Traffic Control and Safety, Beijing Jiaotong University, Beijing 100044, China, and also with the Frontiers Science Center for Smart High-speed Railway System, and also with Henan Joint International Research Laboratory of Intelligent Networking and Data Analysis, Zhengzhou University, Zhengzhou 450001, China, and also with Research Center of Networks and Communications, Peng Cheng Laboratory, Shenzhen, China (e-mail: boai@bjtu.edu.cn).
Abstract

Cell-free massive multiple-input multiple-output (MIMO) employs a large number of distributed access points (APs) to serve a small number of user equipments (UEs) via the same time/frequency resource. Due to the strong macro diversity gain, cell-free massive MIMO can considerably improve the achievable sum-rate compared to conventional cellular massive MIMO. However, the performance of cell-free massive MIMO is upper limited by inter-user interference (IUI) when employing simple maximum ratio combining (MRC) at receivers. To harness IUI, the expanded compute-and-forward (ECF) framework is adopted. In particular, we propose power control algorithms for the parallel computation and successive computation in the ECF framework, respectively, to exploit the performance gain and then improve the system performance. Furthermore, we propose an AP selection scheme and the application of different decoding orders for the successive computation. Finally, numerical results demonstrate that ECF frameworks outperform the conventional CF and MRC frameworks in terms of achievable sum-rate.

Index Terms:
Cell-free massive MIMO, expanded compute-and-forward, power control, sum rate.

I Introduction

Massive multiple-input multiple-output (MIMO) is a promising physical-layer technology to keep up with the exponential traffic growth of future wireless communication systems. More specifically, massive MIMO can provide tremendous beamforming gains and spatially multiplexing gains to multiple user equipments (UEs) and increase the system achievable sum-rate [2, 3, 4]. Despite the potential performance gain brought by massive MIMO, UEs at cell-edge may experience poor channel conditions and suffer from strong inter-cell interference (ICI). To alleviate this performance bottleneck, distributed massive MIMO has been proposed to combat ICI and to improve the performance of cell-edge UEs. However, there is a fundamental performance limitation for distributed massive MIMO with full cooperation between different transmitters [5].

Recently, the authors in [6] proposed a practical network infrastructure for distributed massive MIMO, under the name of cell-free massive MIMO [7, 8, 9]. In cell-free massive MIMO systems, a large number of access points (APs) distribute in a large area and are connected to a central processing unit (CPU) via a fronthaul network. In particular, a small number of UEs are served by all APs with the same time/frequency resource [6, 10, 11]. Since there are no cells or cell boundaries, ICI does not exist. Indeed, cell-free massive MIMO is a specific realization of distributed massive MIMO [6].

The most outstanding aspect of cell-free massive MIMO is that many APs simultaneously serve a much smaller number of UEs, which yields a high degree of macro-diversity and can offer a huge spectral efficiency. Besides, some studies have reported that favorable propagation is also a potential advantage for cell-free massive MIMO which can be exploited to eliminate inter-user interference (IUI) [6]. Note that favorable propagation refers to the property that when the number of AP antennas is sufficiently large, the channels between the UEs and APs become asymptotically orthogonal [12]. However, the favorable propagation property does not always hold in practical systems. The non-negligible IUI is highly undesirable and leads to a considerable loss in achievable sum-rate. As a result, how to harness the IUI has triggered many new coding and signal processing techniques.

I-A Related Works

As a new approach of linear physical-layer network coding that allows intermediate nodes to send out functions of their received packets [13, 14, 15, 16], the compute-and-forward (CF) scheme has recently been employed in cell-free massive MIMO systems to offer protection against noise and to reduce IUI with cooperation gain [17]. For the uplink transmission, UEs employ a nested lattice coding strategy to encode data that takes values in a prime-size finite field before transmission. Then, the CF scheme enables APs to decode the integer linear equations of UEs’ codewords using the noisy linear combinations provided by the channels. Relying on nested lattice codes, the linear combination of UEs’ codewords is still a regular codeword [18, 19]. Next, each AP forwards the decoded combination to the CPU through the fronthaul link. After receiving sufficient linear combinations, the CPU could recover every UE¡¯s original data by performing AP selection and solving the received equations [20, 21, 17].

However, the CF scheme requests all UEs transmit with equal power, which is generally not the optimal strategy for improving the achievable sum-rate. Due to the different propagation conditions between APs and UEs, the performance can be improved by performing appropriate power control [12]. Moreover, with power control for UEs, the effective noise variance across all APs whose linear combinations involve the message can be reduced. Then, the achievable sum-rate can be further improved.

Motivated by the discussion above, we adopt the expanded compute-and-forward (ECF) framework which was proposed in [22] for the uplink transmission in cell-free massive MIMO systems. The ECF framework is able to distribute transmit powers unequally and retains the connection between the finite field data and the lattice codeword. We note that coordinated multiple points (CoMP) framework also can be implemented with interference alignment at the transmitter-side [23, 24, 25], however, the distinction between CoMP and ECF is that CoMP as conventionally defined does not involve CF strategy.

There are two types of ECF framework, named parallel computation and successive computation, respectively. The distinction between these schemes is that in parallel computation the CPU recovers UEs’ data independently while for successive computation the CPU decodes the linear combinations by using successive cancellation. Specifically, in successive computation, the combinations which have been decoded can be used as side information in the subsequent decoding steps to decrease both effective noise variance and the number of UEs that need to tolerate the effective noise. Applying successive computation helps improve the achievable sum-rate, however, in terms of processing delay, the parallel computation has some advantages. In other words, there is a trade-off between the parallel computation and successive computation.

Besides, there are some key aspects which dominate the performance of ECF framework: coefficient vector selection and AP selection. Since the performance of ECF is captured by the computation rate and that rate achieves the highest when the equation coefficients closely approximate the effective channel coefficients, designing the coefficient vector elaborately is beneficial for the improvement of achievable sum-rate. As for AP selection, it is performed at the CPU when recovering UEs’ original data in both parallel computation and successive computation. With the help of AP selection, the computational complexity of power optimization is reduced. Furthermore, the noise tolerance on UEs’ data can also be relaxed, which contributes to the improvement of the achievable sum-rate.

I-B Contributions

In this paper, we consider the application of ECF framework in cell-free massive MIMO systems to increase the achievable sum-rate, including both parallel computation and successive computation. The main contributions of this paper are as follows:

  • We apply a quadratic programming relaxation based coefficient vector selection method and a large-scale fading based low-complexity AP selection algorithm to improve the achievable sum-rate of the cell-free massive MIMO system.

  • We design efficient power control algorithms for parallel and successive computation schemes, respectively. For the successive computation scheme, we further derive a sub-optimal decoding order of combinations and develop three assignment algorithms to find a sub-optimal decoding order of UEs.

  • We quantitatively compare the performance of conventional combining and ECF frameworks under practical channel model and scenarios, which proves that the ECF framework is an effective approach for the fronthaul reduction. In particular, the successive computation scheme outperforms the parallel computation scheme with a larger fronthaul load.

Compared with our related conference paper [1], which focused only on parallel computation with power control based on uplink-downlink duality, in this paper, we provide a thorough analysis for the successive computation scheme with power control for improving the achievable sum rate. Besides, the problem-solving methodology for determining the suboptimal decoding order of combinations and UEs are investigated. Furthermore, the results from [1] are not applicable to the case considered in this paper due to different power control method and additional AP selection algorithm are applied. More importantly, we also provide practice insights into the performance of MRC, CF, centralized MMSE, parallel computation, and successive computation schemes in achievable sum rate.

The rest of this paper is organized as follows. In Section II, we describe the cell-free massive MIMO system model. A detailed introduction for ECF framework is given in Section III. Furthermore, AP selection methods and power control algorithm for parallel computation are introduced in Section III-B. In Section III-C, we investigate different decoding order methods of combinations for successive computation. Finally, numerical results and discussions are given in Section IV while Section V concludes the paper.

Table I shows the notations. Unless further specified, plain letters, boldface letters, and boldface uppercase letters denotes scalars, column vectors, and matrices respectively.

TABLE I: Notations
pp A prime number
\mathbb{R}, \mathbb{C}, p\mathbb{Z}_{p}
Reals, complex field,
finite field of size pp
q1,q2,w1,w2,r{q_{1}},{q_{2}},{w_{1}},{w_{2}},r Element in p\mathbb{Z}_{p}
[i]=\mathbb{Z}[i]=
{a+bi|a,b}\left\{{a+\left.{bi}\right|a,b\in\mathbb{Z}}\right\}
Set of Gaussian integers whose real
and imaginary parts are both integers
\sum
Addition over the real
or complex field
\oplus Addition over the finite field
amodp=ra\bmod p=r
Computing the remainder
of dividing aa by pp
q1w1q2w2{q_{1}}{w_{1}}\oplus{q_{2}}{w_{2}} q1w1+q2w2modp{{q_{1}}{w_{1}}+{q_{2}}{w_{2}}}\bmod p
𝐚\left\|{\bf{a}}\right\| 2-norm of vector 𝐚\bf{a}
𝐚T{{\bf{a}}^{T}}, 𝐚H{{\bf{a}}^{H}}
Transpose of 𝐚\bf{a},
conjugate-transpose of 𝐚\bf{a}
a\left\lfloor a\right\rfloor Floor function of aa
I Identity matrix
𝔼{a}{\mathbb{E}}\left\{{a}\right\} Expectation of a{a}
log+(a){\log^{+}}\left(a\right)
max(log(a),0)\max\left({\log\left(a\right),0}\right), the log function
is with respect to base 2

II System Model

We consider an uplink cell-free massive MIMO system. MM single-antenna APs and LL (M>LM>L) single-antenna UEs are randomly distributed in a wide geographical area [6, 10, 11]. APs provide services for UEs via the same time/frequency resource. In particular, each AP exchanges information with the CPU via fronthaul link. As the practical number of APs is finite, we assume that the IUI can still have significant impact on the achievable sum-rate.

Refer to caption
Figure 1: ECF framework based cell-free massive MIMO systems.

First, we will provide some necessary definition on nested lattice codes. An nn-dimensional lattice, Λ\Lambda, is a set of points in n{\mathbb{R}^{n}} such that if 𝐬,𝐭Λ{\mathbf{s}},{\mathbf{t}}\in\Lambda, then 𝐬+𝐭Λ{\mathbf{s}}+{\mathbf{t}}\in\Lambda and if 𝐬Λ{\mathbf{s}}\in\Lambda, then 𝐬Λ-{\mathbf{s}}\in\Lambda. Note that a lattice can always be written in terms of a lattice generator matrix 𝐁n×n{\mathbf{B}}\in{\mathbb{R}^{n\times n}}, i.e., Λ={𝐬=𝐁𝐜:𝐜n}\Lambda=\left\{{{\mathbf{s}}={\mathbf{Bc}}:{\mathbf{c}}\in{\mathbb{Z}^{n}}}\right\}. Besides, a lattice Λ\Lambda is said to be nested in a lattice Λ1{\Lambda_{1}} if ΛΛ1\Lambda\subseteq{\Lambda_{1}}. As shown in Fig. 1, without loss of generality, the llth UE maps the original length-kk data 𝐰lpk{\bf{w}}_{l}\in\mathbb{Z}^{k}_{p} into a length-nn complex-valued lattice codeword 𝐱l{{\bf{x}}_{l}} with encoder ϕl:pk[i]n{\phi_{l}}:\mathbb{Z}_{p}^{k}\to\mathbb{Z}{[i]^{n}}. The specific choices of nn and pp are studied in [22, Theorem 8]. For creating generation matrices to encode the original data into nested lattice codeword, the blocklength needs to be large enough. Therefore, the longer blocklength, i.e., nn, is better. Note that klk_{l} is the number of symbols carrying information. The remaining kklk-k_{l} symbols are set to zero to meet the power constraint and the effective noise tolerance. The lattice codeword is subject to the power constraint 𝔼𝐱l2nPl\mathbb{E}{\left\|{{{\bf{x}}_{l}}}\right\|^{2}}\leq n{P_{l}}, where Pl{P_{l}} is the transmit power of the llth UE.

Let gmkg_{mk} represent the channel coefficient between the mmth AP and llth UE, which is given by

gml=βml1/2hml,{g_{ml}}=\beta_{ml}^{1/2}{h_{ml}}, (1)

where βml\beta_{ml} denotes the large-scale fading and hmlh_{ml}\in{\mathbb{C}} denotes the small-scale fading. With the help of [11, Eq. (17)], the propagation is given as

βml[dB]=30.536.7log10(dml/1m)+Fml,{\beta_{ml}}\left[{{\rm{dB}}}\right]=-30.5-36.7{\log_{10}}\left({{d_{ml}}/1{\text{m}}}\right)+{F_{ml}}, (2)

where dmld_{ml} represents the distances between the mmth AP and the llth UE and Fml𝒩(0,42){F_{ml}}\sim{\mathcal{N}}(0,{4^{2}}) is the shadow fading. We assume that hml,m=1,,M,l=1,,L{h_{ml}},m=1,\ldots,M,l=1,\ldots,L, are independent and identically distributed (i.i.d.) 𝒞𝒩(0,1){\cal C}{\cal N}\left({0,1}\right) random variables (RV)s.

The length-nn vector received signal at the mmth AP is

𝐲m=l=1Lgml𝐱l+𝐳m,{{\bf{y}}_{m}}=\sum\nolimits_{l=1}^{L}{{g_{ml}}}{{\bf{x}}_{l}}+{{\bf{z}}_{m}}, (3)

where the thermal noise 𝐳mn{{\bf{z}}_{m}}\in{\mathbb{C}^{n}} is elementwise independent and identically distributed (i.i.d.) 𝒞𝒩(0,σ2)\mathcal{CN}\left({0,{\sigma^{2}}}\right).

The ECF framework manipulates the algebraic structure such that any Gaussian integer combination of lattice codewords is still a lattice point. In cell-free massive MIMO, each AP endeavours to represent the received length-nn signal vector 𝐲m{{\bf{y}}_{m}} with a Gaussian integer linear combination of UEs’ codewords. By applying an equalization factor bmb_{m} and selecting the coefficient vector 𝐚m=[am1,am2,,amL]T[i]L{{\bf{a}}_{m}}={\left[{{a_{m1}},{a_{m2}},\ldots,{a_{mL}}}\right]^{T}}\in\mathbb{Z}{\left[i\right]^{L}}, the scaled received signal can be expressed as

bm𝐲m=l=1Laml𝐱l+l=1L(bmgmlaml)𝐱l+bm𝐳meffectivenoise.{b_{m}}{{\bf{y}}_{m}}=\sum\nolimits_{l=1}^{L}{{a_{ml}}}{{\bf{x}}_{l}}+\underbrace{\sum\nolimits_{l=1}^{L}{\left({{b_{m}}{g_{ml}}-{a_{ml}}}\right)}{{\bf{x}}_{l}}+{b_{m}}{{\bf{z}}_{m}}}_{{\rm{effective\,noise}}}. (4)

Each AP is equipped with a decoder, φm:[i]npk{\varphi_{m}}:\mathbb{Z}{\left[i\right]^{{n}}}\to\mathbb{Z}_{p}^{{k}}. Then, AP decodes the received signal 𝐲m{{\bf{y}}_{m}} into the finite field as 𝐮^m=φm(𝐲m){{{\bf{\hat{u}}}}_{m}}={\varphi_{m}}\left({{{\bf{y}}_{m}}}\right), where 𝐮^m{{{\bf{\hat{u}}}}_{m}} is an estimation of the linear combination of original data 𝐮m=l=1Lqml𝐰l=l=1Laml𝐱lmodp{{\bf{u}}_{m}}=\mathop{\oplus}\limits_{l=1}^{L}{q_{ml}}{{\bf{w}}_{l}}={\sum\nolimits_{l=1}^{L}{{a_{ml}}{{\bf{x}}_{l}}}}\bmod p. 111If the codeword spacing for a given data from the llth UE can tolerate the maximum effective noise across the APs whose linear combinations involve that data, the probability of decoding error is given as Pr(m=1Ml{𝐮^m𝐮m})<ε\Pr\left({\mathop{\cup}\limits_{m=1}^{{M_{l}}}\left\{{{{{\bf{\hat{u}}}}_{m}}\neq{{\bf{u}}_{m}}}\right\}}\right)<\varepsilon, where MlM_{l} represents the number of APs whose combinations contain the data and ε\varepsilon is a small positive number that tends to zero. The specific procedure for recovering messages for UEs is stated in [18]. Given LL linear combinations of messages with real and imaginary coefficient matrices 𝐐R={qmlR}{{\bf{Q}}^{R}}=\left\{{q_{ml}^{R}}\right\}, 𝐐I={qmlI}{{\bf{Q}}^{I}}=\left\{{q_{ml}^{I}}\right\}, the CPU can recover message 𝐰l{{\bf{w}}_{l}} if there exists a vector 𝐜pM×L{\bf{c}}\in{\mathbb{Z}}_{p}^{M\times L} such that

𝐜T[𝐐R𝐐I𝐐I𝐐R]=𝜹lT,{{\bf{c}}^{T}}\left[{\begin{array}[]{*{20}{c}}{{{\bf{Q}}^{R}}}&{-{{\bf{Q}}^{I}}}\\ {{{\bf{Q}}^{I}}}&{{{\bf{Q}}^{R}}}\end{array}}\right]={{\bm{\delta}}_{l}^{T}}, (5)

where 𝜹l{{\bm{\delta}}_{l}} denotes a unit column vector with 1 in the llth entry and 0 elsewhere. 222The two decoders adopted at APs and the CPU, respectively, have different functionalities. Indeed, decoders at the APs are used for decoding the received signal into the linear combination of the UEs’ original data. Then, these decoded linear combinations are transmitted from APs to CPU. In contrast, the decoder at the CPU is responsible for recovering each UE’s data from those combinations. Specifically, when applying it with successive computation, the interference cancellation procedure takes place at the decoder at the CPU. For the traditional multiuser MIMO systems, where M=LM=L, data recovery is a major challenge due to the high probability of rank deficiency. However, the number of APs is far larger than that of the UEs in cell-free massive MIMO systems. Since when the number of APs increases the probability of selecting LL APs that provides LL independent linear combinations also increases [17], the extra APs can ensure a much higher probability for avoiding rank deficiency, so as to improve the probability to recover the desired message.

III Expanded Compute-and-Forward

One of the major challenges in cell-free massive MIMO is the IUI in the uplink. In particular, CF scheme can achieve large gain through decoding linear functions of transmitted signals with nested lattice codes. The performance of CF scheme for cell-free massive MIMO has been compared with MRC in [17], which shows that with equal power transmission at all UEs, the CF scheme can offer a throughput improvement. Furthermore, the ECF framework can improve the achievable sum-rate utilizing the characteristic of optimal power control.

In this section, two practical ECF frameworks are considered for cell-free massive MIMO systems. The first one is parallel computation, which refers to that the CPU decodes each of the integer linear combinations independently. Furthermore, successive computation decodes received combinations one-by-one and employing the side information to reduce the effective noise. We begin with the parallel computation.

III-A Coefficient Vector Selection

The goal of this paper is to evaluate the performance of the ECF framework for cell-free massive MIMO systems by deriving its computation rate region [22], which is defined as the set of achievable rate RlR_{l} ensuring successful data recovery:

ECF(𝐏,𝐠m,𝐚m){(R1,R2,,RL)+L:\displaystyle{\mathcal{R}_{\text{ECF}}}\left({\bf{P}},{{\bf{g}}_{m}},{{\bf{a}}_{m}}\right)\triangleq\bigg{\{}\left({R_{1}},{R_{2}},\dots,{R_{L}}\right)\in\mathbb{R}_{+}^{L}:
Rllog+(Plσ2(𝐏,𝐠m,𝐚m))(m,l)s.t.aml0},\displaystyle{R_{l}}\!\leq\!{\log^{+}}\!\left(\frac{{{P_{l}}}}{{{\sigma^{2}}\left({\bf{P}},{{\bf{g}}_{m}},{{\bf{a}}_{m}}\right)}}\right)\;\;\forall\left({m,l}\right)\;\;{\rm{s.}}{\rm{t.}}\;\;{a_{ml}}\neq 0\bigg{\}}, (6)

where σ2(𝐏,𝐠m,𝐚m){\sigma^{2}}\left({\bf{P}},{{\bf{g}}_{m}},{{\bf{a}}_{m}}\right) refers to the effective noise at the mmth AP and 𝐏=Δdiag(P1,P2,,PL){\bf{P}}\buildrel\Delta\over{=}{\text{diag}}\left({{P_{1}},{P_{2}},\ldots,{P_{L}}}\right) is the diagonal matrix with the power constraint for UEs. In order to maximize the computation rate region, we need to find the optimal coefficient vector 𝐚m{{{\bf{a}}_{m}}} and equalization factor bmb_{m}.

According to [22, Lemma 2], the equalization factor bmb_{m} that minimizes the effective noise variance from (4) is the MMSE projection. Then, we have

bm=𝐠mH𝐏𝐚m(1+𝐠mH𝐏𝐠m)1.{b_{m}}={\bf{g}}_{m}^{H}{{\bf{P}}}{{\bf{a}}_{m}}{\left({1+{\bf{g}}_{m}^{H}{{\bf{P}}}{{\bf{g}}_{m}}}\right)^{-1}}. (7)

Hence, the effective noise is given by

σ2(𝐏,𝐠m,𝐚m)\displaystyle{\sigma^{2}}\left({{\bf{P}},{{\bf{g}}_{m}},{{\bf{a}}_{m}}}\right) =Δ1n𝔼{𝐗T(bm𝐠m𝐚m)+bm𝐳m2}\displaystyle\buildrel\Delta\over{=}\frac{1}{n}{\mathbb{E}}\left\{{{{\left\|{{{\bf{X}}^{T}}\left({{b_{m}}{{\bf{g}}_{m}}-{{\bf{a}}_{m}}}\right)+{b_{m}}{{\bf{z}}_{m}}}\right\|}^{2}}}\right\}
=𝐚mH(𝐏1+𝐠m𝐠mH)1𝐚m,\displaystyle={\bf{a}}_{m}^{H}{\left({{{{{\bf{P}}}}^{-1}}+{{\bf{g}}_{m}}{\bf{g}}_{m}^{H}}\right)^{-1}}{{\bf{a}}_{m}}, (8)

where 𝐗=[𝐱1,𝐱2,,𝐱L]T{\bf{X}}={\left[{{{\bf{x}}_{1}},{{\bf{x}}_{2}},\ldots,{{\bf{x}}_{L}}}\right]^{T}} represents the codeword matrix. For the mmth AP, the aim is to find its optimal coefficient vector that maximizes the computation rate region as

𝐚m,opt\displaystyle{{\bf{a}}_{m,\text{opt}}} =argmax𝐚m[i]LECF(𝐏,𝐠m,𝐚m)\displaystyle=\mathop{\arg\max}\limits_{{{\bf{a}}_{m}}\in\mathbb{Z}{{\left[i\right]}^{L}}}{{\mathcal{R}}_{\text{ECF}}}\left({{\bf{P}},{{\bf{g}}_{m}},{{\bf{a}}_{m}}}\right)
=argmin𝐚m[i]Lσ2(𝐏,𝐠m,𝐚m).\displaystyle=\mathop{\arg\min}\limits_{{{\bf{a}}_{m}}\in\mathbb{Z}{{\left[i\right]}^{L}}}{\sigma^{2}}\left({{\bf{P}},{{\bf{g}}_{m}},{{\bf{a}}_{m}}}\right). (9)

Since the channel coefficient between the mmth AP and the llth UE is complex-valued, the received signal 𝐲m{{\bf{y}}_{m}} can be divided into the real part and the imaginary part:

Re(𝐲m)=l=1L(Re(gml)Re(𝐱l)Im(gml)Im(𝐱l))+Re(𝐳m),\displaystyle{\mathop{\rm Re}\nolimits}\left({{{\bf{y}}_{m}}}\right)\!\!=\!\!\sum\limits_{l=1}^{L}{\left({{\mathop{\rm Re}\nolimits}\left({{g_{ml}}}\right){\mathop{\rm Re}\nolimits}\left({{{\bf{x}}_{l}}}\right)\!\!-\!\!{\mathop{\rm Im}\nolimits}\left({{g_{ml}}}\right){\mathop{\rm Im}\nolimits}\left({{{\bf{x}}_{l}}}\right)}\right)}\!\!+\!\!{\mathop{\rm Re}\nolimits}\left({{{\bf{z}}_{m}}}\right),
Im(𝐲m)=l=1L(Im(gml)Re(𝐱l)+Re(gml)Im(𝐱l))+Im(𝐳m).\displaystyle{\mathop{\rm Im}\nolimits}\left({{{\bf{y}}_{m}}}\right)\!\!=\!\!\sum\limits_{l=1}^{L}{\left({{\mathop{\rm Im}\nolimits}\left({{g_{ml}}}\right){\mathop{\rm Re}\nolimits}\left({{{\bf{x}}_{l}}}\right)\!\!+\!\!{\mathop{\rm Re}\nolimits}\left({{g_{ml}}}\right){\mathop{\rm Im}\nolimits}\left({{{\bf{x}}_{l}}}\right)}\right)}\!\!+\!\!{\mathop{\rm Im}\nolimits}\left({{{\bf{z}}_{m}}}\right).

Therefore, we can transform the complex-valued network with LL UEs and MM APs into a real-valued network with 2L2L UEs and 2M2M APs. It is convenient to calculate the real and imaginary parts of the coefficient vector 𝐚m{{\bf{a}}_{m}}, respectively. 333Reducing the problem of developing coefficient algorithms for complex channels to an equivalent real-only channel is generally suboptimal. Some solutions of finding the optimal solution in polynomial time over complex integer based lattices and complex channels were proposed in [26], however, they require a substantially higher complexity. Therefore, the investigation of explicitly addresses the complex channel with low complexity is one of our future work. Without loss of generality, we only consider Re(𝐚m){\mathop{\rm Re}\nolimits}\left({{{\bf{a}}_{m}}}\right) for a given real-valued channel coefficient Re(𝐠m){\mathop{\rm Re}\nolimits}\left({{{\bf{g}}_{m}}}\right) in the following.

For each coefficient vector Re(𝐠m){\mathop{\rm Re}\nolimits}\left({{{\bf{g}}_{m}}}\right), we can find a signed permutation matrix S, which is unimodular and orthogonal such that 𝐒Re(𝐠m){\bf{S}}{\mathop{\rm Re}\nolimits}\left({{{\bf{g}}_{m}}}\right) is nonnegative and its elements are in nondecreasing order [27, Lemma 1]. Suppose Re(𝐚m)opt{\mathop{\rm Re}\nolimits}{\left({{{\bf{a}}_{m}}}\right)_{\text{opt}}} is the optimal coefficient vector with the specifical power constraint P and channel coefficient Re(𝐠m){\mathop{\rm Re}\nolimits}\left({{{\bf{g}}_{m}}}\right), we have (Re(𝐠m),Re(𝐚m)opt)=(𝐒Re(𝐠m),𝐒Re(𝐚m)opt)\mathcal{R}\left({{\mathop{\rm Re}\nolimits}{\left({{{\bf{g}}_{m}}}\right)},{\mathop{\rm Re}\nolimits}{\left({{{\bf{a}}_{m}}}\right)_{\text{opt}}}}\right)=\mathcal{R}\left({{\bf{S}}{\mathop{\rm Re}\nolimits}\left({{{\bf{g}}_{m}}}\right),{\bf{S}}{\mathop{\rm Re}\nolimits}{\left({{{\bf{a}}_{m}}}\right)_{\text{opt}}}}\right) [27, Lemma 3]. Define Re(𝐠m)¯\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{g}}_{m}}}\right)} as the nonnegative and non-decreasing-ordered vector, e.g., Re(𝐠m)¯=𝐒Re(𝐠m)\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{g}}_{m}}}\right)}={\bf{S}}{\mathop{\rm Re}\nolimits}\left({{{\bf{g}}_{m}}}\right). Therefore, we can recover the desired coefficient vector through Re(𝐚m)opt=𝐒1Re(𝐚m)¯opt{\mathop{\rm Re}\nolimits}{\left({{{\bf{a}}_{m}}}\right)_{\text{opt}}}={{\bf{S}}^{-1}}{\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{a}}_{m}}}\right)}_{\text{opt}}}.

In the following, we concentrate on acquiring Re(𝐚m)¯opt{\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{a}}_{m}}}\right)}_{\text{opt}}} for Re(𝐠m)¯\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{g}}_{m}}}\right)} by relaxing the optimization problems stated in (III-A) based on the quadratic programming (QP) method [28]. Recall that Re(𝐚m)¯\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{a}}_{m}}}\right)} is in nondecreasing order, therefore, the maximum element should be Re(𝐚m)¯L{\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{a}}_{m}}}\right)}_{L}}. According to [22], the searching space for Re(𝐚m)¯L{\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{a}}_{m}}}\right)}_{L}} can be restricted with

Re(𝐚m)¯Lλmax(𝐈+Re(𝐠m)¯𝐏Re(𝐠m)¯T),{\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{a}}_{m}}}\right)}_{L}}\leq{\lambda_{\max}}\left({{\bf{I}}+\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{g}}_{m}}}\right)}{{\bf{P}}}{{\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{g}}_{m}}}\right)}}^{T}}}\right), (10)

where λmax(𝐈+Re(𝐠m)¯𝐏Re(𝐠m)¯T){\lambda_{\max}}\left({{\bf{I}}+\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{g}}_{m}}}\right)}{{\bf{P}}}{{\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{g}}_{m}}}\right)}}^{T}}}\right) denotes the maximum eigenvalue of (𝐈+Re(𝐠m)¯𝐏Re(𝐠m)¯T)\left({{\bf{I}}+\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{g}}_{m}}}\right)}{{\bf{P}}}{{\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{g}}_{m}}}\right)}}^{T}}}\right). Then, the problem stated in (III-A) can be rewritten as a series of QP problems

minimizeRe(𝐚m)¯\displaystyle\mathop{{\rm{minimize}}}\limits_{\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{a}}_{m}}}\right)}}\qquad Re(𝐚m)¯T𝐆mRe(𝐚m)¯\displaystyle{\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{a}}_{m}}}\right)}^{T}}{{\bf{G}}_{m}}\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{a}}_{m}}}\right)}
subjectto\displaystyle{\rm{subject\,to}}\qquad Re(𝐚m)¯L,\displaystyle\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{a}}_{m}}}\right)}\in{\mathbb{R}^{L}},
Re(𝐚m)¯L=k,k=1,2,,K,\displaystyle{\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{a}}_{m}}}\right)}_{L}}=k,\;\;\;k=1,2,\ldots,K, (11)

where 𝐆m=(𝐏+Re(𝐠m)¯Re(𝐠m)¯T)1{{\bf{G}}_{m}}={\left({{\bf{P}}+\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{g}}_{m}}}\right)}\,{\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{g}}_{m}}}\right)}^{T}}}\right)}^{-1} and K=λmax(𝐈+Re(𝐠m)¯𝐏Re(𝐠m)¯T)K=\left\lfloor{{\lambda_{\max}}\left({{\bf{I}}+\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{g}}_{m}}}\right)}{{\bf{P}}}{{\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{g}}_{m}}}\right)}}^{T}}}\right)}\right\rfloor. Let Re(𝐚m)¯k+\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{a}}_{m}}}\right)}_{k}^{+} represent the solution to the problem of (III-A) with the constraint Re(𝐚m)¯L=k{\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{a}}_{m}}}\right)}_{L}}=k. KK solutions can be obtained by utilizing Re(𝐚m)¯k+=kRe(𝐚m)¯1+\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{a}}_{m}}}\right)}_{k}^{+}=k\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{a}}_{m}}}\right)}_{1}^{+}. With the Lagrange multiplier method [28], we have

Re(𝐚m)¯1+=[𝐫,1]T,\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{a}}_{m}}}\right)}_{1}^{+}=\left[{\bf{r}},1\right]^{T}, (12)

where 𝐫=(𝐆m(1:L1,1:L1))1𝐆(1:L1,L){\bf{r}}=-{\left({{{\bf{G}}_{m}}\left({1:L-1,1:L-1}\right)}\right)^{-1}}{\bf{G}}\left({1:L-1,L}\right). With the help of [27, ALgorithm 1], KK real-valued solutions to the problem in (III-A), {Re(𝐚m)¯k+}\left\{{\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{a}}_{m}}}\right)}_{k}^{+}}\right\}, can be quantized to integer-valued {Re(𝐚m)¯kint}\left\{{\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{a}}_{m}}}\right)}_{k}^{{\mathop{\rm int}}}}\right\}. We select a sub-optimal coefficient vector Re(𝐚m)¯opt{\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{a}}_{m}}}\right)}_{\text{opt}}} for Re(𝐠m)¯\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{g}}_{m}}}\right)} with

Re(𝐚m)¯opt=argminRe(𝐚m)¯{Re(𝐚m)¯kint}Re(𝐚m)¯T𝐆mRe(𝐚m)¯.{\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{a}}_{m}}}\right)}_{\text{opt}}}=\arg\mathop{\min}\limits_{\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{a}}_{m}}}\right)}\in\left\{{\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{a}}_{m}}}\right)}_{k}^{{\mathop{\rm int}}}}\right\}}{\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{a}}_{m}}}\right)}^{T}}{{\bf{G}}_{m}}\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{a}}_{m}}}\right)}.

Finally, the optimal coefficient vector Re(𝐚m)opt{\mathop{\rm Re}\nolimits}{\left({{{\bf{a}}_{m}}}\right)_{\text{opt}}} correlated with the channel coefficient Re(𝐠m){\mathop{\rm Re}\nolimits}\left({{{\bf{g}}_{m}}}\right) is recovered with Re(𝐚m)opt=𝐒1Re(𝐚m)¯opt{\mathop{\rm Re}\nolimits}{\left({{{\bf{a}}_{m}}}\right)_{\text{opt}}}={{\bf{S}}^{-1}}{\overline{{\mathop{\rm Re}\nolimits}\left({{{\bf{a}}_{m}}}\right)}_{\text{opt}}}. Following a similar line of reasoning, the imaginary part of the coefficient vector can be derived.

III-B Parallel Computation

For parallel computation, the integer linear combinations of UEs’ data are decoded independently. On this basis, we first introduce the computation rate region. Then, we provide a detailed description of the proposed power control algorithm which improves the achievable sum-rate. For reducing the effective noise variance and computation complexity, we further propose an AP selection algorithm based on large-scale fading.

III-B1 Computation Rate Region

Let us suppose that all APs have full channel state information. To obtain the estimation of the integer combination with UEs’ original data 𝐮^m{{{\bf{\hat{u}}}}_{m}}, the mmth AP multiplies the received signal by an equalization factor bmb_{m} by the received signal to obtain the effective channel as

𝐲~m=bm𝐲m=bm𝐗T𝐠m+bm𝐳m\displaystyle{\widetilde{\bf{y}}_{m}}={b_{m}}{{\bf{y}}_{m}}={b_{m}}{{\bf{X}}^{T}}{{\bf{g}}_{m}}+{b_{m}}{{\bf{z}}_{m}}
=bm𝐗T𝐚m+𝐗T(bm𝐠m𝐚m)+bm𝐳meffectivenoise.\displaystyle={b_{m}}{{\bf{X}}^{T}}{{\bf{a}}_{m}}+\underbrace{{{\bf{X}}^{T}}\left({{b_{m}}{{\bf{g}}_{m}}-{{\bf{a}}_{m}}}\right)+{b_{m}}{{\bf{z}}_{m}}}_{{\rm{effective\,noise}}}. (13)

After choosing bmb_{m} to be the minimum mean-square error (MMSE) coefficient adopted at the mmth AP, the minimum effective noise variance for parallel computation is given by

σpara2(𝐏,𝐠m,𝐚m)=Δ𝐚mH(𝐏1+𝐠m𝐠mH)1𝐚m.\sigma_{\text{para}}^{2}\left({{\bf{P}},{{\bf{g}}_{m}},{{\bf{a}}_{m}}}\right)\buildrel\Delta\over{=}{\bf{a}}_{m}^{H}{\left({{{{{\bf{P}}}}^{-1}}+{{\bf{g}}_{m}}{\bf{g}}_{m}^{H}}\right)^{-1}}{{\bf{a}}_{m}}. (14)

We denote 𝐀{\bf{A}} as the matrix of the coefficient vectors, 𝐀=[𝐚1,𝐚2,,𝐚M]{\bf{A}}=\left[{{{\bf{a}}_{1,}}{{\bf{a}}_{2}},\ldots,{{\bf{a}}_{M}}}\right]. Specifically, if the mmth column of 𝐀{\bf{A}} is a null vector, the mmth AP does not serve any UE; if the llth row of 𝐀{\bf{A}} is a zero vector, the llth UE is not served by any AP. When we remove such columns and rows from 𝐀{\bf{A}}, we obtain 𝐀[i]L×M{\bf{A}}\in{\mathbb{Z}}{\left[i\right]^{L^{\prime}\times M^{\prime}}}, where LL^{\prime} and MM^{\prime} refers to the number of effective UEs and APs. Due to the array gain, the sum-rate increases along with the value of MM^{\prime} increase. However, there is a trade-off between the values of LL^{\prime} and sum-rate performance, since the growth of effective UEs does not always lead to the increase in sum-rate [6]. According to the discussion of coefficient vector selection in Section III-A, the values of MM^{\prime} and LL^{\prime} are determined by the location of APs and UEs, therefore MM^{\prime} and LL^{\prime} can take the optimal value when the location of APs and UEs is optimal. Define the rank of 𝐀{\bf{A}} by Mrank=ΔRank(𝐀){{M^{\prime}}_{{\rm{rank}}}}\buildrel\Delta\over{=}{\rm{Rank}}\left({\bf{A}}\right). According to [18], all effective UEs’ data can be recovered if Mrank=L{{M^{\prime}}_{{\rm{rank}}}}=L^{\prime}. Therefore, we only need LL^{\prime} integer linear combinations among the whole MM^{\prime} combinations. In other words, only LL^{\prime} APs need to transmit signals to the CPU through fronthaul links. The computation rate region for the parallel computation is given by

para(𝐏,𝐠m,𝐚m)=Δ{(R1,R2,,RL)+L:\displaystyle\!\!{\mathcal{R}_{{\rm{para}}}}\left({{\bf{P}},{{\bf{g}}_{m}},{{\bf{a}}_{m}}}\right)\buildrel\Delta\over{=}\Bigg{\{}{\left({{R_{1}},{R_{2}},\ldots,{R_{L^{\prime}}}}\right)\in\mathbb{R}_{+}^{L^{\prime}}:}
Rllog+(Plσpara2(𝐏,𝐠m,𝐚m))(m,l)s.t.aml0},\displaystyle\!\!{{R_{l}}\!\leq\!{{\log}^{+}}\!\left(\!{\frac{{{P_{l}}}}{{{\sigma_{\text{para}}^{2}}\left({{{{\bf{P}},{{\bf{g}}_{m}},{{\bf{a}}_{m}}}}}\!\right)}}}\right)\;\;\forall\!\left({m,l}\right)\;{\rm{s.}}{\rm{t.}}\;{a_{ml}}\neq 0}\Bigg{\}}, (15)

where aml=0a_{ml}=0 means the mmth AP doesn’t serve the llth UE.

III-B2 Power Optimization

If aml0{a_{ml}}\neq 0, the computed achievable rate for the llth UE at the mmth AP is given as

R(l,m)=log+(Plσpara2(𝐏,𝐠m,𝐚m)){R^{\prime}_{\left({l,m}\right)}}={\log^{+}}\left({\frac{{{P_{l}}}}{{\sigma_{{\rm{para}}}^{2}\left({{\bf{P}},{{\bf{g}}_{m}},{{\bf{a}}_{m}}}\right)}}}\right) (16)

However, when recovering the data of the llth UE, the codeword spacing for that data should tolerate the maximum effective noise variance across APs, whose linear combinations involve that data. Therefore, the actual achievable rate of the llth UE is

Rl=minaml0R(l,m)=minaml0log+(Plσpara2(𝐏,𝐠m,𝐚m)).\!{R_{l}}\!=\!\mathop{\min}\limits_{{a_{ml}}\neq 0}\!{{R^{\prime}}_{\left({l,m}\right)}}\!=\!\mathop{\min}\limits_{{a_{ml}}\neq 0}{\log^{+}}\left({\frac{{{P_{l}}}}{{\sigma_{{\rm{para}}}^{2}\left({{\bf{P}},{{\bf{g}}_{m}},{{\bf{a}}_{m}}}\right)}}}\right). (17)

Hence, the achievable sum-rate of LL^{\prime} UEs is

l=1LRl=minaml0l=1L(log+(Plσpara2(𝐏,𝐠m,𝐚m))).\sum\limits_{l=1}^{L^{\prime}}{{R_{l}}}=\mathop{\min}\limits_{{a_{ml}}\neq 0}\sum\limits_{l=1}^{L^{\prime}}{\left({{{\log}^{+}}\left({\frac{{P_{l}}}{{\sigma_{{\rm{para}}}^{2}\left({{{\bf{P}}},{{\bf{g}}_{m}},{{\bf{a}}_{m}}}\right)}}}\right)}\right)}. (18)

Recall that all UEs transmit with equal power in CF scheme. For fairness, we compare the performance of CF and ECF with the constraint of equal total transmit power. We aim at optimizing the power allocation to maximize the achievable sum-rate under the constraints on the total power consumption PtP_{t}. The optimization problem is formulated as follows:

maximize𝐏\displaystyle\mathop{{\rm{maximize}}}\limits_{\bf{P}} l=1LRl\displaystyle\sum\limits_{l=1}^{L^{\prime}}{{R_{l}}}
subjectto\displaystyle{\rm{subject\,to}} l=1LPl=Pt,\displaystyle\sum\nolimits_{l=1}^{L^{\prime}}{{P_{l}}}={P_{t}},
Pl0,l=1,2,.L.\displaystyle{P_{l}}\geq 0,l=1,2,\ldots.L^{\prime}. (19)

UEs can share a total power budget which is the upper bound performance of each UE’s power constraint as their total maximum allowable transmit power [29]. Besides, (19) is handled at the CPU since the global information 𝐚1,,𝐚M{{\bf{a}}_{1}},\cdots,{{\bf{a}}_{M}} and 𝐠1,,𝐠M{{\bf{g}}_{1}},\cdots,{{\bf{g}}_{M}} are required.

As mentioned above, each AP decodes l=1Laml𝐱l\sum\nolimits_{l=1}^{L}{{a_{ml}}{{\mathbf{x}}_{l}}} as one regular codeword due to the lattice algebraic structure. All UEs served by the mmth AP need to tolerate the same effective noise. If the linear integer combinations l=1Laml𝐱l\sum\nolimits_{l=1}^{L}{{a_{ml}}{{\mathbf{x}}_{l}}} can tolerate the effective noise variance σpara2(𝐏,𝐠m,𝐚m){{{\sigma_{{\rm{para}}}^{2}\left({{{\bf{P}}},{{\bf{g}}_{m}},{{\bf{a}}_{m}}}\right)}}}, then all UEs served by the mmth AP, which means aml0{a_{ml}}\neq 0 can be successfully recovered from the linear combination with integer coefficient vector 𝐚m{{\bf{a}}_{m}}. In cell-free massive MIMO, we always emphasize a good quality-of-service for all users. However, directly improving the achievable sum rate cannot achieve a good balance of quality-of-service for all users [12]. Therefore, the goal of minimizing the maximum effective noise variance that can generally improve the achievable rate for most UEs is more suitable for our model. In other words, we could minimize the maximum effective noise variance as

minimize𝐏maxm=1,,M\displaystyle\mathop{{\rm{minimize}}}\limits_{{{\bf{P}}}}\mathop{\max}\limits_{m=1,\ldots,M^{\prime}} {σpara2(𝐏,𝐠m,𝐚m)}\displaystyle\left\{{{{\sigma_{{\rm{para}}}^{2}\left({{{\bf{P}}},{{\bf{g}}_{m}},{{\bf{a}}_{m}}}\right)}}}\right\}
subjectto\displaystyle{\rm{subject\,to}} l=1LPl=Pt,\displaystyle\sum\nolimits_{l=1}^{L^{\prime}}{{P_{l}}}=P_{t},
Pl0,l=1,2,.L.\displaystyle{P_{l}}\geq 0,\;\;\;l=1,2,\ldots.L^{\prime}. (20)

According to (14), (III-B2) is equivalent to

minimize𝐏maxm=1,,M\displaystyle\mathop{{\rm{minimize}}}\limits_{\bf{P}}\mathop{\max}\limits_{m=1,\ldots,M^{\prime}} {𝐚mH(𝐏1+𝐠m𝐠mH)1𝐚m}\displaystyle\left\{{{\bf{a}}_{m}^{H}{\left({{{{{\bf{P}}}}^{-1}}+{{\bf{g}}_{m}}{\bf{g}}_{m}^{H}}\right)^{-1}}{{\bf{a}}_{m}}}\right\}
subjectto\displaystyle{\rm{subject\,to}} l=1LPl=Pt,\displaystyle\sum\nolimits_{l=1}^{L^{\prime}}{{P_{l}}}={P_{t}},
Pl0,l=1,,L.\displaystyle{P_{l}}\geq 0,\;\;\;l=1,\ldots,L^{\prime}. (21)

According to [12, Lemma B. 4], the matrix inversion can be equivalently represented by

(𝐏1+𝐠m𝐠mH)1=𝐏11+𝐠mH𝐏𝐠m𝐏𝐠m𝐠mH𝐏.{\left({{{{{{\bf{P}}}}}^{-1}}+{{\bf{g}}_{m}}{\bf{g}}_{m}^{H}}\right)^{-1}}={{\bf{P}}}-\frac{1}{{1+{\bf{g}}_{m}^{H}{{\bf{P}}}{{\bf{g}}_{m}}}}{{\bf{P}}}{{\bf{g}}_{m}}{\bf{g}}_{m}^{H}{{\bf{P}}}. (22)

Therefore, the effective noise variance for the mmth AP is

σpara2(𝐏,𝐚m,𝐠m)=𝐚mH𝐏𝐚m𝐚mH𝐏𝐠m𝐠mH𝐏𝐚m1+𝐠mH𝐏𝐠m.\sigma_{{\rm{para}}}^{2}\left({{\bf{P}},{{\bf{a}}_{m}},{{\bf{g}}_{m}}}\right)={\bf{a}}_{m}^{H}{\bf{P}}{{\bf{a}}_{m}}-\frac{{{\bf{a}}_{m}^{H}{\bf{P}}{{\bf{g}}_{m}}{\bf{g}}_{m}^{H}{\bf{P}}{{\bf{a}}_{m}}}}{{1+{\bf{g}}_{m}^{H}{\bf{P}}{{\bf{g}}_{m}}}}. (23)

Aa a result, (III-B2) can be rewritten as

minimize𝐏maxm=1,,M\displaystyle\mathop{{\rm{minimize}}}\limits_{\bf{P}}\mathop{\max}\limits_{m=1,\ldots,M^{\prime}} {𝐚mH𝐏𝐚m𝐚mH𝐏𝐠m𝐠mH𝐏𝐚m1+𝐠mH𝐏𝐠m}\displaystyle{\left\{{{\bf{a}}_{m}^{H}{\bf{P}}{{\bf{a}}_{m}}-\frac{{{\bf{a}}_{m}^{H}{\bf{P}}{{\bf{g}}_{m}}{\bf{g}}_{m}^{H}{\bf{P}}{{\bf{a}}_{m}}}}{{1+{\bf{g}}_{m}^{H}{\bf{P}}{{\bf{g}}_{m}}}}}\right\}}
subjectto\displaystyle{\rm{subject\,to}} l=1LPl=Pt,\displaystyle\sum\limits_{l=1}^{L^{\prime}}{{P_{l}}}={P_{t}},
Pl0,l=1,,L.\displaystyle{P_{l}}\geq 0,\;\;\;l=1,\ldots,L^{\prime}. (24)

However, (III-B2) is NP-hard. To tackle this challenge, we first introduce three auxiliary variables. On this basis, we can build the following optimization problem by introducing three auxiliary variables rr, ss, and tt:

{min𝐏,r,sts.t.trs,m,r𝐚mH𝐏𝐚m,m,s𝐚mH𝐏𝐠m21+𝐠mH𝐏𝐠m,m,l=1LPl=Pt,Pl0,l.\left\{\begin{aligned} \mathop{\min}\limits_{{\bf{P}},r,s}\;\;\;&t\\ {\rm{s}}{\rm{.t}}{\rm{.}}\;\;\;&t\geq{r}-{s},\;\;\;\forall m,\\ &{r}\geq{\bf{a}}_{m}^{H}{\bf{P}}{{\bf{a}}_{m}},\;\;\;\forall m,\\ &{s}\leq\frac{{{{\left\|{{\bf{a}}_{m}^{H}{\bf{P}}{{\bf{g}}_{m}}}\right\|}^{2}}}}{{1+{\bf{g}}_{m}^{H}{\bf{P}}{{\bf{g}}_{m}}}},\;\;\;\forall m,\\ &\sum\limits_{l=1}^{L^{\prime}}{{P_{l}}}={P_{t}},\\ &{P_{l}}\geq 0,\;\;\;\forall l.\end{aligned}\right. (25)

In particular, variables rr and ss have limited searching space, respectively. For a given value of rr, the variable ss should be smaller than rr. Therefore, we employ a two-dimension of brute force search on these two scalars. For fixed rr and ss, the optimization problem in (25) can be rewritten as a feasibility problem

{𝐚mH𝐏𝐚mr,m,(1/2)𝐩T𝐉𝐩s𝐯𝐩s0,m,l=1LPl=Pt,Pl0,l,\left\{\begin{aligned} &{\bf{a}}_{m}^{H}{\bf{P}}{{\bf{a}}_{m}}\leq r,\;\;\;\forall m,\\ &\left({1/2}\right){{\bf{p}}^{T}}{\bf{Jp}}-s{{\bf{v}}}{\bf{p}}-{{s}}\geq{{0}},\;\;\;\forall m,\\ &\sum\limits_{l=1}^{L^{\prime}}{{P_{l}}}={P_{t}},\\ &{P_{l}}\geq 0,\;\;\;\forall l,\end{aligned}\right. (26)

where 𝐩=[P1,,PL]T{\bf{p}}={\left[{{P_{1}},\ldots,{P_{L^{\prime}}}}\right]^{T}}, 𝐯=[|gm1|2,,|gmL|2]T{\bf{v}}={\left[{{{\left|{{g_{m1}}}\right|}^{2}},\ldots,{{\left|{{g_{mL^{\prime}}}}\right|}^{2}}}\right]^{T}}. 𝐉{\bf{J}} is a L×LL^{\prime}\times L^{\prime} matrix whose (l1,l2){\left({{l_{1}},{l_{2}}}\right)}th element is given as 𝐉(l1,l2)=2|aml1||gml1||aml2||gml2|{{\bf{J}}_{\left({{l_{1}},{l_{2}}}\right)}}=2{\left|{{a_{m{l_{1}}}}}\right|\left|{{g_{m{l_{1}}}}}\right|\left|{{a_{m{l_{2}}}}}\right|\left|{{g_{m{l_{2}}}}}\right|}. Clearly, 𝐉{\bf{J}} is a positive semi-definite matrix. Consequently, (26) can be solved efficiently by performing a brute force search on two scalars. In each step, a feasibility problem needs to be solved. Since transforming a nonconvex problem into its equivalent convex form is quite difficult if not possible, an off-the-shelf optimization solver, e.g. fmincon in Matlab, is adopted to obtain a suboptimal solution. Besides, simulations show that solving a nonconvex problem also brings obvious performance improvement, the computation cost is tolerable. Besides, actually (III-B2) is equal to (25) only if the two terms in the objective of (III-B2) are independent. However, at the optimal point, we only care about minimize the final maximum term. More specifically, Algorithm 1 can solve (25). The parameters in Step 1, e.g., rmin{r_{\min}} and rmax{r_{\max}}, can be determined by solving another two feasibility problems:

{max𝐏𝐚mH𝐏𝐚ms.t.l=1LPl=Pt,Pl0,l,{min𝐏𝐚mH𝐏𝐚ms.t.l=1LPl=Pt,Pl0,l,\begin{cases}\begin{array}[]{l}\mathop{\max}\limits_{\bf{P}}\;\;\;{\bf{a}}_{m}^{H}{\bf{P}}{{\bf{a}}_{m}}\\ {\rm{s}}{\rm{.t}}{\rm{.}}\;\;\;\sum\limits_{l=1}^{L^{\prime}}{{P_{l}}}={P_{t}},\\ \;\;\;\;\;\;\;{P_{l}}\geq 0,\;\;\;\forall l,\end{array}\end{cases}\quad\begin{cases}\begin{array}[]{l}\mathop{\min}\limits_{\bf{P}}\;\;\;{\bf{a}}_{m}^{H}{\bf{P}}{{\bf{a}}_{m}}\\ {\rm{s}}{\rm{.t}}{\rm{.}}\;\;\;\sum\limits_{l=1}^{L^{\prime}}{{P_{l}}}={P_{t}},\\ \;\;\;\;\;\;{P_{l}}\geq 0,\;\;\;\forall l,\end{array}\end{cases} (27)

respectively. When the search is completed, all APs achieve the same minimal effective noise variance tt. The corresponding values of rr and ss can be denoted as roptr_{\text{opt}} and sopts_{\text{opt}}. Finally, utilizing (18) and (23), the achievable sum-rate can be obtained. In Algorithm 1, there are at most

(2rmaxrslrmaxrminrsl1)rmaxrminrsl2sslrmaxrminrsl\left\lfloor\!{\frac{{\left(\!{2{r_{\max}}\!-\!{r_{sl}}\!\left\lfloor{\frac{{{r_{\max}}\!-\!{r_{\min}}}}{{{r_{sl}}}}\!-\!1}\right\rfloor}\!\right)\!\left\lfloor{\frac{{{r_{\max}}\!-\!{r_{\min}}}}{{{r_{sl}}}}}\right\rfloor}}{\!}{{2{s_{sl}}}}}\right\rfloor\!\!\left\lfloor\!{\frac{{{r_{\max}}\!-\!{r_{\min}}}}{{{r_{sl}}}}}\!\right\rfloor (28)

feasibility problems that need to be solved, where ssl{s_{sl}} and rsl{r_{sl}} refer to the step size for searching rr and ss, respectively.

Algorithm 1 Brute Force Search on Two Scalars for Solving Problem (25)
1:Initialization: Define the range of the values of rr by rmin{r_{\min}} and rmax{r_{\max}}. Choose step size for rr and ss as rslr_{sl} and ssls_{sl}, respectively. Set t=t=\infty and r=rmaxr={r_{\max}}.
2:Set s=rssls=r-s_{sl}. If s0s\geq 0, solving the feasibility program in (26),else go to Step Step 4.
3:If (26) is feasible, set t=min(t,rs)t=\min\left({t,r-s}\right), else back to Step Step 2.
4:Update rr with r=rrslr=r-r_{sl}. Stop if rrminr\leq{r_{\min}}.

III-B3 AP Selection

As mentioned above, recovering LL^{\prime} UEs’ original data only requires LL^{\prime} integer linear combinations. Therefore, we propose a low-complexity AP selection algorithm for two purposes. First, only LL^{\prime} effective noise variance participant in the brute force search leads to a reduction in computational complexity. Second, the noise tolerance on UEs’ data can be relaxed, which contributes to the improvement of the achievable sum-rate. Recall that (10) restricts the maximum value in the coefficient vector 𝐚m{{{\bf{a}}_{m}}} and the search space is generally small. Hence, for different APs, it is the difference on the second term in (23) that leads to a significant deviation on the effective noise variance.

Note that the average channel gain is -70 dB while the noise power is -130 dBW. Therefore, the denominator of that term is close to 1 and is several orders of magnitude smaller than the numerator. Consequently, the main factor that affects the effective noise variance across different APs is the sum of all elements in 𝐉{\bf{J}}. In other words, decoding the estimations 𝐮^m{{{\bf{\hat{u}}}}_{m}} of APs with a high sum value of 𝐉{\bf{J}} will obtain a higher achievable sum-rate. Therefore, we prefer selecting APs with a high sum value of 𝐉{\bf{J}}. Furthermore, according to (1), gmlg_{ml} is a function of βml\beta_{ml}. Then, we propose an algorithm for AP selection based on large-scale fading coefficient βml{\beta_{ml}}, which generally stays constant for several coherence intervals.

We construct matrix 𝐉{\bf{J}} for each AP by replacing the channel coefficient gmlg_{ml} with βml{\beta_{ml}}. For each 𝐉{\bf{J}}, we first sum all the elements and sort the sum values in ascending order. Then, we apply the greedy AP selection for message recovery stated in [17, Algorithm 1]. Compared with the AP selection in [17], we sort the APs by firstly replacing the channel coefficient gmlg_{ml} with βml\beta_{ml}, and then calculating the sum value of all the elements in the matrix JJ, which is independent of the power allocation. According to [17], we check the columns of 𝐀{\bf{A}} one by one, where 𝐀=[𝐚1,𝐚M]{\bf{A}}={\left[{{\bf{a}}_{1}\ldots,{\bf{a}}_{M}}\right]}, until rank requirement satisfied, therefore the computational complexity of the proposed AP selection method is no more than 𝒪(M+Mlog2(M)+M(M1)3){\cal O}\left({M^{\prime}+M^{\prime}{{\log}_{2}}\left({M^{\prime}}\right)+M^{\prime}{{\left({M^{\prime}-1}\right)}^{3}}}\right).

III-C Successive Computation

It is beneficial to remove the codewords which have been decoded successfully from the channel observation. In that case, subsequent decoding stages will encounter less interference. This well-known technique is referred to as successive interference cancellation (SIC). In ECF framework, we apply an analog of that for cell-free massive MIMO, which is named successive computation and can be viewed as the combination of ECF and successive interference cancellation. Compared with parallel computation, successive computation reduces the effective noise variance and the number of users that need to tolerate that effective noise in each decoding step [22]. Hence, it can further improve the system performance. The SIC technique also applied in [30], which proposes a hybrid deep reinforcement learning (DRL) model to design the IUI-aware receive diversity combining scheme. Compared with [30], our successive computation scheme benefits from applying the nested lattice coding strategy which can effectively reduce the fronthaul load. In this subsection, the expression of the computation rate region for successive computation is introduced firstly. Since the decoding order of integer linear combinations 𝐮~m{{{\bf{\tilde{u}}}}_{m}} and UEs both have a significant impact on the performance of successive computation, we present different methods to find the sub-optimal decoding orders with power control.

III-C1 Computation Rate Region

In successive computation, AP mm sends the received signal 𝐲m{\bf{y}}_{m} and the integer linear combinations of codewords 𝐚mT𝐗{\bf{a}}_{m}^{T}{\bf{X}} to the CPU, instead of decoding the received signal 𝐲m{{\bf{y}}_{m}} into the combinations of UEs’ original data 𝐮^m{{{\bf{\hat{u}}}}_{m}} 444When applying with successive computation, the interference cancellation procedure takes place at the decoder equipped at the CPU. Note that the signaling exchanges occurred per coherent interval and the fronthaul load are 2Mn2M^{\prime}n and 4Mn4M^{\prime}n with parallel computation and successive computation, respectively. Besides, the data of UEs are conveyed from the CPU to the APs through fronthaul links and then distributed to the UEs with precoding.. Define 𝐀m1=Δ[𝐚1,,𝐚m1]T{{\bf{A}}_{m-1}}\buildrel\Delta\over{=}{\left[{{{\bf{a}}_{1}},\ldots,{{\bf{a}}_{m-1}}}\right]^{T}}, the CPU applies equalization factor bmb_{m} and vector 𝐜{\bf{c}} to the mmth combination as

𝐲~m=bm𝐲m+𝐗T𝐀m1𝐜\displaystyle{{{\bf{\tilde{y}}}}_{m}}={b_{m}}{{\bf{y}}_{m}}+{{\bf{X}}^{T}}{{\bf{A}}_{m-1}}{\bf{c}}
=𝐗T𝐚m+𝐗T(bm𝐲m+𝐀m1𝐜𝐚m)+bm𝐳meffectivenoise.\displaystyle={{\bf{X}}^{T}}{{\bf{a}}_{m}}+\underbrace{{{\bf{X}}^{T}}\left({{b_{m}}{{\bf{y}}_{m}}+{{\bf{A}}_{m-1}}{\bf{c}}-{{\bf{a}}_{m}}}\right)+{b_{m}}{{\bf{z}}_{m}}}_{{\rm{effective\;noise}}}. (29)

where 𝐜=[c1,,cm1]T{\bf{c}}={\left[{{c_{1}},\ldots,{c_{m-1}}}\right]^{T}}. (III-C1) shows that the decoded linear combinations {𝐲1,,𝐲m1}\left\{{{{\bf{y}}}_{1}},\ldots,{{{\bf{y}}}_{m-1}}\right\} can be used as side information for reducing the effective noise experienced in the latter decoding stages for other UEs.

After choosing bmb_{m} and c to be the MMSE projection scalar and vector, the minimum effective noise variance with transmit power matrix P is given by

σsucc2(𝐠m,𝐚m,𝐏|𝐀m1)=Δ𝐚mH𝐅T𝐍m1𝐅𝐚m,\sigma_{{\rm{succ}}}^{2}\left({{{\bf{g}}_{m}},{{\bf{a}}_{m}},\left.{\bf{P}}\right|{{\bf{A}}_{m-1}}}\right)\buildrel\Delta\over{=}{\bf{a}}_{m}^{H}{{\bf{F}}^{T}}{{\bf{N}}_{m-1}}{\bf{F}}{{\bf{a}}_{m}}, (30)

where

𝐅T𝐅=(𝐏1+𝐠m𝐠mH)1,\displaystyle{{\bf{F}}^{T}}{\bf{F}}={\left({{{\bf{P}}^{-1}}+{{\bf{g}}_{m}}{\bf{g}}_{m}^{H}}\right)^{-1}}, (31)

and

𝐍m1=𝐈𝐅𝐀m1T(𝐀m1𝐅T𝐅𝐀m1T)1𝐀m1𝐅T.\displaystyle{{\bf{N}}_{m-1}}\!=\!{\bf{I}}\!-\!{\bf{FA}}_{m-1}^{T}{\left({{{\bf{A}}_{m-1}}{{\bf{F}}^{T}}{\bf{FA}}_{m-1}^{T}}\right)^{-1}}{{\bf{A}}_{m-1}}{{\bf{F}}^{T}}. (32)

Then, the computation rate region for successive computation is given as

succ(𝐏,𝐠m,𝐚m|𝐀m1)=Δ{(R1,,RL)R+L:\displaystyle{\mathcal{R}_{{\rm{succ}}}}\left({{\bf{P}},{{\bf{g}}_{m}},\left.{{{\bf{a}}_{m}}}\right|{{\bf{A}}_{m-1}}}\right)\buildrel\Delta\over{=}\Bigg{\{}{\left({{R_{1}},\ldots,{R_{L^{\prime}}}}\right)\in R_{+}^{L}:}
Rllog+(Plσsucc2(𝐏,𝐠m,𝐚m|𝐀m1))(m,l)s.t.aml0}.\displaystyle{{R_{l}}\!\leq\!{{\log}^{+}}\!\left(\!{\frac{{{P_{l}}}}{{\sigma_{{\rm{succ}}}^{2}\left({{\bf{P}},{{\bf{g}}_{m}},\left.{{{\bf{a}}_{m}}}\right|{{\bf{A}}_{m-1}}}\right)}}}\!\right)\forall\!\left({m,l}\right)\;{\text{s.t.}}\;{a_{ml}}\!\neq\!0}\!\Bigg{\}}. (33)

III-C2 Searching Decoding Order for Combinations

Among MM^{\prime} APs, we first select the effective UEs and APs that participate in the uplink transmission. Then, we try to find the candidate integer combinations with small effective noise utilizing the AP selection algorithm. The detailed procedure has been introduced in Section III-B3. According to (III-C1), the effective noise variance is related to the decoding order of integer linear combinations. Therefore, we propose an efficient method for determining the side information matrix 𝐀m1{{\bf{A}}_{m-1}} with power control.

The main idea of successive computation is to utilize the side information and the decoded integer linear combinations to reduce the effective noise. Note that the integer linear combination decoded firstly does not have any side information to exploit. Hence, the effective noise expression for the first decoding step is similar to that of parallel computation, which is given by

σsucc,δ(1)2(𝐏,𝐠δ(1),𝐚δ(1))\displaystyle\sigma_{{\rm{succ}},\delta\left(1\right)}^{2}\left({{\bf{P}},{{\bf{g}}_{\delta\left(1\right)}},{{\bf{a}}_{\delta\left(1\right)}}}\right)
=𝐚δ(1)H(𝐏1+𝐠δ(1)𝐠δ(1)H)1𝐚δ(1).\displaystyle={\bf{a}}_{\delta\left(1\right)}^{H}{\left({{{\bf{P}}^{-1}}+{{\bf{g}}_{\delta\left(1\right)}}{\bf{g}}_{\delta\left(1\right)}^{H}}\right)^{-1}}{{\bf{a}}_{\delta\left(1\right)}}. (34)

where δ(m)\delta\left(m\right) denotes the decoding order. As the remaining combinations can reduce their effective noise with the help of side information, we can select the integer linear combination which has the minimum effective noise to decode firstly. To determine 𝐚δ(1)T𝐗{\bf{a}}_{\delta\left(1\right)}^{T}{\bf{X}}, we solve the following optimization problem to obtain its local sub-optimal power allocation. As the problem (III-C2) is non-convex and translate it into non-convex is quite difficult, an off-the-shelf optimization solver, e.g. fmincon in Matlab, is adopted to obtain a suboptimal solution. Note that although the obtained solution is suboptimal, the performance of the proposed framework is still superior compared with CF and MRC, which will be verified in the simulation section. Such that, for each 𝐚mT𝐗{\bf{a}}_{m}^{T}{\bf{X}}, m=1,,Lm=1,\ldots,L^{\prime}, we have

min𝐏\displaystyle\mathop{\min}\limits_{\bf{P}}\,\, σsucc,δ(1)2(𝐏,𝐠m,𝐚m)\displaystyle\sigma_{{\rm{succ}},\delta\left(1\right)}^{2}\left({{\bf{P}},{{\bf{g}}_{m}},{{\bf{a}}_{m}}}\right)
s.t.\displaystyle{\rm{s}}{\rm{.t}}{\rm{.}}\,\, l=1LPl=Pt,\displaystyle\sum\nolimits_{l=1}^{L^{\prime}}{{P_{l}}}={P_{t}},
Pl0,l=1,,L.\displaystyle{P_{l}}\geq 0,l=1,\ldots,L^{\prime}. (35)

Then, we calculate the effective noise σsucc,δ(1)2\sigma_{{\rm{succ}},\delta\left(1\right)}^{2} for each 𝐚mT𝐗{\bf{a}}_{m}^{T}{\bf{X}} with its own power allocation matrix and select the combination which has the minimal effective noise as 𝐚δ(1)T𝐗{\bf{a}}_{\delta\left(1\right)}^{T}{\bf{X}}.

After determining 𝐚δ(1)T𝐗{\bf{a}}_{\delta\left(1\right)}^{T}{\bf{X}}, we begin to determine the remaining decoding order. For the mmth step, we determine 𝐚δ(m)T𝐗{{\bf{a}}^{T}_{\delta\left(m\right)}}{\bf{X}} depending on the effective noise and the rank of the side information matrix. More specifically, we first calculate the effective noise variance for each of the remaining integer linear combinations according to (30) and sort them in ascending order. Then, in line with the order, in each turn add the corresponding coefficient vector to the side information matrix 𝐀m1{{\bf{A}}_{m-1}}, which is known from step 1 through m1m-1, to form 𝐀m=[𝐀m1;𝐚mT]{{\bf{A}}_{m}}=\left[{{{\bf{A}}_{m-1}};{\bf{a}}_{m^{\prime}}^{T}}\right], m=1,,Mm+1m^{\prime}=1,\ldots,M^{\prime}-m+1. Finally, we select the integer linear combination which meets the constraint

Rank(𝐀m)=m,{\rm{Rank}}\left({{{\bf{A}}_{m}}}\right)=m, (36)

to update the side information matrix. The procedure for finding 𝐚δ(m)T𝐗{{\bf{a}}_{\delta\left(m\right)}^{T}}{\bf{X}} terminates when all integer linear combinations find themselves decoding orders. The detailed procedure for searching the decoding order of combinations in successive computation is summarized in Algorithm 2. To determine 𝐚δ(1)T𝐗{\bf{a}}_{\delta\left(1\right)}^{T}{\bf{X}}, we need to solve the optimization problem in (III-C2) for LL^{\prime} times. Then, for searching the decoding order for the left L1L^{\prime}-1 combinations in terms of the number of complex multiplications is

m=2L(Lm+2)(Lm+1)2[(L)3L3+(m1)3L\displaystyle\sum\limits_{m=2}^{L^{\prime}}{\frac{{\left({L^{\prime}-m+2}\right)\left({L^{\prime}-m+1}\right)}}{2}}{\left[{\frac{{{{\left({L^{\prime}}\right)}^{3}}-L^{\prime}}}{3}}+{{\left({m-1}\right)}^{3}}L^{\prime}\right.}
+(m1)2L+(m1)L2+(m1)(L)2+2(L)2].\displaystyle{\left.{+\frac{{{{\left({m-1}\right)}^{2}}L^{\prime}\!\!+\!\!\left({m-1}\right)L^{\prime}}}{2}\!\!+\!\!\left({m-1}\right){{\left({L^{\prime}}\right)}^{2}}\!\!+\!\!2{{\left({L^{\prime}}\right)}^{2}}}\right]}. (37)
Algorithm 2 Searching Decoding Order for Combinations in Successive Computation
1:Input: LL^{\prime} candidate integer linear combinations and the corresponding coefficient vectors which can be obtained through AP selection.
2:Initialization: m=2m=2, n=1n=1.
3:Solve the optimization problem (III-C2) for all LL^{\prime} combinations and calculate the effective noise variance for each combination using (III-C2). Select the combination which has the minimal effective noise as 𝐚δ(1)T𝐗{\bf{a}}_{\delta\left(1\right)}^{T}{\bf{X}} and remove the corresponding coefficient vector from the candidate set. The side information is obtained with 𝐀1=[𝐚δ(1)T]{{\bf{A}}_{1}}=\left[{\bf{a}}_{\delta\left(1\right)}^{T}\right]. The power constraint matrix 𝐏{\bf{P}} is determined with the power allocation for 𝐚δ(1)T𝐗{\bf{a}}_{\delta\left(1\right)}^{T}{\bf{X}}.
4:Calculate the effective noise variance for each of the remaining combinations based on 𝐀m1{{\bf{A}}_{m-1}} using (30) and sort them in ascending order. Update 𝐀m{{\bf{A}}_{m}} with 𝐀m1=[𝐀m1;𝐚nT]{{\bf{A}}_{m-1}}=\left[{{\bf{A}}_{m-1}};{{\bf{a}}_{n}^{T}}\right]. If Rank(𝐀m)=m{\rm{Rank}}\left({{{\bf{A}}_{m}}}\right)=m, remove 𝐚n{{\bf{a}}_{n}} form the candidate set to update the side information matrix and set m:=m+1,n=1m:=m+1,n=1, else n:=n+1n:=n+1.
5:Stop if Rank(𝐀L)=L{\rm{Rank}}\left({{{\bf{A}}_{L}^{\prime}}}\right)=L^{\prime}, otherwise back to Step 4.
6:Output: 𝐀L{{{\bf{A}}_{L}^{\prime}}}.

III-C3 Searching Decoding Order for UEs

In successive computation, the effective noise of UEs whose data is decoded in the latter decoding stages can be reduced. At the mmth decoding step, it is possible to use the side information matrix 𝐀m1{{\bf{A}}_{m-1}} to reduce some known individual codewords and remove them from the integer linear information 𝐚δ(m)T𝐗{\bf{a}}_{\delta\left(m\right)}^{T}{\bf{X}} without changing the effective noise variance. In particular, if the llth UE’s data has been recovered at the mmth step, the data only need to tolerate the maximum effective noise among {𝐚δ(1)T𝐗,,𝐚δ(m)T𝐗}\left\{{{\bf{a}}_{\delta\left(1\right)}^{T}{\bf{X}},\ldots,{\bf{a}}_{\delta\left(m\right)}^{T}{\bf{X}}}\right\}. Therefore, the decoding order of UEs also has an effect on the achievable sum-rate. Searching the decoding order has been studied in some works [31], [32] with SIC. However, our problem is generally intractable and the use of convex optimization for obtaining the optimal solution is not possible. Therefore, we propose several methods to determine the decoding order of UEs and select the best one as a suboptimal solution.

Received-Power-Based Algorithm

In general successive interference cancellation, the decoding order of UEs is determined by their received power. We calculate the received power of the llth UE with respect to the mmth integer linear combination with Pr,(l,m)=Plgml2{P_{r,\left({l,m}\right)}}={P_{l}}{\left\|{{g_{ml}}}\right\|^{2}}. Then, we can obtain the achievable rate for each UE with 𝐚δ(m)T𝐗{\bf{a}}_{\delta\left(m\right)}^{T}{\bf{X}} with

Rδ(m),l=Plσsucc,δ(m)2(𝐏,𝐠δ(m),𝐚δ(m)|𝐀m1),aδ(m),l0,{R_{{\delta\left(m\right)},l}}=\frac{{{P_{l}}}}{{\sigma_{{\rm{succ}},\delta\left(m\right)}^{2}\left({{\bf{P}},{{\bf{g}}_{\delta\left(m\right)}},\left.{{{\bf{a}}_{\delta\left(m\right)}}}\right|{{\bf{A}}_{m-1}}}\right)}},\;{a_{{\delta\left(m\right)},l}}\neq 0, (38)

and sort them in descending order based on the received power. The signal from UE whose rate with respect to the combination is in the first place of the order is decoded at the δ(m){\delta\left(m\right)} step.

Channel-Coefficient-Based Algorithm

As stated in Section III-B3, a better channel condition leads to the less effective noise variance. Therefore, UEs with good channel condition contributes to the small effective noise of selected APs. These UEs can be decoded first to relax their effective noise tolerance and then have a large achievable rate. For the llth UE, let us define 𝐠l{{\bf{g}}_{l}} as the channel coefficients with LL^{\prime} integer linear combinations 𝐠l=[g1l,,gLl]T{{\bf{g}}_{l}}={\left[{{g_{1l}},\ldots,{g_{L^{\prime}l}}}\right]}^{T}. We calculate the 2-norm of 𝐠l{{\bf{g}}_{l}} for all UEs and sort them in descending order.

Hungarian Algorithm

With the effective noise variance for LL^{\prime} integer linear combinations and the power allocation for LL^{\prime} UEs, we can find the assignment for each UE with PlP_{l}. It has been shown in [33] that the Hungarian algorithm may be the best solution to the combinational optimization problem. Therefore, we first construct a L×LL^{\prime}\times L^{\prime} matrix 𝐂{\bf{C}}, whose element in the llth row and mmth column represents the achievable rate of the llth UE with the mmth integer linear combination from (38), such as Cml=RmlC_{ml}=R_{ml}. The conventional Hungarian algorithm aims to find LL^{\prime} element which are set in different rows and columns of 𝐂{\bf{C}}. The sum of these LL^{\prime} element is minimum. However, we need to obtain the maximum value of the achievable sum-rate. Therefore, we first find the maximum value Cmax{C_{\max}} in 𝐂{\bf{C}} and replace each element with Cml=CmlCmax{C_{ml}}={C_{ml}}-{C_{\max}}. The detailed procedure of the Hungarian algorithm is summarized in Algorithm 3.

Algorithm 3 Hungarian Algorithm for Finding Optimal Decoding Order of UEs
1:Perform row operations on 𝐂{\bf{C}}. The minimum element of each row is selected and is subtracted from each element in that row.
2:Repeat the procedure stated in Step 1 for all columns.
3:Count the minimum number of rows and columns that cover all zeros. Test the optimality. If the number of counted lines is equal to LL^{\prime}, such as Nl=L{N_{l}}=L^{\prime}, stop the procedure.
4:Find the minimum value that is not covered in lines and add that to intersection points. Subtract that minimum value from elements that are not covered by counted lines.
5:Repeat Step 3 for checking the optimality condition. If NlL{N_{l}}\leq L^{\prime}, repeat Step 4.

To determine which UE’s data should be recovered at the mmth decoding step, we need to compute L+1mL^{\prime}+1-m UEs’ achievable rate. Therefore, the computational complexity for the received-power-based algorithm and the channel-coefficient-based algorithm is 𝒪((L+1)L2){\cal O}\left({\frac{{\left({L^{\prime}+1}\right)L^{\prime}}}{2}}\right). Besides, the complexity of the Hungarian algorithm is 𝒪((L)3){\cal O}\left({{{\left({L^{\prime}}\right)}^{3}}}\right) [34]. After determining the decoding order of integer linear combinations and UEs, the achievable sum-rate can be obtained. Using (38) for calculating the achievable rate for llth UE whose data is recovered with 𝐚δ(m)T𝐗{\bf{a}}_{\delta\left(m\right)}^{T}{\bf{X}}, and the sum achievable rate is given as Rsum=l=1LRml{R_{\text{sum}}}=\sum\nolimits_{l=1}^{L^{\prime}}{{R_{ml}}}.

IV Numerical Results

IV-A Parameters Setup

We adopt the similar parameters setting in [6] as the basis to establish our simulation system model. More specifically, all UEs and APs are randomly located within a square of 1 ×\times 1 km. In each simulation setup, the APs and UEs are uniformly distributed at random locations within the simulation area. The square is wrapped around at the edges to avoid boundary effects. Hata-COST231 model is employed to characterize the large-scale propagation.

IV-B Results and Discussion

IV-B1 Parallel Computation

Refer to caption
Figure 2: Achievable sum-rate for CF, PARA, and APS-PARA schemes with L=10L=10 and Pt=200{P_{t}=200} mW.
Refer to caption
Figure 3: Achievable sum-rate for PARA, APS-PARA, LSF-PARA, and APS-LSF-PARA schemes with L=10L=10 and Pt=200{P_{t}=200} mW.

First, we evaluate the performance of the proposed parallel computation (PARA) scheme in terms of the achievable sum-rate with power control. The APS-PARA scheme refers to the PARA with AP selection. Fig. 2 shows the achievable sum-rate obtained via CF, PARA, and APS-PARA schemes versus the number of APs with L=10L=10 and Pt=200{P_{t}=200} mW. Owing to the array again, the system performance of all considered schemes increases as the number of APs MM increasing. Moreover, the PARA scheme with the proposed power control method outperforms the conventional CF scheme. For example, compared with the CF scheme, both PARA and APS-PARA schemes improve the achievable sum-rate by factors more than 1.24 and 1.36 for the case of M=100M=100, respectively. This is due to the fact that the ECF framework enables optimal transmit power of UEs which facilitates the exploitation of performance gain. Furthermore, it can be seen from Fig. 2 that APS-PARA scheme is better than the PARA scheme. This is contributed to the low IUI brought by the proposed AP selection. Due to the effective noise variance which UEs’ data need to tolerate decreases considerably, it is beneficial to utilize AP selection for improving the achievable sum-rate. Besides, the computational complexity has also been reduced with AP selection. For recovering UEs’ original information, only LL integer linear combinations instead of MM need to be used in the power control. Besides, when the number of APs is 60, compared with imperfect CSI estimated by MMSE estimation method [12] known at APs, the performance degradation caused by imperfect CSI is only 4%.

Assuming that the power control is utilized at the CPU based on the large-scale fading, we need to replace gml{g_{ml}} with βml{\beta}_{ml} for solving the optimization problem (III-B2). The PARA scheme using power control based on the large-scale fading is referred to as LSF-PARA scheme. Fig. 3 shows the achievable sum-rate obtained with PARA, LSF-PARA, APS-PARA, and APS-LSF-PARA schemes against the number of APs. As expected, the achievable sum-rate of all schemes improves as the number of APs increases, and applying AP selection does help enhance the performance. Furthermore, the impact on achievable sum-rate of neglecting the small-scale fading is not critical, especially when the ratio of APs to UEs becomes large. In particular, the performance gap due to ignoring the small-scale fading vanishes for M=100M=100. This is due to the property of channel hardening [12]. As the number of antennas is sufficiently large, the variance of the channel gain reduces and the fading becomes almost as a deterministic channel.

IV-B2 Successive Computation

Refer to caption
Figure 4: Achievable sum-rate for APS-LSF-SUCC schemes with power allocation applying Hungarian algorithm, RP algorithm and NLSF algorithm for L=10L=10 and Pt=200{P_{t}=200} mW.

Next, we examine the performance of successive computation (SUCC) schemes. Let us denote the successive computation scheme based on large-scale fading applied with AP selection, power allocation, and Hungarian algorithm by APS-LSF-PA-Hungarian-SUCC. Fig. 4 shows the performance of APS-LSF-PA-SUCC schemes with different algorithms on determining the decoding order of UEs, Hungarian algorithm, received-power-based (RP) algorithm, and channel-coefficient-based algorithm. As we assume that the power control is employed at the CPU, which means that the channel coefficient is replaced with the large-scale fading coefficient, the APS-LSF-PA-SUCC scheme with searching the decoding order of UEs through 2-norm of large-scale fading coefficients is named as APS-LSF-PA-NLSF-SUCC scheme. As shown in Fig. 4, the APS-LSF-PA-Hungarian-SUCC scheme achieves the best result compared to other schemes. Furthermore, the performance of APS-LSF-PA-RP-SUCC scheme is similar to that of APS-LSF-PA-NLSF-SUCC. This is due to the fact that the denominator of the second term in (23) is several orders of magnitude smaller than the numerator, as the transmit power normalized by the noise power is huge. Note that the effect of the first term in (23) on effective noise variance is not significant. Therefore, according to (23), UEs with good channel state should be allocated with more transmit power for reducing the effective noise and finally the system can obtain a larger achievable sum-rate. In this way, using RP and NLSF algorithms leads to the same result.

Refer to caption
Figure 5: Achievable sum-rate for APS-LSF-PA-Hungarian-SUCC and APS-PA-Hungarian-SUCC schemes for M=50M=50 and M=100M=100.

In previous simulation results, we have shown that the instantaneous channel state information can help the parallel computation scheme to improve the achievable sum-rate while with higher complexity. In successive computation, the conclusion is similar. In Fig. 5, we compare the achievable sum-rate of APS-PA-Hungarian-SUCC and APS-LSF-PA-Hungarian-SUCC schemes. Although the performance gap induced by the replacement of channel coefficient becomes large along with the increase of the number of UEs, it is still very small. At the same time, transmitting instantaneous channel state information yields a great growth load in fronthaul load. Noted that the small-scale fading coefficient is only static during one coherence block while the large-scale fading coefficient stays constant for a duration of at least 40 small-scale fading coherence intervals [6]. Therefore, using the statistical channel state information works well for successive computation schemes. Besides, the benefit of employing AP selection is also obvious in successive computation. During searching the decoding order of combinations (36), in the mmth step we only need to calculate LmL^{\prime}-m times to find the minimal effective noise which increases the rank of the side information matrix. Furthermore, we can use 𝐮1,𝐮m1{{\bf{u}}_{1}}\ldots,{{\bf{u}}_{m-1}} to eliminate certain symbols from the combination and thus remove the constraint on them. For determining the total decoding order of combinations, the computational complexity is 𝒪((L)2)\mathcal{O}\left({{{\left({L^{\prime}}\right)}^{2}}}\right) while abandoning the selection needs 𝒪((M)2)\mathcal{O}\left({{{\left({M^{\prime}}\right)}^{2}}}\right). Therefore, utilizing AP selection not only improves the performance but also decrease the computational complexity when the number of APs is larger than UEs.

IV-B3 Comparison of centralized MMSE, ECF, CF, and MRC scheme

Refer to caption
Figure 6: Achievable sum-rate for APS-LSF-PA-Hungarian-SUCC scheme, APS-LSF-PARA scheme, CF scheme, and MRC scheme for L=10L=10 and Pt=200{P_{t}=200} mW.
Refer to caption
Figure 7: CDFs of the achievable sum-rate for centralized MMSE, APS-LSF-PA-Hungarian-SUCC, APS-LSF-PARA, CF, Local ZF, and MRC schemes for M=100M=100, L=20L=20.

In Fig. 6, we compare the achievable sum-rate of APS-LSF-PA-Hungarian-SUCC, APS-LSF-PARA, CF, and MRC schemes. MRC scheme is the simple linear strategy in cell-free massive MIMO, which has been widely used in previous works [6]. In the uplink data transmission, the received signal at the mmth AP can be expressed as 𝐲m=l=1LgmlPl𝐱l+𝐳m{{\bf{y}}_{m}}=\sum\nolimits_{l=1}^{L}{{g_{ml}}}\sqrt{{{P_{l}}}}{{\bf{x}}_{l}}+{{\bf{z}}_{m}}. Then, the mmth AP multiplies the received signal with the conjugate of its channel coefficient vector 𝐠m{{{\bf{g}}}_{m}} and then forwards 𝐲m𝐠m{{\bf{y}}_{m}}{\bf{g}}_{m}^{*} to the CPU. The CPU combines signals from all MM APs. Therefore, the achievable rate of the llth UE is given by

Rmrc,l=log2(1+P|𝐠lH𝐠l|2|𝐠lH|2+Plll|𝐠lH𝐠l|2).{R_{{\rm{mrc,}}l}}={\log_{2}}\left({1+\frac{{P{{\left|{{\bf{g}}_{l}^{H}{{\bf{g}}_{l}}}\right|}^{2}}}}{{{{\left|{{\bf{g}}_{l}^{H}}\right|}^{2}}+{P_{l}}\sum\nolimits_{l\neq l^{\prime}}{{{\left|{{\bf{g}}_{l^{\prime}}^{H}{{\bf{g}}_{l}}}\right|}^{2}}}}}}\right). (39)

It is clear to see that the IUI limits the achievable sum-rate. However, employing CF and ECF schemes can harness and even exploit the interference for cooperative gain, which leads to an increase in the achievable sum-rate. This can be verified from Fig. 6. When the number of APs is not very large, which means the IUI affects the performance significantly, the advantage of applying CF and ECF schemes is self-evident. For example, compared with MRC scheme, the achievable sum-rate of CF and ECF schemes improves by factors more than 1.5 and 2.5 when M=100M=100, respectively.

Although utilizing the ECF framework can effectively improve the system performance, it is not the optimal choice for maximizing the achievable rate. Fig. 7 shows the cumulative distribution function (CDF) of achievable sum-rate for centralized MMSE, APS-LSF-PA-Hungarian-SUCC, APS-LSF-PARA, CF, and MRC schemes with M=100M=100, L=20L=20. From Fig. 7, we can first observe that our parallel ECF scheme with power control method that solves (26) outperforms both CF and MRC. Second, when comparing the ECF framework with local MR and zero-forcing (ZF) schemes using quantized signals under the same fronthaul limit, our proposed ECF schemes including parallel and successive computation have superior performance. Specifically, compared with the local ZF, applying the successive computation scheme leads to 60.4% improvement in terms of the average achievable sum-rate. Besides, the performance gap between the APS-LSF-PA-Hungarian-SUCC scheme and the centralized MMSE scheme is obvious. It is attributed to the fact that the centralized MMSE adopts the optimal combining scheme for maximizing the instantaneous signal-to-interference-and-noise ratio [12]. Specifically, applying the centralized MMSE scheme leads to 55% improvement in terms of average achievable sum-rate. However, compared with centralized MMSE, ECF is also an efficient approach for fronthaul reduction and hence a largely achievable sum-rate still can be realized even if the fronthaul capacity is limited. In particular, each AP decodes the received signal into the finite field by applying the equalization factor and then forwards an integer combination of the transmitted symbols of all UEs. The cardinality of signals transmitted in the fronthaul link is the same as the cardinality of UEs original data, this is the theoretical minimum fronthaul load required to achieve lossless transmission [17]. When the fronthaul capacity restricted as R0R_{0}, the actual achievable rate is R=min{R0,Rsum}R=\min\left\{{{R_{0}},{R_{\text{sum}}}}\right\} [35], where Rsum{R_{\text{sum}}} represents the achievable sum-rate without considering the fronthaul load constraint.

IV-B4 Trade-off between the performance and the complexity of ECF schemes

TABLE II: The computational complexity and performance of various version of ECF framework
              Schemes Computation Complexity Sum rate [bits per channel use]    
AP Selection    
Power Optimization
(the number of feasibility problems)
   
Searching Decoding Order        
             
Parallel
Computation
𝒪(M+Mlog2(M)+M(M1)3){\begin{array}[]{l}{\cal O}\left({M^{\prime}+M^{\prime}{{\log}_{2}}\left({M^{\prime}}\right)}\right.\\ \left.{+M^{\prime}{{\left({M^{\prime}-1}\right)}^{3}}}\right)\end{array}}     (rmax+rmaxrslrmaxrminrsl1)×rmaxrminrsl/2sslrmaxrminrsl\begin{array}[]{l}\left\lfloor{\left({{r_{\max}}\!+\!{r_{\max}}\!-\!{r_{sl}}\left\lfloor{\frac{{{r_{\max}}\!-\!{r_{\min}}}}{{{r_{sl}}}}\!\!-\!\!1}\right\rfloor}\right)}\right.\\ \left.\times\left\lfloor{\frac{{{r_{\max}}-{r_{\min}}}}{{{r_{sl}}}}}\right\rfloor/2{s_{sl}}\right\rfloor\left\lfloor{\frac{{{r_{\max}}-{r_{\min}}}}{{{r_{sl}}}}}\right\rfloor\end{array}     N/A     19.57    
      Successive Computation     Using Received-power-based algorithm     𝒪(M+Mlog2(M)+M(M1)3)\begin{array}[]{l}{\cal O}\left({M^{\prime}+M^{\prime}{{\log}_{2}}\left({M^{\prime}}\right)}\right.\\ \left.{+M^{\prime}{{\left({M^{\prime}-1}\right)}^{3}}}\right)\end{array}     m=2L(Lm+2)(Lm+1)2[(L)3L3+(m1)2L+(m1)L2+(m1)3L+(m1)(L)2+2(L)2]\begin{array}[]{l}\sum\limits_{m=2}^{L^{\prime}}{\frac{{\left({L^{\prime}-m+2}\right)\left({L^{\prime}-m+1}\right)}}{2}}\left[{\frac{{{{\left({L^{\prime}}\right)}^{3}}-L^{\prime}}}{3}}\right.\\ +\frac{{{{\left({m-1}\right)}^{2}}L^{\prime}+\left({m-1}\right)L^{\prime}}}{2}+{\left({m-1}\right)^{3}}L^{\prime}\\ \left.{+\left({m-1}\right){{\left({L^{\prime}}\right)}^{2}}+2{{\left({L^{\prime}}\right)}^{2}}}\right]\end{array}     𝒪((L+1)L2){\cal O}\left({\frac{{\left({L^{\prime}+1}\right)L^{\prime}}}{2}}\right)     23.58    
       Using Channel-coefficient-based algorithm     𝒪(M+Mlog2(M)+M(M1)3)\begin{array}[]{l}{\cal O}\left({M^{\prime}+M^{\prime}{{\log}_{2}}\left({M^{\prime}}\right)}\right.\\ \left.{+M^{\prime}{{\left({M^{\prime}-1}\right)}^{3}}}\right)\end{array}     m=2L(Lm+2)(Lm+1)2[(L)3L3+(m1)2L+(m1)L2+(m1)3L+(m1)(L)2+2(L)2]\begin{array}[]{l}\sum\limits_{m=2}^{L^{\prime}}{\frac{{\left({L^{\prime}-m+2}\right)\left({L^{\prime}-m+1}\right)}}{2}}\left[{\frac{{{{\left({L^{\prime}}\right)}^{3}}-L^{\prime}}}{3}}\right.\\ +\frac{{{{\left({m-1}\right)}^{2}}L^{\prime}+\left({m-1}\right)L^{\prime}}}{2}+{\left({m-1}\right)^{3}}L^{\prime}\\ \left.{+\left({m-1}\right){{\left({L^{\prime}}\right)}^{2}}+2{{\left({L^{\prime}}\right)}^{2}}}\right]\end{array}     𝒪((L+1)L2){\cal O}\left({\frac{{\left({L^{\prime}+1}\right)L^{\prime}}}{2}}\right)     22.82    
       Using Hungarian algorithm     𝒪(M+Mlog2(M)+M(M1)3)\begin{array}[]{l}{\cal O}\left({M^{\prime}+M^{\prime}{{\log}_{2}}\left({M^{\prime}}\right)}\right.\\ \left.{+M^{\prime}{{\left({M^{\prime}-1}\right)}^{3}}}\right)\end{array}     m=2L(Lm+2)(Lm+1)2[(L)3L3+(m1)2L+(m1)L2+(m1)3L+(m1)(L)2+2(L)2]\begin{array}[]{l}\sum\limits_{m=2}^{L^{\prime}}{\frac{{\left({L^{\prime}-m+2}\right)\left({L^{\prime}-m+1}\right)}}{2}}\left[{\frac{{{{\left({L^{\prime}}\right)}^{3}}-L^{\prime}}}{3}}\right.\\ +\frac{{{{\left({m-1}\right)}^{2}}L^{\prime}+\left({m-1}\right)L^{\prime}}}{2}+{\left({m-1}\right)^{3}}L^{\prime}\\ \left.{+\left({m-1}\right){{\left({L^{\prime}}\right)}^{2}}+2{{\left({L^{\prime}}\right)}^{2}}}\right]\end{array}     𝒪((L)3){\cal O}\left({{{\left({L^{\prime}}\right)}^{3}}}\right)     25.51    
         

In Table II, we summarize the performance in terms of sum-rate and the computational complexity of various versions of successive computation and parallel computation. It can be observed that there is a trade-off between performance and computational complexity. Specifically, the successive computation with the Hungarian algorithm for searching decoding order of UEs has the higher computational complexity and superior performance compared with the other two methods, i.e., received-power-based algorithm and channel-coefficient-based algorithm.

IV-B5 Scalable Issue

In order to realize scalability [36], [37], our ECF framework needs to control the number of UEs each AP serves. Specifically, according to the large-scale-fading-based AP selection criterion proposed in [38], APs are first selected to form UE-centric clusters for each UE. Then, each AP sorts the UEs that need to be served according to the large-scale fading information, and then selects only the first several UEs with the best channel quality to serve. Fig. 8 shows the performance comparison between the original non scalable ECF and the scalable ECF schemes. We can observe that the performance loss is small and decreases with the increase of the number of APs.

Refer to caption
Figure 8: Achievable sum-rate for the non scalable and scalable ECF-SUCC and ECF-PARA scheme for L=10L=10 and Pt=200P_{t}=200 mW.

V Conclusions

In this work, we investigate the achievable sum-rate of ECF framework for cell-free massive MIMO systems. Two types of ECF framework including parallel computation and successive computation to improve the achievable sum-rate in cell-free massive MIMO are proposed. An AP selection scheme is proposed to reduce the effective noise tolerance of UEs to further improve the performance and reduce the computation complexity. We prove that the proposed power control algorithm for parallel computation and successive computation with AP selection can improve the achievable sum-rate significantly. For obtaining better system performance, methods for determining the decoding order of combinations and UEs are also presented. Numerical results show that compared with CF and MRC schemes, the ECF framework remarkably improves the achievable sum-rate of cell-free massive MIMO systems.

References

  • [1] J. Zhang, J. Zhang, J. Zheng, S. Jin, and B. Ai, “Expanded compute-and-forward for backhaul-limited cell-free massive MIMO,” in Proc. IEEE ICC Wkshps, 2019, pp. 1–6.
  • [2] T. L. Marzetta, “Noncooperative cellular wireless with unlimited numbers of base station antennas,” IEEE Trans. Wireless Commun., vol. 9, no. 11, pp. 3590–3600, Nov. 2010.
  • [3] J. Zhang, E. Björnson, M. Matthaiou, D. W. K. Ng, H. Yang, and D. J. Love, “Prospective multiple antenna technologies for beyond 5G,” IEEE J. Sel. Areas in Commun., vol. 38, no. 8, pp. 1637–1660, Aug. 2020.
  • [4] X. Chen, D. W. K. Ng, W. Yu, E. G. Larsson, N. Al-Dhahir, and R. Schober, “Massive access for 5G and beyond,” IEEE J. Sel. Areas in Commun., vol. 39, no. 3, pp. 615–637, Mar. 2021.
  • [5] A. Lozano, R. W. Heath, and J. G. Andrews, “Fundamental limits of cooperation,” IEEE Trans. Inf. Theory, vol. 59, no. 9, pp. 5213–5226, Sep. 2013.
  • [6] H. Q. Ngo, A. Ashikhmin, H. Yang, E. G. Larsson, and T. L. Marzetta, “Cell-free massive MIMO versus small cells,” IEEE Trans. Wireless Commun., vol. 16, no. 3, pp. 1834–1850, Mar. 2017.
  • [7] E. Nayebi, A. Ashikhmin, T. L. Marzetta, H. Yang, and B. D. Rao, “Precoding and power optimization in cell-free massive MIMO systems,” IEEE Trans. Wireless Commun., vol. 16, no. 7, pp. 4445–4459, Jul. 2017.
  • [8] M. Karlsson, E. Björnson, and E. G. Larsson, “Techniques for system information broadcast in cell-free massive MIMO,” IEEE Trans. Commun., vol. 67, no. 1, pp. 244–257, Jan. 2019.
  • [9] M. Bashar, K. Cumanan, A. G. Burr, M. Debbah, and H. Q. Ngo, “On the uplink max¨cmin SINR of cell-free massive MIMO systems,” IEEE Trans. Wireless Commun., vol. 18, no. 4, pp. 2021–2036, Apr. 2019.
  • [10] G. Interdonato, E. Björnson, H. Q. Ngo, P. Frenger, and E. G. Larsson, “Ubiquitous cell-free massive MIMO communications,” EURASIP J. Wireless Commun. Netw., vol. 2019, no. 1, p. 197, Dec. 2019.
  • [11] E. Bjornson and L. Sanguinetti, “Making cell-free massive MIMO competitive with MMSE processing and centralized implementation,” IEEE Trans. Wireless Commun., vol. 19, no. 1, pp. 77–90, Jun. 2020.
  • [12] E. Björnson, J. Hoydis, L. Sanguinetti et al., “Massive MIMO networks: Spectral, energy, and hardware efficiency,” Foundations and Trends® in Signal Processing, vol. 11, no. 3-4, pp. 154–655, 2017.
  • [13] R. Ahlswede, N. Cai, S.-Y. Li, and R. W. Yeung, “Network information flow,” IEEE Trans. Inf. Theory, vol. 46, no. 4, pp. 1204–1216, Apr. 2000.
  • [14] S.-Y. Li, R. W. Yeung, and N. Cai, “Linear network coding,” IEEE Trans. Inf. Theory, vol. 49, no. 2, pp. 371–381, Feb. 2003.
  • [15] T. Yang and I. B. Collings, “On the optimal design and performance of linear physical-layer network coding for fading two-way relay channels,” IEEE Trans. Wireless Commun., vol. 13, no. 2, pp. 956–967, May 2014.
  • [16] B. Nazer and M. Gastpar, “Reliable physical layer network coding,” Proc. IEEE, vol. 99, no. 3, pp. 438–460, Mar. 2011.
  • [17] Q. Huang and A. Burr, “Compute-and-forward in cell-free massive MIMO: Great performance with low backhaul load,” in Proc. IEEE ICC Wkshps, May 2017, pp. 601–606.
  • [18] B. Nazer and M. Gastpar, “Compute-and-forward: Harnessing interference through structured codes,” IEEE Trans. Inf. Theory, vol. 57, no. 10, pp. 6463–6486, Oct. 2011.
  • [19] S.-N. Hong and G. Caire, “Compute-and-forward strategies for cooperative distributed antenna systems,” IEEE Trans. Inf. Theory, vol. 59, no. 9, pp. 5227–5243, Sep. 2013.
  • [20] C. Feng, D. Silva, and F. R. Kschischang, “An algebraic approach to physical-layer network coding,” IEEE Trans. Inf. Theory, vol. 59, no. 11, pp. 7576–7596, Nov. 2013.
  • [21] M. Nokleby and B. Aazhang, “Lattice coding over the relay channel,” in Proc. IEEE ICC, 2011, pp. 1–5.
  • [22] B. Nazer, V. R. Cadambe, V. Ntranos, and G. Caire, “Expanding the compute-and-forward framework: Unequal powers, signal levels, and multiple linear combinations,” IEEE Trans. Inf. Theory, vol. 62, no. 9, pp. 4879–4909, Sep. 2016.
  • [23] Z. Li, J. Chen, L. Zhen, S. Cui, K. G. Shin, and J. Liu, “Coordinated multi-point transmissions based on interference alignment and neutralization,” IEEE Trans. Wireless Commun., vol. 18, no. 7, pp. 3347–3365, Jul. 2019.
  • [24] P. Marsch and G. P. Fettweis, Coordinated Multi-Point in Mobile Communications: from theory to practice. Cambridge University Press, 2011.
  • [25] V. V. Veeravalli and A. El Gamal, Interference management in wireless networks: Fundamental bounds and the role of cooperation. Cambridge University Press, 2018.
  • [26] Q. Huang and A. Burr, “Low complexity coefficient selection algorithms for compute-and-forward,” IEEE Access, 2017, pp. 19 182–19 193.
  • [27] B. Zhou and W. H. Mow, “A quadratic programming relaxation approach to compute-and-forward network coding design,” in Proc. IEEE ISIT, Jun. 2014, pp. 2296–2300.
  • [28] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge University Press, 2004.
  • [29] W. He, B. Nazer, and S. S. Shitz, “Uplink-downlink duality for integer-forcing,” IEEE Trans. Inf. Theory, vol. 64, no. 3, pp. 1992–2011, Mar. 2018.
  • [30] Y. Al-Eryani, M. Akrout, and E. Hossain, “Multiple access in cell-free networks: Outage performance, dynamic clustering, and deep reinforcement learning-based design,” IEEE J. Sel. Areas in Commun., vol. 39, no. 4, pp. 1028–1042, 2020.
  • [31] W. Mesbah and H. Alnuweiri, “Joint rate, power, and decoding order optimization of MIMO-MAC with MMSE-SIC,” in Proc. IEEE Globecom Wkshps, 2009.
  • [32] Z. Zhou, T. Jiang, H. Bai, S. Sun, and H. Long, “Joint optimization of power and decoding order in CDMA based cognitive radio systems with successive interference cancellation,” in Proc. IEEE ICCT, 2011, pp. 187–191.
  • [33] R. R. Patel, T. T. Desai, and S. J. Patel, “Scheduling of jobs based on hungarian method in cloud computing,” in Proc. IEEE ICICCT, 2017, pp. 6–9.
  • [34] B.-L. Xu and Z.-Y. Wu, “A study on two measurements-to-tracks data assignment algorithms,” Inf. Sci., vol. 177, no. 19, pp. 4176–4187, 2007.
  • [35] B. Nazer, A. Sanderovich, M. Gastpar, and S. Shamai, “Structured superposition for backhaul constrained cellular uplink,” in Proc. IEEE ISIT, 2009, pp. 1530–1534.
  • [36] E. Björnson and L. Sanguinetti, “Scalable cell-free massive MIMO systems,” IEEE Trans. Commun., vol. 68, no. 7, pp. 4247–4261, Jul. 2020.
  • [37] G. Interdonato, P. Frenger, and E. G. Larsson, “Scalability aspects of cell-free massive MIMO,” in Proc. IEEE ICC, 2019, pp. 1–6.
  • [38] H. Q. Ngo, L.-N. Tran, T. Q. Duong, M. Matthaiou, and E. G. Larsson, “On the total energy efficiency of cell-free massive MIMO,” IEEE Trans. Green Commun. Netw., vol. 2, no. 1, pp. 25–39, Mar. 2018.