This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

\setstackEOL

A General Approach to Fully Linearize the Power Amplifiers in mMIMO with Less Complexity

Ganesh Prasad, Member, IEEE, Håkan Johansson Senior Member, IEEE, and Rabul Hussain Laskar G. Prasad and H. Johansson are with the Division of Communication Systems, Department of Electrical Engineering, Linköping University, 581 83 Linköping, Sweden (e-mail: {ganesh.prasad, hakan.johansson}@liu.se). R. H. Laskar is with the Department of Electronics and Communication Engineering, National Institute of Technology Silchar, 788 010 Silchar, India (e-mail: rhlaskar@ece.nits.ac.in).
Abstract

A radio frequency (RF) power amplifier (PA) plays an important role to amplify the message signal at higher power to transmit it to a distant receiver. Due to a typical nonlinear behavior of the PA at high power transmission, a digital predistortion (DPD), exploiting the preinversion of the nonlinearity, is used to linearize the PA. However, in a massive MIMO (mMIMO) transmitter, a single DPD is not sufficient to fully linearize the hundreds of PAs. Further, for the full linearization, assigning a separate DPD to each PA is complex and not economical. In this work, we address these challenges via the proposed low-complexity DPD (LC-DPD) scheme. Initially, we describe the fully-featured DPD (FF-DPD) scheme to linearize the multiple PAs and examine its complexity. Thereafter, using it, we derive the LC-DPD scheme that can adaptively linearize the PAs as per the requirement. The coefficients in the two schemes are learned using the algorithms that adopt indirect learning architecture based recursive prediction error method (ILA-RPEM) due to its adaptive and free from matrix inversion operations. Furthermore, for the LC-DPD structure, we have proposed three algorithms based on correlation of its common coefficients with the distinct coefficients. Lastly, the performance of the algorithms are quantified using the obtained numerical results.

Index Terms:
Digital predistortion, massive MIMO, direct learning architecture, indirect learning architecture, recursive prediction error method.

I Introduction

In the wireless transmitters, the radio frequency (RF) power amplifiers (PAs) are used to amplify the modulated signals for distant transmissions. However, the in-band and out-of-band nonlinear distortions occur to the signals amplified near to saturation region of the PAs [1]. This can be reduced by employing some backoff to the peak power of the signals. But, it reduces the efficiency of the PAs. Therefore, the preprocessing like digital predistortion (DPD) over the transmit signals before the PAs are required to linearize the resultant signals towards the saturation region. Since a decade, many works have focused on the linearization of multiple power amplifiers in the transmitters like massive MIMO (mMIMO) transmitters. But, they have focused on the linearization in a particular direction of beamforming instead of linearizing all the PAs. Because, the linearization of each PA requires separate DPD block along with the driving RF chain. Thus, due to high complexity, it is not suitable for an economical mMIMO transmitter. To deal with it, in this work, we have proposed a most general approach to fully linearize all the PAs with less complexity. Also, we have discoursed in detail about the fundamentals behind the challenges and the procedure to tackle it.

I-A Related Works

The preprocessing using DPD has an inverse property to the nonlinear PA to mitigate the nonlinearties in the desired transmit signal [2]. From the state-of-the-art, mostly the linear parametric models have been used for the the DPD [3]. One of the methods to identify the DPD coefficients is least square (LS) due to its fast convergence [4, 5, 6]. But, despite mathematical simplicity, its computational complexity is high due to engagement of inverse operations of the matrices of large sizes that correspond to the estimation of large number of DPD coefficients. However, many works have proposed the algorithms to reduce the complexity for the identification of the DPD based on LS method [7, 8, 9, 10]. For example, the size of the matrix is reduced by normalization of the DPD basis functions (BFs) followed by their pruning [7]. Also, based on stationary random process, the time varying matrix associated with the DPD coefficients is replaced by a constant covariance matrix [8]. Further, in an iterative algorithm based on LS, the samples of the DPD coefficients (or the size of the matrix) can be reduced by considering the correlation in the observation errors between two iterations [9]. Besides, the matrix size can also be reduced using the eigenvalue decomposition and principal component analysis (PCA) that decreases the order of the memory polynomial model of the DPD [11, 10]. In eigenvalue decomposition, the number of DPD coefficients can be reduced by considering only the dominant eigenvectors. Whereas, in PCA, the reduction is achieved by converting the correlated BFs of the DPD into uncorrelated BFs.

Although, the above techniques help in reduction of the size of the matrices, but, for time varying and highly nonlinear PAs, still, the required number of DPD coefficients is large. Thus, it leads to an undesirable large matrix operations. Therefore, the recursive based algorithms like least mean square (LMS) [12], recursive least squares (RLS) [13, 14], and recursive prediction error method (RPEM) [15] are computationally more reliable at the cost of their slow convergence to the desired optimal value of the variables. Using LMS, the DPD adjusts its coefficients to minimize the mean square error (MSE) between the PA output and the desired signal. The coefficients are updated using stochastic gradient decent method that minimizes the instantaneous error in each iteration. However, LMS is quite unstable and it is very sensitive in the step size for the update [16]. In conventional LS estimation, a batch of input and output data samples of the PA are used to update the DPD coefficients. But, in RLS, using a set of equations, the LS estimation is represented recursively and the coefficients are updated accordingly for the obtained new data sample of the input and output. To discount the influence of older samples, it uses an exponential weighing known as forgetting factor. The chosen value of the forgetting factor gives a trade-off between the precision and convergence and its low value provides high fluctuation to noise. Therefore, the forgetting factor is improved further in RPEM by considering its variation with time [17]. In the existing works, mostly, these adaptive algorithms are applied to two types of DPD learning architectures: (i) direct learning architecture (DLA) [18, 19] and (ii) indirect learning architecture (ILA) [13, 14]. DLA has better performance in the presence of noise at the output the PA, but, ILA is more effective in the convergence rate [20]. Therefore, ILA is widely used for the identification of the DPD. Also, in our proposed work, we have considered RPEM algorithm in an ILA architecture for the DPD identification. Next, we describe the state-of-the-art for the linearization of the multi-antenna transmitters.

In the multi-antenna systems like MIMO or mMIMO, although the PAs in the transmitters are of same type, but, in practice, they have different nonlinearties due to their sensitivity to process, supply voltage, and temperature (PVT) [21, 22]. Therefore, a single DPD is not capable to linearize all the PAs [23] and ideally, each PA requires a separate DPD [24]. But, the ideal case provides undesirably high complexity in hardware implementation as well as in processing and even not feasible for a mMIMO transmitter where hundreds of PAs need to be linearized. Subsequently, instead of linerizing all PAs, a resultant single PA can be linearized whose output is the sum of the outputs of the PAs [25, 26]. However, it addresses the average nonlinerites of the PAs, thus, none of them is fully linearized. On the other hand, instead of sum, the beam-oriented (BO) output of the PAs in a given direction can be linearized using a single DPD [27, 28, 29]. As it addresses the nonlinearity of the BO output in the desired direction (main lobe), again, the PAs are not fully linearized. Thereby, they are not able to linearize the outputs in other directions that gives the nonlinear sidelobes and typically, their power level is only 1010 dB lesser than the linear main lobe [30]. This can be improved by frequently updating the DPD for different directions. Also, the number of DPD coefficients per update can be reduced using the pruning algorithms [31]. However, the frequent update is not reliable for online operations and it leads to high computational complexity. Moreover, at a time, the DPD is identified for a particular BO direction and the PAs are not fully linearized that still gives the comparable nonlinear sidelobes to the main lobe. If we assume the similar distribution of nonlinearites over the PAs, the side lobes can be reduced by optimally adjusting the amplitude of the phase shifters in the BO output [32]. However, in general, it cannot provide the full linearization of the PAs. The performance towards the full linearization can be improved by including extra tuning box to each PA. The tuning boxes compensate the nonlinear differences between the PAs such that the resultant nonlinearity is same for each PA, thereafter, using a single DPD, the PAs are fully linearized [33, 34, 35]. Nonetheless, for the compensation of the differences, each tuning box is modeled using a polynomial model which needs to be identified using a learning algorithm. Therefore, its complexity is approximately similar to incorporate separate DPD for each PA. Different from only DPD operations, the sidelobes in the BO output can be linearized more reliably using the two layer of operations: the DPD training followed by the post-weighting coefficients optimization that are multipled by the DPD output signal and distributed at the respective PAs’ input [36, 37]. In a simplified analysis, different post-weighting coefficients are assigned for each PA, but, in a branch of a PA, same post-weighting coefficient are multiplied to the BFs of the DPD. Thus, due to less degree of freedom per branch, it is less reliable in post-weighting linearization of the PAs [36]. Also, to distribute different signals to the branches of the PAs, separate RF-chain is needed for each PA that gives a high complexity in a mMIMO transmitter. Later, in our proposed work [37], we adopted an adaptive post-weighting architecture that increases the degree of freedom (DOF) per branch as well as reduces the number of RF-chains requirement. But, still, due to optimization of post-weighting coefficients for discrete range of directions, the PAs are not fully linearized.

I-B Motivation and Key Contribution

As described earlier, the PAs in a multi-antenna transmitter can be fully linearized using identification of a separate DPD to each PA [24]. But, it leads to high complexity in the structure and in the computation to learn the coefficients. Also, for the distribution of the predistorted signals, it requires a separate RF-chain to each PA. Based on it, we propose a most general approach using a low-complexity DPD (LC-DPD) structure which approximates seperate DPD identification requirement as well as the reduction of the number of RF-chains as per the requirement in the mMIMO transmitters. The key contribution of this work is four-fold as follows. (i) First, we deduce the reduction in the number of coefficients for a given type of PAs in a subarray from the measurement data and the obtained numerical result of a system setting. Then, we propose a fully-featured DPD (FF-DPD) scheme to fully linearize the PAs in a subarray and describe its complexity in terms of number of multipliers, adders, and RF chains. (ii) Using the FF-DPD structure, we derive the less complex and non-trivial LC-DPD structure. The number of coefficients in it is reduced based on a geometric sequence and corresponding coefficients are represented in a block vector form. Based on the geometric sequence, we derive the expression of the number of multipliers, adders, and RF chains which are significantly reduced; thus, reduces the complexity. (iii) Next, for the training of the coefficients for the two schemes, we propose four algorithms based on indirect learning architecture based recursive prediction error method (ILA-RPEM): one for the FF-DPD scheme and three for the LC-DPD. Apart from the structural complexity of the FF-DPD, we also describe the computational complexity of its training. The performance of the three algorithms for the LC-DPD is determined based on the correlation of its common coefficients to the distinct coefficients in the structure. It is also shown that the complexities of the three algorithms are less than the algorithm for FF-DPD. Further, for various operations in the four algorithms, we define the four operators and describe their properties. (iv) Lastly, we obtain the numerical results in terms of power spectral density (PSD) and error vector magnitude (EVM) using the algorithms for the two schemes and obtain the various insights by comparing their performances.

II Structures for Full Linearization

In this section, first, we describe the ideal structure for the predistortion to fully linearize the multiple PAs. Thereafter, we derive a low-complexity structure that approximates the full linearization.

Refer to caption
Figure 1: An ideal structure to linearize a subarray of S=4S=4 PAs.

If we consider a subarray of SS PAs in a mMIMO transmitter as shown in Fig. 1 (where S=4S=4), ideally, separate DPDs are applied to the respective PAs for the full linearization. As each DPD output signal is different, therefore, a separate RF-chain is employed. Thereafter, the following signals are phase shifted using the analog phase shifters (analog beamforming weights), {wl}\{w_{l}\}; l{1,,S}l\in{\{1,\cdots,S\}} to get the BO output from the PAs in a specific direction. Based on general memory polynomial (GMP) [3], the llth DPD output xl(n)x_{l}(n) to the input message s(n)s(n) can be expressed as:

xl(n)=p=0Pl1m=0Ml1ϕp,mls(nm)|s(nm)|p,\displaystyle x_{l}(n)=\sum_{p=0}^{P_{l}-1}\sum_{m=0}^{M_{l}-1}\phi_{p,m}^{l}s(n-m)|s(n-m)|^{p}, (1)

where ϕp,ml\phi_{p,m}^{l} is the coefficient for the BF, s(nm)|s(nm)|ps(n-m)|s(n-m)|^{p} of ppth power and mmth delay. Eq. (1) represents the most general model where the memory length MlM_{l} and the order PlP_{l} of the polynomial depends on the llth PA. The outputs111For convenience, the time marker index of the signals are omitted. of the SS DPDs in the subarray are represented by a vector 𝑿=[x1,,xS]T\bm{X}=[x_{1},\cdots,x_{S}]^{T}. Further, xlx_{l} is multiplied by the beamforming weight wlw_{l} and inputted to the respective llth nonlinear PA. Output yl(n)y_{l}(n) of the PA can be expressed as:

yl(n)=fnonl(wlxl(n)),\displaystyle y_{l}(n)=f_{non}^{l}(w_{l}x_{l}(n)), (2)

where fnonl()f_{non}^{l}(\cdot) represents the nonlinear function for the llth PA. For the SS PAs in the subarray, the output vector 𝒀\bm{Y} can be expressed as: 𝒀=[y1,,yS]T\bm{Y}=[y_{1},\cdots,y_{S}]^{T}. Nevertheless, the implementation of the general architecture in Fig. 1 to completely linearize all the PAs is highly complex. Because, the different set of BFs with their coefficients for each of the DPDs require many delays, multipliers, and adders. Further, the computational complexity of the iterative/learning algoirthm to identify the coefficients for each of the DPDs is undesirably high. Also, the number of RF chains is same as the number of PAs, SS in the subarray which is not economical for a mMIMO transmitter.

Refer to caption
Figure 2: Real and imaginary values of identified 88 DPDs’ coefficients to predistort respective 88 PAs based on Saleh model [38].

In order to simplify the structure, first, we analyze the values of the identified DPDs’ coefficients for the given PAs in a subarray. In Fig. 2, we have plotted the real and imaginary values of the identified coefficients of 88 DPDs to fully linearize the respective S=8S=8 traveling-wave tube (TWT) PAs based on Saleh model [38] having different AM/AM and AM/PM nonlinearities as described in Section V. Here, Pl=5P_{l}=5 and Ml=5M_{l}=5; l{1,,S}\forall l\in\{1,\cdots,S\}. The DPDs are trained using the adaptive ILA-RPEM algorithm which is described in detail in the following sections. From the figure, it can be observed that out of 2525 BFs, the coefficients of some BFs with indices in the set ={4,5,9,10,14,15,19,20,24,25}\mathcal{I}=\{4,5,9,10,14,15,19,20,24,25\} are non-zero. Further, the index set \mathcal{I} of the BFs of non-zero coefficients is same for all PAs, because, the PAs are of same type222Note that in the supplementary file of [39], from the measurement of outputs of 16 HMC943APM5E PA ICs for the input signal at 28.528.5 GHz, all the PAs nonlinerties are identified using the coefficients of same BFs. Therefore, they can be linearized using the DPDs’ coefficients of same BFs.. Also, the deviation in the values of a coefficient for different PAs is higher for higher value of the coefficient than the coefficients of lower values. For example, the mean deviations for the indices 55 and 1010 are 0.02490.0249 and 4.0248×1044.0248\times 10^{-4} for real part and 0.02480.0248 and 5.7523×1045.7523\times 10^{-4} for the imaginary part, respectively. Thus, the coefficients with higher values dominate in the linarization of the SS PAs. Based on these observations, next, we reduce the number of coefficients in the proposed two DPD schemes.

Refer to caption
Refer to caption
Figure 3: Performance comparison of the proposed PW schemes against (a) the benchmark schemes and (b) the systems with only BO-DPD.

II-A Fully-Featured DPD (FF-DPD)

As described earlier in this section, the predistortion signals for a given type of PAs can be obtained using the same set of BFs based on GMP (cf. Fig. 2). Thus, in FF-DPD, we consider the same set of BFs of order PP and memory length MM for all the PAs in the subarray. Therefore, the llth output xlx_{l} from the FF-DPD can be obtained again using (1) after substituting Pl=PP_{l}=P and Ml=MM_{l}=M. The total number of BFs in the set {s(nm)|s(nm)|p}\{s(n-m)|s(n-m)|^{p}\} for p{0,,P1}p\in\{0,\cdots,P-1\} and m{0,,M1}m\in\{0,\cdots,M-1\} is PMPM. However, as observed in Fig. 2, out of PMPM BFs, some BFs have nonzero DPD coefficients for a given type of PAs. Therefore, in general, we represent QQ BFs as a vector 𝚿=[ψ1,,ψQ]T\bm{\Psi}=[\psi_{1},\cdots,\psi_{Q}]^{T} with their respective nonzero coefficient vector for the llth PA as 𝚽l=[ϕl,1,,ϕl,Q]T\bm{\Phi}_{l}=[\phi_{l,1},\cdots,\phi_{l,Q}]^{T}, where ψi\psi_{i} is the iith BF333BF ψi\psi_{i} is a function of s(n)s(n), is given by ψi(s(n))\psi_{i}(s(n)) for i{1,,Q}i\in\{1,\cdots,Q\}. and ϕl,i\phi_{l,i} is its nonzero coefficient for i{1,,Q}i\in\{1,\cdots,Q\}. Using 𝚿\bm{\Psi} and 𝚽l\bm{\Phi}_{l}, xlx_{l} in (1) can be expressed in matrix form as:

xl=𝚽lT𝚿,\displaystyle x_{l}=\bm{\Phi}_{l}^{T}\bm{\Psi}, (3)

Using xlx_{l}, the output yly_{l} of the llth PA is obtained using the same process as in (2) where xlx_{l} is multiplied by the phase shifter wlw_{l} and inputted to the PA to get yly_{l}, thus, output vector 𝒀\bm{Y} is obtained. Moreover, the coefficient vectors for the SS PAs in the subarray can be expressed in a block vector as: 𝚽=[𝚽1T,,𝚽ST]T\bm{\Phi}=[\bm{\Phi}_{1}^{T},\cdots,\bm{\Phi}_{S}^{T}]^{T}. From Fig. 3(a), SS coefficients are multiplied by each BF. Thus, for QQ BFs, the FF-DPD structure has QSQS coefficients. Besides, the number of multipliers, NmFN_{m}^{F} in the structure is same as the number of coefficients, i.e., NmF=QSN_{m}^{F}=QS. Further, using the structure of FF-DPD in Fig. 3(a), the number of adders, NaFN_{a}^{F} can be determined as follows444In this work, the numbers of adders for the different structures are determined for the assumption that an adder has two inputs and one output.. Each predistorted output (xlx_{l}; l{1,2,3,4}l\in\{1,2,3,4\}) is determined though the sum of the multiplications of the QQ coefficients by respective BFs. Therefore, in generation of each output, the number of adders is (Q1)(Q-1), i.e., one less than the number of coefficients. Thus, for SS outputs, the total number of adders, NaF=(Q1)SN_{a}^{F}=(Q-1)S. Moreover, the number of RF chains, NRFFN_{RF}^{F} is same as the number of PAs, i.e., NRFF=SN_{RF}^{F}=S. For instance, Fig. 3(a) depicts the FF-DPD for S=4S=4 and Q=4Q=4. It has NmF=16N_{m}^{F}=16, NaF=12N_{a}^{F}=12, and NRFF=4N_{RF}^{F}=4. If we compare Fig. 3(a) with Fig. 1, the ideal structure for the linearization of the subarray is same as the structure of FF-DPD except the same set of BFs has been used for all PAs in the later. Thus, the complexity of FF-DPD is still high in terms of multipliers, adders, and the RF chains. These complexities can be reduced using LC-DPD which is described next.

II-B Low-Complexity DPD (LC-DPD)

Using LC-DPD, we reduce the complexity to fully linearize the PAs as follows. As described earlier (cf. Fig. 2), coefficients of a dominant BF in the generation of the predistorted signals for different PAs has higher deviation in its value. Therefore, to reduce the number of coefficients in LC-DPD, the BFs in the vector 𝚿=[ψ1,,ψQ]T\bm{\Psi}=[\psi_{1},\cdots,\psi_{Q}]^{T} are arranged in decreasing order of their dominance. Then, more coefficients are multiplied by a higher dominant BFs than the lower dominant BFs to generate the predistorted signals. Therefore, different from FF-DPD, in LC-DPD, the number of coefficients are multiplied adaptively by the BFs according to their dominance. Therefore, we decrease the number of coefficients based on a geometric series as follows.

{{n1Srν,,ngSr(ν+g1)}, Case I{n1Srν,,ng1Sr(ν+g2),ng×1}, Case II\displaystyle\left\{\begin{array}[]{cc}\{n_{1}Sr^{\nu},\cdots,n_{g}Sr^{(\nu+g-1)}\},&\!\!\!\!\text{ Case I}\\ \{n_{1}Sr^{\nu},\cdots,n_{g-1}Sr^{(\nu+g-2)},n_{g}\times 1\},&\text{ Case II}\end{array}\right. (6)

where ν+={0,1,}\nu\in\mathbb{Z}^{+}=\{0,1,\cdots\}, i=1gni=Q\sum_{i=1}^{g}n_{i}=Q; ni={1,2,}n_{i}\in\mathbb{P}=\{1,2,\cdots\} and the cases are: Case I: Sr(ν+q1)1Sr^{(\nu+q-1)}\geq 1 and Case II: {Sr(ν+g2)1}{Sr(ν+g1)<1}\{Sr^{(\nu+g-2)}\geq 1\}\wedge\{Sr^{(\nu+g-1)}<1\}. According to the sequence in (6), the QQ BFs are divided into gg groups, where each of the nin_{i} BFs in the iith group are multiplied by Sr(ν+i1)Sr^{(\nu+i-1)} coefficients; thus, the total coefficients in the iith group is niSr(ν+i1)n_{i}Sr^{(\nu+i-1)}. Further, over the groups, the number multiplied coefficients per BF decreases as a geometric sequence with a common ratio rr (<1)(<1). The sequence in Case II is same as in Case I except each BF in the last group gg is multiplied by one coefficient in Case II, because, Sr(ν+g1)<1Sr^{(\nu+g-1)}<1 and the number of coefficient per BF cannot be a fraction. Now, using this sequence, we define the coefficient vector 𝚽¯\bm{\overline{\Phi}} of the LC-DPD as below.

Definition 1.

For the sequence of the number of coefficients multiplied by different BFs as expressed in (6), the coefficient vector 𝚽¯\bm{\overline{\Phi}} for the LC-DPD can be represented as:

𝚽¯=[𝚽¯1T,,𝚽¯gT]T,\displaystyle\bm{\overline{\Phi}}=\left[\bm{\overline{\Phi}}_{1}^{T},\cdots,\bm{\overline{\Phi}}_{g}^{T}\right]^{T},\hskip 29.87538pt (7a)
𝚽¯i=\displaystyle\bm{\overline{\Phi}}_{i}= [ϕ1,(σi+1),ϕ(1+r(ν+i1)),(σi+1),,ϕ(1+(Sr(ν+i1)1)r(ν+i1)),(σi+1),,ϕ1,(σi+ni),ϕ(1+r(ν+i1)),(σi+ni),,\displaystyle\left[\phi_{{}_{1,\left(\sigma_{i}+1\right)}},\phi_{{}_{\left(1+r^{-(\nu+i-1)}\right),\left(\sigma_{i}+1\right)}},\cdots,\phi_{{}_{\left(1+\left(Sr^{\left(\nu+i-1\right)}-1\right)r^{-(\nu+i-1)}\right),\left(\sigma_{i}+1\right)}},\cdots,\phi_{{}_{1,\left(\sigma_{i}+n_{i}\right)}},\phi_{{}_{\left(1+r^{-(\nu+i-1)}\right),\left(\sigma_{i}+n_{i}\right)}},\cdots,\right.
ϕ(1+(Sr(ν+i1)1)r(ν+i1)),(σi+ni)]T\displaystyle\left.\phi_{{}_{\left(1+\left(Sr^{(\nu+i-1)}-1\right)r^{-(\nu+i-1)}\right),\left(\sigma_{i}+n_{i}\right)}}\right]^{T} (7b)
𝚽¯g=[ϕ1,(σg+1),ϕ1,(σg+2),,ϕ1,(σg+ng)]T; for Case II\displaystyle\bm{\overline{\Phi}}_{g}=\left[\phi_{{}_{1,\left(\sigma_{g}+1\right)}},\phi_{{}_{1,\left(\sigma_{g}+2\right)}},\cdots,\phi_{{}_{1,\left(\sigma_{g}+n_{g}\right)}}\right]^{T};\text{ for Case II} (7c)

where 𝚽¯i\bm{\overline{\Phi}}_{i} is given in (1) for i{1,,g}i\in\{1,\cdots,g\} for both the cases except 𝚽¯g\bm{\overline{\Phi}}_{g} for Case II is obtained by (7c). Besides, σi=j=1i1nj\sigma_{i}=\sum_{j=1}^{i-1}n_{j} and ϕl,q\phi_{{}_{l,q}} is the coefficient which is multiplied by the qqth BF ψq\psi_{q} to contribute to the llth predistorted output of the LC-DPD.

Example:  For LC-DPD in Fig. 3(b) where S=4S=4 and Q=4Q=4, the BFs in the vector 𝚿=[ψ1,ψ2n1=2,ψ3,ψ4n2=2]T\bm{\Psi}=[\underbrace{\psi_{1},\psi_{2}}_{n_{1}=2},\underbrace{\psi_{3},\psi_{4}}_{n_{2}=2}]^{T} are divided into g=2g=2 groups and each group contains two BFs. For ν=1\nu=1 and r=1/2r=1/2, the sequence of number of coefficients is: {n1Srν,n2Srν+1}={4,2}\{n_{1}Sr^{\nu},n_{2}Sr^{\nu+1}\}=\{4,2\}; thus, the total number of coefficients is 4+2=64+2=6. Here, each of the BF in the first group is assigned by Srν=2Sr^{\nu}=2 coefficients, while for the second group, it is Srν+1=1Sr^{\nu+1}=1 as depicted in the figure. Further, for the LC-DPD coefficient vector 𝚽¯\bm{\overline{\Phi}}, the Case I in (7a) is satisfied, thus, using (7), σ1=0\sigma_{1}=0, and σ2=2\sigma_{2}=2, the coefficient vector 𝚽¯=[ϕ1,1,ϕ3,1,ϕ1,2,ϕ3,2elements of 𝚽¯1,ϕ1,3,ϕ1,4elements of 𝚽¯2]\bm{\overline{\Phi}}=[\underbrace{\phi_{1,1},\phi_{3,1},\phi_{1,2},\phi_{3,2}}_{\text{elements of }\bm{\overline{\Phi}}_{1}},\underbrace{\phi_{1,3},\phi_{1,4}}_{\text{elements of }\bm{\overline{\Phi}}_{2}}]. \Box

Furthermore, the total number of coefficients (or multipliers) in 𝚽¯\bm{\overline{\Phi}}, the number of adders, and the number of RF chains in the structure of LC-DPD can be determined using Lemma 1 as described below.

Lemma 1.

Using the sequence in (6), the total number of coefficients (or multipliers), NmLN_{m}^{L} in 𝚽¯\bm{\overline{\Phi}} can be determined as in (8c). Further, for a special case when ni=nj=nn_{i}=n_{j}=n; ij\forall i\neq j, Nm,eq.LN_{m,eq.}^{L} is expressed in (8f). Moreover, the number of adders, NaLN_{a}^{L} and the number of RF chains, NRFLN_{RF}^{L} are given by (9).

NmL={Srνi=1gniri1|i=1gni=Q; Case ISrνi=1g1niri1+ng; Case II,\displaystyle N_{m}^{L}=\left\{\begin{array}[]{cc}Sr^{\nu}\sum_{i=1}^{g}n_{i}r^{i-1}\big{|}\sum_{i=1}^{g}n_{i}=Q;\!\!\!\!\!&\text{ Case I}\\ Sr^{\nu}\sum_{i=1}^{g-1}n_{i}r^{i-1}+n_{g};\!\!\!\!\!&\text{ Case II},\end{array}\right. (8c)
Nm,eq.L={nSrν(1rg)/(1r); Case InSrν(1r(g1))/(1r)+ng; Case II.\displaystyle N_{m,eq.}^{L}=\left\{\begin{array}[]{cc}nSr^{\nu}(1-r^{g})/(1-r);\!\!\!\!\!&\text{ Case I}\\ nSr^{\nu}(1-r^{(g-1)})/(1-r)+n_{g};\!\!\!\!\!&\text{ Case II}.\end{array}\right. (8f)
NaL=NmLSr(ν+g1),NRFL=Srν\displaystyle N_{a}^{L}=N_{m}^{L}-\left\lceil Sr^{(\nu+g-1)}\right\rceil,\;\;N_{RF}^{L}=Sr^{\nu} (9)
Proof:

The expression of NmLN_{m}^{L} in (8c) is trivial which is the sum of the terms of the sequences in (6) for the two cases. After substituting the condition of the special case: ni=nj=nn_{i}=n_{j}=n; ij\forall i\neq j in (8c), nn is taken common, thus, it becomes a geometric series with common ratio rr. After simplifying, we get Nm,eq.LN_{m,eq.}^{L} as in (8f). The number of adders, NaLN_{a}^{L} in the LC-DPD structure can be determined as follows. As described in the structure of FF-DPD, for generation of a predistorted output, if the coefficients to the respective BFs are completely different, then, the number of adder used in the generation is Q1Q-1, i.e., one less than the number of coefficients. It can also be observed for the output x1x_{1} of the LC-DPD in Fig. 3(b). But, for the output x3x_{3}, the coefficients for BFs ψ1\psi_{1} and ψ2\psi_{2} are different, while the coefficients for ψ3\psi_{3} and ψ4\psi_{4} are equal to the respective coefficients for x1x_{1}. We find that for the generation of x3x_{3}, the number of adders is 22 which is equal to the number of different coefficients. Based on it, the total number of adders, NaLN_{a}^{L} is obtained as in (9) which is equal to the total number of coefficients (or multipliers) subtracted by the number of predistorted outputs which uses completely different set of coefficients, given by: Sr(ν+g1)Sr^{(\nu+g-1)}. For Case II in (6), Sr(ν+g1)<1Sr^{(\nu+g-1)}<1 and only one output is generated by completed different set of coefficients. So, Sr(ν+g1)\lceil Sr^{(\nu+g-1)}\rceil represents the number of different sets of coefficients, thus, NaLN_{a}^{L} in (9) is for both the cases. Moreover, the number of RF chains, NRFLN_{RF}^{L} depends on the number of coefficients multiplied by each of the BF in the first group, i.e., NRFL=SrνN_{RF}^{L}=Sr^{\nu} as expressed in (9). For instance, in Fig. 3(b), each of ψ1\psi_{1} and ψ2\psi_{2} are assigned by 4×(1/2)1=24\times(1/2)^{1}=2 coefficients, thus, NRFL=2N_{RF}^{L}=2. ∎

If we compare the LC-DPD to FF-DPD in Fig. 3, the number of multipliers, adders, and RF chains in LC-DPD are reduced by the factors, NmF/NmL=2.67N_{m}^{F}/N_{m}^{L}=2.67, NaF/NaL=2.4N_{a}^{F}/N_{a}^{L}=2.4, and NRFF/NRFL=2N_{RF}^{F}/N_{RF}^{L}=2, respectively.

III Training Based on ILA-RPEM: Part I

Now, we describe the ILA-RPEM based learning for FF-DPD and LC-DPD schemes to fully linearize a subarray of PAs. In Part I, we focus on the learning using FF-DPD structure where first, we describe the learning for FF-DPD coefficients, then, the learning for LC-DPD coefficients is realized by utilizing the structure of FF-DPD. Whereas, in Part II, the learning completely exploits the structure of LC-DPD.

Refer to caption
Figure 4: Indirect learning architecture (ILA) based on RPEM.

Fig. 4 represents a general ILA architecture to linearize a subarray of PAs using RPEM algorithm. Here, the message s(n)s(n) is inputted to the FF-DPD (or LC-DPD) block with coefficient vector 𝚽\bm{\Phi} (or 𝚽¯\bm{\overline{\Phi}}) which generates a vector 𝑿\bm{X} of predistorted signals using (3). Thereafter, these signals are inputted to the respective PAs to get the output vector 𝒀\bm{Y} using (2). To minimize the nonlinearties in 𝒀\bm{Y}, in the feedback loop, first, 𝒀\bm{Y} is scaled by the inverse gain 1/G1/G of the PAs to get 𝒀=[y1(n)yS(n)]T\bm{Y}^{{}^{\prime}}=[y_{1}^{{}^{\prime}}(n)\cdots y_{S}^{{}^{\prime}}(n)]^{T}, where yl(n)=(1/G)yl(n)y_{l}^{{}^{\prime}}(n)=(1/G)y_{l}(n). Then, it is inputted to the training block. Based on RPEM, the block estimates the FF-DPD (or LC-DPD) coefficient vector 𝚽~\bm{\tilde{\Phi}} (or 𝚽¯~\bm{\tilde{\overline{\Phi}}}), where 𝚽~\bm{\tilde{\Phi}} is defined similar to 𝚽\bm{\Phi} as in Section II-A for FF-DPD structure, whereas, 𝚽¯~\bm{\tilde{\overline{\Phi}}} is defined as 𝚽¯\bm{\overline{\Phi}} in (7) for LC-DPD structure. Thereafter, it is copied to FF-DPD (or LC-DPD) block, i.e., 𝚽=𝚽~\bm{\Phi}=\bm{\tilde{\Phi}} (or 𝚽¯=𝚽¯~\bm{\overline{\Phi}}=\bm{\tilde{\overline{\Phi}}}) and again generates 𝑿\bm{X} followed by 𝒀\bm{Y}. The process repeats until 𝚽\bm{\Phi} (or 𝚽¯\bm{\overline{\Phi}}) converges. Thereafter, the FF-DPD (or LC-DPD) is trained to fully linearize the PAs.

III-A Linearization of a PA using ILA-RPEM

We study the processing behind the linearization of llth PA of the subarray using the ILA-RPEM in Fig. 4 as follows. Using the input s(n)s(n), the predistorted output xl(n)x_{l}(n) followed by the PA output yl(n)y_{l}(n) are obtained using (3) and (2), respectively. To capture the inverse of the nonlinear behavior of the PA, the scaled output of the PA yl(n)y_{l}^{{}^{\prime}}(n) is inputted to the RPEM based learning block which generates the postdistorter signal x~l(n)\tilde{x}_{l}(n) similar to (3) as:

x~l(n)=𝚽~lT𝚿l,\displaystyle\tilde{x}_{l}(n)=\bm{\tilde{\Phi}}_{l}^{T}\bm{\Psi}_{l}^{{}^{\prime}}, (10)

where 𝚽~l\bm{\tilde{\Phi}}_{l} is the llth vector of the block vector 𝚽~\bm{\tilde{\Phi}} and 𝚿l\bm{\Psi}_{l}^{{}^{\prime}} is the vector of BFs defined using same polynomial terms as in 𝚿\bm{\Psi}, only the difference is that instead of s(n)s(n), yl(n)y_{l}^{{}^{\prime}}(n) is the input to the BFs in 𝚿\bm{\Psi}. Thus, the iith element of the vector 𝚿l\bm{\Psi}_{l}^{{}^{\prime}} is: ψl,i=ψi(yl(n))\psi_{l,i}^{{}^{\prime}}=\psi_{i}(y_{l}^{{}^{\prime}}(n)). Now, the goal of the RPEM algorithm is to minimize the difference between the postdistorted signal x~l\tilde{x}_{l} in (10) and the predistorted signal xlx_{l} in (3) iteratively by optimizing 𝚽~l\bm{\tilde{\Phi}}_{l}. At the end of each iteration, 𝚽~l\bm{\tilde{\Phi}}_{l} is copied to 𝚽l\bm{\Phi}_{l}, i.e., 𝚽l=𝚽~l\bm{\Phi}_{l}=\bm{\tilde{\Phi}}_{l} and using it, the algorithm tries to capture the inverse of the nonlinear characteristics of the PA. Thus, using the estimated value 𝚽^l\bm{\widehat{\Phi}}_{l} at the end of the convergence, the FF-DPD generates the optimal predistored signal x^l\widehat{x}_{l} to linearize the llth PA.

Now, we describe the process to minimize the difference el(n)=xl(n)x~l(n)e_{l}(n)=x_{l}(n)-\tilde{x}_{l}(n) in each iteration. The cost function (𝚽~l)\mathcal{L}(\bm{\tilde{\Phi}}_{l}) based on the average of the power content of el(n)e_{l}(n) over the long horizon is defined as:

(𝚽~l)=limN1Nn=1N𝔼[|el(n)|2],\displaystyle\mathcal{L}(\bm{\tilde{\Phi}}_{l})=\textstyle\lim_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}\mathbb{E}\left[|e_{l}(n)|^{2}\right], (11)

From [40], (𝚽~l)\mathcal{L}(\bm{\tilde{\Phi}}_{l}) in (11) can be minimized using the negative gradient of el(n)e_{l}(n) in 𝚽~l\bm{\tilde{\Phi}}_{l}. It can be obtained as:

de(n)d𝚽~l=dx~l(n)d𝚽~l=d𝚽~lT𝚿ld𝚽~l=𝚿l.\displaystyle-\frac{\mathrm{d}e(n)}{\mathrm{d}\bm{\tilde{\Phi}}_{l}}=\frac{\mathrm{d}\tilde{x}_{l}(n)}{\mathrm{d}\bm{\tilde{\Phi}}_{l}}=\frac{\mathrm{d}\bm{\tilde{\Phi}}_{l}^{T}\bm{\Psi}^{{}^{\prime}}_{l}}{\mathrm{d}\bm{\tilde{\Phi}}_{l}}=\bm{\Psi}^{{}^{\prime}}_{l}. (12)

Using the gradient in (12), the training block performs the below computations in (13) based on RPEM to get the trained coefficients 𝚽~l(n)\bm{\tilde{\Phi}}_{l}^{(n)} for the llth PA in the nnth iteration [40].

el(n)\displaystyle\!\!\!\!e_{l}(n) =xl(n)x~l(n),\displaystyle=x_{l}(n)-\tilde{x}_{l}(n),\!\!\!\!\!\!\! (13a)
ξl(n)\displaystyle\!\!\!\!\xi_{l}^{(n)} =ρξl(n1)+1ρ,\displaystyle=\rho\xi_{l}^{(n-1)}+1-\rho,\!\!\!\!\!\!\! (13b)
Zl(n)\displaystyle\!\!\!\!Z_{l}^{(n)} =𝚿lT(n)Pl(n1)𝚿l(n)+ξl(n),\displaystyle=\bm{\Psi}_{l}^{{}^{\prime}T}(n)P_{l}^{(n-1)}{\bm{\Psi}_{l}^{{}^{\prime}}}^{*}(n)+\xi_{l}^{(n)},\!\!\!\!\!\!\! (13c)
Pl(n)\displaystyle\!\!\!\!P_{l}^{(n)} =(Pl(n1)Pl(n1)𝚿l(n)Zl(n)1𝚿lT(n)Pl(n1))/ξl(n),\displaystyle=(P_{l}^{(n-1)}-P_{l}^{(n-1)}{\bm{\Psi}_{l}^{{}^{\prime}}}^{*}(n){Z_{l}^{(n)}}^{-1}\bm{\Psi}_{l}^{{}^{\prime}T}(n)P_{l}^{(n-1)})/\xi_{l}^{(n)},\!\!\!\!\!\!\! (13d)
𝚽~l(n)\displaystyle\!\!\!\!\bm{\tilde{\Phi}}_{l}^{(n)} =𝚽~l(n1)+Pl(n)𝚿l(n)el(n),\displaystyle=\bm{\tilde{\Phi}}_{l}^{(n-1)}+P_{l}^{(n)}{\bm{\Psi}_{l}^{{}^{\prime}}}^{*}(n)e_{l}(n),\!\!\!\!\!\!\! (13e)

Here, in (13a), el(n)e_{l}(n) is computed. Thereafter, the forgetting factor ξl\xi_{l} is determined recursively in (13b) using its value in the previous iteration and the rate of growth ρ\rho. The initial value ξl(0)=λ0\xi_{l}^{(0)}=\lambda_{0} and ξl\xi_{l} grows exponentially to 11 with iterations. Using ξl\xi_{l}, BF vector 𝚿l\bm{\Psi}_{l}^{{}^{\prime}}, and the covariance matrix PlP_{l}, the scalar ZlZ_{l} is computed in (13c). Initial value Pl(0)=μ𝑰P_{l}^{(0)}=\mu\bm{I}, where 𝑰\bm{I} is the identity matrix and μ\mu is a constant. Finally, matrix PlP_{l} is obtained in (13d) followed by 𝚽~l\bm{\tilde{\Phi}}_{l} is determined in (13e) recursively at the end of the iteration. Moreover, RPEM is free from complex matrix inverse operations like in a LS estimation as ZlZ_{l} is scalar. Now, using the above study for the linearization of llth PA, we realize the full linearization of PAs of a subarray.

III-B FF-DPD to Linearize a Subarray using ILA-RPEM

As described earlier, the structure of FF-DPD in Fig. 3(a) is similar to assigning individual DPD in Fig. 1 to each of the PAs. Its complexity is reduced only due to utilizing same set of QQ BFs for a given type of PAs. Therefore, using FF-DPD, the full linearization for SS PAs through ILA-RPEM is same as the parallel linearization of each of the SS PAs using the same process as for the llth PA (cf. previous paragraph). For the parallel operations, the different parameters are arranged in the matrix form as follows. We define, 𝑬=[e1,,eS]T\bm{E}=[e_{1},\cdots,e_{S}]^{T}, 𝝃diag(ξ1,,ξS)\bm{\xi}\triangleq\text{diag}(\xi_{1},\cdots,\xi_{S}), 𝒁diag(Z1,,ZS)\bm{Z}\triangleq\text{diag}(Z_{1},\cdots,Z_{S}), 𝚼diag(𝚿1,,𝚿S)\bm{\Upsilon}\triangleq\text{diag}(\bm{\Psi}_{1}^{{}^{\prime}},\cdots,\bm{\Psi}_{S}^{{}^{\prime}}), 𝑷diag(P1,,PS)\bm{P}\triangleq\text{diag}(P_{1},\cdots,P_{S}), and 𝚵diag(ξ1𝑰Q,,ξS𝑰Q)\bm{\Xi}\triangleq\text{diag}(\xi_{1}\bm{I}_{Q},\cdots,\xi_{S}\bm{I}_{Q}). Here, diag()\text{diag}(\cdot) is diagonal matrix constructed using the input scalar or matrix elements.

Algorithm 1 Iterative estimation of coefficients for FF-DPD.
1:The values of ρ\rho, λ0\lambda_{0}, μ\mu, and 𝚽~(0)\bm{\tilde{\Phi}}^{(0)}
2:The estimated coefficient vector 𝚽^\bm{\widehat{\Phi}}
3:𝑷(0)=diag(μ𝑰Q,,μ𝑰QS times)\bm{P}^{(0)}=\text{diag}(\underbrace{\mu\bm{I}_{Q},\cdots,\mu\bm{I}_{Q}}_{S\text{ times}})
4:𝝃(0)=λ0𝑰S\bm{\xi}^{(0)}=\lambda_{0}\bm{I}_{S}
5:Assign 𝚽(0)=𝚽~(0)\bm{\Phi}^{(0)}=\bm{\tilde{\Phi}}^{(0)} and 𝑿(1)\bm{X}(1) using (3) followed by determine 𝚼(1)\bm{\Upsilon}(1) using the outputs 𝒀(1)\bm{Y}(1) of the PAs
6:𝑿~(1)=𝚼(1)T𝚽~(0)\bm{\tilde{X}}(1)=\bm{\Upsilon}(1)^{T}\bm{\tilde{\Phi}}^{(0)}
7:n=1n=1
8:repeat
9:    𝑬(n)=𝑿(n)𝑿~(n)\bm{E}(n)=\bm{X}(n)-\bm{\tilde{X}}(n)
10:    𝝃(n)=ρ𝝃(n1)+𝑰Sρ𝑰S\bm{\xi}^{(n)}=\rho\bm{\xi}^{(n-1)}+\bm{I}_{S}-\rho\bm{I}_{S}
11:    𝒁(n)=𝚼T(n)𝑷(n1)𝚼(n)+𝝃(n)\bm{Z}^{(n)}=\bm{\Upsilon}^{T}(n)\bm{P}^{(n-1)}\bm{\Upsilon}^{*}(n)+\bm{\xi}^{(n)}
12:    𝑷(n)=(𝑷(n1)𝑷(n1)𝚼(n)𝒁(n)1𝚼T(n)𝑷(n1))𝚵(n)1\bm{P}^{(n)}=(\bm{P}^{(n-1)}-\bm{P}^{(n-1)}\bm{\Upsilon}^{*}(n){\bm{Z}^{(n)}}^{-1}\bm{\Upsilon}^{T}(n)\bm{P}^{(n-1)}){\bm{\Xi}^{(n)}}^{-1}
13:    𝚽~(n)=𝚽~(n1)+𝑷(n)𝚼(n)𝑬(n)\bm{\tilde{\Phi}}^{(n)}=\bm{\tilde{\Phi}}^{(n-1)}+\bm{P}^{(n)}\bm{\Upsilon}^{*}(n)\bm{E}(n)
14:    Assign 𝚽(n)=𝚽~(n)\bm{\Phi}^{(n)}=\bm{\tilde{\Phi}}^{(n)} and compute 𝑿(n+1)\bm{X}(n+1) using (3)\eqref{eq:x_l_FF}, then find 𝚼(n+1)\bm{\Upsilon}(n+1) using the outputs 𝒀(n+1)\bm{Y}(n+1) of the PAs
15:    𝑿~(n+1)=𝚼(n+1)T𝚽~(n)\bm{\tilde{X}}(n+1)=\bm{\Upsilon}(n+1)^{T}\bm{\tilde{\Phi}}^{(n)}
16:    n=n+1n=n+1
17:until 𝚽~(n)\bm{\tilde{\Phi}}^{(n)} converges
18:𝚽^=𝚽~(n)\bm{\widehat{\Phi}}=\bm{\tilde{\Phi}}^{(n)}

In Algorithm 1, the values are assigned to the independent parameters ρ\rho, λ0\lambda_{0}, and μ\mu. Then, the initial values of the covariance matrix 𝑷\bm{P} and the forgetting factor 𝝃\bm{\xi} are computed in Steps 3 and 4. Thereafter, the calculations from Step 9 to Step 15 are repeated until the coefficient vector 𝚽~\bm{\tilde{\Phi}} converges. Lastly, we obtain the optimal coefficient vector 𝚽~^\bm{\widehat{\tilde{\Phi}}} in Step 18 which is copied to the FF-DPD, i.e., 𝚽^=𝚽~^\bm{\widehat{\Phi}}=\bm{\widehat{\tilde{\Phi}}}.

Performance

If we examine Step 13 of Algorithm 1, 𝚽\bm{\Phi} is iteratively estimated using the correlation matrix 𝑷\bm{P}. Therefore, the coefficients in the vector 𝚽l\bm{\Phi}_{l} of the block vector 𝚽\bm{\Phi} are correlated with each other to provide optimal predistorted signal xlx_{l} to the linearize the llth PA. As each PA is provided a separate predistorted signal, hence, FF-DPD gives the best performance.

Complexity

As the matrix multiplication dominates in the complexity of an algorithm [41], therefore, we consider Steps 11, 12, and 13 to determine the computational complexity of Algorithm 1 in an iteration. In Step 11, the matrices 𝚼\bm{\Upsilon}, 𝑷\bm{P} have the sizes QS×SQS\times S and QS×QSQS\times QS, respectively. The computational complexity of 𝚼T𝑷\bm{\Upsilon}^{T}\bm{P} is O(SQSQS)=O(Q2S3)O(SQSQS)=O(Q^{2}S^{3}). Now, the matrix 𝚼T𝑷\bm{\Upsilon}^{T}\bm{P} has the size S×QSS\times QS which is further multiplied by 𝚼\bm{\Upsilon}^{*} with complexity O(SQSS)=O(QS3)O(SQSS)=O(QS^{3}). Thus, the total complexity of Step 11 is O(Q2S3+QS3)O(Q^{2}S^{3}+QS^{3}). Similarly, the complexities of Steps 12 and 13 are O(2Q3S3+2Q2S3+QS3)O(2Q^{3}S^{3}+2Q^{2}S^{3}+QS^{3}) and O(Q2S3+QS2)O(Q^{2}S^{3}+QS^{2}), respectively555Note that the algorithm is still free from the matrix inverse operations, because, although, 𝒁\bm{Z} and 𝚵\bm{\Xi} are the matrices, but, they are diagonal matrices and their inverses are only the inverse of their diagonal elements.. In these operations, the highest order term is Q3S3Q^{3}S^{3}, therefore, per iteration, the computational complexity of the algorithm is O(Q3S3)O(Q^{3}S^{3}).

Lemma 2.

Using property of diagonal matrix multiplication, complexity of Algorithm 1 is reduced by a factor of S2S^{2}.

Proof:

As the direct multiplication of diagonal matrices is not computationally efficient due to redundant multiplication of 0 entries. For example, the two diagonal matrices of size 2×22\times 2 complies the following multiplication, diag(a1,a2)diag(b1,b2)=diag(a1b1,a2b2)\text{diag}(a_{1},a_{2})\text{diag}(b_{1},b_{2})=\text{diag}(a_{1}b_{1},a_{2}b_{2}), where aia_{i} and bib_{i}; i{1,2}i\in\{1,2\} are the scalars. If we consider the conventional matrix multiplication, total number of multiplications is 232^{3}, but, according to the diagonal matrix multiplication, only 22 multiplications are required of their diagonal entries. Thus, there are 66 redundant multiplications in the former method. The same multiplication property is applied if aia_{i} and bib_{i} are matrices provided their sizes should satisfy the multiplication aibia_{i}b_{i}. If we employ it in the matrix multiplication 𝚼T𝑷𝚼\bm{\Upsilon}^{T}\bm{P}\bm{\Upsilon}^{*} of Step 11, there are 2S2S respective diagonal elements multiplications. Further, in each diagonal multiplication, for example, in llth, the matrix multiplication is 𝚿lTPl𝚿l\bm{\Psi}_{l}^{{}^{\prime}T}P_{l}{\bm{\Psi}_{l}^{{}^{\prime}}} with total number of multiplications O(Q2+Q)O(Q^{2}+Q). Thus, total multiplication is O(Q2S+QS)O(Q^{2}S+QS). Similarly, the computational complexities of Steps 12 and 13 are O(2Q3S+2Q2S+QS)O(2Q^{3}S+2Q^{2}S+QS) and O(Q2S+QS)O(Q^{2}S+QS), respectively. Thus, per iteration, the computational complexity is O(Q3S)O(Q^{3}S) which is S2S^{2} times less than the conventional method. ∎

III-C Realization of Learning for LC-DPD using FF-DPD

To learn the LC-DPD coefficient vector 𝚽¯\bm{\overline{\Phi}}, we exploit the learning process for FF-DPD. In this regard, vector 𝚽¯\bm{\overline{\Phi}} is converted into FF-DPD coefficient vector 𝚽\bm{\Phi} and vice versa. To realize the conversions mathematically, we define two operators 𝕸1\bm{\mathfrak{M}}_{1} and 𝕸2\bm{\mathfrak{M}}_{2}.

Definition 2 (A Linear Operator 𝔐1\bm{\mathfrak{M}}_{1}).

The function f1()f_{1}(\cdot) that transforms the shape of the coefficient vector 𝚽¯\bm{\overline{\Phi}} into the shape of 𝚽\bm{\Phi} as expressed in (14a) is a linear operator 𝕸1\bm{\mathfrak{M}}_{1} as defined in (14b).

𝚽=f1(𝚽¯)=𝕸1𝚽¯,\displaystyle\!\!\!\!\!\bm{\Phi}=f_{1}(\bm{\overline{\Phi}})=\bm{\mathfrak{M}}_{1}\bm{\overline{\Phi}},\!\! (14a)
𝕸1[𝑴11𝑴1g𝑴S1𝑴Sg],𝑴ij=[m11ijm1LjijmQ1ijmQLjij]\displaystyle\!\!\!\!\!\bm{\mathfrak{M}}_{1}\triangleq\begin{bmatrix}\bm{M}_{11}&\cdots&\bm{M}_{1g}\\ \vdots&\ddots&\vdots\\ \bm{M}_{S1}&\cdots&\bm{M}_{Sg}\end{bmatrix}\!\!,\bm{M}_{ij}=\begin{bmatrix}m_{11}^{ij}&\cdots&m_{1L_{j}}^{ij}\\ \vdots&\ddots&\vdots\\ m_{Q1}^{ij}&\cdots&m_{QL_{j}}^{ij}\end{bmatrix}\!\! (14b)

where Lj=njSr(ν+j1)L_{j}=n_{j}Sr^{(\nu+j-1)}, muvij{0,1}m_{uv}^{ij}\in\{0,1\} for u{1,,Q}u\in\{1,\cdots,Q\}, v{1,,Lj}v\in\{1,\cdots,L_{j}\}, i{1,,S}i\in\{1,\cdots,S\} and j{1,,g}j\in\{1,\cdots,g\}. Here, muvij=1m_{uv}^{ij}=1 indicates that after performing the operation in (14a), the vvth element of vector 𝚽¯j\bm{\overline{\Phi}}_{j} is assigned to the uuth element of vector 𝚽i\bm{\Phi}_{i}. Furthermore, the operator 𝕸1\bm{\mathfrak{M}}_{1} has the following two properties.

  • (i)

    In each row vector of the matrix 𝕸1\bm{\mathfrak{M}}_{1}, only one element is 11 and the remaining elements are 0.

  • (ii)

    The sum of the elements in the zzth column vector of the operator depicts the repetition of zzth element of vector 𝚽¯\bm{\overline{\Phi}} in the vector 𝚽\bm{\Phi}. Also, if the zzth element of 𝚽¯\bm{\overline{\Phi}} lies in the jjth vector 𝚽¯j\bm{\overline{\Phi}}_{j} of the block vector 𝚽¯\bm{\overline{\Phi}}, then the number of repetition of zzth coefficient of 𝚽¯\bm{\overline{\Phi}} in 𝚽\bm{\Phi} is r(ν+j1)r^{-(\nu+j-1)} which is same as the sum of the elements of the zzth column vector of 𝕸1\bm{\mathfrak{M}}_{1}.

Example:  From Fig. 3(b), the block vector 𝚽¯=[𝚽¯1T,𝚽¯2T]T\bm{\overline{\Phi}}=[\bm{\overline{\Phi}}_{1}^{T},\bm{\overline{\Phi}}_{2}^{T}]^{T}, where 𝚽¯1=[ϕ¯1,1,ϕ¯3,1,ϕ¯1,2,ϕ¯3,2]T\bm{\overline{\Phi}}_{1}=[\overline{\phi}_{1,1},\overline{\phi}_{3,1},\overline{\phi}_{1,2},\overline{\phi}_{3,2}]^{T}, and 𝚽¯2=[ϕ¯1,3,ϕ¯1,4]T\bm{\overline{\Phi}}_{2}=[\overline{\phi}_{1,3},\overline{\phi}_{1,4}]^{T}. Here, the parameters for LC-DPD are: ν=1\nu=1, r=1/2r=1/2, g=2g=2, n1=n2=2n_{1}=n_{2}=2, Q=4Q=4, and S=4S=4. To reshape 𝚽¯\bm{\overline{\Phi}} into the shape of 𝚽=[𝚽1T,𝚽2T,𝚽3T,𝚽4T]T\bm{\Phi}=[\bm{\Phi}_{1}^{T},\bm{\Phi}_{2}^{T},\bm{\Phi}_{3}^{T},\bm{\Phi}_{4}^{T}]^{T} as for FF-DPD in Fig. 3(a), we use the operator 𝔐1\mathfrak{M}_{1} as given below, where 𝚽i=[ϕi,1,ϕi,2,ϕi,3,ϕi,4]T\bm{\Phi}_{i}=[\phi_{i,1},\phi_{i,2},\phi_{i,3},\phi_{i,4}]^{T} for i{1,2,3,4}i\in\{1,2,3,4\}.

𝕸1=[𝑴11𝑴12𝑴21𝑴22𝑴31𝑴32𝑴41𝑴42]=[[0.7]1000001000000000[0.7]00001001[0.7]1000001000000000[0.7]00001001[0.7]0100000100000000[0.7]00001001[0.7]0100000100000000[0.7]00001001]\displaystyle\bm{\mathfrak{M}}_{1}=\begin{bmatrix}\bm{M}_{11}&\bm{M}_{12}\\ \bm{M}_{21}&\bm{M}_{22}\\ \bm{M}_{31}&\bm{M}_{32}\\ \bm{M}_{41}&\bm{M}_{42}\end{bmatrix}=\footnotesize\left[\begin{array}[]{@{}c|c@{}}\begin{matrix}[0.7]1&0&0&0\\ 0&0&1&0\\ 0&0&0&0\\ 0&0&0&0\end{matrix}&\begin{matrix}[0.7]0&0\\ 0&0\\ 1&0\\ 0&1\end{matrix}\\ \hline\cr\begin{matrix}[0.7]1&0&0&0\\ 0&0&1&0\\ 0&0&0&0\\ 0&0&0&0\end{matrix}&\begin{matrix}[0.7]0&0\\ 0&0\\ 1&0\\ 0&1\end{matrix}\\ \hline\cr\begin{matrix}[0.7]0&1&0&0\\ 0&0&0&1\\ 0&0&0&0\\ 0&0&0&0\end{matrix}&\begin{matrix}[0.7]0&0\\ 0&0\\ 1&0\\ 0&1\end{matrix}\\ \hline\cr\begin{matrix}[0.7]0&1&0&0\\ 0&0&0&1\\ 0&0&0&0\\ 0&0&0&0\end{matrix}&\begin{matrix}[0.7]0&0\\ 0&0\\ 1&0\\ 0&1\end{matrix}\end{array}\right] (19)

In (19), 𝕸1\bm{\mathfrak{M}}_{1} satisfies the property (i) as each of its row vector contains one element as 11 and remaining are 0. Next, to verify the property (ii), for instance, the sum of the elements of the 55th (z=5)(z=5) column vector is 44 which entails that 55th coefficient ϕ¯1,3\overline{\phi}_{1,3} of 𝚽¯\bm{\overline{\Phi}} repeats 44 times in 𝚽\bm{\Phi}. It can also be determined using the expression r(ν+j1)r^{-(\nu+j-1)} where j=2j=2 as the 5th (z=5)(z=5) column vector lies in the 2nd (j=2)(j=2) group and it gives 44 after substitution of the value of the parameters. Besides, the block vector 𝚽=[𝚽1T,𝚽2T,𝚽3T,𝚽4T]T\bm{\Phi}=[\bm{\Phi}_{1}^{T},\bm{\Phi}_{2}^{T},\bm{\Phi}_{3}^{T},\bm{\Phi}_{4}^{T}]^{T} is obtained using the computation in (14a), where 𝚽1=[ϕ¯1,1,ϕ¯1,2,ϕ¯1,3,ϕ¯1,4]T\bm{\Phi}_{1}=[\overline{\phi}_{1,1},\overline{\phi}_{1,2},\overline{\phi}_{1,3},\overline{\phi}_{1,4}]^{T}, 𝚽2=[ϕ¯1,1,ϕ¯1,2,ϕ¯1,3,ϕ¯1,4]T\bm{\Phi}_{2}=[\overline{\phi}_{1,1},\overline{\phi}_{1,2},\overline{\phi}_{1,3},\overline{\phi}_{1,4}]^{T}, 𝚽3=[ϕ¯3,1,ϕ¯3,2,ϕ¯1,3,ϕ¯1,4]T\bm{\Phi}_{3}=[\overline{\phi}_{3,1},\overline{\phi}_{3,2},\overline{\phi}_{1,3},\overline{\phi}_{1,4}]^{T}, and 𝚽4=[ϕ¯3,1,ϕ¯3,2,ϕ¯1,3,ϕ¯1,4]T\bm{\Phi}_{4}=[\overline{\phi}_{3,1},\overline{\phi}_{3,2},\overline{\phi}_{1,3},\overline{\phi}_{1,4}]^{T}. \Box

Definition 3 (A Linear Operator 𝔐2\bm{\mathfrak{M}}_{2}).

The function f2()f_{2}(\cdot) that transforms the shape of the coefficient vector 𝚽\bm{\Phi} into the shape of 𝚽¯\bm{\overline{\Phi}} as expressed in (20a), where some of the elements of 𝚽¯\bm{\overline{\Phi}} are the average of some of elements of 𝚽\bm{\Phi}, is a linear operator 𝕸2\bm{\mathfrak{M}}_{2} as defined in (20b).

𝚽¯=f2(𝚽)=𝕸2𝚽,\displaystyle\!\!\!\!\!\bm{\overline{\Phi}}=f_{2}(\bm{\Phi})=\bm{\mathfrak{M}}_{2}\bm{\Phi},\!\! (20a)
𝕸2[𝑴11𝑴1S𝑴g1𝑴gS],𝑴ij=[m11ijm1QijmLi1ijmLiQij]\displaystyle\!\!\!\!\!\bm{\mathfrak{M}}_{2}\triangleq\begin{bmatrix}\bm{M}_{11}&\cdots&\bm{M}_{1S}\\ \vdots&\ddots&\vdots\\ \bm{M}_{g1}&\cdots&\bm{M}_{gS}\end{bmatrix}\!\!,\bm{M}_{ij}=\begin{bmatrix}m_{11}^{ij}&\cdots&m_{1Q}^{ij}\\ \vdots&\ddots&\vdots\\ m_{L_{i}1}^{ij}&\cdots&m_{L_{i}Q}^{ij}\end{bmatrix}\!\! (20b)
{muv1ij1==muvNiijNi=1/Ni;for ϕ¯i,u=1/Nit=1Niϕjt,vt,muvij=0; Otherwise,\displaystyle\!\!\!\!\!\left\{\begin{array}[]{cc}m_{uv_{1}}^{ij_{1}}=\cdots=m_{uv_{N_{i}}}^{ij_{N_{i}}}=1/{N_{i}};&\text{for }\overline{\phi}_{i,u}=1/{N_{i}}\sum_{t=1}^{N_{i}}\phi_{j_{t},v_{t}},\\ m_{uv}^{ij}=0;&\text{ Otherwise},\end{array}\right.\!\! (20e)

where Li=niSr(ν+i1)L_{i}=n_{i}Sr^{(\nu+i-1)}, Ni=r(ν+i1)N_{i}=r^{(\nu+i-1)}, u{1,,Li}u\in\{1,\cdots,L_{i}\}, v,v1,v2,{1,,Q}v,v_{1},v_{2},\cdots\in\{1,\cdots,Q\}, i{1,,g}i\in\{1,\cdots,g\}, and j{1,,S}j\in\{1,\cdots,S\}. The value of the matrix 𝕸2\bm{\mathfrak{M}}_{2} elements is determined using (20e) based on the relationship between the elements of the vectors 𝚽¯\bm{\overline{\Phi}} and 𝚽\bm{\Phi}. Besides, the operator 𝕸2\bm{\mathfrak{M}}_{2} has the following two properties.

  • (i)

    In each row vector of the matrix 𝕸2\bm{\mathfrak{M}}_{2}, the sum of the elements is 11.

  • (ii)

    Each column vector of 𝕸2\bm{\mathfrak{M}}_{2} has only one nonzero element which takes the value 1/Ni1/{N_{i}}.

Example:  If we consider Fig. 3 for this example, from Fig. 3(a), the block vector 𝚽\bm{\Phi} for FF-DPD is expressed as 𝚽=[𝚽1T,𝚽2T,𝚽3T,𝚽4T]T\bm{\Phi}=[\bm{\Phi}_{1}^{T},\bm{\Phi}_{2}^{T},\bm{\Phi}_{3}^{T},\bm{\Phi}_{4}^{T}]^{T}, where 𝚽j=[ϕj,1,ϕj,2,ϕj,3,ϕj,4]T\bm{\Phi}_{j}=[\phi_{j,1},\phi_{j,2},\phi_{j,3},\phi_{j,4}]^{T} for S=4S=4 and Q=4Q=4. To transform 𝚽\bm{\Phi} into the shape of 𝚽¯=[𝚽¯1T,𝚽¯2T]T\bm{\overline{\Phi}}=[\bm{\overline{\Phi}}_{1}^{T},\bm{\overline{\Phi}}_{2}^{T}]^{T} as for the LC-DPD in Fig. 3(b), we use the operator 𝔐2\mathfrak{M}_{2} as given below, where 𝚽¯1=[ϕ¯1,1,ϕ¯3,1,ϕ¯1,2,ϕ¯3,2]T\bm{\overline{\Phi}}_{1}=[\overline{\phi}_{1,1},\overline{\phi}_{3,1},\overline{\phi}_{1,2},\overline{\phi}_{3,2}]^{T}, and 𝚽¯2=[ϕ¯1,3,ϕ¯1,4]T\bm{\overline{\Phi}}_{2}=[\overline{\phi}_{1,3},\overline{\phi}_{1,4}]^{T}.

𝕸2=[[1.3]120000000012000000[1.3]120000000012000000[1.3]000012000000001200[1.3]000012000000001200[1.3]0014000014[1.3]0014000014[1.3]0014000014[1.3]0014000014]\displaystyle\bm{\mathfrak{M}}_{2}=\footnotesize\left[\begin{array}[]{@{}c|c|c|c@{}}\begin{matrix}[1.3]\frac{1}{2}&0&0&0\\ 0&0&0&0\\ 0&\frac{1}{2}&0&0\\ 0&0&0&0\end{matrix}&\begin{matrix}[1.3]\frac{1}{2}&0&0&0\\ 0&0&0&0\\ 0&\frac{1}{2}&0&0\\ 0&0&0&0\end{matrix}&\begin{matrix}[1.3]0&0&0&0\\ \frac{1}{2}&0&0&0\\ 0&0&0&0\\ 0&\frac{1}{2}&0&0\end{matrix}&\begin{matrix}[1.3]0&0&0&0\\ \frac{1}{2}&0&0&0\\ 0&0&0&0\\ 0&\frac{1}{2}&0&0\end{matrix}\\ \hline\cr\begin{matrix}[1.3]0&0&\frac{1}{4}&0\\ 0&0&0&\frac{1}{4}\end{matrix}&\begin{matrix}[1.3]0&0&\frac{1}{4}&0\\ 0&0&0&\frac{1}{4}\end{matrix}&\begin{matrix}[1.3]0&0&\frac{1}{4}&0\\ 0&0&0&\frac{1}{4}\end{matrix}&\begin{matrix}[1.3]0&0&\frac{1}{4}&0\\ 0&0&0&\frac{1}{4}\end{matrix}\end{array}\right] (23)

In (23), 𝕸2\bm{\mathfrak{M}}_{2} satisfies the property (i) as sum of the elements in each of its row vector is 11. Further, in each column vector, only one element is nonzero and its value is 1/Ni1/{N_{i}}. For instance, in the second column vector, the third element is nonzero and for it, parameter i=1i=1. So, N1=(1/2)(1+11)=1/2N_{1}=(1/2)^{(1+1-1)}=1/2. Thus, property (ii) is also satisfied by 𝕸2\bm{\mathfrak{M}}_{2}. Moreover, after performing the operation in (20b), the relationship between the elements of 𝚽¯\bm{\overline{\Phi}} and 𝚽\bm{\Phi} can be expressed as: ϕ¯1,1=(ϕ1,1+ϕ2,1)/2\overline{\phi}_{1,1}=(\phi_{1,1}+\phi_{2,1})/2, ϕ¯3,1=(ϕ3,1+ϕ4,1)/2\overline{\phi}_{3,1}=(\phi_{3,1}+\phi_{4,1})/2, ϕ¯1,2=(ϕ1,2+ϕ2,2)/2\overline{\phi}_{1,2}=(\phi_{1,2}+\phi_{2,2})/2, ϕ¯3,2=(ϕ3,2+ϕ4,2)/2\overline{\phi}_{3,2}=(\phi_{3,2}+\phi_{4,2})/2, ϕ¯1,3=(ϕ1,3+ϕ2,3+ϕ3,3+ϕ4,3)/4\overline{\phi}_{1,3}=(\phi_{1,3}+\phi_{2,3}+\phi_{3,3}+\phi_{4,3})/4, and ϕ¯1,4=(ϕ1,4+ϕ2,4+ϕ3,4+ϕ4,4)/4\overline{\phi}_{1,4}=(\phi_{1,4}+\phi_{2,4}+\phi_{3,4}+\phi_{4,4})/4. \Box

Algorithm 2 Estimation of coefficients for LC-DPD (Method-I).
1:The values of ρ\rho, λ0\lambda_{0}, μ\mu, 𝕸1\bm{\mathfrak{M}}_{1}, 𝕸2\bm{\mathfrak{M}}_{2}, and 𝚽¯~(0)\bm{\tilde{\overline{\Phi}}}^{(0)}
2:The estimated coefficient vector 𝚽¯^\bm{\widehat{\overline{\Phi}}}
3:Determine the initial values 𝑷(0)\bm{P}^{(0)} and 𝝃(0)\bm{\xi}^{(0)} using Steps 3 and 4 of Algorithms 1
4:Find 𝚽~(0)=𝔐1𝚽¯~(0)\bm{\tilde{\Phi}}^{(0)}=\mathfrak{M}_{1}\bm{\tilde{\overline{\Phi}}}^{(0)}, then, assign 𝚽(0)=𝚽~(0)\bm{\Phi}^{(0)}=\bm{\tilde{\Phi}}^{(0)} and compute 𝑿(1)\bm{X}(1), 𝚼(1)\bm{\Upsilon}(1), and 𝑿~(1)\bm{\tilde{X}}(1) as in Steps 5 and 6 of Algorithm 1
5:n=1n=1
6:repeat
7:    Compute 𝑬(n)\bm{E}(n), 𝝃(n)\bm{\xi}^{(n)}, 𝒁(n)\bm{Z}^{(n)}, 𝑷(n)\bm{P}^{(n)}, and 𝚽~(n)\bm{\tilde{\Phi}}^{(n)} from Step 9 to Step 13 of Algorithm 1.
8:    Using operator 𝕸2\bm{\mathfrak{M}}_{2} in (20b), compute 𝚽¯~(n)=𝕸2𝚽~(n)\bm{\tilde{\overline{\Phi}}}^{(n)}=\bm{\mathfrak{M}}_{2}\bm{\tilde{\Phi}}^{(n)}
9:    Then, using 𝕸1\bm{\mathfrak{M}}_{1} in (14b), compute 𝚽~(n)=𝕸1𝚽¯~(n)\bm{\tilde{\Phi}}^{(n)}=\bm{\mathfrak{M}}_{1}\bm{\tilde{\overline{\Phi}}}^{(n)}
10:    Assign 𝚽(n)=𝚽~(n)\bm{\Phi}^{(n)}=\bm{\tilde{\Phi}}^{(n)} and compute 𝑿(n+1)\bm{X}(n+1), 𝚼(n+1)\bm{\Upsilon}(n+1) as in Step 14 of Algorithm 1
11:    𝑿~(n+1)=𝚼(n+1)T𝚽~(n)\bm{\tilde{X}}(n+1)=\bm{\Upsilon}(n+1)^{T}\bm{\tilde{\Phi}}^{(n)}
12:    n=n+1n=n+1
13:until 𝚽¯~(n)\bm{\tilde{\overline{\Phi}}}^{(n)} converges
14:𝚽¯^=𝚽¯~(n)\bm{\widehat{\overline{\Phi}}}=\bm{\tilde{\overline{\Phi}}}^{(n)}

Now, using Algorithm 2, we train the coefficient vector 𝚽¯~\bm{\tilde{\overline{\Phi}}} of LC-DPD as follows. Apart from the parameters, ρ\rho, λ0\lambda_{0}, μ\mu, and 𝚽¯~(0)\bm{\tilde{\overline{\Phi}}}^{(0)}, the values of the operators, 𝕸1\bm{\mathfrak{M}}_{1} and 𝕸2\bm{\mathfrak{M}}_{2} are also inputted to the algorithm. As we realize the training of 𝚽¯~\bm{\tilde{\overline{\Phi}}} by exploiting the training of 𝚽~\bm{\tilde{\Phi}}, the steps of Algorithm 2 are same as of Algorithm 1 except the Steps 4, 8, and 9. In Step 4, using operator 𝔐1\mathfrak{M}_{1}, 𝚽¯~(0)\bm{\tilde{\overline{\Phi}}}^{(0)} is converted into 𝚽~(0)\bm{\tilde{\Phi}}^{(0)} to compute other initial values of the parameters for the FF-DPD structure. While Steps 8 and 9 enforce the learning of FF-DPD coefficients in 𝚽~\bm{\tilde{\Phi}} to incorporate the repetitive characteristics of LC-DPD coefficients in 𝚽¯~\bm{\tilde{\overline{\Phi}}} in each iteration. The forth process in Step 8 takes the average of some coefficients of 𝚽~\bm{\tilde{\Phi}} which has to be repeated in the LC-DPD structure and is assigned to a coefficient of 𝚽¯~\bm{\tilde{\overline{\Phi}}} (cf. example of Definition 3). In the back process in Step 9, again, this coefficient in 𝚽¯~\bm{\tilde{\overline{\Phi}}} is repeated in 𝚽~\bm{\tilde{\Phi}} (cf. example of Definition 2). Thus, it enforces the repetitive coefficients in 𝚽~\bm{\tilde{\Phi}} to have equal value in each iteration as the learning is based on FF-DPD. Finally, after the convergence, it returns the estimated LC-DPD coefficient vector 𝚽¯^\bm{\widehat{\overline{\Phi}}}.

Performance and Complexity

As in Algorithm 2, the LC-DPD coefficient vector 𝚽¯\bm{\overline{\Phi}} is trained by exploiting the training of the FF-DPD coefficient vector 𝚽\bm{\Phi} where using the operator 𝔐2\mathfrak{M}_{2}, the common coefficients in 𝚽¯\bm{\overline{\Phi}} is obtained by taking the average of some of the coefficients in 𝚽\bm{\Phi} (cf. Example of Definition 3). However, the obtained common coefficients after the average loose the correlation with the distinct coefficients. Therefore, using it, the generated predistorted signals are not optimal as in FF-DPD to linearize the PAs; thus, its performance is low. Further, the complexity of the algorithm is described as follows. Although, the operators, 𝔐1\mathfrak{M}_{1} and 𝔐2\mathfrak{M}_{2} are represented as the matrices in (14b) and (20b) to analyze the operations mathematically, but, in practice, they have only assignment and average operations whose complexities are negligible compared to the matrix multiplications. Therefore, the dominant operations in Algorithm 2 are same as Algorithm 1, thus, its complexity per iteration is O(Q3S)O(Q^{3}S). To enhance the performance and to reduce the complexity, we propose the improved algorithms in next section.

IV Training Based on ILA-RPEM: Part II

For reducing the complexity of the algorithm, we need to train LC-DPD coefficient vector 𝚽¯\bm{\overline{\Phi}} by only exploiting the structure of LC-DPD instead of enforcing its training using the FF-DPD. Because, the length of vector 𝚽¯\bm{\overline{\Phi}} as given by (8) is less than 𝚽\bm{\Phi}. Thus, the training only using the length of vector 𝚽¯\bm{\overline{\Phi}} reduces the sizes of the matrices, 𝑷\bm{P} and 𝚼\bm{\Upsilon} in the dominant matrix multiplications. Based on it, we propose two algorithms.

In order to train 𝚽¯\bm{\overline{\Phi}} by completely exploiting the LC-DPD structure, first, we need to represent 𝚽¯\bm{\overline{\Phi}} in a suitable form, i.e., in another block vector where each vector in it consists the coefficients that are multiplied by the BFs to generate a predistorted signal that is distributed to a subgroup of PAs666In LC-DPD assisted subarray, the SS PAs of the subarray are divided into σ¯g=Srν\overline{\sigma}_{g}=Sr^{\nu} subgroups, where each subgroup consists nPA=S/σ¯g=rνn_{PA}=S/\overline{\sigma}_{g}=r^{-\nu} number of PAs. Thus, LC-DPD structure generates σ¯g\overline{\sigma}_{g} predistorted signals to distribute them to respective subgroups..

Definition 4 (Reshape of 𝚽¯\bm{\overline{\Phi}} as 𝚽¯\bm{\overline{\Phi}^{{}^{\prime}}}).

To generate the predistorted signals using the LC-DPD coefficient vector 𝚽¯\bm{\overline{\Phi}}, it can be reshaped as the block vector 𝚽¯\bm{\overline{\Phi}^{{}^{\prime}}}, given by:

𝚽¯=[𝚽¯1T,,𝚽¯σ¯1T1st gr., each of len. J1,,𝚽¯(σ¯g1+1)T,,𝚽¯σ¯gTgth gr., each of len. Jg]T,\displaystyle\bm{\overline{\Phi}}^{{}^{\prime}}=[\underbrace{\bm{\overline{\Phi}}_{1}^{{}^{\prime}T},\cdots,\bm{\overline{\Phi}}_{\overline{\sigma}_{1}}^{{}^{\prime}T}}_{1st\text{ gr., each of len. }J_{1}},\cdots,\underbrace{\bm{\overline{\Phi}}_{(\overline{\sigma}_{g-1}+1)}^{{}^{\prime}T},\cdots,\bm{\overline{\Phi}}_{\overline{\sigma}_{g}}^{{}^{\prime}T}}_{gth\text{ gr., each of len. }J_{g}}]^{T}, (24)

where 𝚽¯i\bm{\overline{\Phi}}_{i}^{{}^{\prime}} is the iith vector in 𝚽¯\bm{\overline{\Phi}}^{{}^{\prime}}. Here, the grouping of the vectors is based on their lengths, i.e., the vectors in a group have same length. The total number of groups is gg which is equal to the number of vectors in 𝚽¯\bm{\overline{\Phi}} (cf. (7)). In the jjth group, the number of vectors is TjT_{j} and each of the vector is of JjJ_{j} length as shown in (24). Besides, σ¯j=q=1jTq\overline{\sigma}_{j}=\sum_{q=1}^{j}T_{q}. Further, TjT_{j} and JjJ_{j} are given by:

Tj\displaystyle T_{j} =Srν+gj[1ru~(r1j1)],\displaystyle=Sr^{\nu+g-j}\left[1-r\tilde{u}\left(r^{1-j}-1\right)\right], (25a)
Jj\displaystyle J_{j} =Qq=1j1ngq+1u~(r1j1),\displaystyle=Q-\sum_{q=1}^{j-1}n_{g-q+1}\tilde{u}\left(r^{1-j}-1\right), (25b)

where j{1,,g}j\in{\{1,\cdots,g\}}, r<1r<1 and u~(x)=1\tilde{u}(x)=1 for x>0x>0; otherwise, u~(x)=0\tilde{u}(x)=0.

Example:  Again, we consider the instance of LC-DPD structure with S=4S=4, Q=4Q=4, r=1/2r=1/2, and ν=1\nu=1 in Fig. 3(b) to reshape 𝚽¯\bm{\overline{\Phi}} to 𝚽¯\bm{\overline{\Phi}^{{}^{\prime}}}. From (7), 𝚽¯=[𝚽¯1T,𝚽¯2T]T\bm{\overline{\Phi}}=[\bm{\overline{\Phi}}_{1}^{T},\bm{\overline{\Phi}}_{2}^{T}]^{T} where 𝚽¯1=[ϕ¯1,1,ϕ¯3,1,ϕ¯1,2,ϕ¯3,2n1=4]T\bm{\overline{\Phi}}_{1}=[\underbrace{\overline{\phi}_{1,1},\overline{\phi}_{3,1},\overline{\phi}_{1,2},\overline{\phi}_{3,2}}_{n_{1}=4}]^{T}, 𝚽¯2=[ϕ¯1,3,ϕ¯1,4n2=2]T\bm{\overline{\Phi}}_{2}=[\underbrace{\overline{\phi}_{1,3},\overline{\phi}_{1,4}}_{n_{2}=2}]^{T}, and g=2g=2. Using (24), its reshape as 𝚽¯\bm{\overline{\Phi}^{{}^{\prime}}} to generate the predistored signals is given as: 𝚽¯=[𝚽¯1T1st gr.,𝚽¯2T2nd gr.]T\bm{\overline{\Phi}^{{}^{\prime}}}=[\underbrace{\bm{\overline{\Phi}}_{1}^{{}^{\prime}T}}_{1\text{st}\text{ gr.}},\underbrace{\bm{\overline{\Phi}}_{2}^{{}^{\prime}T}}_{2\text{nd}\text{ gr.}}]^{T}. Here, number of groups, g=2g=2 and substituting the parameters values in (25), we get T1=T2=1T_{1}=T_{2}=1, J1=4J_{1}=4, and J2=2J_{2}=2. Thus, 𝚽¯1=[ϕ¯1,1,ϕ¯1,2,ϕ¯1,3,ϕ¯1,4]T\bm{\overline{\Phi}}_{1}^{{}^{\prime}}=[\overline{\phi}_{1,1},\overline{\phi}_{1,2},\overline{\phi}_{1,3},\overline{\phi}_{1,4}]^{T} and 𝚽¯2=[ϕ¯3,1,ϕ¯3,2]T\bm{\overline{\Phi}}_{2}^{{}^{\prime}}=[\overline{\phi}_{3,1},\overline{\phi}_{3,2}]^{T}. \Box

Corollary 1.

As 𝚽¯\bm{\overline{\Phi}^{{}^{\prime}}} is the reshape of 𝚽¯\bm{\overline{\Phi}}, therefore, the elements in the former vector is same as in the later, thus, the length of the two vectors is equal. The length of 𝚽¯\bm{\overline{\Phi}^{{}^{\prime}}} can be obtained by i=1gTiJi\sum_{i=1}^{g}T_{i}J_{i}, hence, from the length of 𝚽¯\bm{\overline{\Phi}} in (8c), NmL=i=1gTiJiN_{m}^{L}=\sum_{i=1}^{g}T_{i}J_{i}.

The above corollary can be proved by substituting Q=i=1gniQ=\sum_{i=1}^{g}n_{i} in (25) followed by simplifying i=1gTiJi\sum_{i=1}^{g}T_{i}J_{i}, it gives NmLN_{m}^{L} in (8c). Furthermore, to reshape 𝚽¯\bm{\overline{\Phi}} into 𝚽¯\bm{\overline{\Phi}^{{}^{\prime}}}, we define a linear operator 𝕸3\bm{\mathfrak{M}}_{3} as below.

Definition 5 (A Linear Operator 𝔐3\bm{\mathfrak{M}}_{3}).

A linear operator 𝕸3\bm{\mathfrak{M}}_{3} is defined as in (26b) which is used to reshape the LC-DPD coefficient vector 𝚽¯\bm{\overline{\Phi}} into the vector 𝚽¯\bm{\overline{\Phi}^{{}^{\prime}}} as given by (24).

𝚽¯=𝕸3𝚽¯,\displaystyle\bm{\overline{\Phi}^{{}^{\prime}}}=\bm{\mathfrak{M}}_{3}\bm{\overline{\Phi}}, (26a)
𝕸3=[𝑴11𝑴1g𝑴σ¯11𝑴σ¯1g𝑴(σ¯g1+1)1𝑴(σ¯g1+1)g𝑴σ¯g1𝑴σ¯gg],\displaystyle\bm{\mathfrak{M}}_{3}=\begin{bmatrix}\bm{M}_{11}&\cdots&\bm{M}_{1g}\\ \vdots&\ddots&\vdots\\ \bm{M}_{\overline{\sigma}_{1}1}&\cdots&\bm{M}_{\overline{\sigma}_{1}g}\\ \vdots&\ddots&\vdots\\ \bm{M}_{(\overline{\sigma}_{g-1}+1)1}&\cdots&\bm{M}_{(\overline{\sigma}_{g-1}+1)g}\\ \vdots&\ddots&\vdots\\ \bm{M}_{\overline{\sigma}_{g}1}&\cdots&\bm{M}_{\overline{\sigma}_{g}g}\end{bmatrix}, (26b)
𝑴ij=[m11ijm1LjijmJτ1ijmJτLjij],\displaystyle\bm{M}_{ij}=\begin{bmatrix}m_{11}^{ij}&\cdots&m_{1L_{j}}^{ij}\\ \vdots&\ddots&\vdots\\ m_{J_{\tau}1}^{ij}&\cdots&m_{J_{\tau}L_{j}}^{ij}\end{bmatrix}, (26c)

where TτT_{\tau} and JτJ_{\tau} is given by (25) for τ{1,,g}\tau\in\{1,\cdots,g\} , Lj=njSr(ν+j1)L_{j}=n_{j}Sr^{(\nu+j-1)}, muvij{0,1}m_{uv}^{ij}\in\{0,1\} for i{σ¯τ1+1,,σ¯τ}i\in\{\overline{\sigma}_{\tau-1}+1,\cdots,\overline{\sigma}_{\tau}\} (σ¯0=1)(\overline{\sigma}_{0}=1), j{1,,g}j\in\{1,\cdots,g\}, u{1,,Jτ}u\in\{1,\cdots,J_{\tau}\}, and v{1,,Lj}v\in\{1,\cdots,L_{j}\}. Moreover, sum of the elements in each of the row or column vector of 𝕸3\bm{\mathfrak{M}}_{3} is 11.

Example:  Again, we consider the example for Fig. 3(b) to reshape 𝚽¯=[𝚽¯1T,𝚽¯2T]T\bm{\overline{\Phi}}=[\bm{\overline{\Phi}}_{1}^{T},\bm{\overline{\Phi}}_{2}^{T}]^{T} into 𝚽¯=[𝚽¯1T,𝚽¯2T]T\bm{\overline{\Phi}^{{}^{\prime}}}=[\bm{\overline{\Phi}}_{1}^{{}^{\prime}T},\bm{\overline{\Phi}}_{2}^{{}^{\prime}T}]^{T}, where 𝚽¯1=[ϕ¯1,1,ϕ¯3,1,ϕ¯1,2,ϕ¯3,2]T\bm{\overline{\Phi}}_{1}=[\overline{\phi}_{1,1},\overline{\phi}_{3,1},\overline{\phi}_{1,2},\overline{\phi}_{3,2}]^{T}, 𝚽¯2=[ϕ¯1,3,ϕ¯1,4]T\bm{\overline{\Phi}}_{2}=[\overline{\phi}_{1,3},\overline{\phi}_{1,4}]^{T}, 𝚽¯1=[ϕ¯1,1,ϕ¯1,2,ϕ¯1,3,ϕ¯1,4]T\bm{\overline{\Phi}}_{1}^{{}^{\prime}}=[\overline{\phi}_{1,1},\overline{\phi}_{1,2},\overline{\phi}_{1,3},\overline{\phi}_{1,4}]^{T}, and 𝚽¯2=[ϕ¯3,1,ϕ¯3,2]T\bm{\overline{\Phi}}_{2}^{{}^{\prime}}=[\overline{\phi}_{3,1},\overline{\phi}_{3,2}]^{T}. For it, the operator 𝕸3\bm{\mathfrak{M}}_{3} is given by:

𝕸3=[𝑴11𝑴12𝑴21𝑴22]=[[0.7]1000001000000000[0.7]00001001[0.7]01000001[0.7]0000]\displaystyle\bm{\mathfrak{M}}_{3}=\begin{bmatrix}\bm{M}_{11}&\bm{M}_{12}\\ \bm{M}_{21}&\bm{M}_{22}\end{bmatrix}=\footnotesize\left[\begin{array}[]{@{}c|c@{}}\begin{matrix}[0.7]1&0&0&0\\ 0&0&1&0\\ 0&0&0&0\\ 0&0&0&0\end{matrix}&\begin{matrix}[0.7]0&0\\ 0&0\\ 1&0\\ 0&1\end{matrix}\\ \hline\cr\begin{matrix}[0.7]0&1&0&0\\ 0&0&0&1\end{matrix}&\begin{matrix}[0.7]0&0\\ 0&0\end{matrix}\end{array}\right] (29)

\Box

Corollary 2.

The operator 𝕸3\bm{\mathfrak{M}}_{3} is always a square matrix as it is used to reshape the vector 𝚽¯\bm{\overline{\Phi}} into the vector 𝚽¯\bm{\overline{\Phi}^{{}^{\prime}}} using the same elements. Also, the column vectors in 𝕸3\bm{\mathfrak{M}}_{3} are unit vectors and they are orthogonal to each other. Therefore, they form a orthonormal basis in the space NmL\mathbb{R}^{N_{m}^{L}}. Furthermore, the inverse of 𝕸3\bm{\mathfrak{M}}_{3} is its transpose, i.e., 𝕸31=𝕸3T\bm{\mathfrak{M}}_{3}^{-1}=\bm{\mathfrak{M}}_{3}^{T} [42]. Hence, from (26a), using 𝕸31\bm{\mathfrak{M}}_{3}^{-1}, 𝚽¯\bm{\overline{\Phi}^{{}^{\prime}}} can be reshaped back to 𝚽¯\bm{\overline{\Phi}} as: 𝚽¯=𝕸3T𝚽¯\bm{\overline{\Phi}}=\bm{\mathfrak{M}}_{3}^{T}\bm{\overline{\Phi}^{{}^{\prime}}}.

Algorithm 3 Estimation of coefficients for LC-DPD (Method-II).
1:The values of ρ\rho, λ0\lambda_{0}, μ\mu, 𝕸1\bm{\mathfrak{M}}_{1}, 𝕸2\bm{\mathfrak{M}}_{2}, 𝕸3\bm{\mathfrak{M}}_{3}, and 𝚽¯~(0)\bm{\tilde{\overline{\Phi}}}^{(0)}
2:The estimated coefficients 𝚽¯^\bm{\widehat{\overline{\Phi}}}
3:𝑷¯(0)=diag(μ𝑰J1,,μ𝑰J1T1 times,,μ𝑰Jg,,μ𝑰JgTg times)\bm{\overline{P}}^{(0)}=\text{diag}(\underbrace{\mu\bm{I}_{J_{1}},\cdots,\mu\bm{I}_{J_{1}}}_{T_{1}\text{ times}},\cdots,\underbrace{\mu\bm{I}_{J_{g}},\cdots,\mu\bm{I}_{J_{g}}}_{T_{g}\text{ times}})
4:𝝃¯(0)=λ0𝑰σ¯g\bm{\overline{\xi}}^{(0)}=\lambda_{0}\bm{I}_{\overline{\sigma}_{g}}
5:Operate 𝚽~(0)=𝕸1𝚽¯~(0)\bm{\tilde{\Phi}}^{(0)}=\bm{\mathfrak{M}}_{1}\bm{\tilde{\overline{\Phi}}}^{(0)} and assign 𝚽(0)=𝚽~(0)\bm{\Phi}^{(0)}=\bm{\tilde{\Phi}}^{(0)}
6:Obtain 𝑿(1)\bm{X}(1) using (3) followed by 𝒀(1)\bm{Y}(1) using (2), then, compute 𝚿\bm{\Psi}^{{}^{\prime}}, 𝚼(1)\bm{\Upsilon}(1) similar to Step 5 of Algorithm 1
7:Compute 𝚿¯(1)=𝕸3(𝕸2𝚿(1))\bm{{\overline{\Psi}^{{}^{\prime}}}}(1)=\bm{\mathfrak{M}}_{3}(\bm{\mathfrak{M}}_{2}\bm{\Psi}^{{}^{\prime}}(1)) and 𝚼¯(1)\bm{\overline{\Upsilon}}(1)
8:Compute 𝑿~(1)=𝚼(1)T𝚽~(0)\bm{\tilde{X}}(1)=\bm{\Upsilon}(1)^{T}\bm{\tilde{\Phi}}^{(0)} and 𝚽¯~(0)=𝔐3𝚽¯~(0)\bm{{\tilde{\overline{\Phi}}^{{}^{\prime}}}}^{(0)}=\mathfrak{M}_{3}\bm{\tilde{\overline{\Phi}}}^{(0)}
9:n=1n=1
10:repeat
11:    𝑬(n)=𝑿(n)𝑿~(n)\bm{E}(n)=\bm{X}(n)-\bm{\tilde{X}}(n)
12:    𝑬¯(n)=𝕸3(𝕸2(𝑬(n))𝟏Q)\bm{\overline{E}}(n)=\bm{\mathfrak{M}}_{3}(\bm{\mathfrak{M}}_{2}(\bm{E}(n))\bigotimes\bm{1}_{Q})
13:    𝝃¯(n)=ρ𝝃¯(n1)+𝑰σ¯gρ𝑰σ¯g\bm{\overline{\xi}}^{(n)}=\rho\bm{\overline{\xi}}^{(n-1)}+\bm{I}_{\overline{\sigma}_{g}}-\rho\bm{I}_{\overline{\sigma}_{g}}
14:    𝒁¯(n)=𝚼¯T(n)𝑷¯(n1)𝚼¯(n)+𝝃¯(n)\bm{\overline{Z}}^{(n)}=\bm{\overline{\Upsilon}}^{T}(n)\bm{\overline{P}}^{(n-1)}\bm{\overline{\Upsilon}}^{*}(n)+\bm{\overline{\xi}}^{(n)}
15:    𝑷¯(n)=(𝑷¯(n1)𝑷¯(n1)𝚼¯(n)𝒁¯(n)1𝚼¯T(n)𝑷¯(n1))𝚵¯(n)1\bm{\overline{P}}^{(n)}=(\bm{\overline{P}}^{(n-1)}-\bm{\overline{P}}^{(n-1)}\bm{\overline{\Upsilon}}^{*}(n){\bm{\overline{Z}}^{(n)}}^{-1}\bm{\overline{\Upsilon}}^{T}(n)\bm{\overline{P}}^{(n-1)}){\bm{\overline{\Xi}}^{(n)}}^{-1}
16:    𝚽¯~(n)=𝚽¯~(n1)+(𝑷¯(n)𝚼¯(n)𝟏σ¯g)𝑬¯(n)\bm{{\tilde{\overline{\Phi}}^{{}^{\prime}}}}^{(n)}=\bm{{\tilde{\overline{\Phi}}^{{}^{\prime}}}}^{(n-1)}+(\bm{\overline{P}}^{(n)}\bm{\overline{\Upsilon}}^{*}(n)\bm{1}_{\overline{\sigma}_{g}})\bigodot\bm{\overline{E}}(n)
17:    Operate 𝚽¯~(n)=𝔐3T𝚽¯~(n)\bm{\tilde{\overline{\Phi}}}^{(n)}=\mathfrak{M}_{3}^{T}\bm{{\tilde{\overline{\Phi}}^{{}^{\prime}}}}^{(n)} followed by 𝚽~(n)=𝕸1𝚽¯~(n)\bm{\tilde{\Phi}}^{(n)}=\bm{\mathfrak{M}}_{1}\bm{\tilde{\overline{\Phi}}}^{(n)}, then, assign 𝚽(n)=𝚽~(n)\bm{\Phi}^{(n)}=\bm{\tilde{\Phi}}^{(n)}
18:    Using obtained 𝑿(n+1)\bm{X}(n+1) followed by 𝒀(n+1)\bm{Y}(n+1) and then 𝚿(n+1)\bm{\Psi}^{{}^{\prime}}(n+1), find 𝚼(n+1)\bm{\Upsilon}(n+1), 𝚿¯(n+1)=𝕸3(𝕸2𝚿(n+1))\bm{{\overline{\Psi}^{{}^{\prime}}}}(n+1)=\bm{\mathfrak{M}}_{3}(\bm{\mathfrak{M}}_{2}\bm{\Psi}^{{}^{\prime}}(n+1)), and 𝚼¯(n+1)\bm{\overline{\Upsilon}}(n+1)
19:    𝑿~(n+1)=𝚼(n+1)T𝚽~(n)\bm{\tilde{X}}(n+1)=\bm{\Upsilon}(n+1)^{T}\bm{\tilde{\Phi}}^{(n)}
20:    n=n+1n=n+1
21:until 𝚽¯~(n)\bm{\tilde{\overline{\Phi}}}^{(n)} converges
22:𝚽¯^=𝚽¯~(n)\bm{\widehat{\overline{\Phi}}}=\bm{\tilde{\overline{\Phi}}}^{(n)}

Now, we utilize the operator 𝕸3\bm{\mathfrak{M}}_{3} to train the coefficient vector 𝚽¯\bm{\overline{\Phi}} in Algorithm 3. In the algorithm based on ILA-RPEM, the sizes of different matrices and vectors used in the training are determined according to the shape of the vector 𝚽¯\bm{\overline{\Phi}^{{}^{\prime}}} in (24). They are defined as: the forgetting vector 𝝃¯diag(ξ1,,ξσ¯g)\bm{\overline{\xi}}\triangleq\text{diag}(\xi_{1},\cdots,\xi_{\overline{\sigma}_{g}}), 𝒁¯diag(Z1,,Zσ¯g)\bm{\overline{Z}}\triangleq\text{diag}(Z_{1},\cdots,Z_{\overline{\sigma}_{g}}), 𝑷¯diag(P1,,Pσ¯g)\bm{\overline{P}}\triangleq\text{diag}(P_{1},\cdots,P_{\overline{\sigma}_{g}}), and 𝚵diag(ξ1𝑰J1,,ξσ¯1𝑰J1,,ξ(σ¯g1+1)𝑰Jg,,ξσ¯g𝑰Jg)\bm{\Xi}\triangleq\text{diag}(\xi_{1}\bm{I}_{J_{1}},\cdots,\xi_{\overline{\sigma}_{1}}\bm{I}_{J_{1}},\cdots,\xi_{(\overline{\sigma}_{g-1}+1)}\bm{I}_{J_{g}},\cdots,\xi_{\overline{\sigma}_{g}}\bm{I}_{J_{g}}). To determine the parameter 𝚼¯\bm{\overline{\Upsilon}}, first, using the outputs 𝒀\bm{Y} of the PAs, we obtain the block vector 𝚿=[𝚿1T,,𝚿ST]T\bm{\Psi}^{{}^{\prime}}=[\bm{\Psi}_{1}^{{}^{\prime}T},\cdots,\bm{\Psi}_{S}^{{}^{\prime}T}]^{T}, where 𝚿l=[ψl,1,,ψl,Q]T\bm{\Psi}_{l}^{{}^{\prime}}=[\psi_{l,1}^{{}^{\prime}},\cdots,\psi_{l,Q}^{{}^{\prime}}]^{T}. From 𝚿\bm{\Psi}^{{}^{\prime}}, the block vector 𝚿¯\bm{\overline{\Psi^{{}^{\prime}}}} for LC-DPD can be obtained using the operator 𝕸2\bm{\mathfrak{M}}_{2} as: 𝚿¯=𝕸2𝚿\bm{\overline{\Psi^{{}^{\prime}}}}=\bm{\mathfrak{M}}_{2}\bm{\Psi}^{{}^{\prime}}. Here, 𝚿¯=[𝚿¯1T,,𝚿¯gT]T\bm{\overline{\Psi^{{}^{\prime}}}}=[\bm{\overline{\Psi^{{}^{\prime}}}}_{1}^{T},\cdots,\bm{\overline{\Psi^{{}^{\prime}}}}_{g}^{T}]^{T} and 𝚿¯i\bm{\overline{\Psi^{{}^{\prime}}}}_{i} can be represented similar to 𝚽¯i\bm{\overline{\Phi}}_{i} in (1) for i{1,,g}i\in\{1,\cdots,g\}, where ϕ\phi is replaced by ψ\psi^{{}^{\prime}}. Further, using operator 𝕸3\bm{\mathfrak{M}}_{3}, 𝚿¯\bm{\overline{\Psi^{{}^{\prime}}}} can be reshaped into 𝚿¯\bm{\overline{\Psi}^{{}^{\prime}}} as: 𝚿¯=𝕸3𝚿¯\bm{\overline{\Psi}^{{}^{\prime}}}=\bm{\mathfrak{M}}_{3}\bm{\overline{\Psi^{{}^{\prime}}}}. Now, using the vectors in the block vector 𝚿¯\bm{\overline{\Psi}^{{}^{\prime}}}, 𝚼¯\bm{\overline{\Upsilon}} can be expressed as777This procedure is used to compute 𝚼¯\bm{\overline{\Upsilon}} from 𝒀\bm{Y} in Steps 7 and 18 of Algorithm 3.: 𝚼¯=diag(𝚿¯1,,𝚿¯σ¯g)\bm{\overline{\Upsilon}}=\text{diag}(\bm{\overline{\Psi}^{{}^{\prime}}}_{1},\cdots,\bm{\overline{\Psi}^{{}^{\prime}}}_{\overline{\sigma}_{g}}). Now the process in the Algorithm 3 can be described as follows. The inputs to the algorithm are same as in Algorithm 2 along with the operator 𝕸3\bm{\mathfrak{M}}_{3}. In the first two steps, correlation matrix 𝑷¯\bm{\overline{P}} and 𝝃¯\bm{\overline{\xi}} are initialized. Using Steps 5 to 8, the initial values of 𝚼¯\bm{\overline{\Upsilon}} and the postdistorted signal vector 𝑿~\bm{\tilde{X}} are determined. Thereafter, similar to Algorithms 1 and 2, the iterative steps are followed to get the the converged value of 𝚽¯^\bm{\widehat{\overline{\Phi}}}. In the loop, the operators, \bigotimes and \bigodot represent the Kronecker and Hadamard products, respectively.

Performance and Complexity

In Algorithm 3, the block vector 𝚽¯\bm{\overline{\Phi}} is reshaped as the block vector 𝚽¯\bm{{\overline{\Phi}}^{{}^{\prime}}} to correlate the coefficients in its vector 𝚽¯i\bm{{\overline{\Phi}}^{{}^{\prime}}}_{i} using the correlation matrix 𝑷¯\bm{\overline{P}} while the training (cf. Step 16 of Algorithm 3). However, the common coefficients in it are correlated with the distinct coefficients in the first σ¯1=T1\overline{\sigma}_{1}=T_{1} vectors of the block vector 𝚽¯\bm{{\overline{\Phi}}^{{}^{\prime}}} (cf. (24)). For example, in Fig. 3(b), the coefficients, ϕ¯1,3\overline{\phi}_{1,3}, ϕ¯1,4\overline{\phi}_{1,4} are commonly shared with the remaining coefficients ϕ¯1,1\overline{\phi}_{1,1}, ϕ¯1,2\overline{\phi}_{1,2} and ϕ¯3,1\overline{\phi}_{3,1}, ϕ¯3,2\overline{\phi}_{3,2} to generate the predistorted signals x1x_{1} and x2x_{2}, respectively. But, using Algorithm 3, ϕ¯1,3\overline{\phi}_{1,3}, ϕ¯1,4\overline{\phi}_{1,4} are only correlated with ϕ¯1,1\overline{\phi}_{1,1}, ϕ¯1,2\overline{\phi}_{1,2}, thus, the predistorted signal x1x_{1} is better to linearize 11st subgroup of PAs, whereas, x2x_{2} performs less to linearize the 22nd subgroup. Therefore, the predistorted signals generated using the coefficients in first T1T_{1} vectors in 𝚽¯\bm{{\overline{\Phi}}^{{}^{\prime}}} are optimal to linearize the respective subgroups of PAs. But, still, this algorithm provides a low performance for the remaining PAs. Therefore, next, we propose an algorithm that establishes the correlation of the common coefficients with the all distinct coefficients. Besides, the complexity of Algorithm 3 in an iteration can be determined using the dominant matrix multiplications in Steps 14, 15, and 16. As the matrices, 𝚼¯\bm{\overline{\Upsilon}}, 𝑷¯\bm{\overline{P}}, 𝒁¯\bm{\overline{Z}}, and 𝚵¯\bm{\overline{\Xi}} are diagonal, the respective complexities of the three steps are: O(Q2T1+QT1)O(Q^{2}T_{1}+QT_{1}), O(2Q3T1+2Q2T1+QT1)O(2Q^{3}T_{1}+2Q^{2}T_{1}+QT_{1}), and O(Q2T1+QT1)O(Q^{2}T_{1}+QT_{1}). Based on the dominant term, the complexity of Algorithm 3 is O(Q3T1)O(Q^{3}T_{1}). As T1<ST_{1}<S, the complexity of Algorithm 3 is lesser than Algorithms 1 and 2. Next, to correlate the common coefficients in LC-DPD structure with all the remaining, first, we define a sequence of linear operators as below.

Definition 6 (A Sequence of Linear Operators for the Back and Forth Operations).

To establish the correlation of the common coefficients to r(g1)r^{-(g-1)} set of distinct coefficients in the LC-DPD structure, a sequence of r(g1)r^{-(g-1)} operators is defined in (30a), where the ttth operator 𝕸4,t\bm{\mathfrak{M}}_{4,t} in the sequence is given by (30b) and its ijijth matrix element 𝐌ij,t\bm{M}_{ij,t} is expressed in (30c).

𝕸4={𝕸4,1,𝕸4,2,,𝕸4,r(g1)}\displaystyle\bm{\mathfrak{M}}_{4}=\{\bm{\mathfrak{M}}_{4,1},\bm{\mathfrak{M}}_{4,2},\cdots,\bm{\mathfrak{M}}_{4,r^{-(g-1)}}\} (30a)
𝕸4,t=[𝑴11,t𝑴1g,t𝑴T11,t𝑴T1g,t𝑴(T1+1)1,t𝑴(T1+1)g,t]\displaystyle\bm{\mathfrak{M}}_{4,t}=\begin{bmatrix}\bm{M}_{11,t}&\cdots&\bm{M}_{1g,t}\\ \vdots&\ddots&\vdots\\ \bm{M}_{T_{1}1,t}&\cdots&\bm{M}_{T_{1}g,t}\\ \bm{M}_{(T_{1}+1)1,t}&\cdots&\bm{M}_{(T_{1}+1)g,t}\end{bmatrix} (30b)
𝑴ij,t=[m11,tijm1Lj,tijmL¯i1,tijmL¯iLj,tij],\displaystyle\bm{M}_{ij,t}=\begin{bmatrix}m_{11,t}^{ij}&\cdots&m_{1L_{j},t}^{ij}\\ \vdots&\ddots&\vdots\\ m_{\overline{L}_{i}1,t}^{ij}&\cdots&m_{\overline{L}_{i}L_{j},t}^{ij}\end{bmatrix}, (30c)
𝚽tr.,t(n+1)=trunc(𝕸4,t𝚽¯(n),Q),\displaystyle\bm{\Phi}_{tr.,t}^{(n+1)}=\text{trunc}(\bm{\mathfrak{M}}_{4,t}\bm{\overline{\Phi}}^{(n)},Q), (30d)
𝚽¯(n+1)=𝕸4,tTmerge(𝚽tr.,t(n+1),𝕸4,t𝚽¯(n),Q)\displaystyle\bm{\overline{\Phi}}^{(n+1)}=\bm{\mathfrak{M}}_{4,t}^{T}\text{merge}(\bm{\Phi}_{tr.,t}^{(n+1)},\bm{\mathfrak{M}}_{4,t}\bm{\overline{\Phi}}^{(n)},Q) (30e)

where t{1,,r(g1)}t\in\{1,\cdots,r^{-(g-1)}\}, L¯i=Q\overline{L}_{i}=Q for i{1,,T1}i\in{\{1,\cdots,T_{1}\}}; otherwise L¯i=NmLT1Q\overline{L}_{i}=N_{m}^{L}-T_{1}Q for i=T1+1i=T_{1}+1. Lj=njSr(ν+j1)L_{j}=n_{j}Sr^{(\nu+j-1)}, muv,tij{0,1}m_{uv,t}^{ij}\in\{0,1\} for i{1,,T1+1}i\in\{1,\cdots,T_{1}+1\}, j{1,,g}j\in\{1,\cdots,g\}, u{1,,L¯i}u\in\{1,\cdots,\overline{L}_{i}\}, and v{1,,Lj}v\in\{1,\cdots,L_{j}\}. Again, like 𝕸3\bm{\mathfrak{M}}_{3} in (26b), the sum of the elements in each of the row or column vector of 𝕸4,t\bm{\mathfrak{M}}_{4,t} is 11. Further, from Corollary 2, 𝕸4,t1=𝕸4,tT\bm{\mathfrak{M}}_{4,t}^{-1}=\bm{\mathfrak{M}}_{4,t}^{T}. Besides, the common coefficients are with the ttth set of distinct coefficients in the vector 𝚽tr.,t\bm{\Phi}_{tr.,t} which is obtained using the ttth operator in (30d). Here, first, 𝕸4,t𝚽¯\bm{\mathfrak{M}}_{4,t}\bm{\overline{\Phi}} is multiplied by 𝚽¯\bm{\overline{\Phi}}, then, using a truncation function trunc(𝕸4,t𝚽¯,Q)\text{trunc}(\bm{\mathfrak{M}}_{4,t}\bm{\overline{\Phi}},Q), the first QQ elements of the vector 𝕸4,t𝚽¯\bm{\mathfrak{M}}_{4,t}\bm{\overline{\Phi}} is truncated to get 𝚽tr.,t\bm{\Phi}_{tr.,t}. Its reverse operation, i.e., the conversion of 𝚽tr.,t\bm{\Phi}_{tr.,t} to 𝚽¯\bm{\overline{\Phi}} can be performed using (30e), where merge(𝐚,𝐛,Q)\text{merge}(\bm{a},\bm{b},Q) updates the first QQ elements of 𝐛\bm{b} by merging them with 𝐚\bm{a} of length QQ.

Example:  For the instance of the LC-DPD structure with S=4S=4, Q=4Q=4, r=1/2r=1/2, and ν=1\nu=1 in Fig. 3(b), r(g1)=(1/2)(21)=2r^{-(g-1)}=(1/2)^{-(2-1)}=2, thus, from (30a), 𝕸4={𝕸4,1,𝕸4,2}\bm{\mathfrak{M}}_{4}=\{\bm{\mathfrak{M}}_{4,1},\bm{\mathfrak{M}}_{4,2}\}. As described earlier, the coefficients, ϕ¯1,3\overline{\phi}_{1,3}, ϕ¯1,4\overline{\phi}_{1,4} are commonly shared with ϕ¯1,1\overline{\phi}_{1,1}, ϕ¯1,2\overline{\phi}_{1,2} and ϕ¯3,1\overline{\phi}_{3,1}, ϕ¯3,2\overline{\phi}_{3,2}. From (30d), we can find the vectors, 𝚽tr.,1=[ϕ¯1,1,ϕ¯1,2,ϕ¯1,3,ϕ¯1,4]T\bm{\Phi}_{tr.,1}=[\overline{\phi}_{1,1},\overline{\phi}_{1,2},\overline{\phi}_{1,3},\overline{\phi}_{1,4}]^{T} and 𝚽tr.,2=[ϕ¯3,1,ϕ¯3,2,ϕ¯1,3,ϕ¯1,4]T\bm{\Phi}_{tr.,2}=[\overline{\phi}_{3,1},\overline{\phi}_{3,2},\overline{\phi}_{1,3},\overline{\phi}_{1,4}]^{T} from 𝚽¯=[ϕ¯1,1,ϕ¯3,1,ϕ¯1,2,ϕ¯3,2,ϕ¯1,3,ϕ¯1,4]T\bm{\overline{\Phi}}=[\overline{\phi}_{1,1},\overline{\phi}_{3,1},\overline{\phi}_{1,2},\overline{\phi}_{3,2},\overline{\phi}_{1,3},\overline{\phi}_{1,4}]^{T} using the operators 𝕸4,1\bm{\mathfrak{M}}_{4,1} and 𝕸4,2\bm{\mathfrak{M}}_{4,2} as given by (30b) which are:

𝕸4,1=[100000100000000000001001010000010000],𝕸4,2=[010000010000000000001001100000100000].\displaystyle\bm{\mathfrak{M}}_{4,1}=\footnotesize\left[\begin{array}[]{@{}c|c@{}}\begin{matrix}1&0&0&0\\ 0&0&1&0\\ 0&0&0&0\\ 0&0&0&0\end{matrix}&\begin{matrix}0&0\\ 0&0\\ 1&0\\ 0&1\end{matrix}\\ \hline\cr\begin{matrix}0&1&0&0\\ 0&0&0&1\end{matrix}&\begin{matrix}0&0\\ 0&0\end{matrix}\end{array}\right],\;\;\bm{\mathfrak{M}}_{4,2}=\footnotesize\left[\begin{array}[]{@{}c|c@{}}\begin{matrix}0&1&0&0\\ 0&0&0&1\\ 0&0&0&0\\ 0&0&0&0\end{matrix}&\begin{matrix}0&0\\ 0&0\\ 1&0\\ 0&1\end{matrix}\\ \hline\cr\begin{matrix}1&0&0&0\\ 0&0&1&0\end{matrix}&\begin{matrix}0&0\\ 0&0\end{matrix}\end{array}\right]. (35)

Also, using this example, we can realize (30e). \Box

Algorithm 4 Estimation of coefficients for LC-DPD (Method-III).
1:The values of ρ\rho, λ0\lambda_{0}, μ\mu, 𝕸1\bm{\mathfrak{M}}_{1}, 𝕸2\bm{\mathfrak{M}}_{2}, 𝕸3,F\bm{\mathfrak{M}}_{3,F}, 𝕸3,B\bm{\mathfrak{M}}_{3,B}, 𝚽¯~(0)\bm{\tilde{\overline{\Phi}}}^{(0)}, and 𝒩\mathcal{N}
2:The estimated coefficients 𝚽¯^\bm{\widehat{\overline{\Phi}}}
3:𝑷t(0)=diag(μ𝑰Q,,μ𝑰QT1=Sr(ν+g1))\bm{P}_{t}^{(0)}=\text{diag}(\underbrace{\mu\bm{I}_{Q},\cdots,\mu\bm{I}_{Q}}_{T_{1}=Sr^{(\nu+g-1)}}); t{1,,r(g1)}t\in\{1,\cdots,r^{-(g-1)}\}
4:𝝃t(0)=λ0𝑰T1\bm{\xi}_{t}^{(0)}=\lambda_{0}\bm{I}_{T_{1}}; t{1,,r(g1)}t\in\{1,\cdots,r^{-(g-1)}\}
5:Operate 𝚽~(0)=𝕸1𝚽¯~(0)\bm{\tilde{\Phi}}^{(0)}=\bm{\mathfrak{M}}_{1}\bm{\tilde{\overline{\Phi}}}^{(0)} and assign 𝚽(0)=𝚽~(0)\bm{\Phi}^{(0)}=\bm{\tilde{\Phi}}^{(0)}
6:Using obtained 𝑿(1)\bm{X}(1) followed by 𝒀(1)\bm{Y}(1) and then 𝚿\bm{\Psi}^{{}^{\prime}}, find 𝚼(1)\bm{\Upsilon}(1), 𝚿¯(1)=𝕸2𝚿(1)\bm{\overline{\Psi^{{}^{\prime}}}}(1)=\bm{\mathfrak{M}}_{2}\bm{\Psi}^{{}^{\prime}}(1), and 𝚼¯(1)\bm{\overline{\Upsilon}}(1)
7:Compute 𝑿~(1)=𝚼(1)T𝚽~(0)\bm{\tilde{X}}(1)=\bm{\Upsilon}(1)^{T}\bm{\tilde{\Phi}}^{(0)}
8:n=0n=0
9:repeat
10:    t=1t=1
11:    repeat
12:         repeat
13:             Operate 𝚽~(n)=𝕸1𝚽¯~(n)\bm{\tilde{\Phi}}^{(n)}=\bm{\mathfrak{M}}_{1}\bm{\tilde{\overline{\Phi}}}^{(n)} and assign 𝚽(n)=𝚽~(n)\bm{\Phi}^{(n)}=\bm{\tilde{\Phi}}^{(n)}
14:             Using obtained 𝑿(n+1)\bm{X}(n+1) followed by 𝒀(n+1)\bm{Y}(n+1) and then 𝚿(n+1)\bm{\Psi}^{{}^{\prime}}(n+1), find 𝚼(n+1)\bm{\Upsilon}(n+1), 𝚿¯(n+1)=𝕸2𝚿(n+1)\bm{\overline{\Psi^{{}^{\prime}}}}(n+1)=\bm{\mathfrak{M}}_{2}\bm{\Psi}^{{}^{\prime}}(n+1), 𝚿t(n+1)=trunc(𝕸4,t𝚿¯(n+1),Q)\bm{\Psi}^{{}^{\prime}}_{t}(n+1)=\text{trunc}(\bm{\mathfrak{M}}_{4,t}\bm{\overline{\Psi}}^{{}^{\prime}}(n+1),Q), 𝚼t(n+1)\bm{\Upsilon}_{t}(n+1), and 𝚽tr.,t(n)=trunc(𝕸4,t𝚽¯~(n),Q)\bm{\Phi}_{tr.,t}^{(n)}=\text{trunc}(\bm{\mathfrak{M}}_{4,t}\bm{\tilde{\overline{\Phi}}}^{(n)},Q)
15:             𝑿~(n+1)=𝚼(n+1)T𝚽~(n)\bm{\tilde{X}}(n+1)=\bm{\Upsilon}(n+1)^{T}\bm{\tilde{\Phi}}^{(n)}
16:             𝑬(n+1)=𝑿(n+1)𝑿~(n+1)\bm{E}(n+1)=\bm{X}(n+1)-\bm{\tilde{X}}(n+1)
17:             𝑬¯(n+1)=𝕸2(𝑬(n+1)𝟏Q)\bm{\overline{E}}(n+1)=\bm{\mathfrak{M}}_{2}(\bm{E}(n+1)\bigotimes\bm{1}_{Q})
18:             𝑬t(n+1)=𝕸4,t𝑬¯(n+1)\bm{E}_{t}(n+1)=\bm{\mathfrak{M}}_{4,t}\bm{\overline{E}}(n+1)
19:             𝝃t(n+1)=ρ𝝃t(n)+𝑰T1ρ𝑰T1\bm{\xi}_{t}^{(n+1)}=\rho\bm{\xi}_{t}^{(n)}+\bm{I}_{{}_{T_{1}}}-\rho\bm{I}_{{}_{T_{1}}}
20:             𝒁t(n+1)=𝚼tT(n+1)𝑷t(n)𝚼t(n+1)+𝝃t(n+1)\bm{Z}_{t}^{(n+1)}=\bm{\Upsilon}_{t}^{T}(n+1)\bm{P}_{t}^{(n)}\bm{\Upsilon}_{t}^{*}(n+1)+\bm{\xi}_{t}^{(n+1)}
21:             𝑷t(n+1)=(𝑷t(n)𝑷t(n)𝚼t(n+1)𝒁t(n+1)1𝚼tT(n+1)𝑷t(n))𝚵t(n+1)1\bm{P}_{t}^{(n+1)}=(\bm{P}_{t}^{(n)}-\bm{P}_{t}^{(n)}\bm{\Upsilon}_{t}^{*}(n+1){\bm{Z}_{t}^{(n+1)}}^{-1}\bm{\Upsilon}_{t}^{T}(n+1)\bm{P}_{t}^{(n)}){\bm{\Xi}_{t}^{(n+1)}}^{-1}
22:             𝚽tr.,t(n+1)=𝚽tr.,t(n)+(𝑷t(n+1)𝚼t(n+1)𝟏T1)𝑬t(n+1)\bm{\Phi}_{tr.,t}^{(n+1)}=\bm{\Phi}_{tr.,t}^{(n)}+(\bm{P}_{t}^{(n+1)}\bm{\Upsilon}_{t}^{*}(n+1)\bm{1}_{T_{1}})\bigodot\bm{E}_{t}(n+1)
23:             𝚽¯~(n+1)=𝕸4,tTmerge(𝚽tr.,t(n+1),𝕸4,t𝚽¯~(n),Q)\bm{\tilde{\overline{\Phi}}}^{(n+1)}=\bm{\mathfrak{M}}_{4,t}^{T}\text{merge}(\bm{\Phi}_{tr.,t}^{(n+1)},\bm{\mathfrak{M}}_{4,t}\bm{\tilde{\overline{\Phi}}}^{(n)},Q)
24:             n=n+1n=n+1
25:         until n%𝒩==0n\;\%\;\mathcal{N}==0
26:         t=t+1t=t+1
27:    until t>r(g1)t>r^{-(g-1)}
28:until 𝚽¯~(n)\bm{\tilde{\overline{\Phi}}}^{(n)} converges
29:𝚽¯^=𝚽¯~(n)\bm{\widehat{\overline{\Phi}}}=\bm{\tilde{\overline{\Phi}}}^{(n)}

Now, use of the sequence of operators, 𝕸4\bm{\mathfrak{M}}_{4} is described in Algorithm 4 to correlate the common coefficients with the remaining distinct coefficients. Apart from the input parameters in Algorithm 2, 𝕸4\bm{\mathfrak{M}}_{4} and the number 𝒩\mathcal{N} (defined later) are inputted in Algorithm 4. Then, it initializes the correlation matrix 𝑷t\bm{P}_{t} and the forgetting matrix 𝝃t\bm{\xi}_{t} for t{1,,r(g1)}t\in\{1,\cdots,r^{-(g-1)}\}. In Steps 5, 6, and 7, similar to earlier algorithms, it determines the initial values of 𝑿~\bm{\tilde{X}} and 𝚼¯\bm{\overline{\Upsilon}}. Thereafter, three nested loops are initialized. In Steps 13 and 14 of the innermost loop, the algorithm determines 𝚼t\bm{\Upsilon}_{t} and 𝚽tr.,t\bm{\Phi}_{tr.,t} using (30d) in the nnth iteration. Steps 15 to 22 follow the process to update 𝚽tr.,t\bm{\Phi}_{tr.,t} in the current iteration. Then, 𝚽tr.,t\bm{\Phi}_{tr.,t} is converted back to 𝚽¯~\bm{\tilde{\overline{\Phi}}} using (30e). This process repeats for 𝒩\mathcal{N} iterations to correlate the common coefficients to the ttth set of distinct coefficients. Thereafter, tt increases by unity to establish the correlation of the common coefficients with next set of distinct coefficients. Thus, the two inner loops repeat until t>r(g1)t>r^{-(g-1)} and at this point, the algorithm completes the one cycle to correlate the common coefficients with all sets of distinct coefficients. The outermost loop repeats this cycle until 𝚽¯~\bm{\tilde{\overline{\Phi}}} converges.

Performance and Complexity

Algorithm 4 performs better than Algorithms 2 and 3, because, the common coefficients establish the correlation with all distinct coefficients in the vector 𝚽¯\bm{\overline{\Phi}}. Therefore, using it, the generated predistorted signal vector 𝑿\bm{X} gives the better linearization of the PAs in the subarray. Moreover, for the fair comparison of this algorithm with the other earlier algorithms in terms of computational complexity, we assign 𝒩=1\mathcal{N}=1. The complexity of a correlation cycle depends on the dominant matrix multiplications in Steps 20, 21, and 22. For the correlation of the common coefficients with the r(g1)r^{-(g-1)} distinct set of coefficients, the complexities of the three steps are: O(Q2T1+QT1)r(g1)O(Q^{2}T_{1}+QT_{1})r^{-(g-1)}, O(2Q3T1+2Q2T1+QT1)r(g1)O(2Q^{3}T_{1}+2Q^{2}T_{1}+QT_{1})r^{-(g-1)}, and O(Q2T1+QT1)r(g1)O(Q^{2}T_{1}+QT_{1})r^{-(g-1)}. Thus, considering the dominant term, the complexity is O(Q3T1)r(g1)O(Q^{3}T_{1})r^{-(g-1)}. Taking r(g1)r^{-(g-1)} inside the big O, the complexity can be approximated as: O(Q3T1)r(g1)O(Q3σ¯g)O(Q^{3}T_{1})r^{-(g-1)}\approx O(Q^{3}\overline{\sigma}_{g}). As σ¯g>T1\overline{\sigma}_{g}>T_{1}, but, <S<S, hence, the complexity of Algorithm 4 is greater than Algorithms 3, but, it is lesser than that of Algorithms 1 and 2.

Refer to caption
(a) FF-DPD and Single-DPD
Refer to caption
(b) LC-DPD I
Refer to caption
(c) LC-DPD II
Refer to caption
(d) LC-DPD III
Refer to caption
(e) EVM
Figure 5: Linearization of the PAs in a subarray using different schemes.

V Numerical Results

V-A Evaluation Environment

To evaluate the performance of the proposed analysis, we consider a subarray of S=8S=8 PAs. The PAs follow Saleh model as in [43]. The parameters, αa,i\alpha_{a,i} and βa,i\beta_{a,i} for the AM/AM distortion and αϕ,i\alpha_{\phi,i} and βϕ,i\beta_{\phi,i} for the AM/PM distortion are given by: αa,i=0.9445+0.1ua,i\alpha_{a,i}=0.9445+0.1u_{a,i}, βa,i=0.5138+0.1va,i\beta_{a,i}=0.5138+0.1v_{a,i}, αϕ,i=4.0033+uϕ,i\alpha_{\phi,i}=4.0033+u_{\phi,i}, and βϕ,i=9.1040+vϕ,i\beta_{\phi,i}=9.1040+v_{\phi,i}, where ua,iu_{a,i}, va,iv_{a,i}, uϕ,iu_{\phi,i} and vϕ,iv_{\phi,i} are uniformly distributed over [0,1][0,1] for i{1,,S}i\in\{1,\cdots,S\}. The GMP used for a DPD has the order P=5P=5 and each order has the memory length, M=5M=5. But, as described using Fig. 2, the BFs with indices in the set \mathcal{I} have the nonzero coefficients; thus Q=10Q=10. Further, for the LC-DPD scheme, the BFs are arranged in decreasing order of their dominance. The arrangement is represented using the indices of the BFs as: {4,5,14,15,19,20,24,25,9,10}\{4,5,14,15,19,20,24,25,9,10\}. Moreover, for the LC-DPD structure, the geometric sequence in (6) has the following parameters’ values: g=2g=2, n1=4n_{1}=4 n2=6n_{2}=6, ν=1\nu=1, and r=1/2r=1/2. The bandwidth of the input signal s(n)s(n) is 44 MHz. To get the insights on the linearization using the obtained results, the in-band average powers of the power spectral density (PSD) of the input signal s(n)s(n) and the output of the PAs, yl(n)y_{l}(n); l{1S}l\in\{1\cdots S\}, are normalized to 0 dB. Moreover, for the algorithms based on ILA-RPEM, the input parameters are set as: λ0=0.99\lambda_{0}=0.99, μ=0.2\mu=0.2, and ρ=0.95\rho=0.95. The linearization of the PAs is determined using the error vector magnitudes (EVMs) of their outputs with respect to the reference message signal s(n)s(n). It is computed as: EVMPAl=n(yl(n)s(n))2/ns2(n)\text{EVM}_{PA_{l}}=\sqrt{\sum_{n}(y_{l}(n)-s(n))^{2}/\sum_{n}s^{2}(n)}; l{1,,S}l\in\{1,\cdots,S\}. The simulation for the proposed analysis is performed using MATLAB/Simulink®{}^{\text{\textregistered}}.

V-B Performance Comparison

In Fig. 5, the different DPD schemes are compared in terms of linearization of the PAs in the subarray. It can be observed in Fig. 5(a) that FF-DPD gives the best performance to linearize all the PAs, whereas, single-DPD has the least performance. Because, in FF-DPD, each PA has a separate DPD to linearize itself, but, in single-DPD, all the PAs are linearized using a single DPD. Also, from the bar plot in Fig. 5(e), all the PAs have almost same EVMs and their values are around 3.04%3.04\%. Whereas, for single-DPD, the EVMs values of the PAs are different and the maximum value goes upto 33.31%33.31\%. If we compare the three LC-DPD schemes in Figs. 5(b), 5(c), and 5(d), the linearization performance of LC-DPD I is least, because, LC-DPD coefficients in it are determined using the structure of FF-DPD where correlation of the common coefficients are least to the distinct coefficients. Therefore, although, it is giving the better performance than single-DPD, but, none of the PAs is linearized properly. The average EVMs for the first four and the next four PAs are 17.99%17.99\% and 11.56%11.56\%, respectively. In LC-DPD II scheme, the structure of LC-DPD is completely exploits, but, the common coefficients are correlated with distinct coefficients for the first two PAs. Therefore, the linearization of these PAs are same as FF-DPD with average EVM equal to 2.71%2.71\%, while the linerization of the remaining 66 PAs is less and their average EVM value is 12.09%12.09\%. In LC-DPD III, the common coefficients are partially correlated with each of the distinct coefficients, therefore, its overall performance is better than the previous two schemes. The average EVM values of LC-DPD I, LC-DPD II, and LC-DPD III schemes are 14.77%14.77\%, 9.74%9.74\%, and 8.98%8.98\%, respectively.

VI Conclusion

In this work, we have proposed two schemes, FF-DPD and LC-DPD to fully linearize the PAs in a subarray of a mMIMO transmitter. Although, FF-DPD provides the best performance but it has high complexity. Using the structure of FF-DPD, we derive a less complex LC-DPD. For the two schemes, four algorithms based on ILA-RPEM are described and their performances and complexities are investigated. From the obtained results we find that FF-DPD almost linearizes the PAs fully with on average 3.04%3.04\% EVM. The computational complexities of the three algorithms for LC-DPD is less complex, but, their performances in EVM are less than the algorithm for FF-DPD. Furthermore, among the three algorithms for LC-DPD, the third algorithm (LC-DPD III) provides the better performance as it better correlates the common coefficients to the distinct coefficients of the scheme.

References

  • [1] J. Kenney and A. Leke, “Design considerations for multicarrier CDMA base station power amplifiers,” Microw. J., vol. 42, no. 2, pp. 76–83, Feb. 1999.
  • [2] A. Katz, J. Wood, and D. Chokola, “The evolution of PA linearization: From classic feedforward and feedback through analog and digital predistortion,” IEEE Microw. Mag., vol. 17, no. 2, pp. 32–40, Feb. 2016.
  • [3] D. R. Morgan, Z. Ma, J. Kim, M. G. Zierdt, and J. Pastalan, “A generalized memory polynomial model for digital predistortion of RF power amplifiers,” IEEE Trans. Signal Process., vol. 54, no. 10, pp. 3852–3860, Oct. 2006.
  • [4] A. N. D’Andrea, V. Lottici, and R. Reggiannini, “RF power amplifier linearization through amplitude and phase predistortion,” IEEE Trans. Commun., vol. 44, no. 11, pp. 1477–1484, Nov. 1996.
  • [5] Y. Liu, W. Pan, S. Shao, and Y. Tang, “A general digital predistortion architecture using constrained feedback bandwidth for wideband power amplifiers,” IEEE Trans. Microw. Theory Tech., vol. 63, no. 5, pp. 1544–1555, Feb. 2015.
  • [6] Z. Wang, W. Chen, G. Su, F. M. Ghannouchi, Z. Feng, and Y. Liu, “Low feedback sampling rate digital predistortion for wideband wireless transmitters,” IEEE Trans. Microw. Theory Tech., vol. 64, no. 11, pp. 3528–3539, Nov. 2016.
  • [7] S. Zhang, W. Chen, F. M. Ghannouchi, and Y. Chen, “An iterative pruning of 2-D digital predistortion model based on normalized polynomial terms,” in Proc. IEEE MTT-S Int. Microw. Symp. Digest, Seattle, WA, USA, Jun. 2013, pp. 1–4.
  • [8] Z. Wang, W. Chen, G. Su, F. M. Ghannouchi, Z. Feng, and Y. Liu, “Low computational complexity digital predistortion based on direct learning with covariance matrix,” IEEE Trans. Microw. Theory Tech., vol. 65, no. 11, pp. 4274–4284, Nov. 2017.
  • [9] L. Guan and A. Zhu, “Optimized low-complexity implementation of least squares based model extraction for digital predistortion of RF power amplifiers,” IEEE Trans. Microw. Theory Tech., vol. 60, no. 3, pp. 594–603, Mar. 2012.
  • [10] P. L. Gilabert, G. Montoro, D. López, N. Bartzoudis, E. Bertran, M. Payaro, and A. Hourtane, “Order reduction of wideband digital predistorters using principal component analysis,” in Proc. IEEE MTT-S Int. Microw. Symp. Digest, Seattle, WA, USA, Jun. 2013, pp. 1–7.
  • [11] R. N. Braithwaite, “Wide bandwidth adaptive digital predistortion of power amplifiers using reduced order memory correction,” in Proc. IEEE MTT-S Int. Microw. Symp. Digest, Atlanta, GA, USA, Jun. 2008, pp. 1517–1520.
  • [12] J. Swaminathan, P. Kumar, and M. Vinoth, “Performance analysis of LMS filter in linearization of different memoryless non linear power amplifier models,” in Proc. Int. Conf. Advances Computing, Commun. Control, Berlin, Heidelberg, Jan. 2013, pp. 459–464.
  • [13] P. M. Suryasarman and A. Springer, “A comparative analysis of adaptive digital predistortion algorithms for multiple antenna transmitters,” IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 62, no. 5, pp. 1412–1420, May 2015.
  • [14] D. R. Morgan, Z. Ma, and L. Ding, “Reducing measurement noise effects in digital predistortion of RF power amplifiers,” in Proc. IEEE ICC, vol. 4, Anchorage, AK, USA, May 2003, pp. 2436–2439.
  • [15] L. Gan and E. Abd-Elrady, “Digital predistortion of memory polynomial systems using direct and indirect learning architectures,” in Proc. Int. Conf. IASTED, vol. 654, p. 802.
  • [16] B. Mohr, W. Li, and S. Heinen, “Analysis of digital predistortion architectures for direct digital-to-RF transmitter systems,” in Proc. IEEE Int. Midwest Symp. Circuits Syst., Boise, ID, USA, Aug. 2012, pp. 650–653.
  • [17] T. Söderström and P. Stoica, System identification.   Prentice-Hall International, 1989.
  • [18] D. Zhou and V. E. DeBrunner, “Novel adaptive nonlinear predistorters based on the direct learning algorithm,” IEEE Trans. Signal Process., vol. 55, no. 1, pp. 120–133, Jan. 2006.
  • [19] H. Paaso and A. Mammela, “Comparison of direct learning and indirect learning predistortion architectures,” in Proc. IEEE Int. Symp. Wireless Commun. Syst., Reykjavik, Iceland, Oct. 2008, pp. 309–313.
  • [20] J. Chani-Cahuana, P. N. Landin, C. Fager, and T. Eriksson, “Iterative learning control for RF power amplifier linearization,” IEEE Trans. Microw. Theory Tech., vol. 64, no. 9, pp. 2778–2789, Sep. 2016.
  • [21] H. Chauhan, V. Kvartenko, and M. Onabajo, “A tuning technique for temperature and process variation compensation of power amplifiers with digital predistortion,” in Proc. IEEE North Atlantic Test Workshop, Providence, RI, USA, May 2016, pp. 38–45.
  • [22] E. Jarvinen, S. Kalajo, and M. Matilainen, “Bias circuits for GaAs HBT power amplifiers,” in Proc. IEEE MTT-S Int. Microw. Symps. Digest, vol. 1, Phoenix, AZ, USA, May 2001, pp. 507–510.
  • [23] E. Ng, Y. Beltagy, P. Mitran, and S. Boumaiza, “Single-input single-output digital predistortion of power amplifier arrays in millimeter wave RF beamforming transmitters,” in Proc. IEEE Int. Microw. Symp.-IMS, Philadelphia, PA, USA, Jun. 2018, pp. 481–484.
  • [24] K. Hausmair, P. N. Landin, U. Gustavsson, C. Fager, and T. Eriksson, “Digital predistortion for multi-antenna transmitters affected by antenna crosstalk,” IEEE Trans. Microw. Theory Tech., vol. 66, no. 3, pp. 1524–1535, Mar. 2018.
  • [25] S. Choi and E.-R. Jeong, “Digital predistortion based on combined feedback in MIMO transmitters,” IEEE Commun. Lett., vol. 16, no. 10, pp. 1572–1575, Oct. 2012.
  • [26] Q. Luo, X.-W. Zhu, C. Yu, and W. Hong, “Single-receiver over-the-air digital predistortion for massive MIMO transmitters with antenna crosstalk,” IEEE Trans. Microw. Theory Tech., vol. 68, no. 1, pp. 301–315, Jan. 2019.
  • [27] X. Liu, Q. Zhang, W. Chen, H. Feng, L. Chen, F. M. Ghannouchi, and Z. Feng, “Beam-oriented digital predistortion for 5G massive MIMO hybrid beamforming transmitters,” IEEE Trans. Microw. Theory Tech., vol. 66, no. 7, pp. 3419–3432, Jul. 2018.
  • [28] C. Tarver, A. Balatsoukas-Stimming, C. Studer, and J. R. Cavallaro, “Ofdm-based beam-oriented digital predistortion for massive MIMO,” in Proc. IEEE Int. Symp. Circuits Syst., Daegu, Korea, May 2021, pp. 1–5.
  • [29] X. Liu, W. Chen, L. Chen, F. M. Ghannouchi, and Z. Feng, “Power scalable beam-oriented digital predistortion for compact hybrid massive MIMO transmitters,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 67, no. 12, pp. 4994–5006, Aug. 2020.
  • [30] C. Yu, J. Jing, H. Shao, Z. H. Jiang, P. Yan, X.-W. Zhu, W. Hong, and A. Zhu, “Full-angle digital predistortion of 5G millimeter-wave massive MIMO transmitters,” IEEE Trans. Microw. Theory Tech., vol. 67, no. 7, pp. 2847–2860, Jun. 2019.
  • [31] A. Brihuega, M. Abdelaziz, L. Anttila, M. Turunen, M. Allén, T. Eriksson, and M. Valkama, “Piecewise digital predistortion for mmWave active antenna arrays: Algorithms and measurements,” IEEE Trans. Microw. Theory Tech., vol. 68, no. 9, pp. 4000–4017, Sep. 2020.
  • [32] N. Tervo, B. Khan, J. P. Aikio, O. Kursu, M. Jokinen, M. E. Leinonen, M. Sonkki, T. Rahkonen, and A. Pärssinen, “Combined sidelobe reduction and omnidirectional linearization of phased array by using tapered power amplifier biasing and digital predistortion,” IEEE Trans. Microw. Theory Tech., vol. 69, no. 9, pp. 4284–4299, Sep. 2021.
  • [33] C. Yu, J. Jing, H. Shao, Z. H. Jiang, P. Yan, X.-W. Zhu, W. Hong, and A. Zhu, “Full-angle digital predistortion of 5G millimeter-wave massive MIMO transmitters,” IEEE Trans. Microw. Theory Tech., vol. 67, no. 7, pp. 2847–2860, Jul. 2019.
  • [34] P. Diao, L. Zhang, L. Tao, Y. Zhang, Y. Yi, H. Liu, and D. Zhao, “Full-angle digital predistortion technique for 5G millimeter-wave integrated phased array,” in Proc. IEEE MTT-S Int. Wireless Symp., vol. 1, Harbin, China, Aug. 2022, pp. 1–3.
  • [35] J. Zhao, P. Liu, L. Zhai, and F. Yang, “A novel digital predistortion based on flexible characteristic detection for 5G massive MIMO transmitters,” IEEE Microw. Wireless Compon. Lett., vol. 32, no. 4, pp. 363–366, Apr. 2021.
  • [36] J. Yan, H. Wang, and J. Shen, “Novel post-weighting digital predistortion structures for hybrid beamforming systems,” IEEE Commun. Lett., vol. 25, no. 12, pp. 3980–3984, Dec. 2021.
  • [37] G. Prasad and H. Johansson, “A low-complexity post-weighting predistorter in a mMIMO transmitter under crosstalk,” arXiv preprint arXiv:2304.05795, 2023.
  • [38] A. A. Saleh, “Frequency-independent and frequency-dependent nonlinear models of TWT amplifiers,” IEEE Trans. Commun., vol. 29, no. 11, pp. 1715–1720, Nov. 1981.
  • [39] A. Brihuega, L. Anttila, M. Abdelaziz, T. Eriksson, F. Tufvesson, and M. Valkama, “Digital predistortion for multiuser hybrid MIMO at mmwaves,” IEEE Trans. Signal Process., vol. 68, pp. 3603 – 3618, May 2020.
  • [40] L. Ljung and T. Söderström, Theory and practice of recursive identification.   MIT press, 1983.
  • [41] Y. Li, S.-L. Hu, J. Wang, and Z.-H. Huang, “An introduction to the computational complexity of matrix multiplication,” Journal of the Operations Research Society of China, vol. 8, pp. 29–43, Dec. 2020.
  • [42] G. Strang, “Linear algebra and its applications 4th ed.” 2012.
  • [43] C. Liu, W. Feng, Y. Chen, C.-X. Wang, and N. Ge, “Optimal beamforming for hybrid satellite terrestrial networks with nonlinear PA and imperfect CSIT,” IEEE Wireless Commun. Lett., vol. 9, no. 3, pp. 276–280, Mar. 2019.