This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Deep-Learning-Aided Successive-Cancellation Decoding of Polar Codes

Seyyed Ali Hashemi1, Nghia Doan2, Thibaud Tonnellier2, Warren J. Gross2 1Department of Electrical Engineering, Stanford University, USA 2Department of Electrical and Computer Engineering, McGill University, Canada ahashemi@stanford.edu, nghia.doan@mail.mcgill.ca, thibaud.tonnellier@mcgill.ca, warren.gross@mcgill.ca
Abstract

A deep-learning-aided successive-cancellation list (DL-SCL) decoding algorithm for polar codes is introduced with deep-learning-aided successive-cancellation (DL-SC) decoding being a specific case of it. The DL-SCL decoder works by allowing additional rounds of SCL decoding when the first SCL decoding attempt fails, using a novel bit-flipping metric. The proposed bit-flipping metric exploits the inherent relations between the information bits in polar codes that are represented by a correlation matrix. The correlation matrix is then optimized using emerging deep-learning techniques. Performance results on a polar code of length 128128 with 6464 information bits concatenated with a 2424-bit cyclic redundancy check show that the proposed bit-flipping metric in the proposed DL-SCL decoder requires up to 66%66\% fewer multiplications and up to 36%36\% fewer additions, without any need to perform transcendental functions, and by providing almost the same error-correction performance in comparison with the state of the art.

Index Terms:
5G, polar codes, deep learning, SC, SCL, SC-Flip, SCL-Flip.

I Introduction

Polar codes represent a class of error-correcting codes that are proven to achieve channel capacity for any binary symmetric channel under the low-complexity successive-cancellation (SC) decoding [1]. Recently, polar codes are selected for use in the enhanced mobile broadband (eMBB) control channel of the fifth generation of cellular technology (5G standard), where codes with short block length are used [2]. The error-correction performance of short polar codes under SC decoding does not satisfy the requirements of the 5G standard. SC list (SCL) decoding was introduced in [3] to improve the error-correction performance of SC decoding by keeping a list of candidate message words at each decoding step. In addition, it was observed that under SCL decoding, the error-correction performance is significantly improved when the polar code is concatenated with a cyclic redundancy check (CRC) code [3]. However, the decoding complexity of SCL grows as the list size increases.

Unlike SCL decoding, SC flip (SCF) decoding [4] performs multiple SC decoding attempts in series where in each attempt, the first-order erroneous information bit in the initial SC decoding attempt is flipped. Similar to SCL decoding, SCF decoding uses a CRC code to determine whether a decoding attempt is successful or not and a bit-flipping metric is used to identify the erroneous information bit. Several methods have been proposed to improve the error-correction performance of SCF [5, 6, 7]. However, the bit-flipping metric of a given information bit is oversimplified where only the log-likelihood ratio (LLR) corresponding to that bit is considered. To overcome this problem, dynamic SCF (DSCF) decoding [8] defines a more accurate bit-flipping metric, which utilizes the LLR values of all the previously decoded information bits. It was shown in [8] that at practical signal-to-noise ratio (SNR) values, DSCF decoding can achieve an error-correction performance comparable to SCL decoding, while maintaining an average decoding complexity close to that of SC decoding. But the bit-flipping metric in DSCF decoding requires costly exponential and logarithmic computations, which hinders the algorithm to be efficiently implemented in hardware.

In this paper, the likelihood of the correct decoding of each information bit under SC or SCL decoding is estimated by exploiting the inherent correlations among all the information bits. These correlations are expressed in the form of a trainable correlation matrix. Consequently, a bit-flipping metric based on the proposed correlation matrix is introduced. It only requires the computation of multiplication and addition operations in the LLR domain, preventing completely the needs to use costly transcendental functions as required by DSCF decoding. Motivated by recent developments that exploit deep learning (DL) to decode polar codes [9, 10, 11, 12, 13], DL techniques are applied to optimize the correlation matrix. Thus the proposed decoding algorithm is called deep-learning-aided SCL (DL-SCL) decoding with DL-SC decoding being its specific case when the list size is one. Performance results on a polar code of length 128128 with 6464 information bits concatenated with a 2424-bit CRC show that the proposed bit-flipping metric in the proposed DL-SCL decoder requires up to 66%66\% fewer multiplications and up to 36%36\% fewer additions in comparison with the decoder that uses the bit-flipping metric in [8]. Moreover, the proposed decoder with the proposed bit-flipping metric does not require to perform any transcendental functions and can provide almost the same error-correction performance in comparison with the decoder that uses the bit-flipping metric in [8].

II Preliminaries

II-A Polar Codes, SC Decoding, and SCL Decoding

A polar code 𝒫(N,K)\mathcal{P}(N,K) of block length NN with KK information bits is derived as 𝒙=𝒖𝑮n\bm{x}=\bm{u}\bm{G}^{\otimes n}, where 𝒙={x0,x1,,xN1}\bm{x}=\{x_{0},x_{1},\ldots,x_{N-1}\} is the polar codeword, 𝒖={u0,u1,,uN1}\bm{u}=\{u_{0},u_{1},\ldots,u_{N-1}\} is the message word, 𝑮n\bm{G}^{\otimes n} is the nn-th Kronecker power of the polarizing matrix 𝑮=[1011]\bm{G}=\bigl{[}\begin{smallmatrix}1&0\\ 1&1\end{smallmatrix}\bigr{]}, and n=log2Nn=\log_{2}N. The vector 𝒖\bm{u} consists of a set 𝒜\mathcal{A} of the indices of KK information bits and a set 𝒜c\mathcal{A}^{c} of the indices of NKN-K frozen bits. The positions of frozen bits are known to both the encoder and the decoder, and their values are set to 0. In this paper, binary phase-shift keying (BPSK) modulation technique is considered. Therefore, the received signals of the transmitted codeword are represented as 𝒚=(𝟏2𝒙)+𝒛\bm{y}=(\mathbf{1}-2\bm{x})+\bm{z}, where 𝟏\mathbf{1} is an all-one vector of size NN, and 𝒛N\bm{z}\in\mathbb{R}^{N} is the additive white Gaussian noise (AWGN) vector with variance σ2\sigma^{2} and zero mean. The LLR vector of the received signal is then given as 𝑳n=2𝒚σ2\bm{L}_{n}=\frac{2\bm{y}}{\sigma^{2}}.

Refer to caption
(a)
Refer to caption
(b)
Figure 1: (a) SC decoding on the factor graph of 𝒫(8,5)\mathcal{P}(8,5) with 𝒜c={0,1,2}\mathcal{A}^{c}=\{0,1,2\}, (b) a PE.

SC decoding can be illustrated on a polar code factor graph representation. Fig. 1(a) shows an example of a factor graph for 𝒫(8,5)\mathcal{P}(8,5). To obtain the estimated message word, the LLR values and the hard bit estimations are propagated through all the processing elements (PEs) in the factor graph that are depicted in Fig. 1(b). A PE performs LLR computations as

Ls,i=min(|Ls+1,i|,|Ls+1,i+2s|)sgn(Ls+1,i)sgn(Ls+1,i+2s),\displaystyle L_{s,i}\!=\!\min(|L_{s+1,i}|,|L_{s+1,i+2^{s}}|)\operatorname*{sgn}(L_{s+1,i})\operatorname*{sgn}(L_{s+1,i+2^{s}}),
Ls,i+2s=(12v^s,i)Ls+1,i+Ls+1,i+2s,\displaystyle L_{s,i+2^{s}}\!=\!(1-2\hat{v}_{s,i})L_{s+1,i}+L_{s+1,i+2^{s}}, (1)

where Ls,iL_{s,i} and v^s,i\hat{v}_{s,i} are the LLR value and the hard bit estimation at the ss-th stage, 0sn0\leq s\leq n, and the ii-th bit, 0iN10\leq i\leq N-1, respectively. The hard bit values of the PE are computed as

v^s+1,i=v^s,iv^s,i+2s,\displaystyle\hat{v}_{s+1,i}=\hat{v}_{s,i}\oplus\hat{v}_{s,i+2^{s}}, (2)
v^s+1,i+2s=v^s,i+2s,\displaystyle\hat{v}_{s+1,i+2^{s}}=\hat{v}_{s,i+2^{s}},

where \oplus denotes the logical XOR operation.

The LLR values at the nn-th stage are initialized to 𝑳n\bm{L}_{n}. In SC decoding, the hard bit estimations at the 0-th stage are calculated as

u^i=v^0,i={0if i𝒜c,1sgn(L0,i)2otherwise.\hat{u}_{i}=\hat{v}_{0,i}=\begin{cases}0&\text{if }i\in\mathcal{A}^{c},\\ \frac{1-\operatorname*{sgn}(L_{0,i})}{2}&\text{otherwise.}\end{cases} (3)

In SCL decoding, at the 0-th stage, each information bit is estimated as either 0 or 11 and at each decoding step, only MM most likely candidate paths are allowed to survive. After the last bit is estimated in SCL decoding, the path with the highest reliability metric is selected as the decoding result. If a CRC of length cc is used to help SCL decoding, after the last bit is estimated, the path that passes the CRC verification is selected as the decoding result.

II-B SCF and DSCF Decoding

SCF decoding is used to decode a polar code that is concatenated with a CRC of length cc for verification. It starts by performing SC decoding and if the CRC verification fails after the initial SC decoding, it flips the bit estimation of an information bit which has the smallest absolute LLR value [4]. However, this simple bit-flipping metric prevents SCF decoding to obtain a satisfactory error-correction performance [8].

To determine the bit-flipping position, DSCF decoding estimates the probability PiωP^{*}_{i_{\omega}} of the iωi_{\omega}-th bit (iω𝒜i_{\omega}\in\mathcal{A}) being the first-order error bit after the initial SC decoding attempt as

Piω=(1piω)×i𝒜iωi<iωpi,P^{*}_{i_{\omega}}=(1-p^{*}_{i_{\omega}})\times\prod_{\begin{subarray}{c}{\forall i\in\mathcal{A}\setminus{i_{\omega}}}\\ i<i_{\omega}\end{subarray}}p^{*}_{i}, (4)

where pip^{*}_{i} is defined as

pi=Pr(u^i=ui|𝒚,𝒖^0i1=𝒖0i1),p^{*}_{i}=\text{Pr}(\hat{u}_{i}=u_{i}|\bm{y},\bm{\hat{u}}_{0}^{i-1}=\bm{u}_{0}^{i-1}), (5)

with 𝒖^0i1={u^0,u^1,,u^i1}\bm{\hat{u}}_{0}^{i-1}=\{\hat{u}_{0},\hat{u}_{1},\dots,\hat{u}_{i-1}\}, 𝒖0i1={u0,u1,,ui1}\bm{u}_{0}^{i-1}=\{u_{0},u_{1},\dots,u_{i-1}\}. Therefore, the bit-flipping position iωi^{*}_{\omega} that maximizes the probability of 𝒖^\bm{\hat{u}} being correctly decoded after the second SC decoding attempt can be calculated as

iω=argmaxiω𝒜Piω.i^{*}_{\omega}=\operatorname*{arg\,max}_{\forall i_{\omega}\in\mathcal{A}}P^{*}_{i_{\omega}}. (6)

Note that pip^{*}_{i} cannot be obtained during the course of decoding since the message word 𝒖\bm{u} is unknown to the decoder [8]. Therefore, DSCF approximates pip^{*}_{i} as

pimax(Pr(u^i=0|𝒚,𝒖^0i1),Pr(u^i=1|𝒚,𝒖^0i1))=11+exp(|L0,i|).\begin{split}p^{*}_{i}&\approx\max\left(\text{Pr}(\hat{u}_{i}=0|\bm{y},\bm{\hat{u}}_{0}^{i-1}),\text{Pr}(\hat{u}_{i}=1|\bm{y},\bm{\hat{u}}_{0}^{i-1})\right)\\ &=\frac{1}{1+\exp\left(-|L_{0,i}|\right)}.\\ \end{split} (7)

It was observed in [8] that the approximation in (7) does not result in a desirable error-correction performance. Therefore, a perturbation parameter α+\alpha\in\mathbb{R}^{+} is introduced to obtain a better estimation of pip^{*}_{i} as

pi11+exp(α|L0,i|).p^{*}_{i}\approx\frac{1}{1+\exp\left(-\alpha|L_{0,i}|\right)}\text{.} (8)

To enable numerically stable computations for a hardware implementation, the bit-flipping metric is defined as [8]

QDSCF(iω)=1αln(Piω)=|L0,iω|+i𝒜iiω1αln(1+exp(α|L0,i|)).\begin{split}Q_{\text{DSCF}}(i_{\omega})&=-\frac{1}{\alpha}\ln(P^{*}_{i_{\omega}})\\ &=|L_{0,i_{\omega}}|+\sum_{\begin{subarray}{c}{\forall i\in\mathcal{A}}\\ i\leq i_{\omega}\end{subarray}}\frac{1}{\alpha}\ln{\left(1+\exp\left(-\alpha|L_{0,i}|\right)\right)}.\end{split} (9)

Consequently, the most probable bit-flipping position iωi^{*}_{\omega} under DSCF decoding can be found as

iω=argminiω𝒜QDSCF(iω).i^{*}_{\omega}=\operatorname*{arg\,min}_{\forall i_{\omega}\in\mathcal{A}}Q_{\text{DSCF}}(i_{\omega}).\vskip 1.0pt (10)

In this paper, all the presented decoders only target the first-order error bit. However, the bit-flipping selection schemes presented in this paper can be directly extended to cover the cases of high-order error bits [8, 13].

III Deep-Learning-Aided Successive-Cancellation Decoding

In this section, a general bit-flipping algorithm for SCL decoding of polar codes is proposed, with the special case of the bit-flipping algorithm for SC decoding when the list size is 11. Moreover, a new bit-flipping metric is derived that directly utilizes the correlations of the information bits in terms of the likelihood that an information bit is correctly decoded. A training framework is then introduced as the optimization scheme to design the decoder’s parameters followed by the evaluation of the proposed scheme.

III-A A Bit-Flipping Algorithm for SCL decoding

Consider a failure in the SCL decoding with list size MM as the SCL decoding attempt in which all the MM decoding paths fail the CRC verification. Let 𝒖^[m]\bm{\hat{u}}[m], 0m<M0\leq m<M, be the mm-th candidate path after the first SCL decoding, 𝒖^[0]\bm{\hat{u}}[0] be the best path after the first SCL decoding attempt, i.e. the path with the smallest path metric [14], and let iωi^{*}_{\omega} be the estimated first erroneous bit of 𝒖^[0]\bm{\hat{u}}[0]. In the proposed scheme, a secondary SCL decoding attempt is performed by keeping only 𝒖^[0]\bm{\hat{u}}[0] and fixing all the information bits before the iωi^{*}_{\omega}-th bit. This is because all the estimated information bits before iωi^{*}_{\omega}-th are believed to be correct, and the iωi^{*}_{\omega}-th bit is flipped to correct the first error bit of 𝒖^[0]\bm{\hat{u}}[0].

The information bits for the second SCL decoding attempt up to the iωi^{*}_{\omega}-th bit are fixed as

u^[m]i={u^[0]i if i𝒜,i<iω,1u^[0]i if i𝒜,i=iω,\hat{u}[m]_{i}=\begin{cases}\hat{u}[0]_{i}&\text{ if }i\in\mathcal{A},i<i^{*}_{\omega},\\ 1-\hat{u}[0]_{i}&\text{ if }i\in\mathcal{A},i=i^{*}_{\omega},\end{cases} (11)

for 0m<M0\leq m<M. After the iωi^{*}_{\omega}-th information bit, the conventional SCL decoding procedure is performed by estimating each information bit i>iω,i𝒜i>i^{*}_{\omega},i\in\mathcal{A} as both 0 and 11 and by keeping the best MM paths at each decoding step. The path metrics of all the decoding paths are then given as [14]

PM[m]i=PM[m]i1+Δ,\operatorname{PM}[m]_{i}=\operatorname{PM}[m]_{i-1}+\Delta, (12)

where 0i<N0\leq i<N, PM[m]1=0\operatorname{PM}[m]_{-1}=0, and Δ0\Delta\geq 0 is the path metric penalty at the ii-th bit that is calculated as

Δ={|L[m]0,i|(1sgn(L[m]0,i))2 if i𝒜c,|L[m]0,i|(1(12u^[m]i)sgn(L[m]0,i))2 otherwise,\Delta=\begin{cases}\frac{|L[m]_{0,i}|(1-\operatorname*{sgn}(L[m]_{0,i}))}{2}&\text{ if }i\in\mathcal{A}^{c},\\ \frac{|L[m]_{0,i}|(1-(1-2\hat{u}[m]_{i})\operatorname*{sgn}(L[m]_{0,i}))}{2}&\text{ otherwise,}\end{cases} (13)

where L[m]0,iL[m]_{0,i} is the LLR value of the ii-th bit at the mm-th path.

Note that the bit-flipping metric of DSCF can be used to estimate iωi^{*}_{\omega}. However, this approach requires costly logarithmic and exponential functions, hence they are not attractive for an efficient hardware implementation. In the next subsection, a novel bit-flipping metric is proposed that only requires multiplication and addition operations.

III-B The Proposed Bit-Flipping Metric

Unlike the DSCF decoder which relies on the estimation of the probability pi,i𝒜p^{*}_{i},\forall i\in\mathcal{A}, for the bit-flipping metric computation, a method is proposed to directly estimate the following likelihood ratio

liω=max{Pr(u^[0]iω=0|𝒚,𝒖)Pr(u^[0]iω=1|𝒚,𝒖),Pr(u^[0]iω=1|𝒚,𝒖)Pr(u^[0]iω=0|𝒚,𝒖)}.l^{*}_{i_{\omega}}=\max\left\{\frac{\text{Pr}(\hat{u}[0]_{i_{\omega}}=0|\bm{y},\bm{u})}{\text{Pr}(\hat{u}[0]_{i_{\omega}}=1|\bm{y},\bm{u})},\frac{\text{Pr}(\hat{u}[0]_{i_{\omega}}=1|\bm{y},\bm{u})}{\text{Pr}(\hat{u}[0]_{i_{\omega}}=0|\bm{y},\bm{u})}\right\}. (14)

The value liωl^{*}_{i_{\omega}} indicates how likely the estimated message bit u^[0]iω\hat{u}[0]_{i_{\omega}} is correctly decoded given the received signal 𝒚\bm{y} and the message word 𝒖\bm{u}. The bit index iωi^{*}_{\omega} which is most likely to be the first-order erroneous bit is then obtained as

iω=argminiω𝒜liω.i^{*}_{\omega}=\operatorname*{arg\,min}_{\forall i_{\omega}\in\mathcal{A}}l^{*}_{i_{\omega}}. (15)

Similar to pip^{*}_{i}, the value of liωl^{*}_{i_{\omega}} is not available during the decoding process as 𝒖\bm{u} remains unknown to the decoder. Therefore, the following hypothesis is proposed for the estimation of liωl^{*}_{i_{\omega}}:

liωi𝒜liβiω,i,l^{*}_{i_{\omega}}\approx\prod_{\forall i\in\mathcal{A}}l_{i}^{\beta_{i_{\omega},i}}, (16)

where

li=max{Pr(u^i=0|𝒚,𝒖^0i1)Pr(u^i=1|𝒚,𝒖^0i1),Pr(u^i=1|𝒚,𝒖^0i1)Pr(u^i=0|𝒚,𝒖^0i1)}=exp|L0,i|,\begin{split}l_{i}&=\max\left\{\frac{\text{Pr}(\hat{u}_{i}=0|\bm{y},\bm{\hat{u}}^{i-1}_{0})}{\text{Pr}(\hat{u}_{i}=1|\bm{y},\bm{\hat{u}}^{i-1}_{0})},\frac{\text{Pr}(\hat{u}_{i}=1|\bm{y},\bm{\hat{u}}^{i-1}_{0})}{\text{Pr}(\hat{u}_{i}=0|\bm{y},\bm{\hat{u}}^{i-1}_{0})}\right\}\\ &=\exp{|L_{0,i}|},\end{split} (17)

and βiω,i\beta_{i_{\omega},i}\in\mathbb{R} are perturbation parameters such that βiω,i=βi,iω{\beta_{i_{\omega},i}=\beta_{i,i_{\omega}}} and βiω,iω=1\beta_{i_{\omega},i_{\omega}}=1, for i𝒜i\in\mathcal{A} and iω𝒜i_{\omega}\in\mathcal{A}.

To enable numerically stable computations, the bit-flipping metric of the proposed decoder can be obtained by transforming the likelihood ratio liωl^{*}_{i_{\omega}} to the LLR domain as

QDL-SCL(iω)=ln(liω)ln(i𝒜exp(βiω,i|L0,i|))=i𝒜βiω,i|L0,i|.\begin{split}Q_{\text{DL-SCL}}(i_{\omega})&=\ln(l^{*}_{i_{\omega}})\\ &\approx\ln\left(\prod_{\forall i\in\mathcal{A}}\exp{\left(\beta_{i_{\omega},i}|L_{0,i}|\right)}\right)\\ &=\sum_{\forall i\in\mathcal{A}}\beta_{i_{\omega},i}|L_{0,i}|.\end{split} (18)

The most probable bit-flipping index iωi^{*}_{\omega} can then be selected as

iω=argminiω𝒜QDL-SCL(iω).i^{*}_{\omega}=\operatorname*{arg\,min}_{\forall i_{\omega}\in\mathcal{A}}Q_{\text{DL-SCL}}(i_{\omega}). (19)

Note that the bit-flipping metric computation in (18) can be represented in the matrix form as

𝑸DL-SCL=|𝑳0|𝜷,\bm{Q}_{\text{DL-SCL}}=|\bm{L}_{0}|\cdot\bm{\beta}, (20)

where 𝑸DL-SCL\bm{Q}_{\text{DL-SCL}} and 𝑳0\bm{L}_{0} are row vectors of size 1×(K+c){1\times(K+c)}, and 𝜷\bm{\beta} is a matrix of size (K+c)×(K+c)(K+c)\times(K+c). Equivalently, iωi^{*}_{\omega} is the index of the element in 𝑸DL-SCL\bm{Q}_{\text{DL-SCL}} that has the smallest value.

With βiω,iω=1\beta_{i_{\omega},i_{\omega}}=1 and βiω,i=βi,iω\beta_{i_{\omega},i}=\beta_{i,i_{\omega}}, 0i,iω<K+c0\leq i,i_{\omega}<K+c, 𝜷\bm{\beta} can be seen as a correlation matrix that captures the inherent relations of the absolute LLR values of all the information bits under SCL decoding. For the sake of simplicity, since only the LLR values of information bits are considered, all of the bit indices used in the rest of this paper are referred to as information bit indices, and therefore have the values in the range of [0,K+c1][0,K+c-1].

III-C Parameter Optimization

Note that βiω,iω\beta_{i_{\omega},i_{\omega}} is fixed to 1 and is not trainable for all 0iω<K+c0\leq i_{\omega}<K+c. On the other hand, other elements of the matrix 𝜷\bm{\beta} are trainable with a condition that βiω,i=βi,iω{\beta_{i_{\omega},i}=\beta_{i,i_{\omega}}}, 0i,iω<K+c{0\leq i,i_{\omega}<K+c}. In the proposed DL-SCL decoding, the number of trainable parameters of the matrix 𝜷\bm{\beta} is (K+c)(K+c1)2\frac{(K+c)(K+c-1)}{2}, which is too large to efficiently apply heuristic methods such as Monte Carlo simulation for parameter optimization. Therefore, the optimization of 𝜷\bm{\beta} is considered as a learning problem and DL techniques are exploited to optimize 𝜷\bm{\beta}. The bit-flipping metric 𝑸DL-SCL\bm{Q}_{\text{DL-SCL}} of the proposed decoder does not depend on the values of the message word 𝒖\bm{u}. Thus, all-zero codewords can be used during the training phase. This symmetric property is particularly useful for DL-based decoders of linear block codes, as it simplifies the training process [13, 12, 15].

Let 𝑻^\bm{\hat{T}} be the estimated bit-flipping vector of the information bits with 1-1 indicating a bit-flip and +1+1 indicating no bit-flip. From (19), the elements of the vector 𝑻^\bm{\hat{T}} are defined as

T^i={1if i=iω,+1if iiω,\hat{T}_{i}=\begin{cases}-1&\text{if }i=i^{*}_{\omega},\\ +1&\text{if }i\neq i^{*}_{\omega},\end{cases} (21)

for 0i<K+c0\leq i<K+c. In this paper, stochastic-gradient-descent (SGD) based techniques are used to update the values of 𝜷\bm{\beta} during training, thus the computation of 𝑻^\bm{\hat{T}} is modified to enable back-propagation during training [13]. Otherwise, learning is not feasible as the derivative of (21) with respect to 𝑸DL-SCL[i]\bm{Q}_{\text{DL-SCL}}[i], i.e., the ii-th element of 𝑸DL-SCL\bm{Q}_{\text{DL-SCL}}, is always 0.

Let the soft estimation of T^i\hat{T}_{i} be

T~i=tanh(𝑸DL-SCL[i]τ),\tilde{T}_{i}=\tanh(\bm{Q}_{\text{DL-SCL}}[i]-\tau), (22)

where τ=τ0+τ12\tau=\frac{\tau_{0}+\tau_{1}}{2}, and τ0\tau_{0} and τ1\tau_{1} are the smallest and the second smallest values of 𝑸DL-SCL\bm{Q}_{\text{DL-SCL}}, respectively. The objective loss function is then defined as

Loss=1K+ci=0K+c1(1T~i2,1Ti2)+λiω=0K+c1i=iω+1K+c1(βiω,i)2,\begin{split}\text{Loss}&=\frac{1}{K+c}\sum_{i=0}^{K+c-1}\mathcal{L}\left(\frac{1-\tilde{T}_{i}}{2},\frac{1-T_{i}}{2}\right)\\ &+\lambda\sum_{i_{\omega}=0}^{K+c-1}\sum_{i=i_{\omega}+1}^{K+c-1}(\beta_{i_{\omega},i})^{2},\end{split} (23)

where (a,b)=blog(a)(1b)log(1a){\mathcal{L}(a,b)=-b\log(a)-(1-b)\log(1-a)} is the binary cross-entropy function, and λ\lambda is the scaling factor of the L2 regularization [16].

In this paper, PyTorch [17] is used as the DL framework. Training is done using RMSprop optimizer [18] with a mini-batch size of 128128 and a learning rate of 10410^{-4}. The training set consists of 2182^{18} samples of the received channel signals, which are not correctly decoded after the first SCL decoding attempt, and the data is collected at Eb/N0=5{E_{b}/N_{0}=5} dB. The L2 regularization hyperparameter λ\lambda is set to 0.250.25. The initial values of the non-diagonal elements of 𝜷\bm{\beta} are drawn from an i.i.d distribution within the range of [0.2,0.2][-0.2,0.2] before training takes place. The matrix 𝜷\bm{\beta} is trained for list sizes M{1,2,4,8}M\in\{1,2,4,8\}111Optimized 𝜷\bm{\beta} matrices are available at https://github.com/nghiadt05/DLSCL-CorMatrices.

III-D Evaluation

444.54.5555.55.5666.56.510410^{-4}10310^{-3}10210^{-2}10110^{-1}Eb/N0E_{b}/N_{0} [dB]FER

2

Figure 2: FER comparison of various decoders for 𝒫(128,64)\mathcal{P}(128,64) concatenated with a 2424-bit CRC.

In this section, the performance of the proposed DL-SCL decoder in terms of frame-error-rate (FER) and computational complexity is examined. The polar code 𝒫(128,64)\mathcal{P}(128,64) concatenated with a 2424-bit CRC is considered. The selected polar code and the CRC polynomial are used in the eMBB control channel of the 5G standard [2].

Fig. 2 compares the FER performance of various decoders for 𝒫(128,64)\mathcal{P}(128,64). In this figure, DL-SCLMM denotes the proposed DL-SCL decoding algorithm with list size M={1,2,4,8}M=\{1,2,4,8\}, and the bit-flipping SCL decoder with the bit-flipping metric proposed in [8] is denoted as SCLFMM. In addition, the original SCL decoding in [14] is also considered for the comparison. For all the bit-flipping SCL decoders, 88 additional decoding attempts are considered for the secondary SCL decoding. As observed from Fig. 2, the proposed bit-flipping metric in the proposed DL-SCL decoders results in almost no FER performance loss compared to that of the SCLF decoders.

Fig. 3 visualizes the values of the elements in matrix 𝜷𝑰\bm{\beta}-\bm{I} in the form of a heat map222Matrix 𝜷𝑰\bm{\beta}-\bm{I} is shown to exclude the diagonal elements of 𝜷\bm{\beta} which have a value of 11., where 𝑰\bm{I} is the identity matrix with the same size as 𝜷\bm{\beta}. It can be seen that 𝜷𝑰\bm{\beta}-\bm{I} (and thus 𝜷\bm{\beta}) is a sparse matrix with many of its elements having a value close to 0. This observation is exploited to reduce the computational complexity of computing the proposed bit-flipping metric, which in turn reduces the computational complexity of the proposed DL-SCL decoding algorithm.

Refer to caption
Figure 3: Visualization of 𝜷𝑰\bm{\beta}-\bm{I} for the DL-SCL88 decoder.

Table I reports the computational complexity of the proposed bit-flipping metric of the DL-SCL decoder in comparison with that of the SCLF decoder in terms of the number of different operations performed. Other than the bit-flipping metric, the proposed DL-SCL decoder and the SCLF decoder are identical. In this table, all the elements of 𝜷\bm{\beta} which have a value in the range [104,104][-10^{-4},10^{-4}] are set to 0, thus removing the need to perform additions or multiplications over those elements, without tangibly degrading the error-correction performance. It can be seen that the bit-flipping metric computation in the proposed DL-SCL decoders require up to 66%66\% fewer multiplications and up to 36%36\% fewer additions in comparison with that of the SCLF decoder. Moreover, unlike SCLF decoder, the proposed bit-flipping metric in the DL-SCL decoders does not require the computation of any transcendental functions.

TABLE I: Computational complexity of the bit-flipping metric for 𝒫(128,64)\mathcal{P}(128,64) in terms of the number of operations performed
Decoders ×\times ++ ln\ln/exp\exp
SCLFMM 78327832 40044004 78327832
DL-SCL11 26522652 25642564 0
DL-SCL22 31163116 30283028 0
DL-SCL44 32383238 31503150 0
DL-SCL88 31763176 30883088 0

IV Conclusion

In this paper, a deep-learning-aided successive-cancellation list (DL-SCL) decoding algorithm for polar codes is introduced. The proposed decoder improves the performance of successive-cancellation list (SCL) decoding by running additional SCL decoding attempts using a novel bit-flipping scheme. The bit-flipping metric of the proposed decoder is obtained by exploiting the inherent relations between the information bits. These relations are expressed in the form of a trainable correlation matrix, which is optimized using deep-learning (DL) techniques. Performance results on a polar code of length 128128 and rate 1/21/2 show that the proposed bit-flipping metric in the proposed DL-SCL decoder requires up to 66%66\% fewer multiplications and up to 36%36\% fewer additions in comparison with the state of the art, while providing almost the same error-correction performance.

Acknowledgment

S. A. Hashemi is supported by a Postdoctoral Fellowship from the Natural Sciences and Engineering Research Council of Canada (NSERC).

References

  • [1] E. Arıkan, “Channel polarization: A method for constructing capacity-achieving codes for symmetric binary-input memoryless channels,” IEEE Trans. Inf. Theory, vol. 55, no. 7, pp. 3051–3073, July 2009.
  • [2] 3GPP, “Multiplexing and channel coding (Release 10) 3GPP TS 21.101 v10.4.0.” Oct. 2018. [Online]. Available: http://www.3gpp.org/ftp/Specs/2018-09/Rel-10/21_series/21101-a40.zip
  • [3] I. Tal and A. Vardy, “List decoding of polar codes,” IEEE Trans. Inf. Theory, vol. 61, no. 5, pp. 2213–2226, March 2015.
  • [4] O. Afisiadis, A. Balatsoukas-Stimming, and A. Burg, “A low-complexity improved successive cancellation decoder for polar codes,” in 48th Asilomar Conf. on Sig., Sys. and Comp., Nov 2014, pp. 2116–2120.
  • [5] C. Condo, F. Ercan, and W. J. Gross, “Improved successive cancellation flip decoding of polar codes based on error distribution,” in IEEE Wireless Commun. and Net. Conf. Workshops, April 2018, pp. 19–24.
  • [6] F. Ercan, C. Condo, S. A. Hashemi, and W. J. Gross, “Partitioned successive-cancellation flip decoding of polar codes,” arXiv e-prints, p. arXiv:1711.11093v4, Nov 2017. [Online]. Available: https://arxiv.org/abs/1711.11093
  • [7] F. Ercan, C. Condo, and W. J. Gross, “Improved bit-flipping algorithm for successive cancellation decoding of polar codes,” IEEE Trans. on Commun., vol. 67, no. 1, pp. 61–72, Jan 2019.
  • [8] L. Chandesris, V. Savin, and D. Declercq, “Dynamic-SCFlip decoding of polar codes,” IEEE Trans. Commun., vol. 66, no. 6, pp. 2333–2345, June 2018.
  • [9] S. Cammerer, T. Gruber, J. Hoydis, and S. ten Brink, “Scaling deep learning-based decoding of polar codes via partitioning,” in IEEE Global Commun. Conf., December 2017, pp. 1–6.
  • [10] W. Xu, Z. Wu, Y.-L. Ueng, X. You, and C. Zhang, “Improved polar decoder based on deep learning,” in IEEE Int. Workshop on Signal Process. Syst., November 2017, pp. 1–6.
  • [11] N. Doan, S. A. Hashemi, and W. J. Gross, “Neural successive cancellation decoding of polar codes,” in 2018 IEEE 19th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), June 2018, pp. 1–5.
  • [12] N. Doan, S. A. Hashemi, E. N. Mambou, T. Tonnellier, and W. J. Gross, “Neural belief propagation decoding of CRC-polar concatenated codes,” in IEEE Int. Conf. on Commun., May 2019, pp. 1–6.
  • [13] N. Doan, S. A. Hashemi, F. Ercan, T. Tonnellier, and W. J. Gross, “Neural dynamic successive cancellation flip decoding of polar codes,” ArXiv, vol. abs/1907.11563, 2019. [Online]. Available: https://arxiv.org/abs/1907.11563
  • [14] A. Balatsoukas-Stimming, M. B. Parizi, and A. Burg, “LLR-based successive cancellation list decoding of polar codes,” IEEE Trans. Signal Process., vol. 63, no. 19, pp. 5165–5179, Oct. 2015.
  • [15] E. Nachmani, E. Marciano, L. Lugosch, W. J. Gross, D. Burshtein, and Y. Be’ery, “Deep learning methods for improved decoding of linear codes,” IEEE J. of Sel. Topics in Signal Process., vol. 12, no. 1, pp. 119–131, February 2018.
  • [16] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, p. 436, May 2015.
  • [17] A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, and A. Lerer, “Automatic differentiation in pytorch,” 2017.
  • [18] G. Hinton, N. Srivastava, and K. Swersky, “Neural networks for machine learning lecture 6a overview of mini-batch gradient descent.” [Online]. Available: https://cs.toronto.edu/csc321/slides/lecture_slides_lec6.pdf