This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

An Exploration of the
Heterogeneous Unsourced MAC

†A. Hao, †S. Rini, §V. K. Amalladinne, §A. K. Pradhan, §J.-F. Chamberland
†Electrical and Computer Engineering, National Chiao Tung University
§Electrical and Computer Engineering, Texas A&M University
This material is based upon work supported, in part, by the National Science Foundation (NSF) under Grant CCF-1619085 and by Qualcomm Technologies, Inc., through their University Relations Program.
Abstract

The unsourced MAC model was originally introduced to study the communication scenario in which a number of devices with low-complexity and low-energy wish to upload their respective messages to a base station. In the original problem formulation, all devices communicate using the same information rate. This may be very inefficient in certain wireless situations with varied channel conditions, power budgets, and payload requirements at the devices. This paper extends the original problem setting so as to allow for such variability. More specifically, we consider the scenario in which devices are clustered into two classes, possibly with different SNR levels or distinct payload requirements. In the cluster with higher power, devices transmit using a two-layer superposition modulation. In the cluster with lower energy, users transmit with the same base constellation as in the high power cluster. Within each layer, devices employ the same codebook. At the receiver, signal groupings are recovered using Approximate Message Passing (AMP), and proceeding from the high to the low power levels using successive interference cancellation (SIC). This layered architecture is implemented using Coded Compressed Sensing (CCS) within every grouping. An outer tree code is employed to stitch fragments together across times and layers, as needed. This pragmatic approach to heterogeneous CCS is validated numerically and design guidelines are identified.

Index Terms:
Unsourced random access, Coded compressed sensing, Approximate message passing, Superposition constellation.

I Introduction

The IoT paradigm of myriad unattended devices connected wirelessly to the Internet may pose a significant disruption to existing communication networks. The predicted number of such devices, orders of magnitude greater than human subscribers, and the usage profile of these devise, sporadic and fleeting, invalidate the type of connection-based architectures that form a foundation for existing deployments. Thus, new means of Internet access must be explored to reflect this change, with provisions for random access. Along these lines, one model attuned to this reality that has gained attention in recent years is unsourced random-access (URA). The URA formulation, originally proposed by Polyanskiy [polyanskiy2017perspective], centers on concurrent up-link data transfers. There is a strong connection between URA and Compressed Sensing (CS), with the former problem being an instance of a noisy support recovery task [choi2017compressed, reeves2012sampling, gilbert2017all]. More precisely, in URA, the receiver seeks to identify the set of messages being transmitted by active devices, without regard for the identities of their sources. The identity of a source can be embedded in the message payload, if needed. The value of this approach lies in the fact that the access point does not need to determine the set of active devices at the onset of a frame, a step that can rapidly become overwhelming for connection-less settings with a very large population of candidate transmitters. URA raises both theoretical and practical challenges. Achievable bounds rooted in finite-block length analysis for such systems can be found in [polyanskiy2017perspective]. These bounds are obtained devoid of complexity constraints, as they rely on joint maximum likelihood decoding.

Several pragmatic, low-complexity approaches for this problem have been proposed [ordentlich2017low, vem2019user, amalladinne2019coded, calderbank2018chirrup, fengler2019sparcs, pradhan2019sparseidma, marshakov2019polar, AKPolar]. Conceptually, each of these contributions offer a means to circumvent the difficulty associated with the dimensionality of the problem. Indeed, when viewed as a support recovery task, unsourced random access features a KK-sparse state vector of length 21002^{100} or longer. This reality prevents the straightforward application of standard CS solvers. To address this issue, many algorithms leverage lessons from random access and coding theory to design structured sensing matrices suitable for the efficient recovery of the sent messages. A line of research that has attracted attention in this context is the framework of coded compressed sensing (CCS) originally proposed by Amalladinne et al. [amalladinne2018couple, amalladinne2019coded]. This scheme is a divide-and-conquer approach where a large CS problem is partitioned into smaller components, each of which can be solved using standard CS algorithms. The output of this step produces lists of message fragments, one list for every CS instance. The transmitted messages are recovered by stitching fragments together using an outer code. Overall the approach can be abstracted as a concatenated coding scheme where an inner code is task with fragment recovery and the outer code is responsible for message disambiguation.

CCS has been ported, enhanced, and extended by multiple authors. It appears as a component of the ultra-low complexity CHIRRUP algorithm [calderbank2018chirrup], and it can be incorporated into activity detection in multi-antenna systems [fengler2019massive]. CCS can be employed to build neighbor discovery schemes and to handle signal asynchrony [zhang2013neighbor, thompson2018compressed, amalladinne2019asynchronous]. An enhanced version of the algorithm takes advantage of the fact that output from the early stages of CCS can be integrated into later stages as side information to improve execution [amalladinne2020enhanced]. This variant has inspired significant extensions related to sparse regression codes and Approximate Message Passing (AMP) [fengler2019sparcs, amalladinne2020unsourced].

Along similar lines of research, the main contributions of our article can be summarized as follows.

  • In Sec. II, we introduce a novel system model for unsourced random access. This new model captures the fact that, in practice, wireless IoT devices may have distinct payload requirements. Heterogeneity is addressed by introducing the notion of clustering, whereby users within a cluster have the same power budget and they transmit at the same rate. We refer to this model as HetURA.

  • A pragmatic communication scheme for this setting is developed in Sec. III. The propose algorithm borrows ideas from CCS [amalladinne2018couple, amalladinne2019coded], but also introduces a phased decoding approach akin to successive interference cancellation across layers. Portions of clusters with greater energy budgets are decoded first. The structure of the problem is facilitated by a superposition constellation with two-levels.

The value of the proposed framework is examined in Sec. IV, where performance results showcase the validity of the approach. Finally, Sec. LABEL:sec:Conclusion concludes the paper.

II Heterogeneous URA

Cluster 1Cluster 2fragmentation𝐰1i\mathbf{w}_{1i}𝐰2i\mathbf{w}_{2i} tree encoding 𝐰1i1\mathbf{w}_{1i1}\vdots𝐰1iJ\mathbf{w}_{1iJ}𝐰¯2i1\mathbf{\overline{w}}_{2i1}\vdots𝐰¯2iJ\mathbf{\overline{w}}_{2iJ}𝐰¯¯2i1\mathbf{\overline{\overline{w}}}_{2i1}\vdots𝐰¯¯2iJ\mathbf{\overline{\overline{w}}}_{2iJ} bottom bits top bits 𝐆11\mathbf{G}_{11}\vdots𝐆1J\mathbf{G}_{1J}𝐆11\mathbf{G}_{11}\vdots𝐆1J\mathbf{G}_{1J}𝐆21\mathbf{G}_{21}\vdots𝐆2J\mathbf{G}_{2J}𝐀1\mathbf{A}_{1}\vdots𝐀1\mathbf{A}_{1}𝐀1\mathbf{A}_{1}\vdots𝐀1\mathbf{A}_{1}𝐀2\mathbf{A}_{2}\vdots𝐀2\mathbf{A}_{2} be2i be2i be2i be2i be2i be2i concatconcat\sum 𝐱2i\mathbf{x}_{2i} 𝐱1i\mathbf{x}_{1i} ++CCS decoding – 𝐀2\mathbf{A}_{2}CCS decoding – 𝐀1\mathbf{A}_{1}Tree Decoder – 𝐆2\mathbf{G}_{2}Tree Decoder – 𝐆1\mathbf{G}_{1} Superposition +CS encode MAC channel 2-phase decoder tree decoding 𝐱1i\mathbf{x}_{1i} 𝚽2𝐦2-\boldsymbol{\Phi}_{2}\mathbf{m}_{2} 𝐲\mathbf{y} 𝐲\mathbf{y} 𝐰1i1\mathbf{w}_{1i1} \vdots 𝐰1iJ\mathbf{w}_{1iJ} 𝐰¯2i1\mathbf{\overline{w}}_{2i1} \vdots 𝐰¯2iJ\mathbf{\overline{w}}_{2iJ} 𝐰¯¯2i1\mathbf{\overline{\overline{w}}}_{2i1} \vdots 𝐰¯¯2iJ\mathbf{\overline{\overline{w}}}_{2iJ}
Figure 1: This notional diagram offers a synopsis of the proposed communication scheme, as described in Sec. III.

Our goal is to introduce and study a heterogeneous version of URA with groupings, where distinct groups have different power levels and data requirements. We refer to this model as the heterogeneous URA (HetURA). The HetURA is formally defined as the up-link scenario where the user population is divided in KK clusters, with cluster kk containing the set of devices 𝒮k\mathcal{S}_{k}. Of these 𝒮k\mathcal{S}_{k} devices, only a subset 𝒜k𝒮k\mathcal{A}_{k}\subset\mathcal{S}_{k} of size |𝒜k|=Mk|\mathcal{A}_{k}|=M_{k} is active, with users therein wishing to communicate to the base station. The output at the receiver is then equal to

𝐲=k[K]i𝒜k𝐱ki+𝐳,\displaystyle\mathbf{y}=\sum_{k\in[K]}\sum_{i\in\mathcal{A}_{k}}\mathbf{x}_{ki}+\mathbf{z}, (1)

where 𝐱kiN\mathbf{x}_{ki}\in\mathbb{R}^{N} is the channel input of the ithi^{\mathrm{th}} user in the kthk^{\mathrm{th}} cluster, and NN is the block-length. Note [K]{1,,K}[K]\ \triangleq\{1,\ldots,K\} in (1). Each input sequence is subject to an expected power constraint 𝐱ki22NPk\|\mathbf{x}_{ki}\|_{2}^{2}\leq NP_{k} with cluster ordering PkPk+1P_{k}\leq P_{k+1}. The components of additive noise 𝐳\mathbf{z} are independent, each with a standard normal distribution.

A suitable transmission scheme for the HetUMAC is defined as follows. Active user ii in cluster kk wishes to transmit message 𝐰ki[2NRk]\mathbf{w}_{ki}\in\left[\lfloor 2^{NR_{k}}\rfloor\right], where RkR_{k} denotes the rate of cluster kk. All the users within a cluster employ the same code and, hence, they share a same rate. This gives

𝐱ki=fenck(𝐰ki),i𝒜k.\displaystyle\mathbf{x}_{ki}=f_{{\rm enc}-k}\left(\mathbf{w}_{ki}\right),\quad\forall\ i\in\mathcal{A}_{k}. (2)

Having observed 𝐲\mathbf{y}, the receiver is tasked with decoding the list of messages transmitted by each cluster; that is,

𝒲k=fdeck(𝐲),k[K],\displaystyle{\cal W}_{k}=f_{{\rm dec}-k}(\mathbf{y}),\quad\forall\ k\in[K], (3a)
w

ith |𝒲k|=Mk|{\cal W}_{k}|=M_{k}. Every entry on this list should take value in the set [2NRk]\left[\lfloor 2^{NR_{k}}\rfloor\right]. System performance is evaluated according to the per-user probability of error, defined as [polyanskiy2017perspective]

PUE=maxk[K]1Mki𝒜k[wki𝒲k|𝐲].\displaystyle P_{\rm UE}=\max_{k\in[K]}\ \frac{1}{M_{k}}\sum_{i\in\mathcal{A}_{k}}\mathbb{P}\left[w_{ki}\not\in{\cal W}_{k}|\mathbf{y}\right]. (4)

In words, this captures the (maximum) probability that a message sent by one of the devices is not recovered at the receiver. Note that, since all the users in a cluster use the same encoding function, as in (2), the receiver does not discover which user transmitted which message.

III A Coded Compressed Sensing Scheme

In this section, we describe an extension of the work found in [amalladinne2019coded] adapted to the HetUMAC scenario discussed in Sec. II. To put our contribution in context, we begin with a brief review of key CS notions. The original URA formulation can be viewed as sparse support recovery from observation

𝐲=𝚽𝐦+𝐳,\displaystyle\mathbf{y}=\boldsymbol{\Phi}\mathbf{m}+\mathbf{z}, (5)

where 𝚽N×2NR\boldsymbol{\Phi}\in\mathbb{R}^{N\times 2^{\lfloor NR\rfloor}} is a dictionary of possible signals, 𝐦\mathbf{m} is a binary vector that contains the indices of the transmitted codewords so that 𝐦{0,1}2NR\mathbf{m}\in\{0,1\}^{2^{\lfloor NR\rfloor}}, and 𝐳\mathbf{z} is additive noise as in (1). We stress that 𝐦\mathbf{m} is a sparse vector with 𝐦0\|\mathbf{m}\|_{0} being equal to the number of active URA devices.

As mentioned above, this article explores the extended scenario where the device population is partitioned into groups, and users from distinct clusters employ different codebooks. For ease of exposition, we restrict our treatment to the case where K=2K=2. When two groupings are present, the CS interpretation of URA becomes

𝐲=𝚽1𝐦1+𝚽2𝐦2+𝐳.\displaystyle\mathbf{y}=\boldsymbol{\Phi}_{1}\mathbf{m}_{1}+\boldsymbol{\Phi}_{2}\mathbf{m}_{2}+\mathbf{z}. (6)

In a manner analogous to the basic URA formulation, 𝐦1\mathbf{m}_{1} denotes the collection of indices from the first cluster; and 𝐦2\mathbf{m}_{2}, the indices from the second cluster. As in Sec. II, we assume that the clusters are ordered in increasing transmit power, so that we refer to the first/second cluster and low/high-energy cluster. For simplicity, we do not discuss this alternative in the paper. Recall that CCS was introduced as a means to tackle the dimensionality issue posed by the width of 𝚽\boldsymbol{\Phi}. Quite obviously, when expanding the sensing matrix [𝚽1𝚽2]\big{[}\boldsymbol{\Phi}_{1}\,\boldsymbol{\Phi}_{2}\big{]} to accommodate multiple groupings, a number of complexity issue arises. In particular, similarly to CSS, complexity allows for the decoding through non-negative least squares (NNLS) or LASSO only for limited problem dimensions in (6). To support longer transmission block-lengths, the transmitted bits are divide into fragments and sent in separate slots. Since the identity of the transmitter is not conveyed in the choice of encoding function, the individual fragments of the original messages must be pieced together through a low-complexity tree-based algorithm as in [amalladinne2019coded].

To further reduce decoding complexity, the users in the high-energy cluster transmit their message bits using a superposition constellation with two-layers: a top and a bottom layer. The symbols in the bottom layer are transmitted using the same code-book as the low-energy cluster, whereas the remaining bits are “on top” of the bottom bits using superposition constellation. This coding choice allows the receiver, for each transmission slot, to first decode the top layer of the superposition constellation in the high-energy cluster; and then, after using Successive Interference Cancellation (SIC), decode the low-energy users together with the bottom layer in the superposition constellation of the high-energy cluster. Upon decoding all the message fragments in all the slots and from all the users, the receiver can then employ the tree decoder to reconstruct the set of transmitted messages. We further detail the proposed transmission scheme below.

III-A Fragmenting

In the low-energy cluster, every message 𝐰1i\mathbf{w}_{1i} is converted into a binary vector and partitioned into JJ sub-blocks, where the jthj^{\mathrm{th}} sub-block consists of B1jB_{1j} bits, so that j[J]B1j=B1=NR1\sum_{j\in[J]}B_{1j}=B_{1}=\lfloor NR_{1}\rfloor. This results in a collection of information fragments {𝐰1ij}j[J]\{\mathbf{w}_{1ij}\}_{j\in[J]} for message 𝐰1i\mathbf{w}_{1i}. On the other hand, users in the high-energy cluster first split their bits into two groups: one for the top layer and one for the bottom layer. Denote these two sets of bits by 𝐰¯¯2i\mathbf{\overline{\overline{w}}}_{2i}, 𝐰¯2i\mathbf{\overline{w}}_{2i} for the top and bottom portions, respectively. Likewise, let B¯¯2{\overline{\overline{B}}}_{2}, B¯2\overline{B}_{2} be the total numbers of bits assigned to these two layers; and R¯¯2{\overline{\overline{R}}}_{2}, R¯2\overline{R}_{2} be the corresponding rates. Subsequently, the bottom bits are fragmented exactly as in the low-energy cluster to form the set {𝐰¯2ij}\{\mathbf{\overline{w}}_{2ij}\}. The top bits 𝐰¯¯2i\mathbf{\overline{\overline{w}}}_{2i} are also partitioned into JJ fragments, but this time B¯¯2j{\overline{\overline{B}}}_{2j} is the size of the jthj^{\mathrm{th}} fragment (not necessarily the same partitioning as in the bottom layer). Again, we must have j[J]B¯¯2j=B¯¯2\sum_{j\in[J]}{\overline{\overline{B}}}_{2j}={\overline{\overline{B}}}_{2} and, additionally, B¯2+B¯¯2=NR2\overline{B}_{2}+{\overline{\overline{B}}}_{2}=\lfloor NR_{2}\rfloor.

III-B Tree Encoding

The role of the tree decoder is to enable the stitching of message fragments at the decoder. In CCS, this is accomplished by appending parity bits to the jthj^{\mathrm{th}} fragment based on the preceding information bits. Every device in the low-energy cluster takes the message fragment 𝐰1i(j+1)\mathbf{w}_{1i(j+1)} and encodes it into a vector 𝐯1i(j+1)\mathbf{v}_{1i(j+1)} using a systematic random linear code, together with the message fragment, 𝐰1i(j+1)\mathbf{w}_{1i(j+1)}

𝐯1i(j+1)=[𝐰1i(j+1);𝐆1j[𝐰1i1;;𝐰1ij]],\displaystyle\mathbf{v}_{1i(j+1)}=\left[\mathbf{w}_{1i(j+1)}\ ;\ \mathbf{G}_{1j}\otimes\left[\mathbf{w}_{1i1};\ldots;\mathbf{w}_{1ij}\right]\right], (7)

where 𝐆1j\mathbf{G}_{1j} produces TB1(j+1)T-B_{1(j+1)} random parity bits from all the previous segments, and \otimes indicates modulo-2 matrix multiplication. Thus, 𝐯1i(j+1)\mathbf{v}_{1i(j+1)} is viewed as a binary vector. In this scheme, B11=TB_{11}=T; that is the first fragment does not contain any parity bits. In (7), TB1jT\geq B_{1j}, so that effectively we have TB1jT-B_{1j} random linear parity constraints embedded in this block to help stitch together fragments of information bits belonging to 𝐰1ij\mathbf{w}_{1ij} when decoding codeword 𝐯1i\mathbf{v}_{1i}.

A user in the high-energy cluster performs a similar encoding process. Redundancy for bottom bits is added paralleling the encoding in the low energy cluster, yielding

𝐯¯2i(j+1)=[𝐰¯2i(j+1);𝐆1j[𝐰¯2i1;;𝐰¯2ij]],\displaystyle\mathbf{\overline{v}}_{2i(j+1)}=\left[\mathbf{\overline{w}}_{2i(j+1)}\ ;\ \mathbf{G}_{1j}\otimes\left[\mathbf{\overline{w}}_{2i1};\ldots;\mathbf{\overline{w}}_{2ij}\right]\right], (8)

where 𝐆1j\mathbf{G}_{1j} is the same binary matrix that appears in (7). The top bits are encoded according to the information bits in each cluster, i.e.,

𝐯¯¯2i(j+1)=[𝐰¯¯2i(j+1);𝐆2j[𝐰¯¯2i1;;𝐰¯¯2ij]],\displaystyle\mathbf{\overline{\overline{v}}}_{2i(j+1)}=\left[\mathbf{\overline{\overline{w}}}_{2i(j+1)}\ ;\ \mathbf{G}_{2j}\otimes\left[\mathbf{\overline{\overline{w}}}_{2i1};\ldots;\mathbf{\overline{\overline{w}}}_{2ij}\right]\right], (9)

where 𝐆2j\mathbf{G}_{2j} is, again, a random parity generating matrix.

III-C Superposition Coding & CS Encoding

After tree encoding is complete, each encoded block has size TT. These blocks are then encoded using two set of inner CS codes: (i) one for the segments of the low-energy cluster and the bottom segments of the high-energy cluster; and (ii) one for the top segments of the high-energy users. To apply the inner encoding, we convert the binary string 𝐯1ij\mathbf{v}_{1ij} into the one-norm binary vector 𝐦1i{0,1}2T\mathbf{m}_{1i}\in\{0,1\}^{2^{T}} in which a single one is placed at the location corresponding to the integer value of 𝐯1ij\mathbf{v}_{1ij}. This is the emblematic index representation used in CCS. Blocks 𝐯¯2ij\mathbf{\overline{v}}_{2ij} and 𝐯¯¯2ij\mathbf{\overline{\overline{v}}}_{2ij} are converted to their index representations in a similar manner.

The two CS code differ as follows: the former has entries from {+P1,P1}\{+\sqrt{P_{1}},-\sqrt{P_{1}}\}, while the latter features entries from {+P2P1,P2P1}\{+\sqrt{P_{2}-P_{1}},-\sqrt{P_{2}-P_{1}}\}. This difference in support results in a superposition constellation; the top bits are effectively transmitted at a higher power than the bottom bits, based on our assumption P2>2P1P_{2}>2P_{1}. Accordingly the CSS signal corresponding to section 𝐯1ij\mathbf{v}_{1ij} is

𝐱1ij=𝐀1𝐦1ij,\displaystyle\mathbf{x}_{1ij}=\mathbf{A}_{1}\mathbf{m}_{1ij}, (10)

where 𝐀1{+P1,P1}Q×2T\mathbf{A}_{1}\in\{+\sqrt{P_{1}},-\sqrt{P_{1}}\}^{Q\times 2^{T}} is a matrix formed by picking Q=NR/JQ=\lfloor NR/J\rfloor rows uniformly at random (excluding the row of all ones) from a Hadamard matrix of dimension 2T×2T2^{T}\times 2^{T} and re-scaling them to meet the power constraint. Similarly, after tree encoding, the bottom bits are CS encoded,

𝐱¯2ij=𝐀1𝐦¯2ij\displaystyle\mathbf{\overline{x}}_{2ij}=\mathbf{A}_{1}\mathbf{\overline{m}}_{2ij} (11)

using the same sensing matrix. The top bits are processed using a different signal dictionary,

𝐱¯¯2ij=𝐀2𝐦¯¯2ij.\displaystyle\mathbf{\overline{\overline{x}}}_{2ij}=\mathbf{A}_{2}\mathbf{\overline{\overline{m}}}_{2ij}. (12)

The rows of matrix 𝐀2{+P2P1,P2P1}Q×2T\mathbf{A}_{2}\in\{+\sqrt{P_{2}-P_{1}},-\sqrt{P_{2}-P_{1}}\}^{Q\times 2^{T}} are also scaled versions of randomly selected rows from a Hadamard matrix (excluding the row of all ones).

Finally, the channel inputs are obtained by concatenating the partial signals. For the low-energy users, we get

𝐱1i\displaystyle\mathbf{x}_{1i} =[𝐱1i1𝐱1i2𝐱1iJ],\displaystyle=\left[\mathbf{x}_{1i1}\ \mathbf{x}_{1i2}\ \ldots\ \mathbf{x}_{1iJ}\right], (13)

and, for the high energy users, we have

𝐱¯2i\displaystyle\mathbf{\overline{x}}_{2i} =[𝐱¯2i1𝐱¯2i2𝐱¯2iJ]\displaystyle=\left[\mathbf{\overline{x}}_{2i1}\ \mathbf{\overline{x}}_{2i2}\ \ldots\ \mathbf{\overline{x}}_{2iJ}\right] (14a)
𝐱¯¯2i\displaystyle\mathbf{\overline{\overline{x}}}_{2i} =[𝐱¯¯2i1𝐱¯¯2i2𝐱¯¯2iJ]\displaystyle=\left[\mathbf{\overline{\overline{x}}}_{2i1}\ \mathbf{\overline{\overline{x}}}_{2i2}\ \ldots\ \mathbf{\overline{\overline{x}}}_{2iJ}\right] (14b)
𝐱2i\displaystyle\mathbf{x}_{2i}\ =𝐱¯2i+𝐱¯¯2i,\displaystyle=\mathbf{\overline{x}}_{2i}+\mathbf{\overline{\overline{x}}}_{2i}, (14c)
O

ver the ensemble of random coding parameters, we obtain expected transmit power

𝐱2i2=𝐱¯2i2+𝐱¯¯2i2=NP1+N(P2P1)=NP2,\begin{split}\left\|\mathbf{x}_{2i}\right\|^{2}&=\left\|\mathbf{\overline{x}}_{2i}\right\|^{2}+\left\|\mathbf{\overline{\overline{x}}}_{2i}\right\|^{2}\\ &=NP_{1}+N(P_{2}-P_{1})=NP_{2},\end{split}

as mandated by the constraint associated with (1). The parameters of the scheme are summarized in Table. I.

III-D Channel Transmission/Reception

As the transmitted channel inputs are composed of coded segments of the same length, we observe that the CCS construction naturally maps to the CS interpretation in (6). To see this, let 𝐦1j\mathbf{m}_{1j} be the sum of coded fragments 𝐦1ij\mathbf{m}_{1ij} for i𝒜1i\in\mathcal{A}_{1}, then let 𝐦1\mathbf{m}_{1} be the concatenation of the fragments 𝐦1j\mathbf{m}_{1j} as in (13). Define 𝐦¯2j\mathbf{\overline{m}}_{2j}, 𝐦¯2\mathbf{\overline{m}}_{2} and 𝐦¯¯2j\mathbf{\overline{\overline{m}}}_{2j}, 𝐦¯¯2\mathbf{\overline{\overline{m}}}_{2} in a similar manner. Let 𝚽1\boldsymbol{\Phi}_{1} be the tensor product between 𝐈J\mathbf{I}_{J} and 𝐀1\mathbf{A}_{1}; likewise, let 𝚽2\boldsymbol{\Phi}_{2} be that between 𝐈J\mathbf{I}_{J} and 𝐀2\mathbf{A}_{2}. Then, we can express the received signal as

𝐲=𝚽1(𝐦1+𝐦¯2)+𝚽2𝐦¯¯2+𝐳.\displaystyle\mathbf{y}=\boldsymbol{\Phi}_{1}\left(\mathbf{m}_{1}+\mathbf{\overline{m}}_{2}\right)+\boldsymbol{\Phi}_{2}\mathbf{\overline{\overline{m}}}_{2}+\mathbf{z}. (15)

The signal aggregate obtained by adding 𝐦1\mathbf{m}_{1} and 𝐦¯2\mathbf{\overline{m}}_{2} has a sparsity of J(M1+M2)J(M_{1}+M_{2}), whereas 𝐦¯¯2\mathbf{\overline{\overline{m}}}_{2} is JM2JM_{2} sparse.

III-E Two-Phase Decoding

Upon getting observation 𝐲\mathbf{y}, the receiver separates it into sections {𝐲j}j[J]\{\mathbf{y}_{j}\}_{j\in[J]}, each block corresponding to the summation of the sections 𝐀1𝐦1j\mathbf{A}_{1}\mathbf{m}_{1j}, 𝐀1𝐦¯2j\mathbf{A}_{1}\mathbf{\overline{m}}_{2j}, and 𝐀2𝐦¯¯2j\mathbf{A}_{2}\mathbf{\overline{\overline{m}}}_{2j}, as in (15). The receiver begins by decoding 𝐦¯¯2j\mathbf{\overline{\overline{m}}}_{2j} using the CSS algorithm with coding matrix 𝐀2\mathbf{A}_{2}. During this phase of the decoding process, it treats the remaining terms in 𝐲j\mathbf{y}_{j} as additional noise. Parameters are selected to make sure that this portion of the decoding process is successful with high probability. Furthermore, for ease of exposition, we assume that 𝐦¯¯2j\mathbf{\overline{\overline{m}}}_{2j} is correctly recovered, although in reality, an error at this stage will produce some interference at the subsequent stage.

Once 𝐦¯¯2j\mathbf{\overline{\overline{m}}}_{2j} is recovered, the decoder computes a residual, or effective observation, for every section,

𝐲~j=𝐲j𝐀2𝐦¯¯2j.\tilde{\mathbf{y}}_{j}=\mathbf{y}_{j}-\mathbf{A}_{2}\mathbf{\overline{\overline{m}}}_{2j}.

In the spirit of successive interference cancellation, these sections, {𝐲~j}j[J]\left\{\tilde{\mathbf{y}}_{j}\right\}_{j\in[J]} are then passed to the CSS algorithm with coding matrix 𝐀1\mathbf{A}_{1} and sparsity level M1+M2M_{1}+M_{2}.

The recovery algorithm we adopt for individual sections is an AMP-based CS solver [maleki2010optimally]. Such composite algorithms iterate through two equations:

𝐳(t)=𝐲𝐀𝐦(t)+𝐳(t1)Qdiv𝜼t1(𝐀T𝐳(t1)+𝐦(t1))\displaystyle\begin{split}\mathbf{z}^{(t)}&=\mathbf{y}-\mathbf{A}\mathbf{m}^{(t)}\\ &\quad+\frac{\mathbf{z}^{(t-1)}}{Q}\operatorname{div}\boldsymbol{\eta}_{t-1}\left(\mathbf{A}^{\mathrm{T}}\mathbf{z}^{(t-1)}+\mathbf{m}^{(t-1)}\right)\end{split} (16)
𝐦(t+1)\displaystyle\mathbf{m}^{(t+1)} =𝜼t(𝐀T𝐳(t)+𝐦(t)),\displaystyle=\boldsymbol{\eta}_{t}\left(\mathbf{A}^{\mathrm{T}}\mathbf{z}^{(t)}+\mathbf{m}^{(t)}\right), (17)

with initial conditions 𝐦(0)=𝟎\mathbf{m}^{(0)}={\bf 0} and 𝐳(0)=𝐲\mathbf{z}^{(0)}=\mathbf{y}. The function 𝜼t()\boldsymbol{\eta}_{t}(\cdot) in (17) is the denoiser, which can take the form of a posterior mean estimate [Giuseppe] or a standard soft thresholding operator [daubechies2004iterative, beck2009fast]. Equation (16) can be interpreted as the computation of a residual enhanced enhanced with an Onsager correction [bayati2011dynamics, donoho2013information].

After converting the terms 𝐦\mathbf{m} into 𝐯\mathbf{v}, the tree decoder uses the random parity bits to stitch together the information bits from each of the users, thus reconstructing the transmitted terms 𝐰1\mathbf{w}_{1} and 𝐰2\mathbf{w}_{2}.111In actuality, one also needs some random parity bits to stitch together the top and the bottom bits from the high-energy users. For simplicity this coding step is not presented here, as it is analogous to the step in Sec. III-B.

The encoding and decoding process is also conceptually represented in Fig. 1. Successive steps in the scheme are represented in vertical sections, proceeding from left to right. Separated horizontal section represent the processing of three sets of bits: the bits from cluster 1, the bottom bits from cluster 2, and the top bits from cluster 2. In the figure “be2i” indicates the conversion from binary string of length ll to the index in the 2l2^{l} binary vector. Also, “concat” indicates the concatenation of the segment as in (13).

Quantity Quantity
block-length NN # fragments JJ
# clusters KK len. low-energy fragments B1jB_{1j}
# active users MkM_{k} len. high-energy top fragments B¯2j\overline{B}_{2j}
power constraint PkP_{k} len. high-energy top fragments B¯¯2j{\overline{\overline{B}}}_{2j}
transmission rate RkR_{k} len. tree-coded fragments TT
# transmitted bits BkB_{k} len. CSS+tree-coded fragments QQ
TABLE I: Summary of the quantities in Sec. II and Sec. III

.

IV Numerical Evaluations

We now turn to the numerical simulation of the scheme introduced in Sec. III. Generally speaking, we are interested in arguing that the scheme in Sec. III allows high-energy users to transmit at high rates while preserving the baseline performance in which all nodes, i.e. M1+M2M_{1}+M_{2}, transmit at the low-energy level P1P_{1} as in [amalladinne2019coded]. Indeed, this is the design reasoning behind the coding choice of treating the bottom bits of the high energy cluster as the bits in the low energy cluster. Accordingly, in the simulations we fix scheme parameters in Table II and study the rate performance as a function of P1P_{1} while P2/P1=αP_{2}/P_{1}=\alpha is kept fixed.

Parameter Value Parameter Value
block-length N=200N=200 # of fragments J=11J=11
# low-energy users M1=10M_{1}=10 section length L=5L=5
# high-energy users M2=5M_{2}=5 # of AMP iterations 1010
low-energy payload B1=100B_{1}=100 target PUEP_{\rm UE} 5%
TABLE II: Summary of the simulation parameters in Sec. IV.

IV-A Baseline Performance

In this section, we elaborate on the baseline settings for our simulation campaign. The performance associated with the original CSS code in each fragment is plotted in Fig. 2 for the settings in Table II.

1414161618182020222210310^{-3}10210^{-2}10110^{-1}10010^{0}P1P_{1} [dB]PUEP_{\rm UE}R1=0.03R_{1}=0.03R1=0.04R_{1}=0.04R1=0.05R_{1}=0.05R1=0.06R_{1}=0.06R1=0.07R_{1}=0.07R1=0.075R_{1}=0.075
Figure 2: The PUEP_{UE} for a fragment in the baseline URA in Sec. IV-A as a function of P1P_{1} for various rates R1[0.03,0,075]R_{1}\in[0.03,0,075].

Let us consider the performance of the classic (coordinated) RA scheme in which each active user transmits using time-sharing for a portion 1/(M1+M2)1/(M_{1}+M_{2}) of the time. In this case, the largest attainable rate (ignoring the block-length effects) is R1=0.067R_{1}=0.067: note the CCS code here attains reliable decoding up to Rl=0.07R_{l}=0.07 but shows a PUPP_{\rm UP} saturation at R=0.08R=0.08.

To attain the desired block-length, the various fragment is stitched together using a tree code expressed as 𝐁1=[0,5,5,5,5,5,5,5,5,5,9]\mathbf{B}_{1}=[0,5,5,5,5,5,5,5,5,5,9] where the jj element of 𝐁1\mathbf{B}_{1} is B1jB_{1j} in (7). With this choice choice of tree-coding parameters, we achieve an error rate below 1%1\% at the tree decoder.

IV-B Comparison with TDMA

Next, we wish to compare the performance of the scheme in Sec. III with the simpler scheme relying time-division multiple access (TDMA) as follows. Given a block-length NN, transmission takes place in a low-rate phase of duration λN\lambda N and a high-rate phase of duration (1λ)N(1-\lambda)N. In the low-rate phase, all users send at the baseline rate in Sec. IV-A at power P1=P1/λP_{1}^{\prime}=P_{1}/\lambda while, in the high-phase rate, the high-rate users transmit at rate the baseline rate for M2M_{2} users at power P2P_{2}^{\prime} obtained as

P2(1λ)+λP1=P2.\displaystyle P_{2}^{\prime}(1-\lambda)+\lambda P_{1}=P_{2}. (18)

Let the rate attained with this scheme in the two clusters as (R1,R2)(R_{1}^{\prime},R_{2}^{\prime}). We can compare the performance of this TDMA approach with the approach in Sec. III by letting the rate in the low-energy user be R1R_{1}^{\prime} and see what rate for the high-energy user is attainable. We can then interpret R¯¯2{\overline{\overline{R}}}_{2} and R2R_{2}^{\prime} as the rates that the two approaches afford the high-energy users for a given degradation of the baseline performance.

17171818191920202121222223232424252561026\cdot 10^{-2}71027\cdot 10^{-2}81028\cdot 10^{-2}91029\cdot 10^{-2}0.10.10.110.110.120.120.130.13P1P_{1} [dB]RRR1R_{1}R2R_{2}: our schemeR2R_{2}: TDMA-scheme
Figure 3: Simulation results for the setting in Sec. IV-B for α=P2/P1=6\alpha=P_{2}/P_{1}=6

IV-C IAN improvement

Thorough numerical experiments, we have noted that the decoding procedure in Sec. III-E actually greatly outperforms the performance predicted by SIC assuming that the effecting noise 𝚽1(𝐦1+𝐦¯2𝐳isnormaldistributed,i.e.IAN.OneinherentfeatureoftheCCScodesusedinSec.IIIisthattherandomlygeneratedcodewordsin𝐀1/𝐀2areuniformlydistributedinthecodespace.Thismeansthat,whenR1andR2aresufficientlylow,thecodewordsin𝐀1and𝐀2areperpendicularwithhighprobability.WeindeedobservethatthedecodingprocedureinSec.IIIisinherentlyabletoexploitthecodewordperpendicularityintheprojectionstep.Wearecurrentlyunabletopreciselycharacterizethiseffect;nonetheless,wecannumericallyinvestigatethisphenomenaasinFig.LABEL:fig:SIC_CDMA.HereweplotthelargestR¯¯2=R2R1attainablefordifferentnumberofusers,togetherwiththeIANprediction.Figure 44Figure 418202224262830323410-1.410-1.210-1P1 [dB]RR2-R1 for M1=15R2-R1 for M1=10IAN prediction for R2-R14Decodingimprovementwithrespecttotheinterferenceasnoise(IAN)predictiondescribedinSec