This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Deep-Unfolded Massive Grant-Free Transmission
in Cell-Free Wireless Communication Systems

Gangle Sun, , Mengyao Cao, Wenjin Wang, ,
Wei Xu, , and Christoph Studer
This work was supported in part by the National Natural Science Foundation of China under Grant 62371122 and Grant 62341110, in part by the National Key Research and Development Program under Grant 2020YFB1806608, and in part by the Special Fund for Key Basic Research in Jiangsu Province No. BK20243015. The work of C. Studer was supported in part by an ETH Research Grant.Part of this work was presented at the IEEE International Workshop on Signal Processing Advances in Wireless Communications (SPAWC) 2023 [1] and the 19th International Symposium on Wireless Communication Systems (ISWCS) [2], respectively. This work extends our previous results by (i) unifying the JACD optimization problem in [1] and the JAD optimization problem in [2]; (ii) proposing a deep-unfolding-based approximate box-constrained algorithm, in which complex computations based on the KKT conditions in [1] are replaced by a simple shrinkage operation; (iii) implementing a novel model training strategy that facilitates the joint training of parameters in the FBS modules and the AUD module, fully utilizing the estimated channel information; and (iv) providing a comprehensive performance assessment of different algorithms across various scenarios. The associate editor coordinating the review of this article and approving it for publication was Prof. Andreas Burg. (Corresponding authors: Wenjin Wang; Christoph Studer.)Gangle Sun, Mengyao Cao, Wenjin Wang, and Wei Xu are with the National Mobile Communications Research Laboratory, Southeast University, Nanjing 210096, China, and also with the Purple Mountain Laboratories, Nanjing 211100, China (e-mail: sungangle@seu.edu.cn; cmengyao@seu.edu.cn; wangwj@seu.edu.cn; wxu@seu.edu.cn). Christoph Studer is with the Department of Information Technology and Electrical Engineering, ETH Zurich, 8092 Zürich, Switzerland (e-mail: studer@ethz.ch).
Abstract

Grant-free transmission and cell-free communication are vital in improving coverage and quality-of-service for massive machine-type communication. This paper proposes a novel framework of joint active user detection, channel estimation, and data detection (JACD) for massive grant-free transmission in cell-free wireless communication systems. We formulate JACD as an optimization problem and solve it approximately using forward-backward splitting. To deal with the discrete symbol constraint, we relax the discrete constellation to its convex hull and propose two approaches that promote solutions from the constellation set. To reduce complexity, we replace costly computations with approximate shrinkage operations and approximate posterior mean estimator computations. To improve active user detection (AUD) performance, we introduce a soft-output AUD module that considers both the data estimates and channel conditions. To jointly optimize all algorithm hyper-parameters and to improve JACD performance, we further deploy deep unfolding together with a momentum strategy, resulting in two algorithms called DU-ABC and DU-POEM. Finally, we demonstrate the efficacy of the proposed JACD algorithms via extensive system simulations.

Index Terms:
Active user detection, cell-free communication, channel estimation, data detection, deep unfolding, grant-free transmission, massive machine-type communication.

I Introduction

Massive machine-type communication (mMTC) is an essential scenario of fifth-generation (5G) and beyond-5G wireless communication systems [3, 4, 5, 6, 7, 8], and focuses on supporting a large number of sporadically active user equipments (UEs) transmitting short packets to an infrastructure base station (BS) [9, 10, 11, 12, 13, 14, 15, 16]. On the one hand, grant-free transmission schemes [17, 18, 19, 20] are essential for mMTC scenarios, as they reduce signaling overhead, network congestion and transmission latency compared with traditional grant-based transmission schemes [21, 22, 23]. In grant-free transmission schemes, UEs transmit signals directly to the BS over shared resources, bypassing the need for complex scheduling. On the other hand, to improve coverage in mMTC, cell-free communication offers a promising solution [24, 25, 26, 27, 28]. In cell-free systems, numerous decentralized access points (APs) are connected to a central processing unit (CPU), jointly serving UEs to effectively broaden the coverage, mitigate inter-cell interference and enhance spectral efficiency [29, 30]. The key tasks of the CPU for massive grant-free transmission in cell-free wireless communication systems involve (i) identifying the set of active UEs, (ii) estimating their channels, and (iii) detecting their transmitted data.

I-A Contributions

This paper proposes a novel framework for joint active user detection, channel estimation, and data detection (JACD) in cell-free systems with grant-free access. We start by formulating JACD as an optimization problem that fully exploits the sparsity of the wireless channel and data matrices and then approximately solve it using forward-backward splitting (FBS) [31, 32]. To enable FBS, we relax the discrete constellation constraint to its convex hull and employ JACD methods with the incorporation of either a regularizer or a posterior mean estimator (PME), guiding symbols toward discrete constellation points, resulting in the box-constrained FBS and PME-based JACD algorithms, respectively. To reduce complexity, we replace the exact proximal operators with approximate shrinkage operations and approximate PME computations. To improve convergence and JACD performance, we include per-iteration step sizes and a momentum strategy. To avoid tedious manual parameter tuning, we employ deep unfolding (DU) to jointly tune all of the algorithm hyper-parameters using machine learning tools. To improve active user detection (AUD) performance, we include a novel soft-output AUD module that jointly considers the estimated data and channel matrix. Based on the aforementioned modifications, we have developed the deep unfolding versions of the box-constrained FBS and PME-based JACD algorithms, referred to as DU-ABC and DU-POEM. We use Monte–Carlo simulations to demonstrate the superiority of our framework compared to existing methods.

I-B Prior Work

I-B1 Massive Grant-Free Transmission in Cell-Free Wireless Communication Systems

Recent results have focused on AUD, channel estimation (CE), and data detection (DD) for massive grant-free transmission in cell-free wireless communication systems [1, 2, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46]. Reference [33] proposes two different AUD algorithms based on dominant APs and clustering, respectively. On this basis, a parallel AUD algorithm is developed to reduce complexity. Reference [34] proposes a covariance-based cooperative AUD method in which APs exchange their low-dimensional local information with neighbors. Reference [35] introduces a correlation-based AUD algorithm, accompanied by simulation results and empirical analysis, demonstrating that the cell-free system outperforms the collocated system in AUD performance. Reference [36] proposes centralized and distributed AUD algorithms for asynchronous transmission caused by low-cost oscillators. For near-real-time transmission, reference [37] introduces a deep-learning-based AUD algorithm, in which distributed computing units employ convolutional neural networks for preliminary AUD, and the CPU subsequently refines them through transfer learning. Capitalizing on the a-priori distribution of channel coefficients, reference [38] introduces a modified expectation-maximization approximate message passing (AMP) algorithm for CE, followed by AUD through the posterior support probabilities. Reference [39] proposes a two-stage AUD and CE method, in which AUD is first conducted via adjacent APs utilizing vector AMP, and CE is performed through a linear estimator. Reference [40] performs joint AUD and CE through a single-measurement-vector-based minimum mean square error (MMSE) estimation approach at each AP independently. Considering both centralized and edge computing paradigms, reference [41] presents an AMP-based approach for the joint AUD and CE while addressing quantization accuracy. In millimeter-wave systems, reference [42] introduces two distinct algorithms for joint AUD and CE, leveraging the inherent characteristic that each UE’s channel predominantly comprises a few significant propagation paths. Reference [43] presents a joint AUD and DD (JAD) algorithm, employing an adaptive AP selection method based on local log-likelihood ratios. Reference [44] performs AUD, MMSE-based CE, and successive interference cancellation (SIC)-based data decoding under a probabilistic KK-repitition scheme. Reference [45] first presents a joint AUD and CE approach for grant-free transmission using orthogonal time frequency space (OTFS) modulation in low Earth orbit (LEO) satellite communication systems. Subsequently, it introduces a least squares-based parallel time domain signal detection method. Reference [46] presents a Gaussian approximation-based Bayesian message passing algorithm for JACD, combined with an advanced low-coherence pilot design. Our previous work in [2] introduced a DU-based JAD algorithm, in which all algorithm hyper-parameters are optimized using machine learning. In addition, our study in [1] presents a box-constrained FBS algorithm designed for JACD. Unlike previous methods, this paper tackles the task of JACD for massive grant-free transmission in cell-free systems. To improve JACD performance, we capture the sporadic UE activity more accurately by representing both the channel matrix and the data matrix as sparse matrices.

I-B2 Joint Active User Detection, Channel Estimation, and Data Detection for Single-Cell Massive Grant-Free Transmission

JACD for single-cell massive grant-free transmission has been extensively investigated in [47, 48, 49, 50, 51, 52, 53]. Considering low-precision data converters, reference [47] utilizes bilinear generalized AMP (Bi-GAMP) with belief propagation algorithms for JACD in single-cell mMTC systems. Reference [48] proposes a bilinear message-scheduling generalized AMP for JACD, in which the channel decoder beliefs are used to refine AUD and DD. Reference [49] develops a Bi-GAMP algorithm for JACD, capturing the row-sparse channel matrix structure stemming from channel correlations. Reference [50] divides the JACD scheme into slot-wise AUD and joint signal and channel estimation, which are addressed using message passing. Reference [51] combines AMP and belief propagation (BP) to perform JACD for asynchronous mMTC systems, in which the UEs transmit different lengths of data packets. Reference [52] introduces a turbo-structured receiver for JACD and data decoding, utilizing the channel decoder’s information to improve CE and DD performance. Reference [53] introduces a JACD algorithm based on message passing and Markov random fields for LEO satellite-enabled mMTC scenarios, which employs OTFS modulation to capitalize on the sparsity in the delay-Doppler-angle domain. In contrast to these message-passing-based JACD methods that have been designed for single-cell systems that primarily focus on UE activity sparsity, we consider the JACD problem in cell-free systems by taking into account two distinct sources of sparsity in the channel matrix: (i) column sparsity, which stems from the sporadic UE activity in mMTC scenarios, and (ii) block sparsity within each non-zero column, which stems from the vast discrepancies in large-scale channel fading between UEs and distributed APs in cell-free systems [54]. Furthermore, we propose our JACD methods within the FBS framework, incorporating efficient strategies to improve the JACD performance and reduce computational complexity.

I-B3 Deep-Unfolding for Massive Grant-Free Transmission and Cell-Free Systems

DU techniques [55, 56, 57, 58] have increasingly found application in the domain of massive grant-free transmission and cell-free systems [42, 51, 59, 60, 61, 62, 63], which adeptly utilize backpropagation and stochastic gradient descent to automatically learn algorithm hyper-parameters. Reference [42] employs DU to a linearized alternating direction method of multipliers and vector AMP, improving the performance of joint AUD and CE. Reference [51] applies DU to AMP-BP to improve JACD performance, fully exploiting the three-level sparsity inherent in UE activity, transmission delay, and data length diversity. Reference [59] introduces a model-driven sparse recovery algorithm to estimate the sparse channel matrix in mMTC scenarios, effectively utilizing the a-priori knowledge of partially known supports. Reference [60] unfolds AMP, tailored for JAD in mMTC under single-phase non-coherent schemes, wherein the embedded parameters are trained to mitigate the performance degradation caused by the non-ideal i.i.d. model. Reference [61] uses DU together with an iterative shrinkage thresholding algorithm for joint AUD and CE, wherein multiple computable matrices are treated as trainable parameters, thereby providing improving optimization flexibility. Reference [62] proposes a DU-based multi-user beamformer for cell-free systems, improving robustness to imperfect channel state information (CSI), where APs are equipped with fully digital or hybrid analog-digital arrays. Reference [63] unfolds a zero-forcing algorithm to achieve multi-user precoding, reducing complexity and improving the robustness under imperfect CSI. In contrast to these results, we deploy DU to train all hyper-parameters in our proposed algorithms to improve the JACD performance and employ approximations for high-complexity steps to decrease computational complexity. Furthermore, we train the hyper-parameters of our soft-output AUD module that generates information on UE activity probability.

I-C Notation

Matrices, column vectors, and sets are denoted by uppercase boldface letters, lowercase boldface letters, and uppercase calligraphic letters, respectively. 𝐀(m,n)\mathbf{A}(m,n) stands for the element of matrix 𝐀\mathbf{A} at the mmth row and nnth column, and 𝐚(m)\mathbf{a}(m) stands for the mmth entry of the vector 𝐚\mathbf{a}. The M×NM\times N all-ones matrix is 𝟏M×N\mathbf{1}_{M\times N} and the M×MM\times M identity matrix is 𝐈M\mathbf{I}_{M}. The unit vector, which is zero except for the mmth entry, is 𝐞m\mathbf{e}_{m}, and the zero vector is 𝟎\mathbf{0}; the dimensions of these vectors will be clear from the context. We use hat symbols to refer to the estimated values of a variable, vector, or matrix. The superscripts ()T\left(\cdot\right)^{T} and ()H\left(\cdot\right)^{H} represent transpose and conjugate transpose, respectively, and the superscript (k)(k) denotes the kkth iteration. The Frobenius norm is denoted by F\|\cdot\|_{F}. In addition, diag{x1,,xN}\operatorname{diag}\{x_{1},\ldots,x_{N}\} stands for a diagonal matrix with entries x1,,xNx_{1},\ldots,x_{N} on the main diagonal; det(𝐀)\operatorname{det}(\mathbf{A}) stands for the determinant of 𝐀\mathbf{A}. The cardinality of a set 𝒬\mathcal{Q} is |𝒬||\mathcal{Q}|. The operators \odot and \propto denote Hadamard product and proportional relationships. For 𝐱N\mathbf{x}\in\mathbb{C}^{N}, Re{𝐱}N\text{Re}\{\mathbf{x}\}\in\mathbb{R}^{N} and Im{𝐱}N\text{Im}\{\mathbf{x}\}\in\mathbb{R}^{N} represent its real and imaginary part, respectively. {}\mathbb{P}\{\cdot\} and 𝔼{}\mathbb{E}\{\cdot\} denote probability and expectation, respectively; the indicator function 𝕀{}\mathbb{I}\left\{\cdot\right\} returns 11 for valid conditions and 0 otherwise. The multivariate complex Gaussian probability distribution with mean vector 𝐦\mathbf{m} and covariance matrix 𝚺\bm{\Sigma} evaluated at 𝐱M\mathbf{x}\in\mathbb{C}^{M} is

𝒞𝒩(𝐱;𝐦,𝚺)exp((𝐱𝐦)H𝚺1(𝐱𝐦))πMdet(𝚺),\displaystyle\mathcal{CN}\left(\mathbf{x};\mathbf{m},\bm{\Sigma}\right)\triangleq\frac{\exp\left(-\left(\mathbf{x}-\mathbf{m}\right)^{H}\bm{\Sigma}^{-1}\left(\mathbf{x}-\mathbf{m}\right)\right)}{\pi^{M}\operatorname{det}\left(\bm{\Sigma}\right)}, (1)

and the symbol \triangleq means “is defined as.” Besides, the shrinkage operation is defined as

Shrinkage(𝐱,μ){𝐱max{𝐱Fμ,0}𝐱F, if 𝐱𝟎,𝟎, if 𝐱=𝟎.\displaystyle\textsf{Shrinkage}\left(\mathbf{x},\mu\right)\triangleq\left\{\begin{array}[]{l}\mathbf{x}\frac{\max\{\left\|\mathbf{x}\right\|_{F}-\mu,0\}}{\left\|\mathbf{x}\right\|_{F}},\text{ if }\mathbf{x}\neq\mathbf{0},\\ \mathbf{0},\text{ if }\mathbf{x}=\mathbf{0}.\end{array}\right. (2)

and the element-wise clamp function is defined as

Clamp(𝐱,L,U)\displaystyle\textsf{Clamp}\left(\mathbf{x},L,U\right)\triangleq min{max{Re{𝐱},L},U}\displaystyle\min\left\{\max\left\{\text{Re}\left\{\mathbf{x}\right\},L\right\},U\right\} (3)
+jmin{max{Im{𝐱},L},U}.\displaystyle+j\min\left\{\max\left\{\text{Im}\left\{\mathbf{x}\right\},L\right\},U\right\}.

I-D Paper Outline

The rest of this paper is organized as follows. Section II presents the necessary prerequisites for the subsequent derivations. Section III introduces the system model and formulates the JACD optimization problem. Section IV introduces two distinct JACD algorithms: (i) box-constrained FBS and (ii) PME-based JACD. Section V deploys DU techniques and details the AUD module. Section VI investigates the efficacy of the proposed algorithms via Monte–Carlo simulations. Section VII concludes the paper. Proofs are relegated to the appendices.

II Prerequisites

In this section, we introduce some background knowledge and mathematical definitions related to the subsequent sections. Specifically, the introduction to FBS will provide insights into the derivation of the JACD algorithms under the FBS framework in Section IV. In addition, Definitions 1 and 3 are primarily used for the derivation of the PME-based JACD algorithm in Section IV-B, and Definition 2 is mainly utilized in the derivation of the DU-POEM algorithm in Section V-B.

II-A A Brief Introduction to FBS

FBS, also known as proximal gradient methods, is widely used for solving a wide variety of convex optimization problems [31, 32]. FBS splits the objective function into two components: a smooth function, denoted as f(𝐒)f(\mathbf{S}), and another not necessarily smooth function, g(𝐒)g(\mathbf{S}), and solves the following optimization problem:

𝐒^=argmin𝐒f(𝐒)+g(𝐒).\displaystyle\hat{\mathbf{S}}={\arg\min}_{\mathbf{S}}\;f(\mathbf{S})+g(\mathbf{S}). (4)

FBS iteratively performs a gradient step in the smooth function f(𝐒)f(\mathbf{S}) (denoted as the forward step) and a proximal operation to find a solution in the vicinity of the minimizer of the function g(𝐒)g(\mathbf{S}) (denoted as the backward step). The forward step proceeds as

𝐒^(k)=𝐒(k)τ(k)f(𝐒(k)),\hat{\mathbf{S}}^{(k)}=\mathbf{S}^{(k)}-\tau^{(k)}\nabla f\!\left(\mathbf{S}^{(k)}\right)\!, (5)

where the superscript (k)(k) denotes the kkth iteration, τ(k)\tau^{(k)} represents the step size at iteration kk, and f(𝐒)\nabla f\left(\mathbf{S}\right) is the gradient of f(𝐒)f(\mathbf{S}). The backward step proceeds as

𝐒(k+1)=argmin𝐒12𝐒𝐒^(k)F2+τ(k)g(𝐒).\mathbf{S}^{(k+1)}={\arg\min}_{\mathbf{S}}\;\frac{1}{2}\left\|\mathbf{S}-\hat{\mathbf{S}}^{(k)}\right\|_{F}^{2}+\tau^{(k)}g(\mathbf{S}). (6)

This process is iterated for k=1,2,k=1,2,\ldots until a predefined convergence criterion is met or a maximum number of KK iterations has been reached.

II-B Some Mathematical Definitions

Here, we introduce Definitions 1 and 2 to specify the probability distributions that the random vector 𝐱\mathbf{x} may follow.

Definition 1: For a random vector 𝐱\mathbf{x} taken from the discrete set 𝒮\mathcal{S} (𝟎𝒮\mathbf{0}\in\mathcal{S}), we call 𝐱\mathbf{x} follows the θ\theta-mixed discrete uniform distribution on 𝒮\mathcal{S}, denoted as 𝐱Uθ,𝒮{𝐱}\mathbf{x}\sim U_{\theta,\mathcal{S}}\{\mathbf{x}\}, if {𝐱}=Uθ,𝒮{𝐱}=θ|𝒮|1𝕀{𝐱𝟎}+(1θ)𝕀{𝐱=𝟎},𝐱𝒮\mathbb{P}\{\mathbf{x}\}=U_{\theta,\mathcal{S}}\{\mathbf{x}\}=\frac{\theta}{|\mathcal{S}|-1}\mathbb{I}\{\mathbf{x}\neq\mathbf{0}\}+(1-\theta)\mathbb{I}\{\mathbf{x}=\mathbf{0}\},\forall\mathbf{x}\in\mathcal{S}, where θ(0,1)\theta\in(0,1) is the probability of 𝐱\mathbf{x} being non-zero.

Definition 2: For a random vector 𝐱\mathbf{x} in a discrete set 𝒮\mathcal{S} (𝟎𝒮\mathbf{0}\notin\mathcal{S}), we call 𝐱\mathbf{x} follows a discrete uniform distribution on 𝒮\mathcal{S}, denoted as 𝐱U𝒮{𝐱}\mathbf{x}\sim U_{\mathcal{S}}\{\mathbf{x}\}, if {𝐱}=U𝒮{𝐱}=1|𝒮|,𝐱𝒮\mathbb{P}\{\mathbf{x}\}=U_{\mathcal{S}}\{\mathbf{x}\}=\frac{1}{|\mathcal{S}|},\forall\mathbf{x}\in\mathcal{S}.

We also define the PME of a vector 𝐱\mathbf{x} under the specific conditions as follows:

Definition 3: Given the observation vector 𝐱^=𝐱+𝐞\hat{\mathbf{x}}=\mathbf{x}+\mathbf{e} with 𝐱𝒮\mathbf{x}\in\mathcal{S} following the prior distribution {𝐱}\mathbb{P}\{\mathbf{x}\} and Gaussian estimation error 𝐞𝒞𝒩(𝟎,Ne𝐈)\mathbf{e}\sim\mathcal{CN}\left(\mathbf{0},N_{\text{e}}\mathbf{I}\right), we call

PME(𝐱^,𝒮,{𝐱},Ne)=𝐱𝒮{𝐱}𝒞𝒩(𝐱;𝐱^,Ne𝐈)𝐱d𝐱𝐱𝒮{𝐱}𝒞𝒩(𝐱;𝐱^,Ne𝐈)d𝐱,\displaystyle\text{PME}(\hat{\mathbf{x}},\mathcal{S},\mathbb{P}\{\mathbf{x}\},N_{\text{e}})=\frac{\int_{\mathbf{x}\in\mathcal{S}}\mathbb{P}\{\mathbf{x}\}\mathcal{CN}(\mathbf{x};\hat{\mathbf{x}},N_{\text{e}}\mathbf{I})\mathbf{x}\,\mathrm{d}\mathbf{x}}{\int_{\mathbf{x}\in\mathcal{S}}\mathbb{P}\{\mathbf{x}\}\mathcal{CN}(\mathbf{x};\hat{\mathbf{x}},N_{\text{e}}\mathbf{I})\,\mathrm{d}\mathbf{x}}, (7)

the PME of 𝐱\mathbf{x} under the prior {𝐱}\mathbb{P}\{\mathbf{x}\}.

III System Model and Problem Formulation

We consider an mMTC scenario in a cell-free wireless communication system as illustrated in Fig. 1. We model the situation using PP distributed APs with MM antennas that serve NN single-antenna sporadically active UEs. We assume that the NaNN_{a}\ll N active UEs are synchronized in time and simultaneously transmit uplink signals to APs over RR resource elements, assuming frequency-flat and block-fading channels.

Refer to caption
Figure 1: Illustration of mMTC in a cell-free wireless communication system.

III-A System Model

Following our previous work in [1, 2], we model the input-output relation of this scenario as follows:

𝐘=n=1Nξn𝐡n𝐱¯nT+𝐍.\mathbf{Y}=\sum_{n=1}^{N}\xi_{n}\mathbf{h}_{n}\bar{\mathbf{x}}_{n}^{T}+\mathbf{N}. (8)

Here, the received signal matrix from all APs is denoted by 𝐘MP×R\mathbf{Y}\in\mathbb{C}^{MP\times R}, and ξn{0,1}\xi_{n}\in\{0,1\} is nnth UE’s activity indicator with ξn=1\xi_{n}=1 indicating the nnth UE is active and ξn=0\xi_{n}=0 otherwise. We assume that the activity between UEs is independent and all UEs have the same activity probability α\alpha, i.e., {ξn=1}=α,n\mathbb{P}\{\xi_{n}=1\}=\alpha,\forall n. The channel vector between the nnth UE and all APs is 𝐡n[𝐡n,1T,,𝐡n,PT]TMP\mathbf{h}_{n}\triangleq[\mathbf{h}_{n,1}^{T},\ldots,\mathbf{h}_{n,P}^{T}]^{T}\in\mathbb{C}^{MP} with 𝐡n,pM\mathbf{h}_{n,p}\in\mathbb{C}^{M} being the channel vector between the nnth UE and the ppth AP. The nnth UE’s signal vector 𝐱¯n[𝐱P,nT,𝐱¯D,nT]TR\bar{\mathbf{x}}_{n}\triangleq[\mathbf{x}_{\text{P},n}^{T},\bar{\mathbf{x}}_{\text{D},n}^{T}]^{T}\in\mathbb{C}^{R} consists of the pilot vector 𝐱P,nRP\mathbf{x}_{\text{P},n}\in\mathbb{C}^{R_{\text{P}}} and the data vector 𝐱¯D,n𝒬RD\bar{\mathbf{x}}_{\text{D},n}\in\mathcal{Q}^{R_{\text{D}}} with data entries independently and uniformly sampled from the constellation set 𝒬\mathcal{Q}. Entries in the noise matrix 𝐍MP×R\mathbf{N}\in\mathbb{C}^{MP\times R} are assumed to be i.i.d. circularly-symmetric complex Gaussian variables with variance N0=1N_{0}=1.

Refer to caption
Figure 2: Illustration of the signal matrix 𝐗\mathbf{X}, where only 10 out of 50 UEs are active. Active UEs transmit unique pilots of length RP=20R_{\text{P}}=20 and data symbols of length RD=40R_{\text{D}}=40, represented by dark-colored squares. Inactive UEs transmit no signals; however, their unique pilots, which are known to the BS and represented by light-colored squares, can be used to improve JACD performance.
𝒫1:{𝐇^,𝐗^D}=argmin𝐇MP×N𝐱D,n𝒬¯,n12𝐘𝐇[𝐗P,𝐗D]F2+μhn=1Np=1P𝐡n,pF+μxn=1N𝐱D,nF.\mathcal{P}_{1}:\;\left\{{\hat{\mathbf{H}},{{\hat{\mathbf{X}}}_{\text{D}}}}\right\}=\mathop{\arg\min}_{\scriptstyle{\mathbf{H}\in\mathbb{C}^{MP\times N}}\hfill\atop{\scriptstyle{\mathbf{x}_{\text{D},n}\in\bar{\mathcal{Q}},\forall n}\hfill}}\frac{1}{2}\left\|\mathbf{Y}-\mathbf{H}\left[\mathbf{X}_{\text{P}},\mathbf{X}_{\text{D}}\right]\right\|_{F}^{2}+\mu_{h}\sum_{n=1}^{N}\sum_{p=1}^{P}\left\|\mathbf{h}_{n,p}\right\|_{F}+\mu_{x}\sum_{n=1}^{N}\left\|\mathbf{x}_{\text{D},n}\right\|_{F}. (14)
𝒫2:{𝐇^,𝐗^D}=argmin𝐇MP×N𝐗DN×RD12𝐘𝐇[𝐗P,𝐗D]F2+μhn=1Np=1P𝐡n,pF+μxn=1N𝐱D,nF+λ𝒞(𝐗D).\mathcal{P}_{2}:\left\{{\hat{\mathbf{H}},{{\hat{\mathbf{X}}}_{\text{D}}}}\right\}=\mathop{\arg\min}_{\scriptstyle{\mathbf{H}\in\mathbb{C}^{MP\times N}}\hfill\atop{\scriptstyle{{\mathbf{X}}_{\text{D}}\in\mathcal{B}^{N\times R_{\text{D}}}}\hfill}}\frac{1}{2}\left\|\mathbf{Y}-\mathbf{H}\left[\mathbf{X}_{\text{P}},\mathbf{X}_{\text{D}}\right]\right\|_{F}^{2}+\mu_{h}\sum_{n=1}^{N}\sum_{p=1}^{P}\left\|\mathbf{h}_{n,p}\right\|_{F}+\mu_{x}\sum_{n=1}^{N}\left\|\mathbf{x}_{\text{D},n}\right\|_{F}+\lambda\,\mathcal{C}\left(\mathbf{X}_{\text{D}}\right). (16)

For ease of notation, we rewrite the system model in (8) as

𝐘=𝐇[𝐗P,𝐗D]+𝐍=𝐇𝐗+𝐍,\displaystyle\textstyle\mathbf{Y}=\mathbf{H}\left[\mathbf{X}_{\text{P}},\mathbf{X}_{\text{D}}\right]+\mathbf{N}=\mathbf{H}\mathbf{X}+\mathbf{N}, (9)

where 𝐘=[𝐘P,𝐘D]\mathbf{Y}=[\mathbf{Y}_{\text{P}},\mathbf{Y}_{\text{D}}] with 𝐘PMP×RP\mathbf{Y}_{\text{P}}\in\mathbb{C}^{MP\times R_{\text{P}}} and 𝐘DMP×RD\mathbf{Y}_{\text{D}}\in\mathbb{C}^{MP\times R_{\text{D}}} being the received pilot matrix and received data matrix, respectively. The channel matrix is given by 𝐇[ξ1𝐡1,,ξN𝐡N]MP×N\mathbf{H}\triangleq\left[\xi_{1}\mathbf{h}_{1},\ldots,\xi_{N}\mathbf{h}_{N}\right]\in\mathbb{C}^{MP\times N}. The signal matrix 𝐗=[𝐗P,𝐗D]N×R\mathbf{X}=[\mathbf{X}_{\text{P}},\mathbf{X}_{\text{D}}]\in\mathbb{C}^{N\times R} contains the pilot matrix 𝐗P[𝐱P,1,,𝐱P,N]TN×RP\mathbf{X}_{\text{P}}\triangleq[\mathbf{x}_{\text{P},1},\ldots,\mathbf{x}_{\text{P},N}]^{T}\in\mathbb{C}^{N\times R_{\text{P}}} and the data matrix 𝐗D[𝐱D,1,,𝐱D,N]TN×RD\mathbf{X}_{\text{D}}\triangleq[\mathbf{x}_{\text{D},1},\ldots,\mathbf{x}_{\text{D},N}]^{T}\in\mathbb{C}^{N\times R_{\text{D}}}, where 𝐱D,nξn𝐱¯D,n𝒬¯\mathbf{x}_{\text{D},n}\triangleq\xi_{n}\bar{\mathbf{x}}_{\text{D},n}\in\bar{\mathcal{Q}} with 𝒬¯{𝒬RD,𝟎}\bar{\mathcal{Q}}\triangleq\{\mathcal{Q}^{R_{\text{D}}},\mathbf{0}\}.

A typical signal matrix 𝐗\mathbf{X} is depicted in Fig. 2. Here, the data matrix 𝐗D\mathbf{X}_{\text{D}} is a row-sparse matrix due to sporadic UE activity, and the pilot matrix 𝐗P\mathbf{X}_{\text{P}} is characterized as a non-sparse matrix, retaining all known pilot information for subsequent optimization. A typical channel matrix 𝐇\mathbf{H} is depicted in Fig. 3, where sparsity is due to two reasons: (i) the UEs’ sporadic activity results in column sparsity and (ii) the inherent discrepancies in large-scale fading between UEs and different APs, caused by their varying distances, lead to the block sparsity within each non-zero column.

While row/column sparsity and block sparsity have been widely studied in compressed sensing, the intricate interplay of these sparsities in our setting remains largely unexplored. In our system model (9), the estimation of both 𝐇\mathbf{H} and 𝐗\mathbf{X} poses a significant challenge, as 𝐇\mathbf{H} exhibits both column and row sparsity, while 𝐗D\mathbf{X}_{\text{D}} displays row sparsity. In the subsequent problem formulation and algorithm derivation, we will effectively leverage the sparsity of both the channel matrix 𝐇\mathbf{H} and the data matrix 𝐗D\mathbf{X}_{\text{D}} to enhance JACD performance.

Refer to caption
Figure 3: Illustration of a channel matrix 𝐇\mathbf{H} for 1010 APs with 44 antennas each and 5050 UEs, where 1010 UEs are active. Darker colors indicate larger absolute values; the boxed columns correspond to the active UEs (cf. Fig. 2).

III-B Problem Formulation

Using the system model (9), we formulate the maximum-a-posteriori JACD problem for massive grant-free transmission in cell-free wireless communication systems as [1, 2]

{𝐇^,𝐗^D}=argmax𝐇MP×N𝐱D,n𝒬¯,nP(𝐘|𝐇,𝐗D)P(𝐇)P(𝐗D),\left\{\hat{\mathbf{H}},{{\hat{\mathbf{X}}}_{\text{D}}}\right\}=\mathop{\arg\max}\limits_{\scriptstyle{\mathbf{H}\in\mathbb{C}^{MP\times N}}\hfill\atop{\scriptstyle{\mathbf{x}_{\text{D},n}\in\bar{\mathcal{Q}},\forall n}\hfill}}P\!\left(\mathbf{Y}|\mathbf{H},\mathbf{X}_{\text{D}}\right)\!P\!\left(\mathbf{H}\right)\!P\!\left(\mathbf{X}_{\text{D}}\right), (10)

where the channel law P(𝐘|𝐇,𝐗D)P\!\left(\mathbf{Y}|\mathbf{H},\mathbf{X}_{\text{D}}\right) is given by

P(𝐘|𝐇,𝐗D)exp(𝐘𝐇[𝐗P,𝐗D]F2N0).P\!\left(\mathbf{Y}|\mathbf{H},\mathbf{X}_{\text{D}}\right)\propto\exp\Big{(}-\frac{\left\|\mathbf{Y}-\mathbf{H}\left[\mathbf{X}_{\text{P}},\mathbf{X}_{\text{D}}\right]\right\|_{F}^{2}}{N_{0}}\Big{)}. (11)

Here, we employ the complex-valued block-Laplace model and Laplace model for block sparsity in 𝐇\mathbf{H} and column sparsity in 𝐗D\mathbf{X}_{\text{D}}, respectively, as follows [1, 2]

P(𝐇)n=1Np=1Pexp(2μh𝐡n,pF),\displaystyle P\!\left(\mathbf{H}\right)\propto\prod_{n=1}^{N}\prod_{p=1}^{P}\exp\!\left(-2\mu_{h}\left\|\mathbf{h}_{n,p}\right\|_{F}\right)\!, (12)
P(𝐗D)n=1Nexp(2μx𝐱D,nF),\displaystyle P\!\left(\mathbf{X}_{\text{D}}\right)\propto\prod_{n=1}^{N}\exp\!\left(-2\mu_{x}\left\|\mathbf{x}_{\text{D},n}\right\|_{F}\right)\!, (13)

with μh\mu_{h} and μx\mu_{x} indicating the sparsity levels of 𝐇\mathbf{H} and 𝐗D\mathbf{X}_{\text{D}}, respectively.

By inserting (11), (12), and (13) into (10), we can rewrite the JACD problem as problem 𝒫1\mathcal{P}_{1} with N0=1N_{0}=1, expressed in problem (14) above. This problem aims to estimate the channel matrix 𝐇\mathbf{H} and the data matrix 𝐗D\mathbf{X}_{\text{D}} using the received signal matrix 𝐘\mathbf{Y} and the pilot matrix 𝐗P\mathbf{X}_{\text{P}}, in which UE activity is indicated by the column sparsity of 𝐇\mathbf{H} and row sparsity of 𝐗D\mathbf{X}_{\text{D}}.

III-C Problem Relaxation

The discrete set 𝒬¯\bar{\mathcal{Q}} renders 𝒫1\mathcal{P}_{1} a discrete-valued optimization problem for which a naïve exhaustive search is infeasible. To circumvent this limitation, as in [1, 2], we relax the discrete set 𝒬¯\bar{\mathcal{Q}} to its convex hull RD\mathcal{B}^{R_{\text{D}}}, thereby transforming 𝒫1\mathcal{P}_{1} into a continuous-valued optimization problem. The set \mathcal{B} is given by111For quadrature phase shift keying (QPSK), the set \mathcal{B} can be specified as ={x:BRe{x}B,BIm{x}B}\mathcal{B}=\{x\in\mathbb{C}:-B\leq\text{Re}\{x\}\leq B,-B\leq\text{Im}\{x\}\leq B\}.

{i=1|𝒬|δiqi:qi𝒬,δi0,i;i=1|𝒬|δi=1}.\textstyle\mathcal{B}\triangleq\left\{\sum_{i=1}^{|\mathcal{Q}|}\delta_{i}q_{i}:\,q_{i}\in\mathcal{Q},\,\delta_{i}\geq 0,\forall i;\,\sum_{i=1}^{|\mathcal{Q}|}\delta_{i}=1\right\}\!. (15)

To push the entries in the data matrix 𝐗D\mathbf{X}_{\text{D}} towards points in discrete set 𝒬¯\bar{\mathcal{Q}}, two distinct methodologies can be used: Method (i) introduces a regularizer (also called penalty term) into the objective function of 𝒫1\mathcal{P}_{1} [1, 64] and method (ii) leverages the PME to denoise the estimated data matrix 𝐗D\mathbf{X}_{\text{D}} [2]. As such, we can transform the original discrete-valued optimization problem 𝒫1\mathcal{P}_{1} into problem 𝒫2\mathcal{P}_{2} in (16) above. The penalty parameter λ\lambda is λ>0\lambda>0 for method (i) and λ=0\lambda=0 for method (ii). For the regularizer 𝒞(𝐗D)\mathcal{C}\left(\mathbf{X}_{\text{D}}\right), many alternatives are possible, such as 𝒞(𝐗D)=𝐗D𝐗DB2𝟏N×RDF2\mathcal{C}\left(\mathbf{X}_{\text{D}}\right)=-\|\mathbf{X}_{\text{D}}\odot\mathbf{X}_{\text{D}}^{*}-B^{2}\mathbf{1}_{N\times R_{\text{D}}}\|_{F}^{2} utilized in [1]. In the next section, we will develop the FBS algorithm based on method (i) and method (ii) respectively to efficiently compute approximate solutions to problem 𝒫2\mathcal{P}_{2}.

IV JACD Algorithms

We develop two JACD algorithms that utilize FBS, each leveraging specific techniques to improve JACD performance. Specifically, the first algorithm based on method (i) is called the box-constrained FBS algorithm [1], which utilizes a regularizer within the objective function to guide the estimated symbols toward the discrete constellation points, thereby improving DD accuracy. The second algorithm based on method (ii) is called the PME-based JACD algorithm, which employs PME to modify estimated data to further improve DD performance.

IV-A Box-Constrained FBS Algorithm for JACD

For this method, we utilize the FBS with the incorporation of the regularizer 𝒞(𝐗D)\mathcal{C}(\mathbf{X}_{\text{D}}) to approximately solve the non-convex problem 𝒫2\mathcal{P}_{2}. Using the definition 𝐒[𝐇H,𝐗D]H(MP+RD)×N\mathbf{S}\triangleq[\mathbf{H}^{H},\mathbf{X}_{\text{D}}]^{H}\in\mathbb{C}^{(MP+R_{\text{D}})\times N} in [1, 2, 54], we can split the objective function of 𝒫2\mathcal{P}_{2} into the two functions

f(𝐒)\displaystyle\!\!\!f(\mathbf{S}) =12𝐘𝐇[𝐗P,𝐗D]F2+λ𝒞(𝐗D),\displaystyle=\frac{1}{2}\left\|\mathbf{Y}-\mathbf{H}\left[\mathbf{X}_{\text{P}},\mathbf{X}_{\text{D}}\right]\right\|_{F}^{2}\!+\!\lambda\,\mathcal{C}(\mathbf{X}_{\text{D}}), (17)
g(𝐒)\displaystyle\!\!\!g(\mathbf{S}) =μhn=1Np=1P𝐡n,pF+μxn=1N𝐱D,nF+𝒳(𝐗D),\displaystyle=\mu_{h}\!\!\sum_{n=1}^{N}\sum_{p=1}^{P}\left\|\mathbf{h}_{n,p}\right\|_{F}\!+\!\mu_{x}\!\!\sum_{n=1}^{N}\left\|\mathbf{x}_{\text{D},n}\right\|_{F}\!+\!\mathcal{X}\!\left(\mathbf{X}_{\text{D}}\right), (18)

where 𝒳(𝐗D)=+𝕀{𝐗DN×RD}\mathcal{X}\left(\mathbf{X}_{\text{D}}\right)=+\infty\cdot\mathbb{I}\left\{\mathbf{X}_{\text{D}}\notin\mathcal{B}^{N\times R_{\text{D}}}\right\} represents the constraint 𝐗DN×RD{\mathbf{X}}_{\text{D}}\in\mathcal{B}^{N\times R_{\text{D}}} in problem 𝒫2\mathcal{P}_{2}. Following the FBS framework outlined in Section II-A, the corresponding forward and backward steps can be specified as follows.

IV-A1 Forward Step

Given the expression of f(𝐒)f(\mathbf{S}) in (17), we can derive the gradient of f(𝐒)f\!\left(\mathbf{S}\right)\! with respect to 𝐒\mathbf{S} in the forward step (5) as follows [1]:

f(𝐒)=[(f𝐇)T,(f𝐗DT)T]T,\nabla f(\mathbf{S})=\left[\left(\frac{\partial f}{\partial\mathbf{H}^{*}}\right)^{T},\left(\frac{\partial f}{\partial\mathbf{X}_{\text{D}}^{T}}\right)^{T}\right]^{T}, (19)

where

f𝐇\displaystyle\frac{\partial f}{\partial\mathbf{H}^{*}} =(𝐘𝐇𝐗)𝐗H,\displaystyle=-\!\left(\mathbf{Y}-\mathbf{H}\mathbf{X}\right)\!\mathbf{X}^{H}, (20)
f𝐗DT\displaystyle\frac{\partial f}{\partial\mathbf{X}_{\text{D}}^{T}} =(𝐘D𝐇𝐗D)H𝐇+λ𝒞(𝐗D)𝐗DT.\displaystyle=-\!\left(\mathbf{Y}_{\text{D}}-\mathbf{H}\mathbf{X}_{\text{D}}\right)\!^{H}\mathbf{H}+\lambda\,\frac{\partial\mathcal{C}\!\left(\mathbf{X}_{\text{D}}\right)\!}{\partial\mathbf{X}_{\text{D}}^{T}}. (21)

The expression 𝒞(𝐗D)𝐗DT\frac{\partial\mathcal{C}\!\left(\mathbf{X}_{\text{D}}\right)\!}{\partial\mathbf{X}_{\text{D}}^{T}} varies for various choices of 𝒞(𝐗D)\mathcal{C}\left(\mathbf{X}_{\text{D}}\right). For instance, 𝒞(𝐗D)𝐗DT=4(𝐗D(𝐗D𝐗DB2𝟏N×RD))T\frac{\partial\mathcal{C}\!\left(\mathbf{X}_{\text{D}}\right)\!}{\partial\mathbf{X}_{\text{D}}^{T}}=-4\left(\mathbf{X}_{\text{D}}^{*}\odot\left(\mathbf{X}_{\text{D}}\odot\mathbf{X}_{\text{D}}^{*}-B^{2}\mathbf{1}_{N\times R_{\text{D}}}\right)\right)^{T} for 𝒞(𝐗D)=𝐗D𝐗DB2𝟏N×RDF2\mathcal{C}\left(\mathbf{X}_{\text{D}}\right)=-\left\|\mathbf{X}_{\text{D}}\odot\mathbf{X}_{\text{D}}^{*}-B^{2}\mathbf{1}_{N\times R_{\text{D}}}\right\|_{F}^{2}.

IV-A2 Backward Step

The proximal operator for 𝐒\mathbf{S} can be decomposed into separate proximal operators for 𝐇\mathbf{H} and 𝐗D\mathbf{X}_{\text{D}}, respectively. The proximal operator for 𝐇\mathbf{H} is

𝐇(k+1)=argmin𝐇MP×N12𝐇𝐇^(k)F2+τ(k)μhn=1Np=1P𝐡n,pF,\displaystyle\mathbf{H}^{(k+1)}\!=\!\!\mathop{\arg\min}_{\scriptstyle{\mathbf{H}\in\mathbb{C}^{MP\times N}}}\!\frac{1}{2}\left\|\mathbf{H}\!-\!\hat{\mathbf{H}}^{(k)}\right\|_{F}^{2}\!\!+\!\tau^{(k)}\mu_{h}\!\!\sum_{n=1}^{N}\sum_{p=1}^{P}\left\|\mathbf{h}_{n,p}\right\|_{F}\!, (22)

with τ(k)\tau^{(k)} representing the step size at iteration kk in the forward step (5). The closed-form solution of problem (22) is given by [54, 31, 32]

𝐡n,p(k+1)=Shrinkage(𝐡^n,p(k),τ(k)μh),\mathbf{h}_{n,p}^{(k+1)}=\textsf{Shrinkage}\left(\hat{\mathbf{h}}_{n,p}^{(k)},\tau^{(k)}\mu_{h}\right), (23)

with the shrinkage operation defined in (2). The proximal operator for 𝐗D\mathbf{X}_{\text{D}} can be decomposed as independent proximal operators for 𝐱D,n,n\mathbf{x}_{\text{D},n},\,\forall n as follows

𝐱D,n(k+1)=argmin𝐱D,nRD12𝐱D,n𝐱^D,n(k)F2+τ(k)μx𝐱D,nF,\displaystyle\!\mathbf{x}_{\text{D},n}^{(k+1)}\!=\!\!\mathop{\arg\min}_{\scriptstyle{\mathbf{x}_{\text{D},n}\in\mathcal{B}^{R_{\text{D}}}}}\frac{1}{2}\!\left\|\mathbf{x}_{\text{D},n}\!-\!\hat{\mathbf{x}}_{\text{D},n}^{(k)}\right\|_{F}^{2}\!+\!\tau^{(k)}\mu_{x}\left\|\mathbf{x}_{\text{D},n}\right\|_{F}, (24)

which is a convex optimization problem, and the optimal solution of (24) can be obtained by the KKT conditions outlined in [1]. For completeness, the following proposition details the closed-form solution to (24) and its proof is given in Appendix A.

Proposition 1: The optimal solution of (24) is given by 𝐱D,n(k+1)=[𝐈RD,i𝐈RD]𝐫n(k+1)\mathbf{x}_{\text{D},n}^{(k+1)}=\left[\mathbf{I}_{R_{\text{D}}},i\mathbf{I}_{R_{\text{D}}}\right]\mathbf{r}_{n}^{(k+1)}, where 𝐫n(k+1)2RD\mathbf{r}_{n}^{(k+1)}\in\mathbb{R}^{2R_{\text{D}}} is expressed as:

𝐫n(k+1)(d)=\displaystyle\mathbf{r}_{n}^{(k+1)}(d)= b𝐫^n(k)(d)𝕀{d𝒮p𝒮q}+B𝕀{d𝒮p,b0}\displaystyle b\,\hat{\mathbf{r}}_{n}^{(k)}(d)\,\mathbb{I}\left\{d\notin\mathcal{S}_{p}\cup\mathcal{S}_{q}\right\}+B\,\mathbb{I}\left\{d\in\mathcal{S}_{p},b\neq 0\right\} (25)
B𝕀{d𝒮q,b0}.\displaystyle-B\,\mathbb{I}\left\{d\in\mathcal{S}_{q},b\neq 0\right\}.

Here, 𝐫^n(k)=[Re{𝐱^D,n(k)}T,Im{𝐱^D,n(k)}T]T2RD\hat{\mathbf{r}}_{n}^{(k)}=[\text{Re}\{\hat{\mathbf{x}}_{\text{D},n}^{(k)}\}^{T},\text{Im}\{\hat{\mathbf{x}}_{\text{D},n}^{(k)}\}^{T}]^{T}\in\mathbb{R}^{2R_{\text{D}}}, 𝒮p{d:𝐫n,tmp(k+1)(d)>B}\mathcal{S}_{p}\triangleq\{d:\,\mathbf{r}_{n,\text{tmp}}^{(k+1)}(d)>B\}, and 𝒮q{d:𝐫n,tmp(k+1)(d)<B}\mathcal{S}_{q}\triangleq\{d:\,\mathbf{r}_{n,\text{tmp}}^{(k+1)}(d)<-B\}, where 𝐫n,tmp(k+1)=Shrinkage(𝐫^n(k),τ(k)μx)\textstyle\mathbf{r}_{n,\text{tmp}}^{(k+1)}=\textsf{Shrinkage}\big{(}\hat{\mathbf{r}}_{n}^{(k)},\tau^{(k)}\mu_{x}\big{)}. In addition, bb is the solution of the quartic equation 2d𝒮p𝒮q𝐫^n(k)(d)2b44d𝒮p𝒮q𝐫^n(k)(d)2b3+(2d𝒮p𝒮q𝐫^n(k)(d)2+|𝒮p𝒮q|2(τdμx)2)b22|𝒮p𝒮q|b+|𝒮p𝒮q|=02\sum_{d\notin\mathcal{S}_{p}\cup\mathcal{S}_{q}}\hat{\mathbf{r}}_{n}^{(k)}(d)^{2}b^{4}-4\sum_{d\notin\mathcal{S}_{p}\cup\mathcal{S}_{q}}\hat{\mathbf{r}}_{n}^{(k)}(d)^{2}b^{3}+(2\sum_{d\notin\mathcal{S}_{p}\cup\mathcal{S}_{q}}\hat{\mathbf{r}}_{n}^{(k)}(d)^{2}+|\mathcal{S}_{p}\cup\mathcal{S}_{q}|-2(\tau^{d}\mu_{x})^{2})b^{2}-2|\mathcal{S}_{p}\cup\mathcal{S}_{q}|b+|\mathcal{S}_{p}\cup\mathcal{S}_{q}|=0 within the interval (0,1], if it exists; otherwise, b=0b=0.

The pseudocode for the proposed box-constrained FBS algorithm is presented in Algorithm 1.

1 Input: 𝐘\mathbf{Y}, 𝐗P\mathbf{X}_{\text{P}}, μh\mu_{h}, μx\mu_{x}, λ\lambda, and KK.
2 Initialization: 𝐒(1)=[(𝐇(1))H,𝐗D(1)]H\mathbf{S}^{(1)}=[(\mathbf{H}^{(1)})^{H},\mathbf{X}_{\text{D}}^{(1)}]^{H}.
3 for k=1,,Kk=1,\ldots,K do
4      Forward Step: Calculate 𝐒^(k)\hat{\mathbf{S}}^{(k)} via (5) and (19).
5      Backward Step: Calculate 𝐇(k+1)\mathbf{H}^{(k+1)} and 𝐗D(k+1)\mathbf{X}_{\text{D}}^{(k+1)} via (23) and Proposition 1, respectively.
6      end for
Output: 𝐇(k+1)\mathbf{H}^{(k+1)} and 𝐗D(k+1)\mathbf{X}_{\text{D}}^{(k+1)}.
Algorithm 1 Box-Constrained FBS Algorithm
𝐱D,n(k+1)=PME(𝐱^D,n(k),𝒬¯,Uα,𝒬¯{𝐱},Ne(k))=𝐱𝒬RDα|𝒬RD|𝒞𝒩(𝐱;𝐱^D,n(k),Ne(k)𝐈RD)𝐱𝐱𝒬RDα|𝒬RD|𝒞𝒩(𝐱;𝐱^D,n(k),Ne(k)𝐈RD)+(1α)𝒞𝒩(𝟎;𝐱^D,n(k),Ne(k)𝐈RD).\mathbf{x}_{\text{D},n}^{(k+1)}=\text{PME}\left(\hat{\mathbf{x}}_{\text{D},n}^{(k)},\bar{\mathcal{Q}},U_{\alpha,\bar{\mathcal{Q}}}\{\mathbf{x}\},N_{\text{e}}^{(k)}\right)=\frac{\sum_{\mathbf{x}\in\mathcal{Q}^{R_{\text{D}}}}{\frac{\alpha}{|\mathcal{Q}^{R_{\text{D}}}|}\,\mathcal{CN}\left(\mathbf{x};\hat{\mathbf{x}}_{\text{D},n}^{(k)},N_{\text{e}}^{(k)}\mathbf{I}_{R_{\text{D}}}\right)}\mathbf{x}}{\sum_{\mathbf{x}\in\mathcal{Q}^{R_{\text{D}}}}{\frac{\alpha}{|\mathcal{Q}^{R_{\text{D}}}|}\,\mathcal{CN}\left(\mathbf{x};\hat{\mathbf{x}}_{\text{D},n}^{(k)},N_{\text{e}}^{(k)}\mathbf{I}_{R_{\text{D}}}\right)}+(1-\alpha)\mathcal{CN}\left(\mathbf{0};\hat{\mathbf{x}}_{\text{D},n}^{(k)},N_{\text{e}}^{(k)}\mathbf{I}_{R_{\text{D}}}\right)}. (27)
Refer to caption
Figure 4: Architecture details of the DU-ABC and DU-POEM algorithms for JACD, differing only in the backward step.

IV-B PME-Based JACD Algorithm

An alternative technique that considers the discrete constellation constraint in 𝐗D\mathbf{X}_{\text{D}} uses a PME, which denoises the data matrix 𝐗D\mathbf{X}_{\text{D}}; for this approach, we set λ=0\lambda=0. We now only discuss the differences to the box-constrained FBS algorithm from Section IV-A with λ=0\lambda=0, which are in the computation of the proximal operator for 𝐗D\mathbf{X}_{\text{D}}.

In our paper, the PME assumes that one observes a signal of interest through noisy Gaussian observations (see Definition 3 in Section II-B) and obtains the best estimates in terms of minimizing the mean square error by leveraging the known prior probability of the signal. As in [65, 66], instead of calculating a proximal operator for the data matrix 𝐗D\mathbf{X}_{\text{D}}, we simply denoise the output of the forward step using a carefully designed PME. To this end, we model the nnth UE’s estimated data vector 𝐱^D,n(k)\hat{\mathbf{x}}_{\text{D},n}^{(k)} in the kkth iteration as

𝐱^D,n(k)=𝐱D,n+𝐞(k).\hat{\mathbf{x}}_{\text{D},n}^{(k)}=\mathbf{x}_{\text{D},n}+\mathbf{e}^{(k)}. (26)

Here, 𝐱D,n\mathbf{x}_{\text{D},n} is unknown and 𝐞(k)RD\mathbf{e}^{(k)}\in\mathbb{R}^{R_{\text{D}}} is the estimation error at iteration kk, which we assume to be complex Gaussian following the distribution 𝐞(k)𝒞𝒩(𝟎,Ne(k)𝐈RD)\mathbf{e}^{(k)}\sim\mathcal{CN}(\mathbf{0},N_{\text{e}}^{(k)}\mathbf{I}_{R_{\text{D}}}). Here, the vector 𝐱D,n\mathbf{x}_{\text{D},n} from (26) follows α\alpha-mixed discrete uniform distribution on 𝒬¯\bar{\mathcal{Q}}, i.e., 𝐱D,nUα,𝒬¯{𝐱D,n}\mathbf{x}_{\text{D},n}\sim U_{\alpha,\bar{\mathcal{Q}}}\{\mathbf{x}_{\text{D},n}\} (see Definition 1 in Section II-B), where α\alpha is the UE activity probability. Accordingly, we can employ the PME of 𝐱D,n\mathbf{x}_{\text{D},n} under the prior Uα,𝒬¯{𝐱D,n}U_{\alpha,\bar{\mathcal{Q}}}\{\mathbf{x}_{\text{D},n}\}, i.e., PME(𝐱^D,n(k),𝒬¯,Uα,𝒬¯{𝐱},Ne(k))\text{PME}(\hat{\mathbf{x}}_{\text{D},n}^{(k)},\bar{\mathcal{Q}},U_{\alpha,\bar{\mathcal{Q}}}\{\mathbf{x}\},N_{\text{e}}^{(k)}) (refer to Definition 3 in Section II-B) as the estimate of the data matrix 𝐗D\mathbf{X}_{\text{D}} [2], expressed as in (27). Algorithm 2 outlines the pseudocode for the proposed PME-based JACD algorithm.

1 Input: 𝐘\mathbf{Y}, 𝐗P\mathbf{X}_{\text{P}}, μh\mu_{h}, μx\mu_{x}, and KK.
2 Initialization: 𝐒(1)=[(𝐇(1))H,𝐗D(1)]H\mathbf{S}^{(1)}=[(\mathbf{H}^{(1)})^{H},\mathbf{X}_{\text{D}}^{(1)}]^{H}.
3 for k=1,,Kk=1,\ldots,K do
4      Forward Step: Calculate 𝐒^(k)\hat{\mathbf{S}}^{(k)} via (5) and (19).
5      Backward Step: Calculate 𝐇(k+1)\mathbf{H}^{(k+1)} and 𝐗D(k+1)\mathbf{X}_{\text{D}}^{(k+1)} via (23) and (27), respectively.
6      end for
Output: 𝐇(K+1)\mathbf{H}^{(K+1)} and 𝐗D(K+1)\mathbf{X}_{\text{D}}^{(K+1)}.
Algorithm 2 PME-Based JACD Algorithm

IV-C Active User Detection and Data Detection

After estimating the channel and data matrices over KK iterations using box-constrained FBS or PME-based JACD, we proceed with AUD and DD for both algorithms. As in [1], we determine UE activity based on the channel energy. Specifically, if the channel energy of the nnth UE surpasses a threshold TAUDT_{\text{AUD}}, then the UE is deemed active; otherwise, we consider it inactive. Mathematically, we describe AUD as follows:

ξ^n=𝕀{𝐡n(K+1)F2TAUD}.\hat{\xi}_{n}=\mathbb{I}\left\{\left\|\mathbf{h}_{n}^{(K+1)}\right\|_{F}^{2}\geq T_{\text{AUD}}\right\}. (28)

The estimated UE activity indicators {ξ^n}n\{\hat{\xi}_{n}\}_{\forall n} can now be used to update the estimated data matrix as

𝐗^D=diag{ξ^1,,ξ^N}𝐗~D,\hat{\mathbf{X}}_{\text{D}}=\text{diag}\left\{\hat{\xi}_{1},\ldots,\hat{\xi}_{N}\right\}\tilde{\mathbf{X}}_{\text{D}}, (29)

where 𝐗~D𝒬N×RD\tilde{\mathbf{X}}_{\text{D}}\in\mathcal{Q}^{N\times R_{\text{D}}} is obtained by mapping the entries in 𝐗D(K+1)\mathbf{X}_{\text{D}}^{(K+1)} to the nearest symbols in 𝒬\mathcal{Q} as follows:

𝐗~D=argmin𝐗𝒬N×RD𝐗𝐗D(K+1)F2.\tilde{\mathbf{X}}_{\text{D}}=\mathop{\arg\min}_{\scriptstyle{\mathbf{X}\in\mathcal{Q}^{N\times R_{\text{D}}}}}\left\|\mathbf{X}-\mathbf{X}_{\text{D}}^{(K+1)}\right\|_{F}^{2}. (30)

In Section V-C, we will introduce a trainable soft-output AUD module that leverages information from both the estimated channel and data matrices to improve AUD performance.

IV-D Complexity Comparison

We assess the computational complexity of our algorithms using the number of complex-valued multiplications, which are expressed in big-O notation. For each iteration, the complexity of the forward steps in these algorithms is O(MPNR+MPNRD+𝕀{λ0}NRD𝒞1)O(MPNR+MPNR_{\text{D}}+\mathbb{I}\{\lambda\neq 0\}NR_{\text{D}}\mathcal{C}_{1}). Here, 𝒞1\mathcal{C}_{1} varies depending on the chosen regularizer222For instance, 𝒞1=1\mathcal{C}_{1}=1 for the regularizer 𝒞(𝐗D)=𝐗D𝐗DB2𝟏N×RDF2\mathcal{C}\left(\mathbf{X}_{\text{D}}\right)=-\|\mathbf{X}_{\text{D}}\odot\mathbf{X}_{\text{D}}^{*}-B^{2}\mathbf{1}_{N\times R_{\text{D}}}\|_{F}^{2}.. The per-iteration complexities of the backward steps in the box-constrained FBS and PME-based JACD algorithms are O(MPN+NRD)O(MPN+NR_{\text{D}}) and O(MPN+NRD|𝒬|RD)O(MPN+NR_{\text{D}}|\mathcal{Q}|^{R_{\text{D}}}), respectively. In addition, the computational complexity of the AUD and DD module is O(MPN+NRD)O(MPN+NR_{\text{D}}).

The summation terms in the PME expression (27) lead to exponential complexity with the data length RDR_{\text{D}} serving as the exponent, which results in higher complexity for the PME-based JACD algorithm compared to the box-constrained FBS algorithm. In Section V, we will leverage DU to tune hyper-parameters in these algorithms, improving their effectiveness automatically. Besides that, we introduce approximations to replace the costly computations in (25) and (27) in the backward steps of these algorithms to further reduce complexity.

V Algorithm Tuning Using Deep-Unfolding

The JACD algorithms introduced in Section IV involve numerous hyper-parameters, making manual parameter tuning challenging. We apply DU to tune all of the involved hyper-parameters automatically. The resulting deep-unfolded algorithms are called the Deep-Unfolding-based Approximate Box-Constrained (DU-ABC) algorithm and the Deep-Unfolding-based aPproximate pOsterior mEan estiMator (DU-POEM) algorithm. Their corresponding deep-unfolded architecture is outlined in Fig. 4.

𝐞rTPME(𝐱^D,n(k),𝒬RD,U𝒬RD{𝐱},Ne(k))=PME(𝐱^D,n(k)(r),𝒬,U𝒬{x},Ne(k))=x𝒬𝒞𝒩(x;𝐱^D,n(k)(r),Ne(k))xx𝒬𝒞𝒩(x;𝐱^D,n(k)(r),Ne(k)),\mathbf{e}_{r}^{T}\text{PME}\left(\hat{\mathbf{x}}_{\text{D},n}^{(k)},\mathcal{Q}^{R_{\text{D}}},U_{\mathcal{Q}^{R_{\text{D}}}}\{\mathbf{x}\},N_{\text{e}}^{(k)}\right)=\text{PME}\left(\hat{\mathbf{x}}_{\text{D},n}^{(k)}(r),\mathcal{Q},U_{\mathcal{Q}}\{x\},N_{\text{e}}^{(k)}\right)=\frac{\sum_{x\in\mathcal{Q}}\mathcal{CN}\left(x;\hat{\mathbf{x}}_{\text{D},n}^{(k)}(r),N_{\text{e}}^{(k)}\right)x}{\sum_{x\in\mathcal{Q}}\mathcal{CN}\left(x;\hat{\mathbf{x}}_{\text{D},n}^{(k)}(r),N_{\text{e}}^{(k)}\right)}, (40)

V-A DU-ABC Algorithm

V-A1 Forward Step

In the forward step of the box-constrained FBS algorithm, the same step size τ(k)\tau^{(k)} in each iteration is utilized to compute both 𝐇^(k)\hat{\mathbf{H}}^{(k)} and 𝐗^D(k)\hat{\mathbf{X}}_{\text{D}}^{(k)}. Due to the vast difference in dynamic range of 𝐇\mathbf{H} and 𝐗D\mathbf{X}_{\text{D}}, for DU-ABC, we introduce separate step sizes τh(k)\tau_{h}^{(k)} and τx(k)\tau_{x}^{(k)} to update 𝐇^(k)\hat{\mathbf{H}}^{(k)} and 𝐗^D(k)\hat{\mathbf{X}}_{\text{D}}^{(k)}, respectively. To accelerate convergence, we apply a momentum strategy, where the gradient information of all previous iterations is used to compute 𝐒^\hat{\mathbf{S}} in each iteration [2, 66]. Furthermore, we allow the penalty coefficient λ\lambda to be iteration-dependent as {λ(k)}k\{\lambda^{(k)}\}_{\forall k}. In summary, the forward step (5) of the box-constrained FBS algorithm is modified as

𝐇^(k)\displaystyle\hat{\mathbf{H}}^{(k)} =𝐇(k)+𝐃h(k),\displaystyle=\mathbf{H}^{(k)}+\mathbf{D}_{h}^{(k)}, (31)
𝐗^D(k)\displaystyle\hat{\mathbf{X}}_{\text{D}}^{(k)} =𝐗D(k)+𝐃x(k),\displaystyle=\mathbf{X}_{\text{D}}^{(k)}+\mathbf{D}_{x}^{(k)}, (32)

where the momentum terms 𝐃h(k)\mathbf{D}_{h}^{(k)} and 𝐃x(k)\mathbf{D}_{x}^{(k)} incorporate the gradient information from the first kk iterations, and are given by

𝐃h(k)\displaystyle\!\!\!\mathbf{D}_{h}^{(k)} =τh(k)(𝐘𝐇(k)𝐗(k))(𝐗(k))H+ηh(k)𝐃h(k1),\displaystyle=\tau_{h}^{(k)}\left(\mathbf{Y}-\mathbf{H}^{(k)}\mathbf{X}^{(k)}\right)\left(\mathbf{X}^{(k)}\right)\!^{H}+\eta_{h}^{(k)}\mathbf{D}_{h}^{(k-1)}, (33)
𝐃x(k)\displaystyle\!\!\!\mathbf{D}_{x}^{(k)} =τx(k)((𝐘D𝐇(k)𝐗D(k))H𝐇(k)+λ(k)𝒞(𝐗D)𝐗DT)\displaystyle=\tau_{x}^{(k)}\left(\left(\mathbf{Y}_{\text{D}}-\mathbf{H}^{(k)}\mathbf{X}_{\text{D}}^{(k)}\right)\!^{H}\mathbf{H}^{(k)}+\lambda^{(k)}\frac{\partial\mathcal{C}\left(\mathbf{X}_{\text{D}}\right)}{\partial\mathbf{X}_{\text{D}}^{T}}\right)
+ηx(k)𝐃x(k1).\displaystyle\quad\,+\eta_{x}^{(k)}\mathbf{D}_{x}^{(k-1)}. (34)

Here, ηh(k)\eta_{h}^{(k)} and ηx(k)\eta_{x}^{(k)} are weights for the momentum terms of the quantities 𝐇^(k)\hat{\mathbf{H}}^{(k)} and 𝐗^D(k)\hat{\mathbf{X}}_{\text{D}}^{(k)}, respectively, and 𝐃h(0)=𝟎\mathbf{D}_{h}^{(0)}=\mathbf{0} and 𝐃x(0)=𝟎\mathbf{D}_{x}^{(0)}=\mathbf{0}.

V-A2 Backward Step

Since we introduce separate step sizes τh(k)\tau_{h}^{(k)} and τx(k)\tau_{x}^{(k)} in the forward step of DU-ABC, we accordingly define trainable parameters μ~h(k)τh(k)μh\tilde{\mu}_{h}^{(k)}\triangleq\tau_{h}^{(k)}\mu_{h} and μ~x(k)τx(k)μh\tilde{\mu}_{x}^{(k)}\triangleq\tau_{x}^{(k)}\mu_{h} for all kk, to facilitate subsequent computations. Consequently, the proximal operator for 𝐇\mathbf{H} in (23) is modified as follows:

𝐡n,p(k+1)=Shrinkage(𝐡^n,p(k),μ~h(k)).\mathbf{h}_{n,p}^{(k+1)}=\textsf{Shrinkage}\left(\hat{\mathbf{h}}_{n,p}^{(k)},\tilde{\mu}_{h}^{(k)}\right). (35)

As for the proximal operation on 𝐗D\mathbf{X}_{\text{D}}, the optimal solution of the problem (24) involves a complicated quartic equation, as mentioned in Proposition 1, thereby preventing the utilization of DU techniques. To solve this problem, we introduce a simpler alternative: first solve problem (24) without considering the constraints to obtain the optimal solution Shrinkage(𝐱^D,n(k),μ~x(k))\textsf{Shrinkage}\big{(}\hat{\mathbf{x}}_{\text{D},n}^{(k)},\tilde{\mu}_{x}^{(k)}\big{)}; then, clamp the result to the convex hull \mathcal{B}. The procedure is

𝐱D,n(k+1)=Clamp(ω(k)Shrinkage(𝐱^D,n(k),μ~x(k))+𝐛(k),B,B),\textstyle\mathbf{x}_{\text{D},n}^{(k+1)}\!=\!\textsf{Clamp}\big{(}\omega^{(k)}\textsf{Shrinkage}\big{(}\hat{\mathbf{x}}_{\text{D},n}^{(k)},\tilde{\mu}_{x}^{(k)}\big{)}\!+\!\mathbf{b}^{(k)},-B,B\big{)}, (36)

with the clamp function defined in (3). In addition, ω(k)\omega^{(k)}\in\mathbb{R} and 𝐛(k)RD\mathbf{b}^{(k)}\in\mathbb{C}^{R_{\text{D}}} are introduced trainable parameters for the coefficient and bias vector at the kkth iteration, respectively, to increase the flexibility of optimization.

For DU-ABC, the trainable hyper-parameters in forward and backward steps are τh(k),ηh(k),τx(k),ηx(k),λ(k),Θ𝒞(k)\tau_{h}^{(k)},\eta_{h}^{(k)},\tau_{x}^{(k)},\eta_{x}^{(k)},\lambda^{(k)},\Theta_{\mathcal{C}}^{(k)} and μ~h(k),μ~x(k),ω(k),𝐛(k)\tilde{\mu}_{h}^{(k)},\tilde{\mu}_{x}^{(k)},\omega^{(k)},\mathbf{b}^{(k)}, respectively, where Θ𝒞(k)\Theta_{\mathcal{C}}^{(k)} denotes the set of trainable hyper-parameters in the regularizer 𝒞(𝐗D)\mathcal{C}(\mathbf{X}_{\text{D}}).

V-B DU-POEM Algorithm

The forward step and the proximal operations on 𝐇\mathbf{H} in the backward step of the DU-POEM algorithm align with those of the DU-ABC algorithm with λ(k)=0,k\lambda^{(k)}=0,\forall k, as detailed in (31)-(35). We now only focus on the proximal operation for 𝐗D\mathbf{X}_{\text{D}} in DU-POEM, which is different from that of the DU-ABC.

As for the PME of 𝐗D\mathbf{X}_{\text{D}} in the PME-based JACD algorithm, the summation of numerous exponential terms in (27) can result in high computational complexity and lead to numerical stability issues. To mitigate these issues, we show the following proposition that reveals the linear relationship between the PME of 𝐱D,n\mathbf{x}_{\text{D},n} under two specific prior distributions: Uα,𝒬¯{𝐱D,n}U_{\alpha,\bar{\mathcal{Q}}}\{\mathbf{x}_{\text{D},n}\} and U𝒬RD{𝐱D,n}U_{\mathcal{Q}^{R_{\text{D}}}}\{\mathbf{x}_{\text{D},n}\} (refer to Definition 2 in Section II-B).

Proposition 2: Given observation vector 𝐱^=𝐱+𝐞\hat{\mathbf{x}}=\mathbf{x}+\mathbf{e} with Gaussian estimation error 𝐞𝒞𝒩(𝟎,Ne𝐈)\mathbf{e}\sim\mathcal{CN}\left(\mathbf{0},N_{\text{e}}\mathbf{I}\right), there is a linear relationship between PME(𝐱^,𝒮¯,Uα,𝒮¯{𝐱},Ne)\text{PME}(\hat{\mathbf{x}},\bar{\mathcal{S}},U_{\alpha,\bar{\mathcal{S}}}\{\mathbf{x}\},N_{\text{e}}) and PME(𝐱^,𝒮,U𝒮{𝐱},Ne)\text{PME}(\hat{\mathbf{x}},\mathcal{S},U_{\mathcal{S}}\{\mathbf{x}\},N_{\text{e}}), i.e.,

PME(𝐱^,𝒮¯,Uα,𝒮¯{𝐱},Ne)\displaystyle\text{PME}(\hat{\mathbf{x}},\bar{\mathcal{S}},U_{\alpha,\bar{\mathcal{S}}}\{\mathbf{x}\},N_{\text{e}}) (37)
=CPME(𝐱^,𝒮,α,Ne)PME(𝐱^,𝒮,U𝒮{𝐱},Ne),\displaystyle=C_{\text{PME}}(\hat{\mathbf{x}},\mathcal{S},\alpha,N_{\text{e}})\,\text{PME}(\hat{\mathbf{x}},\mathcal{S},U_{\mathcal{S}}\{\mathbf{x}\},N_{\text{e}}),

where 𝒮¯={𝒮,𝟎}\bar{\mathcal{S}}=\{\mathcal{S},\mathbf{0}\}, and the coefficient CPME(𝐱^,𝒮,α,Ne)C_{\text{PME}}(\hat{\mathbf{x}},\mathcal{S},\alpha,N_{\text{e}}) is defined as

CPME(𝐱^,𝒮,α,Ne)=α(α+(1α)|𝒮|𝒞𝒩(𝟎;𝐱^,Ne𝐈)𝐱𝒮𝒞𝒩(𝐱;𝐱^,Ne𝐈))1.\!\!\textstyle C_{\text{PME}}(\hat{\mathbf{x}},\mathcal{S},\alpha,N_{\text{e}})\!=\!\alpha\left(\alpha\!+\!\left(1\!-\!\alpha\right)\!\frac{{|\mathcal{S}|}\,\mathcal{CN}(\mathbf{0};\hat{\mathbf{x}},N_{\text{e}}\mathbf{I})}{\sum_{\mathbf{x}\in\mathcal{S}}\mathcal{CN}(\mathbf{x};\hat{\mathbf{x}},N_{\text{e}}\mathbf{I})}\right)^{-1}\!\!\!. (38)

The proof is given in Appendix B.

According to Proposition 2, we can reformulate the PME of 𝐱D,n\mathbf{x}_{\text{D},n} under the prior Uα,𝒬¯{𝐱}U_{\alpha,\bar{\mathcal{Q}}}\{\mathbf{x}\}, PME(𝐱^D,n(k),𝒬¯,Uα,𝒬¯{𝐱},Ne(k))\text{PME}(\hat{\mathbf{x}}_{\text{D},n}^{(k)},\bar{\mathcal{Q}},U_{\alpha,\bar{\mathcal{Q}}}\{\mathbf{x}\},N_{\text{e}}^{(k)}), in equation (27) as the product of a coefficient and PME(𝐱^D,n(k),𝒬RD,U𝒬RD{𝐱},Ne(k))\text{PME}(\hat{\mathbf{x}}_{\text{D},n}^{(k)},\mathcal{Q}^{R_{\text{D}}},U_{\mathcal{Q}^{R_{\text{D}}}}\{\mathbf{x}\},N_{\text{e}}^{(k)}). The main advantage of this reformulation is that we can further decouple each element in PME(𝐱^D,n(k),𝒬RD,U𝒬RD{𝐱},Ne(k))\text{PME}(\hat{\mathbf{x}}_{\text{D},n}^{(k)},\mathcal{Q}^{R_{\text{D}}},U_{\mathcal{Q}^{R_{\text{D}}}}\{\mathbf{x}\},N_{\text{e}}^{(k)}) and compute them independently, which is explained by Proposition 3. The proof is given in Appendix C.

Proposition 3: Given observation vector 𝐱^=𝐱+𝐞S\hat{\mathbf{x}}=\mathbf{x}+\mathbf{e}\in\mathbb{C}^{S} with 𝐱U𝒮{𝐱}\mathbf{x}\sim U_{\mathcal{S}}\{\mathbf{x}\} and Gaussian estimation error 𝐞𝒞𝒩(𝟎,Ne𝐈S)\mathbf{e}\sim\mathcal{CN}\left(\mathbf{0},N_{\text{e}}\mathbf{I}_{S}\right), we can express the ss-th entry of PME(𝐱^,𝒮,U𝒮{𝐱},Ne)\text{PME}(\hat{\mathbf{x}},\mathcal{S},U_{\mathcal{S}}\{\mathbf{x}\},N_{\text{e}}) as

𝐞sTPME(𝐱^,𝒮,U𝒮{𝐱},Ne)=PME(𝐱^(s),,U{x},Ne),\!\!\!\mathbf{e}_{s}^{T}\text{PME}(\hat{\mathbf{x}},\mathcal{S},U_{\mathcal{S}}\{\mathbf{x}\},N_{\text{e}})\!=\!\text{PME}(\hat{\mathbf{x}}(s),\mathcal{R},U_{\mathcal{R}}\{x\},N_{\text{e}}),\! (39)

where 𝒮=S\mathcal{S}=\mathcal{R}^{S}.

According to Proposition 3, we can calculate the rrth entry of PME(𝐱^D,n(k),𝒬RD,U𝒬RD{𝐱},Ne(k))\text{PME}(\hat{\mathbf{x}}_{\text{D},n}^{(k)},\mathcal{Q}^{R_{\text{D}}},U_{\mathcal{Q}^{R_{\text{D}}}}\{\mathbf{x}\},N_{\text{e}}^{(k)}) by the expression (40), which only relates to 𝐱^D,n(k)(r)\hat{\mathbf{x}}_{\text{D},n}^{(k)}(r) and avoids a summation of a large number of exponential terms, thereby reducing complexity and avoiding numerical stability issues.

Refer to caption
Figure 5: Diagrams of the coefficient CPME(x^,𝒮,α,Ne)C_{\text{PME}}(\hat{x},\mathcal{S},\alpha,N_{\text{e}}) with 𝒮={±0.5±j0.5}\mathcal{S}=\{\pm\sqrt{0.5}\pm j\sqrt{0.5}\}, α=0.02\alpha=0.02, and Ne=0.12N_{e}=0.12 (on the left) and its approximation CAPME(x^,ρ,ν)C_{\text{APME}}(\hat{x},\rho,\nu) with ρ=3.49\rho=3.49 and ν=2.46\nu=2.46 (on the right).

Although PME(𝐱^,𝒮,U𝒮{𝐱},Ne)\text{PME}(\hat{\mathbf{x}},\mathcal{S},U_{\mathcal{S}}\{\mathbf{x}\},N_{\text{e}}) in (37) requires low computational complexity and alleviates numerical stability issues, Proposition 2 also introduces CPME(𝐱^,𝒮,α,Ne)C_{\text{PME}}(\hat{\mathbf{x}},\mathcal{S},\alpha,N_{\text{e}}) as shown in (38), which remains complex due to the summation of numerous exponential terms in the denominator. To address this, we propose the following approximate shrinkage operation as a simplified alternative:

CAPME(𝐱^,ρ,ν)={Clamp(ρν𝐱^F,0,1), if 𝐱^𝟎,0, if 𝐱^=𝟎.\displaystyle\!\!\!C_{\text{APME}}(\hat{\mathbf{x}},\rho,\nu)\!=\!\left\{\!\!\begin{array}[]{l}\textsf{Clamp}\Big{(}\rho\!-\!\frac{\nu}{\|\hat{\mathbf{x}}\|_{F}},0,1\Big{)},\!\text{ if }\hat{\mathbf{x}}\neq\mathbf{0},\\ 0,\!\text{ if }\hat{\mathbf{x}}=\mathbf{0}.\end{array}\right. (41)

where ρ\rho and ν\nu are tunable parameters. In Fig. 5, we illustrate CPME(x^,𝒮,α,Ne)C_{\text{PME}}(\hat{x},\mathcal{S},\alpha,N_{\text{e}}) with 𝒮={±0.5±j0.5}\mathcal{S}=\{\pm\sqrt{0.5}\pm j\sqrt{0.5}\}, α=0.02\alpha=0.02, and Ne=0.12N_{e}=0.12 alongside its approximation CAPME(x^,ρ,ν)C_{\text{APME}}(\hat{x},\rho,\nu) with ρ=3.49\rho=3.49 and ν=2.46\nu=2.46 in a simplified one-dimensional complex-value space x^\hat{x}\in\mathbb{C} for ease of visualization. Evidently, the resulting approximation CAPME(x^,ρ,ν)C_{\text{APME}}(\hat{x},\rho,\nu), illustrated on the right of Fig. 5, exhibits sufficient similarity with CPME(x^,𝒮,α,Ne)C_{\text{PME}}(\hat{x},\mathcal{S},\alpha,N_{\text{e}}).

Consequently, we employ an approximate PME at the backward step to replace the exact PME (27) as done in [2]:

𝐱D,n(k+1)\displaystyle\mathbf{x}_{\text{D},n}^{(k+1)} =CAPME(𝐱^D,n(k),ρ(k),ν(k))\displaystyle=C_{\text{APME}}\left(\hat{\mathbf{x}}_{\text{D},n}^{(k)},\rho^{(k)},\nu^{(k)}\right) (42)
×PME(𝐱^D,n(k),𝒬RD,U𝒬RD{𝐱},Ne(k)).\displaystyle\quad\times\text{PME}\left(\hat{\mathbf{x}}_{\text{D},n}^{(k)},\mathcal{Q}^{R_{\text{D}}},U_{\mathcal{Q}^{R_{\text{D}}}}\{\mathbf{x}\},N_{\text{e}}^{(k)}\right).

Here, the hyper-parameters ρ(k),ν(k)\rho^{(k)},\nu^{(k)} and the variance of the estimation error Ne(k)N_{\text{e}}^{(k)} at iteration kk are trainable.

In DU-POEM, the trainable hyper-parameters in the forward and backward steps are τh(k),ηh(k),τx(k),ηx(k)\tau_{h}^{(k)},\eta_{h}^{(k)},\tau_{x}^{(k)},\eta_{x}^{(k)} and μ~h(k),Ne(k),ρ(k),ν(k)\tilde{\mu}_{h}^{(k)},N_{\rm{e}}^{(k)},\rho^{(k)},\nu^{(k)}, respectively.

V-C Trainable Soft-Output Active User Detection and Data Detection Modules

Both the sparsity in the channel matrix and in the data matrix indicate UE activity. To obtain accurate soft information for UE activity {Ln}n\{L_{n}\}_{\forall n}, we propose the use of a sigmoid function as in [2] to fully extract activity information from both the channel and data matrices as follows:

Ln=(1+exp(Tthωh𝐡n(K+1)F2ωx𝐱D,n(K+1)F2))1.L_{n}\!=\!\Big{(}1\!+\exp\!\Big{(}T_{\text{th}}\!-\omega_{h}\left\|\mathbf{h}_{n}^{(K+1)}\right\|_{F}^{2}\!\!-\omega_{x}\left\|\mathbf{x}_{\text{D},n}^{(K+1)}\right\|_{F}^{2}\!\Big{)}\!\Big{)}^{-1}\!\!\!. (43)

Here, the parameters ωh\omega_{h}, ωx\omega_{x}, and TthT_{\text{th}} are tuned using DU. Utilizing soft information, we can detect the UEs’ active states by comparing them against a threshold L¯[0,1]\bar{L}\in[0,1], i.e.,

ξ^n=𝕀{Ln>L¯}.\hat{\xi}_{n}=\mathbb{I}\left\{L_{n}>\bar{L}\right\}. (44)

Determining L¯\bar{L} depends on the desired UE miss-detection and false-detection rates333The UE miss-detection rate is the ratio of the number of active UEs mistakenly deemed inactive to the total number of UEs. The UE false-detection rate is the ratio of the number of inactive UEs incorrectly classified as active to the total number of UEs.. Generally, larger L¯\bar{L} might result in more UEs being detected as inactive, subsequently potentially increasing the UE miss-detection rate and reducing the UE false-detection rate. Strictly speaking, as L¯\bar{L} increases, the UE miss-detection rate should not decrease, and the UE false-detection rate should not increase.

The DD modules of the DU-ABC and DU-POEM algorithms are the same as (29) and (30).

V-D Training Procedure

In our previous work [2], we trained the hyper-parameters in the unfolded layers of these algorithms and the AUD module separately. As a result, the AUD module is unable to guide these unfolded layers to estimate more accurate outputs that could potentially improve AUD performance. This one-way interaction limits the effectiveness of the parameter tuning. Furthermore, the performance of these unfolded layers was also not fully optimized in [2] due to the lack of valuable feedback from the AUD module. These separate training processes result in a missed opportunity for more effective hyper-parameter tuning. To achieve better JACD performance, we choose to jointly train the hyper-parameters in the unfolded layers in conjunction with the AUD module utilizing the following loss function:

Loss=diag{L1,,LN}𝐗D(K+1)𝐗DF2.\operatorname{Loss}=\left\|\operatorname{diag}\left\{L_{1},\ldots,L_{N}\right\}\mathbf{X}_{\text{D}}^{(K+1)}-\mathbf{X}_{\text{D}}\right\|_{F}^{2}. (45)

The above loss function underscores our emphasis on the precision of both AUD and DD performance, aligning seamlessly with the objectives of the practical wireless communication systems. We note that due to the absence of the error term comparing the estimated channel matrix and actual channel matrix in the loss function (45), i.e., we do not explicitly optimize our algorithms for CE accuracy.

Note that the shrinkage operation (2) and approximate shrinkage operation (41) applied in DU-based algorithms have no gradient at 𝟎\mathbf{0}. To circumvent this issue, we replace the denominator 𝐱F\|\mathbf{x}\|_{F} in these equations with 𝐱F2+ϵ\sqrt{\|\mathbf{x}\|_{F}^{2}+\epsilon}, producing valid gradients and avoiding a denominator of 0, where ϵ\epsilon is a small value (we use ϵ=1040\epsilon=10^{-40} in the simulations).

V-E Complexity Comparison

The computational complexity per iteration for the forward steps in both the DU-ABC and DU-POEM algorithms is O(MPNR+MPNRD+𝕀{λ0}NRD𝒞1)O(MPNR+MPNR_{\text{D}}+\mathbb{I}\{\lambda\neq 0\}NR_{\text{D}}\mathcal{C}_{1}). Additionally, for the backward steps, the per-iteration complexity is O(MPN+NRD)O(MPN+NR_{\text{D}}) for the DU-ABC algorithm and O(MPN+NRD|𝒬|)O(MPN+NR_{\text{D}}|\mathcal{Q}|) for the DU-POEM algorithm. The computational complexity of the trainable AUD and DD module is O(MPN+NRD)O(MPN+NR_{\text{D}}).

By approximating the PME expression (27) through Propositions 2 and 3, we have significantly reduced the complexity of the backward step in the DU-POEM algorithm to O(MPN+NRD|𝒬|)O(MPN+NR_{\text{D}}|\mathcal{Q}|), in contrast to O(MPN+NRD|𝒬|RD)O(MPN+NR_{\text{D}}|\mathcal{Q}|^{R_{\text{D}}}) in the PME-based JACD algorithm. This reduction makes the computational complexities of both the DU-ABC and DU-POEM algorithms comparable, improving their efficiency and practicality.

VI Simulation Results

We now demonstrate the efficacy of our proposed JACD algorithms and compare them to existing baseline methods.

VI-A Simulation Setup

Building upon the system settings from [1, 2, 54], we consider a cell-free wireless communication system in an area of 500 m ×\times 500 m. Unless stated otherwise, we use the following assumptions. We consider P=60P=60 uniformly distributed APs at the height of hAP=15h_{\text{AP}}=15 m, each with M=4M=4 antennas, that serve N=400N=400 uniformly distributed single-antenna UEs at the height of hUE=1.65h_{\text{UE}}=1.65 m. We set the UE activity probability to α=0.2\alpha=0.2. Active UEs transmit RP=50R_{\text{P}}=50 pilot signals, originating from a complex equiangular tight frame as described in [67], and RD=200R_{\text{D}}=200 QPSK data signals over the channel with a bandwidth of 2020 MHz and a carrier frequency of fc=1900f_{c}=1900 MHz. These signals satisfy the energy constraints 𝐱P,nF2=RP\|\mathbf{x}_{\text{P},n}\|_{F}^{2}=R_{\text{P}} and 𝔼{𝐗D(n,r)F2}=1\mathbb{E}\{\|\mathbf{X}_{\text{D}}(n,r)\|_{F}^{2}\}=1, implying that B=0.5B=\sqrt{0.5} for QPSK. We assume that the UEs’ transmission power is 0.1 W, with power control allowing for a dynamic range of up to 12 dB between the strongest and weakest UE [54]. Moreover, we account for a shadow fading variance of σsh2=8\sigma_{\text{sh}}^{2}=8 dB, a noise figure of 99 dB, and a noise temperature of 290290 K.

Regarding the channel model, we assume that the small-scale fading parameters follow the standard complex Gaussian distribution, while the large-scale fading follows a three-slope path-loss model [54, 25, 68]. Specifically, the large-scale fading between the nnth UE and the ppth AP, denoted as βn,p\beta_{n,p}, is given by [25, Eq. (52)][54][68]

βn,p={10L+zn,p10(dn,p)3.5,ifdn,p>D1,10L10(d1)1.5(dn,p)2,ifD0<dn,pD1,10L10(d1)1.5(d0)2,ifdn,pD0.\!\!\beta_{n,p}\!=\!\left\{\!\!{\begin{array}[]{l}10^{\frac{-L+z_{n,p}}{10}}(d_{n,p})^{-3.5},\;\text{if}\;d_{n,p}>D_{1},\\ 10^{-\frac{L}{10}}(d_{1})^{-1.5}(d_{n,p})^{-2},\;\text{if}\;D_{0}<d_{n,p}\leq D_{1},\\ 10^{-\frac{L}{10}}(d_{1})^{-1.5}(d_{0})^{-2},\;\text{if}\;d_{n,p}\leq D_{0}.\end{array}}\right. (46)

Here, dn,pd_{n,p} [km] is the distance between the nnth UE and the ppth AP with breakpoints at D0=0.01D_{0}=0.01 km and D1=0.05D_{1}=0.05 km [25]. Besides, the shadow fading follows zn,p𝒩(0,σsh2)z_{n,p}\sim\mathcal{N}(0,\sigma_{\text{sh}}^{2}), and L45.5+35.46log10(fc)13.82log10(hAP)(1.1log10(fc)0.7)hUEL\triangleq 45.5+35.46\log_{10}(f_{c})-13.82\log_{10}(h_{\text{AP}})-(1.1\log_{10}(f_{c})-0.7)h_{\text{UE}} [25, Eq. (53)].

VI-B Baseline Methods

To assess the effectiveness of our algorithms444Since the PME-based JACD algorithm involves tuning numerous hyper-parameters and has a high computational complexity, this section only presents performance simulations for its deep-unfolded variant, DU-POEM, alongside the box-constrained FBS algorithm and DU-ABC., we introduce the following baseline methods for comparison:

  • Baseline 1: In this baseline, we first employ the FBS method [31] to estimate sparse channels from the system model 𝐘P=𝐇𝐗P+𝐍\mathbf{Y}_{\text{P}}=\mathbf{H}\mathbf{X}_{\text{P}}+\mathbf{N}. Then, active UEs are identified by equation (28) based on the estimated channels 𝐇~tmp\tilde{\mathbf{H}}_{\text{tmp}}, resulting in the active UE set {ξ^1,,ξ^N}\{\hat{\xi}_{1},\ldots,\hat{\xi}_{N}\}. Subsequently, we perform DD through 𝐗~D=(𝐇~H𝐇~)1𝐇~H𝐘D\tilde{\mathbf{X}}_{\text{D}}=(\tilde{\mathbf{H}}^{H}\tilde{\mathbf{H}})^{-1}\tilde{\mathbf{H}}^{H}\mathbf{Y}_{\text{D}}, where 𝐇~=𝐇~tmpdiag{ξ^1,,ξ^N}\tilde{\mathbf{H}}=\tilde{\mathbf{H}}_{\text{tmp}}\,\text{diag}\{\hat{\xi}_{1},\ldots,\hat{\xi}_{N}\}. Finally, we map the result 𝐗~D\tilde{\mathbf{X}}_{\text{D}} to the nearest constellation symbols using equation (30).

  • Baseline 2: In this baseline, we utilize the AMP algorithm [22] to estimate sparse channels, while all other components remain unchanged from Baseline 1.

  • Baseline 3: This baseline retains all components of Baseline 1, except that we employ a soft MMSE-based iterative detection method [69] for DD.

  • Baseline 4: We adopt a joint AUD-CE-DD method as proposed in [47], which combines Bi-GAMP for CE and DD, alongside sum-product loopy belief propagation (LBP) for AUD.

  • Baseline 5: This baseline implements the FBS-based approach from [54] for joint AUD-CE-DD, with active UEs identified using equation (28) based on the estimated channels.

To accelerate convergence, we take the result of Baseline 1 to initialize Baseline 5, the box-constrained FBS, DU-ABC, and DU-POEM algorithms. In addition, we carry out a maximum number of K=200K=200 iterations for Baselines 1-to-5 and box-constrained FBS algorithm, with a stopping tolerance of 10310^{-3} for FBS. We use K=10K=10 layers (equal to the maximum number of iterations) for DU-ABC and DU-POEM. To ensure a fair comparison with DU-ABC and DU-POEM, we also present the results of high-performance Baselines 2, 4, 5 and the box-constrained FBS algorithm with only 1010 iterations.

To illustrate the trade-off between JACD performance and computational complexity, we now provide the computational complexities for Baselines 1-5 before we analyze their performance. Specifically, their computational complexities are O(MPNRPKiter+MPN2+N3+MPNRD)O(MPNR_{\text{P}}K_{\text{iter}}+MPN^{2}+N^{3}+MPNR_{\text{D}}), O(MPNRPKiter+MPN2+N3+MPNRD)O(MPNR_{\text{P}}K_{\text{iter}}+MPN^{2}+N^{3}+MPNR_{\text{D}}), O(MPNRPKiter+N3RD+N2MPRD)O(MPNR_{\text{P}}K_{\text{iter}}+N^{3}R_{\text{D}}+N^{2}MPR_{\text{D}}), O(N3Kiter)O(N^{3}K_{\text{iter}}), and O(MPNRKiter+MPNRDKiter)O(MPNRK_{\text{iter}}+MPNR_{\text{D}}K_{\text{iter}}), respectively, where KiterK_{\text{iter}} is the iteration number of these baselines.

VI-C Performance Metrics

To evaluate the performance of the proposed algorithms and the baseline methods, we consider the following performance metrics: UE detection error rate (UDER), channel estimation normalized mean square error (NMSE), and average symbol error rate (ASER), which are defined as follows:

UDER=n=1N|ξnξ^n|N,\displaystyle\textit{UDER}=\frac{\sum_{n=1}^{N}\left|\xi_{n}-\hat{\xi}_{n}\right|}{N}, (47)
NMSE=𝐇𝐇(K+1)F2𝐇F2,\displaystyle\textit{NMSE}=\frac{\left\|\mathbf{H}-{\mathbf{H}}^{(K+1)}\right\|_{F}^{2}}{\left\|\mathbf{H}\right\|_{F}^{2}}, (48)
ASER=n,rξn𝕀{𝐗D(n,r)𝐗~D(n,r)}RDNa.\displaystyle\textit{ASER}=\frac{\sum_{n,r}\xi_{n}\mathbb{I}\left\{\mathbf{X}_{\text{D}}(n,r)\neq\tilde{\mathbf{X}}_{\text{D}}(n,r)\right\}}{R_{\text{D}}N_{a}}. (49)

Furthermore, we also consider a receiver operating characteristic (ROC) curve analysis with the goal of exploring the trade-off between true positive rate (TPR) and false positive rate (FPR) for AUD, which are defined as follows:

FPR =n=1N𝕀{ξn=1,ξ^n=1}Na,\displaystyle=\frac{\sum_{n=1}^{N}\mathbb{I}\{\xi_{n}=1,\hat{\xi}_{n}=1\}}{N_{a}}, (50)
TPR =n=1N𝕀{ξn=0,ξ^n=1}NNa.\displaystyle=\frac{\sum_{n=1}^{N}\mathbb{I}\{\xi_{n}=0,\hat{\xi}_{n}=1\}}{N-N_{a}}. (51)

The results shown next are from 50005000 Monte–Carlo trials.

Refer to caption
(a) UDER
Refer to caption
(b) NMSE
Refer to caption
(c) ASER
Refer to caption
(d) ROC
Figure 6: AUD, CE, and DD performance comparison for the number of APs P=20,40,60,80,100P=20,40,60,80,100, as well as the ROC at P=60P=60.

VI-D Simulation Results

VI-D1 Performance Analysis for Different AP Numbers

In Fig. 6, we evaluate the JACD performance across various algorithms for different numbers of APs. Figs. 6(a)-6(c) show that, when running 200200 iterations, the box-constrained FBS algorithm consistently outperforms all other considered methods in terms of NMSE and ASER across various AP numbers. In addition, the performance of the box-constrained FBS algorithm in UDER surpasses that of most baseline methods. The superior performance of the box-constrained FBS algorithm is primarily due to its effective utilization of the block sparsity in the channel matrix and the row sparsity in the data matrix.

Furthermore, when executing 1010 iterations, DU-ABC and DU-POEM surpass all baseline methods in terms of ASER and UDER while maintaining comparable performance with others in NMSE. The superiority of these DU-based algorithms in UDER and ASER can be attributed to the precise tuning of algorithm hyper-parameters through DU. However, their moderate performance in NMSE is due to the fact that their loss function primarily focuses on AUD and DD without accounting for CE accuracy.

Refer to caption
(a) UDER
Refer to caption
(b) NMSE
Refer to caption
(c) ASER
Refer to caption
(d) ROC
Figure 7: AUD, CE, and DD performance comparison under activity probabilities α=0.1,0.15,0.2,0.25,0.3\alpha=0.1,0.15,0.2,0.25,0.3, as well as the ROC at α=0.15\alpha=0.15.

The insufficient iterations for some algorithms to converge result in the following phenomena: (i) The NMSE of Baseline 5 and the box-constrained FBS algorithm with 10 iterations show instability as the number of APs increases, and (ii) AMP-based Baseline 2 and Bi-GAMP-based Baseline 4 experience significant JACD performance degradation when limited to 10 iterations compared to their performance with 200 iterations. In addition, the UDER of DU-ABC at P=100P=100 exhibits inferior performance compared to that at P=80P=80, resulting from the optimization objective (45) integrating both AUD and DD performance. In these scenarios, the optimization process might give precedence to the precision of DD and compromise the performance of AUD for a smaller value of the loss function.

In Fig. 6(d), the ROC of various algorithms is depicted at P=60P=60, where Baseline 1 and 3 run 200200 iterations while the remaining algorithms use only 1010 iterations. Notably, the AMP-based Baseline 2 and Bi-GAMP-based Baseline 4 perform the worst for 1010 iterations. In contrast, the DU-ABC and DU-POEM algorithms manifest superior AUD performance, as evidenced by their elevated TPR at the same FPR. For ROC, an increase in TPR often corresponds to a higher (or identical) FPR, which highlights that the selection of thresholds L¯\bar{L} and TAUDT_{\text{AUD}} embodies a balance between TPR and FPR, contingent upon the targeted metrics for each.

In summary, when running 200200 iterations, the proposed box-constrained FBS algorithm generally achieves the best AUD, CE, and DD performance across various numbers of APs. When limited to only 1010 iterations, the proposed DU-ABC and DU-POEM algorithms respectively demonstrate the best and second-best performance in AUD and DD in most considered scenarios, respectively, while their CE performance is comparable to that of baseline algorithms.

VI-D2 Performance Analysis for Different UE Activity Probabilities

In Fig. 7, we present a comparative analysis of JACD performance across various UE activity probability scenarios, where the zero value of the UDER for Baseline 2, running 200 iterations under considered activity probabilities, is not shown on the logarithmic axis in Fig. 7(a). Firstly, Figs. 7(a)-7(c) illustrate a significant inverse relationship between UE activity probability and the JACD performance of different algorithms. Typically, with 200200 iterations, the proposed box-constrained FBS algorithm exhibits the best CE performance and achieves superior performance in AUD and DD. Conversely, with a reduced iteration count of 1010, the DU-ABC and DU-POEM algorithms generally outperform other benchmarks in AUD and DD despite their subpar CE performance, which is due to the exclusion of a metric assessing the channel estimation accuracy from the loss function.

Analogous to the observations in Fig. 6, the DD of both DU-ABC and DU-POEM algorithms marginally decline at α=0.15\alpha=0.15 relative to α=0.2\alpha=0.2. This is because the loss function of our DU-based algorithms considers both AUD and DD performance, which might lead to the prioritization of one aspect over the other for a lower overall loss function value. Figs. 7(a) and 7(c) demonstrate the DU-based algorithms prioritize DD performance at higher probabilities (α=0.25,0.3\alpha=0.25,0.3) and AUD accuracy at lower probabilities (α=0.1,0.15\alpha=0.1,0.15).

In Fig. 7(d), we present the ROC comparison of various algorithms under α=0.15\alpha=0.15. Baselines 1 and 3 undergo 200200 iterations, while all other algorithms complete 1010 iterations each. The AMP-based Baseline 2 and Bi-GAMP-based Baseline 4 demonstrate the worst AUD performance. Moreover, our proposed DU-ABC and DU-POEM algorithms exhibit superior AUD performance, which attains the highest TPR for a given FPR.

To sum up, when running 200200 iterations, the proposed box-constrained FBS algorithm typically achieves the best performance in AUD, CE, and DD. Conversely, the proposed DU-based algorithms generally surpass other baseline methods in AUD and DD performance with only K=10K=10 iterations.

VII Conclusions

We have proposed a novel framework of joint active user detection, channel estimation, and data detection (JACD) for massive grant-free transmission in cell-free wireless communication systems. From this framework, we have developed several computationally efficient JACD algorithms, denoted as box-constrained FBS and PME-based JACD algorithms, accompanied by their deep-unfolded versions, DU-ABC and DU-POEM. When running 200200 algorithm iterations, the box-constrained FBS algorithm often exhibits superior JACD performance. When running only 1010 iterations, the proposed DU-ABC and DU-POEM algorithms usually significantly outperform all considered baseline methods regarding active user and data detection performance. The findings of this paper are expected to establish a solid foundation for the development of algorithms for massive machine-type communication.

Appendix A Proof of Proposition 1

As delineated in [1], we recast the complex-valued optimization problem (24) into a real-valued problem:

𝐫n(k+1)=argmin𝐫n(k+1)2RD12𝐫n(k+1)𝐫^n(k)F2+τ(k)μx𝐫n(k+1)F\displaystyle\mathbf{r}_{n}^{(k+1)}\!=\!\!\!\!\mathop{\arg\min}_{\scriptstyle{\mathbf{r}_{n}^{(k+1)}\in\mathbb{R}^{2R_{\text{D}}}}}\frac{1}{2}\left\|\mathbf{r}_{n}^{(k+1)}\!-\!\hat{\mathbf{r}}_{n}^{(k)}\right\|_{F}^{2}+\tau^{(k)}\mu_{x}\left\|\mathbf{r}_{n}^{(k+1)}\right\|_{F} (52)
s.t. B𝐫n(k+1)(d)B,d{1,,2RD},\displaystyle\text{s.t. }-B\leq\mathbf{r}_{n}^{(k+1)}\!\left(d\right)\!\leq B,\;\forall d\in\left\{1,\ldots,2R_{\text{D}}\right\},

where 𝐫n(k+1)[Re{𝐱D,n(k+1)}T,Im{𝐱D,n(k+1)}T]T2RD\mathbf{r}_{n}^{(k+1)}\triangleq[\text{Re}\{\mathbf{x}_{\text{D},n}^{(k+1)}\}^{T},\text{Im}\{\mathbf{x}_{\text{D},n}^{(k+1)}\}^{T}]^{T}\in\mathbb{R}^{2R_{\text{D}}} and 𝐫^n(k)[Re{𝐱^D,n(k)}T,Im{𝐱^D,n(k)}T]T2RD\hat{\mathbf{r}}_{n}^{(k)}\triangleq[\text{Re}\{\hat{\mathbf{x}}_{\text{D},n}^{(k)}\}^{T},\text{Im}\{\hat{\mathbf{x}}_{\text{D},n}^{(k)}\}^{T}]^{T}\in\mathbb{R}^{2R_{\text{D}}}. With the Lagrangian function L(𝐫n(k+1),𝐩,𝐪)=12𝐫n(k+1)𝐫^n(k)F2+τ(k)μx𝐫n(k+1)F+d𝐩(d)(𝐫n(k+1)(d)B)d𝐪(d)(𝐫n(k+1)(d)+B)L(\mathbf{r}_{n}^{(k+1)},\mathbf{p},\mathbf{q})=\frac{1}{2}\|\mathbf{r}_{n}^{(k+1)}-\hat{\mathbf{r}}_{n}^{(k)}\|_{F}^{2}+\tau^{(k)}\mu_{x}\|\mathbf{r}_{n}^{(k+1)}\|_{F}+\sum_{d}\mathbf{p}(d)(\mathbf{r}_{n}^{(k+1)}(d)-B)-\sum_{d}\mathbf{q}(d)(\mathbf{r}_{n}^{(k+1)}(d)+B), the KKT conditions of optimization problem (52) are as follows [1]:

𝐫n(k+1)+τ(k)μx𝐫n(k+1)F𝐫n(k+1)𝐫^n(k)+𝐩𝐪=0,\displaystyle\mathbf{r}_{n}^{(k+1)}+\frac{\tau^{(k)}\mu_{x}}{\left\|\mathbf{r}_{n}^{(k+1)}\right\|_{F}}\mathbf{r}_{n}^{(k+1)}-\hat{\mathbf{r}}_{n}^{(k)}+\mathbf{p}-\mathbf{q}=0, (53a)
𝐫n(k+1)(d)B0,𝐫n(k+1)(d)B0,d,\displaystyle\mathbf{r}_{n}^{(k+1)}(d)-B\leq 0,\;-\mathbf{r}_{n}^{(k+1)}(d)-B\leq 0,\;\forall d, (53b)
𝐩(d)0,𝐪(d)0,d,\displaystyle\mathbf{p}(d)\geq 0,\;\mathbf{q}(d)\geq 0,\;\forall d, (53c)
𝐩(d)(𝐫n(k+1)(d)B)=0,d,\displaystyle\mathbf{p}(d)(\mathbf{r}_{n}^{(k+1)}(d)-B)=0,\;\forall d, (53d)
𝐪(d)(𝐫n(k+1)(d)+B)=0,d,\displaystyle\mathbf{q}(d)(\mathbf{r}_{n}^{(k+1)}(d)+B)=0,\;\forall d, (53e)

where 𝐫n(k+1)𝟎\mathbf{r}_{n}^{(k+1)}\neq\mathbf{0}. The solution of equation (53a) is given by

𝐫n(k+1)=max{𝐫^n(k)𝐩+𝐪Fτ(k)μx,0}𝐫^n(k)𝐩+𝐪F(𝐫^n(k)𝐩+𝐪),\mathbf{r}_{n}^{(k+1)}\!=\!\frac{\max\!\left\{\!\left\|\hat{\mathbf{r}}_{n}^{(k)}\!-\!\mathbf{p}\!+\!\mathbf{q}\right\|_{F}\!\!-\!\tau^{(k)}\mu_{x},0\right\}}{\left\|\hat{\mathbf{r}}_{n}^{(k)}\!-\!\mathbf{p}\!+\!\mathbf{q}\right\|_{F}}\left(\hat{\mathbf{r}}_{n}^{(k)}\!-\!\mathbf{p}\!+\!\mathbf{q}\right), (54)

with 𝐫^n(k)𝐩+𝐪F>τ(k)μx\left\|\hat{\mathbf{r}}_{n}^{(k)}-\mathbf{p}+\mathbf{q}\right\|_{F}>\tau^{(k)}\mu_{x}. Here, the expression (54) implies that the influence of vectors 𝐩\mathbf{p} and 𝐪\mathbf{q} serves to diminish the values of some entries in 𝐫^n(k)\hat{\mathbf{r}}_{n}^{(k)}, thereby confining the elements of 𝐫n(k+1)\mathbf{r}_{n}^{(k+1)} to the interval [B,B][-B,B] and, consequently, guaranteeing that 𝐫n(k+1)\mathbf{r}_{n}^{(k+1)} meets the KKT conditions.

To identify the set of elements in 𝐫^n(k)\hat{\mathbf{r}}_{n}^{(k)} that potentially fall outside the interval [B,B][-B,B], an initial resolution 𝐫n,tmp(k+1)=Shrinkage(𝐫^n(k),τ(k)μx)\textstyle\mathbf{r}_{n,\text{tmp}}^{(k+1)}=\textsf{Shrinkage}\big{(}\hat{\mathbf{r}}_{n}^{(k)},\tau^{(k)}\mu_{x}\big{)} of equation (52) without constraints is conducted, i.e., 𝐩=𝐪=𝟎\mathbf{p}=\mathbf{q}=\mathbf{0}, thereby we can obtain the sets 𝒮p{d:𝐫n,tmp(k+1)(d)>B}\mathcal{S}_{p}\triangleq\{d:\;\mathbf{r}_{n,\text{tmp}}^{(k+1)}(d)>B\} and 𝒮q{d:𝐫n,tmp(k+1)(d)<B}\mathcal{S}_{q}\triangleq\{d:\;\mathbf{r}_{n,\text{tmp}}^{(k+1)}(d)<-B\}, which indicates the index of non-zero elements in 𝐩\mathbf{p} and 𝐪\mathbf{q}, respectively. There are three cases as follows:

  • 1)

    If d𝒮pd\in\mathcal{S}_{p}, then we should set 𝐩(d)>0\mathbf{p}(d)>0 and 𝐪(d)=0\mathbf{q}(d)=0 to reduce the proximal coefficient max{𝐫^n(k)𝐩+𝐪Fτ(k)μx,0}𝐫^n(k)𝐩+𝐪F\frac{\max\{\|\hat{\mathbf{r}}_{n}^{(k)}-\mathbf{p}+\mathbf{q}\|_{F}\!-\tau^{(k)}\mu_{x},0\}}{\|\hat{\mathbf{r}}_{n}^{(k)}-\mathbf{p}+\mathbf{q}\|_{F}} and ensure the corresponding value of optimal vector 𝐫n(k+1)(d)=B\mathbf{r}_{n}^{(k+1)}(d)=B.

  • 2)

    If d𝒮qd\in\mathcal{S}_{q}, then we should set 𝐩(d)=0\mathbf{p}(d)=0 and 𝐪(d)>0\mathbf{q}(d)>0 to reduce the proximal coefficient max{𝐫^n(k)𝐩+𝐪Fτ(k)μx,0}𝐫^n(k)𝐩+𝐪F\frac{\max\{\|\hat{\mathbf{r}}_{n}^{(k)}-\mathbf{p}+\mathbf{q}\|_{F}\!-\tau^{(k)}\mu_{x},0\}}{\|\hat{\mathbf{r}}_{n}^{(k)}-\mathbf{p}+\mathbf{q}\|_{F}} and ensure the corresponding value of optimal vector 𝐫n(k+1)(d)=B\mathbf{r}_{n}^{(k+1)}(d)=-B.

  • 3)

    If d𝒮p𝒮qd\notin\mathcal{S}_{p}\cup\mathcal{S}_{q}, then we should set 𝐩(d)=0\mathbf{p}(d)=0 and 𝐪(d)=0\mathbf{q}(d)=0 because non-negativity of 𝐩\mathbf{p} and 𝐪\mathbf{q} ensures that the proximal coefficient max{𝐫^n(k)𝐩+𝐪Fτ(k)μx,0}𝐫^n(k)𝐩+𝐪F\frac{\max\{\|\hat{\mathbf{r}}_{n}^{(k)}-\mathbf{p}+\mathbf{q}\|_{F}\!-\tau^{(k)}\mu_{x},0\}}{\|\hat{\mathbf{r}}_{n}^{(k)}-\mathbf{p}+\mathbf{q}\|_{F}} does not exceed max{𝐫^n(k)Fτ(k)μx,0}𝐫^n(k)F\frac{\max\{\|\hat{\mathbf{r}}_{n}^{(k)}\|_{F}-\tau^{(k)}\mu_{x},0\}}{\|\hat{\mathbf{r}}_{n}^{(k)}\|_{F}}, causing |𝐫n(k+1)(d)||𝐫n,tmp(k+1)(d)||\mathbf{r}_{n}^{(k+1)}(d)|\leq|\mathbf{r}_{n,\text{tmp}}^{(k+1)}(d)|. Consequently, 𝐫n(k+1)(d)\mathbf{r}_{n}^{(k+1)}(d) would still satisfy the conditions.

Given the index sets 𝒮p\mathcal{S}_{p} and 𝒮q\mathcal{S}_{q} corresponding to non-zero entries in vectors 𝐩\mathbf{p} and 𝐪\mathbf{q}, we precisely determine the values of these non-zero entries. We now introduce the notation 𝐦=𝐫^n(k)𝐩+𝐪\mathbf{m}=\hat{\mathbf{r}}_{n}^{(k)}-\mathbf{p}+\mathbf{q}, and b=max{𝐦Fτ(k)μx,0}𝐦Fb=\frac{{\max\{\|\mathbf{m}\|_{F}-\tau^{(k)}\mu_{x},0\}}}{{\|\mathbf{m}\|_{F}}} for brevity. Since 𝐦F>τ(k)μx\left\|\mathbf{m}\right\|_{F}>\tau^{(k)}\mu_{x}, then b=𝐦Fτ(k)μx𝐦F(0,1]b=\frac{\left\|\mathbf{m}\right\|_{F}-\tau^{(k)}\mu_{x}}{\left\|\mathbf{m}\right\|_{F}}\in\!\left(0,1\right] and we have 𝐫n(k+1)=b𝐦\mathbf{r}_{n}^{(k+1)}=b\,\mathbf{m}, i.e., 𝐦(d)=𝐫^n(k)(d)𝕀{d𝒮p𝒮q}+Bb𝕀{d𝒮p}Bb𝕀{d𝒮q}\mathbf{m}(d)=\hat{\mathbf{r}}_{n}^{(k)}(d)\,\mathbb{I}\{d\notin\mathcal{S}_{p}\cup\mathcal{S}_{q}\}+\frac{B}{b}\,\mathbb{I}\{d\in\mathcal{S}_{p}\}-\frac{B}{b}\,\mathbb{I}\{d\in\mathcal{S}_{q}\}. Accordingly, 𝐦F\left\|\mathbf{m}\right\|_{F} can be rewritten as 𝐦F=C2B2/b2+C1\|\mathbf{m}\|_{F}=\textstyle\sqrt{\textstyle{C_{2}B^{2}}/{b^{2}}+C_{1}}, where C1=d𝒮p𝒮q𝐫^n(k)(d)2C_{1}=\sum_{d\notin\mathcal{S}_{p}\cup\mathcal{S}_{q}}\hat{\mathbf{r}}_{n}^{(k)}(d)^{2} and C2=|𝒮p𝒮q|C_{2}=|\mathcal{S}_{p}\cup\mathcal{S}_{q}|. This can be substituted into b=𝐦Fτ(k)μx𝐦Fb=\frac{\left\|\mathbf{m}\right\|_{F}-\tau^{(k)}\mu_{x}}{\left\|\mathbf{m}\right\|_{F}} to obtain a quartic equation with respect to bb as:

2C1b44C1b3+(2C1+C2+C3)b22C2b+C2=0,2C_{1}b^{4}-4C_{1}b^{3}+(2C_{1}+C_{2}+C_{3})b^{2}-2C_{2}b+C_{2}=0,

where C3=2(τdμx)2C_{3}=-2(\tau^{d}\mu_{x})^{2}. Among the four solutions, the desired one lies within the range (0,1](0,1].

If the aforementioned quartic equation has no solution in the range (0,1](0,1], it means the case 𝐫n(k+1)𝟎\mathbf{r}_{n}^{(k+1)}\neq\mathbf{0} has no solution for problem (52), then we can only consider b=0b=0 and 𝐫n(k+1)=𝟎\mathbf{r}_{n}^{(k+1)}=\mathbf{0}.

This completes the proof.

Appendix B Proof of Proposition 2

Given observation vector 𝐱^=𝐱+𝐞\hat{\mathbf{x}}=\mathbf{x}+\mathbf{e} with 𝐱U𝒮{𝐱}\mathbf{x}\sim U_{\mathcal{S}}\{\mathbf{x}\} and Gaussian estimation error 𝐞𝒞𝒩(𝟎,Ne𝐈)\mathbf{e}\sim\mathcal{CN}\left(\mathbf{0},N_{\text{e}}\mathbf{I}\right), we can express PME(𝐱^,𝒮,U𝒮{𝐱},Ne)\text{PME}(\hat{\mathbf{x}},\mathcal{S},U_{\mathcal{S}}\{\mathbf{x}\},N_{\text{e}}) as

PME(𝐱^,𝒮,U𝒮{𝐱},Ne)=𝐱𝒮𝒞𝒩(𝐱;𝐱^,Ne𝐈)𝐱𝐱𝒮𝒞𝒩(𝐱;𝐱^,Ne𝐈).\text{PME}(\hat{\mathbf{x}},\mathcal{S},U_{\mathcal{S}}\{\mathbf{x}\},N_{\text{e}})=\frac{\sum_{\mathbf{x}\in\mathcal{S}}\mathcal{CN}(\mathbf{x};\hat{\mathbf{x}},N_{\text{e}}\mathbf{I})\mathbf{x}}{\sum_{\mathbf{x}\in\mathcal{S}}\mathcal{CN}(\mathbf{x};\hat{\mathbf{x}},N_{\text{e}}\mathbf{I})}. (55)

Meanwhile, for 𝒮¯={𝒮,𝟎}\bar{\mathcal{S}}=\{\mathcal{S},\mathbf{0}\}, we can express the PME of 𝐱\mathbf{x} under Uα,𝒮¯{𝐱}U_{\alpha,\bar{\mathcal{S}}}\{\mathbf{x}\} as

PME(𝐱^,𝒮¯,Uα,𝒮¯{𝐱},Ne)\displaystyle\text{PME}(\hat{\mathbf{x}},\bar{\mathcal{S}},U_{\alpha,\bar{\mathcal{S}}}\{\mathbf{x}\},N_{\text{e}}) (56)
=𝐱𝒮α|𝒮|𝒞𝒩(𝐱;𝐱^,Ne𝐈)𝐱𝐱𝒮α|𝒮|𝒞𝒩(𝐱;𝐱^,Ne𝐈)+(1α)𝒞𝒩(𝟎;𝐱^,Ne𝐈)\displaystyle=\frac{\sum_{\mathbf{x}\in\mathcal{S}}\frac{\alpha}{|\mathcal{S}|}\mathcal{CN}(\mathbf{x};\hat{\mathbf{x}},N_{\text{e}}\mathbf{I})\mathbf{x}}{\sum_{\mathbf{x}\in\mathcal{S}}\frac{\alpha}{|\mathcal{S}|}\mathcal{CN}(\mathbf{x};\hat{\mathbf{x}},N_{\text{e}}\mathbf{I})+(1-\alpha)\mathcal{CN}(\mathbf{0};\hat{\mathbf{x}},N_{\text{e}}\mathbf{I})}
=αα+(1α)|𝒮|𝒞𝒩(𝟎;𝐱^,Ne𝐈)𝐱𝒮𝒞𝒩(𝐱;𝐱^,Ne𝐈)𝐱𝐱𝒮𝒞𝒩(𝐱;𝐱^,Ne𝐈)𝐱𝐱𝒮𝒞𝒩(𝐱;𝐱^,Ne𝐈)\displaystyle=\frac{\alpha}{\alpha+(1-\alpha)\frac{|\mathcal{S}|\mathcal{CN}(\mathbf{0};\hat{\mathbf{x}},N_{\text{e}}\mathbf{I})}{\sum_{\mathbf{x}\in\mathcal{S}}\mathcal{CN}(\mathbf{x};\hat{\mathbf{x}},N_{\text{e}}\mathbf{I})\mathbf{x}}}\frac{\sum_{\mathbf{x}\in\mathcal{S}}\mathcal{CN}(\mathbf{x};\hat{\mathbf{x}},N_{\text{e}}\mathbf{I})\mathbf{x}}{\sum_{\mathbf{x}\in\mathcal{S}}\mathcal{CN}(\mathbf{x};\hat{\mathbf{x}},N_{\text{e}}\mathbf{I})}
=CPME(𝐱^,𝒮,α,Ne)PME(𝐱^,𝒮,U𝒮{𝐱},Ne),\displaystyle=C_{\text{PME}}(\hat{\mathbf{x}},\mathcal{S},\alpha,N_{\text{e}})\,\text{PME}(\hat{\mathbf{x}},\mathcal{S},U_{\mathcal{S}}\{\mathbf{x}\},N_{\text{e}}),

where

CPME(𝐱^,𝒮,α,Ne)α(α+(1α)|𝒮|𝒞𝒩(𝟎;𝐱^,Ne𝐈)𝐱𝒮𝒞𝒩(𝐱;𝐱^,Ne𝐈))1.\!\!\textstyle C_{\text{PME}}(\hat{\mathbf{x}},\mathcal{S},\alpha,N_{\text{e}})\!\triangleq\!\alpha\left(\alpha\!+\!\left(1\!-\!\alpha\right)\!\frac{{|\mathcal{S}|}\,\mathcal{CN}(\mathbf{0};\hat{\mathbf{x}},N_{\text{e}}\mathbf{I})}{\sum_{\mathbf{x}\in\mathcal{S}}\mathcal{CN}(\mathbf{x};\hat{\mathbf{x}},N_{\text{e}}\mathbf{I})}\right)^{-1}\!\!\!. (57)

This completes the proof.

Appendix C Proof of Proposition 3

If 𝒮=S\mathcal{S}=\mathcal{R}^{S}, we can rewrite U𝒮{𝐱}U_{\mathcal{S}}\{\mathbf{x}\} as

U𝒮{𝐱}=1|S|=1||S,𝐱S,\textstyle U_{\mathcal{S}}\{\mathbf{x}\}=\frac{1}{|\mathcal{R}^{S}|}=\frac{1}{|\mathcal{R}|^{S}},\forall\mathbf{x}\in\mathcal{R}^{S}, (58)

which indicates entries of 𝐱\mathbf{x} are independent with each other. Based on this, given observation vector 𝐱^=𝐱+𝐞S\hat{\mathbf{x}}=\mathbf{x}+\mathbf{e}\in\mathbb{C}^{S} with 𝐱U𝒮{𝐱}\mathbf{x}\sim U_{\mathcal{S}}\{\mathbf{x}\} and Gaussian estimation error 𝐞𝒞𝒩(𝟎,Ne𝐈S)\mathbf{e}\sim\mathcal{CN}\left(\mathbf{0},N_{\text{e}}\mathbf{I}_{S}\right), the ssth entry of PME(𝐱^,𝒮,U𝒮{𝐱},Ne)\text{PME}(\hat{\mathbf{x}},\mathcal{S},U_{\mathcal{S}}\{\mathbf{x}\},N_{\text{e}}) can be given by

𝐞sTPME(𝐱^,𝒮,U𝒮{𝐱},Ne)\displaystyle\textstyle\mathbf{e}_{s}^{T}\text{PME}(\hat{\mathbf{x}},\mathcal{S},U_{\mathcal{S}}\{\mathbf{x}\},N_{\text{e}}) =𝐱𝒮1|𝒮|𝒞𝒩(𝐱;𝐱^,Ne𝐈S)𝐞sT𝐱𝐱𝒮1|𝒮|𝒞𝒩(𝐱;𝐱^,Ne𝐈S)\displaystyle=\frac{\sum_{\mathbf{x}\in\mathcal{S}}\frac{1}{|\mathcal{S}|}\mathcal{CN}(\mathbf{x};\hat{\mathbf{x}},N_{\text{e}}\mathbf{I}_{S})\mathbf{e}_{s}^{T}\mathbf{x}}{\sum_{\mathbf{x}\in\mathcal{S}}\frac{1}{|\mathcal{S}|}\mathcal{CN}(\mathbf{x};\hat{\mathbf{x}},N_{\text{e}}\mathbf{I}_{S})} (59)
=(a)x1||𝒞𝒩(x;𝐱^(s),Ne)xx1||𝒞𝒩(x;𝐱^(s),Ne)\displaystyle\overset{(a)}{=}\frac{\sum_{x\in\mathcal{R}}\frac{1}{|\mathcal{R}|}\mathcal{CN}(x;\hat{\mathbf{x}}(s),N_{\text{e}})x}{\sum_{x\in\mathcal{R}}\frac{1}{|\mathcal{R}|}\mathcal{CN}(x;\hat{\mathbf{x}}(s),N_{\text{e}})}
=PME(𝐱^(r),,U{x},Ne),\displaystyle=\text{PME}(\hat{\mathbf{x}}(r),\mathcal{R},U_{\mathcal{R}}\{x\},N_{\text{e}}),

where 𝐱(s)=𝐞sT𝐱\mathbf{x}(s)=\mathbf{e}_{s}^{T}\mathbf{x} is replaced by xx in equation (a)(a). This completes the proof.

Acknowledgements

The authors thank Victoria Palhares and Haochuan Song for their help in cell-free channel modeling and Gian Marti for his suggestions on deriving the FBS algorithm. We acknowledge Sueda Taner and Oscar Castañeda for their advice on training deep-unfolded algorithms. We are grateful to Mengyuan Feng for her suggestions on improving the figures in this paper.

References

  • [1] G. Sun, M. Cao, W. Wang, W. Xu, and C. Studer, “Joint active user detection, channel estimation, and data detection for massive grant-free transmission in cell-free systems,” in Proc. IEEE Int. Workshop Signal Process. Adv. Wireless Commun. (SPAWC), Sept. 2023, pp. 406–410.
  • [2] G. Sun, W. Wang, W. Xu, and C. Studer, “Deep-unfolded joint activity and data detection for grant-free transmission in cell-free systems,” in Proc. Int. Symp. Wireless Commun. Syst. (ISWCS), Jul. 2024, pp. 1–5.
  • [3] ITU-R, “Framework and overall objectives of the future development of IMT for 2030 and beyond,” Jun. 2023.
  • [4] G. Sun, H. Hou, Y. Wang, W. Wang, W. Xu, and S. Jin, “Beam-sweeping design for mmWave massive grant-free transmission,” IEEE J. Sel. Topics Signal Process., vol. 18, no. 7, pp. 1249–1264, Oct. 2024.
  • [5] Z. Gao, M. Ke, L. Qiao, and Y. Mei, “Grant-free massive access in cell-free massive MIMO systems,” in Massive IoT Access for 6G.   Springer, 2022, pp. 39–75.
  • [6] Y. Mei, Z. Gao, Y. Wu, W. Chen, J. Zhang, D. W. K. Ng, and M. Di Renzo, “Compressive sensing-based joint activity and data detection for grant-free massive IoT access,” IEEE Trans. Wireless Commun., vol. 21, no. 3, pp. 1851–1869, Mar. 2022.
  • [7] G. Sun, X. Yi, W. Wang, W. Xu, and S. Jin, “Hybrid beamforming for millimeter-wave massive grant-free transmission,” IEEE Trans. Commun., pp. 1–1, Early Access, Sep. 2024.
  • [8] Z. Gao, M. Ke, L. Qiao, and Y. Mei, “Joint activity and data detection for grant-free massive access,” in Massive IoT Access for 6G.   Springer, 2022, pp. 127–161.
  • [9] J. G. Andrews, S. Buzzi, W. Choi, S. V. Hanly, A. Lozano, A. C. K. Soong, and J. C. Zhang, “What will 5G be?” IEEE J. Sel. Areas Commun., vol. 32, no. 6, pp. 1065–1082, Jun. 2014.
  • [10] J. Fang, G. Sun, W. Wang, L. You, and R. Ding, “OFDMA-based unsourced random access in LEO satellite Internet of Things,” China Commun., vol. 21, no. 1, pp. 13–23, Jan. 2024.
  • [11] X. Chen, D. W. K. Ng, W. Yu, E. G. Larsson, N. Al-Dhahir, and R. Schober, “Massive access for 5G and beyond,” IEEE J. Sel. Areas Commun., vol. 39, no. 3, pp. 615–637, Mar. 2021.
  • [12] G. Sun, X. Yi, W. Wang, and W. Xu, “Hybrid beamforming for ergodic rate maximization of mmWave massive grant-free systems,” in Proc. IEEE Global Commun. Conf. (GLOBECOM), Dec. 2022, pp. 1612–1617.
  • [13] G. Sun, W. Wang, W. Xu, and C. Studer, “Low-coherence sequence design under PAPR constraints,” IEEE Wireless Commun. Lett., vol. 13, no. 12, pp. 3663–3667, Dec. 2024.
  • [14] E. Björnson, E. de Carvalho, J. H. Sørensen, E. G. Larsson, and P. Popovski, “A random access protocol for pilot allocation in crowded massive MIMO systems,” IEEE Trans. Wireless Commun., vol. 16, no. 4, pp. 2220–2234, Apr. 2017.
  • [15] W. Xu, Y. Huang, W. Wang, F. Zhu, and X. Ji, “Toward ubiquitous and intelligent 6G networks: From architecture to technology,” Sci. China Inf. Sci., vol. 66, no. 3, p. 130300, Feb. 2023.
  • [16] G. Sun, Y. Li, X. Yi, W. Wang, X. Gao, L. Wang, F. Wei, and Y. Chen, “Massive grant-free OFDMA with timing and frequency offsets,” IEEE Trans. Wireless Commun., vol. 21, no. 5, pp. 3365–3380, May 2022.
  • [17] L. Liu, E. G. Larsson, W. Yu, P. Popovski, C. Stefanovic, and E. de Carvalho, “Sparse signal processing for grant-free massive connectivity: A future paradigm for random access protocols in the Internet of Things,” IEEE Signal Process. Mag., vol. 35, no. 5, pp. 88–99, Sept. 2018.
  • [18] Z. Zhang, X. Wang, Y. Zhang, and Y. Chen, “Grant-free rateless multiple access: A novel massive access scheme for Internet of Things,” IEEE Commun. Lett., vol. 20, no. 10, pp. 2019–2022, Oct. 2016.
  • [19] G. Sun, Y. Li, X. Yi, W. Wang, X. Gao, and L. Wang, “OFDMA based massive grant-free transmission in the presence of timing offset,” in Proc. 13th Int. Conf. Wireless Commun. Signal Process. (WCSP), Oct. 2021, pp. 1–6.
  • [20] Y. Zhu, G. Sun, W. Wang, L. You, F. Wei, L. Wang, and Y. Chen, “Massive grant-free receiver design for OFDM-based transmission over frequency-selective fading channels,” in Proc. IEEE Int. Conf. Commun. (ICC), May 2022, pp. 2405–2410.
  • [21] L. Liu and W. Yu, “Massive connectivity with massive MIMO—Part I: Device activity detection and channel estimation,” IEEE Trans. Signal Process., vol. 66, no. 11, pp. 2933–2946, Jun. 2018.
  • [22] Z. Chen, F. Sohrabi, and W. Yu, “Sparse activity detection for massive connectivity,” IEEE Trans. Signal Process., vol. 66, no. 7, pp. 1890–1904, Apr. 2018.
  • [23] Y. Zhu, G. Sun, W. Wang, L. You, F. Wei, L. Wang, and Y. Chen, “OFDM-based massive grant-free transmission over frequency-selective fading channels,” IEEE Trans. Commun., vol. 70, no. 7, pp. 4543–4558, Jul. 2022.
  • [24] E. Björnson and L. Sanguinetti, “Scalable cell-free massive MIMO systems,” IEEE Trans. Commun., vol. 68, no. 7, pp. 4247–4261, Jul. 2020.
  • [25] H. Q. Ngo, A. Ashikhmin, H. Yang, E. G. Larsson, and T. L. Marzetta, “Cell-free massive MIMO versus small cells,” IEEE Trans. Wireless Commun., vol. 16, no. 3, pp. 1834–1850, Mar. 2017.
  • [26] A. Mishra, Y. Mao, L. Sanguinetti, and B. Clerckx, “Rate-splitting assisted massive machine-type communications in cell-free massive MIMO,” IEEE Commun. Lett., vol. 26, no. 6, pp. 1358–1362, Jun. 2022.
  • [27] H. Q. Ngo, A. Ashikhmin, H. Yang, E. G. Larsson, and T. L. Marzetta, “Cell-free massive MIMO: Uniformly great service for everyone,” in Proc. IEEE Int. Workshop Signal Process. Adv. Wireless Commun. (SPAWC), Jun. 2015, pp. 201–205.
  • [28] Z. Gao, M. Ke, Y. Mei, L. Qiao, S. Chen, D. W. K. Ng, and H. V. Poor, “Compressive-sensing-based grant-free massive access for 6G massive communication,” IEEE Internet Things J., vol. 11, no. 5, pp. 7411–7435, Mar. 2024.
  • [29] S. Elhoushy, M. Ibrahim, and W. Hamouda, “Cell-free massive MIMO: A survey,” IEEE Commun. Surveys Tuts., vol. 24, no. 1, pp. 492–523, 1st Quart. 2021.
  • [30] J. Zhang, S. Chen, Y. Lin, J. Zheng, B. Ai, and L. Hanzo, “Cell-free massive MIMO: A new next-generation paradigm,” IEEE Access, vol. 7, pp. 99 878–99 888, Jun. 2019.
  • [31] T. Goldstein, C. Studer, and R. Baraniuk, “A field guide to forward-backward splitting with a FASTA implementation,” arXiv preprint: 1411.3406, Nov. 2014.
  • [32] A. Beck and M. Teboulle, “A fast iterative shrinkage-thresholding algorithm for linear inverse problems,” SIAM J. Imag. Sci., vol. 2, no. 1, pp. 183–202, 2009.
  • [33] U. K. Ganesan, E. Björnson, and E. G. Larsson, “Clustering-based activity detection algorithms for grant-free random access in cell-free massive MIMO,” IEEE Trans. Commun., vol. 69, no. 11, pp. 7520–7530, Nov. 2021.
  • [34] X. Shao, X. Chen, D. W. K. Ng, C. Zhong, and Z. Zhang, “Covariance-based cooperative activity detection for massive grant-free random access,” in Proc. IEEE Global Commun. Conf. (GLOBECOM), Dec. 2020, pp. 1–6.
  • [35] H. Wang, J. Wang, and J. Fang, “Grant-free massive connectivity in massive MIMO systems: Collocated versus cell-free,” IEEE Wireless Commun. Lett., vol. 10, no. 3, pp. 634–638, Mar. 2021.
  • [36] Y. Li, Q. Lin, Y.-F. Liu, B. Ai, and Y.-C. Wu, “Asynchronous activity detection for cell-free massive MIMO: From centralized to distributed algorithms,” IEEE Trans. Wireless Commun., vol. 22, no. 4, pp. 2477–2492, Apr. 2023.
  • [37] L. Diao, H. Wang, J. Li, P. Zhu, D. Wang, and X. You, “A scalable deep-learning-based active user detection approach for SEU-assisted cell-free massive MIMO systems,” IEEE Internet Things J., vol. 10, no. 22, pp. 19 666–19 680, Nov. 2023.
  • [38] S. Jiang, J. Dang, Z. Zhang, L. Wu, B. Zhu, and L. Wang, “EM-AMP-based joint active user detection and channel estimation in cell-free system,” IEEE Syst. J., vol. 17, no. 3, pp. 4026–4037, Sept. 2023.
  • [39] X. Wang, A. Ashikhmin, Z. Dong, and C. Zhai, “Two-stage channel estimation approach for cell-free IoT with massive random access,” IEEE J. Sel. Areas Commun., vol. 40, no. 5, pp. 1428–1440, May 2022.
  • [40] M. Guo and M. C. Gursoy, “Joint activity detection and channel estimation in cell-free massive MIMO networks with massive connectivity,” IEEE Trans. Commun., vol. 70, no. 1, pp. 317–331, Jan. 2022.
  • [41] M. Ke, Z. Gao, Y. Wu, X. Gao, and K.-K. Wong, “Massive access in cell-free massive MIMO-based Internet of Things: Cloud computing and edge computing paradigms,” IEEE J. Sel. Areas Commun., vol. 39, no. 3, pp. 756–772, Mar. 2021.
  • [42] J. Johnston and X. Wang, “Model-based deep learning for joint activity detection and channel estimation in massive and sporadic connectivity,” IEEE Trans. Wireless Commun., vol. 21, no. 11, pp. 9806–9817, Nov. 2022.
  • [43] R. B. Di Renna and R. C. de Lamare, “Adaptive LLR-based APs selection for grant-free random access in cell-free massive MIMO,” in Proc. IEEE Global Commun. Conf. Workshops (GLOBECOM Workshops), Dec. 2022, pp. 196–201.
  • [44] G. Femenias and F. Riera-Palou, “K-repetition for grant-free random access in cell-free massive MIMO networks,” IEEE Trans. Veh. Technol., vol. 73, no. 3, pp. 3623–3638, Mar. 2024.
  • [45] X. Zhou, K. Ying, Z. Gao, Y. Wu, Z. Xiao, S. Chatzinotas, J. Yuan, and B. Ottersten, “Active terminal identification, channel estimation, and signal detection for grant-free NOMA-OTFS in LEO satellite Internet-of-Things,” IEEE Trans. Wireless Commun., vol. 22, no. 4, pp. 2847–2866, Apr. 2023.
  • [46] H. Iimori, T. Takahashi, K. Ishibashi, G. T. F. de Abreu, and W. Yu, “Grant-free access via bilinear inference for cell-free MIMO with low-coherence pilots,” IEEE Trans. Wireless Commun., vol. 20, no. 11, pp. 7694–7710, Nov. 2021.
  • [47] Q. Zou, H. Zhang, D. Cai, and H. Yang, “A low-complexity joint user activity, channel and data estimation for grant-free massive MIMO systems,” IEEE Signal Process. Lett., vol. 27, pp. 1290–1294, Jul. 2020.
  • [48] R. B. Di Renna and R. C. de Lamare, “Joint channel estimation, activity detection and data decoding based on dynamic message-scheduling strategies for mMTC,” IEEE Trans. Commun., vol. 70, no. 4, pp. 2464–2479, Apr. 2022.
  • [49] S. Zhang, Y. Cui, and W. Chen, “Joint device activity detection, channel estimation and signal detection for massive grant-free access via BiGAMP,” IEEE Trans. Signal Process., vol. 71, pp. 1200–1215, Apr. 2023.
  • [50] S. Jiang, X. Yuan, X. Wang, C. Xu, and W. Yu, “Joint user identification, channel estimation, and signal detection for grant-free NOMA,” IEEE Trans. Wireless Commun., vol. 19, no. 10, pp. 6960–6976, Oct. 2020.
  • [51] Y. Bai, W. Chen, B. Ai, and P. Popovski, “Deep learning for asynchronous massive access with data frame length diversity,” IEEE Trans. Wireless Commun., vol. 23, no. 6, pp. 5529–5540, Jun. 2024.
  • [52] X. Bian, Y. Mao, and J. Zhang, “Joint activity detection, channel estimation, and data decoding for grant-free massive random access,” IEEE Internet Things J., vol. 10, no. 16, pp. 14 042–14 057, Aug. 2023.
  • [53] B. Shen, Y. Wu, W. Zhang, S. Chatzinotas, and B. Ottersten, “Joint device identification, channel estimation, and signal detection for LEO satellite-enabled random access,” in Proc. IEEE Global Commun. Conf. (GLOBECOM), 2023, pp. 679–684.
  • [54] H. Song, T. Goldstein, X. You, C. Zhang, O. Tirkkonen, and C. Studer, “Joint channel estimation and data detection in cell-free massive MU-MIMO systems,” IEEE Trans. Wireless Commun., vol. 21, no. 6, pp. 4068–4084, Jun. 2022.
  • [55] A. Balatsoukas-Stimming and C. Studer, “Deep unfolding for communications systems: A survey and some new directions,” in Proc. IEEE Int. Workshop Signal Process. Syst. (SiPS), Oct. 2019, pp. 266–271.
  • [56] J. R. Hershey, J. L. Roux, and F. Weninger, “Deep unfolding: Model-based inspiration of novel deep architectures,” arXiv preprint: 1409.2574, 2014.
  • [57] A. Jagannath, J. Jagannath, and T. Melodia, “Redefining wireless communication for 6G: Signal processing meets deep learning with deep unfolding,” IEEE Trans. Artif. Intell., vol. 2, no. 6, pp. 528–536, Dec. 2021.
  • [58] N. Ye, J. An, and J. Yu, “Deep-learning-enhanced NOMA transceiver design for massive MTC: Challenges, state of the art, and future directions,” IEEE Wireless Commun., vol. 28, no. 4, pp. 66–73, Aug. 2021.
  • [59] Y. Bai, W. Chen, B. Ai, Z. Zhong, and I. J. Wassell, “Prior information aided deep learning method for grant-free NOMA in mMTC,” IEEE J. Sel. Areas Commun., vol. 40, no. 1, pp. 112–126, Jan. 2022.
  • [60] Z. Ma, W. Wu, F. Gao, and X. Shen, “Model-driven deep learning for non-coherent massive machine-type communications,” IEEE Trans. Wireless Commun., vol. 23, no. 3, pp. 2197–2211, Mar. 2024.
  • [61] Y. Shi, S. Xia, Y. Zhou, and Y. Shi, “Sparse signal processing for massive device connectivity via deep learning,” in Proc. IEEE Int. Conf. Commun. Workshops (ICC Workshops), Jun. 2020, pp. 1–6.
  • [62] Z. Gao, S. Liu, Y. Su, Z. Li, and D. Zheng, “Hybrid knowledge-data driven channel semantic acquisition and beamforming for cell-free massive MIMO,” IEEE J. Sel. Topics Signal Process., vol. 17, no. 5, pp. 964–979, Sept. 2023.
  • [63] S. Liu, Z. Gao, C. Hu, S. Tan, L. Fang, and L. Qiao, “Model-driven deep learning based precoding for FDD cell-free massive MIMO with imperfect CSI,” in Proc. Int. Wireless Commun. Mobile Comput. (IWCMC), May 2022, pp. 696–701.
  • [64] O. Castañeda, S. Jacobsson, G. Durisi, M. Coldrey, T. Goldstein, and C. Studer, “1-bit massive MU-MIMO precoding in VLSI,” IEEE J. Emerg. Sel. Topics Circuits Syst., vol. 7, no. 4, pp. 508–522, Dec. 2017.
  • [65] H. Song, X. You, C. Zhang, and C. Studer, “Soft-output joint channel estimation and data detection using deep unfolding,” in Proc. IEEE Inf. Theory Workshop (ITW), 2021, pp. 1–5.
  • [66] G. Marti, T. Kölle, and C. Studer, “Mitigating smart jammers in multi-user MIMO,” IEEE Trans. Signal Process., vol. 71, pp. 756–771, Feb. 2023.
  • [67] J. Tropp, I. Dhillon, R. Heath, and T. Strohmer, “Designing structured tight frames via an alternating projection method,” IEEE Trans. Inf. Theory, vol. 51, no. 1, pp. 188–209, Jan. 2005.
  • [68] A. Tang, J. Sun, and K. Gong, “Mobile propagation loss with a low base station antenna for NLOS street microcells in urban area,” in Proc. IEEE Veh. Technol. Conf. Spring (VTC-Spring), vol. 1, May 2001, pp. 333–336.
  • [69] M. Zhang and S. Kim, “Evaluation of MMSE-based iterative soft detection schemes for coded massive MIMO system,” IEEE Access, vol. 7, pp. 10 166–10 175, 2019.