This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

An Efficient Algorithm for Device Detection and Channel Estimation in Asynchronous IoT Systems

Abstract

A great amount of endeavour has recently been devoted to the joint device activity detection and channel estimation problem in massive machine-type communications. This paper targets at two practical issues along this line that have not been addressed before: asynchronous transmission from uncoordinated users and efficient algorithms for real-time implementation in systems with a massive number of devices. Specifically, this paper considers a practical system where the preamble sent by each active device is delayed by some unknown number of symbols due to the lack of coordination. We manage to cast the problem of detecting the active devices and estimating their delay and channels into a group LASSO problem. Then, a block coordinate descent algorithm is proposed to solve this problem globally, where the closed-form solution is available when updating each block of variables with the other blocks of variables being fixed, thanks to the special structure of our interested problem. Our analysis shows that the overall complexity of the proposed algorithm is low, making it suitable for real-time application.

Index Terms—  Massive machine-type communication, compressed sensing, asynchronous detection.

1 Introduction

Driven by the rapid advance of Internet of Things (IoT), massive machine-type communications (mMTC), the purpose of which is to provide wireless connectivity to a vast number of IoT devices, has attracted more and more attention recently. To reduce the device access delay, a grant-free random access scheme was advocated in [1], where each active device first transmits its preamble to the base station (BS) and then directly transmits its data without waiting for the grant from the BS. To enable this low-latency access scheme, the BS should be able to detect the active devices and estimate their channels based on the received preambles [1, 2]. Recently, [3, 4] show that joint device activity detection and channel estimation can be formulated as a compressed sensing problem because of the sparse device activity. Such a problem is then solved via the approximate message passing (AMP) algorithm [5] in [3, 4, 6] or other sparse optimization techniques in [7, 8, 9, 10]. To practically implement joint device activity detection and channel estimation under the grant-free random access scheme, there are two crucial issues to address. First, in a practical mMTC system consisting of a large number of low-cost IoT devices, it is impossible to ensure that all the active devices are perfectly synchronized. Thereby, the preamble sequence sent by each active device may be delayed by some unknown number of symbols at the beginning of each coherence block. In this case, it is unknown that whether the device detection and channel estimation problem can be solved under the compressed sensing framework as for the synchronous case [3, 4, 6]. Apart from the issue of asynchronous transmission, another challenge lies in the complexity. In mMTC, the number of devices is huge. Moreover, due to the recent success of the massive multiple-input multiple-output (MIMO) technique, the number of antennas at the BS is becoming large as well. In this case, the joint device activity detection and channel estimation problem involves a vast number of unknown variables. It is thus important to propose some efficient algorithm that can be implemented in a practical but large IoT system.

To tackle the above two challenges, this paper aims to propose a low-complexity algorithm to solve the problem of detecting the active devices and estimating their delay and channels in asynchronous mMTC systems. Specifically, by introducing an enlarged sensing matrix that consists of all the effective preambles of the devices (for each device, each of its effective preambles denotes a preamble that is delayed by a particular number of symbols), we show that the above problem can be cast into a compressed sensing problem, similar to its counterpart in synchronous mMTC systems [3, 1, 4, 6]. To guarantee that at most one effective preamble is detected to be active for each device, the compressed sensing problem is further formulated as a group LASSO problem [11]. We propose a block coordinate descent (BCD) algorithm to solve this problem globally. Thanks to the problem’s special structure, we show that when the BCD algorithm optimizes some block of variables with the other blocks of variables being fixed, the optimal solution can be obtained in closed form. Further, the overall complexity of our algorithm is linear to the numbers of devices and antennas at the BS, which makes it appealing in large IoT systems. Last, we remark that our considered device activity detection problem in asynchronous mMTC systems is in sharp contrast to the information decoding counterpart in asynchronous human-type communication systems, which has been widely studied in the literature, because of the different techniques used for activity detection and information decoding.

2 System Model

This paper considers the uplink communication in an mMTC system, which consists of a BS equipped with MM antennas and NN single-antenna IoT devices denoted by the set 𝒩{1,,N}\mathcal{N}\triangleq\{1,\ldots,N\}. We assume quasi-static block-fading channels, in which all channels remain approximately constant in each fading block, but vary independently from block to block. The channel from device nn to the BS is denoted by 𝒉nM×1\mbox{\boldmath{$h$}}_{n}\in\mathbb{C}^{M\times 1}, n𝒩\forall n\in\mathcal{N}. We assume that the device channels follow the independent and identically distributed (i.i.d.) Rayleigh fading channel model, i.e., 𝒉n𝒞𝒩(𝟎,αn𝑰)\mbox{\boldmath{$h$}}_{n}\sim\mathcal{CN}(\mbox{\boldmath{$0$}},\alpha_{n}\mbox{\boldmath{$I$}}), n\forall n, where αn\alpha_{n} denotes the path loss of device nn.

Due to the sporadic IoT data traffic, only some of the devices become active within each channel coherence block. We thus define the device activity indicator functions as follows:

λn={1,ifdevicenisactive,0,otherwise,n𝒩.\displaystyle\lambda_{n}=\left\{\begin{array}[]{ll}1,&{\rm if}~{}{\rm device}~{}n~{}{\rm is~{}active},\\ 0,&{\rm otherwise},\end{array}\right.~{}\forall n\in\mathcal{N}. (3)

Then, the set of active devices is defined by 𝒦={n:λn=1,n𝒩}\mathcal{K}=\{n:\lambda_{n}=1,\forall n\in\mathcal{N}\}.

It is assumed that the two-phase grant-free random access scheme [1] is adopted for the considered system, where each active device first sends its preamble sequence to the BS for device activity detection and channel estimation, and then sends its data to the BS for decoding. Further, this paper mainly focuses on the first phase of the above grant-free random access scheme. In this phase, each user nn is assigned with a unique preamble sequence denoted by 𝒂n=[an,1,,an,L]TL×1\mbox{\boldmath{$a$}}_{n}=[a_{n,1},\ldots,a_{n,L}]^{T}\in\mathbb{C}^{L\times 1}, n\forall n, where LL denotes the length of the preamble sequence, and an,la_{n,l} with |an,l|=1|a_{n,l}|=1 denotes the ll-th preamble symbol of device nn, l=1,,Ll=1,\ldots,L.

At the beginning of each coherence block, all the active users tend to transmit their preambles to the BS. However, due to the lack of perfect coordination among a large number of low-cost devices, the preamble transmissions are in general asynchronous. Let τn\tau_{n} denote the discrete delay (in terms of symbols) of user nn for transmitting 𝒂n\mbox{\boldmath{$a$}}_{n}, n𝒦\forall n\in\mathcal{K}. It is assumed that at each coherence block, the preamble transmission delay for each device nn is a random integer value in the regime of τn[0,τmax]\tau_{n}\in[0,\tau_{{\rm max}}], n\forall n, where τmax\tau_{{\rm max}} denotes the maximum delay of all the devices over all the coherence blocks. Moreover, τmax\tau_{{\rm max}} is assumed to be known by the BS.

Since each active device n𝒦n\in\mathcal{K} starts to transmit its preamble at the (τn+1)(\tau_{n}+1)-th symbol in the coherence block, we define the effective preamble sequence for device nn as

𝒂¯n(τn)\displaystyle\bar{\mbox{\boldmath{$a$}}}_{n}(\tau_{n}) =[a¯n,1(τn),,a¯n,L+τmax(τn)]T\displaystyle=[\bar{a}_{n,1}(\tau_{n}),\ldots,\bar{a}_{n,L+\tau_{{\rm max}}}(\tau_{n})]^{T}
=[0,,0τn,an,1,,an,L,0,,0τmaxτn]T,n,\displaystyle=[\underset{\tau_{n}}{\underbrace{0,\ldots,0}},a_{n,1},\ldots,a_{n,L},\underset{\tau_{{\rm max}}-\tau_{n}}{\underbrace{0,\ldots,0}}]^{T},~{}\forall n, (4)

where a¯n,j(τn)\bar{a}_{n,j}(\tau_{n}) denotes the jj-th transmit symbol of device nn given a delay of τn\tau_{n} symbols, 1jL+τmax1\leq j\leq L+\tau_{{\rm max}}. Note that a¯n,j=0\bar{a}_{n,j}=0 if 1jτn1\leq j\leq\tau_{n} or L+τn+1jL+τmaxL+\tau_{n}+1\leq j\leq L+\tau_{{\rm max}}. Moreover, under the two-phase grant-free random access scheme, after transmitting the pilot in Phase I, each active device should wait for τmax\tau_{{\rm max}} symbols before transmitting its data such that the pilot received at the BS in the first L+τmaxL+\tau_{{\rm max}} time slots is not interfered by the data. Then, at time slot 1jL+τmax1\leq j\leq L+\tau_{{\rm max}}, the received signal at the BS is merely contributed by pilot and given by

𝒚j=n=1Nλn𝒉npa¯n,j(τn)+𝒛j,j=1,,L+τmax,\displaystyle\mbox{\boldmath{$y$}}_{j}=\sum\limits_{n=1}^{N}\lambda_{n}\mbox{\boldmath{$h$}}_{n}\sqrt{p}\bar{a}_{n,j}(\tau_{n})+\mbox{\boldmath{$z$}}_{j},~{}j=1,\ldots,L+\tau_{{\rm max}}, (5)

where pp denotes the identical transmit power of all the devices, and 𝒛j𝒞𝒩(𝟎,σ2𝑰)\mbox{\boldmath{$z$}}_{j}\sim\mathcal{CN}(\mbox{\boldmath{$0$}},\sigma^{2}\mbox{\boldmath{$I$}}) denotes the additive white Gaussian noise (AWGN) at the BS. Further, the overall received signal at the BS over all the L+τmaxL+\tau_{{\rm max}} time slots, denoted by 𝒀=[𝒚1,,𝒚L+τmax]T(L+τmax)×M\mbox{\boldmath{$Y$}}=[\mbox{\boldmath{$y$}}_{1},\ldots,\mbox{\boldmath{$y$}}_{L+\tau_{{\rm max}}}]^{T}\in\mathbb{C}^{(L+\tau_{{\rm max}})\times M}, is given by

𝒀=n=1Nλnp𝒂¯n(τn)𝒉nT+𝒁=p𝑨¯(τ1,,τN)𝑿+𝒁,\displaystyle\mbox{\boldmath{$Y$}}=\sum\limits_{n=1}^{N}\lambda_{n}\sqrt{p}\bar{\mbox{\boldmath{$a$}}}_{n}(\tau_{n})\mbox{\boldmath{$h$}}_{n}^{T}+\mbox{\boldmath{$Z$}}=\sqrt{p}\bar{\mbox{\boldmath{$A$}}}(\tau_{1},\ldots,\tau_{N})\mbox{\boldmath{$X$}}+\mbox{\boldmath{$Z$}}, (6)

where 𝑨¯(τ1,,τN)=[𝒂¯1(τ1),,𝒂¯N(τN)]\bar{\mbox{\boldmath{$A$}}}(\tau_{1},\ldots,\tau_{N})=[\bar{\mbox{\boldmath{$a$}}}_{1}(\tau_{1}),\ldots,\bar{\mbox{\boldmath{$a$}}}_{N}(\tau_{N})], 𝑿=[𝒙1,\mbox{\boldmath{$X$}}=[\mbox{\boldmath{$x$}}_{1}, ,𝒙N]T\ldots,\mbox{\boldmath{$x$}}_{N}]^{T} with 𝒙n=λn𝒉n\mbox{\boldmath{$x$}}_{n}=\lambda_{n}\mbox{\boldmath{$h$}}_{n}, n\forall n, and 𝒁=[𝒛1,,𝒛L+τmax]T\mbox{\boldmath{$Z$}}=[\mbox{\boldmath{$z$}}_{1},\ldots,\mbox{\boldmath{$z$}}_{L+\tau_{{\rm max}}}]^{T}. The goal of the BS is to estimate the device activity λn\lambda_{n}’s and delay τn\tau_{n}’s as well as active devices’ channels 𝒉n\mbox{\boldmath{$h$}}_{n}’s based on its received signal 𝒀Y given in (6) and its knowledge of the preamble sequences 𝒂n\mbox{\boldmath{$a$}}_{n}’s.

3 A Compressed Sensing Problem Formulation for Estimating {λn,τn,𝒉n}\{\lambda_{n},\tau_{n},\mbox{\boldmath{$h$}}_{n}\}

In this section, we show that the detection of active devices as well as the estimation of their delay and channels can be cast into a compressed sensing problem. Specifically, define

𝑨ext\displaystyle\mbox{\boldmath{$A$}}_{{\rm ext}} =[𝒂¯1(0),,𝒂¯1(τmax),,𝒂¯N(0),,𝒂¯N(τmax)]\displaystyle=[\bar{\mbox{\boldmath{$a$}}}_{1}(0),\ldots,\bar{\mbox{\boldmath{$a$}}}_{1}(\tau_{{\rm max}}),\ldots,\bar{\mbox{\boldmath{$a$}}}_{N}(0),\ldots,\bar{\mbox{\boldmath{$a$}}}_{N}(\tau_{{\rm max}})]
(L+τmax)×(τmax+1)N\displaystyle\in\mathbb{C}^{(L+\tau_{{\rm max}})\times(\tau_{{\rm max}}+1)N} (7)

as the collection of the possible effective preamble sequences of all the devices. Moreover, define the indicator functions of device activity and delay as follows:

βn,τ={1,ifλn=1andτ=τn,0,otherwise,n,τ.\displaystyle\beta_{n,\tau}=\left\{\begin{array}[]{ll}1,&{\rm if}~{}\lambda_{n}=1~{}{\rm and}~{}\tau=\tau_{n},\\ 0,&{\rm otherwise},\end{array}\right.~{}\forall n,\tau. (10)

In other words, βn,τ=1\beta_{n,\tau}=1 only if device nn is active and its delay is of τ\tau symbol duration. Note that if device nn is active, only one of βn,τ\beta_{n,\tau}’s, τ=0,,τmax\tau=0,\ldots,\tau_{{\rm max}}, is equal to 1, i.e., τ=0τmaxβn,τ1\sum_{\tau=0}^{\tau_{{\rm max}}}\beta_{n,\tau}\leq 1, n\forall n. Then, (6) can be reformulated as

𝒀=p𝑨ext𝑿ext+𝒁,\displaystyle\mbox{\boldmath{$Y$}}=\sqrt{p}\mbox{\boldmath{$A$}}^{{\rm ext}}\mbox{\boldmath{$X$}}^{{\rm ext}}+\mbox{\boldmath{$Z$}}, (11)

where 𝑿ext=[𝒙1,0ext,,𝒙1,τmaxext,,𝒙N,0ext,,𝒙N,τmaxext]T(τmax+1)N×M\mbox{\boldmath{$X$}}^{{\rm ext}}=[\mbox{\boldmath{$x$}}_{1,0}^{{\rm ext}},\ldots,\mbox{\boldmath{$x$}}_{1,\tau_{{\rm max}}}^{{\rm ext}},\ldots,\mbox{\boldmath{$x$}}_{N,0}^{{\rm ext}},\ldots,\mbox{\boldmath{$x$}}_{N,\tau_{{\rm max}}}^{{\rm ext}}]^{T}\in\mathbb{C}^{(\tau_{{\rm max}}+1)N\times M} with

𝒙n,τext=βn,τ𝒉n,n,τ.\displaystyle\mbox{\boldmath{$x$}}_{n,\tau}^{{\rm ext}}=\beta_{n,\tau}\mbox{\boldmath{$h$}}_{n},~{}~{}~{}\forall n,\tau. (12)

Suppose that 𝑿ext\mbox{\boldmath{$X$}}^{{\rm ext}} can be estimated according to (11). If 𝒙n,τext𝟎\mbox{\boldmath{$x$}}_{n,\tau}^{{\rm ext}}\neq\mbox{\boldmath{$0$}}, i.e., βn,τ=1\beta_{n,\tau}=1, device nn is active, i.e., λn=1\lambda_{n}=1, and its delay and channel are τn=τ\tau_{n}=\tau and 𝒉n=𝒙n,τext\mbox{\boldmath{$h$}}_{n}=\mbox{\boldmath{$x$}}_{n,\tau}^{{\rm ext}}. If 𝒙n,τext=𝟎\mbox{\boldmath{$x$}}_{n,\tau}^{{\rm ext}}=\mbox{\boldmath{$0$}}, τ\forall\tau, i.e., βn,τ=0\beta_{n,\tau}=0, τ\forall\tau, device nn is inactive, i.e., λn=0\lambda_{n}=0. Thus, the key of the joint estimation of device activity, delay, and channels lies in estimating 𝑿ext\mbox{\boldmath{$X$}}^{{\rm ext}} based on (11).

Note that estimating 𝑿ext\mbox{\boldmath{$X$}}^{{\rm ext}} based on (11) is a compressed sensing problem, since 𝑿ext\mbox{\boldmath{$X$}}^{{\rm ext}} is a row-sparse matrix according to (10) and (12). In this paper, the compressed sensing problem of estimating 𝑿ext\mbox{\boldmath{$X$}}^{{\rm ext}} is formulated as follows:

minimize𝑿ext\displaystyle\mathop{\mathrm{minimize}}_{\mbox{\boldmath{$X$}}^{{\rm ext}}} 𝒀p𝑨ext𝑿extF2\displaystyle~{}\left\|\mbox{\boldmath{$Y$}}-\sqrt{p}\mbox{\boldmath{$A$}}^{{\rm ext}}\mbox{\boldmath{$X$}}^{{\rm ext}}\right\|_{{\rm F}}^{2} (13)
subjectto\displaystyle\mathrm{subject~{}to} [𝒙n,0ext2,,𝒙n,τmaxext2]01,n,\displaystyle~{}\left\|\left[\left\|\mbox{\boldmath{$x$}}_{n,0}^{{\rm ext}}\right\|_{2},\ldots,\left\|\mbox{\boldmath{$x$}}_{n,\tau_{{\rm max}}}^{{\rm ext}}\right\|_{2}\right]\right\|_{0}\leq 1,~{}~{}~{}\forall n, (14)

where 𝑨F\|\mbox{\boldmath{$A$}}\|_{{\rm F}} denotes the Frobenius norm of matrix 𝑨A, i.e., 𝑨F=tr(𝑨𝑨H)\|\mbox{\boldmath{$A$}}\|_{{\rm F}}=\sqrt{{\rm tr}(\mbox{\boldmath{$A$}}\mbox{\boldmath{$A$}}^{H})}, and 𝒂0\|\mbox{\boldmath{$a$}}\|_{0} denotes the zero norm of vector 𝒂a, i.e., the number of non-zero elements in 𝒂a.

In the above problem, constraint (14) is to guarantee that at most one delay pattern is detected to be active for each device nn. Mathematically, (14) imposes a group sparsity constraint on the structure of 𝑿ext\mbox{\boldmath{$X$}}^{{\rm ext}}: in each block consisting of τmax+1\tau_{{\rm max}}+1 vectors 𝒙n,0ext,,𝒙n,τmaxext\mbox{\boldmath{$x$}}_{n,0}^{{\rm ext}},\ldots,\mbox{\boldmath{$x$}}_{n,\tau_{{\rm max}}}^{{\rm ext}}, at most one of them is a non-zero vector. However, this constraint is non-convex. In the rest of this paper, we adopt the group LASSO technique to deal with this non-convex group sparsity constraint.

Under the group LASSO technique, given any coefficient ρ>0\rho>0, we are interested in the following convex problem [11]

minimize𝑿ext0.5𝒀p𝑨ext𝑿extF2+ρn=1Nτ=0τmax𝒙n,τext2.\displaystyle\mathop{\mathrm{minimize}}_{\mbox{\boldmath{$X$}}^{{\rm ext}}}~{}0.5\left\|\mbox{\boldmath{$Y$}}-\sqrt{p}\mbox{\boldmath{$A$}}^{{\rm ext}}\mbox{\boldmath{$X$}}^{{\rm ext}}\right\|_{{\rm F}}^{2}+\rho\sum\limits_{n=1}^{N}\sum\limits_{\tau=0}^{\tau_{{\rm max}}}\left\|\mbox{\boldmath{$x$}}_{n,\tau}^{{\rm ext}}\right\|_{2}. (15)

In problem (15), we penalize the estimation error with a mixed 2/1\ell_{2}/\ell_{1} norm, i.e., ρn=1Nτ=0τmax𝒙n,τext2\rho\sum_{n=1}^{N}\sum_{\tau=0}^{\tau_{{\rm max}}}\|\mbox{\boldmath{$x$}}_{n,\tau}^{{\rm ext}}\|_{2}. Note that this penalty is minimized when all the zero entries are put together in some rows of 𝑿ext\mbox{\boldmath{$X$}}^{{\rm ext}}. As a result, given a large value of ρ\rho, the optimal solution of problem (15) should be a row-sparse matrix. Moreover, if ρ\rho is large enough, the corresponding solution will be sufficiently sparse, and therefore, constraint (14) in problem (13) can be satisfied.

In the following two sections, we will introduce how to solve problem (15) efficiently given ρ\rho and how to select a proper value of ρ\rho so as to balance between activity sparsity and channel estimation error, respectively.

4 An Efficient BCD Algorithm for Problem (11)

The BCD type of algorithms are efficient in solving large-scale optimization problems with a vast number of variables [12]. In this section, we introduce a low-complexity BCD algorithm to solve problem (15) given any ρ>0\rho>0.

4.1 Algorithm Design

Under the BCD algorithm, at each time, we merely optimize one vector 𝒙n¯,τ¯ext\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}} for some particular 1n¯N1\leq\bar{n}\leq N and 0τ¯τmax0\leq\bar{\tau}\leq\tau_{{\rm max}} given 𝒙n,τext=𝒙~n,τext\mbox{\boldmath{$x$}}_{n,\tau}^{{\rm ext}}=\tilde{\mbox{\boldmath{$x$}}}_{n,\tau}^{{\rm ext}}’s, (n,τ)(n¯,τ¯)\forall(n,\tau)\neq(\bar{n},\bar{\tau}). The corresponding optimization problem is formulated as

minimize𝒙n¯,τ¯ext\displaystyle\mathop{\mathrm{minimize}}_{\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}} 0.5𝒀~n¯,τ¯p𝒂n¯,τ¯ext(𝒙n¯,τ¯ext)TF2+ρ𝒙n¯,τ¯ext2\displaystyle~{}0.5\left\|\tilde{\mbox{\boldmath{$Y$}}}_{\bar{n},\bar{\tau}}-\sqrt{p}\mbox{\boldmath{$a$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\left(\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\right)^{T}\right\|_{{\rm F}}^{2}+\rho\left\|\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\right\|_{2} (16)

where

𝒀~n¯,τ¯=𝒀p(n,τ)(n¯,τ¯)𝒂n,τext(𝒙~n,τext)T.\displaystyle\tilde{\mbox{\boldmath{$Y$}}}_{\bar{n},\bar{\tau}}=\mbox{\boldmath{$Y$}}-\sqrt{p}\sum\limits_{(n,\tau)\neq(\bar{n},\bar{\tau})}\mbox{\boldmath{$a$}}_{n,\tau}^{{\rm ext}}\left(\tilde{\mbox{\boldmath{$x$}}}_{n,\tau}^{{\rm ext}}\right)^{T}. (17)

Somewhat surprisingly, we can obtain the closed-form optimal solution of problem (16), as shown in the following theorem.

Theorem 1.

The objective function in problem (16) is strongly convex, and its global minimum is achieved by

(𝒙^n¯,τ¯ext)T={γn¯,τ¯(𝒂n¯,τ¯ext)H𝒀~n¯,τ¯,if(𝒂n¯,τ¯ext)H𝒀~n¯,τ¯2>ρp,𝟎,otherwise,\displaystyle\left(\hat{\mbox{\boldmath{$x$}}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\right)^{T}=\left\{\begin{array}[]{ll}\gamma_{\bar{n},\bar{\tau}}\left(\mbox{\boldmath{$a$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\right)^{H}\tilde{\mbox{\boldmath{$Y$}}}_{\bar{n},\bar{\tau}},&{\rm if}~{}\left\|\left(\mbox{\boldmath{$a$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\right)^{H}\tilde{\mbox{\boldmath{$Y$}}}_{\bar{n},\bar{\tau}}\right\|_{2}>\frac{\rho}{\sqrt{p}},\\ \mbox{\boldmath{$0$}},&{\rm otherwise},\end{array}\right. (20)

where

γn¯,τ¯=1LpρLp(𝒂n¯,τ¯ext)H𝒀~n¯,τ¯2>0.\displaystyle\gamma_{\bar{n},\bar{\tau}}=\frac{1}{L\sqrt{p}}-\frac{\rho}{Lp\left\|\left(\mbox{\boldmath{$a$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\right)^{H}\tilde{\mbox{\boldmath{$Y$}}}_{\bar{n},\bar{\tau}}\right\|_{2}}>0. (21)
Proof.

Please refer to Appendix A. ∎

The optimal solution (20) in Theorem 1 indicates that the BS should keep applying the matched filters 𝒂n¯,τ¯ext\mbox{\boldmath{$a$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}’s to denoise 𝒀~n¯,τ¯\tilde{\mbox{\boldmath{$Y$}}}_{\bar{n},\bar{\tau}}’s. Then, if the power of some resulting signal, i.e., (𝒂n¯,τ¯ext)H𝒀~n¯,τ¯2\left\|\left(\mbox{\boldmath{$a$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\right)^{H}\tilde{\mbox{\boldmath{$Y$}}}_{\bar{n},\bar{\tau}}\right\|_{2}, is larger than a threshold ρ/p\rho/\sqrt{p}, then the estimation of 𝒙n¯,τ¯ext\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}} is a non-zero vector. Otherwise, the estimation of 𝒙n¯,τ¯ext\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}} is a zero vector. This implies that the solution to problem (15) is more sparse as ρ\rho increases.

Remark 1.

In general, a group LASSO problem can be merely solved numerically. However, under the BCD framework, the sensing matrix reduces to a vector 𝐚n¯,τ¯ext\mbox{\boldmath{$a$}}_{\bar{n},\bar{\tau}}^{{\rm ext}} in problem (16) to optimize 𝐱n¯,τ¯ext\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}. In this case, there is a closed-form solution, which is appealing to reduce the computational complexity in mMTC.

Based on Theorem 1, the BCD algorithm to solve problem (15) is summarized in Algorithm 1. Algorithm 1 is an iterative algorithm. At each outer iteration of the algorithm, we first optimize 𝒙1,0ext\mbox{\boldmath{$x$}}_{1,0}^{{\rm ext}} given 𝒙n,τ=𝒙~n,τ\mbox{\boldmath{$x$}}_{n,\tau}=\tilde{\mbox{\boldmath{$x$}}}_{n,\tau}’s, (n,τ)(1,0)\forall(n,\tau)\neq(1,0), and then optimize 𝒙1,1ext\mbox{\boldmath{$x$}}_{1,1}^{{\rm ext}} given 𝒙n,τ=𝒙~n,τ\mbox{\boldmath{$x$}}_{n,\tau}=\tilde{\mbox{\boldmath{$x$}}}_{n,\tau}’s, (n,τ)(1,1)\forall(n,\tau)\neq(1,1), and so on, as shown in Step 2.1 to Step 2.5. When 𝒙n,τext\mbox{\boldmath{$x$}}_{n,\tau}^{{\rm ext}}’s are all optimized once, we can calculate the objective value of problem (15) achieved after the tt-th iteration of the algorithm, denoted by Γ(t)\Gamma^{(t)} as shown in Step 3. The algorithm terminates when the objective value of problem (15) does not decrease sufficiently over two iterations.

Initialization: Set the initial values of 𝒙n,τext\mbox{\boldmath{$x$}}_{n,\tau}^{{\rm ext}}’s as 𝒙~n,τext=𝟎\tilde{\mbox{\boldmath{$x$}}}_{n,\tau}^{{\rm ext}}=\mbox{\boldmath{$0$}}, n=1,,Nn=1,\ldots,N, τ=0,,τmax\tau=0,\ldots,\tau_{{\rm max}}, 𝒀¯=𝒀\bar{\mbox{\boldmath{$Y$}}}=\mbox{\boldmath{$Y$}}, where 𝒀Y is the received signal given in (6), and t=1t=1;
Repeat:

  • 1

    Set j=1j=1;

  • 2

    While jN(τmax+1)j\leq N(\tau_{{\rm max}}+1):

    • 2.1

      Set n¯=j/(τmax+1)+1\bar{n}=\left\lfloor j/(\tau_{{\rm max}}+1)\right\rfloor+1 and τ¯=j(n¯1)(τmax+1)\bar{\tau}=j-(\bar{n}-1)(\tau_{{\rm max}}+1);

    • 2.2

      Set 𝒙n,τext=𝒙~n,τext\mbox{\boldmath{$x$}}_{n,\tau}^{{\rm ext}}=\tilde{\mbox{\boldmath{$x$}}}_{n,\tau}^{{\rm ext}}’s, (n,τ)(n¯,τ¯)\forall(n,\tau)\neq(\bar{n},\bar{\tau}), and 𝒀~n¯,τ¯=𝒀¯+p𝒂n¯,τ¯ext(𝒙~n¯,τ¯ext)T\tilde{\mbox{\boldmath{$Y$}}}_{\bar{n},\bar{\tau}}=\bar{\mbox{\boldmath{$Y$}}}+\sqrt{p}\mbox{\boldmath{$a$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\left(\tilde{\mbox{\boldmath{$x$}}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\right)^{T};

    • 2.3

      Find the optimal solution to problem (16) based on (20), which is denoted by 𝒙^n¯,τ¯ext\hat{\mbox{\boldmath{$x$}}}_{\bar{n},\bar{\tau}}^{{\rm ext}};

    • 2.4

      Set 𝒙~n¯,τ¯ext=𝒙^n¯,τ¯ext\tilde{\mbox{\boldmath{$x$}}}_{\bar{n},\bar{\tau}}^{{\rm ext}}=\hat{\mbox{\boldmath{$x$}}}_{\bar{n},\bar{\tau}}^{{\rm ext}} and 𝒀¯=𝒀~n¯,τ¯p𝒂n¯,τ¯ext(𝒙^n¯,τ¯ext)T\bar{\mbox{\boldmath{$Y$}}}=\tilde{\mbox{\boldmath{$Y$}}}_{\bar{n},\bar{\tau}}-\sqrt{p}\mbox{\boldmath{$a$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\left(\hat{\mbox{\boldmath{$x$}}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\right)^{T};

    • 2.5

      Set j=j+1j=j+1;

  • 3

    Set Γ(t)=0.5𝒀¯F2+ρn=1Nτ=0τmax𝒙~n,τext2\Gamma^{(t)}=0.5\left\|\bar{\mbox{\boldmath{$Y$}}}\right\|_{{\rm F}}^{2}+\rho\sum_{n=1}^{N}\sum_{\tau=0}^{\tau_{{\rm max}}}\left\|\tilde{\mbox{\boldmath{$x$}}}_{n,\tau}^{{\rm ext}}\right\|_{2} and t=t+1t=t+1;

Until (Γ(t1)Γ(t))/Γ(t1)ξ(\Gamma^{(t-1)}-\Gamma^{(t)})/\Gamma^{(t-1)}\leq\xi, where ξ\xi is a small positive number.

Algorithm 1 BCD Algorithm for Solving Problem (15)

4.2 Algorithm Properties

After introducing how the BCD algorithm works, in this subsection, we present some theoretical properties of this algorithm about its optimality and complexity.

Theorem 2.

Every limit point of the iterates generated by Algorithm 1 is a global solution of problem (15). Moreover, for all sufficiently large tt, it follows that

Γ(t)Γ𝒪(1t),\displaystyle\Gamma^{(t)}-\Gamma^{\ast}\leq{\cal O}\left(\frac{1}{t}\right), (22)

where Γ(t)\Gamma^{(t)}, as given in Step 3 of Algorithm 1, is the objective value of problem (15) at the tt-th iteration of Algorithm 1, and Γ\Gamma^{\ast} is the optima value of problem (15).

Proof.

Please refer to Appendix B. ∎

Theorem 3.

Given any ϵ>0\epsilon>0, the total complexity of Algorithm 1 to find an ϵ\epsilon-optimal solution of problem (15) that satisfies Γ(t)Γϵ\Gamma^{(t)}-\Gamma^{\ast}\leq\epsilon is given by

𝒪((L+τmax)τmaxMNϵ).\displaystyle{\cal O}\left(\frac{(L+\tau_{\max})\tau_{\max}MN}{\epsilon}\right). (23)
Proof.

Please refer to Appendix C. ∎

Theorems 2 and 3 imply that the BCD Algorithm can solve problem (15) globally with a complexity linear to NN and MM.

5 The Approach to Select ρ\rho

After solving problem (15) with any given ρ>0\rho>0, we introduce in this section how to determine the value of ρ\rho such that the solution to problem (15) is a good solution to problem (13). In this paper, we update the value of ρ\rho iteratively. Specifically, at the beginning, we set an initial value to ρ\rho as ρ=ρinitial>0\rho=\rho^{{\rm initial}}>0 and solve problem (15) via Algorithm 1. Then, we keep updating ρ\rho as ρ=δρ\rho=\delta\rho, where δ>1\delta>1, and solving problem (15) iteratively until for some large enough value of ρ\rho, the solution to problem (15) satisfies constraint (14) in problem (13). The overall algorithm to solve problem (13) via solving a sequence of problem (15) is summarized in Algorithm 2.

Initialization: Set an initial value of ρ\rho as ρ=ρinitial>0\rho=\rho^{{\rm initial}}>0;
Repeat:

  • 1

    Find the solution to problem (15) given ρ\rho, which is denoted by 𝒙~n,τext\tilde{\mbox{\boldmath{$x$}}}_{n,\tau}^{{\rm ext}}’s, via Algorithm 1;

  • 2

    Set

    𝒙n,τext={𝒙~n,τext,if𝒙~n,τext22Mζαn,𝟎,otherwise,n,τ,\displaystyle\mbox{\boldmath{$x$}}_{n,\tau}^{{\rm ext}}=\left\{\begin{array}[]{ll}\tilde{\mbox{\boldmath{$x$}}}_{n,\tau}^{{\rm ext}},&{\rm if}~{}\frac{\left\|\tilde{\mbox{\boldmath{$x$}}}_{n,\tau}^{{\rm ext}}\right\|_{2}^{2}}{M}\geq\zeta\alpha_{n},\\ \mbox{\boldmath{$0$}},&{\rm otherwise},\end{array}\right.~{}\forall n,\tau, (26)

    where 0<ζ<10<\zeta<1 is a given parameter to control the sparsity of 𝑿ext\mbox{\boldmath{$X$}}^{{\rm ext}}.

  • 3

    Update ρ=δρ\rho=\delta\rho where δ>1\delta>1.

Until the solution of 𝒙n,τext\mbox{\boldmath{$x$}}_{n,\tau}^{{\rm ext}}’s satisfies constraint (14).

Algorithm 2 Proposed Algorithm for Solving Problem (13)

Note that for some inactive devices, maybe the power of corresponding estimated signals 𝒙~n,τext\tilde{\mbox{\boldmath{$x$}}}_{n,\tau}^{{\rm ext}}’s via Algorithm 1 are very weak, but non-zero. This will cause the so-called false alarm event, i.e., an inactive device is detected as an active device. To enhance the sparsity of the estimation of 𝑿ext\mbox{\boldmath{$X$}}^{{\rm ext}} and reduce the false alarm probability, after the convergence of Algorithm 1 in Step 1 of Algorithm 2, we set 𝒙n,τext=𝟎\mbox{\boldmath{$x$}}_{n,\tau}^{{\rm ext}}=\mbox{\boldmath{$0$}} if 𝒙~n,τext2\|\tilde{\mbox{\boldmath{$x$}}}_{n,\tau}^{{\rm ext}}\|^{2} is less than some threshold in Step 2 of Algorithm 2.

Remark 2.

One main issue under the conventional LASSO technique is how to select a proper value of ρ\rho such that the resulting LASSO problem is a good approximation of the original problem. Under our interested problem (13), the new constraint (14) enables an accurate stopping criterion for updating ρ\rho in Step 3 of Algorithm 2. This is another advantage to use LASSO in this work, other than the closed-form solution given any ρ\rho shown in Theorem 1.

6 Numerical Results

In this section, we provide numerical examples to verify the effectiveness of our proposed algorithm for detecting the active devices and estimating their delay and channels in asynchronous IoT systems. We assume that there are N=100N=100 IoT devices located in a cell with a radius of 250250 m, and at each coherence block, only K=10K=10 of them become active. Moreover, the maximum delay of all the devices is τmax=5\tau_{{\rm max}}=5 symbols. The transmit power of the active devices is 2323 dBm. Last, the power spectral density of the AWGN at the BS is 169-169dBm/Hz, and the channel bandwidth is 1010 MHz.

Refer to caption
Fig. 1: Sublinear convergence rate of Algorithm 1.

First, we provide one numerical example to verify the convergence property of Algorithm 1, where the BS has M=128M=128 antennas, and the pilot sequence length is L=20L=20. Fig. 1 shows the relative gap between the objective value of problem (15) achieved at each iteration of Algorithm 1 and the optimal value of problem (15), i.e., (Γ(t)Γ)/Γ(\Gamma^{(t)}-\Gamma^{\ast})/\Gamma^{\ast}. As shown in Theorem 2, the solution generated by Algorithm 1 converges to the optimal solution sublinearly.

Refer to caption
Fig. 2: Detection error probability versus LL.

Next, we show the performance of the proposed algorithm by Monte Carlo simulation. Specifically, we generate 10410^{4} realizations of device activity, location, and channels. Moreover, for each realization, if the detection of some βk,n\beta_{k,n} is wrong, then we declare that this realization is under detection error. The overall detection error probability is defined as the ratio between the number of realizations under detection error and the total number of realizations, i.e., 10410^{4}. Similar to [3], the missed detection/false alarm probability is defined as the probability that an active/inactive device is detected as an inactive/active device. Fig. 2 shows the overall detection error probability and missed detection probability (no false alarm events happen over the 10410^{4} realizations) achieved by our proposed algorithm when LL ranges from 10 to 25 and M=128M=128 or 3232. First, it is observed that the missed detection and false alarm probabilities for device activity detection are very low, e.g., when M=128M=128 and L15L\geq 15, no missed detection and false alarm events are observed over the 10410^{4} realizations. Next, it is observed that when LL is small, the overall detection error probability is high. This indicates that although the active devices can be detected, their delay estimation is in error with high probability. However, the priority of device activity detection is much higher than that of delay estimation. Moreover, delay estimation becomes more accurate as LL increases. Last, it is observed that massive MIMO is powerful to decrease the detection error probability.

7 Conclusion

In this paper, we showed that the problem of jointly detecting the active devices and estimating their delay and channels in asynchronous mMTC systems can be formulated as a group LASSO problem. Utilizing the BCD technique, we proposed an efficient algorithm to solve the group LASSO problem, the complexity of which is shown to be linearly proportional to the numbers of devices and antennas at the BS. Future work may consider how to apply the covariance-based device detection appraoch [13, 14] in asynchronous IoT systems.

Appendix

A: Proof of Theorem 1

Given any n¯=1,,N\bar{n}=1,\ldots,N and τ¯=0,,τmax\bar{\tau}=0,\ldots,\tau_{{\rm max}}, define

fn¯,τ¯(𝒙n¯,τ¯ext)=0.5𝒀~n¯,τ¯p𝒂n¯,τ¯ext(𝒙n¯,τ¯ext)TF2+ρ𝒙n¯,τ¯ext2.\displaystyle f_{\bar{n},\bar{\tau}}(\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}})=0.5\left\|\tilde{\mbox{\boldmath{$Y$}}}_{\bar{n},\bar{\tau}}-\sqrt{p}\mbox{\boldmath{$a$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\left(\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\right)^{T}\right\|_{{\rm F}}^{2}+\rho\left\|\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\right\|_{2}. (27)

Since 𝒀~n¯,τ¯p𝒂n¯,τ¯ext(𝒙n¯,τ¯ext)TF2\|\tilde{\mbox{\boldmath{$Y$}}}_{\bar{n},\bar{\tau}}-\sqrt{p}\mbox{\boldmath{$a$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\left(\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\right)^{T}\|_{{\rm F}}^{2} is a strongly convex function and ρ𝒙n¯,τ¯ext2\rho\|\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\|_{2} is a convex function, fn¯,τ¯(𝒙n¯,τ¯ext)f_{\bar{n},\bar{\tau}}(\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}) is a strongly convex function.

Next, we derive the optimal solution to minimize fn¯,τ¯(𝒙n¯,τ¯ext)f_{\bar{n},\bar{\tau}}(\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}). It is observed that fn¯,τ¯(𝒙n¯,τ¯ext)f_{\bar{n},\bar{\tau}}(\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}) is differentiable when 𝒙n¯,τ¯ext𝟎\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\neq\mbox{\boldmath{$0$}}, but not differentiable when 𝒙n¯,τ¯ext=𝟎\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}=\mbox{\boldmath{$0$}}. Moreover, when 𝒙n¯,τ¯ext𝟎\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\neq\mbox{\boldmath{$0$}}, the gradient of fn¯,τ¯(𝒙n¯,τ¯ext)f_{\bar{n},\bar{\tau}}(\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}) is

[fn¯,τ¯(𝒙n¯,τ¯ext)]T\displaystyle[\partial f_{\bar{n},\bar{\tau}}(\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}})]^{T}
=\displaystyle= p(𝒂n¯,τ¯ext)H𝒀~n¯,τ¯+pL(𝒙n¯,τ¯ext)T+ρ(𝒙n¯,τ¯ext)T𝒙n¯,τ¯ext2,\displaystyle-\sqrt{p}\left(\mbox{\boldmath{$a$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\right)^{H}\tilde{\mbox{\boldmath{$Y$}}}_{\bar{n},\bar{\tau}}+pL\left(\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\right)^{T}+\frac{\rho\left(\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\right)^{T}}{\left\|\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\right\|_{2}}, (28)

while when 𝒙n¯,τ¯ext=𝟎\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}=\mbox{\boldmath{$0$}}, the sub-gradient of fn¯,τ¯(𝒙n¯,τ¯ext)f_{\bar{n},\bar{\tau}}(\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}) is

[fn¯,τ¯(𝒙n¯,τ¯ext)]T\displaystyle[\partial f_{\bar{n},\bar{\tau}}(\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}})]^{T}
=\displaystyle= p(𝒂n¯,τ¯ext)H𝒀~n¯,τ¯+pL(𝒙n¯,τ¯ext)T+ρ𝒙n¯,τ¯ext2.\displaystyle-\sqrt{p}\left(\mbox{\boldmath{$a$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\right)^{H}\tilde{\mbox{\boldmath{$Y$}}}_{\bar{n},\bar{\tau}}+pL\left(\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\right)^{T}+\rho\partial\left\|\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\right\|_{2}. (29)

Since fn¯,τ¯(𝒙n¯,τ¯ext)f_{\bar{n},\bar{\tau}}(\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}) is a strongly convex function over 𝒙n¯,τ¯ext\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}, a point 𝒙^n¯,τ¯ext\hat{\mbox{\boldmath{$x$}}}_{\bar{n},\bar{\tau}}^{{\rm ext}} minimizes this function if and only if 𝟎 is a sub-gradient of the function fn¯,τ¯(𝒙n¯,τ¯ext)f_{\bar{n},\bar{\tau}}(\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}) at this point, i.e.,

𝟎fn¯,τ¯(𝒙n¯,τ¯ext)|𝒙n¯,τ¯ext=𝒙^n¯,τ¯ext.\displaystyle\mbox{\boldmath{$0$}}\in\partial f_{\bar{n},\bar{\tau}}(\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}})|_{\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}=\hat{\mbox{\boldmath{$x$}}}_{\bar{n},\bar{\tau}}^{{\rm ext}}}. (30)

According to (A: Proof of Theorem 1) and (A: Proof of Theorem 1), we study the sub-gradient of the function fn¯,τ¯(𝒙n¯,τ¯ext)f_{\bar{n},\bar{\tau}}(\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}) in two cases: 𝒙n¯,τ¯ext𝟎\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\neq\mbox{\boldmath{$0$}} and 𝒙n¯,τ¯ext=𝟎\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}=\mbox{\boldmath{$0$}}.

First, consider the case when 𝒙n¯,τ¯ext𝟎\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\neq\mbox{\boldmath{$0$}}. To make fn¯,τ¯(𝒙n¯,τ¯ext)=𝟎\partial f_{\bar{n},\bar{\tau}}(\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}})=\mbox{\boldmath{$0$}}, according to (A: Proof of Theorem 1), we have

p(𝒂n¯,τ¯ext)H𝒀~n¯,τ¯+pL(𝒙n¯,τ¯ext)T+ρ(𝒙n¯,τ¯ext)T𝒙n¯,τ¯ext2=𝟎.\displaystyle-\sqrt{p}\left(\mbox{\boldmath{$a$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\right)^{H}\tilde{\mbox{\boldmath{$Y$}}}_{\bar{n},\bar{\tau}}+pL\left(\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\right)^{T}+\frac{\rho\left(\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\right)^{T}}{\left\|\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\right\|_{2}}=\mbox{\boldmath{$0$}}. (31)

It then follows that

(𝒙n¯,τ¯ext)T=γn¯,τ¯(𝒂n¯,τ¯ext)H𝒀~n¯,τ¯,\displaystyle\left(\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\right)^{T}=\gamma_{\bar{n},\bar{\tau}}\left(\mbox{\boldmath{$a$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\right)^{H}\tilde{\mbox{\boldmath{$Y$}}}_{\bar{n},\bar{\tau}}, (32)

where γn¯,τ¯=p/(pL+ρ𝒙n¯,τ¯ext2)>0\gamma_{\bar{n},\bar{\tau}}=\sqrt{p}/\left(pL+\frac{\rho}{\|\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\|_{2}}\right)>0. As a result, (𝒙n¯,τ¯ext)T(\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}})^{T} should be linear to the vector (𝒂n¯,τ¯ext)H𝒀~n¯,τ¯(\mbox{\boldmath{$a$}}_{\bar{n},\bar{\tau}}^{{\rm ext}})^{H}\tilde{\mbox{\boldmath{$Y$}}}_{\bar{n},\bar{\tau}}. The remaining job is to find the value of γn¯,τ¯\gamma_{\bar{n},\bar{\tau}} such that (31) is true. By substituting γn¯,τ¯\gamma_{\bar{n},\bar{\tau}} into (31), it follows that

(p+pL)γn¯,τ¯(𝒂n¯,τ¯ext)H𝒀~n¯,τ¯+ργn¯,τ¯(𝒂n¯,τ¯ext)H𝒀~n¯,τ¯γn¯,τ¯(𝒂n¯,τ¯ext)H𝒀~n¯,τ¯2\displaystyle(-\sqrt{p}+pL)\gamma_{\bar{n},\bar{\tau}}\left(\mbox{\boldmath{$a$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\right)^{H}\tilde{\mbox{\boldmath{$Y$}}}_{\bar{n},\bar{\tau}}+\frac{\rho\gamma_{\bar{n},\bar{\tau}}\left(\mbox{\boldmath{$a$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\right)^{H}\tilde{\mbox{\boldmath{$Y$}}}_{\bar{n},\bar{\tau}}}{\left\|\gamma_{\bar{n},\bar{\tau}}\left(\mbox{\boldmath{$a$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\right)^{H}\tilde{\mbox{\boldmath{$Y$}}}_{\bar{n},\bar{\tau}}\right\|_{2}}
=\displaystyle= (p+pL)γn¯,τ¯(𝒂n¯,τ¯ext)H𝒀~n¯,τ¯+ργn¯,τ¯(𝒂n¯,τ¯ext)H𝒀~n¯,τ¯γn¯,τ¯(𝒂n¯,τ¯ext)H𝒀~n¯,τ¯2\displaystyle(-\sqrt{p}+pL)\gamma_{\bar{n},\bar{\tau}}\left(\mbox{\boldmath{$a$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\right)^{H}\tilde{\mbox{\boldmath{$Y$}}}_{\bar{n},\bar{\tau}}+\frac{\rho\gamma_{\bar{n},\bar{\tau}}\left(\mbox{\boldmath{$a$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\right)^{H}\tilde{\mbox{\boldmath{$Y$}}}_{\bar{n},\bar{\tau}}}{\gamma_{\bar{n},\bar{\tau}}\left\|\left(\mbox{\boldmath{$a$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\right)^{H}\tilde{\mbox{\boldmath{$Y$}}}_{\bar{n},\bar{\tau}}\right\|_{2}} (33)
=\displaystyle= (p+pLγn¯,τ¯+ρ(𝒂n¯,τ¯ext)H𝒀~n¯,τ¯2)(𝒂n¯,τ¯ext)H𝒀~n¯,τ¯\displaystyle\left(-\sqrt{p}+pL\gamma_{\bar{n},\bar{\tau}}+\frac{\rho}{\left\|\left(\mbox{\boldmath{$a$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\right)^{H}\tilde{\mbox{\boldmath{$Y$}}}_{\bar{n},\bar{\tau}}\right\|_{2}}\right)\left(\mbox{\boldmath{$a$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\right)^{H}\tilde{\mbox{\boldmath{$Y$}}}_{\bar{n},\bar{\tau}} (34)
=\displaystyle= 𝟎,\displaystyle\mbox{\boldmath{$0$}}, (35)

where (33) is because γn¯,τ¯>0\gamma_{\bar{n},\bar{\tau}}>0. Note that there exists a unique positive solution to (34) if (𝒂n¯,τ¯ext)H𝒀~n¯,τ¯2>ρp\|(\mbox{\boldmath{$a$}}_{\bar{n},\bar{\tau}}^{{\rm ext}})^{H}\tilde{\mbox{\boldmath{$Y$}}}_{\bar{n},\bar{\tau}}\|_{2}>\frac{\rho}{\sqrt{p}}, and this solution is denoted by (21) in Theorem 1. Therefore, if (𝒂n¯,τ¯ext)H𝒀~n¯,τ¯2>ρp\|(\mbox{\boldmath{$a$}}_{\bar{n},\bar{\tau}}^{{\rm ext}})^{H}\tilde{\mbox{\boldmath{$Y$}}}_{\bar{n},\bar{\tau}}\|_{2}>\frac{\rho}{\sqrt{p}}, then the solution given in (32) and (21) minimizes the objective function of problem (16).

Next, consider the case when 𝒙n¯,τ¯ext=𝟎\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}=\mbox{\boldmath{$0$}}. In this case, according to (A: Proof of Theorem 1), 𝟎 minimizes problem (16) if

p(𝒂n¯,τ¯ext)H𝒀~n¯,τ¯+ρ[𝒙n¯,τ¯ext2|𝒙n¯,τ¯ext=𝟎]T=𝟎.\displaystyle-\sqrt{p}\left(\mbox{\boldmath{$a$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\right)^{H}\tilde{\mbox{\boldmath{$Y$}}}_{\bar{n},\bar{\tau}}+\rho\left[\partial\left\|\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\right\|_{2}\big{|}_{\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}=\mbox{\boldmath{$0$}}}\right]^{T}=\mbox{\boldmath{$0$}}. (36)

Note that 𝒙n¯,τ¯ext2|𝒙n¯,τ¯ext=𝟎={𝒈:𝒈21}\partial\left\|\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}\right\|_{2}|_{\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}=\mbox{\boldmath{$0$}}}=\{\mbox{\boldmath{$g$}}:\|\mbox{\boldmath{$g$}}\|_{2}\leq 1\}. As a result, (36) is true only if (𝒂n¯,τ¯ext)H𝒀~n¯,τ¯2ρ/p\|(\mbox{\boldmath{$a$}}_{\bar{n},\bar{\tau}}^{{\rm ext}})^{H}\tilde{\mbox{\boldmath{$Y$}}}_{\bar{n},\bar{\tau}}\|_{2}\leq\rho/\sqrt{p}. In this case, 𝒙n¯,τ¯ext=𝟎\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}}=\mbox{\boldmath{$0$}} minimizes the objective function of problem (16).

To summarize, the optimal solution to problem (16) is given by (20). Theorem 1 is thus proved.

B: Proof of Theorem 2

Here, we provide a brief proof of Theorem 2. First, the proof of the global optimality of Algorithm 1 is based on [15, Theorem 2]. In particular, we need to check the following three conditions: 1) We set the upper bound function (for each block) in [15] to be the objective function in (16). Then, all conditions in Assumption 2 in [15] hold automatically. 2) It follows from Theorem 1 that the objective function in problem (16) is strongly convex and hence the solution to problem (16) is unique. 3) Because the nonsmooth term is decoupled among different blocks, the objective function in problem (15) is indeed regular. Combining 1), 2), and 3) together, it follows from [15, Theorem 2] that every limit point of the iterates generated by Algorithm 1 is a stationary point of problem (15). Moreover, since problem (15) is convex, any stationary point is also a global solution. As a result, Algorithm 1 solves problem (15) globally. Moreover, the sublinear convergence rate shown in (22) directly follows from [16, Theorem 2].

C: Proof of Theorem 3

First, the complexity of solving problem (16) for any block 𝒙n¯,τ¯ext\mbox{\boldmath{$x$}}_{\bar{n},\bar{\tau}}^{{\rm ext}} based on (20) is 𝒪((L+τmax)M){\cal O}((L+\tau_{\max})M). Since there are (τmax+1)N(\tau_{\max}+1)N blocks, the total complexity of each iteration of Algorithm 1 to update all blocks once is 𝒪((L+τmax)τmaxMN){\cal O}((L+\tau_{\max})\tau_{\max}MN). According to Theorem 2, to get an ϵ\epsilon-optimal solution of problem (15), the total number of iterations should be in the order of 1/ϵ1/\epsilon. Thus, the total complexity of Algorithm 1 to find an ϵ\epsilon-optimal solution is (23).

References

  • [1] L. Liu, E. G. Larsson, W. Yu, P. Popovski, C. Stefanovic, and E. de Carvalho, “Sparse signal processing for grant-free massive connectivity: A future paradigm for random access protocols in the Internet of Things,” IEEE Signal Process. Mag., vol. 35, no. 5, pp. 88–99, Sep. 2018.
  • [2] X. Chen, D. W. K. Ng, W. Yu, E. G. Larsson, N. Al-Dhahir, and R. Schober, “Massive access for 5G and beyond,” to appear in IEEE J. Sel. Areas Commun., 2021. [Online] Available: https://arxiv.org/abs/2002.03491.
  • [3] L. Liu and W. Yu, “Massive connectivity with massive MIMO-Part I: Device activity detection and channel estimation,” IEEE Trans. Signal Process., vol. 66, no. 11, pp. 2933–2946, Jun. 2018.
  • [4] Z. Chen, F. Sohrabi, and W. Yu, “Sparse activity detection for massive connectivity,” IEEE Trans. Signal Process., vol. 66, no. 7, pp. 1890–1904, Apr. 2018.
  • [5] D. L. Donoho, A. Maleki, and A. Montanari, “Message-passing algorithms for compressed sensing,” Proc. Nat. Acad. Sci., vol. 106, no. 45, pp. 18914–18919, Nov. 2009.
  • [6] K. Senel and E. G. Larsson, “Grant-free massive MTC-enabled massive MIMO: A compressive sensing approach,” IEEE Trans. Commun., vol. 66, no. 12, pp. 6164–6175, Dec. 2018.
  • [7] T. Jiang, Y. Shi, J. Zhang, and K. B. Letaief, “Joint activity detection and channel estimation for IoT networks: Phase transition and computation-estimation tradeoff,” IEEE Internet of Things J., vol. 6, no. 4, pp. 6212–6225, Aug. 2018.
  • [8] M. Ke, Z. Gao, Y. Wu, X. Gao, and R. Schober, “Compressive sensing-based adaptive active user detection and channel estimation: Massive access meets massive MIMO,” IEEE Trans. Signal Process., vol. 68, pp. 764–779, 2020.
  • [9] T. Ding, X. Yuan, and S. C. Liew, “Sparsity learning-based multiuser detection in grant-free massive-device multiple access,” IEEE Trans. Wireless Commun., vol. 18, no. 7, pp. 3569–3582, Jul. 2019.
  • [10] Z. Sun, Z. Wei, L. Yang, J. Yuan, X. Cheng, and L. Wan, “Exploiting transmission control for joint user identification and channel estimation in massive connectivity,” IEEE Trans. Commun., vol. 67, no. 9, pp. 6311–6326, Sep. 2019.
  • [11] M. Yuan and Y. Lin, “Model selection and estimation in regression with grouped variables,” J. Royal Statistical Society: Series B, vol. 68, no. 1, pp. 49–67, 2006.
  • [12] D. P. Bertsekas, Nonlinear Programming, Athena Scientific, Belmont, MA, U.S.A., 2nd ed. edition, 1999.
  • [13] Saeid Haghighatshoar, Peter Jung, and Giuseppe Caire, “Improved scaling law for activity detection in massive MIMO systems,” in Proc. IEEE Int. Symp. Inf. Theory (ISIT), Jun. 2018, pp. 381–385.
  • [14] Z. Chen, F. Sohrabi, Y.-F. Liu, and W. Yu, “Covariance based joint activity and data detection for massive random access with massive MIMO,” in Proc. IEEE Int. Conf. Commun. (ICC), May 2019.
  • [15] M. Razaviyayn, M. Hong, and Z.-Q. Luo, “A unified convergence analysis of block successive minimization methods for nonsmooth optimization,” SIAM J. Optim., vol. 23, no. 2, pp. 1126–1153, 2013.
  • [16] M. Hong, X. Wang, M. Razaviyayn, and Z.-Q. Luo, “Iteration complexity analysis of block coordinate descent methods,” Math. Program., vol. 163, no. 1-2, pp. 85–114, Aug. 2017.