An Efficient Algorithm for Device Detection and Channel Estimation in Asynchronous IoT Systems
Abstract
A great amount of endeavour has recently been devoted to the joint device activity detection and channel estimation problem in massive machine-type communications. This paper targets at two practical issues along this line that have not been addressed before: asynchronous transmission from uncoordinated users and efficient algorithms for real-time implementation in systems with a massive number of devices. Specifically, this paper considers a practical system where the preamble sent by each active device is delayed by some unknown number of symbols due to the lack of coordination. We manage to cast the problem of detecting the active devices and estimating their delay and channels into a group LASSO problem. Then, a block coordinate descent algorithm is proposed to solve this problem globally, where the closed-form solution is available when updating each block of variables with the other blocks of variables being fixed, thanks to the special structure of our interested problem. Our analysis shows that the overall complexity of the proposed algorithm is low, making it suitable for real-time application.
Index Terms— Massive machine-type communication, compressed sensing, asynchronous detection.
1 Introduction
Driven by the rapid advance of Internet of Things (IoT), massive machine-type communications (mMTC), the purpose of which is to provide wireless connectivity to a vast number of IoT devices, has attracted more and more attention recently. To reduce the device access delay, a grant-free random access scheme was advocated in [1], where each active device first transmits its preamble to the base station (BS) and then directly transmits its data without waiting for the grant from the BS. To enable this low-latency access scheme, the BS should be able to detect the active devices and estimate their channels based on the received preambles [1, 2]. Recently, [3, 4] show that joint device activity detection and channel estimation can be formulated as a compressed sensing problem because of the sparse device activity. Such a problem is then solved via the approximate message passing (AMP) algorithm [5] in [3, 4, 6] or other sparse optimization techniques in [7, 8, 9, 10]. To practically implement joint device activity detection and channel estimation under the grant-free random access scheme, there are two crucial issues to address. First, in a practical mMTC system consisting of a large number of low-cost IoT devices, it is impossible to ensure that all the active devices are perfectly synchronized. Thereby, the preamble sequence sent by each active device may be delayed by some unknown number of symbols at the beginning of each coherence block. In this case, it is unknown that whether the device detection and channel estimation problem can be solved under the compressed sensing framework as for the synchronous case [3, 4, 6]. Apart from the issue of asynchronous transmission, another challenge lies in the complexity. In mMTC, the number of devices is huge. Moreover, due to the recent success of the massive multiple-input multiple-output (MIMO) technique, the number of antennas at the BS is becoming large as well. In this case, the joint device activity detection and channel estimation problem involves a vast number of unknown variables. It is thus important to propose some efficient algorithm that can be implemented in a practical but large IoT system.
To tackle the above two challenges, this paper aims to propose a low-complexity algorithm to solve the problem of detecting the active devices and estimating their delay and channels in asynchronous mMTC systems. Specifically, by introducing an enlarged sensing matrix that consists of all the effective preambles of the devices (for each device, each of its effective preambles denotes a preamble that is delayed by a particular number of symbols), we show that the above problem can be cast into a compressed sensing problem, similar to its counterpart in synchronous mMTC systems [3, 1, 4, 6]. To guarantee that at most one effective preamble is detected to be active for each device, the compressed sensing problem is further formulated as a group LASSO problem [11]. We propose a block coordinate descent (BCD) algorithm to solve this problem globally. Thanks to the problem’s special structure, we show that when the BCD algorithm optimizes some block of variables with the other blocks of variables being fixed, the optimal solution can be obtained in closed form. Further, the overall complexity of our algorithm is linear to the numbers of devices and antennas at the BS, which makes it appealing in large IoT systems. Last, we remark that our considered device activity detection problem in asynchronous mMTC systems is in sharp contrast to the information decoding counterpart in asynchronous human-type communication systems, which has been widely studied in the literature, because of the different techniques used for activity detection and information decoding.
2 System Model
This paper considers the uplink communication in an mMTC system, which consists of a BS equipped with antennas and single-antenna IoT devices denoted by the set . We assume quasi-static block-fading channels, in which all channels remain approximately constant in each fading block, but vary independently from block to block. The channel from device to the BS is denoted by , . We assume that the device channels follow the independent and identically distributed (i.i.d.) Rayleigh fading channel model, i.e., , , where denotes the path loss of device .
Due to the sporadic IoT data traffic, only some of the devices become active within each channel coherence block. We thus define the device activity indicator functions as follows:
(3) |
Then, the set of active devices is defined by .
It is assumed that the two-phase grant-free random access scheme [1] is adopted for the considered system, where each active device first sends its preamble sequence to the BS for device activity detection and channel estimation, and then sends its data to the BS for decoding. Further, this paper mainly focuses on the first phase of the above grant-free random access scheme. In this phase, each user is assigned with a unique preamble sequence denoted by , , where denotes the length of the preamble sequence, and with denotes the -th preamble symbol of device , .
At the beginning of each coherence block, all the active users tend to transmit their preambles to the BS. However, due to the lack of perfect coordination among a large number of low-cost devices, the preamble transmissions are in general asynchronous. Let denote the discrete delay (in terms of symbols) of user for transmitting , . It is assumed that at each coherence block, the preamble transmission delay for each device is a random integer value in the regime of , , where denotes the maximum delay of all the devices over all the coherence blocks. Moreover, is assumed to be known by the BS.
Since each active device starts to transmit its preamble at the -th symbol in the coherence block, we define the effective preamble sequence for device as
(4) |
where denotes the -th transmit symbol of device given a delay of symbols, . Note that if or . Moreover, under the two-phase grant-free random access scheme, after transmitting the pilot in Phase I, each active device should wait for symbols before transmitting its data such that the pilot received at the BS in the first time slots is not interfered by the data. Then, at time slot , the received signal at the BS is merely contributed by pilot and given by
(5) |
where denotes the identical transmit power of all the devices, and denotes the additive white Gaussian noise (AWGN) at the BS. Further, the overall received signal at the BS over all the time slots, denoted by , is given by
(6) |
where , with , , and . The goal of the BS is to estimate the device activity ’s and delay ’s as well as active devices’ channels ’s based on its received signal given in (6) and its knowledge of the preamble sequences ’s.
3 A Compressed Sensing Problem Formulation for Estimating
In this section, we show that the detection of active devices as well as the estimation of their delay and channels can be cast into a compressed sensing problem. Specifically, define
(7) |
as the collection of the possible effective preamble sequences of all the devices. Moreover, define the indicator functions of device activity and delay as follows:
(10) |
In other words, only if device is active and its delay is of symbol duration. Note that if device is active, only one of ’s, , is equal to 1, i.e., , . Then, (6) can be reformulated as
(11) |
where with
(12) |
Suppose that can be estimated according to (11). If , i.e., , device is active, i.e., , and its delay and channel are and . If , , i.e., , , device is inactive, i.e., . Thus, the key of the joint estimation of device activity, delay, and channels lies in estimating based on (11).
Note that estimating based on (11) is a compressed sensing problem, since is a row-sparse matrix according to (10) and (12). In this paper, the compressed sensing problem of estimating is formulated as follows:
(13) | ||||
(14) |
where denotes the Frobenius norm of matrix , i.e., , and denotes the zero norm of vector , i.e., the number of non-zero elements in .
In the above problem, constraint (14) is to guarantee that at most one delay pattern is detected to be active for each device . Mathematically, (14) imposes a group sparsity constraint on the structure of : in each block consisting of vectors , at most one of them is a non-zero vector. However, this constraint is non-convex. In the rest of this paper, we adopt the group LASSO technique to deal with this non-convex group sparsity constraint.
Under the group LASSO technique, given any coefficient , we are interested in the following convex problem [11]
(15) |
In problem (15), we penalize the estimation error with a mixed norm, i.e., . Note that this penalty is minimized when all the zero entries are put together in some rows of . As a result, given a large value of , the optimal solution of problem (15) should be a row-sparse matrix. Moreover, if is large enough, the corresponding solution will be sufficiently sparse, and therefore, constraint (14) in problem (13) can be satisfied.
In the following two sections, we will introduce how to solve problem (15) efficiently given and how to select a proper value of so as to balance between activity sparsity and channel estimation error, respectively.
4 An Efficient BCD Algorithm for Problem (11)
The BCD type of algorithms are efficient in solving large-scale optimization problems with a vast number of variables [12]. In this section, we introduce a low-complexity BCD algorithm to solve problem (15) given any .
4.1 Algorithm Design
Under the BCD algorithm, at each time, we merely optimize one vector for some particular and given ’s, . The corresponding optimization problem is formulated as
(16) |
where
(17) |
Somewhat surprisingly, we can obtain the closed-form optimal solution of problem (16), as shown in the following theorem.
Theorem 1.
The objective function in problem (16) is strongly convex, and its global minimum is achieved by
(20) |
where
(21) |
Proof.
Please refer to Appendix A. ∎
The optimal solution (20) in Theorem 1 indicates that the BS should keep applying the matched filters ’s to denoise ’s. Then, if the power of some resulting signal, i.e., , is larger than a threshold , then the estimation of is a non-zero vector. Otherwise, the estimation of is a zero vector. This implies that the solution to problem (15) is more sparse as increases.
Remark 1.
In general, a group LASSO problem can be merely solved numerically. However, under the BCD framework, the sensing matrix reduces to a vector in problem (16) to optimize . In this case, there is a closed-form solution, which is appealing to reduce the computational complexity in mMTC.
Based on Theorem 1, the BCD algorithm to solve problem (15) is summarized in Algorithm 1. Algorithm 1 is an iterative algorithm. At each outer iteration of the algorithm, we first optimize given ’s, , and then optimize given ’s, , and so on, as shown in Step 2.1 to Step 2.5. When ’s are all optimized once, we can calculate the objective value of problem (15) achieved after the -th iteration of the algorithm, denoted by as shown in Step 3. The algorithm terminates when the objective value of problem (15) does not decrease sufficiently over two iterations.
Initialization: Set the initial values of ’s as , , , , where is the received signal given in (6), and ;
Repeat:
-
1
Set ;
- 2
-
3
Set and ;
Until , where is a small positive number.
4.2 Algorithm Properties
After introducing how the BCD algorithm works, in this subsection, we present some theoretical properties of this algorithm about its optimality and complexity.
Theorem 2.
Every limit point of the iterates generated by Algorithm 1 is a global solution of problem (15). Moreover, for all sufficiently large , it follows that
(22) |
where , as given in Step 3 of Algorithm 1, is the objective value of problem (15) at the -th iteration of Algorithm 1, and is the optima value of problem (15).
Proof.
Please refer to Appendix B. ∎
Theorem 3.
Proof.
Please refer to Appendix C. ∎
5 The Approach to Select
After solving problem (15) with any given , we introduce in this section how to determine the value of such that the solution to problem (15) is a good solution to problem (13). In this paper, we update the value of iteratively. Specifically, at the beginning, we set an initial value to as and solve problem (15) via Algorithm 1. Then, we keep updating as , where , and solving problem (15) iteratively until for some large enough value of , the solution to problem (15) satisfies constraint (14) in problem (13). The overall algorithm to solve problem (13) via solving a sequence of problem (15) is summarized in Algorithm 2.
Initialization: Set an initial value of as ;
Repeat:
- 1
-
2
Set
(26) where is a given parameter to control the sparsity of .
-
3
Update where .
Until the solution of ’s satisfies constraint (14).
Note that for some inactive devices, maybe the power of corresponding estimated signals ’s via Algorithm 1 are very weak, but non-zero. This will cause the so-called false alarm event, i.e., an inactive device is detected as an active device. To enhance the sparsity of the estimation of and reduce the false alarm probability, after the convergence of Algorithm 1 in Step 1 of Algorithm 2, we set if is less than some threshold in Step 2 of Algorithm 2.
Remark 2.
One main issue under the conventional LASSO technique is how to select a proper value of such that the resulting LASSO problem is a good approximation of the original problem. Under our interested problem (13), the new constraint (14) enables an accurate stopping criterion for updating in Step 3 of Algorithm 2. This is another advantage to use LASSO in this work, other than the closed-form solution given any shown in Theorem 1.
6 Numerical Results
In this section, we provide numerical examples to verify the effectiveness of our proposed algorithm for detecting the active devices and estimating their delay and channels in asynchronous IoT systems. We assume that there are IoT devices located in a cell with a radius of m, and at each coherence block, only of them become active. Moreover, the maximum delay of all the devices is symbols. The transmit power of the active devices is dBm. Last, the power spectral density of the AWGN at the BS is dBm/Hz, and the channel bandwidth is MHz.

First, we provide one numerical example to verify the convergence property of Algorithm 1, where the BS has antennas, and the pilot sequence length is . Fig. 1 shows the relative gap between the objective value of problem (15) achieved at each iteration of Algorithm 1 and the optimal value of problem (15), i.e., . As shown in Theorem 2, the solution generated by Algorithm 1 converges to the optimal solution sublinearly.

Next, we show the performance of the proposed algorithm by Monte Carlo simulation. Specifically, we generate realizations of device activity, location, and channels. Moreover, for each realization, if the detection of some is wrong, then we declare that this realization is under detection error. The overall detection error probability is defined as the ratio between the number of realizations under detection error and the total number of realizations, i.e., . Similar to [3], the missed detection/false alarm probability is defined as the probability that an active/inactive device is detected as an inactive/active device. Fig. 2 shows the overall detection error probability and missed detection probability (no false alarm events happen over the realizations) achieved by our proposed algorithm when ranges from 10 to 25 and or . First, it is observed that the missed detection and false alarm probabilities for device activity detection are very low, e.g., when and , no missed detection and false alarm events are observed over the realizations. Next, it is observed that when is small, the overall detection error probability is high. This indicates that although the active devices can be detected, their delay estimation is in error with high probability. However, the priority of device activity detection is much higher than that of delay estimation. Moreover, delay estimation becomes more accurate as increases. Last, it is observed that massive MIMO is powerful to decrease the detection error probability.
7 Conclusion
In this paper, we showed that the problem of jointly detecting the active devices and estimating their delay and channels in asynchronous mMTC systems can be formulated as a group LASSO problem. Utilizing the BCD technique, we proposed an efficient algorithm to solve the group LASSO problem, the complexity of which is shown to be linearly proportional to the numbers of devices and antennas at the BS. Future work may consider how to apply the covariance-based device detection appraoch [13, 14] in asynchronous IoT systems.
Appendix
A: Proof of Theorem 1
Given any and , define
(27) |
Since is a strongly convex function and is a convex function, is a strongly convex function.
Next, we derive the optimal solution to minimize . It is observed that is differentiable when , but not differentiable when . Moreover, when , the gradient of is
(28) |
while when , the sub-gradient of is
(29) |
Since is a strongly convex function over , a point minimizes this function if and only if is a sub-gradient of the function at this point, i.e.,
(30) |
According to (A: Proof of Theorem 1) and (A: Proof of Theorem 1), we study the sub-gradient of the function in two cases: and .
First, consider the case when . To make , according to (A: Proof of Theorem 1), we have
(31) |
It then follows that
(32) |
where . As a result, should be linear to the vector . The remaining job is to find the value of such that (31) is true. By substituting into (31), it follows that
(33) | ||||
(34) | ||||
(35) |
where (33) is because . Note that there exists a unique positive solution to (34) if , and this solution is denoted by (21) in Theorem 1. Therefore, if , then the solution given in (32) and (21) minimizes the objective function of problem (16).
Next, consider the case when . In this case, according to (A: Proof of Theorem 1), minimizes problem (16) if
(36) |
Note that . As a result, (36) is true only if . In this case, minimizes the objective function of problem (16).
B: Proof of Theorem 2
Here, we provide a brief proof of Theorem 2. First, the proof of the global optimality of Algorithm 1 is based on [15, Theorem 2]. In particular, we need to check the following three conditions: 1) We set the upper bound function (for each block) in [15] to be the objective function in (16). Then, all conditions in Assumption 2 in [15] hold automatically. 2) It follows from Theorem 1 that the objective function in problem (16) is strongly convex and hence the solution to problem (16) is unique. 3) Because the nonsmooth term is decoupled among different blocks, the objective function in problem (15) is indeed regular. Combining 1), 2), and 3) together, it follows from [15, Theorem 2] that every limit point of the iterates generated by Algorithm 1 is a stationary point of problem (15). Moreover, since problem (15) is convex, any stationary point is also a global solution. As a result, Algorithm 1 solves problem (15) globally. Moreover, the sublinear convergence rate shown in (22) directly follows from [16, Theorem 2].
C: Proof of Theorem 3
First, the complexity of solving problem (16) for any block based on (20) is . Since there are blocks, the total complexity of each iteration of Algorithm 1 to update all blocks once is . According to Theorem 2, to get an -optimal solution of problem (15), the total number of iterations should be in the order of . Thus, the total complexity of Algorithm 1 to find an -optimal solution is (23).
References
- [1] L. Liu, E. G. Larsson, W. Yu, P. Popovski, C. Stefanovic, and E. de Carvalho, “Sparse signal processing for grant-free massive connectivity: A future paradigm for random access protocols in the Internet of Things,” IEEE Signal Process. Mag., vol. 35, no. 5, pp. 88–99, Sep. 2018.
- [2] X. Chen, D. W. K. Ng, W. Yu, E. G. Larsson, N. Al-Dhahir, and R. Schober, “Massive access for 5G and beyond,” to appear in IEEE J. Sel. Areas Commun., 2021. [Online] Available: https://arxiv.org/abs/2002.03491.
- [3] L. Liu and W. Yu, “Massive connectivity with massive MIMO-Part I: Device activity detection and channel estimation,” IEEE Trans. Signal Process., vol. 66, no. 11, pp. 2933–2946, Jun. 2018.
- [4] Z. Chen, F. Sohrabi, and W. Yu, “Sparse activity detection for massive connectivity,” IEEE Trans. Signal Process., vol. 66, no. 7, pp. 1890–1904, Apr. 2018.
- [5] D. L. Donoho, A. Maleki, and A. Montanari, “Message-passing algorithms for compressed sensing,” Proc. Nat. Acad. Sci., vol. 106, no. 45, pp. 18914–18919, Nov. 2009.
- [6] K. Senel and E. G. Larsson, “Grant-free massive MTC-enabled massive MIMO: A compressive sensing approach,” IEEE Trans. Commun., vol. 66, no. 12, pp. 6164–6175, Dec. 2018.
- [7] T. Jiang, Y. Shi, J. Zhang, and K. B. Letaief, “Joint activity detection and channel estimation for IoT networks: Phase transition and computation-estimation tradeoff,” IEEE Internet of Things J., vol. 6, no. 4, pp. 6212–6225, Aug. 2018.
- [8] M. Ke, Z. Gao, Y. Wu, X. Gao, and R. Schober, “Compressive sensing-based adaptive active user detection and channel estimation: Massive access meets massive MIMO,” IEEE Trans. Signal Process., vol. 68, pp. 764–779, 2020.
- [9] T. Ding, X. Yuan, and S. C. Liew, “Sparsity learning-based multiuser detection in grant-free massive-device multiple access,” IEEE Trans. Wireless Commun., vol. 18, no. 7, pp. 3569–3582, Jul. 2019.
- [10] Z. Sun, Z. Wei, L. Yang, J. Yuan, X. Cheng, and L. Wan, “Exploiting transmission control for joint user identification and channel estimation in massive connectivity,” IEEE Trans. Commun., vol. 67, no. 9, pp. 6311–6326, Sep. 2019.
- [11] M. Yuan and Y. Lin, “Model selection and estimation in regression with grouped variables,” J. Royal Statistical Society: Series B, vol. 68, no. 1, pp. 49–67, 2006.
- [12] D. P. Bertsekas, Nonlinear Programming, Athena Scientific, Belmont, MA, U.S.A., 2nd ed. edition, 1999.
- [13] Saeid Haghighatshoar, Peter Jung, and Giuseppe Caire, “Improved scaling law for activity detection in massive MIMO systems,” in Proc. IEEE Int. Symp. Inf. Theory (ISIT), Jun. 2018, pp. 381–385.
- [14] Z. Chen, F. Sohrabi, Y.-F. Liu, and W. Yu, “Covariance based joint activity and data detection for massive random access with massive MIMO,” in Proc. IEEE Int. Conf. Commun. (ICC), May 2019.
- [15] M. Razaviyayn, M. Hong, and Z.-Q. Luo, “A unified convergence analysis of block successive minimization methods for nonsmooth optimization,” SIAM J. Optim., vol. 23, no. 2, pp. 1126–1153, 2013.
- [16] M. Hong, X. Wang, M. Razaviyayn, and Z.-Q. Luo, “Iteration complexity analysis of block coordinate descent methods,” Math. Program., vol. 163, no. 1-2, pp. 85–114, Aug. 2017.