Stacked Intelligent Metasurfaces for Multiuser Beamforming
in the Wave Domain
Abstract
Reconfigurable intelligent surface has recently emerged as a promising technology for shaping the wireless environment by leveraging massive low-cost reconfigurable elements. Prior works mainly focus on a single-layer metasurface that lacks the capability of suppressing multiuser interference. By contrast, we propose a stacked intelligent metasurface (SIM)-enabled transceiver design for multiuser multiple-input single-output downlink communications. Specifically, the SIM is endowed with a multilayer structure and is deployed at the base station to perform transmit beamforming directly in the electromagnetic wave domain. As a result, an SIM-enabled transceiver overcomes the need for digital beamforming and operates with low-resolution digital-to-analog converters and a moderate number of radio-frequency chains, which significantly reduces the hardware cost and energy consumption, while substantially decreasing the precoding delay benefiting from the processing performed in the wave domain. To leverage the benefits of SIM-enabled transceivers, we formulate an optimization problem for maximizing the sum rate of all the users by jointly designing the transmit power allocated to them and the analog beamforming in the wave domain. Numerical results based on a customized alternating optimization algorithm corroborate the effectiveness of the proposed SIM-enabled analog beamforming design as compared with various benchmark schemes. Most notably, the proposed analog beamforming scheme is capable of substantially decreasing the precoding delay compared to its digital counterpart.
Index Terms:
Stacked intelligent metasurface (SIM), analog or wave-based beamforming, power allocation, reconfigurable intelligent surface (RIS).I Introduction
Reconfigurable intelligent surface (RIS) has recently emerged as a promising technology for effectively improving the spectrum and energy efficiencies of wireless networks [1, 2, 3, 4]. In general, an RIS is made of a programmable metasurface consisting of a large number of low-cost passive elements integrated with low-power electronics, each of which can independently impose a phase shift on the incident waves [5, 6]. By adjusting the phase shifts of all the elements with the aid of a smart controller, an RIS is capable of manipulating the reflected/transmitted waves to shape the wireless channels, thus establishing favorable propagation environments and substantially improving the quality-of-service (QoS) of wireless networks [7, 8, 9, 10, 11, 12, 13].
Specifically, by deploying an RIS, Huang et al. [3] achieved 300% energy efficiency improvement in multiuser multiple-input single-output (MISO) downlink communication systems, as compared with conventional amplify-and-forward relays. Moreover, Wu et al. [5] demonstrated that a dB power gain is attained upon doubling the number of reflecting elements. In order to reduce the excessive pilot signaling overhead required for probing RIS-aided channels with a large number of programmable elements, An et al. [14, 1] proposed a novel codebook-based protocol, striking a beneficial trade-off between the QoS, the pilot overhead, and the computational complexity. However, existing works on RIS-aided wireless systems mostly rely on a single-layer metasurface structure, which severely limits the degrees-of-freedom of the attainable beam patterns [8, 11] and the capability of suppressing the multiuser interference [9].
Recently, a wave-based signal processing paradigm relying on a multilayer metasurface structure has gained prominent attention [15, 16]. Specifically, Lin et al. [15] conceived a diffractive deep neural network (D2NN) architecture, by stacking an array of printed optical lenses, which is capable of performing parallel calculations at the speed of light [15]. Subsequently, Liu et al. [16] designed a programmable D2NN architecture based on multi-layer metasurfaces, where each meta-atom acts as a reprogrammable artificial neuron. The authors demonstrated that the programmable D2NN architecture is able to execute various signal processing tasks, e.g., image classification, by flexibly manipulating the electromagnetic (EM) waves propagating through it [16].
In wireless communications, however, the design of transceivers based on multilayer metasurfaces remains largely unexplored. Motivated by these considerations, we develop a new stacked intelligent metasurface (SIM) enabled wireless transceiver design. As depicted in Fig. 1, by stacking multiple programmable metasurfaces in the close proximity of the transmitter, an SIM has a structure that is similar to an artificial neural network and, therefore, offers enhanced signal processing capabilities compared to its single-layer counterpart. Benefiting from the resulting processing capabilities of SIMs, we consider a scenario in which an SIM is deployed at the base station (BS) to perform multiuser downlink beamforming directly in the EM wave domain. In sharp contrast to the recent multi-layer RIS that requires complex baseband signal processing capabilities and power-hungry radio-frequency (RF) chains at the transceiver [17], the proposed SIM-enabled transceiver completely removes the digital beamforming and only requires low-resolution digital-to-analog converters (DAC) at the BS.
Specifically, we propose a wave-based (analog) beamforming scheme for application to an SIM-assisted multiuser MISO downlink system. First of all, we formulate a sum rate maximization problem by jointly optimizing the transmit power and the wave-based beamforming. Since the transmit power coefficients and the large number of SIM phase shifts are coupled in a non-convex objective function, we propose an effective alternating optimization (AO) algorithm to solve the formulated joint optimization problem. Numerical results demonstrate the effectiveness of the novel wave-based beamforming paradigm and the performance improvement of the proposed AO algorithm against various benchmark schemes.
II System Model
As shown in Fig. 1, we consider an SIM-aided multiuser MISO wireless system, where an SIM is deployed to assist the downlink communication from a BS equipped with antennas to single-antenna users. Specifically, the SIM is composed of metasurface layers, each of which consists of meta-atoms. Moreover, the SIM is attached to a smart controller, which is capable of imposing an independent and adjustable phase shift onto the EM waves transmitted through each meta-atom [8]. By properly adjusting the phase shifts in each metasurface, the SIM implements the downlink beamforming directly in the EM wave domain [16].
Let , , and denote the set of metasurfaces, meta-atoms in each metasurface layer, and users, respectively. Moreover, let with denote the phase shift imposed by the -th meta-atom in the -th metasurface layer. Thus, the diagonal phase shift matrix of the -th metasurface layer is . Furthermore, let be the transmission matrix from the -th to the -th metasurface layer and let denote the transmission vector from the -th transmit antenna to the first metasurface layer of the SIM. According to the Rayleigh-Sommerfeld diffraction theory [15], the -th entry of is given by
(1) |
for , where is the wavelength, denotes the transmission distance, represents the angle between the propagation direction and the normal direction of the -th metasurface layer, and is the size of each meta-atom. Similarly, the -th entry of is obtained from (1). Therefore, the SIM-enabled wave-based beamforming matrix is written as
(2) |
It is worth noting that the inter-layer transmission coefficients may deviate from (1) due to practical hardware imperfections, which, however, can be readily calibrated before the deployment of the SIM [16].

As a sensible case study, we consider a quasi-static flat-fading channel model. Specifically, denotes the baseband equivalent channel from the last metasurface layer of the SIM to the -th user, which is modeled as a correlated Rayleigh fading channel, i.e., where is the distance-dependent path loss of the -th SIM-user link, and is the covariance matrix that characterizes the spatial correlation between different meta-atoms. By assuming an isotropic scattering environment with uniformly distributed multipath components, the -th entry of is [18], where is the normalized sinc function, and denotes the spacing between the meta-atoms.
In sharp contrast with conventional digital precoding schemes that assign each symbol to an individual beamforming vector, we consider a wave-based beamforming scheme with the aid of the SIM. Accordingly, each data stream is directly transmitted from the BS antennas. In general, the number of BS antennas can be greater than the number of users , and an antenna selection process needs to be carried out in advance [19]. For simplicity, we assume . In the considered case study, each antenna performs the modulation and detection of a single data stream. Therefore, the use of low-resolution ADC/DAC may be a suitable choice, while resulting in acceptable performance losses.
The information symbol intended to the -th user is denoted by , which is assumed to be an independent and identically distributed (i.i.d.) random variable with zero mean and unit variance. Let denote the power allocated to the -th user. Then, the total transmit power constraint at the BS reads as where is the transmit power budget at the BS. Furthermore, by superimposing all the signals that propagate through the SIM, the composite signal received at the -th user is expressed as
(3) |
where denotes the i.i.d. additive white Gaussian noise with being the receiver noise power at the -th user.
Therefore, the signal-to-interference-plus-noise-ratio (SINR) at the -th user is given by
(4) |
In this paper, we are interested in optimizing the phase shifts in to suppress the multiuser interference, which we refer to as the wave-based beamforming. Compared with conventional digital beamforming and RIS-aided hybrid active and passive beamforming schemes, the considered wave-based beamforming scheme entails the optimization of a greater number of phase shifts associated with each metasurface layer, which is a more challenging problem to solve as elaborated in the next section.
III Joint Power Allocation and Wave-Based Beamforming: Problem Formulation and Solution
III-A Problem Formulation
We aim to maximize the sum rate of all the users, by jointly optimizing the transmit power at the BS and the wave-based beamforming at the SIM. To characterize the ultimate performance limits, we assume that the CSI of all the channels is perfectly known by the BS111In the considered SIM-aided system model, channel estimation can be performed similarly to conventional MIMO systems with a reduced number of RF chains, e.g., [20].. By defining , , and , the joint power allocation and wave-based beamforming optimization problem is formulated as
(5a) | |||||
s.t. | (5b) | ||||
(5c) | |||||
(5d) |
Since the optimization variables and are coupled in the non-convex objective function in (5a), it is not simple to obtain the optimal solution of problem . To tackle this issue, we propose an efficient AO algorithm.
III-B AO Algorithm
The proposed AO algorithm decomposes the joint optimization problem into a pair of subproblems: the power allocation problem among interfering channels at the BS, and the phase shift optimization problem at the SIM. At any iteration of each subproblem, we update one of the optimization variables, i.e., or , while keeping the other one fixed to the value that it takes at the previous iteration.
III-B1 Algorithm 1 for Optimizing the Power Allocation Given
Assuming that the SIM phase shifts and the wave-based beamforming matrix in (2) are given, the problem reduces to
(6a) | |||||
s.t. | (6b) |
The problem is a conventional power allocation problem over interference channels with a given total power constraint. Therefore, it can be efficiently solved by applying the iterative water-filling algorithm [21].
Specifically, the power allocated to the -th user at each iteration is [1]
(7) |
for , where , and is the water-filling level that is determined by using the bisection method while fulfilling the constraint at each iteration.
Since the power allocated to all the users is simultaneously updated at each iteration, the iterative water-filling algorithm specified by (7) may not be stable when [21]. To circumvent this issue, we add a damping term. Specifically, at each iteration, the power allocation is a weighted sum of the power at the previous iteration and the power in (7). This ensures that the sum rate approaches its maximum value after several iterations and computations of (7).
III-B2 Algorithm 2 for Optimizing the SIM Phase Shifts Given
Given the power allocation , the phase shift optimization subproblem is formulated as
(8a) | |||||
s.t. | (8b) |
The problem is still challenging to solve due to the non-convex objective function. To tackle it, we propose a computationally efficient iterative gradient ascent algorithm.
First, the phase shifts are initialized by assuming, e.g., a uniform random distribution. Then, at each iteration, the partial derivative of with respect to is calculated as follows
(9) |
where and are given by
(10) | ||||
(11) |
In (11), and denote the -th row of the matrix and the -th column of the matrix , which are defined by
(12) | ||||
(13) |
After calculating all the partial derivatives of with respect to according to (9), all the phase shifts of the SIM are simultaneously updated as follows
(14) |
where is the Armijo step size, which is obtained by leveraging the backtracking line search at each iteration [10].
The computation of (14) is repeated until the fractional increase of the sum rate is less than a preset threshold, i.e., until convergence. The convergence of the gradient ascent algorithm to a local maximum is guaranteed because 1) the sum rate is upper bounded due to the constraint on the transmit power and the unit values of the amplitudes of the transmission coefficients of the SIM; and 2) the sum rate is non-decreasing provided that a suitable Armijo step size is utilized at each iteration.
By alternatively executing Algorithms 1 and 2 several times, the solution to problem is obtained.
IV Numerical Results
IV-A Simulation Setup
In this section, we provide numerical results to validate the effectiveness of the wave-based beamforming design by applying the proposed algorithms. As shown in Fig. 2, we consider an SIM-assisted multiuser MISO downlink system. The BS is a uniform linear array with antennas that are deployed parallel to the -axis. Furthermore, an SIM stacking multiple metasurfaces is integrated into the BS to perform the transmit beamforming in the EM wave domain. Each metasurface is a uniform planar array whose meta-atoms are deployed parallel to the - plane. The center of each antenna/meta-atom of the BS and metasurfaces are located in the -axis. The height of the BS is m, and the thickness of the SIM is . Thus, the spacing between adjacent metasurfaces in a -layer SIM is . Moreover, we assume that all metasurfaces comprise and meta-atoms along the -axis and -axis, respectively. Thus, the total number of meta-atoms in each layer is . For simplicity, we consider a square metasurface structure with . Furthermore, we assume half-wavelength spacing between adjacent antennas/meta-atoms at the BS and metasurfaces. The size of each meta-atom is . We assume that single-antenna users are evenly distributed on the -axis with spacing m, as shown in Fig. 2. Additionally, each antenna at the BS is assumed to have an antenna gain of dBi, while each user is equipped with a single dBi antenna [10]. The distance-dependent path loss is modeled as , where is the free space path loss at the reference distance of m, and is the path loss exponent [1]. The carrier frequency is GHz. Also, we set dB and . The noise power is dBm for .





We evaluate the performance of the proposed algorithm and compare it with two benchmark schemes. 1) Uniform power allocation: This corresponds to solving problem assuming a uniform power allocation strategy; 2) Codebook-based method: This corresponds to randomly generating a set of phase shift vectors and for each of them applying the iterative water-filling power allocation strategy [11]. The phase shift vector resulting in the maximum sum rate is selected to configure the SIM. The codebook size is . The threshold for stopping the AO algorithm is set to , and the maximum number of iterations for the AO algorithm and the inner iteration of the iterative water-filling and backtracking line search is set to . The simulation results are obtained by averaging over independent channel realizations.
IV-B Sum Rate versus the Number of Metasurface Layers
In Fig. 3a, we show the sum rate versus the number of metasurface layers , by considering the setup and dBm. In particular, we analyze two case studies with a different number of meta-atoms in each layer: and , while increasing the number of metasurface layers from to . We see that the sum rate resulting from the proposed power allocation strategy and wave-based beamforming scheme increases with the number of metasurface layers, benefiting from the capability of an SIM to suppress the multiuser interference in the EM wave domain. Nonetheless, the sum rate gradually converges as increases, and reaches the maximum at approximately , which results in a 30% improvement compared with a single-layer SIM when . Additionally, the proposed AO algorithm outperforms the codebook-based scheme and the uniform power allocation strategy in all considered setups. Specifically, the uniform power allocation strategy has a rate loss of about bps compared to the proposed iterative water-filling algorithm when . Nonetheless, as the number of meta-atoms in each layer increases from to , the sum rate obtained by using the uniform power allocation strategy improves significantly. Notably, the uniform power allocation strategy almost approaches the iterative water-filling power allocation scheme, which means that the multiuser interference is effectively reduced by an SIM as the number of meta-atoms increases. Finally, the codebook-based scheme with random phase shifts does not provide any performance gain as the number of metasurface layers increases. Also, it provides half of the rate compared with the proposed AO algorithm.
IV-C Sum Rate versus the Number of Users
In Fig. 3b, we show the achievable sum rate versus the number of users , by considering the setup . All the other parameters are the same as those in Fig. 3a. From Fig. 3b, we observe that the sum rate obtained with the optimized power allocation strategy and wave-based beamforming increases as increases, it then decreases as exceeds a certain value. This is due to the fact that a finite-size SIM is capable of hardly suppressing the multiuser interference among a large number of users. In the considered setup with and , the maximum number of users that an SIM can effectively serve by satisfactorily suppressing the multiuser interference is . The number of users can be increased to if an SIM with a larger size, e.g., with meta-atoms in each layer, is utilized. Additionally, the uniform power allocation and the codebook-based schemes hardly suppress the multiuser interference, thus resulting in a noticeable drop of the sum rate as the number of users increases. If the number of users is large, the iterative water-filling algorithm has a more pronounced performance gain (e.g., 300% rate improvement when and ) compared with the uniform power allocation strategy, which implies that utilizing an appropriate power allocation algorithm is a suitable approach to further decrease the multiuser interference among a large number of users.
IV-D Sum Rate versus the Transmit Power
In Fig. 4a, we illustrate the sum rate versus the transmit power by considering a number of users . As expected, the sum rate of all the considered schemes increases as the transmit power increases. Again, the proposed AO algorithm outperforms the considered benchmark schemes in all considered setups. Specifically, the performance gain of the proposed algorithm over the codebook-based scheme is more pronounced as the transmit power increases, attaining more than bps rate improvement for moderate and high values of the transmit power. For low values of the transmit power, in addition, the uniform power allocation strategy suffers from some performance loss as compared with the proposed iterative water-filling algorithm. The gap, however, shrinks as the transmit power increases. This is consistent with the typical behavior of the water-filling algorithm in the high power regime, where the impact of power allocation becomes negligible.
IV-E Convergence Analysis of the Proposed Algorithm
In Fig. 4b, we show the convergence of the proposed AO algorithm by considering and , while keeping the other parameters the same as in Fig. 3a. In each case, we consider independent channel realizations. Notably, the AO algorithm converges fast in all cases, achieving its maximum value after about iterations. From Fig. 4b, we observe that a larger number of iterations is required for reaching convergence as the number of meta-atoms in each layer increases. This is due to the larger number of variables to be optimized and to the larger number of partial derivatives to be computed at each iteration step. Our simulation study reveals, in addition, that the proposed wave-based beamforming scheme with optimized SIM phase shifts can be implemented within a fraction of a nanosecond, which dramatically reduces the processing delay as compared to digital beamforming schemes, which usually require tens of microseconds to be computed.
V Conclusions
In this paper, we have proposed an SIM-enabled wave-based beamforming design for multiuser MISO downlink systems, which substantially reduces the precoding delay and hardware cost in comparison to its digital counterpart. Specifically, a joint transmit power allocation and phase shift optimization problem has been formulated to maximize the sum rate. Furthermore, we have proposed an effective AO algorithm to decompose the original non-convex joint optimization problem into two subproblems. The power allocation has been tackled by applying a modified iterative water-filling algorithm, while the phase shifts of the SIM have been optimized by customizing the gradient ascent algorithm. Extensive simulation results have demonstrated that the proposed wave-based beamforming design achieves significant performance gains compared to the currently available state-of-the-art benchmarks. Most notably, the wave-based beamforming significantly decreases the precoding delay. In a nutshell, the proposed SIM-enabled transceiver design constitutes a new paradigm to perform advanced signal processing operations in the wave domain.
References
- [1] J. An, C. Xu, L. Gan, and L. Hanzo, “Low-complexity channel estimation and passive beamforming for RIS-assisted MIMO systems relying on discrete phase shifts,” IEEE Trans. Commun., vol. 70, no. 2, pp. 1245–1260, Feb. 2022.
- [2] M. Di Renzo, A. Zappone, M. Debbah, M.-S. Alouini, C. Yuen, J. de Rosny, and S. Tretyakov, “Smart radio environments empowered by reconfigurable intelligent surfaces: How it works, state of research, and the road ahead,” IEEE J. Sel. Areas Commun., vol. 38, no. 11, pp. 2450–2525, Nov. 2020.
- [3] C. Huang, A. Zappone, G. C. Alexandropoulos, M. Debbah, and C. Yuen, “Reconfigurable intelligent surfaces for energy efficiency in wireless communication,” IEEE Trans. Wireless Commun., vol. 18, no. 8, pp. 4157–4170, Aug. 2019.
- [4] G. C. Alexandropoulos, K. Stylianopoulos, C. Huang, C. Yuen, M. Bennis, and M. Debbah, “Pervasive machine learning for smart radio environments enabled by reconfigurable intelligent surfaces,” Proc. IEEE, vol. 110, no. 9, pp. 1494–1525, Sept. 2022.
- [5] Q. Wu, S. Zhang, B. Zheng, C. You, and R. Zhang, “Intelligent reflecting surface-aided wireless communications: A tutorial,” IEEE Trans. Commun., vol. 69, no. 5, pp. 3313–3351, May 2021.
- [6] J. An, Q. Wu, and C. Yuen, “Scalable channel estimation and reflection optimization for reconfigurable intelligent surface-enhanced OFDM systems,” IEEE Wireless Commun. Lett., vol. 11, no. 4, pp. 796–800, Apr. 2022.
- [7] C. Huang, S. Hu, G. C. Alexandropoulos, A. Zappone, C. Yuen, R. Zhang, M. Di Renzo, and M. Debbah, “Holographic MIMO surfaces for 6G wireless networks: Opportunities, challenges, and trends,” IEEE Wireless Commun. Mag., vol. 27, no. 5, pp. 118–125, Oct. 2020.
- [8] Q. Wu and R. Zhang, “Beamforming optimization for wireless network aided by intelligent reflecting surface with discrete phase shifts,” IEEE Trans. Commun., vol. 68, no. 3, pp. 1838–1851, Mar. 2020.
- [9] H. Guo, Y.-C. Liang, J. Chen, and E. G. Larsson, “Weighted sum-rate maximization for reconfigurable intelligent surface aided wireless networks,” IEEE Trans. Wireless Commun., vol. 19, no. 5, pp. 3064–3076, May 2020.
- [10] A. Papazafeiropoulos, C. Pan, P. Kourtessis, S. Chatzinotas, and J. M. Senior, “Intelligent reflecting surface-assisted MU-MISO systems with imperfect hardware: Channel estimation and beamforming design,” IEEE Trans. Wireless Commun., vol. 21, no. 3, pp. 2077–2092, Mar. 2022.
- [11] J. An, C. Xu, L. Wang, Y. Liu, L. Gan, and L. Hanzo, “Joint training of the superimposed direct and reflected links in reconfigurable intelligent surface assisted multiuser communications,” IEEE Trans. Green Commun. Netw., vol. 6, no. 2, pp. 739–754, Jun. 2022.
- [12] L. Wei, C. Huang, G. C. Alexandropoulos, W. E. I. Sha, Z. Zhang, M. Debbah, and C. Yuen, “Multi-user holographic MIMO surfaces: Channel modeling and spectral efficiency analysis,” IEEE J. Sel. Topics Signal Process., vol. 16, no. 5, pp. 1112–1124, Aug. 2022.
- [13] M. Di Renzo, M. Debbah, D.-T. Phan-Huy, A. Zappone, M.-S. Alouini, C. Yuen, V. Sciancalepore, G. C. Alexandropoulos, J. Hoydis, H. Gacanin et al., “Smart radio environments empowered by reconfigurable AI meta-surfaces: An idea whose time has come,” EURASIP J. Wireless Commun. Netw., vol. 2019, no. 1, pp. 1–20, May 2019.
- [14] J. An, C. Xu, Q. Wu, D. W. K. Ng, M. Di Renzo, C. Yuen, and L. Hanzo, “Codebook-based solutions for reconfigurable intelligent surfaces and their open challenges,” IEEE Wireless Commun., pp. 1–8, Early Access, 2022.
- [15] X. Lin, Y. Rivenson, N. T. Yardimci, M. Veli, Y. Luo, M. Jarrahi, and A. Ozcan, “All-optical machine learning using diffractive deep neural networks,” Sci., vol. 361, no. 6406, pp. 1004–1008, Jul. 2018.
- [16] C. Liu, Q. Ma, Z. J. Luo, Q. R. Hong, Q. Xiao, H. C. Zhang, L. Miao, W. M. Yu, Q. Cheng, L. Li et al., “A programmable diffractive deep neural network based on a digital-coding metasurface array,” Nat. Electro., vol. 5, no. 2, pp. 113–122, Feb. 2022.
- [17] K. Liu, Z. Zhang, L. Dai, and L. Hanzo, “Compact user-specific reconfigurable intelligent surfaces for uplink transmission,” IEEE Trans. Commun., vol. 70, no. 1, pp. 680–692, Jan. 2022.
- [18] E. Björnson and L. Sanguinetti, “Rayleigh fading modeling and channel hardening for reconfigurable intelligent surfaces,” IEEE Wireless Commun. Lett., vol. 10, no. 4, pp. 830–834, Apr. 2021.
- [19] S. Sanayei and A. Nosratinia, “Antenna selection in MIMO systems,” IEEE Commun. Mag., vol. 42, no. 10, pp. 68–73, Oct. 2004.
- [20] J. Lee, G.-T. Gil, and Y. H. Lee, “Channel estimation via orthogonal matching pursuit for hybrid MIMO systems in millimeter wave communications,” IEEE Trans. Commun., vol. 64, no. 6, pp. 2370–2386, June 2016.
- [21] N. Jindal, W. Rhee, S. Vishwanath, S. Jafar, and A. Goldsmith, “Sum power iterative water-filling for multi-antenna Gaussian broadcast channels,” IEEE Trans. Inf. Theory, vol. 51, no. 4, pp. 1570–1580, Apr. 2005.