This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Efficient Task Offloading Algorithm for Digital Twin in Edge/Cloud Computing Environment

Ziru ZHANG, Xuling ZHANG, Guangzhi ZHU, Yuyang WANG,  and Pan HUI Ziru Zhang, Xuling Zhang, Guangzhi Zhu, Yuyang Wang, and Pan Hui are with the Information Hub, Hong Kong University of Science and Technology (Guangzhou), China, 511453.
E-mail: {zzhang758, xzhang659, gzhu305}@connect.hkust-gz.edu.cn, {yuyangwang, panhui}@ust.hk This work is supported by Science and Technology Bureau of Nansha District, Grant No.2022ZD012.
Abstract

In the era of Internet of Things (IoT), Digital Twin (DT) is envisioned to empower various areas as a bridge between physical objects and the digital world. Through virtualization and simulation techniques, multiple functions can be achieved by leveraging computing resources. In this process, Mobile Cloud Computing (MCC) and Mobile Edge Computing (MEC) have become two of the key factors to achieve real-time feedback. However, current works only considered edge servers or cloud servers in the DT system models. Besides, The models ignore the DT with not only one data resource. In this paper, we propose a new DT system model considering a heterogeneous MEC/MCC environment. Each DT in the model is maintained in one of the servers via multiple data collection devices. The offloading decision-making problem is also considered and a new offloading scheme is proposed based on Distributed Deep Learning (DDL). Simulation results demonstrate that our proposed algorithm can effectively and efficiently decrease the system’s average latency and energy consumption. Significant improvement is achieved compared with the baselines under the dynamic environment of DTs.

Index Terms:
Digital Twin, Mobile Cloud Computing, Mobile Edge Computing, Distributed Deep Learning, Task Offloading.

1 Introduction

WITH the rapid development of digital technology, DT has triggered significant advancement and interaction in both academia and industry [1, 2]. DT, a virtual model designed to accurately represent a physical object, can be regarded as consisting of three main parts: a physical object, its virtual twin, and a mapping between the physical object and its virtual twin [3]. As an emerging and attractive technology, the DT maps the states of physical objects into the virtual world through virtual descriptions or digital representations, significantly reshaping the design and engineering process [4]. In addition, by continuously collecting and analyzing sensor data, DT can model, simulate, monitor, analyze, and optimize the physical world and help people predict the related states of the near future. In recent years, DT technology has been used in various areas, such as real-time remote monitoring, 6G networks, control in industry, MCC, and MEC. From a technical aspect, the DT system comes with a complex architecture to satisfy the functions of virtual-real mapping, real-time synchronization, and prediction [5]. Therefore, the infrastructure of DTs not only needs to provide an arithmetic power guarantee but also needs network transmission equipment that can achieve negligible transmission latency across devices [6].

To address the limitations of the computing power of the DT infrastructure, we expect to take advantage of MCC by offloading complex computing tasks from mobile devices to a central cloud. By leveraging the rich virtual resources and exploiting cloud servers’ computational power, we can reduce the stress on mobile devices to handle tasks locally, reducing the task response time. However, when mobile devices communicate with MCC servers over long distances, this approach will be reformulated for many problems, such as high latency, low bandwidth, and network congestion. Hence, offloading tasks and data to MCC servers frequently is infeasible. Researchers have focused their research on MEC to solve the problems encountered by MCC servers. By utilizing the computing resources around the user, MEC provides end-users with powerful arithmetic power, storage capacity, energy efficiency, etc., and enhances the capabilities of MCC servers. Furthermore, the proximity of MEC servers to mobile users considerably shrinks the communication cost of task offloading and significantly reduces network latency. Moreover, MEC is equipped with better offloading technology to meet the increasing demands of ultra-high bandwidth and ultra-low network latency needs [7]. However, when MEC systems deal with complex scenarios, generating offloading decisions using traditional optimization methods or mathematical models consumes much computation time and energy. Therefore, making optimal offloading decisions with less computation time and energy has been a significant challenge.

Artificial Intelligence (AI) algorithms have emerged as a promising approach to the above-mentioned challenge. However, as most machine learning algorithms require large amounts of data to train the network, optimal data for offloading decisions are difficult to obtain in a dynamic and heterogeneous MEC/MCC environment. Combining Reinforcement Learning (RL) with Deep Neural Network (DNN), Deep Reinforcement Learning (DRL) can learn to solve complex problems through trial and error. In addition, DRL can train the network without data. Therefore, researchers have investigated various offloading algorithms for MEC tasks based on DRL. These methods treat the MEC system as a fixed environment and generate optimal offloading decisions without training data. In addition to DRL, DDL has become another promising method to generate near-optimal decisions [8]. DDL uses multiple parallel DNNs to generate offloading decisions for each task, which is independent of the training data. Besides, the employment of the DNNs can also enable the model to handle more sophisticated environments and tasks.

It is envisioned that DT, MCC, MEC and AI are critical technologies in the era of IoT. Edge computing and cloud computing technologies are necessary for achieving real-time data synchronization. Furthermore, merging DTs with machine learning algorithms will bring great benefits in improving the utility of the servers. Besides, AI is one of the underlying core technologies of DT. AI algorithms can not only provide data processing and system optimization for DT systems but also shorten the task offloading decision time. However, since DT technologies are still developing, research on the collaboration between DT and AI has yet to mature. Specifically, very little work has explored applying DT and DDL to MEC and MCC.

Based on the existing work, we notice that current DT systems involved either cloud servers or edge servers in the model. But cloud servers and edge servers have different advantages. So, we want to design a new DT system that can integrate DTs with cloud servers and edge servers. With the help of MEC and MCC, DTs can be maintained in the proper server to minimize the average latency of the system. Besides, for the system with multiple DTs and multiple servers, we also want to find an intelligent way to allocate the resources to further utilize the servers in the system. Moreover, most DT systems assume that each DT is linked to one data resource. However, DT with multiple data resources should also be considered, such as Internet of Vehicles (IoV) and Metaverse [9].

In this paper, we first propose a new DT system model based on cloud computing and edge computing technologies. To better utilize the resources and decrease the system cost, such as average latency and energy consumption, the resource allocation problem is formalized into an optimization problem. Then we present a new offloading algorithm to the optimization problem based on DDL. Finally, the proposed model and algorithm are evaluated via simulation. The main contributions of this paper can be summarized as follows:

  • We design a new DT system model with one cloud server, multiple edge servers and various DTs. Each DT is maintained in one server using data from multiple resources. The data for the same DT are synchronized in the corresponding server.

  • We formulate the resource allocation problem for placing all the DTs in cloud servers or edge servers according to different scenarios. The problem is formulated as a Mixed-Integer Programming (MIP) problem. By solving the problem, we can lower the system cost and provide a better experience for the users.

  • Considering the dynamic environment, we design a distributed framework based on DDL to minimize the system cost by allocating server resources intelligently. A fully connected network layer is also added to accelerate the training process and improve portability by further extracting the feature of each group. The DDL model treats features as input rather than the data from all the entities.

  • To evaluate the effectiveness of the proposed algorithm, we conduct several experiments by simulating real MEC/MCC scenarios. A test dataset consisting of 1024 different test data is generated randomly, and the average system costs under several offloading schemes are tested. The numerical results indicate that our method can significantly reduce the system cost.

The remainder of this paper is organized as follows. In Section 2, we summarize related work about DT and offloading algorithm. Next, we evaluate the system model and problem formulation in Section 3.1. In section 4, we present the DDL driven decision-making algorithm. Then we discuss the simulation process and evaluation result in Section 5. In Section 6, we conclude this article.

2 Related Work

The application of DTs has been found in several fields and has achieved significant progress, e.g., IoT, IoV and Unmanned Aerial Vehicle (UAV). To improve the system performance and enhance the stability, researchers also proposed various DT system models according to diverse application scenarios and transmission environments. Besides, cloud computing and edge computing technologies are also widely applied in different models to reduce the latency and energy consumption of the system. Different algorithms are also proposed to find proper offloading decisions. In this section, we will briefly summarize some previous studies about different DT system models and state-of-the-art offloading decision approaches.

2.1 Digital Twin

With the big success of AI, it is possible to achieve more sophisticated, intelligent functions on various mobile devices such as drones, self-driving cars, and robots. DT technology, as a bridge between physical entities and virtual systems, has been used in many different areas.

Industrial IoT is an important concept for Industry 4.0. To further improve the performance of the IoT devices in the industry, DT technologies have been adopted in many works. For example, Sun et al.[10] proposed an architecture based on DTs to assist the federated learning process of the Industrial IoT scenario. Song et al. [11] designed a federated learning framework for the DT driven Industrial IoT, where DRL and Lyapunov dynamic deficit queue are used to improve the communication efficiency. Therefore, data security and training accuracy of federal learning are guaranteed with the help of DT. In addition to Industrial IoT, IoV is an important IoT extension. DTs can provide a digital simulation model of the vehicles and optimize the allocation of resources. For instance, Wang et al. [12] and Sun et al. [13] used dynamic DT of aerial-assisted IoV to handle the unified resource scheduling and allocation problem. Alternating Direction Method of Multipliers (ADMMs) and Stackelberg game theory are employed to improve the overall efficiency of the model. Besides, Sun et al. [14] designed a dynamic DT and federated learning driven system for the air-ground networks: different incentive mechanisms are applied to improve the accuracy and energy efficiency of the system. Non-Orthogonal Multiple Access (NOMA) is an important part of next-generation wireless communications. Hence, Wang et al. [15] developed an energy-efficient DT system based on NOMA transmission. All the devices collaborate via federated learning to update a universal DT model, and the action model is updated to optimize the system. Numerical results also prove the viability of the algorithm.

With the help of edge computing techniques, MEC has become a novel approach for meeting the strict real-time requirements of DTs. Digital Twin Edge Networks (DITEN), which emerged as a novel DT architecture, have attracted significant attention. Sun et al. [16] first proposed a new vision of DITEN that considers user mobility and the variability of MEC environments, where Lyapunov optimization method and DRL algorithm are leveraged to handle the long-term migration cost. IoT is one of the most common application scenes of DT. For example, Lu et al. [17] integrated DTs with edge networks and proposed DITEN, while a blockchain-empowered federated learning scheme alleviates the data privacy protection and communication security problem in DITEN. A DT-assisted MEC model was proposed in [18] to minimize the end-to-end latency of industrial automation where multiple IoT devices and servers were considered. Additionally, UAVs can be applied as MEC nodes, providing low consumption and high flexibility for mobile-edge services. Li et al. [19] studied the intelligent task offloading problem in UAV-enabled MEC assisted by DT: Double Deep Q-network algorithm and closed-form expression method were deployed to optimize the system’s energy consumption. A social-aware vehicular edge caching mechanism was designed in [20], and a new concept of vehicular cache cloud was developed. The system utility is optimized by a deep deterministic policy gradient learning approach intelligently. In the Industrial IoT scenario, DITEN and federated learning were applied to construct DTs for IoT devices in the digital space. The energy and time cost of the communication process is minimized with the deep neural network model [21].

From the above work, it is undeniable that MEC can improve the performance of DTs. Since the cloud servers are always far from the physical entities, the transmission in mobile networks will lead to higher latency for MCC. However, MCC still has its advantages. For example, the cloud server has higher computational power, and the highly integrated framework makes it more energy efficient. Therefore, the cloud server can provide a lower time and energy cost for entities with huge workloads during execution. Hence, more work should be done on integrating DTs with MEC and MCC heterogeneous environments. Moreover, current DT system models mainly focus on improving performance by reducing the total consumption of each DT in the system. However, the DT with multiple data resources has yet to be discussed. No consideration has been given to the synchronization problem between various devices when formula the system cost.

2.2 Intelligent Offloading Schemes

In addition to the DT system models, there is still plenty of work that mainly focuses on the offloading decision-making problem of MEC and MCC. Much of the current work is based on traditional optimization methods such as Lyapunov optimization [22, 23, 24, 25, 26, 27], Graph Theory [28, 29], Stalberg Game Theory [30], and Successive Convex Approximation [31]. These schemes can take optimal or near-optimal offloading decisions for various offloading environments. However, the problem complexity increased exponentially when the number of devices increased. Therefore, the time consumption during generating offloading decisions using traditional optimization methods will be unaffordable when dealing with complicated scenes. To better solve this NP-hard problem within a relatively short time, machine learning-based intelligent algorithms have been the main research direction in recent years.

Most machine learning algorithms require large amounts of data to train the networks. However, the optimal offloading decision data is challenging to obtain when facing dynamic heterogeneous MEC/MCC environments. So, algorithms that can be trained without data, such as RL and DRL based algorithms, have attracted much attention from many researchers. Huang et al. [32] first proposed a Deep Q-Network based offloading algorithm to minimize the system cost in the MEC environment. In order to jointly optimize the task assignment and radio resource allocation in the MEC/MCC scenes, Dab et al. [33] proposed the QL-JTAR algorithm based on Q-learning. Simulations conducted in NS3 have proved the performance of the approach. Double Deep Q-Network algorithm and a Q-function decomposition technique were combined in work [34]. The problem of stochastic computation offloading is formulated as a Markov decision process, and two novel offloading algorithms for the MEC system, Deep-SARL and DARLING, are designed. When considering the application scenario of vehicular edge computing, which is a typical application of MEC, Zhan et al. [35] presented a new DRL architecture based on proximal policy optimization method called DRLOSM. A convolutional neural network is also used to approximate policy and value functions and extract representative features. Qiu et al. [36] studied a blockchain-based collective Q-learning approach in a networking integrated cloud–edge–end resource allocation in IoT.

RL and DRL-driven algorithms are promising solutions to problems that lack enough training data. However, another issue emerges when considering the dynamic environments in MEC/MCC: the model needs to be retrained when the offloading environment varies. Meta-learning is an excellent solution to improve the portability of different methods. Huang et al. [37] combined Meta-learning with deep learning in the MEC network and proposed an algorithm called MELO. Through training the DNNs with historical MEC task scenarios, the model can achieve 99% accuracy via one-step fine-tuning when facing a new MEC task scenario. Wang et al.[38] first applied Meta-Reinforcement Learning in a multi-access edge computing network. They synergized the first-order MRL algorithm with a sequence-to-sequence neural network and proposed the MRLCO algorithm. The typical meta-learning algorithm MAML and Reptile are also applied in MRL and combined with DRL in [39, 40]. Simulations have proved that the training time can be remarkably reduced, and the model’s portability can be improved significantly.

Apart from the meta-learning approach, distributed deep learning is another encouraging way to generate offloading decisions despite few training data. DDL algorithms use multiple parallel DNNs to generate offloading decisions. Each DNN receives offloading environment as input and output offloading decisions for each task through the forward propagation process. In contrast to traditional DNNs, which require large amounts of data for training, distributed deep learning can make the DNNs train each other without data. Huang et al. [41] first proposed the DDLO algorithm based on DDL for the MEC networks. As a result, near-optimal offloading decisions can be generated in less than one second. More work about the online computation offloading issue in MEC networks was studied in [42]. Thorough experiments and evaluation works proved the feasibility of the designed DROO algorithm. Moreover, Wu et al. [34] extended the algorithm and proposed the distributed deep learning based offloading algorithm for the heterogeneous MEC/MCC system. Numerical results indicate that the average error can be controlled under 6% compared to optimal decisions.

DDL has shown significant potential when facing MEC and/or MCC scenarios. Nevertheless, few works have been done for applying DDL in the DT system model. In addition, most approaches ignored the relationship between different devices during the deployment. The hidden information should be extracted and utilized to simplify the network structure. Here, we proposed a novel DT system model that considers the synchronization issues for the DT with various devices for data collection. The distributed deep learning algorithm is combined with a fully connected network layer to further extract the features from each group. The offloading decision-making model takes features as input instead of all the offloading environment information.

3 System Model and Problem Formulation

This section elaborates on our DT system model framework of heterogeneous clouds, where each DT is maintained by one edge server or cloud server. Each DT is maintained via data collected from multiple devices, such as sensors and cameras. The data from different resources will be synchronized in the corresponding server. The optimization problem is then defined by formulating the system’s total latency and energy consumption. For convenience, the major notations used in this paper are listed in Table I.

TABLE I: Major Notations
Notation Description
wnw_{n} The size of data for the nthn_{th} device
bnb_{n} The bandwidth of the nthn_{th} device
fcf_{c} The clock speed of CPU in the cloud server.
fesf_{e}^{s} The clock speed of CPU in the sths_{th} edge server.
γ\gamma The discount parameter of the network.
δ{\delta} Instructions executed by CPU for one unit of data.
θc\theta_{c} Average energy cost for each instruction in the cloud server.
θes\theta_{e}^{s} Average energy cost for each instruction in the sths_{th} edge server.
Tt,ncT^{c}_{t,n} Transmission time cost when wnw_{n} is executed in cloud server.
Tc,ncT^{c}_{c,n} Execution time cost when wnw_{n} is executed in cloud server.
Tt,nseT^{e}_{t,ns} Transmission time cost when wnw_{n} is executed in the sths_{th} edge server.
Tc,nseT^{e}_{c,ns} Execution time cost when wnw_{n} is executed in the sths_{th} edge server.
GnG_{n} The ownership of the nthn_{th} device.
XmX_{m} The placement of the mthm_{th} DT.
DnD_{n} The offloading decision of wnw_{n}.
α\alpha The weight coefficient between latency and energy consumption.

3.1 Framework

Refer to caption

Figure 1: Framework of the proposed DT system model

Figure 1 elaborates the framework of our DT system model. The framework can be divided into two main parts: the physical environment and the procedure of DT maintenance. In the physical environment, we assume that there are multiple edge servers and various data collection devices in the field, where all the edge servers and devices are deployed. Each device can connect to the edge servers via wireless transmission, and one cloud server can also be utilized through the mobile network.

Besides, we assume that several DTs are maintained by the devices and servers in the physical environment. The maintenance of each DT requires data from various devices, i.e., the vehicles in the same street, Metaverse users in the same virtual space, and cameras for intelligent robots. Each DT is placed and maintained in one of the servers, where all the data is synchronized, and computational tasks are executed, such as simulation, prediction, and analysis.

DTs can achieve better performance with the advantages of MEC and MCC, but the placement of DTs in the dynamic scenario still matters. On the one hand, the utilization of computation power and bandwidth should be maximized in order to further reduce the delay; on the other hand, the average latency and energy consumption should be balanced accordingly. Therefore, the resource allocation problem during the DTs maintenance will be determined according to the environment.

3.2 Offloading Model

Our system model contains one cloud server, S edge servers, denoted as 𝒮={1,2,,S,S+1}\mathcal{S}=\{1,2,\cdots,S,S+1\}, and various devices used for data collection, which can be denoted as 𝒩={1,2,,N}\mathcal{N}=\{1,2,\cdots,N\}. Each device belongs to one of the DT and all the DTs can be represented as ={1,2,,M}\mathcal{M}=\{1,2,\cdots,M\}. In addition, the maintenance of the DTs always requires great computational capacity, which is much higher than the capacity of the calculation unit in each device. The workload, including the newly collected information such as location and movement, is transmitted to the server via wired or wireless transmission techniques. The DTs are maintained in different servers instead of computing all the tasks locally. The whole process can be called offloading. We assume that the workload for the nthn^{th} devices is wnw_{n}. The DTs can be maintained on either a cloud server or edge servers, and workloads are offloaded to the servers according to the offloading decision.

3.2.1 MCC Model

The cloud server is a powerful physical or virtual infrastructure that has powerful computational resources and great data storage capacity. By dividing a physical server into multiple virtual servers using virtualization approaches, users can buy cloud services according to the workload. Multiple users can be serviced simultaneously, and less energy is consumed due to the high integration and better utilization rate of the resources.

When the DT chooses to be maintained in the cloud server, the data from the nthn^{th} devices is delivered via the mobile network. Since cloud servers are always far from the devices, we assume that the transmission time is only related to the data size and the allocated bandwidth. The transmission time can be formulated as follows:

Tt,nc=wnbnγT^{c}_{t,n}=\frac{w_{n}}{b_{n}\gamma} (1)

where bnb_{n} is the bandwidth between the nthn^{th} device and the base station and γ\gamma is the discount parameter considering the network fluctuations and congestion. Generally speaking, the time consumed during the download period can be ignored since only the necessary results are returned, which is much smaller than the size of offloaded data.

During the execution process, the clock speed of the CPU in the cloud server can be represented as fcf_{c}. And we assumed that δ{\delta} instructions are executed when processing one unit of data. Thus the execution time can be denoted as follows:

Tc,nc=δwnfcT^{c}_{c,n}=\frac{{\delta}{w_{n}}}{f_{c}} (2)

To further evaluate the overall energy cost during cloud computing, the average energy cost for transmitting data of one unit size to the cloud server is set to etce^{c}_{t} and the average energy required to process each instruction is θc\theta_{c}. Then the total energy cost of wiw_{i}, which is consisted of the transmission energy and calculating energy, can be derived as :

Enc=Et,nc+Ec,nc=etcwn+θcδwnE^{c}_{n}=E^{c}_{t,n}+E^{c}_{c,n}={e^{c}_{t}}{w_{n}}+{\theta_{c}}{\delta}{w_{n}} (3)

3.2.2 MEC Model

Compared to cloud servers, edge servers are placed near the users and devices in the same scenario. Benefiting from the short distance, the devices can directly communicate with the edge servers through wireless techniques such as WiFi and Bluetooth. The transmission speed is much faster in a more stable. However, limited by the radio distance, the transmission rate decreases as the distance grows, which makes distance one of the main factors affecting transmission performance. The latency during MCC transmission will be lower for devices that are far from edge servers.

To further define the latency, we use LnL_{n} and LsL_{s} to represent the location of the nthn^{th} device and the sths^{th} edge server. The maximum transmission rate pn,sp_{n,s} can be determined as:

pns=λLnLs2p_{ns}=\frac{\lambda}{||L_{n}-L_{s}||_{2}} (4)

where the distance between the device and the server is calculated by a l2l_{2} norm. The λ\lambda is the average discount parameter decided by the radio access distance of different devices and interference in the scenario. Then the transmission time cost when choosing the sths^{th} edge server can be derived as:

Tt,nse=wnpns=1λwnLnLs2T^{e}_{t,ns}=\frac{w_{n}}{p_{ns}}=\frac{1}{\lambda}{w_{n}}{||L_{n}-L_{s}||_{2}} (5)

When executing the same task on cloud servers or edge servers, the number of instructions is the same, allowing the computation time to be expressed as:

Tc,nse=δwnfesT^{e}_{c,ns}=\frac{{\delta}{w_{n}}}{f_{e}^{s}} (6)

where fesf_{e}^{s} represent the clock speed of CPU in the sths^{th} edge server.

Same as edge computing, the total energy consumption in cloud computing of wiw_{i} can be determined as:

En,se=Et,nse+Ec,nse=etewn+θesδwnE^{e}_{n,s}=E^{e}_{t,ns}+E^{e}_{c,ns}={e^{e}_{t}}{w_{n}}+{\theta^{s}_{e}}{\delta}{w_{n}} (7)

where etee^{e}_{t} is the transmission energy cost parameter of MEC and θes\theta^{s}_{e} is the execution energy parameter of the sths^{th} edge server.

3.2.3 Problem Formulation

Considering all the DTs and devices in the system, we use 𝒢=[G1,G2,,GN]\mathcal{G}=[G_{1},G_{2},\cdots,G_{N}]^{\top} to denote the ownership information of the devices. For the nthn^{th} device, in details we have Gn=[Gn1,Gn2,,GnM]G_{n}=[G_{n}^{1},G_{n}^{2},\cdots,G_{n}^{M}], which is denoted as:

Gnm={1,if the nth device belongs to the mth DT,0,otherwise.G_{n}^{m}=\begin{cases}1,&\textrm{if the $n^{th}$ device belongs to the $m^{th}$ DT},\\ 0,&\textrm{otherwise}.\end{cases} (8)

Meanwhile, we define that the offloading decision of the environment is 𝒳=[X1,X2,,XM]\mathcal{X}=[X_{1},X_{2},\cdots,X_{M}]^{\top} where Xm=[Xm,1,Xm,2,,Xm,S,Xm,S+1]{X_{m}}=[X_{m,1},X_{m,2},\cdots,X_{m,S},X_{m,S+1}] can similarly be denoted as:

Xm,s={1,if the mth DT choose the sth server,0,otherwise.X_{m,s}=\begin{cases}1,&\textrm{if the $m^{th}$ DT choose the $s^{th}$ server},\\ 0,&\textrm{otherwise}.\end{cases} (9)

To be specific, it refers to the sths^{th} edge server when 1sS1\leq s\leq S and the cloud server when s=S+1s=S+1.

Then the offloading decision of the devices can be derived as:

𝒟=[D1,D2,,DN]=𝒢×𝒳\mathcal{D}=[D_{1},D_{2},\cdots,D_{N}]^{\top}={\mathcal{G}}{\times}{\mathcal{X}} (10)

where DnD_{n} is the offloading decision of the nthn^{th} device.

The expected transmission time and calculation time of the nthn^{th} entity under offloading decision 𝒳\mathcal{X} can be derived as:

Tnt=Gn𝒳ψnt{T_{n}^{t}}={G_{n}}{\mathcal{X}}\psi_{n}^{t} (11)
Tnc=Gn𝒳ψnc{T_{n}^{c}}={G_{n}}{\mathcal{X}}\psi_{n}^{c} (12)

where ψnt=[Tt,n1e,Tt,n2e,,Tt,nSe,Tt,nc]\psi_{n}^{t}=[T^{e}_{t,n1},T^{e}_{t,n2},\cdots,T^{e}_{t,nS},T^{c}_{t,n}] and ψnc=[Tc,n1e,Tc,n2e,,Tc,nSe,Tc,nc]\psi_{n}^{c}=[T^{e}_{c,n1},T^{e}_{c,n2},\cdots,T^{e}_{c,nS},T^{c}_{c,n}] are the delay matrix according to the transmission time delay defined in (1) (5) and calculation time delay defined in (2) (6).

Similarly, we can define the energy consumption matrix as ξn=[En,1e,En,2e,,En,Se,Enc]\xi_{n}=[E^{e}_{n,1},E^{e}_{n,2},\cdots,E^{e}_{n,S},E^{c}_{n}] and the total energy cost of the mthm^{th} DT according to (3) and (7) can be written as:

Em=n𝒩,Gn=GmGn𝒳ξnE_{m}=\sum_{n\in\mathcal{N},G_{n}=G_{m}}{G_{n}}{\mathcal{X}}\xi_{n} (13)

As a consequence, the total energy cost of the whole system can be written as:

E=m=1M(n𝒩,Gn=GmGn𝒳ξn)E=\sum_{m=1}^{M}(\sum_{n\in\mathcal{N},G_{n}=G_{m}}{G_{n}}{\mathcal{X}}\xi_{n}) (14)

Considering that the data synchronization process is done when all the devices of the same DT finish data transmitting procedure. Besides, the server can receive data from multiple devices simultaneously. So, the latency during synchronization is decided by the longest time cost of each DT, which can be expressed as:

Tmt=max{Tnt|n𝒩;Gn=Gm}T_{m}^{t}=max\{T_{n}^{t}\big{|}n\in\mathcal{N};G_{n}=G_{m}\} (15)

Besides, we also need to take device number distinctions into account. Then the overall time cost of the group can be expressed as:

Tm=Im(Tmt+n𝒩,Gn=GmTnc)T_{m}=\left|\left|I_{m}\right|\right|(T_{m}^{t}+\sum_{n\in\mathcal{N},G_{n}=G_{m}}T_{n}^{c})\\

where Im\left|\left|I_{m}\right|\right| denotes the devices number of the mthm^{th} DT.

Therefore, the overall time and energy expenditure of the MEC/MCC hybrid offloading model can be derived as:

T=\displaystyle T= m=1M(||Im||(max{Gn𝒳ψnt|n𝒩;Gn=Gm}\displaystyle\sum_{m=1}^{M}(\left|\left|I_{m}\right|\right|(max\{{G_{n}}{\mathcal{X}}\psi_{n}^{t}\big{|}n\in\mathcal{N};G_{n}=G_{m}\}
+n𝒩,Gn=GmGn𝒳ψnc))\displaystyle+\sum_{n\in\mathcal{N},G_{n}=G_{m}}{G_{n}}{\mathcal{X}}\psi_{n}^{c}))

From the above equations, the system time cost defined in (3.2.3) and system energy cost defined in (14) are related to the workload of each DT, the location of each device, the ownership of each device and the offloading decision of the model. In other words, for the giving offloading environment, we can calculate the system cost of different offloading decisions based on the above information.

To find the optimal offloading decision considering the time and energy of the DT system, we further defined the weighted system cost function, which can be formulated as:

Q(𝒲,,𝒢,𝒳)=αT+(1α)EQ(\mathcal{W},\mathcal{L},\mathcal{G},\mathcal{X})=\alpha T+(1-\alpha)E (17)

where 𝒲={wn|n𝒩}\mathcal{W}=\{w_{n}\big{|}n\in\mathcal{N}\} and α[0,1]\alpha\in[0,1] is the weight coefficient to trade-off between energy consumption and total delay. For instance, system delay is the only concern when α=1\alpha=1 and we only focus on the energy cost if α=0\alpha=0.

Therefore, according to the DT network scenario, the optimal decision-making problem can be transformed into an optimization problem 𝒫\mathcal{P}:

(𝒫)min:Q(𝒲,,𝒢,𝒳)=αT+(1α)E(\mathcal{P})\quad min:Q(\mathcal{W},\mathcal{L},\mathcal{G},\mathcal{X})=\alpha T+(1-\alpha)E (18a)
s.t.:Xms{0,1},m,s𝒮s.t.:X_{m}^{s}\in\{0,1\},m\in\mathcal{M},s\in\mathcal{S}\qquad (18b)
s=1S+1Xms=1,m,s𝒮\sum_{s=1}^{S+1}X_{m}^{s}=1,m\in\mathcal{M},s\in\mathcal{S} (18c)

The problem (𝒫\mathcal{P}) is a MIP and nonconvex problem with high-dimensional state space. In order to find the optimal solution for the nonlinear function, one needs to select the target decision from (S+1)M(S+1)^{M} possible decisions. The computation complexity of the NP-hard problem increases exponentially when the DT and the server numbers become larger. In order to solve the problem efficiently, we developed a new approach for finding the near-optimal decision based on DDL.

4 ALGORITHM

Refer to caption

Figure 2: The procedure of the proposed algorithm

To solve the decision-making problem (𝒫\mathcal{P}) with efficiency and accuracy, we proposed a DDL based algorithm. A fully connected network, which can extract features of the workloads of each DT and simplify the network structure, is also added to solve the synchronization problem. The framework of our algorithm is depicted in Figure 2.

4.1 Framework

Our framework can be divided into two parts, namely, the feature extract module based on a fully connected network and the decision module based on distributed deep learning approach. For any given workloads information 𝒲\mathcal{W}, location \mathcal{L} and devices information 𝒢\mathcal{G}, we aim to output the respect near-optimal offloading decision 𝒳\mathcal{X}.

Before the decision-making module, for each DT we first input 𝒲\mathcal{W} and 𝒲\mathcal{W} to the feature extractor module according to 𝒢\mathcal{G}. Features for each DT are extracted, which then be regarded as inputs of the decision module to reduce the dimension of the problem. Although 𝒲\mathcal{W}, \mathcal{L} and 𝒢\mathcal{G} can be leveraged directly in the DDL process, the relationship between the devices of the same DT is ignored, and the number of trainable parameters is increased significantly, which will cause unnecessary computational cost. Besides, it is hard to use machine learning approaches to extract features. The training process always needs adequate training data, which is hard to get, and the features have to be pre-decided by experience. But in our scheme, the feature extraction module can be trained together with the decision module without data by utilizing the advantage of distributed deep learning, which will be elaborated in the next section.

The decision module consists of 𝒦\mathcal{K} parallel DNNs with the same structure but different initial weight parameters. Each DNN gets the data given by the feature extraction module as input and output offloading decisions of the application scenario. In the hidden layers, we use Relu as the activation function, and in the output layer, we use the Sigmoid activation function to map the output into (0,1). Since the offloading decision is represented by a sequence of binary integers, we have to map the values given by the network into the desired format. For the kthk^{th} DNN, we use 𝒱^k\hat{\mathcal{V}}^{k} to represent the output values and 𝒱k\mathcal{V}^{k} as the corresponding binary output. For any element V^i\hat{V}_{i} in 𝒱^k\hat{\mathcal{V}}^{k}, the binary offloading decision ViV_{i} can be determined by:

Vi={1,if V^i  12,0,if V^i >12.V_{i}=\begin{cases}1,&\textrm{if $\hat{V}_{i}$ $\leq$ $\frac{1}{2}$},\\ 0,&\textrm{if $\hat{V}_{i}$ \textgreater$\frac{1}{2}$}.\end{cases} (19)

As a result, The offloading decision 𝒱k\mathcal{V}^{k} of each DT is represented by a sequence in a binary format. According to the definition of the offloading decision in problem (𝒫\mathcal{P}), each sequence is then converted from the binary format into one-hot format and composed into the required offloading decision 𝒳k\mathcal{X}^{k}.

During the decision-making procedure, the input data is replicated and fed into each DNN. Since the multiple DNNs have different weights, we can obtain kk possible offloading decisions {𝒳1,𝒳2,,𝒳K}\{\mathcal{X}^{1},\mathcal{X}^{2},\cdots,\mathcal{X}^{K}\}. The performance in terms of time delay and energy consumption of each decision can be calculated by using the cost function Q(𝒲,,𝒢,𝒳)Q(\mathcal{W},\mathcal{L},\mathcal{G},\mathcal{X}). Then the best decision 𝒳~\tilde{\mathcal{X}} with the lowest system cost can then be determined, which can be either used as the final decision or regarded as training data.

4.2 Train

Taking the complexity and variability of the MEC and MCC hybrid environments into account, we notice the difficulty of collecting abundant training data of the aiming scenario for the traditional deep learning based approaches. Besides, generating optimal decisions via conventional methods is also time-consuming when the server numbers and the DT numbers increase. In our approach, multiple DNNs are used to train the DDL model regardless of training data. The main procedure is depicted as follows.

We first randomly initialize the weight parameter of each DNN differently. To be specific, each layer obeys a normal distribution but the same layer of different DNNs have different weight parameter. Then, we randomly generate a set of input data consisting of 𝒲\mathcal{W}, \mathcal{L}, and 𝒢\mathcal{G} to simulate one possible situation. After that, we input all the data to each DNN and these 𝒦\mathcal{K} parallel DNNs will output 𝒦\mathcal{K} different decisions. Based on the generated situation and decisions, we can calculate the current best decision 𝒳~\tilde{\mathcal{X}}. Although the DNNs have not been trained well for the same input, the best decision still represents better performance, which can be regarded as labeled data. Improvements will be achieved theoretically if we use the selected decision to train other DNNs. Thus we regard {𝒲,,𝒢,𝒳~}\{\mathcal{W},\mathcal{L},\mathcal{G},\tilde{\mathcal{X}}\} as one training data and stored in the training data set. The process is iterated until the upper limit of data size is reached.

To train the model, we randomly choose a batch of training data from the data set for each DNN. The batches are different from each other since the weights of each neural network will soon become the same as the iteration progresses if all the DNNs are trained by the same data, which will be a hindrance to the exploration of possible decisions. Then cross-entropy loss function is applied for each DNN, which is minimized based on gradient descent. We denote the training data set of the kthk^{th} DNN as k={(𝒲ik,ik,𝒢ik,𝒳~ik)|1iU}\mathcal{R}^{k}=\{(\mathcal{W}_{i}^{k},\mathcal{L}_{i}^{k},\mathcal{G}_{i}^{k},\tilde{\mathcal{X}}_{i}^{k})\big{|}1\leq i\leq U\}, where U is the size of each data batch. Then the loss of the kthk^{th} DNN can be calculated as:

L(πk)=\displaystyle L(\pi_{k})= 1Ui=1U((𝒳~ik)Tlogfπk(𝒲ik,ik)\displaystyle-\frac{1}{U}\sum_{i=1}^{U}((\tilde{\mathcal{X}}_{i}^{k})^{T}{\log f}_{\pi_{k}}(\mathcal{W}_{i}^{k},\mathcal{L}_{i}^{k})
+(1𝒳~ik)Tlog(1fπk(𝒲ik,ik)))\displaystyle+{(1-\tilde{\mathcal{X}}_{i}^{k})}^{T}\log(1-f_{\pi_{k}}(\mathcal{W}_{i}^{k},\mathcal{L}_{i}^{k}))) (20)

where πk\pi_{k} to represent the parameter of the kthk^{th} DNN and fπkf_{\pi_{k}} is the corresponding output.

After updating the parameter {π1,π2,,πK}\{\pi_{1},\pi_{2},\dots,\pi_{K}\} using Adam optimizer [43] according to the calculated loss value, the DNNs should have better performance and higher decision-making level. Therefore, we repeat the data generation process described in the previous part and replace part of the data set with these new samples that have higher accuracy.

The parallel DNNs keep learning from the best decisions within each other, and the data set is constantly updated to ensure the training performance. Iterations are conducted, and the neural networks are continuously trained to approximate the global optimal offloading decisions. The pseudo-code of the proposed DRL algorithm is provided in Algorithm 1.

Algorithm 1 DDL-based Offloading Algorithm

Input: Workload 𝒲\mathcal{W}, location \mathcal{L}, devices information 𝒢\mathcal{G}
Output: Offloading decision 𝒳~\tilde{\mathcal{X}}

1:Initialization: Initialize the model with random parameter θ1\theta_{1} and empty the database
2:for j=1,2,3,,Nj=1,2,3,\cdots,N do
3:     Randomly generate a group of input 𝒲i\mathcal{W}_{i}, i\mathcal{L}_{i} and 𝒢i\mathcal{G}_{i}
4:     for i=1,2,3,,Ki=1,2,3,\cdots,K do
5:         Input the generated data 𝒲i\mathcal{W}_{i}, i\mathcal{L}_{i} and 𝒢i\mathcal{G}_{i} to the
6:   fully connected network;
7:         Input the extracted features to the ithi^{th} DNN;
8:         Generate the ithi^{th} offloading decision 𝒳~ik\tilde{\mathcal{X}}_{i}^{k};
9:     end for
10:     Select decision 𝒳~i=\tilde{\mathcal{X}}_{i}= argmin{𝒳~ik}\mathop{min}\limits_{\{\tilde{\mathcal{X}}_{i}^{k}\}} Q(𝒲i,i,𝒢i,𝒳~ik)Q(\mathcal{W}_{i},\mathcal{L}_{i},\mathcal{G}_{i},\tilde{\mathcal{X}}_{i}^{k});
11:     Calculate Q(𝒲i,i,𝒢i,𝒳~i)Q(\mathcal{W}_{i},\mathcal{L}_{i},\mathcal{G}_{i},\tilde{\mathcal{X}}_{i}) as QiQ_{i};
12:     if database is not full then
13:         Store (𝒲i,i,𝒢i,𝒳~i,Qi)(\mathcal{W}_{i},\mathcal{L}_{i},\mathcal{G}_{i},\tilde{\mathcal{X}}_{i},Q_{i}) into the database;
14:     else
15:         Replace the oldest data with (𝒲i,i,𝒢i,𝒳~i,Qi)(\mathcal{W}_{i},\mathcal{L}_{i},\mathcal{G}_{i},\tilde{\mathcal{X}}_{i},Q_{i})
16:         Randomly choose K batches of training data;
17:         Train each DNN using a selected batch of data
18:   and update θj\theta_{j} using the Adam optimizer;
19:     end if
20:end for
21:for k=1,2,3,,Kk=1,2,3,\cdots,K do
22:     Input 𝒲\mathcal{W}, \mathcal{L} and 𝒢\mathcal{G} to the fully connected network;
23:     Input the extracted features to the kthk^{th} DNN;
24:     Generate the kthk^{th} offloading decision candidate 𝒳~k\tilde{\mathcal{X}}^{k};
25:end for
26:Select decision 𝒳~=\tilde{\mathcal{X}}= argmin{𝒳~k}\mathop{min}\limits_{\{\tilde{\mathcal{X}}^{k}\}} Q(𝒲,,𝒢,𝒳~k)Q(\mathcal{W},\mathcal{L},\mathcal{G},\tilde{\mathcal{X}}^{k});
27:return Offloading decisions 𝒳~\tilde{\mathcal{X}}

4.3 Test

For traditional deep learning based approaches, the model’s accuracy can be tested using the labeled data. However, the distributed deep learning based model is trained without labeled data. Besides, the optimal decision is hard to obtain using both traditional optimization methods and intelligent methods. To evaluate the performance and verify the feasibility of the proposed algorithm, we need to adopt a plausible verification method based on the characteristics of DDL. The testing process of the model can be separated into two parts, the convergence performance during training and the accuracy performance after the model is trained well.

Firstly, we need to ensure that the whole model converges after iterations. More specifically, after finite training iterations, the generated decisions of different inputs remain the same before and after extra iterations. Although the loss value can show the convergence performance of each DNN, the convergence of the model mainly focus on the best decision chosen from all the DNNs. So, we use U0U_{0} randomly generated input data as the test data, which can be denoted as 𝒯={(𝒲i,i,𝒢i)|1iU0}\mathcal{R}^{\mathcal{T}}=\{(\mathcal{W}_{i},\mathcal{L}_{i},\mathcal{G}_{i})\big{|}1\leq i\leq U_{0}\}. We input 𝒯\mathcal{R}^{\mathcal{T}} into the model before and after each training epoch and generate the corresponding decisions. The system cost of each decision is then calculated, including the cost for the old decisions before training and the new decisions after training, which can be written as Cold={Ciold|1iU0}C^{old}=\{C_{i}^{old}|1\leq i\leq U_{0}\} and Cnew={Cinew|1iU0}C^{new}=\{C_{i}^{new}|1\leq i\leq U_{0}\}. The convergence rate can be derived as follows:

𝒞=1U0i=1U0min(Ciold,Cinew)max(Ciold,Cinew)\mathcal{C}=\frac{1}{U_{0}}\sum_{i=1}^{U_{0}}\frac{min(C_{i}^{old},C_{i}^{new})}{max(C_{i}^{old},C_{i}^{new})} (21)

where 𝒞(0,1]\mathcal{C}\in(0,1] and the 𝒞\mathcal{C} will be close to one of the model convergent better.

After the model is proven to converge well, we still need to make sure that the model can make near-optimal decisions rather than converge in any local optimal solutions. So several baselines are applied to make comparison evaluations due to the lack of optimal decisions. we use the randomly generated dataset as the test data, too. For the same test data set, we will measure the average cost under given decision-making schemes and our approach, which can show the improvement and drawbacks of our approach. Besides, we also change the weight coefficient α\alpha into different value to ensure the proposed algorithm have reliable accuracy in various trade-off situations.

5 PERFORMANCE EVALUATION

In this section, we set up numerical simulations under different application scenarios to demonstrate the effectiveness of the proposed approach. Experiments are carefully designed to evaluate the test standards presented above.

5.1 Simulation Setup

In our simulation, we consider a heterogeneous edge/cloud environment consisting of one cloud server and three edge servers. Considering the system congestion and the number of threads, we set the equivalent clock frequency of the cloud server fc=3.5f_{c}=3.5 GHz. Similarly, the clock frequency of the edge servers is randomly distributed between 1.83.01.8\sim 3.0 GHz. In addition, since the cloud server has better energy efficiency than the edge servers, we set the calculation energy consumption parameter θc=0.1\theta_{c}=0.1 mJ and θes=0.125\theta_{e}^{s}=0.125 mJ, respectively.

Then we assume that the DT edge/cloud system contains up to 120 data collection devices, which belong to 15 different DTs at most. The locations and device ownership are randomly simulated in different situations, and the workload of each device is derived from the range [10,40][10,40] MB. The transmission cost for one unit of data is set as etc=0.15e_{t}^{c}=0.15 mJ and ete=0.125e_{t}^{e}=0.125 mJ. Then we assume that the bandwidth between each device and base station is 1000 Mbps. The summary of our evaluation parameters is demonstrated in Table II.

TABLE II: Evaluation Parameter
Parameters Values
The number of cloud servers 1
The number of edge server 𝒮\mathcal{S} = 3
The number of devices for data collection 𝒩\mathcal{N} = 120
The number of DTs \mathcal{M} = 15
The size of the field 1000m ×\times 800m
The clock frequency of cloud server fc=3.5f_{c}=3.5 GHz
The clock frequency of edge server fes=[1.8,3.0]f_{e}^{s}=[1.8,3.0] GHz
The data size of each device [10,40][10,40] MB.
The energy to transmit a unit of data in MEC ete=0.125e_{t}^{e}=0.125 mJ
The energy to transmit a unit of data in MCC etc=0.15e_{t}^{c}=0.15 mJ
The energy to execute each instruction in MEC θc=0.1\theta_{c}=0.1 mJ
The energy to execute each instruction in MCC θes=0.125\theta_{e}^{s}=0.125 mJ
The bandwidth of each device bnb_{n} = 1000 Mbps

For each DNN in our model, we considered a network with two hidden layers as the decision-making module and a fully connected network with one hidden layer to extract the feature of each group. We implement the proposed algorithm in Python 3.8.0 with TensorFlow 2.9.1. All the simulations are performed based on an Intel Core i7-12700H CPU and 16 GB memory.

5.2 Training Evaluation

In this part, we measure the convergence performance under different learning rates, DNN numbers, and database sizes. We also generate a test set consisting of 1024 different scenarios as the test data set. After each iteration, we input the test set into the model and calculate the average cost after each iteration, which can represent the accuracy change of our approach.

5.2.1 Impact of Learning Rate

Refer to caption

Figure 3: Convergence performance of learning rates

Refer to caption

Figure 4: Accuracy performance of learning rates

The learning rate is an essential hyperparameter that affects the performance of our approach. For the traditional DNN-based model, if the learning rate is too low, the model will converge very slowly, and more epochs are needed for training. On the contrary, the model will converge fast but cannot achieve good accuracy if the learning rate is too big.

In our experiments, we adjust the learning rate from 0.00001 to 0.01, and the convergence performance can be shown in Figure 3. Since our DNNs are trained with the training data generated by the model itself, the impact of the learning rate is magnified by the iteration process. On the one hand, the parameters are changed too fast during training, and the database is replaced by the data with low accuracy, which will cause the model to skip a lot of hidden information and fall into a locally optimal solution or become under-fitting. From Figure 4, we can see the accuracy under a high learning rate is unacceptable. On the other hand, the database is renewed at a low speed if the learning rate is too low. Thus the model is trained slower, which requires additional training costs.

5.2.2 Impact of Number of DNNs

Refer to caption

Figure 5: Convergence performance of DNN number

Refer to caption

Figure 6: Accuracy performance of DNN number

During our experiments, the number of DNNs is adjusted from 2 to 16, and each model is trained with 1000 iterations. The convergence rate and accuracy performance are measured after every iteration using the approach defined in the previous section. From Figure 5, we can see that the model converges with the training iterations goes. The changing speed became faster at the beginning and then became stable since the decision is not changed until the output number crosses the dividing line according to equation 19.

When the number of DNNs increases, fewer iterations are needed for the model to converge and the stability during training increases. From Figure 6, we can also derive that more DNNs can have better initial performance and higher accuracy can be achieved. However, additional DNNs can not ensure better accuracy and convergence speed when the model has used enough DNNs. As a consequence, we use 12 DNNs to avoid redundant training costs.

5.2.3 Impact of Database Size

Sufficient training data is needed when training a traditional DNN model. When the data is inadequate, the model will have low accuracy and may become over-fitting or under-fitting. To ensure the ability of the model to explore more possible decisions, in our model, we train different DNNs with different batches of datasets, and part of the database is renewed after each iteration.

As shown in Figure 7 and 8, the database size has little impact on the decision-making model compared with the learning rate and number of DNNs. When the database size is sufficient for training iteration, extra data will not improve the accuracy of the model. The database is renewed slower, and a relatively longer training time is needed. So when facing a new application environment, we can use a database with a higher volume first to test the feasibility of the model and then decrease the size to minimize the training cost.

Refer to caption

Figure 7: Convergence performance of database size

Refer to caption

Figure 8: Accuracy performance of database size

5.3 Performance Comparison

Refer to caption

Figure 9: Comparison with various schemes under different weighting parameters

By evaluating the convergence and accuracy performance under different hyperparameter settings, we have proven that the proposed algorithm can be trained without labeled data and significantly reduce system consumption. To have a better insight of the effectiveness of our approach, we choose several traditional schemes as baselines. We then change the weight coefficient from 0 to 1 and test the average system cost of different schemes. The following schemes are tested for comparison analysis:

  • Random Offloading(RO): In this method, all the decisions in the decision space are chosen randomly with the same probability.

  • Cloud Only(CO): In this method, all the DTs are maintained by the cloud server.

  • Average Distribution(AD): In this method, all the DT is divided according to the size of the workload. Each server shares an average amount of workload.

  • Our algorithm: In this method, we applied the proposed algorithm to generate offloading decisions.

The comparison results of different offloading schemes are depicted in Figure 9. The weight coefficient α\alpha of the cost function (17) represents the trade-off between system delay and total energy consumption. It can be seen that our model can converge and decrease the system cost significantly under different weight coefficients. Compared with several baselines, our approach can achieve better performance, especially when tackling the system delay. Besides, our algorithm also has better capabilities to handle wider application scenarios. We do not need to train the model again when the scenario changes, e.g., when each device’s workload and location information are different. The decision can be obtained through a forward propagation process, which can be achieved within 0.01s.

6 CONCLUSION AND FUTURE WORK

In this paper, we designed a new DT system model based on cloud computing and edge computing technology. In the model, we assume each DT have multiple data resource, and one server is chosen to synchronize the data and maintain the DT. To optimize the system cost and allocate the servers properly, we convert the resource allocation problem into a MIP problem. A new algorithm is then proposed based on DDL to generate near-optimal decisions intelligently. Simulation shows that our model can be trained according to the cost function and converge with good performance. No labeled data is needed for the network training, which allows our model to be used in a broader range of cases. In the given environment, the trained model can make decisions for different workloads and device locations with portability and efficiency. According to the comparison analysis, the energy consumption and total delay are significantly decreased compared with various baselines.

Although our work only considers a basic DT edge/cloud system model and various variables are ignored, we mainly focus on proving the feasibility of the proposed algorithm. This model is highly prospective and can be easily expanded into complicated real-world scenarios by changing the cost function. In the future, we will do more study about applying our algorithm to other traditional transmission models. To address the problem more effectively in applications, it is essential to evaluate the performance of both meta-learning and deep reinforcement learning within the same model to determine the most efficient approach. Besides, privacy protection and user data security are other issues of the model because of the potential risks associated with its collection, storage, and usage. Since the distributed deep learning based algorithm is very compatible with federal learning and blockchain technologies, more work should be done to ensure the system’s privacy.

References

  • [1] A. Fuller, Z. Fan, C. Day, and C. Barlow, “Digital twin: Enabling technologies, challenges and open research,” IEEE access, vol. 8, pp. 108 952–108 971, 2020.
  • [2] J. Xiao, Y. Qian, W. Du, Y. Wang, Y. Jiang, and Y. Liu, “Vr/ar/mr in the electricity industry: Concepts, techniques, and applications,” in 2023 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW).   IEEE, 2023, pp. 82–88.
  • [3] Y. Wu, K. Zhang, and Y. Zhang, “Digital twin networks: A survey,” IEEE Internet of Things Journal, vol. 8, no. 18, pp. 13 789–13 804, 2021.
  • [4] F. Tao, H. Zhang, A. Liu, and A. Y. Nee, “Digital twin in industry: State-of-the-art,” IEEE Transactions on industrial informatics, vol. 15, no. 4, pp. 2405–2415, 2018.
  • [5] X. Fang, H. Wang, G. Liu, X. Tian, G. Ding, and H. Zhang, “Industry application of digital twin: From concept to implementation,” The International Journal of Advanced Manufacturing Technology, vol. 121, no. 7-8, pp. 4289–4312, 2022.
  • [6] Y. Wang, L.-H. Lee, T. Braud, and P. Hui, “Re-shaping post-covid-19 teaching and learning: A blueprint of virtual-physical blended classrooms in the metaverse era,” in 2022 IEEE 42nd International Conference on Distributed Computing Systems Workshops (ICDCSW).   IEEE, 2022, pp. 241–247.
  • [7] L. U. Khan, Z. Han, W. Saad, E. Hossain, M. Guizani, and C. S. Hong, “Digital twin of wireless systems: Overview, taxonomy, challenges, and opportunities,” IEEE Communications Surveys & Tutorials, 2022.
  • [8] X. Liu, J. Yu, Y. Liu, Y. Gao, T. Mahmoodi, S. Lambotharan, and D. H. Tsang, “Distributed intelligence in wireless networks,” IEEE Open Journal of the Communications Society, 2023.
  • [9] J. Yu, A. Alhilal, P. Hui, and D. H. Tsang, “6g mobile-edge empowered metaverse: Requirements, technologies, challenges and research directions,” arXiv preprint arXiv:2211.04854, 2022.
  • [10] W. Sun, S. Lei, L. Wang, Z. Liu, and Y. Zhang, “Adaptive federated learning and digital twin for industrial internet of things,” IEEE Transactions on Industrial Informatics, vol. 17, no. 8, pp. 5605–5614, 2020.
  • [11] Q. Song, S. Lei, W. Sun, and Y. Zhang, “Adaptive federated learning for digital twin driven industrial internet of things,” in 2021 IEEE Wireless Communications and Networking Conference (WCNC).   IEEE, 2021, pp. 1–6.
  • [12] P. Wang, N. Xu, W. Sun, G. Wang, and Y. Zhang, “Distributed incentives and digital twin for resource allocation in air-assisted internet of vehicles,” in 2021 IEEE Wireless Communications and Networking Conference (WCNC).   IEEE, 2021, pp. 1–6.
  • [13] W. Sun, P. Wang, N. Xu, G. Wang, and Y. Zhang, “Dynamic digital twin and distributed incentives for resource allocation in aerial-assisted internet of vehicles,” IEEE Internet of Things Journal, vol. 9, no. 8, pp. 5839–5852, 2021.
  • [14] W. Sun, N. Xu, L. Wang, H. Zhang, and Y. Zhang, “Dynamic digital twin and federated learning with incentives for air-ground networks,” IEEE Transactions on Network Science and Engineering, 2020.
  • [15] T. Wang, N. Huang, M. Dai, Y. Wu, L. Qian, and B. Lin, “Energy efficient digital twin with federated learning via non-orthogonal multiple access transmission,” in 2022 IEEE 95th Vehicular Technology Conference:(VTC2022-Spring).   IEEE, 2022, pp. 1–6.
  • [16] W. Sun, H. Zhang, R. Wang, and Y. Zhang, “Reducing offloading latency for digital twin edge networks in 6g,” IEEE Transactions on Vehicular Technology, vol. 69, no. 10, pp. 12 240–12 251, 2020.
  • [17] Y. Lu, X. Huang, K. Zhang, S. Maharjan, and Y. Zhang, “Communication-efficient federated learning and permissioned blockchain for digital twin edge networks,” IEEE Internet of Things Journal, vol. 8, no. 4, pp. 2276–2288, 2020.
  • [18] T. Do-Duy, D. Van Huynh, O. A. Dobre, B. Canberk, and T. Q. Duong, “Digital twin-aided intelligent offloading with edge selection in mobile edge computing,” IEEE Wireless Communications Letters, vol. 11, no. 4, pp. 806–810, 2022.
  • [19] B. Li, Y. Liu, L. Tan, H. Pan, and Y. Zhang, “Digital twin assisted task offloading for aerial edge computing and networks,” IEEE Transactions on Vehicular Technology, vol. 71, no. 10, pp. 10 863–10 877, 2022.
  • [20] K. Zhang, J. Cao, S. Maharjan, and Y. Zhang, “Digital twin empowered content caching in social-aware vehicular edge networks,” IEEE Transactions on Computational Social Systems, vol. 9, no. 1, pp. 239–251, 2021.
  • [21] Y. Lu, X. Huang, K. Zhang, S. Maharjan, and Y. Zhang, “Communication-efficient federated learning for digital twin edge networks in industrial iot,” IEEE Transactions on Industrial Informatics, vol. 17, no. 8, pp. 5709–5718, 2020.
  • [22] P. Shu, F. Liu, H. Jin, M. Chen, F. Wen, Y. Qu, and B. Li, “etime: Energy-efficient transmission between cloud and mobile devices,” in 2013 Proceedings IEEE INFOCOM.   IEEE, 2013, pp. 195–199.
  • [23] H. Wu, Y. Sun, and K. Wolter, “Energy-efficient decision making for mobile cloud offloading,” IEEE Transactions on Cloud Computing, vol. 8, no. 2, pp. 570–584, 2018.
  • [24] Y. Li, S. Xia, m. Zheng, B. Cao, and Q. Liu, “Lyapunov optimization-based trade-off policy for mobile cloud offloading in heterogeneous wireless networks,” IEEE Transactions on Cloud Computing, vol. 10, no. 1, pp. 491–505, 2019.
  • [25] G. Zhang, W. Zhang, Y. Cao, D. Li, and L. Wang, “Energy-delay tradeoff for dynamic offloading in mobile-edge computing system with energy harvesting devices,” IEEE Transactions on Industrial Informatics, vol. 14, no. 10, pp. 4642–4655, 2018.
  • [26] Y. Mao, J. Zhang, and K. B. Letaief, “Dynamic computation offloading for mobile-edge computing with energy harvesting devices,” IEEE Journal on Selected Areas in Communications, vol. 34, no. 12, pp. 3590–3605, 2016.
  • [27] H. Wu, K. Wolter, P. Jiao, Y. Deng, Y. Zhao, and M. Xu, “Eedto: an energy-efficient dynamic task offloading algorithm for blockchain-enabled iot-edge-cloud orchestrated computing,” IEEE Internet of Things Journal, vol. 8, no. 4, pp. 2163–2176, 2020.
  • [28] H. Wu, W. J. Knottenbelt, and K. Wolter, “An efficient application partitioning algorithm in mobile environments,” IEEE Transactions on Parallel and Distributed Systems, vol. 30, no. 7, pp. 1464–1480, 2019.
  • [29] W. Zhang and Y. Wen, “Energy-efficient task execution for application as a general topology in mobile cloud computing,” IEEE Transactions on cloud Computing, vol. 6, no. 3, pp. 708–719, 2015.
  • [30] M. Li, Q. Wu, J. Zhu, R. Zheng, and M. Zhang, “A computing offloading game for mobile devices and edge cloud servers,” Wireless Communications and Mobile Computing, vol. 2018, 2018.
  • [31] E. El Haber, T. M. Nguyen, D. Ebrahimi, and C. Assi, “Computational cost and energy efficient task offloading in hierarchical edge-clouds,” in 2018 IEEE 29th Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC).   IEEE, 2018, pp. 1–6.
  • [32] L. Huang, X. Feng, C. Zhang, L. Qian, and Y. Wu, “Deep reinforcement learning-based joint task offloading and bandwidth allocation for multi-user mobile edge computing,” Digital Communications and Networks, vol. 5, no. 1, pp. 10–17, 2019.
  • [33] B. Dab, N. Aitsaadi, and R. Langar, “Q-learning algorithm for joint computation offloading and resource allocation in edge cloud,” in 2019 IFIP/IEEE Symposium on Integrated Network and Service Management (IM).   IEEE, 2019, pp. 45–52.
  • [34] X. Chen, H. Zhang, C. Wu, S. Mao, Y. Ji, and M. Bennis, “Optimized computation offloading performance in virtual edge computing systems via deep reinforcement learning,” IEEE Internet of Things Journal, vol. 6, no. 3, pp. 4005–4018, 2018.
  • [35] W. Zhan, C. Luo, J. Wang, C. Wang, G. Min, H. Duan, and Q. Zhu, “Deep-reinforcement-learning-based offloading scheduling for vehicular edge computing,” IEEE Internet of Things Journal, vol. 7, no. 6, pp. 5449–5465, 2020.
  • [36] C. Qiu, X. Wang, H. Yao, J. Du, F. R. Yu, and S. Guo, “Networking integrated cloud–edge–end in iot: A blockchain-assisted collective q-learning approach,” IEEE Internet of Things Journal, vol. 8, no. 16, pp. 12 694–12 704, 2020.
  • [37] L. Huang, L. Zhang, S. Yang, L. P. Qian, and Y. Wu, “Meta-learning based dynamic computation task offloading for mobile edge computing networks,” IEEE Communications Letters, vol. 25, no. 5, pp. 1568–1572, 2020.
  • [38] J. Wang, J. Hu, G. Min, A. Y. Zomaya, and N. Georgalas, “Fast adaptive task offloading in edge computing based on meta reinforcement learning,” IEEE Transactions on Parallel and Distributed Systems, vol. 32, no. 1, pp. 242–253, 2020.
  • [39] G. Qu, H. Wu, R. Li, and P. Jiao, “Dmro: A deep meta reinforcement learning-based task offloading framework for edge-cloud computing,” IEEE Transactions on Network and Service Management, vol. 18, no. 3, pp. 3448–3459, 2021.
  • [40] Z. Zhang, N. Wang, H. Wu, C. Tang, and R. Li, “Mr-dro: A fast and efficient task offloading algorithm in heterogeneous edge/cloud computing environments,” IEEE Internet of Things Journal, pp. 1–1, 2021.
  • [41] L. Huang, X. Feng, A. Feng, Y. Huang, and L. P. Qian, “Distributed deep learning-based offloading for mobile edge computing networks,” Mobile Networks and Applications, 2018.
  • [42] L. Huang, S. Bi, and Y. J. A. Zhang, “Deep reinforcement learning for online computation offloading in wireless powered mobile-edge computing networks,” 2018.
  • [43] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.