This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

PORA: Predictive Offloading and Resource Allocation in Dynamic Fog Computing Systems

Xin Gao,  Xi Huang,  Simeng Bian,  Ziyu Shao,  Yang Yang X. Gao, X. Huang, S. Bian, Z. Shao and Y. Yang are with the School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China. (E-mail: {gaoxin, huangxi, biansm, shaozy, yangyang}@shanghaitech.edu.cn) (*Corresponding author: Ziyu Shao)
Abstract

In multi-tiered fog computing systems, to accelerate the processing of computation-intensive tasks for real-time IoT applications, resource-limited IoT devices can offload part of their workloads to nearby fog nodes, whereafter such workloads may be offloaded to upper-tier fog nodes with greater computation capacities. Such hierarchical offloading, though promising to shorten processing latencies, may also induce excessive power consumptions and latencies for wireless transmissions. With the temporal variation of various system dynamics, such a trade-off makes it rather challenging to conduct effective and online offloading decision making. Meanwhile, the fundamental benefits of predictive offloading to fog computing systems still remain unexplored. In this paper, we focus on the problem of dynamic offloading and resource allocation with traffic prediction in multi-tiered fog computing systems. By formulating the problem as a stochastic network optimization problem, we aim to minimize the time-average power consumptions with stability guarantee for all queues in the system. We exploit unique problem structures and propose PORA, an efficient and distributed predictive offloading and resource allocation scheme for multi-tiered fog computing systems. Our theoretical analysis and simulation results show that PORA incurs near-optimal power consumptions with queue stability guarantee. Furthermore, PORA requires only mild-value of predictive information to achieve a notable latency reduction, even with prediction errors.

Index Terms:
Internet of Things, fog computing, workload offloading, resource allocation, Lyapunov optimization, predictive offloading.

I Introduction

In the face of the proliferation of real-time IoT applications, fog computing has come as a promising complement to cloud computing by extending cloud to the edge of the network to meet the stringent latency requirements and intensive computation demands of such applications [1].

A typical fog computing system consists of a set of geographically distributed fog nodes which are deployed at the network periphery with elastic resource provisioning such as storage, computation, and network bandwidth[2]. Depending on their distance to IoT devices, fog nodes are often organized in a hierarchical fashion, with each layer as a fog tier. In such a way, resource-limited IoT devices, when heavily loaded, can delegate workloads via wireless links to nearby fog nodes, a.k.a., workload offloading, to reduce the power consumption and accelerate workload processing; meanwhile, each fog node can offload workloads to nodes in its upper fog tier. However, along with all the benefits come the extended latency and extra power consumption. Given such a power-latency tradeoff, two interesting questions arise. One is where and how much workloads to offload between successive fog tiers. The other is how to allocate resources for workload processing and offloading. The timely decision making regarding these two questions is critical but challenging, due to temporal variations of system dynamics in wireless environment, uncertainty in the resulting offloading latency, and the unknown traffic statistics.

We summarize the main challenges of dynamic offloading and resource allocation in fog computing as follows:

  1. \diamond

    Characterization of system dynamics and the power-latency tradeoff: In practice, a fog system often consists of multiple tiers, with complex interplays between fog tiers and the cloud, not to mention the constantly varying dynamics and intertwined power-latency tradeoffs therein. A model that accurately characterizes the system and tradeoffs is the key to the fundamental understanding of the design space.

  2. \diamond

    Efficient online decision making: The decision making must be computationally efficient, so as to minimize the overheads. The difficulties often come from the uncertainties of traffic statistics, online nature of workload arrivals, and intrinsic complexity of the problem.

  3. \diamond

    Understanding the benefits of predictive offloading: One natural extension to online decision making is to employ predictive offloading to further reduce latencies and improve quality of service. For example, Netflix preloads videos onto users’ devices based on user behavior prediction[3]. Despite the wide applications of such approaches, the fundamental limits of predictive offloading in fog computing still remain unknown.

TABLE I: Comparisons of related works
D2D-enabled IoT IoT-Fog1 Fog-Fog2 Fog-Cloud3 Dynamic Prior Arrival Distribution Prediction
[1]
[4]
[5] Poisson
[6]
[7] Poisson
[8] Not Required
[9] Not Required
[10] Not Required
[11] Not Required
Ours Not Required
  • 1,2,3

    “IoT-Fog” means offloading from IoT devices to fog, “Fog-Fog” means offloading between fog tiers, while “Fog-Cloud” means offloading from fog to cloud.

In this paper, we focus on the workload offloading problem for multi-tiered fog systems. We address the above challenges by developing a fine-grained queueing model that accurately depicts such systems and proposing an efficient online scheme that proceeds the offloading on a time-slot basis. To the best of our knowledge, we are the first to conduct systematic study on predictive offloading in fog systems. Our key results and main contributions are summarized as follows:

  1. \diamond

    Problem Formulation: We formulate the problem of dynamic offloading and resource allocation as a stochastic optimization problem, aiming at minimizing the long-term time-average expectation of total power consumptions of fog tiers with queue stability guarantee.

  2. \diamond

    Algorithm Design: Through a non-trivial transformation, we decouple the problem into a series of subproblems over time slots. By exploiting their unique structures, we propose PORA, an efficient scheme that exploits predictive scheduling to make decisions in an online manner.

  3. \diamond

    Theoretical Analysis and Experimental Verification: We conduct theoretical analysis and trace-driven simulations to evaluate the effectiveness of PORA. The results show that PORA achieves a tunable power-latency tradeoff while effectively reducing the average latency with only mild-value of predictive information, even in the presence of prediction errors.

  4. \diamond

    New Degree of Freedom in the Design of Fog Computing Systems: We systematically investigate the fundamental benefits of predictive offloading in fog computing systems, with both theoretical analysis and numerical evaluations.

We organize the rest of the paper as follows. Section II discusses the related work. Next, in Section III, we provide an example that motivates our design for dynamic offloading and resource consumption in fog computing systems. Section IV presents the system model and problem formulation, followed by the algorithm design of PORA and performance analysis in Section V. Section VI analyzes the results from trace-driven simulations, while Section VII concludes the paper.

II Related Work

In recent years, a series of works have been proposed to optimize the performance fog computing systems from various aspects [1, 12, 13, 4, 5, 6, 7, 8, 9, 10, 11]. Among such works, the most related are those focusing on the design of effective offloading schemes. For example, by adopting alternating direction method of multipliers (ADMM) methods, Xiao et al.[1] and Wang et al.[4] proposed two offloading schemes for cloud-aided fog computing systems to minimize average task duration and average service response time under different energy constraints, respectively. Later, Liu et al. [5] took the social relationships among IoT users into consideration and developed a socially aware offloading scheme by advocating game theoretic approaches. Misra et al. [6] studied the problem in software-defined fog computing systems and proposed a greedy heuristic scheme to conduct multi-hop task offloading with offloading path selection. Lei et al. [7] considered the joint minimization of delay and power consumption over all IoT devices; they formulated the problem under the settings of continuous-time Markov decision process and solved it via approximate dynamic programming techniques. The above works, despite their effectiveness, generally assume the availability of the statistical information on task arrivals in the systems which is usually unattainable in practice with highly time-varying system dynamics [14].

In the face of such uncertainties, a number of works have applied stochastic optimization methods such as Lyapunov optimization techniques to online and dynamic offloading scheme design [8, 9, 10, 11]. For instance, Mao et al.[8] investigated the tradeoff between the power consumption and execution delay, then developed a dynamic offloading scheme for energy-harvesting-enabled IoT devices. Chen et al. [9] designed an adaptive and efficient offloading scheme to minimize the transmission energy consumption with queueing latency guarantee. Gao et al. [10] investigated efficient offloading and social-awareness-aided network resource allocation for device-to-device-enabled (D2D-enabled) IoT users. Zhang et al. [11] designed an online rewards-optimal scheme for the computation offloading of energy harvesting-enabled IoT devices based on Lyapunov optimization and Vickrey-Clarke-Groves auction. Different from such works that focus on fog computing systems with flat or two-tiered architectures, our solution is applicable to general multi-tiered fog computing systems with time-varying wireless channel states and unknown traffic statistics. Moreover, to the best of our knowledge, our solution is also the first to proactively leverage the predicted traffic information to optimize the system performance with theoretical guarantee. We are also the first to investigate the fundamental benefits of predictive offloading in fog computing systems. We compare our work with the above mentioned works in TABLE I.

III Motivating Example

In this section, we provide a motivating example to show the potential power-latency tradeoff in multi-tiered fog computing systems. The objective is to achieve low power consumptions and short average workload latency (in the unit of packets).

Figure 1 shows an instance of time-slotted fog computing system with two fog tiers, i.e., edge fog tier and central fog tier. Within each fog tier resides one fog node, i.e., an edge fog node (EFN) in edge fog tier and a central fog node (CFN) in central fog tier. The EFN connects to the CFN via a wireless link, while the CFN connects to the cloud data center over wired links. Each fog node maintains one queue to store packets. Figure 1(a) shows that during time slot t0t_{0}, both the EFN and the CFN store 88 packets in their queues.

We assume that each fog node sticks to one policy all the time to handle packets, i.e., either processing packets locally or offloading them to its next tier. The local processing capacities of EFN and CFN are 11 and 88 packets per time slot, respectively. The transmission capacities from EFN to CFN and from CFN to cloud are 44 and 55 packets per time slot, respectively. The power consumption is assumed linearly proportional to the number of processed/transmitted packets. In particular, processing one packet locally consumes 11 mW power, while transmitting one packet over wireless link consumes 0.50.5 mW. We ignore the processing latency in the cloud due to its powerful processing capacity.

TABLE II lists the total power consumptions and average packet latencies under all four possible settings. Figures 1(b)-1(d) show the case when EFN sticks to offloading and CFN sticks to local processing. In time slot (t0+1)(t_{0}+1), EFN offloads four packets to CFN at its full transmission capacity, while CFN processes all the eight packets locally. In time slot (t0+2)(t_{0}+2), EFN offloads the rest four packets to CFN; meanwhile, CFN locally processes the four packets that arrive in previous time slot. In time slot (t0+3)(t_{0}+3), CFN finishes processing the rest four packets. In this case, the system consumes 1616 mW power in local processing and 44 mW power in transmission, with an average packet latency of 1.751.75 time slots.

Refer to caption
Figure 1: Motivating example of dynamic offloading and resource consumption in multi-tiered fog computing systems.
TABLE II: Performance under different offloading policies
Policy of Policy of Total Power Average Packet
EFN CFN Consumptions (mW) Latency (time slot)
Local Local 16 2.75
Local Offload 8 2.9375
Offload Local 20 1.75
Offload Offload 4 2.125

From TABLE II, we conclude that: First, when EFN sticks to offloading and CFN sticks to local processing, the system achieves the lowest average packet latency of 1.751.75 slots but the maximum power consumption of 2020mW. Second, with the same offloading policy on EFN, there is a tradeoff between the total power consumptions and the average packet latency when CFN sticks to different policies. The reason is that offloading to the cloud can not only reduce power consumptions but also prolong latency as well. Third, when CFN sticks to local processing, there is a power-latency tradeoff with different policies at EFN, in that offloading to CFN can induce lower processing latency but at the cost of even higher power consumption for wireless transmissions.

IV Model and Problem Formulation

We consider a multi-tiered fog computing system, as shown in Figure 2. The system evolves over time slots indexed by t{0,1,2,}t\in\{0,1,2,...\}. Each time slot has a length of τ0\tau_{0}. Inside the edge fog tier (EFT) are a set of edge fog nodes (EFNs) that offer low-latency access to IoT devices. On the other hand, the central fog tier (CFT) comprises of central fog nodes (CFNs) with greater processing capacities than EFNs. We assume that the workload on each EFN can be offloaded to and processed by any of its accessible CFNs, and that each CFN can offload its workload to the cloud. In our model, we do not consider the power consumptions and latencies within the cloud. We mainly focus on the power consumptions and latencies within fog tiers, as shown in TABLE III. First, the power consumptions we consider include two parts: processing power and transmit power. The processing power consumption is induced by the workload processing on both EFT and CFT. The transmit power is induced by the transmissions from EFT to CFT. We do not consider the transmit power consumption from CFT to cloud because we assume that the CFT communicates with the cloud through wireline connections. Second, the latencies we consider include three parts: queueing latency, processing latency and transmit latency. We focus on the queueing latency on both EFT and CFT. We assume that the workload processing in each time slot can be completed by the end of the same time slot, and then we can ignore the processing latency. Since the EFT communicates with the CFT through high-speed wireless connections and the CFT communicates with the cloud through high-speed wireline connections, we assume that transmission latencies from both EFT to CFT and CFT to Cloud are negligible.

TABLE III: Performance Metrics in Our Model
Power Consumption Latency
Processing Transmit Queueing Processing Transmit
EFT
EFT2CFT
CFT
CFT2Cloud

In the following, we first introduce the basic settings in Section IV-A, then elaborate the queueing models in Section IV-D. Next, we define the optimization objective in Section IV-E while proposing the problem formulation in Section IV-F. We summarize the key notations in TABLE IV.

Refer to caption
Figure 2: An example of fog computing systems with two fog tiers.
TABLE IV: Key notations
Notation Description
τ0\tau_{0} Length of each time slot
𝒩\mathcal{N} 𝒩\mathcal{N} is the set of EFNs with |𝒩|N|\mathcal{N}|\triangleq N
\mathcal{M} \mathcal{M} is the set of CFNs with ||M|\mathcal{M}|\triangleq M
𝒩j\mathcal{N}_{j} Set of accessible EFNs from CFN jj
i\mathcal{M}_{i} Set of accessible CFNs from EFN ii
Ai(t)A_{i}(t) Amount of workload arriving to EFN ii in time slot tt
λi\lambda_{i} Average workload arriving rate on EFN ii, λi𝔼{Ai(t)}\lambda_{i}\triangleq\mathbb{E}\{A_{i}(t)\}
WiW_{i} Prediction window size of EFN ii
Ai,1(t)A_{i,-1}(t) Arrival queue backlog of EFN ii in time slot tt
Ai,w(t)A_{i,w}(t) Prediction queue backlog of EFN ii in time slot tt, such that
0wWi10\leq w\leq W_{i}-1
Qi(e,a)(t)Q_{i}^{(e,a)}(t) Integrate queue backlog of EFN ii in time slot tt
Qi(e,l)(t)Q_{i}^{(e,l)}(t) Local processing queue backlog of EFN ii in time slot tt
Qi(e,o)(t)Q_{i}^{(e,o)}(t) Offloading queue backlog of EFN ii in time slot tt
bi(e,l)(t)b_{i}^{(e,l)}(t) Amount of workload to be sent to Qi(e,l)(t)Q_{i}^{(e,l)}(t) in time slot tt
bi(e,o)(t)b_{i}^{(e,o)}(t) Amount of workload to be sent to Qi(e,o)(t)Q_{i}^{(e,o)}(t) in time slot tt
fi(e)(t)f_{i}^{(e)}(t) CPU frequency of EFN ii in time slot tt
Hi,j(t)H_{i,j}(t) Wireless channel gain between EFN ii and CFN jj
pi,j(t)p_{i,j}(t) Transmit power from EFN ii to CFN jj in time slot tt
Ri,j(t)R_{i,j}(t) Transmit rate from EFN ii to CFN jj in time slot tt
Qj(c,a)(t)Q_{j}^{(c,a)}(t) Arrival queue backlog of CFN jj in time slot tt
Qj(c,l)(t)Q_{j}^{(c,l)}(t) Local processing queue backlog of CFN jj in time slot tt
Qj(c,o)(t)Q_{j}^{(c,o)}(t) Offloading queue backlog of CFN jj in time slot tt
bj(c,l)(t)b_{j}^{(c,l)}(t) Amount of workload to be sent to Qj(c,l)(t)Q_{j}^{(c,l)}(t) in time slot tt
bj(c,o)(t)b_{j}^{(c,o)}(t) Amount of workload to be sent to Qj(c,o)(t)Q_{j}^{(c,o)}(t) in time slot tt
fj(c)(t)f_{j}^{(c)}(t) CPU frequency of CFN jj in time slot tt
P(t)P(t) Total power consumptions in time slot tt

IV-A Basic Settings

The fog computing system consists of NN EFNs in EFT and MM CFNs in CFT. Let 𝒩\mathcal{N} and \mathcal{M} be the sets of EFNs and CFNs. Each EFN ii has access to a subset of CFNs in their proximities. We denote the subset by i\mathcal{M}_{i}\subset\mathcal{M}. For each CFN jj, 𝒩j𝒩\mathcal{N}_{j}\subset\mathcal{N} denotes the set of its accessible EFNs. Accordingly, for any i𝒩ji\in\mathcal{N}_{j} we have jij\in\mathcal{M}_{i}.

IV-B Queueing Model for Edge Fog Node

During time slot tt, there is an amount Ai(t)A_{i}(t) (Amax\leq A_{\text{max}} for some constant AmaxA_{\text{max}}) of workload generated from IoT devices arrive to be processed on EFN ii such that 𝔼{Ai(t)}=λi\mathbb{E}\{A_{i}(t)\}=\lambda_{i}. We assume that such arrivals are independent over time slots and different EFNs. Each EFN ii is equipped with a learning module111We do not specify any particular learning method in this paper, since our work aims to explore the fundamental benefits of predictive offloading. In practice, one can leverage machine learning techniques such as time-series prediction methods [15] for workload arrival prediction. that can predict the future workload within a prediction window of size WiW_{i}, i.e. workload will arrive in the next WiW_{i} time slots. The predicted arrivals are pre-generated and recorded, then arrive to EFN ii for pre-serving. Once the predicted arrivals actually arrive after pre-serving, they will be considered finished.

On each EFN, as Figure 3 shows, there are four types of queues: prediction queues with the backlogs as Ai,0(t)A_{i,0}(t), …, Ai,Wi1(t)A_{i,W_{i}-1}(t), arrival queue Ai,1(t)A_{i,-1}(t), local processing queue Qi(e,l)(t)Q_{i}^{(e,l)}(t), and offloading queue Qi(e,o)(t)Q_{i}^{(e,o)}(t). In time slot tt, prediction queue Ai,w(t)A_{i,w}(t) (0wWi10\leq w\leq W_{i}-1) stores untreated workload that will arrive in time slot (t+w)(t+w). Workload that actually arrives at EFN ii is stored in the arrival queue Ai,1(t)A_{i,-1}(t), awaiting being forwarded to the local processing queue Qi(e,l)(t)Q_{i}^{(e,l)}(t) or the offloading queue Qi(e,o)(t)Q_{i}^{(e,o)}(t). Workload in Qi(e,l)(t)Q_{i}^{(e,l)}(t) will be processed locally by EFN ii, while workload in Qi(e,o)(t)Q_{i}^{(e,o)}(t) will be offloaded to CFNs in set i\mathcal{M}_{i}.

Refer to caption
Figure 3: Queueing model of the system.

IV-B1 Prediction Queues and Arrival Queues in EFNs

Within each time slot tt, in addition to the current arrivals in the arrival queue, EFN ii can also forward future arrivals in the prediction queues. We define μi,w(t)\mu_{i,w}(t) as the amount of output workload from Ai,w(t)A_{i,w}(t), for w{1,0,,Wi1}w\in\{-1,0,...,W_{i}-1\}. Such workload should be distributed to the local processing queue and offloading queue. We denote the amounts of workloads to be distributed to the local processing queue and offloading queue as bi(e,l)(t)b_{i}^{(e,l)}(t) and bi(e,o)(t)b_{i}^{(e,o)}(t), respectively, such that

0bi(e,β)(t)bi,max(e,β),β{l,o}0\leq b^{(e,\beta)}_{i}\left(t\right)\leq b^{(e,\beta)}_{i,\text{max}},\ \forall\beta\in\{l,o\} (1)

where each bi,max(e,β)b^{(e,\beta)}_{i,\text{max}} is a positive constant. As a result, we have

w=1Wi1μi,w(t)=bi(e,l)(t)+bi(e,o)(t).\sum_{w=-1}^{W_{i}-1}\mu_{i,w}\left(t\right)=b^{(e,l)}_{i}\left(t\right)+b^{(e,o)}_{i}\left(t\right). (2)

Next, we consider the queueing dynamics for different types of queues in EFN, respectively.

Regarding Ai,w(t)A_{i,w}(t), it is updated whenever pre-service is finished and the lookahead window moves one slot ahead at the end of each time slot. Therefore, we have

  1. (i)

    If w=Wi1w=W_{i}-1, then

    Ai,Wi1(t+1)=Ai(t+Wi).A_{i,W_{i}-1}\left(t+1\right)=A_{i}\left(t+W_{i}\right). (3)
  2. (ii)

    If 0wWi20\leq w\leq W_{i}-2, then

    Ai,w(t+1)=[Ai,w+1(t)μi,w+1(t)]+,A_{i,w}(t+1)=[A_{i,w+1}(t)-\mu_{i,w+1}(t)]^{+}, (4)

where [x]+max{x,0}[x]^{+}\triangleq\max\{x,0\} for xx\in\mathbb{R}. In time slot (t+1)(t+1), the amount of workload that will arrive after (Wi1)(W_{i}-1) time slots is Ai(t+Wi)A_{i}(t+W_{i}) and it remains unknown until time slot (t+1)(t+1).

Regarding the arrival queue Ai,1(t)A_{i,-1}(t), it records the actual backlog of EFN ii with the update equation as follows:

Ai,1(t+1)=[Ai,1(t)μi,1(t)]++[Ai,0(t)μi,0(t)]+.A_{i,-1}(t+1)\!=\![A_{i,-1}(t)-\mu_{i,-1}(t)]^{+}\!+\![A_{i,0}(t)-\mu_{i,0}(t)]^{+}. (5)

Note that μi,1(t)\mu_{i,-1}(t) denotes the amount of distributed workload that have already being in Ai,1(t)A_{i,-1}(t).

Next, we introduce an integrate queue with a backlog size as the sum of all prediction queues and the arrival queue on EFN ii, denoted by Qi(e,a)(t)w=1Wi1Ai,w(t)Q_{i}^{\left(e,a\right)}\left(t\right)\triangleq\sum_{w=-1}^{W_{i}-1}A_{i,w}\left(t\right) . Under fully-efficient [16] service policy, Qi(e,a)(t)Q_{i}^{\left(e,a\right)}\left(t\right) is updated as

Qi(e,a)(t+1)=[Qi(e,a)(t)(bi(e,l)(t)+bi(e,o)(t))]++Ai(t+Wi).Q_{i}^{\left(e,a\right)}\left(t+1\right)=[Q_{i}^{\left(e,a\right)}\left(t\right)-(b^{(e,l)}_{i}\left(t\right)+b^{(e,o)}_{i}\left(t\right))]^{+}\\ +A_{i}\left(t+W_{i}\right). (6)

The input of integrate queue Qi(e,a)(t)Q_{i}^{(e,a)}(t) consists of the predicted workload that will arrive at EFN ii in time slot (t+Wi)(t+W_{i}), while its output consists of workloads being forwarded to the local processing queue and the offloading queue. Note that bi(e,l)(t)+bi(e,o)(t)b^{(e,l)}_{i}(t)+b^{(e,o)}_{i}(t) is the output capacity of integrate queue Qi(e,a)(t)Q_{i}^{(e,a)}(t) in time slot tt. If the capacity is larger than the queue backlog size, the true output amount will be smaller than bi(e,l)(t)+bi(e,o)(t)b^{(e,l)}_{i}(t)+b^{(e,o)}_{i}(t).

IV-B2 Offloading Queues in EFNs

In time slot tt, workload in queue Qi(e,o)(t)Q^{(e,o)}_{i}(t) will be offloaded to CFNs in set i\mathcal{M}_{i}. The transmission capacities are determined by the transmit power decisions (pi,j(t))ji(p_{i,j}(t))_{j\in\mathcal{M}_{i}}, where pi,j(t)p_{i,j}(t) is the transmit power from EFN ii to CFN jj. The transmit power is nonnegative and the total transmit power of each EFN is upper bounded, i.e.,

pi,j(t)0,i𝒩,ji and t,\displaystyle p_{i,j}\left(t\right)\geq 0,\ \forall i\in\mathcal{N},j\in\mathcal{M}_{i}\text{ and }t, (7)
jipi,j(t)pi,max,i𝒩 and t.\displaystyle\sum_{j\in\mathcal{M}_{i}}p_{i,j}\left(t\right)\leq p_{i,\text{max}},\ \forall i\in\mathcal{N}\text{ and }t. (8)

According to Shannon’s capacity formula [17], the transmission capacity from EFN ii to CFN jj is

Ri,j(t)R^i,j(pi,j(t))=τ0Blog2(1+pi,j(t)Hi,j(t)N0B),R_{i,j}(t)\!\triangleq\!\hat{R}_{i,j}(p_{i,j}(t))\!=\!\tau_{0}B\log_{2}\left(1\!+\!\frac{p_{i,j}(t)H_{i,j}(t)}{N_{0}B}\right), (9)

where τ0\tau_{0} is the length of each time slot, BB is the channel bandwidth, Hi,j(t)H_{i,j}(t) is the wireless channel gain between EFN ii and CFN jj, and N0N_{0} is the system power spectral density of the additive white Gaussian noise. Note that Hi,j(t)H_{i,j}(t) is an uncontrollable environment state with positive upper bound HmaxH_{\text{max}}. We do not consider the interferences among fog nodes and tiers. By adjusting the transmit power pi,j(t)p_{i,j}(t), we can offload different amounts of workload from EFN ii to CFN jj in time slot tt. Accordingly, the update equation of offloading queue Qi(e,o)(t)Q^{(e,o)}_{i}(t) is

Qi(e,o)(t+1)[Qi(e,o)(t)jiRi,j(t)]++bi(e,o)(t),Q^{(e,o)}_{i}\left(t+1\right)\leq[Q^{(e,o)}_{i}\left(t\right)\!-\!\sum_{j\in\mathcal{M}_{i}}R_{i,j}\left(t\right)]^{+}+b^{(e,o)}_{i}\left(t\right), (10)

where jiRi,j(t)\sum_{j\in\mathcal{M}_{i}}R_{i,j}(t) is the total transmission capacity to EFN ii in time slot tt. The inequality here means that the actual arrival of Qi(e,o)(t)Q_{i}^{(e,o)}(t) may be less than bi(e,o)(t)b_{i}^{(e,o)}(t), because bi(e,o)(t)b^{(e,o)}_{i}(t) is the transmission capacity from integrate queue Qi(e,a)(t)Q_{i}^{(e,a)}(t) to offloading queue Qi(e,o)(t)Q_{i}^{(e,o)}(t) instead of the amount of truly transmitted workload. Recall that we assume the transmission latency from EFT to CFT is negligible compared to the length of each time slot, the workload transmission in each time slot can be accomplished by the end of that time slot.

IV-C Queueing Model for Central Fog Node

Figure 3 also shows the queueing model on CFN. Each CFN jj\in\mathcal{M} maintains three queues: an arrival queue Qj(c,a)(t)Q_{j}^{(c,a)}(t), a local processing queue Qj(c,l)(t)Q_{j}^{(c,l)}(t), and an offloading queue Qj(c,o)(t)Q_{j}^{(c,o)}(t). Similar to EFNs, workload offloaded from the EFT will be firstly stored in the arrival queue, then distributed to Qj(c,l)(t)Q_{j}^{(c,l)}(t) for local processing and to Qj(c,o)(t)Q_{j}^{(c,o)}(t) for further offloading.

IV-C1 Arrival Queues in CFNs

The arrivals on CFN jj consist of workloads offloaded from EFNs in the set 𝒩j\mathcal{N}_{j}. We denote the amounts of workloads distributed to the local processing queue and offloading queue in time slot tt as bj(c,l)(t)b_{j}^{(c,l)}(t) and bj(c,o)(t)b_{j}^{(c,o)}(t), respectively, such that

0bj(c,β)(t)bj,max(c,β),β{l,o},0\leq b^{(c,\beta)}_{j}\left(t\right)\leq b^{(c,\beta)}_{j,\text{max}},\ \forall\beta\in\{l,o\}, (11)

where each bj,max(c,β)b^{(c,\beta)}_{j,\text{max}} is a positive constant. Accordingly, Qj(c,a)(t)Q_{j}^{(c,a)}(t) is updated as follows:

Qj(c,a)(t+1)[Qj(c,a)(t)(bj(c,l)(t)+bj(c,o)(t))]++i𝒩jRi,j(t).Q_{j}^{(c,a)}(t+1)\\ \leq[Q_{j}^{(c,a)}(t)-(b_{j}^{(c,l)}(t)+b_{j}^{(c,o)}(t))]^{+}\!+\!\sum_{i\in\mathcal{N}_{j}}R_{i,j}(t). (12)

IV-C2 Offloading Queues in CFNs

For each CFN jj\in\mathcal{M}, its offloading queue Qj(c,o)(t)Q^{(c,o)}_{j}(t) stores the workload to be offloaded to the cloud. We define Dj(t)D_{j}(t) as the transmission capacity of the wired link from CFN jj to the cloud during time slot tt, which depends on the network state and is upper bounded by some constant DmaxD_{\text{max}} for all jj and tt. Then we have the following update function for Qj(c,o)(t)Q^{(c,o)}_{j}(t):

Qj(c,o)(t+1)[Qj(c,o)(t)Dj(t)]++bj(c,o)(t).Q^{(c,o)}_{j}\left(t+1\right)\leq[Q^{(c,o)}_{j}\left(t\right)-D_{j}\left(t\right)]^{+}+b^{(c,o)}_{j}\left(t\right). (13)

Note that the amount of actually offloaded workload to the cloud is min{Qj(c,o)(t),Dj(t)}\min\{Q_{j}^{(c,o)}(t),D_{j}(t)\}.

IV-D Local Processing Queues on EFNs and CFNs

We assume that all fog nodes are able to adjust their CPU frequencies in each time slot, by applying dynamic voltage and frequency scaling (DVFS) techniques[18]. Next, we define Lk(α)L_{k}^{(\alpha)} as the number of CPU cycles that fog node k𝒩k\in\mathcal{N}\cup\mathcal{M} requires to process one bit of workload, where α\alpha is an indicator of fog node kk’s type (α=e\alpha=e if kk is an EFN, and α=c\alpha=c if kk is a CFN). Lk(α)L_{k}^{(\alpha)} is assumed constant and can be measured offline [19]. Therefore, the local processing capacity of fog node kk is fk(α)(t)/Lk(α)f_{k}^{(\alpha)}(t)/L_{k}^{(\alpha)}. The local processing queue on fog node kk evolves as follows:

Qk(α,l)(t+1)[Qk(α,l)(t)τ0fk(α)(t)/Lk(α)]++bk(α,l)(t).Q_{k}^{(\alpha,l)}(t+1)\!\leq\![Q_{k}^{\left(\alpha,l\right)}\left(t\right)\!-\tau_{0}f_{k}^{\left(\alpha\right)}\left(t\right)\!/\!L_{k}^{\left(\alpha\right)}]^{+}\!+b_{k}^{\left(\alpha,l\right)}\left(t\right). (14)

All CPU frequencies are nonnegative and finite:

0fk(α)(t)fk,max(α),k𝒩 and t,0\leq f_{k}^{(\alpha)}\left(t\right)\leq f^{(\alpha)}_{k,\text{max}},\ \forall k\in\mathcal{N}\cup\mathcal{M}\text{ and }t, (15)

where each fk,max(α)f^{(\alpha)}_{k,\text{max}} is a positive constant.

IV-E Power Consumptions

The total power consumptions P(t)P(t) of fog tiers in time slot tt consist of the processing power consumption and wireless transmit power consumption. Given a local CPU with frequency ff, its power consumption per time slot is τ0ςf3\tau_{0}\varsigma f^{3}, where ς\varsigma is a parameter depending on the deployed hardware and is measurable in practice [20]. Thus P(t)P(t) is defined as follows:

P(t)P^(𝒇(t),𝒑(t))=i𝒩τ0ς(fi(e)(t))3+jτ0ς(fj(c)(t))3+i𝒩jiτ0pi,j(t),P\left(t\right)\triangleq\hat{P}\left(\boldsymbol{f}\left(t\right),\boldsymbol{p}\left(t\right)\right)=\sum_{i\in\mathcal{N}}\tau_{0}\varsigma(f_{i}^{\left(e\right)}\left(t\right))^{3}\\ +\sum_{j\in\mathcal{M}}\tau_{0}\varsigma(f_{j}^{\left(c\right)}\left(t\right))^{3}+\sum_{i\in\mathcal{N}}\sum_{j\in\mathcal{M}_{i}}\tau_{0}p_{i,j}\left(t\right), (16)

where 𝒇(t)((fi(e)(t))i𝒩,(fj(c)(t))j)\boldsymbol{f}(t)\triangleq((f_{i}^{(e)}(t))_{i\in\mathcal{N}},(f_{j}^{(c)}(t))_{j\in\mathcal{M}}) is the vector of all CPU frequencies, and 𝒑(t)(𝒑i(t))i𝒩\boldsymbol{p}(t)\triangleq(\boldsymbol{p}_{i}(t))_{i\in\mathcal{N}} in which 𝒑i(t)=(pi,j(t))ji\boldsymbol{p}_{i}(t)=(p_{i,j}(t))_{j\in\mathcal{M}_{i}} is the transmit power allocation of EFN ii.

IV-F Problem Formulation

We define the long-term time-average expectation of total power consumptions P¯\bar{P} and total queue backlog Q¯\bar{Q} as follows:

P¯lim supT1Tt=0T1𝔼{P(t)},\displaystyle~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}\bar{P}\triangleq\limsup_{T\rightarrow\infty}\frac{1}{T}\sum_{t=0}^{T-1}\mathbb{E}\left\{P\left(t\right)\right\}, (17)
Q¯lim supT1Tt=0T1β{a,l,o}(i𝒩𝔼{Qi(e,β)(t)}\displaystyle\bar{Q}\triangleq\limsup_{T\rightarrow\infty}\frac{1}{T}\sum_{t=0}^{T-1}\sum_{\beta\in\{a,l,o\}}(\sum_{i\in\mathcal{N}}\mathbb{E}\{Q_{i}^{\left(e,\beta\right)}(t)\}
+j𝔼{Qj(c,β)(t)}).\displaystyle~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}+\sum_{j\in\mathcal{M}}\mathbb{E}\{Q_{j}^{(c,\beta)}(t)\}). (18)

In this paper, we aim to minimize the long-term time-average expectation of total power consumptions P¯\bar{P}, while ensuring the stability of all queues in the system, i.e., Q¯<\bar{Q}<\infty. The problem formulation is given by

Minimize{𝒃(t),𝒇(t),𝒑(t)}tP¯Subject to(1)(7)(8)(11)(15),Q¯<.\begin{array}[]{cl}\underset{\{\boldsymbol{b}(t),\boldsymbol{f}(t),\boldsymbol{p}(t)\}_{t}}{\text{Minimize}}&\displaystyle\bar{P}\\ \text{Subject to}&\displaystyle(\ref{constraint: EFN offloading decision 1})(\ref{constraint: power allocation 1})(\ref{constraint: power allocation 2})(\ref{constraint: CFN offloading decision 1})(\ref{constraint: CPU frequency}),\\ &\displaystyle\bar{Q}<\infty.\end{array} (19)

V Algorithm Design

V-A Predictive Algorithm

To solve problem (19), we adopt Lyapunov optimization techniques[16][21] to decouple the problem into a series of subproblems over time slots. We show the detail of this process in Appendix A. By solving each of these subproblems during each time slot, we propose PORA, an efficient and predictive scheme conducts workload offloading in an online and distributed manner. We show the pseudocode of PORA in Algorithm 1. Note that symbol α{e,c}\alpha\in\{e,c\} indicates the type of fog node. Specifically, for each fog node kk, α=e\alpha=e if kk is an EFN and CFN otherwise.

Algorithm 1 Predictive Offloading and Resource Allocation (PORA) in One Time Slot
1:  Initialize 𝒃(t)𝟎\boldsymbol{b}(t)\leftarrow\boldsymbol{0}, 𝒇(t)𝟎\boldsymbol{f}(t)\leftarrow\boldsymbol{0}, 𝒑(t)𝟎\boldsymbol{p}(t)\leftarrow\boldsymbol{0}.
2:  for each fog node k𝒩k\in\mathcal{N}\cup\mathcal{M} do
3:     %%Make Offloading Decisions
4:     if Qk(α,a)(t)>Qk(α,l)(t)Q_{k}^{(\alpha,a)}(t)>Q_{k}^{(\alpha,l)}(t) then
5:        Set bk(α,l)(t)bk,max(α,l)b_{k}^{(\alpha,l)}(t)\leftarrow b^{(\alpha,l)}_{k,\text{max}}.
6:     end if
7:     if Qk(α,a)(t)>Qk(α,o)(t)Q_{k}^{(\alpha,a)}(t)>Q_{k}^{(\alpha,o)}(t) then
8:        Set bk(α,o)(t)bk,max(α,o)b_{k}^{(\alpha,o)}(t)\leftarrow b^{(\alpha,o)}_{k,\text{max}}.
9:     end if
10:     %%Local CPU Resource Allocation
11:     Set fk(α)(t)min{Qk(α,l)(t)/3VςLk(α),fk,max(α)}f^{(\alpha)}_{k}\left(t\right)\leftarrow\min\{\sqrt{Q_{k}^{(\alpha,l)}(t)/3V\varsigma L_{k}^{(\alpha)}},f^{(\alpha)}_{k,\text{max}}\}.
12:  end for
13:  %%Transmit Power Allocation
14:  for each EFN i𝒩i\in\mathcal{N} do
15:     Set λmin0\lambda_{\text{min}}\leftarrow 0.
16:     Set λmaxmaxji(Qi(e,o)Qj(c,a))Hi,j(t)N0V\lambda_{\text{max}}\leftarrow\max_{j\in\mathcal{M}_{i}}\frac{(Q_{i}^{(e,o)}-Q_{j}^{(c,a)})H_{i,j}(t)}{N_{0}}-V.
17:     while λmaxλmin>ε\lambda_{\text{max}}-\lambda_{\text{min}}>\varepsilon do
18:        %%Water Filling with Bisection Method
19:        Set λ(λmin+λmax)/2\lambda^{*}\leftarrow(\lambda_{\text{min}}+\lambda_{\text{max}})/2
20:        Set pi,j(t)B[Qi(e,o)(t)Qj(c,a)(t)V+λN0Hi,j(t)]+p_{i,j}(t)\!\leftarrow\!B\left[\frac{Q_{i}^{(e,o)}(t)-Q_{j}^{(c,a)}(t)}{V+\lambda^{*}}-\frac{N_{0}}{H_{i,j}(t)}\right]^{+}.
21:        if jipi,j(t)>pi,max\sum_{j\in\mathcal{M}_{i}}p_{i,j}(t)>p_{i,\text{max}} then
22:           Set λmaxλ\lambda_{\text{max}}\leftarrow\lambda^{*}.
23:        else
24:           Set λminλ\lambda_{\text{min}}\leftarrow\lambda^{*}.
25:        end if
26:     end while
27:  end for
28:  Enforce scheduling decisions 𝒃(t)\boldsymbol{b}(t), 𝒇(t)\boldsymbol{f}(t), and 𝒑(t)\boldsymbol{p}(t).

Next, we introduce PORA in detail.

V-A1 Offloading Decision

In each time slot tt, under PORA, each fog node k𝒩k\in\mathcal{N}\cup\mathcal{M} decides the amounts of workload scheduled to the offloading queue and the local processing queue, denoted by bk(α,l)(t)b_{k}^{(\alpha,l)}(t) and bk(α,o)(t)b_{k}^{(\alpha,o)}(t), respectively. Such decisions are obtained by solving the following problem:

Minimize0bk(α,β)bk,max(α,β)(Qk(α,β)(t)Qk(α,a)(t))bk(α,β),\begin{array}[]{cl}\underset{0\leq b_{k}^{(\alpha,\beta)}\leq b^{(\alpha,\beta)}_{k,\text{max}}}{\text{Minimize}}\displaystyle\left(Q^{(\alpha,\beta)}_{k}\left(t\right)-Q_{k}^{(\alpha,a)}\left(t\right)\right)b_{k}^{(\alpha,\beta)},\end{array} (20)

where β{l,o}\beta\in\{l,o\}. Accordingly, the optimal solution to (20) is

bk(α,β)(t)={bk,max(α,β),if Qk(α,β)(t)<Qk(α,a)(t),0,otherwise.b_{k}^{(\alpha,\beta)}\left(t\right)=\begin{cases}b^{(\alpha,\beta)}_{k,\text{max}},&\text{if }Q_{k}^{(\alpha,\beta)}\left(t\right)<Q_{k}^{(\alpha,a)}(t),\\ 0,&\text{otherwise}.\end{cases} (21)

From (21), we see that, to determine the optimal solutions bi(e,l)(t)b_{i}^{(e,l)}(t) and bi(e,o)(t)b_{i}^{(e,o)}(t), each EFN ii would compare its integrate queue backlog size Qi(e,a)(t)Q_{i}^{(e,a)}(t) with its local processing queue backlog size Qi(e,l)(t)Q_{i}^{(e,l)}(t) and offloading queue backlog size Qi(e,o)(t)Q_{i}^{(e,o)}(t), respectively. Particularly, if there is too much workload in its integrate queue compared to its local queue (Qi(e,l)(t)<Qi(e,a)(t)Q_{i}^{(e,l)}\left(t\right)<Q_{i}^{(e,a)}(t)), then it will offload as much workload (up to bi,max(e,l)b^{(e,l)}_{i,\text{max}}) as possible to its local queue. Likewise, if its integrate queue is loaded with more workload than its offloading queue (Qi(e,o)(t)<Qi(e,a)(t)Q_{i}^{(e,o)}\left(t\right)<Q_{i}^{(e,a)}(t)), it will offload up to an amount of bi,max(e,o)b^{(e,o)}_{i,\text{max}} workload to its offloading queue.

Notably, if the backlog size of the EFN ii’s integrate queue is larger than both its local queue and offloading queue, then the EFN will transmit the workload one by one unit (e.g. packets); each unit of workload is either sent to the EFN ii’s local queue or its offloading queue, such that the amounts of workload distributed to such two queues are no greater than bi,max(e,l)b_{i,\text{max}}^{(e,l)} and bi,max(e,o)b_{i,\text{max}}^{(e,o)}, respectively. In practice, the workload distributing strategy is left as a degree of freedom to be specified in the implementation of PORA. In our simulation, we adopt the following distributing strategy. When an EFN ii’s integrate queue backlog size is greater than both its local queue and its offloading backlog size, then it will transmit workload to its local queue until the amount of transmitted workload reaches bi,max(e,l)b_{i,\text{max}}^{(e,l)}. Then the rest workload in the integrate queue is transmitted to the offloading queue until the amount of distributed workload reaches bi,max(e,o)b_{i,\text{max}}^{(e,o)}. Such a process terminates whenever the integrate queue becomes empty.

The decision making process is similar for CFNs. Specifically, each CFN jj determines bj(c,l)(t)b_{j}^{(c,l)}(t) and bj(c,o)(t)b_{j}^{(c,o)}(t) by comparing its arrival queue backlog size Qj(c,a)(t)Q_{j}^{(c,a)}(t) with its local processing queue backlog size Qj(c,l)(t)Q_{j}^{(c,l)}(t) and offloading queue backlog size Qj(c,o)(t)Q_{j}^{(c,o)}(t), respectively.

Remark: For each EFN, we can view the difference between the backlog sizes of its integrate queue and its local processing/offloading queue as its willingness of workload transmission. If such willingness is positive, then the EFN will transmit as much workload as possible from its integrate queue; otherwise, the EFN will leave the workload not distributed in the current time slot. In such a way, PORA always endeavors to balance the integrate queue backlog and the local/offloading queue backlog. Likewise, under PORA, each CFN determines its offloading decisions upon the difference between the backlog sizes of its arrival queue and its local processing/offloading queue to ensure the queue stability.

V-A2 Local CPU Frequency Allocation

Under PORA, in each time slot tt, each fog node k𝒩k\in\mathcal{N}\cup\mathcal{M} sets its local CPU frequency fk(α)(t)f_{k}^{(\alpha)}(t) by solving the following subproblem:

Minimize0fk(α)fk,max(α)Vς(fk(α))3Qk(α,l)(t)fk(α)/Lk(α).\begin{array}[]{cl}\underset{0\leq f_{k}^{(\alpha)}\leq f^{(\alpha)}_{k,\text{max}}}{\text{Minimize}}\displaystyle V\varsigma(f_{k}^{(\alpha)})^{3}-Q_{k}^{\left(\alpha,l\right)}\left(t\right)f_{k}^{(\alpha)}/L_{k}^{(\alpha)}.\end{array} (22)

By setting the second derivative of the objective function in (22) to zero, we can obtain the optimal CPU frequency fk(α)(t)f_{k}^{(\alpha)}(t) to be set by fog node kk as

fk(α)(t)=min{Qk(α,l)(t)/3VςLk(α),fk,max(α)}.f_{k}^{(\alpha)}(t)=\min\left\{\sqrt{Q_{k}^{\left(\alpha,l\right)}\left(t\right)/3V\varsigma L_{k}^{(\alpha)}},f_{k,\text{max}}^{(\alpha)}\right\}. (23)

We prove the optimality of (23) in Appendix B.

Remark: When fk(α)(t)<fk,max(α)f^{(\alpha)}_{k}(t)<f_{k,\text{max}}^{(\alpha)}, the allocated CPU frequency fk(α)(t)f_{k}^{(\alpha)}(t) is proportional to the square root of the backlog size of local processing queue Qk(α,l)(t)Q_{k}^{(\alpha,l)}(t) and the inverse of the value of parameter VV. This shows that, on the one hand, PORA would allocate as much CPU frequency as possible to process the workload in the queues. On the other hand, the value of parameter VV determines the tradeoff between power consumption and the backlog sizes of queues: a small value of VV will encourage the fog node to allocate more CPU frequency to process the workload and hence a small queue backlog size; in contrast, a large value of VV will make the fog node more conservative to allocate resources, leading to less power consumptions but a large queue backlog size as well. In practice, the choice of the value of VV is dependent on the system design objective.

V-A3 Power Allocations for EFNs

In each time slot tt, under PORA, each EFN i𝒩i\in\mathcal{N} determines its allocated transmit power 𝒑i(t)\boldsymbol{p}_{i}(t) by solving the following optimization problem.

Minimize𝒑iji[Vpi,jmi,j(t)log2(1+li,j(t)pi,j)]Subject tojipi,jpi,max,pi,j0,ji,\begin{split}\underset{\boldsymbol{p}_{i}}{\text{Minimize}}&\displaystyle\sum_{j\in\mathcal{M}_{i}}\Big{[}Vp_{i,j}-m_{i,j}(t)\log_{2}(1+l_{i,j}(t)p_{i,j})\Big{]}\\ \text{Subject to}&\displaystyle\sum_{j\in\mathcal{M}_{i}}p_{i,j}\leq p_{i,\text{max}},\\ &\displaystyle\ p_{i,j}\geq 0,\ \forall j\in\mathcal{M}_{i},\end{split} (24)

where mi,j(t)(Qi(e,o)(t)Qj(c,a)(t))Bm_{i,j}(t)\triangleq(Q_{i}^{(e,o)}(t)-Q_{j}^{(c,a)}(t))B and li,j(t)Hi,j(t)N0Bl_{i,j}(t)\triangleq\frac{H_{i,j}(t)}{N_{0}B}. By applying water-filling algorithm[22], we obtain the optimal solution to problem (24) as

pi,j(t)=[mi,j(t)/(V+λ)1/li,j(t)]+,ji,p_{i,j}(t)=[m_{i,j}(t)/(V+\lambda^{*})-1/l_{i,j}(t)]^{+},\ \forall j\in\mathcal{M}_{i}, (25)

where λ\lambda^{*} is the optimal Lagrangian variable that satisfies

ji[mi,j(t)/(V+λ)1/li,j(t)]+=pi,max.\sum_{j\in\mathcal{M}_{i}}[m_{i,j}\left(t\right)/(V+\lambda^{*})-1/l_{i,j}\left(t\right)]^{+}=p_{i,\text{max}}. (26)

The optimality of such solutions is proven in Appendix C. We adopt bisection method (line 15-25 in Algorithm 1) to obtain the value of λ\lambda^{*} with its lower and upper bounds as λmin\lambda_{\text{min}} and λmax\lambda_{\text{max}}, respectively. Note that the value of λ\lambda^{*} converges asymptotically to the optimum λopt\lambda^{\text{opt}} as the tolerance parameter ε\varepsilon approaches zero, such that |λλopt|ε/2|\lambda^{*}-\lambda^{\text{opt}}|\leq\varepsilon/2.

Remark: PORA tends to allocate more transmit power to the CFN with smaller arrival queue backlog size Qj(c,a)(t)Q_{j}^{(c,a)}(t) for load balancing. When Qj(c,a)(t)Qi(e,o)(t)Q_{j}^{(c,a)}(t)\geq Q_{i}^{(e,o)}(t), we have mi,j(t)0m_{i,j}(t)\leq 0 and pi,j(t)=0p_{i,j}(t)=0, i.e., EFN ii allocates no transmit power to CFN jj unless the backlog size of the arrival queue on CFN jj is greater than that of the offloading queue on EFN ii. By increasing the value of VV, transmit power consumption will be reduced but the backlog size will increase as well.

V-B Computational Complexity of PORA

During each time slot, part of the computational complexity concentrates on the calculation for CPU frequency settings and offloading decision makings. Since the calculation (line 3-11) requires only constant time for each fog node, the total complexity of these steps is O(N+M)O(N+M). Next, each EFN ii applies the bisection method (line 15-26) to calculate the optimal dual variable, with a complexity of O(log2((λmaxλmin/ε)+|i|)O(\log_{2}((\lambda_{\text{max}}-\lambda_{\text{min}}/\varepsilon)+|\mathcal{M}_{i}|). After that, EFN ii determines the transmit power to each CFN in the set i\mathcal{M}_{i}. In the worst case, each EFN is potentially connected to all CFNs, thus the total complexity of PORA algorithm is O(M×N)O(M\times N).

V-C Performance Analysis

We conduct theoretical analysis on the relationship between the average power consumption P¯\bar{P} and queue backlog Q¯\bar{Q} under PORA scheme in the non-predictive case (Wi=0,i𝒩W_{i}=0,\ \forall i\in\mathcal{N}), and then analyze the benefits of predictive offloading in terms of latency reduction.

V-C1 Time-average Power Consumption and Queue Backlog

Let PP^{*} be the achievable minimum of P¯\bar{P} over all feasible non-predictive polices. We have the following theorem.

Theorem 1

Assume the system arrival lies in the interior of the capacity region and 𝐐(0)<\boldsymbol{Q}(0)<\infty. Under PORA, without prediction, there exist constants θ>0\theta>0 and ϵ>0\epsilon>0 such that

P¯θ/V+P,Q¯(θ+VPmax)/ϵ,\bar{P}\leq\theta/V+P^{*},\ \bar{Q}\leq(\theta+VP_{\text{max}})/\epsilon,

where P¯\bar{P} and Q¯\bar{Q} are defined in (17) and (18), respectively.

The proof is quite standard and hence omitted here.

Remark: By Little’s theorem[23], the average queue backlog size is proportional to the average queueing latency. Therefore, Theorem 1 implies that by adjusting parameter VV, PORA can achieve an [O(1/V),O(V)][O(1/V),O(V)] power-latency tradeoff in the non-predictive case. Furthermore, the average power consumption P¯\bar{P} approaches the optimum PP^{*} asymptotically as the value of VV increases to infinity.

V-C2 Latency Reduction

We analyze the latency reduction induced by PORA under perfect prediction compared to the non-predictive case. In particular, we denote the prediction window vector (Wi)i𝒩(W_{i})_{i\in\mathcal{N}} by 𝑾\boldsymbol{W} and the corresponding delay reduction by η(𝑾)\eta(\boldsymbol{W}). For each unit of workload on EFN ii, let πi,w\pi_{i,w} denote the steady-state probability that it experiences a latency of ww time slots in Ai,1(t)A_{i,-1}(t). Without prediction, the average latency on its arrival queues is d=i𝒩λiw1wπi,w/i𝒩λid=\sum_{i\in\mathcal{N}}\lambda_{i}\sum_{w\geq 1}w\pi_{i,w}/\sum_{i\in\mathcal{N}}\lambda_{i}. Then we have the following theorem.

Theorem 2

Suppose the system steady-state behavior depends only on the statistical behaviors of the arrivals and service processes. Then the latency reduction η(𝐖)\eta(\boldsymbol{W}) is

η(𝑾)=i𝒩λi(1wWiwπi,w+Wiw1πi,w+Wi)i𝒩λi.\eta\left(\boldsymbol{W}\right)\\ =\frac{\sum_{i\in\mathcal{N}}\lambda_{i}\!\left(\!\sum_{1\leq w\leq W_{i}}\!w\pi_{i,w}\!+\!W_{i}\!\sum_{w\geq 1}\!\pi_{i,w+W_{i}}\right)}{\sum_{i\in\mathcal{N}}\lambda_{i}}. (27)

Furthermore, if d<d<\infty, as 𝐖\boldsymbol{W}\rightarrow\infty, i.e., with inifinite predictive information, we have

lim𝑾η(𝑾)=d.\lim_{\boldsymbol{W}\rightarrow\infty}\eta\left(\boldsymbol{W}\right)=d. (28)

We relegate the proof of Theorem 2 to Appendix D.

Remark: Theorem 2 implies that predictive offloading conduces to a shorter workload latency; in other words, with predicted information, PORA can break the barrier of [O(1/V),O(V)][O(1/V),O(V)] power-latency tradeoff. Furthermore, the latency reduction induced by PORA is proportional to the inverse of the prediction window size, and approaches zero as prediction window sizes go to infinity. In our simulations, we see that PORA can effectively shorten the average arrival queue latency with only mild-value of future information.

V-D Impact of Network Topology

Fog computing systems generally proceed in wireless environments, thus the network topology of such systems is usually dynamic and may change over time slots. However, at the beginning of each time slot, the network topology is observed and deemed fixed by the end of the time slot. Therefore, in the following, we put the focus of our discussion on the impact of network topology within each time slot.

Recall that in our settings, each EFN has access to only a subset of CFNs in its vicinity. For each EFN ii, the subset of its accessible EFNs is denoted by i\mathcal{M}_{i} with a size of |i||\mathcal{M}_{i}|. From the perspective of graph theory, we can view the interconnection among fog nodes of different tiers as a directed graph, in which each vertex corresponds to a fog node and each edge indicates a directed connection between nodes. Hence, the value of |i||\mathcal{M}_{i}| can be regarded as the out-degree of EFN ii, which is an important parameter of network topology that measures the number of directed connections originating from EFN ii. Due to time-varying wireless dynamics, the out-degree of each fog node may vary over time slots; consequentially, the resulting topology would significantly affect the system performance. In the following, we discuss such impacts under two channel conditions, respectively.

On the one hand, within each time slot, poor channel conditions (e.g. in terms of low SINR) would often lead to unreliable or even unavailable connections among fog nodes and hence a network topology with a relatively smaller out-degree of nodes. In this case, each fog node may have a very limited freedom to choose the best target node to offload its workloads, further leading to backlog imbalance among fog nodes or even overloading in its upper tier with a large cumulative queue backlog size. Besides, poor channel conditions may also require more power consumptions to ensure reliable communication between successive fog nodes.

On the other hand, within each time slot, good channel conditions allow each fog node to have a broader access to the fog nodes in its upper tier, resulting a network topology with a relatively larger out-degree of nodes. In this case, each fog node is able to conduct better decision-making with more freedom in choosing the fog nodes in its upper fog tier, thereby achieving a better tradeoff between power consumptions and backlog sizes.

TABLE V: Simulation Settings
Parameter Value
BB 22 MHz
Hi,j(t),i𝒩,jH_{i,j}(t),\forall i\in\mathcal{N},j\in\mathcal{M} 24log10di,j+20log105.824\log_{10}d_{i,j}+20\log_{10}5.8+60 a
N0N_{0} 174-174 dBm/Hz
Pi,max,i𝒩P_{i,\text{max}},\forall i\in\mathcal{N} 500500 mW
Li(e)i𝒩L^{(e)}_{i}\forall i\in\mathcal{N}, Lj(c)jL^{(c)}_{j}\forall j\in\mathcal{M} 297.62297.62 cycles/bit
fi,max(e),i𝒩f^{(e)}_{i,\text{max}},\forall i\in\mathcal{N} 44 G cycles/s
fj,max(c),jf^{(c)}_{j,\text{max}},\forall j\in\mathcal{M} 88 G cycles/s
ς\varsigma 102710^{-27} W\cdots3/cycle3
bi,max(e,l),bi,max(e,o),i𝒩b^{(e,l)}_{i,\text{max}},b^{(e,o)}_{i,\text{max}},\forall i\in\mathcal{N} 66 Mb/s
bj,max(c,l),bj,max(c,o),jb^{(c,l)}_{j,\text{max}},b^{(c,o)}_{j,\text{max}},\forall j\in\mathcal{M} 1212 Mb/s
Dj(t),j,tD_{j}(t),\forall j\in\mathcal{M},t 66 Mb/s
  • a

    di,jd_{i,j} is the distance between EFN ii and CFN jj.

V-E Use Cases

In practice, PORA can be applied as a theoretical framework to design the offloading schemes for fog computing systems under various use cases, such as public safety systems, intelligent transportation, and smart healthcare systems. For example, in a public safety system, each street is usually deployed with multiple smart cameras (IoT devices). At runtime, such smart cameras would upload real-time vision data to one of their accessible EFNs. Each EFN aggregates such data to extract or even analyze the instant road conditions within multiple streets. Such EFNs can upload some of the workload to their upper-layered CFNs (each taking charge of one community consisting of several streets) with greater computing capacities. Each CFN can further offload the workload to the cloud via optical fiber links. For latency-sensitive applications, the real-time vision data will be processed locally on EFNs or offloaded to CFNs. For latency-insensitive applications with intensive computation demand, the data will be offloaded to the cloud through the fog nodes. PORA conduces to the design of dynamic and online offloading and resource allocation schemes to support such fog systems with various applications.

VI Numerical Results

We conduct extensive simulations to evaluate PORA and its variants. The parameter settings in our simulation are based on the commonly adopted wireless environment settings that have been used in [24, 25]. The simulation is conducted on a MacBook Pro with 2.3 GHz Intel Core i5 processor and 8GB 2133 MHz LPDDR3 memory, and the simulation program is implemented using Python 3.7. This section firstly presents the basic settings of our simulations, and then provides the key results under perfect and imperfect prediction, respectively.

VI-A Basic Settings

We simulate a hierarchal fog computing system with 8080 EFNs and 2020 CFNs. All EFNs have a uniform prediction window size WW, which varies from 0 to 3030. Note that W=0W=0 refers to the case without prediction. For each EFN ii, its accessible CFN set i\mathcal{M}_{i} is chosen uniformly randomly from the power set of the CFN set with size |i|=5|\mathcal{M}_{i}|=5. We set the time slot length τ0=1\tau_{0}=1 second. During each time slot, workload arrives to the system in the unit of packets, each with a fixed size of 40964096 bits. The packet arrivals are drawn from previous measurements [26], where the average flow arrival rate is 538538 flows/s, and the distribution of flow size has a mean of 1313 Kb. Given these settings, the average arrival rate is about 77 Mbps. All results are averaged over 5000050000 time slots. We list all other parameter settings in TABLE V.

Refer to caption
Figure 4: Offloading decisions when W=10W=10.
Refer to caption
(a) Queue backlogs.
Refer to caption
(b) Power consumptions.
Figure 5: Performance of PORA vs. WW when V=1011V=10^{11}.

VI-B Evaluation with Perfect Prediction

Under the perfect prediction settings, we evaluate how the values of parameter VV and prediction window size WW influence the performance of PORA, respectively.

System Performance under Different Values of VV: Figure 4 shows the impact of parameter VV on the offloading decisions of PORA: When the value of VV is around 101010^{10}, the time-average amount of locally processed workload on EFNs reaches the bottom of the curve, while other offloading decisions induce the peak workload. The reason is that the offloading decisions are not only determined by the value of VV, but also influenced by the queue backlog sizes.

Figure 6 presents the impact of the value of VV on different types of queues and power consumptions in the system, respectively. As the value of VV increases, we see a rising trend in the sizes of all types of queue backlogs, and a roughly falling trend in all types of power consumptions.

Refer to caption
(a) Queue backlogs.
Refer to caption
(b) Power consumptions.
Figure 6: Performance of PORA when W=10W=10.
Refer to caption
(a) Total queue backlogs.
Refer to caption
(b) Total power consumptions.
Figure 7: Performance of variants of PORA.
Refer to caption
(a) Total queue backlogs.
Refer to caption
(b) Total power consumptions.
Figure 8: Comparison between PORA and baselines.

System Performance with Different Values of Prediction Window Size WW: Figures 5(a) and 5(b) show the system performance with the prediction window size WW varying from 0 to 3030. With perfect prediction, PORA effectively shortens the average queueing latencies on EFN arrival queues – eventually close to zero with no extra power consumption and only a mild-value of prediction window size (W=20W=20 in this case).

PORA vs. PORA-dd (Low-Sampling Variant): In practice, since PORA requires to sample system dynamics across various fog nodes, it may incur considerable sampling overheads. By adopting the idea of randomized load balancing techniques [27], we propose PORA-dd, a variant of PORA that reduces the sampling overheads by probing dd (d{1,2,3,4}d\in\{1,2,3,4\}) 222When d=1d=1, the scheme degenerates to uniform random sampling. CFNs and conducting resource allocation on which are uniformly chosen for each EFN from its accessible CFN set.

Figure 7 compares the performance of PORA with PORA-dd. We observe that PORA achieves the smallest queue backlog size. The result is reasonable since each EFN has access to 5 CFNs under PORA, more than the d4d\leq 4 CFNs under PORA-dd. As a result, each EFN has more chance to access to the CFNs with better wireless channel condition and processing capacity under PORA when compared with PORA-dd. The observation that the queue backlog size increases as dd decreases further verifies our analysis. In fact, we can view dd as the degree of each EFN in the network topology. As dd decreases, the system performance degrades. However, when the value of VV is sufficiently large, PORA-dd achieves the similar power consumptions as PORA and the ratio of increment in the backlog size is small. For example, when V=2×1011V=2\times 10^{11}, PORA-44 achieves 4.34.3% larger backlog size than PORA, and PORA-33 achieves 10.910.9% larger backlog size than PORA. In summary, PORA-dd (when d=2,3,4d=2,3,4) can reduce the sampling overheads by trading off only a little performance degradation under large VV.

Comparison of PORA and Baselines: We introduce four baselines to evaluate the performance of PORA: (1) NOL (No Offloading): All nodes in the EFT process packets locally. (2) O2CFT (Offload to CFT): All packets are offloaded to the CFT and processed therein. (3) O2CLOUD (Offload to Cloud): All packets are offloaded to the cloud. (4) RANDOM: Each fog node randomly chooses to offload each packet or process it locally with equal chance. Note that all above baselines are also assumed capable of pre-serving future workloads in the prediction window. Figure 8 compares the instant total queue backlog sizes and power consumptions over time slots under the five schemes (PORA, NOL, O2CFT, O2CLOUD, RANDOM), where W=10W=10 and V{109,1011}V\in\{10^{9},10^{11}\}.

We observe that scheme O2CLOUD achieves the minimum power consumptions, but incurs constantly increasing queue backlog sizes over time. The reasons are shown as follows. On one hand, in our settings, the mean power consumption for transmitting workload from EFT to CFT is smaller than the mean power consumption of processing the same amount of workload on fog nodes; under scheme O2CLOUD, only wireless transmit power is consumed and hence the minimum is achieved. On the other hand, all the workload must travel through all fog tiers before being offloaded to the cloud, which results in network congestion within fog tiers and thus workload accumulation with increasing queue backlogs.

As Figure 8 illustrates, PORA achieves the maximum power consumptions but the smallest backlog size when V=109V=10^{9}. Upon convergence of PORA, the power consumptions under all these schemes reach the same level, but the differences between their queue backlog sizes become more obvious: PORA (V=109V=10^{9}) reduces 9696% of the queue backlog when compared with NOL and RANDOM. The results demonstrate that with the appropriate choice of the value of VV, PORA can achieve less latency than the four baselines under the same power consumptions.

VI-C Evaluation with Imperfect Prediction

In practice, prediction errors are inevitable. Hence, we investigate the performance of PORA in the presence of prediction errors [28]. Particularly, we consider two kinds of prediction errors: false alarm and missed detection. A packet is falsely alarmed if it is predicted to arrive but it does not arrive actually. A packet is missed to be detected if it will arrive but is not predicted. We assume that all EFNs have the uniform false-alarm rate p1p_{1} and missed-detection rate p2p_{2}. In our simulation, we consider different pairs of values of (p1,p2)(p_{1},p_{2}): (0.0,0.0)(0.0,0.0), (0.05,0.05)(0.05,0.05), (0.5,0.05)(0.5,0.05), (0.05,0.25)(0.05,0.25), and (0.5,0.25)(0.5,0.25). Note that (p1,p2)=(0.0,0.0)(p_{1},p_{2})=(0.0,0.0) corresponds to the case when the prediction is perfect.

Refer to caption
(a) Total queue backlogs.
Refer to caption
(b) Total power consumptions.
Figure 9: Performance of PORA under imperfect prediction.

Figure 9 presents the results under prediction window size W=10W=10. We observe when V7.5×1010V\leq 7.5\times 10^{10}, both the total queue backlog sizes and power consumptions under imperfect prediction are larger than that under perfect prediction. The reason for this performance degradation is twofold: First, arrivals that are missed to be detected cannot be pre-served, thus leading to larger queue backlog sizes. Second, PORA allocates redundant resources to handle the falsely predicted arrivals, thus causing more power consumptions. As the value of VV increases, this performance degradation becomes negligible. Taking the total queue backlog under (p1,p2)=(0.25,0.5)(p_{1},p_{2})=(0.25,0.5) as an example, when compared with the case under perfect prediction, it increases by 4.72%4.72\% at V=1011V=10^{11}, and increases by 2.24%2.24\% at V=2×1011V=2\times 10^{11}. Moreover, there is no extra power consumption under imperfect prediction when V7.5×1010V\geq 7.5\times 10^{10} since PORA tends to reserve resources to reduce power consumptions under large VV.

In summary, there will be performance degradation in both total queue backlog sizes and power consumptions in the presence of prediction errors. However, as the value of VV increases, this degradation decreases and becomes negligible. Though a large value of VV can improve the robustness of PORA and achieve small power consumptions, it brings long workload latencies. In practice, the choice of the value of VV depends on how the system designer trades off all these criterions.

VII Conclusion

In this paper, we studied the problem of dynamic offloading and resource allocation with prediction in a fog computing system with multiple tiers. By formulating it as a stochastic network optimization problem, we proposed PORA, an efficient online scheme that exploits predictive offloading to minimize power consumption with queue stability guarantee. Our theoretical analysis and trace-driven simulations showed that PORA achieves a tunable power-latency tradeoff, while effectively shortening latency with only mild-value of future information, even in the presence of prediction errors. As for future work, our model can be further extended to more general settings such that the instant wireless channel states may be unknown by the moment of decision making or the underlying system dynamics is non-stationary.

References

  • [1] Y. Xiao and M. Krunz, “QoE and power efficiency tradeoff for fog computing networks with fog node cooperation,” in Proceedings of IEEE INFOCOM, 2017.
  • [2] S. Yi, Z. Hao, Z. Qin, and Q. Li, “Fog computing: Platform and applications,” in Proceedings of IEEE HotWeb, 2015.
  • [3] J. Broughton, “Netflix adds download functionality,” https://technology.ihs.com/586280/netflix-adds-download-support, 2016.
  • [4] Y. Wang, X. Tao, X. Zhang, P. Zhang, and Y. T. Hou, “Cooperative task offloading in three-tier mobile computing networks: An ADMM framework,” IEEE Transactions on Vehicular Technology, vol. 68, no. 3, pp. 2763–2776, 2019.
  • [5] L. Liu, Z. Chang, and X. Guo, “Socially-aware dynamic computation offloading scheme for fog computing system with energy harvesting devices,” IEEE Internet of Things Journal, vol. 5, no. 3, pp. 1869–1879, 2018.
  • [6] S. Misra and N. Saha, “Detour: Dynamic task offloading in software-defined fog for IoT applications,” IEEE Journal on Selected Areas in Communications, vol. 37, no. 5, pp. 1159–1166, 2019.
  • [7] L. Lei, H. Xu, X. Xiong, K. Zheng, and W. Xiang, “Joint computation offloading and multi-user scheduling using approximate dynamic programming in NB-IoT edge computing system,” IEEE Internet of Things Journal, vol. 6, no. 3, pp. 5345–5362, 2019.
  • [8] Y. Mao, J. Zhang, S. Song, and K. B. Letaief, “Power-delay tradeoff in multi-user mobile-edge computing systems,” in Proceedings of IEEE GLOBECOM, 2016.
  • [9] Y. Chen, N. Zhang, Y. Zhang, X. Chen, W. Wu, and X. S. Shen, “Energy efficient dynamic offloading in mobile edge computing for Internet of Things,” IEEE Transactions on Cloud Computing, 2019, doi: 10.1109/TCC.2019.2898657.
  • [10] Y. Gao, W. Tang, M. Wu, P. Yang, and L. Dan, “Dynamic social-aware computation offloading for low-latency communications in IoT,” IEEE Internet of Things Journal, 2019, doi: 10.1109/JIOT.2019.2909299.
  • [11] D. Zhang, L. Tan, J. Ren, M. K. Awad, S. Zhang, Y. Zhang, and P.-J. Wan, “Near-optimal and truthful online auction for computation offloading in green edge-computing systems,” IEEE Transactions on Mobile Computing, 2019, doi: 10.1109/TMC.2019.2901474.
  • [12] M. Taneja and A. Davy, “Resource aware placement of IoT application modules in fog-cloud computing paradigm,” in Proceedings of IFIP/IEEE IM, 2017.
  • [13] M. Chen, W. Li, G. Fortino, Y. Hao, L. Hu, and I. Humar, “A dynamic service migration mechanism in edge cognitive computing,” ACM Transactions on Internet Technology, vol. 19, no. 2, p. 30, 2019.
  • [14] D. Zhang, Z. Chen, L. X. Cai, H. Zhou, S. Duan, J. Ren, X. Shen, and Y. Zhang, “Resource allocation for green cloud radio access networks with hybrid energy supplies,” IEEE Transactions on Vehicular Technology, vol. 67, no. 2, pp. 1684–1697, 2017.
  • [15] N. K. Ahmed, A. F. Atiya, N. E. Gayar, and H. El-Shishiny, “An empirical comparison of machine learning models for time series forecasting,” Econometric Reviews, vol. 29, no. 5-6, pp. 594–621, 2010.
  • [16] L. Huang, S. Zhang, M. Chen, and X. Liu, “When backpressure meets predictive scheduling,” IEEE/ACM Transactions on Networking, vol. 24, no. 4, pp. 2237–2250, 2016.
  • [17] R. G. Gallager, Principles of Digital Communication.   Cambridge University Press, 2008.
  • [18] Y. Mao, C. You, J. Zhang, K. Huang, and K. B. Letaief, “A survey on mobile edge computing: The communication perspective,” IEEE Communications Surveys & Tutorials, vol. 19, no. 4, pp. 2322–2358, 2017.
  • [19] A. P. Miettinen and J. K. Nurminen, “Energy efficiency of mobile clients in cloud computing,” in Proceedings of ACM HotCloud, 2010.
  • [20] Y. Kim, J. Kwak, and S. Chong, “Dual-side optimization for cost-delay tradeoff in mobile edge computing,” IEEE Transactions on Vehicular Technology, vol. 67, no. 2, pp. 1765–1781, 2018.
  • [21] M. J. Neely, “Stochastic network optimization with application to communication and queueing systems,” Synthesis Lectures on Communication Networks, vol. 3, no. 1, pp. 1–211, 2010.
  • [22] S. Boyd and L. Vandenberghe, Convex Optimization.   Cambridge University Press, 2004.
  • [23] A. Leon-Garcia, Probability, Statistics, and Random Processes for Electrical Engineering, 3rd ed.   Pearson Education, 2017.
  • [24] C.-F. Liu, M. Bennis, and H. V. Poor, “Latency and reliability-aware task offloading and resource allocation for mobile edge computing,” in Proceedings of IEEE GLOBECOM, 2017.
  • [25] J. Du, L. Zhao, J. Feng, and X. Chu, “Computation offloading and resource allocation in mixed fog/cloud computing systems with min-max fairness guarantee,” IEEE Transactions on Communications, vol. 66, no. 4, pp. 1594–1608, 2017.
  • [26] T. Benson, A. Akella, and D. A. Maltz, “Network traffic characteristics of data centers in the wild,” in Proceedings of ACM IMC, 2010.
  • [27] M. Mitzenmacher, “The power of two choices in randomized load balancing,” IEEE Transactions on Parallel and Distributed Systems, vol. 12, no. 10, pp. 1094–1104, 2001.
  • [28] K. Chen and L. Huang, “Timely-throughput optimal scheduling with prediction,” in Proceedings of IEEE INFOCOM, 2018.

Appendix A Design of Scheme PORA

First, we define Lyapunov function [21] L(𝑸(t))L\left(\boldsymbol{Q}\left(t\right)\right) as

L(𝑸(t))12i𝒩β{a,l,o}(Qi(e,β)(t))2+12jβ{a,l,o}(Qj(c,β)(t))2.L(\boldsymbol{Q}(t))\triangleq\frac{1}{2}\sum_{i\in\mathcal{N}}\sum_{\beta\in\{a,l,o\}}(Q_{i}^{(e,\beta)}(t))^{2}\\ +\frac{1}{2}\sum_{j\in\mathcal{M}}\sum_{\beta\in\{a,l,o\}}(Q_{j}^{(c,\beta)}(t))^{2}. (29)

Next, we define the drift-plus-penalty ΔVL(𝑸(t))\Delta_{V}L\left(\boldsymbol{Q}\left(t\right)\right) as

ΔVL(𝑸(t))𝔼[L(𝑸(t+1))L(𝑸(t))|𝑸(t)]+V𝔼{P(t)|𝑸(t)},\Delta_{V}L\left(\boldsymbol{Q}\left(t\right)\right)\triangleq\mathbb{E}\left[L\left(\boldsymbol{Q}\left(t+1\right)\right)-L\left(\boldsymbol{Q}\left(t\right)\right)|\boldsymbol{Q}\left(t\right)\right]\\ +V\mathbb{E}\left\{P\left(t\right)|\boldsymbol{Q}\left(t\right)\right\}, (30)

where VV is a positive parameter. According to definition (29), the update functions (6), (10), (12), (13), and (14), there exists a positive constant θ>0\theta>0 such that

L\displaystyle L (𝑸(t+1))L(𝑸(t))\displaystyle\left(\boldsymbol{Q}\left(t+1\right)\right)-L\left(\boldsymbol{Q}\left(t\right)\right)
\displaystyle\leq θ+i𝒩Qi(e,a)(t)(Ai(t+Wi)bi(e,l)(t)bi(e,o)(t))\displaystyle\theta\!+\!\sum_{i\in\mathcal{N}}Q_{i}^{(e,a)}(t)\!\left(\!A_{i}(t+W_{i})-b_{i}^{(e,l)}(t)-b_{i}^{(e,o)}(t)\!\right)\!
+i𝒩Qi(e,l)(t)(bi(e,l)(t)τ0fi(e)(t)/Li(e))\displaystyle+\sum_{i\in\mathcal{N}}Q_{i}^{(e,l)}\left(t\right)\left(b_{i}^{\left(e,l\right)}\left(t\right)-\tau_{0}f_{i}^{\left(e\right)}\left(t\right)/L_{i}^{\left(e\right)}\right)
+i𝒩Qi(e,o)(t)(bi(e,o)(t)jiRi,j(t))\displaystyle+\sum_{i\in\mathcal{N}}Q_{i}^{\left(e,o\right)}\left(t\right)\Big{(}b_{i}^{\left(e,o\right)}\left(t\right)-\sum_{j\in\mathcal{M}_{i}}R_{i,j}\left(t\right)\Big{)}
+jQj(c,a)(t)[i𝒩jRi,j(t)bj(c,l)(t)bj(c,o)(t)]\displaystyle+\!\sum_{j\in\mathcal{M}}\!Q_{j}^{(c,a)}\left(t\right)\!\bigg{[}\!\sum_{i\in\mathcal{N}_{j}}\!R_{i,j}\left(t\right)-b_{j}^{(c,l)}\left(t\right)-b_{j}^{(c,o)}\left(t\right)\!\bigg{]}
+jQj(c,l)(t)(bj(c,l)(t)τ0fj(c)(t)/Lj(c))\displaystyle+\sum_{j\in\mathcal{M}}Q_{j}^{(c,l)}\left(t\right)\left(b_{j}^{\left(c,l\right)}\left(t\right)-\tau_{0}f_{j}^{\left(c\right)}\left(t\right)/L_{j}^{\left(c\right)}\right)
+jQj(c,o)(t)(bj(c,o)(t)Dj(t)).\displaystyle+\sum_{j\in\mathcal{M}}Q_{j}^{(c,o)}\left(t\right)\left(b_{j}^{(c,o)}\left(t\right)-D_{j}\left(t\right)\right). (31)

Substituting (A) into the definition of drift-plus-penalty shown in (30) and by 𝔼[Ai(t+Wi)]=λi\mathbb{E}[A_{i}(t+W_{i})]=\lambda_{i}, we obtain

Δ\displaystyle\Delta LV(𝑸(t)){}_{V}L\left(\boldsymbol{Q}\left(t\right)\right)
\displaystyle\leq θ+V𝔼{P(t)|𝑸(t)}\displaystyle\theta+V\mathbb{E}\left\{P\left(t\right)|\boldsymbol{Q}\left(t\right)\right\}
+i𝒩Qi(e,a)(t)𝔼{λi(bi(e,l)(t)+bi(e,o)(t))|𝑸(t)}\displaystyle+\sum_{i\in\mathcal{N}}Q_{i}^{(e,a)}(t)\mathbb{E}\Big{\{}\lambda_{i}-\left(b_{i}^{(e,l)}(t)+b_{i}^{(e,o)}(t)\right)|\boldsymbol{Q}(t)\Big{\}}
+i𝒩Qi(e,l)(t)𝔼{bi(e,l)(t)τ0fi(e)(t)/Li(e)|𝑸(t)}\displaystyle+\sum_{i\in\mathcal{N}}Q_{i}^{(e,l)}\left(t\right)\mathbb{E}\left\{b_{i}^{\left(e,l\right)}\left(t\right)-\tau_{0}f_{i}^{\left(e\right)}\left(t\right)/L_{i}^{\left(e\right)}|\boldsymbol{Q}\left(t\right)\right\}
+i𝒩Qi(e,o)(t)𝔼{bi(e,o)(t)jiRi,j(t)|𝑸(t)}\displaystyle+\sum_{i\in\mathcal{N}}Q_{i}^{\left(e,o\right)}\left(t\right)\mathbb{E}\bigg{\{}b_{i}^{\left(e,o\right)}\left(t\right)-\sum_{j\in\mathcal{M}_{i}}R_{i,j}\left(t\right)|\boldsymbol{Q}\left(t\right)\bigg{\}}
+jQj(c,a)(t)𝔼{i𝒩jRi,j(t)\displaystyle+\sum_{j\in\mathcal{M}}Q_{j}^{(c,a)}(t)\mathbb{E}\Big{\{}\sum_{i\in\mathcal{N}_{j}}R_{i,j}(t)
(bj(c,l)(t)+bj(c,o)(t))|𝑸(t)}\displaystyle-\left(b_{j}^{(c,l)}(t)+b_{j}^{(c,o)}(t)\right)|\boldsymbol{Q}(t)\Big{\}}
+jQj(c,l)(t)𝔼{bj(c,l)(t)τ0fj(c)(t)/Lj(c)|𝑸(t)}\displaystyle+\sum_{j\in\mathcal{M}}Q_{j}^{(c,l)}\left(t\right)\mathbb{E}\left\{b_{j}^{\left(c,l\right)}\left(t\right)-\tau_{0}f_{j}^{\left(c\right)}\left(t\right)/L_{j}^{\left(c\right)}|\boldsymbol{Q}\left(t\right)\right\}
+jQj(c,o)(t)𝔼{bj(c,o)(t)Dj(t)|𝑸(t)}.\displaystyle+\sum_{j\in\mathcal{M}}Q_{j}^{(c,o)}\left(t\right)\mathbb{E}\left\{b_{j}^{(c,o)}\left(t\right)-D_{j}\left(t\right)|\boldsymbol{Q}\left(t\right)\right\}. (32)

Then by the expression of transmission capacity from EFN ii to CFN jj shown in (9) and the expression of total power consumptions shown in (16), we have

Δ\displaystyle\Delta LV(𝑸(t)){}_{V}L\left(\boldsymbol{Q}\left(t\right)\right)
\displaystyle\leq θ+i𝒩Qi(e,a)(t)𝔼{Ai(t+Wi)|𝑸(t)}\displaystyle\theta+\sum_{i\in\mathcal{N}}Q_{i}^{(e,a)}\left(t\right)\mathbb{E}\left\{A_{i}\left(t+W_{i}\right)|\boldsymbol{Q}\left(t\right)\right\}
+i𝒩𝔼{(Qi(e,l)(t)Qi(e,a)(t))bi(e,l)(t)|𝑸(t)}\displaystyle+\sum_{i\in\mathcal{N}}\mathbb{E}\left\{\left(Q_{i}^{\left(e,l\right)}\left(t\right)-Q_{i}^{(e,a)}\left(t\right)\right)b_{i}^{\left(e,l\right)}\left(t\right)|\boldsymbol{Q}\left(t\right)\right\}
+i𝒩𝔼{(Qi(e,o)(t)Qi(e,a)(t))bi(e,o)(t)|𝑸(t)}\displaystyle+\sum_{i\in\mathcal{N}}\mathbb{E}\left\{\left(Q_{i}^{\left(e,o\right)}\left(t\right)-Q_{i}^{(e,a)}\left(t\right)\right)b_{i}^{\left(e,o\right)}\left(t\right)|\boldsymbol{Q}\left(t\right)\right\}
+i𝒩𝔼{Vτ0ς(fi(e)(t))3τ0Qi(e,l)(t)Li(e)fi(e)(t)|𝑸(t)}\displaystyle+\sum_{i\in\mathcal{N}}\mathbb{E}\Bigg{\{}V\tau_{0}\varsigma\left(f_{i}^{\left(e\right)}\left(t\right)\right)^{3}-\frac{\tau_{0}Q_{i}^{\left(e,l\right)}\left(t\right)}{L_{i}^{\left(e\right)}}f_{i}^{\left(e\right)}\left(t\right)|\boldsymbol{Q}\left(t\right)\Bigg{\}}
+j𝔼{(Qj(c,l)(t)Qj(c,a)(t))bj(c,l)(t)|𝑸(t)}\displaystyle+\sum_{j\in\mathcal{M}}\mathbb{E}\left\{\left(Q_{j}^{\left(c,l\right)}\left(t\right)-Q_{j}^{\left(c,a\right)}\left(t\right)\right)b_{j}^{\left(c,l\right)}\left(t\right)|\boldsymbol{Q}\left(t\right)\right\}
+j𝔼{(Qj(c,o)(t)Qj(c,a)(t))bj(c,o)(t)|𝑸(t)}\displaystyle+\sum_{j\in\mathcal{M}}\mathbb{E}\left\{\left(Q_{j}^{\left(c,o\right)}\left(t\right)-Q_{j}^{\left(c,a\right)}\left(t\right)\right)b_{j}^{\left(c,o\right)}\left(t\right)|\boldsymbol{Q}\left(t\right)\right\}
+j𝔼{Vτ0ς(fj(c)(t))3τ0Qj(c,l)(t)Lj(c)fj(c)(t)|𝑸(t)}\displaystyle+\sum_{j\in\mathcal{M}}\mathbb{E}\Bigg{\{}V\tau_{0}\varsigma\left(f_{j}^{\left(c\right)}\left(t\right)\right)^{3}-\frac{\tau_{0}Q_{j}^{\left(c,l\right)}\left(t\right)}{L_{j}^{\left(c\right)}}f_{j}^{\left(c\right)}\left(t\right)|\boldsymbol{Q}\left(t\right)\Bigg{\}}
+i𝒩ji𝔼{Vτ0pi,j(t)\displaystyle+\sum_{i\in\mathcal{N}}\sum_{j\in\mathcal{M}_{i}}\mathbb{E}\Big{\{}V\tau_{0}p_{i,j}\left(t\right)
τ0mi,j(t)log2(1+li,j(t)pi,j(t))|𝑸(t)}\displaystyle-\tau_{0}m_{i,j}\left(t\right)\log_{2}\left(1+l_{i,j}\left(t\right)p_{i,j}\left(t\right)\right)|\boldsymbol{Q}\left(t\right)\Big{\}}
jQj(c,o)(t)𝔼{Dj(t)|𝑸(t)}\displaystyle-\sum_{j\in\mathcal{M}}Q_{j}^{\left(c,o\right)}\left(t\right)\mathbb{E}\left\{D_{j}\left(t\right)|\boldsymbol{Q}\left(t\right)\right\} (33)

where mi,j(t)(Qi(e,o)(t)Qj(c,a)(t))Bm_{i,j}(t)\triangleq(Q_{i}^{(e,o)}(t)-Q_{j}^{(c,a)}(t))B and li,j(t)Hi,j(t)N0Bl_{i,j}(t)\triangleq\frac{H_{i,j}(t)}{N_{0}B} for all i𝒩,jii\in\mathcal{N},j\in\mathcal{M}_{i}.

To solve problem (19), we should minimize the upper bound of ΔVL(𝑸(t))\Delta_{V}L\left(\boldsymbol{Q}\left(t\right)\right) in every time slot. However, it is hard to solve a minimization problem with expectation. Thus we approximately solve the problem by considering the following deterministic problem in every time slot tt:

Minimize𝒃,𝒇,𝒑i𝒩(Qi(e,l)(t)Qi(e,a)(t))bi(e,l)+i𝒩(Qi(e,o)(t)Qi(e,a)(t))bi(e,o)+i𝒩(Vτ0ς(fi(e))3τ0Qi(e,l)(t)Li(e)fi(e))+j(Qj(c,l)(t)Qj(c,a)(t))bj(c,l)+j(Qj(c,o)(t)Qj(c,a)(t))bj(c,o)+j(Vτ0ς(fj(c))3τ0Qj(c,l)(t)Lj(c)fj(c))+i𝒩j[Vτ0pi,jτ0mi,j(t)log2(1+li,j(t)pi,j)]Subject to (1)(7)(8)(11)(15).\begin{split}\underset{\boldsymbol{b},\boldsymbol{f},\boldsymbol{p}}{\text{Minimize}}&\sum_{i\in\mathcal{N}}\left(Q_{i}^{\left(e,l\right)}\left(t\right)-Q_{i}^{(e,a)}\left(t\right)\right)b_{i}^{\left(e,l\right)}\\ &+\sum_{i\in\mathcal{N}}\left(Q_{i}^{\left(e,o\right)}\left(t\right)-Q_{i}^{(e,a)}\left(t\right)\right)b_{i}^{\left(e,o\right)}\\ &+\sum_{i\in\mathcal{N}}\Bigg{(}V\tau_{0}\varsigma\left(f_{i}^{\left(e\right)}\right)^{3}-\frac{\tau_{0}Q_{i}^{\left(e,l\right)}\left(t\right)}{L_{i}^{\left(e\right)}}f_{i}^{\left(e\right)}\Bigg{)}\\ &+\sum_{j\in\mathcal{M}}\left(Q_{j}^{\left(c,l\right)}\left(t\right)-Q_{j}^{\left(c,a\right)}\left(t\right)\right)b_{j}^{\left(c,l\right)}\\ &+\sum_{j\in\mathcal{M}}\left(Q_{j}^{\left(c,o\right)}\left(t\right)-Q_{j}^{\left(c,a\right)}\left(t\right)\right)b_{j}^{\left(c,o\right)}\\ &+\sum_{j\in\mathcal{M}}\Bigg{(}V\tau_{0}\varsigma\left(f_{j}^{\left(c\right)}\right)^{3}-\frac{\tau_{0}Q_{j}^{\left(c,l\right)}\left(t\right)}{L_{j}^{\left(c\right)}}f_{j}^{\left(c\right)}\Bigg{)}\\ &+\sum_{i\in\mathcal{N}}\sum_{j\in\mathcal{M}}\Big{[}V\tau_{0}p_{i,j}-\tau_{0}m_{i,j}\left(t\right)\log_{2}\left(1\right.\\ &\left.+l_{i,j}\left(t\right)p_{i,j}\right)\Big{]}\\ \text{Subject to }&(\ref{constraint: EFN offloading decision 1})(\ref{constraint: power allocation 1})(\ref{constraint: power allocation 2})(\ref{constraint: CFN offloading decision 1})(\ref{constraint: CPU frequency}).\end{split} (34)

Problem (34) can be decomposed into subproblems shown in Section V. By solving these subproblems, we develop PORA, an online scheme that indepedently makes predictive offloading decisions 𝒃(t)\boldsymbol{b}(t), sets CPU frequencies 𝒇(t)\boldsymbol{f}(t), and allocates transmit powers 𝒑(t)\boldsymbol{p}(t) in every time slot tt. ∎

Appendix B Proof of Optimal Local CPU Frequency

To solve the optimal solution to subproblem (22), we denote its objective function by

Fk(α,t)(fk(α))Vς(fk(α))3Qk(α,l)(t)Lk(α)fk(α).F_{k}^{(\alpha,t)}\left(f_{k}^{(\alpha)}\right)\triangleq V\varsigma\left(f_{k}^{(\alpha)}\right)^{3}-\frac{Q_{k}^{(\alpha,l)}\left(t\right)}{L_{k}^{(\alpha)}}f_{k}^{(\alpha)}. (35)

Its first- and second-order derivatives are shown as follows:

dFk(α,t)(fk(α))dfk(α)=3Vς(fk(α))2Qk(α,l)(t)Lk(α),\displaystyle\frac{dF^{(\alpha,t)}_{k}\left(f_{k}^{(\alpha)}\right)}{df_{k}^{(\alpha)}}=3V\varsigma\left(f_{k}^{(\alpha)}\right)^{2}-\frac{Q_{k}^{(\alpha,l)}\left(t\right)}{L_{k}^{(\alpha)}}, (36)
d2Fk(α,t)(fk(α))(dfk(α))2=6Vςfk(α).\displaystyle~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}\frac{d^{2}F^{(\alpha,t)}_{k}\left(f_{k}^{(\alpha)}\right)}{\left(df_{k}^{(\alpha)}\right)^{2}}=6V\varsigma f_{k}^{(\alpha)}. (37)

From the above two derivatives, we conclude that function Fk(α,t)()F^{(\alpha,t)}_{k}(\cdot) is convex in interval [0,fk,max(α)][0,f_{k,\text{max}}^{(\alpha)}] since its second order derivative satisfies d2Fk(α,t)()/(dfk(α))20d^{2}F^{(\alpha,t)}_{k}(\cdot)/(df_{k}^{(\alpha)})^{2}\geq 0 for fk(α)0f_{k}^{(\alpha)}\geq 0. On the other hand, its first order derivative satisfies dFk(α,t)()/dfk(α)=0dF^{(\alpha,t)}_{k}(\cdot)/df_{k}^{(\alpha)}=0 when fk(α)=Qk(α,l)(t)/3VςLk(α)f_{k}^{(\alpha)}=\sqrt{Q_{k}^{(\alpha,l)}(t)/3V\varsigma L_{k}^{(\alpha)}}. Thus the minimum point of Fi(α,t)()F^{(\alpha,t)}_{i}(\cdot) over interval [0,fk,max(α)][0,f_{k,\text{max}}^{(\alpha)}] is min{Qk(α,l)(t)/3VςLk(α),fk,max(α)}\min\left\{\sqrt{Q_{k}^{(\alpha,l)}(t)/3V\varsigma L_{k}^{(\alpha)}},f_{k,\text{max}}^{(\alpha)}\right\}. ∎

Appendix C Proof of Optimal Transmit Power Allocation

We denote the optimal solution to subproblem (24) by 𝒑i(t)\boldsymbol{p}^{*}_{i}(t) and the objective function in subproblem (24) by Gi(t)(𝒑i)G^{(t)}_{i}\left(\boldsymbol{p}_{i}\right). Moreover, we define the following function

Gi,j(t)(pi,j)Vpi,jmi,j(t)log2(1+li,j(t)pi,j)G^{(t)}_{i,j}\left(p_{i,j}\right)\triangleq Vp_{i,j}-m_{i,j}\left(t\right)\log_{2}\left(1+l_{i,j}\left(t\right)p_{i,j}\right) (38)

for each jij\in\mathcal{M}_{i}. Then Gi(t)(𝒑i)G^{(t)}_{i}\left(\boldsymbol{p}_{i}\right) can be expressed as

Gi(t)(𝒑i)=jiGi,j(t)(pi,j).G^{(t)}_{i}\left(\boldsymbol{p}_{i}\right)=\sum_{j\in\mathcal{M}_{i}}G^{(t)}_{i,j}\left(p_{i,j}\right). (39)

We denote the minimizer of function Gi,j(t)()G^{(t)}_{i,j}(\cdot) in interval [0,)[0,\infty) by p~i,j(t)\tilde{p}_{i,j}^{(t)}, i.e.,

p~i,j(t)argminpi,j0Gi,j(t)(pi,j).\tilde{p}_{i,j}^{\left(t\right)}\triangleq\arg\min_{p_{i,j}\geq 0}G_{i,j}^{\left(t\right)}\left(p_{i,j}\right). (40)

When mi,j(t)0m_{i,j}(t)\leq 0, Gi,j(t)()G^{(t)}_{i,j}(\cdot) is increasing over interval [0,)[0,\infty) and p~i,j(t)=0\tilde{p}_{i,j}^{(t)}=0. In this case, we have 𝒑i(t)=𝒑~i(t)\boldsymbol{p}_{i}^{*}(t)=\tilde{\boldsymbol{p}}^{(t)}_{i}. When mi,j(t)>0m_{i,j}(t)>0, Gi,j(t)()G^{(t)}_{i,j}(\cdot) is convex in interval [0,)[0,\infty) since its second-order derivative satisfies

d2Gi,j(t)(pi,j)dpi,j2=mi,j(t)(li,j(t))2(1+li,j(t)pi,j)2>0.\frac{d^{2}G_{i,j}^{\left(t\right)}\left(p_{i,j}\right)}{dp_{i,j}^{2}}=\frac{m_{i,j}\left(t\right)\left(l_{i,j}\left(t\right)\right)^{2}}{\left(1+l_{i,j}\left(t\right)p_{i,j}\right)^{2}}>0. (41)

Thus we obtain p~i,j(t)\tilde{p}_{i,j}^{(t)} by letting its first-order derivative to be zero:

dGi,j(t)(pi,j)dpi,j|pi,j=p~i,j(t)=Vmi,j(t)li,j(t)1+li,j(t)p~i,j(t)=0.\frac{dG_{i,j}^{\left(t\right)}\left(p_{i,j}\right)}{dp_{i,j}}|_{p_{i,j}=\tilde{p}_{i,j}^{(t)}}=V-\frac{m_{i,j}\left(t\right)l_{i,j}\left(t\right)}{1+l_{i,j}\left(t\right)\tilde{p}_{i,j}^{(t)}}=0. (42)

It follows that when mi,j(t)>0m_{i,j}(t)>0,

p~i,j(t)=[mi,j(t)V1li,j(t)]+.\tilde{p}_{i,j}^{\left(t\right)}=\left[\frac{m_{i,j}\left(t\right)}{V}-\frac{1}{l_{i,j}\left(t\right)}\right]^{+}. (43)

If jip~i,j(t)pi,max\sum_{j\in\mathcal{M}_{i}}\tilde{p}^{(t)}_{i,j}\leq p_{i,\text{max}}, we have 𝒑i(t)=𝒑~i(t)\boldsymbol{p}_{i}^{*}(t)=\tilde{\boldsymbol{p}}^{(t)}_{i} as the constraints in (24) are satisfied. Otherwise, we have the following lemma.

Lemma 1

If jip~i,j(t)>pi,max\sum_{j\in\mathcal{M}_{i}}\tilde{p}^{(t)}_{i,j}>p_{i,\text{max}}, then 𝐩i(t)\boldsymbol{p}_{i}^{*}(t) must satisfy jipi,j(t)=pi,max\sum_{j\in\mathcal{M}_{i}}p_{i,j}^{*}\left(t\right)=p_{i,\text{max}}.

Proof: We prove Lemma 1 by contradiction. Suppose that there exists θ1>0\theta_{1}>0 such that jipi,j(t)+θ1=pi,max\sum_{j\in\mathcal{M}_{i}}p_{i,j}^{*}\left(t\right)+\theta_{1}=p_{i,\text{max}}. Since jip~i,j(t)>pi,max\sum_{j\in\mathcal{M}_{i}}\tilde{p}^{(t)}_{i,j}>p_{i,\text{max}}, there exist jij^{\prime}\in\mathcal{M}_{i} and θ2>0\theta_{2}>0 such that pi,j(t)<p~i,j(t)θ2p_{i,j^{\prime}}^{*}(t)<\tilde{p}_{i,j^{\prime}}^{(t)}-\theta_{2}. Note that mi,j(t)>0m_{i,j^{\prime}}(t)>0 must hold for jj^{\prime} since p~i,j(t)>0\tilde{p}_{i,j^{\prime}}^{(t)}>0. Now we consider a solution 𝒑i0(t)\boldsymbol{p}_{i}^{0}(t) to subproblem (24) which satisfies

pi,j0(t)=pi,j(t)+θ3,pi,j0(t)=pi,j(t),ji/j,\begin{split}&p_{i,j^{\prime}}^{0}\left(t\right)=p_{i,j^{\prime}}^{*}(t)+\theta_{3},\\ &p_{i,j}^{0}\left(t\right)=p_{i,j}^{*}\left(t\right),\ \forall j\in\mathcal{M}_{i}/j^{\prime},\end{split} (44)

where θ3(0,min(θ1,θ2)]\theta_{3}\in\left(0,\min(\theta_{1},\theta_{2})\right]. Then 𝒑i0(t)\boldsymbol{p}_{i}^{0}(t) is a feasible solution since

jipi,j0(t)=jipi,j(t)+θ3jipi,j(t)+θ1=pi,max.\sum_{j\in\mathcal{M}_{i}}p_{i,j}^{0}\left(t\right)=\sum_{j\in\mathcal{M}_{i}}p_{i,j}^{*}\left(t\right)+\theta_{3}\\ \leq\sum_{j\in\mathcal{M}_{i}}p_{i,j}^{*}\left(t\right)+\theta_{1}=p_{i,\text{max}}. (45)

By the definition of pi,j0(t)p_{i,j^{\prime}}^{0}(t) in (44), we have

pi,j(t)<pi,j0(t)<p~i,j(t).p_{i,j^{\prime}}^{*}\left(t\right)<p_{i,j^{\prime}}^{0}\left(t\right)<\tilde{p}_{i,j^{\prime}}^{\left(t\right)}. (46)

Since Gi,j(t)()G_{i,j^{\prime}}^{\left(t\right)}\left(\cdot\right) is convex and p~i,j(t)\tilde{p}_{i,j^{\prime}}^{\left(t\right)} is its unique minimizer, we have

Gi,j(t)(pi,j(t))>Gi,j(t)(pi,j0(t))>Gi,j(t)(p~i,j(t)).G_{i,j^{\prime}}^{\left(t\right)}\left(p_{i,j^{\prime}}^{*}\left(t\right)\right)>G_{i,j^{\prime}}^{\left(t\right)}\left(p_{i,j^{\prime}}^{0}\left(t\right)\right)>G_{i,j^{\prime}}^{\left(t\right)}\left(\tilde{p}_{i,j^{\prime}}^{\left(t\right)}\right). (47)

It follows that

Gi(t)(𝒑i(t))>Gi(t)(𝒑i0(t)),G_{i}^{\left(t\right)}\left(\boldsymbol{p}_{i}^{*}\left(t\right)\right)>G_{i}^{\left(t\right)}\left(\boldsymbol{p}_{i}^{0}\left(t\right)\right), (48)

which contradicts the fact that 𝒑i(t)\boldsymbol{p}_{i}^{*}\left(t\right) is the optimal solution to (24). Thus θ1\theta_{1} must equal zero and 𝒑i(t)\boldsymbol{p}_{i}^{*}(t) satisfies jipi,j(t)=pi,max\sum_{j\in\mathcal{M}_{i}}p_{i,j}^{*}\left(t\right)=p_{i,\text{max}}. ∎

When jip~i,j(t)>pi,max\sum_{j\in\mathcal{M}_{i}}\tilde{p}^{(t)}_{i,j}>p_{i,\text{max}}, to find the optimal solution to problem (24), we need the following lemma as well.

Lemma 2

For any jij\in\mathcal{M}_{i}, if mi,j(t)Vli,j(t)m_{i,j}\left(t\right)\leq\frac{V}{l_{i,j}\left(t\right)}, then pi,j(t)=p~i,j(t)=0p_{i,j}^{*}(t)=\tilde{p}^{(t)}_{i,j}=0.

Proof: By (43), p~i,j(t)=0\tilde{p}_{i,j}^{\left(t\right)}=0 if and only if mi,j(t)Vli,j(t)m_{i,j}\left(t\right)\leq\frac{V}{l_{i,j}\left(t\right)}. Next, we show that if mi,j(t)Vli,j(t)m_{i,j}\left(t\right)\leq\frac{V}{l_{i,j}\left(t\right)}, then the optimal pi,j(t)p^{*}_{i,j}(t) must be zero. Particularly, we prove it by contradiction.

Assume the optimal pi,j(t)>0p^{*}_{i,j}(t)>0, then there must exist a feasible solution 𝒑i1(t)\boldsymbol{p}^{1}_{i}(t) such that pi,j1(t)=pi,j(t)p_{i,j^{\prime}}^{1}(t)=p_{i,j^{\prime}}^{*}\left(t\right) for all ji/jj^{\prime}\in\mathcal{M}_{i}/j and pi,j1(t)=0<pi,j(t)p_{i,j}^{1}(t)=0<p_{i,j}^{*}(t). Then we have

Gi(t)(𝒑i(t))Gi(t)(𝒑i1(t))=Vpi,j(t)mi,j(t)log2(1+li,j(t)pi,j(t)).G^{(t)}_{i}\left(\boldsymbol{p}_{i}^{*}\left(t\right)\right)-G_{i}^{(t)}\left(\boldsymbol{p}_{i}^{1}\left(t\right)\right)=Vp_{i,j}^{*}\left(t\right)\\ -m_{i,j}\left(t\right)\log_{2}\left(1+l_{i,j}\left(t\right)p_{i,j}^{*}\left(t\right)\right). (49)

If mi,j(t)0m_{i,j}(t)\leq 0, according to pi,j(t)>0p_{i,j}^{*}(t)>0, we have

Gi(t)(𝒑i(t))Gi(t)(𝒑i1(t))>0.G^{(t)}_{i}\left(\boldsymbol{p}_{i}^{*}\left(t\right)\right)-G^{(t)}_{i}\left(\boldsymbol{p}_{i}^{1}\left(t\right)\right)>0. (50)

If 0<mi,j(t)Vli,j(t)0<m_{i,j}(t)\leq\frac{V}{l_{i,j}\left(t\right)}, since p~i,j(t)(t)=0<pi,j(t)\tilde{p}_{i,j}^{\left(t\right)}\left(t\right)=0<p_{i,j}^{*}\left(t\right) is the unique minimizer of Gi,j(h)()G_{i,j}^{(h)}(\cdot) over [0,)[0,\infty), we have

Gi(t)(𝒑i(t))Gi(t)(𝒑i1(t))\displaystyle G^{(t)}_{i}\left(\boldsymbol{p}_{i}^{*}\left(t\right)\right)-G^{(t)}_{i}\left(\boldsymbol{p}_{i}^{1}\left(t\right)\right) =Gi,j(t)(pi,j(t))\displaystyle=G_{i,j}^{\left(t\right)}\left(p_{i,j}^{*}\left(t\right)\right)
>Gi,j(t)(p~i,j(t)(t)),\displaystyle>G_{i,j}^{\left(t\right)}\left(\tilde{p}_{i,j}^{\left(t\right)}\left(t\right)\right), (51)

which contradicts the fact that 𝒑i(t)\boldsymbol{p}_{i}^{*}(t) is the optimal solution of problem (24). Thus for any jj with mi,j(t)Vli,j(t)m_{i,j}\left(t\right)\leq\frac{V}{l_{i,j}\left(t\right)}, the optimal pi,j(t)p_{i,j}^{*}(t) must be zero. ∎

We define i+{j|ji,mi,j(t)>Vli,j(t)}\mathcal{M}_{i}^{+}\triangleq\left\{j|j\in\mathcal{M}_{i},m_{i,j}\left(t\right)>\frac{V}{l_{i,j}\left(t\right)}\right\}. By applying Lemma 1 and Lemma 2, when jip~i,j(t)>pi,max\sum_{j\in\mathcal{M}_{i}}\tilde{p}^{(t)}_{i,j}>p_{i,\text{max}}, we just need to solve the following problem:

Minimize(pi,j)ji+ji+[Vpi,jmi,j(t)log2(1+li,j(t)pi,j)]Subject to ji+pi,j=Pi,max,pi,j0,ji+.\begin{split}\underset{\left(p_{i,j}\right)_{j\in\mathcal{M}_{i}^{+}}}{\text{Minimize}}&\sum_{j\in\mathcal{M}_{i}^{+}}\Big{[}Vp_{i,j}-m_{i,j}\left(t\right)\log_{2}\left(1+l_{i,j}\left(t\right)p_{i,j}\right)\Big{]}\\ \text{Subject to }&\sum_{j\in\mathcal{M}_{i}^{+}}p_{i,j}=P_{i,\text{max}},\\ &p_{i,j}\geq 0,\ \forall j\in\mathcal{M}_{i}^{+}.\end{split} (52)

Note that (pi,j(t))ji+(p_{i,j}^{*}(t))_{j\in\mathcal{M}_{i}^{+}} is the optimal solution to problem (52) and it satisfies the following KKT conditions:

Vmi,j(t)li,j(t)1+li,j(t)pi,j(t)\displaystyle V-\frac{m_{i,j}\left(t\right)l_{i,j}\left(t\right)}{1+l_{i,j}\left(t\right)p_{i,j}^{*}\left(t\right)} +λμj=0,ji+,\displaystyle+\lambda^{*}-\mu_{j}^{*}=0,\ \forall j\in\mathcal{M}_{i}^{+}, (53)
μjpi,j(t)=\displaystyle\mu_{j}^{*}p_{i,j}^{*}\left(t\right)= 0,ji+,\displaystyle 0,\ \forall j\in\mathcal{M}_{i}^{+}, (54)
λ,μj0\displaystyle\lambda^{*},\mu_{j}^{*}\geq 0 ,ji+,\displaystyle,\ \forall j\in\mathcal{M}_{i}^{+}, (55)
ji+pi,j\displaystyle\sum_{j\in\mathcal{M}_{i}^{+}}p_{i,j}^{*} (t)=pi,max,\displaystyle\left(t\right)=p_{i,\text{max}}, (56)
pi,j\displaystyle p_{i,j}^{*} (t)0,\displaystyle\left(t\right)\geq 0, (57)

where λ\lambda^{*} and (μj)ji+(\mu_{j}^{*})_{j\in\mathcal{M}_{i}^{+}} are the corresponding optimal dual variables. Multiplying both sides of (53) by pi,j(t)p_{i,j}^{*}(t), we have

(Vmi,j(t)li,j(t)1+li,j(t)pi,j(t)+λ)pi,j(t)μjpi,j(t)=0.\left(V-\frac{m_{i,j}\left(t\right)l_{i,j}\left(t\right)}{1+l_{i,j}\left(t\right)p_{i,j}^{*}\left(t\right)}+\lambda^{*}\right)p_{i,j}^{*}\left(t\right)\\ -\mu_{j}^{*}p_{i,j}^{*}\left(t\right)=0. (58)

It follows by (54) that

(Vmi,j(t)li,j(t)1+li,j(t)pi,j(t)+λ)pi,j(t)=0.\left(V-\frac{m_{i,j}\left(t\right)l_{i,j}\left(t\right)}{1+l_{i,j}\left(t\right)p_{i,j}^{*}\left(t\right)}+\lambda^{*}\right)p_{i,j}^{*}\left(t\right)=0. (59)

On the other hand, according to (53) and (55), we have

λ\displaystyle\lambda^{*} =mi,j(t)li,j(t)1+li,j(t)pi,j(t)V+μi\displaystyle=\frac{m_{i,j}\left(t\right)l_{i,j}\left(t\right)}{1+l_{i,j}\left(t\right)p_{i,j}^{*}\left(t\right)}-V+\mu_{i}^{*}
mi,j(t)li,j(t)1+li,j(t)pi,j(t)V\displaystyle\geq\frac{m_{i,j}\left(t\right)l_{i,j}\left(t\right)}{1+l_{i,j}\left(t\right)p_{i,j}^{*}\left(t\right)}-V (60)

for every ji+j\in\mathcal{M}_{i}^{+}. Now we consider two cases:

  1. 1.

    If λ<mi,j(t)li,j(t)V\lambda^{*}<m_{i,j}\left(t\right)l_{i,j}\left(t\right)-V, then (C) holds only if pi,j(t)>0p_{i,j}^{*}(t)>0. It follows by (59) that

    λ=mi,j(t)li,j(t)1+li,j(t)pi,j(t)V,\lambda^{*}=\frac{m_{i,j}\left(t\right)l_{i,j}\left(t\right)}{1+l_{i,j}\left(t\right)p_{i,j}^{*}\left(t\right)}-V, (61)

    which yields pi,j(t)=mi,j(t)V+λ1li,j(t)p_{i,j}^{*}\left(t\right)=\frac{m_{i,j}\left(t\right)}{V+\lambda^{*}}-\frac{1}{l_{i,j}\left(t\right)}.

  2. 2.

    If λmi,j(t)li,j(t)V\lambda^{*}\geq m_{i,j}(t)l_{i,j}(t)-V, then condition (59) holds if and only if pi,j(t)=0p_{i,j}^{*}(t)=0.

In conclusion, we have

pi,j(t)={mi,j(t)V+λ1li,j(t),if λ<mi,j(t)li,j(t)V,0,if λmi,j(t)li,j(t)V,p_{i,j}^{*}\left(t\right)=\\ \begin{cases}\frac{m_{i,j}\left(t\right)}{V+\lambda^{*}}-\frac{1}{l_{i,j}\left(t\right)},&\text{if }\lambda^{*}<m_{i,j}\left(t\right)l_{i,j}\left(t\right)-V,\\ 0,&\text{if }\lambda^{*}\geq m_{i,j}\left(t\right)l_{i,j}\left(t\right)-V,\end{cases} (62)

or equivalently,

pi,j(t)=[mi,j(t)V+λ1li,j(t)]+.p_{i,j}^{*}\left(t\right)=\left[\frac{m_{i,j}\left(t\right)}{V+\lambda^{*}}-\frac{1}{l_{i,j}\left(t\right)}\right]^{+}. (63)

Note that the above expression also applies to the case when mi,j(t)Vli,j(t)m_{i,j}(t)\leq\frac{V}{l_{i,j}(t)}. Then by substituting (63) into (56), we obtain

ji[mi,j(t)V+λ1li,j(t)]+=pi,max.\sum_{j\in\mathcal{M}_{i}}\left[\frac{m_{i,j}\left(t\right)}{V+\lambda^{*}}-\frac{1}{l_{i,j}\left(t\right)}\right]^{+}=p_{i,\text{max}}. (64)

The left-hand side is a piecewise-linear decreasing function of λ\lambda^{*}, with the breakpoint at (mi,j(t)li,j(t)V)(m_{i,j}\left(t\right)l_{i,j}\left(t\right)-V). Therefore, the equation has a unique solution. ∎

Appendix D Proof of Theorem 2

Applying the Corollary 1 in [16], given prediction window size WiW_{i}, the average latency of workload in arrival queue Ai,1(t)A_{i,-1}(t) of EFN ii under PORA is

dip=w1wπi,w+Wi.d_{i}^{p}=\sum_{w\geq 1}w\pi_{i,w+W_{i}}. (65)

According to Little’s theorem [23], the average arrival queue backlog size of EFN ii under prediction is

ψip=λidip=λiw1wπi,w+Wi.\psi_{i}^{p}=\lambda_{i}d_{i}^{p}=\lambda_{i}\sum_{w\geq 1}w\pi_{i,w+W_{i}}. (66)

Therefore, the total average arrival queue backlog sizes of all EFNs is

ψp=i𝒩ψip=i𝒩λiw1wπi,w+Wi.\psi^{p}=\sum_{i\in\mathcal{N}}\psi_{i}^{p}=\sum_{i\in\mathcal{N}}\lambda_{i}\sum_{w\geq 1}w\pi_{i,w+W_{i}}. (67)

When the prediction window size is zero, i.e., when there is no prediction, the corresponding total average arrival queue backlog size of all EFNs is

ψ=i𝒩ψi=i𝒩λiw1wπi,w.\psi=\sum_{i\in\mathcal{N}}\psi_{i}=\sum_{i\in\mathcal{N}}\lambda_{i}\sum_{w\geq 1}w\pi_{i,w}. (68)

Using (67) and (68), we conclude that

ψψp=\displaystyle\psi-\psi^{p}= i𝒩λi(w1wπi,ww1wπi,w+Wi)\displaystyle\sum_{i\in\mathcal{N}}\lambda_{i}\bigg{(}\sum_{w\geq 1}w\pi_{i,w}-\sum_{w\geq 1}w\pi_{i,w+W_{i}}\bigg{)}
=\displaystyle= i𝒩λi(w1wπi,ww1(w+Wi)πi,w+Wi\displaystyle\sum_{i\in\mathcal{N}}\lambda_{i}\bigg{(}\sum_{w\geq 1}w\pi_{i,w}-\sum_{w\geq 1}\left(w+W_{i}\right)\pi_{i,w+W_{i}}
+w1Wiπi,w+Wi)\displaystyle+\sum_{w\geq 1}W_{i}\pi_{i,w+W_{i}}\bigg{)}
=\displaystyle= i𝒩λi(w1wπi,wwWi+1wπi,w\displaystyle\sum_{i\in\mathcal{N}}\lambda_{i}\bigg{(}\sum_{w\geq 1}w\pi_{i,w}-\sum_{w\geq W_{i}+1}w\pi_{i,w}
+w1Wiπi,w+Wi)\displaystyle+\sum_{w\geq 1}W_{i}\pi_{i,w+W_{i}}\bigg{)}
=\displaystyle= i𝒩λi(1wWiwπi,w+Wiw1πi,w+Wi).\displaystyle\sum_{i\in\mathcal{N}}\lambda_{i}\bigg{(}\!\sum_{1\leq w\leq W_{i}}\!w\pi_{i,w}\!+\!W_{i}\sum_{w\geq 1}\!\pi_{i,w+W_{i}}\!\bigg{)}. (69)

Dividing both sides by i𝒩λi\sum_{i\in\mathcal{N}}\lambda_{i} and using Little’s theorem, we obtain (27).

Next, we prove (28). Taking the limit of 𝑾\boldsymbol{W} (𝑾\boldsymbol{W}\rightarrow\infty), we obtain

lim𝑾i𝒩λi1wWiwπi,w=ψ.\lim_{\boldsymbol{W}\rightarrow\infty}\sum_{i\in\mathcal{N}}\lambda_{i}\sum_{1\leq w\leq W_{i}}w\pi_{i,w}=\psi. (70)

It follows that

lim𝑾η(𝑾)=d+lim𝑾i𝒩λiWiw1πi,w+Wii𝒩λi.\lim_{\boldsymbol{W}\rightarrow\infty}\!\eta\left(\boldsymbol{W}\right)=d\!+\!\lim_{\boldsymbol{W}\rightarrow\infty}\frac{\sum_{i\in\mathcal{N}}\lambda_{i}W_{i}\sum_{w\geq 1}\pi_{i,w+W_{i}}}{\sum_{i\in\mathcal{N}}\lambda_{i}}. (71)

On the other hand, we have

lim𝑾η(𝑾)\displaystyle\lim_{\boldsymbol{W}\rightarrow\infty}\eta\left(\boldsymbol{W}\right) =ψi𝒩λilim𝑾ψpi𝒩λi\displaystyle=\frac{\psi}{\sum_{i\in\mathcal{N}}\lambda_{i}}-\lim_{\boldsymbol{W}\rightarrow\infty}\frac{\psi^{p}}{\sum_{i\in\mathcal{N}}\lambda_{i}}
ψi𝒩λi.\displaystyle\leq\frac{\psi}{\sum_{i\in\mathcal{N}}\lambda_{i}}. (72)

Combining (71) and (D), we have

lim𝑾i𝒩λiWiw1πi,w+Wii𝒩λi=0,\lim_{\boldsymbol{W}\rightarrow\infty}\frac{\sum_{i\in\mathcal{N}}\lambda_{i}W_{i}\sum_{w\geq 1}\pi_{i,w+W_{i}}}{\sum_{i\in\mathcal{N}}\lambda_{i}}=0, (73)

since it cannot be negative. Substituting (73) into (71), we obtain

lim𝑾η(𝑾)=d.\lim_{\boldsymbol{W}\rightarrow\infty}\eta\left(\boldsymbol{W}\right)=d. (74)