This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Exploiting Data Locality to Improve Performance of Heterogeneous Server Clusters

Zhisheng Zhao1 Debankur Mukherjee2 Ruoyu Wu3
Abstract

We consider load balancing in large-scale heterogeneous server systems in the presence of data locality that imposes constraints on which tasks can be assigned to which servers. The constraints are naturally captured by a bipartite graph between the servers and the dispatchers handling assignments of various arrival flows. When a task arrives, the corresponding dispatcher assigns it to a server with the shortest queue among d2d\geq 2 randomly selected servers obeying the above constraints. Server processing speeds are heterogeneous and they depend on the server-type. For a broad class of bipartite graphs, we characterize the limit of the appropriately scaled occupancy process, both on the process-level and in steady state, as the system size becomes large. Using such a characterization, we show that data locality constraints can be used to significantly improve the performance of heterogeneous systems. This is in stark contrast to either heterogeneous servers in a full flexible system or data locality constraints in systems with homogeneous servers, both of which have been observed to degrade the system performance. Extensive numerical experiments corroborate the theoretical results.

1Georgia Institute of Technology, Email: zhisheng@gatech.edu2Georgia Institute of Technology, Email: debankur.mukherjee@isye.gatech.edu3Iowa State University, Email: ruoyu@iastate.eduKeywords and phrases. Heterogeneous load balancing system, data locality, compatibility constraint, Power-of-Two, Mean-field, McKean-Vlasov Process, Stochastic CouplingAcknowledgements. The work was partially supported by the NSF grant CIF-2113027.

1 Introduction

Over the last two decades, large-scale load balancing has emerged as a fundamental research problem. In simple terms, the goal is to investigate how to efficiently allocate tasks in large-scale service systems, such as data centers and cloud networks? As modern data centers continue to process massive amounts of data with increasingly stringent processing time requirements, the need for more efficient and scalable, dynamic load balancing algorithms is greater than ever. The study of scalable load balancing algorithms started with the seminal works of Mitzenmacher [17, 1, 16] and Vvedenskaya et al. [32], where the popular ‘power-of-dd choices’ or the JSQ(dd) algorithm was introduced. Here a canonical model was considered that consists of NN identical parallel servers, each serving a dedicated queue of tasks. Arriving tasks are routed to the shortest of d2d\geq 2 randomly selected queues by a centralized dispatcher, irrevocably and instantaneously, at the time of arrival. Since then, this model has received significant attention from the research community and we have seen tremendous progress in our understanding of performance of various algorithms; see [31] for a recent survey.

Despite this phenomenal progress, when it comes to modern large-scale systems, many of the existing wisdoms can be observed to be false. This is primarily due to the fact that the above classical model fails to capture two of the most significant factors that impact the performance of these systems: (a) Data locality constraints: In simple terms, it means that tasks of a particular type can only be routed to a small subset of servers that are equipped with the appropriate resources to execute them [33, 22, 27, 29]. For example, an image classification request must be routed to a server that is trained with appropriate machine learning models such as, deep convolutional neural network. Also, in online video services like Netflix and YouTube, users’ requests may only be routed to servers that are equipped with the required data (e.g., movies, music). The classical model ignores this effect and assumes full flexibility, that is, that any task can be assigned to any server in the system. In the presence of data locality constraints, the delay performance of the system may degrade drastically as compared to fully flexible systems. (b) Heterogeneity in service rates: Servers in any modern large-scale server clusters do not process tasks at equal speeds. This heterogeneity of the service rates is a major bottleneck in implementing the existing heuristics of the classical model. For example, if there are two groups of servers in the system, one faster and the other slower, then popular dynamic algorithms like JSQ(dd) that has a provably excellent delay performance when all server speeds are identical, can be observed to be unstable (i.e., their queue lengths blow up) [10, 13, 21, 20]. In other words, heterogeneity shrinks the stability region as formally established in [13]. This happens simply because if all the servers are treated equally, then the slower server pool may receive a higher flow of arrivals than what it can process.

Takeaway. In summary, both data locality and heterogeneity of server speeds may significantly degrade the system performance. The main contribution of the current work is to establish that when these two aspects are considered together, then the performance can in fact be drastically improved. That is, if servers are heterogeneous, then efficiently designing the data locality constraints (by appropriately placing the resource files in the server network) can regain the full stability region, which was shrunk for fully flexible systems. Moreover, we also establish that a carefully designed data locality constraints can ensure the celebrated double-exponential decay of tail probability of the steady-state queue length distribution even for the heterogeneous systems.

1.1 Our Contributions

Motivated by this, in the current paper, we consider a bipartite graph model for large-scale load balancing systems, which has recently gained popularity in the research community. In this model, a bipartite graph between the servers and task types describes the compatibility between the two, where an edge represents the server’s ability to process the corresponding task type. This encompasses the classical full-flexibility models as those having a complete bipartite compatibility graph. An immediate difficulty of the new model is that when the graph is non-trivial (i.e., not a collection of isolated pairs or a complete bipartite graph), the mean-field techniques break down. This is because, the queues no longer remain exchangeable, making the aggregate processes such as the vector of number of servers with queue length ii with i=0,1,2,i=0,1,2,\ldots non-Markovian. In addition, we also consider that each dispatcher handles the arrival flow of one of KK possible task types and that there are MM server types. The rate of service at a server depends on its type. Throughout the paper, the key quantity of interest will be the global occupancy process 𝐪N(t)=(qm,lN(t),m=1,,M,l1)\mathbf{q}^{N}(t)=(q^{N}_{m,l}(t),m=1,\ldots,M,l\geq 1), where qm,lN(t)q^{N}_{m,l}(t) represents the fraction of servers of type mm with queue-length at least ll at time tt in the NN-th system with NN servers, and we will look at the large-system asymptotic regime: NN\to\infty.

Due to the compatibility constraints, the servers become non-exchangeable, even if they belong to the same type. This causes most of the existing frameworks [8, 17, 24] to break down. To characterize the process-level limit of the queue length process, we take resort to the theory of weakly interacting particle systems and asymptotically couple the evolution of the NN-dimensional vector of queue lengths with an appropriately defined infinite system of independent McKean-Vlasov processes [26, 15]. We also show the asymptotic independence of any finite number of queue length processes, also know as the propagation of chaos property. This convergence of the queue length processes (in L2L_{2} sense) is then used to establish the transient convergence of the occupancy process. One downside of the above convergence is that it depends on the assumption that the initial queue lengths within each set of servers of the same type are independent and identically distributed (i.i.d.) and are independent across the the set of servers of different types. Due to this assumption, this convergence result cannot be used to establish the interchange of tt\to\infty and NN\to\infty limits, which is crucial in studying the limit of steady states.

To overcome this issue, we use the framework of [22], recently introduced in the context of homogeneous systems. Here, a notion called proportional sparsity for graph sequences was introduced, which ensures that the empirical queue length distribution within the set of compatible servers of any dispatcher is close to the empirical queue length distribution of the entire system. This was used in [22] to construct conditions on graphs that match the performance of a fully flexible system. In the current setup, however, this notion is inadequate, since our goal is not to match the performance of the fully flexible system (which is usually poor under heterogeneity). That is why, we extend this notion to what we call the clustered proportional sparsity for a sequence of graphs with increasing size, to accommodate the heterogeneous systems. The clustered proportional sparsity property allows us to construct a stochastic coupling between the system and another intermediate system whose task allocation is done by a carefully constructed algorithm called GWSQ(dd) (Algorithm 1). This coupling with the intermediate system, along with clustered proportional sparsity, helps us establish that if the initial occupancy of two systems are close, then the distance (in the 1\ell_{1}-norm) between their global occupancy remains small uniformly over any finite time interval. In turn, it implies that their limits of the global occupancy systems are the same. As a consequence, we can remove the i.i.d. assumption of the initial queue lengths, since the above guarantees that under clustered proportional sparsity, the convergence of the occupancy process depends only on the initial occupancy and not on how the individual queues are distributed.

The above process-level limit result shows that the transient limit of the occupancy process can be described as a system of ODEs that depend on various graph parameters. Next, we also show that the interchange of limits holds and that the sequence of occupancy states in stationarity, converges weakly to the unique fixed point of the ODE. One celebrated feature of the classical JSQ(dd) policy for homogeneous systems under full flexibility is that the steady-state queue length decays doubly exponentially as λ(di1)/(d1)\lambda^{(d^{i}-1)/(d-1)}, where λ(0,1)\lambda\in(0,1) is the load per server [17, 32]. We establish this double-exponential decay property for the heterogeneous system.

It is worthwhile to note that the strength of the above results lie in that they hold for arbitrary deterministic sequence of graphs satisfying certain properties. However, we show that all these properties are satisfied almost surely by a sequence of inhomogeneous random graphs with parameters prescribed by the theorems. This makes it easy to design graphs with the desired favorable properties.

1.2 Related Works

The research on task allocation systems with limited flexibility can be traced back to the works of Turner [30] and Foss and Chernova [9]. Of particular importance to the current work, Foss and Chernova [9] considered stability properties of the system using the fluid model. Later, Bramson [4] generalized some parts of results in [9] to a broad class of Join-Shortest-Queue (JSQ)-type systems, including the JSQ(dd) policy, via the Lyapunov function approach. Stolyar [23] considered optimal routing in output-queued flexible server system, which is essentially the bipartite graph model for the load balancing system. Here the author considered a system with a fixed number of servers and dispatchers in the conventional heavy traffic regime and proposed a routing policy that is optimal in terms of server workload. Recently, Cruise et al. [7] considered load balancing problems on hypergraphs and proved its stability conditions. The above works, however, did not aim to precisely characterize the system performance in the large-scale scenario.

The analysis in the large-scale scenario became prominent in the last decade, with the emergence of its applications to load balancing in data centers and cloud networks. In the full-flexibility setup, the analysis of heterogeneous-server systems gained some attention. In this case, Stolyar [24, 25] studied the zero-queueing property of the Join-Idle-Queue (JIQ) policy, Mukhopadhyay et al. [21] and Mukhopadhyay and Mazumdar [20] analyzed the JSQ(dd) policy in heterogeneous systems with processor-sharing service discipline, Hurtado-Lange and Maguluri [13] studied the throughput and delay optimality properties of JSQ(dd), and Bhambay and Mukhopadhyay [3] studied a speed-aware JSQ policy. The above works on the JSQ(dd) policy observe that the stability region shrinks if the dispatcher applies the JSQ(dd) policy blindly. One way to mitigate this performance degradation is to take the server speeds into consideration while sampling servers or while assigning tasks to the sampled servers. Such a ‘hybrid JSQ(dd)’ scheme is able to recover the stability region. The current work can be contrasted with this approach. First, in the presence of data locality, both the server speeds and the underlying compatibility constraints need to be taken into account during the sampling procedure, and the approach becomes significantly more complicated. Second, we show how exploiting the data locality, the blind JSQ(dd) policy can recover the stability region and even achieve the double-exponential decay of tail probabilities of the stead-state queue length distribution. One advantage of the latter approach is that the dispatchers can be oblivious to the server speeds, which reduces the implementation complexity and also, makes it robust against changes to the servers (e.g., when servers are add/removed).

Recently, Allmeier and Gast [2] studied the application of (refined) mean-field approximations for heterogeneous systems. Their method is using an ODE to approximate the evolution of each server, and the error vanishes as the system scales. However, this method cannot be directly used in our case. Due to the bipartite compatibility graph structure, it is hard to capture the interactions between two servers, which means that we cannot write the transition rates of the underlying Markov chain as [2] does. Also, one important assumption in their work is the finite buffer, but we consider the infinite buffer case here.

The aspect of task-server compatibility constraints in large-scale load balancing and scheduling gained popularity only recently, as the data locality became prominent in data centers and cloud networks. This led to many works in this area [11, 18, 5, 22, 33, 29, 28]. All these works consider homogeneous processing speeds at the servers. The initial works [30, 11] focused on certain fixed-degree graphs and showed that the flexibility to forward tasks to even a few neighbors with possibly shorter queues may significantly improve the waiting time performance as compared to dedicated arrival streams or a collection of independent M/M/1 queues. Tsitsiklis and Xu [28, 29] considered asymptotic optimality properties of the bipartite graph topology in an input-queued, dynamic scheduling framework. Later, in the (output-queued) load balancing setup, Mukherjee et al. [18] considered the JSQ policy and Budhiraja et al. [5] considered the transient analysis of the JSQ(dd) policy on non-bipartite graphs. The goal in these papers was to provide sufficient condition on the graph sequence to asymptotically match the performance of a complete graph. Here we should mention that the non-bipartite graph model cannot be used to capture the data locality constraints. In the presence of data locality constraints, the analysis of the JSQ(dd) policy for homogeneous systems, including both transient and interchange of limits, was performed by Rutten and Mukherjee [22]. Weng et al. [33] is the first to consider the large-scale heterogeneous-server model under data locality. They showed that the Join-the-Fastest-Shortest-Queue (JFSQ) and Join-the-Fastest-Idle-Queue (JFIQ) policies achieve asymptotic optimality for minimizing mean steady-state waiting time when the bipartite graph is sufficiently well connected. However, these results fall in the category of JSQ-type policies where the asymptotic behavior is degenerate in the sense that the queue lengths at servers can be either 0 or 1. Naturally, the results and their analysis are very different from the JSQ(dd)-type policies where queues of any length is possible.

1.3 Notations

Let 0={0}\mathbb{N}_{0}=\mathbb{N}\cup\{0\}. For a set SS, its cardinality is denoted as |S||S|. For a polish space 𝒮\mathcal{S}, the space of right continuous functions with left limits from [0,)[0,\infty) to 𝒮\mathcal{S} is denoted as 𝔻([0,),𝒮)\mathbb{D}([0,\infty),\mathcal{S}), endowed with the Skorokhod topology. The distribution of 𝒮\mathcal{S}-valued random variable XX will be denoted as (X)\mathcal{L}(X). For a function f:[0,)f:\ [0,\infty)\rightarrow\mathbb{R}, let f,tsup0st|f(s)|\left\lVert f\right\rVert_{*,t}\coloneqq\sup_{0\leq s\leq t}|f(s)|. The distribution of 𝒮\mathcal{S}-valued random variable XX will be denoted as (X)\mathcal{L}(X). For x𝒮x\in\mathcal{S}, the Dirac measure at the point xx is denoted as δx\delta_{x}. p\left\lVert\cdot\right\rVert_{p} represents the p\ell_{p}-norm. Define (XY)=X(X1)(XY+1)Y!{X\choose Y}=\frac{X(X-1)\cdots(X-Y+1)}{Y!} if XYX\geq Y and is 0, otherwise. RHS is the acronym of Right Hand side.

2 Model Description

The model below for large-scale systems with limited flexibility was considered by Tsitsiklis and Xu [28, 29] in the context of scheduling algorithms for input-queued systems. Subsequently, it was considered in [18, 5, 22, 33] for output-queued load balancing systems. Let GN=(WN,VN,EN)G^{N}=(W^{N},V^{N},E^{N}) be a system with NN single servers, each serving its own queue, and W(N)W(N) dispatchers, where WN={1,,W(N)}W^{N}=\{1,...,W(N)\} and VN={1,,N}V^{N}=\{1,...,N\} denote the sets of dispatchers and servers, respectively. Similar to [28, 29], we assume that limNW(N)/N=ξ\lim_{N\rightarrow\infty}W(N)/N=\xi where ξ>0\xi>0 is a constant. The set ENWN×VNE^{N}\subseteq W^{N}\times V^{N} of edges represents hard compatibility between the dispatchers and servers in the NN-th system. In other words, tasks of type ii can be assigned to a server jj if and only if (i,j)EN(i,j)\in E^{N}. Tasks arriving at a dispatcher must be assigned instantaneously and irrevocably to one of the compatible servers.

  • Task types: A task can be of one of KK possible types labelled in 𝒦={1,,K}\mathcal{K}=\{1,...,K\} and each dispatcher handles arrivals of exactly one task type. Thus, we will interchangeably use the terms task-type and dispatcher-type throughout the article. Let WkNW^{N}_{k} denote the set of all dispatchers handling type-kk tasks. As NN\to\infty, assume that |WkN|/W(N)wk(0,1)|W^{N}_{k}|/W(N)\to w_{k}\in(0,1) for k𝒦k\in\mathcal{K} with k=1Kwk=1\sum_{k=1}^{K}w_{k}=1. Tasks arrive at each dispatcher as an independent Poisson process with rate λ\lambda.

  • Server types: Each server belongs to one of MM possible types labelled in ={1,,M}\mathcal{M}=\{1,...,M\}. Let VmNV^{N}_{m} denote the set of type-mm servers, and as NN\to\infty, |VmN|/Nvm(0,1)|V^{N}_{m}|/N\to v_{m}\in(0,1) for mm\in\mathcal{M} with m=1Mvm=1\sum_{m=1}^{M}v_{m}=1.

  • Service times: The processing time at a type-mm server is exponentially distributed with mean 1/um1/u_{m}, where umu_{m} is a positive constant. Throughout, we will assume that asymptotically, the system has sufficient service capacity in the sense that

    λξ<mumvm.\lambda\xi<\sum_{m\in\mathcal{M}}u_{m}v_{m}. (2.1)

    Note that the left and right hand side above represents the scaled total arrival rate and scaled maximum departure rate, respectively.

For all the asymptotic results, we consider a general class of systems where the compatibility graph satisfies certain asymptotic criteria as specified in Condition 2.1 below. Define

degwN(i,m)\displaystyle\deg^{N}_{w}(i,m) =|{jVmN:(i,j)EN}|,iWN,m,\displaystyle=|\{j\in V^{N}_{m}:(i,j)\in E^{N}\}|,\quad i\in W^{N},m\in\mathcal{M},
degvN(k,j)\displaystyle\deg^{N}_{v}(k,j) =|{iWkN:(i,j)EN}|,jVN,k𝒦.\displaystyle=|\{i\in W^{N}_{k}:(i,j)\in E^{N}\}|,\quad j\in V^{N},k\in\mathcal{K}.

Namely, degwN(i,m)\deg^{N}_{w}(i,m) is the number of the dispatcher ii’s neighboring servers whose type is mm\in\mathcal{M}. Similarly, degvN(k,j)\deg^{N}_{v}(k,j) is the number of the server jj’s neighboring dispatchers whose type is k𝒦k\in\mathcal{K}.

Condition 2.1.

The sequence {GN}N1\{G^{N}\}_{N\geq 1} satisfies the following:

  1. (a)

    For each k𝒦k\in\mathcal{K} and mm\in\mathcal{M}, let EN(k,m)={(i,j)WkN×VmN:(i,j)EN}E^{N}(k,m)=\{(i,j)\in W^{N}_{k}\times V^{N}_{m}:(i,j)\in E^{N}\}

    limN|EN(k,m)||WkN|×|VmN|=pk,m[0,1].\lim_{N\rightarrow\infty}\frac{|E^{N}(k,m)|}{|W^{N}_{k}|\times|V^{N}_{m}|}=p_{k,m}\in[0,1]. (2.2)

    We call the matrix 𝐩=(pk,m,k𝒦,m)\mathbf{p}=(p_{k,m},k\in\mathcal{K},m\in\mathcal{M}) as the compatibility matrix.

  2. (b)

    For each k𝒦k\in\mathcal{K} and mm\in\mathcal{M},

    limNmaxiWkNdegwN(i,m)miniWkNdegwN(i,m)=1,limNmaxjVmNdegvN(k,j)minjVmNdegvN(k,j)=1.\lim_{N\rightarrow\infty}\frac{\max_{i\in W^{N}_{k}}\deg^{N}_{w}(i,m)}{\min_{i\in W^{N}_{k}}\deg^{N}_{w}(i,m)}=1,\quad\lim_{N\rightarrow\infty}\frac{\max_{j\in V^{N}_{m}}\deg^{N}_{v}(k,j)}{\min_{j\in V^{N}_{m}}\deg^{N}_{v}(k,j)}=1.

Intuitively, the condition implies that the ‘asymptotic density’ of edges between type-kk dispatchers and type-mm servers is given by pk,mp_{k,m} and for each task-type-server-type pair, the servers have similar levels of flexibility. The classical, well-studied setup where any task can be processed by any server, corresponds to the complete bipartite graph with pk,mN=1p^{N}_{k,m}=1, k𝒦,m\forall k\in\mathcal{K},m\in\mathcal{M}. In Section 3.5, we show that for any given 𝐩:=(pk,m,k𝒦,m)\mathbf{p}:=(p_{k,m},k\in\mathcal{K},m\in\mathcal{M}), a sequence of graphs satisfying Condition 2.1 can be obtained simply by putting edges suitably randomly. This is a certain class of inhomogeneous random graphs, which we call irg(𝐩\mathbf{p}); see Definition 3.15 for details. In fact, the irg(𝐩\mathbf{p}) sequence of graphs will be proved to satisfy the required conditions for all the results of this article to hold.

State Space. In the NN-th system, let XjN(t)X^{N}_{j}(t) be the number of tasks (including those in service) in the queue of server jVNj\in V^{N} at time tt. Let qm,lN(t)q^{N}_{m,l}(t) be the proportion of servers of type mm with queue length at least ll at time tt, namely,

qm,lN(t)1|VmN|jVmN𝟙(XjN(t)l),t0,m,l0.q^{N}_{m,l}(t)\coloneqq\frac{1}{|V^{N}_{m}|}\sum_{j\in V^{N}_{m}}\mathds{1}_{\big{(}X^{N}_{j}(t)\geq l\big{)}},\quad t\geq 0,m\in\mathcal{M},l\in\mathbb{N}_{0}. (2.3)

Let 𝐪N(t)=(qm,lN(t),m,l0)\mathbf{q}^{N}(t)=\big{(}q^{N}_{m,l}(t),m\in\mathcal{M},l\in\mathbb{N}_{0}\big{)}. Then 𝐪N{𝐪N(t)}0t<\mathbf{q}^{N}\coloneqq\big{\{}\mathbf{q}^{N}(t)\big{\}}_{0\leq t<\infty} is a process with sample paths in 𝔻([0,),𝒮)\mathbb{D}([0,\infty),\mathcal{S}) where

𝒮{𝐪[0,1]M×0:qm,0=1,qm,lqm,l+1, and l0qm,l<,m,l0}\mathcal{S}\coloneqq\Big{\{}\mathbf{q}\in[0,1]^{M\times\mathbb{N}_{0}}:q_{m,0}=1,q_{m,l}\geq q_{m,l+1},\text{ and }\sum_{l\in\mathbb{N}_{0}}q_{m,l}<\infty,\forall m\in\mathcal{M},l\in\mathbb{N}_{0}\Big{\}}

is equipped with the 1\ell_{1}-topology. Note that the space 𝒮\mathcal{S} is a complete metric space.

Local JSQ(dd) Policy. For any fixed d2d\geq 2, each dispatcher uses the JSQ(dd) policy [17, 32] to assign the incoming tasks to servers. To describe the policy, define the neighborhood of dispatcher iWNi\in W^{N}, 𝒩wN(i):={jVN:(i,j)EN}\mathcal{N}^{N}_{w}(i):=\{j\in V^{N}:(i,j)\in E^{N}\} with δiN=|𝒩wN(i)|\delta^{N}_{i}=|\mathcal{N}^{N}_{w}(i)|. When a new task arrives at the dispatcher iWNi\in W^{N} with δiNd\delta^{N}_{i}\geq d, it is immediately assigned to the server with the shortest queue among dd servers selected uniformly at random from 𝒩wN(i)\mathcal{N}^{N}_{w}(i). Ties are broken uniformly at random. If δiN<d\delta^{N}_{i}<d, then the task is assigned to one server selected from 𝒩wN(i)\mathcal{N}^{N}_{w}(i) uniformly at random. This δiN<d\delta^{N}_{i}<d scenario is asymptotically not relevant for us since all the graphs that we will consider have diverging degrees as NN\to\infty.

3 Main Results

3.1 Mitigating the Stability Issue

As discussed earlier, when the server speeds are heterogeneous, the fully flexible systems (with the complete bipartite compatibility graph) may not be stable under the JSQ(dd) policy, even if we assume that the sufficient service capacity in (2.1) is satisfied. The next lemma provides a necessary and sufficient condition for ergodicity of the queue length process. Recall δiN=|𝒩wN(i)|\delta^{N}_{i}=|\mathcal{N}^{N}_{w}(i)|. For any fixed NN, define

ρNmaxUVNU{(jUm𝟙(jVmN)um)1iWN(𝟙(δiNd)S(U𝒩wN(i)):|S|=dλ(δiNd)+𝟙(δiN<d)|U𝒩wN(i)|δiN)}.\rho^{N}\coloneqq\max_{\begin{subarray}{c}U\subseteq V^{N}\\ U\neq\emptyset\end{subarray}}\Big{\{}\Big{(}\sum_{j\in U}\sum_{m\in\mathcal{M}}\mathds{1}_{(j\in V^{N}_{m})}u_{m}\Big{)}^{-1}\sum_{i\in W^{N}}\Big{(}\mathds{1}_{(\delta^{N}_{i}\geq d)}\sum_{\begin{subarray}{c}S\subseteq(U\cap\mathcal{N}^{N}_{w}(i)):\\ |S|=d\end{subarray}}\frac{\lambda}{{\delta^{N}_{i}\choose d}}+\mathds{1}_{(\delta^{N}_{i}<d)}\frac{|U\cap\mathcal{N}^{N}_{w}(i)|}{\delta^{N}_{i}}\Big{)}\Big{\}}. (3.1)
Lemma 3.1.

The queue length process (XjN(t))jVN\big{(}X_{j}^{N}(t)\big{)}_{j\in V^{N}} under the local JSQ(dd) policy is ergodic if and only if ρN<1\rho^{N}<1.

The above lemma is an immediate consequence of [9, Theorem 2.5]; see also [4]. We omit its proof. Intuitively, ρN<1\rho^{N}<1 means that in the NN-th system, for any subset UU of servers with possibly long queues (compared to the rest servers), the total rate at which tasks are assigned to some server in this set must be less than the rate of departure from this set.

Since we are interested in large-NN behavior, we will assume certain asymptotic version of the above stability criterion. This is fairly standard in the large-system analysis, as one would want to avoid the ‘heavy-traffic’ regime when ρN1\rho^{N}\uparrow 1 as NN\to\infty. The behavior in the latter scenario is typically qualitatively different from the so-called ‘subcritical’ regime as defined below.

Definition 3.2 (Subcritical Regime).

The sequence {GN}N\{G^{N}\}_{N} of systems defined as above is said to be in the subcritical regime with asymptotic load ρ<1\rho<1 if ρNρ<1\rho^{N}\to\rho<1, as NN\to\infty.

Throughout this paper we will assume that the sequence of systems under consideration is in subcritical regime. From Lemma 3.1, it is immediate that if a sequence of systems is in subcritical regime, then its queue length process is ergodic for all large enough NN. The potential non-ergodicity of fully flexible, heterogeneous server clusters brings us to the question that when the sufficient service capacity in (2.1) is satisfied, whether we can design the underlying compatibility structure carefully so that the queue length process is ergodic. In other words, can we regain the stability region? Proposition 3.3 below shows that this is indeed the case. In some sense, this highlights the first-order improvements (i.e., in terms of stability properties) of a careful compatibility structure design in contrast to a fully flexible system.

Proposition 3.3.

Let the parameters λ,ξ,d\lambda,\xi,d and wk,vm,umw_{k},v_{m},u_{m}, k𝒦k\in\mathcal{K}, mm\in\mathcal{M}, be such that (2.1) is satisfied. Then there exists (pk,m)k𝒦,m[0,1]K×M(p_{k,m})_{k\in\mathcal{K},m\in\mathcal{M}}\in[0,1]^{K\times M} such that, for any sequence of systems {GN}N1\{G^{N}\}_{N\geq 1} satisfying Condition 2.1, the queue length process (XjN(t))jVN\big{(}X_{j}^{N}(t)\big{)}_{j\in V^{N}} is ergodic for all NN large-enough. Moreover, such a (pk,m)k𝒦,m(p_{k,m})_{k\in\mathcal{K},m\in\mathcal{M}} can be obtained explicitly by solving a set of inequalities.

In the following sections, we will demonstrate, in addition to the first-order improvements, how asymptotic queue length distribution can be improved as well, for example, in terms of having a double-exponential decay of tail probabilities.

The proof of Proposition 3.3 is provided in Appendix A. It relies on first building a simple criteria involving the system parameters, which, for sequence of systems satisfying Condition 2.1, ensures stability for all large-enough NN (Lemma 3.4). Then we show that given other parameters, a value of (pk,m)k𝒦,m(p_{k,m})_{k\in\mathcal{K},m\in\mathcal{M}} satisfying this criteria can be found by checking the feasibility region defined by MM inequalities.

We end this subsection by presenting the above-mentioned simple asymptotic criteria for subcriticality. The proof is given in Appendix A. Denote δkmpk,mvm\delta_{k}\coloneqq\sum_{m\in\mathcal{M}}p_{k,m}v_{m} for each k𝒦k\in\mathcal{K}.

Lemma 3.4.

Let {GN}N\{G^{N}\}_{N} be a sequence satisfying Condition 2.1. The sequence of systems is in subcritical regime if

λξumk𝒦wkpk,mδk<1, for all m.\frac{\lambda\xi}{u_{m}}\sum_{k\in\mathcal{K}}\frac{w_{k}p_{k,m}}{\delta_{k}}<1,\quad\text{ for all }m\in\mathcal{M}. (3.2)

3.2 Process-level Limit: IID Case

Our first main result characterizes the process-level limit of the queue-length process (XjN\big{(}X^{N}_{j}, jV)j\in V\big{)}, as NN\to\infty, when the starting states {XjN(0):jVmN}\big{\{}X^{N}_{j}(0):j\in V^{N}_{m}\big{\}} are i.i.d. for all mm\in\mathcal{M} and independent across different mm-values. When the sequence of graphs {GN}N\{G^{N}\}_{N} satisfies a stronger condition, called clustered proportional sparsity (Definition 3.8), the i.i.d. condition can be removed. This is the content of Section 3.3.

Now, note that for a fixed N1N\geq 1, {XjN:jVN}\{X^{N}_{j}:j\in V_{N}\} is a system of NN stochastic processes with mean-field type interactions. Exploiting tools from the theory of weakly interacting particles, we show in Theorem 3.5 below that as the system size becomes large, queue-length processes converge weakly to those of an infinite system of independent McKean-Vlasov processes {Xj:j}\{X_{j}:j\in\mathbb{N}\} (see e.g. [26, 15]). In fact, using a suitable coupling to be described in more details in Section 4.1, the convergence holds in L2L_{2}. For ease of describing such processes and coupling, although we only assumed that certain fractions of servers are of certain task types in the model description, it will be convenient to fix the type of each server jj\in\mathbb{N} in this subsection, by defining a membership map 𝐌:\mathbf{M}:\mathbb{N}\rightarrow\mathcal{M}, so that VmN={jVN:𝐌(j)=m}V^{N}_{m}=\{j\in V^{N}:\mathbf{M}(j)=m\} with limN|VmN|N=vm\lim_{N\rightarrow\infty}\frac{|V^{N}_{m}|}{N}=v_{m} and Vm=limNVmNV_{m}=\lim_{N\rightarrow\infty}V^{N}_{m} for each mm\in\mathcal{M}. With such fixed server types and XjN(0)Xj(0)X_{j}^{N}(0)\equiv X_{j}(0), let

Xj(t)\displaystyle X_{j}(t) =Xj(0)0t𝟙(Xj(s)>0)Dj(ds)+[0,t]×+𝟙(0yCj(s))Aj(dsdy),\displaystyle=X_{j}(0)-\int_{0}^{t}\mathds{1}_{\big{(}X_{j}(s-)>0\big{)}}D_{j}(ds)+\int_{[0,t]\times\mathbb{R}_{+}}\mathds{1}_{\big{(}0\leq y\leq C_{j}(s-)\big{)}}A_{j}(dsdy), (3.3)
Cj(t)\displaystyle C_{j}(t) =dξk𝒦pk,mwkδk(M2,,Md)d1ht(j,M2,,Md),\displaystyle=d\xi\sum_{k\in\mathcal{K}}\frac{p_{k,m}w_{k}}{\delta_{k}}\sum_{(M_{2},...,M_{d})\in\mathcal{M}^{d-1}}h_{t}(j,M_{2},...,M_{d}), (3.4)

where 𝐌(j)=m\mathbf{M}(j)=m and

ht(j,M2,,Md)\displaystyle h_{t}(j,M_{2},...,M_{d}) =h=2dvMhpk,Mhδkd1b(Xj(t),xj2,,xjd)μtM2(dxj2)μtMd(dxjd),\displaystyle=\prod_{h=2}^{d}\frac{v_{M_{h}}p_{k,M_{h}}}{\delta_{k}}\int_{\mathbb{N}^{d-1}}b\big{(}X_{j}(t),x_{j_{2}},...,x_{j_{d}}\big{)}\mu^{M_{2}}_{t}(dx_{j_{2}})\cdots\mu^{M_{d}}_{t}(dx_{j_{d}}),
b(𝐱)\displaystyle b(\mathbf{x}) =b(x1,,xd)r=1d1r𝟙(x1=minj[d]𝐱,|argmin𝐱|=r),𝐱=(x1,,xd)0d,\displaystyle=b(x_{1},...,x_{d})\coloneqq\sum_{r=1}^{d}\frac{1}{r}\mathds{1}_{\big{(}x_{1}=\min_{j\in[d]}\mathbf{x},|\operatorname*{arg\,min}\mathbf{x}|=r\big{)}},\quad\mathbf{x}=(x_{1},...,x_{d})\in\mathbb{N}_{0}^{d}, (3.5)
μtm\displaystyle\mu^{m}_{t} =(Xi(t)),iVm,m,t0.\displaystyle=\mathcal{L}\big{(}X_{i}(t)\big{)},\quad\forall\,i\in V_{m},m\in\mathcal{M},t\geq 0.

Here, {Dj:jVm}\{D_{j}:j\in V_{m}\} are i.i.d. Poisson processes with rate umu_{m} for each mm\in\mathcal{M}, {Aj:j}\{A_{j}:j\in\mathbb{N}\} are i.i.d. Poisson random measures on [0,)×+[0,\infty)\times\mathbb{R}_{+} with intensity λdsdy\lambda dsdy, and all DjD_{j}’s and AjA_{j}’s are independent. Loosely speaking, AjA_{j} corresponds to the arrival processes and DjD_{j} corresponds to the departure processes at servers. We note that the existence and uniqueness of solutions to (3.3) and (3.4) can be proved by standard arguments (see e.g., [26, 15]) using the boundedness and Lipschitz property of the functions bb and x𝟙(x>0)x\mapsto\mathds{1}_{(x>0)} on 0\mathbb{N}_{0}.

Theorem 3.5 (Convergence to McKean-Vlasov process and propagation of chaos).

Consider any fixed 𝐪=(qm,l,m,l0)𝒮\mathbf{q}^{\infty}=\big{(}q^{\infty}_{m,l},m\in\mathcal{M},l\in\mathbb{N}_{0}\big{)}\in\mathcal{S}. Assume that all XjN(0)X^{N}_{j}(0)’s are independent, and for each mm\in\mathcal{M}, {XjN(0):jVmN}\big{\{}X^{N}_{j}(0):j\in V^{N}_{m}\big{\}} is i.i.d. with (XjN(0)l)=qm,l\mathbb{P}\big{(}X^{N}_{j}(0)\geq l\big{)}=q^{\infty}_{m,l}, l0l\in\mathbb{N}_{0}. On any finite time interval [0,T][0,T], T>0T>0, for any mm\in\mathcal{M} and jVmj\in V_{m}, the queue length process XjN()X^{N}_{j}(\cdot) at server jj weakly converges to the process Xj()X_{j}(\cdot) in (3.3). In fact, one can suitably couple XjNX^{N}_{j} with XjX_{j} such that

maxjVN𝔼XjNXj,T2N0\max_{j\in V^{N}}\mathbb{E}\left\lVert X^{N}_{j}-X_{j}\right\rVert^{2}_{*,T}\xrightarrow{N\rightarrow\infty}0 (3.6)

and hence the propagation of chaos property holds, that is, for any nn\in\mathbb{N} and distinct jhVMhj_{h}\in V_{M_{h}}, h=1,,nh=1,\dotsc,n,

(Xj1N,,XjnN)N(Xj1,,Xjn)=μM1μMn.\mathcal{L}(X_{j_{1}}^{N},\dotsc,X_{j_{n}}^{N})\xrightarrow{N\rightarrow\infty}\mathcal{L}(X_{j_{1}},\dotsc,X_{j_{n}})=\mu^{M_{1}}\otimes\dotsb\otimes\mu^{M_{n}}. (3.7)

Theorem 3.5 gives us the limit law of all individual queues. Next, in Theorem 3.6, we will show how such a server-level convergence can be used to obtain a convergence result for the global occupancy process 𝐪N()\mathbf{q}^{N}(\cdot) to a deterministic dynamical system, which was our primary goal. The proofs of Theorem 3.5 and Theorem 3.6 are provided in Section 4.

Theorem 3.6 (Process-level convergence for i.i.d. starting state).

Assume that all XjN(0)X^{N}_{j}(0)’s are independent, and for each mm\in\mathcal{M}, {XjN(0):jVmN}\{X^{N}_{j}(0):j\in V^{N}_{m}\} is i.i.d. with (XjN(0)l)=qm,l\mathbb{P}(X^{N}_{j}(0)\geq l)=q^{\infty}_{m,l}, l0l\in\mathbb{N}_{0} for some 𝐪=(qm,l,m,l0)𝒮\mathbf{q}^{\infty}=(q^{\infty}_{m,l},m\in\mathcal{M},l\in\mathbb{N}_{0})\in\mathcal{S}. Then on any finite time interval, the occupancy process 𝐪N()\mathbf{q}^{N}(\cdot) converges weakly with respect to Skorokhod J1J_{1} topology to the deterministic limit 𝐪()(qm,l(),m,l0)\mathbf{q}(\cdot)\coloneqq(q_{m,l}(\cdot),m\in\mathcal{M},l\in\mathbb{N}_{0}) given by the unique solution to the following system of ODEs: For all mm\in\mathcal{M}, qm,0(t)=1q_{m,0}(t)=1, qm,l(0)=qm,lq_{m,l}(0)=q^{\infty}_{m,l}, and

dqm,l(t)dt\displaystyle\frac{dq_{m,l}(t)}{dt} =um(qm,l(t)qm,l+1(t))\displaystyle=-u_{m}\big{(}q_{m,l}(t)-q_{m,l+1}(t)\big{)}
+λξ(qm,l1(t)qm,l(t))k𝒦pk,mwkδk(q~k,l1(t))d(q~k,l(t))dq~k,l1(t)q~k,l(t),l.\displaystyle\quad+\lambda\xi\big{(}q_{m,l-1}(t)-q_{m,l}(t)\big{)}\sum_{k\in\mathcal{K}}\frac{p_{k,m}w_{k}}{\delta_{k}}\frac{(\tilde{q}_{k,l-1}(t))^{d}-(\tilde{q}_{k,l}(t))^{d}}{\tilde{q}_{k,l-1}(t)-\tilde{q}_{k,l}(t)},\quad\forall l\in\mathbb{N}. (3.8)

where q~k,l(t)=mvmpk,mδkqm,l(t)\tilde{q}_{k,l}(t)=\sum_{m\in\mathcal{M}}\frac{v_{m}p_{k,m}}{\delta_{k}}q_{m,l}(t) for all k𝒦k\in\mathcal{K}.

Remark 3.7.

Using the propagation of chaos property (3.7) and the fact that {Xj(t):j}\{X_{j}(t):j\in\mathbb{N}\} is independent and {Xj(t):jVm}\{X_{j}(t):j\in V_{m}\} is i.i.d. for each mm\in\mathcal{M}, it follows that the limit of the global occupancy process at any time instant tt, in fact, corresponds to the laws of Xj(t)X_{j}(t) for each type of servers jj in (3.3), that is,

μtm[l,)=(Xj(t)l)=qm,l(t),jVm,m,l0,t0.\mu_{t}^{m}[l,\infty)=\mathbb{P}(X_{j}(t)\geq l)=q_{m,l}(t),\quad j\in V_{m},m\in\mathcal{M},l\in\mathbb{N}_{0},t\geq 0.

3.3 Process-level Limit: General Case

Theorem 3.6 requires the strong assumption that for each mm\in\mathcal{M}, XjN(0)X^{N}_{j}(0), jVmNj\in V^{N}_{m}, are i.i.d.. In order to argue the interchange of limits, we need to relax this assumption on initial states. This is because the arguments for the interchange of limits involves initiating the prelimit system at the steady state and then showing that as NN\to\infty, the system must converge to the unique fixed point of the limiting ODE. The above requires us to characterize the (process-level) limiting trajectory of the system starting from arbitrary occupancy state. We achieve this in this section.

Intuitively, the assumption of i.i.d. in Theorems 3.5 and 3.6 ensures that the local occupancy observed by any dispatcher iWkNi\in W^{N}_{k}, k𝒦k\in\mathcal{K} is ‘close’, in suitable sense, to the average occupancy at the entire system. This phenomenon can be ensured asymptotically, even without the i.i.d. assumption if the graph sequence satisfies a property we call the clustered proportional sparsity. This notion was first introduced for the homogeneous systems in [22]. The definition below is a modified notion that is suitable for the current heterogeneous setting.

Definition 3.8 (Clustered Proportional Sparsity).

Recall 𝒩wN(i)={jVN:(i,j)EN}\mathcal{N}^{N}_{w}(i)=\{j\in V^{N}:(i,j)\in E^{N}\}. The sequence {GN}N\{G^{N}\}_{N} is called clustered proportionally sparse if for any ε>0\varepsilon>0,

supk𝒦supUVN|{iWkN:||𝒩wN(i)U||𝒩wN(i)||EkN(U)||EkN(VN)||ε}|/|WkN|N0,\sup_{k\in\mathcal{K}}\sup_{U\subseteq V^{N}}\Big{|}\Big{\{}i\in W^{N}_{k}:\Big{|}\frac{|\mathcal{N}^{N}_{w}(i)\cap U|}{|\mathcal{N}^{N}_{w}(i)|}-\frac{|E^{N}_{k}(U)|}{|E^{N}_{k}(V^{N})|}\Big{|}\geq\varepsilon\Big{\}}\Big{|}/|W^{N}_{k}|\xrightarrow{N\rightarrow\infty}0, (3.9)

where EkN(U){(i,j)WkN×U:(i,j)EN}E^{N}_{k}(U)\coloneqq\{(i,j)\in W^{N}_{k}\times U:(i,j)\in E^{N}\}.

Remark 3.9.

We can view the subset UU in the definition as a test set, say U=𝒬m,lN(t)U=\mathcal{Q}^{N}_{m,l}(t), where 𝒬m,lN(t)\mathcal{Q}^{N}_{m,l}(t) is the set of type mm\in\mathcal{M} servers with queue length at least l0l\in\mathbb{N}_{0} at time tt. Hence, Definition 3.8 ensures that for all but o(N)o(N) dispatchers, the observed empirical queue length distribution within its neighborhood, is close to the global weighted empirical queue length distribution (Definition 4.6) of its corresponding type. Then, the global occupancy process evolves similarly to (and converges to the same limit as) the case when the initial states are i.i.d..

Theorem 3.10 (Process-level convergence).

Let {GN}N\{G^{N}\}_{N} be a clustered proportionally sparse sequence of graphs. Assume that 𝐪N(0)\mathbf{q}^{N}(0) weakly converges to 𝐪𝒮\mathbf{q}^{\infty}\in\mathcal{S}. Then on any finite time interval, the occupancy process 𝐪N()\mathbf{q}^{N}(\cdot) converges weakly with respect to the Skorokhod J1J_{1} topology to the deterministic limit 𝐪()(qm,l(),m,l0)\mathbf{q}(\cdot)\coloneqq(q_{m,l}(\cdot),m\in\mathcal{M},l\in\mathbb{N}_{0}) given by the unique solution to the system of ODEs defined by (3.8) with initial state 𝐪(0)=(qm,l,m,l0)\mathbf{q}(0)=\big{(}q^{\infty}_{m,l},m\in\mathcal{M},l\in\mathbb{N}_{0}\big{)}.

The proof of Theorem 3.10 is given in Section 4.4

3.4 Convergence of Steady States

In the last section, we showed the process-level convergence of global occupancy process 𝐪N()\mathbf{q}^{N}(\cdot) to a mean-field limit 𝐪()\mathbf{q}(\cdot). In this section, we will establish the convergence of the sequence of stationary distributions to the unique fixed point of the mean-field limit by establishing the interchange of large-NN and large-tt limits: limtlimN𝐪N(t)=limNlimt𝐪N(t)\lim_{t\rightarrow\infty}\lim_{N\rightarrow\infty}\mathbf{q}^{N}(t)=\lim_{N\rightarrow\infty}\lim_{t\rightarrow\infty}\mathbf{q}^{N}(t). Throughout this section, we will assume that the sequence of systems is in subcritical regime (recall Definition 3.2). The first result below states that the limiting system of ODEs have a unique fixed point 𝐪\mathbf{q}^{*} and it satisfies the global stability property, i.e., for any initial point 𝐪(0)𝒮\mathbf{q}(0)\in\mathcal{S}, limt𝐪(t)=𝐪\lim_{t\rightarrow\infty}\mathbf{q}(t)=\mathbf{q}^{*}.

Theorem 3.11 (Global stability).

Let 𝐪¯(t,𝐪0)\bar{\mathbf{q}}(t,\mathbf{q}_{0}) be the solution to the system of ODEs in (3.8) with the initial point 𝐪(0)=𝐪0𝒮\mathbf{q}(0)=\mathbf{q}_{0}\in\mathcal{S}. Then there exists a unique fixed point 𝐪=(qm,l,m,l0)𝒮\mathbf{q}^{*}=\big{(}q^{*}_{m,l},m\in\mathcal{M},l\in\mathbb{N}_{0}\big{)}\in\mathcal{S} such that

limt𝐪¯(t,𝐪0)=𝐪.\lim_{t\rightarrow\infty}\bar{\mathbf{q}}(t,\mathbf{q}_{0})=\mathbf{q}^{*}.

The proof of Theorem 3.11 is given in Section 5. It relies on a monotonicity property of the system, which ensures that for two processes 𝐪1()\mathbf{q}^{1}(\cdot) and 𝐪2()\mathbf{q}^{2}(\cdot), if 𝐪1(0)𝐪2(0)\mathbf{q}^{1}(0)\leq\mathbf{q}^{2}(0), then 𝐪1(t)𝐪2(t)\mathbf{q}^{1}(t)\leq\mathbf{q}^{2}(t) for all t0t\geq 0 (see ref. [24, 14]).

The last ingredient that we need in order to prove the interchange of limits is to establish tightness of the sequence of random variables {𝐪N()}N1\{\mathbf{q}^{N}(\infty)\}_{N\geq 1} under suitable metric, where 𝐪N():=limt𝐪N(t)\mathbf{q}^{N}(\infty):=\lim_{t\to\infty}\mathbf{q}^{N}(t). Here, as before, we should note that the process (𝐪N(t))t0(\mathbf{q}^{N}(t))_{t\geq 0} is not Markovian. That is why, the random variable 𝐪N()\mathbf{q}^{N}(\infty) should be interpreted as the functional applied to the steady-state system. The tightness result is stated in the next theorem.

Theorem 3.12 (Tightness).

For any ε>0\varepsilon>0, there exists a compact subset K¯(ε)𝒮\bar{K}(\varepsilon)\subseteq\mathcal{S}, when 𝒮\mathcal{S} is equipped with the 1\ell_{1}-topology, such that

(𝐪N()K¯(ε))<ε,N1.\mathbb{P}(\mathbf{q}^{N}(\infty)\notin\bar{K}(\varepsilon))<\varepsilon,\quad\forall N\geq 1.

Theorem 3.12 is proved in Section 5. The key idea is to use Lyapunov function approach to bound the expected sum of tails qm,lN()q^{N}_{m,l}(\infty). Combining Theorems 3.103.11, and 3.12 we can prove the following interchange of limits result.

Theorem 3.13 (Convergence of steady states).

Let {GN}N1\{G^{N}\}_{N\geq 1} be a clustered proportionally sparse sequence of graphs satisfying Condition 2.1. Then the sequence of random variables {𝐪N()}N1\{\mathbf{q}^{N}(\infty)\}_{N\geq 1} converges weakly to 𝐪\mathbf{q}^{*}, the unique fixed point of the system of ODEs in (3.8).

One major discovery about the JSQ(dd) policy for the classical, homogeneous, fully flexible system is that the limit of the stationary distribution (which, in our case, is given by 𝐪\mathbf{q}^{*}) has a double-exponential decay of tail [17, 32] for any d2d\geq 2. This is in sharp contrast with the (single) exponential decay of the corresponding tail for random routing or d=1d=1. In fact, in this case, for any d2d\geq 2, 𝐪\mathbf{q}^{*} can be characterized explicitly as: ql=λdl1d1q^{*}_{l}=\lambda^{\frac{d^{l}-1}{d-1}}, where qlq_{l}^{*} is the (limiting) steady-state fraction of servers with queue length at least l=1,2,l=1,2,\ldots. In the current case of heterogeneous systems, it is intractable to characterize the fixed point 𝐪\mathbf{q}^{*} explicitly. However, as stated in the next theorem, we can still prove that the doubly exponential decay of the tails qm,lq^{*}_{m,l} for each mm\in\mathcal{M} holds.

Theorem 3.14 (Double-exponential tail decay).

Let 𝐪=(qm,l,m,l0)\mathbf{q}^{*}=\big{(}q^{*}_{m,l},m\in\mathcal{M},l\in\mathbb{N}_{0}\big{)} be the unique fixed point of the system of ODEs in (3.8). Then, for all mm\in\mathcal{M}, the sequence {qm,l,l0}\big{\{}q^{*}_{m,l},l\in\mathbb{N}_{0}\big{\}} decreases doubly exponentially, i.e., there exist positive constant lm0l_{m}\in\mathbb{N}_{0}, am(0,1)a_{m}\in(0,1) and bm>0b_{m}>0 such that for all llml\geq l_{m},

qm,lbmamdl.q^{*}_{m,l}\leq b_{m}a_{m}^{d^{l}}. (3.10)

3.5 Simple Data Locality Design using Randomization

Sections 3.13.4 characterize the performance of the occupancy process for arbitrary deterministic sequence of systems where the underlying graph sequence satisfies certain properties. In particular, Condition 2.1 and Definition 3.8 provide a sufficient criteria under which both the process-level convergence (Theorem 3.10) and interchange of limits (Theorem 3.13) hold. In this section, we show that graphs satisfying the above required criteria can be obtained easily if the compatibility graph is designed suitably randomly. Given the asymptotic edge-density parameters in Condition 2.1, we define a certain sequence of inhomogeneous random graphs or irg as follows.

Definition 3.15 (irg(𝐩\mathbf{p})).

Given 𝐩(pk,m,k𝒦,m)\mathbf{p}\coloneqq(p_{k,m},k\in\mathcal{K},m\in\mathcal{M}), the NN-th system of irg(𝐩\mathbf{p}) is constructed as follows: For any k𝒦k\in\mathcal{K} and mm\in\mathcal{M}, dispatcher ii and server jj shares an edge with probability pk,mp_{k,m} for all iWkNi\in W^{N}_{k} and jVmNj\in V^{N}_{m}, independently of each other.

For any 𝐩\mathbf{p} for which the asymptotic stability criterion holds, we have the following result for the sequence of irg(𝐩\mathbf{p}).

Theorem 3.16.

Let 𝐩=(pk,m,k𝒦,m)\mathbf{p}=(p_{k,m},k\in\mathcal{K},m\in\mathcal{M}) be such that the stability criterion in (3.2) holds and {GN}N1\{G_{N}\}_{N\geq 1} be a sequence of irg(𝐩\mathbf{p}) with increasing NN. Then the conclusions of Theorem 3.10 and Theorem 3.13 hold for {GN}N1\{G_{N}\}_{N\geq 1}.

The proof of Theorem 3.16 is provided in Appendix I. It relies on verifying that the sequence of irg(𝐩\mathbf{p}) graphs satisfies Condition 2.1 and the property of clustered proportional sparsity almost surely. The verification involves using concentration of measure arguments to establish structural properties of the compatibility graphs.

4 Proof of Transient Limit Results

In this section, we will prove the results of transient limit results, Theorems 3.5,  3.6 and 3.10 in Sections 4.24.3, and 4.4, respectively. We start by proving a few auxiliary results in Section 4.1.

4.1 Auxiliary Results

First, we will need a characterization of the evolution of the queue length process at each server. To describe this evolution, let us introduce the following notations:

setN(j)\displaystyle\texttt{set}^{N}(j) {(j2,,jd)[N]d1:(j,j2,,jd) are distinct},\displaystyle\coloneqq\Big{\{}(j_{2},...,j_{d})\in[N]^{d-1}:(j,j_{2},\ldots,j_{d})\text{ are distinct}\Big{\}}, (4.1)
settN(j)\displaystyle\texttt{sett}^{N}(j) {(j2,,jd,j2,,jd)[N]2d2:(j2,,jd)setN(j),(j2,,jd)setN(j),\displaystyle\coloneqq\Big{\{}(j_{2},\ldots,j_{d},j^{\prime}_{2},\ldots,j^{\prime}_{d})\in[N]^{2d-2}:(j_{2},...,j_{d})\in\texttt{set}^{N}{(j)},(j^{\prime}_{2},...,j^{\prime}_{d})\in\texttt{set}^{N}{(j)},
(j2,,jd)(j2,,jd)}.\displaystyle\hskip 199.16928pt(j_{2},...,j_{d})\cap(j^{\prime}_{2},...,j^{\prime}_{d})\neq\emptyset\Big{\}}. (4.2)

To represent the graph, define the edge occupancy ξi,jN\xi^{N}_{i,j} to be the binary variable:

ξi,jN={1,if(i,j)EN,0,otherwise,for all iWN,jVN.\xi^{N}_{i,j}=\begin{cases}1,&\text{if}\ (i,j)\in E^{N},\\ 0,&\text{otherwise},\end{cases}\quad\mbox{for all }i\in W^{N},j\in V^{N}.

Recall the function bb, Poisson processes {Dj}\{D_{j}\} and Poisson random measures {Aj}\{A_{j}\} in and below (3.5). By Condition 2.1, for all large enough NN, all dispatchers in the NN-th system have at least dd neighbors. Hence, WLOG, in the rest of this section, we will only consider the case δiNd\delta^{N}_{i}\geq d, iWN\forall i\in W^{N}. In that case, due to the Poisson thinning property, note that we can write XjN(t)X^{N}_{j}(t) as follows:

XjN(t)=XjN(0)0t𝟙(XjN(s)>0)Dj(ds)+[0,)×+𝟙(0yCjN(s))Aj(dsdy),X^{N}_{j}(t)=X^{N}_{j}(0)-\int_{0}^{t}\mathds{1}_{\big{(}X^{N}_{j}(s-)>0\big{)}}D_{j}(ds)+\int_{[0,\infty)\times\mathbb{R}_{+}}\mathds{1}_{\big{(}0\leq y\leq C^{N}_{j}(s-)\big{)}}A_{j}(dsdy), (4.3)

where

CjN(s)\displaystyle C^{N}_{j}(s) =iWNξi,jN(j2,,jd)setN(j)ξi,j2N××ξi,jdN(δiNd)(d1)!b(XjN(s),Xj2N(s),,XjdN(s))\displaystyle=\sum_{i\in W^{N}}\xi^{N}_{i,j}\sum_{(j_{2},...,j_{d})\in\texttt{set}^{N}(j)}\frac{\xi^{N}_{i,j_{2}}\times\cdots\times\xi^{N}_{i,j_{d}}}{{\delta^{N}_{i}\choose d}(d-1)!}b\big{(}X^{N}_{j}(s),X^{N}_{j_{2}}(s),...,X^{N}_{j_{d}}(s)\big{)} (4.4)
=k𝒦iWkNξi,jN(j2,,jd)setN(j)ξi,j2N××ξi,jdN(δiNd)(d1)!b(XjN(s),Xj2N(s),,XjdN(s)).\displaystyle=\sum_{k\in\mathcal{K}}\sum_{i\in W^{N}_{k}}\xi^{N}_{i,j}\sum_{(j_{2},...,j_{d})\in\texttt{set}^{N}(j)}\frac{\xi^{N}_{i,j_{2}}\times\cdots\times\xi^{N}_{i,j_{d}}}{{\delta^{N}_{i}\choose d}(d-1)!}b\big{(}X^{N}_{j}(s),X^{N}_{j_{2}}(s),...,X^{N}_{j_{d}}(s)\big{)}.

The RHS of the first summation in (4.4) represents the probability that a job arriving at the dispatcher iWNi\in W^{N} will be assigned to the server jVNj\in V^{N} given the state (XjN,jVN)\big{(}X^{N}_{j},j\in V^{N}\big{)}. Moreover, by Condition 2.1, the term CjNC^{N}_{j} for all jVNj\in V^{N} can be upper bounded, uniformly for all tt, by a constant for all large enough NN, which is stated in Lemma 4.2 below.

When we do some estimation, like bounding the term CjNC^{N}_{j}, we need to uniformly bound the number of the neighbors of servers or dispatchers. Such uniformity is stated in Lemma 4.1 and is a direct result of Condition 2.1. Recall δiN=|𝒩wN(i)|\delta^{N}_{i}=|\mathcal{N}^{N}_{w}(i)| and δk=mpk,mvm\delta_{k}=\sum_{m\in\mathcal{M}}p_{k,m}v_{m}.

Lemma 4.1.

For each k𝒦k\in\mathcal{K},

limNmaxiWkNdegwN(i,m)|VmN|=limNminiWkNdegwN(i,m)|VmN|=pk,m,m,\lim_{N\rightarrow\infty}\max_{i\in W^{N}_{k}}\frac{\deg^{N}_{w}(i,m)}{|V^{N}_{m}|}=\lim_{N\rightarrow\infty}\min_{i\in W^{N}_{k}}\frac{\deg^{N}_{w}(i,m)}{|V^{N}_{m}|}=p_{k,m},\quad m\in\mathcal{M}, (4.5)

and

limNmaxiWkNδiNN=limNminiWkNδiNN=δk.\lim_{N\rightarrow\infty}\max_{i\in W^{N}_{k}}\frac{\delta^{N}_{i}}{N}=\lim_{N\rightarrow\infty}\min_{i\in W^{N}_{k}}\frac{\delta^{N}_{i}}{N}=\delta_{k}. (4.6)

Also, for each mm\in\mathcal{M},

limNmaxjVmNdegvN(k,j)|WkN|=limNminjVmNdegvN(k,j)|WkN|=pk,m,k𝒦.\lim_{N\rightarrow\infty}\max_{j\in V^{N}_{m}}\frac{\deg^{N}_{v}(k,j)}{|W^{N}_{k}|}=\lim_{N\rightarrow\infty}\min_{j\in V^{N}_{m}}\frac{\deg^{N}_{v}(k,j)}{|W^{N}_{k}|}=p_{k,m},\quad k\in\mathcal{K}. (4.7)
Lemma 4.2.

For all large enough NN, we have that for any mm\in\mathcal{M}, jVmNj\in V^{N}_{m}, and t0t\geq 0,

CjN(t)2ξdk𝒦pk,mwkδk.C^{N}_{j}(t)\leq 2\xi d\sum_{k\in\mathcal{K}}\frac{p_{k,m}w_{k}}{\delta_{k}}. (4.8)
Proof.

By the definition of CjN(t)C^{N}_{j}(t), for any t0t\geq 0 and large enough NN,

CjN(t)k𝒦iWkNξi,jN(j2,,jd)setN(j)ξi,j2N××ξi,jdN(δiNd)(d1)!=k𝒦iWkNξi,jN(δiN1d1)(δiNd)2ξdk𝒦pk,mwkδk,C^{N}_{j}(t)\leq\sum_{k\in\mathcal{K}}\sum_{i\in W^{N}_{k}}\xi^{N}_{i,j}\sum_{(j_{2},...,j_{d})\in\texttt{set}^{N}(j)}\frac{\xi^{N}_{i,j_{2}}\times\cdots\times\xi^{N}_{i,j_{d}}}{{\delta^{N}_{i}\choose d}(d-1)!}=\sum_{k\in\mathcal{K}}\sum_{i\in W^{N}_{k}}\xi^{N}_{i,j}\frac{{\delta^{N}_{i}-1\choose d-1}}{{\delta^{N}_{i}\choose d}}\leq 2\xi d\sum_{k\in\mathcal{K}}\frac{p_{k,m}w_{k}}{\delta_{k}},

where the first inequality is due to b()1b(\cdot)\leq 1 and the last inequality comes from Lemma 4.1. ∎

By Lemma 4.1, we know that the neighborhoods of dispatchers of the same type are almost the same. With the scale of the system size, the local graph structure for each dispatcher of the same type will converge to the average one. The following two lemmas give necessary approximation of the graph structures for large-NN systems. Their proofs are combinatorial and are based on Condition 2.1 and Lemma 4.1. They are provided in Appendix B.

Lemma 4.3.

Consider a sequence {GN}N\{G^{N}\}_{N} satisfying Condition 2.1. For each mm\in\mathcal{M},

maxjVmNmaxk𝒦max(M2,,Md)d1|iWkNξi,jN(j2,,jd)setN(j)s.t.j2VM2N,,jdVMdNξi,j2N××ξi,jdN(δiNd)(d1)!ξdpk,mwkδkh=2dvMhpk,Mhδk|N0.\begin{split}\max_{j\in V^{N}_{m}}\max_{k\in\mathcal{K}}\max_{(M_{2},...,M_{d})\in\mathcal{M}^{d-1}}\Big{|}&\sum_{i\in W^{N}_{k}}\xi^{N}_{i,j}\sum_{\begin{subarray}{c}(j_{2},...,j_{d})\in\texttt{set}^{N}(j)\\ s.t.\quad j_{2}\in V^{N}_{M_{2}},...,j_{d}\in V^{N}_{M_{d}}\end{subarray}}\frac{\xi^{N}_{i,j_{2}}\times\cdots\times\xi^{N}_{i,j_{d}}}{{\delta^{N}_{i}\choose d}(d-1)!}\\ &\hskip 113.81102pt-\xi d\frac{p_{k,m}w_{k}}{\delta_{k}}\prod_{h=2}^{d}\frac{v_{M_{h}}p_{k,M_{h}}}{\delta_{k}}\Big{|}\xrightarrow{N\rightarrow\infty}0.\end{split} (4.9)
Lemma 4.4.

Consider any mm\in\mathcal{M} and jVmj\in V_{m}. For large enough NN,

iWNsettN(j)ξi,jN×ξi,j2N××ξi,jdN(δiNd)(d1)!ξi,jN×ξi,j2N××ξi,jdN(δiNd)(d1)!C1N2\begin{split}&\sum_{i\in W^{N}}\sum_{\texttt{sett}^{N}{(j)}}\frac{\xi^{N}_{i,j}\times\xi^{N}_{i,j_{2}}\times\cdots\times\xi^{N}_{i,j_{d}}}{{\delta^{N}_{i}\choose d}(d-1)!}\frac{\xi^{N}_{i,j}\times\xi^{N}_{i,j^{\prime}_{2}}\times\cdots\times\xi^{N}_{i,j^{\prime}_{d}}}{{\delta^{N}_{i}\choose d}(d-1)!}\leq\frac{C_{1}}{N^{2}}\end{split} (4.10)

where C1C_{1} is a positive constant. Similarly,

i1,i2WN,i1i2settN(j)ξi1,jN×ξi1,j2N××ξi1,jdN(δi1Nd)(d1)!ξi2,jN×ξi2,j2N××ξi2,jdN(δi2Nd)(d1)!C2N\begin{split}&\sum_{\begin{subarray}{c}i_{1},i_{2}\in W^{N},\\ i_{1}\neq i_{2}\end{subarray}}\sum_{\texttt{sett}^{N}{(j)}}\frac{\xi^{N}_{i_{1},j}\times\xi^{N}_{i_{1},j_{2}}\times\cdots\times\xi^{N}_{i_{1},j_{d}}}{{\delta^{N}_{i_{1}}\choose d}(d-1)!}\frac{\xi^{N}_{i_{2},j}\times\xi^{N}_{i_{2},j^{\prime}_{2}}\times\cdots\times\xi^{N}_{i_{2},j^{\prime}_{d}}}{{\delta^{N}_{i_{2}}\choose d}(d-1)!}\leq\frac{C_{2}}{N}\end{split} (4.11)

where C2C_{2} is a positive constant.

4.2 Convergence to McKean-Vlasov Process: IID Case

Proof of Theorem 3.5.

It suffices to prove (3.6). Fix any mm\in\mathcal{M}, jVmj\in V_{m} and T>0T>0. We have that for any fixed t[0,T]t\in[0,T] and any NN s.t. jVNj\in V^{N},

𝔼XjNXj,t2\displaystyle\mathbb{E}\left\lVert X^{N}_{j}-X_{j}\right\rVert^{2}_{*,t} c0𝔼XjN(t)Xj(t)2\displaystyle\leq c_{0}\mathbb{E}\left\lVert X^{N}_{j}(t)-X_{j}(t)\right\rVert^{2}
c1𝔼0t|𝟙(XjN(s)>0)𝟙(Xj(s)>0)|2𝑑s+c1𝔼(0t|𝟙(XjN(s)>0)𝟙(Xj(s)>0)|𝑑s)2\displaystyle\leq c_{1}\mathbb{E}\int_{0}^{t}|\mathds{1}_{(X^{N}_{j}(s)>0)}-\mathds{1}_{(X_{j}(s)>0)}|^{2}ds+c_{1}\mathbb{E}\Big{(}\int_{0}^{t}|\mathds{1}_{(X^{N}_{j}(s)>0)}-\mathds{1}_{(X_{j}(s)>0)}|ds\Big{)}^{2}
+c1𝔼[0,t]×+|𝟙(0yCjN(s))𝟙(0yCj(s))|2𝑑s𝑑y\displaystyle\quad+c_{1}\mathbb{E}\int_{[0,t]\times\mathbb{R}_{+}}|\mathds{1}_{(0\leq y\leq C^{N}_{j}(s))}-\mathds{1}_{(0\leq y\leq C_{j}(s))}|^{2}dsdy
+c1𝔼([0,t]×+|𝟙(0yCjN(s))𝟙(0yCj(s))|𝑑s𝑑y)2\displaystyle\quad+c_{1}\mathbb{E}\Big{(}\int_{[0,t]\times\mathbb{R}_{+}}|\mathds{1}_{(0\leq y\leq C^{N}_{j}(s))}-\mathds{1}_{(0\leq y\leq C_{j}(s))}|dsdy\Big{)}^{2}
c1𝔼0t|XjN(s)Xj(s)|2𝑑s+c1𝔼(0t|XjN(s)Xj(s)|𝑑s)2\displaystyle\leq c_{1}\mathbb{E}\int_{0}^{t}|X^{N}_{j}(s)-X_{j}(s)|^{2}ds+c_{1}\mathbb{E}\Big{(}\int_{0}^{t}|X^{N}_{j}(s)-X_{j}(s)|ds\Big{)}^{2}
+c1𝔼0t|CjN(s)Cj(s)|2𝑑s+c1𝔼(0t|CjN(s)Cj(s)|𝑑s)2\displaystyle\hskip 56.9055pt+c_{1}\mathbb{E}\int_{0}^{t}|C^{N}_{j}(s)-C_{j}(s)|^{2}ds+c_{1}\mathbb{E}\Big{(}\int_{0}^{t}|C^{N}_{j}(s)-C_{j}(s)|ds\Big{)}^{2}
c20t𝔼|XjN(s)Xj(s)|2𝑑s+c20t𝔼|CjN(s)Cj(s)|𝑑s,\displaystyle\leq c_{2}\int_{0}^{t}\mathbb{E}|X^{N}_{j}(s)-X_{j}(s)|^{2}ds+c_{2}\int_{0}^{t}\mathbb{E}|C^{N}_{j}(s)-C_{j}(s)|ds, (4.12)

where c0c_{0}, c1c_{1} and c2c_{2} are positive constants. The first two inequalities are by Doob’s inequalities and Cauchy-Schwarz, respectively. The last inequality is due to Lemma 4.1. By adding and subtracting terms, we have

|CjN(s)Cj(s)||CjN(s)CjN,1(s)|+|CjN,1(s)CjN,2(s)|+|CjN,2(s)Cj(s)|,\begin{split}|C^{N}_{j}&(s)-C_{j}(s)|\leq|C^{N}_{j}(s)-C^{N,1}_{j}(s)|+|C^{N,1}_{j}(s)-C^{N,2}_{j}(s)|+|C^{N,2}_{j}(s)-C_{j}(s)|,\end{split} (4.13)

where

CjN,1\displaystyle C^{N,1}_{j} =k𝒦iWkN[ξi,jN(j2,,jd)setN(j)ξi,j2N××ξi,jdN(δiNd)(d1)!b(Xj(s),Xj2(s),,Xjd(s))],\displaystyle=\sum_{k\in\mathcal{K}}\sum_{i\in W^{N}_{k}}\Big{[}\xi^{N}_{i,j}\sum_{(j_{2},...,j_{d})\in\texttt{set}^{N}(j)}\frac{\xi^{N}_{i,j_{2}}\times\cdots\times\xi^{N}_{i,j_{d}}}{{\delta^{N}_{i}\choose d}(d-1)!}b(X_{j}(s),X_{j_{2}}(s),...,X_{j_{d}}(s))\Big{]},
CjN,2\displaystyle C^{N,2}_{j} =k𝒦iWkN[ξi,jN(j2,,jd)setN(j)ξi,j2N××ξi,jdN(δiNd)(d1)!d1b(Xj(t),xj2,,xjd)\displaystyle=\sum_{k\in\mathcal{K}}\sum_{i\in W^{N}_{k}}\bigg{[}\xi^{N}_{i,j}\sum_{(j_{2},...,j_{d})\in\texttt{set}^{N}(j)}\frac{\xi^{N}_{i,j_{2}}\times\cdots\times\xi^{N}_{i,j_{d}}}{{\delta^{N}_{i}\choose d}(d-1)!}\int_{\mathbb{N}^{d-1}}b(X_{j}(t),x_{j_{2}},...,x_{j_{d}})
μt𝐌(j2)(dxj2)μt𝐌(jd)(dxjd)].\displaystyle\hskip 256.0748pt\mu^{\mathbf{M}(j_{2})}_{t}(dx_{j_{2}})\cdots\mu^{\mathbf{M}(j_{d})}_{t}(dx_{j_{d}})\bigg{]}.

First, consider |CjN(s)CjN,1(s)||C^{N}_{j}(s)-C^{N,1}_{j}(s)|. For large enough NN,

𝔼|CjN(s)CjN,1(s)|\displaystyle\mathbb{E}|C^{N}_{j}(s)-C^{N,1}_{j}(s)|
=𝔼|k𝒦iWkN[ξi,jN(j2,,jd)setN(j)ξi,j2N××ξi,jdN(δiNd)(d1)!(b(XjN(s),Xj2N(s),,XjdN(s))\displaystyle=\mathbb{E}\bigg{|}\sum_{k\in\mathcal{K}}\sum_{i\in W^{N}_{k}}\Big{[}\xi^{N}_{i,j}\sum_{(j_{2},...,j_{d})\in\texttt{set}^{N}(j)}\frac{\xi^{N}_{i,j_{2}}\times\cdots\times\xi^{N}_{i,j_{d}}}{{\delta^{N}_{i}\choose d}(d-1)!}\big{(}b(X^{N}_{j}(s),X^{N}_{j_{2}}(s),...,X^{N}_{j_{d}}(s))
b(Xj(s),Xj2(s),,Xjd(s)))]|\displaystyle\hskip 256.0748pt-b(X_{j}(s),X_{j_{2}}(s),...,X_{j_{d}}(s))\big{)}\Big{]}\bigg{|}
𝔼k𝒦iWkN[ξi,jN(j2,,jd)setN(j)ξi,j2N××ξi,jdN(δiNd)(d1)!(|XjN(s)Xj(s)|++|XjdN(s)Xjd(s)|)]\displaystyle\leq\mathbb{E}\sum_{k\in\mathcal{K}}\sum_{i\in W^{N}_{k}}\Big{[}\xi^{N}_{i,j}\sum_{(j_{2},...,j_{d})\in\texttt{set}^{N}(j)}\frac{\xi^{N}_{i,j_{2}}\times\cdots\times\xi^{N}_{i,j_{d}}}{{\delta^{N}_{i}\choose d}(d-1)!}\big{(}|X^{N}_{j}(s)-X_{j}(s)|+\cdots+|X^{N}_{j_{d}}(s)-X_{j_{d}}(s)|\big{)}\Big{]}
d×maxjVN𝔼[|XjN(s)Xj(s)|]×k𝒦iWkNξi,jN(j2,,jd)setN(j)ξi,j2N××ξi,jdN(δiNd)(d1)!\displaystyle\leq d\times\max_{j\in V^{N}}\mathbb{E}[|X^{N}_{j}(s)-X_{j}(s)|]\times\sum_{k\in\mathcal{K}}\sum_{i\in W^{N}_{k}}\xi^{N}_{i,j}\sum_{(j_{2},...,j_{d})\in\texttt{set}^{N}(j)}\frac{\xi^{N}_{i,j_{2}}\times\cdots\times\xi^{N}_{i,j_{d}}}{{\delta^{N}_{i}\choose d}(d-1)!}
c3maxjVN𝔼[|XjN(s)Xj(s)|],\displaystyle\leq c_{3}\max_{j\in V^{N}}\mathbb{E}[|X^{N}_{j}(s)-X_{j}(s)|], (4.14)

where c3c_{3} is constant. The first inequality is from the that b()b(\cdot) is Lipschitz continuous with Lipschitz constant 1 and the last inequality is from (4.9).

Second, consider |CjN,1(s)CjN,2(s)||C^{N,1}_{j}(s)-C^{N,2}_{j}(s)|.

𝔼[|CjN,1(s)CjN,2(s)|2]\displaystyle\mathbb{E}\big{[}|C^{N,1}_{j}(s)-C^{N,2}_{j}(s)|^{2}\big{]}
=𝔼|k𝒦iWkN[ξi,jN(j2,,jd)setN(j)ξi,j2N××ξi,jdN(δiNd)(d1)!b(Xj(s),Xj2(s),,Xjd(s))]\displaystyle=\mathbb{E}\Big{|}\sum_{k\in\mathcal{K}}\sum_{i\in W^{N}_{k}}\Big{[}\xi^{N}_{i,j}\sum_{(j_{2},...,j_{d})\in\texttt{set}^{N}(j)}\frac{\xi^{N}_{i,j_{2}}\times\cdots\times\xi^{N}_{i,j_{d}}}{{\delta^{N}_{i}\choose d}(d-1)!}b(X_{j}(s),X_{j_{2}}(s),...,X_{j_{d}}(s))\Big{]}
k𝒦iWkN[ξi,jN(j2,,jd)setN(j)ξi,j2N××ξi,jdN(δiNd)(d1)!0d1b(Xj(s),xj2,,xjd)\displaystyle\hskip 42.67912pt-\sum_{k\in\mathcal{K}}\sum_{i\in W^{N}_{k}}\Big{[}\xi^{N}_{i,j}\sum_{(j_{2},...,j_{d})\in\texttt{set}^{N}(j)}\frac{\xi^{N}_{i,j_{2}}\times\cdots\times\xi^{N}_{i,j_{d}}}{{\delta^{N}_{i}\choose d}(d-1)!}\int_{\mathbb{N}^{d-1}_{0}}b(X_{j}(s),x_{j_{2}},...,x_{j_{d}}) (4.15)
μs𝐌(j2)(dxj2)μs𝐌(jd)(dxjd)]|2\displaystyle\hskip 256.0748pt\mu^{\mathbf{M}(j_{2})}_{s}(dx_{j_{2}})\cdots\mu^{\mathbf{M}(j_{d})}_{s}(dx_{j_{d}})\Big{]}\Big{|}^{2}
𝔼[i1,i2WNsettN(j)ξi1,jN×ξi1,j2N××ξi1,jdN(δi1Nd)(d1)!ξi2,jN×ξi2,j2N××ξi2,jdN(δi2Nd)(d1)!]\displaystyle\leq\mathbb{E}\Big{[}\sum_{i_{1},i_{2}\in W^{N}}\sum_{\texttt{sett}^{N}(j)}\frac{\xi^{N}_{i_{1},j}\times\xi^{N}_{i_{1},j_{2}}\times\cdots\times\xi^{N}_{i_{1},j_{d}}}{{\delta^{N}_{i_{1}}\choose d}(d-1)!}\frac{\xi^{N}_{i_{2},j}\times\xi^{N}_{i_{2},j^{\prime}_{2}}\times\cdots\times\xi^{N}_{i_{2},j^{\prime}_{d}}}{{\delta^{N}_{i_{2}}\choose d}(d-1)!}\Big{]}
(a)𝔼[iWNsettN(j)ξi,jN×ξi,j2N××ξi,jdN(δiNd)(d1)!ξi,jN×ξi,j2N××ξi,jdN(δiNd)(d1)!\displaystyle\overset{(a)}{\leq}\mathbb{E}\Big{[}\sum_{i\in W^{N}}\sum_{\texttt{sett}^{N}{(j)}}\frac{\xi^{N}_{i,j}\times\xi^{N}_{i,j_{2}}\times\cdots\times\xi^{N}_{i,j_{d}}}{{\delta^{N}_{i}\choose d}(d-1)!}\frac{\xi^{N}_{i,j}\times\xi^{N}_{i,j^{\prime}_{2}}\times\cdots\times\xi^{N}_{i,j^{\prime}_{d}}}{{\delta^{N}_{i}\choose d}(d-1)!}
+i1,i2WN,i1i2settN(j)ξi1,jN×ξi1,j2N××ξi1,jdN(δi1Nd)(d1)!ξi2,jN×ξi2,j2N××ξi2,jdN(δi2Nd)(d1)!],\displaystyle\hskip 42.67912pt+\sum_{i_{1},i_{2}\in W^{N},i_{1}\neq i_{2}}\sum_{\texttt{sett}^{N}{(j)}}\frac{\xi^{N}_{i_{1},j}\times\xi^{N}_{i_{1},j_{2}}\times\cdots\times\xi^{N}_{i_{1},j_{d}}}{{\delta^{N}_{i_{1}}\choose d}(d-1)!}\frac{\xi^{N}_{i_{2},j}\times\xi^{N}_{i_{2},j^{\prime}_{2}}\times\cdots\times\xi^{N}_{i_{2},j^{\prime}_{d}}}{{\delta^{N}_{i_{2}}\choose d}(d-1)!}\Big{]},
c4N2+c5N1\displaystyle\leq c_{4}N^{-2}+c_{5}N^{-1} (4.16)

where the first inequality is due to the fact that Xj(0)X_{j}(0) is i.i.d. for jVmj\in V_{m} and independent for different mm, so for each mm\in\mathcal{M}, {Xj(s),jVm}\{X_{j}(s),j\in V_{m}\} are also i.i.d., and the independence across the server pools holds for any fixed s>0s>0. Hence, if (j,j2,,jd,j2,,jd)(j,j_{2},...,j_{d},j^{\prime}_{2},...,j^{\prime}_{d}) are distinct, then

𝔼[(b(Xj(t),Xj2(t),,Xjd(t))d1b(Xj(t),xj2,,xjd)μt𝐌(j2)(dxj2)μt𝐌(jd)(dxjd))(b(Xj(t),Xj2(t),,Xjd(t))d1b(Xj(t),xj2,,xjd)μt𝐌(j2)(dxj2)μt𝐌(jd)(dxjd))]=0,\begin{split}\mathbb{E}\Big{[}&\big{(}b(X_{j}(t),X_{j_{2}}(t),...,X_{j_{d}}(t))-\int_{\mathbb{N}^{d-1}}b(X_{j}(t),x_{j_{2}},...,x_{j_{d}})\mu^{\mathbf{M}(j_{2})}_{t}(dx_{j_{2}})\cdots\mu^{\mathbf{M}(j_{d})}_{t}(dx_{j_{d}})\big{)}\\ &\big{(}b(X_{j}(t),X_{j^{\prime}_{2}}(t),...,X_{j^{\prime}_{d}}(t))-\int_{\mathbb{N}^{d-1}}b(X_{j}(t),x_{j^{\prime}_{2}},...,x_{j^{\prime}_{d}})\mu^{\mathbf{M}(j^{\prime}_{2})}_{t}(dx_{j^{\prime}_{2}})\cdots\mu^{\mathbf{M}(j^{\prime}_{d})}_{t}(dx_{j^{\prime}_{d}})\big{)}\Big{]}=0,\end{split}

and b()b(\cdot) and b()μ(d)\int b(\cdot)\mu(d\cdot) are both in [0,1][0,1]. The last inequality of (4.16) is by (4.10) and (4.11).

Third, consider |CjN,2(s)Cj(s)||C^{N,2}_{j}(s)-C_{j}(s)|.

𝔼[|CjN,2(s)Cj(s)|]\displaystyle\mathbb{E}\big{[}|C^{N,2}_{j}(s)-C_{j}(s)|\big{]}
=𝔼[|k𝒦iWkN[ξi,jN(j2,,jd)setN(j)ξi,j2N××ξi,jdN(δiNd)(d1)!d1b(Xj(t),xj2,,xjd)\displaystyle=\mathbb{E}\Big{[}\Big{|}\sum_{k\in\mathcal{K}}\sum_{i\in W^{N}_{k}}\Big{[}\xi^{N}_{i,j}\sum_{(j_{2},...,j_{d})\in\texttt{set}^{N}(j)}\frac{\xi^{N}_{i,j_{2}}\times\cdots\times\xi^{N}_{i,j_{d}}}{{\delta^{N}_{i}\choose d}(d-1)!}\int_{\mathbb{N}^{d-1}}b(X_{j}(t),x_{j_{2}},...,x_{j_{d}})
μt𝐌(j2)(dxj2)μt𝐌(jd)(dxjd)]\displaystyle\hskip 256.0748pt\mu^{\mathbf{M}(j_{2})}_{t}(dx_{j_{2}})\cdots\mu^{\mathbf{M}(j_{d})}_{t}(dx_{j_{d}})\Big{]}
dξk𝒦pk,mwkδk(M2,,Md)d1h=2dvMhpk,Mhδkd1b(Xj(t),xj2,,xjd)\displaystyle-d\xi\sum_{k\in\mathcal{K}}\frac{p_{k,m}w_{k}}{\delta_{k}}\sum_{(M_{2},...,M_{d})\in\mathcal{M}^{d-1}}\prod_{h=2}^{d}\frac{v_{M_{h}}p_{k,M_{h}}}{\delta_{k}}\int_{\mathbb{N}^{d-1}}b(X_{j}(t),x_{j_{2}},...,x_{j_{d}})
μt𝐌(j2)(dxj2)μt𝐌(jd)(dxjd)|]\displaystyle\hskip 256.0748pt\mu^{\mathbf{M}(j_{2})}_{t}(dx_{j_{2}})\cdots\mu^{\mathbf{M}(j_{d})}_{t}(dx_{j_{d}})\Big{|}\Big{]}
c6(N),\displaystyle\leq c_{6}(N), (4.17)

where c6(N)c_{6}(N) only depends on NN and goes to 0 as NN\rightarrow\infty and the inequality comes from (4.9) and the fact that b()μ(d)[0,1]\int b(\cdot)\mu(d\cdot)\in[0,1]. Now, by (4.12), (4.13), (4.2), (4.16) and (4.17), we have that for large enough NN,

maxjVN𝔼XjNXj,t2c100tmaxjVN𝔼XjNXj,t2ds+f(N),\max_{j\in V^{N}}\mathbb{E}\left\lVert X^{N}_{j}-X_{j}\right\rVert^{2}_{*,t}\leq c_{10}\int_{0}^{t}\max_{j\in V^{N}}\mathbb{E}\left\lVert X^{N}_{j}-X_{j}\right\rVert^{2}_{*,t}ds+f(N),

where c10c_{10} is a constant and f(N)f(N) is a function which goes to 0 as NN\rightarrow\infty. Last by Gronwall’s inequality, we have (3.6) and this completes the proof. ∎

4.3 Convergence of the Occupancy Process: IID Case

In this section, we want to show the convergence of the occupancy process 𝐪N()\mathbf{q}^{N}(\cdot) to the limit process 𝐪\mathbf{q} represented by the ODE (3.8). The first step is to investigate the existence and uniqueness of the solution of the ODE (3.8). Define

𝒮¯{𝐪[0,1]M×0:qm,0=1,qm,lqm,l+1,m,l0}\bar{\mathcal{S}}\coloneqq\Big{\{}\mathbf{q}\in[0,1]^{M\times\mathbb{N}_{0}}:q_{m,0}=1,q_{m,l}\geq q_{m,l+1},\forall m\in\mathcal{M},l\in\mathbb{N}_{0}\Big{\}}

and clearly, 𝒮𝒮¯\mathcal{S}\subseteq\bar{\mathcal{S}}.

Lemma 4.5.

If 𝐪(0)=𝐪0𝒮¯\mathbf{q}(0)=\mathbf{q}_{0}\in\bar{\mathcal{S}}, then the ODE system (3.8) has a unique solution denoted as 𝐪¯(t,𝐪0)\bar{\mathbf{q}}(t,\mathbf{q}_{0}), t0t\geq 0 in 𝒮¯\bar{\mathcal{S}}.

The proof of Lemma 4.5 is based on the Picard successive approximation method ([14, Theorem 1(i)]) and is provided in Appendix C.

Proof of Theorem 3.6.

Fix any T(0,)T\in(0,\infty). For each mm\in\mathcal{M}, consider random measures μmN=1|VmN|jVmNδXjN()\mu^{N}_{m}=\frac{1}{|V^{N}_{m}|}\sum_{j\in V^{N}_{m}}\delta_{X^{N}_{j}(\cdot)} and μ¯mN=1|VmN|jVmNδXj()\bar{\mu}^{N}_{m}=\frac{1}{|V^{N}_{m}|}\sum_{j\in V^{N}_{m}}\delta_{X_{j}(\cdot)} on 𝕊𝔻([0,T],0)\mathbb{S}\coloneqq\mathbb{D}([0,T],\mathbb{N}_{0}), where Xj()X_{j}(\cdot) is defined in (3.3). Denote the joint measures μN=(μ1N,,μMN)\mu^{N}=(\mu^{N}_{1},...,\mu^{N}_{M}) and μ¯N=(μ¯1N,,μ¯MN)\bar{\mu}^{N}=(\bar{\mu}^{N}_{1},...,\bar{\mu}^{N}_{M}). Denote by dBL(,)d_{BL}(\cdot,\cdot) the bounded-Lipschitz metric for probability measures on 𝕊\mathbb{S}:

dBL(μ1,μ2)supfBL1|𝕊f𝑑μ1𝕊f𝑑μ2|,fBLmax{f,supxyf(x)f(y)d(x,y)}.d_{BL}(\mu_{1},\mu_{2})\coloneqq\sup_{\left\lVert f\right\rVert_{BL}\leq 1}\Big{|}\int_{\mathbb{S}}fd\mu_{1}-\int_{\mathbb{S}}fd\mu_{2}\Big{|},\quad\left\lVert f\right\rVert_{BL}\coloneqq\max\Big{\{}\left\lVert f\right\rVert_{\infty},\sup_{x\neq y}\frac{f(x)-f(y)}{d(x,y)}\Big{\}}.

From (3.6) we have

𝔼dBL(μmN,μ¯mN)𝔼supfBL11|VmN|jVmN|f(XjN)f(Xj)|1|VmN|jVN𝔼XjNXj,TN0\begin{split}\mathbb{E}d_{BL}(\mu^{N}_{m},\bar{\mu}^{N}_{m})&\leq\mathbb{E}\sup_{\left\lVert f\right\rVert_{BL}\leq 1}\frac{1}{|V^{N}_{m}|}\sum_{j\in V^{N}_{m}}|f(X^{N}_{j})-f(X_{j})|\leq\frac{1}{|V_{m}^{N}|}\sum_{j\in V^{N}}\mathbb{E}\left\lVert X^{N}_{j}-X_{j}\right\rVert_{*,T}\xrightarrow{N\rightarrow\infty}0\end{split}

which implies that dBL(μmN,μ¯mN)0d_{BL}(\mu^{N}_{m},\bar{\mu}^{N}_{m})\xrightarrow{\ \mathbb{P}\ }0 for each mm\in\mathcal{M}. Since μ¯mNμm\bar{\mu}^{N}_{m}\xrightarrow{\ \mathbb{P}\ }\mu_{m} by LLN, we have μN=(μ1N,,μMN)(μ1,,μM)\mu^{N}=(\mu^{N}_{1},...,\mu^{N}_{M})\xrightarrow{\ \mathbb{P}\ }(\mu_{1},...,\mu_{M}) by Slutsky’s theorem. Also, it is easy to check that

supN𝔼[sup0tT𝐪N(t)12]<.\sup_{N}\mathbb{E}\Big{[}\sup_{0\leq t\leq T}\left\lVert\mathbf{q}^{N}(t)\right\rVert^{2}_{\ell_{1}}\Big{]}<\infty.

Thus, we have 𝐪N𝐪\mathbf{q}^{N}\xrightarrow{\ \mathbb{P}\ }\mathbf{q}. Next, we need to show that 𝐪\mathbf{q} satisfies (3.8). Define fl(x)=𝟙{xl}f_{l}(x)=\mathds{1}_{\{x\geq l\}}, l0l\in\mathbb{N}_{0}. By (3.3), we have that for any mm\in\mathcal{M} and jVmj\in V_{m},

𝔼fl(Xj(t))\displaystyle\mathbb{E}f_{l}(X_{j}(t)) =𝔼fl(Xj(0))+0tum𝔼𝟙{Xj(s)>0}(fl(Xj(s)1)fl(Xj(s)))𝑑s\displaystyle=\mathbb{E}f_{l}(X_{j}(0))+\int_{0}^{t}u_{m}\mathbb{E}\mathds{1}_{\{X_{j}(s)>0\}}\big{(}f_{l}(X_{j}(s)-1)-f_{l}(X_{j}(s))\big{)}ds
+0td1λξdk𝒦pk,mwkδk(M2,,Md)d1h=2dvMhpk,Mhδk\displaystyle\quad+\int_{0}^{t}\int_{\mathbb{N}^{d-1}}\lambda\xi d\sum_{k\in\mathcal{K}}\frac{p_{k,m}w_{k}}{\delta_{k}}\sum_{(M_{2},...,M_{d})\in\mathcal{M}^{d-1}}\prod_{h=2}^{d}\frac{v_{M_{h}}p_{k,M_{h}}}{\delta_{k}}
×𝔼[b(Xj(s),xj2,,xjd)(fl(Xj(s)+1)fl(Xj(s)))]μsM2(dxj2)μsMd(dxjd)ds\displaystyle\qquad\times\mathbb{E}\big{[}b(X_{j}(s),x_{j_{2}},...,x_{j_{d}})\big{(}f_{l}(X_{j}(s)+1)-f_{l}(X_{j}(s))\big{)}\big{]}\mu^{M_{2}}_{s}(dx_{j_{2}})\cdots\mu^{M_{d}}_{s}(dx_{j_{d}})ds
=𝔼fl(Xj(0))+0tum𝔼𝟙{Xj(s)>0}(fl+1(Xj(s))fl(Xj(s)))𝑑s\displaystyle=\mathbb{E}f_{l}(X_{j}(0))+\int_{0}^{t}u_{m}\mathbb{E}\mathds{1}_{\{X_{j}(s)>0\}}(f_{l+1}(X_{j}(s))-f_{l}(X_{j}(s)))ds
+0td1λξdk𝒦pk,mwkδk(M2,,Md)d1h=2dvMhpk,Mhδk\displaystyle\quad+\int_{0}^{t}\int_{\mathbb{N}^{d-1}}\lambda\xi d\sum_{k\in\mathcal{K}}\frac{p_{k,m}w_{k}}{\delta_{k}}\sum_{(M_{2},...,M_{d})\in\mathcal{M}^{d-1}}\prod_{h=2}^{d}\frac{v_{M_{h}}p_{k,M_{h}}}{\delta_{k}}
×𝔼[b(l1,xj2,,xjd)(fl1(Xj(s))fl(Xj(s)))]μsM2(dxj2)μsMd(dxjd)ds.\displaystyle\qquad\times\mathbb{E}[b(l-1,x_{j_{2}},...,x_{j_{d}})(f_{l-1}(X_{j}(s))-f_{l}(X_{j}(s)))]\mu^{M_{2}}_{s}(dx_{j_{2}})\cdots\mu^{M_{d}}_{s}(dx_{j_{d}})ds.

For any mm\in\mathcal{M}, if jVmj\in V_{m}, then 𝔼fl(Xj(t))=qm,l(t)=μtm[l,)\mathbb{E}f_{l}(X_{j}(t))=q_{m,l}(t)=\mu^{m}_{t}[l,\infty) for l=1,2,l=1,2,.... Hence,

qm,l(t)\displaystyle q_{m,l}(t) =qm,l(0)0tum(qm,l(s)qm,l+1(s))𝑑s+0tλξdk𝒦pk,mwkδk(qm,l1(s)qm,l(s))\displaystyle=q_{m,l}(0)-\int_{0}^{t}u_{m}(q_{m,l}(s)-q_{m,l+1}(s))ds+\int_{0}^{t}\lambda\xi d\sum_{k\in\mathcal{K}}\frac{p_{k,m}w_{k}}{\delta_{k}}(q_{m,l-1}(s)-q_{m,l}(s))
×(M2,,Md)d1h=2dvMhpk,Mhδkd1b(l1,xj2,,xjd)μsM2(dxj2)μsMd(dxjd)ds\displaystyle\times\sum_{(M_{2},...,M_{d})\in\mathcal{M}^{d-1}}\prod_{h=2}^{d}\frac{v_{M_{h}}p_{k,M_{h}}}{\delta_{k}}\int_{\mathbb{N}^{d-1}}b(l-1,x_{j_{2}},...,x_{j_{d}})\mu^{M_{2}}_{s}(dx_{j_{2}})\cdots\mu^{M_{d}}_{s}(dx_{j_{d}})ds (4.18)

Also,

(M2,,Md)d1h=2dvMhpk,Mhδkd1b(l1,xj2,,xjd)μsM2(dxj2)μsMd(dxjd)\displaystyle\sum_{(M_{2},...,M_{d})\in\mathcal{M}^{d-1}}\prod_{h=2}^{d}\frac{v_{M_{h}}p_{k,M_{h}}}{\delta_{k}}\int_{\mathbb{N}^{d-1}}b(l-1,x_{j_{2}},...,x_{j_{d}})\mu^{M_{2}}_{s}(dx_{j_{2}})\cdots\mu^{M_{d}}_{s}(dx_{j_{d}})
=r¯¯r¯¯(r¯)11+|r¯|m(rmrm)(vmpk,mδk)rm(qm,l1(s)qm,l(s))rm(qm,l(s))rmrm\displaystyle=\sum_{\bar{r}\in\bar{\mathcal{R}}}\sum_{\bar{r}^{\prime}\in\bar{\mathcal{R}}^{\prime}(\bar{r})}\frac{1}{1+|\bar{r}^{\prime}|}\prod_{m\in\mathcal{M}}{r_{m}\choose r^{\prime}_{m}}\Big{(}\frac{v_{m}p_{k,m}}{\delta_{k}}\Big{)}^{r_{m}}(q_{m,l-1}(s)-q_{m,l}(s))^{r^{\prime}_{m}}(q_{m,l}(s))^{r_{m}-r^{\prime}_{m}}
=r=0d111+r(d1r)(mvmpk,mδkqm,l1(s)mvmpk,mδkqm,l(s))r(mvmpk,mδkqm,l(s))d1r\displaystyle=\sum_{r=0}^{d-1}\frac{1}{1+r}{d-1\choose r}\Big{(}\sum_{m\in\mathcal{M}}\frac{v_{m}p_{k,m}}{\delta_{k}}q_{m,l-1}(s)-\sum_{m\in\mathcal{M}}\frac{v_{m}p_{k,m}}{\delta_{k}}q_{m,l}(s)\Big{)}^{r}\Big{(}\sum_{m\in\mathcal{M}}\frac{v_{m}p_{k,m}}{\delta_{k}}q_{m,l}(s)\Big{)}^{d-1-r}
=r=1d1r(d1r1)(q~k,l1(s)q~k,l(s))r1(q~k,l(s))dr(Let q~k,l(s)=mvmpk,mδkqm,l(s))\displaystyle=\sum_{r=1}^{d}\frac{1}{r}{d-1\choose r-1}\big{(}\tilde{q}_{k,l-1}(s)-\tilde{q}_{k,l}(s)\big{)}^{r-1}(\tilde{q}_{k,l}(s))^{d-r}\quad(\text{Let }\tilde{q}_{k,l}(s)=\sum_{m\in\mathcal{M}}\frac{v_{m}p_{k,m}}{\delta_{k}}q_{m,l}(s))
=(q~k,l1(s))d(q~k,l(s))dd(q~k,l1(s)q~k,l(s))\displaystyle=\frac{(\tilde{q}_{k,l-1}(s))^{d}-(\tilde{q}_{k,l}(s))^{d}}{d(\tilde{q}_{k,l-1}(s)-\tilde{q}_{k,l}(s))} (4.19)

where ¯={r¯=(r1,,rM)0M:mrm=d1}\bar{\mathcal{R}}=\{\bar{r}=(r_{1},...,r_{M})\in\mathbb{N}_{0}^{M}:\sum_{m\in\mathcal{M}}r_{m}=d-1\} and ¯(r¯)={r¯=(r1,,rM)0M:rmrm,m}\bar{\mathcal{R}}^{\prime}(\bar{r})=\{\bar{r}^{\prime}=(r^{\prime}_{1},...,r^{\prime}_{M})\in\mathbb{N}_{0}^{M}:r^{\prime}_{m}\leq r_{m},\forall m\in\mathcal{M}\} given r¯¯\bar{r}\in\bar{\mathcal{R}}. Plugging (4.3) into (4.3), we get the desired result. ∎

4.4 Convergence of the Occupancy Process: General Case

In this section, we will discuss the case that the sequence {GN}N\{G^{N}\}_{N} is clustered proportionally sparse, which helps us remove the i.i.d. assumption in Theorem 3.10. Intuitively, if {GN}N\{G^{N}\}_{N} is clustered proportionally sparse, then for each k𝒦k\in\mathcal{K} and each dispatcher iWkNi\in W^{N}_{k}, the queue-length distribution of its neighborhood will always be close (in appropriate sense) to the corresponding global weighted queue-length distribution. Clustered proportional sparsity ensures that this statement holds uniformly for all occupancy states. Loosely speaking, this statement enables us to make sure that the evolution of the occupancy process happens in the same way for any initial state as in the case of i.i.d. initial state. For the case of homogeneous systems, the notion of proportional sparsity was introduced in [22]. Here, proportional sparsity was defined in a way that for most dispatcher ii, the fraction of its neighbors within any subset UU of servers is proportional to the size of the subset UU. However, due to the heterogeneous compatibility between dispatchers and servers, such fraction, in the current setup, depends on the corresponding type of the dispatcher as well (see the term EkN(U)EkN(VN)\frac{E^{N}_{k}(U)}{E^{N}_{k}(V^{N})} in Definition 3.8). Thus, unlike the homogeneous case where the local queue-length distribution is directly compared to the global queue-length distribution of the system, for the heterogeneous case, we need to define KK types of global weighted queue-length distribution (see Definition 4.6), where the weights are determined by the asymptotic properties of the graph structure: (vm,m)(v_{m},m\in\mathcal{M}) and (pk,m,k𝒦,m)(p_{k,m},k\in\mathcal{K},m\in\mathcal{M}). Then, we compare the local queue-length distribution of dispatcher ii to the global weighted queue-length distribution of the corresponding type as defined below.

Definition 4.6.

Consider any fixed NN\in\mathbb{N} and k𝒦k\in\mathcal{K}. Given the global occupancy 𝐪N=(qm,lN,m,l0)\mathbf{q}^{N}=(q^{N}_{m,l},m\in\mathcal{M},l\in\mathbb{N}_{0}) of the NN-th system, the global weighted queue-length distribution (GWQD) of type k is defined as (xk,m,lN,m,l0)\big{(}x^{N}_{k,m,l},m\in\mathcal{M},l\in\mathbb{N}_{0}\big{)}, where

xk,m,lN=vmpk,mδk(qm,l+1Nqm,lN).x^{N}_{k,m,l}=\frac{v_{m}p_{k,m}}{\delta_{k}}(q^{N}_{m,l+1}-q^{N}_{m,l}).

Also, the local queue-length distribution is defined as follows.

Definition 4.7.

Consider any fixed NN\in\mathbb{N} and k𝒦k\in\mathcal{K}. Given the state (XjN,jVN)(X^{N}_{j},j\in V^{N}) of the NN-th system, the local queue-length distribution (LQD) of dispatcher iWkNi\in W^{N}_{k} is defined as (x^i,m,lN,m,l0)(\hat{x}^{N}_{i,m,l},m\in\mathcal{M},l\in\mathbb{N}_{0}), where

x^i,m,lN=|{jVmN:ξi,jN=1 and XjN=l}||𝒩wN(i)|.\hat{x}^{N}_{i,m,l}=\frac{|\{j\in V^{N}_{m}:\xi^{N}_{i,j}=1\text{ and }X^{N}_{j}=l\}|}{|\mathcal{N}^{N}_{w}(i)|}.

Although the dispatcher following the JSQ(dd) policy selects a target server based on its LQD, if its LQD is close (in suitable sense) to its corresponding GWQD, then the selection can be viewed as if the decision was based on the GWQD. The latter case is easier to analyze. Hence, if a dispatcher’s LQD is close to its corresponding GWQD, we call it a good dispatcher:

Definition 4.8 (ε\varepsilon-Good Dispatcher).

Consider any fixed NN\in\mathbb{N} and an ε>0\varepsilon>0. Given the state (XjN,jVN)(X^{N}_{j},j\in V^{N}) of the NN-th system. A dispatcher iWkNi\in W^{N}_{k}, k𝒦k\in\mathcal{K}, is ε\varepsilon-good if

ml0|x^i,m,lNxk,m,lN|ε.\sum_{m\in\mathcal{M}}\sum_{l\in\mathbb{N}_{0}}|\hat{x}^{N}_{i,m,l}-x^{N}_{k,m,l}|\leq\varepsilon. (4.20)

Also, a dispatcher is ε\varepsilon-bad if it is not ε\varepsilon-good.

4.4.1 Consequences of Clustered Proportional Sparsity

The proof of Theorem 3.10 relies on the idea that if the local occupancy of each dispatcher within a particular type evolves similar to the global occupancy of that type, then the process-level limiting behavior should not depend on any specific initial state. That is, it will enable us to go beyond the i.i.d. assumption. First step, for this approach to work, is to show that almost all dispatchers are ε\varepsilon-good for any ε>0\varepsilon>0. Here is where we need the property of clustered proportional sparsity. This is stated in the next proposition.

Proposition 4.9.

Let {GN}N\{G^{N}\}_{N} be a sequence of clustered proportionally sparse graphs. For any T0T\geq 0 and ε1,ε2>0\varepsilon_{1},\varepsilon_{2}>0,

(supt[0,T]Nε1(t)ε2|WN|)N0,\mathbb{P}\Big{(}\sup_{t\in[0,T]}\mathscr{B}^{\varepsilon_{1}}_{N}(t)\geq\varepsilon_{2}|W^{N}|\Big{)}\xrightarrow{N\rightarrow\infty}0, (4.21)

where Nε1(t)\mathscr{B}^{\varepsilon_{1}}_{N}(t) is the number of ε1\varepsilon_{1}-bad dispatchers at time tt.

The intuition behind Proposition 4.9 is that the servers of type mm\in\mathcal{M} with queue length l0l\in\mathbb{N}_{0} forms a subset Um,lNU^{N}_{m,l} of the server set VNV^{N}. If this set is large, then by the clustered proportional sparsity, for any fixed k𝒦k\in\mathcal{K} and almost all iWkNi\in W^{N}_{k}, the fraction of the dispatcher ii’s neighbors within Um,lNU^{N}_{m,l} is close to |EkN(Um,lN)||EkN(VN)|\frac{|E^{N}_{k}(U^{N}_{m,l})|}{|E^{N}_{k}(V^{N})|}, which is close to xk,m,lNx^{N}_{k,m,l} for large enough NN by Condition 2.1. Also, in order to deal with the sum over l0l\in\mathbb{N}_{0}, we will need to establish uniform bounds of the tail of the occupancy process on any finite time interval. The complete proof is given in Appendix D.

4.4.2 Coupling with an intermediate system

The main methodology for the proof of Theorem 3.10 is a stochastic coupling with a sequence {GN}N1\{G^{\prime N}\}_{N\geq 1} of carefully constructed systems where the evolution of each system GNG^{\prime N} can be coupled with that of the system GNG^{N}. For each NN, the system GNG^{\prime N} has the same sets of dispatchers and servers as GNG^{N}, i.e., WN=WNW^{\prime N}=W^{N} and VN=VNV^{\prime N}=V^{N}. However, the task assignment in GNG^{\prime N} happens differently. To describe the task assignment policy, let us introduce the following notations: Let XjN(t)X^{\prime N}_{j}(t) be the number of tasks (including those in service) in the queue of server jVNj\in V^{\prime N} at time tt. Let 𝐪N(t)=(qm,lN(t),m,l0)\mathbf{q}^{\prime N}(t)=\big{(}q^{\prime N}_{m,l}(t),m\in\mathcal{M},l\in\mathbb{N}_{0}\big{)} be the corresponding global occupancy at time tt, which is defined in the same way as 𝐪N\mathbf{q}^{N} for the system GNG^{N}. Then, the system GNG^{\prime N} assigns tasks under the Global Weighted Shortest Queue (GWSQ(dd)) policy as described in Algorithm 1. The GWSQ(dd) policy is essentially a variant of the JSQ(d) policy since for each new task, the dispatcher selects a target set of servers of size dd according to the global weighted queue-length distribution.

while A new task arrives at dispatcher iWkNi\in W^{N}_{k}, k𝒦k\in\mathcal{K} do
  Get the current global occupancy 𝐪N=(qm,lN,m,l0)\mathbf{q}^{N}=(q^{N}_{m,l},m\in\mathcal{M},l\in\mathbb{N}_{0});
  Calculate the global weighted queue-length distribution 𝐱kN=(xk,m,lN,m,l0)\mathbf{x}^{N}_{k}=(x^{N}_{k,m,l},m\in\mathcal{M},l\in\mathbb{N}_{0}) of type kk,
xk,m,lN=vmpk,mδk(qm,l+1Nqm,lN);x^{N}_{k,m,l}=\frac{v_{m}p_{k,m}}{\delta_{k}}(q^{N}_{m,l+1}-q^{N}_{m,l});
Randomly select a set selectN\texttt{select}^{N} with size dd as the following:
  •  

    Let Yk,m,lN(t)0Y^{N}_{k,m,l}(t)\in\mathbb{N}_{0} be the number of servers of type mm\in\mathcal{M} with queue length

l0l\in\mathbb{N}_{0} in the set selectN\texttt{select}^{N};
  • (Yk,m,lN(t),m,lN0)(Y^{N}_{k,m,l}(t),m\in\mathcal{M},l\in N_{0}) satisfies

    m,l0Yk,m,lN(t)=d;\sum_{m\in\mathcal{M},l\in\mathbb{N}_{0}}Y^{N}_{k,m,l}(t)=d;
  •  

    The probability of selecting (Yk,m,lN(t),m,lN0)(Y^{N}_{k,m,l}(t),m\in\mathcal{M},l\in N_{0}) is

    (Yk,m,lN(t),m,lN0)=m,l0(Xk,m,lN(t)Yk,m,lN(t))/(Nd);\mathbb{P}(Y^{N}_{k,m,l}(t),m\in\mathcal{M},l\in N_{0})=\prod_{m\in\mathcal{M},l\in\mathbb{N}_{0}}{X^{N}_{k,m,l}(t)\choose Y^{N}_{k,m,l}(t)}/{N\choose d};

    where Xk,m,lN=N×xk,m,lNX^{N}_{k,m,l}=N\times x^{N}_{k,m,l}.

    Get l=min(l0:k𝒦,m such that Yk,m,lN>0)l^{*}=\min(l\in\mathbb{N}_{0}:\exists k\in\mathcal{K},m\in\mathcal{M}\text{ such that }Y^{N}_{k,m,l}>0);

  •   Assign the task a type mm\in\mathcal{M} server with queue length ll^{*} with probability
    Yk,m,lNmYk,m,lN.\frac{Y^{N}_{k,m,l^{*}}}{\sum_{m\in\mathcal{M}}Y^{N}_{k,m,l^{*}}}.
    end while
    Algorithm 1 GWSQ(dd)

    Next, we couple the evolution of the system GNG^{\prime N} with that of the system GNG^{N} by the optimal coupling method. The optimal coupling for two stochastic processes is similar to the maximal coupling for two discrete random variables (say, XX and YY), maximizing the probability (X=Y)\mathbb{P}(X=Y).

    Optimal Coupling.

    Fix any NN. In both systems, within the pool of servers of each type, arrange the servers in the non-decreasing order of their queue lengths (ties are broken arbitrarily). Now, couple the evolution of the system GNG^{N} with the system GNG^{\prime N} in the following way:

    • Departure. For any mm\in\mathcal{M} and n=1,,|VmN|n=1,\ldots,|V^{N}_{m}|, synchronize the departure epochs of the nthn^{th} ordered servers of type mm in the two systems.

    • Arrival. The coupling of arrivals is the tricky part. For this, first synchronize the arrival epochs at each dispatcher ii in both systems GNG^{\prime N} and GNG^{N}. At an arrival epoch of dispatcher iWkNi\in W^{N}_{k}, let (x^i,m,lN,m,l0)(\hat{x}^{N}_{i,m,l},m\in\mathcal{M},l\in\mathbb{N}_{0}) be the local empirical distribution of dispatcher ii in the system GNG^{N} and (xk,m,lN,m,l0)(x^{\prime N}_{k,m,l},m\in\mathcal{M},l\in\mathbb{N}_{0}) be the weighted global empirical distribution of type kk dispatchers in the system GNG^{\prime N}. Then, in the system GNG^{N}, probability that the task will be assigned to a server of type mm\in\mathcal{M} with queue length l0l\in\mathbb{N}_{0} is given by

      pm,lN(i):=r=1dr1=1rr1r(|𝒩wN(i)|x^i,m,lNr1)(|𝒩wN(i)|{m}x^i,m,lNrr1)(|𝒩wN(i)|ll+1x^i,m,lNdr)(|𝒩wN(i)|d).\begin{split}p^{N}_{m,l}(i)&:=\frac{\sum_{r=1}^{d}\sum_{r_{1}=1}^{r}\frac{r_{1}}{r}{|\mathcal{N}^{N}_{w}(i)|\hat{x}^{N}_{i,m,l}\choose r_{1}}{|\mathcal{N}^{N}_{w}(i)|\sum_{\mathcal{M}\setminus\{m\}}\hat{x}^{N}_{i,m,l}\choose r-r_{1}}{|\mathcal{N}^{N}_{w}(i)|\sum_{\mathcal{M}}\sum_{l^{\prime}\geq l+1}\hat{x}^{N}_{i,m,l^{\prime}}\choose d-r}}{{|\mathcal{N}^{N}_{w}(i)|\choose d}}.\end{split} (4.22)

      In the system GNG^{\prime N}, the probability that the task will be assigned to a server of type mm\in\mathcal{M} with queue length l0l\in\mathbb{N}_{0} is given by

      pm,lN(k):=r=1dr1=1rr1r(Xk,m,lNr1)({m}Xk,m,lNrr1)(ll+1Xk,m,lNdr)(Nd).\begin{split}p^{\prime N}_{m,l}(k)&:=\frac{\sum_{r=1}^{d}\sum_{r_{1}=1}^{r}\frac{r_{1}}{r}{X^{\prime N}_{k,m,l}\choose r_{1}}{\sum_{\mathcal{M}\setminus\{m\}}X^{\prime N}_{k,m,l}\choose r-r_{1}}{\sum_{\mathcal{M}}\sum_{l^{\prime}\geq l+1}X^{\prime N}_{k,m,l^{\prime}}\choose d-r}}{{N\choose d}}.\end{split} (4.23)

      For convenience, we denote pm,lN(i)p^{N}_{m,l}(i) and pm,lN(k)p^{\prime N}_{m,l}(k) as pm,lNp^{N}_{m,l} and pm,lNp^{\prime N}_{m,l}, respectively. Denote p¯m,lN=min(pm,lN,pm,lN)\bar{p}^{N}_{m,l}=\min(p^{N}_{m,l},p^{\prime N}_{m,l}) for mm\in\mathcal{M} and l0l\in\mathbb{N}_{0}.

      Now, to couple the task assignment, let us draw a Uniform[0,1]\mathrm{Uniform}[0,1] random variable UU, independently of any other processes and across various arrival epochs. UU is used to generate the random variables (MN,LN)×0(M^{N},L^{N})\in\mathcal{M}\times\mathbb{N}_{0} and (MN,LN)×0(M^{\prime N},L^{\prime N})\in\mathcal{M}\times\mathbb{N}_{0} for the system GNG^{N} and the system GNG^{\prime N}, respectively. In the system GNG^{N}, set (MN,LN)=(m,l)×0(M^{N},L^{N})=(m,l)\in\mathcal{M}\times\mathbb{N}_{0}, if

      U[m=1m1l=0p¯m,lN+l=0l1p¯m,lN,m=1m1l=0p¯m,lN+l=0lp¯m,lN)[p¯N+m=1m1l=0(pm,lNp¯m,lN)+l=0l1(pm,lNp¯m,lN),p¯N+m=1m1l=0(pm,lNp¯m,lN)+l=0l(pm,lNp¯m,lN)),\begin{split}U\in&\Big{[}\sum_{m^{\prime}=1}^{m-1}\sum_{l^{\prime}=0}^{\infty}\bar{p}^{N}_{m^{\prime},l^{\prime}}+\sum_{l^{\prime}=0}^{l-1}\bar{p}^{N}_{m,l^{\prime}},\sum_{m^{\prime}=1}^{m-1}\sum_{l^{\prime}=0}^{\infty}\bar{p}^{N}_{m^{\prime},l^{\prime}}+\sum_{l^{\prime}=0}^{l}\bar{p}^{N}_{m,l^{\prime}}\big{)}\\ &\bigcup\big{[}\bar{p}^{N}+\sum_{m^{\prime}=1}^{m-1}\sum_{l^{\prime}=0}^{\infty}(p^{N}_{m^{\prime},l^{\prime}}-\bar{p}^{N}_{m^{\prime},l^{\prime}})+\sum_{l^{\prime}=0}^{l-1}(p^{N}_{m,l^{\prime}}-\bar{p}^{N}_{m,l^{\prime}}),\\ &\qquad\bar{p}^{N}+\sum_{m^{\prime}=1}^{m-1}\sum_{l^{\prime}=0}^{\infty}(p^{N}_{m^{\prime},l^{\prime}}-\bar{p}^{N}_{m^{\prime},l^{\prime}})+\sum_{l^{\prime}=0}^{l}(p^{N}_{m,l^{\prime}}-\bar{p}^{N}_{m,l^{\prime}})\Big{)},\end{split} (4.24)

      where p¯N=m=1Ml=0p¯m,lN\bar{p}^{N}=\sum_{m^{\prime}=1}^{M}\sum_{l^{\prime}=0}^{\infty}\bar{p}^{N}_{m^{\prime},l^{\prime}}, and assign the task to a server of type mm with queue length ll. Similarly, in the system GNG^{\prime N}, set (MN,LN)=(m,l)×0(M^{\prime N},L^{\prime N})=(m,l)\in\mathcal{M}\times\mathbb{N}_{0}, if

      U[m=1m1l=0p¯m,lN+l=0l1p¯m,lN,m=1m1l=0p¯m,lN+l=0lp¯m,lN)[p¯N+m=1m1l=0(pm,lNp¯m,lN)+l=0l1(pm,lNp¯m,lN),p¯N+m=1m1l=0(pm,lNp¯m,lN)+l=0l(pm,lNp¯m,lN)),\begin{split}U\in&\Big{[}\sum_{m^{\prime}=1}^{m-1}\sum_{l^{\prime}=0}^{\infty}\bar{p}^{N}_{m^{\prime},l^{\prime}}+\sum_{l^{\prime}=0}^{l-1}\bar{p}^{N}_{m,l^{\prime}},\sum_{m^{\prime}=1}^{m-1}\sum_{l^{\prime}=0}^{\infty}\bar{p}^{N}_{m^{\prime},l^{\prime}}+\sum_{l^{\prime}=0}^{l}\bar{p}^{N}_{m,l^{\prime}}\big{)}\\ &\bigcup\big{[}\bar{p}^{N}+\sum_{m^{\prime}=1}^{m-1}\sum_{l^{\prime}=0}^{\infty}(p^{\prime N}_{m^{\prime},l^{\prime}}-\bar{p}^{N}_{m^{\prime},l^{\prime}})+\sum_{l^{\prime}=0}^{l-1}(p^{\prime N}_{m,l^{\prime}}-\bar{p}^{N}_{m,l^{\prime}}),\\ &\qquad\bar{p}^{N}+\sum_{m^{\prime}=1}^{m-1}\sum_{l^{\prime}=0}^{\infty}(p^{\prime N}_{m^{\prime},l^{\prime}}-\bar{p}^{N}_{m^{\prime},l^{\prime}})+\sum_{l^{\prime}=0}^{l}(p^{\prime N}_{m,l^{\prime}}-\bar{p}^{N}_{m,l^{\prime}})\Big{)},\end{split} (4.25)

      and assign the task to a server of type mm with queue length ll.

    As alluded to before, the above coupling is constructed in a way that maximizes the probability of the two systems to assign an arriving task to some server with the same queue length. Next, the difference in the occupancy processes of the two systems, on any finite time interval, can be upper-bounded by the number of times the two systems assign to two different queue lengths. This is formalized by the notion of mismatch below, which was originally introduced in [19].

    Definition 4.10 (Mismatch).

    At an arrival epoch, the system GNG^{N} and the system GNG^{\prime N} are said to mismatch if (MN,LN)(MN,LN)(M^{N},L^{N})\neq(M^{\prime N},L^{\prime N}), that is, the arriving task is not assigned to servers of the same type with the same queue length in the two systems. Denote by ΔN(t)\Delta^{N}(t) the cumulative number of times the systems mismatch in queue length up to time tt.

    The next proposition provides a deterministic bound on the difference between the occupancy processes of the two systems in terms of the number of mismatches.

    Proposition 4.11.

    For any N1N\geq 1, consider the system GNG^{N} and the system GNG^{\prime N} coupled as above. Then the following holds almost surely on the coupled probability space: for t0t\geq 0,

    ml0|Qm,lN(t)Qm,lN(t)|2ΔN(t),\sum_{m\in\mathcal{M}}\sum_{l\in\mathbb{N}_{0}}|Q^{N}_{m,l}(t)-Q^{\prime N}_{m,l}(t)|\leq 2\Delta^{N}(t), (4.26)

    provided the inequality holds at t=0t=0. Qm,lN(t)Q^{N}_{m,l}(t) and Qm,lN(t)Q^{\prime N}_{m,l}(t) represent the number of servers of type mm\in\mathcal{M} with queue length at least l0l\in\mathbb{N}_{0} in the system GNG^{N} and the system GNG^{\prime N} at time tt, respectively.

    Bounds of the form as given in (4.26) was originally established in [19, Proposition 4], and is later used in various contexts [18, 22]. The proof does not depend on any specific assignment policy and relies on showing inductively that if the inequality in (4.26) holds before an event time epoch, then it is preserved after the event time epoch as well. The proof of Proposition 4.11 can be obtained following the similar arguments. We omit the details.

    Lemma 4.12.

    Given ml0|Qm,lNQm,lN|2ΔN\sum_{m\in\mathcal{M}}\sum_{l\in\mathbb{N}_{0}}|Q^{N}_{m,l}-Q^{\prime N}_{m,l}|\leq 2\Delta^{N}. Then, there exist N00N_{0}\in\mathbb{N}_{0} and a positive constant LL such that for any k𝒦k\in\mathcal{K},

    ml0|xk,m,lNxk,m,lN|LΔN/N,NN0.\sum_{m\in\mathcal{M}}\sum_{l\in\mathbb{N}_{0}}|x^{N}_{k,m,l}-x^{\prime N}_{k,m,l}|\leq L\Delta^{N}/N,\quad\forall N\geq N_{0}. (4.27)
    Proof.

    By the model assumption, there exists N00N_{0}\in\mathbb{N}_{0} such that for all NN0N\geq N_{0}, |VmN|12Nvm|V^{N}_{m}|\geq\frac{1}{2}Nv_{m}, m\forall m\in\mathcal{M}, which gives us that

    ml0|xk,m,lNxk,m,lN|=ml0vmpk,mδk|Qm,lNQm,lN|/|VmN|ml02pk,mδk|Qm,lNQm,lN|/NLΔN/N,\begin{split}\sum_{m\in\mathcal{M}}\sum_{l\in\mathbb{N}_{0}}|x^{N}_{k,m,l}-x^{\prime N}_{k,m,l}|&=\sum_{m\in\mathcal{M}}\sum_{l\in\mathbb{N}_{0}}\frac{v_{m}p_{k,m}}{\delta_{k}}|Q^{N}_{m,l}-Q^{\prime N}_{m,l}|/|V^{N}_{m}|\\ &\leq\sum_{m\in\mathcal{M}}\sum_{l\in\mathbb{N}_{0}}\frac{2p_{k,m}}{\delta_{k}}|Q^{N}_{m,l}-Q^{\prime N}_{m,l}|/N\leq L\Delta^{N}/N,\end{split} (4.28)

    where L=4maxk𝒦,mpk,mδkL=4\max_{k\in\mathcal{K},m\in\mathcal{M}}\frac{p_{k,m}}{\delta_{k}}. ∎

    The final ingredient that we need is the probability of mismatch in a particular epoch, under the optimal coupling method. The next lemma bounds this probability in terms of the 1\ell_{1}-distance between the LQD of the GNG_{N} system and the GWQD of the GNG^{\prime N} system.

    Lemma 4.13.

    Consider an arrival epoch at dispatcher ii, and assume that in this epoch, the LQD in the system GNG^{N} is given by (x^i,m,lN,m,l0)(\hat{x}^{N}_{i,m,l},m\in\mathcal{M},l\in\mathbb{N}_{0}) and the GWQD of type-kk servers in the system GNG^{\prime N} is given by (xk,m,lN,m,l0)(x^{\prime N}_{k,m,l},m\in\mathcal{M},l\in\mathbb{N}_{0}). Then there exists exists a finite positive constant L1L_{1}, such that for all large enough NN,

    (Mismatch)L1ml0|x^i,m,lNxk,m,lN|.\mathbb{P}(\text{Mismatch})\leq L_{1}\sum_{m\in\mathcal{M}}\sum_{l\in\mathbb{N}_{0}}|\hat{x}^{N}_{i,m,l}-x^{\prime N}_{k,m,l}|. (4.29)

    The key step in the proof of Lemma 4.13 is that, given the queue-length distribution 𝐱=(xm,l,m,l0)\mathbf{x}=(x_{m,l},m\in\mathcal{M},l\in\mathbb{N}_{0}), the probability pm,lp_{m,l} that a task will be assigned to a server of type mm\in\mathcal{M} with queue length l0l\in\mathbb{N}_{0} can be approximated by

    pm,lr=1dr1=1rr1rd!r1!(rr1)!(dr)!(xm,l)r1({m}xm,l)rr1(ll+1xm,l)drp_{m,l}\approx\sum_{r=1}^{d}\sum_{r_{1}=1}^{r}\frac{r_{1}}{r}\frac{d!}{r_{1}!(r-r_{1})!(d-r)!}\big{(}x_{m,l}\big{)}^{r_{1}}\big{(}\sum_{\mathcal{M}\setminus\{m\}}x_{m,l}\big{)}^{r-r_{1}}\big{(}\sum_{\mathcal{M}}\sum_{l^{\prime}\geq l+1}x_{m,l^{\prime}}\big{)}^{d-r}

    and that the function xkx^{k} is Lipschitz for x[0,1]x\in[0,1]. The complete proof is given in Appendix E.

    4.4.3 Proof of Theorem 3.10

    Now we have all the ingredients to prove Theorem 3.10. Let us explain the high-level proof scheme first.

    Step 1. Using the optimal coupling, we will show that the global occupancy processes {𝐪N()}N\{\mathbf{q}^{N}(\cdot)\}_{N} and {𝐪N()}N\{\mathbf{q}^{\prime N}(\cdot)\}_{N} must converge to the same limit process as NN\to\infty, if their initial states are the same, XjN(0)=XjN(0)X^{N}_{j}(0)=X^{\prime N}_{j}(0) for all jj. In other words, with the same initial states,

    limN𝐪N()=limN𝐪N().\lim_{N\rightarrow\infty}\mathbf{q}^{N}(\cdot)=\lim_{N\rightarrow\infty}\mathbf{q}^{\prime N}(\cdot).

    Step 2. Since there is no graph structure in the system GNG^{\prime N}, all servers of the same type in the system GNG^{\prime N} are exchangeable. Hence, 𝐪N()\mathbf{q}^{\prime N}(\cdot) is Markovian, which implies that given 𝐪N(0)\mathbf{q}^{\prime N}(0), its evolution does not depend on how individual XjN(0)X^{\prime N}_{j}(0)’s are distributed. Denote the system GNG^{\prime N} with i.i.d. assumption as G1NG^{\prime N}_{1}, where the i.i.d. assumption refers to that for any mm\in\mathcal{M}, XjN(0)X^{N}_{j}(0), jVmNj\in V^{N}_{m}, are i.i.d.. Also, denote the system GNG^{\prime N} without the i.i.d. assumption as G2NG^{\prime N}_{2}. Their occupancy processes are 𝐪1N()\mathbf{q}^{\prime N}_{1}(\cdot) and 𝐪2N()\mathbf{q}^{\prime N}_{2}(\cdot), respectively. Since task assignment policy in GNG^{\prime N} does not distinguish between two servers having same type and queue lengths, by a natural coupling, 𝐪1N(t)=𝐪2N(t)\mathbf{q}^{\prime N}_{1}(t)=\mathbf{q}^{\prime N}_{2}(t) holds for all t0t\geq 0, implying that

    limN𝐪1N()=limN𝐪2N().\lim_{N\rightarrow\infty}\mathbf{q}^{\prime N}_{1}(\cdot)=\lim_{N\rightarrow\infty}\mathbf{q}^{\prime N}_{2}(\cdot).

    Step 3. Denote the system GNG^{N} with i.i.d. assumption as G1NG^{N}_{1} and the system GNG^{\prime N} without the i.i.d. assumption as G2NG^{N}_{2} and their occupancy processes by 𝐪1N()\mathbf{q}^{N}_{1}(\cdot) and 𝐪2N()\mathbf{q}^{N}_{2}(\cdot), respectively. Combining Step 1 and Step 2, the following equation holds. With the same initial global occupancy state,

    limN𝐪1N()=limN𝐪1N()=limN𝐪2N()=limN𝐪2N(),\lim_{N\rightarrow\infty}\mathbf{q}^{N}_{1}(\cdot)=\lim_{N\rightarrow\infty}\mathbf{q}^{\prime N}_{1}(\cdot)=\lim_{N\rightarrow\infty}\mathbf{q}^{\prime N}_{2}(\cdot)=\lim_{N\rightarrow\infty}\mathbf{q}^{N}_{2}(\cdot),

    where the first and last equalities are due to Step 1 and the second equality is due to Step 2.

    Step 4. Use Theorem 3.6 to note that when the sequence {GN}N\{G^{N}\}_{N} satisfies the assumption that for each mm\in\mathcal{M}, XjN(0)X^{N}_{j}(0), jVmNj\in V^{N}_{m}, are i.i.d., the scaled global occupancy process 𝐪N\mathbf{q}^{N} converge weakly to 𝐪\mathbf{q} described by the system of ODEs in (3.8).

    Step 5. By Steps 3 and 4, Theorem 3.10 holds.

    In the above proof scheme, observe that all that remains is to show Step 1, which is given below.

    Proof of Theorem 3.10.

    For Step 1 described in the proof scheme above, by Proposition 4.11, it is sufficient to show that for any ε>0\varepsilon^{*}>0 and δ>0\delta^{*}>0, there exists an N01N_{0}\geq 1 such that

    (supt[0,T]ΔN(t)/Nε)δ,NN0.\mathbb{P}\big{(}\sup_{t\in[0,T]}\Delta^{N}(t)/N\geq\varepsilon^{*}\big{)}\leq\delta^{*},\quad\forall N\geq N_{0}. (4.30)

    Fix an ε>0\varepsilon>0, which will be chosen later. Let 𝒢Nε(t)\mathscr{G}^{\varepsilon}_{N}(t) and Nε(t)\mathscr{B}^{\varepsilon}_{N}(t) be the number of ε\varepsilon-good and ε\varepsilon-bad dispatchers in the system GNG^{N} at time tt, respectively. We couple the evolution of the system GNG^{N} with that of the system GNG^{\prime N} by the optimal coupling method. In system GNG^{N}, let (xk,m,lN(t),m,l0)(x^{N}_{k,m,l}(t),m\in\mathcal{M},l\in\mathbb{N}_{0}) be the global weighted queue-length distribution of type k𝒦k\in\mathcal{K} and (x^i,m,lN(t),m,l0)(\hat{x}^{N}_{i,m,l}(t),m\in\mathcal{M},l\in\mathbb{N}_{0}) be the local queue-length distribution of the dispatcher iWkNi\in W^{N}_{k}, k𝒦k\in\mathcal{K}. Also, let (xk,m,lN(t),m,l0)(x^{\prime N}_{k,m,l}(t),m\in\mathcal{M},l\in\mathbb{N}_{0}) be the global weighted queue-length distribution of type k𝒦k\in\mathcal{K} in system GNG^{\prime N}. Denote ρkN(t)=ml0|xk,m,lN(t)xk,m,lN(t)|\rho^{N}_{k}(t)=\sum_{m\in\mathcal{M}}\sum_{l\in\mathbb{N}_{0}}|x^{N}_{k,m,l}(t)-x^{\prime N}_{k,m,l}(t)|. At an arrival epoch t0t\geq 0, if a task arrives at an ε\varepsilon-good dispatcher iWkNi\in W^{N}_{k}, then

    ml0|x^i,m,lN(t)xk,m,lN(t)|\displaystyle\sum_{m\in\mathcal{M}}\sum_{l\in\mathbb{N}_{0}}|\hat{x}^{N}_{i,m,l}(t-)-x^{\prime N}_{k,m,l}(t-)|
    ml0|x^i,m,lN(t)xk,m,lN(t)|+ml0|xk,m,lN(t)xk,m,lN(t)|=ε+ρkN(t).\displaystyle\leq\sum_{m\in\mathcal{M}}\sum_{l\in\mathbb{N}_{0}}|\hat{x}^{N}_{i,m,l}(t-)-x^{N}_{k,m,l}(t-)|+\sum_{m\in\mathcal{M}}\sum_{l\in\mathbb{N}_{0}}|x^{N}_{k,m,l}(t-)-x^{\prime N}_{k,m,l}(t-)|=\varepsilon+\rho^{N}_{k}(t). (4.31)

    Recall the uniform random variable UU and p¯m,lN\bar{p}^{N}_{m,l} defined in the description of the optimal coupling method. The probability that the systems have a mismatch at such arrival epoch is bounded by

    (U[0,ml0p¯m,lN])=1ml0p¯m,lN=ml0pm,lNml0p¯m,lNml0|pm,lNpm,lN|L1ml0|x^i,m,lN(t)xk,m,lN|L1(ρkN(t)+ε),\begin{split}\mathbb{P}\Big{(}U\notin[0,\sum_{m\in\mathcal{M}}\sum_{l\in\mathbb{N}_{0}}\bar{p}^{N}_{m,l}]\Big{)}&=1-\sum_{m\in\mathcal{M}}\sum_{l\in\mathbb{N}_{0}}\bar{p}^{N}_{m,l}=\sum_{m\in\mathcal{M}}\sum_{l\in\mathbb{N}_{0}}p^{\prime N}_{m,l}-\sum_{m\in\mathcal{M}}\sum_{l\in\mathbb{N}_{0}}\bar{p}^{N}_{m,l}\\ &\leq\sum_{m\in\mathcal{M}}\sum_{l\in\mathbb{N}_{0}}|p^{\prime N}_{m,l}-p^{N}_{m,l}|\leq L_{1}\sum_{m\in\mathcal{M}}\sum_{l\in\mathbb{N}_{0}}|\hat{x}^{N}_{i,m,l}(t-)-x^{\prime N}_{k,m,l}|\\ &\leq L_{1}(\rho^{N}_{k}(t)+\varepsilon),\end{split} (4.32)

    where the second inequality is from Lemma 4.13. At an arrival epoch t0t\geq 0, if a task arrives at an ε\varepsilon-bad dispatcher iWkNi\in W^{N}_{k}, then with probability at most one, the systems have a mismatch. Due to the Poisson thinning property, we can construct an independent unit-rate Poisson process (Z(t))t0(Z(t))_{t\geq 0}, so that ΔN(t)\Delta^{N}(t) can be upper bounded by a random time change of ZZ as the following: for all t[0,T]t\in[0,T],

    ΔN(t)Z(k𝒦iWkNλ0t[𝟙(i𝒢Nε(s))L1(ρkN(s)+ε)+𝟙(iNε(s))1]𝑑s)Z(k𝒦iWkNλ0t[𝟙(i𝒢Nε(s))L1(LΔN(s)/N+ε)+𝟙(iNε(s))1]𝑑s)=Z(λ0t[𝒢Nε(s)L1(LΔN(s)/N+ε)+Nε(s)1]𝑑s),\begin{split}\Delta^{N}(t)&\leq Z\Big{(}\sum_{k\in\mathcal{K}}\sum_{i\in W^{N}_{k}}\lambda\int_{0}^{t}[\mathds{1}_{(i\in\mathscr{G}^{\varepsilon}_{N}(s-))}L_{1}(\rho^{N}_{k}(s-)+\varepsilon)+\mathds{1}_{(i\in\mathscr{B}^{\varepsilon}_{N}(s-))}\cdot 1]ds\Big{)}\\ &\leq Z\Big{(}\sum_{k\in\mathcal{K}}\sum_{i\in W^{N}_{k}}\lambda\int_{0}^{t}[\mathds{1}_{(i\in\mathscr{G}^{\varepsilon}_{N}(s-))}L_{1}(L\Delta^{N}(s-)/N+\varepsilon)+\mathds{1}_{(i\in\mathscr{B}^{\varepsilon}_{N}(s-))}\cdot 1]ds\Big{)}\\ &=Z\Big{(}\lambda\int_{0}^{t}[\mathscr{G}^{\varepsilon}_{N}(s-)L_{1}(L\Delta^{N}(s-)/N+\varepsilon)+\mathscr{B}^{\varepsilon}_{N}(s-)\cdot 1]ds\Big{)},\end{split} (4.33)

    where the second inequality is due to Lemma 4.12. By Proposition 4.9, we have that for any ε>0\varepsilon^{\prime}>0, there exists an N(ε)N(\varepsilon^{\prime}) such that for all NN(ε)N\geq N(\varepsilon^{\prime}),

    (supt[0,T]Nε(t)ε|WN|)ε2.\mathbb{P}(\sup_{t\in[0,T]}\mathscr{B}^{\varepsilon}_{N}(t)\geq\varepsilon^{\prime}|W^{N}|)\leq\frac{\varepsilon^{\prime}}{2}. (4.34)

    Hence, by (4.33), (4.34), and Tonelli’s theorem, we have that for all NN(ε)N\geq N(\varepsilon^{\prime}) and t[0,T]t\in[0,T],

    𝔼(ΔN(t)N)λ0t[L1(LW(N)N𝔼(ΔN(s))N+ε)+W(N)N3ε2]𝑑s.\mathbb{E}\Big{(}\frac{\Delta^{N}(t)}{N}\Big{)}\leq\lambda\int_{0}^{t}\Big{[}L_{1}\Big{(}L\frac{W(N)}{N}\frac{\mathbb{E}(\Delta^{N}(s-))}{N}+\varepsilon\Big{)}+\frac{W(N)}{N}\frac{3\varepsilon^{\prime}}{2}\Big{]}ds. (4.35)

    Also, by the assumption that limNW(N)N=ξ\lim_{N\rightarrow\infty}\frac{W(N)}{N}=\xi, there exists N0N_{0} such that W(N)N2ξ\frac{W(N)}{N}\leq 2\xi. Hence, we have that for all Nmax(N(ε),N0)N\geq\max(N(\varepsilon^{\prime}),N_{0}) and t[0,T]t\in[0,T],

    𝔼(ΔN(t)N)λ0t[L1(2Lξ𝔼(ΔN(s))N+ε)+3ξε]𝑑s.\mathbb{E}\Big{(}\frac{\Delta^{N}(t)}{N}\Big{)}\leq\lambda\int_{0}^{t}\Big{[}L_{1}\Big{(}2L\xi\frac{\mathbb{E}(\Delta^{N}(s-))}{N}+\varepsilon\Big{)}+3\xi\varepsilon^{\prime}\Big{]}ds. (4.36)

    By applying Grönwall’s inequality to (4.36), we have

    𝔼(ΔN(t)N)λ(L1ε+3ξε)texp(2LL1ξλt).\mathbb{E}\Big{(}\frac{\Delta^{N}(t)}{N}\Big{)}\leq\lambda(L_{1}\varepsilon+3\xi\varepsilon^{\prime})t\exp(2LL_{1}\xi\lambda t). (4.37)

    Since ΔN(t)\Delta^{N}(t) is nonnegative, by Markov’s inequality and (4.37), we have

    (supt[0,T]ΔN(t)/Nε)1ελ(L1ε+3ξε)texp(2LL1ξλt)\mathbb{P}(\sup_{t\in[0,T]}\Delta^{N}(t)/N\geq\varepsilon^{*})\leq\frac{1}{\varepsilon^{*}}\lambda(L_{1}\varepsilon+3\xi\varepsilon^{\prime})t\exp(2LL_{1}\xi\lambda t) (4.38)

    and we can choose small enough ε\varepsilon and ε\varepsilon^{\prime} such that (4.30) holds. ∎

    5 Proof of Interchange of Limits

    5.1 Properties of the Limiting System of ODEs

    First, we define the fixed point of the ODE (3.8). Recall δk=mpk,mvm\delta_{k}=\sum_{m\in\mathcal{M}}p_{k,m}v_{m} and q~k,l(t)=mvmpk,mδkqm,l(t)\tilde{q}_{k,l}(t)=\sum_{m\in\mathcal{M}}\frac{v_{m}p_{k,m}}{\delta_{k}}q_{m,l}(t). Let 𝐪=(qm,l+,m,l0)\mathbf{q}^{*}=(q^{*}_{m,l}\in\mathbb{R}_{+},m\in\mathcal{M},l\in\mathbb{N}_{0}) be a fixed point of the ODE (3.8) if for all m,lm\in\mathcal{M},l\in\mathbb{N},

    um(qm,lqm,l+1)=λξ(qm,l1qm,l)k𝒦pk,mwkδk(q~k,l1)d(q~k,l)dq~k,l1q~k,l,u_{m}(q^{*}_{m,l}-q^{*}_{m,l+1})=\lambda\xi(q^{*}_{m,l-1}-q^{*}_{m,l})\sum_{k\in\mathcal{K}}\frac{p_{k,m}w_{k}}{\delta_{k}}\frac{(\tilde{q}^{*}_{k,l-1})^{d}-(\tilde{q}^{*}_{k,l})^{d}}{\tilde{q}^{*}_{k,l-1}-\tilde{q}^{*}_{k,l}}, (5.1)

    with qm,0=1q^{*}_{m,0}=1, mm\in\mathcal{M}. The next proposition shows some important properties of the fixed point 𝐪\mathbf{q} of the ODE (3.8).

    Proposition 5.1.

    If there exists a fixed point 𝐪\mathbf{q}^{*} of the ODE (3.8) such that for each mm\in\mathcal{M}, qm,0=1q_{m,0}=1 and qm,ll0q_{m,l}\xrightarrow{l\rightarrow\infty}0, then for each mm\in\mathcal{M}, the sequence {qm,l,l0}\{q_{m,l},l\in\mathbb{N}_{0}\} decreases doubly exponentially.

    The proof of Proposition 5.1 is provided in Appendix F. The key observation used in the proof is that by (5.1), qm,lq^{*}_{m,l} can be expressed in terms of qm,l1q^{*}_{m,l-1} and qm,l2q^{*}_{m,l-2}. Thus, we can recursively characterize the values of qm,lq^{*}_{m,l}, l2l\geq 2, if we know qm,0q^{*}_{m,0} and qm,1q^{*}_{m,1}, mm\in\mathcal{M}.

    By Proposition 5.1, we know that if 𝐪\mathbf{q}^{*} is a fixed point of the ODE (3.8) and, for all mm\in\mathcal{M}, qm,ll0q^{*}_{m,l}\xrightarrow{l\rightarrow\infty}0, then such 𝐪\mathbf{q}^{*} must be in 𝒮\mathcal{S} so we only need to show that such 𝐪\mathbf{q}^{*} exists. For the proof of the existence of such 𝐪\mathbf{q}^{*}, we need a technical lemma, which will be used in (5.4).

    Lemma 5.2.

    Consider a sequence {GN}N\{G^{N}\}_{N} satisfying Condition 2.1. If {GN}N\{G^{N}\}_{N} is proportionally sparse and in the subcritical regime, then for any (α1,,αM)[0,1]M(\alpha_{1},...,\alpha_{M})\in[0,1]^{M} with mαm>0\sum_{m\in\mathcal{M}}\alpha_{m}>0, the following holds:

    (mαmvmum)1λξk𝒦wk(mαmpk,mvmδk)dρ<1.\Big{(}\sum_{m\in\mathcal{M}}\alpha_{m}v_{m}u_{m}\Big{)}^{-1}\lambda\xi\sum_{k\in\mathcal{K}}w_{k}\Big{(}\frac{\sum_{m\in\mathcal{M}}\alpha_{m}p_{k,m}v_{m}}{\delta_{k}}\Big{)}^{d}\leq\rho<1. (5.2)

    The proof of Lemma 5.2 is provided in Appendix G.

    Proof of Theorem 3.11.

    We prove the existence of the fixed point first. From (5.1), we know that if (qm,1,m)(q^{*}_{m,1},m\in\mathcal{M}) are fixed, then all (qm,l,m,l2)(q^{*}_{m,l},m\in\mathcal{M},l\geq 2) are determined as well. Hence, 𝐪\mathbf{q}^{*} can be the viewed as the function of (qm,1,m)(q^{*}_{m,1},m\in\mathcal{M}). Moreover, in the steady state, mqm,1=λξ\sum_{m\in\mathcal{M}}q^{*}_{m,1}=\lambda\xi, which implies that qM,1q^{*}_{M,1} can be decided by the values of qm,1q^{*}_{m,1}, m{M}m\in\mathcal{M}\setminus\{M\}. Hence, we construct the sequence 𝐪(α¯)=(qm,l(α¯),m,l0)\mathbf{q}(\bar{\alpha})=(q_{m,l}(\bar{\alpha}),m\in\mathcal{M},l\in\mathbb{N}_{0}) as functions of the vector α¯=(α1,,αM1)(0,1)M1\bar{\alpha}=(\alpha_{1},...,\alpha_{M-1})\in(0,1)^{M-1} as follows:

    qm,0(α¯)=1,m,\displaystyle q_{m,0}(\bar{\alpha})=1,\forall m\in\mathcal{M},
    qm,1(α¯)=αm,m{M},andqM,1=λξm{M}αmvmumvMuM,\displaystyle q_{m,1}(\bar{\alpha})=\alpha_{m},\quad m\in\mathcal{M}\setminus\{M\},\quad\mbox{and}\quad q_{M,1}=\frac{\lambda\xi-\sum_{m\in\mathcal{M}\setminus\{M\}}\alpha_{m}v_{m}u_{m}}{v_{M}u_{M}},
    um(qm,l(α¯)qm,l+1(α¯))=λξ(qm,l1(α¯)qm,l(α¯))k𝒦pk,mwkδk(q~k,l1(α¯))d(q~k,l(α¯))dq~k,l1(α¯)q~k,l(α¯),l1.\displaystyle u_{m}(q_{m,l}(\bar{\alpha})-q_{m,l+1}(\bar{\alpha}))=\lambda\xi(q_{m,l-1}(\bar{\alpha})-q_{m,l}(\bar{\alpha}))\sum_{k\in\mathcal{K}}\frac{p_{k,m}w_{k}}{\delta_{k}}\frac{(\tilde{q}_{k,l-1}(\bar{\alpha}))^{d}-(\tilde{q}_{k,l}(\bar{\alpha}))^{d}}{\tilde{q}_{k,l-1}(\bar{\alpha})-\tilde{q}_{k,l}(\bar{\alpha})},l\geq 1. (5.3)

    Since, for all mm\in\mathcal{M}, qm,1(α¯)q_{m,1}(\bar{\alpha}) should be in (0,1)(0,1), then α¯=(α1,,αM1)\bar{\alpha}=(\alpha_{1},...,\alpha_{M-1}) must lie in the polyhedron 𝐏1\mathbf{P}_{1} defined as follows:

    𝐏1{αm(max(0,λξm{m}vmumvmum),min(λξvmum,1)),mM1, and λξvMuM<mM1αmvmum<λξ}.\begin{split}&\mathbf{P}_{1}\coloneqq\Big{\{}\alpha_{m}\in\big{(}\max(0,\frac{\lambda\xi-\sum_{m^{\prime}\in\mathcal{M}\setminus\{m\}}v_{m^{\prime}}u_{m^{\prime}}}{v_{m}u_{m}}),\min(\frac{\lambda\xi}{v_{m}u_{m}},1)\big{)},\forall m\leq M-1,\\ &\hskip 170.71652pt\text{ and }\lambda\xi-v_{M}u_{M}<\sum_{m\leq M-1}\alpha_{m}v_{m}u_{m}<\lambda\xi\Big{\}}.\end{split}

    For all α¯𝐏1\bar{\alpha}\in\mathbf{P}_{1}, we have 1=qm,0(α¯)>qm,1(α¯)>01=q_{m,0}(\bar{\alpha})>q_{m,1}(\bar{\alpha})>0, m\forall m\in\mathcal{M}. Consider l=2l=2. By (5.3), we have that when αm=0\alpha_{m}=0, m{M}m\in\mathcal{M}\setminus\{M\},

    um(0qm,2(α¯))=λξ(10)k𝒦pk,mwkδk1(q~k,1(α¯))d1q~k,1(α¯)u_{m}(0-q_{m,2}(\bar{\alpha}))=\lambda\xi(1-0)\sum_{k\in\mathcal{K}}\frac{p_{k,m}w_{k}}{\delta_{k}}\frac{1-(\tilde{q}_{k,1}(\bar{\alpha}))^{d}}{1-\tilde{q}_{k,1}(\bar{\alpha})}

    implying that qm,2(α¯)<0q_{m,2}(\bar{\alpha})<0; when αm=1\alpha_{m}=1, m{M}m\in\mathcal{M}\setminus\{M\},

    um(1qm,2(α¯))=0u_{m}(1-q_{m,2}(\bar{\alpha}))=0

    implying that qm,2(α¯)=1>0q_{m,2}(\bar{\alpha})=1>0; when αm=λξvmum\alpha_{m}=\frac{\lambda\xi}{v_{m}u_{m}}, m{M}m\in\mathcal{M}\setminus\{M\},

    um(λξvmumqm,2(α¯))=λξ(1λξvmum)k𝒦pk,mwkδk1(q~k,1(α¯))d1q~k,1(α¯)u_{m}(\frac{\lambda\xi}{v_{m}u_{m}}-q_{m,2}(\bar{\alpha}))=\lambda\xi(1-\frac{\lambda\xi}{v_{m}u_{m}})\sum_{k\in\mathcal{K}}\frac{p_{k,m}w_{k}}{\delta_{k}}\frac{1-(\tilde{q}_{k,1}(\bar{\alpha}))^{d}}{1-\tilde{q}_{k,1}(\bar{\alpha})}

    implying that

    qm,2(α¯)=λξvmumλξum(1λξvmum)k𝒦pk,mwkδk1(q~k,1(α¯))d1q~k,1(α¯)>λξvmumλξum(1λξvmum)k𝒦pk,mwkδk>λξvmumλξum(1λξvmum)k𝒦pk,mwkpk,mvm=λξvmumλξvmum(1λξvmum)>0.\begin{split}q_{m,2}(\bar{\alpha})&=\frac{\lambda\xi}{v_{m}u_{m}}-\frac{\lambda\xi}{u_{m}}(1-\frac{\lambda\xi}{v_{m}u_{m}})\sum_{k\in\mathcal{K}}\frac{p_{k,m}w_{k}}{\delta_{k}}\frac{1-(\tilde{q}_{k,1}(\bar{\alpha}))^{d}}{1-\tilde{q}_{k,1}(\bar{\alpha})}\\ &>\frac{\lambda\xi}{v_{m}u_{m}}-\frac{\lambda\xi}{u_{m}}(1-\frac{\lambda\xi}{v_{m}u_{m}})\sum_{k\in\mathcal{K}}\frac{p_{k,m}w_{k}}{\delta_{k}}\\ &>\frac{\lambda\xi}{v_{m}u_{m}}-\frac{\lambda\xi}{u_{m}}(1-\frac{\lambda\xi}{v_{m}u_{m}})\sum_{k\in\mathcal{K}}\frac{p_{k,m}w_{k}}{p_{k,m}v_{m}}\\ &=\frac{\lambda\xi}{v_{m}u_{m}}-\frac{\lambda\xi}{v_{m}u_{m}}(1-\frac{\lambda\xi}{v_{m}u_{m}})>0.\end{split}

    Let rm,1r_{m,1}, mM1m\leq M-1 be the maximum number which satisfies the following:

    1. (1)

      rm,1<min(λξvmum,1)r_{m,1}<\min(\frac{\lambda\xi}{v_{m}u_{m}},1),

    2. (2)

      α¯𝐏1\exists\bar{\alpha}\in\mathbf{P}_{1} with αm=rm,1\alpha_{m}=r_{m,1} such that qm,2(α¯)=0q_{m,2}(\bar{\alpha})=0.

    Define 𝐏1𝐏1\mathbf{P}^{\prime}_{1}\subseteq\mathbf{P}_{1} as the following:

    𝐏1{αm(max(rm,1,λξmM1vmumvmum),min(λξvmum,1)),mM1, and λξvMuM<mM1αmvmum<λξ}.\begin{split}&\mathbf{P}^{\prime}_{1}\coloneqq\Big{\{}\alpha_{m}\in\big{(}\max(r_{m,1},\frac{\lambda\xi-\sum_{m\leq M-1}v_{m^{\prime}}u_{m^{\prime}}}{v_{m}u_{m}}),\min(\frac{\lambda\xi}{v_{m}u_{m}},1)\big{)},\forall m\leq M-1,\\ &\hskip 184.9429pt\text{ and }\lambda\xi-v_{M}u_{M}<\sum_{m\leq M-1}\alpha_{m}v_{m}u_{m}<\lambda\xi\Big{\}}.\end{split}

    Again, by using (5.3), we get that when mM1αmum=λξvMuM\sum_{m\leq M-1}\alpha_{m}u_{m}=\lambda\xi-v_{M}u_{M} (i.e., qM,1(α¯)=1q_{M,1}(\bar{\alpha})=1),

    uM(1qM,2(α¯))=0u_{M}(1-q_{M,2}(\bar{\alpha}))=0

    implying that qM,2(α¯)=1>0q_{M,2}(\bar{\alpha})=1>0; when mM1αmum=λξ\sum_{m\leq M-1}\alpha_{m}u_{m}=\lambda\xi (i.e., qM,1(α¯)=0q_{M,1}(\bar{\alpha})=0),

    uM(0qM,2(α¯))=λξ(10)k𝒦pk,Mwkδk1(q~k,1(α¯))d1q~k,1(α¯)u_{M}(0-q_{M,2}(\bar{\alpha}))=\lambda\xi(1-0)\sum_{k\in\mathcal{K}}\frac{p_{k,M}w_{k}}{\delta_{k}}\frac{1-(\tilde{q}_{k,1}(\bar{\alpha}))^{d}}{1-\tilde{q}_{k,1}(\bar{\alpha})}

    implying that qM,2(α¯)<0q_{M,2}(\bar{\alpha})<0.

    Let r1r_{1} be the minimum number which satisfies the following:

    1. 1.

      r1<λξr_{1}<\lambda\xi,

    2. 2.

      There exists α¯𝐏1\bar{\alpha}\in\mathbf{P}^{\prime}_{1} such that mM1αmvmum=r1\sum_{m\leq M-1}\alpha_{m}v_{m}u_{m}=r_{1} and qM,2(α¯)=0q_{M,2}(\bar{\alpha})=0.

    Define 𝐏2𝐏1𝐏1\mathbf{P}_{2}\subseteq\mathbf{P}^{\prime}_{1}\subseteq\mathbf{P}_{1} as the following:

    𝐏2{αm(max(rm,1,λξmM1vmumvmum),min(r1vmum,1)),mM1 and λξvMuMmM1αmvmumr1}.\begin{split}&\mathbf{P}_{2}\coloneqq\Big{\{}\alpha_{m}\in\big{(}\max(r_{m,1},\frac{\lambda\xi-\sum_{m^{\prime}\leq M-1}v_{m^{\prime}}u_{m^{\prime}}}{v_{m}u_{m}}),\min(\frac{r_{1}}{v_{m}u_{m}},1)\big{)},\forall m\leq M-1\\ &\hskip 170.71652pt\text{ and }\lambda\xi-v_{M}u_{M}\leq\sum_{m\leq M-1}\alpha_{m}v_{m}u_{m}\leq r_{1}\Big{\}}.\end{split}

    Hence, for all α¯𝐏2\bar{\alpha}\in\mathbf{P}_{2}, we have 1=qm,0(α¯)>qm,1(α¯)>qm,2(α¯)>01=q_{m,0}(\bar{\alpha})>q_{m,1}(\bar{\alpha})>q_{m,2}(\bar{\alpha})>0, m\forall m\in\mathcal{M}. Continuing this process, we can define a sequence {𝐏1𝐏2}\{\mathbf{P}_{1}\supseteq\mathbf{P}_{2}\supseteq\cdots\} of polyhedra such that for all α¯𝐏n\bar{\alpha}\in\mathbf{P}_{n}, we have 1=qm,0(α¯)>qm,1(α¯)>>qm,n(α¯)>01=q_{m,0}(\bar{\alpha})>q_{m,1}(\bar{\alpha})>\cdots>q_{m,n}(\bar{\alpha})>0, m\forall m\in\mathcal{M}. Thus, we can get decreasing sequences {qm,l(α¯)}l0\{q_{m,l}(\bar{\alpha})\}_{l\in\mathbb{N}_{0}}, mm\in\mathcal{M} for some α¯\bar{\alpha}. Since qm,l0q_{m,l}\geq 0, m,l0\forall m\in\mathcal{M},l\in\mathbb{N}_{0}, then m\forall m\in\mathcal{M}, xm\exists x^{*}_{m} such that limlqm,l(α¯)=xm\lim_{l\rightarrow\infty}q_{m,l}(\bar{\alpha})=x^{*}_{m}. Next, we need to show that xm=0x^{*}_{m}=0, m\forall m\in\mathcal{M}. By (F.2), we have

    mvmumxm=λξk𝒦wk(mpk,mvmδkxm)d.\sum_{m\in\mathcal{M}}v_{m}u_{m}x^{*}_{m}=\lambda\xi\sum_{k\in\mathcal{K}}w_{k}\big{(}\sum_{m\in\mathcal{M}}\frac{p_{k,m}v_{m}}{\delta_{k}}x^{*}_{m}\big{)}^{d}. (5.4)

    Clearly, xm=0x^{*}_{m}=0, m\forall m\in\mathcal{M} is a solution of (5.4). It must be the unique solution, since by Lemma 5.2, for all (xm,m)[0,1]M(x^{*}_{m},m\in\mathcal{M})\in[0,1]^{M} with mxm>0\sum_{m\in\mathcal{M}}x^{*}_{m}>0,

    (mvmumxm)1λξk𝒦wk(mpk,mvmδkxm)d<1(\sum_{m\in\mathcal{M}}v_{m}u_{m}x^{*}_{m})^{-1}\lambda\xi\sum_{k\in\mathcal{K}}w_{k}\big{(}\sum_{m\in\mathcal{M}}\frac{p_{k,m}v_{m}}{\delta_{k}}x^{*}_{m}\big{)}^{d}<1

    implything that (5.4) does not hold. Now, let qm,l=qm,l(α¯)q^{*}_{m,l}=q_{m,l}(\bar{\alpha}), m,l0\forall m\in\mathcal{M},l\in\mathbb{N}_{0}.

    Now, we are going to show the uniqueness. The proof of the uniqueness is based on a monotonicity property of the system, which is stated in the following claim.

    Claim 5.3.

    If 𝐪𝐪^\mathbf{q}\leq\hat{\mathbf{q}} for 𝐪,𝐪^𝒮\mathbf{q},\hat{\mathbf{q}}\in\mathcal{S}, then 𝐪¯(t,𝐪)𝐪¯(t,𝐪^)\bar{\mathbf{q}}(t,\mathbf{q})\leq\bar{\mathbf{q}}(t,\hat{\mathbf{q}}) for all tt.

    Proof.

    Consider any 𝐪𝐪^𝒮\mathbf{q}\leq\hat{\mathbf{q}}\in\mathcal{S}. It is easy to construct two copies of the NN-th systems with initial states {XjN(0),jVN}\{X^{N}_{j}(0),j\in V^{N}\} and {X^jN(0),jVN}\{\hat{X}^{N}_{j}(0),j\in V^{N}\} satisfying:

    1. 1.

      For all jVNj\in V^{N}, XjN(0)X^jN(0)X^{N}_{j}(0)\leq\hat{X}^{N}_{j}(0);

    2. 2.

      {XjN(0),jVN}\{X^{N}_{j}(0),j\in V^{N}\} has the corresponding global occupancy 𝐪N(0)=𝐪𝒮\mathbf{q}^{N}(0)=\mathbf{q}\in\mathcal{S}; similarly, {X^jN(0),jVN}\{\hat{X}^{N}_{j}(0),j\in V^{N}\} has 𝐪^N(0)=𝐪^𝒮\hat{\mathbf{q}}^{N}(0)=\hat{\mathbf{q}}\in\mathcal{S}.

    By a natural coupling, we have that for all jVNj\in V^{N} and t0t\geq 0, XjN(t)X^jN(t)X^{N}_{j}(t)\leq\hat{X}^{N}_{j}(t) implying that 𝐪N(t)𝐪^N(t)\mathbf{q}^{N}(t)\leq\hat{\mathbf{q}}^{N}(t). Since systems are stable, then 𝐪N(t),𝐪^N(t)𝒮\mathbf{q}^{N}(t),\hat{\mathbf{q}}^{N}(t)\in\mathcal{S} for all t0t\geq 0. Moreover, by Theorem 3.6, the claim follows. ∎

    We continue the proof of the uniqueness. Now, it is sufficient to show that limt𝐪¯(t,𝐪0)=𝐪\lim_{t\rightarrow\infty}\bar{\mathbf{q}}(t,\mathbf{q}_{0})=\mathbf{q}^{*} which either 𝐪0𝐪\mathbf{q}_{0}\leq\mathbf{q}^{*} or 𝐪0𝐪\mathbf{q}_{0}\geq\mathbf{q}^{*} in component-wise, since Claim 5.3 implies that

    𝐪¯(t,min(𝐪0,𝐪))𝐪¯(t,𝐪0)𝐪¯(t,max(𝐪0,𝐪)),𝐪0𝐒¯,t0.\bar{\mathbf{q}}(t,\min(\mathbf{q}_{0},\mathbf{q}^{*}))\leq\bar{\mathbf{q}}(t,\mathbf{q}_{0})\leq\bar{\mathbf{q}}(t,\max(\mathbf{q}_{0},\mathbf{q}^{*})),\quad\forall\mathbf{q}_{0}\in\bar{\mathbf{S}},t\geq 0.

    We will prove the case that if 𝐪0𝐪\mathbf{q}_{0}\leq\mathbf{q}^{*}, then

    limt𝐪¯(t,𝐪0)=𝐪.\lim_{t\rightarrow\infty}\bar{\mathbf{q}}(t,\mathbf{q}_{0})=\mathbf{q}^{*}.

    Since the case that 𝐪0𝐪\mathbf{q}_{0}\geq\mathbf{q}^{*} is similar. Also, Note that qm,l()q_{m,l}(\infty), m,l2\forall m\in\mathcal{M},l\geq 2 can be solved recursively by (5.1) when qm,1()q_{m,1}(\infty), m\forall m\in\mathcal{M} are determined so it is sufficient to show that qm,1()=qm,1q_{m,1}(\infty)=q^{*}_{m,1}, m\forall m\in\mathcal{M}. By ODE (3.8), we have

    dmvmqm,1(t)dt=mvmumqm,1(t)+λξ.\frac{d\sum_{m\in\mathcal{M}}v_{m}q_{m,1}(t)}{dt}=-\sum_{m\in\mathcal{M}}v_{m}u_{m}q_{m,1}(t)+\lambda\xi.

    Since 𝐪0𝐪\mathbf{q}_{0}\leq\mathbf{q}^{*}, then 𝐪¯(t,𝐪0)𝐪¯(t,𝐪)=𝐪\bar{\mathbf{q}}(t,\mathbf{q}_{0})\leq\bar{\mathbf{q}}(t,\mathbf{q}^{*})=\mathbf{q}^{*}. Observe that mvmumqm,1=λξ\sum_{m\in\mathcal{M}}v_{m}u_{m}q^{*}_{m,1}=\lambda\xi. Hence, if for some mm\in\mathcal{M}, qm,1(t)<qm,1q_{m,1}(t)<q^{*}_{m,1}, then dmvmqm,1(t)dt>0\frac{d\sum_{m\in\mathcal{M}}v_{m}q_{m,1}(t)}{dt}>0, which implies that

    limtmvmqm,1(t)=mvmqm,1()=λξ.\lim_{t\rightarrow\infty}\sum_{m\in\mathcal{M}}v_{m}q_{m,1}(t)=\sum_{m\in\mathcal{M}}v_{m}q_{m,1}(\infty)=\lambda\xi.

    Since for all mm\in\mathcal{M} and t0t\geq 0, qm,1(t)qm,1q_{m,1}(t)\leq q^{*}_{m,1}, then limtqm,1(t)=qm,1\lim_{t\rightarrow\infty}q_{m,1}(t)=q^{*}_{m,1} must hold for all mm\in\mathcal{M}.

    Proof of Theorem 3.14.

    The result holds immediately from Proposition 5.1 and Theorem 3.11. ∎

    5.2 Proof of Tightness and Interchange of Limits

    Next, we are going to prove the tightness of the steady state occupancy processes {𝐪N()}N\{\mathbf{q}^{N}(\infty)\}_{N}. Let q¯lN()=mqm,lN()\bar{q}^{N}_{l}(\infty)=\sum_{m\in\mathcal{M}}q^{N}_{m,l}(\infty) and 𝐪¯N()=(q¯lN(),l0)\bar{\mathbf{q}}^{N}(\infty)=(\bar{q}^{N}_{l}(\infty),l\in\mathbb{N}_{0}). In order to show the tightness of {𝐪N()}N\{\mathbf{q}^{N}(\infty)\}_{N}, it is sufficient to show that the sequence is {𝐪¯N()}N\{\bar{\mathbf{q}}^{N}(\infty)\}_{N}, which is stated in the next proposition. For showing the above tightness, we will bound the tail of the expected global occupancy of the stationary state first.

    Lemma 5.4.

    Let {GN}N\{G^{N}\}_{N} be a sequence of proportionally sparse graphs satisfying Condition 2.1. There exists an N0N_{0} such that for all NN0N\geq N_{0} and 1\ell\geq 1,

    l=𝔼(q¯lN())(1+ρ)/21(1+ρ)/2𝔼(q¯1N()).\sum_{l=\ell}^{\infty}\mathbb{E}(\bar{q}^{N}_{l}(\infty))\leq\frac{(1+\rho)/2}{1-(1+\rho)/2}\mathbb{E}(\bar{q}^{N}_{\ell-1}(\infty)). (5.5)

    Furthermore,

    𝔼(q¯N())(1+ρ2),0.\mathbb{E}(\bar{q}^{N}_{\ell}(\infty))\leq\Big{(}\frac{1+\rho}{2}\Big{)}^{\ell},\quad\forall\ell\in\mathbb{N}_{0}. (5.6)

    The proof of Lemma 5.4 is similar to [22, Lemma 3]. We define a sequence {Lm,N}m,0\{L^{N}_{m,\ell}\}_{m\in\mathcal{M},\ell\in\mathbb{N}_{0}} of Lyapunov functions and bound the drift of Lm,NL^{N}_{m,\ell} which enables us to bound the tail sum of q¯lN()\bar{q}^{N}_{l}(\infty) starting from \ell. Given the NN-th system state XN=(XjN,jVN)X^{N}=(X^{N}_{j},j\in V^{N}). Let Qm,lN(X)Q^{N}_{m,l}(X) be the set of servers of type mm\in\mathcal{M} with queue length at least l0l\in\mathbb{N}_{0}. For each mm\in\mathcal{M}, we define a sequence of Lyapunov functions Lm,N(X)=i=l=i|Qm,lN(X)|L^{N}_{m,\ell}(X)=\sum_{i=\ell}^{\infty}\sum_{l=i}|Q^{N}_{m,l}(X)|, 0\ell\in\mathbb{N}_{0}. The complete proof is provided in Appendix H.

    The next lemma from [18] gives us the criterion for 1\ell_{1}-tightness.

    Lemma 5.5 ([19, Lemma 2]).

    Let {𝐗N}\{\mathbf{X}^{N}\} be a sequence of random variables in ζ\zeta. Then, the following are equivalent:

    1. (i)

      {𝐗N}\{\mathbf{X}^{N}\} is tight with respect to the product topology, and for all ε>0\varepsilon>0,

      limklim¯N(ikxiN>ε)=0.\lim_{k\rightarrow\infty}\varlimsup_{N\rightarrow\infty}\mathbb{P}\big{(}\sum_{i\geq k}x^{N}_{i}>\varepsilon\big{)}=0. (5.7)
    2. (ii)

      {𝐗N}\{\mathbf{X}^{N}\} is tight with respect to the 1\ell_{1}-topology.

    Proof of Theorem 3.12.

    Since, for all l0l\in\mathbb{N}_{0}, q¯lN[0,1]\bar{q}^{N}_{l}\in[0,1], then it is easy to check that {𝐪¯N()}\{\bar{\mathbf{q}}^{N}(\infty)\} is tight with respect to the product topology. Hence, it is sufficient to show that for any ε>0\varepsilon>0,

    limlim¯N(lq¯lN()>ε)=0.\lim_{\ell\rightarrow\infty}\varlimsup_{N\rightarrow\infty}\mathbb{P}\Big{(}\sum_{l\geq\ell}\bar{q}^{N}_{l}(\infty)>\varepsilon\Big{)}=0. (5.8)

    By Markov’s inequality and Lemma 5.4, we have that for all NN0N\geq N_{0},

    (lq¯lN()>ε)1ε𝔼(lq¯lN())1ε(1+ρ)/21(1+ρ)/2𝔼(q¯1N())1ε((1+ρ)/2)1(1+ρ)/2,\begin{split}&\mathbb{P}\Big{(}\sum_{l\geq\ell}\bar{q}^{N}_{l}(\infty)>\varepsilon\Big{)}\leq\frac{1}{\varepsilon}\mathbb{E}\Big{(}\sum_{l\geq\ell}\bar{q}^{N}_{l}(\infty)\Big{)}\leq\frac{1}{\varepsilon}\frac{(1+\rho)/2}{1-(1+\rho)/2}\mathbb{E}\Big{(}\bar{q}^{N}_{\ell-1}(\infty)\Big{)}\leq\frac{1}{\varepsilon}\frac{\big{(}(1+\rho)/2\big{)}^{\ell}}{1-(1+\rho)/2},\end{split} (5.9)

    which implies that (5.8) holds. By Lemma 5.5, the desired result holds. ∎

    Proof of Theorem 3.13.

    By Theorem 3.12, {𝐪N()}N\{\mathbf{q}^{N}(\infty)\}_{N} is tight with respect to the 1\ell_{1}-topology. Then, any subsequence has a convergent further subsequence. Let {𝐪Nn()}n\{\mathbf{q}^{N_{n}}(\infty)\}_{n} be such convergent subsequence and assume 𝐪N()𝑑𝐪\mathbf{q}^{N}(\infty)\ \xrightarrow{d}\ \mathbf{q}^{*}. Clearly, 𝐪\mathbf{q}^{*} must be in the space 𝒮\mathcal{S}. Now, initiate the NN-th system at its stationarity. Then the system is in steady state at any fixed finite time t0t\geq 0. That is, we have 𝐪Nn(t)𝐪Nn()\mathbf{q}^{N_{n}}(t)\sim\mathbf{q}^{N_{n}}(\infty) for all t[0,T]t\in[0,T]. Also by Theorem 3.10,

    𝐪Nn(t)𝑑𝐪(t).\mathbf{q}^{N_{n}}(t)\ \xrightarrow{d}\ \mathbf{q}(t). (5.10)

    Thus, for all t[0,T]t\in[0,T],

    𝐪(t)𝐪,\mathbf{q}(t)\sim\mathbf{q}^{*}, (5.11)

    which implies that 𝐪\mathbf{q}^{*} is a stationary point of the limiting system. By Theorem 3.11, we know that 𝐪\mathbf{q}^{*} is unique. Therefore, the desired result holds. ∎

    6 Numerical Results

    In this section, we will present the simulation to validate the theoretical results. Using the insights from the theoretical results, we will also show that systems with carefully designed compatibility structure perform much better than the classical, fully flexible systems. Throughout this section, we set the system parameters as follows:

    • K=2K=2: two types of dispatchers;

    • M=3M=3: three types of servers;

    • d=2d=2: the system follows the JSQ(22) policy;

    • 𝝁=(1,5,10)\boldsymbol{\mu}=(1,5,10), where each μm\mu_{m}, m=1,2,3m=1,2,3, is the service rate of type mm servers;

    • λ=3\lambda=3, which is the arrival rate at each dispatcher is λ\lambda;

    • Q=[0.20.50.30.500.50.90.10]Q=\begin{bmatrix}0.2&0.5&0.3\\ 0.5&0&0.5\\ 0.9&0.1&0\end{bmatrix}, where each qm,lq_{m,l} is the probability that type mm, m=1,2,3m=1,2,3 server’s initial queue length is ll, l=1,2,3l=1,2,3;

    • Fraction of types of dispatchers: [w1w2]=[0.20.8]\begin{bmatrix}w_{1}&w_{2}\end{bmatrix}=\begin{bmatrix}0.2&0.8\end{bmatrix};

    • Fraction of types of servers: [v1v2v3]=[0.50.30.2]\begin{bmatrix}v_{1}&v_{2}&v_{3}\end{bmatrix}=\begin{bmatrix}0.5&0.3&0.2\end{bmatrix};

    • ξ=1\xi=1: the relationship between the number of dispatchers and that of servers in the system.

    By the above setting, the capacity sufficiency is satisfied, λξ=3<mvmum=4\lambda\xi=3<\sum_{m\in\mathcal{M}}v_{m}u_{m}=4. The first experiment is to compare the performance of the classical, fully flexible system with that of the system with carefully designed compatibility structure.

    Complete Bipartite vs. Designed Compatibility Structure.

    The complete bipartite is the case that the compatibility matrix 𝐩0=(pm,k0,m,k𝒦)\mathbf{p}^{0}=(p^{0}_{m,k},m\in\mathcal{M},k\in\mathcal{K}) is a matrix with all elements equal to 1. From Lemma 3.1, we have that an NN-th system under JSQ(d) is stable if and only if it should satisfies the following:

    ρN=maxUVNU{(jUm𝟙(jVmN)um)1iWNS(U𝒩wN(i)):|S|=dλ(δiNd)}<1.\rho^{N}=\max_{\begin{subarray}{c}U\subseteq V^{N}\\ U\neq\emptyset\end{subarray}}\Big{\{}\Big{(}\sum_{j\in U}\sum_{m\in\mathcal{M}}\mathds{1}_{(j\in V^{N}_{m})}u_{m}\Big{)}^{-1}\sum_{i\in W^{N}}\sum_{\begin{subarray}{c}S\subseteq(U\cap\mathcal{N}^{N}_{w}(i)):\\ |S|=d\end{subarray}}\frac{\lambda}{{\delta^{N}_{i}\choose d}}\Big{\}}<1.

    By Lemma 5.2, for the complete bipartite case, we have that

    limNρNmax(mvmum)1λξk𝒦wk(mvm)d(0.5×1)1×3×(0.5)2>1,\lim_{N\rightarrow\infty}\rho^{N}\geq\max_{\mathcal{M}^{\prime}\subseteq\mathcal{M}}\big{(}\sum_{m\in\mathcal{M}^{\prime}}v_{m}u_{m}\big{)}^{-1}\lambda\xi\sum_{k\in\mathcal{K}}w_{k}\Big{(}\sum_{m\in\mathcal{M}^{\prime}}v_{m}\Big{)}^{d}\geq(0.5\times 1)^{-1}\times 3\times(0.5)^{2}>1,

    which implies that for large enough NN, the system under JSQ(22) is unstable. The bottleneck here is that the type 1 servers with poor performance receive heavy workload. By Proposition 3.3, if the capacity sufficiency is satisfied, then there always exists a compatibility matrix 𝐩1[0,1]K×M\mathbf{p}^{1}\in[0,1]^{K\times M} making all large enough systems stable under JSQ(22). Checking the feasible region defined in Lemma 3.4, we get one of appropriate matrices 𝐩1\mathbf{p}^{1} defined as

    𝐩1=[0.050.610.10.71].\mathbf{p}^{1}=\begin{bmatrix}0.05&0.6&1\\ 0.1&0.7&1\end{bmatrix}.

    The intuition for designing the compatibility matrix, like 𝐩1\mathbf{p}^{1}, is to lower the traffic intensity for type 1 servers by decreasing the fraction of the type 1 servers in the neighborhood of each dispatcher. For the experiment, we set the number of servers N=1000N=1000, and consider two systems S1S1 and S2S2. S1S1 is a system with complete bipartite graph structure; S2S2, generated by irg(𝐩1\mathbf{p}^{1}) (Definition 3.15), is a system with compatibility matrix 𝐩1\mathbf{p}^{1}. We simulate the evolution of each system for 100 times and plot the mean sample path in Figure 1.

    Refer to caption
    Figure 1: Complete Bipartite vs. Appropriate Designed Structure

    Figure 1 shows that the average queue length of type 1 servers in S1S1 almost monotonically increases as tt increases, which implies that the average queue length of type 1 servers in S1S1 is unbounded. However, in the system S2S2, the average queue length of each type of servers is bounded. From this numerical result, we observer that with appropriately designed graph structure, the performance of the system can be improved. Although we tried to plot the 95%95\% confidence interval (CI) for each point t=0.5,1.0,1.5,2.0,2.5t=0.5,1.0,1.5,2.0,2.5, the CI is narrow and its size is smaller than that of markers in the plot.

    Convergence of global occupancy states.

    In this experiment, we generate systems by irg(𝐩1\mathbf{p}^{1}) and simulate the evolution of systems with size N=100,500,1000N=100,500,1000. For each system, we also simulate for 100 times and plot the mean trajectories of qm,1Nq^{N}_{m,1} and qm,2Nq^{N}_{m,2}, m{1,2,3}m\in\{1,2,3\} in Figure 2. Also, we plot the evolution of qm,1q_{m,1} and qm,2q_{m,2}, m=1,2,3m=1,2,3, of the limit system. The simulation results show that the evolution of the global occupancy of the NN-th system converges to that of the limit system as NN goes to infinity. From the simulation result, we find that q1,1Nq^{N}_{1,1}, and especially q1,2Nq^{N}_{1,2}, decrease very fast when their initial value are large. In other words, when the average queue length of type 1 servers is large, it will decrease very fast. The reason is due to our designed compatibility matrix such that compared with other type servers, type 1 servers are sampled much less often.

    Refer to caption
    Refer to caption
    Refer to caption
    Refer to caption
    Refer to caption
    Refer to caption
    Figure 2: The simulated trajectories of qm,1Nq^{N}_{m,1} and qm,2Nq^{N}_{m,2}, m=1,2,3m=1,2,3 converging to the solution of the system of ODEs as NN increases.
    Uniqueness of the fixed point of the limit system.

    From Theorem 3.11, we have that for all 𝐪𝒮\mathbf{q}\in\mathcal{S}, lim𝐪¯(t,𝐪0)=𝐪\lim_{\rightarrow\infty}\bar{\mathbf{q}}(t,\mathbf{q}_{0})=\mathbf{q}^{*}. In order to verify this, we simulation of the evolution of 𝐪¯(t,𝐪0)\bar{\mathbf{q}}(t,\mathbf{q}_{0}) with different 𝐪0𝒮\mathbf{q}_{0}\in\mathcal{S} (i.e., consider different QQ mentioned above). We also simulate the system with Q1=[0.40.30.30.10.80.10.30.60.1]Q_{1}=\begin{bmatrix}0.4&0.3&0.3\\ 0.1&0.8&0.1\\ 0.3&0.6&0.1\end{bmatrix} and Q2=[0.60.30.10.80.10.10.70.20.1]Q_{2}=\begin{bmatrix}0.6&0.3&0.1\\ 0.8&0.1&0.1\\ 0.7&0.2&0.1\end{bmatrix}. Figure 3 shows that with different 𝐪𝒮\mathbf{q}\in\mathcal{S}, limtqm,1(t)\lim_{t\rightarrow\infty}q_{m,1}(t), m=1,2,3m=1,2,3, are the same. If qm,1q_{m,1}, m=1,2,3m=1,2,3 are fixed, then the values of all qm,lq_{m,l}, l2l\geq 2, m=1,2,3m=1,2,3 are fixed as well by using (5.1). Hence, Figure 3 verifies the uniqueness of the fixed point.

    Refer to caption
    Refer to caption
    Refer to caption
    Figure 3: Multiple trajectories of qm,1q_{m,1}, m=1,2,3m=1,2,3 in the limit system converging to the fixed point.

    References

    Appendix A Proofs for Stability Results

    The goal of this appendix is to prove Proposition 3.3. We start by proving Lemma 3.4, for which we need the next technical lemma. This lemma will help us to upper bound the probability that a new task will be assigned to a specific subset of servers (in particular, (A.5)).

    Lemma A.1.

    Consider the following optimization problem:

    maxi=1N(xid) s.t.\displaystyle\max\sum_{i=1}^{N}{x_{i}\choose d}\text{ s.t. } i=1Nxi=C and xi[0,D]\displaystyle\sum_{i=1}^{N}x_{i}=C\text{ and }x_{i}\in[0,D]

    where CC and DD are positive integers. Let k=C/Dk^{*}=\lfloor C/D\rfloor. Then, the optimal value is k(Dd)+(CDkd)k^{*}{D\choose d}+{C-Dk^{*}\choose d}, if N>kN>k^{*}; otherwise, the optimal value is N(Dd)N{D\choose d}.

    Proof.

    We will prove by contradictions. Suppose the maximizer {xi:i=1,,N}\{x_{i}^{*}:i=1,\dotsc,N\} contains some xj,xk{1,,D1}x_{j}^{*},x_{k}^{*}\in\{1,\dotsc,D-1\} for some jkj\neq k. Note that

    (xjd)+(xkd)<(x~jd)+(x~kd),\binom{x_{j}^{*}}{d}+\binom{x_{k}^{*}}{d}<\binom{\tilde{x}_{j}}{d}+\binom{\tilde{x}_{k}}{d},

    where x~j=min{xj+xk,D}\tilde{x}_{j}=\min\{x_{j}^{*}+x_{k}^{*},D\} and x~k=xj+xkx~j\tilde{x}_{k}=x_{j}^{*}+x_{k}^{*}-\tilde{x}_{j}, that is, the pair (xj,xk)(x_{j}^{*},x_{k}^{*}) gives smaller value than the extremer pair (x~j,x~k)(\tilde{x}_{j},\tilde{x}_{k}). This contradicts the assumption that {xi:i=1,,N}\{x_{i}^{*}:i=1,\dotsc,N\} is the maximizer. Therefore the maximizer {xi:i=1,,N}\{x_{i}^{*}:i=1,\dotsc,N\} must contain at most one xj{1,,D1}x_{j}^{*}\in\{1,\dotsc,D-1\} with all the other xix_{i}^{*} being either 0 or DD. This completes the proof. ∎

    Proof of Lemma 3.4.

    Suppose that (3.2) holds. Since \mathcal{M} is finite, then there exists a ρ(0,1)\rho\in(0,1) such that λξumk𝒦wkpk,mδk<ρ\frac{\lambda\xi}{u_{m}}\sum_{k\in\mathcal{K}}\frac{w_{k}p_{k,m}}{\delta_{k}}<\rho for all mm\in\mathcal{M}. Fix any ε(0,1ρ1+3ρ)\varepsilon\in(0,\frac{1-\rho}{1+3\rho}). Recall δiN=|𝒩wN(i)|\delta^{N}_{i}=|\mathcal{N}^{N}_{w}(i)|. By our model assumption and Condition 2.1, there exists Nε0N_{\varepsilon}\in\mathbb{N}_{0} such that for all mm\in\mathcal{M} and jVmNj\in V^{N}_{m},

    pk,mwkW(N)(1ε)degvN(k,j)pk,mwkW(N)(1+ε),k𝒦,p_{k,m}w_{k}W(N)(1-\varepsilon)\leq\deg^{N}_{v}(k,j)\leq p_{k,m}w_{k}W(N)(1+\varepsilon),\quad\forall k\in\mathcal{K}, (A.1)

    and for all K𝒦K\in\mathcal{K} and iWkNi\in W^{N}_{k},

    Nδk(1ε)δiNNδk(1+ε).N\delta_{k}(1-\varepsilon)\leq\delta^{N}_{i}\leq N\delta_{k}(1+\varepsilon). (A.2)

    Consider the NN-th system. Consider any nonempty subset UVU\subseteq V of servers. If |U|C(λ,ρ)λρminmum|U|\leq C(\lambda,\rho)\coloneqq\frac{\lambda}{\rho\min_{m\in\mathcal{M}}u_{m}}, then there exists an N10N_{1}\in\mathbb{N}_{0} such that for all N(NεN1)N\geq(N_{\varepsilon}\vee N_{1}),

    (jUm𝟙(jVmN)um)1iWNS(U𝒩wN(i)):|S|=dλ(δiNd)1|U|minmumk𝒦iWkNλ(|C(λ,ρ)|d)(δiNd)ρ,\Big{(}\sum_{j\in U}\sum_{m\in\mathcal{M}}\mathds{1}_{(j\in V^{N}_{m})}u_{m}\Big{)}^{-1}\sum_{i\in W^{N}}\sum_{\begin{subarray}{c}S\subseteq(U\cap\mathcal{N}^{N}_{w}(i)):\\ |S|=d\end{subarray}}\frac{\lambda}{{\delta^{N}_{i}\choose d}}\leq\frac{1}{|U|\min_{m\in\mathcal{M}}u_{m}}\sum_{k\in\mathcal{K}}\sum_{i\in W^{N}_{k}}\frac{\lambda{|C(\lambda,\rho)|\choose d}}{{\delta^{N}_{i}\choose d}}\leq\rho,

    since for all iWNi\in W^{N}, δiN\delta^{N}_{i} goes to infinity as NN\rightarrow\infty uniformly by (A.2). Next, consider the case |U|>C(λ,ρ)|U|>C(\lambda,\rho). Denote αm=|UVmN|/|VmN|\alpha_{m}=|U\cap V^{N}_{m}|/|V^{N}_{m}| for each mm\in\mathcal{M}. Then,

    (jUm𝟙(jVmN)um)1iWNS(U𝒩wN(i)):|S|=dλ(δwN(i)d)(m|VmN|αmum)1k𝒦iWkNλ(|U𝒩wN(i)|d)(δwN(i)d).\Big{(}\sum_{j\in U}\sum_{m\in\mathcal{M}}\mathds{1}_{(j\in V^{N}_{m})}u_{m}\Big{)}^{-1}\sum_{i\in W^{N}}\sum_{\begin{subarray}{c}S\subseteq(U\cap\mathcal{N}^{N}_{w}(i)):\\ |S|=d\end{subarray}}\frac{\lambda}{{\delta^{N}_{w}(i)\choose d}}\leq\Big{(}\sum_{m\in\mathcal{M}}\lfloor|V_{m}^{N}|\alpha_{m}\rfloor u_{m}\Big{)}^{-1}\sum_{k\in\mathcal{K}}\sum_{i\in W^{N}_{k}}\frac{\lambda{|U\cap\mathcal{N}^{N}_{w}(i)|\choose d}}{{\delta^{N}_{w}(i)\choose d}}. (A.3)

    By (A.1), we have that for each k𝒦k\in\mathcal{K}

    iWkN|U𝒩wN(i)|=jUdegvN(k,j)m|VmN|αmpk,mwkW(N)(1+ε).\begin{split}&\sum_{i\in W^{N}_{k}}|U\cap\mathcal{N}^{N}_{w}(i)|=\sum_{j\in U}\deg^{N}_{v}(k,j)\leq\sum_{m\in\mathcal{M}}|V^{N}_{m}|\alpha_{m}p_{k,m}w_{k}W(N)(1+\varepsilon).\end{split} (A.4)

    By Lemma A.1 and (A.2), (A.4),

    (A.3)(m|VmN|αmum)1λk𝒦(m|VmN|αmpk,mwkW(N)(1+ε)δkN(1ε)+1)(δkN(1+ε)d)(δkN(1ε)d)C1(N)(1+ε1ε)d(mvmαmum)1λξk𝒦(wkmvmαmpk,m(1+ε)δk(1ε)+1N)\begin{split}\eqref{eq:subset-rho-1}&\leq\Big{(}\sum_{m\in\mathcal{M}}\lfloor|V_{m}^{N}|\alpha_{m}\rfloor u_{m}\Big{)}^{-1}\lambda\sum_{k\in\mathcal{K}}\Big{(}\left\lfloor\frac{\sum_{m\in\mathcal{M}}|V^{N}_{m}|\alpha_{m}p_{k,m}w_{k}W(N)(1+\varepsilon)}{\delta_{k}N(1-\varepsilon)}\right\rfloor+1\Big{)}\frac{{\delta_{k}N(1+\varepsilon)\choose d}}{{\delta_{k}N(1-\varepsilon)\choose d}}\\ &\leq C_{1}(N)\Big{(}\frac{1+\varepsilon}{1-\varepsilon}\Big{)}^{d}\Big{(}\sum_{m\in\mathcal{M}}v_{m}\alpha_{m}u_{m}\Big{)}^{-1}\lambda\xi\sum_{k\in\mathcal{K}}\Big{(}w_{k}\left\lfloor\frac{\sum_{m\in\mathcal{M}}v_{m}\alpha_{m}p_{k,m}(1+\varepsilon)}{\delta_{k}(1-\varepsilon)}\right\rfloor+\frac{1}{N}\Big{)}\\ \end{split} (A.5)

    where C1(N)C_{1}(N) only depends on NN and goes to 1 as NN\rightarrow\infty. Let

    𝒦{k𝒦:mvmαmpk,m(1+ε)δk(1ε)1}.\mathcal{K}^{\prime}\coloneqq\Big{\{}k\in\mathcal{K}:\left\lfloor\frac{\sum_{m\in\mathcal{M}}v_{m}\alpha_{m}p_{k,m}(1+\varepsilon)}{\delta_{k}(1-\varepsilon)}\right\rfloor\geq 1\Big{\}}.

    If 𝒦=\mathcal{K}^{\prime}=\emptyset, then

    (A.5)C1(N)(1+ε1ε)dλξNmvmαmumC1(N)(1+ε1ε)dλξC(λ,ρ)minmumρ(1+ε1ε)d.\begin{split}\eqref{eq:subset-rho-2}&\leq C_{1}(N)\Big{(}\frac{1+\varepsilon}{1-\varepsilon}\Big{)}^{d}\frac{\lambda\xi}{N\sum_{m\in\mathcal{M}}v_{m}\alpha_{m}u_{m}}\\ &\leq C_{1}(N)\Big{(}\frac{1+\varepsilon}{1-\varepsilon}\Big{)}^{d}\frac{\lambda\xi}{C(\lambda,\rho)\min_{m\in\mathcal{M}}u_{m}}\leq\rho\Big{(}\frac{1+\varepsilon}{1-\varepsilon}\Big{)}^{d}.\end{split} (A.6)

    Consider the case 𝒦\mathcal{K}^{\prime}\neq\emptyset. Then, we get

    (A.5)C1(N)(1+ε1ε)d(mvmαmum)1λξ(k𝒦wkmvmαmpk,m(1+ε)δk(1ε)+1N)C1(N)(1+ε1ε)d(mvmαmum)1λξ(k𝒦wkmvmαmpk,m(1+ε)δk(1ε))+C1(N)(1+ε1ε)dλξNmvmαmum.\begin{split}\eqref{eq:subset-rho-2}&\leq C_{1}(N)\Big{(}\frac{1+\varepsilon}{1-\varepsilon}\Big{)}^{d}\Big{(}\sum_{m\in\mathcal{M}}v_{m}\alpha_{m}u_{m}\Big{)}^{-1}\lambda\xi\Big{(}\sum_{k\in\mathcal{K}}w_{k}\frac{\sum_{m\in\mathcal{M}}v_{m}\alpha_{m}p_{k,m}(1+\varepsilon)}{\delta_{k}(1-\varepsilon)}+\frac{1}{N}\Big{)}\\ &\leq C_{1}(N)\Big{(}\frac{1+\varepsilon}{1-\varepsilon}\Big{)}^{d}\Big{(}\sum_{m\in\mathcal{M}}v_{m}\alpha_{m}u_{m}\Big{)}^{-1}\lambda\xi\Big{(}\sum_{k\in\mathcal{K}}w_{k}\frac{\sum_{m\in\mathcal{M}}v_{m}\alpha_{m}p_{k,m}(1+\varepsilon)}{\delta_{k}(1-\varepsilon)}\Big{)}\\ &\quad+C_{1}(N)\Big{(}\frac{1+\varepsilon}{1-\varepsilon}\Big{)}^{d}\frac{\lambda\xi}{N\sum_{m\in\mathcal{M}}v_{m}\alpha_{m}u_{m}}.\end{split} (A.7)

    By (3.2), we have that for all mm\in\mathcal{M} and αm(0,1)\alpha_{m}\in(0,1),

    (αmvmum)1λξ(k𝒦wkαmpk,mvmδk)1+ε1ερ(1+ε)1ε<1+ρ2,(\alpha_{m}v_{m}u_{m})^{-1}\lambda\xi\Big{(}\sum_{k\in\mathcal{K}}w_{k}\frac{\alpha_{m}p_{k,m}v_{m}}{\delta_{k}}\Big{)}\frac{1+\varepsilon}{1-\varepsilon}\leq\frac{\rho(1+\varepsilon)}{1-\varepsilon}<\frac{1+\rho}{2},

    which implies that

    λξ(k𝒦wkmvmαmpk,m(1+ε)δk(1ε))<1+ρ2(mvmαmum).\lambda\xi\Big{(}\sum_{k\in\mathcal{K}}w_{k}\frac{\sum_{m\in\mathcal{M}}v_{m}\alpha_{m}p_{k,m}(1+\varepsilon)}{\delta_{k}(1-\varepsilon)}\Big{)}<\frac{1+\rho}{2}\Big{(}\sum_{m\in\mathcal{M}}v_{m}\alpha_{m}u_{m}\Big{)}. (A.8)

    Since 𝒦\mathcal{K}^{\prime} is nonempty, then we assume k𝒦k^{\prime}\in\mathcal{K} is in 𝒦\mathcal{K}^{\prime}, i.e., mvmαmpk,m(1+ε)δk(1ε)\sum_{m\in\mathcal{M}}v_{m}\alpha_{m}p_{k^{\prime},m}(1+\varepsilon)\geq\delta_{k^{\prime}}(1-\varepsilon). Hence,

    λξNmvmαmumλξ(1+ε)Nδk(1ε)minmumλξρNδkminmum,\begin{split}\frac{\lambda\xi}{N\sum_{m\in\mathcal{M}}v_{m}\alpha_{m}u_{m}}\leq\frac{\lambda\xi(1+\varepsilon)}{N\delta_{k^{\prime}}(1-\varepsilon)\min_{m\in\mathcal{M}}u_{m}}\leq\frac{\lambda\xi\rho}{N\delta_{k^{\prime}}\min_{m\in\mathcal{M}}u_{m}},\end{split} (A.9)

    which implies that there exists N20N_{2}\in\mathbb{N}_{0} such that for all NN2N\geq N_{2},

    λξNmvmαmumλξ(1+ε)Nδk(1ε)minmum<C2(N)N0.\begin{split}\frac{\lambda\xi}{N\sum_{m\in\mathcal{M}}v_{m}\alpha_{m}u_{m}}\leq\frac{\lambda\xi(1+\varepsilon)}{N\delta_{k^{\prime}}(1-\varepsilon)\min_{m\in\mathcal{M}}u_{m}}<C_{2}(N)\xrightarrow{N\rightarrow\infty}0.\end{split} (A.10)

    We choose ε\varepsilon such that (1+ε1ε)d1+ρ2<1\Big{(}\frac{1+\varepsilon}{1-\varepsilon}\Big{)}^{d}\frac{1+\rho}{2}<1. By (A.8) and (A.10), we have that there exists a positive integer N3(NεN1N2)N_{3}\geq(N_{\varepsilon}\vee N_{1}\vee N_{2}) such that for all N30N_{3}\in\mathbb{N}_{0},

    (jUm𝟙(jVmN)um)1iWNS(U𝒩wN(i)):|S|=dλ(δwN(i)d)<(1+ε1ε)d(C1(N)1+ρ2+C2(N))<1.\Big{(}\sum_{j\in U}\sum_{m\in\mathcal{M}^{\prime}}\mathds{1}_{(j\in V^{N}_{m})}u_{m}\Big{)}^{-1}\sum_{i\in W^{N}}\sum_{\begin{subarray}{c}S\subseteq(U\cap\mathcal{N}^{N}_{w}(i)):\\ |S|=d\end{subarray}}\frac{\lambda}{{\delta^{N}_{w}(i)\choose d}}<\Big{(}\frac{1+\varepsilon}{1-\varepsilon}\Big{)}^{d}\Big{(}C_{1}(N)\frac{1+\rho}{2}+C_{2}(N)\Big{)}<1. (A.11)

    We choose ε\varepsilon such that (1+ε1ε)d1+ρ2<1\Big{(}\frac{1+\varepsilon}{1-\varepsilon}\Big{)}^{d}\frac{1+\rho}{2}<1. Now, since the subset UVNU\subseteq V^{N} is arbitrary, then for all NN3N\geq N_{3}, the NN-th system is stable under JSQ(dd) policy. ∎

    Proof of Proposition 3.3.

    By Lemma 3.4, it is sufficient to show that there exists some 𝐩\mathbf{p} such that for each mm\in\mathcal{M},

    λξk𝒦wkvmpk,mδk<vmum.\lambda\xi\sum_{k\in\mathcal{K}}w_{k}\frac{v_{m}p_{k,m}}{\delta_{k}}<v_{m}u_{m}. (A.12)

    Let xm,k=vmpk,mδk[0,1]x_{m,k}=\frac{v_{m}p_{k,m}}{\delta_{k}}\in[0,1], k𝒦k\in\mathcal{K}, mm\in\mathcal{M} with mxm,k=1\sum_{m\in\mathcal{M}}x_{m,k}=1. Now, we can formulate a linear optimization problem as the following: the objective is minρ\min\ \rho and the constraints are

    λξk𝒦wkxk,mρvmum,m,mxk,m= 1,k𝒦,xk,m[0,1],k𝒦,m.\begin{split}\lambda\xi\sum_{k\in\mathcal{K}}w_{k}x_{k,m}\ &\leq\ \rho v_{m}u_{m},\quad\forall m\in\mathcal{M},\\ \sum_{m\in\mathcal{M}}x_{k,m}\ &=\ 1,\quad\forall k\in\mathcal{K},\\ x_{k,m}\ \in&\ [0,1],\quad\forall k\in\mathcal{K},m\in\mathcal{M}.\end{split} (A.13)

    Next we construct a specific solution 𝐱=(xk,m,k𝒦,m)\mathbf{x}^{\prime}=(x^{\prime}_{k,m},k\in\mathcal{K},m\in\mathcal{M}) satisfying the above constraints (A.13) with ρ0=λξ/mvmum\rho_{0}=\lambda\xi/\sum_{m\in\mathcal{M}}v_{m}u_{m}. Note that ρ0<1\rho_{0}<1 by (2.1). For convenience, we denote xk,0=0x^{\prime}_{k,0}=0 for all k𝒦k\in\mathcal{K}. First, consider k=1k=1. Let x1,1=min(ρ0v1u1,λξw1)λξw1x^{\prime}_{1,1}=\frac{\min\big{(}\rho_{0}v_{1}u_{1},\lambda\xi w_{1}\big{)}}{\lambda\xi w_{1}} and for m2m\geq 2,

    x1,m=min(ρ0vmum,λξw1(1m<mx1,m))λξw1.x^{\prime}_{1,m}=\frac{\min\big{(}\rho_{0}v_{m}u_{m},\lambda\xi w_{1}(1-\sum_{m^{\prime}<m}x^{\prime}_{1,m^{\prime}})\big{)}}{\lambda\xi w_{1}}.

    Since λξw1λξ=ρ0mvmum\lambda\xi w_{1}\leq\lambda\xi=\rho_{0}\sum_{m\in\mathcal{M}}v_{m}u_{m}, then mx1,m=1\sum_{m\in\mathcal{M}}x^{\prime}_{1,m}=1 and m1min{m:ρ0vmumx1,mλξw1>0}m_{1}\coloneqq\min\{m\in\mathcal{M}:\rho_{0}v_{m}u_{m}-x^{\prime}_{1,m}\lambda\xi w_{1}>0\}\in\mathcal{M}. Then, consider k=2k=2. For all m<m1m<m_{1}, let x2,m=0x^{\prime}_{2,m}=0. Let

    x2,m1=min(ρ0vm1um1x1,m1λξw1,λξw2)λξw2,x^{\prime}_{2,m_{1}}=\frac{\min(\rho_{0}v_{m_{1}}u_{m_{1}}-x^{\prime}_{1,m_{1}}\lambda\xi w_{1},\lambda\xi w_{2})}{\lambda\xi w_{2}},

    and

    x2,m=min(ρ0vmum,λξw2(1m<mx2,m))λξw2,m>m1.x^{\prime}_{2,m}=\frac{\min\big{(}\rho_{0}v_{m}u_{m},\lambda\xi w_{2}(1-\sum_{m^{\prime}<m}x^{\prime}_{2,m^{\prime}})\big{)}}{\lambda\xi w_{2}},\quad m>m_{1}.

    Again, since λξ(w1+w2)λξρ0mvmum\lambda\xi(w_{1}+w_{2})\leq\lambda\xi\leq\rho_{0}\sum_{m\in\mathcal{M}}v_{m}u_{m}, then mx2,m=1\sum_{m\in\mathcal{M}}x^{\prime}_{2,m}=1 and m2min{mm1:ρ0vmumx2,mλξw2>0}m_{2}\coloneqq\min\{m\geq m_{1}:\rho_{0}v_{m}u_{m}-x^{\prime}_{2,m}\lambda\xi w_{2}>0\}\in\mathcal{M}. We can construct xk,mx^{\prime}_{k,m}, mm\in\mathcal{M}, k3k\geq 3 by following the steps of the construction of x2,mx^{\prime}_{2,m}, mm\in\mathcal{M}. Hence, we get a specific solution 𝐱\mathbf{x}^{\prime} satisfying (A.13) with ρ0<1\rho_{0}<1. Therefore, minρ\min\ \rho is strictly less than 1 and our desired result holds. ∎

    Appendix B Approximation of Graph Structure for Large NN Systems

    Proof of Lemma 4.3.

    Consider any fixed mm\in\mathcal{M} and fixed jVmj\in V_{m}. Also, fix any k𝒦k\in\mathcal{K} and (M2,,Md)d1(M_{2},...,M_{d})\in\mathcal{M}^{d-1}.

    |iWkNξi,jN(j2,,jd)setN(j)s.t.j2VM2N,,jdVMdNξi,j2N××ξi,jdN(δiNd)(d1)!dξpk,mwkδkh=2dvMhpk,Mhδk|\displaystyle\Big{|}\sum_{i\in W^{N}_{k}}\xi^{N}_{i,j}\sum_{\begin{subarray}{c}(j_{2},...,j_{d})\in\texttt{set}^{N}(j)\\ s.t.\quad j_{2}\in V^{N}_{M_{2}},...,j_{d}\in V^{N}_{M_{d}}\end{subarray}}\frac{\xi^{N}_{i,j_{2}}\times\cdots\times\xi^{N}_{i,j_{d}}}{{\delta^{N}_{i}\choose d}(d-1)!}-d\xi\frac{p_{k,m}w_{k}}{\delta_{k}}\prod_{h=2}^{d}\frac{v_{M_{h}}p_{k,M_{h}}}{\delta_{k}}\Big{|}
    |iWkNξi,jN(j2,,jd)setN(j)s.t.j2VM2N,,jdVMdNξi,j2N××ξi,jdN(δiNd)(d1)!iWkNξi,jN(δiNd1)(δiNd)h=2dvMhpk,Mhδk|\displaystyle\leq\Big{|}\sum_{i\in W^{N}_{k}}\xi^{N}_{i,j}\sum_{\begin{subarray}{c}(j_{2},...,j_{d})\in\texttt{set}^{N}(j)\\ s.t.\quad j_{2}\in V^{N}_{M_{2}},...,j_{d}\in V^{N}_{M_{d}}\end{subarray}}\frac{\xi^{N}_{i,j_{2}}\times\cdots\times\xi^{N}_{i,j_{d}}}{{\delta^{N}_{i}\choose d}(d-1)!}-\sum_{i\in W^{N}_{k}}\xi^{N}_{i,j}\frac{{\delta^{N}_{i}\choose d-1}}{{\delta^{N}_{i}\choose d}}\prod_{h=2}^{d}\frac{v_{M_{h}}p_{k,M_{h}}}{\delta_{k}}\Big{|} (B.1)
    +|iWkNξi,jN(δiNd1)(δiNd)h=2dvMhpk,Mhδkdξpk,mwkδkh=2dvMhpk,Mhδk|\displaystyle\quad+\Big{|}\sum_{i\in W^{N}_{k}}\xi^{N}_{i,j}\frac{{\delta^{N}_{i}\choose d-1}}{{\delta^{N}_{i}\choose d}}\prod_{h=2}^{d}\frac{v_{M_{h}}p_{k,M_{h}}}{\delta_{k}}-d\xi\frac{p_{k,m}w_{k}}{\delta_{k}}\prod_{h=2}^{d}\frac{v_{M_{h}}p_{k,M_{h}}}{\delta_{k}}\Big{|} (B.2)

    First,

    maxiWkN|(j2,,jd)setN(j)s.t.j2VM2N,,jdVMdNξi,j2N××ξi,jdN(δiNd1)(d1)!h=2dvMhpk,Mhδk|\displaystyle\max_{i\in W^{N}_{k}}\Big{|}\sum_{\begin{subarray}{c}(j_{2},...,j_{d})\in\texttt{set}^{N}(j)\\ s.t.\quad j_{2}\in V^{N}_{M_{2}},...,j_{d}\in V^{N}_{M_{d}}\end{subarray}}\frac{\xi^{N}_{i,j_{2}}\times\cdots\times\xi^{N}_{i,j_{d}}}{{\delta^{N}_{i}\choose d-1}(d-1)!}-\prod_{h=2}^{d}\frac{v_{M_{h}}p_{k,M_{h}}}{\delta_{k}}\Big{|}
    \displaystyle\leq maxiWkN|(j2,,jd)setN(j)s.t.j2VM2N,,jdVMdNξi,j2N××ξi,jdN(δiNd1)(d1)!degwN(i,M2)××degwN(i,Md)(δiNd1)(d1)!|\displaystyle\max_{i\in W^{N}_{k}}\Big{|}\sum_{\begin{subarray}{c}(j_{2},...,j_{d})\in\texttt{set}^{N}(j)\\ s.t.\quad j_{2}\in V^{N}_{M_{2}},...,j_{d}\in V^{N}_{M_{d}}\end{subarray}}\frac{\xi^{N}_{i,j_{2}}\times\cdots\times\xi^{N}_{i,j_{d}}}{{\delta^{N}_{i}\choose d-1}(d-1)!}-\frac{\deg^{N}_{w}(i,M_{2})\times\cdots\times\deg^{N}_{w}(i,M_{d})}{{\delta^{N}_{i}\choose d-1}(d-1)!}\Big{|}
    +maxiWkN|degwN(i,M2)××degwN(i,Md)(δiNd1)(d1)!h=2dvMhpk,Mhδk|.\displaystyle+\max_{i\in W^{N}_{k}}\Big{|}\frac{\deg^{N}_{w}(i,M_{2})\times\cdots\times\deg^{N}_{w}(i,M_{d})}{{\delta^{N}_{i}\choose d-1}(d-1)!}-\prod_{h=2}^{d}\frac{v_{M_{h}}p_{k,M_{h}}}{\delta_{k}}\Big{|}.

    For large enough NN,

    maxiWkN|(j2,,jd)setN(j)s.t.j2VM2N,,jdVMdNξi,j2N××ξi,jdN(δiNd1)(d1)!degwN(i,M2)××degwN(i,Md)(δiNd1)(d1)!|\displaystyle\max_{i\in W^{N}_{k}}\Big{|}\sum_{\begin{subarray}{c}(j_{2},...,j_{d})\in\texttt{set}^{N}(j)\\ s.t.\quad j_{2}\in V^{N}_{M_{2}},...,j_{d}\in V^{N}_{M_{d}}\end{subarray}}\frac{\xi^{N}_{i,j_{2}}\times\cdots\times\xi^{N}_{i,j_{d}}}{{\delta^{N}_{i}\choose d-1}(d-1)!}-\frac{\deg^{N}_{w}(i,M_{2})\times\cdots\times\deg^{N}_{w}(i,M_{d})}{{\delta^{N}_{i}\choose d-1}(d-1)!}\Big{|}
    maxiWkNd(d1)(δiNd1)(d1)!maxm(degwN(i,m))d2\displaystyle\leq\max_{i\in W^{N}_{k}}\frac{d(d-1)}{{\delta^{N}_{i}\choose d-1}(d-1)!}\max_{m\in\mathcal{M}}(\deg^{N}_{w}(i,m))^{d-2}
    d(d1)miniWkN(δiNd1)(d1)!maxiWkNmaxm(degwN(i,m))d2\displaystyle\leq\frac{d(d-1)}{\min_{i\in W^{N}_{k}}{\delta^{N}_{i}\choose d-1}(d-1)!}\max_{i\in W^{N}_{k}}\max_{m\in\mathcal{M}}(\deg^{N}_{w}(i,m))^{d-2}
    cN(m,k)d(d1)(Nmaxmvmpk,m)d2(Nδk)d1N0,\displaystyle\leq c^{N}(m,k)d(d-1)\frac{(N\max_{m\in\mathcal{M}}v_{m}p_{k,m})^{d-2}}{(N\delta_{k})^{d-1}}\xrightarrow{N\rightarrow\infty}0, (B.3)

    where cN(m,k)c^{N}(m,k) goes to 1 as NN goes to infinity, and only depends on kk and mm for each NN. The last inequality comes from Condition 2.1, Lemma 4.1, and δiN××(δiNd+2)(δiN)d1N1\frac{\delta^{N}_{i}\times\cdots\times(\delta^{N}_{i}-d+2)}{(\delta^{N}_{i})^{d-1}}\xrightarrow{N\rightarrow\infty}1. Similarly, we have

    maxiWkN|degwN(i,M2)××degwN(i,Md)(δiNd1)(d1)!h=2dvMhpk,Mhδk|\displaystyle\max_{i\in W^{N}_{k}}\Big{|}\frac{\deg^{N}_{w}(i,M_{2})\times\cdots\times\deg^{N}_{w}(i,M_{d})}{{\delta^{N}_{i}\choose d-1}(d-1)!}-\prod_{h=2}^{d}\frac{v_{M_{h}}p_{k,M_{h}}}{\delta_{k}}\Big{|}
    max(h=2d(maxiWkNdegwN(i,Mh)miniWkN(δiNd)vMhpk,Mhδk),h=2d(miniWkNdegwN(i,Mh)maxiWkNδiNvMhpk,Mhδk))\displaystyle\leq\max\Big{(}\prod_{h=2}^{d}\big{(}\frac{\max_{i\in W^{N}_{k}}\deg^{N}_{w}(i,M_{h})}{\min_{i\in W^{N}_{k}}(\delta^{N}_{i}-d)}-\frac{v_{M_{h}}p_{k,M_{h}}}{\delta_{k}}\big{)},\prod_{h=2}^{d}\big{(}\frac{\min_{i\in W^{N}_{k}}\deg^{N}_{w}(i,M_{h})}{\max_{i\in W^{N}_{k}}\delta^{N}_{i}}-\frac{v_{M_{h}}p_{k,M_{h}}}{\delta_{k}}\big{)}\Big{)}
    cN(m,k,M2,,Md)N0,\displaystyle\leq c^{N}(m,k,M_{2},...,M_{d})\xrightarrow{N\rightarrow\infty}0, (B.4)

    where cN(m,k,M2,,Md)c^{N}(m,k,M_{2},...,M_{d}) depends on m,k,M2,,Mdm,k,M_{2},...,M_{d}. By (B.3) and (B.4), we have

    maxiWkN|(j2,,jd)setN(j)s.t.j2VM2N,,jdVMdNξi,j2N××ξi,jdN(δiNd1)(d1)!h=2dvMhpk,Mhδk|c1N(m,k,M2,,Md)N0,\max_{i\in W^{N}_{k}}\Big{|}\sum_{\begin{subarray}{c}(j_{2},...,j_{d})\in\texttt{set}^{N}(j)\\ s.t.\quad j_{2}\in V^{N}_{M_{2}},...,j_{d}\in V^{N}_{M_{d}}\end{subarray}}\frac{\xi^{N}_{i,j_{2}}\times\cdots\times\xi^{N}_{i,j_{d}}}{{\delta^{N}_{i}\choose d-1}(d-1)!}-\prod_{h=2}^{d}\frac{v_{M_{h}}p_{k,M_{h}}}{\delta_{k}}\Big{|}\leq c_{1}^{N}(m,k,M_{2},...,M_{d})\xrightarrow{N\rightarrow\infty}0, (B.5)

    where c1N(m,k,M2,,Md)c_{1}^{N}(m,k,M_{2},...,M_{d}) depends on m,k,M2,,Mdm,k,M_{2},...,M_{d}. By Lemma 4.1, we have

    limNmaxiWkNN(δiNd1)(δiNd)\displaystyle\lim_{N\rightarrow\infty}\max_{i\in W^{N}_{k}}\frac{N{\delta^{N}_{i}\choose d-1}}{{\delta^{N}_{i}\choose d}} =limNminiWkNN(δiNd1)(δiNd)=dδk,\displaystyle=\lim_{N\rightarrow\infty}\min_{i\in W^{N}_{k}}\frac{N{\delta^{N}_{i}\choose d-1}}{{\delta^{N}_{i}\choose d}}=\frac{d}{\delta_{k}},
    limNmaxjVmNdegvN(k,j)N\displaystyle\lim_{N\rightarrow\infty}\max_{j\in V^{N}_{m}}\frac{\deg^{N}_{v}(k,j)}{N} =limNminjVmNdegvN(k,j)N=ξpk,mwk.\displaystyle=\lim_{N\rightarrow\infty}\min_{j\in V^{N}_{m}}\frac{\deg^{N}_{v}(k,j)}{N}=\xi p_{k,m}w_{k}.

    Then,

    |iWkNξi,jN(δiNd1)(δiNd)dξpk,mwkδk|\displaystyle\Big{|}\sum_{i\in W^{N}_{k}}\xi^{N}_{i,j}\frac{{\delta^{N}_{i}\choose d-1}}{{\delta^{N}_{i}\choose d}}-d\xi\frac{p_{k,m}w_{k}}{\delta_{k}}\Big{|} |iWkNξi,jN(δiNd1)(δiNd)degvN(k,j)dNδk|+|degvN(k,j)dNδkdξpk,mwkδk|\displaystyle\leq\Big{|}\sum_{i\in W^{N}_{k}}\xi^{N}_{i,j}\frac{{\delta^{N}_{i}\choose d-1}}{{\delta^{N}_{i}\choose d}}-\deg^{N}_{v}(k,j)\frac{d}{N\delta_{k}}\Big{|}+\Big{|}\deg^{N}_{v}(k,j)\frac{d}{N\delta_{k}}-d\xi\frac{p_{k,m}w_{k}}{\delta_{k}}\Big{|}
    c1N(m,k)N0,\displaystyle\leq c^{N}_{1}(m,k)\xrightarrow{N\rightarrow\infty}0, (B.6)

    where c1N(m,k)c^{N}_{1}(m,k) only depends on mm and kk.

    Consider (B.1)\eqref{eq:avg-deg-1-1}.

    |iWkNξi,jN(j2,,jd)setN(j)s.t.j2VM2N,,jdVMdNξi,j2N××ξi,jdN(δiNd)(d1)!iWkNξi,jN(δiNd1)(δiNd)h=2dvMhpk,Mhδk|\displaystyle\Big{|}\sum_{i\in W^{N}_{k}}\xi^{N}_{i,j}\sum_{\begin{subarray}{c}(j_{2},...,j_{d})\in\texttt{set}^{N}(j)\\ s.t.\quad j_{2}\in V^{N}_{M_{2}},...,j_{d}\in V^{N}_{M_{d}}\end{subarray}}\frac{\xi^{N}_{i,j_{2}}\times\cdots\times\xi^{N}_{i,j_{d}}}{{\delta^{N}_{i}\choose d}(d-1)!}-\sum_{i\in W^{N}_{k}}\xi^{N}_{i,j}\frac{{\delta^{N}_{i}\choose d-1}}{{\delta^{N}_{i}\choose d}}\prod_{h=2}^{d}\frac{v_{M_{h}}p_{k,M_{h}}}{\delta_{k}}\Big{|}
    =|iWkNξi,jN(δiNd1)(δiNd)(j2,,jd)setN(j)s.t.j2VM2N,,jdVMdNξi,j2N××ξi,jdN(δiNd1)(d1)!iWkNξi,jN(δiNd1)(δiNd)h=2dvMhpk,Mhδk|\displaystyle=\Big{|}\sum_{i\in W^{N}_{k}}\xi^{N}_{i,j}\frac{{\delta^{N}_{i}\choose d-1}}{{\delta^{N}_{i}\choose d}}\sum_{\begin{subarray}{c}(j_{2},...,j_{d})\in\texttt{set}^{N}(j)\\ s.t.\quad j_{2}\in V^{N}_{M_{2}},...,j_{d}\in V^{N}_{M_{d}}\end{subarray}}\frac{\xi^{N}_{i,j_{2}}\times\cdots\times\xi^{N}_{i,j_{d}}}{{\delta^{N}_{i}\choose d-1}(d-1)!}-\sum_{i\in W^{N}_{k}}\xi^{N}_{i,j}\frac{{\delta^{N}_{i}\choose d-1}}{{\delta^{N}_{i}\choose d}}\prod_{h=2}^{d}\frac{v_{M_{h}}p_{k,M_{h}}}{\delta_{k}}\Big{|}
    iWkNξi,jN(δiNd1)(δiNd)|(j2,,jd)setN(j)s.t.j2VM2N,,jdVMdNξi,j2N××ξi,jdN(δiNd1)(d1)!h=2dvMhpk,Mhδk|\displaystyle\leq\sum_{i\in W^{N}_{k}}\xi^{N}_{i,j}\frac{{\delta^{N}_{i}\choose d-1}}{{\delta^{N}_{i}\choose d}}\Big{|}\sum_{\begin{subarray}{c}(j_{2},...,j_{d})\in\texttt{set}^{N}(j)\\ s.t.\quad j_{2}\in V^{N}_{M_{2}},...,j_{d}\in V^{N}_{M_{d}}\end{subarray}}\frac{\xi^{N}_{i,j_{2}}\times\cdots\times\xi^{N}_{i,j_{d}}}{{\delta^{N}_{i}\choose d-1}(d-1)!}-\prod_{h=2}^{d}\frac{v_{M_{h}}p_{k,M_{h}}}{\delta_{k}}\Big{|}
    (a)iWkNξi,jN(δiNd1)(δiNd)c2N(m,k,M2,,Md)\displaystyle\overset{(a)}{\leq}\sum_{i\in W^{N}_{k}}\xi^{N}_{i,j}\frac{{\delta^{N}_{i}\choose d-1}}{{\delta^{N}_{i}\choose d}}c_{2}^{N}(m,k,M_{2},...,M_{d})
    (b)c2N(m,k,M2,,Md)c2N(m,k)dξpk,mwkδkN0,\displaystyle\overset{(b)}{\leq}c_{2}^{N}(m,k,M_{2},...,M_{d})c^{N}_{2}(m,k)d\xi\frac{p_{k,m}w_{k}}{\delta_{k}}\xrightarrow{N\rightarrow\infty}0, (B.7)

    where c2N(m,k,M2,,Md)N0c_{2}^{N}(m,k,M_{2},...,M_{d})\xrightarrow{N\rightarrow\infty}0 and c2N(m,k)N1c^{N}_{2}(m,k)\xrightarrow{N\rightarrow\infty}1. (a) is from (B.5) and (b) is from (B.6). Hence, (B.1) goes to 0 as NN\rightarrow\infty. Then,

    |iWkNξi,jN(j2,,jd)setN(j)s.t.j2VM2N,,jdVMdNξi,j2N××ξi,jdN(δiNd)(d1)!dξpk,mwkδkh=2dvMhpk,Mhδk|c3N(m,k,M2,,Md)N0,\begin{split}&\Big{|}\sum_{i\in W^{N}_{k}}\xi^{N}_{i,j}\sum_{\begin{subarray}{c}(j_{2},...,j_{d})\in\texttt{set}^{N}(j)\\ s.t.\quad j_{2}\in V^{N}_{M_{2}},...,j_{d}\in V^{N}_{M_{d}}\end{subarray}}\frac{\xi^{N}_{i,j_{2}}\times\cdots\times\xi^{N}_{i,j_{d}}}{{\delta^{N}_{i}\choose d}(d-1)!}-d\xi\frac{p_{k,m}w_{k}}{\delta_{k}}\prod_{h=2}^{d}\frac{v_{M_{h}}p_{k,M_{h}}}{\delta_{k}}\Big{|}\\ &\leq c_{3}^{N}(m,k,M_{2},...,M_{d})\xrightarrow{N\rightarrow\infty}0,\end{split} (B.8)

    where c3N(m,k,M2,,Md)c_{3}^{N}(m,k,M_{2},...,M_{d}) only depends on m,k,M2,,Mdm,k,M_{2},...,M_{d}. Since k𝒦k\in\mathcal{K} and (M2,,Md)d1(M_{2},...,M_{d})\in\mathcal{M}^{d-1} are arbitrary, and 𝒦\mathcal{K} and d1\mathcal{M}^{d-1} are finite sets, we have

    maxk𝒦max(M2,,Md)d1|iWkNξi,jN(j2,,jd)setN(j)s.t.j2VM2N,,jdVMdNξi,j2N××ξi,jdN(δiNd)(d1)!dξpk,mwkδkh=2dvMhpk,Mhδk|cN(m)N0,\begin{split}&\max_{k\in\mathcal{K}}\max_{(M_{2},...,M_{d})\in\mathcal{M}^{d-1}}\Big{|}\sum_{i\in W^{N}_{k}}\xi^{N}_{i,j}\sum_{\begin{subarray}{c}(j_{2},...,j_{d})\in\texttt{set}^{N}(j)\\ s.t.\quad j_{2}\in V^{N}_{M_{2}},...,j_{d}\in V^{N}_{M_{d}}\end{subarray}}\frac{\xi^{N}_{i,j_{2}}\times\cdots\times\xi^{N}_{i,j_{d}}}{{\delta^{N}_{i}\choose d}(d-1)!}-d\xi\frac{p_{k,m}w_{k}}{\delta_{k}}\prod_{h=2}^{d}\frac{v_{M_{h}}p_{k,M_{h}}}{\delta_{k}}\Big{|}\\ &\leq c^{N}(m)\xrightarrow{N\rightarrow\infty}0,\end{split} (B.9)

    where cN(m)c^{N}(m) only depends on mm. Since cN(m)c^{N}(m) does not depend on jVmNj\in V^{N}_{m}, (4.9) holds. ∎

    Proof of Lemma 4.4.

    Fix any mm\in\mathcal{M} and jVmj\in V_{m}. Consider (4.10). When ξi,jN=1\xi^{N}_{i,j}=1, by the definition (4.2) of settN()\texttt{sett}^{N}(\cdot),

    settN(j)ξi,jN×ξi,j2N××ξi,jdN(δiNd)(d1)!ξi,jN×ξi,j2N××ξi,jdN(δiNd)(d1)!=[(d1)!(δiN1d1)]2(2d2)!(δiN12d2)(δiNd)2((d1)!)2.\sum_{\texttt{sett}^{N}{(j)}}\frac{\xi^{N}_{i,j}\times\xi^{N}_{i,j_{2}}\times\cdots\times\xi^{N}_{i,j_{d}}}{{\delta^{N}_{i}\choose d}(d-1)!}\frac{\xi^{N}_{i,j}\times\xi^{N}_{i,j^{\prime}_{2}}\times\cdots\times\xi^{N}_{i,j^{\prime}_{d}}}{{\delta^{N}_{i}\choose d}(d-1)!}=\frac{\Big{[}(d-1)!{\delta^{N}_{i}-1\choose d-1}\Big{]}^{2}-(2d-2)!{\delta^{N}_{i}-1\choose 2d-2}}{{\delta^{N}_{i}\choose d}^{2}\big{(}(d-1)!\big{)}^{2}}.

    Also, by Lemma 4.1, we have that for all k𝒦k\in\mathcal{K} and iWki\in W_{k},

    [(d1)!(δiN1d)]2(2d2)!(δiN12d2)(δiNd)2((d1)!)2\displaystyle\frac{\Big{[}(d-1)!{\delta^{N}_{i}-1\choose d}\Big{]}^{2}-(2d-2)!{\delta^{N}_{i}-1\choose 2d-2}}{{\delta^{N}_{i}\choose d}^{2}\big{(}(d-1)!\big{)}^{2}} [(d1)!maxiWkN(δiN1d)]2(2d2)!miniWkN(δiN12d2)miniWkN(δiNd)2((d1)!)2\displaystyle\leq\frac{\Big{[}(d-1)!\max_{i\in W^{N}_{k}}{\delta^{N}_{i}-1\choose d}\Big{]}^{2}-(2d-2)!\min_{i\in W^{N}_{k}}{\delta^{N}_{i}-1\choose 2d-2}}{\min_{i\in W^{N}_{k}}{\delta^{N}_{i}\choose d}^{2}\big{(}(d-1)!\big{)}^{2}}
    c1(N)[(d1)!(Nδkd1)]2(2d2)!(Nδk2d2)(Nδkd)2((d1)!)2\displaystyle\leq c_{1}(N)\frac{\Big{[}(d-1)!{N\delta_{k}\choose d-1}\Big{]}^{2}-(2d-2)!{N\delta_{k}\choose 2d-2}}{{N\delta_{k}\choose d}^{2}\big{(}(d-1)!\big{)}^{2}}

    where c1(N)c_{1}(N) only depends on NN and goes to 11 as NN\rightarrow\infty. By Lemma 4.1, we have that for all k𝒦k\in\mathcal{K}, maxjWmNdegvN(k,j)c2(N,m)|WkN|pk,m\max_{j\in W^{N}_{m}}\deg^{N}_{v}(k,j)\leq c_{2}(N,m)|W^{N}_{k}|p_{k,m} where c2(N,m)c_{2}(N,m) only depends on NN and mm, and goes to 11 as NN\rightarrow\infty. Hence,

    iWNsettN(j)ξi,jN×ξi,j2N××ξi,jdN(δiNd)(d1)!ξi,jN×ξi,j2N××ξi,jdN(δiNd)(d1)!\displaystyle\sum_{i\in W^{N}}\sum_{\texttt{sett}^{N}{(j)}}\frac{\xi^{N}_{i,j}\times\xi^{N}_{i,j_{2}}\times\cdots\times\xi^{N}_{i,j_{d}}}{{\delta^{N}_{i}\choose d}(d-1)!}\frac{\xi^{N}_{i,j}\times\xi^{N}_{i,j^{\prime}_{2}}\times\cdots\times\xi^{N}_{i,j^{\prime}_{d}}}{{\delta^{N}_{i}\choose d}(d-1)!}
    =k𝒦iWkN[(d1)!(δiN1d)]2(2d2)!(δiN12d2)(δiNd)2((d1)!)2\displaystyle=\sum_{k\in\mathcal{K}}\sum_{i\in W^{N}_{k}}\frac{\Big{[}(d-1)!{\delta^{N}_{i}-1\choose d}\Big{]}^{2}-(2d-2)!{\delta^{N}_{i}-1\choose 2d-2}}{{\delta^{N}_{i}\choose d}^{2}\big{(}(d-1)!\big{)}^{2}}
    c1(N)k𝒦degvN(k,j)[(d1)!(Nδkd1)]2(2d2)!(Nδk2d2)(Nδkd)2((d1)!)2\displaystyle\leq c_{1}(N)\sum_{k\in\mathcal{K}}\deg^{N}_{v}(k,j)\frac{\Big{[}(d-1)!{N\delta_{k}\choose d-1}\Big{]}^{2}-(2d-2)!{N\delta_{k}\choose 2d-2}}{{N\delta_{k}\choose d}^{2}\big{(}(d-1)!\big{)}^{2}}
    c1(N)c2(N,m)k𝒦|WkN|pk,m[(d1)!(Nδkd1)]2(2d2)!(Nδk2d2)(Nδkd)2((d1)!)2.\displaystyle\leq c_{1}(N)c_{2}(N,m)\sum_{k\in\mathcal{K}}|W^{N}_{k}|p_{k,m}\frac{\Big{[}(d-1)!{N\delta_{k}\choose d-1}\Big{]}^{2}-(2d-2)!{N\delta_{k}\choose 2d-2}}{{N\delta_{k}\choose d}^{2}\big{(}(d-1)!\big{)}^{2}}.

    Let c3(N)=maxmc1(N)c2(N,m)c_{3}(N)=\max_{m\in\mathcal{M}}c_{1}(N)c_{2}(N,m) with c3(N)N1c_{3}(N)\xrightarrow{N\rightarrow\infty}1. Then, we have that for large enough NN,

    iWNsettN(j)ξi,jN×ξi,j2N××ξi,jdN(δiNd)(d1)!ξi,jN×ξi,j2N××ξi,jdN(δiNd)(d1)!c3(N)k𝒦|WkN|pk,m[(d1)!(Nδkd1)]2(2d2)!(Nδk2d2)(Nδkd)2((d1)!)22k𝒦|WkN|pk,m[(d1)!(Nδkd1)]2(2d2)!(Nδk2d2)(Nδkd)2((d1)!)2\begin{split}&\sum_{i\in W^{N}}\sum_{\texttt{sett}^{N}{(j)}}\frac{\xi^{N}_{i,j}\times\xi^{N}_{i,j_{2}}\times\cdots\times\xi^{N}_{i,j_{d}}}{{\delta^{N}_{i}\choose d}(d-1)!}\frac{\xi^{N}_{i,j}\times\xi^{N}_{i,j^{\prime}_{2}}\times\cdots\times\xi^{N}_{i,j^{\prime}_{d}}}{{\delta^{N}_{i}\choose d}(d-1)!}\\ &\leq c_{3}(N)\sum_{k\in\mathcal{K}}|W^{N}_{k}|p_{k,m}\frac{\Big{[}(d-1)!{N\delta_{k}\choose d-1}\Big{]}^{2}-(2d-2)!{N\delta_{k}\choose 2d-2}}{{N\delta_{k}\choose d}^{2}\big{(}(d-1)!\big{)}^{2}}\\ &\leq 2\sum_{k\in\mathcal{K}}|W^{N}_{k}|p_{k,m}\frac{\Big{[}(d-1)!{N\delta_{k}\choose d-1}\Big{]}^{2}-(2d-2)!{N\delta_{k}\choose 2d-2}}{{N\delta_{k}\choose d}^{2}\big{(}(d-1)!\big{)}^{2}}\end{split} (B.10)

    Since limN|WkN|/N=ξwk\lim_{N\rightarrow\infty}|W^{N}_{k}|/N=\xi w_{k} and [(d1)!(xd1)]2(2d2)!(x2d2)C3x2d3\Big{[}(d-1)!{x\choose d-1}\Big{]}^{2}-(2d-2)!{x\choose 2d-2}\leq C_{3}x^{2d-3} for some constant C3C_{3}, then by choosing C1C_{1} appropriately, (4.10) holds for all large enough NN. We can get (4.11) in a similar way. ∎

    Appendix C Unique Solution of ODE (3.8)

    Proof of Lemma 4.5.

    Recall 𝐪¯(t,𝐪0)\bar{\mathbf{q}}(t,\mathbf{q}_{0}) is a solution of (3.8) given the initial point 𝐪N(0)=𝐪0\mathbf{q}^{N}(0)=\mathbf{q}_{0}. For convenience, we denote 𝐪¯(t,𝐪0)\bar{\mathbf{q}}(t,\mathbf{q}_{0}) as 𝐪¯(t)\bar{\mathbf{q}}(t) and write the ODE (3.8) as the following:

    𝐪¯(0)=𝐪0,𝐪˙(t)=𝐡¯(𝐪¯(t)),\bar{\mathbf{q}}(0)=\mathbf{q}_{0},\quad\dot{\mathbf{q}}(t)=\bar{\mathbf{h}}(\bar{\mathbf{q}}(t)), (C.1)

    where for all mm\in\mathcal{M},

    h¯m,0(𝐪)\displaystyle\bar{h}_{m,0}(\mathbf{q}) =0,\displaystyle=0,
    h¯m,l(𝐪)\displaystyle\bar{h}_{m,l}(\mathbf{q}) =um(qm,lqm,l+1)+λξ(qm,l1qm,l)k𝒦pk,mwkδk(q~k,l1)d(q~k,l)dq~k,l1q~k,l,l1.\displaystyle=-u_{m}(q_{m,l}-q_{m,l+1})+\lambda\xi(q_{m,l-1}-q_{m,l})\sum_{k\in\mathcal{K}}\frac{p_{k,m}w_{k}}{\delta_{k}}\frac{(\tilde{q}_{k,l-1})^{d}-(\tilde{q}_{k,l})^{d}}{\tilde{q}_{k,l-1}-\tilde{q}_{k,l}},\quad l\geq 1. (C.2)

    observe that under (C.2), if qm,l(t)=qm,l+1(t)q_{m,l}(t)=q_{m,l+1}(t) for some m,l0,t0m\in\mathcal{M},l\in\mathbb{N}_{0},t\geq 0, then h¯m,l(𝐪(t))0\bar{h}_{m,l}(\mathbf{q}(t))\geq 0 and h¯m,l+1(𝐪(t))0\bar{h}_{m,l+1}(\mathbf{q}(t))\leq 0; if qm,l(t)=0q_{m,l}(t)=0 for some m,l0,t0m\in\mathcal{M},l\in\mathbb{N}_{0},t\geq 0, then h¯m,l(𝐪(t))0\bar{h}_{m,l}(\mathbf{q}(t))\geq 0. Hence, if 𝐪𝒮¯\mathbf{q}\in\bar{\mathcal{S}}, then any solution of (C.1)-(C.2) remains within 𝒮¯\bar{\mathcal{S}}. In order to show the existence and the uniqueness, we use the Picard successive approximation method([14, Theorem 1(i)]). In the rest of the proof, we use the norm:

    𝐪=supmsupl0|qm,l|l+1.\left\lVert\mathbf{q}\right\rVert=\sup_{m\in\mathcal{M}}\sup_{l\in\mathbb{N}_{0}}\frac{|q_{m,l}|}{l+1}.

    For any 𝐪\mathbf{q}, 𝐪𝒮¯\mathbf{q}^{\prime}\in\bar{\mathcal{S}},

    𝐡¯(𝐪)K1,𝐡¯(𝐪)𝐡¯(𝐪)K2𝐪𝐪,\left\lVert\bar{\mathbf{h}}(\mathbf{q})\right\rVert\leq K_{1},\quad\left\lVert\bar{\mathbf{h}}(\mathbf{q})-\bar{\mathbf{h}}(\mathbf{q}^{\prime})\right\rVert\leq K_{2}\left\lVert\mathbf{q}-\mathbf{q}^{\prime}\right\rVert, (C.3)

    where K1maxmum+λξK_{1}\coloneqq\max_{m\in\mathcal{M}}u_{m}+\lambda\xi and K22maxmum+2dλξK_{2}\coloneqq 2\max_{m\in\mathcal{M}}u_{m}+2d\lambda\xi. For t0t\geq 0, let 𝐪(0)(t)=𝐪0\mathbf{q}^{(0)}(t)=\mathbf{q}_{0}, and by Picard successive approximation method, let

    𝐪(n)(t)=𝐪0+0t𝐡¯(𝐪(n1)(s))𝑑s,n.\mathbf{q}^{(n)}(t)=\mathbf{q}_{0}+\int_{0}^{t}\bar{\mathbf{h}}(\mathbf{q}^{(n-1)}(s))ds,\quad n\in\mathbb{N}.

    By induction, we have that 𝐪(n)(t)\mathbf{q}^{(n)}(t) is continuous w.r.t. tt on [0,)[0,\infty) for all nn, and that

    𝐪(n+1)(t)𝐪(n)(t)K1K2ntn+1(n+1)!,n,t0.\left\lVert\mathbf{q}^{(n+1)}(t)-\mathbf{q}^{(n)}(t)\right\rVert\leq\frac{K_{1}K_{2}^{n}t^{n+1}}{(n+1)!},\quad\forall n\in\mathbb{N},t\geq 0.

    Hence, for all t0t\geq 0, 𝐪()=limn𝐪(n)\mathbf{q}^{(\infty)}=\lim_{n\rightarrow\infty}\mathbf{q}^{(n)} exists uniformly for s[0,t]s\in[0,t]. Also, by (C.3) and Dominated Convergence Theorem, the following holds

    𝐪()(t)=𝐪0+0t𝐡¯(𝐪()(s))𝑑s.\mathbf{q}^{(\infty)}(t)=\mathbf{q}_{0}+\int_{0}^{t}\bar{\mathbf{h}}(\mathbf{q}^{(\infty)}(s))ds. (C.4)

    Next, we show the uniqueness by contradiction. Assume that 𝐪~()\tilde{\mathbf{q}}^{(\infty)} also satisfies

    𝐪~()(t)=𝐪0+0t𝐡¯(𝐪~()(s))𝑑s.\tilde{\mathbf{q}}^{(\infty)}(t)=\mathbf{q}_{0}+\int_{0}^{t}\bar{\mathbf{h}}(\tilde{\mathbf{q}}^{(\infty)}(s))ds.

    Then, we have

    𝐪~()(t)𝐪(n)(t)=0t[𝐡¯(𝐪~()(s))𝐡¯(𝐪(n1)(s))]𝑑s.\tilde{\mathbf{q}}^{(\infty)}(t)-\mathbf{q}^{(n)}(t)=\int_{0}^{t}\big{[}\bar{\mathbf{h}}(\tilde{\mathbf{q}}^{(\infty)}(s))-\bar{\mathbf{h}}(\mathbf{q}^{(n-1)}(s))\big{]}ds.

    Similarly, we get

    𝐪~()(t)𝐪(n)(t)K1K2ntn+1(n+1)!\left\lVert\tilde{\mathbf{q}}^{(\infty)}(t)-\mathbf{q}^{(n)}(t)\right\rVert\leq\frac{K_{1}K_{2}^{n}t^{n+1}}{(n+1)!}

    which implies that 𝐪~()(t)=limn𝐪(n)(t)=𝐪()\tilde{\mathbf{q}}^{(\infty)}(t)=\lim_{n\rightarrow\infty}\mathbf{q}^{(n)}(t)=\mathbf{q}^{(\infty)}. ∎

    Appendix D Proof of Proposition 4.9

    Lemma D.1.

    If 𝐪N(0)\mathbf{q}^{N}(0) weakly converges to 𝐪(0)=𝐪𝒮\mathbf{q}(0)=\mathbf{q}^{\infty}\in\mathcal{S}, then for any ε>0\varepsilon>0, δ>0\delta>0, and T>0T>0, there exist 0\ell\in\mathbb{N}_{0} and N0N_{\ell}\in\mathbb{N}_{0}, depending on 𝐪\mathbf{q}^{\infty}, ε\varepsilon, δ\delta, and TT, such that, for all NN1N\geq N_{1},

    (supt[0,T]supmqm,N(t)ε)<δ.\mathbb{P}\Big{(}\sup_{t\in[0,T]}\sup_{m\in\mathcal{M}}q^{N}_{m,\ell}(t)\geq\varepsilon\Big{)}<\delta. (D.1)
    Proof.

    Fix any ε>0\varepsilon>0 and δ>0\delta>0. Since 𝐪𝒮\mathbf{q}^{\infty}\in\mathcal{S}, then there exists 10\ell_{1}\in\mathbb{N}_{0} such that supmqm,1ε/4\sup_{m\in\mathcal{M}}q^{\infty}_{m,\ell_{1}}\leq\varepsilon/4. By the weak convergence 𝐪N(0)𝐪\mathbf{q}^{N}(0)\Rightarrow\mathbf{q}^{\infty}, there exists N10N_{1}\in\mathbb{N}_{0} such that for all NN1N\geq N_{1},

    (qm,1N(0)ε/2)(𝐪N(0)𝐪1ε/4)<δ2.\mathbb{P}\big{(}q^{N}_{m,\ell_{1}}(0)\leq\varepsilon/2\big{)}\leq\mathbb{P}\Big{(}\left\lVert\mathbf{q}^{N}(0)-\mathbf{q}^{\infty}\right\rVert_{1}\geq\varepsilon/4\Big{)}<\frac{\delta}{2}. (D.2)

    Let =1+supm4ξλTvmε\ell=\ell_{1}+\sup_{m\in\mathcal{M}}\lceil\frac{4\xi\lambda T}{v_{m}\varepsilon}\rceil. Hence,

    (supt[0,T]supmqm,N(t)ε)(supt[0,T]supmqm,N(t)ε|supmqm,1N(0)<ε/2)+(qm,1N(0)ε/2).\begin{split}&\mathbb{P}\Big{(}\sup_{t\in[0,T]}\sup_{m\in\mathcal{M}}q^{N}_{m,\ell}(t)\geq\varepsilon\Big{)}\\ &\leq\mathbb{P}\Big{(}\sup_{t\in[0,T]}\sup_{m\in\mathcal{M}}q^{N}_{m,\ell}(t)\geq\varepsilon|\sup_{m\in\mathcal{M}}q^{N}_{m,\ell_{1}}(0)<\varepsilon/2\Big{)}+\mathbb{P}\big{(}q^{N}_{m,\ell_{1}}(0)\leq\varepsilon/2\big{)}.\end{split} (D.3)

    Since given supmqm,1N(0)<ε/2\sup_{m\in\mathcal{M}}q^{N}_{m,\ell_{1}}(0)<\varepsilon/2, i.e., for all mm\in\mathcal{M}, qm,1N(0)|VmN|<ε/2|VmN|q^{N}_{m,\ell_{1}}(0)|V^{N}_{m}|<\varepsilon/2|V^{N}_{m}|, then if for some t[0,T]t\in[0,T] and mm\in\mathcal{M}, qm,N(t)εq^{N}_{m,\ell}(t)\geq\varepsilon, i.e., qm,N(t)|VmN|ε|VmN|q^{N}_{m,\ell}(t)|V^{N}_{m}|\geq\varepsilon|V^{N}_{m}|, there must be at least infm|VmN|ε(1)/2\inf_{m\in\mathcal{M}}|V^{N}_{m}|\varepsilon(\ell-\ell_{1})/2 tasks arriving in the system. By using the standard concentration inequality for Poisson random variables (see [12, Theorem 2.3(b)]), we have

    (supt[0,T]supmqm,N(t)ε|supmqm,1N(0)<ε/2)(Po(W(N)λ)infm|VmN|ε(1)/2)(Po(NξλT)2C(N)NξλT)exp(((2C(N)1)NξλT)22(NξλT+((2C(N)1)NξλT)/3))N0,\begin{split}&\mathbb{P}\Big{(}\sup_{t\in[0,T]}\sup_{m\in\mathcal{M}}q^{N}_{m,\ell}(t)\geq\varepsilon|\sup_{m\in\mathcal{M}}q^{N}_{m,\ell_{1}}(0)<\varepsilon/2\Big{)}\leq\mathbb{P}\big{(}\mathrm{Po}(W(N)\lambda)\geq\inf_{m\in\mathcal{M}}|V^{N}_{m}|\varepsilon(\ell-\ell_{1})/2\big{)}\\ &\leq\mathbb{P}\big{(}\mathrm{Po}(N\xi\lambda T)\geq 2C(N)N\xi\lambda T\big{)}\leq\exp\Big{(}-\frac{\big{(}(2C(N)-1)N\xi\lambda T\big{)}^{2}}{2\big{(}N\xi\lambda T+((2C(N)-1)N\xi\lambda T)/3\big{)}}\Big{)}\xrightarrow{N\rightarrow\infty}0,\end{split} (D.4)

    where Po()\mathrm{Po}(\cdot) is a unit-rate Poisson random variable, and C(N)C(N) is a positive constant only dependent on NN and goes to 11 as NN goes to infinity. The second inequality comes from the assumption that W(N)/NξW(N)/N\rightarrow\xi and |VmN|/Nvm|V^{N}_{m}|/N\rightarrow v_{m}, m\forall m\in\mathcal{M}. By (D.4), there exists N20N_{2}\in\mathbb{N}_{0} such that for all NN2N\geq N_{2},

    (supt[0,T]supmqm,N(t)ε|supmqm,1N(0)<ε/2)<δ2.\mathbb{P}\Big{(}\sup_{t\in[0,T]}\sup_{m\in\mathcal{M}}q^{N}_{m,\ell}(t)\geq\varepsilon|\sup_{m\in\mathcal{M}}q^{N}_{m,\ell_{1}}(0)<\varepsilon/2\Big{)}<\frac{\delta}{2}. (D.5)

    Let N0=max(N1,N2)N_{0}=\max(N_{1},N_{2}). By (D.2), (D.3) and (D.5),

    (supt[0,T]supmqm,N(t)ε)<δ.\mathbb{P}\Big{(}\sup_{t\in[0,T]}\sup_{m\in\mathcal{M}}q^{N}_{m,\ell}(t)\geq\varepsilon\Big{)}<\delta.

    Lemma D.2.

    For each mm\in\mathcal{M} and k𝒦k\in\mathcal{K},

    supUVmN||EkN(U)||EkN(VN)|vmpk,mδk|U||VmN||0 as N.\sup_{U\subseteq V^{N}_{m}}\Big{|}\frac{|E^{N}_{k}(U)|}{|E^{N}_{k}(V^{N})|}-\frac{v_{m}p_{k,m}}{\delta_{k}}\frac{|U|}{|V^{N}_{m}|}\Big{|}\rightarrow 0\text{ as }N\rightarrow\infty. (D.6)
    Proof.

    Fix any ε>0\varepsilon>0. By Condition 2.1 and Lemma 4.1, there exist N(ε)0N(\varepsilon)\in\mathbb{N}_{0} such that for all NN(ε)N\geq N(\varepsilon),

    (1ε)pk,m|WkN||U||EkN(U)|(1+ε)pk,m|WkN||U|,UVmN,(1-\varepsilon)p_{k,m}|W^{N}_{k}||U|\leq|E^{N}_{k}(U)|\leq(1+\varepsilon)p_{k,m}|W^{N}_{k}||U|,\quad\forall U\subseteq V^{N}_{m}, (D.7)

    and

    (1ε)mpk,m|WkN||VmN||EkN(VN)|(1+ε)mpk,m|WkN||VmN|.(1-\varepsilon)\sum_{m\in\mathcal{M}}p_{k,m}|W^{N}_{k}||V^{N}_{m}|\leq|E^{N}_{k}(V^{N})|\leq(1+\varepsilon)\sum_{m\in\mathcal{M}}p_{k,m}|W^{N}_{k}||V^{N}_{m}|. (D.8)

    Hence, for all NN(ε)N\geq N(\varepsilon),

    supUVmN||EkN(U)||VmN||EkN(VN)||U|vmpk,mδk|max{ε1(ε,N),ε2(ε,N)},\sup_{U\subseteq V^{N}_{m}}\Big{|}\frac{|E^{N}_{k}(U)||V^{N}_{m}|}{|E^{N}_{k}(V^{N})||U|}-\frac{v_{m}p_{k,m}}{\delta_{k}}\Big{|}\leq\max\Big{\{}\varepsilon_{1}(\varepsilon,N),\varepsilon_{2}(\varepsilon,N)\Big{\}}, (D.9)

    where ε1(ε,N)=|(1ε)pk,m|WkN||VmN|(1+ε)mpk,m|WkN||VmN|vmpk,mδk|\varepsilon_{1}(\varepsilon,N)=\Big{|}\frac{(1-\varepsilon)p_{k,m}|W^{N}_{k}||V^{N}_{m}|}{(1+\varepsilon)\sum_{m\in\mathcal{M}}p_{k,m}|W^{N}_{k}||V^{N}_{m}|}-\frac{v_{m}p_{k,m}}{\delta_{k}}\Big{|} and ε2(ε,N)=|(1+ε)pk,m|WkN||VmN|(1ε)mpk,m|WkN||VmN|vmpk,mδk|\varepsilon_{2}(\varepsilon,N)=\Big{|}\frac{(1+\varepsilon)p_{k,m}|W^{N}_{k}||V^{N}_{m}|}{(1-\varepsilon)\sum_{m\in\mathcal{M}}p_{k,m}|W^{N}_{k}||V^{N}_{m}|}-\frac{v_{m}p_{k,m}}{\delta_{k}}\Big{|}. Again, by Condition 2.1 and Lemma 4.1,

    limNsupUVmN||EkN(U)||VmN||EkN(VN)||U|vmpk,mδk|limNmax{ε1(ε,N),ε2(ε,N)}=max{|(1ε)vmpk,m(1+ε)δkvmpk,mδk|,|(1+ε)vmpk,m(1ε)δkvmpk,mδk|}\begin{split}&\lim_{N\rightarrow\infty}\sup_{U\subseteq V^{N}_{m}}\Big{|}\frac{|E^{N}_{k}(U)||V^{N}_{m}|}{|E^{N}_{k}(V^{N})||U|}-\frac{v_{m}p_{k,m}}{\delta_{k}}\Big{|}\\ &\leq\lim_{N\rightarrow\infty}\max\Big{\{}\varepsilon_{1}(\varepsilon,N),\varepsilon_{2}(\varepsilon,N)\Big{\}}\\ &=\max\Big{\{}\Big{|}\frac{(1-\varepsilon)v_{m}p_{k,m}}{(1+\varepsilon)\delta_{k}}-\frac{v_{m}p_{k,m}}{\delta_{k}}\Big{|},\Big{|}\frac{(1+\varepsilon)v_{m}p_{k,m}}{(1-\varepsilon)\delta_{k}}-\frac{v_{m}p_{k,m}}{\delta_{k}}\Big{|}\Big{\}}\end{split} (D.10)

    Since (D.10) holds for any ε>0\varepsilon>0, we have

    limNsupUVmN||EkN(U)||VmN||EkN(VN)||U|vmpk,mδk|limε0max{|(1ε)vmpk,m(1+ε)δkvmpk,mδk|,|(1+ε)vmpk,m(1ε)δkvmpk,mδk|}=0.\begin{split}&\lim_{N\rightarrow\infty}\sup_{U\subseteq V^{N}_{m}}\Big{|}\frac{|E^{N}_{k}(U)||V^{N}_{m}|}{|E^{N}_{k}(V^{N})||U|}-\frac{v_{m}p_{k,m}}{\delta_{k}}\Big{|}\\ &\leq\lim_{\varepsilon\downarrow 0}\max\Big{\{}\Big{|}\frac{(1-\varepsilon)v_{m}p_{k,m}}{(1+\varepsilon)\delta_{k}}-\frac{v_{m}p_{k,m}}{\delta_{k}}\Big{|},\Big{|}\frac{(1+\varepsilon)v_{m}p_{k,m}}{(1-\varepsilon)\delta_{k}}-\frac{v_{m}p_{k,m}}{\delta_{k}}\Big{|}\Big{\}}=0.\end{split} (D.11)

    Proof of Proposition 4.9.

    Consider any fixed k𝒦k\in\mathcal{K}. Also, fix ε1>0\varepsilon_{1}>0 and ε2>0\varepsilon_{2}>0. By the triangle inequality, we have

    (supt[0,T]|{iWkN:ml0|x^i,m,lN(t)xk,m,lN(t)|>ε1}|ε2M(N)/K)\displaystyle\mathbb{P}\Big{(}\sup_{t\in[0,T]}\Big{|}\Big{\{}i\in W^{N}_{k}:\sum_{m\in\mathcal{M}}\sum_{l\in\mathbb{N}_{0}}|\hat{x}^{N}_{i,m,l}(t)-x^{N}_{k,m,l}(t)|>\varepsilon_{1}\Big{\}}\Big{|}\geq\varepsilon_{2}M(N)/K\Big{)}
    (supt[0,T]|{iWkN:m0l1|x^i,m,lN(t)xk,m,lN(t)|>ε1/4}|ε2M(N)/(4K))\displaystyle\leq\mathbb{P}\Big{(}\sup_{t\in[0,T]}\Big{|}\Big{\{}i\in W^{N}_{k}:\sum_{m\in\mathcal{M}}\sum_{0\leq l\leq\ell-1}|\hat{x}^{N}_{i,m,l}(t)-x^{N}_{k,m,l}(t)|>\varepsilon_{1}/4\Big{\}}\Big{|}\geq\varepsilon_{2}M(N)/(4K)\Big{)}
    +(supt[0,T]|{iWkN:mlx^i,m,lN(t)>ε1/2}|ε2M(N)/(2K))\displaystyle\quad+\mathbb{P}\Big{(}\sup_{t\in[0,T]}\Big{|}\Big{\{}i\in W^{N}_{k}:\sum_{m\in\mathcal{M}}\sum_{l\geq\ell}\hat{x}^{N}_{i,m,l}(t)>\varepsilon_{1}/2\Big{\}}\Big{|}\geq\varepsilon_{2}M(N)/(2K)\Big{)}
    +(supt[0,T]|{iWkN:mlxk,m,lN(t)>ε1/4}|ε2M(N)/(4K))\displaystyle\quad+\mathbb{P}\Big{(}\sup_{t\in[0,T]}\Big{|}\Big{\{}i\in W^{N}_{k}:\sum_{m\in\mathcal{M}}\sum_{l\geq\ell}x^{N}_{k,m,l}(t)>\varepsilon_{1}/4\Big{\}}\Big{|}\geq\varepsilon_{2}M(N)/(4K)\Big{)}
    0l1(supt[0,T]|{iWkN:m|x^i,m,lN(t)xk,m,lN(t)|>ε1/(4)}|ε2M(N)/(4K))\displaystyle\leq\sum_{0\leq l\leq\ell-1}\mathbb{P}\Big{(}\sup_{t\in[0,T]}\Big{|}\Big{\{}i\in W^{N}_{k}:\sum_{m\in\mathcal{M}}|\hat{x}^{N}_{i,m,l}(t)-x^{N}_{k,m,l}(t)|>\varepsilon_{1}/(4\ell)\Big{\}}\Big{|}\geq\varepsilon_{2}M(N)/(4\ell K)\Big{)}
    +(supt[0,T]|{iWkN:m|l(x^i,m,lN(t)xk,m,lN(t))|>ε1/4}|ε2M(N)/(4K))\displaystyle\quad+\mathbb{P}\Big{(}\sup_{t\in[0,T]}\Big{|}\Big{\{}i\in W^{N}_{k}:\sum_{m\in\mathcal{M}}\big{|}\sum_{l\geq\ell}(\hat{x}^{N}_{i,m,l}(t)-x^{N}_{k,m,l}(t))\big{|}>\varepsilon_{1}/4\Big{\}}\Big{|}\geq\varepsilon_{2}M(N)/(4K)\Big{)}
    +2(supt[0,T]|{iWkN:mlxk,m,lN(t)>ε1/4}|ε2M(N)/(4K))\displaystyle\quad+2\mathbb{P}\Big{(}\sup_{t\in[0,T]}\Big{|}\Big{\{}i\in W^{N}_{k}:\sum_{m\in\mathcal{M}}\sum_{l\geq\ell}x^{N}_{k,m,l}(t)>\varepsilon_{1}/4\Big{\}}\Big{|}\geq\varepsilon_{2}M(N)/(4K)\Big{)} (D.12)

    By the triangle inequality and Markov’s inequality,

    0l1(supt[0,T]|{iWkN:m|x^i,m,lN(t)xk,m,lN(t)|>ε1/(4)}|ε2M(N)/(4K))\displaystyle\sum_{0\leq l\leq\ell-1}\mathbb{P}\Big{(}\sup_{t\in[0,T]}\Big{|}\Big{\{}i\in W^{N}_{k}:\sum_{m\in\mathcal{M}}|\hat{x}^{N}_{i,m,l}(t)-x^{N}_{k,m,l}(t)|>\varepsilon_{1}/(4\ell)\Big{\}}\Big{|}\geq\varepsilon_{2}M(N)/(4\ell K)\Big{)}
    0l1((supt[0,T]|{iWkN:m|x^i,m,lN(t)|EkN(Um,lN(t))||EkN(VN)||>ε1/(8)}|ε2M(N)/(4K))\displaystyle\leq\sum_{0\leq l\leq\ell-1}\Big{(}\mathbb{P}\Big{(}\sup_{t\in[0,T]}\Big{|}\Big{\{}i\in W^{N}_{k}:\sum_{m\in\mathcal{M}}\big{|}\hat{x}^{N}_{i,m,l}(t)-\frac{|E^{N}_{k}(U^{N}_{m,l}(t))|}{|E^{N}_{k}(V^{N})|}\big{|}>\varepsilon_{1}/(8\ell)\Big{\}}\Big{|}\geq\varepsilon_{2}M(N)/(4\ell K)\Big{)}
    +(supt[0,T]m||EkN(Um,lN(t))||EkN(VN)|xk,m,lN|>ε1/(8)))\displaystyle\hskip 42.67912pt+\mathbb{P}\Big{(}\sup_{t\in[0,T]}\sum_{m\in\mathcal{M}}\Big{|}\frac{|E^{N}_{k}(U^{N}_{m,l}(t))|}{|E^{N}_{k}(V^{N})|}-x^{N}_{k,m,l}\Big{|}>\varepsilon_{1}/(8\ell)\Big{)}\Big{)}
    4Kε2M(N)0l1𝔼(supt[0,T]|{iWkN:m|x^i,m,lN(t)|EkN(Um,lN(t))||EkN(VN)||>ε1/(8)}|)\displaystyle\leq\frac{4\ell K}{\varepsilon_{2}M(N)}\sum_{0\leq l\leq\ell-1}\mathbb{E}\Big{(}\sup_{t\in[0,T]}\Big{|}\Big{\{}i\in W^{N}_{k}:\sum_{m\in\mathcal{M}}|\hat{x}^{N}_{i,m,l}(t)-\frac{|E^{N}_{k}(U^{N}_{m,l}(t))|}{|E^{N}_{k}(V^{N})|}|>\varepsilon_{1}/(8\ell)\Big{\}}\Big{|}\Big{)}
    +0l1(msupUVmN||EkN(U)||EkN(VN)|vmδk|U||VmN||>ε1/(8))\displaystyle\hskip 42.67912pt+\sum_{0\leq l\leq\ell-1}\mathbb{P}\Big{(}\sum_{m\in\mathcal{M}}\sup_{U\in V^{N}_{m}}\Big{|}\frac{|E^{N}_{k}(U)|}{|E^{N}_{k}(V^{N})|}-\frac{v_{m}}{\delta_{k}}\frac{|U|}{|V^{N}_{m}|}\Big{|}>\varepsilon_{1}/(8\ell)\Big{)}
    4Kε2M(N)0l1m𝔼(supt[0,T]|{iWkN:|x^i,m,lN(t)|EkN(Um,lN(t))||EkN(VN)||>ε1/(8M)}|)\displaystyle\leq\frac{4\ell K}{\varepsilon_{2}M(N)}\sum_{0\leq l\leq\ell-1}\sum_{m\in\mathcal{M}}\mathbb{E}\Big{(}\sup_{t\in[0,T]}\Big{|}\Big{\{}i\in W^{N}_{k}:|\hat{x}^{N}_{i,m,l}(t)-\frac{|E^{N}_{k}(U^{N}_{m,l}(t))|}{|E^{N}_{k}(V^{N})|}|>\varepsilon_{1}/(8M\ell)\Big{\}}\Big{|}\Big{)}
    +0l1m(supUVmN||EkN(U)||EkN(VN)|vmpk,mδk|U||VmN||>ε1/(8M))\displaystyle\hskip 42.67912pt+\sum_{0\leq l\leq\ell-1}\sum_{m\in\mathcal{M}}\mathbb{P}\Big{(}\sup_{U\in V^{N}_{m}}\Big{|}\frac{|E^{N}_{k}(U)|}{|E^{N}_{k}(V^{N})|}-\frac{v_{m}p_{k,m}}{\delta_{k}}\frac{|U|}{|V^{N}_{m}|}\Big{|}>\varepsilon_{1}/(8M\ell)\Big{)}
    4Kε2M(N)0l1msupUVmN|{iWkN:||𝒩wN(i)U||𝒩wN(i)||EkN(U)||WkN(VN)||>ε1/(8M)}|\displaystyle\leq\frac{4\ell K}{\varepsilon_{2}M(N)}\sum_{0\leq l\leq\ell-1}\sum_{m\in\mathcal{M}}\sup_{U\in V^{N}_{m}}\Big{|}\Big{\{}i\in W^{N}_{k}:\big{|}\frac{|\mathcal{N}^{N}_{w}(i)\cap U|}{|\mathcal{N}^{N}_{w}(i)|}-\frac{|E^{N}_{k}(U)|}{|W^{N}_{k}(V^{N})|}\big{|}>\varepsilon_{1}/(8M\ell)\Big{\}}\Big{|}
    +0l1m(supUVmN||EkN(U)||EkN(VN)|vmpk,mδk|U||VmN||>ε1/(8M))\displaystyle\hskip 42.67912pt+\sum_{0\leq l\leq\ell-1}\sum_{m\in\mathcal{M}}\mathbb{P}\Big{(}\sup_{U\in V^{N}_{m}}\Big{|}\frac{|E^{N}_{k}(U)|}{|E^{N}_{k}(V^{N})|}-\frac{v_{m}p_{k,m}}{\delta_{k}}\frac{|U|}{|V^{N}_{m}|}\Big{|}>\varepsilon_{1}/(8M\ell)\Big{)}
    42KMε2M(N)supUVN|{iWkN:||𝒩wN(i)U||𝒩wN(i)||EkN(U)||WkN(VN)||>ε1/(8M)}|\displaystyle\leq\frac{4\ell^{2}KM}{\varepsilon_{2}M(N)}\sup_{U\in V^{N}}\Big{|}\Big{\{}i\in W^{N}_{k}:\big{|}\frac{|\mathcal{N}^{N}_{w}(i)\cap U|}{|\mathcal{N}^{N}_{w}(i)|}-\frac{|E^{N}_{k}(U)|}{|W^{N}_{k}(V^{N})|}\big{|}>\varepsilon_{1}/(8M\ell)\Big{\}}\Big{|}
    +0l1m(supUVmN||EkN(U)||EkN(VN)|vmpk,mδk|U||VmN||>ε1/(8M))\displaystyle\hskip 42.67912pt+\sum_{0\leq l\leq\ell-1}\sum_{m\in\mathcal{M}}\mathbb{P}\Big{(}\sup_{U\in V^{N}_{m}}\Big{|}\frac{|E^{N}_{k}(U)|}{|E^{N}_{k}(V^{N})|}-\frac{v_{m}p_{k,m}}{\delta_{k}}\frac{|U|}{|V^{N}_{m}|}\Big{|}>\varepsilon_{1}/(8M\ell)\Big{)} (D.13)

    By Lemma D.2, there exists N10N_{1}\in\mathbb{N}_{0} such that for all NN1N\geq N_{1},

    supmsupUVmN||EkN(U)||EkN(VN)|vmpk,mδk|U||VmN||ε1/(8M),\sup_{m\in\mathcal{M}}\sup_{U\subseteq V^{N}_{m}}\Big{|}\frac{|E^{N}_{k}(U)|}{|E^{N}_{k}(V^{N})|}-\frac{v_{m}p_{k,m}}{\delta_{k}}\frac{|U|}{|V^{N}_{m}|}\Big{|}\leq\varepsilon_{1}/(8M\ell), (D.14)

    implying

    0l1(supt[0,T]|{iWkN:m|x^i,m,lN(t)xk,m,lN(t)|>ε1/(4)}|ε2M(N)/(4K))42KMε2M(N)supUVN|{iWkN:||𝒩wN(i)U||𝒩wN(i)||EkN(U)||WkN(VN)||>ε1/(8M)}|\begin{split}&\sum_{0\leq l\leq\ell-1}\mathbb{P}\Big{(}\sup_{t\in[0,T]}\Big{|}\Big{\{}i\in W^{N}_{k}:\sum_{m\in\mathcal{M}}|\hat{x}^{N}_{i,m,l}(t)-x^{N}_{k,m,l}(t)|>\varepsilon_{1}/(4\ell)\Big{\}}\Big{|}\geq\varepsilon_{2}M(N)/(4\ell K)\Big{)}\\ &\leq\frac{4\ell^{2}KM}{\varepsilon_{2}M(N)}\sup_{U\in V^{N}}\Big{|}\Big{\{}i\in W^{N}_{k}:\big{|}\frac{|\mathcal{N}^{N}_{w}(i)\cap U|}{|\mathcal{N}^{N}_{w}(i)|}-\frac{|E^{N}_{k}(U)|}{|W^{N}_{k}(V^{N})|}\big{|}>\varepsilon_{1}/(8M\ell)\Big{\}}\Big{|}\end{split} (D.15)

    Similarly, we have that there exists N20N_{2}\in\mathbb{N}_{0} such that NN2N\geq N_{2},

    (supt[0,T]|{iWkN:m|l(x^i,m,lN(t)xk,m,lN(t))|>ε1/4}|ε2M(N)/(4K))\displaystyle\mathbb{P}\Big{(}\sup_{t\in[0,T]}\Big{|}\Big{\{}i\in W^{N}_{k}:\sum_{m\in\mathcal{M}}\big{|}\sum_{l\geq\ell}(\hat{x}^{N}_{i,m,l}(t)-x^{N}_{k,m,l}(t))\big{|}>\varepsilon_{1}/4\Big{\}}\Big{|}\geq\varepsilon_{2}M(N)/(4K)\Big{)}
    4Kε2M(N)supUVN|{iWkN:||𝒩wN(i)U||𝒩wN(i)||EkN(U)||WkN(VN)||>ε1/4}|.\displaystyle\leq\frac{4K}{\varepsilon_{2}M(N)}\sup_{U\in V^{N}}\Big{|}\Big{\{}i\in W^{N}_{k}:\big{|}\frac{|\mathcal{N}^{N}_{w}(i)\cap U|}{|\mathcal{N}^{N}_{w}(i)|}-\frac{|E^{N}_{k}(U)|}{|W^{N}_{k}(V^{N})|}\big{|}>\varepsilon_{1}/4\Big{\}}\Big{|}. (D.16)

    By (D),(D.15) and (D), there exists N3=max(N1,N2)N_{3}=\max(N_{1},N_{2}) such that for all NN3N\geq N_{3},

    (supt[0,T]|{iWkN:ml0|x^i,m,lN(t)xk,m,lN(t)|>ε1}|ε2M(N)/K)82KMε2M(N)supUVN|{iWkN:||𝒩wN(i)U||𝒩wN(i)||EkN(U)||WkN(VN)||>ε1/4}|+2(supt[0,T]|{iWkN:mlxk,m,lN(t)>ε1/4}|ε2M(N)/(4K)).\begin{split}&\mathbb{P}\Big{(}\sup_{t\in[0,T]}\Big{|}\Big{\{}i\in W^{N}_{k}:\sum_{m\in\mathcal{M}}\sum_{l\in\mathbb{N}_{0}}|\hat{x}^{N}_{i,m,l}(t)-x^{N}_{k,m,l}(t)|>\varepsilon_{1}\Big{\}}\Big{|}\geq\varepsilon_{2}M(N)/K\Big{)}\\ &\leq\frac{8\ell^{2}KM}{\varepsilon_{2}M(N)}\sup_{U\in V^{N}}\Big{|}\Big{\{}i\in W^{N}_{k}:\big{|}\frac{|\mathcal{N}^{N}_{w}(i)\cap U|}{|\mathcal{N}^{N}_{w}(i)|}-\frac{|E^{N}_{k}(U)|}{|W^{N}_{k}(V^{N})|}\big{|}>\varepsilon_{1}/4\Big{\}}\Big{|}\\ &\quad+2\mathbb{P}\Big{(}\sup_{t\in[0,T]}\Big{|}\Big{\{}i\in W^{N}_{k}:\sum_{m\in\mathcal{M}}\sum_{l\geq\ell}x^{N}_{k,m,l}(t)>\varepsilon_{1}/4\Big{\}}\Big{|}\geq\varepsilon_{2}M(N)/(4K)\Big{)}.\end{split} (D.17)

    Fix any ε3>0\varepsilon_{3}>0 By Definition 3.8, there exists N40N_{4}\in\mathbb{N}_{0} such that for all NN4N\geq N_{4},

    supUVN|{iWkN:||𝒩wN(i)U||𝒩wN(i)||EkN(U)||WkN(VN)||>ε1/4}|ε2M(N)ε3162KM.\sup_{U\in V^{N}}\Big{|}\Big{\{}i\in W^{N}_{k}:\big{|}\frac{|\mathcal{N}^{N}_{w}(i)\cap U|}{|\mathcal{N}^{N}_{w}(i)|}-\frac{|E^{N}_{k}(U)|}{|W^{N}_{k}(V^{N})|}\big{|}>\varepsilon_{1}/4\Big{\}}\Big{|}\leq\frac{\varepsilon_{2}M(N)\varepsilon_{3}}{16\ell^{2}KM}. (D.18)

    By Lemma D.1, there exists N50N_{5}\in\mathbb{N}_{0} such that for all NN5N\geq N_{5},

    (supt[0,T]|{iWkN:mlxk,m,lN(t)>ε1/4}|ε2M(N)/(4K))ε34.\mathbb{P}\Big{(}\sup_{t\in[0,T]}\Big{|}\Big{\{}i\in W^{N}_{k}:\sum_{m\in\mathcal{M}}\sum_{l\geq\ell}x^{N}_{k,m,l}(t)>\varepsilon_{1}/4\Big{\}}\Big{|}\geq\varepsilon_{2}M(N)/(4K)\Big{)}\leq\frac{\varepsilon_{3}}{4}. (D.19)

    Hence,

    (supt[0,T]|{iWkN:ml0|x^i,m,lN(t)xk,m,lN(t)|>ε1}|ε2M(N)/K)ε3.\mathbb{P}\Big{(}\sup_{t\in[0,T]}\Big{|}\Big{\{}i\in W^{N}_{k}:\sum_{m\in\mathcal{M}}\sum_{l\in\mathbb{N}_{0}}|\hat{x}^{N}_{i,m,l}(t)-x^{N}_{k,m,l}(t)|>\varepsilon_{1}\Big{\}}\Big{|}\geq\varepsilon_{2}M(N)/K\Big{)}\leq\varepsilon_{3}. (D.20)

    Since ε3>0\varepsilon_{3}>0 are arbitrary, then the desired result holds. ∎

    Appendix E Bound the Mismatch

    Proof.

    Define a function Fm,lN():𝒮[0,1]F^{N}_{m,l}(\cdot):\mathcal{S}\rightarrow[0,1] as: for 𝐱=(xm,l,m,l0)𝒮\mathbf{x}=(x_{m,l},m\in\mathcal{M},l\in\mathbb{N}_{0})\in\mathcal{S},

    Fm,lN(𝐱)=r=1dr1=1rr1r(Nxm,lr1)(N{m}xm,lrr1)(Nll+1xm,ldr)(Nd).F^{N}_{m,l}(\mathbf{x})=\frac{\sum_{r=1}^{d}\sum_{r_{1}=1}^{r}\frac{r_{1}}{r}{Nx_{m,l}\choose r_{1}}{N\sum_{\mathcal{M}\setminus\{m\}}x_{m,l}\choose r-r_{1}}{N\sum_{\mathcal{M}}\sum_{l^{\prime}\geq l+1}x_{m,l}\choose d-r}}{{N\choose d}}. (E.1)

    Also, define a function fm,l()f_{m,l}(\cdot) as: for 𝐱𝒮\mathbf{x}\in\mathcal{S},

    fm,l(𝐱)=r=1dr1=1rr1rd!r1!(rr1)!(dr)!(xm,l)r1({m}xm,l)rr1(ll+1xm,l)drf_{m,l}(\mathbf{x})=\sum_{r=1}^{d}\sum_{r_{1}=1}^{r}\frac{r_{1}}{r}\frac{d!}{r_{1}!(r-r_{1})!(d-r)!}\big{(}x_{m,l}\big{)}^{r_{1}}\big{(}\sum_{\mathcal{M}\setminus\{m\}}x_{m,l}\big{)}^{r-r_{1}}\big{(}\sum_{\mathcal{M}}\sum_{l^{\prime}\geq l+1}x_{m,l^{\prime}}\big{)}^{d-r} (E.2)

    Note that for any 0yx10\leq y\leq x\leq 1 and 1kd1\leq k\leq d, xk(xy)kkxykyx^{k}-(x-y)^{k}\leq kxy\leq ky. Then, we have

    ml0|Fm,lN(𝐱)fm,l(𝐱)|\displaystyle\sum_{m\in\mathcal{M}}\sum_{l\in\mathbb{N}_{0}}|F^{N}_{m,l}(\mathbf{x})-f_{m,l}(\mathbf{x})|
    ml0r=1dr1=1rr1rd!r1!(rr1)!(dr)!((xm,l)r1({m}xm,l)rr1(ll+1xm,l)dr\displaystyle\leq\sum_{m\in\mathcal{M}}\sum_{l\in\mathbb{N}_{0}}\sum_{r=1}^{d}\sum_{r_{1}=1}^{r}\frac{r_{1}}{r}\frac{d!}{r_{1}!(r-r_{1})!(d-r)!}\Big{(}\big{(}x_{m,l}\big{)}^{r_{1}}\big{(}\sum_{\mathcal{M}\setminus\{m\}}x_{m,l}\big{)}^{r-r_{1}}\big{(}\sum_{\mathcal{M}}\sum_{l^{\prime}\geq l+1}x_{m,l^{\prime}}\big{)}^{d-r}
    (xm,lr1N)r1({m}xm,lrr1N)rr1(ll+1xm,ldrn)dr)\displaystyle\hskip 99.58464pt-\big{(}x_{m,l}-\frac{r_{1}}{N}\big{)}^{r_{1}}\big{(}\sum_{\mathcal{M}\setminus\{m\}}x_{m,l}-\frac{r-r_{1}}{N}\big{)}^{r-r_{1}}\big{(}\sum_{\mathcal{M}}\sum_{l^{\prime}\geq l+1}x_{m,l^{\prime}}-\frac{d-r}{n}\big{)}^{d-r}\Big{)}
    ml0r=1dr1=1rr1rd!r1(rr1)(dr)r1!(rr1)!(dr)!xm,l({m}xm,l)(ll+1xm,l)(dN)d\displaystyle\leq\sum_{m\in\mathcal{M}}\sum_{l\in\mathbb{N}_{0}}\sum_{r=1}^{d}\sum_{r_{1}=1}^{r}\frac{r_{1}}{r}\frac{d!r_{1}(r-r_{1})(d-r)}{r_{1}!(r-r_{1})!(d-r)!}x_{m,l}(\sum_{\mathcal{M}\setminus\{m\}}x_{m,l})(\sum_{\mathcal{M}}\sum_{l^{\prime}\geq l+1}x_{m,l^{\prime}})\big{(}\frac{d}{N}\big{)}^{d}
    ml0r=1dr1=1rr1rd!r1(rr1)(dr)r1!(rr1)!(dr)!xm,l(dN)d\displaystyle\leq\sum_{m\in\mathcal{M}}\sum_{l\in\mathbb{N}_{0}}\sum_{r=1}^{d}\sum_{r_{1}=1}^{r}\frac{r_{1}}{r}\frac{d!r_{1}(r-r_{1})(d-r)}{r_{1}!(r-r_{1})!(d-r)!}x_{m,l}\big{(}\frac{d}{N}\big{)}^{d}
    =r=1dr1=1rr1rd!r1(rr1)(dr)r1!(rr1)!(dr)!(dN)d0 as 0.\displaystyle=\sum_{r=1}^{d}\sum_{r_{1}=1}^{r}\frac{r_{1}}{r}\frac{d!r_{1}(r-r_{1})(d-r)}{r_{1}!(r-r_{1})!(d-r)!}\big{(}\frac{d}{N}\big{)}^{d}\rightarrow 0\text{ as }0. (E.3)

    Let 𝐱^iN=(x^i,m,lN,m,l0)\hat{\mathbf{x}}^{N}_{i}=(\hat{x}^{N}_{i,m,l},m\in\mathcal{M},l\in\mathbb{N}_{0}) and 𝐱kN=(xk,m,lN,m,l0)\mathbf{x}^{\prime N}_{k}=(x^{\prime N}_{k,m,l},m\in\mathcal{M},l\in\mathbb{N}_{0}). By (4.22) and (4.23), pm,lN(i)=Fm,lN(𝐱^iN)p^{N}_{m,l}(i)=F^{N}_{m,l}(\hat{\mathbf{x}}^{N}_{i}) and pm,lN(k)=Fm,lN(𝐱kN)p^{\prime N}_{m,l}(k)=F^{N}_{m,l}(\mathbf{x}^{\prime N}_{k}). By the Optimal Coupling, we have

    (Mismatch)\displaystyle\mathbb{P}(\textit{Mismatch}) ml0|Fm,lN(𝐱^iN)Fm,lN(𝐱kN)|\displaystyle\leq\sum_{m\in\mathcal{M}}\sum_{l\in\mathbb{N}_{0}}|F^{N}_{m,l}(\hat{\mathbf{x}}^{N}_{i})-F^{N}_{m,l}(\mathbf{x}^{\prime N}_{k})|
    ml0|Fm,lN(𝐱^iN)fm,l(𝐱^iN)|+ml0|Fm,lN(𝐱kN)fm,l(𝐱kN)|\displaystyle\leq\sum_{m\in\mathcal{M}}\sum_{l\in\mathbb{N}_{0}}|F^{N}_{m,l}(\hat{\mathbf{x}}^{N}_{i})-f_{m,l}(\hat{\mathbf{x}}^{N}_{i})|+\sum_{m\in\mathcal{M}}\sum_{l\in\mathbb{N}_{0}}|F^{N}_{m,l}(\mathbf{x}^{\prime N}_{k})-f_{m,l}(\mathbf{x}^{\prime N}_{k})|
    +ml0|fm,l(𝐱^iN)fm,l(𝐱kN)|\displaystyle\quad+\sum_{m\in\mathcal{M}}\sum_{l\in\mathbb{N}_{0}}|f_{m,l}(\hat{\mathbf{x}}^{N}_{i})-f_{m,l}(\mathbf{x}^{\prime N}_{k})| (E.4)

    Next, we are going to show that f()f(\cdot) is Lipschitz continuous for 𝐱𝒮\mathbf{x}\in\mathcal{S}.

    ml0|fm,l(𝐱^iN)fm,l(𝐱kN)|\displaystyle\sum_{m\in\mathcal{M}}\sum_{l\in\mathbb{N}_{0}}|f_{m,l}(\hat{\mathbf{x}}^{N}_{i})-f_{m,l}(\mathbf{x}^{\prime N}_{k})|
    ml0r=1dr1=1rr1rd!r1!(rr1)!(dr)!|(x^i,m,lN)r1({m}x^i,m,lN)rr1(ll+1x^i,m,lN)dr\displaystyle\leq\sum_{m\in\mathcal{M}}\sum_{l\in\mathbb{N}_{0}}\sum_{r=1}^{d}\sum_{r_{1}=1}^{r}\frac{r_{1}}{r}\frac{d!}{r_{1}!(r-r_{1})!(d-r)!}\Big{|}\big{(}\hat{x}^{N}_{i,m,l}\big{)}^{r_{1}}\big{(}\sum_{\mathcal{M}\setminus\{m\}}\hat{x}^{N}_{i,m,l}\big{)}^{r-r_{1}}\big{(}\sum_{\mathcal{M}}\sum_{l^{\prime}\geq l+1}\hat{x}^{N}_{i,m,l^{\prime}}\big{)}^{d-r}
    (xk,m,lN)r1({m}xk,m,lN)rr1(ll+1xk,m,lN)dr|\displaystyle\hskip 99.58464pt-\big{(}x^{\prime N}_{k,m,l}\big{)}^{r_{1}}\big{(}\sum_{\mathcal{M}\setminus\{m\}}x^{\prime N}_{k,m,l}\big{)}^{r-r_{1}}\big{(}\sum_{\mathcal{M}}\sum_{l^{\prime}\geq l+1}x^{\prime N}_{k,m,l^{\prime}}\big{)}^{d-r}\Big{|}
    ml0r=1dr1=1rr1rd!r1(rr1)(dr)r1!(rr1)!(dr)!|(x^i,m,lNxk,m,lN)\displaystyle\leq\sum_{m\in\mathcal{M}}\sum_{l\in\mathbb{N}_{0}}\sum_{r=1}^{d}\sum_{r_{1}=1}^{r}\frac{r_{1}}{r}\frac{d!r_{1}(r-r_{1})(d-r)}{r_{1}!(r-r_{1})!(d-r)!}\Big{|}\big{(}\hat{x}^{N}_{i,m,l}-x^{\prime N}_{k,m,l}\big{)}
    ({m}x^i,m,lN{m}xk,m,lN)(ll+1x^i,m,lNll+1xk,m,lN)|\displaystyle\hskip 99.58464pt\big{(}\sum_{\mathcal{M}\setminus\{m\}}\hat{x}^{N}_{i,m,l}-\sum_{\mathcal{M}\setminus\{m\}}x^{\prime N}_{k,m,l}\big{)}\big{(}\sum_{\mathcal{M}}\sum_{l^{\prime}\geq l+1}\hat{x}^{N}_{i,m,l^{\prime}}-\sum_{\mathcal{M}}\sum_{l^{\prime}\geq l+1}x^{\prime N}_{k,m,l^{\prime}}\big{)}\Big{|}
    ml0r=1dr1=1rr1rd!r1(rr1)(dr)r1!(rr1)!(dr)!|(x^i,m,lNxk,m,lN)|\displaystyle\leq\sum_{m\in\mathcal{M}}\sum_{l\in\mathbb{N}_{0}}\sum_{r=1}^{d}\sum_{r_{1}=1}^{r}\frac{r_{1}}{r}\frac{d!r_{1}(r-r_{1})(d-r)}{r_{1}!(r-r_{1})!(d-r)!}\Big{|}\big{(}\hat{x}^{N}_{i,m,l}-x^{\prime N}_{k,m,l}\big{)}\Big{|}
    =\displaystyle= r=1dr1=1rr1rd!r1(rr1)(dr)r1!(rr1)!(dr)!𝐱^iN𝐱kN1.\displaystyle\sum_{r=1}^{d}\sum_{r_{1}=1}^{r}\frac{r_{1}}{r}\frac{d!r_{1}(r-r_{1})(d-r)}{r_{1}!(r-r_{1})!(d-r)!}\left\lVert\hat{\mathbf{x}}^{N}_{i}-\mathbf{x}^{N}_{k}\right\rVert_{1}. (E.5)

    Let L=2r=1dr1=1rr1rd!r1(rr1)(dr)r1!(rr1)!(dr)!L=2\sum_{r=1}^{d}\sum_{r_{1}=1}^{r}\frac{r_{1}}{r}\frac{d!r_{1}(r-r_{1})(d-r)}{r_{1}!(r-r_{1})!(d-r)!}. By (E), (E) and (E), we have that for large enough NN,

    (Mismatch)L𝐱^iN𝐱kN1.\mathbb{P}(\textit{Mismatch})\leq L\left\lVert\hat{\mathbf{x}}^{N}_{i}-\mathbf{x}^{N}_{k}\right\rVert_{1}. (E.6)

    Appendix F Doubly Exponential Decay

    Proof of Proposition 5.1.

    Since 𝐪\mathbf{q} is a fixed point of (3.8), then we have

    um(qm,lqm,l+1)=λξ(qm,l1qm,l)k𝒦pk,mwkδk(q~k,l1)d(q~k,l)dq~k,l1q~k,l.u_{m}(q_{m,l}-q_{m,l+1})=\lambda\xi(q_{m,l-1}-q_{m,l})\sum_{k\in\mathcal{K}}\frac{p_{k,m}w_{k}}{\delta_{k}}\frac{(\tilde{q}_{k,l-1})^{d}-(\tilde{q}_{k,l})^{d}}{\tilde{q}_{k,l-1}-\tilde{q}_{k,l}}.

    Multiplying both sides by vmv_{m} and summing over mm\in\mathcal{M} gives,

    mvmum(qm,lqm,l+1)=λξk𝒦wk((q~k,l1)d(q~k,l)d).\sum_{m\in\mathcal{M}}v_{m}u_{m}(q_{m,l}-q_{m,l+1})=\lambda\xi\sum_{k\in\mathcal{K}}w_{k}\big{(}(\tilde{q}_{k,l-1})^{d}-(\tilde{q}_{k,l})^{d}\big{)}. (F.1)

    Also, since qm,ll0q_{m,l}\xrightarrow{l\rightarrow\infty}0, m\forall m\in\mathcal{M}, then for 1\ell\geq 1, by adding ll\geq\ell, we have

    mvmumqm,=λξk𝒦wk(q~k,1)d.\sum_{m\in\mathcal{M}}v_{m}u_{m}q_{m,\ell}=\lambda\xi\sum_{k\in\mathcal{K}}w_{k}\big{(}\tilde{q}_{k,\ell-1}\big{)}^{d}. (F.2)

    From (F.2) and k𝒦wk=1\sum_{k\in\mathcal{K}}w_{k}=1, we have

    mvmumqm,λξ(q~1)d\sum_{m\in\mathcal{M}}v_{m}u_{m}q_{m,\ell}\leq\lambda\xi(\tilde{q}_{\ell-1}^{*})^{d}

    where q~1=maxk𝒦q~k,1\tilde{q}_{\ell-1}^{*}=\max_{k\in\mathcal{K}}\tilde{q}_{k,\ell-1}. Hence, for all mm\in\mathcal{M},

    qm,λξvmum(q~1)dc(m,1)q~1,q_{m,\ell}\leq\frac{\lambda\xi}{v_{m}u_{m}}(\tilde{q}_{\ell-1}^{*})^{d}\leq c^{*}(m,\ell-1)\tilde{q}_{\ell-1}^{*},

    where c(m,1)=(q~1)d1maxmλξ/(vmum)c^{*}(m,\ell-1)=(\tilde{q}_{\ell-1}^{*})^{d-1}\max_{m\in\mathcal{M}}\lambda\xi/(v_{m}u_{m}). Since we assume that qm,0q_{m,\ell}\xrightarrow{\ell\rightarrow\infty}0 for all mm\in\mathcal{M}, then we can choose a large enough \ell such that c(m,1)<1c^{*}(m,\ell-1)<1. By definition, for each k𝒦k\in\mathcal{K},

    q~k,=mvmpk,mδkqm,c(m,1)(q~1)d1\tilde{q}_{k,\ell}=\sum_{m\in\mathcal{M}}\frac{v_{m}p_{k,m}}{\delta_{k}}q_{m,\ell}\leq c^{*}(m,\ell-1)(\tilde{q}^{*}_{\ell-1})^{d-1}

    which implies that q~c(m,1)q~1\tilde{q}_{\ell}^{*}\leq c^{*}(m,\ell-1)\tilde{q}^{*}_{\ell-1} and

    qm,+1λξvmum(q~)d(c(m,1)q~1)dmaxmλξ/(vmum)=(c(m,1))d+1q~1.q_{m,\ell+1}\leq\frac{\lambda\xi}{v_{m}u_{m}}(\tilde{q}^{*}_{\ell})^{d}\leq(c^{*}(m,\ell-1)\tilde{q}_{\ell-1}^{*})^{d}\max_{m\in\mathcal{M}}\lambda\xi/(v_{m}u_{m})=(c^{*}(m,\ell-1))^{d+1}\tilde{q}_{\ell-1}^{*}.

    By induction, we obtain that for n0n\in\mathbb{N}_{0},

    qm,+n(c(m,1))e(n)q~1(c(m,1))dnq~1q_{m,\ell+n}\leq(c^{*}(m,\ell-1))^{e(n)}\tilde{q}^{*}_{\ell-1}\leq(c^{*}(m,\ell-1))^{d^{n}}\tilde{q}^{*}_{\ell-1} (F.3)

    where e(n)=i=0ndie(n)=\sum_{i=0}^{n}d^{i}. (F.3) implies that {qm,l,l0}\{q_{m,l},l\in\mathbb{N}_{0}\} decreases doubly exponentially. ∎

    Remark F.1.

    Recall q~k,l=mvmpk,mδkqm,l\tilde{q}_{k,l}=\sum_{m\in\mathcal{M}}\frac{v_{m}p_{k,m}}{\delta_{k}}q_{m,l}. From Proposition 5.1, we know {q~k,l,l0}\{\tilde{q}_{k,l},l\in\mathbb{N}_{0}\} decreases doubly exponentially. In fact, they do not decay further faster. To see this, let c0=mink𝒦minmpk,mδk(0,1].c_{0}=\min_{k\in\mathcal{K}}\min_{m\in\mathcal{M}}\frac{p_{k,m}}{\delta_{k}}\in(0,1]. Then q~k,l=m=1Mvmpk,mδkqm,lc0mvmqm,l.\tilde{q}_{k,l}=\sum_{m=1}^{M}\frac{v_{m}p_{k,m}}{\delta_{k}}q_{m,l}\geq c_{0}\sum_{m\in\mathcal{M}}v_{m}q_{m,l}. It then follows from (F.2) that

    mink𝒦q~k,c0mvmqm,=λc0k𝒦wk(q~k,1)dλc0(mink𝒦q~k,1)d.\min_{k\in\mathcal{K}}\tilde{q}_{k,\ell}\geq c_{0}\sum_{m\in\mathcal{M}}v_{m}q_{m,\ell}=\lambda c_{0}\sum_{k\in\mathcal{K}}w_{k}\big{(}\tilde{q}_{k,\ell-1}\big{)}^{d}\geq\lambda c_{0}(\min_{k\in\mathcal{K}}\tilde{q}_{k,\ell-1})^{d}.

    So

    (λc0)1d1mink𝒦q~k,((λc0)1d1mink𝒦q~k,1)d((λc0)1d1mink𝒦q~k,0)d(\lambda c_{0})^{\frac{1}{d-1}}\min_{k\in\mathcal{K}}\tilde{q}_{k,\ell}\geq((\lambda c_{0})^{\frac{1}{d-1}}\min_{k\in\mathcal{K}}\tilde{q}_{k,\ell-1})^{d}\geq\dotsb\geq((\lambda c_{0})^{\frac{1}{d-1}}\min_{k\in\mathcal{K}}\tilde{q}_{k,0})^{d^{\ell}}

    and hence mink𝒦q~k,(λc0)d1d1.\min_{k\in\mathcal{K}}\tilde{q}_{k,\ell}\geq(\lambda c_{0})^{\frac{d^{\ell}-1}{d-1}}.

    Appendix G Proof of Lemma 5.2

    Proof of Lemma 5.2.

    Fix any (α1,,αM)(0,1)M(\alpha_{1},...,\alpha_{M})\in(0,1)^{M} with mαm>0\sum_{m\in\mathcal{M}}\alpha_{m}>0. Consider any sequence {UN}N\{U^{N}\}_{N} of subsets with UNVNU^{N}\subseteq V^{N} and limN|UNVmN||VmN|=αm\lim_{N\rightarrow\infty}\frac{|U^{N}\cap V^{N}_{m}|}{|V^{N}_{m}|}=\alpha_{m} for all mm\in\mathcal{M}. By Condition 2.1, we have that for all k𝒦k\in\mathcal{K} and mm\in\mathcal{M},

    limN|EkN(UNVmN)||EkN(VmN)|=αmvm.\lim_{N\rightarrow\infty}\frac{|E^{N}_{k}(U^{N}\cap V^{N}_{m})|}{|E^{N}_{k}(V^{N}_{m})|}=\alpha_{m}v_{m}. (G.1)

    Fix any ε>0\varepsilon>0 which will be chosen later. Let 𝒢k,εN={iWkN:||𝒩wN(i)v||𝒩wN(i)||EkN(v)||EkN(VN)||ε}\mathcal{G}^{N}_{k,\varepsilon}=\Big{\{}i\in W^{N}_{k}:\Big{|}\frac{|\mathcal{N}^{N}_{w}(i)\cap v|}{|\mathcal{N}^{N}_{w}(i)|}-\frac{|E^{N}_{k}(v)|}{|E^{N}_{k}(V^{N})|}\Big{|}\geq\varepsilon\Big{\}} and k,εN=WkN𝒢k,εN\mathcal{B}^{N}_{k,\varepsilon}=W^{N}_{k}\setminus\mathcal{G}^{N}_{k,\varepsilon}. By (G.1), for all large enough NN and i𝒢k,εNi\in\mathcal{G}^{N}_{k,\varepsilon},

    N(12ε)mαmvmpk,m|𝒩wN(i)UN|N(1+2ε)mαmvmpk,m.N(1-2\varepsilon)\sum_{m\in\mathcal{M}}\alpha_{m}v_{m}p_{k,m}\leq|\mathcal{N}^{N}_{w}(i)\cap U^{N}|\leq N(1+2\varepsilon)\sum_{m\in\mathcal{M}}\alpha_{m}v_{m}p_{k,m}. (G.2)

    Also, by Condition 2.1, for all large enough NN,

    Nδk(1ε)δiNNδk(1+ε).N\delta_{k}(1-\varepsilon)\leq\delta^{N}_{i}\leq N\delta_{k}(1+\varepsilon). (G.3)

    Since the sequence {GN}N\{G^{N}\}_{N} is in the subcritical, then for large enough NN,

    ρρN(jUNm𝟙(jVmN)um)1iWNS(UN𝒩wN(i)):|S|=dλ(|𝒩wN(i)|d)c(N)(mNvmαmum)1k𝒦iWkNλ(|UN𝒩wN(i)|d)(|𝒩wN(i)|d)(mNvmαmum)1k𝒦λ|𝒢k,εN|(N(12ε)mαmvmpk,md)(Nδk(1+ε)d),\begin{split}\rho\geq\rho^{N}\geq&\Big{(}\sum_{j\in U^{N}}\sum_{m\in\mathcal{M}^{\prime}}\mathds{1}_{(j\in V^{N}_{m})}u_{m}\Big{)}^{-1}\sum_{i\in W^{N}}\sum_{\begin{subarray}{c}S\subseteq(U^{N}\cap\mathcal{N}^{N}_{w}(i)):\\ |S|=d\end{subarray}}\frac{\lambda}{{|\mathcal{N}^{N}_{w}(i)|\choose d}}\\ \geq&c^{\prime}(N)\Big{(}\sum_{m\in\mathcal{M}^{\prime}}Nv_{m}\alpha_{m}u_{m}\Big{)}^{-1}\sum_{k\in\mathcal{K}}\sum_{i\in W^{N}_{k}}\frac{\lambda{|U^{N}\cap\mathcal{N}^{N}_{w}(i)|\choose d}}{{|\mathcal{N}^{N}_{w}(i)|\choose d}}\\ \geq&\Big{(}\sum_{m\in\mathcal{M}^{\prime}}Nv_{m}\alpha_{m}u_{m}\Big{)}^{-1}\sum_{k\in\mathcal{K}}\frac{\lambda|\mathcal{G}^{N}_{k,\varepsilon}|{N(1-2\varepsilon)\sum_{m\in\mathcal{M}}\alpha_{m}v_{m}p_{k,m}\choose d}}{{N\delta_{k}(1+\varepsilon)\choose d}},\end{split} (G.4)

    where c(N)c^{\prime}(N) is a constant only depending on NN with c(N)N1c^{\prime}(N)\xrightarrow{N\rightarrow\infty}1. Since the sequence {GN}\{G^{N}\} is proportionally sparse, then limN|𝒢k,εN||WkN|=1\lim_{N\rightarrow\infty}\frac{|\mathcal{G}^{N}_{k,\varepsilon}|}{|W^{N}_{k}|}=1. Then, we have

    ρ(mvmαmpk,m)1λξmwk((12ε)mαmvmpk,mδk(1+ε))d.\rho\geq\Big{(}\sum_{m\in\mathcal{M}}v_{m}\alpha_{m}p_{k,m}\Big{)}^{-1}\lambda\xi\sum_{m\in\mathcal{M}}w_{k}\Big{(}\frac{(1-2\varepsilon)\sum_{m\in\mathcal{M}}\alpha_{m}v_{m}p_{k,m}}{\delta_{k}(1+\varepsilon)}\Big{)}^{d}. (G.5)

    Since (G.5) holds for all ε>0\varepsilon>0, then

    ρ(mvmαmpk,m)1λξmwk(mαmvmpk,mδk)d.\rho\geq\Big{(}\sum_{m\in\mathcal{M}}v_{m}\alpha_{m}p_{k,m}\Big{)}^{-1}\lambda\xi\sum_{m\in\mathcal{M}}w_{k}\Big{(}\frac{\sum_{m\in\mathcal{M}}\alpha_{m}v_{m}p_{k,m}}{\delta_{k}}\Big{)}^{d}. (G.6)

    Appendix H Proof of Lemma 5.4

    Proof of Lemma 5.4.

    Given the system state XNX^{N}. When a task arrives at the system, by the Poisson thinning property, the probability that the task will be assigned to a server in the set Qm,lN(XN)Q^{N}_{m,l}(X^{N}) is

    ((Qm,lN))=1W(N)iWNU(Qm,lN𝒩wN(i))|U|=d1(𝒩wN(i)d)\mathbb{P}(\mathcal{E}(Q^{N}_{m,l}))=\frac{1}{W(N)}\sum_{i\in W^{N}}\sum_{\begin{subarray}{c}U\subseteq(Q^{N}_{m,l}\cap\mathcal{N}^{N}_{w}(i))\\ |U|=d\end{subarray}}\frac{1}{{\mathcal{N}^{N}_{w}(i)\choose d}} (H.1)

    where (Qm,lN) the event that the new task will be assigned to Qm,lN(XN)\mathcal{E}(Q^{N}_{m,l})\coloneqq\text{ the event that the new task will be assigned to }Q^{N}_{m,l}(X^{N}). Fix any ε>0\varepsilon>0. Since the sequence {GN}\{G^{N}\} is subcritical, then for large enough NN, we have that

    ((Qm,lN))NW(N)ρNλ|Qm,lN(XN)|umNρλξqm,lNum(1+ε).\begin{split}\mathbb{P}(\mathcal{E}(Q^{N}_{m,l}))&\leq\frac{N}{W(N)}\frac{\rho^{N}}{\lambda}\frac{|Q^{N}_{m,l}(X^{N})|u_{m}}{N}\leq\frac{\rho}{\lambda\xi}q^{N}_{m,l}u_{m}(1+\varepsilon).\end{split} (H.2)

    We consider the system state at event times t0=0<t1<t2<<ti<t_{0}=0<t_{1}<t_{2}<...<t_{i}<...; for all ii, tit_{i} can be an arrival or a potential departure epoch. Define the drift ΔLm,N(XN)\Delta L^{N}_{m,\ell}(X^{N}) as

    ΔLm,N(XN)=𝔼(Lm,N(XN(t1))Lm,N(XN)|XN(t0)=XN).\Delta L^{N}_{m,\ell}(X^{N})=\mathbb{E}\Big{(}L^{N}_{m,\ell}(X^{N}(t_{1}))-L^{N}_{m,\ell}(X^{N})|X^{N}(t_{0})=X^{N}\Big{)}. (H.3)

    Again, by the Poisson thinning property, we have that for all large NN,

    ΔLm,N(XN)\displaystyle\Delta L^{N}_{m,\ell}(X^{N}) =i=(λW(N)λW(N)+m|VmN|um((Qm,i1N))m|VmN|umλW(N)+m|VmN|um|Qm,iN|umm|VmN|um)\displaystyle=\sum_{i=\ell}^{\infty}\Big{(}\frac{\lambda W(N)}{\lambda W(N)+\sum_{m\in\mathcal{M}}|V^{N}_{m}|u_{m}}\mathbb{P}(\mathcal{E}(Q^{N}_{m,i-1}))-\frac{\sum_{m\in\mathcal{M}}|V^{N}_{m}|u_{m}}{\lambda W(N)+\sum_{m\in\mathcal{M}}|V^{N}_{m}|u_{m}}\frac{|Q^{N}_{m,i}|u_{m}}{\sum_{m\in\mathcal{M}}|V^{N}_{m}|u_{m}}\Big{)}
    i=(ρqm,i1Num(1+ε)λξ+mvmumqm,iNumλξ+mvmum)\displaystyle\leq\sum_{i=\ell}^{\infty}\Big{(}\frac{\rho q^{N}_{m,i-1}u_{m}(1+\varepsilon)}{\lambda\xi+\sum_{m\in\mathcal{M}}v_{m}u_{m}}-\frac{q^{N}_{m,i}u_{m}}{\lambda\xi+\sum_{m\in\mathcal{M}}v_{m}u_{m}}\Big{)}
    =ρqm,1Num(1+ε)λξ+mvmum1(1+ε)ρλξ+mvmumi=qm,iNum\displaystyle=\frac{\rho q^{N}_{m,\ell-1}u_{m}(1+\varepsilon)}{\lambda\xi+\sum_{m\in\mathcal{M}}v_{m}u_{m}}-\frac{1-(1+\varepsilon)\rho}{\lambda\xi+\sum_{m\in\mathcal{M}}v_{m}u_{m}}\sum_{i=\ell}^{\infty}q^{N}_{m,i}u_{m} (H.4)

    By the definition of the steady state, 𝔼(ΔLm,NXN())=0\mathbb{E}(\Delta L^{N}_{m,\ell}X^{N}(\infty))=0. Choosing ε\varepsilon such that (1+ε)ρ(1+ρ)/2<1(1+\varepsilon)\rho\leq(1+\rho)/2<1, we have

    i=𝔼(qm,iN())(1+ρ)/21(1+ρ)/2𝔼(qm,1N).\sum_{i=\ell}^{\infty}\mathbb{E}(q^{N}_{m,i}(\infty))\leq\frac{(1+\rho)/2}{1-(1+\rho)/2}\mathbb{E}(q^{N}_{m,\ell-1}). (H.5)

    Finally, summing over mm\in\mathcal{M}, we get the desired result. ∎

    Appendix I Proof for the Sequence of Random Graphs

    Proof of Proposition 3.16.

    First to show that the sequence {GN}N\{G^{N}\}_{N} satisfies Condition 2.1. Consider any fixed k𝒦k\in\mathcal{K} and mm\in\mathcal{M}. Let ei,je_{i,j} be a Bernoulli random variable with probability pk,mp_{k,m} for each iWkNi\in W^{N}_{k} and jVmNj\in V^{N}_{m}. Then, EN(k,m)=(i,j)WkN×VmNei,jE^{N}(k,m)=\sum_{(i,j)\in W^{N}_{k}\times V^{N}_{m}}e_{i,j}, and by the L.L.N., we have that

    limNEN(k,m)|WkN|×|VmN|=pk,m,\lim_{N\rightarrow\infty}\frac{E^{N}(k,m)}{|W^{N}_{k}|\times|V^{N}_{m}|}=p_{k,m},

    which implies that Condition 2.1 (a)(a) holds. Next, we prove that Condition 2.1 (b)(b) holds. Based on the definition degwN(i)\deg_{w}^{N}(i), we have degwN(i)=jVmNei,j\deg_{w}^{N}(i)=\sum_{j\in V^{N}_{m}}e_{i,j} which is a binomial random variable Binomial(|VmN|,pk,m)\mathrm{Binomial}(|V^{N}_{m}|,p_{k,m}). By the Chernoff bound ([6, Theorem 2.4]), it follows that for iWkNi\in W^{N}_{k},

    (|degwN(i)𝔼(degwN(i))|x)2exp(x22𝔼(degwN(i))+2x/3).\mathbb{P}\Big{(}\big{|}\deg^{N}_{w}(i)-\mathbb{E}(\deg^{N}_{w}(i))\big{|}\geq x\Big{)}\leq 2\exp\Big{(}-\frac{x^{2}}{2\mathbb{E}(\deg^{N}_{w}(i))+2x/3}\Big{)}.

    Let X(N)=pk,mN3/4(ln(N))1/4X(N)=p_{k,m}N^{3/4}\big{(}\ln(N)\big{)}^{1/4}. Then, for some c1(0,)c_{1}\in(0,\infty),

    (|degwN(i)|VmN|pk,m|X(N))c1exp(c1pk,mN1/2(ln(N))1/2/vm),\begin{split}\mathbb{P}\Big{(}\big{|}\deg^{N}_{w}(i)-|V^{N}_{m}|p_{k,m}\big{|}\geq X(N)\Big{)}&\leq c_{1}\exp\Big{(}-c_{1}p_{k,m}N^{1/2}(\ln(N))^{1/2}/v_{m}\Big{)},\end{split} (I.1)

    for sufficiently large NN. Also by limNWkNW(N)=wk\lim_{N\rightarrow\infty}\frac{W^{N}_{k}}{W(N)}=w_{k}, limNW(N)N=ξ\lim_{N\rightarrow\infty}\frac{W(N)}{N}=\xi, and the union bound, we have that there exists c2(0,)c_{2}\in(0,\infty) such that for large enough NN,

    (iWkN|degwN(i)|VmN|pk,m|X(N))c2wkξNexp(c1pk,mN1/2(ln(N))1/2/vm)\mathbb{P}\Big{(}\cup_{i\in W^{N}_{k}}\big{|}\deg^{N}_{w}(i)-|V^{N}_{m}|p_{k,m}\big{|}\geq X(N)\Big{)}\leq c_{2}w_{k}\xi N\exp\Big{(}-c_{1}p_{k,m}N^{1/2}(\ln(N))^{1/2}/v_{m}\Big{)} (I.2)

    Then, the RHS of (I.2) is summable over NN. From the Borel-Cantelli lemma, we get that a.s., for all large enough NN,

    |degwN(i)|VmN|pk,m|X(N),iWkN,\big{|}\deg^{N}_{w}(i)-|V^{N}_{m}|p_{k,m}\big{|}\leq X(N),\quad i\in W^{N}_{k},

    which implies that the following equation holds

    1limNmaxiWkNdegwN(i)miniWkNdegwN(i)limN|VmN|pk,m+X(N)|VmN|pk,mX(N)=1,a.s..1\leq\lim_{N\rightarrow\infty}\frac{\max_{i\in W^{N}_{k}}\deg^{N}_{w}(i)}{\min_{i\in W^{N}_{k}}\deg^{N}_{w}(i)}\leq\lim_{N\rightarrow\infty}\frac{|V^{N}_{m}|p_{k,m}+X(N)}{|V^{N}_{m}|p_{k,m}-X(N)}=1,\quad\text{a.s..}

    Thus, Condition 2.1 (b)(b) holds.

    Now, we show that the sequence {GN}N\{G^{N}\}_{N} is clustered proportionally sparse. Fix any k𝒦k\in\mathcal{K}, iWkNi\in W^{N}_{k}, ε>0\varepsilon>0, and UVNU\subseteq V^{N}. Let Bi(U)B_{i}(U) be the event that the dispatcher ii is bad w.r.t. the set UU, i.e.,

    Bi(U){|𝒩wN(i)U𝒩wN(i)EkN(U)EkN(VN)|ε}.B_{i}(U)\coloneqq\Big{\{}\Big{|}\frac{\mathcal{N}^{N}_{w}(i)\cap U}{\mathcal{N}^{N}_{w}(i)}-\frac{E^{N}_{k}(U)}{E^{N}_{k}(V^{N})}\Big{|}\geq\varepsilon\Big{\}}. (I.3)

    Define αm|UVmN||VN|\alpha_{m}\coloneqq\frac{|U\cap V^{N}_{m}|}{|V^{N}|} for each mm\in\mathcal{M}. By the union bound, we have that

    (Bi(U))(Bi(U),||𝒩wN(i)U|m|VmNU|pk,m|<ε1m|VmN|pk,m,|EkN(U)EkN(VN)mαmpk,mvmpk,m|<ε2, and |𝒩wN(i)m|VmN|pk,m|<ε3m|VmN|pk,m)+(||𝒩wN(i)U|m|VmNU|pk,m|ε1m|VmN|pk,m)+(|𝒩wN(i)m|VmN|pk,m|ε2m|VmN|pk,m)\begin{split}\mathbb{P}\big{(}B_{i}(U)\big{)}&\leq\mathbb{P}\Big{(}B_{i}(U),\ \Big{|}|\mathcal{N}^{N}_{w}(i)\cap U|-\sum_{m\in\mathcal{M}}|V^{N}_{m}\cap U|p_{k,m}\Big{|}<\varepsilon_{1}\sum_{m\in\mathcal{M}}|V^{N}_{m}|p_{k,m},\\ &\qquad\big{|}\frac{E^{N}_{k}(U)}{E^{N}_{k}(V^{N})}-\frac{\sum_{m\in\mathcal{M}}\alpha_{m}p_{k,m}}{\sum_{v_{m}p_{k,m}}}\big{|}<\varepsilon_{2},\text{ and }\Big{|}\mathcal{N}^{N}_{w}(i)-\sum_{m\in\mathcal{M}}|V^{N}_{m}|p_{k,m}\Big{|}<\varepsilon_{3}\sum_{m\in\mathcal{M}}|V^{N}_{m}|p_{k,m}\Big{)}\\ &+\mathbb{P}\Big{(}\Big{|}|\mathcal{N}^{N}_{w}(i)\cap U|-\sum_{m\in\mathcal{M}}|V^{N}_{m}\cap U|p_{k,m}\Big{|}\geq\varepsilon_{1}\sum_{m\in\mathcal{M}}|V^{N}_{m}|p_{k,m}\Big{)}\\ &+\mathbb{P}\Big{(}\Big{|}\mathcal{N}^{N}_{w}(i)-\sum_{m\in\mathcal{M}}|V^{N}_{m}|p_{k,m}\Big{|}\geq\varepsilon_{2}\sum_{m\in\mathcal{M}}|V^{N}_{m}|p_{k,m}\Big{)}\end{split} (I.4)

    We will bound each term of the RHS of (I.4). By choosing ε1\varepsilon_{1}, ε2\varepsilon_{2} and ε3\varepsilon_{3} satisfying

    ε3mαmpk,m+ε1mvmpk,m(1ε3)mvmpk,m+ε2<ε,\begin{split}\frac{\varepsilon_{3}\sum_{m\in\mathcal{M}}\alpha_{m}p_{k,m}+\varepsilon_{1}\sum_{m\in\mathcal{M}}v_{m}p_{k,m}}{(1-\varepsilon_{3})\sum_{m\in\mathcal{M}}v_{m}p_{k,m}}+\varepsilon_{2}<\varepsilon,\end{split} (I.5)

    we have that

    𝒩wN(i)U𝒩wN(i)EkN(U)EkN(VN)=𝒩wN(i)U𝒩wN(i)mαmpk,mvmpk,m+mαmpk,mvmpk,mEkN(U)EkN(VN)<ε3mαmpk,m+ε1mvmpk,m(1ε3)mvmpk,m+ε2<ε,\begin{split}\frac{\mathcal{N}^{N}_{w}(i)\cap U}{\mathcal{N}^{N}_{w}(i)}-\frac{E^{N}_{k}(U)}{E^{N}_{k}(V^{N})}&=\frac{\mathcal{N}^{N}_{w}(i)\cap U}{\mathcal{N}^{N}_{w}(i)}-\frac{\sum_{m\in\mathcal{M}}\alpha_{m}p_{k,m}}{\sum_{v_{m}p_{k,m}}}+\frac{\sum_{m\in\mathcal{M}}\alpha_{m}p_{k,m}}{\sum_{v_{m}p_{k,m}}}-\frac{E^{N}_{k}(U)}{E^{N}_{k}(V^{N})}\\ <&\frac{\varepsilon_{3}\sum_{m\in\mathcal{M}}\alpha_{m}p_{k,m}+\varepsilon_{1}\sum_{m\in\mathcal{M}}v_{m}p_{k,m}}{(1-\varepsilon_{3})\sum_{m\in\mathcal{M}}v_{m}p_{k,m}}+\varepsilon_{2}<\varepsilon,\end{split} (I.6)

    and

    𝒩wN(i)U𝒩wN(i)EkN(U)EkN(VN)=𝒩wN(i)U𝒩wN(i)mαmpk,mvmpk,m+mαmpk,mvmpk,mEkN(U)EkN(VN)>ε3mαmpk,m+ε1mvmpk,m(1+ε3)mvmpk,mε2>ε,\begin{split}\frac{\mathcal{N}^{N}_{w}(i)\cap U}{\mathcal{N}^{N}_{w}(i)}-\frac{E^{N}_{k}(U)}{E^{N}_{k}(V^{N})}&=\frac{\mathcal{N}^{N}_{w}(i)\cap U}{\mathcal{N}^{N}_{w}(i)}-\frac{\sum_{m\in\mathcal{M}}\alpha_{m}p_{k,m}}{\sum_{v_{m}p_{k,m}}}+\frac{\sum_{m\in\mathcal{M}}\alpha_{m}p_{k,m}}{\sum_{v_{m}p_{k,m}}}-\frac{E^{N}_{k}(U)}{E^{N}_{k}(V^{N})}\\ >&-\frac{\varepsilon_{3}\sum_{m\in\mathcal{M}}\alpha_{m}p_{k,m}+\varepsilon_{1}\sum_{m\in\mathcal{M}}v_{m}p_{k,m}}{(1+\varepsilon_{3})\sum_{m\in\mathcal{M}}v_{m}p_{k,m}}-\varepsilon_{2}>-\varepsilon,\end{split} (I.7)

    which implies that the first term is equal to 0 with ε1\varepsilon_{1}, ε2\varepsilon_{2} and ε3\varepsilon_{3}. Using the Chernoff bound again, we can bound the second term and the third term as follows: for some c3(0,)c_{3}\in(0,\infty) and large enough NN,

    (||𝒩wN(i)U|m|VmNU|pk,m|ε1m|VmN|pk,m)c3exp(c3Nmvmpk,m),\mathbb{P}\Big{(}\Big{|}|\mathcal{N}^{N}_{w}(i)\cap U|-\sum_{m\in\mathcal{M}}|V^{N}_{m}\cap U|p_{k,m}\Big{|}\geq\varepsilon_{1}\sum_{m\in\mathcal{M}}|V^{N}_{m}|p_{k,m}\Big{)}\\ \leq c_{3}\exp\Big{(}-c_{3}N\sum_{m\in\mathcal{M}}v_{m}p_{k,m}\Big{)}, (I.8)

    and

    (|𝒩wN(i)m|VmN|pk,m|ε2m|VmN|pk,m)c3exp(c3Nmvmpk,m).\begin{split}&\mathbb{P}\Big{(}\Big{|}\mathcal{N}^{N}_{w}(i)-\sum_{m\in\mathcal{M}}|V^{N}_{m}|p_{k,m}\Big{|}\geq\varepsilon_{2}\sum_{m\in\mathcal{M}}|V^{N}_{m}|p_{k,m}\Big{)}\leq c_{3}\exp\Big{(}-c_{3}N\sum_{m\in\mathcal{M}}v_{m}p_{k,m}\Big{)}.\end{split} (I.9)

    Therefore, for large enough NN, we have

    (Bi(U))2c3exp(c3Nmvmpk,m),\mathbb{P}(B_{i}(U))\leq 2c_{3}\exp\Big{(}-c_{3}N\sum_{m\in\mathcal{M}}v_{m}p_{k,m}\Big{)}, (I.10)

    and

    (iWkNBi(U))2c3|WkN|exp(c3Nmvmpk,m).\mathbb{P}(\cup_{i\in W^{N}_{k}}B_{i}(U))\leq 2c_{3}|W^{N}_{k}|\exp\Big{(}-c_{3}N\sum_{m\in\mathcal{M}}v_{m}p_{k,m}\Big{)}. (I.11)

    Moreover, for some c4(0,)c_{4}\in(0,\infty) and large enough NN,

    (supUVNiWkNBi(U))exp(c4N).\mathbb{P}\big{(}\sup_{U\subseteq V^{N}}\cup_{i\in W^{N}_{k}}B_{i}(U)\big{)}\leq\exp(-c_{4}N). (I.12)

    The RHS of (I.12) is summable over NN and the set 𝒦\mathcal{K} is finite, so by the Borel Cantelli lemma, the sequence is clustered proportionally sparse.

    If 𝐩\mathbf{p} satisfies (3.2), by Lemma 3.4, there exists an N00N_{0}\in\mathbb{N}_{0} such that for all NN0N\geq N_{0}, the queue length process (XjN(t))jVN\big{(}X_{j}^{N}(t)\big{)}_{j\in V^{N}} under the local JSQ(dd) policy is ergodic, which implies that all assumptions of Theorem 3.13 hold. ∎