This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

A Mean-Field Team Approach to Minimize the Spread of Infection in a Network

Jalal Arabneydi and Amir G. Aghdam This work has been supported in part by the Natural Sciences and Engineering Research Council of Canada (NSERC) under Grant RGPIN-262127-17, and in part by Concordia University under Horizon Postdoctoral Fellowship.Jalal Arabneydi and Amir G. Aghdam are with the Department of Electrical and Computer Engineering, Concordia University, 1455 de Maisonneuve Blvd. West, Montreal, QC, Canada, Postal Code: H3G 1M8. Email: jalal.arabneydi@mail.mcgill.ca, aghdam@ece.concordia.ca
Abstract

In this paper, a stochastic dynamic control strategy is presented to prevent the spread of an infection over a homogeneous network. The infectious process is persistent, i.e., it continues to contaminate the network once it is established. It is assumed that there is a finite set of network management options available such as degrees of nodes and promotional plans to minimize the number of infected nodes while taking the implementation cost into account. The network is modeled by an exchangeable controlled Markov chain, whose transition probability matrices depend on three parameters: the selected network management option, the state of the infectious process, and the empirical distribution of infected nodes (with not necessarily a linear dependence). Borrowing some techniques from mean-field team theory the optimal strategy is obtained for any finite number of nodes using dynamic programming decomposition and the convolution of some binomial probability mass functions. For infinite-population networks, the optimal solution is described by a Bellman equation. It is shown that the infinite-population strategy is a meaningful sub-optimal solution for finite-population networks if a certain condition holds. The theoretical results are verified by an example of rumor control in social networks.

Proceedings of American Control Conference, 2019.

I Introduction

Networks are ubiquitous in today’s world, connecting people and organizations in various ways to improve the quality of day-to-day life in terms of, for example, health services [1], consumers demand [2], energy management [3], and social activities [4], to name only a few. There has been a growing interest in the literature recently on network analysis, and in particular, on enhancing network reliability and security [5, 6]. This problem has been on the spotlight ever since the dramatic influence of social media on public opinion was observed in a number of major events.

Controlling the spread of undesirable phenomena such as disease and misinformation over a network is an important problem for which different approaches are proposed in the literature [7, 8]. The dynamics of an infection propagating in a network of nn nodes, where each node has a binary state (susceptible and infected), can be modeled by a Markov chain with a 2n×2n2^{n}\times 2^{n} transition probability matrix. Since the computational complexity of such a model is exponential with respect to nn, mean-field theory has proved to be effective in approximating a large-scale dynamic network by an infinite population one. For this purpose, the dynamics of the probability distribution of infected nodes can be described by a differential equation (called diffusion equation) [9, 10].

In the analysis of infection spread, the main objective is to study the dynamics of the states of nodes, specially after a sufficiently long time, in order to determine the rate of convergence to the steady state [11, 12, 13]. It is shown in [11] that if the rate of spread of infection over the rate of cure is less than the inverse of the largest eigenvalue of the adjacency matrix, the infinite-population network reaches to an absorbing state in which all states are healthy, i.e., the infection is eventually cleared.

In the control of infection spread, on the other hand, the objective is to derive the transition probabilities such that a prescribed performance index, which is a function of implementation cost and the number of infected nodes, is minimized [14, 15, 16, 17, 18]. This problem is computationally difficult to solve, in general. However, in the special case when the diffusion equation and cost function have certain structures, the optimal strategy can be obtained analytically. For example, in [14, 15, 16] it is assumed that the network dynamics and cost are linear in control action (that is immunization or curing rates), which leads to bang-bang control strategy. The interested reader is referred to [17, 18] for more details on optimal resource allocation methods.

This paper studies the optimal control of a network consisting of an arbitrary number of nodes that are influenced (coupled) by the empirical distribution of infected nodes (such couplings are not necessarily linear). The infectious process is assumed to be persistent, in the sense that the infection does not disappear after the initial time. In contrast to the papers cited in the previous paragraph which consider a continuous action set, in this paper it is assumed that there is a limited number of resources available, which means that the action set is finite. In addition, we raise a practical question that when the solution of an infinite-population network constructs a meaningful approximation for the finite-population one. Inspired by existing techniques for mean-field teams [19, 20, 21, 22, 23, 24], we first compute the optimal solution of a finite-population network for the case where the empirical distribution of infected nodes is observable. Next, we derive an infinite-population Bellman equation that requires no observation of infected nodes, and identify a stability condition under which the solution of the infinite-population network constitutes a near-optimal solution for the finite-population one.

The paper is structured as follows. In Section II the problem is formulated and the objectives are subsequently described. The optimal control strategies, as the main results of the paper, are derived on micro and macro scales in Section III. An illustrative example of a social network is presented in Section IV. The results are finally summarized in Section V.

II Problem Formulation

II-A Notational convention

Throughout this article, \mathbb{R} is the set of real numbers and \mathbb{N} is the set of natural numbers. For any nn\in\mathbb{N}, let n\mathbb{N}_{n} and n\mathcal{M}_{n} represent finite sets {1,,n}\{1,\ldots,n\} and {0,1n,2n,,1}\{0,\frac{1}{n},\frac{2}{n},\ldots,1\}, respectively, and x1:nx_{1:n} denote the vector (x1,,xn)(x_{1},\ldots,x_{n}). In addition, 𝔼[]\mathbb{E}[\bm{\cdot}], ()\mathbb{P}(\bm{\cdot}) and 𝟙()\mathbb{1}\left(\bm{\cdot}\right) refer to the expectation, probability and indicator operators, respectively. For any nn\in\mathbb{N} and p[0,1]p\in[0,1], Bino(,n,p)\operatorname{Bino}{(\bm{\cdot},n,p)} is the binomial probability distribution function of nn binary trials with success probability pp.

II-B Model

Consider a population of nn\in\mathbb{N} homogeneous users that are exposed to an infectious process (e.g., disease or fake news). Let xti{S,I}x^{i}_{t}\in\{S,I\} be the state of user ini\in\mathbb{N}_{n} at time tt\in\mathbb{N}, where SS and II stand for “susceptible” and “infected”, respectively. Denote by mtnm_{t}\in\mathcal{M}_{n} the empirical distribution of the infected users at time tt\in\mathbb{N}, i.e., mt=1ni=1n𝟙(xti=I)m_{t}=\frac{1}{n}\sum_{i=1}^{n}\mathbb{1}\left(x^{i}_{t}=I\right).

II-B1 Resources

Let 𝒰\mathcal{U} denote the set of finite options available to the network manager (e.g., a company or a government). The objective of the network manager is to minimize the effect of the infectious process on the users by employing the available options effectively. For instance, one possible option is the degree of nodes and by varying the degree (i.e., topology), the spread of an infection can be impeded. Alternatively, the option may be an action plan such as vaccination or health promotion, influencing the rates of infection and cure. Denote by ut𝒰u_{t}\in\mathcal{U} the option taken by the network manager at time tt\in\mathbb{N}.

II-B2 Infectious process

Let zt𝒵z_{t}\in\mathcal{Z} be the state of an infectious process at time tt\in\mathbb{N}, where 𝒵\mathcal{Z} is a finite set consisting of all possible states. Denote by (zt+1zt,ut)\mathbb{P}\big{(}z_{t+1}\mid z_{t},u_{t}\big{)} the transition probability according to which state zt𝒵z_{t}\in\mathcal{Z} transits to state zt+1𝒵z_{t+1}\in\mathcal{Z} under option ut𝒰u_{t}\in\mathcal{U}, t\forall t\in\mathbb{N}. Note that the level of persistence of the infectious process is incorporated in the above transition probability matrix.

II-B3 Dynamics of users

Suppose that the state of user ii is susceptible, the state of the infectious process is ztz_{t}, option utu_{t} is chosen and the number of infected users is nmtnm_{t}, tt\in\mathbb{N}. Then, user ii becomes infected with the following probability:

(xt+1i=Ixti=S,ut,mt,zt):=f0(ut,mt,zt),\mathbb{P}\big{(}x^{i}_{t+1}=I\mid x^{i}_{t}=S,u_{t},m_{t},z_{t}\big{)}:=f^{0}(u_{t},m_{t},z_{t}), (1)

where f0:𝒰×n×𝒵[0,1]f^{0}:\mathcal{U}\times\mathcal{M}_{n}\times\mathcal{Z}\rightarrow[0,1]. In addition, when the state of user ii is infected, it changes to susceptible according to the following probability:

(xt+1i=Sxti=I,ut,mt,zt):=f1(ut,mt,zt),\mathbb{P}\big{(}x^{i}_{t+1}=S\mid x^{i}_{t}=I,u_{t},m_{t},z_{t}\big{)}:=f^{1}(u_{t},m_{t},z_{t}), (2)

where f1:𝒰×n×𝒵[0,1]f^{1}:\mathcal{U}\times\mathcal{M}_{n}\times\mathcal{Z}\rightarrow[0,1]. It is to be noted that the network topology is implicitly described in transition probabilities (1) and (2).

II-B4 Per-step cost

Let c(u,m,z)0c(u,m,z)\in\mathbb{R}_{\geq 0} be the cost associated with implementing option u𝒰u\in\mathcal{U} when the empirical distribution of the infected users is mnm\in\mathcal{M}_{n} and the state of the infectious process is z𝒵z\in\mathcal{Z}. For practical purposes, the per-step cost function is considered to be an increasing function of the empirical distribution of the infected users, i.e., the more infection, the higher cost.

At any time tt\in\mathbb{N}, the network manager chooses its option according to the control law gt:(n×𝒵)t𝒰g_{t}:(\mathcal{M}_{n}\times\mathcal{Z})^{t}\rightarrow\mathcal{U} as follows:

ut=gt(m1:t,z1:t),t.u_{t}=g_{t}(m_{1:t},z_{1:t}),\quad t\in\mathbb{N}. (3)

Note that g:={g1,g2,}g:=\{g_{1},g_{2},\ldots\} is the strategy of the network manager.

II-C Problem statement

Assumption 1

The transition probabilities and cost function are time-homogeneous. In addition, the underlying primitive random variables of users as well as the infectious process are mutually independent in both space and time. Furthermore, the primitive random variables of users are identically distributed.

Given a discount factor β(0,1)\beta\in(0,1), define the total expected discounted cost:

Jn(g)=𝔼g[t=1βt1c(ut,mt,zt)],J_{n}(g)=\mathbb{E}^{g}\left[\sum_{t=1}^{\infty}\beta^{t-1}c(u_{t},m_{t},z_{t})\right], (4)

where the above cost function depends on the choice of strategy gg and the number of users nn.

Problem 1

Find an optimal strategy gg^{\ast} such that for any strategy gg,

Jn:=Jn(g)Jn(g).J^{\ast}_{n}:=J_{n}(g^{\ast})\leq J_{n}(g). (5)

Problem 2

Find a sub-optimal strategy g~:={g~1,g~2,}\tilde{g}:=\{\tilde{g}_{1},\tilde{g}_{2},\ldots\}, g~t:𝒵t𝒰\tilde{g}_{t}:\mathcal{Z}^{t}\rightarrow\mathcal{U}, tt\in\mathbb{N}, such that its performance converges to the optimal performance of the infinite-population as the number of users increases, i.e.,

|Jn(g^)J|ε(n),|J_{n}(\hat{g})-J_{\infty}^{\ast}|\leq\varepsilon(n), (6)

where limnε(n)=0\lim_{n\rightarrow\infty}\varepsilon(n)=0.

III Theoretical results

Prior to solving Problems 1 and 2, it is necessary to understand the dynamics of the empirical distribution of the infected users, and more importantly, the way it evolves over time according to each option of the network manager and state of the infectious process. To this end, the following theorem is needed.

Theorem 1

Let Assumption 1 hold. Given any mtm_{t}\in\mathcal{M}, ut𝒰u_{t}\in\mathcal{U} and zt𝒵z_{t}\in\mathcal{Z}, tt\in\mathbb{N}, the transition probability matrix of the empirical distribution of the infected users is characterized as:

(mt+1|mt=0,ut,zt)=Bino(nmt+1,f0(ut,mt,zt),n),\displaystyle\mathbb{P}\big{(}m_{t+1}|m_{t}=0,u_{t},z_{t}\big{)}=\operatorname{Bino}(nm_{t+1},f^{0}(u_{t},m_{t},z_{t}),n), (7)
(mt+1|mt=1,ut,zt)=Bino(nmt+1,1f1(ut,mt,zt),n),\displaystyle\mathbb{P}\big{(}m_{t+1}|m_{t}=1,u_{t},z_{t}\big{)}=\operatorname{Bino}(nm_{t+1},1-f^{1}(u_{t},m_{t},z_{t}),n), (8)
(mt+1|mt{0,1},ut,zt)=(Bino(,f0(ut,mt,zt),nnmt)\displaystyle\mathbb{P}\big{(}m_{t+1}|m_{t}\notin\{0,1\},u_{t},z_{t}\big{)}=\Big{(}\operatorname{Bino}(\bm{\cdot},f^{0}(u_{t},m_{t},z_{t}),n-nm_{t}) (9)
Bino(,f1(ut,mt,zt),nmt))(nmt+1).\displaystyle\quad*\operatorname{Bino}(\bm{\cdot},f^{1}(u_{t},m_{t},z_{t}),nm_{t})\Big{)}(nm_{t+1}). (10)

Proof

The proof proceeds in three steps. In the first step, suppose mt=0m_{t}=0, i.e. xti=S,inx^{i}_{t}=S,\forall i\in\mathbb{N}_{n}. In such a case, nmt+1=i=1n𝟙(xt+1i=I)nm_{t+1}=\sum_{i=1}^{n}\mathbb{1}\left(x^{i}_{t+1}=I\right) is a random variable consisting of nn i.i.d. Bernoulli random variables with the success probability f0(ut,mt,zt)f^{0}(u_{t},m_{t},z_{t}). In the second step, suppose mt=1m_{t}=1, i.e. xti=I,inx^{i}_{t}=I,\forall i\in\mathbb{N}_{n}. Therefore, nmt+1=i=1n𝟙(xt+1i=I)nm_{t+1}=\sum_{i=1}^{n}\mathbb{1}\left(x^{i}_{t+1}=I\right) is a random variable consisting of nn i.i.d. Bernoulli random variables with the success probability 1f1(ut,mt,zt)1-f^{1}(u_{t},m_{t},z_{t}). In the last step, suppose that mt{0,1}m_{t}\notin\{0,1\}. Then, nmt+1=i=1n𝟙(xt+1i=I)nm_{t+1}=\sum_{i=1}^{n}\mathbb{1}\left(x^{i}_{t+1}=I\right) is the sum of two independent random variables, where the first one is comprised of nnmtn-nm_{t} i.i.d. Bernoulli random variables with the success probability f0(ut,mt,zt)f^{0}(u_{t},m_{t},z_{t}) while the second one is comprised of nmtnm_{t} i.i.d. Bernoulli random variables with the success probability 1f0(ut,mt,zt)1-f^{0}(u_{t},m_{t},z_{t}). The proof is now complete, on noting that the probability mass function of two independent random variables is the convolution of their probability mass functions. \hfill\blacksquare

Theorem 2

Let Assumption 1 hold. Problem 1 admits an optimal stationary strategy characterized by the following Bellman equation: for any mnm\in\mathcal{M}_{n} and z𝒵z\in\mathcal{Z},

V(m,z)=minu𝒰(c(u,m,z)+β(z+𝒵m+(z+z,u)(m+m,u,z)V(m+,z+))).V(m,z)=\min_{u\in\mathcal{U}}\Big{(}c(u,m,z)+\beta\big{(}\sum_{z^{+}\in\mathcal{Z}}\sum_{m^{+}\in\mathcal{M}}\\ \mathbb{P}\big{(}z^{+}\mid z,u\big{)}\mathbb{P}\big{(}m^{+}\mid m,u,z\big{)}V(m^{+},z^{+})\big{)}\Big{)}. (11)

Let gg^{*} be a minimizer of the above Bellman equation; then the optimal action at time tt\in\mathbb{N} is given by ut=g(mt,zt)u^{\ast}_{t}=g^{\ast}(m_{t},z_{t}).

Proof

From the proof of Theorem 1 and the fact that the infectious process evolves in a Markovian manner with a transition probability independent of the states of users, it follows that:

(mt+1,zt+1m1:t,z1:t,u1:t)=(zt+1zt,ut)×(mt+1|mt,ut,zt),\mathbb{P}\big{(}m_{t+1},z_{t+1}\mid m_{1:t},z_{1:t},u_{1:t}\big{)}=\mathbb{P}\big{(}z_{t+1}\mid z_{t},u_{t}\big{)}\\ \times\mathbb{P}\big{(}m_{t+1}|m_{t},u_{t},z_{t}\big{)}, (12)

where the left-hand side of eqaution (12) does not depend on the control laws g1:tg_{1:t}. Hence, one can find the optimal solution of Problem 1 via the dynamic programming principle [25], and this leads to the Bellman equation (11). \hfill\blacksquare

According to Theorem 2, the optimal strategy does not depend on the history of infected users and infectious process, i.e., it is sufficient to know the current values in order to optimally control the network.

Remark 1

The cardinality of the space of the Bellman equation (11), i.e. n\mathcal{M}_{n}, is linear in the number of users nn.

In the special case of n=n=\infty, the probability mass function of mt+1m_{t+1} becomes a Dirac measure. In such a case, there is no loss of optimality in restricting attention to the dynamics of the controlled differential equations. More precisely, define

pt:=limn1nn=1𝟙(xti=I).p_{t}:=\lim_{n\rightarrow\infty}\frac{1}{n}\sum_{n=1}^{\infty}\mathbb{1}\left(x^{i}_{t}=I\right). (13)

According to [20, Lemma 4], the following equality holds with probability one for any trajectory z1:z_{1:\infty},

pt+1=(1pt)f0(ut,pt,zt)+pt(1f1(ut,pt,zt)),p_{t+1}=(1-p_{t})f^{0}(u_{t},p_{t},z_{t})+p_{t}(1-f^{1}(u_{t},p_{t},z_{t})), (14)

with p1=𝔼[m1]p_{1}=\mathbb{E}\big{[}m_{1}\big{]}. By incorporating the macro-scale (infinite-population) dynamics (14) into the Bellman equation (11), one arrives at the following Bellman equation:

V~(p,z)=minu𝒰(c(u,p,z)+βz~𝒵(z~z,u)×V~((1p)f0(u,p,z)+p(1f1(u,p,z)),z~)),\tilde{V}(p,z)=\min_{u\in\mathcal{U}}\Big{(}c(u,p,z)+\beta\sum_{\tilde{z}\in\mathcal{Z}}\mathbb{P}\big{(}\tilde{z}\mid z,u\big{)}\\ \times\tilde{V}\big{(}(1-p)f^{0}(u,p,z)+p(1-f^{1}(u,p,z)),\tilde{z}\big{)}\Big{)}, (15)

for any p[0,1]p\in[0,1] and z𝒵z\in\mathcal{Z}. Let g~:[0,1]×𝒵𝒰\tilde{g}:[0,1]\times\mathcal{Z}\rightarrow\mathcal{U} be a minimizer of the right-hand side of equation (15), and define the following action at time tt:

u~t:=g~(pt,zt).\tilde{u}_{t}:=\tilde{g}(p_{t},z_{t}). (16)

Notice that p1:t+1p_{1:t+1} is a stochastic process adapted to the filtration z1:tz_{1:t} for any tt\in\mathbb{N}, i.e.,

pt+1=(1pt)f0(g~(pt,zt),pt,zt)+pt(1f1(g~(pt,zt),pt,zt)).p_{t+1}=(1-p_{t})f^{0}(\tilde{g}(p_{t},z_{t}),p_{t},z_{t})+p_{t}(1-f^{1}(\tilde{g}(p_{t},z_{t}),p_{t},z_{t})). (17)

To establish the convergence result, the following assumptions are imposed on the model.

Assumption 2

There exist positive constants k0,k1,kc0k^{0},k^{1},k^{c}\in\mathbb{R}_{\geq 0} such that given any u𝒰u\in\mathcal{U}, z𝒵z\in\mathcal{Z} and m1,m2[0,1]m^{1},m^{2}\in[0,1],

|f0(u,m1,z)f0(u,m2,z)|k0|m1m2|,\displaystyle|f^{0}(u,m^{1},z)-f^{0}(u,m^{2},z)|\leq k^{0}|m^{1}-m^{2}|, (18)
|f1(u,m1,z)f1(u,m2,z)|k1|m1m2|,\displaystyle|f^{1}(u,m^{1},z)-f^{1}(u,m^{2},z)|\leq k^{1}|m^{1}-m^{2}|, (19)
|c(u,m1,z)c(u,m2,z)|kc|m1m2|.\displaystyle|c(u,m^{1},z)-c(u,m^{2},z)|\leq k^{c}|m^{1}-m^{2}|. (20)

Assumption 3

The parameters introduced in Assumption 2 satisfy the inequality max(k0,k1)<1β\max(k^{0},k^{1})<\dfrac{1}{\beta}.

Theorem 3

Let Assumptions 12 and 3 hold. Then, g~\tilde{g} is a sub-optimal strategy for Problem 2, i.e.:

|Jn(g~)J|kc(1β)(1βmax(k0,k1))𝒪(1n),|J_{n}(\tilde{g})-J_{\infty}^{\ast}|\leq\frac{k^{c}}{(1-\beta)(1-\beta\max(k^{0},k^{1}))}\mathcal{O}(\frac{1}{\sqrt{n}}), (21)

where limn𝒪(1n)=0\lim_{n\rightarrow\infty}\mathcal{O}(\frac{1}{\sqrt{n}})=0.

Proof

Let m~t\tilde{m}_{t} denote the empirical distribution of the infected users under strategy g~\tilde{g} at time tt\in\mathbb{N}. For ease of display, let function ϕ:𝒰×[0,1]×𝒵\phi:\mathcal{U}\times[0,1]\times\mathcal{Z} denote the dynamics (14), i.e. pt+1=ϕ(ut,pt,zt),tp_{t+1}=\phi(u_{t},p_{t},z_{t}),t\in\mathbb{N}. For any m~t\tilde{m}_{t}, ptp_{t} and ztz_{t} at time tt\in\mathbb{N}, the following inequality holds as a result of the triangle inequality, monotonicity of the expectation function, Assumptions 1 and 2, and equations (14), (16) and (17):

𝔼|m~t+1\displaystyle\mathbb{E}|\tilde{m}_{t+1} pt+1|=𝔼|m~t+1ϕ(g~(pt,zt),pt,zt)|\displaystyle-p_{t+1}|=\mathbb{E}|\tilde{m}_{t+1}-\phi(\tilde{g}(p_{t},z_{t}),p_{t},z_{t})|
𝔼|m~t+1ϕ(g~(pt,zt),m~t,zt)|\displaystyle\leq\mathbb{E}|\tilde{m}_{t+1}-\phi(\tilde{g}(p_{t},z_{t}),\tilde{m}_{t},z_{t})|
+𝔼|ϕ(g~(pt,zt),pt,zt)ϕ(g~(pt,zt),m~t,zt)|\displaystyle\quad+\mathbb{E}|\phi(\tilde{g}(p_{t},z_{t}),p_{t},z_{t})-\phi(\tilde{g}(p_{t},z_{t}),\tilde{m}_{t},z_{t})|
𝒪(1n)+max(k0,k1)|m~tpt|,\displaystyle\leq\mathcal{O}(\frac{1}{\sqrt{n}})+\max(k^{0},k^{1})|\tilde{m}_{t}-p_{t}|, (22)

where rate 𝒪(1n)\mathcal{O}(\frac{1}{\sqrt{n}}) is the rate of convergence to the infinite-population limit [20, Lemma 4]. On the other hand, from the triangle inequality, monotonicity of the expectation function, Assumptions 1 and 2, and equations (15), (16) and (17):

Jn(g~)J=𝔼[t=1βt1c(g~(pt,zt),m~t,zt)]𝔼[V~(p1,z1)]\displaystyle J_{n}(\tilde{g})-J_{\infty}^{\ast}=\mathbb{E}\big{[}\sum_{t=1}^{\infty}\beta^{t-1}c(\tilde{g}(p_{t},z_{t}),\tilde{m}_{t},z_{t})\big{]}-\mathbb{E}\big{[}\tilde{V}(p_{1},z_{1})\big{]}
=𝔼[t=1βt1c(g~(pt,zt),m~t,zt)t=1βt1c(g~(pt,zt),pt,zt)]\displaystyle=\mathbb{E}\big{[}\sum_{t=1}^{\infty}\beta^{t-1}c(\tilde{g}(p_{t},z_{t}),\tilde{m}_{t},z_{t})-\sum_{t=1}^{\infty}\beta^{t-1}c(\tilde{g}(p_{t},z_{t}),p_{t},z_{t})\big{]}
t=1βt1kc𝔼[|m~tpt|].\displaystyle\leq\sum_{t=1}^{\infty}\beta^{t-1}k^{c}\mathbb{E}\big{[}|\tilde{m}_{t}-p_{t}|\big{]}. (23)

From [20, Lemma 2], we have that 𝔼[|m~1p1|]𝒪(1n)\mathbb{E}\big{[}|\tilde{m}_{1}-p_{1}|\big{]}\leq\mathcal{O}(\frac{1}{\sqrt{n}}). Then, the proof follows from Assumption 3 and successively using (Proof) in inequality (Proof). \hfill\blacksquare

Remark 2

It is to be noted that no continuity assumption is imposed on the infinite-population strategy (16) in order to derive Theorem 3. In addition, an extra stability condition (i.e., Assumption 3) is needed to ensure that the infinite-population strategy is stable when applied to the finite-population network.

Since the optimization in (15) is over an infinite space [0,1][0,1], it is computationally difficult to find the exact solution. However, it is shown in [20, Corollary 1] that if the optimization problem is carried out over space n\mathcal{M}_{n}, the resultant solution will be a near-optimal solution for the finite-population case under Assumptions 12 and 3.

Corollary 1

Let Assumptions 12 and 3 hold. Let also g~n:n×𝒵\tilde{g}_{n}:\mathcal{M}_{n}\times\mathcal{Z} be a minimizer of the quantized version of the Bellman equation (15), as proposed in [20, Corollary 1]. The performance of g~n\tilde{g}_{n} converges to JJ^{\ast}_{\infty} at the rate 1/n1/\sqrt{n}.

IV Simulations: A social network example

Nowadays, many people get their daily news via social media, where a small piece of false information may propagate and lead to a widespread misinformation and potentially catastrophic consequences. As a result, it is crucial for network managers as well as governments to prevent large-scale misinformation on social media. Inspired by this objective, we present a simple rumor control problem, where the goal of a network manager is to minimize the number of misinformed users in the presence of a false rumor.

Refer to caption
Figure 1: The optimal strategy for a network of size 200200 in Example 1.

Example 1: Consider nn\in\mathbb{N} users and a matter of public interest with uncertain outcome such as an election. Let xti=Sx^{i}_{t}=S mean that user ini\in\mathbb{N}_{n} at time tt\in\mathbb{N} is correctly informed about the topic and xti=Ix^{i}_{t}=I mean otherwise. Denote by mt=1nt=1n𝟙(xti=I)m_{t}=\frac{1}{n}\sum_{t=1}^{n}\mathbb{1}\left(x^{i}_{t}=I\right) the empirical distribution of the misinformed users at time tt.

Let zt{0,1,2,3,4}z_{t}\in\{0,1,2,3,4\} be the number of fake news that the source of the rumor publishes on social media at time tt. The source is assumed to be persistent, i.e., it will find a way to spread the rumor unless it is constantly blocked.

The network manager has three options at each time instant, i.e. ut{1,2,3}u_{t}\in\{1,2,3\}, where:

  • ut=1u_{t}=1 means that the network manager does not intervene;

  • ut=2u_{t}=2 means that the network manager blocks the source of the rumor, and

  • ut=3u_{t}=3 means that the network manager broadcasts authenticated information to the users and addresses the issue publicly for transparency.

Let wtz{0,1}w^{z}_{t}\in\{0,1\} be a random one-unit increment with success probability 0.30.3 such that at time tt\in\mathbb{N},

zt+1={zt+wtz,zt<4,ut=1,3,4,zt=4,ut=1,3,0,ut=2.z_{t+1}=\begin{cases}z_{t}+w^{z}_{t},&z_{t}<4,u_{t}=1,3,\\ 4,&z_{t}=4,u_{t}=1,3,\\ 0,&u_{t}=2.\end{cases} (24)

The number of fake news may be viewed as the severity level of the misinforation induced by the rumor. In this example, we have implicitly assumed that when option “block” is taken by the manager, the source of rumor starts producing new fake news cautiously from zero again. The initial states of users are identically and independently distributed with probability mass function (0.85,0.15)(0.85,0.15), where 0.150.15 is the probability of being initially misinformed. At any time tt\in\mathbb{N}, given the empirical distribution of misinformed users mtm_{t}, the number of fake news ztz_{t} and the option utu_{t} taken by the network manager, an informed user is misled by the rumor with the following probability:

f0(ut,mt,zt)={0.2mt(zt+1),ut=1,0.2mt,ut=2,0.1mt2,ut=3,f^{0}(u_{t},m_{t},z_{t})=\begin{cases}0.2m_{t}(z_{t}+1),&u_{t}=1,\\ 0.2m_{t},&u_{t}=2,\\ 0.1m_{t}^{2},&u_{t}=3,\end{cases} (25)

where a larger number of misinformed users and fake news means a higher probability that a user becomes misinformed. On the other hand, a misinformed user becomes informed and convinced by the authenticated information provided by the network manager with a high probability. More precisely,

f1(ut)={0,ut=1,2,0.8,ut=3.f^{1}(u_{t})=\begin{cases}0,&u_{t}=1,2,\\ 0.8,&u_{t}=3.\end{cases} (26)

Denote by :𝒰×𝒵0\ell:\mathcal{U}\times\mathcal{Z}\rightarrow\mathbb{R}_{\geq 0} the implementation cost of each option, and let:

(1,z)=0,(2,z)=0.2z+1,(3,z)=5,\ell(1,z)=0,\quad\ell(2,z)=0.2z+1,\quad\ell(3,z)=5, (27)

for any z{0,1,2,3,4}z\in\{0,1,2,3,4\}. It is desired to minimize the following cost function:

𝔼[t=10.9t1(3.8mt+(ut,zt))].\mathbb{E}\big{[}\sum_{t=1}^{\infty}0.9^{t-1}\left(3.8m_{t}+\ell(u_{t},z_{t})\right)\big{]}. (28)

To determine the optimal strategy, we first compute the transition probability matrix in Theorem 1 and then solve the Bellman equation (11) in Theorem 2 by using the value-iteration method. The optimal strategy for n=200n=200 is displayed in Figure 1 as a function of the empirical distribution of misinformed users and the number of fake news. Under this optimal strategy, one realization of Example 1 is depicted in Figure 2.

Refer to caption
Figure 2: Trajectory of the solution for a network of size 200200 in Example 1.

To verify the results of Theorem 3 and Corollary 1, let the Bellman equation (15) be quantized with a step size 1/n1/n, and denote the resultant value function by VQ(q1,z1)V^{Q}(q_{1},z_{1}), where q1q_{1} is the closest number in n\mathcal{M}_{n} to p1=0.15p_{1}=0.15. Subsequently, it is shown in Figure 3 that 𝔼[VQ(q1,z1)]\mathbb{E}\big{[}V^{Q}(q_{1},z_{1})\big{]} converges to Jn=𝔼[V(m1,z1)]J^{\ast}_{n}=\mathbb{E}\big{[}V(m_{1},z_{1})\big{]}, as nn increases.

Refer to caption
Figure 3: The quantized solution converges to the optimal solution as the number of users increases in Example 1.

V Conclusions

A stochastic dynamic control strategy was introduced over a homogeneous network to minimize the spread of a persistent infection. It was shown that the exact optimal solution can be efficiently computed by solving a Bellman equation whose state space increases linearly with the number of nodes. In addition, an approximate optimal solution was proposed based on the infinite-population network, where the approximation error was shown to be upper bounded by a term that decays to zero as the number of users tends to infinity. An example of a social network was then presented to verify the theoretical results.

As a future research direction, the obtained results can be extended to partially homogeneous networks wherein the nodes are categorized into several sub-populations of homogeneous nodes such as low-degree and high-degree nodes. In addition, various approximation methods may be used to further alleviate the computational complexity of the proposed solutions. The development of reinforcement learning algorithms based on the Bellman equations provided in this paper can be another interesting problem for future work.

References

  • [1] R. M. Anderson, The population dynamics of infectious diseases: Theory and applications.   Springer, 2013.
  • [2] O. Shy, “A short survey of network economics,” Review of Industrial Organization, vol. 38, no. 2, pp. 119–149, 2011.
  • [3] G. A. Pagani and M. Aiello, “The power grid as a complex network: A survey,” Physica A: Statistical Mechanics and its Applications, vol. 392, no. 11, pp. 2688 –2700, 2013.
  • [4] D. Easley and J. Kleinberg, Networks, crowds, and markets: Reasoning about a highly connected world.   Cambridge University Press, 2010.
  • [5] K. Sha, A. Striege, and M. Song, Security, Privacy and Reliability in Computer Communications and Networks.   River Publishers, 2016.
  • [6] I. Friedberg, F. Skopik, G. Settanni, and R. Fiedler, “Combating advanced persistent threats: From network event correlation to incident detection,” Computers & Security, vol. 48, pp. 35–57, 2015.
  • [7] R. Pastor-Satorras, C. Castellano, P. Van Mieghem, and A. Vespignani, “Epidemic processes in complex networks,” Reviews of Modern Physics, vol. 87, no. 3, p. 925, 2015.
  • [8] C. Nowzari, V. M. Preciado, and G. J. Pappas, “Analysis and control of epidemics: A survey of spreading processes on complex networks,” IEEE Control Systems, vol. 36, no. 1, pp. 26–46, 2016.
  • [9] J. O. Kephart and S. R. White, “Directed-graph epidemiological models of computer viruses,” in Proceedings of IEEE Computer Society Symposium on Research in Security and Privacy, pp. 343–359, 1992.
  • [10] M. Nekovee, Y. Moreno, G. Bianconi, and M. Marsili, “Theory of rumour spreading in complex social networks,” Physica A: Statistical Mechanics and its Applications, vol. 374, no. 1, pp. 457–470, 2007.
  • [11] P. Van Mieghem, J. Omic, and R. Kooij, “Virus spread in networks,” IEEE/ACM Transactions on Networking, vol. 17, no. 1, pp. 1–14, 2009.
  • [12] Y. Wang, D. Chakrabarti, C. Wang, and C. Faloutsos, “Epidemic spreading in real networks: An eigenvalue viewpoint,” in Proceedings of \nth22 IEEE International Symposium on Reliable Distributed Systems, pp. 25–34, 2003.
  • [13] N. A. Ruhi and B. Hassibi, “SIRS epidemics on complex networks: Concurrence of exact markov chain and approximated models,” in Proceedings of the \nth54 IEEE Conference on Decision and Control, pp. 2919 – 2926, 2015.
  • [14] R. Morton and K. H. Wickwire, “On the optimal control of a deterministic epidemic,” Advances in Applied Probability, vol. 6, no. 4, pp. 622–635, 1974.
  • [15] A. Khanafer and T. Başar, “An optimal control problem over infected networks,” in Proceedings of the International Conference of Control, Dynamic Systems, and Robotics, Ottawa, Ontario, Canada, 2014.
  • [16] S. Eshghi, M. Khouzani, S. Sarkar, and S. S. Venkatesh, “Optimal patching in clustered malware epidemics,” IEEE/ACM Transactions on Networking, vol. 24, no. 1, pp. 283–298, 2016.
  • [17] A. Di Liddo, “Optimal control and treatment of infectious diseases. the case of huge treatment costs,” Mathematics, vol. 4, no. 2, p. 21, 2016.
  • [18] C. Nowzari, V. M. Preciado, and G. J. Pappas, “Optimal resource allocation for control of networked epidemic models,” IEEE Transactions on Control of Network Systems, vol. 4, no. 2, pp. 159–169, 2017.
  • [19] J. Arabneydi, “New concepts in team theory: Mean field teams and reinforcement learning,” Ph.D. dissertation, Dep. of Electrical and Computer Engineering, McGill University, Montreal, Canada, 2016.
  • [20] J. Arabneydi and A. G. Aghdam, “A certainty equivalence result in team-optimal control of mean-field coupled Markov chains,” in Proceedings of the \nth56 IEEE Conference on Decision and Control, 2017, pp. 3125–3130.
  • [21] ——, “Optimal dynamic pricing for binary demands in smart grids: A fair and privacy-preserving strategy,” in Proceedings of American Control Conference, 2018, pp. 5368–5373.
  • [22] ——, “Near-optimal design for fault-tolerant systems with homogeneous components under incomplete information,” in Proceedings of the \nth61 IEEE International Midwest Symposium on Circuits and Systems, 2018, pp. 809–812.
  • [23] J. Arabneydi, M. Baharloo, and A. G. Aghdam, “Optimal distributed control for leader-follower networks: A scalable design,” in Proceedings of the \nth31 IEEE Canadian Conference on Electrical and Computer Engineering, 2018, pp. 1–4.
  • [24] M. Baharloo, J. Arabneydi, and A. G. Aghdam, “Near-optimal control strategy in leader-follower networks: A case study for linear quadratic mean-field teams,” in Proceedings of the \nth57 IEEE Conference on Decision and Control, 2018, pp. 3288–3293.
  • [25] D. P. Bertsekas, Dynamic programming and optimal control.   Athena Scientific, \nth4 Edition, 2012.