This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

\setcopyright

ifaamas \acmDOIdoi \acmISBN \acmConference[AAMAS’20]Proc. of the 19th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2020), B. An, N. Yorke-Smith, A. El Fallah Seghrouchni, G. Sukthankar (eds.)May 2020Auckland, New Zealand \acmYear2020 \copyrightyear2020 \acmPrice

\affiliation\institution

University of Southern California

\affiliation\institution

Carnegie Mellon University

\affiliation\institution

Carnegie Mellon University

\affiliation\institution

Carnegie Mellon University \affiliation \institutionWorld Wildlife Fund \affiliation \institutionCarnegie Mellon University

Green Security Game with Community Engagement

Taoan Huang taoanhua@usc.edu Weiran Shen emersonswr@gmail.com David Zeng dzeng@andrew.cmu.edu Tianyu Gu tianyug@andrew.cmu.edu Rohit Singh rsingh@wwf.sg  and  Fei Fang feif@cs.cmu.edu
Abstract.

While game-theoretic models and algorithms have been developed to combat illegal activities, such as poaching and over-fishing, in green security domains, none of the existing work considers the crucial aspect of community engagement: community members are recruited by law enforcement as informants and can provide valuable tips, e.g., the location of ongoing illegal activities, to assist patrols. We fill this gap and (i) introduce a novel two-stage security game model for community engagement, with a bipartite graph representing the informant-attacker social network and a level-κ\kappa response model for attackers inspired by cognitive hierarchy; (ii) provide complexity results and exact, approximate, and heuristic algorithms for selecting informants and allocating patrollers against level-κ\kappa (κ<\kappa<\infty) attackers; (iii) provide a novel algorithm to find the optimal defender strategy against level-\infty attackers, which converts the problem of optimizing a parameterized fixed-point to a bi-level optimization problem, where the inner level is just a linear program, and the outer level has only a linear number of variables and a single linear constraint. We also evaluate the algorithms through extensive experiments.

Key words and phrases:
Security Game; Computational Sustainability; Community Engagement

1. Introduction

Despite the significance of protecting natural resources to environmental sustainability, a common lack of funding leads to an extremely low density of law enforcement units (referred to as defenders) to combat illegal activities such as wildlife poaching and overfishing (referred to as attacks). Due to insufficient sanctions, attackers are able to launch frequent attacks Le Gallic and Cox (2006); Leader-Williams and Milner-Gulland (1993), making it even more challenging to effectively detect and deter criminal activities through patrolling. To improve patrol efficiency, law enforcement agencies often recruit informants from local communities and plan defensive resources based on tips provided by them Linkie et al. (2015). Since attackers are often from the same local community and their activities can be observed by informants through social interactions, such tips contain detailed information about ongoing or upcoming criminal activities and, if known by defenders, can directly be used to guide allocating defensive resources. In fact, community engagement is listed by World Wild Fund for Nature as one of the six pillars towards zero poaching WWF (2015). The importance of community engagement goes beyond these green security domains about environment conservation and extends to domains such as fighting urban crimes Tublitz and Lawrence (2014); Gill et al. (2014).

Previous research in computational game theory have led to models and algorithms that can help the defenders allocate limited resources in the presence of attackers, with applications to enforce traffic Rosenfeld and Kraus (2017), combat oil-siphoning Wang et al. (2018), and deceive cyber adversaries Schlenker et al. (2018) in addition to protecting critical infrastructure Pita et al. (2008) and combating wildlife crime Fang et al. (2017). However, none of the work has considered this essential element of community engagement.

Community engagement leads to fundamentally new challenges that do not exist in previous literature. First, the defender not only needs to determine how to patrol but also needs to decide whom to recruit as informants. Second, there can be multiple attackers, and the existence of informants makes the success or failure of their attacks interdependent since any tip about other attackers’ actions can change the defender’s patrol. Third, because of the combinatorial nature of the tips, representing the defender’s strategy requires exponential space, making the problem of finding optimal defender strategy extremely challenging. Fourth, attackers may notice the patrol pattern over time and adapt their strategies accordingly.

In this paper, we provide the first study to fill the gap and provide a novel two-stage security game model for community engagement which represents the social network between potential informants and attackers with a bipartite graph. In the first stage of the game, the defender recruits a set of informants under a budget constraint, and in the second stage, the defender chooses a set of targets to protect based on tips from recruited informants. Inspired by the quantal cognitive hierarchy model Wright and Leyton-Brown (2014), we use a level-κ\kappa response model for attackers, taking into account the fact that the attacker can make iterative reasoning and the attacker’s strategy will impact the actual marginal strategy of the defender.

Our second contribution includes complexity results and algorithms for computing optimal defender strategy against level-κ\kappa (κ<\kappa<\infty) attackers. We show that the problem of selecting the optimal set of informants is NP-Hard. Further, based on sampling techniques, we develop an approximation algorithm to compute the optimal patrol strategy and a heuristic algorithm to find the optimal set of informants to recruit. For an expository purpose, we mainly describe the algorithms for level-0 attackers and provide the extension to level-κ\kappa (0<κ<0<\kappa<\infty) attackers in the last section.

The third contribution is a novel algorithm to find the optimal defender strategy against level-\infty attackers, which is an extremely challenging task: an attacker’s strategy may affect the defender’s marginal strategy, which in turn affects the attackers’ strategies and level-\infty attackers is defined through a fixed-point argument; as a result, the defender’s utility relies crucially on solving a parameterized fixed-point problem. A naïve mathematical programming-based formulation is prohibitively large to solve. We instead reduce the program to a bi-level optimization problem, where both levels become more tractable. In particular, the inner level optimization is a linear program, and the outer level optimization is one with a linear number of variables and a single linear constraint.

Finally, we conduct extensive experiments. We compare the running time and solution quality of different algorithms. We show that our bi-level optimization algorithm achieves better performance than the algorithm adapted from previous works. We also compare level-0 attackers and the case with insider threat (i.e., the attacker is aware of the informants), where we formulate the problem as a mathematical program and solve it by adapting an algorithm from previous works. We show that the defender suffers from utility loss if the insider threat is not taken into consideration and the defender still assumes a naïve attacker model (level-0).

2. Related Work and Background

Community engagement is studied in criminology. Smith and Humphreys (2015); Moreto (2015); Duffy et al. (2015) investigate the role of community engagement in wildlife conservation. Linkie et al. (2015); Gill et al. (2014) show the positive effects of community-oriented strategies. However, they lack a mathematical model for strategic defender-attacker interactions.

Recruitment of informants has also been proposed to study societal attitudes in relation to crimes using evolutionary game theory models. Short et al. (2013) formulate the problem of solving recruitment strategies as an optimal control problem to account for limited resources and budget. In contrast to their work, we emphasize the synergy of community engagement and allocation of defensive resources and aim to find the best strategy of recruiting informants and allocating defensive resources.

In security domains, Stackelberg Security Game (SSG) has been applied to a variety of security problems Tambe (2011), with variants accounting for alarm systems, surveillance cameras, and drones that can provide information in real time Basilico et al. (2017); Ma et al. (2018); Guo et al. (2017). Unlike the sensors that provide location-based information as studied in previous works, the kind of tips the informants can provide depends on their social connections, an essential feature about community engagement.

Other than the full rationality model, boundedly rational behavioral models such as quantal response (QR) McKelvey and Palfrey (1995); Yang et al. (2012) and subjective utility quantal response Nguyen et al. (2013) have been explored in the study of SSG. Our model and solution approach are compatible with most existing behavioral models in the SSG literature, but for an expository purpose, we only focus on the QR model.

3. Model

In this section, we introduce our novel two-stage green security game with community engagement. The key addition is the consideration of informants from local communities. They can be recruited and trained by the defender to provide tips about ongoing or upcoming attacks.

Following existing works on SSG Jain et al. (2010); Korzhyk et al. (2011), we consider a game with a set of targets T=[n]={1,,n}T=[n]=\{1,\dots,n\}. The defender has rr units of defensive resources and each can protect or cover one target with no scheduling constraint. An attacker can choose a target to attack. If target ii is attacked, the defender (attacker) receives Rid>0R^{d}_{i}>0 (Pia<0P^{a}_{i}<0) if it is covered, otherwise receives Pid<0P^{d}_{i}<0 (Ria>0R^{a}_{i}>0).

Informants recruited by the defender can provide tips regarding the exact targets in ongoing or upcoming attacks but tip frequency and usefulness may vary due to heterogeneity in the informants’ social connections. We model the interactions and connections between potential informants XX (i.e., members of the community that are known to be non-attacker and can be recruited by the defender) and potential attackers YY using a bipartite graph GS=(X,Y,E)G_{S}=(X,Y,E) with XY=X\cap Y=\emptyset. Here we assume the defender has access to a list of potential attackers which could be provided by the conservation site manager, since the deployment of our work relies on the manager’s domain knowledge, experience, and understanding of the social connections among community members.

When an attacker decides to launch an attack, an informant who interacted with the attacker previously may know his target location. Formally, for each vYv\in Y, we assume that vv will attack a target with probability pvp_{v} but the target is unknown without informants and each attacker takes actions independently. An edge (u,v)E(u,v)\in E is associated with an information sharing intensity wuvw_{uv}, representing the probability of attack activities of attacker vv being reported by uu, given vv attacks and uu is recruited as an informant.

In the first stage, the defender recruits kk informants, and in the second stage, the defender receives tips from the informants and allocates rr units of defensive resources. The defender’s goal is to maximize the expected utility defined as the summation of the utilities for each attack.

Let UU denote the set of recruited informants in the first stage where |U|k|U|\leq k, and V={vuV,(u,v)E}V=\{v\mid\exists u\in V,(u,v)\in E\} denote the set of attackers that are connected with at least one informant in UU. We represent tips as a vector of disjoint subsets of attackers 𝐕=(V1,,Vn)\mathbf{V}=(V_{1},\ldots,V_{n}), where ViV_{i} is the set of attackers who are reported to attack target iTi\in T such that ViV,ViVj=V_{i}\subseteq V,V_{i}\cap V_{j}=\emptyset for any i,jTi,j\in T. An attacker vv is reported if there exists iTi\in T such that vViv\in V_{i}, otherwise he is unreported. We also denote by V0=iTViV_{0}=\bigcup_{i\in T}V_{i} the set of reported attackers. It is possible that V0=V_{0}=\emptyset and we say the defender is informed if V0V_{0}\neq\emptyset. Note that 𝐕\mathbf{V} is a compact representation of the tips received by the defender as it neglects the identity of the informants, which is not crucial in the defender’s decision making given that all the tips are assumed to be correct.

In practice, tips are infrequent and the defender is often very protective of the informants. Thus, the attackers are often not aware of the existence of informants unless there is a significant insider threat. In addition, patrols can be divided into two categories – routine patrols and ambush patrols, where the latter are in response to tips from informants. Ambush patrols are costly, often requiring rangers to lie in wait for many hours for the possibility of catching a poacher. If not informed, the defender follows her routine patrol strategy 𝐱0=(x1,,xn)\mathbf{x}_{0}=(x_{1},\ldots,x_{n}) with xix_{i} denoting the probability that target ii is covered. Naturally, under this assumption the defender should use a strategy x0x_{0} that is optimal against the QR model, which can be computed by following Yang et al. (2012). If informed she uses different strategies 𝐱(𝐕)\mathbf{x}(\mathbf{V}) based on the tip 𝐕\mathbf{V}. Assume that each attacker, if deciding to attack a target, will respond to the defender’s strategy following a known behavioral model – the QR model. We define 𝖰𝖱(𝐱):=(q1,,qn)\mathsf{QR}(\mathbf{x}^{\prime}):=(q^{\prime}_{1},\ldots,q^{\prime}_{n}), where qiq^{\prime}_{i} is the probability of attacking target ii defined by

qi=eλ[xiPia+(1xi)Ria]jTeλ[xjPja+(1xj)Rja],\displaystyle q^{\prime}_{i}=\frac{e^{\lambda\left[x^{\prime}_{i}P_{i}^{a}+(1-x^{\prime}_{i})R_{i}^{a}\right]}}{\sum_{j\in T}e^{\lambda\left[x^{\prime}_{j}P_{j}^{a}+(1-x^{\prime}_{j})R_{j}^{a}\right]}}, (1)

and 𝐱\mathbf{x}^{\prime} is the attacker’s subjective belief of the coverage probabilities. In the above equation, λ0\lambda\geq 0 is the precision parameter McKelvey and Palfrey (1995) fixed throughout the paper. We discuss the relaxation of the some of the assumptions mentioned above in Section 8.

3.1. Level-κ\kappa Response Model

Motivated by the costly ambush patrols and inspired by the cognitive hierarchy theory, we propose the level-κ\kappa response model as the attackers’ behavior model.

When the informants’ report intensities are negligible, the attackers are almost always faced with the routine patrol 𝐱0\mathbf{x}_{0}. But when the informants’ report intensities are not negligible, the attackers’ behavior will change the marginal probability that a target is covered. Thus we assume that level-0 attackers just play the quantal response against the routine patrol 𝐱0\mathbf{x}_{0}: 𝐪0=𝖰𝖱(𝐱0)\mathbf{q}^{0}=\mathsf{QR}(\mathbf{x}_{0}). Then the defender will likely get informed with different tips 𝐕\mathbf{V}, and respond with 𝐱(𝐕)\mathbf{x}(\mathbf{V}) accordingly. Over time, the attackers will learn about the change in the frequency that a target is covered. We denote the induced defender’s marginal strategy at level 0 by 𝐱^0=𝖬𝖲(𝐱0,𝐱,𝐪0)\hat{\mathbf{x}}^{0}=\mathsf{MS}(\mathbf{x}_{0},\mathbf{x},\mathbf{q}^{0}). After observing 𝐱^0\hat{\mathbf{x}}^{0} at level 0, level-1 attackers will update their strategies from 𝐪0\mathbf{q}^{0} to 𝐪1=𝖰𝖱(𝐱^0)\mathbf{q}^{1}=\mathsf{QR}(\hat{\mathbf{x}}^{0}). Similarly, attackers at level κ\kappa (0<κ<0<\kappa<\infty) will use quantal response against the defender’s marginal strategy at level κ1\kappa-1, i.e., 𝐪κ=𝖰𝖱(𝐱^κ1)\mathbf{q}^{\kappa}=\mathsf{QR}(\mathbf{\hat{x}}^{\kappa-1}), where 𝐱^κ1=𝖬𝖲(𝐱0,𝐱,𝐪κ1)\hat{\mathbf{x}}^{\kappa-1}=\mathsf{MS}(\mathbf{x}_{0},\mathbf{x},\mathbf{q}^{\kappa-1}). In Section 5, we also define level-\infty attackers.

Denote by 𝖣𝖾𝖿𝖤𝖴(U)\mathsf{DefEU}(U) the defender’s optimal utility when they recruit a set of informants UU and use the optimal defending strategy. The key questions raised given this model are (i) how to recruit a set UU of at most kk informants and (ii) how to respond to the provided tips to maximize the expected 𝖣𝖾𝖿𝖤𝖴(U)\mathsf{DefEU}(U)?

4. Defending against Level-0 Attackers

In this section, we first tackle the case where all attackers are level-0 by providing complexity results and algorithms to find the optimal set of informants. Designing efficient algorithms to solve this computationally hard problem is particularly challenging due to the combinatorial nature of the tips and exponentially many possibilities of informant selections. Furthermore, in the general case, attackers are heterogeneous and we do not know which attackers will be reported, making it hard to compute 𝖣𝖾𝖿𝖤𝖴(U)\mathsf{DefEU}(U).

4.1. Complexity Results

Let 𝐪0=(q1,,qn).\mathbf{q}^{0}=(q_{1},\ldots,q_{n}). Before presenting our complexity results, we first define some useful notations. Given the set of informants UU and the tips 𝐕=(V1,,Vn)\mathbf{V}=(V_{1},\ldots,V_{n}), we denote by p~v(V0)\tilde{p}_{v}(V_{0}) the probability of vYv\in Y attacking a target given V0V_{0} such that V0=iTViV_{0}=\bigcup_{i\in T}V_{i}. We can compute p~v(V0)\tilde{p}_{v}(V_{0}) with

p~v(V0)={1vV0(1w~v)pv(1w~v)pv+1pvvVV0pvvYV,\displaystyle\tilde{p}_{v}(V_{0})=\begin{cases}1&v\in V_{0}\\ \frac{(1-\tilde{w}_{v})p_{v}}{(1-\tilde{w}_{v})p_{v}+1-p_{v}}&v\in V\setminus V_{0}\\ p_{v}&v\in Y\setminus V\end{cases},

where w~v=1(u,v)E,uU(1wuv)\tilde{w}_{v}=1-\prod_{(u,v)\in E,u\in U}(1-w_{uv}) is the probability of vv being reported given he attacks. Given V0V_{0} and ti=|Vi|t_{i}=|V_{i}| reported attacks on each target ii, we compute the expected utility on ii if ii is covered with 𝖤𝖴ic(ti,V0):=(ti+qivYV0p~v(V0))Rid.\mathsf{EU}_{i}^{c}(t_{i},V_{0}):=\left(t_{i}+q_{i}\sum_{v\in Y\setminus V_{0}}\tilde{p}_{v}(V_{0})\right)R_{i}^{d}. We compute the expected utility if ii is uncovered, 𝖤𝖴iu(ti,V0)\mathsf{EU}_{i}^{u}(t_{i},V_{0}), similarly. Then, the expected gain of the target if covered can be written as 𝖤𝖦i(ti,V0):=𝖤𝖴ic(ti,V0)𝖤𝖴iu(ti,V0)\mathsf{EG}_{i}(t_{i},V_{0}):=\mathsf{EU}_{i}^{c}(t_{i},V_{0})-\mathsf{EU}_{i}^{u}(t_{i},V_{0}).

{theorem}

[] When the defender is informed by informants UU, the optimal allocation of defensive resources can be determined in O(|Y|+n)O(|Y|+n) time given the tips 𝐕=(V1,,Vn)\mathbf{V}=(V_{1},\ldots,V_{n}). Given tips from recruited informants, the defender can find the optimal resource allocation by greedily protecting the targets with the highest expected gains. The proof of Theorem 4.1 is deferred to Appendix B. However, the problem of computing the optimal set of informants is still hard.

{theorem}

[] Computing the optimal set of informants to recruit is NP-Hard. The proof of Theorem 4.1 in Appendix C focuses on a relatively simple case and constructs a reduction from the maximum coverage problem (𝖬𝖢𝖯\mathsf{MCP}).

4.2. Finding the Optimal Set of Informants

In this subsection, we develop exact and heuristic informant selection algorithms to compute the optimal set of informants. To find the UU that maximizes 𝖣𝖾𝖿𝖤𝖴(U)\mathsf{DefEU}(U), we first focus on computing 𝖣𝖾𝖿𝖤𝖴(U)\mathsf{DefEU}(U) by providing a dynamic programming-based algorithm and approximate algorithms.

4.2.1. Calculating 𝖣𝖾𝖿𝖤𝖴(U)\mathsf{DefEU}(U)

Let 𝖣𝖾𝖿𝖤𝖴0\mathsf{DefEU}_{0} be the expected utility when using the optimal regular defending strategy against a single attack, which can be obtained by the algorithms introduced in Yang et al. (2012). Then 𝖣𝖾𝖿𝖤𝖴(U)\mathsf{DefEU}(U) can be explicitly written as

𝖣𝖾𝖿𝖤𝖴(U)=Pr[V0=]𝖣𝖾𝖿𝖤𝖴0\displaystyle\mathsf{DefEU}(U)=\Pr[V_{0}=\emptyset]\mathsf{DefEU}_{0}
+\displaystyle+ Pr[V0]𝖤[i[n]𝐱i(V)𝖤𝖦i(ti,V0)+𝖤𝖴i(ti,V0)|V0].\displaystyle\Pr[V_{0}\neq\emptyset]\mathsf{E}\left[\sum_{i\in[n]}\mathbf{x}_{i}(V)\mathsf{EG}_{i}(t_{i},V_{0})+\mathsf{EU}_{i}(t_{i},V_{0})\Big{|}V_{0}\neq\emptyset\right].

To directly compute 𝖣𝖾𝖿𝖤𝖴(U)\mathsf{DefEU}(U) from the above equation is formidable due to the exponential number of tips combinations. However, it is possible to reduce a significant amount of enumeration by handling the calculation carefully. We first develop an Enumeration and Dynamic Programming-based Algorithm (EDPA) to compute the exact 𝖣𝖾𝖿𝖤𝖴(U)\mathsf{DefEU}(U) as shown in Algorithm 1.

First, we compute the utility when the defender is not informed (lines 4-6). Then, we focus on calculating the total utility 𝖣𝖾𝖿𝖤𝖴(U)\mathsf{DefEU}^{\prime}(U) in the case when the defender is informed. By the linearity of expectation, 𝖣𝖾𝖿𝖤𝖴(U)\mathsf{DefEU}^{\prime}(U) can be computed as the summation of the expected utility obtained from all targets. Therefore, we focus on the calculation of the expected utility of a single target ii. For each target ii, Algorithm 1 enumerates all possible types of tips (lines 2-7). We denote each type of tip by a tuple (ti,V0)(t_{i},V_{0}), which encodes the set of reported attackers V0V_{0}\neq\emptyset and the number of reported attackers tit_{i} targeting location ii. The probability of receiving (ti,V0)(t_{i},V_{0}) can be written as

Pr(ti,V0|U)=PV0(|V0|ti)qiti(1qi)|V0|ti,\Pr(t_{i},V_{0}|U)=P_{V_{0}}{|V_{0}|\choose t_{i}}q_{i}^{t_{i}}(1-q_{i})^{|V_{0}|-t_{i}},

where

PV0=vV0(w~vpv)vVV0(1w~vpv)\displaystyle P_{V_{0}}=\prod_{v\in V_{0}}(\tilde{w}_{v}p_{v})\prod_{v\in V\setminus V_{0}}(1-\tilde{w}_{v}p_{v}) (2)

is the probability of having V0V_{0} being the set of reported attackers given UU (line 3). Let Pi,rP_{i,r} be the probability of ii being among the rr targets with the highest expected gain given (ti,V0)(t_{i},V_{0}) and UU (lines 12-13). For a given tip type (ti,V0)(t_{i},V_{0}), the expected contribution to 𝖣𝖾𝖿𝖤𝖴(U)\mathsf{DefEU}^{\prime}(U) of target ii is

Pr(ti,V0|U)𝖤𝖴i(ti,V0)+PV0(|V0|ti)qitiPi,r𝖤𝖦i(ti,V0)\displaystyle\Pr(t_{i},V_{0}|U)\cdot\mathsf{EU}_{i}(t_{i},V_{0})+P_{V_{0}}{|V_{0}|\choose t_{i}}q_{i}^{t_{i}}\cdot P_{i,r}\mathsf{EG}_{i}(t_{i},V_{0})
=PV0(|V0|ti)qiti((1qi)|V0|ti𝖤𝖴i(ti,V0)+Pi,r𝖤𝖦i(ti,V0)).\displaystyle=P_{V_{0}}{|V_{0}|\choose t_{i}}q_{i}^{t_{i}}\left((1-q_{i})^{|V_{0}|-t_{i}}\mathsf{EU}_{i}(t_{i},V_{0})+P_{i,r}\mathsf{EG}_{i}(t_{i},V_{0})\right).

We can then compute 𝖣𝖾𝖿𝖤𝖴(U)\mathsf{DefEU}^{\prime}(U) by summing over all possible ti,V0t_{i},V_{0}\neq\emptyset.

The calculation of Pi,rP_{i,r} is all that remains. This can be done very efficiently via Algorithm 2, a dynamic programming-based calculation. Let {i1,,in1}\{{i_{1}},\ldots,{i_{n-1}}\} denote the set of targets apart from ii, i.e., T{i}T\setminus\{i\} (line 1) and y!f(s,x,y)y!\cdot f(s,x,y) be the probability of having yy reported attacks among the first ss targets with xx of the targets having expected gain higher than 𝖤𝖦i\mathsf{EG}_{i} given the tips of type (ti,V0)(t_{i},V_{0}). Therefore, f(s,x,y)f(s,x,y) can be neatly written as

f(s,x,y)\displaystyle f(s,x,y) =\displaystyle= a1++as=y,j=1s𝟏[𝖤𝖦ij(aj,V0)>𝖤𝖦i(ti,V0)]=xqi1a1qi2a2qisasa1!a2!as!,\displaystyle\sum_{\begin{subarray}{c}a_{1}+\cdots+a_{s}=y,\\ \sum_{j=1}^{s}\mathbf{1}_{[\mathsf{EG}_{i_{j}}(a_{j},V_{0})>\mathsf{EG}_{i}(t_{i},V_{0})]}=x\end{subarray}}\frac{q_{i_{1}}^{a_{1}}q_{i_{2}}^{a_{2}}\cdots q_{i_{s}}^{a_{s}}}{a_{1}!a_{2}!\cdots a_{s}!},

which can be calculated using dynamic programming (line 5-11). Computing f(s,x,y)f(s,x,y) is done in a similar way by counting the number of ss-partitions on integer yy, where we also consider the constraint brought in by the limitation on the number of resources. To calculate f(s,x,y)f(s,x,y), we enumerate asa_{s} as y~\tilde{y} (line 6) and compare 𝖤𝖦is(as,V0)\mathsf{EG}_{i_{s}}(a_{s},V_{0}) with 𝖤𝖦i(ti,V0)\mathsf{EG}_{i}(t_{i},V_{0}) (line 8). If 𝖤𝖦is(as,V0)>𝖤𝖦i(ti,V0)\mathsf{EG}_{i_{s}}(a_{s},V_{0})>\mathsf{EG}_{i}(t_{i},V_{0}), we check the value of f(s1,x1,yy~)f(s-1,x-1,y-\tilde{y}) (line 9), otherwise check f(s1,x,yy~)f(s-1,x,y-\tilde{y}) (line 11). Thus, we have Pi,r=(|V0|ti)!(x=0r1f(s,x,|V0|ti)).P_{i,r}=(|V_{0}|-t_{i})!\left(\sum_{x=0}^{r-1}f(s,x,|V_{0}|-t_{i})\right). The time complexity for Algorithm 2 is O(nr|Y|2)O(nr|Y|^{2}) and O(2|Y|n2r|Y|3)O(2^{|Y|}n^{2}r|Y|^{3}) for Algorithm 1.

Algorithm 1 Calculate 𝖣𝖾𝖿𝖤𝖴(U)\mathsf{DefEU}(U)
1:𝖤𝖴0\mathsf{EU}\leftarrow 0
2:for all possible sets of reported attackers V0VV_{0}\subseteq V do
3:     PV0vV0(w~vpv)vVV0(1w~vpv)P_{V_{0}}\leftarrow\prod_{v\in V_{0}}(\tilde{w}_{v}p_{v})\prod_{v\in V\setminus V_{0}}(1-\tilde{w}_{v}p_{v})
4:     if V0=V_{0}=\emptyset then
5:         𝖤𝖴=𝖤𝖴+PV0vYp~v(V0)𝖣𝖾𝖿𝖤𝖴0\mathsf{EU}=\mathsf{EU}+P_{V_{0}}\sum_{v\in Y}\tilde{p}_{v}(V_{0})\mathsf{DefEU}_{0}
6:         Continue to line 2      
7:     for target iTi\in T and 0ti|V0|0\leq t_{i}\leq|V_{0}| do
8:         Calculate f()f(\cdot) given |V0|,i,ti|V_{0}|,i,t_{i}
9:         𝖤𝖦i(ti+qivYV0p~v(V0))(RidPid)\mathsf{EG}_{i}\leftarrow(t_{i}+q_{i}\sum_{v\in Y\setminus V_{0}}\tilde{p}_{v}(V_{0}))(R_{i}^{d}-P_{i}^{d})
10:         𝖤𝖴iu(ti+qivYV0p~v(V0))Pid\mathsf{EU}_{i}^{u}\leftarrow(t_{i}+q_{i}\sum_{v\in Y\setminus V_{0}}\tilde{p}_{v}(V_{0}))P_{i}^{d}
11:         Pi,r(|V0|ti)!(x=0r1f(s,x,|V0|ti))P_{i,r}\leftarrow(|V_{0}|-t_{i})!\left(\sum_{x=0}^{r-1}f(s,x,|V_{0}|-t_{i})\right)
12:         𝖤𝖴=𝖤𝖴+PV0(|V0|ti)qitiPi,r𝖤𝖦i\mathsf{EU}=\mathsf{EU}+P_{V_{0}}{|V_{0}|\choose t_{i}}q_{i}^{t_{i}}\cdot P_{i,r}\cdot\mathsf{EG}_{i}
13:         𝖤𝖴=𝖤𝖴+PV0(|V0|ti)qiti(1qiti)𝖤𝖴iu\mathsf{EU}=\mathsf{EU}+P_{V_{0}}{|V_{0}|\choose t_{i}}q_{i}^{t_{i}}(1-q_{i}^{t_{i}})\mathsf{EU}_{i}^{u}      
14:𝖣𝖾𝖿𝖤𝖴(U)𝖤𝖴\mathsf{DefEU}(U)\leftarrow\mathsf{EU}
Algorithm 2 Calculate f()f(\cdot) given |V0|,i,ti|V_{0}|,i,t_{i}
1:{i1,,in1}T{i}\{{i_{1}},\ldots,{i_{n-1}}\}\leftarrow T\setminus\{i\}
2:𝖤𝖦i(ti+qivYV0p~v(V0))(RidPid)\mathsf{EG}_{i}\leftarrow(t_{i}+q_{i}\sum_{v\in Y\setminus V_{0}}\tilde{p}_{v}(V_{0}))(R_{i}^{d}-P_{i}^{d})
3:Initialize f(s,x,y)0f(s,x,y)\leftarrow 0 for all s,x,ys,x,y
4:f(0,0,0)1f(0,0,0)\leftarrow 1
5:for ss in [1,n1][1,n-1], xx in [0,min(s,r)][0,\min(s,r)], yy in [0,|V0|ti][0,|V_{0}|-t_{i}] do
6:     for y~\tilde{y} in [0,y][0,y] do
7:         𝖤𝖦is(y~+qisvYV0p~v(V0))(RisdPisd)\mathsf{EG}_{i_{s}}\leftarrow(\tilde{y}+q_{i_{s}}\sum_{v\in Y\setminus V_{0}}\tilde{p}_{v}(V_{0}))(R_{i_{s}}^{d}-P_{i_{s}}^{d})
8:         if 𝖤𝖦is>𝖤𝖦i\mathsf{EG}_{i_{s}}>\mathsf{EG}_{i} then
9:              f(s,x,y)+=qisy~y~!f(s1,x1,yy~)f(s,x,y)\mathrel{+}=\frac{q_{i_{s}}^{\tilde{y}}}{\tilde{y}!}f(s-1,x-1,y-\tilde{y})
10:         else
11:              f(s,x,y)+=qisy~y~!f(s1,x,yy~)f(s,x,y)\mathrel{+}=\frac{q_{i_{s}}^{\tilde{y}}}{\tilde{y}!}f(s-1,x,y-\tilde{y})               

Since EDPA runs in exponential time, we introduce approximation methods to estimate 𝖣𝖾𝖿𝖤𝖴(U)\mathsf{DefEU}(U). Let 𝖣𝖾𝖿𝖤𝖴(U,C)\mathsf{DefEU}(U,C) be the estimated defender’s utility returned by Algorithm 1 if only subsets of reported attackers V0V_{0} with |V0|<C|V_{0}|<C are enumerated in line 2. We denote by C-Truncated this approach of estimating 𝖣𝖾𝖿𝖤𝖴(U)\mathsf{DefEU}(U). Next, we show that 𝖣𝖾𝖿𝖤𝖴(U,C)\mathsf{DefEU}(U,C) is close to the exact 𝖣𝖾𝖿𝖤𝖴(U)\mathsf{DefEU}(U) when it is unlikely to have many attacks happening at the same time. Formally, assume that the expected number of attacks is bounded by a constant CC^{\prime}, that is vYpvC\sum_{v\in Y}p_{v}\leq C^{\prime}, 𝖣𝖾𝖿𝖤𝖴(U,C)\mathsf{DefEU}(U,C) for C>CC>C^{\prime} is an estimation of 𝖣𝖾𝖿𝖤𝖴(U)\mathsf{DefEU}(U) with bounded error.

Lemma \thetheorem.

Assume that vYpvC\sum_{v\in Y}p_{v}\leq C^{\prime} and |Pid|,|Rid|Q|P_{i}^{d}|,|R_{i}^{d}|\leq Q, the error of estimation |𝖣𝖾𝖿𝖤𝖴(U,C)𝖣𝖾𝖿𝖤𝖴(U)||{\mathsf{DefEU}}(U,C)-\mathsf{DefEU}(U)| for C>CC>C^{\prime} is at most:

Qe2(CC)2/|Y|(C+11e4(CC)/|Y|).\displaystyle Q\cdot e^{-2(C-C^{\prime})^{2}/|Y|}\left(C+\frac{1}{1-e^{-4(C-C^{\prime})/|Y|}}\right).

The proof of Lemma 4.2.1 is deferred to Appendix G. The time complexity of C-Truncated is given by O(n2r|Y|C+3)O(n^{2}r|Y|^{C+3}).

However, for the case where vYpv\sum_{v\in Y}p_{v} is large, we have to set CC to be larger than vYpv\sum_{v\in Y}p_{v} for C-Truncated in order to obtain a high-quality solution; otherwise the error will become unbounded. To mitigate this limitation, we also propose an alternative sampling approach, T-Sampling, to estimate 𝖣𝖾𝖿𝖤𝖴(U)\mathsf{DefEU}(U) for general cases without restrictions on pv\sum_{p_{v}}. Instead of enumerating all possible V0V_{0} as EDPA does, in T-Sampling, we draw 𝖳\mathsf{T} i.i.d. samples of the set of reported attackers where each sample V0V_{0} is drawn with probability PV0P_{V_{0}}. T-Sampling takes the average of the expected defender’s utility when having V0V_{0} as the reported attackers over all samples as the estimation of 𝖣𝖾𝖿𝖤𝖴(U)\mathsf{DefEU}(U). We can sample V0V_{0} as follows: (i) Let V0=V_{0}=\emptyset initially; (ii) For each vVv\in V, add vv to V0V_{0} with probability w~vpv\tilde{w}_{v}p_{v}; (iii) Return V0V_{0} as a sample of the set of reported attackers. From Equation (2), the above sampling process is consistent with the distribution of V0V_{0}. T-Sampling returns an estimation of 𝖣𝖾𝖿𝖤𝖴(U)\mathsf{DefEU}(U) in O(𝖳n2r|Y|3)O(\mathsf{T}n^{2}r|Y|^{3}) time.

Proposition \thetheorem

Let 𝖣𝖾𝖿𝖤𝖴(𝖳)(U)\mathsf{DefEU}^{(\mathsf{T})}(U) be the estimation of 𝖣𝖾𝖿𝖤𝖴(U)\mathsf{DefEU}(U) given by T-Sampling using 𝖳\mathsf{T} samples. We have: lim𝖳𝖣𝖾𝖿𝖤𝖴(𝖳)(U)=𝖣𝖾𝖿𝖤𝖴(U)\lim_{\mathsf{T}\rightarrow\infty}\mathsf{DefEU}^{(\mathsf{T})}(U)=\mathsf{DefEU}(U)

4.2.2. Selecting Informants UU

Given the algorithms for computing 𝖣𝖾𝖿𝖤𝖴(U)\mathsf{DefEU}(U), a straightforward way of selecting informants is through enumeration (denoted as Select).

When using C-Truncated as a subroutine to compute 𝖣𝖾𝖿𝖤𝖴(U)\mathsf{DefEU}(U), the solution quality of the selected set of informants is guaranteed by the following theorem.

{theorem}

Assume that vYpvC\sum_{v\in Y}p_{v}\leq C^{\prime} and |Pid|,|Rid|Q|P_{i}^{d}|,|R_{i}^{d}|\leq Q. Let U𝖮𝖯𝖳U_{\mathsf{OPT}} and UU^{\prime} be the optimal set of informants and the one chosen by C-Truncated. Then for C>CC>C^{\prime}, the error |𝖣𝖾𝖿𝖤𝖴(U𝖮𝖯𝖳)𝖣𝖾𝖿𝖤𝖴(U)||\mathsf{DefEU}(U_{\mathsf{OPT}})-\mathsf{DefEU}(U^{\prime})| can be bounded by:

2Qe2(CC)2/|Y|(C+11e4(CC)/|Y|).\displaystyle 2Q\cdot e^{-2(C-C^{\prime})^{2}/|Y|}\left(C+\frac{1}{1-e^{-4(C-C^{\prime})/|Y|}}\right).
Proposition \thetheorem

Using T-Sampling to estimate 𝖣𝖾𝖿𝖤𝖴\mathsf{DefEU}, the optimal set of informants can be found when 𝖳\mathsf{T}\rightarrow\infty.

Algorithm 3 𝖲𝖾𝖺𝗋𝖼𝗁(U)\mathsf{Search}(U^{\prime})
1:if |U|=k|U^{\prime}|=k then
2:     Update 𝖮𝖯𝖳\mathsf{OPT} with (U,𝖣𝖾𝖿𝖤𝖴(U))(U^{\prime},\mathsf{DefEU}(U^{\prime}))
3:     return
4:u1argmaxuX𝖣𝖾𝖿𝖤𝖴(U{u})u_{1}\leftarrow\arg\max_{u\in X}\mathsf{DefEU}(U^{\prime}\cup\{u\})
5:u2argmaxuX{u1}𝖣𝖾𝖿𝖤𝖴(U{u})u_{2}\leftarrow\arg\max_{u\in X\setminus\{u_{1}\}}\mathsf{DefEU}(U^{\prime}\cup\{u\})
6:𝖲𝖾𝖺𝗋𝖼𝗁(U{u1})\mathsf{Search}(U^{\prime}\cup\{u_{1}\}), 𝖲𝖾𝖺𝗋𝖼𝗁(U{u2})\mathsf{Search}(U^{\prime}\cup\{u_{2}\})

Based on existing results in submodular optimization  Nemhauser et al. (1978), one may expect a greedy algorithm that step by step adds an informant that leads to the largest utility to work well. However, the set function 𝖣𝖾𝖿𝖤𝖴(U)\mathsf{DefEU}(U) in our problem violates submodularity (see Appendix F) and such greedy algorithm will not guarantee an approximation ratio of 11/e1-1/e. Therefore, we propose GSA (Greedy-based Search Algorithm) for the selection of informants as shown in Algorithm 3. GSA starts by calling 𝖲𝖾𝖺𝗋𝖼𝗁()\mathsf{Search}(\emptyset). While |U|<k|U^{\prime}|<k, 𝖲𝖾𝖺𝗋𝖼𝗁(U)\mathsf{Search}(U^{\prime}) expands the current set of informants UU^{\prime} by adding u1,u2u_{1},u_{2} to UU^{\prime} and recursing, where u1u_{1} and u2u_{2} are the two informants that give the largest marginal gain in 𝖣𝖾𝖿𝖤𝖴\mathsf{DefEU} (line 4-5); Otherwise, it updates the optimal solution with UU^{\prime} (line 1-3).

We identify a tractable case to conclude the section.

Lemma \thetheorem.

Given the set of recruited informants UU, the defender’s expected utility 𝖣𝖾𝖿𝖤𝖴(U)\mathsf{DefEU}(U) can be computed in polynomial time if wuv=1(u,v)Ew_{uv}=1\,\forall(u,v)\in E. When kk is a constant, the optimal set of informants can be computed in polynomial time.

This represents the case where the informants have strong connections with a particular group of attackers and can get full access to their attack plans. We refer to the property of wuv=1w_{uv}=1 for all u,vu,v as SISI (Strong Information Sharing Intensity). Denote by ASISI (Algorithm for SISI) the polynomial-time algorithm in Lemma 4.2.2. We provide more details about the SISI case in Appendix D and E.

We summarize the time complexity of all algorithms for computing the optimal UU in Table 1 in the Appendix.

5. Defending Against Level-\infty Attackers

As discussed in Section 3.1, a level-κ\kappa attacker may keep adapting to the new marginal strategy formed by his current level of behavior. In this section, we first show in Theorem 5 that there exists a fixed-point strategy for the attacker in our level-κ\kappa response model, and then use that to define the level-\infty attackers.

We formulate the problem of finding the optimal defender’s strategy for this case as a mathematical program. However, such a program can be too large to solve. We propose a novel technique that reduces the program to a bi-level optimization problem, with both levels much more tractable.

{theorem}

[] Let Δn={𝐪𝐪[0,1]n,𝟏𝖳𝐪1}\Delta_{n}=\{\mathbf{q}\mid\mathbf{q}\in[0,1]^{n},\mathbf{1}^{\mathsf{T}}\mathbf{q}\leq 1\}. Given defender’s strategies 𝐱0\mathbf{x}_{0} and 𝐱(𝐕)\mathbf{x}(\mathbf{V}), there exists 𝐪Δn\mathbf{q}^{*}\in\Delta_{n} such that 𝐪=𝖰𝖱(𝖬𝖲(𝐱0,𝐱,𝐪))\mathbf{q}^{*}=\mathsf{QR}(\mathsf{MS}(\mathbf{x}_{0},\mathbf{x},\mathbf{q}^{*})).

Proof.

Since Δn\Delta_{n} is a compact convex set and 𝖰𝖱(𝖬𝖲(𝐱0,𝐱,𝐪))\mathsf{QR}(\mathsf{MS}(\mathbf{x}_{0},\mathbf{x},\mathbf{q}^{*})) is a continuous function of 𝐪\mathbf{q}, by Brouwer fixed-point theorem, there exists 𝐪Δn\mathbf{q}^{*}\in\Delta_{n} such that 𝐪=𝖰𝖱(𝖬𝖲(𝐱0,𝐱,𝐪))\mathbf{q}^{*}=\mathsf{QR}(\mathsf{MS}(\mathbf{x}_{0},\mathbf{x},\mathbf{q}^{*})). ∎

According to the definition of level-κ\kappa attackers, we have 𝐪κ+1=𝖰𝖱(𝖬𝖲(𝐱0,𝐱,𝐪κ))\mathbf{q}^{\kappa+1}=\mathsf{QR}(\mathsf{MS}(\mathbf{x}_{0},\mathbf{x},\mathbf{q}^{\kappa})). Slightly generalizing the definition, we define a level-\infty attacker as:

Definition \thetheorem (level-\infty attacker).

Given the defender’s strategies 𝐱0\mathbf{x}_{0} and 𝐱(𝐕)\mathbf{x}(\mathbf{V}), the strategy 𝐪\mathbf{q} of a level-\infty attacker satisfies 𝐪=𝖰𝖱(𝖬𝖲(𝐱0,𝐱,𝐪))\mathbf{q}=\mathsf{QR}(\mathsf{MS}(\mathbf{x}_{0},\mathbf{x},\mathbf{q})).

Remark \thetheorem.

Note that Definition 5 is not obtained by taking the limit of the level-κ\kappa definition, since such a limit may not even exist (see Example H in Appendix H).

Remark \thetheorem.

Although the level-\infty attacker is defined through a fixed point argument, we still stick to the Stackelberg assumption: the defender leads and the attacker follows. Notice that in the equation 𝐪=𝖰𝖱(𝖬𝖲(𝐱0,𝐱,𝐪))\mathbf{q}=\mathsf{QR}(\mathsf{MS}(\mathbf{x}_{0},\mathbf{x},\mathbf{q})), 𝐪\mathbf{q} will only be defined after the defender commits to strategies 𝐱0\mathbf{x}_{0} and 𝐱\mathbf{x}. However, it is different from the standard Strong Stackelberg Equilibrium Korzhyk et al. (2011) in that the attacker is following a level-\infty response model, as defined by the fixed point equation.

Also, as we will discuss in Section 7.1.3 on our experiments, when r=nr=n, the defender’s optimal strategy is not to use up all the available resources. This is clearly different from a Nash equilibrium, as the defender still has incentives to use more resources.

5.1. Convergence Condition for the Level-κ\kappa Response Model

We focus on the single-attacker case, where there are only nn different types of tips. We use 𝐕i\mathbf{V}_{i} to denote the tips where the attacker is reported to attack target ii. When the attacker is using strategy qq, the probability of receiving 𝐕i\mathbf{V}_{i} is Pr{𝐕i}=wqi\Pr\{\mathbf{V}_{i}\}=wq_{i}.

{theorem}

Let x¯i=maxj{xi(𝐕j)}\bar{x}_{i}=\max_{j}\{x_{i}(\mathbf{V}_{j})\}. In the single attacker case, if there exists constant L[0,1)L\in[0,1), such that x¯iLnλ(RiaPia),i\bar{x}_{i}\leq\frac{L}{n\lambda(R^{a}_{i}-P^{a}_{i})},\forall i, then level-κ\kappa agents converge to level-\infty agents as κ\kappa approaches infinity.

The proof of Theorem 5.1 is omitted since it is immediate from the following lemma:

Lemma \thetheorem.

In the single attacker case, if there exists constant L[0,1)L\in[0,1), such that x¯iLnλ(RiaPia)\bar{x}_{i}\leq\frac{L}{n\lambda(R^{a}_{i}-P^{a}_{i})} for all ii, then g(𝐪)g(\mathbf{q}) is LL-Lipschitz with respect to the L1L^{1}-norm, i.e., g(𝐪)g(\mathbf{q}) is a contraction.

The proof of Lemma 5.1 is deferred to Appendix I.

Corollary \thetheorem

In the single attacker case, if there exists a constant L[0,1)L\in[0,1), such that Lnλ(RiaPia)>1,i\frac{L}{n\lambda(R^{a}_{i}-P^{a}_{i})}>1,\forall i, then level-κ\kappa agents converge to level-\infty agents as κ\kappa goes to infinity.

5.2. A Bi-Level Optimization for Solving the Optimal Defender’s Strategy

In this section, we still consider the single attacker case and assume the defender has r1r\geq 1 resources. Clearly, the optimal set of informants should contain the ones with the highest information sharing intensities. It remains to compute the optimal strategies 𝐱0\mathbf{x}_{0} and 𝐱(𝐕)\mathbf{x}(\mathbf{V}). Given the optimal set of informants UU^{*}, the probability of receiving a tip is w=1uU(1wu1)w=1-\prod_{u\in U^{*}}(1-w_{u1}). Let Pr{𝐕}\Pr\{\mathbf{V}\} be the probability of receiving tips 𝐕\mathbf{V}, which depends 𝐪\mathbf{q}. Let 𝐱(𝐕)=(x1(𝐕),,xn(𝐕))\mathbf{x}(\mathbf{V})=(x_{1}(\mathbf{V}),\ldots,x_{n}(\mathbf{V})) be the defender strategy when receiving tips 𝐕\mathbf{V}.

Let 𝐪=(q1,,qn)\mathbf{q}=(q_{1},\ldots,q_{n}) be the strategy of the level-\infty attacker. Given 𝐕\mathbf{V} and the corresponding tit_{i}’s, the expected number of attackers that are going to attack target ii is di=ti+(1jtj)p~v()qid_{i}=t_{i}+(1-\sum_{j}t_{j})\tilde{p}_{v}(\emptyset)q_{i}. Therefore, given 𝐱^\hat{\mathbf{x}} we have the defender’s expected utility 𝖣𝖾𝖿𝖤𝖴(𝐱0,𝐱)\mathsf{DefEU}(\mathbf{x}_{0},\mathbf{x}) as

𝖣𝖾𝖿𝖤𝖴(𝐱0,𝐱)=𝐕,iPr{𝐕}di[Pid+xi(𝐕)(RidPid)].\displaystyle\mathsf{DefEU}(\mathbf{x}_{0},\mathbf{x})=\textstyle\sum_{\mathbf{V},i}\Pr\{\mathbf{V}\}d_{i}\left[P_{i}^{d}+x_{i}(\mathbf{V})\left(R_{i}^{d}-P_{i}^{d}\right)\right].

Then the problem of finding the optimal defender strategy can be formulated as the following mathematical program:

max 𝖣𝖾𝖿𝖤𝖴(𝐱0,𝐱)s.t. 𝐪=𝖰𝖱(𝖬𝖲(𝐱0,𝐱,𝐪)).\displaystyle\text{max\quad}\mathsf{DefEU}(\mathbf{x}_{0},\mathbf{x})\quad\text{s.t.\quad}\mathbf{q}=\mathsf{QR}(\mathsf{MS}(\mathbf{x}_{0},\mathbf{x},\mathbf{q})).

In the single-attacker case, we need nn and n2n^{2} variables to represent 𝐱0\mathbf{x}_{0} and 𝐱\mathbf{x}. We can use the QRI-MILP algorithm111An algorithm that computes an approximate defender’s optimal strategy against a variant of level-0 attackers who take into account the impact of informants when determining the target they attack. See Appendix J for more details. to find the solution. However, this approach needs to solve a mixed integer program and does not scale well.

To tackle the problem, we focus on the defender’s marginal strategy instead of the full strategy representation, and decompose the above program into a bi-level optimization problem.

Let 𝐱^=𝖬𝖲(𝐱0,𝐱,𝐪)=𝐕Pr{𝐕}𝐱(𝐕)\hat{\mathbf{x}}=\mathsf{MS}(\mathbf{x}_{0},\mathbf{x},\mathbf{q})=\sum_{\mathbf{V}}\Pr\{\mathbf{V}\}\mathbf{x}(\mathbf{V}), where we slightly abuse notation and use 𝐕=\mathbf{V}=\emptyset to denote the case of receiving no tip, 𝐱()\mathbf{x}(\emptyset) to denote 𝐱0\mathbf{x}_{0}. The bi-level optimization method works as follows. At the inner level, we fix an arbitrary feasible 𝐱^\hat{\mathbf{x}}, and solve the following mathematical program:

max 𝖣𝖾𝖿𝖤𝖴(𝐱^)\displaystyle\mathsf{DefEU}(\hat{\mathbf{x}})
s.t. 𝐕Pr{𝐕}𝐱(𝐕)=𝐱^,𝐪=𝖰𝖱(𝐱^)\displaystyle\textstyle\sum_{\mathbf{V}}\Pr\{\mathbf{V}\}\mathbf{x}(\mathbf{V})=\hat{\mathbf{x}},\,\mathbf{q}=\mathsf{QR}(\hat{\mathbf{x}})
𝟏𝖳𝐱(𝐕)r,𝐱(𝐕)[0,1]n,𝐕\displaystyle\mathbf{1}^{\mathsf{T}}\mathbf{x}(\mathbf{V})\leq r,\mathbf{x}(\mathbf{V})\in[0,1]^{n},\forall\mathbf{V}

Since 𝐱^\hat{\mathbf{x}} is fixed, 𝐪\mathbf{q} and Pr{𝐕}\Pr\{\mathbf{V}\} are also fixed. Thus, the program above becomes a linear program, with 𝐱(𝐕)\mathbf{x}(\mathbf{V}) as variables. We can always find a feasible solution to it by simply setting 𝐱(𝐕)=𝐱^,𝐕\mathbf{x}(\mathbf{V})=\hat{\mathbf{x}},\forall\mathbf{V}. Solving this linear program gives us the optimal defender’s utility 𝖣𝖾𝖿𝖤𝖴(𝐱^)\mathsf{DefEU}(\hat{\mathbf{x}}) for any possible 𝐱^\hat{\mathbf{x}}. To find the optimal defender strategy, we solve the outer-level optimization problem below:

max 𝖣𝖾𝖿𝖤𝖴(𝐱^) s.t. 𝐱^ is feasible.\displaystyle\text{max\quad}\mathsf{DefEU}(\hat{\mathbf{x}})\text{\quad s.t.\quad}\hat{\mathbf{x}}\text{ is feasible.}

Since the feasible region of 𝐱^\hat{\mathbf{x}} is continuous, we can use any known algorithm (e.g., gradient descent) to solve the outer-level program. The inner-level linear program still suffers from the scalability problem. However, when there are multiple attackers, the optimal objective value can be well-approximated by simply sampling a subset of possible 𝐕\mathbf{V}’s, or focusing only on the 𝐕\mathbf{V}’s with the highest probabilities. For those 𝐕\mathbf{V}’s that are not considered, we can always use 𝐱0\mathbf{x}_{0} as the default strategy for 𝐱(𝐕)\mathbf{x}(\mathbf{V}).

6. Defending Against Informant-Aware Attackers

We now consider a variant of our model where attackers take into account the impact of informants when determining the target they attack. Specifically, we assume the attackers follow the QR behavior model but incorporate the probability of being discovered when determining their expected utility for attacking a target.222Consider attackers that have had experience playing against the defender. Over time, the attacker might start to consider their expected utility in practice, which is affected by informants. In this setting, the attackers’ subjective belief 𝐱\mathbf{x}^{\prime} of the target coverage probability does not necessarily satisfy ixir\sum_{i}x^{\prime}_{i}\leq r. Consider the example of a single attacker and a single informant with report intensity 1. Assume that the defender has r=1r=1 and always protects the target being reported with probability 1. Then no matter which target the attacker chooses to attack, it will always be covered.

We focus on the single attacker case with r1r\geq 1. We first consider the problem of computing the optimal defender strategy when given the set of informants UU and associated probability of receiving a tip w=1uU(1wu1)w=1-\prod_{u\in U}(1-w_{u1}). In the general case with multiple attackers, we will need to specify the defender strategy for each combination of tips received. However, when there is only one attacker, we can succinctly describe the defender strategy by their default strategy without tips, 𝐱\mathbf{x}, and their probability of defending a location after receiving a tip for that location, 𝐳\mathbf{z}. Then, under the QR adversary model, the probability qiq_{i} of the attacker targeting location ii will be:

qi=eλ{[(1w)xi+wzi]Pia+[1(1w)xiwzi]Ria}j𝒯eλ{[(1w)xj+wzj]Pja+[1(1w)xjwzj]Rja}.\displaystyle q_{i}=\frac{e^{\lambda\left\{\left[(1-w)x_{i}+wz_{i}\right]P^{a}_{i}+\left[1-(1-w)x_{i}-wz_{i}\right]R^{a}_{i}\right\}}}{\sum_{j\in\mathcal{T}}e^{\lambda\left\{\left[(1-w)x_{j}+wz_{j}\right]P^{a}_{j}+\left[1-(1-w)x_{j}-wz_{j}\right]R^{a}_{j}\right\}}}.

This leads to the following optimization problem, QRI, to compute the optimal defender strategy:

maxx,y,a\displaystyle\max_{x,y,a} i𝒯eλRiaeλ(RiaPia)yi[(RidPid)yi+Pid]i𝒯eλRiaeλ(RiaPiA)yi\displaystyle\quad\frac{\sum_{i\in\mathcal{T}}e^{\lambda R^{a}_{i}}e^{-\lambda(R^{a}_{i}-P^{a}_{i})y_{i}}\left[(R^{d}_{i}-P^{d}_{i})y_{i}+P^{d}_{i}\right]}{\sum_{i\in\mathcal{T}}e^{\lambda R^{a}_{i}}e^{-\lambda(R^{a}_{i}-P^{A}_{i})y_{i}}}
subject to yi=(1w)xi+wzi,i𝒯\displaystyle\quad y_{i}=(1-w)x_{i}+wz_{i},\quad\forall i\in\mathcal{T} (3)
i𝒯xir\displaystyle\quad\textstyle\sum_{i\in\mathcal{T}}x_{i}\leq r (4)
0xi,zi1,i𝒯\displaystyle\quad 0\leq x_{i},z_{i}\leq 1,\quad\forall i\in\mathcal{T} (5)

We can compute the optimal defender strategy by adapting the approach used in the PASAQ algorithm Yang et al. (2012). The description of the algorithm is deferred to Appendix J.

7. Experiment

In this section, we demonstrate the effectiveness of our proposed algorithms through extensive experiments. In our experiments, all reported results are averaged over 30 randomly generated game instances. See Appendix L for details about generating game instances and parameters. Unless specified otherwise, all game instances are generated in this way.

7.1. Experimental Results

We compare the scalability and the solution quality of Select using EDPA, C-Truncated, T-Sampling to obtain 𝖣𝖾𝖿𝖤𝖴\mathsf{DefEU} and GSA for different settings of the problems against level-0 attackers.

First, we test the case where vYpv<3\sum_{v\in Y}p_{v}<3. We set |X|=6,k=4,n=8,r=3|X|=6,k=4,n=8,r=3 and enumerate |Y||Y| from 2 to 16. The results are shown in Figure 1(a). We also include Greedy as a baseline that always chooses the informants that maximizes the probability of receiving tips. We can see that T-Sampling performs the best in term of runtime, but fails to provide high-quality solutions. While C-Truncated is slower than T-Sampling, it performs the best with no error on all test cases. However, when there is no restriction on vYpv\sum_{v\in Y}p_{v}, as shown in Figure 1(b), C-Truncated performs badly, even worse than Greedy for large |Y||Y|, while T-Sampling performs a lot better and GSA performs the best. We also fix |X|=7|X|=7, |Y|=10,k=3,r=5|Y|=10,k=3,r=5 and change the number of targets nn from 5 to 25 for vYpv<3\sum_{v\in Y}p_{v}<3. The results are shown in Figure 1(c). GSA is the fastest but provides slightly worse solutions than C-Truncated does. The runtime of Greedy is less than 0.3s for all instances tested.

We then perform a case study to show the trade-off between the optimal number of resources to allocate and the optimal number of informants to recruit with budget constraints when defending against level-0 attackers. We set |X|=|Y|=n=6|X|=|Y|=n=6 and generate an instance of the game. We set the cost of allocating one defensive resource Cr=3C_{r}=3 and the cost of hiring one informant Ci=1C_{i}=1. Given a budget BB, we can recruit kk informants and allocate rr resources when kCi+rCrBk\cdot C_{i}+r\cdot C_{r}\leq B. The trade-off between the optimal kk and rr is shown in Figure 1(d). In the same instance, we study how the defender’s utility would change by increasing the number of recruited informants with fixed rr. Given a fixed number of resources, the defender should recruit as many informants as possible. We can also see that assuming the defender can acquire sufficient resources, the importance of recruiting additional informants is diminished. This result provides useful guidance to defenders such as conservation agencies in allocating their budget and recruiting informants.

We run additional experiments for the SISI case and do a case study to show the errors of the estimations for all UXU\subseteq X on 2 instances. We present the results in Appendix K.

Refer to caption
Refer to caption
(a) Runtime and solution quality increasing |Y||Y| with pv<3.\sum_{p_{v}}<3.
Refer to caption
Refer to caption
(b) Runtime and solution quality increasing |Y||Y| for General Cases.
Refer to caption
Refer to caption
(c) Runtime and solution quality increasing nn with pv<3.\sum_{p_{v}}<3.
Refer to caption
Refer to caption
(d) Trade-off between r and k, and increase of utility with fixed rr (|X|=6|X|=6, |Y|=6|Y|=6, n=6n=6).
Refer to caption
(e) Comparison between the defender utility against level-0 and level-\infty attackers. “L-\infty/0 def” means that the defender uses the optimal strategy against a Level-\infty/0 attacker.
Refer to caption
(f) Comparison between the bi-level optimization algorithm and QRI-MILP.
Refer to caption
(g) Comparison between defender utility against level-0 and informant-aware attackers.
Figure 1. Experimental Results.

7.1.1. Level-0 vs. Level-\infty attackers

We set |X|=n=6|X|=n=6, |Y|=p1=1|Y|=p_{1}=1, and GSG_{S} to be fully connected. We set r=2,4,6r=2,4,6 and vary kk from 161\ldots 6. We first fix the defender’s strategy to the one against level-0 attackers and compare the utility achieved by the defender when defending against a level-0 attacker and a level-\infty attacker. We show how the defender utility varies with the number of informants and defensive resources in Figure 1(e). On average, we see that the defender utility against a level-\infty attacker is lower than that against a level-0 attacker. We also show the utility of the defender using her optimal strategy against a level-\infty attacker. We can see that when facing a level-\infty attacker, the defender utility when using the optimal strategy is higher by a margin than using the one against level-0 attackers.

7.1.2. Level-0 vs. informant-aware defenders

We set |X|=n=6|X|=n=6, |Y|=p1=1|Y|=p_{1}=1, and GSG_{S} to be fully connected. We vary rr from 161\ldots 6 and kk from 060\ldots 6. We assume that the defender recruits the kk informants with the highest information sharing intensity wu1w_{u1}. The optimal defender strategy against the informant-aware attacker case is found using QRI-MILP. The defender strategy against the level-0 attacker case is computed using PASAQ Yang et al. (2012). The defender utility against the level-0 attacker is found by first computing qi,𝖣𝖾𝖿𝖤𝖴0q_{i},\mathsf{DefEU}_{0} and then using the results to compute 𝖣𝖾𝖿𝖤𝖴(U)\mathsf{DefEU}(U).

In Figure 1(g), we show how the defender utility in the two cases varies with the number of informants and defensive resources. On average, we see that the defender utility is marginally higher against the level-0 attacker than against the informant-aware attacker, particularly when the defender has either very few or very many defensive resources. We also compare the defender’s utility of the level-0 defender (defending against level-0 attackers) and the informant-aware defender (defending against informant-aware attackers). The results are deferred to Appendix K.

7.1.3. Comparison between the Bi-Level Algorithm and QRI-MILP

We empirically compare the bi-level optimization algorithm with QRI-MILP. We set |X|=n=6|X|=n=6, |Y|=p1=1|Y|=p_{1}=1, and GSG_{S} to be fully connected. We vary rr from 161\ldots 6 and kk from 060\ldots 6.

In both cases, we assume that the defender recruits the kk informants with the highest information sharing intensity wu1w_{u1}. The results are shown in Figure 1(f). In general, our bi-level algorithm gives higher expected defender utilities than the QRI-MILP algorithm, except when r=1r=1. Our results show that both increasing the number of resources and hiring more informants increase the defender’s utility. However, as the number of resources (rr) increases, the utility gain from hiring more informants diminishes.

Intuitively, if the number of resources equals the number of targets, the defender should always cover all the targets, Interestingly, during our experiments, we observed that in this case, the optimal defender strategy may not always use all his resources to cover all the targets. The reason is that in a general sum game, by decreasing the probability of protecting a certain target on purpose, the defender can lure the attacker into attacking the target more frequently, and thus increase his expected utility. Such strategies can be found in real-world wildlife protections where the patrollers may sometimes deliberately ignore the tips. This is also reflected in our bi-level algorithm. If the defender always uses all his resources, then both the defender’s and the attacker’s strategies are fixed, and hiring more informants does not increase the defender’s expected utility. But if the defender strategy does not always use all his resources, then hiring more informants could help (see the bi-level algorithm for the r=6r=6 case in Figure 1(f)).

8. Discussion and Conclusion

In this paper, we introduced a novel two-stage security game model and a multi-level QR behavioral model that incorporated community engagement. We provided complexity results, developed algorithms to find (sub-)optimal groups of informants to recruit against level-0 attackers and evaluated the algorithms through extensive experiments. Our results also generalize to the case where informants have heterogeneous recruitment costs and to different kinds of attacker response models, such as SUQR model Nguyen et al. (2013), which can be done by calculating the attacker’s response correspondingly. Also, see Appendix M on how to extend our algorithms to defend against level-κ\kappa (κ<\kappa<\infty) attackers. In Section 5, we defined a more powerful type of attacker that could respond to the marginal strategy and developed a bi-level optimization algorithm to find the optimal defender’s strategy in this case.

In the anti-poaching domain, some conservation site managers utilize the so-called “intelligence” operations that rely on informants in nearby villages to alert rangers when they know the poachers’ plans in advance. The deployment of the work relies on the site manager to provide their understanding of the social connections among community members. The edges and parameters of the bipartite graph in our model can be extracted from a local social media application or historical data collected by site managers. Recruiting and training reliable informants is costly and managers may only afford a limited number of them. Our model and solution can help the managers efficiently recruit informants, make the best use of tips and evaluate the trade-off between allocating budget to hiring rangers and recruiting informants in a timely fashion.

For future work, instead of using a particular behavior model, we can use historical records as training data and learn the attackers’ behavior in different domains. It would also be interesting to consider the case where the informants can only provide inaccurate tips or other types of tips, e.g., some subset of targets will be attacked instead of a single location. We can also model the informants as strategic agents. In real life, it is possible that informants may also provide fake information if they have their own utility structures. We can try to reward them to elicit true information and maximize the defender’s utility.

Acknowledgement

This work is supported in part by NSF grant IIS-1850477 and a research grant from Lockheed Martin.

References

  • (1)
  • Basilico et al. (2017) Nicola Basilico, Andrea Celli, Giuseppe De Nittis, and Nicola Gatti. 2017. Coordinating multiple defensive resources in patrolling games with alarm systems. In AAMAS’17. 678–686.
  • Duffy et al. (2015) Rosaleen Duffy, Freya AV St John, Bram Büscher, and DAN Brockington. 2015. The militarization of anti-poaching: undermining long term goals? Environmental Conservation 42, 4 (2015), 345–348.
  • Fang et al. (2017) Fei Fang, Thanh Hong Nguyen, Rob Pickles, Wai Y. Lam, Gopalasamy R. Clements, Bo An, Amandeep Singh, Brian C. Schwedock, Milind Tambe, and Andrew Lemieux. 2017. PAWS - A Deployed Game-Theoretic Application to Combat Poaching. AI Magazine (2017). http://www.aaai.org/ojs/index.php/aimagazine/article/view/2710
  • Gill et al. (2014) Charlotte Gill, David Weisburd, Cody W Telep, and Trevor Bennett. 2014. Community-oriented policing to reduce crime, disorder and fear and increase satisfaction and legitimacy among citizens: A systematic review. Journal of Experimental Criminology (2014).
  • Guo et al. (2017) Qingyu Guo, Boyuan An, Branislav Bosansky, and Christopher Kiekintveld. 2017. Comparing strategic secrecy and Stackelberg commitment in security games. In IJCAI-17.
  • Jain et al. (2010) Manish Jain, Jason Tsai, James Pita, Christopher Kiekintveld, Shyamsunder Rathi, Milind Tambe, and Fernando Ordónez. 2010. Software assistants for randomized patrol planning for the lax airport police and the federal air marshal service. Interfaces (2010).
  • Korzhyk et al. (2011) Dmytro Korzhyk, Zhengyu Yin, Christopher Kiekintveld, Vincent Conitzer, and Milind Tambe. 2011. Stackelberg vs. Nash in security games: An extended investigation of interchangeability, equivalence, and uniqueness. Journal of Artificial Intelligence Research 41 (2011), 297–327.
  • Le Gallic and Cox (2006) Bertrand Le Gallic and Anthony Cox. 2006. An economic analysis of illegal, unreported and unregulated (IUU) fishing: Key drivers and possible solutions. Marine Policy 30, 6 (2006), 689–695.
  • Leader-Williams and Milner-Gulland (1993) N Leader-Williams and EJ Milner-Gulland. 1993. Policies for the enforcement of wildlife laws: the balance between detection and penalties in Luangwa Valley, Zambia. Conservation Biology 7, 3 (1993), 611–617.
  • Linkie et al. (2015) Matthew Linkie, Deborah J. Martyr, Abishek Harihar, Dian Risdianto, Rudijanta T. Nugraha, Maryati, Nigel Leader‐Williams, and Wai‐Ming Wong. 2015. EDITOR’S CHOICE: Safeguarding Sumatran tigers: evaluating effectiveness of law enforcement patrols and local informant networks. Journal of Applied Ecology (2015).
  • Ma et al. (2018) Xiaobo Ma, Yihui He, Xiapu Luo, Jianfeng Li, Mengchen Zhao, Bo An, and Xiaohong Guan. 2018. Camera Placement Based on Vehicle Traffic for Better City Security Surveillance. IEEE Intelligent Systems 33, 4 (Jul 2018), 49–61. https://doi.org/10.1109/mis.2018.223110904
  • McKelvey and Palfrey (1995) Richard D McKelvey and Thomas R Palfrey. 1995. Quantal response equilibria for normal form games. Games and economic behavior (1995).
  • Moreto (2015) William D Moreto. 2015. Introducing intelligence-led conservation: bridging crime and conservation science. Crime Science 4, 1 (2015), 15.
  • Nemhauser et al. (1978) George L Nemhauser, Laurence A Wolsey, and Marshall L Fisher. 1978. An analysis of approximations for maximizing submodular set functions—I. Mathematical programming 14, 1 (1978), 265–294.
  • Nguyen et al. (2013) Thanh Hong Nguyen, Rong Yang, Amos Azaria, Sarit Kraus, and Milind Tambe. 2013. Analyzing the Effectiveness of Adversary Modeling in Security Games.. In AAAI.
  • Pita et al. (2008) James Pita, Manish Jain, Janusz Marecki, Fernando Ordóñez, Christopher Portway, Milind Tambe, Craig Western, Praveen Paruchuri, and Sarit Kraus. 2008. Deployed ARMOR protection: the application of a game theoretic model for security at the Los Angeles International Airport. In AAMAS: industrial track.
  • Rosenfeld and Kraus (2017) Ariel Rosenfeld and Sarit Kraus. 2017. When Security Games Hit Traffic: Optimal Traffic Enforcement Under One Sided Uncertainty.. In IJCAI. 3814–3822.
  • Schlenker et al. (2018) Aaron Schlenker, Omkar Thakoor, Haifeng Xu, Fei Fang, Milind Tambe, Long Tran-Thanh, Phebe Vayanos, and Yevgeniy Vorobeychik. 2018. Deceiving Cyber Adversaries: A Game Theoretic Approach. In AAMAS.
  • Short et al. (2013) Martin B Short, Ashley B Pitcher, and Maria R D’Orsogna. 2013. External conversions of player strategy in an evolutionary game: A cost-benefit analysis through optimal control. European Journal of Applied Mathematics 24, 1 (2013), 131–159.
  • Smith and Humphreys (2015) MLR Smith and Jasper Humphreys. 2015. The Poaching Paradox: Why South Africa’s ‘Rhino Wars’ Shine a Harsh Spotlight on Security and Conservation. Ashgate Publishing Company.
  • Tambe (2011) Milind Tambe. 2011. Security and game theory: algorithms, deployed systems, lessons learned. Cambridge University Press.
  • Tublitz and Lawrence (2014) Rebecca Tublitz and Sarah Lawrence. 2014. The Fitness Improvement Training Zone Program. (2014).
  • Wang et al. (2018) Xinrun Wang, Bo An, Martin Strobel, and Fookwai Kong. 2018. Catching Captain Jack: Efficient Time and Space Dependent Patrols to Combat Oil-Siphoning in International Waters. (2018). https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16312
  • Wright and Leyton-Brown (2014) James R Wright and Kevin Leyton-Brown. 2014. Level-0 meta-models for predicting human behavior in games. In Proceedings of the fifteenth ACM conference on Economics and computation. ACM, 857–874.
  • WWF (2015) WWF. 2015. Developing an approach to community-based crime prevention. http://zeropoaching.org/pdfs/Community-based-crime%20prevention-strategies.pdf. (2015).
  • Yang et al. (2012) Rong Yang, Fernando Ordonez, and Milind Tambe. 2012. Computing optimal strategy against quantal response in security games. In AAMAS.

Appendix:
Green Security Game with Community Engagement

Appendix A Complexity of different algorithms

Algorithm Time Complexity
EDPA O(|X|k2|Y|n2r|Y|3)O(|X|^{k}2^{|Y|}n^{2}r|Y|^{3})
C-Truncated O(|X|kn2r|Y|C+3)O(|X|^{k}n^{2}r|Y|^{C+3})
T-Sampling O(|X|k𝖳n2r|Y|3)O(|X|^{k}\mathsf{T}n^{2}r|Y|^{3})
ASISI O(|X|kn2r|Y|4)O(|X|^{k}n^{2}r|Y|^{4})
GSA O(2k|X|n2r|Y|3)O(2^{k}|X|n^{2}r|Y|^{3})
Table 1. Complexity Table

Appendix B Proof of Theorem 4.1

See 4.1

Proof.

Given the tips 𝐕\mathbf{V}, the defender should calculate 𝖤𝖦i(|Vi|,V0)\mathsf{EG}_{i}(|V_{i}|,V_{0}) for each target iTi\in T, and then allocate the resources to rr of the targets with the highest 𝖤𝖦i\mathsf{EG}_{i}.

The above strategy is indeed optimal since the expected utility with no resources is given by iT𝖤𝖴iu(|Vi|,V0)\sum_{i\in T}\mathsf{EU}^{u}_{i}(|V_{i}|,V_{0}), and once an additional unit of resource is given, it should always be allocated to the uncovered target that could lead to the largest increment in expected utility, i.e., the target with the largest 𝖤𝖦i(|Vi|,V0)\mathsf{EG}_{i}(|V_{i}|,V_{0}).

The calculation of 𝖤𝖦i(|Vi|,V0)\mathsf{EG}_{i}(|V_{i}|,V_{0}) for each iTi\in T can be done in O(n+|Y|)O(n+|Y|) time, and finding the rr largest 𝖤𝖦i(|Vi|,V0)\mathsf{EG}_{i}(|V_{i}|,V_{0}) can be done in O(n)O(n) time, leading to the overall complexity of O(|Y|+n)O(|Y|+n). ∎

Appendix C Proof of Theorem 4.1

See 4.1

Proof.

Consider the case where r=1r=1, pv=wuv=1p_{v}=w_{uv}=1 for all u,vu,v, and the targets are uniform i.e., RidR_{i}^{d}’s (Ria,Pid,PiaR_{i}^{a},P_{i}^{d},P_{i}^{a}) are the same for all iTi\in T. We use the notation RdR^{d} (PdP^{d}) instead of RidR_{i}^{d} (PidP_{i}^{d}) for simplicity. Let λ=0\lambda=0.

To start with, we investigate how 𝖣𝖾𝖿𝖤𝖴(U)\mathsf{DefEU}(U) depends on a given UU. Since pv=1p_{v}=1 and wuv=1w_{uv}=1 for all uX,vYu\in X,v\in Y, all attackers in VV will be reported to attack a location. Let random variable Xi=|Vi|X_{i}=|V_{i}| be the number of attackers who are reported to attack location ii. Since the targets are uniform, an attacker will attack each location with probability qi=1nq_{i}=\frac{1}{n} if he goes attacking. Then the defender’s expected utility 𝖣𝖾𝖿𝖤𝖴(U)\mathsf{DefEU}(U) could be written as

𝖣𝖾𝖿𝖤𝖴(U)=\displaystyle\mathsf{DefEU}(U)= (𝔼[maxiTXi]+|Y||V|n)Rd\displaystyle\left(\mathbb{E}\left[\max_{i\in T}X_{i}\right]+\frac{|Y|-|V|}{n}\right)R^{d}
+(|V|𝔼[maxiTXi]+n1n(|Y||V|))Pd\displaystyle+\left(|V|-\mathbb{E}\left[\max_{i\in T}X_{i}\right]+\frac{n-1}{n}(|Y|-|V|)\right)P^{d}
=\displaystyle= (𝔼[maxiTXi]|V|n)(RdPd)+|Y|nRd+(n1)|Y|nPd.\displaystyle\left(\mathbb{E}\left[\max_{i\in T}X_{i}\right]-\frac{|V|}{n}\right)\left(R^{d}-P^{d}\right)+\frac{|Y|}{n}R^{d}+\frac{(n-1)|Y|}{n}P^{d}.

The latter two terms are independent of the choice of informants so to maximize 𝖣𝖾𝖿𝖤𝖴\mathsf{DefEU}, it suffices the maximize (𝔼[maxiTXi]|V|n)\left(\mathbb{E}\left[\max_{i\in T}X_{i}\right]-\frac{|V|}{n}\right).

We can prove by induction on |V||V| that (𝔼[maxiTXi]|V|n)\left(\mathbb{E}\left[\max_{i\in T}X_{i}\right]-\frac{|V|}{n}\right) increases as |V||V| increases, or 𝔼[maxiTXi]\mathbb{E}\left[\max_{i\in T}X_{i}\right] increases by at least 1n\frac{1}{n} if |V||V| is increased by 1:

  1. (1)

    Since 𝔼[maxiTXi]=1\mathbb{E}\left[\max_{i\in T}X_{i}\right]=1 when |V|=1|V|=1 and 𝔼[maxiTXi]=1+1n\mathbb{E}\left[\max_{i\in T}X_{i}\right]=1+\frac{1}{n} when |V|=2|V|=2, it holds for |V|=1|V|=1.

  2. (2)

    Consider |V|1|V|\geq 1 and the corresponding sequence {Xi}i=1n\{X_{i}\}_{i=1}^{n}. Let Xm=max1in{Xi}X_{m}=\max_{1\leq i\leq n}\{X_{i}\}. We add an attacker to VV and denote by pp the probability of he targeting the location with the largest XiX_{i}. Thus the expected maximum increase by pp. Since p1np\geq\frac{1}{n} and by a simple coupling argument, we have that 𝔼[maxiTXi]\mathbb{E}\left[\max_{i\in T}X_{i}\right] increases by at least 1n.\frac{1}{n}.

Thus, in this case, solving for the optimal solution of the original problem is equivalent to solving for UU that maximizes the size of VV in the first stage.

We show that the optimization problem is NP-Hard using a reduction from 𝖬𝖢𝖯\mathsf{MCP}: we are given a number kk and a collection of sets SS. The objective is to find a subset SSS^{\prime}\subseteq S of sets such that |S|k|S^{\prime}|\leq k and the number of covered elements |SiSSi|\left|\bigcup_{S_{i}\in S^{\prime}}S_{i}\right| is maximized. Let X={x1,,x|S|}X=\{x_{1},\ldots,x_{|S|}\}, Y=SiSSiY=\bigcup_{S_{i}\in S}S_{i}, E={(xi,y):i[|S|]ySi}E=\{(x_{i},y):i\in[|S|]\land y\in S_{i}\}, pv=1p_{v}=1 for all vYv\in Y and We=1W_{e}=1 for all eEe\in E. Thus to find a UXU\subseteq X with |U|k|U|\leq k that maximizes the size of VV is equivalent to finding a subset of sets with size no larger than kk that maximizes the number of covered elements in the instance of 𝖬𝖢𝖯\mathsf{MCP}. ∎

Appendix D Proof of Lemma 4.2.2

See 4.2.2

Proof.

Since wuv=1w_{uv}=1 for all u,vu,v, we have p~v(V0)=0\tilde{p}_{v}(V_{0})=0 for each vVV0v\in V\setminus V_{0} given V0V_{0}. Therefore, the expected gain of target jj with y~\tilde{y} reported attacks can be written as 𝖤𝖦j=(y~+qjvYVpv)(RjdPjd)\mathsf{EG}_{j}=(\tilde{y}+q_{j}\sum_{v\in Y\setminus V}p_{v})(R_{j}^{d}-P_{j}^{d}) and the calculation of f()f(\cdot) depends only on the size of |V0||V_{0}|. Thus, instead of enumerating V0V_{0}, we enumerate 0t0|V|0\leq t_{0}\leq|V| as the size of V0V_{0} in line 2 of Algorithm 1, and replace PV0P_{V_{0}} in Algorithm 1 with Pt0P_{t_{0}}, where Pt0=Pr[|V0|=t0|U]P_{t_{0}}=\Pr[|V_{0}|=t_{0}|U] can be obtained by expanding the following polynomial vV(1pv+pvx)=i=0|V|Pixi.\prod_{v\in V}(1-p_{v}+p_{v}x)=\sum_{i=0}^{|V|}P_{i}x^{i}. Therefore, 𝖣𝖾𝖿𝖤𝖴(U)\mathsf{DefEU}(U) can be calculated in O(n2r|Y|4)O(n^{2}r|Y|^{4}) time.

Since all possible UU can be enumerated in O(|X|k)O(|X|^{k}), the optimal set of informants can be computed in O(|X|kn2r|Y|4)O(|X|^{k}n^{2}r|Y|^{4}). ∎

Appendix E ASISI

In Algorithm 4 and Algorithm 5, we present the polynomial time algorithm used to compute 𝖣𝖾𝖿𝖤𝖴(U)\mathsf{DefEU}(U) when wuv=1,(u,v)Ew_{uv}=1,\forall(u,v)\in E.

Algorithm 4 Calculate 𝖣𝖾𝖿𝖤𝖴(U)\mathsf{DefEU}(U)
1:Expand polynomial vV(1pv+pvx)=j=0|V|Pjxj\prod_{v\in V}(1-p_{v}+p_{v}x)=\sum_{j=0}^{|V|}P_{j}x^{j}
2:𝖤𝖴0\mathsf{EU}\leftarrow 0
3:for all possible number of reported attackers 0t0|V|0\leq t_{0}\leq|V| do
4:     if t0=0t_{0}=0 then
5:         𝖤𝖴=𝖤𝖴+Pt0vYpv𝖣𝖾𝖿𝖤𝖴0\mathsf{EU}=\mathsf{EU}+P_{t_{0}}\sum_{v\in Y}p_{v}\mathsf{DefEU}_{0}
6:         Continue to line 2      
7:     for target iTi\in T and 0tit00\leq t_{i}\leq t_{0} do \triangleright Enumerate target ii and the number of attackers tit_{i} targeting ii
8:         Calculate f()f(\cdot) given t0,i,tit_{0},i,t_{i}
9:         𝖤𝖦i(ti+qivYVpv)(RidPid)\mathsf{EG}_{i}\leftarrow(t_{i}+q_{i}\sum_{v\in Y\setminus V}p_{v})(R_{i}^{d}-P_{i}^{d})
10:         𝖤𝖴iu(ti+qivYVpv)Pid\mathsf{EU}_{i}^{u}\leftarrow(t_{i}+q_{i}\sum_{v\in Y\setminus V}p_{v})P_{i}^{d}
11:         Pi,r(t0ti)!(x=0r1f(s,x,t0ti))P_{i,r}\leftarrow(t_{0}-t_{i})!\left(\sum_{x=0}^{r-1}f(s,x,t_{0}-t_{i})\right)
12:         𝖤𝖴=𝖤𝖴+Pt0(t0ti)qitiPi,rEGi\mathsf{EU}=\mathsf{EU}+P_{t_{0}}{t_{0}\choose t_{i}}q_{i}^{t_{i}}\cdot P_{i,r}\cdot EG_{i}
13:         𝖤𝖴=𝖤𝖴+Pt0(t0ti)qiti(1qiti)𝖤𝖴iu\mathsf{EU}=\mathsf{EU}+P_{t_{0}}{t_{0}\choose t_{i}}q_{i}^{t_{i}}(1-q_{i}^{t_{i}})\mathsf{EU}_{i}^{u}      
14:𝖣𝖾𝖿𝖤𝖴(U)𝖤𝖴\mathsf{DefEU}(U)\leftarrow\mathsf{EU}
Algorithm 5 Calculate f()f(\cdot) given t0t_{0},ii,tit_{i}
1:{i1,,in1}T{i}\{{i_{1}},\ldots,{i_{n-1}}\}\leftarrow T\setminus\{i\}
2:𝖤𝖦i=(ti+qivYVpv)(RidPid)\mathsf{EG}_{i}=(t_{i}+q_{i}\sum_{v\in Y\setminus V}p_{v})(R_{i}^{d}-P_{i}^{d})
3:Initialize f(s,x,y)0f(s,x,y)\leftarrow 0 for all s,x,ys,x,y
4:f(0,0,0)1f(0,0,0)\leftarrow 1
5:for s1s\leftarrow 1 to n1n-1 do
6:     for x0x\leftarrow 0 to min(s,r)\min(s,r) do
7:         for y0y\leftarrow 0 to t0tit_{0}-t_{i} do
8:              for y~0\tilde{y}\leftarrow 0 to yy do
9:                  𝖤𝖦is=(y~+qisvYVpv)(RidPid)\mathsf{EG}_{i_{s}}=(\tilde{y}+q_{i_{s}}\sum_{v\in Y\setminus V}p_{v})(R_{i}^{d}-P_{i}^{d})
10:                  if 𝖤𝖦is>𝖤𝖦i\mathsf{EG}_{i_{s}}>\mathsf{EG}_{i} then
11:                       f(s,x,y)f(s,x,y)+qisy~y~!f(s1,x1,yy~)f(s,x,y)\leftarrow f(s,x,y)+\frac{q_{i_{s}}^{\tilde{y}}}{\tilde{y}!}f(s-1,x-1,y-\tilde{y})
12:                  else
13:                       f(s,x,y)f(s,x,y)+qisy~y~!f(s1,x,yy~)f(s,x,y)\leftarrow f(s,x,y)+\frac{q_{i_{s}}^{\tilde{y}}}{\tilde{y}!}f(s-1,x,y-\tilde{y})                                               

Appendix F Defender Utility is not Submodular

We provide a counterexample that disproves the submodularity of 𝖣𝖾𝖿𝖤𝖴(U)\mathsf{DefEU}(U).

Example \thetheorem

Consider a network GS=(X,Y,E)G_{S}=(X,Y,E) where X={u1,u2}X=\{u_{1},u_{2}\}, Y={v1,v2,v3}Y=\{v_{1},v_{2},v_{3}\}, E={(u1,v2),(u2,v3)}E=\{(u_{1},v_{2}),(u_{2},v_{3})\}, pv=1vYp_{v}=1\,\forall v\in Y and wuv=1(u,v)Ew_{uv}=1\,\forall(u,v)\in E. There are 2 targets T={1,2}T=\{1,2\}, where Rid=iR_{i}^{d}=i, Pid=1080P_{i}^{d}=-10^{-8}\approx 0 for any iTi\in T. Letting λ=0\lambda=0 yields qi=0.5q_{i}=0.5. The defender has only 1 resource. We can see that 𝖣𝖾𝖿𝖤𝖴()=𝖣𝖾𝖿𝖤𝖴({1})=𝖣𝖾𝖿𝖤𝖴({2})=3\mathsf{DefEU}(\emptyset)=\mathsf{DefEU}(\{1\})=\mathsf{DefEU}(\{2\})=3, 𝖣𝖾𝖿𝖤𝖴({1,2})=14(2+0.5)+14(4+1)+12(2+1)=3.375.\mathsf{DefEU}(\{1,2\})=\frac{1}{4}(2+0.5)+\frac{1}{4}(4+1)+\frac{1}{2}(2+1)=3.375. As a result, 𝖣𝖾𝖿𝖤𝖴({1,2})+𝖣𝖾𝖿𝖤𝖴()>𝖣𝖾𝖿𝖤𝖴({1})+𝖣𝖾𝖿𝖤𝖴({2}).\mathsf{DefEU}(\{1,2\})+\mathsf{DefEU}(\emptyset)>\mathsf{DefEU}(\{1\})+\mathsf{DefEU}(\{2\}).

Appendix G Proof of Lemma 4.2.1

See 4.2.1

Proof.

Let the random variable WW be the number of attacks. Let 𝒜1\mathcal{A}_{1} be the set of events of having no less than CC reported attackers and 𝒜2\mathcal{A}_{2} be the set of events of having no less than CC attacks. Let EAE_{A} be the expected defender’s utility taken over all possible tips given an event AA. By noticing that 𝒜1𝒜2\mathcal{A}_{1}\subseteq\mathcal{A}_{2}, we have

|𝖣𝖾𝖿𝖤𝖴(U)𝖣𝖾𝖿𝖤𝖴(U,C)|\displaystyle|\mathsf{DefEU}(U)-{\mathsf{DefEU}}(U,C)| (6)
\displaystyle\leq A𝒜1Pr[A]|EA|A𝒜2Pr[A]|EA|\displaystyle\sum_{A\in\mathcal{A}_{1}}\Pr[A]|E_{A}|\leq\sum_{A\in\mathcal{A}_{2}}\Pr[A]|E_{A}|
\displaystyle\leq Qi=C|Y|Pr[W=i]i\displaystyle Q\sum_{i=C}^{|Y|}\Pr[W=i]\cdot i
=\displaystyle= Q(CPr[WC]+i=C+1|Y|Pr[Wi])\displaystyle Q\left(C\Pr[W\geq C]+\sum_{i=C+1}^{|Y|}\Pr[W\geq i]\right)
\displaystyle\leq Q(Ce2(CC)2/|Y|+i=C+1|Y|e2(iC)2/|Y|)\displaystyle Q\left(Ce^{-2(C-C^{\prime})^{2}/|Y|}+\sum_{i=C+1}^{|Y|}e^{-2(i-C^{\prime})^{2}/|Y|}\right)
<\displaystyle< Qe2(CC)2/|Y|(C+11e4(CC)/|Y|).\displaystyle Q\cdot e^{-2(C-C^{\prime})^{2}/|Y|}\left(C+\frac{1}{1-e^{-4(C-C^{\prime})/|Y|}}\right). (7)

Inequality (6) follows by the Chernoff Bound, and inequality (7) follows since

i=C+1|Y|e2(iC)2/|Y|\displaystyle\sum_{i=C+1}^{|Y|}e^{-2(i-C^{\prime})^{2}/|Y|}
\displaystyle\leq e2(C+1C)2/|Y|i0e4i(C+1C)/|Y|\displaystyle e^{-2(C+1-C^{\prime})^{2}/|Y|}\sum_{i\geq 0}e^{-4i(C+1-C^{\prime})/|Y|}
<\displaystyle< e2(CC)2/|Y|11e4(CC)/|Y|.\displaystyle e^{-2(C-C^{\prime})^{2}/|Y|}\cdot\frac{1}{1-e^{-4(C-C^{\prime})/|Y|}}.

Appendix H Non-convergence of the level-κ\kappa response

Example \thetheorem

Suppose there is a single attacker and two targets with the following payoffs:

R1a=0.6,R2a=0.8,\displaystyle R^{a}_{1}=0.6,R^{a}_{2}=0.8,
P1a=0.8,P2a=0.6.\displaystyle P^{a}_{1}=-0.8,P^{a}_{2}=-0.6.

In this case, there are only two possible tips: the attacker attacks target 1 (𝐕1\mathbf{V}_{1}), and the attacker attacks 2 (𝐕2\mathbf{V}_{2}). Assume that only 1 informant is recruited with report probability w=0.5w=0.5. The defender has only 1 defensive resource and uses the following strategy:

𝐱0=(0.5,0.5),𝐱(𝐕1)=(1.0,0.0),𝐱(𝐕2)=(0.0,1.0).\displaystyle\mathbf{x}_{0}=(0.5,0.5),\mathbf{x}(\mathbf{V}_{1})=(1.0,0.0),\mathbf{x}(\mathbf{V}_{2})=(0.0,1.0).

When the attacker has λ=2.9\lambda=2.9, the level-κ\kappa response will converge to 𝐪=(0.4283,0.5717)\mathbf{q}=(0.4283,0.5717). However, if λ=3.0\lambda=3.0, then the process will eventually oscillate between 𝐪=(0.2924,0.7076)\mathbf{q}=(0.2924,0.7076) and 𝐪=(0.5676,0.4324)\mathbf{q}^{\prime}=(0.5676,0.4324) iteratively.

Appendix I Proof of Lemma 5.1

See 5.1

Proof.

Given defender’s strategy 𝐱0\mathbf{x}_{0} and 𝐱(𝐕)\mathbf{x}(\mathbf{V}), define:

g(𝐪)=𝖰𝖱(𝖬𝖲(𝐱0,𝐱,𝐪)).\displaystyle g(\mathbf{q})=\mathsf{QR}(\mathsf{MS}(\mathbf{x}_{0},\mathbf{x},\mathbf{q})).

Then a level-(κ\kappa+1) attacker’s strategy can be computed by

𝐪κ+1=g(𝐪κ)=g(g(𝐪κ1))==gκ(𝐪).\displaystyle\mathbf{q}^{\kappa+1}=g(\mathbf{q}^{\kappa})=g(g(\mathbf{q}^{\kappa-1}))=\cdots=g^{\kappa}(\mathbf{q}).

The convergence of level-κ\kappa is equivalent to the convergence of gκ(𝐪)g^{\kappa}(\mathbf{q}).

The marginal strategy 𝐱^\hat{\mathbf{x}} can be written as:

𝐱^(𝐪)=𝐕Pr{𝐕}𝐱(𝐕)=(1w)𝐱0+iwqi𝐱(𝐕i).\displaystyle\hat{\mathbf{x}}(\mathbf{q})=\sum_{\mathbf{V}}\Pr\{\mathbf{V}\}\mathbf{x}(\mathbf{V})=(1-w)\mathbf{x}_{0}+\sum_{i}wq_{i}\mathbf{x}(\mathbf{V}_{i}).

Notice that the function g(𝐪)g(\mathbf{q}) is just the quantal response against 𝐱^\hat{\mathbf{x}}:

gi(𝐪)=eλuia(x^i)jeλuja(x^j),\displaystyle g_{i}(\mathbf{q})=\frac{e^{\lambda u^{a}_{i}(\hat{x}_{i})}}{\sum_{j}e^{\lambda u^{a}_{j}(\hat{x}_{j})}},

where uia(x^i)u^{a}_{i}(\hat{x}_{i}) is the attacker’s expected utility of attacking target ii when the defender’s marginal strategy is 𝐱^\hat{\mathbf{x}}: uia(x^i)=Riax^i(RiaPia)u^{a}_{i}(\hat{x}_{i})=R^{a}_{i}-\hat{x}_{i}(R^{a}_{i}-P^{a}_{i}). Therefore,

giqj=\displaystyle\frac{\partial g_{i}}{\partial q_{j}}= giuiaulaqj+ligiulaulaqj\displaystyle\frac{\partial g_{i}}{\partial u^{a}_{i}}\frac{\partial u^{a}_{l}}{\partial q_{j}}+\sum_{l\neq i}\frac{\partial g_{i}}{\partial u^{a}_{l}}\frac{\partial u^{a}_{l}}{\partial q_{j}}
=\displaystyle= λwgi(1gi)(PiaRia)xi(𝐕j)+liλwgigl(RlaPla)xk(𝐕j)\displaystyle\lambda wg_{i}(1-g_{i})(P^{a}_{i}-R^{a}_{i})x_{i}(\mathbf{V}_{j})+\sum_{l\neq i}\lambda wg_{i}g_{l}(R^{a}_{l}-P^{a}_{l})x_{k}(\mathbf{V}_{j})
=\displaystyle= λwgi[(PiaRia)xi(𝐕j)+lgl(RlaPla)xl(𝐕j)].\displaystyle\lambda wg_{i}\left[(P^{a}_{i}-R^{a}_{i})x_{i}(\mathbf{V}_{j})+\sum_{l}g_{l}(R^{a}_{l}-P^{a}_{l})x_{l}(\mathbf{V}_{j})\right].

Note that in the above equation, 0w,gi,gl,xi(𝐕j),xl(𝐕j)10\leq w,g_{i},g_{l},x_{i}(\mathbf{V}_{j}),x_{l}(\mathbf{V}_{j})\leq 1, PiaRia<0P^{a}_{i}-R^{a}_{i}<0 and RlaPla>0R^{a}_{l}-P^{a}_{l}>0. Thus we have:

λ(PiaRia)xi(𝐕i)<giqj<λlgl(RlaPla)xl(𝐕j).\displaystyle\lambda(P^{a}_{i}-R^{a}_{i})x_{i}(\mathbf{V}_{i})<\frac{\partial g_{i}}{\partial q_{j}}<\lambda\sum_{l}g_{l}(R^{a}_{l}-P^{a}_{l})x_{l}(\mathbf{V}_{j}). (8)

On the other hand, x¯iLnλ(RiaPia)\bar{x}_{i}\leq\frac{L}{n\lambda(R^{a}_{i}-P^{a}_{i})} means xi(𝐕j)Lnλ(RiaPia),jx_{i}(\mathbf{V}_{j})\leq\frac{L}{n\lambda(R^{a}_{i}-P^{a}_{i})},\forall j. Plugging into Equation (8), we get:

Ln<giqj<Lnlgl=Ln.\displaystyle-\frac{L}{n}<\frac{\partial g_{i}}{\partial q_{j}}<\frac{L}{n}\sum_{l}g_{l}=\frac{L}{n}.

For any 𝐪𝐪\mathbf{q}^{\prime}\neq\mathbf{q}, let 𝐪(i)=(q1,,qi,qi,,qn)\mathbf{q}^{(i)}=(q_{1},\dots,q_{i},q^{\prime}_{i},\dots,q^{\prime}_{n}). So 𝐪(0)=𝐪\mathbf{q}^{(0)}=\mathbf{q}^{\prime} and 𝐪(n)=𝐪\mathbf{q}^{(n)}=\mathbf{q}. Therefore,

g(𝐪)g(𝐪)1=i=1n|gi(𝐪)gi(𝐪)|\displaystyle\left\|g(\mathbf{q})-g(\mathbf{q}^{\prime})\right\|_{1}=\sum_{i=1}^{n}\left|g_{i}(\mathbf{q})-g_{i}(\mathbf{q}^{\prime})\right|
=\displaystyle= i=1n|j=1n[gi(𝐪(j))gi(𝐪(j1))]|\displaystyle\sum_{i=1}^{n}\left|\sum_{j=1}^{n}\left[g_{i}(\mathbf{q}^{(j)})-g_{i}(\mathbf{q}^{(j-1)})\right]\right|
=\displaystyle= i=1nj=1n|gi(𝐪(j))gi(𝐪(j1))|\displaystyle\sum_{i=1}^{n}\sum_{j=1}^{n}\left|g_{i}(\mathbf{q}^{(j)})-g_{i}(\mathbf{q}^{(j-1)})\right|
=\displaystyle= i=1nj=1n|qjqjgi(𝐪)qj|𝐪=(q1,,qi1,s,qi+1,,qn)ds|\displaystyle\sum_{i=1}^{n}\sum_{j=1}^{n}\left|\int_{q^{\prime}_{j}}^{q_{j}}\left.\frac{\partial g_{i}(\mathbf{q})}{\partial q_{j}}\right|_{\mathbf{q}=(q_{1},\dots,q_{i-1},s,q^{\prime}_{i+1},\dots,q^{\prime}_{n})}\,\mathrm{d}s\right|
<\displaystyle< i=1nj=1nqjqjLndsi=1nj=1nLn|qjqj|\displaystyle\sum_{i=1}^{n}\sum_{j=1}^{n}\int_{q^{\prime}_{j}}^{q_{j}}\frac{L}{n}\,\mathrm{d}s\leq\sum_{i=1}^{n}\sum_{j=1}^{n}\frac{L}{n}|q_{j}-q^{\prime}_{j}|
=\displaystyle= i=1nLn𝐪𝐪1=L𝐪𝐪1.\displaystyle\sum_{i=1}^{n}\frac{L}{n}\|\mathbf{q}-\mathbf{q}^{\prime}\|_{1}=L\|\mathbf{q}-\mathbf{q}^{\prime}\|_{1}.

Appendix J Algorithm for Defending against Informant-Aware Attackers

Proposition \thetheorem

The optimal objective of QRI is non-decreasing in ww.

Proof.

Consider the two optimization problems induced by different values for ww: w1,w2w_{1},w_{2} where w2>w1w_{2}>w_{1}. Let (𝐱,𝐲,𝐳)(\mathbf{x},\mathbf{y},\mathbf{z}) be a solution for when w=w1w=w_{1}. Then, (𝐱,𝐲,w1w2𝐳+(1w1w2)𝐱)(\mathbf{x},\mathbf{y},\frac{w_{1}}{w_{2}}\mathbf{z}+(1-\frac{w_{1}}{w_{2}})\mathbf{x}) is a feasible solution for when w=w2w=w_{2} that achieves the same objective value. To see why it is feasible, observe that constraint (3) is satisfied by construction and constraint (5) is satisfied since the new value for each ziz_{i} is a convex combination of the previous xi,zix_{i},z_{i}, which were both in [0,1][0,1]. ∎

Proposition J implies that when selecting informants, it is optimal to simply maximize ww. Since w=1uU(1wu1)w=1-\prod_{u\in U}(1-w_{u1}), we can select informants greedily and choose the kk informants with the largest information sharing intensity wu1w_{u1}. We can then solve the optimization problem to find the optimal allocation of resources. Finally, we discuss how to find an approximate solution to QRI using a MILP approach.

We can compute the optimal defender strategy by adapting the approach used in the PASAQ algorithm Yang et al. (2012). Let N(y),D(y)N(y),D(y) be the numerator and denominator of the objective in QRI. As with PASAQ, we binary search on the optimal value δ\delta^{*}. We can check for feasibility of a given δ\delta by rewriting the objective to minx,y,zδD(y)N(y)\min_{x,y,z}\delta D(y)-N(y) and checking if the optimal value is less than 0. To solve the new optimization problem, which still has a non-linear objective function, we adapt their approach of approximating the objective function with linear constraints and write a MILP.

First, let θi=eλRia\theta_{i}=e^{\lambda R^{a}_{i}}, βi=λ(RiaPia)\beta_{i}=\lambda(R^{a}_{i}-P^{a}_{i}), and αi=RidPid\alpha_{i}=R^{d}_{i}-P^{d}_{i}. We rewrite the objective as

i𝒯θi(δPid)eβi(yi)+i𝒯θiαiyieβi(yi)\sum_{i\in\mathcal{T}}\theta_{i}(\delta-P^{d}_{i})e^{-\beta_{i}(y_{i})}+\sum_{i\in\mathcal{T}}\theta_{i}\alpha_{i}y_{i}e^{-\beta_{i}(y_{i})}
Refer to caption
Figure 2. Utility comparison between the level-0 defender and the informant-aware defender against an informant-aware attacker.

We have two non-linear functions that need approximation fi1(y)=eβiyf^{1}_{i}(y)=e^{-\beta_{i}y} and fi2(y)=yeβiyf^{2}_{i}(y)=ye^{-\beta_{i}y}. Let γij\gamma_{ij} be the slope for the linear approximation of fi1(y)f^{1}_{i}(y) from (jK,fi1(jK))(\frac{j}{K},f^{1}_{i}(\frac{j}{K})) to (j+1K,fi1(j+1K))(\frac{j+1}{K},f^{1}_{i}(\frac{j+1}{K})) and similarly with μij\mu_{ij} for fi2(y)f^{2}_{i}(y).

The key change in our MILP compared to PASAQ is that we replace the original defender resource constraint with constraints (9) - (12), which take into account the ability of the defender to respond to tips.

QRI-MILP:

minx,y,z,ai𝒯\displaystyle\min_{x,y,z,a}\quad\sum_{i\in\mathcal{T}} θi(δPid)(1+j=1Kγijyij)i𝒯θiαij=1Kμijyij\displaystyle\theta_{i}(\delta-P^{d}_{i})(1+\sum_{j=1}^{K}\gamma_{ij}y_{ij})-\sum_{i\in\mathcal{T}}\theta_{i}\alpha_{i}\sum_{j=1}^{K}\mu_{ij}y_{ij}
subject to j=1Kyij=(1w)xi+wzi,i\displaystyle\quad\sum_{j=1}^{K}y_{ij}=(1-w)x_{i}+wz_{i},\quad\forall i (9)
i𝒯xir\displaystyle\quad\sum_{i\in\mathcal{T}}x_{i}\leq r (10)
0xi1,i\displaystyle\quad 0\leq x_{i}\leq 1,\quad\forall i (11)
0zi1,i\displaystyle\quad 0\leq z_{i}\leq 1,\quad\forall i (12)
0yij1K,i,j=1K\displaystyle\quad 0\leq y_{ij}\leq\frac{1}{K},\quad\forall i,j=1\ldots K (13)
aij1Kyij,i,j=1K1\displaystyle\quad a_{ij}\frac{1}{K}\leq y_{ij},\quad\forall i,j=1\ldots K-1 (14)
yi(j+1)aij,i,j=1K1\displaystyle\quad y_{i(j+1)}\leq a_{ij},\quad\forall i,j=1\ldots K-1 (15)
aij{0,1},i,j=1K1\displaystyle\quad a_{ij}\in\{0,1\},\quad\forall i,j=1\ldots K-1 (16)
Proposition \thetheorem

The feasible region for 𝐲=yi=j=1Kyij,i𝒯\mathbf{y}=\langle y_{i}=\sum_{j=1}^{K}y_{ij},i\in\mathcal{T}\rangle of QRI-MILP is equivalent to that of QRI.

Proof.

With the substitution yi=j=1Kyijy_{i}=\sum_{j=1}^{K}y_{ij}, constraints (9) - (12) are directly translated from QRI. The remaining constraints (13) - (16) can be shown to allow for any potential yiy_{i}, represented correctly with the appropriate yijy_{ij}. ∎

With the above claim shown, the proof for the approximate correctness of PASAQ applies here and we can show that we can find an ε\varepsilon-optimal solution for arbitrarily small ε\varepsilon Yang et al. (2012).

Appendix K Additional Experiment Results

In Figure 2, we compare the performance of the level-0 defender and the informant-aware defender when playing against an informant-aware attacker. We see that despite that their strategies are computed under the level-0 attacker assumption, the utility of the level-0 defender is only slightly lower than the utility of the informant-aware defender. For a fixed rr, the difference in utility grows larger as the defender recruits more informants and has a higher probability of receiving a tip.

Refer to caption
Refer to caption
Figure 3. Runtime and Solution Quality increasing |Y||Y| with wuv=1w_{uv}=1 for all u,vu,v.
Refer to caption
Refer to caption
Figure 4. Error of 𝖣𝖾𝖿𝖤𝖴\mathsf{DefEU} on 2 Cases with Fixed C=6,C=2,|Y|=8,Q=2C=6,C^{\prime}=2,|Y|=8,Q=2.

We test the special case assuming SISI. We set |X|=6,k=4,n=10,r=3|X|=6,k=4,n=10,r=3 and enumerate |Y||Y| from 2 to 16. The average runtime of all algorithms including ASISI together with the relative error of GSA and T-Sampling are shown in Figure 3. Though ASISI and GSA are the fastest among all, the average relative errors of GSA are slightly above 0%0\%. T-Sampling is slightly slower than ASISI and GSA, and the solutions are less accurate than the other two in this case.

Another experiment is a case study on 2 instances with vYpv<2\sum_{v\in Y}p_{v}<2 fixed |X|=6|X|=6, |Y|=8|Y|=8, n=6n=6, r=3r=3, Q=2Q=2. We run EDPA, C-Truncated (C=6C=6) and T-Sampling on each instance and show the error of the estimations for all UXU\subseteq X. The results are shown in Figure 4 and the red lines indicate the error bound given by Lemma 4.2.1. We encode the set of informants in binary, e.g., the set with code 19=(010011)219=(010011)_{2} represents the set {u1,u2,u4}.\{u_{1},u_{2},u_{4}\}. The first instance is constructed to show that the bound given by Lemma 4.2.1 is empirically tight, i.e., the estimation of 𝖣𝖾𝖿𝖤𝖴(U)\mathsf{DefEU}(U) by C-Truncated could be large but still bounded. In this case, we set pv=1,wuv=1u,vp_{v}=1,w_{uv}=1\,\forall u,v, Rid=QR_{i}^{d}=Q and Pid=103.P_{i}^{d}=-10^{-3}. While the other instance is randomly generated. It is shown that T-Sampling has larger errors with higher variances compared to C-Truncated.

Appendix L Experiment Setup

To generate GS=(X,Y,E)G_{S}=(X,Y,E), we first fix the sets XX and YY. For each uXu\in X, we sample the degree of uu, dud_{u}, uniformly from [|Y|][|Y|] and then sample a uniformly random subset of YY of size dud_{u}. For each (u,v)E(u,v)\in E, wuvw_{uv} is drawn from U[0,0.2]U[0,0.2]. For the attack probability pvp_{v}, in the general case, each pvp_{v} is drawn from U[0.4,1]U[0.4,1]. When we restrict vYpvC\sum_{v\in Y}p_{v}\leq C^{\prime}, we draw a vector 𝐭=(t1,,t|Y|)\mathbf{t}=(t_{1},\ldots,t_{|Y|}) from U[0,1]|Y|U[0,1]^{|Y|} and set pv=min{1,Ctv𝐭1p_{v}=\min\{1,C^{\prime}\cdot\frac{t_{v}}{||\mathbf{t}||_{1}}}. For the payoff matrix, each RidR_{i}^{d} (RiaR_{i}^{a}) is drawn from U(0,Q]U(0,Q] and each Pid(Pia)P_{i}^{d}(P_{i}^{a}) is drawn from U[Q,0)U[-Q,0), where QQ is set to 2. The precision parameter λ\lambda is set to 2. 𝖣𝖾𝖿𝖤𝖴0\mathsf{DefEU}_{0} and qiq_{i}’s are obtained by a binary search with a convex optimization as introduced in Yang et al. (2012). The number of samples 𝖳\mathsf{T} used in T-Sampling is set to 100. In GSA, EDPA is used to calculate 𝖣𝖾𝖿𝖤𝖴(U)\mathsf{DefEU}(U).

In the QRI-MILP algorithm, the optimal defender strategy is found with approximation parameter K=10K=10. The bi-level optimization algorithm is implemented using MATLAB R2017a. The low-level linear program is solved using the linprog function and the high-level optimization is solved with the fmincon function.

Appendix M Defending Against Level-κ\kappa Attackers

In the section 4, we deal with the case with only type-0 attackers and provide algorithms to find the optimal set of informants to recruit. In this section, we show how those approaches can be easily extended to the case with level-κ\kappa attackers.

Once given 𝐱^κ1\hat{\mathbf{x}}^{\kappa-1}, 𝐪κ\mathbf{q}^{\kappa} can be easily obtained. So as 𝖣𝖾𝖿𝖤𝖴κ\mathsf{DefEU}_{\kappa}, the defender’s expected utility using 𝐱0\mathbf{x}_{0} against a single attack of a level-κ\kappa attacker. To get the solution, we simply replace 𝖣𝖾𝖿𝖤𝖴0,𝐪0\mathsf{DefEU}_{0},\mathbf{q}^{0} with 𝖣𝖾𝖿𝖤𝖴κ,𝐪κ\mathsf{DefEU}_{\kappa},\mathbf{q}^{\kappa} and apply Select or GSA. In order to calculate 𝐱^κ1\hat{\mathbf{x}}^{\kappa-1} by definition, all that remains is to calculate 𝖬𝖲(𝐱0,𝐱,𝐪i)\mathsf{MS}(\mathbf{x}_{0},\mathbf{x},\mathbf{q}^{i}) for i<κi<\kappa. The marginal probability of each target being covered can be calculated in a way similar to EDPA.