This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Influence Maximization in Ising Models

Zongchen Chen Department of Computer Science and Engineering, University at Buffalo, zchen83@buffalo.edu. Research supported by EM’s Simons Investigator award (622132).    Elchanan Mossel Department of Mathematics, MIT, elmos@mit.edu. Research supported by the Vannevar Bush Faculty Fellowship ONR-N00014-20-1-2826, the NSF award CCF 1918421, and the Simons Investigator award (622132).
Abstract

Given a complex high-dimensional distribution over {±1}n\{\pm 1\}^{n}, what is the best way to increase the expected number of +1+1’s by controlling the values of only a small number of variables? Such a problem is known as influence maximization and has been widely studied in social networks, biology, and computer science. In this paper, we consider influence maximization on the Ising model which is a prototypical example of undirected graphical models and has wide applications in many real-world problems. We establish a sharp computational phase transition for influence maximization on sparse Ising models under a bounded budget: In the high-temperature regime, we give a linear-time algorithm for finding a small subset of variables and their values which achieve nearly optimal influence; In the low-temperature regime, we show that the influence maximization problem cannot be solved in polynomial time under commonly-believed complexity assumption. The critical temperature coincides with the tree uniqueness/non-uniqueness threshold for Ising models which is also a critical point for other computational problems including approximate sampling and counting.

1 Introduction

Let μ\mu be a distribution supported on {±1}V\{\pm 1\}^{V} where VV is a ground set of size nn, and let k+k\in\mathbb{N}^{+} be an integer corresponding to a budget. We consider the following version of the influence maximization problem which asks to find a subset SVS\subseteq V of size at most kk and a partial assignment σS{±1}S\sigma_{S}\in\{\pm 1\}^{S} which maximizes the expectation of vVXv\sum_{v\in V}X_{v} conditioned on variables in SS receiving values specified by σS\sigma_{S}. In other words, we want to solve the following combinatorial optimization problem:

maxSV,|S|kσS{±1}S{𝔼μ[vVXv|XS=σS]}.\displaystyle\max_{\begin{subarray}{c}S\subseteq V,\,|S|\leq k\\ \sigma_{S}\in\{\pm 1\}^{S}\end{subarray}}\left\{\mathbb{E}_{\mu}\left[\sum_{v\in V}X_{v}\,\Bigg{|}\,X_{S}=\sigma_{S}\right]\right\}. (1)

Influence maximization is an important problem especially in the study of social networks and also has a vast number of applications in other areas [KKT03, KKT05].

The problem of influence maximization has been extensively studied both theoretically and in more applied work. However, the families of distributions for which it was analyzed is somewhat limited. The theoretical foundations for the model were introduced in [KKT03] in terms of dynamical model where agents are infected if a function of their infected neighborhood surpasses a certain threshold. Algorithmic results and computational hardness are both stated in terms of properties of these threshold functions. While the models introduced and analyzed in [KKT03, KKT05, MR10] allow for and vastly generalize standard infections models, they do not apply to other standard models of correlated opinions.

The main interest in our paper is in Ising models, which is one of the simplest and most popular graphical models for modeling the joint distribution of correlated discrete random variables. The Ising model was originally defined as statistical physics models, and nowadays they are widely used to model social networks, computer networks, and biological systems, see e.g. [LYS10, MS10, APB10, MLO01, LMLA19, Lip22].

Consider a graph G=(V,E)G=(V,E), and let β,h\beta,h\in\mathbb{R}. In the Ising distribution on GG parameterized by β,h\beta,h, every configuration σ{±1}V\sigma\in\{\pm 1\}^{V} is assigned with a probability density

μ(σ)exp(βuvEσuσv+hvVσv).\mu(\sigma)\propto\exp\left(\beta\sum_{uv\in E}\sigma_{u}\sigma_{v}+h\sum_{v\in V}\sigma_{v}\right).

Here, β\beta is the inverse temperature describing the interaction between adjacent vertices. In particular if β>0\beta>0 then neighboring vertices are more likely to receive the same value and the model is called ferromagnetic; meanwhile, if β<0\beta<0 they could become repulsive to each other and the model is called antiferromagnetic. The parameter hh is the external field of the system describing the bias of variables from outside. In general, every edge could have a distinct inverse temperature and every vertex a distinct external field; we refer to Section 2.1 for this more general definition.

The problem of influence maximization for the Ising model was studied before in some special settings. It was shown by Bresler, Koehler, and Moitra [BKM19] using the GHS inequality that for the ferromagnetic Ising model, the influence maximization problem for the equilibrium measure is submodular and therefore obtains a 11/e1-1/e approximation factor. More recently, a preprint by Chin, Moitra, Mossel and Sandon [CMMS23] shows that for very high temperature (small β\beta) ferromagnetic Ising models with fixed parameter β\beta, the influence maximization problem is approximately solved by the highest degree nodes. See also [LYS10] for applications in social networks of influence maximization on Ising models.

While prior works provided some interesting algorithms for special cases, much remains unknown. First, many of the most natural models are not ferromagnetic. Second, we may be interested in an approximation factor better than 11/e1-1/e. Finally and importantly, we would like to understand the computational hardness of the problem.

For Ising models, there exists a critical temperature βc\beta_{c} which characterizes phase transitions of the model. Such a critical point βc\beta_{c} depends on the maximum degree of the graph and is called the tree uniqueness/non-uniqueness threshold since it characterizes whether there exists a unique Gibbs measure for the Ising model on infinite regular trees. More importantly, the threshold βc\beta_{c} pinpoints whether or not the model exhibits correlation decay [Wei06, LLY13] or spectral independence [ALO20, CLV20], which are crucial properties for guaranteeing rapid mixing of natural Markov chains for sampling such as Glauber dynamics and polynomial-time algorithms for estimating the partition function.

We show that the critical temperature βc\beta_{c} also pinpoints a computational phase transition for the influence maximization problem on sparse Ising models. In fact, we consider a more general version of influence maximization where we want to maximize the influence on an arbitrary linear function of XvX_{v}’s under a bounded budget; see Section 2.3 for formal definitions.

Theorem 1.1 (Informal version of Theorems 2.3 and 2.4).

Consider Ising models on bounded-degree graphs and let k+k\in\mathbb{N}^{+} be a constant for the budget.

  • If |β|<βc|\beta|<\beta_{c}, then one can find (S,σS)(S,\sigma_{S}) whose influence is ε\varepsilon-close to the optimal value within time O(n)poly(1/ε)O(n)\cdot\mathrm{poly}(1/\varepsilon);

  • If |β|>βc|\beta|>\beta_{c}, then there is no poly(n,1/ε)\mathrm{poly}(n,1/\varepsilon)-time algorithm for influence maximization.

One important feature of our algorithmic result is that the running time of the algorithm is linear in nn. Naively, one can easily obtain a polynomial-time algorithm by enumerating all possible (S,σS)(S,\sigma_{S}) and find their corresponding influences. Since in the high-temperature regime we are able to approximately sample from the distribution or estimate the marginals in polynomial time, such a brute force algorithm runs in polynomial time; however, the exponent in nn is a large constant depending on kk. Our algorithm has the advantage of being linear-time, assuming we have a constant budget k=O(1)k=O(1).

To obtain a linear-time algorithm, we utilize the decay of correlation property and the spectral independence technique in a novel way. In the high-temperature regime (i.e. |β|<βc|\beta|<\beta_{c}), the correlation/influence between a vertex vv and a subset SVS\subseteq V of vertices is known to decay exponentially fast with their graph distance distG(v,S)\mathrm{dist}_{G}(v,S) [Wei06, LLY13, CLV20]; see Section 2.2 for details. The key in our approach is to approximate the global influence of (S,σS)(S,\sigma_{S}) on the whole vertex set VV by a local influence on only vertices sufficiently close to SS when correlation decay and spectral independence hold; see Proposition 3.2. The proof of the algorithmic result is provided in Section 3.

Meanwhile, in the low-temperature regime (i.e. |β|>βc|\beta|>\beta_{c}) correlations or influences between two vertices can be non-vanishing even when their distance grows. For this reason simple Markov chain algorithms for sampling such as Glauber dynamics are known to be exponentially slow on such family, and our algorithmic approach fails for the same reason. In fact, it was known that approximate sampling and counting is 𝖭𝖯\mathsf{NP}-hard in the antiferromagnetic case, i.e. when β<βc\beta<-\beta_{c} [Sly10, SS14, GŠV16]. We establish hardness of influence maximization by giving a simple reduction from approximating the partition function of Ising models. The proof can be found in Section 4.

2 Preliminaries

Suppose G=(V,E)G=(V,E) is a graph. For two vertices u,vVu,v\in V, let distG(u,v)\mathrm{dist}_{G}(u,v) denote their graph distance in GG. For any uVu\in V and any r>0r>0, let 𝖡(u,r)={vV:distG(u,v)r}\mathsf{B}(u,r)=\{v\in V:\mathrm{dist}_{G}(u,v)\leq r\} be the ball of radius rr around uu. Further, for any SVS\subseteq V let 𝖡(S,r)=uS𝖡(u,r)\mathsf{B}(S,r)=\bigcup_{u\in S}\mathsf{B}(u,r).

For r>0r>0, let GrG^{\leq r} denote the graph with the same vertex set VV and two vertices u,vu,v adjacent iff distG(u,v)r\mathrm{dist}_{G}(u,v)\leq r. For SVS\subseteq V, let G[S]G[S] be the subgraph induced on SS.

2.1 Ising model

Suppose G=(V,E)G=(V,E) is a graph. Let βE\beta\in\mathbb{R}^{E} be a vector of edge couplings and hVh\in\mathbb{R}^{V} be a vector of external fields. The Gibbs distribution μ=μG,β,h\mu=\mu_{G,\beta,h} of the Ising model (G,β,h)(G,\beta,h) is given by

μ(σ):=1Zexp(uvEβuvσuσv+vVhvσv),σ{±1}V\mu(\sigma):=\frac{1}{Z}\exp\left(\sum_{uv\in E}\beta_{uv}\sigma_{u}\sigma_{v}+\sum_{v\in V}h_{v}\sigma_{v}\right),\qquad\forall\sigma\in\{\pm 1\}^{V} (2)

where the partition function Z=ZG,β,hZ=Z_{G,\beta,h} is defined by

Z=σ{±1}Vexp(uvEβuvσuσv+vVhvσv).Z=\sum_{\sigma\in\{\pm 1\}^{V}}\exp\left(\sum_{uv\in E}\beta_{uv}\sigma_{u}\sigma_{v}+\sum_{v\in V}h_{v}\sigma_{v}\right).

For an integer Δ3\Delta\geq 3 and a real γ>0\gamma>0, let (Δ,γ)\mathcal{M}(\Delta,\gamma) be the family of all Ising models (G,β,h)(G,\beta,h) satisfying:

  1. 1.

    The graph GG has maximum degree at most Δ\Delta;

  2. 2.

    For all uvEuv\in E it holds (Δ1)(tanh|βuv|)γ(\Delta-1)(\tanh|\beta_{uv}|)\leq\gamma.

We remark that for the family (Δ,γ)\mathcal{M}(\Delta,\gamma) every edge coupling can be either ferromagnetic (i.e. βuv>0\beta_{uv}>0) or antiferromagnetic (i.e. βuv<0\beta_{uv}<0).

The critical temperature is given by βc(Δ)=arctanh(1/(Δ1))\beta_{c}(\Delta)=\mathrm{arctanh}(1/(\Delta-1)). Hence, for any Ising model from the family (Δ,γ)\mathcal{M}(\Delta,\gamma) where γ<1\gamma<1, every edge coupling satisfies |βuv|<βc(Δ)|\beta_{uv}|<\beta_{c}(\Delta).

2.2 Tree uniqueness, strong spatial mixing, total influence decay

In the high-temperature regime, strong spatial mixing (correlation decay) was known for the family (Δ,γ)\mathcal{M}(\Delta,\gamma) for any γ<1\gamma<1 [Wei06, SST14, LLY13]. Recently, [CLV20] established \ell_{\infty}-spectral independence by showing the exponential decay of total influences via Weitz’s self-avoiding tree approach [Wei06].

Lemma 2.1 ([Wei06, SST14, LLY13, CLV20]).

For any Δ3\Delta\geq 3 and δ(0,1)\delta\in(0,1), there exists a constant C=C(Δ,δ)>0C=C(\Delta,\delta)>0 such that the following holds. Consider an Ising model on a graph G=(V,E)G=(V,E) from the family (Δ,1δ)\mathcal{M}(\Delta,1-\delta). Let ΛV\Lambda\subseteq V and τ{±1}Λ\tau\in\{\pm 1\}^{\Lambda} be an arbitrary pinning.

  • (Strong Spatial Mixing) For any uVΛu\in V\setminus\Lambda and L+L\in\mathbb{N}^{+}, for any subset WVΛ{u}W\subseteq V\setminus\Lambda\setminus\{u\} such that distG(u,W)L\mathrm{dist}_{G}(u,W)\geq L and any two spin assignments σW,ξW{±1}W\sigma_{W},\xi_{W}\in\{\pm 1\}^{W}, we have

    |μτ(Xu=+XW=σW)μτ(Xu=+XW=ξW)|C(1δ)L.\left|\mathbb{P}_{\mu^{\tau}}\left(X_{u}=+\mid X_{W}=\sigma_{W}\right)-\mathbb{P}_{\mu^{\tau}}\left(X_{u}=+\mid X_{W}=\xi_{W}\right)\right|\leq C(1-\delta)^{L}.
  • (Total Influence Decay) For any uVΛu\in V\setminus\Lambda and L+L\in\mathbb{N}^{+}, we have

    vVΛ:distG(u,v)L|μτ(Xv=+Xu=+)μτ(Xv=+Xu=)|C(1δ)L.\sum_{v\in V\setminus\Lambda:\,\mathrm{dist}_{G}(u,v)\geq L}\left|\mathbb{P}_{\mu^{\tau}}(X_{v}=+\mid X_{u}=+)-\mathbb{P}_{\mu^{\tau}}(X_{v}=+\mid X_{u}=-)\right|\leq C(1-\delta)^{L}.

2.3 Influence maximization

Consider the Ising model on a graph G=(V,E)G=(V,E) with edge couplings βE\beta\in\mathbb{R}^{E} and external fields hVh\in\mathbb{R}^{V}. Let aVa\in\mathbb{R}^{V} be a vector of vertex weights.

Definition 2.2 (Global Influence).

For a subset SVS\subseteq V of vertices and a partial assignment σS{±1}S\sigma_{S}\in\{\pm 1\}^{S} on SS, define the global influence of (S,σS)(S,\sigma_{S}) on the linear function aXa\cdot X to be

ΦG,β,h,a(S,σS)=𝔼[vVavXv|XS=σS]𝔼[vVavXv]\Phi_{G,\beta,h,a}(S,\sigma_{S})=\mathbb{E}\left[\sum_{v\in V}a_{v}X_{v}\,\Bigg{|}\,X_{S}=\sigma_{S}\right]-\mathbb{E}\left[\sum_{v\in V}a_{v}X_{v}\right]

where X{±1}VX\in\{\pm 1\}^{V} is sampled from the Ising model (G,β,h)(G,\beta,h).

Let k+k\in\mathbb{N}^{+} be an integer representing the budget. In this paper, we consider the kk-Inf-Max problem where we want to select a subset SVS\subseteq V of size at most kk and a partial assignment σS{±1}S\sigma_{S}\in\{\pm 1\}^{S} which achieves almost the maximum global influence. Formally, the problem kk-Inf-Max is defined as follows.

kk-Inf-Max

Input: (G,β,h)(G,\beta,h) an Ising model; aVa\in\mathbb{R}^{V} a vector of vertex weights; ε>0\varepsilon>0 an error parameter.

Output: a subset S^V\hat{S}\subseteq V with |S^|k|\hat{S}|\leq k and a partial assignment σS^{±1}S^\sigma_{\hat{S}}\in\{\pm 1\}^{\hat{S}} such that

ΦG,β,h,a(S^,σS^)maxSV,|S|kσS{±1}S{ΦG,β,h,a(S,σS)}ε.\Phi_{G,\beta,h,a}(\hat{S},\sigma_{\hat{S}})\geq\max_{\begin{subarray}{c}S\subseteq V,\,|S|\leq k\\ \sigma_{S}\in\{\pm 1\}^{S}\end{subarray}}\left\{\Phi_{G,\beta,h,a}(S,\sigma_{S})\right\}-\varepsilon.

We say a weight vector aVa\in\mathbb{R}^{V} is LL-bounded if aL\left\lVert a\right\rVert_{\infty}\leq L, i.e., |av|L|a_{v}|\leq L for all vVv\in V. We are interested in maximizing the global influence for bounded weights. Since ΦG,β,h,ta(S,σS)=tΦG,β,h,a(S,σS)\Phi_{G,\beta,h,ta}(S,\sigma_{S})=t\cdot\Phi_{G,\beta,h,a}(S,\sigma_{S}), we may assume that aa is 11-bounded. Furthermore, we consider influence maximization with a constant budget, namely k=O(1)k=O(1), which is already interesting and captures many real-world settings. Our goal is to find an algorithm for kk-Inf-Max with running time polynomial in nn and 1/ε1/\varepsilon, and understand the computational complexity of it.

2.4 Main results

Theorem 2.3 (Algorithmic Result).

Suppose Δ3\Delta\geq 3 is an integer and δ(0,1)\delta\in(0,1) is a real. For any integer k+k\in\mathbb{N}^{+}, there exists a deterministic algorithm that solves kk-Inf-Max for the family (Δ,1δ)\mathcal{M}(\Delta,1-\delta) and 11-bounded vertex weights with running time O(n)(1/ε)O(1)O(n)\cdot(1/\varepsilon)^{O(1)}.

Theorem 2.4 (Hardness Result).

Suppose Δ3\Delta\geq 3 is an integer and δ>0\delta>0 is a real. For any integer k+k\in\mathbb{N}^{+}, there is no randomized algorithm that can solve kk-Inf-Max for the family (Δ,1+δ)\mathcal{M}(\Delta,1+\delta) and 11-bounded vertex weights with probability at least 3/43/4 in time poly(n,1/ε)\mathrm{poly}(n,1/\varepsilon), assuming 𝖱𝖯𝖭𝖯\mathsf{RP}\neq\mathsf{NP}.

3 Algorithmic Result

We prove our algorithmic result Theorem 2.3 by localizing the global influence of a subset SS of vertices to a ball around SS. For high-temperature Ising models, such local influence approximates the global influence effectively. Furthermore, one can approximately maximize the local influence by the local nature of the problem. Together this gives an approximation algorithm for global influence maximization.

To begin we first define the notion of local influences.

Definition 3.1 (Local Influence).

Let r+r\in\mathbb{N}^{+}. For a subset SVS\subseteq V of vertices and a partial assignment σS{±1}S\sigma_{S}\in\{\pm 1\}^{S} on SS, define the local influence of (S,σS)(S,\sigma_{S}) on the linear function aXa\cdot X to be

ΦG,β,h,a(r)(S,σS)=𝔼G[𝖡(S,r)][v𝖡(S,r)avXv|XS=σS]𝔼G[𝖡(S,r)][v𝖡(S,r)avXv],\Phi_{G,\beta,h,a}^{(r)}(S,\sigma_{S})=\mathbb{E}_{G[\mathsf{B}(S,r)]}\left[\sum_{v\in\mathsf{B}(S,r)}a_{v}X_{v}\,\Bigg{|}\,X_{S}=\sigma_{S}\right]-\mathbb{E}_{G[\mathsf{B}(S,r)]}\left[\sum_{v\in\mathsf{B}(S,r)}a_{v}X_{v}\right],

where X{±1}VX\in\{\pm 1\}^{V} is sampled from the Ising model on the induced subgraph G[𝖡(S,r)]G[\mathsf{B}(S,r)] with β,h\beta,h restricted on it.

Notice that, if S={u,w}S=\{u,w\} and distG(u,w)>2r+1\mathrm{dist}_{G}(u,w)>2r+1, then the induced subgraph G[𝖡({u,w},r)]G[\mathsf{B}(\{u,w\},r)] is the disjoint union of G[𝖡(u,r)]G[\mathsf{B}(u,r)] and G[𝖡(w,r)]G[\mathsf{B}(w,r)], and we can further decompose the local influence as

ΦG,β,h,a(r)(S,σS)=\displaystyle\Phi_{G,\beta,h,a}^{(r)}(S,\sigma_{S})={} (𝔼G[𝖡(u,r)][v𝖡(u,r)avXv|Xu=σu]𝔼G[𝖡(u,r)][v𝖡(u,r)avXv])\displaystyle\left(\mathbb{E}_{G[\mathsf{B}(u,r)]}\left[\sum_{v\in\mathsf{B}(u,r)}a_{v}X_{v}\,\Bigg{|}\,X_{u}=\sigma_{u}\right]-\mathbb{E}_{G[\mathsf{B}(u,r)]}\left[\sum_{v\in\mathsf{B}(u,r)}a_{v}X_{v}\right]\right)
+(𝔼G[𝖡(w,r)][v𝖡(w,r)avXv|Xw=σw]𝔼G[𝖡(w,r)][v𝖡(w,r)avXv])\displaystyle+\left(\mathbb{E}_{G[\mathsf{B}(w,r)]}\left[\sum_{v\in\mathsf{B}(w,r)}a_{v}X_{v}\,\Bigg{|}\,X_{w}=\sigma_{w}\right]-\mathbb{E}_{G[\mathsf{B}(w,r)]}\left[\sum_{v\in\mathsf{B}(w,r)}a_{v}X_{v}\right]\right)
=\displaystyle={} ΦG,β,h,a(r)(u,σu)+ΦG,β,h,a(r)(w,σw).\displaystyle\Phi_{G,\beta,h,a}^{(r)}(u,\sigma_{u})+\Phi_{G,\beta,h,a}^{(r)}(w,\sigma_{w}).

Thus, we see that for local influence we are able to decompose it into clusters of vertices close to each other; more specifically the clusters are connected components of the induced subgraph G[𝖡(S,r)]G[\mathsf{B}(S,r)], see Lemma 3.7 for a precise statement.

We now present two main propositions for establishing Theorem 2.3. Fix Δ3\Delta\geq 3, δ(0,1)\delta\in(0,1) and k+k\in\mathbb{N}^{+}. In the propositions below O()=OΔ,δ,k()O(\cdot)=O_{\Delta,\delta,k}(\cdot) hides a constant depending on Δ,δ,k\Delta,\delta,k.

We first show that for high-temperature Ising models, the global influence is well-approximated by the local influence for sufficiently large radius r+r\in\mathbb{N}^{+}.

Proposition 3.2.

Consider an Ising model on a graph G=(V,E)G=(V,E) from the family (Δ,1δ)\mathcal{M}(\Delta,1-\delta) and a 11-bounded weight vector aVa\in\mathbb{R}^{V}. For any ε>0\varepsilon>0, there exists r=O(log(1/ε))r=O(\log(1/\varepsilon)) such that for all SVS\subseteq V with |S|k|S|\leq k and all σS{±1}S\sigma_{S}\in\{\pm 1\}^{S}, we have

|ΦG,β,h,a(S,σS)ΦG,β,h,a(r)(S,σS)|ε.\displaystyle\left|\Phi_{G,\beta,h,a}(S,\sigma_{S})-\Phi_{G,\beta,h,a}^{(r)}(S,\sigma_{S})\right|\leq\varepsilon. (3)

Next, we give a linear-time algorithm for approximately maximizing the local influence.

Proposition 3.3.

Consider an Ising model on a graph G=(V,E)G=(V,E) from the family (Δ,1δ)\mathcal{M}(\Delta,1-\delta) and a 11-bounded weight vector aVa\in\mathbb{R}^{V}. For any ε>0\varepsilon>0 and r+r\in\mathbb{N}^{+}, there exists an algorithm that finds a subset S^V\hat{S}\subseteq V with |S^|k|\hat{S}|\leq k and a partial assignment σS^{±1}S^\sigma_{\hat{S}}\in\{\pm 1\}^{\hat{S}} such that

ΦG,β,h,a(r)(S^,σS^)maxSV,|S|kσS{±1}S{ΦG,β,h,a(r)(S,σS)}ε.\displaystyle\Phi_{G,\beta,h,a}^{(r)}(\hat{S},\sigma_{\hat{S}})\geq\max_{\begin{subarray}{c}S\subseteq V,\,|S|\leq k\\ \sigma_{S}\in\{\pm 1\}^{S}\end{subarray}}\left\{\Phi_{G,\beta,h,a}^{(r)}(S,\sigma_{S})\right\}-\varepsilon. (4)

The running time of the algorithm is O(n)(1/ε)O(1)eO(r)O(n)\cdot(1/\varepsilon)^{O(1)}\cdot e^{O(r)}.

Theorem 2.3 follows immediately from Propositions 3.2 and 3.3. For ease of notations we omit G,β,h,aG,\beta,h,a in the subscripts for the rest of the paper when it is clear from the context.

Proof of Theorem 2.3.

Define the optimal solutions

(S,σS)\displaystyle(S^{*},\sigma_{S^{*}}) =argmaxSV,|S|kσS{±1}S{Φ(S,σS)}\displaystyle=\operatorname*{arg\,max}_{\begin{subarray}{c}S\subseteq V,\,|S|\leq k\\ \sigma_{S}\in\{\pm 1\}^{S}\end{subarray}}\left\{\Phi(S,\sigma_{S})\right\} (5)
and(S,σS)\displaystyle\text{and}\qquad(S^{\dagger},\sigma_{S^{\dagger}}) =argmaxSV,|S|kσS{±1}S{Φ(r)(S,σS)}.\displaystyle=\operatorname*{arg\,max}_{\begin{subarray}{c}S\subseteq V,\,|S|\leq k\\ \sigma_{S}\in\{\pm 1\}^{S}\end{subarray}}\left\{\Phi^{(r)}(S,\sigma_{S})\right\}. (6)

Let r=O(log(1/ε))r=O(\log(1/\varepsilon)) be from Proposition 3.2 such that Eq. 3 holds with the error in the right-hand side being ε/3\varepsilon/3. (Note that we can compute rr efficiently by Eqs. 7 and 8 from the proof of Proposition 3.2.) For this rr use the algorithm from Proposition 3.3 to find S^V\hat{S}\subseteq V and σS^{±1}S^\sigma_{\hat{S}}\in\{\pm 1\}^{\hat{S}} such that Eq. 4 holds with the error in the right-hand side being ε/3\varepsilon/3. Thus, we conclude that

Φ(S^,σS^)\displaystyle\Phi(\hat{S},\sigma_{\hat{S}}) Eq. 3Φ(r)(S^,σS^)ε3Eq. 4Φ(r)(S,σS)2ε3\displaystyle\overset{\text{\lx@cref{creftype~refnum}{eq:approx-local}}}{\geq}\Phi^{(r)}(\hat{S},\sigma_{\hat{S}})-\frac{\varepsilon}{3}\overset{\text{\lx@cref{creftype~refnum}{eq:alg-local}}}{\geq}\Phi^{(r)}(S^{\dagger},\sigma_{S^{\dagger}})-\frac{2\varepsilon}{3}
Eq. 6Φ(r)(S,σS)2ε3Eq. 3Φ(S,σS)ε\displaystyle\overset{\text{\lx@cref{creftype~refnum}{eq:opt}}}{\geq}\Phi^{(r)}(S^{*},\sigma_{S^{*}})-\frac{2\varepsilon}{3}\overset{\text{\lx@cref{creftype~refnum}{eq:approx-local}}}{\geq}\Phi(S^{*},\sigma_{S^{*}})-\varepsilon

as wanted. The running time of the algorithm is O(n)(1/ε)O(1)eO(r)=O(n)(1/ε)O(1)O(n)\cdot(1/\varepsilon)^{O(1)}\cdot e^{O(r)}=O(n)\cdot(1/\varepsilon)^{O(1)}. ∎

3.1 Proof of Proposition 3.2

Fix SVS\subseteq V and σS{±1}S\sigma_{S}\in\{\pm 1\}^{S}, and define

f(k,)\displaystyle f(k,\ell) =𝔼G[𝖡(S,k)][v𝖡(S,)avXv|XS=σS]𝔼G[𝖡(S,k)][v𝖡(S,)avXv]\displaystyle=\mathbb{E}_{G[\mathsf{B}(S,k)]}\left[\sum_{v\in\mathsf{B}(S,\ell)}a_{v}X_{v}\,\Bigg{|}\,X_{S}=\sigma_{S}\right]-\mathbb{E}_{G[\mathsf{B}(S,k)]}\left[\sum_{v\in\mathsf{B}(S,\ell)}a_{v}X_{v}\right]
=v𝖡(S,)av(𝔼G[𝖡(S,k)][Xv|XS=σS]𝔼G[𝖡(S,k)][Xv]).\displaystyle=\sum_{v\in\mathsf{B}(S,\ell)}a_{v}\left(\mathbb{E}_{G[\mathsf{B}(S,k)]}\left[X_{v}\,|\,X_{S}=\sigma_{S}\right]-\mathbb{E}_{G[\mathsf{B}(S,k)]}\left[X_{v}\right]\right).

Define 𝖡(v,)\mathsf{B}(v,\infty) to be the connected component containing vv and 𝖡(S,)=vS𝖡(v,)\mathsf{B}(S,\infty)=\bigcup_{v\in S}\mathsf{B}(v,\infty). Then we have f(,)=Φ(S,σS)f(\infty,\infty)=\Phi(S,\sigma_{S}) and f(r,r)=Φ(r)(S,σS)f(r,r)=\Phi^{(r)}(S,\sigma_{S}); to see the former, observe that

Φ(S,σS)\displaystyle\Phi(S,\sigma_{S}) =𝔼G[vVavXv|XS=σS]𝔼G[vVavXv]\displaystyle=\mathbb{E}_{G}\left[\sum_{v\in V}a_{v}X_{v}\,\Bigg{|}\,X_{S}=\sigma_{S}\right]-\mathbb{E}_{G}\left[\sum_{v\in V}a_{v}X_{v}\right]
=𝔼G[v𝖡(S,)avXv|XS=σS]𝔼G[v𝖡(S,)avXv]\displaystyle=\mathbb{E}_{G}\left[\sum_{v\in\mathsf{B}(S,\infty)}a_{v}X_{v}\,\Bigg{|}\,X_{S}=\sigma_{S}\right]-\mathbb{E}_{G}\left[\sum_{v\in\mathsf{B}(S,\infty)}a_{v}X_{v}\right] (XSX_{S} and XV𝖡(S,)X_{V\setminus\mathsf{B}(S,\infty)} are independent)
=𝔼G[𝖡(S,)][v𝖡(S,)avXv|XS=σS]𝔼G[𝖡(S,)][v𝖡(S,)avXv]\displaystyle=\mathbb{E}_{G[\mathsf{B}(S,\infty)]}\left[\sum_{v\in\mathsf{B}(S,\infty)}a_{v}X_{v}\,\Bigg{|}\,X_{S}=\sigma_{S}\right]-\mathbb{E}_{G[\mathsf{B}(S,\infty)]}\left[\sum_{v\in\mathsf{B}(S,\infty)}a_{v}X_{v}\right] (μG=μG[𝖡(S,)]μG[V𝖡(S,)]\mu_{G}=\mu_{G[\mathsf{B}(S,\infty)]}\otimes\mu_{G[V\setminus\mathsf{B}(S,\infty)]})
=f(,).\displaystyle=f(\infty,\infty).

Therefore, it suffices to show that |f(,)f(r,r)|ε|f(\infty,\infty)-f(r,r)|\leq\varepsilon, which follows immediately from the following three lemmas.

Lemma 3.4.

There exists ρ=O(log(1/ε))\rho=O(\log(1/\varepsilon)) such that |f(,)f(,ρ)|ε/3|f(\infty,\infty)-f(\infty,\rho)|\leq\varepsilon/3.

Proof.

We define

ρ=1δlog(6Ckε).\displaystyle\rho=\left\lceil\frac{1}{\delta}\log\left(\frac{6Ck}{\varepsilon}\right)\right\rceil. (7)

For simplicity we write =G[𝖡(S,)]\mathbb{P}=\mathbb{P}_{G[\mathsf{B}(S,\infty)]} for the Ising distribution μG[𝖡(S,)]\mu_{G[\mathsf{B}(S,\infty)]}, and 𝔼=𝔼G[𝖡(S,)]\mathbb{E}=\mathbb{E}_{G[\mathsf{B}(S,\infty)]} for the expectation over μG[𝖡(S,)]\mu_{G[\mathsf{B}(S,\infty)]}. By definitions we have that

|f(,)f(,ρ)|=|v𝖡(S,)𝖡(S,ρ)av(𝔼[Xv|XS=σS]𝔼[Xv])|.\displaystyle|f(\infty,\infty)-f(\infty,\rho)|=\left|\sum_{v\in\mathsf{B}(S,\infty)\setminus\mathsf{B}(S,\rho)}a_{v}\left(\mathbb{E}\left[X_{v}\,|\,X_{S}=\sigma_{S}\right]-\mathbb{E}\left[X_{v}\right]\right)\right|.

Suppose S={v1,,vk}S=\{v_{1},\dots,v_{k^{\prime}}\} where k=|S|kk^{\prime}=|S|\leq k. For 0ik0\leq i\leq k^{\prime} we define Si={v1,,vi}S_{i}=\{v_{1},\dots,v_{i}\} and let σSi\sigma_{S_{i}} be σS\sigma_{S} restricted to SiS_{i}. Then it follows that

|f(,)f(,ρ)|\displaystyle|f(\infty,\infty)-f(\infty,\rho)|
=\displaystyle={} |i=1kv𝖡(S,)𝖡(S,ρ)av(𝔼[Xv|XSi=σSi]𝔼[Xv|XSi1=σSi1])|\displaystyle\left|\sum_{i=1}^{k^{\prime}}\sum_{v\in\mathsf{B}(S,\infty)\setminus\mathsf{B}(S,\rho)}a_{v}\left(\mathbb{E}\left[X_{v}\,|\,X_{S_{i}}=\sigma_{S_{i}}\right]-\mathbb{E}\left[X_{v}\,|\,X_{S_{i-1}}=\sigma_{S_{i-1}}\right]\right)\right|
(i)\displaystyle\overset{\text{(i)}}{\leq}{} i=1kv𝖡(S,)𝖡(S,ρ)|av||𝔼[Xv|XSi=σSi]𝔼[Xv|XSi1=σSi1]|\displaystyle\sum_{i=1}^{k^{\prime}}\sum_{v\in\mathsf{B}(S,\infty)\setminus\mathsf{B}(S,\rho)}|a_{v}|\cdot\left|\mathbb{E}\left[X_{v}\,|\,X_{S_{i}}=\sigma_{S_{i}}\right]-\mathbb{E}\left[X_{v}\,|\,X_{S_{i-1}}=\sigma_{S_{i-1}}\right]\right|
(ii)\displaystyle\overset{\text{(ii)}}{\leq}{} i=1kv𝖡(S,)𝖡(S,ρ)2|σSi1(Xv=+1Xvi=+1)σSi1(Xv=+1Xvi=1)|\displaystyle\sum_{i=1}^{k^{\prime}}\sum_{v\in\mathsf{B}(S,\infty)\setminus\mathsf{B}(S,\rho)}2\left|\mathbb{P}^{\sigma_{S_{i-1}}}(X_{v}=+1\mid X_{v_{i}}=+1)-\mathbb{P}^{\sigma_{S_{i-1}}}(X_{v}=+1\mid X_{v_{i}}=-1)\right|
(iii)\displaystyle\overset{\text{(iii)}}{\leq}{} 2Ck(1δ)ρ\displaystyle 2Ck(1-\delta)^{\rho}
(iv)\displaystyle\overset{\text{(iv)}}{\leq}{} ε3,\displaystyle\frac{\varepsilon}{3},

where (i) is the triangle inequality, (ii) follows from |av|1|a_{v}|\leq 1 and expanding the expectation, (iii) follows from Total Influence Decay (Lemma 2.1), and (iv) is by our choice of ρ\rho. ∎

Lemma 3.5.

Given ρ+\rho\in\mathbb{N}^{+}, there exists ρ<r=O(ρ+log(1/ε))\rho<r=O(\rho+\log(1/\varepsilon)) such that |f(,ρ)f(r,ρ)|ε/3|f(\infty,\rho)-f(r,\rho)|\leq\varepsilon/3.

Proof.

We define

r=ρ+1δ(log(24Cε)+ρlogΔ).\displaystyle r=\rho+\left\lceil\frac{1}{\delta}\left(\log\left(\frac{24C}{\varepsilon}\right)+\rho\log\Delta\right)\right\rceil. (8)

By the triangle inequality and a1\left\lVert a\right\rVert_{\infty}\leq 1 we have that

|f(,ρ)f(r,ρ)|\displaystyle|f(\infty,\rho)-f(r,\rho)|
\displaystyle\leq{} v𝖡(S,ρ)|𝔼G[Xv|XS=σS]𝔼Gr[Xv|XS=σS]|+|𝔼G[Xv]𝔼Gr[Xv]|\displaystyle\sum_{v\in\mathsf{B}(S,\rho)}\left|\mathbb{E}_{G_{\infty}}\left[X_{v}\,|\,X_{S}=\sigma_{S}\right]-\mathbb{E}_{G_{r}}\left[X_{v}\,|\,X_{S}=\sigma_{S}\right]\right|+\left|\mathbb{E}_{G_{\infty}}\left[X_{v}\right]-\mathbb{E}_{G_{r}}\left[X_{v}\right]\right|
=\displaystyle={} v𝖡(S,ρ)2|GσS(Xv=+1)GrσS(Xv=+1)|+2|G(Xv=+1)Gr(Xv=+1)|\displaystyle\sum_{v\in\mathsf{B}(S,\rho)}2\left|\mathbb{P}_{G_{\infty}}^{\sigma_{S}}\left(X_{v}=+1\right)-\mathbb{P}_{G_{r}}^{\sigma_{S}}\left(X_{v}=+1\right)\right|+2\left|\mathbb{P}_{G_{\infty}}\left(X_{v}=+1\right)-\mathbb{P}_{G_{r}}\left(X_{v}=+1\right)\right|

where G=G[𝖡(S,)]G_{\infty}=G[\mathsf{B}(S,\infty)], Gr=G[𝖡(S,r)]G_{r}=G[\mathsf{B}(S,r)], Gτ=μGτ\mathbb{P}_{G_{\infty}}^{\tau}=\mathbb{P}_{\mu_{G_{\infty}}^{\tau}}, and Grτ=μGrτ\mathbb{P}_{G_{r}}^{\tau}=\mathbb{P}_{\mu_{G_{r}}^{\tau}}. Let U={uV:distG(u,S)=r}U=\{u\in V:\mathrm{dist}_{G}(u,S)=r\}. For any v𝖡(S,ρ)v\in\mathsf{B}(S,\rho), we can couple XvG(Xv=)X_{v}\sim\mathbb{P}_{G_{\infty}}(X_{v}=\cdot) and XvGr(Xv=)X^{\prime}_{v}\sim\mathbb{P}_{G_{r}}(X^{\prime}_{v}=\cdot) by first revealing the spin assignments XUG(XU=)X_{U}\sim\mathbb{P}_{G_{\infty}}(X_{U}=\cdot) and XUGr(XU=)X^{\prime}_{U}\sim\mathbb{P}_{G_{r}}(X^{\prime}_{U}=\cdot) on UU independently and then couple Xv,XvX_{v},X^{\prime}_{v} optimally conditioned on XU,XUX_{U},X^{\prime}_{U} respectively. Therefore, we deduce that

|GσS(Xv=+1)GrσS(Xv=+1)|\displaystyle\left|\mathbb{P}_{G_{\infty}}^{\sigma_{S}}\left(X_{v}=+1\right)-\mathbb{P}_{G_{r}}^{\sigma_{S}}\left(X_{v}=+1\right)\right| maxσU,τU{±1}U|GσS,σU(Xv=+1)GσS,τU(Xv=+1)|\displaystyle\leq\max_{\sigma_{U},\tau_{U}\in\{\pm 1\}^{U}}\left|\mathbb{P}_{G}^{\sigma_{S},\sigma_{U}}\left(X_{v}=+1\right)-\mathbb{P}_{G}^{\sigma_{S},\tau_{U}}\left(X_{v}=+1\right)\right|
C(1δ)rρ,\displaystyle\leq C(1-\delta)^{r-\rho},

where the first inequality follows from the coupling procedure and the fact that

GσS,σU(Xv=)=GσS,σU(Xv=)=GrσS,σU(Xv=),\mathbb{P}_{G_{\infty}}^{\sigma_{S},\sigma_{U}}(X_{v}=\cdot)=\mathbb{P}_{G}^{\sigma_{S},\sigma_{U}}(X_{v}=\cdot)=\mathbb{P}_{G_{r}}^{\sigma_{S},\sigma_{U}}(X_{v}=\cdot),

and the second inequality follows from Strong Spatial Mixing (Lemma 2.1) and distG(v,U)distG(S,U)distG(v,S)rρ\mathrm{dist}_{G}(v,U)\geq\mathrm{dist}_{G}(S,U)-\mathrm{dist}_{G}(v,S)\geq r-\rho. Similarly, we also have

|G(Xv=+1)Gr(Xv=+1)|C(1δ)rρ.\left|\mathbb{P}_{G_{\infty}}\left(X_{v}=+1\right)-\mathbb{P}_{G_{r}}\left(X_{v}=+1\right)\right|\leq C(1-\delta)^{r-\rho}.

Hence, combining everything above and |𝖡(S,ρ)|2Δρ|\mathsf{B}(S,\rho)|\leq 2\Delta^{\rho} we get

|f(,ρ)f(r,ρ)|2Δρ4C(1δ)rρε3,|f(\infty,\rho)-f(r,\rho)|\leq 2\Delta^{\rho}\cdot 4C(1-\delta)^{r-\rho}\leq\frac{\varepsilon}{3},

as wanted. ∎

Lemma 3.6.

For ρ+\rho\in\mathbb{N}^{+} in Eq. 7 and any integer rρr\geq\rho, we have |f(r,ρ)f(r,r)|ε/3|f(r,\rho)-f(r,r)|\leq\varepsilon/3.

Proof.

The proof is exactly the same as for Lemma 3.4; one only needs to replace G[𝖡(S,)]G[\mathsf{B}(S,\infty)] with G[𝖡(S,r)]G[\mathsf{B}(S,r)] once noting that Total Influence Decay still holds on any subgraph. ∎

Proof of Proposition 3.2.

For ρ,r\rho,r given in Eqs. 7 and 8, we obtain from the triangle inequality and Lemmas 3.4, 3.5 and 3.6 that

|f(,)f(r,r)||f(,)f(,ρ)|+|f(,ρ)f(r,ρ)|+|f(r,ρ)f(r,r)|ε,|f(\infty,\infty)-f(r,r)|\leq|f(\infty,\infty)-f(\infty,\rho)|+|f(\infty,\rho)-f(r,\rho)|+|f(r,\rho)-f(r,r)|\leq\varepsilon,

as claimed. ∎

3.2 Proof of Proposition 3.3

For a graph GG, let 𝖼𝖼(G)\mathsf{cc}(G) denote the set of all connected components of GG, where each connected component is viewed as a subset of vertices. The following decomposition lemma is easy to verify.

Lemma 3.7.

For any rr\in\mathbb{N}, SVS\subseteq V, and σS{±1}S\sigma_{S}\in\{\pm 1\}^{S}, we have

Φ(r)(S,σS)=T𝖼𝖼(G2r+1[S])Φ(r)(T,σT).\Phi^{(r)}(S,\sigma_{S})=\sum_{T\in\mathsf{cc}(G^{\leq 2r+1}[S])}\Phi^{(r)}(T,\sigma_{T}).
Proof.

Follows from the fact that

μG[𝖡(S,r)]=T𝖼𝖼(G2r+1[S])μG[𝖡(T,r)],\mu_{G[\mathsf{B}(S,r)]}=\bigotimes_{T\in\mathsf{cc}(G^{\leq 2r+1}[S])}\mu_{G[\mathsf{B}(T,r)]},

and the same for the conditional distribution with a partial assignment σS\sigma_{S} on SS. ∎

Hence, it suffices to consider all local influences for subsets of vertices that are connected in G2r+1G^{\leq 2r+1}. Our algorithm is given below.

Outline of the Algorithm

  1. Step 1.

    Construct a graph H=(VH,EH)H=(V_{H},E_{H}) as follows.

    1. (1a)

      The vertex set VHV_{H} consists of all non-empty subsets TVT\subseteq V of vertices of size at most kk such that G2r+1[T]G^{\leq 2r+1}[T] is connected.

    2. (1b)

      Two distinct subsets T1,T2T_{1},T_{2} are adjacent iff G2r+1[T1T2]G^{\leq 2r+1}[T_{1}\cup T_{2}] is connected; equivalently, T1,T2T_{1},T_{2} are non-adjacent iff distG(T1,T2)>2r+1\mathrm{dist}_{G}(T_{1},T_{2})>2r+1.

    \vartriangleright (Lemma 3.8) We can construct HH in O(n)eO(r)O(n)\cdot e^{O(r)} time.

  2. Step 2.

    Each vertex TVHT\in V_{H} is assigned an integral cost cT+c_{T}\in\mathbb{N}^{+}, a real weight wTw_{T}\in\mathbb{R}, and a partial assignment ξT{±1}T\xi_{T}\in\{\pm 1\}^{T} as follows.

    1. (2a)

      The cost of TT is its size; i.e., cT=|T|c_{T}=|T|.

    2. (2b)

      For every σT{±1}T\sigma_{T}\in\{\pm 1\}^{T}, compute ψT(σT)\psi_{T}(\sigma_{T}) such that

      |ψT(σT)Φ(r)(T,σT)|ε2k.\displaystyle\left|\psi_{T}(\sigma_{T})-\Phi^{(r)}(T,\sigma_{T})\right|\leq\frac{\varepsilon}{2k}. (9)

      The weight of TT is the maximum value of ψT(σT)\psi_{T}(\sigma_{T}) and the associated partial assignment is the maximizer:

      ξT\displaystyle\xi_{T} =argmaxσT{±1}TψT(σT)\displaystyle=\operatorname*{arg\,max}_{\sigma_{T}\in\{\pm 1\}^{T}}\psi_{T}(\sigma_{T}) (10)
      andwT\displaystyle\text{and}\qquad w_{T} =ψT(ξT)=maxσT{±1}TψT(σT).\displaystyle=\psi_{T}(\xi_{T})=\max_{\sigma_{T}\in\{\pm 1\}^{T}}\psi_{T}(\sigma_{T}). (11)

    \vartriangleright (Lemma 3.9) For each TVHT\in V_{H}, we can compute cTc_{T}, wTw_{T}, and ξT\xi_{T} in (1/ε)O(1)eO(r)(1/\varepsilon)^{O(1)}\cdot e^{O(r)} time.

  3. Step 3.

    Given the graph HH, costs {cT}TVH\{c_{T}\}_{T\in V_{H}}, and weights {wT}TVH\{w_{T}\}_{T\in V_{H}}, find a maximum weighted independent set II^{*} of HH with total cost at most kk; namely,

    max\displaystyle\max\quad TIwT\displaystyle\sum_{T\in I}w_{T} (12)
    s.t.\displaystyle\mathrm{s.t.}\quad I is an independent set of H;\displaystyle\text{$I$ is an independent set of $H$};
    TIcTk.\displaystyle\sum_{T\in I}c_{T}\leq k.

    \vartriangleright (Lemma 3.10) We can find II^{*} in O(n)eO(r)O(n)\cdot e^{O(r)} time.

  4. Step 4.

    Output

    S^=TITandσS^=(ξT)TI.\hat{S}=\bigcup_{T\in I^{*}}T\qquad\text{and}\qquad\sigma_{\hat{S}}=(\xi_{T})_{T\in I^{*}}.

    \vartriangleright (Lemma 3.11) We have Φ(r)(S^,σS^)Φ(r)(S,σS)ε\Phi^{(r)}(\hat{S},\sigma_{\hat{S}})\geq\Phi^{(r)}(S^{\dagger},\sigma_{S^{\dagger}})-\varepsilon as desired.

We show the correctness of our algorithm and analyze the running time of it in the following sequence of lemmas. Throughout, we assume that GG has maximum degree at most Δ3\Delta\geq 3, the Ising model on GG is from the family (Δ,1δ)\mathcal{M}(\Delta,1-\delta), and k+k\in\mathbb{N}^{+} is fixed.

Both Step 1 and 2 can be completed in O(n)(1/ε)O(1)eO(r)O(n)\cdot(1/\varepsilon)^{O(1)}\cdot e^{O(r)} time.

Lemma 3.8 (Step 1).

The graph HH has N=O(n)eO(r)N=O(n)\cdot e^{O(r)} vertices and maximum degree D=eO(r)D=e^{O(r)}. Furthermore, one can construct HH in O(n)eO(r)O(n)\cdot e^{O(r)} time.

Proof.

Follows from results in [PR17, BCKL13] for counting and enumerating bounded-size connected induced subgraphs in a bounded-degree graph. ∎

Lemma 3.9 (Step 2).

For each TVHT\in V_{H}, its cost cTc_{T}, weight wTw_{T}, and partial assignment ξT\xi_{T} can be computed in eO(r)(1/ε)O(1)e^{O(r)}\cdot(1/\varepsilon)^{O(1)} time.

Proof.

The cost cTc_{T} is trivial. For the weight wTw_{T}, we first compute ψT(σT)\psi_{T}(\sigma_{T}) which approximates Φ(r)(T,σT)\Phi^{(r)}(T,\sigma_{T}), for all choices of σT{±1}T\sigma_{T}\in\{\pm 1\}^{T}. The approximation of Φ(r)(T,σT)\Phi^{(r)}(T,\sigma_{T}) follows from [Wei06, SST14, Bar16, PR17] which present deterministic approximate counting algorithms (𝖥𝖯𝖳𝖠𝖲\mathsf{FPTAS}) for high-temperature Ising models. More specifically, observe that by definition Φ(r)(T,σT)\Phi^{(r)}(T,\sigma_{T}) is a linear combination of marginal probabilities at each vertex in 𝖡(T,r)\mathsf{B}(T,r) either with or without the pinning σT\sigma_{T}. Thus, one can estimate all such marginals within an additive error ε=ε4k|𝖡(T,r)|\varepsilon^{\prime}=\frac{\varepsilon}{4k|\mathsf{B}(T,r)|}, and then obtain ψT(σT)\psi_{T}(\sigma_{T}) from these estimates such that

|ψT(σT)Φ(r)(T,σT)|2|𝖡(T,r)|ε=ε2k.\left|\psi_{T}(\sigma_{T})-\Phi^{(r)}(T,\sigma_{T})\right|\leq 2|\mathsf{B}(T,r)|\cdot\varepsilon^{\prime}=\frac{\varepsilon}{2k}.

The running time of this is

|𝖡(T,r)|O(1)(1/ε)O(1)=eO(r)(1/ε)O(1).|\mathsf{B}(T,r)|^{O(1)}\cdot(1/\varepsilon^{\prime})^{O(1)}=e^{O(r)}\cdot(1/\varepsilon)^{O(1)}.

Given ψT(σT)\psi_{T}(\sigma_{T}) for all σT{±1}T\sigma_{T}\in\{\pm 1\}^{T}, we can then find ξT\xi_{T} and wTw_{T}. Note that the number of choices of σT\sigma_{T} is at most 2k2^{k} which is O(1)O(1). ∎

The algorithm for Step 3 is given by the following lemma.

Lemma 3.10 (Step 3).

Let H=(V,E)H=(V,E) be an NN-vertex graph of maximum degree at most D3D\geq 3. Suppose every vertex TVT\in V is assigned an integral cost cT+c_{T}\in\mathbb{N}^{+} and a real weight wTw_{T}\in\mathbb{R}. Then for any fixed k+k\in\mathbb{N}^{+} with k=O(1)k=O(1), there exists an algorithm that finds a maximum weighted independent set II of HH with total cost at most kk in time O(DN)+DO(1)O(DN)+D^{O(1)}.

Proof.

For 1ik1\leq i\leq k, define V(i)={TV:cT=i}V^{(i)}=\{T\in V:c_{T}=i\} to be the set of all vertices of cost ii. Let U(i)={T1(i),,Tti(i)}V(i)U^{(i)}=\{T^{(i)}_{1},\dots,T^{(i)}_{t_{i}}\}\subseteq V^{(i)} be the tit_{i} vertices of largest weights from V(i)V^{(i)} (break ties arbitrarily), where ti=|U(i)|=min{k(D+1),|V(i)|}t_{i}=|U^{(i)}|=\min\{k(D+1),|V^{(i)}|\}. Finally, let U=i=1kU(i)U=\bigcup_{i=1}^{k}U^{(i)}. Observe that UU can be found in O(DN)O(DN) time.

We claim that there exists a maximum weighted independent set II^{*} with total cost at most kk such that II^{*} is completely contained in UU. To prove the claim, let us define II^{*} to be the maximum weighted independent set with total cost at most kk that contains the most vertices in UU, and it suffices to prove IUI^{*}\subseteq U. Suppose for sake of contradiction that IUI^{*}\not\subseteq U. Take any TIUT\in I^{*}\setminus U, and assume that TV(i)T\in V^{(i)} for some ii. Since TU=j=1kU(j)T\not\in U=\bigcup_{j=1}^{k}U^{(j)}, we have TV(i)U(i)T\in V^{(i)}\setminus U^{(i)} and it holds |V(i)|>|U(i)|=k(D+1)|V^{(i)}|>|U^{(i)}|=k(D+1). We say a vertex TT blocks a vertex TT^{\prime} if either T=TT=T^{\prime} or T,TT,T^{\prime} are adjacent. Thus, every vertex blocks at most D+1D+1 vertices. It follows that vertices in I{T}I^{*}\setminus\{T\} block at most (D+1)(k1)|U(i)|1(D+1)(k-1)\leq|U^{(i)}|-1 vertices altogether. Hence, there exists a vertex TU(i)T^{\prime}\in U^{(i)} which is not blocked by I{T}I^{*}\setminus\{T\}. In particular, I=I{T}{T}I^{\prime}=I^{*}\setminus\{T\}\cup\{T^{\prime}\} is an independent set with the same cost (cT=i=cTc_{T^{\prime}}=i=c_{T}) and no smaller weight (wTwTw_{T^{\prime}}\geq w_{T} by the definition of U(i)U^{(i)}), while containing one more vertex from UU. This is a contradiction.

Given the claim, one only needs to enumerate all subsets of UU of size at most kk to find the maximum weighted independent set with cost constraint kk. Since |U|=O(D)|U|=O(D), this can be done in DO(1)D^{O(1)} time, finishing the proof. ∎

Finally, we show the correctness of our algorithm.

Lemma 3.11 (Step 4).

Let (S,σS)(S^{\dagger},\sigma_{S^{\dagger}}) be the maximizer for the local influence defined in Eq. 6. We have that

Φ(r)(S^,σS^)Φ(r)(S,σS)ε.\Phi^{(r)}(\hat{S},\sigma_{\hat{S}})\geq\Phi^{(r)}(S^{\dagger},\sigma_{S^{\dagger}})-\varepsilon.
Proof.

By Lemma 3.7, the optimal solution (S,σS)(S^{\dagger},\sigma_{S^{\dagger}}) corresponds to an independent set II^{\dagger} of HH such that

S=TITandσS=(ηT)TIwhereηT=argmaxσT{±1}T{Φ(r)(T,σT)}.S^{\dagger}=\bigcup_{T\in I^{\dagger}}T\qquad\text{and}\qquad\sigma_{S^{\dagger}}=\left(\eta_{T}\right)_{T\in I^{\dagger}}~{}\text{where}~{}\eta_{T}=\operatorname*{arg\,max}_{\sigma_{T}\in\{\pm 1\}^{T}}\left\{\Phi^{(r)}(T,\sigma_{T})\right\}.

We then deduce that

Φ(r)(S^,σS^)\displaystyle\Phi^{(r)}(\hat{S},\sigma_{\hat{S}}) =Lem 3.7TIΦ(r)(T,ξT)Eq. 9TIψT(ξT)ε2=Eq. 11TIwTε2\displaystyle\overset{\text{Lem \ref{lem:decomp}}}{=}\sum_{T\in I^{*}}\Phi^{(r)}(T,\xi_{T})\overset{\text{\lx@cref{creftype~refnum}{eq:psi-approx}}}{\geq}\sum_{T\in I^{*}}\psi_{T}(\xi_{T})-\frac{\varepsilon}{2}\overset{\text{\lx@cref{creftype~refnum}{eq:wT}}}{=}\sum_{T\in I^{*}}w_{T}-\frac{\varepsilon}{2}
Eq. 12TIwTε2=Eq. 11TIψT(ξT)ε2Eq. 10TIψT(ηT)ε2\displaystyle\overset{\text{\lx@cref{creftype~refnum}{eq:I*-max}}}{\geq}\sum_{T\in I^{\dagger}}w_{T}-\frac{\varepsilon}{2}\overset{\text{\lx@cref{creftype~refnum}{eq:wT}}}{=}\sum_{T\in I^{\dagger}}\psi_{T}(\xi_{T})-\frac{\varepsilon}{2}\overset{\text{\lx@cref{creftype~refnum}{eq:xiT}}}{\geq}\sum_{T\in I^{\dagger}}\psi_{T}(\eta_{T})-\frac{\varepsilon}{2}
Eq. 9TIΦ(r)(T,ηT)ε=Lem 3.7Φ(r)(S,σS)ε,\displaystyle\overset{\text{\lx@cref{creftype~refnum}{eq:psi-approx}}}{\geq}\sum_{T\in I^{\dagger}}\Phi^{(r)}(T,\eta_{T})-\varepsilon\overset{\text{Lem \ref{lem:decomp}}}{=}\Phi^{(r)}(S^{\dagger},\sigma_{S^{\dagger}})-\varepsilon,

as claimed. ∎

4 Hardness Result

We establish computational hardness of kk-Inf-Max for low-temperature Ising models from the hardness of estimating the marginal probabilities of single vertices, which is a direct consequence of hardness of approximate counting [Sly10, SS14, GŠV16] and self-reducibility.

Theorem 4.1 ([Sly10, SS14, GŠV16]).

Suppose Δ3\Delta\geq 3 is an integer and δ>0\delta>0 is a real. Assuming 𝖱𝖯𝖭𝖯\mathsf{RP}\neq\mathsf{NP}, there is no 𝖥𝖯𝖱𝖠𝖲\mathsf{FPRAS} for the following problem: Given an Ising model on a graph G=(V,E)G=(V,E) from the family (Δ,1+δ)\mathcal{M}(\Delta,1+\delta) and a vertex vVv\in V, estimate (Xv=+)\mathbb{P}(X_{v}=+).

Proof.

Given an 𝖥𝖯𝖱𝖠𝖲\mathsf{FPRAS} for estimating marginals, one can approximate the partition function efficiently. More specifically, suppose V={v1,,vn}V=\{v_{1},\dots,v_{n}\} and we have

(X1=+1,,Xn=+1)=i=1n(Xi=+1X1=+1,,Xi1=+1).\mathbb{P}(X_{1}=+1,\dots,X_{n}=+1)=\prod_{i=1}^{n}\mathbb{P}(X_{i}=+1\mid X_{1}=+1,\dots,X_{i-1}=+1).

Each (Xi=+1X1=+1,,Xi1=+1)\mathbb{P}(X_{i}=+1\mid X_{1}=+1,\dots,X_{i-1}=+1) corresponds to the marginal at viv_{i} in an Ising model on the subgraph induced by {vi,,vn}\{v_{i},\dots,v_{n}\} where the pinning on {v1,,vi1}\{v_{1},\dots,v_{i-1}\} becomes external fields. Thus, we can approximate (X1=+1,,Xn=+1)\mathbb{P}(X_{1}=+1,\dots,X_{n}=+1) and hence the partition function via Eq. 2. We therefore deduce the theorem from the hardness results for computing the partition function in low-temperature Ising models [Sly10, SS14, GŠV16]. ∎

We now give the proof of Theorem 2.4.

Proof of Theorem 2.4.

We may assume without loss of generality that δ1\delta\leq 1. Given a polynomial-time algorithm for kk-Inf-Max for the family (Δ,1+δ)\mathcal{M}(\Delta,1+\delta) and 11-bounded vertex weights, we show how to efficiently estimate (Xv=+)\mathbb{P}(X_{v}=+) for an Ising model on a graph G=(V,E)G=(V,E) from the family (Δ,1+δ)\mathcal{M}(\Delta,1+\delta) and a vertex vVv\in V.

Define a graph GG^{\prime} which is the disjoint union of GG and kk distinct isolated vertices u1,,uku_{1},\dots,u_{k}. Each uiu_{i} has the same external field h(ui)=xh(u_{i})=x which we can choose freely. Together with βE\beta\in\mathbb{R}^{E} and hVh\in\mathbb{R}^{V} this defines an Ising model on GG^{\prime} which is still in the family (Δ,1+δ)\mathcal{M}(\Delta,1+\delta). Let aVa\in\mathbb{R}^{V} be a 1-bounded vertex weight vector defined by a(v)=1a(v)=1, a(ui)=1a(u_{i})=1 for i=1,,ki=1,\dots,k and a(u)=0a(u)=0 for all other vertices.

Consider the kk-Inf-Max problem for the Ising model on GG^{\prime} and the weight vector aa. Let U={u1,,uk}U=\{u_{1},\dots,u_{k}\} and W={v,u1,,uk1}W=\{v,u_{1},\dots,u_{k-1}\}. For a subset SS of vertices, let +S{±1}S\boldsymbol{+}_{S}\in\{\pm 1\}^{S} denote the partial assignment that assigns ++ to all vertices in SS. We claim that

maxSVU,|S|kσS{±1}S{Φ(S,σS)}\displaystyle\max_{\begin{subarray}{c}S\subseteq V\cup U,\,|S|\leq k\\ \sigma_{S}\in\{\pm 1\}^{S}\end{subarray}}\left\{\Phi(S,\sigma_{S})\right\} =max{Φ(U,+U),Φ(W,+W)}.\displaystyle=\max\left\{\Phi(U,\boldsymbol{+}_{U}),\Phi(W,\boldsymbol{+}_{W})\right\}. (13)

To see this, consider a feasible pair (S,σS)(S,\sigma_{S}). If S=US=U, then

Φ(U,σU)=i=1k(σui𝔼[Xui])i=1k(1tanhx)=k(1tanhx)=Φ(U,+U).\displaystyle\Phi(U,\sigma_{U})=\sum_{i=1}^{k}(\sigma_{u_{i}}-\mathbb{E}[X_{u_{i}}])\leq\sum_{i=1}^{k}(1-\tanh x)=k(1-\tanh x)=\Phi(U,\boldsymbol{+}_{U}). (14)

If SUS\neq U, then without loss of generality suppose SU={u1,,uj}S\cap U=\{u_{1},\dots,u_{j}\} where jk1j\leq k-1 and we have

Φ(S,σS)\displaystyle\Phi(S,\sigma_{S}) =𝔼[XvXSU=σSU]𝔼[Xv]+i=1j(σui𝔼[Xui])\displaystyle=\mathbb{E}[X_{v}\mid X_{S\setminus U}=\sigma_{S\setminus U}]-\mathbb{E}[X_{v}]+\sum_{i=1}^{j}(\sigma_{u_{i}}-\mathbb{E}[X_{u_{i}}])
1𝔼[Xv]+(k1)(1tanhx)=Φ(W,+W).\displaystyle\leq 1-\mathbb{E}[X_{v}]+(k-1)(1-\tanh x)=\Phi(W,\boldsymbol{+}_{W}). (15)

Therefore, Eq. 13 follows from Eqs. 14 and 15.

Suppose the provided algorithm returns (S^,σS^)(\hat{S},\sigma_{\hat{S}}) which satisfies

Φ(S^,σS^)maxSVU,|S|kσS{±1}S{Φ(S,σS)}ε=max{Φ(U,+U),Φ(W,+W)}ε.\Phi(\hat{S},\sigma_{\hat{S}})\geq\max_{\begin{subarray}{c}S\subseteq V\cup U,\,|S|\leq k\\ \sigma_{S}\in\{\pm 1\}^{S}\end{subarray}}\left\{\Phi(S,\sigma_{S})\right\}-\varepsilon=\max\left\{\Phi(U,\boldsymbol{+}_{U}),\Phi(W,\boldsymbol{+}_{W})\right\}-\varepsilon.

If S^=U\hat{S}=U, then we deduce from Eq. 14 that

Φ(U,+U)Φ(S^,σS^)Φ(W,+W)ε,\Phi(U,\boldsymbol{+}_{U})\geq\Phi(\hat{S},\sigma_{\hat{S}})\geq\Phi(W,\boldsymbol{+}_{W})-\varepsilon,

implying 𝔼[Xv]tanhxε\mathbb{E}[X_{v}]\geq\tanh x-\varepsilon. If S^U\hat{S}\neq U, then we deduce from Eq. 15 that

Φ(W,+W)Φ(S^,σS^)Φ(U,+U)ε,\Phi(W,\boldsymbol{+}_{W})\geq\Phi(\hat{S},\sigma_{\hat{S}})\geq\Phi(U,\boldsymbol{+}_{U})-\varepsilon,

implying 𝔼[Xv]tanhx+ε\mathbb{E}[X_{v}]\leq\tanh x+\varepsilon. Thus, by picking tanhx\tanh x and applying binary search we can estimate 𝔼[Xv]\mathbb{E}[X_{v}] efficiently with additive error ε\varepsilon with high probability. This transforms to an estimator for (Xv=+1)\mathbb{P}(X_{v}=+1) with multiplicative error ε\varepsilon since (Xv=+1)\mathbb{P}(X_{v}=+1) is lower bounded when δ1\delta\leq 1. ∎

References

  • [ALO20] Nima Anari, Kuikui Liu, and Shayan Oveis Gharan. Spectral independence in high-dimensional expanders and applications to the hardcore model. In Proceedings of the 61st Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 1319–1330, 2020.
  • [APB10] Kristine Eia S Antonio, Chrysline Margus N Pinol, and Ronald S Banzon. An ising model approach to malware epidemiology. arXiv preprint arXiv:1007.4938, 2010.
  • [Bar16] Alexander Barvinok. Combinatorics and Complexity of Partition Functions, volume 30. Springer Algorithms and Combinatorics, 2016.
  • [BCKL13] Christian Borgs, Jennifer Chayes, Jeff Kahn, and László Lovász. Left and right convergence of graphs with bounded degree. Random Structures & Algorithms, 42(1):1–28, 2013.
  • [BKM19] Guy Bresler, Frederic Koehler, and Ankur Moitra. Learning restricted Boltzmann machines via influence maximization. In Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing (STOC), pages 828–839, 2019.
  • [CLV20] Zongchen Chen, Kuikui Liu, and Eric Vigoda. Rapid mixing of Glauber dynamics up to uniqueness via contraction. In Proceedings of the 61st Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 1307–1318, 2020.
  • [CMMS23] Byron Chin, Ankur Moitra, Elchanan Mossel, and Colin Sandon. The power of an adversary in Glauber dynamics. arXiv preprint arXiv:2302.10841, 2023.
  • [GŠV16] Andreas Galanis, Daniel Štefankovič, and Eric Vigoda. Inapproximability of the partition function for the antiferromagnetic Ising and hard-core models. Combinatorics, Probability and Computing, 25(4):500–559, 2016.
  • [KKT03] David Kempe, Jon Kleinberg, and Éva Tardos. Maximizing the spread of influence through a social network. In Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 137–146, 2003.
  • [KKT05] David Kempe, Jon Kleinberg, and Éva Tardos. Influential nodes in a diffusion model for social networks. In Automata, Languages and Programming: 32nd International Colloquium, ICALP 2005, Lisbon, Portugal, July 11-15, 2005. Proceedings 32, pages 1127–1138. Springer, 2005.
  • [Lip22] Adam Lipowski. Ising model: Recent developments and exotic applications. Entropy, 24(12):1834, 12 2022.
  • [LLY13] Liang Li, Pinyan Lu, and Yitong Yin. Correlation decay up to uniqueness in spin systems. In Proceedings of the 24th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 67–84, 2013.
  • [LMLA19] Cristina Gabriela Aguilar Lara, Eduardo Massad, Luis Fernandez Lopez, and Marcos Amaku. Analogy between the formulation of Ising-Glauber model and Si epidemiological model. Journal of Applied Mathematics and Physics, 7(05):1052, 2019.
  • [LYS10] Shihuan Liu, Lei Ying, and Srinivas Shakkottai. Influence maximization in social networks: An Ising-model-based approach. In Proceedings of the 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pages 570–576. IEEE, 2010.
  • [MLO01] Jacek Majewski, Hao Li, and Jurg Ott. The Ising model in physics and statistical genetics. The American Journal of Human Genetics, 69(4):853–862, 2001.
  • [MR10] Elchanan Mossel and Sébastien Roch. Submodularity of influence in social networks: From local to global. SIAM Journal on Computing, 39(6):2176–2188, 2010.
  • [MS10] Andrea Montanari and Amin Saberi. The spread of innovations in social networks. Proceedings of the National Academy of Sciences, 107(47):20196–20201, 2010.
  • [PR17] Viresh Patel and Guus Regts. Deterministic polynomial-time approximation algorithms for partition functions and graph polynomials. SIAM Journal on Computing, 46(6):1893–1919, 2017.
  • [Sly10] Allan Sly. Computational transition at the uniqueness threshold. In Proceedings of the 51st Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 287–296, 2010.
  • [SS14] Allan Sly and Nike Sun. The computational hardness of counting in two-spin models on dd-regular graphs. The Annals of Probability, 42(6):2383–2416, 2014.
  • [SST14] Alistair Sinclair, Piyush Srivastava, and Marc Thurley. Approximation algorithms for two-state anti-ferromagnetic spin systems on bounded degree graphs. Journal of Statistical Physics, 155(4):666–686, 2014.
  • [Wei06] Dror Weitz. Counting independent sets up to the tree threshold. In Proceedings of the 38th Annual ACM Symposium on Theory of Computing (STOC), pages 140–149, 2006.