Influence Maximization in Ising Models

Zongchen Chen Department of Computer Science and Engineering, University at Buffalo, zchen83@buffalo.edu. Research supported by EM’s Simons Investigator award (622132). Elchanan Mossel Department of Mathematics, MIT, elmos@mit.edu. Research supported by the Vannevar Bush Faculty Fellowship ONR-N00014-20-1-2826, the NSF award CCF 1918421, and the Simons Investigator award (622132).

Abstract

Given a complex high-dimensional distribution over $\{\pm 1\}^{n}$ , what is the best way to increase the expected number of $+1$ ’s by controlling the values of only a small number of variables? Such a problem is known as influence maximization and has been widely studied in social networks, biology, and computer science. In this paper, we consider influence maximization on the Ising model which is a prototypical example of undirected graphical models and has wide applications in many real-world problems. We establish a sharp computational phase transition for influence maximization on sparse Ising models under a bounded budget: In the high-temperature regime, we give a linear-time algorithm for finding a small subset of variables and their values which achieve nearly optimal influence; In the low-temperature regime, we show that the influence maximization problem cannot be solved in polynomial time under commonly-believed complexity assumption. The critical temperature coincides with the tree uniqueness/non-uniqueness threshold for Ising models which is also a critical point for other computational problems including approximate sampling and counting.

1 Introduction

Let $\mu$ be a distribution supported on $\{\pm 1\}^{V}$ where $V$ is a ground set of size $n$ , and let $k\in\mathbb{N}^{+}$ be an integer corresponding to a budget. We consider the following version of the influence maximization problem which asks to find a subset $S\subseteq V$ of size at most $k$ and a partial assignment $\sigma_{S}\in\{\pm 1\}^{S}$ which maximizes the expectation of $\sum_{v\in V}X_{v}$ conditioned on variables in $S$ receiving values specified by $\sigma_{S}$ . In other words, we want to solve the following combinatorial optimization problem:

\displaystyle\max_{\begin{subarray}{c}S\subseteq V,\,|S|\leq k\\ \sigma_{S}\in\{\pm 1\}^{S}\end{subarray}}\left\{\mathbb{E}_{\mu}\left[\sum_{v\in V}X_{v}\,\Bigg{|}\,X_{S}=\sigma_{S}\right]\right\}.

(1)

Influence maximization is an important problem especially in the study of social networks and also has a vast number of applications in other areas [KKT03, KKT05].

The problem of influence maximization has been extensively studied both theoretically and in more applied work. However, the families of distributions for which it was analyzed is somewhat limited. The theoretical foundations for the model were introduced in [KKT03] in terms of dynamical model where agents are infected if a function of their infected neighborhood surpasses a certain threshold. Algorithmic results and computational hardness are both stated in terms of properties of these threshold functions. While the models introduced and analyzed in [KKT03, KKT05, MR10] allow for and vastly generalize standard infections models, they do not apply to other standard models of correlated opinions.

The main interest in our paper is in Ising models, which is one of the simplest and most popular graphical models for modeling the joint distribution of correlated discrete random variables. The Ising model was originally defined as statistical physics models, and nowadays they are widely used to model social networks, computer networks, and biological systems, see e.g. [LYS10, MS10, APB10, MLO01, LMLA19, Lip22].

Consider a graph $G=(V,E)$ , and let $\beta,h\in\mathbb{R}$ . In the Ising distribution on $G$ parameterized by $\beta,h$ , every configuration $\sigma\in\{\pm 1\}^{V}$ is assigned with a probability density

\mu(\sigma)\propto\exp\left(\beta\sum_{uv\in E}\sigma_{u}\sigma_{v}+h\sum_{v\in V}\sigma_{v}\right).

Here, $\beta$ is the inverse temperature describing the interaction between adjacent vertices. In particular if $\beta>0$ then neighboring vertices are more likely to receive the same value and the model is called ferromagnetic; meanwhile, if $\beta<0$ they could become repulsive to each other and the model is called antiferromagnetic. The parameter $h$ is the external field of the system describing the bias of variables from outside. In general, every edge could have a distinct inverse temperature and every vertex a distinct external field; we refer to Section 2.1 for this more general definition.

The problem of influence maximization for the Ising model was studied before in some special settings. It was shown by Bresler, Koehler, and Moitra [BKM19] using the GHS inequality that for the ferromagnetic Ising model, the influence maximization problem for the equilibrium measure is submodular and therefore obtains a $1-1/e$ approximation factor. More recently, a preprint by Chin, Moitra, Mossel and Sandon [CMMS23] shows that for very high temperature (small $\beta$ ) ferromagnetic Ising models with fixed parameter $\beta$ , the influence maximization problem is approximately solved by the highest degree nodes. See also [LYS10] for applications in social networks of influence maximization on Ising models.

While prior works provided some interesting algorithms for special cases, much remains unknown. First, many of the most natural models are not ferromagnetic. Second, we may be interested in an approximation factor better than $1-1/e$ . Finally and importantly, we would like to understand the computational hardness of the problem.

For Ising models, there exists a critical temperature $\beta_{c}$ which characterizes phase transitions of the model. Such a critical point $\beta_{c}$ depends on the maximum degree of the graph and is called the tree uniqueness/non-uniqueness threshold since it characterizes whether there exists a unique Gibbs measure for the Ising model on infinite regular trees. More importantly, the threshold $\beta_{c}$ pinpoints whether or not the model exhibits correlation decay [Wei06, LLY13] or spectral independence [ALO20, CLV20], which are crucial properties for guaranteeing rapid mixing of natural Markov chains for sampling such as Glauber dynamics and polynomial-time algorithms for estimating the partition function.

We show that the critical temperature $\beta_{c}$ also pinpoints a computational phase transition for the influence maximization problem on sparse Ising models. In fact, we consider a more general version of influence maximization where we want to maximize the influence on an arbitrary linear function of $X_{v}$ ’s under a bounded budget; see Section 2.3 for formal definitions.

Theorem 1.1 (Informal version of Theorems 2.3 and 2.4).

Consider Ising models on bounded-degree graphs and let $k\in\mathbb{N}^{+}$ be a constant for the budget.

•

If $|\beta|<\beta_{c}$ , then one can find $(S,\sigma_{S})$ whose influence is $\varepsilon$ -close to the optimal value within time $O(n)\cdot\mathrm{poly}(1/\varepsilon)$ ;
•

If $|\beta|>\beta_{c}$ , then there is no $\mathrm{poly}(n,1/\varepsilon)$ -time algorithm for influence maximization.

One important feature of our algorithmic result is that the running time of the algorithm is linear in $n$ . Naively, one can easily obtain a polynomial-time algorithm by enumerating all possible $(S,\sigma_{S})$ and find their corresponding influences. Since in the high-temperature regime we are able to approximately sample from the distribution or estimate the marginals in polynomial time, such a brute force algorithm runs in polynomial time; however, the exponent in $n$ is a large constant depending on $k$ . Our algorithm has the advantage of being linear-time, assuming we have a constant budget $k=O(1)$ .

To obtain a linear-time algorithm, we utilize the decay of correlation property and the spectral independence technique in a novel way. In the high-temperature regime (i.e. $|\beta|<\beta_{c}$ ), the correlation/influence between a vertex $v$ and a subset $S\subseteq V$ of vertices is known to decay exponentially fast with their graph distance $\mathrm{dist}_{G}(v,S)$ [Wei06, LLY13, CLV20]; see Section 2.2 for details. The key in our approach is to approximate the global influence of $(S,\sigma_{S})$ on the whole vertex set $V$ by a local influence on only vertices sufficiently close to $S$ when correlation decay and spectral independence hold; see Proposition 3.2. The proof of the algorithmic result is provided in Section 3.

Meanwhile, in the low-temperature regime (i.e. $|\beta|>\beta_{c}$ ) correlations or influences between two vertices can be non-vanishing even when their distance grows. For this reason simple Markov chain algorithms for sampling such as Glauber dynamics are known to be exponentially slow on such family, and our algorithmic approach fails for the same reason. In fact, it was known that approximate sampling and counting is $\mathsf{NP}$ -hard in the antiferromagnetic case, i.e. when $\beta<-\beta_{c}$ [Sly10, SS14, GŠV16]. We establish hardness of influence maximization by giving a simple reduction from approximating the partition function of Ising models. The proof can be found in Section 4.

2 Preliminaries

Suppose $G=(V,E)$ is a graph. For two vertices $u,v\in V$ , let $\mathrm{dist}_{G}(u,v)$ denote their graph distance in $G$ . For any $u\in V$ and any $r>0$ , let $\mathsf{B}(u,r)=\{v\in V:\mathrm{dist}_{G}(u,v)\leq r\}$ be the ball of radius $r$ around $u$ . Further, for any $S\subseteq V$ let $\mathsf{B}(S,r)=\bigcup_{u\in S}\mathsf{B}(u,r)$ .

For $r>0$ , let $G^{\leq r}$ denote the graph with the same vertex set $V$ and two vertices $u,v$ adjacent iff $\mathrm{dist}_{G}(u,v)\leq r$ . For $S\subseteq V$ , let $G[S]$ be the subgraph induced on $S$ .

2.1 Ising model

Suppose $G=(V,E)$ is a graph. Let $\beta\in\mathbb{R}^{E}$ be a vector of edge couplings and $h\in\mathbb{R}^{V}$ be a vector of external fields. The Gibbs distribution $\mu=\mu_{G,\beta,h}$ of the Ising model $(G,\beta,h)$ is given by

\mu(\sigma):=\frac{1}{Z}\exp\left(\sum_{uv\in E}\beta_{uv}\sigma_{u}\sigma_{v}+\sum_{v\in V}h_{v}\sigma_{v}\right),\qquad\forall\sigma\in\{\pm 1\}^{V}

(2)

where the partition function $Z=Z_{G,\beta,h}$ is defined by

Z=\sum_{\sigma\in\{\pm 1\}^{V}}\exp\left(\sum_{uv\in E}\beta_{uv}\sigma_{u}\sigma_{v}+\sum_{v\in V}h_{v}\sigma_{v}\right).

For an integer $\Delta\geq 3$ and a real $\gamma>0$ , let $\mathcal{M}(\Delta,\gamma)$ be the family of all Ising models $(G,\beta,h)$ satisfying:

1.

The graph $G$ has maximum degree at most $\Delta$ ;
2.

For all $uv\in E$ it holds $(\Delta-1)(\tanh|\beta_{uv}|)\leq\gamma$ .

We remark that for the family $\mathcal{M}(\Delta,\gamma)$ every edge coupling can be either ferromagnetic (i.e. $\beta_{uv}>0$ ) or antiferromagnetic (i.e. $\beta_{uv}<0$ ).

The critical temperature is given by $\beta_{c}(\Delta)=\mathrm{arctanh}(1/(\Delta-1))$ . Hence, for any Ising model from the family $\mathcal{M}(\Delta,\gamma)$ where $\gamma<1$ , every edge coupling satisfies $|\beta_{uv}|<\beta_{c}(\Delta)$ .

2.2 Tree uniqueness, strong spatial mixing, total influence decay

In the high-temperature regime, strong spatial mixing (correlation decay) was known for the family $\mathcal{M}(\Delta,\gamma)$ for any $\gamma<1$ [Wei06, SST14, LLY13]. Recently, [CLV20] established $\ell_{\infty}$ -spectral independence by showing the exponential decay of total influences via Weitz’s self-avoiding tree approach [Wei06].

Lemma 2.1 ([Wei06, SST14, LLY13, CLV20]).

For any $\Delta\geq 3$ and $\delta\in(0,1)$ , there exists a constant $C=C(\Delta,\delta)>0$ such that the following holds. Consider an Ising model on a graph $G=(V,E)$ from the family $\mathcal{M}(\Delta,1-\delta)$ . Let $\Lambda\subseteq V$ and $\tau\in\{\pm 1\}^{\Lambda}$ be an arbitrary pinning.

•

(Strong Spatial Mixing) For any $u\in V\setminus\Lambda$ and $L\in\mathbb{N}^{+}$ , for any subset $W\subseteq V\setminus\Lambda\setminus\{u\}$ such that $\mathrm{dist}_{G}(u,W)\geq L$ and any two spin assignments $\sigma_{W},\xi_{W}\in\{\pm 1\}^{W}$ , we have

\left|\mathbb{P}_{\mu^{\tau}}\left(X_{u}=+\mid X_{W}=\sigma_{W}\right)-\mathbb{P}_{\mu^{\tau}}\left(X_{u}=+\mid X_{W}=\xi_{W}\right)\right|\leq C(1-\delta)^{L}.

•

(Total Influence Decay) For any $u\in V\setminus\Lambda$ and $L\in\mathbb{N}^{+}$ , we have

\sum_{v\in V\setminus\Lambda:\,\mathrm{dist}_{G}(u,v)\geq L}\left|\mathbb{P}_{\mu^{\tau}}(X_{v}=+\mid X_{u}=+)-\mathbb{P}_{\mu^{\tau}}(X_{v}=+\mid X_{u}=-)\right|\leq C(1-\delta)^{L}.

2.3 Influence maximization

Consider the Ising model on a graph $G=(V,E)$ with edge couplings $\beta\in\mathbb{R}^{E}$ and external fields $h\in\mathbb{R}^{V}$ . Let $a\in\mathbb{R}^{V}$ be a vector of vertex weights.

Definition 2.2 (Global Influence).

For a subset $S\subseteq V$ of vertices and a partial assignment $\sigma_{S}\in\{\pm 1\}^{S}$ on $S$ , define the global influence of $(S,\sigma_{S})$ on the linear function $a\cdot X$ to be

\Phi_{G,\beta,h,a}(S,\sigma_{S})=\mathbb{E}\left[\sum_{v\in V}a_{v}X_{v}\,\Bigg{|}\,X_{S}=\sigma_{S}\right]-\mathbb{E}\left[\sum_{v\in V}a_{v}X_{v}\right]

where $X\in\{\pm 1\}^{V}$ is sampled from the Ising model $(G,\beta,h)$ .

Let $k\in\mathbb{N}^{+}$ be an integer representing the budget. In this paper, we consider the $k$ -Inf-Max problem where we want to select a subset $S\subseteq V$ of size at most $k$ and a partial assignment $\sigma_{S}\in\{\pm 1\}^{S}$ which achieves almost the maximum global influence. Formally, the problem $k$ -Inf-Max is defined as follows.

$k$ -Inf-Max

Input: $(G,\beta,h)$ an Ising model; $a\in\mathbb{R}^{V}$ a vector of vertex weights; $\varepsilon>0$ an error parameter.

Output: a subset $\hat{S}\subseteq V$ with $|\hat{S}|\leq k$ and a partial assignment $\sigma_{\hat{S}}\in\{\pm 1\}^{\hat{S}}$ such that

\Phi_{G,\beta,h,a}(\hat{S},\sigma_{\hat{S}})\geq\max_{\begin{subarray}{c}S\subseteq V,\,|S|\leq k\\ \sigma_{S}\in\{\pm 1\}^{S}\end{subarray}}\left\{\Phi_{G,\beta,h,a}(S,\sigma_{S})\right\}-\varepsilon.

We say a weight vector $a\in\mathbb{R}^{V}$ is $L$ -bounded if $\left\lVert a\right\rVert_{\infty}\leq L$ , i.e., $|a_{v}|\leq L$ for all $v\in V$ . We are interested in maximizing the global influence for bounded weights. Since $\Phi_{G,\beta,h,ta}(S,\sigma_{S})=t\cdot\Phi_{G,\beta,h,a}(S,\sigma_{S})$ , we may assume that $a$ is $1$ -bounded. Furthermore, we consider influence maximization with a constant budget, namely $k=O(1)$ , which is already interesting and captures many real-world settings. Our goal is to find an algorithm for $k$ -Inf-Max with running time polynomial in $n$ and $1/\varepsilon$ , and understand the computational complexity of it.

2.4 Main results

Theorem 2.3 (Algorithmic Result).

Suppose $\Delta\geq 3$ is an integer and $\delta\in(0,1)$ is a real. For any integer $k\in\mathbb{N}^{+}$ , there exists a deterministic algorithm that solves $k$ -Inf-Max for the family $\mathcal{M}(\Delta,1-\delta)$ and $1$ -bounded vertex weights with running time $O(n)\cdot(1/\varepsilon)^{O(1)}$ .

Theorem 2.4 (Hardness Result).

Suppose $\Delta\geq 3$ is an integer and $\delta>0$ is a real. For any integer $k\in\mathbb{N}^{+}$ , there is no randomized algorithm that can solve $k$ -Inf-Max for the family $\mathcal{M}(\Delta,1+\delta)$ and $1$ -bounded vertex weights with probability at least $3/4$ in time $\mathrm{poly}(n,1/\varepsilon)$ , assuming $\mathsf{RP}\neq\mathsf{NP}$ .

3 Algorithmic Result

We prove our algorithmic result Theorem 2.3 by localizing the global influence of a subset $S$ of vertices to a ball around $S$ . For high-temperature Ising models, such local influence approximates the global influence effectively. Furthermore, one can approximately maximize the local influence by the local nature of the problem. Together this gives an approximation algorithm for global influence maximization.

To begin we first define the notion of local influences.

Definition 3.1 (Local Influence).

Let $r\in\mathbb{N}^{+}$ . For a subset $S\subseteq V$ of vertices and a partial assignment $\sigma_{S}\in\{\pm 1\}^{S}$ on $S$ , define the local influence of $(S,\sigma_{S})$ on the linear function $a\cdot X$ to be

\Phi_{G,\beta,h,a}^{(r)}(S,\sigma_{S})=\mathbb{E}_{G[\mathsf{B}(S,r)]}\left[\sum_{v\in\mathsf{B}(S,r)}a_{v}X_{v}\,\Bigg{|}\,X_{S}=\sigma_{S}\right]-\mathbb{E}_{G[\mathsf{B}(S,r)]}\left[\sum_{v\in\mathsf{B}(S,r)}a_{v}X_{v}\right],

where $X\in\{\pm 1\}^{V}$ is sampled from the Ising model on the induced subgraph $G[\mathsf{B}(S,r)]$ with $\beta,h$ restricted on it.

Notice that, if $S=\{u,w\}$ and $\mathrm{dist}_{G}(u,w)>2r+1$ , then the induced subgraph $G[\mathsf{B}(\{u,w\},r)]$ is the disjoint union of $G[\mathsf{B}(u,r)]$ and $G[\mathsf{B}(w,r)]$ , and we can further decompose the local influence as

	$\displaystyle\Phi_{G,\beta,h,a}^{(r)}(S,\sigma_{S})={}$	$\displaystyle\left(\mathbb{E}_{G[\mathsf{B}(u,r)]}\left[\sum_{v\in\mathsf{B}(u,r)}a_{v}X_{v}\,\Bigg{\|}\,X_{u}=\sigma_{u}\right]-\mathbb{E}_{G[\mathsf{B}(u,r)]}\left[\sum_{v\in\mathsf{B}(u,r)}a_{v}X_{v}\right]\right)$
		$\displaystyle+\left(\mathbb{E}_{G[\mathsf{B}(w,r)]}\left[\sum_{v\in\mathsf{B}(w,r)}a_{v}X_{v}\,\Bigg{\|}\,X_{w}=\sigma_{w}\right]-\mathbb{E}_{G[\mathsf{B}(w,r)]}\left[\sum_{v\in\mathsf{B}(w,r)}a_{v}X_{v}\right]\right)$
	$\displaystyle={}$	$\displaystyle\Phi_{G,\beta,h,a}^{(r)}(u,\sigma_{u})+\Phi_{G,\beta,h,a}^{(r)}(w,\sigma_{w}).$

Thus, we see that for local influence we are able to decompose it into clusters of vertices close to each other; more specifically the clusters are connected components of the induced subgraph $G[\mathsf{B}(S,r)]$ , see Lemma 3.7 for a precise statement.

We now present two main propositions for establishing Theorem 2.3. Fix $\Delta\geq 3$ , $\delta\in(0,1)$ and $k\in\mathbb{N}^{+}$ . In the propositions below $O(\cdot)=O_{\Delta,\delta,k}(\cdot)$ hides a constant depending on $\Delta,\delta,k$ .

We first show that for high-temperature Ising models, the global influence is well-approximated by the local influence for sufficiently large radius $r\in\mathbb{N}^{+}$ .

Proposition 3.2.

Consider an Ising model on a graph $G=(V,E)$ from the family $\mathcal{M}(\Delta,1-\delta)$ and a $1$ -bounded weight vector $a\in\mathbb{R}^{V}$ . For any $\varepsilon>0$ , there exists $r=O(\log(1/\varepsilon))$ such that for all $S\subseteq V$ with $|S|\leq k$ and all $\sigma_{S}\in\{\pm 1\}^{S}$ , we have

\displaystyle\left|\Phi_{G,\beta,h,a}(S,\sigma_{S})-\Phi_{G,\beta,h,a}^{(r)}(S,\sigma_{S})\right|\leq\varepsilon.

(3)

Next, we give a linear-time algorithm for approximately maximizing the local influence.

Proposition 3.3.

Consider an Ising model on a graph $G=(V,E)$ from the family $\mathcal{M}(\Delta,1-\delta)$ and a $1$ -bounded weight vector $a\in\mathbb{R}^{V}$ . For any $\varepsilon>0$ and $r\in\mathbb{N}^{+}$ , there exists an algorithm that finds a subset $\hat{S}\subseteq V$ with $|\hat{S}|\leq k$ and a partial assignment $\sigma_{\hat{S}}\in\{\pm 1\}^{\hat{S}}$ such that

\displaystyle\Phi_{G,\beta,h,a}^{(r)}(\hat{S},\sigma_{\hat{S}})\geq\max_{\begin{subarray}{c}S\subseteq V,\,|S|\leq k\\ \sigma_{S}\in\{\pm 1\}^{S}\end{subarray}}\left\{\Phi_{G,\beta,h,a}^{(r)}(S,\sigma_{S})\right\}-\varepsilon.

(4)

The running time of the algorithm is $O(n)\cdot(1/\varepsilon)^{O(1)}\cdot e^{O(r)}$ .

Theorem 2.3 follows immediately from Propositions 3.2 and 3.3. For ease of notations we omit $G,\beta,h,a$ in the subscripts for the rest of the paper when it is clear from the context.

Proof of Theorem 2.3.

Define the optimal solutions

	$\displaystyle(S^{},\sigma_{S^{}})$	$\displaystyle=\operatorname*{arg\,max}_{\begin{subarray}{c}S\subseteq V,\,\|S\|\leq k\\ \sigma_{S}\in\{\pm 1\}^{S}\end{subarray}}\left\{\Phi(S,\sigma_{S})\right\}$		(5)
	$\displaystyle\text{and}\qquad(S^{\dagger},\sigma_{S^{\dagger}})$	$\displaystyle=\operatorname*{arg\,max}_{\begin{subarray}{c}S\subseteq V,\,\|S\|\leq k\\ \sigma_{S}\in\{\pm 1\}^{S}\end{subarray}}\left\{\Phi^{(r)}(S,\sigma_{S})\right\}.$		(6)

Let $r=O(\log(1/\varepsilon))$ be from Proposition 3.2 such that Eq. 3 holds with the error in the right-hand side being $\varepsilon/3$ . (Note that we can compute $r$ efficiently by Eqs. 7 and 8 from the proof of Proposition 3.2.) For this $r$ use the algorithm from Proposition 3.3 to find $\hat{S}\subseteq V$ and $\sigma_{\hat{S}}\in\{\pm 1\}^{\hat{S}}$ such that Eq. 4 holds with the error in the right-hand side being $\varepsilon/3$ . Thus, we conclude that

	$\displaystyle\Phi(\hat{S},\sigma_{\hat{S}})$	$\displaystyle\overset{\text{\lx@cref{creftype~refnum}{eq:approx-local}}}{\geq}\Phi^{(r)}(\hat{S},\sigma_{\hat{S}})-\frac{\varepsilon}{3}\overset{\text{\lx@cref{creftype~refnum}{eq:alg-local}}}{\geq}\Phi^{(r)}(S^{\dagger},\sigma_{S^{\dagger}})-\frac{2\varepsilon}{3}$
		$\displaystyle\overset{\text{\lx@cref{creftype~refnum}{eq:opt}}}{\geq}\Phi^{(r)}(S^{},\sigma_{S^{}})-\frac{2\varepsilon}{3}\overset{\text{\lx@cref{creftype~refnum}{eq:approx-local}}}{\geq}\Phi(S^{},\sigma_{S^{}})-\varepsilon$

as wanted. The running time of the algorithm is $O(n)\cdot(1/\varepsilon)^{O(1)}\cdot e^{O(r)}=O(n)\cdot(1/\varepsilon)^{O(1)}$ . ∎

3.1 Proof of Proposition 3.2

Fix $S\subseteq V$ and $\sigma_{S}\in\{\pm 1\}^{S}$ , and define

	$\displaystyle f(k,\ell)$	$\displaystyle=\mathbb{E}_{G[\mathsf{B}(S,k)]}\left[\sum_{v\in\mathsf{B}(S,\ell)}a_{v}X_{v}\,\Bigg{\|}\,X_{S}=\sigma_{S}\right]-\mathbb{E}_{G[\mathsf{B}(S,k)]}\left[\sum_{v\in\mathsf{B}(S,\ell)}a_{v}X_{v}\right]$
		$\displaystyle=\sum_{v\in\mathsf{B}(S,\ell)}a_{v}\left(\mathbb{E}_{G[\mathsf{B}(S,k)]}\left[X_{v}\,\|\,X_{S}=\sigma_{S}\right]-\mathbb{E}_{G[\mathsf{B}(S,k)]}\left[X_{v}\right]\right).$

Define $\mathsf{B}(v,\infty)$ to be the connected component containing $v$ and $\mathsf{B}(S,\infty)=\bigcup_{v\in S}\mathsf{B}(v,\infty)$ . Then we have $f(\infty,\infty)=\Phi(S,\sigma_{S})$ and $f(r,r)=\Phi^{(r)}(S,\sigma_{S})$ ; to see the former, observe that

$\displaystyle\Phi(S,\sigma_{S})$	$\displaystyle=\mathbb{E}_{G}\left[\sum_{v\in V}a_{v}X_{v}\,\Bigg{\|}\,X_{S}=\sigma_{S}\right]-\mathbb{E}_{G}\left[\sum_{v\in V}a_{v}X_{v}\right]$
	$\displaystyle=\mathbb{E}_{G}\left[\sum_{v\in\mathsf{B}(S,\infty)}a_{v}X_{v}\,\Bigg{\|}\,X_{S}=\sigma_{S}\right]-\mathbb{E}_{G}\left[\sum_{v\in\mathsf{B}(S,\infty)}a_{v}X_{v}\right]$	( $X_{S}$ and $X_{V\setminus\mathsf{B}(S,\infty)}$ are independent)
	$\displaystyle=\mathbb{E}_{G[\mathsf{B}(S,\infty)]}\left[\sum_{v\in\mathsf{B}(S,\infty)}a_{v}X_{v}\,\Bigg{\|}\,X_{S}=\sigma_{S}\right]-\mathbb{E}_{G[\mathsf{B}(S,\infty)]}\left[\sum_{v\in\mathsf{B}(S,\infty)}a_{v}X_{v}\right]$	( $\mu_{G}=\mu_{G[\mathsf{B}(S,\infty)]}\otimes\mu_{G[V\setminus\mathsf{B}(S,\infty)]}$ )
	$\displaystyle=f(\infty,\infty).$

Therefore, it suffices to show that $|f(\infty,\infty)-f(r,r)|\leq\varepsilon$ , which follows immediately from the following three lemmas.

Lemma 3.4.

There exists $\rho=O(\log(1/\varepsilon))$ such that $|f(\infty,\infty)-f(\infty,\rho)|\leq\varepsilon/3$ .

Proof.

We define

\displaystyle\rho=\left\lceil\frac{1}{\delta}\log\left(\frac{6Ck}{\varepsilon}\right)\right\rceil.

(7)

For simplicity we write $\mathbb{P}=\mathbb{P}_{G[\mathsf{B}(S,\infty)]}$ for the Ising distribution $\mu_{G[\mathsf{B}(S,\infty)]}$ , and $\mathbb{E}=\mathbb{E}_{G[\mathsf{B}(S,\infty)]}$ for the expectation over $\mu_{G[\mathsf{B}(S,\infty)]}$ . By definitions we have that

\displaystyle|f(\infty,\infty)-f(\infty,\rho)|=\left|\sum_{v\in\mathsf{B}(S,\infty)\setminus\mathsf{B}(S,\rho)}a_{v}\left(\mathbb{E}\left[X_{v}\,|\,X_{S}=\sigma_{S}\right]-\mathbb{E}\left[X_{v}\right]\right)\right|.

Suppose $S=\{v_{1},\dots,v_{k^{\prime}}\}$ where $k^{\prime}=|S|\leq k$ . For $0\leq i\leq k^{\prime}$ we define $S_{i}=\{v_{1},\dots,v_{i}\}$ and let $\sigma_{S_{i}}$ be $\sigma_{S}$ restricted to $S_{i}$ . Then it follows that

		$\displaystyle\|f(\infty,\infty)-f(\infty,\rho)\|$
	$\displaystyle={}$	$\displaystyle\left\|\sum_{i=1}^{k^{\prime}}\sum_{v\in\mathsf{B}(S,\infty)\setminus\mathsf{B}(S,\rho)}a_{v}\left(\mathbb{E}\left[X_{v}\,\|\,X_{S_{i}}=\sigma_{S_{i}}\right]-\mathbb{E}\left[X_{v}\,\|\,X_{S_{i-1}}=\sigma_{S_{i-1}}\right]\right)\right\|$
	$\displaystyle\overset{\text{(i)}}{\leq}{}$	$\displaystyle\sum_{i=1}^{k^{\prime}}\sum_{v\in\mathsf{B}(S,\infty)\setminus\mathsf{B}(S,\rho)}\|a_{v}\|\cdot\left\|\mathbb{E}\left[X_{v}\,\|\,X_{S_{i}}=\sigma_{S_{i}}\right]-\mathbb{E}\left[X_{v}\,\|\,X_{S_{i-1}}=\sigma_{S_{i-1}}\right]\right\|$
	$\displaystyle\overset{\text{(ii)}}{\leq}{}$	$\displaystyle\sum_{i=1}^{k^{\prime}}\sum_{v\in\mathsf{B}(S,\infty)\setminus\mathsf{B}(S,\rho)}2\left\|\mathbb{P}^{\sigma_{S_{i-1}}}(X_{v}=+1\mid X_{v_{i}}=+1)-\mathbb{P}^{\sigma_{S_{i-1}}}(X_{v}=+1\mid X_{v_{i}}=-1)\right\|$
	$\displaystyle\overset{\text{(iii)}}{\leq}{}$	$\displaystyle 2Ck(1-\delta)^{\rho}$
	$\displaystyle\overset{\text{(iv)}}{\leq}{}$	$\displaystyle\frac{\varepsilon}{3},$

where (i) is the triangle inequality, (ii) follows from $|a_{v}|\leq 1$ and expanding the expectation, (iii) follows from Total Influence Decay (Lemma 2.1), and (iv) is by our choice of $\rho$ . ∎

Lemma 3.5.

Given $\rho\in\mathbb{N}^{+}$ , there exists $\rho<r=O(\rho+\log(1/\varepsilon))$ such that $|f(\infty,\rho)-f(r,\rho)|\leq\varepsilon/3$ .

Proof.

We define

\displaystyle r=\rho+\left\lceil\frac{1}{\delta}\left(\log\left(\frac{24C}{\varepsilon}\right)+\rho\log\Delta\right)\right\rceil.

(8)

By the triangle inequality and $\left\lVert a\right\rVert_{\infty}\leq 1$ we have that

		$\displaystyle\|f(\infty,\rho)-f(r,\rho)\|$
	$\displaystyle\leq{}$	$\displaystyle\sum_{v\in\mathsf{B}(S,\rho)}\left\|\mathbb{E}_{G_{\infty}}\left[X_{v}\,\|\,X_{S}=\sigma_{S}\right]-\mathbb{E}_{G_{r}}\left[X_{v}\,\|\,X_{S}=\sigma_{S}\right]\right\|+\left\|\mathbb{E}_{G_{\infty}}\left[X_{v}\right]-\mathbb{E}_{G_{r}}\left[X_{v}\right]\right\|$
	$\displaystyle={}$	$\displaystyle\sum_{v\in\mathsf{B}(S,\rho)}2\left\|\mathbb{P}_{G_{\infty}}^{\sigma_{S}}\left(X_{v}=+1\right)-\mathbb{P}_{G_{r}}^{\sigma_{S}}\left(X_{v}=+1\right)\right\|+2\left\|\mathbb{P}_{G_{\infty}}\left(X_{v}=+1\right)-\mathbb{P}_{G_{r}}\left(X_{v}=+1\right)\right\|$

where $G_{\infty}=G[\mathsf{B}(S,\infty)]$ , $G_{r}=G[\mathsf{B}(S,r)]$ , $\mathbb{P}_{G_{\infty}}^{\tau}=\mathbb{P}_{\mu_{G_{\infty}}^{\tau}}$ , and $\mathbb{P}_{G_{r}}^{\tau}=\mathbb{P}_{\mu_{G_{r}}^{\tau}}$ . Let $U=\{u\in V:\mathrm{dist}_{G}(u,S)=r\}$ . For any $v\in\mathsf{B}(S,\rho)$ , we can couple $X_{v}\sim\mathbb{P}_{G_{\infty}}(X_{v}=\cdot)$ and $X^{\prime}_{v}\sim\mathbb{P}_{G_{r}}(X^{\prime}_{v}=\cdot)$ by first revealing the spin assignments $X_{U}\sim\mathbb{P}_{G_{\infty}}(X_{U}=\cdot)$ and $X^{\prime}_{U}\sim\mathbb{P}_{G_{r}}(X^{\prime}_{U}=\cdot)$ on $U$ independently and then couple $X_{v},X^{\prime}_{v}$ optimally conditioned on $X_{U},X^{\prime}_{U}$ respectively. Therefore, we deduce that

	$\displaystyle\left\|\mathbb{P}_{G_{\infty}}^{\sigma_{S}}\left(X_{v}=+1\right)-\mathbb{P}_{G_{r}}^{\sigma_{S}}\left(X_{v}=+1\right)\right\|$	$\displaystyle\leq\max_{\sigma_{U},\tau_{U}\in\{\pm 1\}^{U}}\left\|\mathbb{P}_{G}^{\sigma_{S},\sigma_{U}}\left(X_{v}=+1\right)-\mathbb{P}_{G}^{\sigma_{S},\tau_{U}}\left(X_{v}=+1\right)\right\|$
		$\displaystyle\leq C(1-\delta)^{r-\rho},$

where the first inequality follows from the coupling procedure and the fact that

\mathbb{P}_{G_{\infty}}^{\sigma_{S},\sigma_{U}}(X_{v}=\cdot)=\mathbb{P}_{G}^{\sigma_{S},\sigma_{U}}(X_{v}=\cdot)=\mathbb{P}_{G_{r}}^{\sigma_{S},\sigma_{U}}(X_{v}=\cdot),

and the second inequality follows from Strong Spatial Mixing (Lemma 2.1) and $\mathrm{dist}_{G}(v,U)\geq\mathrm{dist}_{G}(S,U)-\mathrm{dist}_{G}(v,S)\geq r-\rho$ . Similarly, we also have

\left|\mathbb{P}_{G_{\infty}}\left(X_{v}=+1\right)-\mathbb{P}_{G_{r}}\left(X_{v}=+1\right)\right|\leq C(1-\delta)^{r-\rho}.

Hence, combining everything above and $|\mathsf{B}(S,\rho)|\leq 2\Delta^{\rho}$ we get

|f(\infty,\rho)-f(r,\rho)|\leq 2\Delta^{\rho}\cdot 4C(1-\delta)^{r-\rho}\leq\frac{\varepsilon}{3},

as wanted. ∎

Lemma 3.6.

For $\rho\in\mathbb{N}^{+}$ in Eq. 7 and any integer $r\geq\rho$ , we have $|f(r,\rho)-f(r,r)|\leq\varepsilon/3$ .

Proof.

The proof is exactly the same as for Lemma 3.4; one only needs to replace $G[\mathsf{B}(S,\infty)]$ with $G[\mathsf{B}(S,r)]$ once noting that Total Influence Decay still holds on any subgraph. ∎

Proof of Proposition 3.2.

For $\rho,r$ given in Eqs. 7 and 8, we obtain from the triangle inequality and Lemmas 3.4, 3.5 and 3.6 that

|f(\infty,\infty)-f(r,r)|\leq|f(\infty,\infty)-f(\infty,\rho)|+|f(\infty,\rho)-f(r,\rho)|+|f(r,\rho)-f(r,r)|\leq\varepsilon,

as claimed. ∎

3.2 Proof of Proposition 3.3

For a graph $G$ , let $\mathsf{cc}(G)$ denote the set of all connected components of $G$ , where each connected component is viewed as a subset of vertices. The following decomposition lemma is easy to verify.

Lemma 3.7.

For any $r\in\mathbb{N}$ , $S\subseteq V$ , and $\sigma_{S}\in\{\pm 1\}^{S}$ , we have

\Phi^{(r)}(S,\sigma_{S})=\sum_{T\in\mathsf{cc}(G^{\leq 2r+1}[S])}\Phi^{(r)}(T,\sigma_{T}).

Proof.

Follows from the fact that

\mu_{G[\mathsf{B}(S,r)]}=\bigotimes_{T\in\mathsf{cc}(G^{\leq 2r+1}[S])}\mu_{G[\mathsf{B}(T,r)]},

and the same for the conditional distribution with a partial assignment $\sigma_{S}$ on $S$ . ∎

Hence, it suffices to consider all local influences for subsets of vertices that are connected in $G^{\leq 2r+1}$ . Our algorithm is given below.

Outline of the Algorithm

Step 1.
Construct a graph $H=(V_{H},E_{H})$ as follows.
1. (1a)
  
  The vertex set $V_{H}$ consists of all non-empty subsets $T\subseteq V$ of vertices of size at most $k$ such that $G^{\leq 2r+1}[T]$ is connected.
2. (1b)
  
  Two distinct subsets $T_{1},T_{2}$ are adjacent iff $G^{\leq 2r+1}[T_{1}\cup T_{2}]$ is connected; equivalently, $T_{1},T_{2}$ are non-adjacent iff $\mathrm{dist}_{G}(T_{1},T_{2})>2r+1$ .
$\vartriangleright$ (Lemma 3.8) We can construct $H$ in $O(n)\cdot e^{O(r)}$ time.

Step 2.

Each vertex $T\in V_{H}$ is assigned an integral cost $c_{T}\in\mathbb{N}^{+}$ , a real weight $w_{T}\in\mathbb{R}$ , and a partial assignment $\xi_{T}\in\{\pm 1\}^{T}$ as follows.

(2a)

The cost of $T$ is its size; i.e., $c_{T}=|T|$ .

(2b)

For every $\sigma_{T}\in\{\pm 1\}^{T}$ , compute $\psi_{T}(\sigma_{T})$ such that

\displaystyle\left|\psi_{T}(\sigma_{T})-\Phi^{(r)}(T,\sigma_{T})\right|\leq\frac{\varepsilon}{2k}.

(9)

The weight of $T$ is the maximum value of $\psi_{T}(\sigma_{T})$ and the associated partial assignment is the maximizer:

	$\displaystyle\xi_{T}$	$\displaystyle=\operatorname*{arg\,max}_{\sigma_{T}\in\{\pm 1\}^{T}}\psi_{T}(\sigma_{T})$		(10)
	$\displaystyle\text{and}\qquad w_{T}$	$\displaystyle=\psi_{T}(\xi_{T})=\max_{\sigma_{T}\in\{\pm 1\}^{T}}\psi_{T}(\sigma_{T}).$		(11)

$\vartriangleright$ (Lemma 3.9) For each $T\in V_{H}$ , we can compute $c_{T}$ , $w_{T}$ , and $\xi_{T}$ in $(1/\varepsilon)^{O(1)}\cdot e^{O(r)}$ time.

Step 3.

Given the graph $H$ , costs $\{c_{T}\}_{T\in V_{H}}$ , and weights $\{w_{T}\}_{T\in V_{H}}$ , find a maximum weighted independent set $I^{*}$ of $H$ with total cost at most $k$ ; namely,

$\displaystyle\max\quad$	$\displaystyle\sum_{T\in I}w_{T}$	(12)
$\displaystyle\mathrm{s.t.}\quad$	$\displaystyle\text{$I$ is an independent set of $H$};$
	$\displaystyle\sum_{T\in I}c_{T}\leq k.$

$\vartriangleright$ (Lemma 3.10) We can find $I^{*}$ in $O(n)\cdot e^{O(r)}$ time.

Step 4.

Output

$\hat{S}=\bigcup_{T\in I^{*}}T\qquad\text{and}\qquad\sigma_{\hat{S}}=(\xi_{T})_{T\in I^{*}}.$

$\vartriangleright$ (Lemma 3.11) We have $\Phi^{(r)}(\hat{S},\sigma_{\hat{S}})\geq\Phi^{(r)}(S^{\dagger},\sigma_{S^{\dagger}})-\varepsilon$ as desired.

We show the correctness of our algorithm and analyze the running time of it in the following sequence of lemmas. Throughout, we assume that $G$ has maximum degree at most $\Delta\geq 3$ , the Ising model on $G$ is from the family $\mathcal{M}(\Delta,1-\delta)$ , and $k\in\mathbb{N}^{+}$ is fixed.

Both Step 1 and 2 can be completed in $O(n)\cdot(1/\varepsilon)^{O(1)}\cdot e^{O(r)}$ time.

Lemma 3.8 (Step 1).

The graph $H$ has $N=O(n)\cdot e^{O(r)}$ vertices and maximum degree $D=e^{O(r)}$ . Furthermore, one can construct $H$ in $O(n)\cdot e^{O(r)}$ time.

Proof.

Follows from results in [PR17, BCKL13] for counting and enumerating bounded-size connected induced subgraphs in a bounded-degree graph. ∎

Lemma 3.9 (Step 2).

For each $T\in V_{H}$ , its cost $c_{T}$ , weight $w_{T}$ , and partial assignment $\xi_{T}$ can be computed in $e^{O(r)}\cdot(1/\varepsilon)^{O(1)}$ time.

Proof.

The cost $c_{T}$ is trivial. For the weight $w_{T}$ , we first compute $\psi_{T}(\sigma_{T})$ which approximates $\Phi^{(r)}(T,\sigma_{T})$ , for all choices of $\sigma_{T}\in\{\pm 1\}^{T}$ . The approximation of $\Phi^{(r)}(T,\sigma_{T})$ follows from [Wei06, SST14, Bar16, PR17] which present deterministic approximate counting algorithms ( $\mathsf{FPTAS}$ ) for high-temperature Ising models. More specifically, observe that by definition $\Phi^{(r)}(T,\sigma_{T})$ is a linear combination of marginal probabilities at each vertex in $\mathsf{B}(T,r)$ either with or without the pinning $\sigma_{T}$ . Thus, one can estimate all such marginals within an additive error $\varepsilon^{\prime}=\frac{\varepsilon}{4k|\mathsf{B}(T,r)|}$ , and then obtain $\psi_{T}(\sigma_{T})$ from these estimates such that

\left|\psi_{T}(\sigma_{T})-\Phi^{(r)}(T,\sigma_{T})\right|\leq 2|\mathsf{B}(T,r)|\cdot\varepsilon^{\prime}=\frac{\varepsilon}{2k}.

The running time of this is

|\mathsf{B}(T,r)|^{O(1)}\cdot(1/\varepsilon^{\prime})^{O(1)}=e^{O(r)}\cdot(1/\varepsilon)^{O(1)}.

Given $\psi_{T}(\sigma_{T})$ for all $\sigma_{T}\in\{\pm 1\}^{T}$ , we can then find $\xi_{T}$ and $w_{T}$ . Note that the number of choices of $\sigma_{T}$ is at most $2^{k}$ which is $O(1)$ . ∎

The algorithm for Step 3 is given by the following lemma.

Lemma 3.10 (Step 3).

Let $H=(V,E)$ be an $N$ -vertex graph of maximum degree at most $D\geq 3$ . Suppose every vertex $T\in V$ is assigned an integral cost $c_{T}\in\mathbb{N}^{+}$ and a real weight $w_{T}\in\mathbb{R}$ . Then for any fixed $k\in\mathbb{N}^{+}$ with $k=O(1)$ , there exists an algorithm that finds a maximum weighted independent set $I$ of $H$ with total cost at most $k$ in time $O(DN)+D^{O(1)}$ .

Proof.

For $1\leq i\leq k$ , define $V^{(i)}=\{T\in V:c_{T}=i\}$ to be the set of all vertices of cost $i$ . Let $U^{(i)}=\{T^{(i)}_{1},\dots,T^{(i)}_{t_{i}}\}\subseteq V^{(i)}$ be the $t_{i}$ vertices of largest weights from $V^{(i)}$ (break ties arbitrarily), where $t_{i}=|U^{(i)}|=\min\{k(D+1),|V^{(i)}|\}$ . Finally, let $U=\bigcup_{i=1}^{k}U^{(i)}$ . Observe that $U$ can be found in $O(DN)$ time.

We claim that there exists a maximum weighted independent set $I^{*}$ with total cost at most $k$ such that $I^{*}$ is completely contained in $U$ . To prove the claim, let us define $I^{*}$ to be the maximum weighted independent set with total cost at most $k$ that contains the most vertices in $U$ , and it suffices to prove $I^{*}\subseteq U$ . Suppose for sake of contradiction that $I^{*}\not\subseteq U$ . Take any $T\in I^{*}\setminus U$ , and assume that $T\in V^{(i)}$ for some $i$ . Since $T\not\in U=\bigcup_{j=1}^{k}U^{(j)}$ , we have $T\in V^{(i)}\setminus U^{(i)}$ and it holds $|V^{(i)}|>|U^{(i)}|=k(D+1)$ . We say a vertex $T$ blocks a vertex $T^{\prime}$ if either $T=T^{\prime}$ or $T,T^{\prime}$ are adjacent. Thus, every vertex blocks at most $D+1$ vertices. It follows that vertices in $I^{*}\setminus\{T\}$ block at most $(D+1)(k-1)\leq|U^{(i)}|-1$ vertices altogether. Hence, there exists a vertex $T^{\prime}\in U^{(i)}$ which is not blocked by $I^{*}\setminus\{T\}$ . In particular, $I^{\prime}=I^{*}\setminus\{T\}\cup\{T^{\prime}\}$ is an independent set with the same cost ( $c_{T^{\prime}}=i=c_{T}$ ) and no smaller weight ( $w_{T^{\prime}}\geq w_{T}$ by the definition of $U^{(i)}$ ), while containing one more vertex from $U$ . This is a contradiction.

Given the claim, one only needs to enumerate all subsets of $U$ of size at most $k$ to find the maximum weighted independent set with cost constraint $k$ . Since $|U|=O(D)$ , this can be done in $D^{O(1)}$ time, finishing the proof. ∎

Finally, we show the correctness of our algorithm.

Lemma 3.11 (Step 4).

Let $(S^{\dagger},\sigma_{S^{\dagger}})$ be the maximizer for the local influence defined in Eq. 6. We have that

\Phi^{(r)}(\hat{S},\sigma_{\hat{S}})\geq\Phi^{(r)}(S^{\dagger},\sigma_{S^{\dagger}})-\varepsilon.

Proof.

By Lemma 3.7, the optimal solution $(S^{\dagger},\sigma_{S^{\dagger}})$ corresponds to an independent set $I^{\dagger}$ of $H$ such that

S^{\dagger}=\bigcup_{T\in I^{\dagger}}T\qquad\text{and}\qquad\sigma_{S^{\dagger}}=\left(\eta_{T}\right)_{T\in I^{\dagger}}~{}\text{where}~{}\eta_{T}=\operatorname*{arg\,max}_{\sigma_{T}\in\{\pm 1\}^{T}}\left\{\Phi^{(r)}(T,\sigma_{T})\right\}.

We then deduce that

	$\displaystyle\Phi^{(r)}(\hat{S},\sigma_{\hat{S}})$	$\displaystyle\overset{\text{Lem \ref{lem:decomp}}}{=}\sum_{T\in I^{}}\Phi^{(r)}(T,\xi_{T})\overset{\text{\lx@cref{creftype~refnum}{eq:psi-approx}}}{\geq}\sum_{T\in I^{}}\psi_{T}(\xi_{T})-\frac{\varepsilon}{2}\overset{\text{\lx@cref{creftype~refnum}{eq:wT}}}{=}\sum_{T\in I^{*}}w_{T}-\frac{\varepsilon}{2}$
		$\displaystyle\overset{\text{\lx@cref{creftype~refnum}{eq:I*-max}}}{\geq}\sum_{T\in I^{\dagger}}w_{T}-\frac{\varepsilon}{2}\overset{\text{\lx@cref{creftype~refnum}{eq:wT}}}{=}\sum_{T\in I^{\dagger}}\psi_{T}(\xi_{T})-\frac{\varepsilon}{2}\overset{\text{\lx@cref{creftype~refnum}{eq:xiT}}}{\geq}\sum_{T\in I^{\dagger}}\psi_{T}(\eta_{T})-\frac{\varepsilon}{2}$
		$\displaystyle\overset{\text{\lx@cref{creftype~refnum}{eq:psi-approx}}}{\geq}\sum_{T\in I^{\dagger}}\Phi^{(r)}(T,\eta_{T})-\varepsilon\overset{\text{Lem \ref{lem:decomp}}}{=}\Phi^{(r)}(S^{\dagger},\sigma_{S^{\dagger}})-\varepsilon,$

as claimed. ∎

4 Hardness Result

We establish computational hardness of $k$ -Inf-Max for low-temperature Ising models from the hardness of estimating the marginal probabilities of single vertices, which is a direct consequence of hardness of approximate counting [Sly10, SS14, GŠV16] and self-reducibility.

Theorem 4.1 ([Sly10, SS14, GŠV16]).

Suppose $\Delta\geq 3$ is an integer and $\delta>0$ is a real. Assuming $\mathsf{RP}\neq\mathsf{NP}$ , there is no $\mathsf{FPRAS}$ for the following problem: Given an Ising model on a graph $G=(V,E)$ from the family $\mathcal{M}(\Delta,1+\delta)$ and a vertex $v\in V$ , estimate $\mathbb{P}(X_{v}=+)$ .

Proof.

Given an $\mathsf{FPRAS}$ for estimating marginals, one can approximate the partition function efficiently. More specifically, suppose $V=\{v_{1},\dots,v_{n}\}$ and we have

\mathbb{P}(X_{1}=+1,\dots,X_{n}=+1)=\prod_{i=1}^{n}\mathbb{P}(X_{i}=+1\mid X_{1}=+1,\dots,X_{i-1}=+1).

Each $\mathbb{P}(X_{i}=+1\mid X_{1}=+1,\dots,X_{i-1}=+1)$ corresponds to the marginal at $v_{i}$ in an Ising model on the subgraph induced by $\{v_{i},\dots,v_{n}\}$ where the pinning on $\{v_{1},\dots,v_{i-1}\}$ becomes external fields. Thus, we can approximate $\mathbb{P}(X_{1}=+1,\dots,X_{n}=+1)$ and hence the partition function via Eq. 2. We therefore deduce the theorem from the hardness results for computing the partition function in low-temperature Ising models [Sly10, SS14, GŠV16]. ∎

We now give the proof of Theorem 2.4.

Proof of Theorem 2.4.

We may assume without loss of generality that $\delta\leq 1$ . Given a polynomial-time algorithm for $k$ -Inf-Max for the family $\mathcal{M}(\Delta,1+\delta)$ and $1$ -bounded vertex weights, we show how to efficiently estimate $\mathbb{P}(X_{v}=+)$ for an Ising model on a graph $G=(V,E)$ from the family $\mathcal{M}(\Delta,1+\delta)$ and a vertex $v\in V$ .

Define a graph $G^{\prime}$ which is the disjoint union of $G$ and $k$ distinct isolated vertices $u_{1},\dots,u_{k}$ . Each $u_{i}$ has the same external field $h(u_{i})=x$ which we can choose freely. Together with $\beta\in\mathbb{R}^{E}$ and $h\in\mathbb{R}^{V}$ this defines an Ising model on $G^{\prime}$ which is still in the family $\mathcal{M}(\Delta,1+\delta)$ . Let $a\in\mathbb{R}^{V}$ be a 1-bounded vertex weight vector defined by $a(v)=1$ , $a(u_{i})=1$ for $i=1,\dots,k$ and $a(u)=0$ for all other vertices.

Consider the $k$ -Inf-Max problem for the Ising model on $G^{\prime}$ and the weight vector $a$ . Let $U=\{u_{1},\dots,u_{k}\}$ and $W=\{v,u_{1},\dots,u_{k-1}\}$ . For a subset $S$ of vertices, let $\boldsymbol{+}_{S}\in\{\pm 1\}^{S}$ denote the partial assignment that assigns $+$ to all vertices in $S$ . We claim that

\displaystyle\max_{\begin{subarray}{c}S\subseteq V\cup U,\,|S|\leq k\\ \sigma_{S}\in\{\pm 1\}^{S}\end{subarray}}\left\{\Phi(S,\sigma_{S})\right\}

\displaystyle=\max\left\{\Phi(U,\boldsymbol{+}_{U}),\Phi(W,\boldsymbol{+}_{W})\right\}.

(13)

To see this, consider a feasible pair $(S,\sigma_{S})$ . If $S=U$ , then

\displaystyle\Phi(U,\sigma_{U})=\sum_{i=1}^{k}(\sigma_{u_{i}}-\mathbb{E}[X_{u_{i}}])\leq\sum_{i=1}^{k}(1-\tanh x)=k(1-\tanh x)=\Phi(U,\boldsymbol{+}_{U}).

(14)

If $S\neq U$ , then without loss of generality suppose $S\cap U=\{u_{1},\dots,u_{j}\}$ where $j\leq k-1$ and we have

	$\displaystyle\Phi(S,\sigma_{S})$	$\displaystyle=\mathbb{E}[X_{v}\mid X_{S\setminus U}=\sigma_{S\setminus U}]-\mathbb{E}[X_{v}]+\sum_{i=1}^{j}(\sigma_{u_{i}}-\mathbb{E}[X_{u_{i}}])$
		$\displaystyle\leq 1-\mathbb{E}[X_{v}]+(k-1)(1-\tanh x)=\Phi(W,\boldsymbol{+}_{W}).$		(15)

Therefore, Eq. 13 follows from Eqs. 14 and 15.

Suppose the provided algorithm returns $(\hat{S},\sigma_{\hat{S}})$ which satisfies

\Phi(\hat{S},\sigma_{\hat{S}})\geq\max_{\begin{subarray}{c}S\subseteq V\cup U,\,|S|\leq k\\ \sigma_{S}\in\{\pm 1\}^{S}\end{subarray}}\left\{\Phi(S,\sigma_{S})\right\}-\varepsilon=\max\left\{\Phi(U,\boldsymbol{+}_{U}),\Phi(W,\boldsymbol{+}_{W})\right\}-\varepsilon.

If $\hat{S}=U$ , then we deduce from Eq. 14 that

\Phi(U,\boldsymbol{+}_{U})\geq\Phi(\hat{S},\sigma_{\hat{S}})\geq\Phi(W,\boldsymbol{+}_{W})-\varepsilon,

implying $\mathbb{E}[X_{v}]\geq\tanh x-\varepsilon$ . If $\hat{S}\neq U$ , then we deduce from Eq. 15 that

\Phi(W,\boldsymbol{+}_{W})\geq\Phi(\hat{S},\sigma_{\hat{S}})\geq\Phi(U,\boldsymbol{+}_{U})-\varepsilon,

implying $\mathbb{E}[X_{v}]\leq\tanh x+\varepsilon$ . Thus, by picking $\tanh x$ and applying binary search we can estimate $\mathbb{E}[X_{v}]$ efficiently with additive error $\varepsilon$ with high probability. This transforms to an estimator for $\mathbb{P}(X_{v}=+1)$ with multiplicative error $\varepsilon$ since $\mathbb{P}(X_{v}=+1)$ is lower bounded when $\delta\leq 1$ . ∎

References

[ALO20] Nima Anari, Kuikui Liu, and Shayan Oveis Gharan. Spectral independence in high-dimensional expanders and applications to the hardcore model. In Proceedings of the 61st Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 1319–1330, 2020.
[APB10] Kristine Eia S Antonio, Chrysline Margus N Pinol, and Ronald S Banzon. An ising model approach to malware epidemiology. arXiv preprint arXiv:1007.4938, 2010.
[Bar16] Alexander Barvinok. Combinatorics and Complexity of Partition Functions, volume 30. Springer Algorithms and Combinatorics, 2016.
[BCKL13] Christian Borgs, Jennifer Chayes, Jeff Kahn, and László Lovász. Left and right convergence of graphs with bounded degree. Random Structures & Algorithms, 42(1):1–28, 2013.
[BKM19] Guy Bresler, Frederic Koehler, and Ankur Moitra. Learning restricted Boltzmann machines via influence maximization. In Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing (STOC), pages 828–839, 2019.
[CLV20] Zongchen Chen, Kuikui Liu, and Eric Vigoda. Rapid mixing of Glauber dynamics up to uniqueness via contraction. In Proceedings of the 61st Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 1307–1318, 2020.
[CMMS23] Byron Chin, Ankur Moitra, Elchanan Mossel, and Colin Sandon. The power of an adversary in Glauber dynamics. arXiv preprint arXiv:2302.10841, 2023.
[GŠV16] Andreas Galanis, Daniel Štefankovič, and Eric Vigoda. Inapproximability of the partition function for the antiferromagnetic Ising and hard-core models. Combinatorics, Probability and Computing, 25(4):500–559, 2016.
[KKT03] David Kempe, Jon Kleinberg, and Éva Tardos. Maximizing the spread of influence through a social network. In Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 137–146, 2003.
[KKT05] David Kempe, Jon Kleinberg, and Éva Tardos. Influential nodes in a diffusion model for social networks. In Automata, Languages and Programming: 32nd International Colloquium, ICALP 2005, Lisbon, Portugal, July 11-15, 2005. Proceedings 32, pages 1127–1138. Springer, 2005.
[Lip22] Adam Lipowski. Ising model: Recent developments and exotic applications. Entropy, 24(12):1834, 12 2022.
[LLY13] Liang Li, Pinyan Lu, and Yitong Yin. Correlation decay up to uniqueness in spin systems. In Proceedings of the 24th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 67–84, 2013.
[LMLA19] Cristina Gabriela Aguilar Lara, Eduardo Massad, Luis Fernandez Lopez, and Marcos Amaku. Analogy between the formulation of Ising-Glauber model and Si epidemiological model. Journal of Applied Mathematics and Physics, 7(05):1052, 2019.
[LYS10] Shihuan Liu, Lei Ying, and Srinivas Shakkottai. Influence maximization in social networks: An Ising-model-based approach. In Proceedings of the 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pages 570–576. IEEE, 2010.
[MLO01] Jacek Majewski, Hao Li, and Jurg Ott. The Ising model in physics and statistical genetics. The American Journal of Human Genetics, 69(4):853–862, 2001.
[MR10] Elchanan Mossel and Sébastien Roch. Submodularity of influence in social networks: From local to global. SIAM Journal on Computing, 39(6):2176–2188, 2010.
[MS10] Andrea Montanari and Amin Saberi. The spread of innovations in social networks. Proceedings of the National Academy of Sciences, 107(47):20196–20201, 2010.
[PR17] Viresh Patel and Guus Regts. Deterministic polynomial-time approximation algorithms for partition functions and graph polynomials. SIAM Journal on Computing, 46(6):1893–1919, 2017.
[Sly10] Allan Sly. Computational transition at the uniqueness threshold. In Proceedings of the 51st Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 287–296, 2010.
[SS14] Allan Sly and Nike Sun. The computational hardness of counting in two-spin models on $d$ -regular graphs. The Annals of Probability, 42(6):2383–2416, 2014.
[SST14] Alistair Sinclair, Piyush Srivastava, and Marc Thurley. Approximation algorithms for two-state anti-ferromagnetic spin systems on bounded degree graphs. Journal of Statistical Physics, 155(4):666–686, 2014.
[Wei06] Dror Weitz. Counting independent sets up to the tree threshold. In Proceedings of the 38th Annual ACM Symposium on Theory of Computing (STOC), pages 140–149, 2006.

		$\displaystyle\|f(\infty,\rho)-f(r,\rho)\|$
	$\displaystyle\leq{}$	$\displaystyle\sum_{v\in\mathsf{B}(S,\rho)}\left\|\mathbb{E}_{G_{\infty}}\left[X_{v}\,\|\,X_{S}=\sigma_{S}\right]-\mathbb{E}_{G_{r}}\left[X_{v}\,\|\,X_{S}=\sigma_{S}\right]\right\|+\left\|\mathbb{E}_{G_{\infty}}\left[X_{v}\right]-\mathbb{E}_{G_{r}}\left[X_{v}\right]\right\|$
	$\displaystyle={}$	$\displaystyle\sum_{v\in\mathsf{B}(S,\rho)}2\left\|\mathbb{P}_{G_{\infty}}^{\sigma_{S}}\left(X_{v}=+1\right)-\mathbb{P}_{G_{r}}^{\sigma_{S}}\left(X_{v}=+1\right)\right\|+2\left\|\mathbb{P}_{G_{\infty}}\left(X_{v}=+1\right)-\mathbb{P}_{G_{r}}\left(X_{v}=+1\right)\right\|$