This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Solving Satisfiability Modulo Counting for Symbolic and Statistical AI Integration
With Provable Guarantees

Jinzhao Li, Nan Jiang, Yexiang Xue
Abstract

Satisfiability Modulo Counting (SMC) encompasses problems that require both symbolic decision-making and statistical reasoning. Its general formulation captures many real-world problems at the intersection of symbolic and statistical Artificial Intelligence. SMC searches for policy interventions to control probabilistic outcomes. Solving SMC is challenging because of its highly intractable nature (NPPP\text{NP}^{\text{PP}}-complete), incorporating statistical inference and symbolic reasoning. Previous research on SMC solving lacks provable guarantees and/or suffers from sub-optimal empirical performance, especially when combinatorial constraints are present. We propose 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC}, a polynomial algorithm with access to NP-oracles, to solve highly intractable SMC problems with constant approximation guarantees. 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC} transforms the highly intractable SMC into satisfiability problems, by replacing the model counting in SMC with SAT formulae subject to randomized XOR constraints. Experiments on solving important SMC problems in AI for social good demonstrate that 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC} outperforms several baselines both in solution quality and running time.

1 Introduction

Symbolic and statistical approaches are two fundamental driving forces of Artificial Intelligence (AI). Symbolic AI, exemplified by SATisfiability (SAT) and constraint programming, finds solutions satisfying constraints but requires rigid formulations and is difficult to include probabilities. Statistical AI captures uncertainty but often lacks constraint satisfaction. Integrating symbolic and statistical AI remains an open field and has gained research attention recently (cpm 2023; nes 2023; Munawar et al. 2023).

Satisfiability Modulo Counting (SMC) is an umbrella problem at the intersection of symbolic and statistical AI. It encompasses problems that carry out symbolic decision-making (satisfiability) mixed with statistical reasoning (model counting). SMC searches for policy interventions to control probabilistic outcomes. Formally, SMC is an SAT problem involving predicates on model counts. Model counting computes the number of models (i.e., solutions) to an SAT formula. Its weighted form subsumes probabilistic inference on Machine Learning (ML) models.

As a motivating SMC application, stochastic connectivity optimization searches for the optimal plan to reinforce the network structure so its connectivity is preserved under stochastic events – a central problem for a city planner who works on securing her residents multiple paths to emergency shelters in case of natural disasters. This problem is useful for disaster preparation (Wu, Sheldon, and Zilberstein 2015), bio-diversity protection (Dilkina and Gomes 2010), internet resilience (Israeli and Wood 2002), social influence maximization (Kempe, Kleinberg, and Tardos 2005), energy security (Almeida et al. 2019), etc. It requires symbolic reasoning (satisfiability) to decide which roads to reinforce and where to place emergency shelters, and statistical inference (model counting) to reason about the number of paths to shelters and the probabilities of natural disasters. Despite successes in many use cases, previous approaches (Williams and Snyder 2005; Conrad et al. 2012; Sheldon et al. 2010; Wu, Sheldon, and Zilberstein 2014) found solutions lack of certifiable guarantees, which are unfortunately in need for policy adoption in this safety-related application. Besides, their surrogate approximations of connectivity may overlook important probabilistic scenarios. This results in suboptimal quality of the generated plans. As application domains for SMC solvers, this paper considers emergency shelter placement and supply chain network management – two important stochastic connectivity optimization problems.

It is challenging to solve SMC because of their highly intractable nature (NPPP\text{NP}^{\text{PP}}-complete) (Park and Darwiche 2004) – still intractable even with good satisfiability solvers (Biere et al. 2009; Rossi, van Beek, and Walsh 2006; Braunstein, Mézard, and Zecchina 2005) and model counters (Gomes, Sabharwal, and Selman 2006a; Ermon et al. 2013b; Achlioptas and Theodoropoulos 2017; Chakraborty, Meel, and Vardi 2013; Kisa et al. 2014; Cheng et al. 2012; Gogate and Dechter 2012). Previous research on SMC solves either a special case or domain-specific applications (Belanger and McCallum 2016; Welling and Teh 2003; Yedidia, Freeman, and Weiss 2000; Wainwright and Jordan 2008; Fang, Stone, and Tambe 2015; Conitzer and Sandholm 2006; Sheldon et al. 2010). The special case is called the Marginal Maximum-A-Posterior (MMAP) problem, whose decision version can be formulated as a special case of SMC (Marinescu, Dechter, and Ihler 2014; Liu and Ihler 2013; Mauá and de Campos 2012; Jiang, Rai, and III 2011; Lee et al. 2016; Ping, Liu, and Ihler 2015). Both cases are solved by optimizing the surrogate representations of the intractable model counting in variational forms (Liu and Ihler 2012; Kiselev and Poupart 2014), or via knowledge compilation (Choi, Friedman, and den Broeck 2022; Ping, Liu, and Ihler 2015; Mei, Jiang, and Tu 2018) or via sample average approximation (Kleywegt, Shapiro, and Homem-de-Mello 2002; Shapiro 2003; Swamy and Shmoys 2006; Sheldon et al. 2010; Dyer and Stougie 2006; Wu et al. 2017; Xue, Fern, and Sheldon 2015; Verweij et al. 2003).

Nevertheless, previous approaches either cannot quantify the quality of their solutions, or offer one-sided guarantees, or offer guarantees which can be arbitrarily loose. The lack of tight guarantees results in delayed policy adoption in safety-related applications such as the stochastic connectivity optimization considered in this paper. Second, optimizing surrogate objectives without quantifying the quality of approximation leads to sub-optimal behavior empirically. For example, previous stochastic connectivity optimization solvers occasionally produce suboptimal plans because their surrogate approximations overlook cases of significant probability. This problem is amplified when combinatorial constraints are present.

We propose 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC}, a polynomial algorithm accessing NP-oracles, to solve highly intractable SMC problems with constant approximation guarantees. These guarantees hold with high (e.g. >99%>99\%) probability. The strong guarantees enable policy adoption in safety-related domains and improve the empirical performance of SMC solving (e.g., eliminating sub-optimal behavior and providing constraint satisfaction guarantees). The constant approximation means that the solver can correctly decide the truth of an SMC formula if tightening or relaxing the bounds on the model count by a multiplicative constant do not change its truth value. The embedding algorithms allow us to find approximate solutions to beyond-NP SMC problems via querying NP oracles. It expands the applicability of the state-of-the-art SAT solvers to highly intractable problems.

The high-level idea behind 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC} is as follows. Imagine a magic that randomly filters out half of the models (solutions) to an SAT formula. Model counting can be approximated using this magic and an SAT solver – we confirm the SAT formula has more than 2k2^{k} models if it is satisfiable after applying this magic kk times. This magic can be implemented by introducing randomized constraints. The idea is developed by researchers (Valiant and Vazirani 1986; Jerrum, Valiant, and Vazirani 1986; Gomes, Sabharwal, and Selman 2006b, a; Ermon et al. 2013b, a; Kuck et al. 2019; Achlioptas and Theodoropoulos 2017; Chakraborty, Meel, and Vardi 2013; Chakraborty et al. 2014). In these works, model counting is approximated with guarantees using polynomial algorithms accessing NP oracles. 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC} notices such polynomial algorithms can be encoded as SAT formulae. Hence, SAT-Modulo-Counting can be written as SAT-Modulo-SAT (or equivalently SAT), when we embed the SAT formula compiled from algorithms to solve model counting into SMC. The constant approximation guarantee also carries.

We evaluate the performance of 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC} on real-world stochastic connectivity optimization problems. In particular, we consider applied problems of emergency shelter placement and supply chain management. For the shelter placement problem, our 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC} finds high-quality shelter assignments with less computation time and better quality than competing baselines. For wheat supply chain management, the solutions found by our 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC} are better than those found by baselines. 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC} also runs faster than baselines111The code is available at: https://github.com/jil016/xor-smc. Please refer to https://arxiv.org/abs/2309.08883 for the Appendix..

2 Preliminaries

Satisfiability Modulo Theories

Satisfiability Modulo Theory (SMT) determines the SATisfiability (SAT) of a Boolean formula, which contains predicates whose truth values are determined by the background theory. SMT represents a line of successful efforts to build general-purpose logic reasoning engines, encompassing complex expressions containing bit vectors, real numbers, integers, and strings, etc  (Barrett et al. 2021). Over the years, many good SMT solvers are built, such as the Z3 (de Moura and Bjørner 2008; Bjørner et al. 2018) and cvc5 (Barbosa et al. 2022). They play a crucial role in automated theorem proving, program analysis (Feser et al. 2020), program verification (K., Shoham, and Gurfinkel 2022), and software testing (de Moura and Bjørner 2007).

Model Counting and Probabilistic Inference

Model counting computes the number of models (i.e., satisfying variable assignments) to an SAT formula. Consider a Boolean formula f(𝐱)f(\mathbf{x}), where the input 𝐱\mathbf{x} is a vector of Boolean variables, and the output ff is also Boolean. When we use 0 to represent false and 1 to represent true, xf(𝐱)\sum_{x}f(\mathbf{x}) computes the model count. Model counting is closely related to probabilistic inference and machine learning because the marginal inference on a wide range of probabilistic models can be formulated as a weighted model counting problem (Chavira and Darwiche 2008; Xue et al. 2016).

Exact approaches for probabilistic inference and model counting are often based on knowledge compilation (Darwiche and Marquis 2002; Kisa et al. 2014; Choi, Kisa, and Darwiche 2013; Xue, Choi, and Darwiche 2012). Approximate approaches include Variational methods and sampling. Variational methods (Wainwright and Jordan 2008; Wainwright, Jaakkola, and Willsky 2003; Sontag et al. 2008; Hazan and Shashua 2010; Flerova et al. 2011) use tractable forms to approximate a complex probability distribution. Due to a tight relationship between counting and sampling (Jerrum, Valiant, and Vazirani 1986), sampling-based approaches are important for model counting. Importance sampling-based techniques such as SampleSearch (Gogate and Dechter 2007) is able to provide lower bounds. Markov Chain Monte Carlo is asymptotically accurate. However, they cannot provide guarantees except for a limited number of cases (Jerrum and Sinclair 1997; Madras 2002). The authors of (Papandreou and Yuille 2010; Hazan and Jaakkola 2012; Balog et al. 2017) transform weighted integration into optimization queries using extreme value distribution, which today is often called the “Gumbel trick” (Papandreou and Yuille 2011; Jang, Gu, and Poole 2017).

XOR Counting

There is an interesting connection between model counting and solving satisfiability problems subject to randomized XOR constraints. To illustrate this, hold 𝐱\mathbf{x} at 𝐱0\mathbf{x}_{0}, suppose we would like to know if 𝐲𝒴f(𝐱0,𝐲)\sum_{\mathbf{y}\in\mathcal{Y}}f(\mathbf{x}_{0},\mathbf{y}) exceeds 2q2^{q}. Consider the SAT formula:

f(𝐱0,𝐲)𝚇𝙾𝚁1(𝐲)𝚇𝙾𝚁q(𝐲).\displaystyle f(\mathbf{x}_{0},\mathbf{y})\wedge\mathtt{XOR}_{1}(\mathbf{y})\wedge\ldots\wedge\mathtt{XOR}_{q}(\mathbf{y}). (1)

Here, 𝚇𝙾𝚁1,,𝚇𝙾𝚁q\mathtt{XOR}_{1},\ldots,\mathtt{XOR}_{q} are randomly sampled XOR constraints. 𝚇𝙾𝚁i(𝐲)\mathtt{XOR}_{i}(\mathbf{y}) is the logical XOR or the parity of a randomly sampled subset of variables from 𝐲\mathbf{y}. In other words, 𝚇𝙾𝚁i(𝐲)\mathtt{XOR}_{i}(\mathbf{y}) is true if and only if an odd number of these randomly sampled variables in the subset are true.

1 Randomly sample 𝚇𝙾𝚁1(𝐲),,𝚇𝙾𝚁q(𝐲)\mathtt{XOR}_{1}(\mathbf{y}),\ldots,\mathtt{XOR}_{q}(\mathbf{y});
2 if f(𝐱0,𝐲)𝚇𝙾𝚁1(𝐲)𝚇𝙾𝚁q(𝐲)f(\mathbf{x}_{0},\mathbf{y})\wedge\mathtt{XOR}_{1}(\mathbf{y})\wedge\ldots\wedge\mathtt{XOR}_{q}(\mathbf{y}) is satisfiable then
3       return True;
4else
5       return False;
6 end if
Algorithm 1 𝚇𝙾𝚁\mathtt{XOR}-𝙱𝚒𝚗𝚊𝚛𝚢\mathtt{Binary} (ff, 𝐱0\mathbf{x}_{0}, qq)

Formula (1) is likely to be satisfiable if more than 2q2^{q} different 𝐲\mathbf{y} vectors render f(𝐱0,𝐲)f(\mathbf{x}_{0},\mathbf{y}) true. Conversely, Formula (1) is likely to be unsatisfiable if f(𝐱0,𝐲)f(\mathbf{x}_{0},\mathbf{y}) has less than 2q2^{q} satisfying assignments. The significance of this fact is that it essentially transforms model counting (beyond NP) into satisfiability problems (within NP). An intuitive explanation of why this fact holds is that each satisfying assignment 𝐲\mathbf{y} has 50% chance to satisfy a randomly sampled XOR constraint. In other words, each XOR constraint “filters out” half satisfying assignments. For example, the number of models satisfying f(𝐱0,𝐲)𝚇𝙾𝚁1(𝐲)f(\mathbf{x}_{0},\mathbf{y})\wedge\mathtt{XOR}_{1}(\mathbf{y}) is approximately half of that satisfying f(𝐱0,𝐲)f(\mathbf{x}_{0},\mathbf{y}). Continuing this chain of reasoning, if f(𝐱0,𝐲)f(\mathbf{x}_{0},\mathbf{y}) has more than 2q2^{q} solutions, there are still satisfying assignments left after adding qq XOR constraints; hence formula (1) is likely satisfiable. The reverse direction can be reasoned similarly. The precise mathematical argument of the constant approximation is in Lemma 1.

Lemma 1.

(Jerrum, Valiant, and Vazirani 1986; Gomes, Sabharwal, and Selman 2006a; Ermon et al. 2013b) Given Boolean function f(𝐱0,y)f(\mathbf{x}_{0},y) as defined above,

  • If 𝐲f(𝐱0,𝐲)2q0\sum_{\mathbf{y}}f(\mathbf{x}_{0},\mathbf{y})\geq 2^{q_{0}}, then for any qq0q\leq q_{0}, with probability 12c(2c1)21-\frac{2^{c}}{(2^{c}-1)^{2}}, 𝚇𝙾𝚁\mathtt{XOR}-𝙱𝚒𝚗𝚊𝚛𝚢\mathtt{Binary} (f,𝐱0,qc)(f,\mathbf{x}_{0},q-c) returns True.

  • If 𝐲f(𝐱0,𝐲)2q0\sum_{\mathbf{y}}f(\mathbf{x}_{0},\mathbf{y})\leq 2^{q_{0}}, then for any qq0q\geq q_{0}, with probability 12c(2c1)21-\frac{2^{c}}{(2^{c}-1)^{2}}, 𝚇𝙾𝚁\mathtt{XOR}-𝙱𝚒𝚗𝚊𝚛𝚢\mathtt{Binary} (w,θ0,q+c)(w,\theta_{0},q+c) returns False.

This idea of transforming model counting problems into SAT problems subject to randomized constraints is rooted in Leslie Valiant’s seminal work on unique SAT (Valiant and Vazirani 1986; Jerrum, Valiant, and Vazirani 1986) and has been developed by a rich line of work (Gomes, Sabharwal, and Selman 2006b, a; Ermon et al. 2013b, a; Kuck et al. 2019; Achlioptas and Theodoropoulos 2017; Chakraborty, Meel, and Vardi 2013; Chakraborty et al. 2014). This idea has recently gathered momentum thanks to the rapid progress in SAT solving (Maneva, Mossel, and Wainwright 2007; Braunstein, Mézard, and Zecchina 2005). The contribution of this work extends the success of SAT solvers to problems with even higher complexity, namely, NPPP\text{NP}^{\text{PP}}-complete SMC problems.

3 Problem Formulation

Satisfiability Modulo Counting (SMC) is Satisfiability Modulo Theory (SMT) (Barrett et al. 2009) with model counting as the background theory. A canonical definition of the SMC problem is to determine if there exists 𝐱=(x1,,xn){0,1}n\mathbf{x}=(x_{1},\dots,x_{n})\in\{0,1\}^{n} and 𝐛=(b1,,bk){0,1}k\mathbf{b}=(b_{1},\dots,b_{k})\in\{0,1\}^{k} that satisfies the formula:

ϕ(𝐱,𝐛),bi(𝐲i𝒴ifi(𝐱,𝐲i)2qi),i{1..,k}.\phi(\mathbf{x},\mathbf{b}),b_{i}\Leftrightarrow\left(\sum_{\mathbf{y}_{i}\in\mathcal{Y}_{i}}f_{i}(\mathbf{x},\mathbf{y}_{i})\geq 2^{q_{i}}\right),\forall i\in\{1..,k\}. (2)

Here each bib_{i} is a Boolean predicate that is true if and only if the corresponding model count exceeds a threshold. Bold symbols (i.e., 𝐱\mathbf{x}, 𝐲i\mathbf{y}_{i} and 𝐛\mathbf{b}) are vectors of Boolean variables. ϕ,f1,,fk\phi,f_{1},\dots,f_{k} are Boolean functions (i.e., their input is Boolean vectors, and their outputs are also Boolean). We use 0 to represent false and 11 to represent true. Hence fi\sum f_{i} computes the number of satisfying assignments (model counts) of fif_{i}. The directions of the inequalities do not matter much because one can always negate each fif_{i}. For instance, let f(𝐱,𝐲)f(\mathbf{x},\mathbf{y}) be a Boolean function (output is 0 or 1). 𝐲𝒴f(𝐱,𝐲)2q\sum_{\mathbf{y}\in\mathcal{Y}}f(\mathbf{x},\mathbf{y})\leq 2^{q} can be converted by negating ff and modifying the threshold to |𝒴|2q|\mathcal{Y}|-2^{q}, resulting in an equivalent predicate 𝐲𝒴(¬f(𝐱,𝐲))|𝒴|2q\sum_{\mathbf{y}\in\mathcal{Y}}(\neg f(\mathbf{x},\mathbf{y}))\geq|\mathcal{Y}|-2^{q}.

Our 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC} algorithm obtains the constant approximation guarantee to the following slightly relaxed SMC problems. The problem 𝚂𝙼𝙲(ϕ,f1,,fk,\mathtt{SMC}(\phi,f_{1},\dots,f_{k}, q1,,qk)q_{1},\dots,q_{k}) finds a satisfying assignment (𝐱,𝐛)(\mathbf{x},\mathbf{b}) for:

ϕ(𝐱,𝐛)\displaystyle\phi(\mathbf{x},\mathbf{b}) [b1(𝐲1𝒴1f1(𝐱,𝐲1)2q1)]\displaystyle\wedge\left[b_{1}\Rightarrow\left(\sum_{\mathbf{y}_{1}\in\mathcal{Y}_{1}}f_{1}(\mathbf{x},\mathbf{y}_{1})\geq 2^{q_{1}}\right)\right]
[bk(𝐲k𝒴kfk(𝐱,𝐲k)2qk)].\displaystyle\dots\wedge\left[b_{k}\Rightarrow\left(\sum_{\mathbf{y}_{k}\in\mathcal{Y}_{k}}f_{k}(\mathbf{x},\mathbf{y}_{k})\geq 2^{q_{k}}\right)\right]. (3)

The only difference compared to the full-scale problem in Eq. (2)) is the replacement of \Leftrightarrow with \Rightarrow. This change allows us to derive a concise constant approximation bound. We also mention that all the applied SMC problems considered in this paper can be formulated in this relaxed form. We thank the reviewers for pointing out the work of (Fredrikson and Jha 2014), who came up with a slightly different SMC formulation with focused applications in privacy and an exact solver. Their formulation was a little more general than ours, since theirs allows for predicates like fg\sum f\geq\sum g, while ours only allows for f\sum f\geq constant. However, our formulation can handle fg\sum f\geq\sum g by formulating it with (fα)(gα)(\sum f\geq\alpha)\wedge(\sum g\leq\alpha) and binary searching on α\alpha.

4 The 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC} Algorithm

The key motivation behind our proposed 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC} algorithm is to notice that Algorithm 1 itself can be written as a Boolean formula due to the Cook-Levin reduction. When we embed this Boolean formula into Eq. (3), the Satisfiability-Modulo-Counting problem translates into a Satisfiability-Modulo-SAT problem, or equivalently, an SAT problem. This embedding also ensures a constant approximation guarantee (see Theorem 2).

To illustrate the high-level idea, let us consider replacing each 𝐲i𝒴ifi(𝐱,𝐲i)2qi\sum_{\mathbf{y}_{i}\in\mathcal{Y}_{i}}f_{i}(\mathbf{x},\mathbf{y}_{i})\geq 2^{q_{i}} in Eq. (3) with formula

fi(𝐱,𝐲i)𝚇𝙾𝚁1(𝐲i)𝚇𝙾𝚁qi(𝐲i).\displaystyle f_{i}(\mathbf{x},\mathbf{y}_{i})\wedge\mathtt{XOR}_{1}(\mathbf{y}_{i})\wedge\ldots\wedge\mathtt{XOR}_{q_{i}}(\mathbf{y}_{i}). (4)

We denote the previous equation (4) as γ(fi,𝐱,qi,𝐲i)\gamma(f_{i},\mathbf{x},q_{i},\mathbf{y}_{i}). This replacement results in the Boolean formula:

ϕ(𝐱,𝐛)\displaystyle\phi(\mathbf{x},\mathbf{b})\wedge [b1γ(f1,𝐱,q1,𝐲1)]\displaystyle\left[b_{1}\Rightarrow\gamma(f_{1},\mathbf{x},q_{1},\mathbf{y}_{1})\right]\wedge\dots\wedge
[bkγ(fk,𝐱,qk,𝐲k)].\displaystyle\left[b_{k}\Rightarrow\gamma(f_{k},\mathbf{x},q_{k},\mathbf{y}_{k})\right]. (5)
Refer to caption
Figure 1: Our 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC} (shown in Algorithm 2) solves the intractable model counting with satisfiability problems subject to randomized XOR constraints and obtains constant approximation guarantees for SMC.

We argue that the satisfiability of formula (5) should be closely related to that of formula (3) due to the connection between model counting and satisfiability testing subject to randomized constraints (discussed in Section 2). To see this, Eq. (5) is satisfiable if and only if there exists (𝐱,𝐛,𝐲1,,𝐲k)(\mathbf{x},\mathbf{b},\mathbf{y}_{1},\ldots,\mathbf{y}_{k}) that render Eq. (5) true (notice 𝐲1,,𝐲k\mathbf{y}_{1},\ldots,\mathbf{y}_{k} are also its variables). Suppose SMC(ϕ,f1,,fk,q1+c,,qk+c)\texttt{SMC}(\phi,f_{1},\dots,f_{k},q_{1}+c,\dots,q_{k}+c) is satisfiable (a.k.a., Eq. (3) is satisfiable when qiq_{i} is replaced with qi+cq_{i}+c). Let (𝐱,𝐛)(\mathbf{x},\mathbf{b}) be a satisfying assignment. For any bi=1b_{i}=1 (true) in 𝐛\mathbf{b}, we must have 𝐲𝐢𝒴ifi(𝐱,𝐲i)2qi+c\sum_{\mathbf{y_{i}}\in\mathcal{Y}_{i}}f_{i}(\mathbf{x},\mathbf{y}_{{i}})\geq 2^{q_{i}+c}. This implies with a good chance, there exists a 𝐲i\mathbf{y}_{i} that renders γ(fi,𝐱,qi,𝐲i)\gamma(f_{i},\mathbf{x},q_{i},\mathbf{y}_{i}) true. This is due to the discussed connection between model counting and SAT solving subject to randomized constraints. Hence biγ(fi,𝐱,qi,𝐲i)b_{i}\Rightarrow\gamma(f_{i},\mathbf{x},q_{i},\mathbf{y}_{i}) is true. For any bi=0b_{i}=0 (false), the previous equation is true by default. Combining these two facts and ϕ(𝐱,𝐛)\phi(\mathbf{x},\mathbf{b}) is true, we see Eq. (5) is true.

Conversely, suppose SMC(ϕ,f1,,fk,q1c,,qkc)\texttt{SMC}(\phi,f_{1},\dots,f_{k},q_{1}-c,\dots,q_{k}-c) is not satisfiable. This implies for every (𝐱,𝐛)(\mathbf{x},\mathbf{b}), either ϕ(𝐱,𝐛)\phi(\mathbf{x},\mathbf{b}) is false, or there exists at least one jj such that bjb_{j} is true, but 𝐲j𝒴jfj(𝐱,𝐲j)<2qjc\sum_{\mathbf{y}_{{j}}\in\mathcal{Y}_{{j}}}f_{j}(\mathbf{x},\mathbf{y}_{{j}})<2^{q_{j}-c}. The first case implies Eq. (5) is false under the assignment. For the second case, 𝐲j𝒴jfj(𝐱,𝐲j)<2qjc\sum_{\mathbf{y}_{{j}}\in\mathcal{Y}_{{j}}}f_{j}(\mathbf{x},\mathbf{y}_{{j}})<2^{q_{j}-c} implies with a good chance there is no 𝐲j\mathbf{y}_{j} to make γ(fj,𝐱,qj,𝐲j)\gamma(f_{j},\mathbf{x},q_{j},\mathbf{y}_{j}) true. Combining these two facts, with a good chance Eq. (5) is not satisfiable.

In practice, to reduce the error probability the determination of the model count needs to rely on the majority satisfiability status of a series of equations (4) (instead of a single one). Hence we develop Algorithm 2, which is a little bit more complex than the high-level idea discussed above. The idea is still to transform the highly intractable SMC problem into solving an SAT problem of its polynomial size, while ensuring a constant approximation guarantee. Fig. 1 displays the encoding of Algorithm 2. We can see the core is still to replace the intractable model counting with satisfiability problems subject to randomized constraints. We prove 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC} has a constant approximation guarantee in Theorem 2. We leave the implementation of 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC} in the Appendix B.

1 T(n+k)ln2lnηα(c,k)T\leftarrow\lceil\frac{(n+k)\ln 2-\ln\eta}{\alpha(c,k)}\rceil;
2
3for t=1 to Tt=1\text{ to }T do
4       for i=1 to ki=1\text{ to }k do
5             ψi(t)fi(𝐱,𝐲i(t))\psi_{i}^{(t)}\leftarrow f_{i}(\mathbf{x},\mathbf{y}_{i}^{(t)});
6             for j=1,,qij=1,\ldots,q_{i} do
7                   ψi(t)ψi(t)𝚇𝙾𝚁j(𝐲i(t))\psi_{i}^{(t)}\leftarrow\psi_{i}^{(t)}\wedge\mathtt{XOR}_{j}(\mathbf{y}_{i}^{(t)});
8                  
9             end for
10            ψi(t)ψi(t)¬bi\psi_{i}^{(t)}\leftarrow\psi_{i}^{(t)}\vee\neg b_{i};
11            
12       end for
13      ψtψ1(t)ψk(t)\psi_{t}\leftarrow\psi_{1}^{(t)}\wedge\dots\wedge\psi_{k}^{(t)};
14      
15 end for
16ϕϕ𝙼𝚊𝚓𝚘𝚛𝚒𝚝𝚢(ψ1,,ψT)\phi^{*}\leftarrow\phi\wedge\mathtt{Majority}(\psi_{1},\ldots,\psi_{T}) ;
17 if there exists (𝐱,𝐛,{𝐲i(1)}i=1k,,{𝐲i(T)}i=1k)(\mathbf{x},\mathbf{b},\{\mathbf{y}_{i}^{(1)}\}_{i=1}^{k},\ldots,\{\mathbf{y}_{i}^{(T)}\}_{i=1}^{k}) that satisfies ϕ\phi^{*} then
18       return True;
19else
20       return False;
21 end if
Algorithm 2 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC} (ϕ\phi, {fi}i=1k\{f_{i}\}_{i=1}^{k}, {qi}i=1k\{q_{i}\}_{i=1}^{k}, η\eta, cc)
Theorem 2.

Let 0<η<10<\eta<1 and clog(k+1)+1c\geq\log(k+1)+1. Select T=((n+k)ln2lnη)/α(c,k)T=\lceil{((n+k)\ln 2-\ln\eta)}/{\alpha(c,k)}\rceil, we have

  • Suppose there exists 𝐱0{0,1}n\mathbf{x}_{0}\in\{0,1\}^{n} and 𝐛0{0,1}k\mathbf{b}_{0}\in\{0,1\}^{k}, such that 𝚂𝙼𝙲(ϕ,f1,,fk,q1+c,,qk+c)\mathtt{SMC}(\phi,f_{1},\dots,f_{k},q_{1}+c,\dots,q_{k}+c) is true. In other words,

    ϕ(𝐱0,𝐛0)(i=1k(bi𝐲ifi(𝐱0,𝐲i)2qi+c)),\phi(\mathbf{x}_{0},\mathbf{b}_{0})\wedge\left(\bigwedge_{i=1}^{k}\left(b_{i}\Rightarrow\sum_{\mathbf{y}_{i}}f_{i}(\mathbf{x}_{0},\mathbf{y}_{i})\geq 2^{q_{i}+c}\right)\right),

    Then algorithm 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC} (ϕ\phi, {fi}i=1k\{f_{i}\}_{i=1}^{k}, {qi}i=1k\{q_{i}\}_{i=1}^{k}, TT) returns true with probability greater than 1η1-\eta.

  • Contrarily, suppose 𝚂𝙼𝙲(ϕ,f1,,fk,q1c,,qkc)\mathtt{SMC}(\phi,f_{1},\dots,f_{k},q_{1}-c,\dots,q_{k}-c) is not satisfiable. In other words, for all 𝐱{0,1}n\mathbf{x}\in\{0,1\}^{n} and 𝐛{0,1}k\mathbf{b}\in\{0,1\}^{k},

    ¬(ϕ(𝐱,𝐛)(i=1k(bi𝐲ifi(𝐱,𝐲i)2qic))),\displaystyle\neg\left(\phi(\mathbf{x},\mathbf{b})\wedge\left(\bigwedge_{i=1}^{k}\left(b_{i}\Rightarrow\sum_{\mathbf{y}_{i}}f_{i}(\mathbf{x},\mathbf{y}_{i})\geq 2^{q_{i}-c}\right)\right)\right),

    then 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC} (ϕ\phi, {fi}i=1k\{f_{i}\}_{i=1}^{k}, {qi}i=1k\{q_{i}\}_{i=1}^{k}, TT) returns false with probability greater than 1η1-\eta.

Proof.

Claim 1: Suppose there exists 𝐱0=[x1,,xn]{0,1}n\mathbf{x}_{0}=[x_{1},\dots,x_{n}]\in\{0,1\}^{n} and 𝐛0=[b1,,bk]{0,1}k\mathbf{b}_{0}=[b_{1},\dots,b_{k}]\in\{0,1\}^{k}, such that

ϕ(𝐱0,𝐛0)(i=1k(bi𝐲ifi(𝐱0,𝐲i)2qi+c))\displaystyle\phi(\mathbf{x}_{0},\mathbf{b}_{0})\wedge\left(\bigwedge_{i=1}^{k}\left(b_{i}\Rightarrow\sum_{\mathbf{y}_{i}}f_{i}(\mathbf{x}_{0},\mathbf{y}_{i})\geq 2^{q_{i}+c}\right)\right) (6)

holds true. Denote k0k_{0} as the number of non-zero bits in 𝐛0\mathbf{b}_{0}. Without losing generality, suppose those non-zero bits are the first k0k_{0} bits, i.e., b1=b2==bk0=1b_{1}=b_{2}=\dots=b_{k_{0}}=1 and bi=0,i>k0b_{i}=0,\forall i>k_{0}. Then Eq. (6) can be simplified to:

ϕ(𝐱0,𝐛0)(i=1k0(𝐲ifi(𝐱0,𝐲i)2qi+c))\displaystyle\phi(\mathbf{x}_{0},\mathbf{b}_{0})\wedge\left(\bigwedge_{i=1}^{k_{0}}\left(\sum_{\mathbf{y}_{i}}f_{i}(\mathbf{x}_{0},\mathbf{y}_{i})\geq 2^{q_{i}+c}\right)\right) (7)

Consider the Boolean formula ψt\psi_{t} defined in the 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC} algorithm (choosing any t{1,,T}t\in\{1,\dots,T\}). ψt\psi_{t} can be simplified by substituting the values of 𝐱0\mathbf{x}_{0} and 𝐛0\mathbf{b}_{0}. After simplification, we obtain:

ψt\displaystyle\psi_{t} =(f1(𝐱0,𝐲1(t))𝚇𝙾𝚁1(𝐲1(t))𝚇𝙾𝚁q1(𝐲1(t)))\displaystyle=\left(f_{1}(\mathbf{x}_{0},\mathbf{y}_{1}^{(t)})\wedge\mathtt{XOR}_{1}(\mathbf{y}_{1}^{(t)})\dots\wedge\mathtt{XOR}_{q_{1}}(\mathbf{y}_{1}^{(t)})\right)\wedge\dots
(fk0(𝐱0,𝐲k0(t))𝚇𝙾𝚁1(𝐲k0(t))𝚇𝙾𝚁qk0(𝐲k0(t))).\displaystyle\wedge\left(f_{k_{0}}(\mathbf{x}_{0},\mathbf{y}_{k_{0}}^{(t)})\wedge\mathtt{XOR}_{1}(\mathbf{y}_{k_{0}}^{(t)})\dots\wedge\mathtt{XOR}_{q_{k_{0}}}(\mathbf{y}_{k_{0}}^{(t)})\right).

Let γi=(fi(𝐱0,𝐲i(t))𝚇𝙾𝚁1(𝐲i(t))𝚇𝙾𝚁qi(𝐲i(t)))\gamma_{i}=\left(f_{i}(\mathbf{x}_{0},\mathbf{y}_{i}^{(t)})\wedge\mathtt{XOR}_{1}(\mathbf{y}_{i}^{(t)})\dots\wedge\mathtt{XOR}_{q_{i}}(\mathbf{y}_{i}^{(t)})\right). Observing that 𝐲ifi(𝐱0,𝐲i)2qi+c,i=1,,k0\sum_{\mathbf{y}_{i}}f_{i}(\mathbf{x}_{0},\mathbf{y}_{i})\geq 2^{q_{i}+c},\forall i=1,\dots,k_{0}. According to Lemma 1, with probability at least 12c(2c1)21-\frac{2^{c}}{(2^{c}-1)^{2}}, there exists 𝐲i(t)\mathbf{y}_{i}^{(t)}, such that (𝐱0,𝐲i(t))(\mathbf{x}_{0},\mathbf{y}_{i}^{(t)}) renders γi\gamma_{i} true. The probability that ψt\psi_{t} is true under (𝐱0,𝐛0,𝐲1(t),,𝐲k(t))(\mathbf{x}_{0},\mathbf{b}_{0},\mathbf{y}_{1}^{(t)},\dots,\mathbf{y}_{k}^{(t)}) is:

Pr((𝐱0,\displaystyle{Pr}((\mathbf{x}_{0}, 𝐛0,𝐲1(t),,𝐲k(t)) renders ψt true)\displaystyle\mathbf{b}_{0},\mathbf{y}_{1}^{(t)},\dots,\mathbf{y}_{k}^{(t)})\text{ renders }\psi_{t}\text{ true})
=\displaystyle=~{} Pr(i=1k0((𝐱0,𝐲i(t)) renders γi false))\displaystyle{Pr}\left(\bigwedge_{i=1}^{k_{0}}((\mathbf{x}_{0},\mathbf{y}_{i}^{(t)})\text{ renders }\gamma_{i}\text{ false})\right)
=\displaystyle=~{} 1Pr(i=1k0((𝐱0,𝐲i(t)) renders γi false))\displaystyle 1-{Pr}\left(\bigvee_{i=1}^{k_{0}}((\mathbf{x}_{0},\mathbf{y}_{i}^{(t)})\text{ renders }\gamma_{i}\text{ false})\right)
\displaystyle\geq~{} 1i=1k0Pr((𝐱0,𝐲i(t)) renders γi false)\displaystyle 1-\sum_{i=1}^{k_{0}}{Pr}\left((\mathbf{x}_{0},\mathbf{y}_{i}^{(t)})\text{ renders }\gamma_{i}\text{ false}\right)
\displaystyle\geq~{} 1k02c(2c1)21k2c(2c1)2.\displaystyle 1-\frac{k_{0}2^{c}}{(2^{c}-1)^{2}}\geq 1-\frac{k2^{c}}{(2^{c}-1)^{2}}.

Define Γt\Gamma_{t} as a binary indicator variable where

Γt={1if (𝐱0,𝐛0,𝐲1(t),,𝐲k(t)) renders ψt true,0otherwise.\displaystyle\Gamma_{t}=\begin{cases}1&\text{if $(\mathbf{x}_{0},\mathbf{b}_{0},\mathbf{y}_{1}^{(t)},\dots,\mathbf{y}_{k}^{(t)})$ renders $\psi_{t}$ true},\\ 0&\text{otherwise}.\end{cases}

Therefore Pr(Γt=0)k2c(2c1)2{Pr}(\Gamma_{t}=0)\leq\frac{k2^{c}}{(2^{c}-1)^{2}}. Pr(Γt=0)<12{Pr}(\Gamma_{t}=0)<\frac{1}{2} when clog2(k+1)+1c\geq\log_{2}(k+1)+1. 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC} returns true if the majority of ψt,t=1,,T\psi_{t},t=1,\dots,T are true; that is, tΓtT2\sum_{t}\Gamma_{t}\geq\frac{T}{2}. Let’s define

α(c,k)\displaystyle\alpha(c,k) =D(12k2c(2c1)2)\displaystyle=D\left(\frac{1}{2}\|\frac{k2^{c}}{(2^{c}-1)^{2}}\right)
=12ln(2c1)2k2c+1+(112)ln2(2c1)2(2c1)2k2c+1.\displaystyle=\frac{1}{2}\ln\frac{(2^{c}-1)^{2}}{k2^{c+1}}+\left(1-\frac{1}{2}\right)\ln\frac{2(2^{c}-1)^{2}}{(2^{c}-1)^{2}-k2^{c+1}}.

When clog2(k+1)+1c\geq\log_{2}(k+1)+1, observing that α(c,k)>0\alpha(c,k)>0 , we can apply the Chernoff-Hoeffding theorem to obtain:

Pr(t=1TΓtT2)\displaystyle{Pr}\left(\sum_{t=1}^{T}\Gamma_{t}\geq\frac{T}{2}\right) =1Pr(t=1TΓt<T2)1eα(c,k)T\displaystyle=1-{Pr}\left(\sum_{t=1}^{T}\Gamma_{t}<\frac{T}{2}\right)\geq 1-e^{-\alpha(c,k)T}

For T((n+k)ln2lnη)α(c,k)lnηα(c,k)T\geq\lceil\frac{((n+k)\ln 2-\ln\eta)}{\alpha(c,k)}\rceil\geq\frac{-\ln\eta}{\alpha(c,k)}, it follows that eα(c,k)Tηe^{-\alpha(c,k)T}\leq\eta. Therefore, with a probability at least 1η1-\eta, we have tΓtT2\sum_{t}\Gamma_{t}\geq\frac{T}{2}. In this scenario, 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC} (ϕ\phi, {fi}i=1k\{f_{i}\}_{i=1}^{k}, {qi}i=1k\{q_{i}\}_{i=1}^{k}, TT) returns true as it discovers 𝐱0\mathbf{x}_{0}, 𝐛0\mathbf{b}_{0}, (𝐲1(t),,𝐲k(t))(\mathbf{y}_{1}^{(t)},\dots,\mathbf{y}_{k}^{(t)}), for which the majority of Boolean formulae in {ψt}t=1T\{\psi_{t}\}_{t=1}^{T} are true.

Claim 2: Suppose for all 𝐱{0,1}n\mathbf{x}\in\{0,1\}^{n} and 𝐛{0,1}k\mathbf{b}\in\{0,1\}^{k},

¬(ϕ(𝐱,𝐛)(i=1k(bi𝐲ifi(𝐱,𝐲i)2qic)))\displaystyle\neg\left(\phi(\mathbf{x},\mathbf{b})\wedge\left(\bigwedge_{i=1}^{k}\left(b_{i}\Rightarrow\sum_{\mathbf{y}_{i}}f_{i}(\mathbf{x},\mathbf{y}_{i})\geq 2^{q_{i}-c}\right)\right)\right)

Consider a fixed 𝐱1\mathbf{x}_{1} and 𝐛1\mathbf{b}_{1}, the previous condition with high probability renders most ψt\psi_{t} false in Algorithm 2. We prove that the probability is sufficiently low such that 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC} will return false with a high probability after examining all 𝐱\mathbf{x} and 𝐛\mathbf{b}. The detailed proof is left in Appendix A. ∎

5 Experiment 1: Locate Emergency Shelters

Problem Formulation. Disasters such as hurricanes and floods continue to endanger millions of lives. Shelters are safe zones that protect residents from possible damage, and evacuation routes are the paths from resident zones toward shelter areas. To enable the timely evacuation of resident zones, picking a set of shelter locations with sufficient routing from resident areas should be considered. Given the unpredictability of chaos during natural disasters, it is crucial to guarantee multiple paths rather than one path from residential areas to shelters. This ensures that even if one route is obstructed, residents have alternative paths to safety areas.

Refer to caption
Figure 2: Example assignment of shelters that guarantee sufficient alternative paths from the resident areas, at Hawaii Island. Every orange dot corresponds to shelters and the green dot indicates a resident area.

Given a map G=(V,E)G=(V,E) where nodes in V={v1,,vN}V=\{v_{1},\dots,v_{N}\} represent NN areas and an edge e=(vi,vj)Ee=(v_{i},v_{j})\in E indicates a road from viv_{i} to vjv_{j}, NN and MM denote the number of nodes and edges, respectively. Given a subset of nodes R={vr1,,vrk}VR=\{v_{r_{1}},\dots,v_{r_{k}}\}\subseteq V indicates the residential areas, the task is to choose at most mm nodes as shelters from the rest of the nodes, such that the number of routes that can reach a shelter from each residential area is maximized. Fig. 2 gives an example with m=4m=4 shelters and there are sufficiently many roads connecting the resident area to those shelters.

Current methods (Bayram and Yaman 2018; Amideo, Scaparra, and Kotiadis 2019) considered finding shelter locations that have at least one single path from a residential area. However, those proposed methods cannot be generalized to solve the problem that requires sufficient alternative routes from residential area to shelters, primarily because counting the number of paths is intractable. This complexity makes it difficult to solve large-scale problems of this type.

SMC Formulation. 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC} transforms this optimization problem into a decision problem by gradually increasing the path count threshold qrq_{r}. The decision problem decides if there are at least 2qr2^{q_{r}} paths connecting any residential area with a shelter. The assigned shelters is represented by a vector 𝕓=(b1,,bn){0,1}n\mathbb{b}=(b_{1},\ldots,b_{n})\in\{0,1\}^{n}, where bi=1b_{i}=1 implies node viv_{i} is chosen as shelter. Let ϕ(𝐛)=(i=1nbi)m\phi(\mathbf{b})=\left(\sum_{i=1}^{n}b_{i}\right)\leq m represent there are at most mm shelters. Let f(vr,vs,E)f(v_{r},v_{s},E^{\prime}) be an indicator function that returns one if and only if the selected edges EE^{\prime} form a path from vrv_{r} to vsv_{s}. The whole formula is:

ϕ(𝐛),bi\displaystyle\phi(\mathbf{b}),b_{i}\Rightarrow (vsS,EEf(vr,vs,E)2qr)for 1in.\displaystyle\left(\sum_{v_{s}\in S,E^{\prime}\subseteq E}f(v_{r},v_{s},E^{\prime})\geq 2^{q_{r}}\right)\text{for }1\leq i\leq n.

We leave the details implementation of f(vr,vs,E)f(v_{r},v_{s},E^{\prime}) in the Appendix C.

Empirical Experiment Analysis

Experiment Setting. We crawl the real-world dataset from the Hawaii Statewide GIS Program website. We extract the real Hawaii map with those major roads and manually label those resident areas on the map. We create problems of different scales by subtracting different sub-regions from the map. 3 major resident areas are picked as RR, and set m=5m=5.

Table 1: 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC} takes less empirical running time than baselines to find shelter location assignments over different graphs. Graph size is the number of nodes in the graph.
Graph Size
N=121N=121 N=183N=183 N=388N=388
𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC} (ours) 0.04hr\mathbf{0.04}hr 0.11hr\mathbf{0.11}hr 0.16hr\mathbf{0.16}hr
GibbsSampler-LS 0.56hr0.56hr 0.66hr0.66hr 6.97hr6.97hr
QuickSampler-LS 0.31hr0.31hr 0.29hr0.29hr 0.62hr0.62hr
Unigen-LS 0.08hr0.08hr 0.17hr0.17hr 0.42hr0.42hr

In terms of baselines, we consider the local search algorithm with shelter locations as the state and the number of paths between shelters and resident areas as the heuristic. Due to the intractability of path counting in our formulation, the heuristic is approximated by querying sampling oracles. In particular, we consider 1) Gibbs sampling-based (Geman and Geman 1984) Local Search (Gibbs-LS). 2) Uniform SAT sampler-based (Soos, Gocht, and Meel 2020) Local Search (Unigen-LS). 3) Quick Sampler-based (Dutra et al. 2018) Local Search (Quick-LS). Each baseline runs 5 times, and the best result is included. Each run is until the approach finds a local minimum. For our 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC}, we give a time limit of 12 hours. The algorithm repeatedly runs with increasing qrq_{r} until it times out. The time shown in Table 1 and the number of paths in Figure 2 correspond to the cumulative time and the best solutions found before the algorithm times out. For the evaluation metrics, we consider 1) the number of paths identified in the predicted plan by each algorithm, and 2) the total running time of the process.

Result Analysis. In terms of running time (in Table 1), 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC} takes less empirical running time than baselines for finding shelter location assignments over different graphs. In Fig. 3, we evaluate the quality of the predicted shelters by counting the number of connecting paths from residents to the shelters. The path is counted by directly solving the counting predicate by SharpSAT-TD (Korhonen and Järvisalo 2021) with given shelter locations. The shelter locations selected by our 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC} lead to a higher number of paths than those found by the baselines.

Refer to caption
Figure 3: 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC} finds better shelter locations with orders of magnitudes more paths from the resident area to the chosen shelters than competing baselines across different graphs (y-axis in log scale).
Table 2: Empirical average of the total production (tons) and running time on two supply networks. Our 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC} finds better solutions (higher production) with less time usage compared to baselines.
Small Supply Network Real-world Supply Network
Disaster Scale (%\% of affected edges) Disaster Scale (%\% of affected edges)
10%10\% 20%20\% 30%30\% 10%10\% 20%20\% 30%30\%
𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC} (ours) 95.0(0.67𝐬)\mathbf{95.0(0.67s)} 96.0(1.19𝐬)\mathbf{96.0(1.19s)} 119.3(17.63𝐬)\mathbf{119.3(17.63s)} 232.2(57.8𝐬)\mathbf{232.2(57.8s)} 210.2(75.5𝐬)\mathbf{210.2(75.5s)} 147.8(96.9𝐬)\mathbf{147.8(96.9s)}
BP-SAA 86.5(1.17s)86.5(1.17s) 82.7(3.76s)82.7(3.76s) 118.8(11.48s)118.8(11.48s) 187.2(2hr)187.2(2hr) 143.7(2hr)143.7(2hr) 130.5(2hr)130.5(2hr)
Gibbs-SAA 88.0(1.24s)88.0(1.24s) 95.4(4.48s)95.4(4.48s) 113.3(15.61s)113.3(15.61s) 155.0(2hr)155.0(2hr) 147.1(2hr)147.1(2hr) 111.0(2hr)111.0(2hr)
IS-SAA 80.5(0.98s)80.5(0.98s) 76.7(3.29s)76.7(3.29s) 109.6(21.74s)109.6(21.74s) 157.4(2hr)157.4(2hr) 123.5(2hr)123.5(2hr) 132.9(2hr)132.9(2hr)
LoopyIS-SAA 88.0(0.84s)88.0(0.84s) 95.4(2.96s)95.4(2.96s) 112.3(17.95s)112.3(17.95s) 200.2(2hr)200.2(2hr) 149.5(2hr)149.5(2hr) 130.5(2hr)130.5(2hr)
Weighted-SAA 86.5(1.11s)86.5(1.11s) 95.4(3.57s)95.4(3.57s) 117.0(11.80s)117.0(11.80s) 170.5(2hr)170.5(2hr) 169.0(2hr)169.0(2hr) 130.3(2hr)130.3(2hr)
Best possible values 100.0100.0 104.0104.0 128.0128.0 237.0237.0 226.0226.0 235.0235.0

6 Experiment 2: Robust Supply Chain Design

Problem Formulation. Supply chain management found its importance in operations research and economics. The essence of supply chain management is to integrate the flow of products and finances to maximize value to the consumer. Its importance is underscored by the increasing complexity of business environments, where minor inefficiencies by random disasters result in significant extra costs.

Given a supply chain of NN suppliers, they form a supply-demand network (V,E)(V,E) where each node vVv\in V represents a supplier and edges eEe\in E represent supply-to-demand trades. Each supplier vv acts as a vendor to downstream suppliers and also as a buyer from upstream suppliers. We assume a conservation of production conditions for each node, i.e., the input-output ratio is 1. To guarantee substantial production, supplier vv should order necessary raw materials from upstream suppliers in advance. Denote the cost of trade between vendor uu, and buyer vv as c(u,v)c(u,v). Let the amount of goods of the trade between uu and vv be f(u,v)f(u,v) and B(v)B(v) be the total budget to get all his materials ready. Due to unpredictable natural disasters and equipment failure, the trade between uu and vv may fail. Denote 𝜽=(θ1,,θL){0,1}L\boldsymbol{\theta}=(\theta_{1},\ldots,\theta_{L})\in\{0,1\}^{L} as the state of LL different stochastic events, where θl=1\theta_{l}=1 indicates event ll occurs. The objective is to design a global trading plan maximizing the expected total production output, accounting for stochastic influences. This is measured by the output of final-tier suppliers, who produce end goods without supplying others.

Existing works in supply chain optimization often gravitate towards mathematical programming approaches. However, when delving into more complex scenarios involving stochastic events—such as uncertain demand or supply disruptions—the task becomes considerably more intricate. Specifically, formulating counting constraints, like guaranteeing a certain amount of supplies across multiple vendors under stochastic events, is intractable. These complexities necessitate innovative approaches that can capture the randomness and dynamism inherent in real-world supply chain systems without sacrificing optimality.

SMC Formulation. Similarly, we transform the maximization into a decision problem by gradually increasing the threshold of the production. Denote the trading plan as 𝐱=(xu,v,){0,1}|E|\mathbf{x}=(x_{u,v},\dots)\in\{0,1\}^{|E|}, where xu,v=1x_{u,v}=1 indicates that vv purchases raw material from uu. The decision problem can be formulated into an SMC problem as follows:

ϕ(𝐱),vD𝔼θ[(u,v)E𝐱,𝜽f(u,v)]2q\displaystyle\phi(\mathbf{x}),\sum_{v\in D}\mathbb{E}_{\theta}\left[\sum_{(u,v)\in E_{\mathbf{x},\boldsymbol{\theta}}}f(u,v)\right]\geq 2^{q} (8)

where DD represents final-tier supplier nodes, 2q2^{q} is a minimum production level to be guaranteed, E𝐱,𝜽E_{\mathbf{x},\boldsymbol{\theta}} is the remaining sets of edges after applying trading plan 𝐱\mathbf{x} and under the stochastic events of disasters 𝜽\boldsymbol{\theta}. ϕ(𝐱)\phi(\mathbf{x}) encodes all essential constraints, e.g., adherence to budget limits, output not exceeding input at each node, etc. We leave the formulation details in Appendix C.

Empirical Experiment Analysis

Experiment Setting. The dataset is the wheat supply chain network from (Zokaee et al. 2017). The dataset only provides the cost of trade, the capacity of transportation, and raw material demand. We further generate stochastic events of disasters (see Appendix C) over different portions of supply-demand edges. The random disasters make the expectation computation intractable. For a better comparison of running time, a small-scale synthetic network is included, in which the cost, budget, and capacities are randomly generated. For the small supply network, the number of nodes in each layer is [4,4,5,5][4,4,5,5]. For the real-world supply network, the number of nodes in each layer is [9,7,9,19][9,7,9,19].

For the baseline, we utilize Sample Average Approximation (SAA)-based methods (Kleywegt, Shapiro, and Homem-de-Mello 2002). These baselines employ Mixed Integer Programming (MIP) to identify a trading plan that directly maximizes the average production across networks impacted by 100 sampled disasters. The average over samples serves as a proxy for the actual expected production. For the sampler, we consider Gibbs sampling (Gibbs-SAA), belief propagation (BP-SAA), importance sampling (IS-SAA), loopy-importance sampling, and weighted sampling (Weighted-SAA). For a fair comparison, we imposed a time limit of 30 seconds for the small-sized network and 2 hours for the real-world network. The time shown in Table 2 for SAA approaches is their actual execution time. Our SMC solver again executes repeatedly with increasing qq until it times outs. The time shown is the cumulative time it finds the best solution (last one) before the time limit.

To evaluate the efficacy of a trading plan, we calculate its empirical average production under 10,00010,000 i.i.d. disasters, sampled from the ground-truth distribution. The production numbers are reported in Table 2. This method is adopted due to the computational infeasibility to calculate expectations directly. For SAA approaches that exceed the time limit, the production numbers in Table 2 are for the best solutions found within the time limit.

Result Analysis. Table 2 shows the production and running times of plans derived from various methods. For small networks, while SAA-based methods can complete MIP within the time limit, they remain sub-optimal as they optimize a surrogate expectation derived from sampling, which deviates from the true expectation. In the case of larger networks, these methods further struggle due to the poor scalability of the large MIP formulation. They fail to find good solutions within the 2-hour time limit. In contrast, 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC}, by formulating as an SMC problem, directly optimizes the intractable expectation using XOR counting, and yields superior solution quality and less running time.

7 Conclusion

We presented 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC}, an algorithm with polynomial approximation guarantees to solve the highly intractable Satisfiability Modulo Counting (SMC) problems. Solving SMC problems presents unique challenges due to their intricate nature, integrating statistical inference and symbolic reasoning. Prior work on SMC solving offers no or loose guarantees and may find suboptimal solutions. 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC} transforms the intractable SMC problem into satisfiability problems by replacing intricate model counting with SAT formulae subject to randomized XOR constraints. 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC} also obtains constant approximation guarantees on the solutions obtained. SMC solvers offer useful tools for many real-world problems at the nexus of symbolic and statistical AI. Extensive experiments on two real-world applications in AI for social good demonstrate that 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC} outperforms other approaches in solution quality and running time.

Acknowledgments

We thank all the reviewers for their constructive comments. This research was supported by NSF grant CCF-1918327.

References

  • cpm (2023) 2023. AAAI-23 Constraint Programming and Machine Learning Bridge Program.
  • nes (2023) 2023. Neuro-symbolic AI for Agent and Multi-Agent Systems (NeSyMAS) Workshop at AAMAS-23.
  • Achlioptas and Theodoropoulos (2017) Achlioptas, D.; and Theodoropoulos, P. 2017. Probabilistic Model Counting with Short XORs. In SAT, volume 10491, 3–19. Springer.
  • Almeida et al. (2019) Almeida, R.; Shi, Q.; Gomes-Selman, J. M.; Wu, X.; Xue, Y.; Angarita, H.; Barros, N.; Forsberg, B.; García-Villacorta, R.; Hamilton, S.; Melack, J.; Montoya, M.; Perez, G.; Sethi, S.; Gomes, C.; and Flecker, A. 2019. Reducing greenhouse gas emissions of Amazon hydropower with strategic dam planning. Nature Communications, 10.
  • Amideo, Scaparra, and Kotiadis (2019) Amideo, A. E.; Scaparra, M. P.; and Kotiadis, K. 2019. Optimising shelter location and evacuation routing operations: The critical issues. Eur. J. Oper. Res., 279(2): 279–295.
  • Balog et al. (2017) Balog, M.; Tripuraneni, N.; Ghahramani, Z.; and Weller, A. 2017. Lost Relatives of the Gumbel Trick. In ICML, volume 70, 371–379. PMLR.
  • Barbosa et al. (2022) Barbosa, H.; Barrett, C. W.; Brain, M.; Kremer, G.; Lachnitt, H.; Mann, M.; Mohamed, A.; Mohamed, M.; Niemetz, A.; Nötzli, A.; Ozdemir, A.; Preiner, M.; Reynolds, A.; Sheng, Y.; Tinelli, C.; and Zohar, Y. 2022. cvc5: A Versatile and Industrial-Strength SMT Solver. In TACAS, volume 13243, 415–442. Springer.
  • Barrett et al. (2009) Barrett, C.; Sebastiani, R.; Seshia, S.; and Tinelli, C. 2009. Satisfiability Modulo Theories, chapter 26, 825–885.
  • Barrett et al. (2021) Barrett, C. W.; Sebastiani, R.; Seshia, S. A.; and Tinelli, C. 2021. Satisfiability Modulo Theories. In Handbook of Satisfiability, volume 336, 1267–1329. IOS Press.
  • Bayram and Yaman (2018) Bayram, V.; and Yaman, H. 2018. Shelter Location and Evacuation Route Assignment Under Uncertainty: A Benders Decomposition Approach. Transp. Sci., 52(2): 416–436.
  • Belanger and McCallum (2016) Belanger, D.; and McCallum, A. 2016. Structured Prediction Energy Networks. In ICML, volume 48, 983–992.
  • Biere et al. (2009) Biere, A.; Heule, M. J. H.; van Maaren, H.; and Walsh, T., eds. 2009. Handbook of Satisfiability, volume 185 of Frontiers in Artificial Intelligence and Applications.
  • Bjørner et al. (2018) Bjørner, N. S.; de Moura, L.; Nachmanson, L.; and Wintersteiger, C. M. 2018. Programming Z3. In SETSS, volume 11430, 148–201. Springer.
  • Braunstein, Mézard, and Zecchina (2005) Braunstein, A.; Mézard, M.; and Zecchina, R. 2005. Survey propagation: an algorithm for satisfiability. Random Struct. Algorithms, 27: 201–226.
  • Chakraborty et al. (2014) Chakraborty, S.; Fremont, D. J.; Meel, K. S.; Seshia, S. A.; and Vardi, M. Y. 2014. Distribution-Aware Sampling and Weighted Model Counting for SAT. In AAAI, 1722–1730.
  • Chakraborty, Meel, and Vardi (2013) Chakraborty, S.; Meel, K. S.; and Vardi, M. Y. 2013. A Scalable and Nearly Uniform Generator of SAT Witnesses. In CAV, volume 8044, 608–623. Springer.
  • Chavira and Darwiche (2008) Chavira, M.; and Darwiche, A. 2008. On probabilistic inference by weighted model counting. Artificial Intelligence, 172(6-7): 772–799.
  • Cheng et al. (2012) Cheng, Q.; Chen, F.; Dong, J.; Xu, W.; and Ihler, A. 2012. Approximating the Sum Operation for Marginal-MAP Inference. In AAAI, 1882–1887.
  • Choi, Kisa, and Darwiche (2013) Choi, A.; Kisa, D.; and Darwiche, A. 2013. Compiling Probabilistic Graphical Models Using Sentential Decision Diagrams. In ECSQARU, volume 7958, 121–132. Springer.
  • Choi, Friedman, and den Broeck (2022) Choi, Y.; Friedman, T.; and den Broeck, G. V. 2022. Solving Marginal MAP Exactly by Probabilistic Circuit Transformations. In AISTATS, volume 151, 10196–10208. PMLR.
  • Conitzer and Sandholm (2006) Conitzer, V.; and Sandholm, T. 2006. Computing the optimal strategy to commit to. In EC, 82–90. ACM.
  • Conrad et al. (2012) Conrad, J.; Gomes, C. P.; van Hoeve, W.-J.; Sabharwal, A.; and Suter, J. F. 2012. Wildlife corridors as a connected subgraph problem. Journal of Environmental Economics and Management, 63(1).
  • Darwiche and Marquis (2002) Darwiche, A.; and Marquis, P. 2002. A Knowledge Compilation Map. J. Artif. Int. Res.
  • de Moura and Bjørner (2007) de Moura, L. M.; and Bjørner, N. S. 2007. Efficient E-Matching for SMT Solvers. In CADE, volume 4603 of Lecture Notes in Computer Science, 183–198. Springer.
  • de Moura and Bjørner (2008) de Moura, L. M.; and Bjørner, N. S. 2008. Z3: An Efficient SMT Solver. In TACAS, volume 4963, 337–340. Springer.
  • Dilkina and Gomes (2010) Dilkina, B.; and Gomes, C. P. 2010. Solving Connected Subgraph Problems in Wildlife Conservation. In CPAIOR, 102–116.
  • Dutra et al. (2018) Dutra, R.; Laeufer, K.; Bachrach, J.; and Sen, K. 2018. Efficient sampling of SAT solutions for testing. In ICSE, 549–559. ACM.
  • Dyer and Stougie (2006) Dyer, M. E.; and Stougie, L. 2006. Computational complexity of stochastic programming problems. Math. Program., 106(3): 423–432.
  • Ermon et al. (2013a) Ermon, S.; Gomes, C. P.; Sabharwal, A.; and Selman, B. 2013a. Embed and Project: Discrete Sampling with Universal Hashing. In NIPS, 2085–2093.
  • Ermon et al. (2013b) Ermon, S.; Gomes, C. P.; Sabharwal, A.; and Selman, B. 2013b. Taming the Curse of Dimensionality: Discrete Integration by Hashing and Optimization. In ICML, volume 28, 334–342.
  • Fang, Stone, and Tambe (2015) Fang, F.; Stone, P.; and Tambe, M. 2015. When Security Games Go Green: Designing Defender Strategies to Prevent Poaching and Illegal Fishing. In IJCAI.
  • Feser et al. (2020) Feser, J. K.; Madden, S.; Tang, N.; and Solar-Lezama, A. 2020. Deductive optimization of relational data storage. Proc. ACM Program. Lang., 4(OOPSLA): 170:1–170:30.
  • Flerova et al. (2011) Flerova, N.; Ihler, E.; Dechter, R.; and Otten, L. 2011. Mini-bucket elimination with moment matching. In NIPS Workshop DISCML.
  • Fredrikson and Jha (2014) Fredrikson, M.; and Jha, S. 2014. Satisfiability modulo counting: a new approach for analyzing privacy properties. In CSL-LICS, 42:1–42:10. ACM.
  • Geman and Geman (1984) Geman, S.; and Geman, D. 1984. Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images. IEEE Trans. Pattern Anal. Mach. Intell., 6(6): 721–741.
  • Gogate and Dechter (2007) Gogate, V.; and Dechter, R. 2007. SampleSearch: A Scheme that Searches for Consistent Samples. In AISTATS, volume 2, 147–154.
  • Gogate and Dechter (2012) Gogate, V.; and Dechter, R. 2012. Importance sampling-based estimation over AND/OR search spaces for graphical models. Artif. Intell., 184-185: 38–77.
  • Gomes, Sabharwal, and Selman (2006a) Gomes, C. P.; Sabharwal, A.; and Selman, B. 2006a. Model Counting: A New Strategy for Obtaining Good Bounds. In AAAI, 54–61.
  • Gomes, Sabharwal, and Selman (2006b) Gomes, C. P.; Sabharwal, A.; and Selman, B. 2006b. Near-Uniform Sampling of Combinatorial Spaces Using XOR Constraints. In NIPS, 481–488. MIT Press.
  • Hazan and Jaakkola (2012) Hazan, T.; and Jaakkola, T. S. 2012. On the Partition Function and Random Maximum A-Posteriori Perturbations. In ICML.
  • Hazan and Shashua (2010) Hazan, T.; and Shashua, A. 2010. Norm-Product Belief Propagation: Primal-Dual Message-Passing for Approximate Inference. IEEE Trans. Inf. Theory, 56(12): 6294–6316.
  • Israeli and Wood (2002) Israeli, E.; and Wood, R. K. 2002. Shortest-path network interdiction. Networks: An International Journal, 40(2): 97–111.
  • Jang, Gu, and Poole (2017) Jang, E.; Gu, S.; and Poole, B. 2017. Categorical Reparameterization with Gumbel-Softmax. In ICLR (Poster).
  • Jerrum and Sinclair (1997) Jerrum, M.; and Sinclair, A. 1997. The Markov chain Monte Carlo method: an approach to approximate counting and integration, 482–520. Boston, MA, USA.
  • Jerrum, Valiant, and Vazirani (1986) Jerrum, M.; Valiant, L. G.; and Vazirani, V. V. 1986. Random Generation of Combinatorial Structures from a Uniform Distribution. Theor. Comput. Sci., 43: 169–188.
  • Jiang, Rai, and III (2011) Jiang, J.; Rai, P.; and III, H. D. 2011. Message-Passing for Approximate MAP Inference with Latent Variables. In NIPS, 1197–1205.
  • K., Shoham, and Gurfinkel (2022) K., H. G. V.; Shoham, S.; and Gurfinkel, A. 2022. Solving constrained Horn clauses modulo algebraic data types and recursive functions. Proc. ACM Program. Lang., 6: 1–29.
  • Kempe, Kleinberg, and Tardos (2005) Kempe, D.; Kleinberg, J.; and Tardos, É. 2005. Influential nodes in a diffusion model for social networks. In Automata, languages and programming, 1127–1138. Springer.
  • Kisa et al. (2014) Kisa, D.; den Broeck, G. V.; Choi, A.; and Darwiche, A. 2014. Probabilistic Sentential Decision Diagrams. In KR.
  • Kiselev and Poupart (2014) Kiselev, I.; and Poupart, P. 2014. Policy optimization by marginal-map probabilistic inference in generative models. In AAMAS, 1611–1612. IFAAMAS/ACM.
  • Kleywegt, Shapiro, and Homem-de-Mello (2002) Kleywegt, A. J.; Shapiro, A.; and Homem-de-Mello, T. 2002. The Sample Average Approximation Method for Stochastic Discrete Optimization. SIAM J. Optim., 12(2): 479–502.
  • Korhonen and Järvisalo (2021) Korhonen, T.; and Järvisalo, M. 2021. Integrating tree decompositions into decision heuristics of propositional model counters (short paper). In 27th International Conference on Principles and Practice of Constraint Programming (CP 2021). Schloss Dagstuhl-Leibniz-Zentrum für Informatik.
  • Kuck et al. (2019) Kuck, J.; Dao, T.; Zhao, S.; Bartan, B.; Sabharwal, A.; and Ermon, S. 2019. Adaptive Hashing for Model Counting. In UAI, volume 115, 271–280. AUAI Press.
  • Lee et al. (2016) Lee, J.; Marinescu, R.; Dechter, R.; and Ihler, A. 2016. From Exact to Anytime Solutions for Marginal MAP. In AAAI, 3255–3262.
  • Liu and Ihler (2012) Liu, Q.; and Ihler, A. 2012. Belief Propagation for Structured Decision Making. In UAI, 523–532. AUAI Press.
  • Liu and Ihler (2013) Liu, Q.; and Ihler, A. T. 2013. Variational algorithms for marginal MAP. Journal of Machine Learning Research, 14.
  • Madras (2002) Madras, N. 2002. Lectures on Monte Carlo Methods. American Mathematical Society.
  • Maneva, Mossel, and Wainwright (2007) Maneva, E. N.; Mossel, E.; and Wainwright, M. J. 2007. A new look at survey propagation and its generalizations. J. ACM, 54(4): 17.
  • Marinescu, Dechter, and Ihler (2014) Marinescu, R.; Dechter, R.; and Ihler, A. T. 2014. AND/OR Search for Marginal MAP. In UAI.
  • Mauá and de Campos (2012) Mauá, D. D.; and de Campos, C. P. 2012. Anytime Marginal MAP Inference. In ICML.
  • Mei, Jiang, and Tu (2018) Mei, J.; Jiang, Y.; and Tu, K. 2018. Maximum A Posteriori Inference in Sum-Product Networks. In AAAI, 1923–1930.
  • Munawar et al. (2023) Munawar, A.; Lenchner, J.; Rossi, F.; Horesh, L.; Gray, A.; and Campbell, M., eds. 2023. IBM Neuro-Symbolic AI Workshop 2023 – Unifying Statistical and Symbolic AI.
  • Papandreou and Yuille (2010) Papandreou, G.; and Yuille, A. L. 2010. Gaussian sampling by local perturbations. In NIPS, 1858–1866.
  • Papandreou and Yuille (2011) Papandreou, G.; and Yuille, A. L. 2011. Perturb-and-MAP random fields: Using discrete optimization to learn and sample from energy models. In ICCV, 193–200.
  • Park and Darwiche (2004) Park, J. D.; and Darwiche, A. 2004. Complexity Results and Approximation Strategies for MAP Explanations. J. Artif. Int. Res.
  • Ping, Liu, and Ihler (2015) Ping, W.; Liu, Q.; and Ihler, A. 2015. Decomposition Bounds for Marginal MAP. In NIPS, 3267–3275.
  • Rossi, van Beek, and Walsh (2006) Rossi, F.; van Beek, P.; and Walsh, T., eds. 2006. Handbook of Constraint Programming, volume 2. Elsevier.
  • Shapiro (2003) Shapiro, A. 2003. Monte Carlo sampling methods. Handbooks in operations research and management science, 10: 353–425.
  • Sheldon et al. (2010) Sheldon, D.; Dilkina, B.; Elmachtoub, A. N.; Finseth, R.; Sabharwal, A.; Conrad, J.; Gomes, C. P.; Shmoys, D. B.; Allen, W.; Amundsen, O.; and Vaughan, W. 2010. Maximizing the Spread of Cascades Using Network Design. In UAI, 517–526.
  • Sontag et al. (2008) Sontag, D.; Meltzer, T.; Globerson, A.; Jaakkola, T.; and Weiss, Y. 2008. Tightening LP Relaxations for MAP using Message Passing. In UAI, 503–510.
  • Soos, Gocht, and Meel (2020) Soos, M.; Gocht, S.; and Meel, K. S. 2020. Tinted, Detached, and Lazy CNF-XOR Solving and Its Applications to Counting and Sampling. In CAV, volume 12224 of Lecture Notes in Computer Science, 463–484. Springer.
  • Swamy and Shmoys (2006) Swamy, C.; and Shmoys, D. B. 2006. Approximation Algorithms for 2-Stage Stochastic Optimization Problems. In FSTTCS, volume 4337, 5–19. Springer.
  • Valiant and Vazirani (1986) Valiant, L. G.; and Vazirani, V. V. 1986. NP is as Easy as Detecting Unique Solutions. Theor. Comput. Sci., 47(3): 85–93.
  • Verweij et al. (2003) Verweij, B.; Ahmed, S.; Kleywegt, A. J.; Nemhauser, G. L.; and Shapiro, A. 2003. The Sample Average Approximation Method Applied to Stochastic Routing Problems: A Computational Study. Comput. Optim. Appl., 24(2-3): 289–333.
  • Wainwright, Jaakkola, and Willsky (2003) Wainwright, M. J.; Jaakkola, T. S.; and Willsky, A. S. 2003. Tree-reweighted belief propagation algorithms and approximate ML estimation by pseudo-moment matching. In AISTATS. Society for Artificial Intelligence and Statistics.
  • Wainwright and Jordan (2008) Wainwright, M. J.; and Jordan, M. I. 2008. Graphical Models, Exponential Families, and Variational Inference. Found. Trends Mach. Learn., 1(1-2): 1–305.
  • Welling and Teh (2003) Welling, M.; and Teh, Y. W. 2003. Approximate inference in Boltzmann machines. Artif. Intell., 143(1): 19–50.
  • Williams and Snyder (2005) Williams, J. C.; and Snyder, S. A. 2005. Restoring Habitat Corridors in Fragmented Landscapes using Optimization and Percolation Models. Environmental Modeling and Assessment, 10(3): 239–250.
  • Wu et al. (2017) Wu, X.; Kumar, A.; Sheldon, D.; and Zilberstein, S. 2017. Robust Optimization for Tree-Structured Stochastic Network Design. In AAAI, 4545–4551.
  • Wu, Sheldon, and Zilberstein (2014) Wu, X.; Sheldon, D.; and Zilberstein, S. 2014. Stochastic Network Design in Bidirected Trees. In NIPS, 882–890.
  • Wu, Sheldon, and Zilberstein (2015) Wu, X.; Sheldon, D.; and Zilberstein, S. 2015. Fast Combinatorial Algorithm for Optimizing the Spread of Cascades. In IJCAI, 2655–2661.
  • Xue, Fern, and Sheldon (2015) Xue, S.; Fern, A.; and Sheldon, D. 2015. Scheduling Conservation Designs for Maximum Flexibility via Network Cascade Optimization. J. Artif. Intell. Res., 52: 331–360.
  • Xue, Choi, and Darwiche (2012) Xue, Y.; Choi, A.; and Darwiche, A. 2012. Basing Decisions on Sentences in Decision Diagrams. In AAAI, 842–849.
  • Xue et al. (2016) Xue, Y.; Li, Z.; Ermon, S.; Gomes, C. P.; and Selman, B. 2016. Solving Marginal MAP Problems with NP Oracles and Parity Constraints. In NIPS, 1127–1135.
  • Yedidia, Freeman, and Weiss (2000) Yedidia, J. S.; Freeman, W. T.; and Weiss, Y. 2000. Generalized Belief Propagation. In NIPS, 689–695. MIT Press.
  • Zokaee et al. (2017) Zokaee, S.; Jabbarzadeh, A.; Fahimnia, B.; and Sadjadi, S. J. 2017. Robust supply chain network design: an optimization model with real world application. Annals of Operations Research, 257: 15–44.

Appendix A Proofs of Theorem 2

Proof of Theorem 2 Claim 2

Suppose for all 𝐱{0,1}n\mathbf{x}\in\{0,1\}^{n} and 𝐛{0,1}k\mathbf{b}\in\{0,1\}^{k}, there is

¬(ϕ(𝐱,𝐛)(i=1k(biyifi(𝐱,yi)2qic)))\displaystyle\neg\left(\phi(\mathbf{x},\mathbf{b})\wedge\left(\bigwedge_{i=1}^{k}\left(b_{i}\Rightarrow\sum_{y_{i}}f_{i}(\mathbf{x},y_{i})\geq 2^{q_{i}-c}\right)\right)\right)
¬ϕ(𝐱,𝐛)(i=1k(biyifi(𝐱,yi)2qic)).\displaystyle\Rightarrow\neg\phi(\mathbf{x},\mathbf{b})\vee\left(\bigvee_{i=1}^{k}\left(b_{i}\wedge\sum_{y_{i}}f_{i}(\mathbf{x},y_{i})\leq 2^{q_{i}-c}\right)\right).

Consider a fixed 𝐱1\mathbf{x}_{1} and 𝐛1\mathbf{b}_{1}, we will examine the probability of not returning true when discovering (𝐱1,𝐛1)(\mathbf{x}_{1},\mathbf{b}_{1}).

If ¬ϕ(𝐱1,𝐛1)\neg\phi(\mathbf{x}_{1},\mathbf{b}_{1}) holds true, then the probability of 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC} (ϕ\phi, {fi}i=1k\{f_{i}\}_{i=1}^{k}, {qi}i=1k\{q_{i}\}_{i=1}^{k}, TT) not returning true for discovering (𝐱1,𝐛1)(\mathbf{x}_{1},\mathbf{b}_{1}) is one.

Otherwise, when i=1k(biyifi(𝐱1,yi)2qic)\bigvee_{i=1}^{k}\left(b_{i}\wedge\sum_{y_{i}}f_{i}(\mathbf{x}_{1},y_{i})\leq 2^{q_{i}-c}\right) is true, i.e,

i,s.t.biyifi(𝐱1,yi)2qic.\displaystyle\exists i,~{}s.t.~{}~{}b_{i}\wedge\sum_{y_{i}}f_{i}(\mathbf{x}_{1},y_{i})\leq 2^{q_{i}-c}. (9)

Let’s select any t=1,,Tt=1,\dots,T. The Boolean formula ψt\psi_{t} can be simplified by substituting the values of 𝐱1\mathbf{x}_{1} and 𝐛1\mathbf{b}_{1}:

ψt=\displaystyle\psi_{t}= (f1(𝐱1,y1(t))𝚇𝙾𝚁1(y1(t))𝚇𝙾𝚁q1(y1(t)))\displaystyle\left(f_{1}(\mathbf{x}_{1},y_{1}^{(t)})\wedge\mathtt{XOR}_{1}(y_{1}^{(t)})\wedge\dots\wedge\mathtt{XOR}_{q_{1}}(y_{1}^{(t)})\right)
\displaystyle\wedge\dots\wedge
(fk1(𝐱1,yk1(t))𝚇𝙾𝚁1(yk1(t))𝚇𝙾𝚁qk1(yk1(t)))\displaystyle\left(f_{k_{1}}(\mathbf{x}_{1},y_{k_{1}}^{(t)})\wedge\mathtt{XOR}_{1}(y_{k_{1}}^{(t)})\wedge\dots\wedge\mathtt{XOR}_{q_{k_{1}}}(y_{k_{1}}^{(t)})\right)

where k1k_{1} is the number of digits that are non-zero in 𝐛1\mathbf{b}_{1}, and 1k1k1\leq k_{1}\leq k.

According to Eq. (9), we define

γi=(fi(𝐱1,yi(t))𝚇𝙾𝚁1(yi(t))𝚇𝙾𝚁qi(yi(t)))\gamma_{i}=\left(f_{i}(\mathbf{x}_{1},y_{i}^{(t)})\wedge\mathtt{XOR}_{1}(y_{i}^{(t)})\wedge\dots\wedge\mathtt{XOR}_{q_{i}}(y_{i}^{(t)})\right)

where yi(t)fi(𝐱1,yi(t))2qic\sum_{y_{i}^{(t)}}f_{i}(\mathbf{x}_{1},y_{i}^{(t)})\leq 2^{q_{i}-c}. Then according to Lemma 1, with probability at least 12c(2c1)21-\frac{2^{c}}{(2^{c}-1)^{2}}, (𝐱1,yi(t))(\mathbf{x}_{1},y_{i}^{(t)}) renders γi\gamma_{i} false for all yi(t)y_{i}^{(t)}.

Then the probability of ψt\psi_{t} being false under all (𝐱1,𝐛1,y1(t),,yk(t))(\mathbf{x}_{1},\mathbf{b}_{1},y_{1}^{(t)},\dots,y_{k}^{(t)}) is

Pr((𝐱1,𝐛1,y1(t),,yk(t)) renders ψt false for all {yi(t)}i=1k)\displaystyle{Pr}((\mathbf{x}_{1},\mathbf{b}_{1},y_{1}^{(t)},\dots,y_{k}^{(t)})\text{ renders $\psi_{t}$ false for all $\{y_{i}^{(t)}\}_{i=1}^{k}$})
=\displaystyle= Pr(i=1k1(𝐱1,yi(t)) renders γi false for all yi(t))\displaystyle{Pr}\left(\bigvee_{i=1}^{k_{1}}(\mathbf{x}_{1},y_{i}^{(t)})\text{ renders $\gamma_{i}$ false for all $y_{i}^{(t)}$}\right)
\displaystyle\geq Pr((𝐱1,yi(t)) renders γi false for all yi(t))\displaystyle{Pr}((\mathbf{x}_{1},y_{i}^{(t)})\text{ renders $\gamma_{i}$ false for all $y_{i}^{(t)}$})
\displaystyle\geq 12c(2c1)2\displaystyle 1-\frac{2^{c}}{(2^{c}-1)^{2}}

Define Γt\Gamma_{t} as a binary indicator variable where

Γt={0if all (𝐱1,𝐛1,y1(t),,yk(t)) renders ψt false1otherwise\displaystyle\Gamma_{t}=\begin{cases}0&\text{if all $(\mathbf{x}_{1},\mathbf{b}_{1},y_{1}^{(t)},\dots,y_{k}^{(t)})$ renders $\psi_{t}$ false}\\ 1&\text{otherwise}\end{cases}

Upon discovering 𝐱1,𝐛1\mathbf{x}_{1},\mathbf{b}_{1}, 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC} won’t return true if the majority of ψt\psi_{t}, t=1,,Tt=1,\dots,T are false under (𝐱1,𝐛1,y1(t),,yk(t))(\mathbf{x}_{1},\mathbf{b}_{1},y_{1}^{(t)},\dots,y_{k}^{(t)}) for all {yi(t)}i=1k\{y_{i}^{(t)}\}_{i=1}^{k}, that is, t=1TΓtT2\sum_{t=1}^{T}\Gamma_{t}\leq\frac{T}{2}. By Chernoff–Hoeffding theorem,

Pr(t=1TΓtT2)\displaystyle{Pr}\left(\sum_{t=1}^{T}\Gamma_{t}\leq\frac{T}{2}\right) =1Pr(t=1TΓt>T2)1eα(c,1)T\displaystyle=1-{Pr}\left(\sum_{t=1}^{T}\Gamma_{t}>\frac{T}{2}\right)\geq 1-e^{-\alpha(c,1)T}

For T(n+k)ln2lnηα(c,1)T\geq\frac{(n+k)\ln 2-\ln\eta}{\alpha(c,1)}, it follows that eα(c,1)Tη2n+ke^{-\alpha(c,1)T}\leq\frac{\eta}{2^{n+k}}, In this case, with a probability at least 1η2n+k1-\frac{\eta}{2^{n+k}}, 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC} won’t return true upon discovering 𝐱1\mathbf{x}_{1} and 𝐛1\mathbf{b}_{1}.

Noted that 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC} returns false only if it won’t return true upon discovering any (𝐱1,𝐛1)(\mathbf{x}_{1},\mathbf{b}_{1}). Therefore, we can write

Pr(𝚇𝙾𝚁-𝚂𝙼𝙲 returns false)\displaystyle{Pr}(\text{$\mathtt{XOR}$-$\mathtt{SMC}$ returns false})
=\displaystyle=~{} Pr(𝐱,𝐛,𝚇𝙾𝚁-𝚂𝙼𝙲 won’t return true upon (𝐱,𝐛))\displaystyle{Pr}(\forall\mathbf{x},\mathbf{b},\text{$\mathtt{XOR}$-$\mathtt{SMC}$ won't return true upon $(\mathbf{x},\mathbf{b})$})
=\displaystyle=~{} 1Pr(𝐱1,𝐛1,𝚇𝙾𝚁-𝚂𝙼𝙲 returns true upon (𝐱1,𝐛1))\displaystyle 1-{Pr}(\exists\mathbf{x}_{1},\mathbf{b}_{1},\text{$\mathtt{XOR}$-$\mathtt{SMC}$ returns true upon $(\mathbf{x}_{1},\mathbf{b}_{1})$})
\displaystyle\geq~{} 1𝐱1,𝐛1Pr(𝚇𝙾𝚁-𝚂𝙼𝙲 returns true upon (𝐱1,𝐛1))\displaystyle 1-\sum_{\mathbf{x}_{1},\mathbf{b}_{1}}{Pr}(\text{$\mathtt{XOR}$-$\mathtt{SMC}$ returns true upon $(\mathbf{x}_{1},\mathbf{b}_{1})$})
\displaystyle\geq~{} 1η2n+k2n+k=1η.\displaystyle 1-\frac{\eta}{2^{n+k}}2^{n+k}=1-\eta.

Proof of Theorem 2

Combining both Claim 1 and 2, we can choose the value T=(n+k)ln2lnηα(c,k)T=\frac{(n+k)\ln 2-\ln\eta}{\alpha(c,k)} such that Tmax{lnηα(c,k),(n+k)ln2lnηα(c,1)}T\geq\max\{\frac{-\ln\eta}{\alpha(c,k)},\frac{(n+k)\ln 2-\ln\eta}{\alpha(c,1)}\}. This completes the proof.

Appendix B Implementation of 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC}

Our code is attached to the supplementary material. It contains 1) the implementation of our 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC} method 2) the list of datasets, and 3) the implementation of several baseline algorithms.

We use CPLEX solver222https://www.ibm.com/products/ilog-cplex-optimization-studio/cplex-optimizer version 22.1.1 as the NP oracle. We use Python package boolexpr333http://www.boolexpr.org/index.html to process those parts involve logic operations “,,¬\lor,\land,\neg”.

Computational Pipeline of 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC}

Preprocess

The preprocessing step converts the problem into a standard form as specified in Eq. (3):

ϕ(𝐱,𝐛)\displaystyle\phi(\mathbf{x},\mathbf{b}) [b1(𝐲1𝒴1f1(𝐱,𝐲1)2q1)][bk(𝐲k𝒴kfk(𝐱,𝐲k)2qk)].\displaystyle\wedge\left[b_{1}\Rightarrow\left(\sum_{\mathbf{y}_{1}\in\mathcal{Y}_{1}}f_{1}(\mathbf{x},\mathbf{y}_{1})\geq 2^{q_{1}}\right)\right]\wedge\dots\wedge\left[b_{k}\Rightarrow\left(\sum_{\mathbf{y}_{k}\in\mathcal{Y}_{k}}f_{k}(\mathbf{x},\mathbf{y}_{k})\geq 2^{q_{k}}\right)\right].

The crucial aspects are to guarantee

  • i\forall i, 𝒴i\mathcal{Y}_{i} should be properly encoded such that 𝒴i{0,1}di\mathcal{Y}_{i}\subseteq\{0,1\}^{d_{i}} for some did_{i}. (See example in Shelter Location Assignment.)

  • i\forall i, fi:𝒳×𝒴i{0,1}f_{i}:\mathcal{X}\times\mathcal{Y}_{i}\rightarrow\{0,1\} must be a binary-output function. (See example in Robust Supply Chain Design.)

Formulate a binary-output function into a Boolean Formula

A binary-output function fif_{i} can be formulated into a Boolean formula where 𝐱\mathbf{x} and 𝐲i\mathbf{y}_{i} are seen as boolean variables and fi(𝐱,𝐲i)=1f_{i}(\mathbf{x},\mathbf{y}_{i})=1 iff. the boolean formula is satisfiable at (𝐱,𝐲i)(\mathbf{x},\mathbf{y}_{i}).

Although it seems straightforward to use SAT solvers, we found it more efficient to transform the SAT into an equivalent Constraint Programming (CP) problem and query the CPLEX solver after comparison. Due to the interchangeability of CP and SAT, we will keep using the SAT formulation for the following discussion.

Add XOR Constraints

Then we need to add random XOR constraints to the Boolean formula. As an example, an XOR constraint for 𝐲{0,1}d\mathbf{y}\in\{0,1\}^{d} is

𝚇𝙾𝚁(𝐲)=y2y3yd1\displaystyle\mathtt{XOR}(\mathbf{y})=y_{2}\oplus y_{3}\oplus y_{d}\oplus 1

In this 𝚇𝙾𝚁(𝐲)\mathtt{XOR}(\mathbf{y}), each literal yjy_{j}, j=1,,dj=1,\dots,d, and the extra 11 in the end (adding an extra 11 can be seen as negating all literals) has 1/21/2 chance of appearance. Following this manner, we will sample multiple XOR constraints for each 𝐲i\mathbf{y}_{i}. The number of XOR constraints for 𝐲i\mathbf{y}_{i} is determined by the logarithm of the corresponding threshold log2(2qi)=qi\log_{2}(2^{q_{i}})=q_{i}.

After sampling all XOR constraints, we will have a boolean formula:

(b1f1(𝐱,𝐲1)𝚇𝙾𝚁1(𝐲1)𝚇𝙾𝚁q1(𝐲1))\displaystyle(b_{1}\Rightarrow f_{1}(\mathbf{x},\mathbf{y}_{1})\wedge\mathtt{XOR}_{1}(\mathbf{y}_{1})\wedge\ldots\wedge\mathtt{XOR}_{q_{1}}(\mathbf{y}_{1}))\wedge
(b2f2(𝐱,𝐲2)𝚇𝙾𝚁2(𝐲2)𝚇𝙾𝚁q2(𝐲2))\displaystyle(b_{2}\Rightarrow f_{2}(\mathbf{x},\mathbf{y}_{2})\wedge\mathtt{XOR}_{2}(\mathbf{y}_{2})\wedge\ldots\wedge\mathtt{XOR}_{q_{2}}(\mathbf{y}_{2}))\wedge (10)
\displaystyle\cdots

Repeat and Examine the Majority

Since all XOR constraints are sampled uniformly, the chance of Eq. (B) being satisfiable when Eq. (3) holds true may not be significant. Therefore we can enhance the probability by repeating experiments TT times and checking whether the majority is satisfiable.

Appendix C Experiment Settings

All experiments run on a high-end server with two Sky Lake CPUs @ 2.60GHz, 24 Core, 96 GB RAM.

Shelter Location Assignment

As discussed in the main text, the essential decision problem decides there are at least 2qr2^{q_{r}} paths connecting any residential area with a shelter. The assigned shelters is represented by a vector 𝐛=(b1,,bn){0,1}n\mathbf{b}=(b_{1},\ldots,b_{n})\in\{0,1\}^{n}, where bi=1b_{i}=1 implies node viv_{i} is chosen as shelter. Let ϕ(𝐛)=(i=1nbi)m\phi(\mathbf{b})=\left(\sum_{i=1}^{n}b_{i}\right)\leq m represent there are at most mm shelters. Let I(vr,vs,E)I(v_{r},v_{s},E^{\prime}) be an indicator function that returns one if and only if the selected edges EE^{\prime} form a path from vrv_{r} to vsv_{s}. The whole formula is:

ϕ(𝐛)bi\displaystyle\phi(\mathbf{b})\wedge b_{i}\Rightarrow (vsS,EEI(vr,vs,E)2qr)for 1in.\displaystyle\left(\sum_{v_{s}\in S,E^{\prime}\subseteq E}I(v_{r},v_{s},E^{\prime})\geq 2^{q_{r}}\right)\text{for }1\leq i\leq n.

Path Encoding

The path indicator function I()I(\cdot) is implemented as follows. Any path between 2 nodes can be mapped to a unique flow where the flow along the path is 1 and others are 0. One such unit integer flow can be represented by a binary-valued vector:

𝐟{(fe1,fe2,,feM)|fei{0,1}, for 1iM},\mathbf{f}\in\{(f_{e_{1}},f_{e_{2}},\dots,f_{e_{M}})|f_{e_{i}}\in\{0,1\},\text{ for }1\leq i\leq M\},

where fei=1f_{e_{i}}=1 if and only if the amount of flow on edge eie_{i} is 1. Then we can encode every path on the map using 𝐟\mathbf{f}. Then we can use a flow indicator function instead in the implementation, where

I(vr,vs,𝐟)={1 vector 𝐟 represents a path from vr to shelter vs,0otherwise.\displaystyle I(v_{r},v_{s},\mathbf{f})=\begin{cases}1&\text{ vector }\mathbf{f}\text{ represents a path from $v_{r}$ to shelter $v_{s}$},\\ 0&\text{otherwise}.\end{cases}

Then I()I(\cdot) can be implemented by encoding flow constraints.

Shelter Location Encoding

The set of nodes with shelters SS can be further encoded by a binary vector

𝐱S=(x1,,xN){0,1}|V|,\mathbf{x}_{S}=(x_{1},\dots,x_{N})\in\{0,1\}^{|V|},

where xi=1x_{i}=1 represents a shelter assigned to node viv_{i}, i.e., viSv_{i}\in S. For example,

𝐱S=(1,1,1,0,,0all 0)\mathbf{x}_{S}=(1,1,1,\underbrace{0,\dots,0}_{\text{all $0$}}) (11)

implies we place only 3 shelters on node v1v_{1}, v2v_{2}, and v3v_{3}.

After we properly define the notations of this problem, the shelter location problem can be formulated into an SMC:

(i=1|V|xim)vrR(vsS𝐟I(vr,vs,𝐟)2qr)\displaystyle\left(\sum_{i=1}^{|V|}x_{i}\leq m\right)\wedge\bigwedge_{v_{r}\in R}\left(\sum_{v_{s}\in S}\sum_{\mathbf{f}}I(v_{r},v_{s},\mathbf{f})\geq 2^{q_{r}}\right) (12)

If problem (12) holds true for some SS, then the shelter assignment plan meets the safety requirement. Thus in 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC}, we are actually solving an SAT problem:

(i=1|V|xim)vr(I(vr,vs,𝐟)𝚇𝙾𝚁1(𝐟,𝐱S)𝚇𝙾𝚁qr(𝐟,𝐱S)repeat T times and check the satisfiability of the majority)\displaystyle\left(\sum_{i=1}^{|V|}x_{i}\leq m\right)\wedge\bigwedge_{v_{r}}\left(\underbrace{I(v_{r},v_{s},\mathbf{f})\wedge\mathtt{XOR}_{1}(\mathbf{f},\mathbf{x}_{S})\wedge\dots\wedge\mathtt{XOR}_{q_{r}}(\mathbf{f},\mathbf{x}_{S})}_{\text{repeat $T$ times and check the satisfiability of the majority}}\right)

where we determine whether there is such a 𝐟\mathbf{f} and 𝐱S\mathbf{x}_{S} renders the boolean formula above true.

Dataset

The dataset is from Shelters at Hawaii444https://iu.maps.arcgis.com/apps/Styler/index.html?appid=477fae419a014a3ba7c95cdcb39fd946. The case study for this assignment is from Hawaii and involves Mauna Loa volcano. This map offers an analysis of the most practical locations for emergency shelters for displaced people within 15 minutes of their original locations. Relationships between population, emergency shelters, and high-risk lava flow zones on the islands are analyzed. According to different choices of sub-regions, we extract 3 maps with 121, 186, and 388 nodes representing 3 difficulty levels with 237, 359, and 775 undirected edges, respectively. Those maps are included in the code repository.

Evaluation Metric

Suppose we find a shelter assignment {vs1,,vsq}\{v_{s_{1}},\dots,v_{s_{q}}\} for residential areas {vr1,,vrp}\{v_{r_{1}},\dots,v_{r_{p}}\}. The quality of the assignment is evaluated by the total paths from the resident areas to the chosen shelters, which is computed as:

i=1pj=1qNumber of paths from vri to vsj\sum_{i=1}^{p}\sum_{j=1}^{q}\text{Number of paths from $v_{r_{i}}$ to $v_{s_{j}}$}

However, the path counting problem is an intractable problem even if shelter locations are given. We use SharpSAT-TD (Korhonen and Järvisalo 2021) (a model counting solver) to solve each counting predicate separately with given shelter locations.

Choice of Baseline

For baseline methods, shelter locations are found by local search with the shelter assignment as the search state and a path approximation estimation function as

h({vs1,,vsq})=i=1pj=1qApproximated number of paths from vri to vsj\displaystyle h(\{v_{s_{1}},\dots,v_{s_{q}}\})=\sum_{i=1}^{p}\sum_{j=1}^{q}\text{Approximated number of paths from $v_{r_{i}}$ to $v_{s_{j}}$}

The approximated path counting is achieved by querying approximate sampling oracles including Gibbs Sampler, Unigen Sampler555https://github.com/meelgroup/unigen, Quick sampler666https://github.com/RafaelTupynamba/quicksampler, etc.

Hyper Parameters

For all experiments, residential nodes are picked at node vr1=0v_{r_{1}}=0, vr2=10v_{r_{2}}=10, vr3=20v_{r_{3}}=20. The maximum number of shelters is m=5m=5. Parameter TT (for repeat experiments) is set to be 2, which empirically works well.

Due to the sensitivity of local search to the search state initialization. We initialize the starting point randomly and repeat 5 times. Each run is until the approach finds a local minimum and only those best are picked for comparison. For our 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC}, we give a time limit of 12 hours. The algorithm repeatedly runs with increasing qrq_{r} until it times out. The time shown in Table 1 and the number of paths in Figure 2 correspond to the cumulative time and the best solutions found before the algorithm times out.

Robust Supply Chain Design

Refer to caption
Figure 4: Example of the supply chain design application. (a) The initial supply chain where nodes 1 and 2 are suppliers of wheat; nodes 3 and 4 are producers of bread; nodes 5 and 6 are markets. (b) Graph after trade (edge) selection of each demander. (c) Under the effect of random events, e.g., natural disasters, some trades (edges) are disabled.

Figure 4 shows an example of the supply chain design. This supply chain (network) can be formulated into a directed acyclic graph G(V,E)G(V,E). For each node vVv\in V in the supply chain, (1) it represents a supplier, which consumes raw materials from upper-stream suppliers (nodes in the previous layer) and supplies products to down-stream suppliers (nodes in the following layer). The number of nodes is |V||V|. (2) Each node vv has a limited budget, denoted by Budget(v)Budget(v), representing the maximum budget it can spend on purchasing raw materials. (3) We call those nodes without upstream suppliers in the initial graph ”primary suppliers”, e.g., nodes 1 and 2 in Fig. 4. Those without downstream suppliers are called ”final-tier suppliers”, e.g., nodes 5 and 6 in Fig. 4.

For each edge (u,v)E(u,v)\in E in the graph, (1) it represents the possible trades from nodes uu to vv, and the direction is from a supplier to a demand. The number of edges is denoted as |E||E|. (2) Each edge (u,v)(u,v) has a capacity Capacity(u,v)Capacity(u,v) representing the maximum amount of raw materials that can flow from the supplier to the demand. (3) Each edge (u,v)(u,v) has a cost for the demander, denoted by Cost(u,v)Cost(u,v). It represents the cost of the trade (mainly from purchase and transportation). (4) The actual amount of goods flow on each edge could be below the capacity, we use flow(e)flow(e) to denote the actual flow on edge ee.

Due to the limited budget, each node vv can only trade with a limited number of suppliers. Denote this selection as 𝐱E\mathbf{x}\subseteq E where (u,v)𝐱(u,v)\in\mathbf{x} if and only if the trading edge (u,v)(u,v) is selected. The trade selection is shown in Figure 4 (i), and the resulting graph is Figure 4 (b). We have to ensure (u,v)𝐱Capacity(u,v)Budget(v)\sum_{(u,v)\in{\mathbf{x}}}Capacity(u,v)\leq Budget(v) for all vv.

A disaster is defined as: 𝜽{(θu,v,)|(u,v)E}{0,1}|E|\boldsymbol{\theta}\in\{(\theta_{u,v},\dots)|(u,v)\in E\}\subseteq\{0,1\}^{|E|} where θu,v=1\theta_{u,v}=1 iff. the disaster destroies edge (u,v)(u,v). The disaster 𝜽\boldsymbol{\theta} is associated with a random distribution Pr(𝜽){Pr}(\boldsymbol{\theta}). The random disaster happens after the trade selection. An example is in Figure 4 (ii) and (c). In our study, we address the complex scenario where a disaster might simultaneously affect multiple trades, especially those located within the same geographical region, e.g., a wheat factory and a bread market in the same city are likely to face the same natural disaster. Bayesian Networks are exceptionally expressive and capable of effectively representing conditional dependencies and probabilistic relationships among various entities. This makes them particularly well-suited for capturing the nuanced interdependencies among trades. By modeling Pr(𝜽){Pr}(\boldsymbol{\theta}) as a Bayesian Network, we can more accurately reflect the real-world scenario thus providing a more realistic case study analysis.

Problem Formulation

As discussed in the main text Eq. (8), the core decision problem to determine if there is a trading plan 𝐱\mathbf{x} that satisfies

ϕ(𝐱)v in final-tier suppliers 𝔼𝜽[(u,v) in selected edges 𝐱flow(u,v;𝜽)]2q.\displaystyle\phi(\mathbf{x})\wedge\sum_{v\text{ in final-tier suppliers }}\mathbb{E}_{\boldsymbol{\theta}}\left[\sum_{(u,v)\text{ in selected edges }\mathbf{x}}flow(u,v;\boldsymbol{\theta})\right]\geq 2^{q}.

where the summation inside the expectation quantifies all incoming raw materials transported to vv on the actual trading map following trade selection 𝐱\mathbf{x} and random disaster 𝜽\boldsymbol{\theta}, and the total expectation term represents the expected production of node vv subject to disaster distribution. ϕ(𝐱)\phi(\mathbf{x}) captures the budget limitation (u,v) in selected edges 𝐱Cost(u,v)Budget(v)\sum_{(u,v)\text{ in selected edges }{\mathbf{x}}}Cost(u,v)\leq Budget(v) for all node vv.

Note that if the flow on each edge ee can vary arbitrarily between 0 and Capacity(e)Capacity(e), then a representation of flow(,;𝜽)flow(\cdot,\cdot;\boldsymbol{\theta}) must be included among the decision variables alongside 𝐱\mathbf{x}. This is because the same graph can have varying amounts and assignments of flow. As a result, the decision problem becomes overly complex, as encoding flow(,;𝜽)flow(\cdot,\cdot;\boldsymbol{\theta}) for all possible graphs requires exponential space.

To simplify, we assume that all suppliers, except for the primary and final-tier suppliers, are capable of meeting the maximum production requirement with a single incoming trade, that is, Capacity(u,v)(v,w)ECapacity(v,w),(u,v)ECapacity(u,v)\geq\sum_{(v,w)\in E}Capacity(v,w),\forall(u,v)\in E. Consequently, the amount of flow on selected trading edges must be full to ensure the maximum flow. Therefore, we can get rid of flowflow and consider CapacityCapacity directly.

SMC Formulation

After simplification, we can expand the decision problem in Eq. (8) as

𝜽Pr(𝜽)v in final-tier suppliers,and (u,v) in selected edges 𝐱Capacity(u,v)I(u,v;𝜽)2q\displaystyle\sum_{\boldsymbol{\theta}}{Pr}(\boldsymbol{\theta})\sum_{\begin{subarray}{c}v\text{ in final-tier suppliers,}\\ \text{and }(u,v)\text{ in selected edges }{\mathbf{x}}\end{subarray}}Capacity(u,v)I(u,v;\boldsymbol{\theta})\geq 2^{q}
such that v,(u,v) in selected edges 𝐱Cost(u,v)Budget(v)\displaystyle\text{such that }~{}~{}\forall v,\sum_{(u,v)\text{ in selected edges }{\mathbf{x}}}Cost(u,v)\leq Budget(v)

where I(u,v;𝜽)I(u,v;\boldsymbol{\theta}) is an indicator function, which is defined as follows:

I(u,v;𝜽)={1If there exists a path from one primary supplier to node v via edge (u,v) subject to disasters0otherwise.\displaystyle I(u,v;\boldsymbol{\theta})=\begin{cases}1&\text{If there exists a path from one primary supplier to node $v$ via edge $(u,v)$ subject to disasters}\\ 0&\text{otherwise}.\end{cases}

Implementation of 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC}

To utilize 𝚇𝙾𝚁\mathtt{XOR}-𝚂𝙼𝙲\mathtt{SMC}, we need to transform Pr(𝜽)v in final-tier suppliers,and (u,v) in selected edges 𝐱Capacity(u,v)I(u,v;𝜽){Pr}(\boldsymbol{\theta})\sum_{\begin{subarray}{c}v\text{ in final-tier suppliers,}\\ \text{and }(u,v)\text{ in selected edges }{\mathbf{x}}\end{subarray}}Capacity(u,v)I(u,v;\boldsymbol{\theta}) into a sum over 0/1 output function (Boolean formula).

  • Discretize Probability: First, let’s discretize Pr(𝜽){Pr}(\boldsymbol{\theta}). Suppose we use 4 digits 𝐛=(b0,b1,b2,b3){0,1}4\mathbf{b}=(b_{0},b_{1},b_{2},b_{3})\in\{0,1\}^{4} to represent probability from 015\frac{0}{15} to 1515\frac{15}{15}. Then we have,

    Pr(𝜽)=𝐛{0,1}4𝟏(115(20b0+21b1+22b2+23b3)<Pr(𝜽))This indicator function is 0/1 output, and easy to be implemented in CPLEX\displaystyle{Pr}(\boldsymbol{\theta})=\sum_{\mathbf{b}\in\{0,1\}^{4}}\underbrace{\mathbf{1}\left(\frac{1}{15}(2^{0}b_{0}+2^{1}b_{1}+2^{2}b_{2}+2^{3}b_{3})<{Pr}(\boldsymbol{\theta})\right)}_{\text{This indicator function is $0/1$ output, and easy to be implemented in CPLEX}} (13)

    We model Pr(𝜽){Pr}(\boldsymbol{\theta}) as a Bayesian Network, i.e., Pr(𝜽)=Pr(𝜽)Pr(𝜽′′|𝜽){Pr}(\boldsymbol{\theta})={Pr}(\boldsymbol{\theta}^{\prime})\cdot{Pr}(\boldsymbol{\theta}^{\prime\prime}|\boldsymbol{\theta}^{\prime}) where 𝜽\boldsymbol{\theta}^{\prime} and 𝜽′′\boldsymbol{\theta}^{\prime\prime} are two subset of random variables of 𝜽\boldsymbol{\theta}.

    Pr(𝜽)Pr(𝜽′′|𝜽)=𝐛{0,1}4𝐛′′{0,1}4\displaystyle{Pr}(\boldsymbol{\theta}^{\prime})\cdot{Pr}(\boldsymbol{\theta}^{\prime\prime}|\boldsymbol{\theta}^{\prime})=\sum_{\mathbf{b}^{\prime}\in\{0,1\}^{4}}\sum_{\mathbf{b}^{\prime\prime}\in\{0,1\}^{4}} 𝟏(115(20b0+21b1+22b2+23b3)<Pr(𝜽))×\displaystyle\mathbf{1}\left(\frac{1}{15}(2^{0}b_{0}^{\prime}+2^{1}b_{1}^{\prime}+2^{2}b_{2}^{\prime}+2^{3}b_{3}^{\prime})<{Pr}(\boldsymbol{\theta}^{\prime})\right)\times (14)
    𝟏(115(20b0′′+21b1′′+22b2′′+23b3′′)<Pr(𝜽′′|𝜽))\displaystyle\mathbf{1}\left(\frac{1}{15}(2^{0}b_{0}^{\prime\prime}+2^{1}b_{1}^{\prime\prime}+2^{2}b_{2}^{\prime\prime}+2^{3}b_{3}^{\prime\prime})<{Pr}(\boldsymbol{\theta}^{\prime\prime}|\boldsymbol{\theta}^{\prime})\right)

    In addition, we need to code up each factor in the Bayesian network, which has a form of Pr(𝜽){Pr}(\boldsymbol{\theta}^{\prime}) or Pr(𝜽′′|𝜽){Pr}(\boldsymbol{\theta}^{\prime\prime}|\boldsymbol{\theta}^{\prime}) in CPLEX. Use Pr(𝜽′′|𝜽){Pr}(\boldsymbol{\theta}^{\prime\prime}|\boldsymbol{\theta}^{\prime}) as an example, we need a CPLEX IloNumExpr-typed numerical expression that takes the value assignment of 𝜽𝜽′′\boldsymbol{\theta}^{\prime}\cup\boldsymbol{\theta}^{\prime\prime} as input and outputs the value of Pr(𝜽′′|𝜽){Pr}(\boldsymbol{\theta}^{\prime\prime}|\boldsymbol{\theta}).

    For a clear demonstration, assume we have a factor P(θ1|θ2)P(\theta_{1}|\theta_{2}) as in Tab. 3. Then we can encode it in CPLEX as:

    Table 3: Factor P(θ1|θ2)P(\theta_{1}|\theta_{2})
    θ1\theta_{1} θ2\theta_{2} Pr(θ1|θ2){Pr}(\theta_{1}|\theta_{2})
    0 0 p0p_{0}
    0 1 p1p_{1}
    1 0 p2p_{2}
    1 1 p3p_{3}
    P(θ1|θ2)\displaystyle P(\theta_{1}|\theta_{2}) =p0u0,0+p1u0,1+p2u1,0+p3u1,1,\displaystyle=p_{0}u_{0,0}+p_{1}u_{0,1}+p_{2}u_{1,0}+p_{3}u_{1,1},
    s.t.,\displaystyle s.t.,\quad u0,0+u0,1=1θ1\displaystyle u_{0,0}+u_{0,1}=1-\theta_{1}
    u1,0+u1,1=θ1\displaystyle u_{1,0}+u_{1,1}=\theta_{1}
    u0,0+u1,0=1θ2\displaystyle u_{0,0}+u_{1,0}=1-\theta_{2}
    u0,1+u1,1=θ2\displaystyle u_{0,1}+u_{1,1}=\theta_{2}
    u0,0,u0,1,u1,0,u1,1{0,1}\displaystyle u_{0,0},u_{0,1},u_{1,0},u_{1,1}\in\{0,1\}

    where ui,ju_{i,j} are assisting variables. Then a corresponding indicator function can be implemented as

    𝟏(115(20b0+21b1+22b2+23b3)<Pr(θ1|θ2))\displaystyle\mathbf{1}\left(\frac{1}{15}(2^{0}b_{0}+2^{1}b_{1}+2^{2}b_{2}+2^{3}b_{3})<{Pr}(\theta_{1}|\theta_{2})\right)
    \displaystyle\equiv (115(20b0+21b1+22b2+23b3)<p0u0,0+p1u0,1+p2u1,0+p3u1,1)\displaystyle\left(\frac{1}{15}(2^{0}b_{0}+2^{1}b_{1}+2^{2}b_{2}+2^{3}b_{3})<p_{0}u_{0,0}+p_{1}u_{0,1}+p_{2}u_{1,0}+p_{3}u_{1,1}\right)\wedge
    (u0,0+u0,1=1θ1)(u1,0+u1,1=θ1)(u0,0+u1,0=1θ2)(u0,1+u1,1=θ2)\displaystyle(u_{0,0}+u_{0,1}=1-\theta_{1})\wedge(u_{1,0}+u_{1,1}=\theta_{1})\wedge(u_{0,0}+u_{1,0}=1-\theta_{2})\wedge(u_{0,1}+u_{1,1}=\theta_{2})

    After discretization, we could now transform the original problem into

    𝐱,𝜽bPrIndicator function for Pr(𝜽)v in final-tier suppliers,and (u,v) in selected edges 𝐱Capacity(u,v)I(u,v;𝜽)\displaystyle\exists\mathbf{x},\quad\sum_{\boldsymbol{\theta}}\sum_{b_{{Pr}}}\text{Indicator function for }{Pr}(\boldsymbol{\theta})\sum_{\begin{subarray}{c}v\text{ in final-tier suppliers,}\\ \text{and }(u,v)\text{ in selected edges }{\mathbf{x}}\end{subarray}}Capacity(u,v)I(u,v;\boldsymbol{\theta})

    where bPrb_{{Pr}} represents the digits for representing Pr(𝜽){Pr}(\boldsymbol{\theta}), i.e., bb in Eq. (13) and bb′′b^{\prime}\cup b^{\prime\prime} in Eq. (14).

  • Discretize Capacity: The same idea applies to capacity as well. Now we will discretize v in final-tier suppliers,and (u,v) in selected edges 𝐱Capacity(u,v)I(u,v;𝜽)\sum_{\begin{subarray}{c}v\text{ in final-tier suppliers,}\\ \text{and }(u,v)\text{ in selected edges }{\mathbf{x}}\end{subarray}}Capacity(u,v)I(u,v;\boldsymbol{\theta}) as a whole. Assume this total capacity can be represented by a 6-digit binary number 𝐛=(b0,b1,b2,b3,b4,b5)\mathbf{b}=(b_{0},b_{1},b_{2},b_{3},b_{4},b_{5}).

    v in final-tier suppliers,and (u,v) in selected edges 𝐱Capacity(u,v)I(u,v;𝜽)\displaystyle\sum_{\begin{subarray}{c}v\text{ in final-tier suppliers,}\\ \text{and }(u,v)\text{ in selected edges }{\mathbf{x}}\end{subarray}}Capacity(u,v)I(u,v;\boldsymbol{\theta})
    =\displaystyle= 𝐛{0,1}6𝟏(b0+2b1++25b5<v in final-tier suppliers,and (u,v) in selected edges 𝐱Capacity(u,v)I(u,v;𝜽))\displaystyle\sum_{\mathbf{b}\in\{0,1\}^{6}}\mathbf{1}\left(b_{0}+2b_{1}+...+2^{5}b_{5}<\sum_{\begin{subarray}{c}v\text{ in final-tier suppliers,}\\ \text{and }(u,v)\text{ in selected edges }{\mathbf{x}}\end{subarray}}Capacity(u,v)I(u,v;\boldsymbol{\theta})\right)
  • Practical example of I(u,v;θ)I(u,v;\boldsymbol{\theta}): This indicator function indicates whether edge (u,v)(u,v) is in some path from one primary supplier to vv.

    Use Figure 4(a) as an example. Set v=v5v=v_{5} and examine I(v3,v5)I(v_{3},v_{5}).

    I(v3,v5;𝜽)=(I(v1,v3;𝜽)I(v2,v3;𝜽))Node v3 is connected to some primary supplierxv3,v5¬θv3,v5(v3,v5) is selected and not destroied\displaystyle I(v_{3},v_{5};\boldsymbol{\theta})=\underbrace{\left(I(v_{1},v_{3};\boldsymbol{\theta})\vee I(v_{2},v_{3};\boldsymbol{\theta})\right)}_{\begin{subarray}{c}\text{Node $v_{3}$ is connected to some primary supplier}\end{subarray}}\wedge\underbrace{x_{v_{3},v_{5}}\wedge\neg\theta_{v_{3},v_{5}}}_{\text{$(v_{3},v_{5})$ is selected and not destroied}}
  • Overall: After discretization and transformation, the problem is a typical #SAT problem.

    𝐱,𝜽𝐛Pr𝐛c0/1 output function(𝐱,𝜽,𝐛Pr,𝐛c)\displaystyle\exists\mathbf{x},\quad\sum_{\boldsymbol{\theta}}\sum_{\mathbf{b}_{{Pr}}}\sum_{\mathbf{b}_{c}}\text{A $0/1$ output function}(\mathbf{x},\boldsymbol{\theta},\mathbf{b}_{{Pr}},\mathbf{b}_{c})

    where 𝐛Pr,𝐛c\mathbf{b}_{{Pr}},\mathbf{b}_{c} are binary variables introduced by discretizing probability and capacity.

Dataset

We evaluate our algorithm on a real-world wheat supply chain network from (Zokaee et al. 2017), the structure of the supply chain is shown in Fig. 5. For a better comparison, one small-scale synthetic network, in which the parameters are generated with a Gaussian distribution fitted from the real-world data. The statistics of the two supply chain networks are in Table 4.

Table 4: Statistics of the Wheat supply chain network.
Wheat suppliers Flour factories Bread factories Markets
Real-world supply network 9 7 9 19
Small synthesized supply network 4 4 5 5

In our experiments, we consider different rates of disasters over the supply chain network. We randomly pick 10%, 20%, 30% of edges over the total MM edges in the network. In each disaster distribution, The joint probability over affected edges is defined as a randomly generated Bayesian Network.

Evaluation Metric

To evaluate the efficacy of a trading plan, we calculate its empirical average of actual production under K=10000K=10000 i.i.d. disasters, denoted as {𝜽(1),,𝜽(K)}\{\boldsymbol{\theta}^{(1)},\dots,\boldsymbol{\theta}^{(K)}\}, sampled from the ground-truth distribution. This method is adopted due to the computational infeasibility of directly calculating expectations. Formally, given a trading plan 𝐱\mathbf{x}, the performance of 𝐱\mathbf{x} is evaluated by

1Kk=1Kv in final-tier suppliers,and (u,v) in selected edges 𝐱Capacity(u,v)I(u,v;𝜽)\displaystyle\frac{1}{K}\sum_{k=1}^{K}\sum_{\begin{subarray}{c}v\text{ in final-tier suppliers,}\\ \text{and }(u,v)\text{ in selected edges }{\mathbf{x}}\end{subarray}}Capacity(u,v)I(u,v;\boldsymbol{\theta})

Choice of Baseline

For the baseline, we utilize Sample Average Approximation (SAA)-based methods (Kleywegt, Shapiro, and Homem-de-Mello 2002). These baselines employ Mixed Integer Programming (MIP) to identify a trading plan that directly maximizes the average production across networks impacted by 100 sampled disasters. The average over samples serves as a proxy for the actual expected production. For the sampler, we consider Gibbs sampling (Gibbs-SAA), belief propagation (BP-SAA), importance sampling (IS-SAA), loopy-importance sampling, and weighted sampling (Weighted-SAA). For a fair comparison, we imposed a time limit of 30 seconds for the small-sized network and 2 hours for the real-world network. The time shown in Table 2 for SAA approaches is their actual execution time. Our SMC solver again executes repeatedly with increasing qq until it times outs. The time shown is the cumulative time it finds the best solution (the last one) before the time limit.

Refer to caption
Figure 5: Example of the bread production supply chain network from (Zokaee et al. 2017). The example includes 3 wheat suppliers, 2 flour factories, 2 bread factories, and 2 markets. A black solid arrow from ii to jj indicates a trade between them (xi,j=1x_{i,j}=1 in the trading plan), and a dashed arrow from pp to qq represents an unchosen trade (xp,q=0x_{p,q}=0). Each edge also incorporates a cost and capacity which are not shown in the figure. The red cross indicates the edge is destroyed due to stochastic events (with a probability specified by Pr(𝜽){Pr}(\boldsymbol{\theta})).

Hyper Parameters

Parameter TT (for repeat experiments) is set to be 2, which is sufficient to show superior performance over baselines. Pr(𝜽){Pr}(\boldsymbol{\theta}) is discretized into 4 bits, and capacity is discretized into 12 bits. In the randomly generated disasters, each Bayesian network node can have at most 8 parents and the number of Bayesian network edges is around half of the maximum possible number. Those disaster distributions as well as the generation script are available in the code repository.