This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Approximating Pandora’s Box with Correlations111This work was funded in part by NSF awards CCF-2225259 and CCF-2217069

Shuchi Chawla
UT-Austin
shuchi@cs.utexas.edu
   Evangelia Gergatsouli
UW-Madison
evagerg@cs.wisc.edu
   Jeremy McMahan
UW-Madison
jmcmahan@wisc.edu
   Christos Tzamos
UW-Madison & University of Athens
tzamos@wisc.edu

We revisit the classic Pandora’s Box (PB) problem under correlated distributions on the box values. Recent work of [CGT+20] obtained constant approximate algorithms for a restricted class of policies for the problem that visit boxes in a fixed order. In this work, we study the complexity of approximating the optimal policy which may adaptively choose which box to visit next based on the values seen so far.

Our main result establishes an approximation-preserving equivalence of PB to the well studied Uniform Decision Tree (UDT) problem from stochastic optimization and a variant of the Min-Sum Set Cover (MSSCf\textsc{MSSC}_{f}) problem. For distributions of support mm, UDT admits a logm\log m approximation, and while a constant factor approximation in polynomial time is a long-standing open problem, constant factor approximations are achievable in subexponential time [LLM20]. Our main result implies that the same properties hold for PB and MSSCf\textsc{MSSC}_{f}.

We also study the case where the distribution over values is given more succinctly as a mixture of mm product distributions. This problem is again related to a noisy variant of the Optimal Decision Tree which is significantly more challenging. We give a constant-factor approximation that runs in time nO~(m2/ε2)n^{\tilde{O}(m^{2}/\varepsilon^{2})} when the mixture components on every box are either identical or separated in TV distance by ε\varepsilon.

1 Introduction

Many everyday tasks involve making decisions under uncertainty; for example driving to work using the fastest route or buying a house at the best price. Although we don’t know how the future outcomes of our current decisions will turn out, we can often use some prior information to facilitate the decision making process. For example, having driven on the possible routes to work before, we know which is usually the busiest one. It is also common in such cases that we can remove part of the uncertainty by paying some additional cost. This type of online decision making in the presence of costly information can be modeled as the so-called Pandora’s Box problem, first formalized by Weitzman in [Wei79]. In this problem, the algorithm is given nn alternatives called boxes, each containing a value from a known distribution. The exact value is not known, but can be revealed at a known opening cost specific to the box. The goal of the algorithm is to decide which box to open next and whether to select a value and stop, such that the total opening cost plus the minimum value revealed is minimized. In the case of independent distributions on the boxes’ values, this problem has a very elegant and simple optimal solution, as described by Weitzman [Wei79]: calculate an index for each box222This is a special case of Gittins index [GJ74]., open the boxes in increasing order of index, and stop when the expected gain is worse than the value already obtained.

Weitzman’s model makes the crucial assumption that the distributions on the values are independent across boxes. This, however, is not always the case in practice and as it turns out, the simple algorithm of the independent case fails to find the optimal solution under correlated distributions. Generally, the complexity of the Pandora’s Box with correlations is not yet well understood. In this work we develop the first approximately-optimal policies for the Pandora’s Box problem with correlated values.

We consider two standard models of correlation where the distribution over values can be specified explicitly in a succinct manner. In the first, the distribution over values has a small support of size mm. In the second the distribution is a mixture of mm product distributions, each of which can be specified succinctly. We present approximations for both settings.

A primary challenge in approximating Pandora’s Box with correlations is that the optimal solution can be an adaptive policy that determines which box to open depending on the instantiations of values in all of the boxes opened previously. It is not clear that such a policy can even be described succinctly. Furthermore, the choice of which box to open is complicated by the need to balance two desiderata – finding a low value box quickly versus learning information about the values in unopened boxes (a.k.a. the state of the world or realized scenario) quickly. Indeed, the value contained in a box can provide the algorithm with crucial information about other boxes, and inform the choice of which box to open next; an aspect that is completely missing in the independent values setting studied by Weitzman.

Contribution 1: Connection to Decision Tree and a general purpose approximation.

Some aspects of the Pandora’s Box problem have been studied separately in other contexts. For example, in the Optimal Decision Tree problem (DT[GB09, LLM20], the goal is to identify an unknown hypothesis, out of mm possible ones, by performing a sequence of costly tests, whose outcomes depend on the realized hypothesis. This problem has an informational structure similar to that in Pandora’s Box. In particular, we can think of every possible joint instantiation of values in boxes as a possible hypothesis, and every opening of a box as a test. The difference between the two problems is that while in Optimal Decision Tree we want to identify the realized hypothesis exactly, in Pandora’s Box it suffices to terminate the process as soon as we have found a low value box.

Another closely related problem is the Min Sum Set Cover [FLT04], where boxes only have two kinds of values – acceptable or unacceptable – and the goal is to find an acceptable value as quickly as possible. A primary difference relative to Pandora’s Box is that unacceptable boxes provide no further information about the values in unopened boxes.

One of the main contributions of our work is to unearth connections between Pandora’s Box and the two problems described above. We show that Pandora’s Box is essentially equivalent to a special case of Optimal Decision Tree (called Uniform Decision Tree or UDT) where the underlying distribution over hypotheses is uniform – the approximation ratios of these two problems are related within log-log factors. Surprisingly, in contrast, the non-uniform DT appears to be harder than non-uniform Pandora’s Box. We relate these two problems by showing that both are in turn related to a new version of Min Sum Set Cover, that we call Min Sum Set Cover with Feedback (MSSCf\textsc{MSSC}_{f}). These connections are summarized in Figure 1. We can thus build on the rich history and large collection of results on these problems to offer efficient algorithms for Pandora’s Box. We obtain a polynomial time O~(logm)\tilde{O}(\log m) approximation for Pandora’s Box, where mm is the number of distinct value vectors (a.k.a. scenarios) that may arise; as well as constant factor approximations in subexponential time.

PBUMSSCf\textsc{UMSSC}_{f}UMSSCf\textsc{UMSSC}_{f}UDTSection 4Section 5Log-log factorsConstant factors
Figure 1: A summary of our approximation preserving reductions

It is an important open question whether constant factor approximations exist for Uniform Decision Tree: the best known lower-bound on the approximation ratio is 44 while it is known that it is not NP-hard to obtain super-constant approximations under the Exponential Time Hypothesis. The same properties transfer also to Pandora’s Box and Min Sum Set Cover with Feedback. Pinning down the tight approximation ratio for any of these problems will directly answer these questions for any other problem in the equivalence class we establish.

The key technical component in our reductions is to find an appropriate stopping rule for Pandora’s Box: after opening a few boxes, how should the algorithm determine whether a small enough value has been found or whether further exploration is necessary? We develop an iterative algorithm that in each phase finds an appropriate threshold, with the exploration terminating as soon as a value smaller than the threshold is found, such that there is a constant probability of stopping in each phase. Within each phase then the exploration problem can be solved via a reduction to UDT. The challenge is in defining the stopping thresholds in a manner that allows us to relate the algorithm’s total cost to that of the optimal policy.

Contribution 2: Approximation for the mixture of distributions model.

Having established the general purpose reductions between Pandora’s Box and DT, we turn to the mixture of product distributions model of correlation. This special case of Pandora’s Box interpolates between Weitzman’s independent values setting and the fully general correlated values setting. In this setting, we use the term “scenario” to denote the different product distributions in the mixture. The information gathering component of the problem is now about determining which product distribution in the mixture the box values are realized from. Once the algorithm has determined the realized scenario (a.k.a. product distribution), the remaining problem amounts to implementing Weitzman’s strategy for that scenario.

We observe that this model of correlation for Pandora’s Box is related to the noisy version of DT, where the results of some tests for a given realized hypothesis are not deterministic. One challenge for DT in this setting is that any individual test may give us very little information distinguishing different scenarios, and one needs to combine information across sequences of many tests in order to isolate scenarios. This challenge is inherited by Pandora’s Box.

Previous work on noisy DT obtained algorithms whose approximations and runtimes depend on the amount of noise. In contrast, we consider settings where the level of noise is arbitrary, but where the mixtures satisfy a separability assumption. In particular, we assume that for any given box, if we consider the marginal distributions of the value in the box under different scenarios, these distributions are either identical or sufficiently different (e.g., at least ε\varepsilon in TV distance) across different scenarios. Under this assumption, we design a constant-factor approximation for Pandora’s Box that runs in nO~(m2/ε2)n^{\tilde{O}(m^{2}/\varepsilon^{2})} (Theorem 6.1), where nn is the number of boxes. The formal result and the algorithm is presented in Section 6.

1.1 Related work

The Pandora’s Box problem was first introduced by Weitzman in the Economics literature [Wei79]. Since then, there has been a long line of research studying Pandora’s Box and its many variants ; non-obligatory inspection [Dov18, BK19, BC22, FLL22], with order constraints [HAKS13, BFLL20], with correlation [CGT+20, GT23], with combinatorial costs [BEFF23], competitive information design [DFH+23], delegated version [BDP22], and finally in an online setting [EHLM19]. Multiple works also study the generalized setting where more information can be obtained for a price [CFG+00, GK01, CJK+15, CHKK15] and in settings with more complex combinatorial constraints [Sin18, GGM06, GN13, ASW16, GNS16, GNS17, GJSS19].

Chawla et al. [CGT+20] were the first to study Pandora’s Box with correlated values, but they designed approximations relative to a simpler benchmark, namely the optimal performance achievable using a so-called Partially Adaptive strategy that cannot adapt the order in which it opens boxes to the values revealed. In general, optimal strategies can decide both the ordering of the boxes and the stopping time based on the values revealed. [CGT+20] designed an algorithm with performance no more than a constant factor worse than the optimal Partially Adaptive strategy.

In Min Sum Set Cover the line of work was initiated by [FLT04], and continued with improvements and generalizations to more complex constraints by [AGY09, MBMW05, BGK10, SW11].

Optimal decision tree is an old problem studied in a variety of settings ([PKSR02, PD92, GB09, GKR10]), while its most notable application is in active learning settings. It was proven to be NP-Hard by Hyafil and Rivest [HR76]. Since then the problem of finding the best approximation algorithm was an active one [GG74, Lov85, KPB99, Das04, CPR+11, CPRS09, GB09, GNR17, CJLM10, AH12], where finally a greedy logm\log m for the general case was given by [GB09]. This approximation ratio is proven to be the best possible [CPR+11]. For the case of Uniform decision tree less is known, until recently the best algorithm was the same as the optimal decision tree, and the lower bound was 44 [CPR+11]. The recent work of Li et al. [LLM20] showed that there is an algorithm strictly better than logm\log m for the uniform decision tree.

The noisy version of optimal decision tree was first studied in [GKR10]333This result is based on a result from [GK11] which turned out to be wrong [NS17]. The correct results are presented in [GK17], which gave an algorithm with runtime that depends exponentially on the number of noisy outcomes. Subsequently, Jia et al. in [JNNR19] gave an (min(r,h)+logm)(\min(r,h)+\log m)-approximation algorithm, where rr (resp. hh) is the maximum number of different test results per test (resp. scenario) using a reduction to Adaptive Submodular Ranking problem [KNN17]. In the case of large number of noisy outcome they obtain a logm\log m approximation exploiting the connection to Stochastic Set Cover [LPRY08, INvdZ16].

2 Preliminaries

In this paper we study the connections between three different sequential decision making problems – Optimal Decision Tree, Pandora’s Box, and Min Sum Set Cover. We describe these problems formally below.

Optimal Decision Tree

In the Optimal Decision Tree problem (denoted DT) we are given a set 𝒮\mathcal{S} of mm scenarios s𝒮s\in\mathcal{S}, each occurring with (known) probability psp_{s}; and nn tests 𝒯={Ti}i[n]\mathcal{T}=\{T_{i}\}_{i\in[n]}, each with cost 11. Nature picks a scenario s𝒮s\in\mathcal{S} from the distribution pp but this scenario is unknown to the algorithm. The goal of the algorithm is to determine which scenario is realized by running a subset of the tests 𝒯\mathcal{T}. When test TiT_{i} is run and the realized scenario is ss, the test returns a result Ti(s)T_{i}(s)\in\mathbb{R}.

Output.

The output of the algorithm is a decision tree where at each node there is a test that is performed, and the branches are the outcomes of the test. In each of the leaves there is an individual scenario that is the only one consistent with the results of the test in the unique path from the root to this leaf. Observe that there is a single leaf corresponding to each scenario ss. We can represent the tree as an adaptive policy defined as follows:

Definition 2.1 (Adaptive Policy π\pi).

An adaptive policy π:X𝒯X𝒯\pi:\cup_{X\subseteq\mathcal{T}}\mathbb{R}^{X}\rightarrow\mathcal{T} is a function that given a set of tests done so far and their results, returns the next test to be performed.

Objective.

For a given decision tree or policy π\pi, let costs(π)\mathrm{cost}_{s}(\pi) denote the total cost of all of the tests on the unique path in the tree from the root to the leaf labeled with scenario ss. The objective of the algorithm is to find a policy π\pi that minimizes the average cost s𝒮pscosts(π)\sum_{s\in\mathcal{S}}p_{s}\mathrm{cost}_{s}(\pi).

We use the term Uniform Decision Tree (UDT) to denote the special case of the problem where ps=1/mp_{s}=1/m for all scenarios ss.

Pandora’s Box

In the Pandora’s Box problem we are given nn boxes, each with cost ci0c_{i}\geq 0 and value viv_{i}. The values {vi}i[n]\{v_{i}\}_{i\in[n]} are distributed according to known distribution 𝒟\mathcal{D}. We assume that 𝒟\mathcal{D} is an arbitrary correlated distribution over vectors {vi}i[n]n\{v_{i}\}_{i\in[n]}\in\mathbb{R}^{n}. We call vectors of values scenarios and use s={vi}i[n]s=\{v_{i}\}_{i\in[n]} to denote a possible realization of the scenario. As in DT, nature picks a scenario from the distribution DD and this realization is a priori unknown to the algorithm. The goal of the algorithm is to pick a box of small value. The algorithm can observe the values realized in the boxes by opening any box ii at its respective costs cic_{i}.

Output.

The output of the algorithm is an adaptive policy π\pi for opening boxes and a stopping condition. The policy π\pi takes as input a subset of the boxes and their associated values, and either returns the index of a box i[n]i\in[n] to be opened next or stops and selects the minimum value seen so far. That is, π:X[n]X[n]{}\pi:\cup_{X\subseteq[n]}\mathbb{R}^{X}\rightarrow[n]\cup\{\perp\} where \perp denotes stopping.

Objective.

For a given policy π\pi, let π(s)\pi(s) denote the set of boxes opened by the policy prior to stopping when the realized scenario is ss. The objective of the algorithm is to minimize the expected cost of the boxes opened plus the minimum value discovered, where the expectation is taken over all possible realizations of the values in each box.444In the original version of the problem studied by Weitzman [Wei79] the values are independent across boxes, and the goal is to maximize the value collected minus the costs paid, in contrast to the minimization version we study here. Formally the objective is given by

𝔼s𝒟[miniπ(s)vis+iπ(s)ci],\mathbb{E}_{s\sim\mathcal{D}}{\left[\min_{i\in\pi(s)}v_{is}+\sum_{i\in\pi(s)}c_{i}\right]},

For simplicity of presentation, from now on we assume that ci=1c_{i}=1 for all boxes, but we show in Section E how to adapt our results to handle non-unit costs, without any loss in the approximation factors.

We use UPB to denote the special case of the problem where the distribution 𝒟\mathcal{D} is uniform over mm scenarios.

Min Sum Set Cover with Feedback

In Min Sum Set Cover, we are given nn elements and a collection of mm sets 𝒮\mathcal{S} over them, and a distribution 𝒟\mathcal{D} over the sets. The output of the algorithm is an ordering π\pi over the elements. The cost of the ordering for a particular set s𝒮s\in\mathcal{S} is the index of the first element in the ordering that belongs to the set ss, that is, costs(π)=min{i:π(i)s}\mathrm{cost}_{s}(\pi)=\min\{i:\pi(i)\in s\}. The goal of the algorithm is to minimize the expected cost 𝔼s𝒟[costs(π)]\mathbb{E}_{s\sim\mathcal{D}}{\left[\mathrm{cost}_{s}(\pi)\right]}.

We define a variant of the Min Sum Set Cover problem, called Min Sum Set Cover with Feedback (MSSCf\textsc{MSSC}_{f}). As in the original problem, we are given a set of nn elements, a collection of mm sets 𝒮\mathcal{S} and a distribution 𝒟\mathcal{D} over the sets. Nature instantiates a set s𝒮s\in\mathcal{S} from the distribution 𝒟\mathcal{D}; the realization is unknown to the algorithm. Furthermore, in this variant, each element provides feedback to the algorithm when the algorithm ”visits” this element; this feedback takes on the value fi(s)f_{i}(s)\in\mathbb{R} for element i[n]i\in[n] if the realized set is s𝒮s\in\mathcal{S}.

Output.

The algorithm once again produces an ordering π\pi over the elements. Observe that the feedback allows the algorithm to adapt its ordering to previously observed values. Accordingly, π\pi is an adaptive policy that maps a subset of the elements and their associated feedback, to the index of another element i[n]i\in[n]. That is, π:X[n]X[n]\pi:\cup_{X\subseteq[n]}\mathbb{R}^{X}\rightarrow[n].

Objective.

As before, the cost of the ordering for a particular set s𝒮s\in\mathcal{S} is the index of the first element in the ordering that belongs to the set ss, that is, costs(π)=min{i:π(i)s}\mathrm{cost}_{s}(\pi)=\min\{i:\pi(i)\in s\}. The goal of the algorithm is to minimize the expected cost 𝔼s𝒟[costs(π)]\mathbb{E}_{s\sim\mathcal{D}}{\left[\mathrm{cost}_{s}(\pi)\right]}.

Commonalities and notation

As the reader has observed, we capture the commonalities between the different problems through the use of similar notation. Scenarios in DT correspond to value vectors in PB and to sets in MSSCf\textsc{MSSC}_{f}; all are denoted by ss, lie in the set 𝒮\mathcal{S}, and are drawn by nature from a known joint distribution 𝒟\mathcal{D}. Tests in DT correspond to boxes in PB and elements in MSSCf\textsc{MSSC}_{f}; we index each by i[n]i\in[n]. The algorithm for each problem produces an adaptive ordering π\pi over these tests/boxes/elements. Test outcomes Ti(s)T_{i}(s) in DT correspond to box values vi(s)v_{i}(s) in PB and feedback fi(s)f_{i}(s) in MSSCf\textsc{MSSC}_{f}. We will use the terminology and notation across different problems interchangeably in the rest of the paper.

2.1 Modeling Correlation

In this work we study two general ways of modeling the correlation between the values in the boxes.

Explicit Distributions.

In this case, 𝒟\mathcal{D} is a distribution over mm scenarios where the jj’th scenario is realized with probability pjp_{j}, for j[m]j\in[m]. Every scenario corresponds to a fixed and known vector of values contained in each box. Specifically, box ii has value vij+{}v_{ij}\in\mathbb{R}^{+}\cup\{\infty\} for scenario jj.

Mixture of Distributions.

We also consider a more general setting, where 𝒟\mathcal{D} is a mixture of mm product distributions. Specifically, each scenario jj is a product distribution; instead of giving a deterministic value for every box ii, the result is drawn from distribution 𝒟ij\mathcal{D}_{ij}. This setting is a generalization of the explicit distributions setting described before.

3 Roadmap of the Reductions and Implications

In Figure 2, we give an overview of all the main technical reductions shown in Sections 4 and 5. An arrow ABA\rightarrow B means that we gave an approximation preserving reduction from problem AA to problem BB. Therefore an algorithm for BB that achieves approximation ratio α\alpha gives also an algorithm for AA with approximation ratio O(α)O(\alpha) (or O(αlogα)O(\alpha\log\alpha) in the case of black dashed lines). For the exact guarantees we refer to the formal statement of the respective theorem. The gray lines denote less important claims or trivial reductions (e.g. in the case of AA being a subproblem of BB).

PBUMSSCf\textsc{UMSSC}_{f}MSSCf\textsc{MSSC}_{f}UDTDTClaim 4.1Claim 5.1Claim 5.1 Thm 4.2 Thm 5.2Thm 4.2Main Theorem (log\log factors)Main Theorem (const. factors)Minor ClaimSubproblem
Figure 2: Summary of all our reductions. Bold black lines denote our main theorems, gray dashed are minor claims, and dotted lines are trivial reductions.

3.1 Approximating Pandora’s Box

Given our reductions and using the best known results for Uniform Decision Tree from [LLM20] we immediately obtain efficient approximation algorithms for Pandora’s Box. We repeat the results of [LLM20] below.

Theorem 3.1 (Theorems 3.1 and 3.2 from [LLM20]).
  • There is a O(logm/logOPT)O(\log m/\log\mathrm{OPT})-approximation algorithm for UDT that runs in polynomial time, where OPT is the cost of the optimal solution of the UDT instance.

  • There is a 9+εα\frac{9+\varepsilon}{\alpha}-approximation algorithm for UDT that runs in time nO~(mα)n^{\tilde{O}(m^{\alpha})} for any α(0,1)\alpha\in(0,1).

Using the results of Theorem 3.1 combined with Theorem 4.2 and Claim 5.1 we get the following corollary.

Corollary 3.1.

From the best-known results for UDT, we have that

  • There is a O~(logm)\tilde{O}(\log m)-approximation algorithm for PB that runs in polynomial time555If additionally the possible number of outcomes is a constant KK, this gives a O(logm)O(\log m) approximation without losing an extra logarithmic factor, since OPTlogKm\text{OPT}\geq\log_{K}m, as observed by [LLM20]..

  • There is a O~(1/α)\tilde{O}(1/\alpha)-approximation algorithm for PB that runs in time nO~(mα)n^{\tilde{O}(m^{\alpha})} for any α(0,1)\alpha\in(0,1).

An immediate implication of the above corollary is that it is not NP-hard to obtain a superconstant approximation for PB, formally stated below.

Corollary 3.2.

It is not NP-hard to achieve any superconstant approximation for PB assuming the Exponential Time Hypothesis.

Observe that the logarithmic approximation achieved in Corollary 3.1 loses a loglogm\log\log m factor (hence the O~\tilde{O}) as it relies on the more complex reduction of Theorem 4.2. If we choose to use the more direct reduction of Theorem A.1 to the Optimal Decision Tree where the tests have non-unit costs (which also admits a O(logm)O(\log m)-approximation [GNR17, KNN17]), we get the following corollary.

Corollary 3.3.

There exists an efficient algorithm that is O(logm)O(\log m)-approximate for Pandora’s Box and with or without unit-cost boxes.

3.2 Constant approximation for Partially Adaptive PB

Moving on, we show how our reduction can be used to obtain and improve the results of [CGT+20]. Recall that in  [CGT+20] the authors presented a constant factor approximation algorithm against a Partially Adaptive benchmark where the order of opening boxes must be fixed up front.

In such a case, the reduction of Section 4 can be used to reduce PB to the standard Min Sum Set Cover (i.e. without feedback), which admits a 4-approximation [FLT04].

Corollary 3.4.

There exists a polynomial time algorithm for PB that is O(1)O(1)-competitive against the partially adaptive benchmark.

The same result applies even in the case of non-uniform opening costs. This is because a 4-approximate algorithm for Min Sum Set Cover is known even when elements have arbitrary costs [MBMW05]. The case of non-uniform opening costs has also been considered for Pandora’s Box by [CGT+20] but only provide an algorithm to handle polynomially bounded opening costs.

4 Connecting Pandora’s Box and MSSCf\textsc{MSSC}_{f}

In this section we establish the connection between Pandora’s Box and Min Sum Set Cover with Feedback. We show that the two problems are equivalent up to logarithmic factors in approximation ratio.

One direction of this equivalence is easy to see in fact: Min Sum Set Cover with Feedback is a special case of Pandora’s Box. Note that in both problems we examine boxes/elements in an adaptive order. In PB we stop when we find a sufficiently small value; in MSSCf\textsc{MSSC}_{f} we stop when we find an element that belongs to the instantiated scenario. To establish a formal connection, given an instance of MSSCf\textsc{MSSC}_{f}, we can define the ”value” of each element ii in scenario ss as being 0 if the element belongs to the set ss and as being L+fi(s)L+f_{i}(s) for some sufficiently large value LL where fi(s)f_{i}(s) is the feedback of element ii for set ss. This places the instance within the framework of PB and a PB algorithm can be used to solve it. We formally describe this reduction in Section B of the Appendix.

Claim 4.1.

If there exists an α(n,m)\alpha(n,m)-approximation algorithm for PB then there exists a α(n,m)\alpha(n,m)-approximation for MSSCf\textsc{MSSC}_{f}.

The more interesting direction is a reduction from PB to MSSCf\textsc{MSSC}_{f}. In fact we show that a general instance of PB can be reduced to the simpler uniform version of Min Sum Set Cover with Feedback. We devote the rest of this section to proving the following theorem.

Theorem 4.2 (Pandora’s Box to MSSCf\textsc{MSSC}_{f}).

If there exists an a(n,m)a(n,m) approximation algorithm for UMSSCf\textsc{UMSSC}_{f} then there exists a O(α(n+m,m2)logα(n+m,m2))O(\alpha(n+m,m^{2})\log\alpha(n+m,m^{2}))-approximation for PB.

Guessing a stopping rule and an intermediate problem

The feedback structure in PB and MSSCf\textsc{MSSC}_{f} is quite similar, and the main component in which the two problems differ is the stopping condition. In MSSCf\textsc{MSSC}_{f}, an algorithm can stop examining elements as soon as it finds one that ”covers” the realized set. In PB, when the algorithm observes a value in a box, it is not immediately apparent whether the value is small enough to stop or whether the algorithm should probe further, especially if the scenario is not fully identified. The key to relating the two problems is to ”guess” an appropriate stopping condition for PB, namely an appropriate threshold TT such that as soon as the algorithm observes a value smaller than this threshold, it stops. We say that the realized scenario is ”covered”.

To formalize this approach, we introduce an intermediate problem called Pandora’s Box with costly outside option TT (also called threshold), denoted by PBT{\textsc{PB}_{\leq T}}. In this version the objective is to minimize the cost of finding a value T\leq T, while we have the extra option to quit searching by opening an outside option box of cost TT. We say that a scenario is covered in a given run of the algorithm if it does not choose the outside option box TT.

We show that Pandora’s Box can be reduced to PBT{\textsc{PB}_{\leq T}} with a logarithmic loss in approximation factor, and then PBT{\textsc{PB}_{\leq T}} can be reduced to Min Sum Set Cover with Feedback with a constant factor loss. The following two results capture the details of these reductions.

Claim 4.3.

If there exists an α(n,m)\alpha(n,m) approximation algorithm for UMSSCf\textsc{UMSSC}_{f} then there exists an 3α(n+m,m2)3\alpha(n+m,m^{2})-approximation for UPBT{\textsc{UPB}_{\leq T}}.

It is also worth noting that PBT{\textsc{PB}_{\leq T}} is a special case of the Adaptive Ranking problem which directly implies a logm\log m approximation factor (given in [KNN17]).

Main Lemma 4.4.

Given a polynomial-time α(n,m)\alpha(n,m)-approximation algorithm for UPBT{\textsc{UPB}_{\leq T}}, there exists a polynomial-time O(α(n,m)logα(n,m)){O}(\alpha(n,m)\log\alpha(n,m))-approximation for PB.

The relationship between PBT{\textsc{PB}_{\leq T}} and Min Sum Set Cover with Feedback is relatively straightforward and requires explicitly relating the structure of feedback in the two problems. We describe the details in Section B of the Appendix.

Putting it all together.

The proof of Theorem 4.2 follows by combining Claim 4.3 with Lemmas 4.5 and 4.4 presented in the following sections. Proofs of Claims 4.1, 4.3 deferred to Section B of the Appendix. The rest of this section is devoted to proving Lemmas 4.5 and 4.4. The landscape of reductions shown in this section is presented in Figure 3.

PBUPBT{\textsc{UPB}_{\leq T}}PBT{\textsc{PB}_{\leq T}}UMSSCf\textsc{UMSSC}_{f}MSSCf\textsc{MSSC}_{f}Lem 4.5Lem 4.4Claim 4.3Claim 4.3Claim 4.1Main Lemma (log\log factors)Claim (const. factors)Minor ClaimSubproblem
Figure 3: Reductions shown in this section. Claim 4.3 alongside Lemmas 4.5 and 4.4 are part of Theorem 4.2.

4.1 Reducing Pandora’s Box to PBT{\textsc{PB}_{\leq T}}

Recall that a solution to Pandora’s Box involves two components ; (1) the order in which to open boxes and (2) a stopping rule. The goal of the reduction to PBT{\textsc{PB}_{\leq T}} is to simplify the stopping rule of the problem, by making values either 0 or \infty, therefore allowing us to focus on the order in which boxes are opened, rather than which value to stop at. We start by presenting our main tool, a reduction to Min Sum Set Cover with Feedback in Section 4.1.1 and then improve upon that to reduce from the uniform version of MSSCf\textsc{MSSC}_{f} (Section 4.1.2).

4.1.1 Main Tool

The high level idea in this reduction is that we repeatedly run the algorithm for PBT{\textsc{PB}_{\leq T}} with increasingly larger value of TT with the goal of covering some mass of scenarios at every step. The thresholds for every run have to be cleverly chosen to guarantee that enough mass is covered at every run. The distributions on the boxes remain the same, and this reduction does not increase the number of boxes, therefore avoiding the issues faced by the naive reduction given in Section A of the Appendix. Formally, we show the following lemma.

Main Lemma 4.5.

Given a polynomial-time α(n,m)\alpha(n,m)-approximation algorithm for PBT{\textsc{PB}_{\leq T}}, there exists a polynomial-time O(α(n,m)logα(n,m)){O}(\alpha(n,m)\log\alpha(n,m))-approximation for PB.

1
Input: Oracle 𝒜(T)\mathcal{A}(T) for PBT{\textsc{PB}_{\leq T}}, set of all scenarios 𝒮\mathcal{S}.
i0i\leftarrow 0 // Number of current Phase
2 while 𝒮\mathcal{S}\neq\emptyset do
3       Use 𝒜\mathcal{A} to find smallest TiT_{i} via Binary Search s.t. Pr[accepting the outside option Ti]0.2\textbf{Pr}\left[\text{accepting the outside option }T_{i}\right]\leq 0.2
4       Call the oracle 𝒜(Ti)\mathcal{A}(T_{i}) on set 𝒮\mathcal{S} to obtain policy πi\pi_{i}
5       𝒮𝒮\mathcal{S}\leftarrow\mathcal{S}\setminus {scenarios with total cost Ti\leq T_{i}}
6 end while
7for i0i\leftarrow 0 to \infty do
8       Run policy πi\pi_{i} until it terminates and selects a box, or accumulates probing cost TiT_{i}.
9 end for
10
Algorithm 1 Reduction from PB to PBT{\textsc{PB}_{\leq T}}.


We will now analyze the policy produced by this algorithm.

Proof of Main Lemma 4.5.

We start with some notation. Given an instance \mathcal{I} of PB, we repeatedly run PBT{\textsc{PB}_{\leq T}} in phases. Phase ii consists of running PBT{\textsc{PB}_{\leq T}} with threshold TiT_{i} on a sub instance of the original problem where we are left with a smaller set of scenarios, with their probabilities reweighted to sum to 11. Call this set of scenarios 𝒮i\mathcal{S}_{i} for phase ii and the corresponding instance i\mathcal{I}_{i}. After every phase ii, we remove the probability mass that was covered666Recall, a scenario is covered if it does not choose the outside option box., and run PBT{\textsc{PB}_{\leq T}} on this new instance with a new threshold Ti+1T_{i+1}. In each phase, the boxes, costs and values remain the same, but the stopping condition changes: thresholds TiT_{i} increase in every subsequent phase. The thresholds are chosen such that at the end of each phase, 0.80.8 of the remaining probability mass is covered. The reduction process is formally shown in Algorithm 1.

Accounting for the cost of the policy.

We first note that the total cost of the policy in phase ii conditioned on reaching that phase is at most 2Ti2T_{i}: if the policy terminates in that phase, it selects a box with value at most TiT_{i}. Furthermore, the policy incurs probing cost at most TiT_{i} in the phase. We can therefore bound the total cost of the policy as 2i=0(0.2)iTi\leq 2\sum_{i=0}^{\infty}(0.2)^{i}T_{i}.

We will now relate the thresholds TiT_{i} to the cost of the optimal PB policy for \mathcal{I}. To this end, we define corresponding thresholds for the optimal policy that we call pp-thresholds. Let π\pi^{*}_{\mathcal{I}} denote the optimal PB policy for \mathcal{I} and let csc_{s} denote the cost incurred by π\pi^{*}_{\mathcal{I}} when scenario ii is realized. A pp-threshold is the minimum possible threshold TT such that at most pp mass of the scenarios has cost more than TT in PB, formally defined below.

Definition 4.6 (pp-Threshold).

Let \mathcal{I} be an instance of PB and csc_{s} be the cost of scenario s𝒮s\in\mathcal{S} in π\pi^{*}_{\mathcal{I}}, we define the pp-threshold as

tp=min{T:Pr[cs>T]p}.t_{p}=\min\{T:\textbf{Pr}\left[c_{s}>T\right]\leq p\}.

The following two lemmas relate the cost of the optimal policy to the pp-thresholds, and the pp-thresholds to the thresholds TiT_{i} our algorithm finds. The proofs of both lemmas are deferred to Section B.1 of the Appendix. We first formally define a sub-instance of the given Pandora’s Box instance.

Definition 4.7 (Sub-instance).

Let \mathcal{I} be an instance of {PBT,PB}\{{\textsc{PB}_{\leq T}},\textsc{PB}\} with set of scenarios 𝒮\mathcal{S}_{\mathcal{I}} each with probability psp^{\mathcal{I}}_{s}. For any q[0,1]q\in[0,1] we call \mathcal{I}^{\prime} a qq-sub instance of \mathcal{I} if 𝒮𝒮\mathcal{S}_{\mathcal{I}^{\prime}}\subseteq\mathcal{S}_{\mathcal{I}} and s𝒮ps=q\sum_{s\in\mathcal{S}_{\mathcal{I}^{\prime}}}p_{s}^{\mathcal{I}}=q.

Lemma 4.8.

(Optimal Lower Bound) Let \mathcal{I} be the instance of PB. For any q<1q<1, any α>1\alpha>1, and β2\beta\geq 2, for the optimal policy π\pi^{*}_{\mathcal{I}} for PB it that

cost(π)i=01βα(q)itqi/βα.\mathrm{cost}(\pi^{*}_{\mathcal{I}})\geq\sum_{i=0}^{\infty}\frac{1}{\beta\alpha}\cdot\left(q\right)^{i}t_{q^{i}/\beta\alpha}.
Lemma 4.9.

Given an instance \mathcal{I} of PB; an α\alpha-approximation algorithm 𝒜T\mathcal{A}_{T} to PBT{\textsc{PB}_{\leq T}}; and any q<1q<1 and β2\beta\geq 2, suppose that the threshold TT satisfies

Ttq/(βα)+βαcs[tq,tq/(βα)]s𝒮cspsq.T\geq t_{q/(\beta\alpha)}+\beta\alpha\sum_{\begin{subarray}{c}c_{s}\in[t_{q},t_{q/(\beta\alpha)}]\\ s\in\mathcal{S}\end{subarray}}c_{s}\frac{p_{s}}{q}.

Then if 𝒜T\mathcal{A}_{T} is run on a qq-sub instance of \mathcal{I} with threshold TT, at most a total mass of (2/β)q(2/\beta)q of the scenarios pick the outside option box TT.

Calculating the thresholds.

For every phase ii we choose a threshold TiT_{i} such that Ti=min{T:Pr[cs>T]0.2}T_{i}=\min\{T:\textbf{Pr}\left[c_{s}>T\right]\leq 0.2\} i.e. at most 0.20.2 of the probability mass of the scenarios are not covered. In order to select this threshold, we do binary search starting from T=1T=1, running every time the α\alpha-approximation algorithm for PBT{\textsc{PB}_{\leq T}} with outside option box TT and checking how many scenarios select it. We denote by Inti=[t(0.2)i,t(0.2)i/(10α)]\operatorname{Int}_{i}=[t_{(0.2)^{i}},t_{(0.2)^{i}/(10\alpha)}] the relevant interval of costs at every run of the algorithm, then by Lemma 4.9 for β=10\beta=10, we know that for remaining total probability mass (0.2)i(0.2)^{i}, any threshold which satisfies

Tit(0.2)i1/10a+10αs𝒮csInticsps(0.2)iT_{i}\geq t_{(0.2)^{i-1}/10a}+10\alpha\sum_{\begin{subarray}{c}s\in\mathcal{S}\\ c_{s}\in\operatorname{Int}_{i}\end{subarray}}c_{s}\frac{p_{s}}{(0.2)^{i}}

also satisfies the desired covering property, i.e. at least 0.80.8 mass of the current scenarios is covered. Therefore the threshold

TiT_{i} found by our binary search satisfies the following

Ti=t(0.2)i1/10a+10αs𝒮csInticsps(0.2)i.T_{i}=t_{(0.2)^{i-1}/10a}+10\alpha\sum_{\begin{subarray}{c}s\in\mathcal{S}\\ c_{s}\in\operatorname{Int}_{i}\end{subarray}}c_{s}\frac{p_{s}}{(0.2)^{i}}. (1)
Bounding the final cost.

To bound the final cost, we recall that at the end of every phase we cover 0.80.8 of the remaining scenarios. Furthermore, we observe that each threshold TiT_{i} is charged in the above Equation (1) to optimal costs of scenarios corresponding to intervals of the form Inti=[t(0.2)i,t(0.2)i/(10α)]\operatorname{Int}_{i}=[t_{(0.2)^{i}},t_{(0.2)^{i}/(10\alpha)}]. Note that these intervals are overlapping. We therefore get

cost(π)\displaystyle\mathrm{cost}(\pi_{\mathcal{I}}) 2i=0(0.2)iTi\displaystyle\leq 2\sum_{i=0}^{\infty}(0.2)^{i}T_{i}
=2i=0((0.2)it(0.2)i1/10a+10αs𝒮csInticsps)\displaystyle=2\sum_{i=0}^{\infty}\left((0.2)^{i}t_{(0.2)^{i-1}/10a}+10\alpha\sum_{\begin{subarray}{c}s\in\mathcal{S}\\ c_{s}\in\operatorname{Int}_{i}\end{subarray}}c_{s}p_{s}\right) From equation (1)
410απ+20αi=0s𝒮csInticsps\displaystyle\leq 4\cdot 10\alpha\pi^{*}_{\mathcal{I}}+20\alpha\sum_{i=0}^{\infty}\sum_{\begin{subarray}{c}s\in\mathcal{S}\\ c_{s}\in\operatorname{Int}_{i}\end{subarray}}c_{s}p_{s} Using Lemma 4.8 for β=10,q=0.2\beta=10,q=0.2
40αlogαπ.\displaystyle\leq 40\alpha\log\alpha\cdot\pi^{*}_{\mathcal{I}}.

Where the last inequality follows since each scenario with cost csc_{s} can belong to at most logα\log\alpha intervals, therefore we get the theorem. ∎

Notice the generality of this reduction; the distributions on the values are preserved, and we did not make any more assumptions on the scenarios or values throughout the proof. Therefore we can apply this tool regardless of the type of correlation or the way it is given to us, e.g. we could be given a parametric distribution, or an explicitly given distribution, as we see in the next section.

4.1.2 An Even Stronger Tool

Moving one step further, we show that if we instead of PBT{\textsc{PB}_{\leq T}} we had an α\alpha-approximation algorithm for UPBT{\textsc{UPB}_{\leq T}} we can obtain the same guarantees as the ones described in Lemma 4.5. Observe that we cannot directly use Algorithm 1 since the oracle now requires that all scenarios have the same probability, while this might not be the case in the initial PB instance. The theorem stated formally follows.

See 4.4

We are going to highlight the differences with the proof of Main Lemma 4.5, and show how to change Algorithm 1 to work with the new oracle, that requires the scenarios to have uniform probability. The function Expand shown in Algorithm 2 is used to transform the instance of scenarios to a uniform one where every scenario has the same probability by creating multiple copies of the more likely scenarios. The function is formally described in Algorithm 3 in Section B.2 of the Appendix, alongside the proof of Main Lemma 4.4.

Input: Oracle 𝒜(T)\mathcal{A}(T) for UPBT{\textsc{UPB}_{\leq T}}, set of all scenarios 𝒮\mathcal{S}, c=1/10,δ=0.1c=1/10,\delta=0.1.
i0i\leftarrow 0 // Number of current Phase
1 while 𝒮\mathcal{S}\neq\emptyset do
       Let ={s𝒮:psc1|𝒮|}\mathcal{L}=\left\{s\in\mathcal{S}:p_{s}\leq c\cdot\frac{1}{|\mathcal{S}|}\right\} // Remove low probability scenarios
2       𝒮=𝒮\mathcal{S}^{\prime}=\mathcal{S}\setminus\mathcal{L}
3       𝒰=\mathcal{UI}= Expand(𝒮\mathcal{S}^{\prime})
4       In instance 𝒰\mathcal{UI} use 𝒜\mathcal{A} to find smallest TiT_{i} via Binary Search s.t. Pr[accepting Ti]δ\textbf{Pr}\left[\text{accepting }T_{i}\right]\leq\delta
5       Call the oracle 𝒜(Ti)\mathcal{A}(T_{i})
6       𝒮(𝒮{s𝒮:csTi})\mathcal{S}\leftarrow\big{(}\mathcal{S}^{\prime}\setminus\{s\in\mathcal{S}^{\prime}:c_{s}\leq T_{i}\}\big{)}\cup\mathcal{L}
7 end while
Algorithm 2 Reduction from PB to UPBT{\textsc{UPB}_{\leq T}}.

5 Connecting MSSCf\textsc{MSSC}_{f} and Optimal Decision Tree

In this section we establish the connection between Min Sum Set Cover with Feedback and Optimal Decision Tree. We show that the uniform versions of these problems are equivalent up to constant factors in approximation ratio. The results of this section are summarized in Figure 4 and the two results below.

UMSSCf\textsc{UMSSC}_{f}MSSCf\textsc{MSSC}_{f}UDTDTClaim 5.1Claim 5.1Thm 5.2Main Theorem: const. factorsMinor ClaimSubproblem
Figure 4: Summary of reductions in Section 5
Claim 5.1.

If there exists an α(n,m)\alpha(n,m)-approximation algorithm for DT (UDT) then there exists a (1+α(n,m))\left(1+\alpha(n,m)\right)-approximation algorithm for MSSCf\textsc{MSSC}_{f} (resp. UMSSCf\textsc{UMSSC}_{f}).

Theorem 5.2 (Uniform Decision Tree to UMSSCf\textsc{UMSSC}_{f}).

Given an α(m,n)\alpha(m,n)-approximation algorithm for UMSSCf\textsc{UMSSC}_{f} then there exists an O(α(n+m,m))O(\alpha(n+m,m))-approximation algorithm for UDT.

The formal proofs of these statements can be found in Section C of the Appendix. Here we sketch the main ideas.

One direction of this equivalence is again easy to see. The main difference between Optimal Decision Tree and MSSCf\textsc{MSSC}_{f} is that the former requires scenarios to be exactly identified whereas in the latter it suffices to simply find an element that covers the scenario. In particular, in MSSCf\textsc{MSSC}_{f} an algorithm could cover a scenario without identifying it by, for example, covering it with an element that covers multiple scenarios. To reduce MSSCf\textsc{MSSC}_{f} to DT we simply introduce extra feedback into all of the elements of the MSSCf\textsc{MSSC}_{f} instance such that the elements isolate any scenarios they cover. (That is, if the algorithm picks an element that covers some subset of scenarios, this element provides feedback about which of the covered scenarios materialized.) This allows us to relate the cost of isolation and the cost of covering to within the cost of a single additional test, implying Claim 5.1.

Proof Sketch of Theorem 5.2.

The other direction is more complicated, as we want to ensure that covering implies isolation. Given an instance of UDT, we create a special element for each scenario which is the unique element covering the scenario and also isolates the scenario from all other scenarios. The intention is that an algorithm for MSSCf\textsc{MSSC}_{f} on this new instance only chooses the special isolating element in a scenario after it has identified the scenario. If that happens, then the algorithm’s policy is a feasible solution to the UDT instance and incurs no extra cost. The problem is that an algorithm for MSSCf\textsc{MSSC}_{f} over the modified instance may use the special covering element before isolating a scenario. We argue that this choice can be ”postponed” in the policy to a point at which isolation is nearly achieved without incurring too much extra cost. This involves careful analysis of the policy’s decision tree and we present details in the appendix.

Why our reduction does not work for DT.

Our analysis above heavily uses the fact that the probabilities of all scenarios in the UDT instance are equal. This is because the ”postponement” of elements charges increased costs of some scenarios to costs of other scenarios. In fact, our reduction above fails in the case of non-uniform distributions over scenarios – it can generate an MSSCf\textsc{MSSC}_{f} instance with optimal cost much smaller than that of the original DT instance.

To see this, consider an example with mm scenarios where scenarios 11 through m1m-1 happen with probability ε/(m1)\varepsilon/(m-1) and scenario mm happens with probability 1ε1-\varepsilon. There are m1m-1 tests of cost 11 each. Test ii for i[m1]i\in[m-1] isolates scenario ii from all others. Observe that the optimal cost of this DT instance is at least (1ε)(m1)(1-\varepsilon)(m-1) as all m1m-1 tests need to be run to isolate scenario mm. Our construction of the MSSCf\textsc{MSSC}_{f} instance adds another isolating test for scenario mm. A solution to this instance can use this new test at the beginning to identify scenario mm and then run other tests with the remaining ε\varepsilon probability. As a result, it incurs cost at most (1ε)+ε(m1)(1-\varepsilon)+\varepsilon(m-1), which is a factor of 1/ε1/\varepsilon cheaper than that of the original DT instance.

6 Mixture of Product Distributions

In this section we switch gears and consider the case where we are given a mixture of mm product distributions. Observe that using the tool described in Section 4.1.1, we can reduce this problem to PBT{\textsc{PB}_{\leq T}}. This now is equivalent to the noisy version of DT [GK17, JNNR19] where for a specific scenario, the result of each test is not deterministic and can get different values with different probabilities.

Comparison with previous work:

previous work on noisy decision tree, considers limited noise models or the runtime and approximation ratio depends on the type of noise. For example in the main result of [JNNR19], the noise outcomes are binary with equal probability. The authors mention that it is possible to extend the following ways:

  • to probabilities within [δ,1δ][\delta,1-\delta], incurring an extra 1/δ1/\delta factor in the approximation

  • to non-binary noise outcomes, incurring an extra at most mm factor in the approximation

Additionally, their algorithm works by expanding the scenarios for every possible noise outcome (e.g. to 2m2^{m} for binary noise). In our work the number of noisy outcomes does not affect the number of scenarios whatsoever.

In our work, we obtain a constant approximation factor, that does not depend in any way on the type of the noise. Additionally, the outcomes of the noisy tests can be arbitrary, and do not affect either the approximation factor or the runtime. We only require a separability condition to hold ; the distributions either differ enough or are exactly the same. Formally, we require that for any two scenarios s1,s2𝒮s_{1},s_{2}\in\mathcal{S} and for every box ii, the distributions 𝒟is1\mathcal{D}_{is_{1}} and 𝒟is2\mathcal{D}_{is_{2}} satisfy |𝒟is1𝒟is2|ε{0}\left|\mathcal{D}_{is_{1}}-\mathcal{D}_{is_{2}}\right|\in\mathbb{R}_{\geq\varepsilon}\cup\{0\}, where |𝒜||\mathcal{A}-\mathcal{B}| is the total variation distance of distributions 𝒜\mathcal{A} and \mathcal{B}.

6.1 A DP Algorithm for noisy PBT{\textsc{PB}_{\leq T}}

We move on to designing a dynamic programming algorithm to solve the PBT{\textsc{PB}_{\leq T}} problem, in the case of a mixtures of product distributions. The guarantees of our dynamic programming algorithm are given in the following theorem.

Theorem 6.1.

For any β>0\beta>0, let π\scaletoDP5pt\pi_{\scaleto{DP}{5pt}} and π\pi^{*} be the policies produced by Algorithm DP(β)\operatorname{DP}(\beta) described by Equation (2) and the optimal policy respectively and UB=m2ε2logm2Tcminβ\operatorname{UB}=\frac{m^{2}}{\varepsilon^{2}}\log\frac{m^{2}T}{c_{\min}\beta}. Then it holds that

c(π\scaletoDP5pt)(1+β)c(π).c(\pi_{\scaleto{\operatorname{DP}}{5pt}})\leq(1+\beta)c(\pi^{*}).

and the DP\operatorname{DP} runs in time nUBn^{\operatorname{UB}}, where nn is the number of boxes and cminc_{\min} is the minimum cost box.

Using the reduction described in Section 4.1.1 and the previous theorem we can get a constant-approximation algorithm for the initial PB problem given a mixture of product distributions. Observe that in the reduction, for every instance of PBT{\textsc{PB}_{\leq T}} it runs, the chosen threshold TT satisfies that T(β+1)c(πT)/0.2T\leq(\beta+1)c(\pi^{*}_{T})/0.2 where πT\pi^{*}_{T} is the optimal policy for the threshold TT. The inequality holds since the algorithm for the threshold TT is a (β+1)(\beta+1) approximation and it covers 80%80\% of the scenarios left (i.e. pays 0.2T0.2T for the rest). This is formalized in the following corollary.

Corollary 6.1.

Given an instance of PB on mm scenarios, and the DP algorithm described in Equation (2), then using Algorithm 1 we obtain an O(1)O(1)-approximation algorithm for PB that runs in nO~(m2/ε2)n^{\tilde{O}(m^{2}/\varepsilon^{2})}.

Observe that the naive DP, that keeps track of all the boxes and possible outcomes, has space exponential in the number of boxes, which can be very large. In our DP, we exploit the separability property of the distributions by distinguishing the boxes in two different types based on a given set of scenarios. Informally, the informative boxes help us distinguish between two scenarios, by giving us enough TV distance, while the non-informative always have zero TV distance. The formal definition follows.

Definition 6.2 (Informative and non-informative boxes).

Let S𝒮S\subseteq\mathcal{S} be a set of scenarios. Then we call a box kk informative if there exist si,sj𝒮s_{i},s_{j}\in\mathcal{S} such that

|𝒟ksi𝒟ksj|ε.|\mathcal{D}_{ks_{i}}-\mathcal{D}_{ks_{j}}|\geq\varepsilon.

We denote the set of all informative boxes by IB(S)\operatorname{IB}(S). Similarly, the boxes for which the above does not hold are called non-informative and the set of these boxes is denoted by NIB(S)\operatorname{NIB}(S).

Recursive calls of the DP:

Our dynamic program chooses at every step one of the following options:

  1. 1.

    open an informative box: this step contributes towards eliminating improbable scenarios. From the definition of informative boxes, every time such a box is opened, it gives TV distance at least ε\varepsilon between at least two scenarios, making one of them more probable than the other. We show (Lemma 6.3) that it takes a finite amount of these boxes to decide, with high probability, which scenario is the one realized (i.e. eliminating all but one scenarios).

  2. 2.

    open a non-informative box: this is a greedy step; the best non-informative box to open next is the one that maximizes the probability of finding a value smaller than TT. Given a set SS of scenarios that are not yet eliminated, there is a unique next non-informative box which is best. We denote by NIB(S)\operatorname{NIB}^{*}(S) the function that returns this next best non-informative box. Observe that the non-informative boxes do not affect the greedy ordering of which is the next best, since they do not affect which scenarios are eliminated.

State space of the DP:

the DP keeps track of the following three quantities:

  1. 1.

    a list MM which consists of sets of informative boxes opened and numbers of non-informative ones opened in between the sets of informative ones. Specifically, MM has the following form: M=S1|x1|S2|x2||SL|xLM=S_{1}|x_{1}|S_{2}|x_{2}|\ldots|S_{L}|x_{L}777If bib_{i} for i[n]i\in[n] are boxes, the list MM looks like this: b3b6b13|5|b42b1|6|b2b_{3}b_{6}b_{13}|5|b_{42}b_{1}|6|b_{2} where SiS_{i} is a set of informative boxes, and xix_{i}\in\mathbb{N} is the number of non-informative boxes opened exactly after the boxes in set SiS_{i}. We also denote by IB(M)\operatorname{IB}(M) the informative boxes in the list MM.

    In order to update MM at every recursive call, we either append a new informative box bib_{i} opened (denoted by M|biM|b_{i}) or, when a non-informative box is opened, we add 11 at the end, denoted by M+1M+1.

  2. 2.

    a list EE of m2m^{2} tuples of integers (zij,tij)(z_{ij},t_{ij}), one for each pair of distinct scenarios (si,sj)(s_{i},s_{j}) with i,j[m]i,j\in[m]. The number zijz_{ij} keeps track of the number of informative boxes between sis_{i} and sjs_{j} that the value discovered had higher probability for scenario sis_{i}, and the number tijt_{ij} is the total number of informative for scenarios sis_{i} and sjs_{j} opened. Every time an informative box is opened, we increase the tijt_{ij} variables for the scenarios the box was informative and add 11 to the zijz_{ij} if the value discovered had higher probability in sis_{i}. When a non-informative box is opened, the list remains the same.We denote this update by E\scaleto++5ptE^{\scaleto{++}{5pt}}.

  3. 3.

    a list SS of the scenarios not yet eliminated. Every time an informative test is performed, and the list EE updated, if for some scenario sis_{i} there exists another scenario sjs_{j} such that tij>1/ε2log(1/δ)t_{ij}>1/\varepsilon^{2}\log(1/\delta) and |zij𝔼[zij|si]|ε/2|z_{ij}-\mathbb{E}{\left[z_{ij}|s_{i}\right]}|\leq\varepsilon/2 then sjs_{j} is removed from SS, otherwise sis_{i} is removed888This is the process of elimination in the proof of Lemma 6.3. This update is denoted by S\scaleto++5ptS^{\scaleto{++}{5pt}}.

Base cases:

if a value below TT is found, the algorithm stops. The other base case is when |S|=1|S|=1, which means that the scenario realized is identified, we either take the outside option TT or search the boxes for a value below TT, whichever is cheapest. If the scenario is identified correctly, the DP finds the expected optimal for this scenario. We later show that we make a mistake only with low probability, thus increasing the cost only by a constant factor. We denote by Nat(,,)\operatorname{Nat}(\cdot,\cdot,\cdot) the “nature’s” move, where the value in the box we chose is realized, and Sol(,,)\operatorname{Sol}(\cdot,\cdot,\cdot) is the minimum value obtained by opening boxes. The recursive formula is shown below.

Sol(M,E,S)={min(T,c\scaletoNIB(S)6pt+Nat(M+1,E,S)) if |S|=1min(T,min\scaletoiIB(M)6pt(ci+Nat(M|i,E,S)),c\scaletoNIB(S)6pt+Nat(M+1,E,S))elseNat(M,E,S)={0if vlast box openedTSol(M,E\scaleto++4pt,S\scaleto++4pt)else\displaystyle\begin{split}\operatorname{Sol}(M,E,S)&=\begin{cases}\min(T,c_{\scaleto{\operatorname{NIB}^{*}(S)}{6pt}}+\operatorname{Nat}(M\raisebox{0.3pt}{\scalebox{0.7}{+}}1,E,S))&\text{ if }|S|=1\\ \min\Big{(}T,\min\limits_{\scaleto{i\in\operatorname{IB}(M)}{6pt}}\left(c_{i}\raisebox{0.3pt}{\scalebox{0.7}{+}}\operatorname{Nat}(M|i,E,S)\right)&\\ \hskip 42.67912pt,c_{\scaleto{\operatorname{NIB}^{*}(S)}{6pt}}+\operatorname{Nat}(M\raisebox{0.3pt}{\scalebox{0.7}{+}}1,E,S)\Big{)}&\text{else}\end{cases}\\ \operatorname{Nat}(M,E,S)&=\begin{cases}0&\hskip 95.3169pt\text{if }v_{\text{last box opened}}\leq T\\ \operatorname{Sol}(M,E^{\scaleto{++}{4pt}},S^{\scaleto{++}{4pt}})&\hskip 95.3169pt\text{else}\end{cases}\end{split} (2)

The final solution is DP(β)=Sol(,E0,𝒮)\operatorname{DP}(\beta)=\operatorname{Sol}(\emptyset,E^{0},\mathcal{S}), where E0E^{0} is a list of tuples of the form (0,0)(0,0), and in order to update SS we set δ=βcmin/(m2T)\delta=\beta c_{\min}/(m^{2}T).

Lemma 6.3.

Let s1,s2𝒮s_{1},s_{2}\in\mathcal{S} be any two scenarios. Then after opening log(1/δ)ε2\frac{\log(1/\delta)}{\varepsilon^{2}} informative boxes, we can eliminate one scenario with probability at least 1δ1-\delta.

We defer the proof of this lemma and Theorem 6.1 to Section D of the Appendix.

References

  • [AGY09] Yossi Azar, Iftah Gamzu, and Xiaoxin Yin. Multiple intents re-ranking. In Michael Mitzenmacher, editor, Proceedings of the 41st Annual ACM Symposium on Theory of Computing, STOC 2009, Bethesda, MD, USA, May 31 - June 2, 2009, pages 669–678. ACM, 2009.
  • [AH12] Micah Adler and Brent Heeringa. Approximating optimal binary decision trees. Algorithmica, 62(3-4):1112–1121, 2012.
  • [ASW16] Marek Adamczyk, Maxim Sviridenko, and Justin Ward. Submodular stochastic probing on matroids. Math. Oper. Res., 41(3):1022–1038, 2016.
  • [BC22] Hedyeh Beyhaghi and Linda Cai. Pandora’s problem with nonobligatory inspection: Optimal structure and a PTAS. CoRR, abs/2212.01524, 2022.
  • [BDP22] Curtis Bechtel, Shaddin Dughmi, and Neel Patel. Delegated pandora’s box. In David M. Pennock, Ilya Segal, and Sven Seuken, editors, EC ’22: The 23rd ACM Conference on Economics and Computation, Boulder, CO, USA, July 11 - 15, 2022, pages 666–693. ACM, 2022.
  • [BEFF23] Ben Berger, Tomer Ezra, Michal Feldman, and Federico Fusco. Pandora’s problem with combinatorial cost. CoRR, abs/2303.01078, 2023.
  • [BFLL20] Shant Boodaghians, Federico Fusco, Philip Lazos, and Stefano Leonardi. Pandora’s box problem with order constraints. In Péter Biró, Jason D. Hartline, Michael Ostrovsky, and Ariel D. Procaccia, editors, EC ’20: The 21st ACM Conference on Economics and Computation, Virtual Event, Hungary, July 13-17, 2020, pages 439–458. ACM, 2020.
  • [BGK10] Nikhil Bansal, Anupam Gupta, and Ravishankar Krishnaswamy. A constant factor approximation algorithm for generalized min-sum set cover. In Proceedings of the Twenty-First Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2010, Austin, Texas, USA, January 17-19, 2010, pages 1539–1545, 2010.
  • [BK19] Hedyeh Beyhaghi and Robert Kleinberg. Pandora’s problem with nonobligatory inspection. In Anna Karlin, Nicole Immorlica, and Ramesh Johari, editors, Proceedings of the 2019 ACM Conference on Economics and Computation, EC 2019, Phoenix, AZ, USA, June 24-28, 2019, pages 131–132. ACM, 2019.
  • [CFG+00] Moses Charikar, Ronald Fagin, Venkatesan Guruswami, Jon M. Kleinberg, Prabhakar Raghavan, and Amit Sahai. Query strategies for priced information (extended abstract). In Proceedings of the Thirty-Second Annual ACM Symposium on Theory of Computing, May 21-23, 2000, Portland, OR, USA, pages 582–591, 2000.
  • [CGT+20] Shuchi Chawla, Evangelia Gergatsouli, Yifeng Teng, Christos Tzamos, and Ruimin Zhang. Pandora’s box with correlations: Learning and approximation. In 61st IEEE Annual Symposium on Foundations of Computer Science, FOCS 2020, Durham, NC, USA, November 16-19, 2020, pages 1214–1225. IEEE, 2020.
  • [CHKK15] Yuxin Chen, S. Hamed Hassani, Amin Karbasi, and Andreas Krause. Sequential information maximization: When is greedy near-optimal? In Proceedings of The 28th Conference on Learning Theory, COLT 2015, Paris, France, July 3-6, 2015, pages 338–363, 2015.
  • [CJK+15] Yuxin Chen, Shervin Javdani, Amin Karbasi, J. Andrew Bagnell, Siddhartha S. Srinivasa, and Andreas Krause. Submodular surrogates for value of information. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, January 25-30, 2015, Austin, Texas, USA., pages 3511–3518, 2015.
  • [CJLM10] Ferdinando Cicalese, Tobias Jacobs, Eduardo Sany Laber, and Marco Molinaro. On greedy algorithms for decision trees. In Otfried Cheong, Kyung-Yong Chwa, and Kunsoo Park, editors, Algorithms and Computation - 21st International Symposium, ISAAC 2010, Jeju Island, Korea, December 15-17, 2010, Proceedings, Part II, volume 6507 of Lecture Notes in Computer Science, pages 206–217. Springer, 2010.
  • [CPR+11] Venkatesan T. Chakaravarthy, Vinayaka Pandit, Sambuddha Roy, Pranjal Awasthi, and Mukesh K. Mohania. Decision trees for entity identification: Approximation algorithms and hardness results. ACM Trans. Algorithms, 7(2):15:1–15:22, 2011.
  • [CPRS09] Venkatesan T. Chakaravarthy, Vinayaka Pandit, Sambuddha Roy, and Yogish Sabharwal. Approximating decision trees with multiway branches. In Susanne Albers, Alberto Marchetti-Spaccamela, Yossi Matias, Sotiris E. Nikoletseas, and Wolfgang Thomas, editors, Automata, Languages and Programming, 36th International Colloquium, ICALP 2009, Rhodes, Greece, July 5-12, 2009, Proceedings, Part I, volume 5555 of Lecture Notes in Computer Science, pages 210–221. Springer, 2009.
  • [Das04] Sanjoy Dasgupta. Analysis of a greedy active learning strategy. In Advances in Neural Information Processing Systems 17 [Neural Information Processing Systems, NIPS 2004, December 13-18, 2004, Vancouver, British Columbia, Canada], pages 337–344, 2004.
  • [DFH+23] Bolin Ding, Yiding Feng, Chien-Ju Ho, Wei Tang, and Haifeng Xu. Competitive information design for pandora’s box. In Nikhil Bansal and Viswanath Nagarajan, editors, Proceedings of the 2023 ACM-SIAM Symposium on Discrete Algorithms, SODA 2023, Florence, Italy, January 22-25, 2023, pages 353–381. SIAM, 2023.
  • [Dov18] Laura Doval. Whether or not to open pandora’s box. J. Econ. Theory, 175:127–158, 2018.
  • [EHLM19] Hossein Esfandiari, Mohammad Taghi Hajiaghayi, Brendan Lucier, and Michael Mitzenmacher. Online pandora’s boxes and bandits. In The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27 - February 1, 2019, pages 1885–1892. AAAI Press, 2019.
  • [FLL22] Hu Fu, Jiawei Li, and Daogao Liu. Pandora box problem with nonobligatory inspection: Hardness and improved approximation algorithms. CoRR, abs/2207.09545, 2022.
  • [FLT04] Uriel Feige, László Lovász, and Prasad Tetali. Approximating min sum set cover. Algorithmica, 40(4):219–234, 2004.
  • [GB09] Andrew Guillory and Jeff A. Bilmes. Average-case active learning with costs. In Ricard Gavaldà, Gábor Lugosi, Thomas Zeugmann, and Sandra Zilles, editors, Algorithmic Learning Theory, 20th International Conference, ALT 2009, Porto, Portugal, October 3-5, 2009. Proceedings, volume 5809 of Lecture Notes in Computer Science, pages 141–155. Springer, 2009.
  • [GG74] M. R. Garey and Ronald L. Graham. Performance bounds on the splitting algorithm for binary testing. Acta Informatica, 3:347–355, 1974.
  • [GGM06] Ashish Goel, Sudipto Guha, and Kamesh Munagala. Asking the right questions: model-driven optimization using probes. In Proceedings of the Twenty-Fifth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, June 26-28, 2006, Chicago, Illinois, USA, pages 203–212, 2006.
  • [GJ74] J.C. Gittins and D.M. Jones. A dynamic allocation index for the sequential design of experiments. Progress in Statistics, pages 241–266, 1974.
  • [GJSS19] Anupam Gupta, Haotian Jiang, Ziv Scully, and Sahil Singla. The markovian price of information. In Integer Programming and Combinatorial Optimization - 20th International Conference, IPCO 2019, Ann Arbor, MI, USA, May 22-24, 2019, Proceedings, pages 233–246, 2019.
  • [GK01] Anupam Gupta and Amit Kumar. Sorting and selection with structured costs. In 42nd Annual Symposium on Foundations of Computer Science, FOCS 2001, 14-17 October 2001, Las Vegas, Nevada, USA, pages 416–425, 2001.
  • [GK11] Daniel Golovin and Andreas Krause. Adaptive submodularity: Theory and applications in active learning and stochastic optimization. J. Artif. Intell. Res., 42:427–486, 2011.
  • [GK17] Daniel Golovin and Andreas Krause. Adaptive submodularity: A new approach to active learning and stochastic optimization. CoRR, abs/1003.3967, 2017.
  • [GKR10] Daniel Golovin, Andreas Krause, and Debajyoti Ray. Near-optimal bayesian active learning with noisy observations. In John D. Lafferty, Christopher K. I. Williams, John Shawe-Taylor, Richard S. Zemel, and Aron Culotta, editors, Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, Vancouver, British Columbia, Canada, pages 766–774. Curran Associates, Inc., 2010.
  • [GN13] Anupam Gupta and Viswanath Nagarajan. A stochastic probing problem with applications. In Integer Programming and Combinatorial Optimization - 16th International Conference, IPCO 2013, Valparaíso, Chile, March 18-20, 2013. Proceedings, pages 205–216, 2013.
  • [GNR17] Anupam Gupta, Viswanath Nagarajan, and R. Ravi. Approximation algorithms for optimal decision trees and adaptive TSP problems. Math. Oper. Res., 42(3):876–896, 2017.
  • [GNS16] Anupam Gupta, Viswanath Nagarajan, and Sahil Singla. Algorithms and adaptivity gaps for stochastic probing. In Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2016, Arlington, VA, USA, January 10-12, 2016, pages 1731–1747, 2016.
  • [GNS17] Anupam Gupta, Viswanath Nagarajan, and Sahil Singla. Adaptivity gaps for stochastic probing: Submodular and XOS functions. In Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2017, Barcelona, Spain, Hotel Porta Fira, January 16-19, pages 1688–1702, 2017.
  • [GT23] Evangelia Gergatsouli and Christos Tzamos. Weitzman’s rule for pandora’s box with correlations. CoRR, abs/2301.13534, 2023.
  • [HAKS13] Noam Hazon, Yonatan Aumann, Sarit Kraus, and David Sarne. Physical search problems with probabilistic knowledge. Artif. Intell., 196:26–52, 2013.
  • [HR76] Laurent Hyafil and Ronald L. Rivest. Constructing optimal binary decision trees is np-complete. Inf. Process. Lett., 5(1):15–17, 1976.
  • [INvdZ16] Sungjin Im, Viswanath Nagarajan, and Ruben van der Zwaan. Minimum latency submodular cover. ACM Trans. Algorithms, 13(1):13:1–13:28, 2016.
  • [JNNR19] Su Jia, Viswanath Nagarajan, Fatemeh Navidi, and R. Ravi. Optimal decision tree with noisy outcomes. In Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d’Alché-Buc, Emily B. Fox, and Roman Garnett, editors, Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, pages 3298–3308, 2019.
  • [KNN17] Prabhanjan Kambadur, Viswanath Nagarajan, and Fatemeh Navidi. Adaptive submodular ranking. In Integer Programming and Combinatorial Optimization - 19th International Conference, IPCO 2017, Waterloo, ON, Canada, June 26-28, 2017, Proceedings, pages 317–329, 2017.
  • [KPB99] S. Rao Kosaraju, Teresa M. Przytycka, and Ryan S. Borgstrom. On an optimal split tree problem. In Frank K. H. A. Dehne, Arvind Gupta, Jörg-Rüdiger Sack, and Roberto Tamassia, editors, Algorithms and Data Structures, 6th International Workshop, WADS ’99, Vancouver, British Columbia, Canada, August 11-14, 1999, Proceedings, volume 1663 of Lecture Notes in Computer Science, pages 157–168. Springer, 1999.
  • [LLM20] Ray Li, Percy Liang, and Stephen Mussmann. A tight analysis of greedy yields subexponential time approximation for uniform decision tree. In Shuchi Chawla, editor, Proceedings of the 2020 ACM-SIAM Symposium on Discrete Algorithms, SODA 2020, Salt Lake City, UT, USA, January 5-8, 2020, pages 102–121. SIAM, 2020.
  • [Lov85] Donald W. Loveland. Performance bounds for binary testing with arbitrary weights. Acta Informatica, 22(1):101–114, 1985.
  • [LPRY08] Zhen Liu, Srinivasan Parthasarathy, Anand Ranganathan, and Hao Yang. Near-optimal algorithms for shared filter evaluation in data stream systems. In Jason Tsong-Li Wang, editor, Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2008, Vancouver, BC, Canada, June 10-12, 2008, pages 133–146. ACM, 2008.
  • [MBMW05] Kamesh Munagala, Shivnath Babu, Rajeev Motwani, and Jennifer Widom. The pipelined set cover problem. In Thomas Eiter and Leonid Libkin, editors, Database Theory - ICDT 2005, 10th International Conference, Edinburgh, UK, January 5-7, 2005, Proceedings, volume 3363 of Lecture Notes in Computer Science, pages 83–98. Springer, 2005.
  • [NS17] Feng Nan and Venkatesh Saligrama. Comments on the proof of adaptive stochastic set cover based on adaptive submodularity and its implications for the group identification problem in ”group-based active query selection for rapid diagnosis in time-critical situations”. IEEE Trans. Inf. Theory, 63(11):7612–7614, 2017.
  • [PD92] Krishna R. Pattipati and Mahesh Dontamsetty. On a generalized test sequencing problem. IEEE Trans. Syst. Man Cybern., 22(2):392–396, 1992.
  • [PKSR02] Vili Podgorelec, Peter Kokol, Bruno Stiglic, and Ivan Rozman. Decision trees: An overview and their use in medicine. Journal of medical systems, 26:445–63, 11 2002.
  • [Sin18] Sahil Singla. The price of information in combinatorial optimization. In Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2018, New Orleans, LA, USA, January 7-10, 2018, pages 2523–2532, 2018.
  • [SW11] Martin Skutella and David P. Williamson. A note on the generalized min-sum set cover problem. Oper. Res. Lett., 39(6):433–436, 2011.
  • [Wei79] Martin L Weitzman. Optimal Search for the Best Alternative. Econometrica, 47(3):641–654, May 1979.

Appendix A A Naive Reduction to PBT{\textsc{PB}_{\leq T}}

In this section we present a straightforward reduction from Pandora’s Box to PBT{\textsc{PB}_{\leq T}} as an alternative to Theorem 4.2. This reduction has a simpler construction compared to the reduction of Section 4, and does not lose a logarithmic factor in the approximation, it however faces the following issues.

  1. 1.

    It incurs an extra computational cost, since it adds a number of boxes that depends on the size of the values’ support.

  2. 2.

    It requires opening costs, which means that the oracle for Pandora’s Box with outside option should be able to handle non-unit costs on the boxes, even if the original PB problem had unit-cost boxes.

We denote by PBTc{\textsc{PB}_{\leq T}^{c}} the version of Pandora’s Box with outside option that has non-unit cost boxes, and formally state the guarantees of our naive reduction below.

Theorem A.1.

For nn boxes and mm scenarios, given an α(n,m)\alpha(n,m)-approximation algorithm for PBTc{\textsc{PB}_{\leq T}^{c}} for arbitrary TT, there exits a 2α(n|supp(v)|,m)2\alpha(n\cdot|\text{supp}(v)|,m)-approximation for PB that runs in polynomial time in the number of scenarios, number of boxes, and the number of values.

Figure 5 summarizes all the reductions from PB to PBT{\textsc{PB}_{\leq T}} and in Table 1 we compare the properties of the naive reduction of this section, to the ones show in Section 4. The main differences are that there is a blow-up in the number of boxes that depends on the support, while losing only constant factors in the approximation.

UPBT{\textsc{UPB}_{\leq T}}PBPBT{\textsc{PB}_{\leq T}}PBTc{\textsc{PB}_{\leq T}^{c}}Main Lem. 4.4  Main Lem. 4.5Thm A.1Main Lemma (log\log factors)Main Theorem (const. factors)Subproblem
Figure 5: Reductions shown in Section 4.1
Reducing PB to
PBTc{\textsc{PB}_{\leq T}^{c}}, Theorem A.1 (𝒰)PBT(\mathcal{U}){\textsc{PB}_{\leq T}}, Main Lemma 4.5 (4.4)
Costs of boxes Introduces non-unit costs Maintains costs
Probabilities Maintains probabilities
Maintains probabilities
(Makes probabilities uniform)
# of extra scenarios 0 0
# of extra boxes nsupp(v)n\cdot\operatorname{supp}(v) 0
Approximation loss 2α(nsupp(v),m)2\alpha(n\cdot\operatorname{supp}(v),m) O(α(n,m)loga(n,m))O(\alpha(n,m)\log a(n,m))
Table 1: Differences of reductions of Theorems A.1, and the Main Lemmas 4.5 and 4.4 that comprise Theorem 4.2.

The main idea is that we can move the information about the values contained in the boxes into the cost of the boxes. We do achieve this effect by creating one new box for every (box, value)-pair. Note, that doing this risks losing the information about the realized scenario that the original boxes revealed. To retain this information, we keep the original boxes, but replace their values by high values. The high values guarantee the effect of the new boxes is retained. Now, we can formalize this intuition.

PBT{\textsc{PB}_{\leq T}} Instance.

Given an instance \mathcal{I} of PB, we construct an instance \mathcal{I}^{\prime} of PBT{\textsc{PB}_{\leq T}}. We need TT to be sufficiently large so that the outside option is never chosen. The net effect is that a policy for PB is easily inferred from a policy for PBT{\textsc{PB}_{\leq T}}. We define the instance \mathcal{I}^{\prime} to have the same scenarios sis_{i} and same scenario probabilities pip_{i} as \mathcal{I}. We choose T=T=\infty999We set TT to a value larger than ici+maxi,jvij\sum_{i}c_{i}+\max_{i,j}v_{ij}., and define the new values by vi,j=vi,j+T+1v^{\prime}_{i,j}=v_{i,j}+T+1. Note that all of these values will be larger than TT and so a feasible policy cannot terminate after receiving such a value. At the same time, these values ensure the same branching behaviour as before since each distinct value is mapped one to one to a new distinct value. Next, we add additional “final” boxes for each pair (j,v)(j,v) where jj is a box and vv a potential value of box jj. Each “final” box (j,v)(j,v) has cost cj+vc_{j}+v. Box (j,v)(j,v) has value 0 for the scenarios where box jj gives exactly value vv and values T+1T+1 for all other scenarios. Formally,

vi,(j,v)={0if vi,j=vT+1else\displaystyle v^{\prime}_{i,(j,v)}=\begin{cases}0&\text{if }v_{i,j}=v\\ T+1&\text{else}\end{cases}

Intuitively, these “final” boxes indicate to a policy that this will be the last box opened, and so its values, which are at least that of the best values of the boxes chosen, should now be taken into account in the cost of the solution.

In order to prove Theorem A.1, we use two key lemmas. In Lemma A.2 we show that the optimal value for the transformed instance \mathcal{I}^{\prime} of PBT{\textsc{PB}_{\leq T}} is not much higher than the optimal value for original instance \mathcal{I}. In Lemma A.3 we show how to obtain a policy for the initial instance with values, given a policy for the problem with a threshold.

Lemma A.2.

Given the instance \mathcal{I} of PB and the constructed instance \mathcal{I}^{\prime} of PBT{\textsc{PB}_{\leq T}} it holds that

c(π)2c(π).c(\pi^{*}_{\mathcal{I}^{\prime}})\leq 2c(\pi^{*}_{\mathcal{I}}).
Proof.

We show that given an optimal policy for PB, we can construct a feasible policy π\pi^{\prime} for \mathcal{I}^{\prime} such that c(π0)2c(π)c(\pi_{0})\leq 2c(\pi^{*}_{\mathcal{I}}). We construct the policy π\pi^{\prime} by opening the same boxes as π\pi and finally opening the corresponding “values” box, in order to find the 0 needed to stop.

Fix any scenario ii, and suppose box jj achieved the smallest value Vi,jV_{i,j} of all boxes opened under scenario ii. Since jj is opened, in the instance \mathcal{I}^{\prime} we open box (j,vi,j)(j,v_{i,j}), and from the construction of \mathcal{I}^{\prime} we have that vi,(j,vi,j)=0v^{\prime}_{i,(j,v_{i,j})}=0. Since on every branch we open a box with values 0101010π\pi opens at least one box., we see that π\pi^{\prime} is a feasible policy for \mathcal{I}^{\prime}. Under scenario ii, we have that the cost of π(i)\pi(i) is

c(π(i))=minkπ(i)vi,k+kπ(i)ck.c(\pi(i))=\min_{k\in\pi(i)}v_{i,k}+\sum_{k\in\pi(i)}c_{k}.

In contrast, the minimum cost following π(i)\pi^{\prime}(i) is 0 and there is the additional cost of the “values” box. Formally, the cost of π(i)\pi^{\prime}(i) is

c(π(i))=0+kπ(i)ck+c(j,vi,j)=minkπ(i)vi,k+kπ(i)ck+cj=c(π(i))+cjc(\pi^{\prime}(i))=0+\sum_{k\in\pi(i)}c_{k}+c_{(j,v_{i,j})}=\min_{k\in\pi(i)}v_{i,k}+\sum_{k\in\pi(i)}c_{k}+c_{j}=c(\pi(i))+c_{j}

Since cjc_{j} appears in the cost of π(i)\pi(i), we know that c(π(i))cjc(\pi(i))\geq c_{j}. Thus, c(π(i))=c(π(i))+cj2c(π(i))c(\pi^{\prime}(i))=c(\pi(i))+c_{j}\leq 2c(\pi(i)), which implies that c(π)2c(π)c(\pi^{\prime})\leq 2c(\pi^{*}_{\mathcal{I}}) for our feasible policy π\pi^{\prime}. Observing that c(π)c(π)c(\pi^{\prime})\geq c(\pi^{*}_{\mathcal{I}^{\prime}}) for any policy, completes the proof. ∎

Lemma A.3.

Given a policy π\pi^{\prime} for the constructed instance \mathcal{I}^{\prime} of PBT{\textsc{PB}_{\leq T}}, there exists a feasible policy π\pi for the instance \mathcal{I} of PB with no larger expected cost. Furthermore, any branch of π\pi can be constructed from π\pi^{\prime} in polynomial time.

Proof of Lemma A.3.

We construct a policy π\pi for \mathcal{I} using the policy π\pi^{\prime}. Fix some branch of π\pi^{\prime}. If π\pi^{\prime} opens box jj along this branch, we define policy π\pi to open the same box along this branch. When π\pi^{\prime} opens a “final” box (j,v)(j,v), we define the policy π\pi to open box jj if it has not been opened already.

Next, we show this policy π\pi has no larger expected cost than π\pi^{\prime}. There are two cases to consider depending on where the “final” box (j,v)(j,v) is opened:

  1. 1.

    “Final” box (j,v)(j,v) is at a leaf of π\pi^{\prime}: since π\pi^{\prime} has finite expected cost and this is the first “final” box we encountered, the result must be 0. Therefore, under π\pi the values will be vv by definition of \mathcal{I}^{\prime}. Observe that in this case, c(π)c(π)c(\pi)\leq c(\pi^{\prime}) since the (at most) extra vv paid by π\pi for the value term, has already been paid by the box cost in π\pi^{\prime} when box (j,v)(j,v) was opened.

  2. 2.

    “Final” box (j,v)(j,v) is at an intermediate node of π\pi^{\prime}: after π\pi opens box jj, we copy the subtree of π\pi^{\prime} that follows the 0 branch into the branch of π\pi that follows the vv branch. Also, we copy the subtree of π\pi^{\prime} that follows the 1\infty_{1} branch into each branch that has a value different from vv (the non-vv branches). The cost of this new subtree is cjc_{j} instead of the original cj+vc_{j}+v. The vv branch may accrue an additional cost of vv or smaller if jj was not the smallest values box on this branch, so in total, the vv branch has cost at most its original cost.

    However, the non-vv branches have a vv term removed going down the tree. Specifically, since the feedback of (j,v)(j,v) down the non-vv branch was 1\infty_{1}, some other box with 0 values had to be opened at some point, and this box is still available to be used as the final values for this branch later on (since if this branch already had a 0, it would have stopped). Thus, the cost of this subtree is at most that originally, and has one fewer “final” box opened.

Putting these cases together implies that c(π)c(π)c(\pi)\leq c(\pi^{\prime}).

Lastly, we argue that any branch of π\pi can be computed efficiently. To compute a branch for π\pi, we follow the corresponding branch of π\pi^{\prime}. As we go along this branch, we open box jj whenever π\pi^{\prime} opens box (j,v)(j,v) and remember the feedback. We use the feedback to know which boxes of π\pi^{\prime} to open in the future. Hence, we can compute a branch of π\pi from π\pi^{\prime} in polynomial time. ∎

We are now ready to give the proof of Lemma A.1.

Proof of Lemma A.1.

Suppose we have an α\alpha-approximation for PBT{\textsc{PB}_{\leq T}}. Given an instance \mathcal{I} to PB, we construct the instance \mathcal{I}^{\prime} for PBT{\textsc{PB}_{\leq T}} as described and then run the approximation algorithm on \mathcal{I}^{\prime} to get a policy π\pi_{\mathcal{I^{\prime}}}. Next, we prune the tree as described in Lemma A.3 to get a policy, π\pi_{\mathcal{I}} of no worse cost. Our policy will use time at most polynomially more than the policy for PBT{\textsc{PB}_{\leq T}} since each branch of π\pi_{\mathcal{I}} can be computed in polynomial time from π\pi_{\mathcal{I^{\prime}}}. Hence, the runtime is polynomial in the size of \mathcal{I}^{\prime}. We also note that we added at most mnmn total “final” boxes to construct our new instance \mathcal{I}^{\prime}, and so this algorithm will run in polynomial time in mm and nn. Thus, by Lemma A.3 and Lemma A.2 we know the cost of the constructed policy is

c(π)c(π)αc(π)2αc(π)c(\pi)\leq c(\pi^{\prime})\leq\alpha c(\pi^{*}_{\mathcal{I}^{\prime}})\leq 2\alpha c(\pi^{*}_{\mathcal{I}})

Hence, this algorithm is a 2α2\alpha-approximation for PB.

Appendix B Proofs from Section 4

See 4.1

Proof of Claim 4.1.

Let \mathcal{I} be an instance of MSSCf\textsc{MSSC}_{f}. We create an instance \mathcal{I}^{\prime} of PB the following way: for every set sjs_{j} of \mathcal{I} that gives feedback fijf_{ij} when element eie_{i} is selected, we create a scenario sjs_{j} with the same probability and whose value for box ii, is either 0 if eisje_{i}\in s_{j} or fij\infty_{f_{ij}} otherwise, where fij\infty_{f_{ij}} denotes an extremely large value which is different for different values of the feedback fijf_{ij}. Observe that any solution to the PB instance gives a solution to the MSSCf\textsc{MSSC}_{f} at the same cost and vice versa. ∎

See 4.3 Before formally proving this claim, recall the correspondence of scenarios and boxes of PB-type problems, to elements and sets of MSSC-type problems. The idea for the reduction is to create TT copies of sets for each scenario in the initial PBT{\textsc{PB}_{\leq T}} instance and one element per box, where if the price a box gives for a scenario ii is <T<T then the corresponding element belongs to all TT copies of the set ii. The final step is to “simulate” the outside option TT, for which we we create TT elements where the kk’th one belongs only to the kk’th copy of each set.

Proof of Claim 4.3.

Given an instance \mathcal{I} of UPBT{\textsc{UPB}_{\leq T}} with outside cost box bTb_{T}, we construct the instance \mathcal{I}^{\prime} of UMSSCf\textsc{UMSSC}_{f} as follows.

Construction of the instance.

For every scenario sis_{i} in the initial instance, we create TT sets denoted by siks_{ik} where k[T]k\in[T]. Each of these sets has equal probability pik=1/(mT)p_{ik}=1/(mT). We additionally create one element eBe^{B} per box BB, which belongs to every set siks_{ik} for all kk iff vBi<Tv_{Bi}<T in the initial instance, otherwise gives feedback vBiv_{Bi}. In order to simulate box bTb_{T} without introducing an element with non-unit cost, we use a sequence of TT outside option elements ekTe^{T}_{k} where ekTsike^{T}_{k}\in s_{ik} for all i[m]i\in[m] i.e. element eikTe^{T}_{ik} belongs to “copy kk” of every set 111111Observe that there are exactly TT possible options for kk for any set. Choosing all these elements costs TT and covers all sets thus simulating bTb_{T}..

Construction of the policy.

We construct policy π\pi_{\mathcal{I}} by ignoring any outside option elements that π\pi_{\mathcal{I^{\prime}}} selects until π\pi_{\mathcal{I^{\prime}}} has chosen at least T/2T/2 such elements, at which point π\pi_{\mathcal{I}} takes the outside option box bTb_{T}. To show feasibility we need that for every scenario either bTb_{T} is chosen or some box with vijTv_{ij}\leq T. If bTb_{T} is not chosen, then less than T/2T/2 isolating elements were chosen, therefore in instance of UMSSCf\textsc{UMSSC}_{f} some sub-sets will have to be covered by another element eBe^{B}, corresponding to a box. This corresponding box however gives a value T\leq T in the initial UPBT{\textsc{UPB}_{\leq T}} instance.

Approximation ratio.

Let sis_{i} be any scenario in \mathcal{I}. We distinguish between the following cases, depending on whether there are outside option tests on sis_{i}’s branch.

  1. 1.

    No outside option tests on sis_{i}’s branch: scenario sis_{i} contributes equally in both policies, since absence of isolating elements implies that all copies of scenario sis_{i} will be on the same branch (paying the same cost) in both π\pi_{\mathcal{I}^{\prime}} and π\pi_{\mathcal{I}}

  2. 2.

    Some outside option tests on ii’s branch: for this case, from Lemma B.1 we have that c(π(si))3c(π(si))c(\pi_{\mathcal{I}}(s_{i}))\leq 3c(\pi_{\mathcal{I}^{\prime}}(s_{i})).

Putting it all together we get

c(π)3c(π)2α(n+m,m2)c(π)3α(n+m,m2)c(π),c(\pi_{\mathcal{I}})\leq 3c(\pi_{\mathcal{I^{\prime}}})\leq 2\alpha(n+m,m^{2})c(\pi^{*}_{\mathcal{I^{\prime}}})\leq 3\alpha(n+m,m^{2})c(\pi^{*}_{\mathcal{I}}),

where the second inequality follows since we are given an α\alpha approximation and the last inequality since if we are given an optimal policy for UPBT{\textsc{UPB}_{\leq T}}, the exact same policy is also feasible for any \mathcal{I^{\prime}} instance of UDT, which has cost at least c(π)c(\pi^{*}_{\mathcal{I^{\prime}}}). We also used that TmT\leq m, since otherwise the initial policy would never take the outside option. ∎

Lemma B.1.

Let \mathcal{I} be an instance of UPBT{\textsc{UPB}_{\leq T}}, and \mathcal{I^{\prime}} the instance of UMSSCf\textsc{UMSSC}_{f} constructed by the reduction of Claim 4.3. For a scenario sis_{i}, if there is at least one outside option test run in π\pi_{\mathcal{I}}, then c(π(si))3c(π(si))c(\pi_{\mathcal{I}}(s_{i}))\leq 3c(\pi_{\mathcal{I^{\prime}}}(s_{i})).

Proof.

For the branch of scenario sis_{i}, denote by MM the box elements chosen before there were T/2T/2 outside option elements, and by NN the number of outside option elements in π\pi_{\mathcal{I}^{\prime}}. Note that the smallest cost is achieved if all the outside option elements are chosen first121212Since the outside option tests cause some copies to be isolated and so can reduce their cost.. The copies of scenario sis_{i} can be split into two groups; those that were isolated before T/2T/2 outside option elements were chosen, and those that were isolated after. We distinguish between the following cases, based on the value of NN.

  1. 1.

    NT/2N\geq T/2: in this case each of the copies of sis_{i} that are isolated after pays at least M+T/2M+T/2 for the initial box elements and the initial sequence of outside option elements. For the copies isolated before, we lower bound the cost by choosing all outside option elements first.

    The cost of all the copies in π\pi_{\mathcal{I}^{\prime}} then is at least

    j=1Kik=1T/2cpTk+j=1Kik=T/2+1TcpT(T/2+M)\displaystyle\sum_{j=1}^{K_{i}}\sum_{k=1}^{T/2}\frac{cp_{\ell}}{T}k+\sum_{j=1}^{K_{i}}\sum_{k=T/2+1}^{T}\frac{cp_{\ell}}{T}(T/2+M) =cpiT2(T2+1)2T+cpiT2(T/2+M)T\displaystyle=cp_{i}\frac{\frac{T}{2}(\frac{T}{2}+1)}{2T}+cp_{i}\frac{\frac{T}{2}(T/2+M)}{T}
    cpi(3T/8+M/2)\displaystyle\geq cp_{i}(3T/8+M/2)
    38pi(T+M)\displaystyle\geq\frac{3}{8}p_{i}(T+M)

    Since NT/2N\geq T/2, policy π\pi_{\mathcal{I}} will take the outside option box for sis_{i}, immediately after choosing the MM initial boxes corresponding to the box elements. So, the total contribution sis_{i} has on the expected cost of π\pi_{\mathcal{I}} is at most pi(M+T)p_{i}(M+T) in this case. Hence, we have that sis_{i}’s contribution in π\pi_{\mathcal{I}} is at most 833\frac{8}{3}\leq 3 times sis_{i}’s contribution in π\pi_{\mathcal{I}^{\prime}}.

  2. 2.

    N<T/2N<T/2: policy π\pi_{\mathcal{I}} will only select the MM boxes (corresponding to box elements) and this was sufficient for finding a value less than TT. The total contribution of sis_{i} on c(π)c(\pi_{\mathcal{I}}) is exactly piMp_{i}M. On the other hand, since N<T/2N<T/2 we know that at least half of the copies will pay MM for all of the box elements. The cost of all the copies is at least

    j=1Kik=NTcpTM=cpiTNTMcpiM/2,\sum_{j=1}^{K_{i}}\sum_{k=N}^{T}\frac{cp_{\ell}}{T}M=cp_{i}\frac{T-N}{T}M\geq cp_{i}M/2,

    therefore, the contribution sis_{i} has on c(π)c(\pi_{\mathcal{I}^{\prime}}) is at least cpiM/2cp_{i}M/2. Hence, we have c(π)3c(π)c(\pi_{\mathcal{I}})\leq 3c(\pi_{\mathcal{I}^{\prime}})

B.1 Proofs from subsection 4.1.1

See 4.9

Proof.

Consider a policy πq\pi_{\mathcal{I}_{q}} which runs π\pi^{*}_{\mathcal{I}} on the instance q{\mathcal{I}_{q}}; and for scenarios with cost cstq/(βα)c_{s}\geq t_{q/(\beta\alpha)} aborts after spending this cost and chooses the outside option TT. The cost of this policy is:

c(πq)c(πq)=T+tq/(βα)βα+cs[tq,tq/(10α)]s𝒮cspsq,c(\pi^{*}_{\mathcal{I}_{q}})\leq c(\pi_{\mathcal{I}_{q}})=\frac{T+t_{q/(\beta\alpha)}}{\beta\alpha}+\sum_{\begin{subarray}{c}c_{s}\in[t_{q},t_{q/(10\alpha)}]\\ s\in\mathcal{S}\end{subarray}}c_{s}\frac{p_{s}}{q}, (3)

By our assumption on TT, this cost is at most 2T/βα2T/\beta\alpha. On the other hand since 𝒜T\mathcal{A}_{T} is an α\alpha-approximation to the optimal we have that the cost of the algorithm’s solution is at most

αc(πq)2Tβ\alpha c(\pi^{*}_{\mathcal{I}_{q}})\leq\frac{2T}{\beta}

Since the expected cost of 𝒜T\mathcal{A}_{T} is at most 2T/β2T/\beta, then using Markov’s inequality, we get that Pr[csT](2T/β)/T=2/β\textbf{Pr}\left[c_{s}\geq T\right]\leq(2T/\beta)/T=2/\beta. Therefore, 𝒜T\mathcal{A}_{T} covers at least 12/β1-2/\beta mass every time. ∎

See 4.8

Proof.

In every interval of the form i=[tqi,tqi/(βα)]\mathcal{I}_{i}=[t_{q^{i}},t_{q^{i}/(\beta\alpha)}] the optimal policy for PB covers at least 1/(βα)1/(\beta\alpha) of the probability mass that remains. Since the values belong in the interval i\mathcal{I}_{i} in phase ii, it follows that the minimum possible value that the optimal policy might pay is tqit_{q^{i}}, i.e. the lower end of the interval. Summing up for all intervals, we get the lemma. ∎

B.2 Proofs from subsection 4.1.2

Input: Set of scenarios 𝒮\mathcal{S}
1 Scale all probabilities by cc such that cs𝒮ps=1c\sum_{s\in\mathcal{S}}p_{s}=1
2 Let pmin=mins𝒮psp_{\text{min}}=\min_{s\in\mathcal{S}}p_{s}
3 𝒮=\mathcal{S}^{\prime}= for each s𝒮s\in\mathcal{S} create ps/pminp_{s}/p_{\text{min}} copies
4 Each copy has probability 1/|𝒮|1/|\mathcal{S}^{\prime}|
return 𝒮\mathcal{S}^{\prime}
Algorithm 3 Expand: rescales and returns an instance of UPB.

See 4.4

Proof.

The proof in this case follows the steps of the proof of Theorem 4.5, and we are only highlighting the changes. The process of the reduction is the same as Algorithm 1 with the only difference that we add two extra steps; (1) we initially remove all low probability scenarios (line 3 - remove at most cc fraction) and (2) we add them back after running UPBT{\textsc{UPB}_{\leq T}} (line 8). The reduction process is formally shown in Algorithm 2.

Calculating the thresholds.

For every phase ii we choose a threshold TiT_{i} such that Ti=min{T:Pr[cs>T]δ}T_{i}=\min\{T:\textbf{Pr}\left[c_{s}>T\right]\leq\delta\} i.e. at most δ\delta of the probability mass of the scenarios are not covered, again using binary search as in Algorithm 1. We denote by Inti=[t(1c)(δ+c)i,t(1c)(δ+c)i/(βα)]\operatorname{Int}_{i}=[t_{(1-c)(\delta+c)^{i}},t_{(1-c)(\delta+c)^{i}/(\beta\alpha)}] the relevant interval of costs at every run of the algorithm, then by Lemma 4.9, we know that for remaining total probability mass (1c)(δ+c)i(1-c)(\delta+c)^{i}, any threshold which satisfies

Tit(1c)(δ+c)i1/βα+βαs𝒮csInticsps(1c)(δ+c)iT_{i}\geq t_{(1-c)(\delta+c)^{i-1}/\beta\alpha}+\beta\alpha\sum_{\begin{subarray}{c}s\in\mathcal{S}\\ c_{s}\in\operatorname{Int}_{i}\end{subarray}}c_{s}\frac{p_{s}}{(1-c)(\delta+c)^{i}}

also satisfies the desired covering property, i.e. at least (12/β)(1c)(δ+c)(1-2/\beta)(1-c)(\delta+c) mass of the current scenarios is covered. Therefore the threshold TiT_{i} found by our binary search satisfies

Ti=t(1c)(δ+c)i1/βα+βαs𝒮csInticsps(1c)(δ+c)i.T_{i}=t_{(1-c)(\delta+c)^{i-1}/\beta\alpha}+\beta\alpha\sum_{\begin{subarray}{c}s\in\mathcal{S}\\ c_{s}\in\operatorname{Int}_{i}\end{subarray}}c_{s}\frac{p_{s}}{(1-c)(\delta+c)^{i}}. (4)

Following the proof of Theorem 4.5, Constructing the final policy and Accounting for the values remain exactly the same as neither of them uses the fact that the scenarios are uniform.

Bounding the final cost.

Using the guarantee that at the end of every phase we cover (δ+c)(\delta+c) of the scenarios, observe that the algorithm for PBT{\textsc{PB}_{\leq T}} is run in an interval of the form Inti=[t(1c)(δ+c)i,t(1c)(δ+c)i/(βα)]\operatorname{Int}_{i}=[t_{(1-c)(\delta+c)^{i}},t_{(1-c)(\delta+c)^{i}/(\beta\alpha)}]. Note also that these intervals are overlapping. Bounding the cost of the final policy π\pi_{\mathcal{I}} for all intervals we get

π\displaystyle\pi_{\mathcal{I}} i=0(1c)(δ+c)iTi\displaystyle\leq\sum_{i=0}^{\infty}(1-c)(\delta+c)^{i}T_{i}
=i=0((1c)(δ+c)it(1c)(δ+c)i1/βα+βαs𝒮csInticsps)\displaystyle=\sum_{i=0}^{\infty}\left((1-c)(\delta+c)^{i}t_{(1-c)(\delta+c)^{i-1}/\beta\alpha}+\beta\alpha\sum_{\begin{subarray}{c}s\in\mathcal{S}\\ c_{s}\in\operatorname{Int}_{i}\end{subarray}}c_{s}p_{s}\right) From equation (4)
2βαπ+βαi=0s𝒮csInticsps\displaystyle\leq 2\cdot\beta\alpha\pi^{*}_{\mathcal{I}}+\beta\alpha\sum_{i=0}^{\infty}\sum_{\begin{subarray}{c}s\in\mathcal{S}\\ c_{s}\in\operatorname{Int}_{i}\end{subarray}}c_{s}p_{s} Using Lemma 4.8
2βαlogαπ,\displaystyle\leq 2\beta\alpha\log\alpha\cdot\pi^{*}_{\mathcal{I}},

where the inequalities follow similarly to the proof of Theorem 4.5. Choosing c=δ=0.1c=\delta=0.1 and β=20\beta=20 we get the theorem. ∎

Appendix C Proofs from Section 5

See 5.1

Proof of Claim 5.1.

Let \mathcal{I} be an instance of MSSCf\textsc{MSSC}_{f}. We create an instance \mathcal{I}^{\prime} of DT the following way: for every set sjs_{j} we create a scenario sjs_{j} with the same probability and for every element eie_{i} we create a test TeiT_{e_{i}} with the same cost, that gives full feedback whenever an element belongs to a set, otherwise returns only the element’s feedback fijf_{ij}. Formally, the ii-test under scenario sjs_{j} returns

Tei(sj)={“The feedback is fijIf eisj“The scenario is jelse ,T_{e_{i}}(s_{j})=\begin{cases}\text{``The feedback is $f_{ij}$''}&\text{If }e_{i}\not\in s_{j}\\ \text{``The scenario is $j$''}&\text{else },\end{cases}

therefore the test isolates scenario jj when eisje_{i}\in s_{j}.

Constructing the policy.

Given a policy π\pi^{\prime} for the instance \mathcal{I}^{\prime} of DT, we can construct a policy π\pi for \mathcal{I} by selecting the element that corresponds to the test π\pi^{\prime} chose. When π\pi^{\prime} finishes, all scenarios are identified and for any scenario sjs_{j} either (1) there is a test in π\pi^{\prime} that corresponds to an element in sjs_{j} (in the instance \mathcal{I}) or (2) there is no such test, but we can pay an extra minisjci\min_{i\in s_{j}}c_{i} to select the lowest cost element in this set131313Since the scenario is identified, we know exactly which element this is..

Observe also that in this instance of DT if we were given the optimal solution, it directly translates to a solution for MSSCf\textsc{MSSC}_{f} with the same cost, therefore

c(π)c(π)=c(π)c(\pi^{*}_{\mathcal{I}})\leq c(\pi^{\prime}_{\mathcal{I}^{\prime}})=c(\pi^{*}_{\mathcal{I}^{\prime}}) (5)
Bounding the cost of the policy.

As we described above the total cost of the policy is

c(π)\displaystyle c(\pi) c(π)+𝔼s𝒮[minisci]\displaystyle\leq c(\pi_{\mathcal{I}^{\prime}})+\mathbb{E}_{s\in\mathcal{S}}{\left[\min_{i\in s}c_{i}\right]}
c(π)+c(π)\displaystyle\leq c(\pi_{\mathcal{I}^{\prime}})+c(\pi^{*}_{\mathcal{I}})
a(n,m)c(π)+c(π)\displaystyle\leq a(n,m)c(\pi^{*}_{\mathcal{I}^{\prime}})+c(\pi^{*}_{\mathcal{I}})
=(1+a(n,m))c(π),\displaystyle=(1+a(n,m))c(\pi^{*}_{\mathcal{I}}),

where in the last inequality we used equation (5).

Note that for this reduction we did not change the probabilities of the scenarios, therefore if we had started with uniform probabilities and had an oracle to UDT, we would still get an a(n,m)+1a(n,m)+1 algorithm for UMSSCf\textsc{UMSSC}_{f}. ∎

In the reduction proof of Theorem 5.2, we are using the following two lemmas, that show that the policy constructed for UDT via the reduction is feasible and has bounded cost.

Lemma C.1.

Given an instance \mathcal{I^{\prime}} of UDT and the corresponding instance \mathcal{I} of UMSSCf\textsc{UMSSC}_{f} in the reduction of Theorem 5.2, the policy π\pi_{\mathcal{I^{\prime}}} constructed for UDT is feasible.

Proof of Lemma C.1.

It suffices to show that every scenario is isolated. Fix a scenario sis_{i}. Observe that sis_{i}’s branch has chosen the isolating element EiE^{i} in the UMSSCf\textsc{UMSSC}_{f} solution, since that is the the only element that belongs to set sis_{i}. Let SS be the set of scenarios just before EiE^{i} is chosen and note that by definition siSs_{i}\in S.

If |S|=1|S|=1, then since π\pi_{\mathcal{I}^{\prime}} runs tests giving the same branching behavior by definition of π\pi_{\mathcal{I}^{\prime}}, and sis_{i} is the only scenario left, we have that the branch of π\pi_{\mathcal{I}^{\prime}} isolates scenario sis_{i}.

If |S|>1|S|>1 then all scenarios/sets in S{si}S\setminus\{s_{i}\} are not covered by choosing element EiE^{i}, therefore they are covered at strictly deeper leaves in the tree. By induction on the depth of the tree, we can assume that for each scenario sj(S{si})s_{j}\in\left(S\setminus\{s_{i}\}\right) is isolated in π\pi_{\mathcal{I}^{\prime}}. We distinguish the following cases based on when we encounter EiE^{i} among the isolating elements on sis_{i}’s branch.

  1. 1.

    EiE^{i} was the first isolating element chosen on the branch: then policy π\pi_{\mathcal{I}^{\prime}} ignores element EiE^{i}. Since every leaf holds a unique scenario in S{si}S\setminus\{s_{i}\}, if we ignore sis_{i} it follows some path of tests and either be isolated or end up in a node that originally would have had only one scenario, as shown in Figure 6. Since there are only two scenarios at that node, policy π\pi_{\mathcal{I}^{\prime}} runs the cheapest test distinguishing sis_{i} from that scenario.

    𝒮\mathcal{S}EiE^{i}sis_{i}𝒮si\mathcal{S}\setminus s_{i}sleafs_{\text{leaf}}sjs_{j}\ldots\ldotssks_{k}Ignoring Box EiE^{i}𝒮\mathcal{S}sleaf,s_{\text{leaf}}, sis_{i}sjs_{j}\ldots\ldotssks_{k}
    Figure 6: Case 1: 𝒮\mathcal{S} is the set of scenarios remaining when EiE^{i} is chosen, sleafs_{\text{leaf}} is the scenario that sis_{i} ends up with.
  2. 2.

    A different element EjE^{j} was chosen before EiE^{i}: by our construction, instead of ignoring EiE^{i} we now run the cheapest test that distinguishes sis_{i} from sjs_{j}, causing ii and jj to go down separate branches, as shown in figure 7. We apply the induction hypothesis again to the scenarios in these sub-branches, therefore, both sis_{i} and sjs_{j} are either isolated or end up in a node with a single scenario and then get distinguished by the last case of π\pi_{\mathcal{I}^{\prime}}’s construction.

    𝒮si\mathcal{S}\cup s_{i}EjE^{j}sjs_{j}𝒮si\mathcal{S}\cup s_{i}𝒯\mathcal{T}Replacing EjE^{j} with Ti vs jT_{i\text{ vs }j}𝒮si\mathcal{S}\cup s_{i}Ti vs jT_{i\text{ vs }j}𝒮1si\mathcal{S}_{1}\cup s_{i}𝒮2sj\mathcal{S}_{2}\cup s_{j}𝒯\mathcal{T}𝒯\mathcal{T}
    Figure 7: Case 2: run test Ti vs jT_{i\text{ vs }j} to distinguish sis_{i} and sjs_{j}. Sets 𝒮1\mathcal{S}_{1} and 𝒮2\mathcal{S}_{2} partition 𝒮\mathcal{S}

Hence, π\pi_{\mathcal{I}^{\prime}} is isolating for any scenario sis_{i}. Also, notice that any two scenarios that have isolating boxes on the same branch will end up in distinct subtrees of the lower node. ∎

Lemma C.2.

Given an instance \mathcal{I} of UMSSCf\textsc{UMSSC}_{f} and an instance \mathcal{I^{\prime}} of UDT, in the reduction of Theorem 5.2 it holds that

c(π)2c(π).c(\pi_{\mathcal{I^{\prime}}})\leq 2c(\pi_{\mathcal{I}}).
Proof of Lemma C.2.

Let sis_{i} be any scenario in 𝒮\mathcal{S}. We use induction on the number of isolating boxes along sis_{i}’s branch in \mathcal{I}^{\prime}. Initially observe that EiE^{i} will always exist in sis_{i}’s branch, in any feasible solution to \mathcal{I}. We use c(Ej)c(E^{j}) and c(Tk)c(T_{k}) to denote the costs of box EjE^{j} and test TkT_{k}, for any k[n]k\in[n] and j[n+m]j\in[n+m].

  1. 1.

    Only EiE^{i} is on the branch: since EiE^{i} will be ignored, we end up with sis_{i} and some other not yet isolated scenario, let sleafs_{\text{leaf}} be that scenario. To isolate sis_{i} and sleafs_{\text{leaf}} we run the cheapest test that distinguishes between these. From the definition of the cost of EiE^{i} we know that c(Tsi vs sleaf)c(Ei)c(T_{s_{i}\text{ vs }s_{\text{leaf}}})\leq c(E^{i}). Additionally, since c(si)c(sleaf)c(s_{i})\leq c(s_{\text{leaf}}) and both sleafs_{\text{leaf}} and sis_{i} have probability 1/m1/m, overall we have c(π)2c(π)c(\pi_{\mathcal{I}})\leq 2c(\pi_{\mathcal{I^{\prime}}}). This is also shown in Figure 6.

  2. 2.

    More than one isolating elements are on the branch: similarly, observe that for any extra isolating element EjE^{j} we encounter, we substitute it with a test that distinguishes between sis_{i} and sjs_{j} and costs at most c(Ej)c(E^{j}). Given that c(si)c(sleaf)c(s_{i})\leq c(s_{\text{leaf}}) and scenarios are uniform, we again have c(π)2c(π)c(\pi_{\mathcal{I}})\leq 2c(\pi_{\mathcal{I^{\prime}}}).

See 5.2

Proof of Theorem 5.2.

We begin by giving the construction of the policy in the reduction, and showing the final approximation ratio.

Constructing the policy.

Given a policy π\pi_{\mathcal{I}} for the instance of UMSSCf\textsc{UMSSC}_{f}, we construct a policy π\pi_{\mathcal{I}^{\prime}}. For any test element BjB_{j} that π\pi_{\mathcal{I}} selects, π\pi_{\mathcal{I}^{\prime}} runs the equivalent test TjT_{j}. For the isolating elements EiE^{i} we distinguish the following cases.

  1. 1.

    If π\pi_{\mathcal{I}} selects an isolating element EiE^{i} for the first time on the current branch, then π\pi_{\mathcal{I}^{\prime}} ignores this element but remembers the set/scenario sis_{i}, which EiE^{i} belonged to.

  2. 2.

    If π\pi_{\mathcal{I}} selects another isolating element EjE^{j} after some EiE^{i} on the branch, then π\pi_{\mathcal{I}^{\prime}} runs the minimum cost test that distinguishes scenario sjs_{j} from sks_{k} where EkE^{k} was the most recent isolating element chosen on this branch prior to EjE^{j}.

  3. 3.

    If we are at the end of π\pi_{\mathcal{I}}, there can be at most 22 scenarios remaining on the branch, so π\pi_{\mathcal{I}^{\prime}} runs the minimum cost test that distinguishes these two scenarios.

By Lemma C.1, we have that the above policy is feasible for UDT.

Approximation ratio.

From Lemma C.2 we have that c(π)2c(π)c(\pi_{\mathcal{I^{\prime}}})\leq 2c(\pi_{\mathcal{I}}). For the optimal policy, we have that c(π)3c(π)c(\pi^{*}_{\mathcal{I}})\leq 3c(\pi^{*}_{\mathcal{I}^{\prime}}). This holds since if we have an optimal solution to UDT, we can add an isolating element at every leaf to make it feasible for UMSSCf\textsc{UMSSC}_{f}, by only increasing the cost by a factor of 33141414This is because for every two scenarios, the UDT solution must distinguish between them, but one of these scenarios is the max\max scenario from the definition of TjT_{j}, for which we pay less than TjT_{j}, which means that c(π)c(\pi^{*}_{\mathcal{I}}) will be less than this transformed UMSSCf\textsc{UMSSC}_{f} solution. Overall, if π\pi_{\mathcal{I}} is computed from an α(n,m)\alpha(n,m)-approximation for UMSSCf\textsc{UMSSC}_{f}, we have

c(π)2c(π)2α(n+m,m)c(π)6α(n+m,m)c(π)c(\pi_{\mathcal{I}^{\prime}})\leq 2c(\pi_{\mathcal{I}})\leq 2\alpha(n+m,m)c(\pi_{\mathcal{I}}^{*})\leq 6\alpha(n+m,m)c(\pi_{\mathcal{I}^{\prime}}^{*})

Appendix D Proofs from Section 6

See 6.3

Proof.

Let s1,s2𝒮s_{1},s_{2}\in\mathcal{S} be any two scenarios in the instance of PB and let viv_{i} be the value returned by opening the ii’th informative box, which has distributions 𝒟is1{\mathcal{D}_{is_{1}}} and 𝒟is2{\mathcal{D}_{is_{2}}} for scenarios s1s_{1} and s2s_{2} respectively. Then by the definition of informative boxes for every such box opened, there is a set of values vv for which Pr𝒟is1[v]Pr𝒟is2[v]\textbf{Pr}_{{\mathcal{D}_{is_{1}}}}\left[v\right]\geq\textbf{Pr}_{{\mathcal{D}_{is_{2}}}}\left[v\right] and a set for which the reverse holds. Denote these sets by Mis1M_{i}^{s_{1}} and Mis2M_{i}^{s_{2}} respectively. We also define the indicator variables Xis1=𝟙{viMis1}X_{i}^{s_{1}}=\mathbbm{1}{\{v_{i}\in M_{i}^{s_{1}}\}}. Define X¯=i[k]Xis1/k\overline{X}=\sum_{i\in[k]}X_{i}^{s_{1}}/k, and observe that 𝔼[X¯|s1]=i[k]Pr[Mis1]/k\mathbb{E}{\left[\overline{X}|s_{1}\right]}=\sum_{i\in[k]}\textbf{Pr}\left[M_{i}^{s_{1}}\right]/k. Since for every box we have an ε\varepsilon gap in TV distance between the scenarios s1,s2s_{1},s_{2} we have that

|𝔼[X¯|s1]𝔼[X¯|s2]|ε,\left|\mathbb{E}{\left[\overline{X}|s_{1}\right]}-\mathbb{E}{\left[\overline{X}|s_{2}\right]}\right|\geq\varepsilon,

therefore if |X¯𝔼[X¯|s1]|ε/2\left|\overline{X}-\mathbb{E}{\left[\overline{X}|s_{1}\right]}\right|\leq\varepsilon/2 we conclude that scenario s2s_{2} is eliminated, otherwise we eliminate scenario s1s_{1}. The probability of error is Pr𝒟is1[X¯𝔼[X¯|s1]>ε/2]e2k(ε/2)2\textbf{Pr}_{{\mathcal{D}_{is_{1}}}}\left[\overline{X}-\mathbb{E}{\left[\overline{X}|s_{1}\right]}>\varepsilon/2\right]\leq e^{-2k(\varepsilon/2)^{2}}, where we used Hoeffding’s inequality since Xi{0,1}X_{i}\in\{0,1\}. Since we want the probability of error to be less than δ\delta, we need to open O(log1/δε2)O\left(\frac{\log 1/\delta}{\varepsilon^{2}}\right) informative boxes. ∎

Proof of Theorem 6.1.

We describe how to bound the final cost, and calculate the runtime of the DP. Denote by L=m2/ε2log1/δL=m^{2}/\varepsilon^{2}\log 1/\delta where we show that in order to get (1+β)(1+\beta)-approximation we set δ=βcminm2T\delta=\frac{\beta c_{\min}}{m^{2}T}.

Cost of the final solution.

Observe that the only case where the DP limits the search space is when |S|=1|S|=1. If the scenario is identified correctly, the DP finds the optimal solution by running the greedy order; every time choosing the box with the highest probability of a value below TT151515When there is only one scenario, this is exactly Weitzman’s algorithm..

In order to eliminate all scenarios but one, we should eliminate all but one of the m2m^{2} pairs in the list EE. From Lemma 6.3, and a union bound on all m2m^{2} pairs, the probability of the last scenario being the wrong one is at most m2δm^{2}\delta. By setting δ=βcmin/(m2T)\delta=\beta c_{\min}/(m^{2}T), we get that the probability of error is at most βcmin/T\beta c_{\min}/T, in which case we pay at most TT, therefore getting an extra βcminβc(π)\beta c_{\min}\leq\beta c(\pi^{*}) factor.

Runtime.

The DP maintains a list MM of sets of informative boxes opened, and numbers of non informative ones. Recall that MM has the following form M=S1|x1|S2|x2||Sk|xkM=S_{1}|x_{1}|S_{2}|x_{2}|\ldots|S_{k}|x_{k}, where kLk\leq L from Lemma 6.3 and the fact that there are m2m^{2} pairs in EE. There are in total nn boxes, and LL “positions” for them, therefore the size of the state space is (nL)=O(nL){n\choose L}=O(n^{L}). There is also an extra nn factor for searching in the list of informative boxes at every step of the recursion. Observe that the numbers of non-informative boxes also add a factor of at most nn in the state space. The list EE adds another factor at most nm2n^{m^{2}}, and the list SS a factor of 2m2^{m} making the total runtime to be nO~(m2/ε2)n^{\tilde{O}(m^{2}/\varepsilon^{2})}. ∎

Appendix E Boxes with Non-Unit Costs: Revisiting our Results

In the original Pandora’s Box problem, denoted by PBc{\textsc{PB}^{c}}, each box ii has a different known cost ci>0c_{i}>0. Similarly we denote the non-unit cost version of both decision tree-like problems and Min Sum Set Cover-like problems by adding a superscript c to the problem name. Specifically, we now define DTc\textsc{DT}^{c}, UDTc\textsc{UDT}^{c}, MSSCfc\textsc{MSSC}_{f}^{c} and UMSSCfc\textsc{UMSSC}_{f}^{c}, where the tests (elements) have non-unit cost for the decision tree (min sum set cover) problems. We revisit our results and describe how our reductions change to incorporate non-unit cost boxes (summary in Figure 8).

PBc{\textsc{PB}^{c}}UMSSCfc\textsc{UMSSC}_{f}^{c}MSSCfc\textsc{MSSC}_{f}^{c}UDTc\textsc{UDT}^{c}DTc\textsc{DT}^{c}Claim 4.1Claim 5.1Claim 5.1 Thm 4.2 Cor E.2Thm 4.2Main Theorem (log\log factors)Main Theorem (const. factors)Minor ClaimSubproblem
Figure 8: Summary of all the reductions with non-unit costs. The only result that needs a changed proof is Corollary E.2 highlighted in bold (previously Theorem 5.2).

Note also, that even though the known results for Optimal Decision Tree (e.g. [GB09, GNR17]) handle non-unit test costs, the currently known works for Uniform Decision Tree do not. If however there is an algorithm for Uniform Decision Tree with non-unit costs, our reductions will handle this obtaining the same approximation guarantees.

E.1 Connecting Pandora’s Box and MSSCf\textsc{MSSC}_{f}

PBc{\textsc{PB}^{c}}UPBTc{\textsc{UPB}^{c}_{\leq T}}PBTc{\textsc{PB}_{\leq T}^{c}}UMSSCfc\textsc{UMSSC}_{f}^{c}MSSCfc\textsc{MSSC}_{f}^{c}Lem 4.5Lem 4.4Claim 4.3Claim 4.3Claim 4.1Main Lemma (log\log factors)Claim (const. factors)Minor ClaimSubproblem
Figure 9: Reductions shown in this section. The solid lines are part of Corollary E.1.

All the results of this section hold as they are when we change all versions to incorporate costs. We did not use the fact that the costs are unit in any of the proofs of Claim 4.1, Claim 4.3 or Lemmas 4.5, 4.4. We formally restate the main theorem of Section 4 as the following corollary, where the only change is that it now holds for the cost versions of the problems.

Corollary E.1 (Pandora’s Box to MSSCf\textsc{MSSC}_{f} with non-unit costs).

If there exists an a(n,m)a(n,m) approximation algorithm for MSSCfc\textsc{MSSC}_{f}^{c} then there exists a O(α(n+m,m2)logα(n+m,m2))O(\alpha(n+m,m^{2})\log\alpha(n+m,m^{2}))-approximation for PBc{\textsc{PB}^{c}}. The same result holds if the initial algorithm is for UMSSCfc\textsc{UMSSC}_{f}^{c}.

E.2 Connecting MSSCf\textsc{MSSC}_{f} and Optimal Decision Tree

In this section the reduction of Theorem 5.2 uses the fact that the costs are uniform. However we can easily circumvent this and obtain corollary E.2. Using this, the results for the non-unit costs versions are summarized in Figure 10.

UMSSCfc\textsc{UMSSC}_{f}^{c}MSSCfc\textsc{MSSC}_{f}^{c}UDTc\textsc{UDT}^{c}DTc\textsc{DT}^{c}Claim 5.1Claim 5.1Cor E.2Main Theorem (const. factors)Minor ClaimSubproblem
Figure 10: Summary of reductions for non unit cost boxes.
Corollary E.2 (Uniform Decision Tree with costs to UMSSCfc\textsc{UMSSC}_{f}^{c}).

Given an α(m,n)\alpha(m,n)-approximation algorithm for UMSSCfc\textsc{UMSSC}_{f}^{c} then there exists an O(α(n+m,m))O(\alpha(n+m,m))-approximation algorithm for UDTc\textsc{UDT}^{c}.

Proof.

The proof follows exactly the same way as the proof of Theorem 5.2 with one change: the cost of an isolating element is the minimum cost test needed to isolate sis_{i} from scenario sks_{k} where sks_{k} is the scenario that maximizes this quantity. Formally, if c(i,k)=min{cj|Tj(i)Tj(k)}c(i,k)=\min\{c_{j}|T_{j}(i)\not=T_{j}(k)\}, then c(Bi)=maxk[m]c(i,k)c(B^{i})=\max_{k\in[m]}c(i,k). The reduction follows the exact steps as the one we described in Section C. ∎