Approximating Pandora’s Box with Correlations111This work was funded in part by NSF awards CCF-2225259 and CCF-2217069
We revisit the classic Pandora’s Box (PB) problem under correlated distributions on the box values. Recent work of [CGT+20] obtained constant approximate algorithms for a restricted class of policies for the problem that visit boxes in a fixed order. In this work, we study the complexity of approximating the optimal policy which may adaptively choose which box to visit next based on the values seen so far.
Our main result establishes an approximation-preserving equivalence of PB to the well studied Uniform Decision Tree (UDT) problem from stochastic optimization and a variant of the Min-Sum Set Cover () problem. For distributions of support , UDT admits a approximation, and while a constant factor approximation in polynomial time is a long-standing open problem, constant factor approximations are achievable in subexponential time [LLM20]. Our main result implies that the same properties hold for PB and .
We also study the case where the distribution over values is given more succinctly as a mixture of product distributions. This problem is again related to a noisy variant of the Optimal Decision Tree which is significantly more challenging. We give a constant-factor approximation that runs in time when the mixture components on every box are either identical or separated in TV distance by .
1 Introduction
Many everyday tasks involve making decisions under uncertainty; for example driving to work using the fastest route or buying a house at the best price. Although we don’t know how the future outcomes of our current decisions will turn out, we can often use some prior information to facilitate the decision making process. For example, having driven on the possible routes to work before, we know which is usually the busiest one. It is also common in such cases that we can remove part of the uncertainty by paying some additional cost. This type of online decision making in the presence of costly information can be modeled as the so-called Pandora’s Box problem, first formalized by Weitzman in [Wei79]. In this problem, the algorithm is given alternatives called boxes, each containing a value from a known distribution. The exact value is not known, but can be revealed at a known opening cost specific to the box. The goal of the algorithm is to decide which box to open next and whether to select a value and stop, such that the total opening cost plus the minimum value revealed is minimized. In the case of independent distributions on the boxes’ values, this problem has a very elegant and simple optimal solution, as described by Weitzman [Wei79]: calculate an index for each box222This is a special case of Gittins index [GJ74]., open the boxes in increasing order of index, and stop when the expected gain is worse than the value already obtained.
Weitzman’s model makes the crucial assumption that the distributions on the values are independent across boxes. This, however, is not always the case in practice and as it turns out, the simple algorithm of the independent case fails to find the optimal solution under correlated distributions. Generally, the complexity of the Pandora’s Box with correlations is not yet well understood. In this work we develop the first approximately-optimal policies for the Pandora’s Box problem with correlated values.
We consider two standard models of correlation where the distribution over values can be specified explicitly in a succinct manner. In the first, the distribution over values has a small support of size . In the second the distribution is a mixture of product distributions, each of which can be specified succinctly. We present approximations for both settings.
A primary challenge in approximating Pandora’s Box with correlations is that the optimal solution can be an adaptive policy that determines which box to open depending on the instantiations of values in all of the boxes opened previously. It is not clear that such a policy can even be described succinctly. Furthermore, the choice of which box to open is complicated by the need to balance two desiderata – finding a low value box quickly versus learning information about the values in unopened boxes (a.k.a. the state of the world or realized scenario) quickly. Indeed, the value contained in a box can provide the algorithm with crucial information about other boxes, and inform the choice of which box to open next; an aspect that is completely missing in the independent values setting studied by Weitzman.
Contribution 1: Connection to Decision Tree and a general purpose approximation.
Some aspects of the Pandora’s Box problem have been studied separately in other contexts. For example, in the Optimal Decision Tree problem (DT) [GB09, LLM20], the goal is to identify an unknown hypothesis, out of possible ones, by performing a sequence of costly tests, whose outcomes depend on the realized hypothesis. This problem has an informational structure similar to that in Pandora’s Box. In particular, we can think of every possible joint instantiation of values in boxes as a possible hypothesis, and every opening of a box as a test. The difference between the two problems is that while in Optimal Decision Tree we want to identify the realized hypothesis exactly, in Pandora’s Box it suffices to terminate the process as soon as we have found a low value box.
Another closely related problem is the Min Sum Set Cover [FLT04], where boxes only have two kinds of values – acceptable or unacceptable – and the goal is to find an acceptable value as quickly as possible. A primary difference relative to Pandora’s Box is that unacceptable boxes provide no further information about the values in unopened boxes.
One of the main contributions of our work is to unearth connections between Pandora’s Box and the two problems described above. We show that Pandora’s Box is essentially equivalent to a special case of Optimal Decision Tree (called Uniform Decision Tree or UDT) where the underlying distribution over hypotheses is uniform – the approximation ratios of these two problems are related within log-log factors. Surprisingly, in contrast, the non-uniform DT appears to be harder than non-uniform Pandora’s Box. We relate these two problems by showing that both are in turn related to a new version of Min Sum Set Cover, that we call Min Sum Set Cover with Feedback (). These connections are summarized in Figure 1. We can thus build on the rich history and large collection of results on these problems to offer efficient algorithms for Pandora’s Box. We obtain a polynomial time approximation for Pandora’s Box, where is the number of distinct value vectors (a.k.a. scenarios) that may arise; as well as constant factor approximations in subexponential time.
It is an important open question whether constant factor approximations exist for Uniform Decision Tree: the best known lower-bound on the approximation ratio is while it is known that it is not NP-hard to obtain super-constant approximations under the Exponential Time Hypothesis. The same properties transfer also to Pandora’s Box and Min Sum Set Cover with Feedback. Pinning down the tight approximation ratio for any of these problems will directly answer these questions for any other problem in the equivalence class we establish.
The key technical component in our reductions is to find an appropriate stopping rule for Pandora’s Box: after opening a few boxes, how should the algorithm determine whether a small enough value has been found or whether further exploration is necessary? We develop an iterative algorithm that in each phase finds an appropriate threshold, with the exploration terminating as soon as a value smaller than the threshold is found, such that there is a constant probability of stopping in each phase. Within each phase then the exploration problem can be solved via a reduction to UDT. The challenge is in defining the stopping thresholds in a manner that allows us to relate the algorithm’s total cost to that of the optimal policy.
Contribution 2: Approximation for the mixture of distributions model.
Having established the general purpose reductions between Pandora’s Box and DT, we turn to the mixture of product distributions model of correlation. This special case of Pandora’s Box interpolates between Weitzman’s independent values setting and the fully general correlated values setting. In this setting, we use the term “scenario” to denote the different product distributions in the mixture. The information gathering component of the problem is now about determining which product distribution in the mixture the box values are realized from. Once the algorithm has determined the realized scenario (a.k.a. product distribution), the remaining problem amounts to implementing Weitzman’s strategy for that scenario.
We observe that this model of correlation for Pandora’s Box is related to the noisy version of DT, where the results of some tests for a given realized hypothesis are not deterministic. One challenge for DT in this setting is that any individual test may give us very little information distinguishing different scenarios, and one needs to combine information across sequences of many tests in order to isolate scenarios. This challenge is inherited by Pandora’s Box.
Previous work on noisy DT obtained algorithms whose approximations and runtimes depend on the amount of noise. In contrast, we consider settings where the level of noise is arbitrary, but where the mixtures satisfy a separability assumption. In particular, we assume that for any given box, if we consider the marginal distributions of the value in the box under different scenarios, these distributions are either identical or sufficiently different (e.g., at least in TV distance) across different scenarios. Under this assumption, we design a constant-factor approximation for Pandora’s Box that runs in (Theorem 6.1), where is the number of boxes. The formal result and the algorithm is presented in Section 6.
1.1 Related work
The Pandora’s Box problem was first introduced by Weitzman in the Economics literature [Wei79]. Since then, there has been a long line of research studying Pandora’s Box and its many variants ; non-obligatory inspection [Dov18, BK19, BC22, FLL22], with order constraints [HAKS13, BFLL20], with correlation [CGT+20, GT23], with combinatorial costs [BEFF23], competitive information design [DFH+23], delegated version [BDP22], and finally in an online setting [EHLM19]. Multiple works also study the generalized setting where more information can be obtained for a price [CFG+00, GK01, CJK+15, CHKK15] and in settings with more complex combinatorial constraints [Sin18, GGM06, GN13, ASW16, GNS16, GNS17, GJSS19].
Chawla et al. [CGT+20] were the first to study Pandora’s Box with correlated values, but they designed approximations relative to a simpler benchmark, namely the optimal performance achievable using a so-called Partially Adaptive strategy that cannot adapt the order in which it opens boxes to the values revealed. In general, optimal strategies can decide both the ordering of the boxes and the stopping time based on the values revealed. [CGT+20] designed an algorithm with performance no more than a constant factor worse than the optimal Partially Adaptive strategy.
In Min Sum Set Cover the line of work was initiated by [FLT04], and continued with improvements and generalizations to more complex constraints by [AGY09, MBMW05, BGK10, SW11].
Optimal decision tree is an old problem studied in a variety of settings ([PKSR02, PD92, GB09, GKR10]), while its most notable application is in active learning settings. It was proven to be NP-Hard by Hyafil and Rivest [HR76]. Since then the problem of finding the best approximation algorithm was an active one [GG74, Lov85, KPB99, Das04, CPR+11, CPRS09, GB09, GNR17, CJLM10, AH12], where finally a greedy for the general case was given by [GB09]. This approximation ratio is proven to be the best possible [CPR+11]. For the case of Uniform decision tree less is known, until recently the best algorithm was the same as the optimal decision tree, and the lower bound was [CPR+11]. The recent work of Li et al. [LLM20] showed that there is an algorithm strictly better than for the uniform decision tree.
The noisy version of optimal decision tree was first studied in [GKR10]333This result is based on a result from [GK11] which turned out to be wrong [NS17]. The correct results are presented in [GK17], which gave an algorithm with runtime that depends exponentially on the number of noisy outcomes. Subsequently, Jia et al. in [JNNR19] gave an -approximation algorithm, where (resp. ) is the maximum number of different test results per test (resp. scenario) using a reduction to Adaptive Submodular Ranking problem [KNN17]. In the case of large number of noisy outcome they obtain a approximation exploiting the connection to Stochastic Set Cover [LPRY08, INvdZ16].
2 Preliminaries
In this paper we study the connections between three different sequential decision making problems – Optimal Decision Tree, Pandora’s Box, and Min Sum Set Cover. We describe these problems formally below.
Optimal Decision Tree
In the Optimal Decision Tree problem (denoted DT) we are given a set of scenarios , each occurring with (known) probability ; and tests , each with cost . Nature picks a scenario from the distribution but this scenario is unknown to the algorithm. The goal of the algorithm is to determine which scenario is realized by running a subset of the tests . When test is run and the realized scenario is , the test returns a result .
Output.
The output of the algorithm is a decision tree where at each node there is a test that is performed, and the branches are the outcomes of the test. In each of the leaves there is an individual scenario that is the only one consistent with the results of the test in the unique path from the root to this leaf. Observe that there is a single leaf corresponding to each scenario . We can represent the tree as an adaptive policy defined as follows:
Definition 2.1 (Adaptive Policy ).
An adaptive policy is a function that given a set of tests done so far and their results, returns the next test to be performed.
Objective.
For a given decision tree or policy , let denote the total cost of all of the tests on the unique path in the tree from the root to the leaf labeled with scenario . The objective of the algorithm is to find a policy that minimizes the average cost .
We use the term Uniform Decision Tree (UDT) to denote the special case of the problem where for all scenarios .
Pandora’s Box
In the Pandora’s Box problem we are given boxes, each with cost and value . The values are distributed according to known distribution . We assume that is an arbitrary correlated distribution over vectors . We call vectors of values scenarios and use to denote a possible realization of the scenario. As in DT, nature picks a scenario from the distribution and this realization is a priori unknown to the algorithm. The goal of the algorithm is to pick a box of small value. The algorithm can observe the values realized in the boxes by opening any box at its respective costs .
Output.
The output of the algorithm is an adaptive policy for opening boxes and a stopping condition. The policy takes as input a subset of the boxes and their associated values, and either returns the index of a box to be opened next or stops and selects the minimum value seen so far. That is, where denotes stopping.
Objective.
For a given policy , let denote the set of boxes opened by the policy prior to stopping when the realized scenario is . The objective of the algorithm is to minimize the expected cost of the boxes opened plus the minimum value discovered, where the expectation is taken over all possible realizations of the values in each box.444In the original version of the problem studied by Weitzman [Wei79] the values are independent across boxes, and the goal is to maximize the value collected minus the costs paid, in contrast to the minimization version we study here. Formally the objective is given by
For simplicity of presentation, from now on we assume that for all boxes, but we show in Section E how to adapt our results to handle non-unit costs, without any loss in the approximation factors.
We use UPB to denote the special case of the problem where the distribution is uniform over scenarios.
Min Sum Set Cover with Feedback
In Min Sum Set Cover, we are given elements and a collection of sets over them, and a distribution over the sets. The output of the algorithm is an ordering over the elements. The cost of the ordering for a particular set is the index of the first element in the ordering that belongs to the set , that is, . The goal of the algorithm is to minimize the expected cost .
We define a variant of the Min Sum Set Cover problem, called Min Sum Set Cover with Feedback (). As in the original problem, we are given a set of elements, a collection of sets and a distribution over the sets. Nature instantiates a set from the distribution ; the realization is unknown to the algorithm. Furthermore, in this variant, each element provides feedback to the algorithm when the algorithm ”visits” this element; this feedback takes on the value for element if the realized set is .
Output.
The algorithm once again produces an ordering over the elements. Observe that the feedback allows the algorithm to adapt its ordering to previously observed values. Accordingly, is an adaptive policy that maps a subset of the elements and their associated feedback, to the index of another element . That is, .
Objective.
As before, the cost of the ordering for a particular set is the index of the first element in the ordering that belongs to the set , that is, . The goal of the algorithm is to minimize the expected cost .
Commonalities and notation
As the reader has observed, we capture the commonalities between the different problems through the use of similar notation. Scenarios in DT correspond to value vectors in PB and to sets in ; all are denoted by , lie in the set , and are drawn by nature from a known joint distribution . Tests in DT correspond to boxes in PB and elements in ; we index each by . The algorithm for each problem produces an adaptive ordering over these tests/boxes/elements. Test outcomes in DT correspond to box values in PB and feedback in . We will use the terminology and notation across different problems interchangeably in the rest of the paper.
2.1 Modeling Correlation
In this work we study two general ways of modeling the correlation between the values in the boxes.
Explicit Distributions.
In this case, is a distribution over scenarios where the ’th scenario is realized with probability , for . Every scenario corresponds to a fixed and known vector of values contained in each box. Specifically, box has value for scenario .
Mixture of Distributions.
We also consider a more general setting, where is a mixture of product distributions. Specifically, each scenario is a product distribution; instead of giving a deterministic value for every box , the result is drawn from distribution . This setting is a generalization of the explicit distributions setting described before.
3 Roadmap of the Reductions and Implications
In Figure 2, we give an overview of all the main technical reductions shown in Sections 4 and 5. An arrow means that we gave an approximation preserving reduction from problem to problem . Therefore an algorithm for that achieves approximation ratio gives also an algorithm for with approximation ratio (or in the case of black dashed lines). For the exact guarantees we refer to the formal statement of the respective theorem. The gray lines denote less important claims or trivial reductions (e.g. in the case of being a subproblem of ).
3.1 Approximating Pandora’s Box
Given our reductions and using the best known results for Uniform Decision Tree from [LLM20] we immediately obtain efficient approximation algorithms for Pandora’s Box. We repeat the results of [LLM20] below.
Theorem 3.1 (Theorems 3.1 and 3.2 from [LLM20]).
-
•
There is a -approximation algorithm for UDT that runs in polynomial time, where OPT is the cost of the optimal solution of the UDT instance.
-
•
There is a -approximation algorithm for UDT that runs in time for any .
Using the results of Theorem 3.1 combined with Theorem 4.2 and Claim 5.1 we get the following corollary.
Corollary 3.1.
From the best-known results for UDT, we have that
-
•
There is a -approximation algorithm for PB that runs in polynomial time555If additionally the possible number of outcomes is a constant , this gives a approximation without losing an extra logarithmic factor, since , as observed by [LLM20]..
-
•
There is a -approximation algorithm for PB that runs in time for any .
An immediate implication of the above corollary is that it is not NP-hard to obtain a superconstant approximation for PB, formally stated below.
Corollary 3.2.
It is not NP-hard to achieve any superconstant approximation for PB assuming the Exponential Time Hypothesis.
Observe that the logarithmic approximation achieved in Corollary 3.1 loses a factor (hence the ) as it relies on the more complex reduction of Theorem 4.2. If we choose to use the more direct reduction of Theorem A.1 to the Optimal Decision Tree where the tests have non-unit costs (which also admits a -approximation [GNR17, KNN17]), we get the following corollary.
Corollary 3.3.
There exists an efficient algorithm that is -approximate for Pandora’s Box and with or without unit-cost boxes.
3.2 Constant approximation for Partially Adaptive PB
Moving on, we show how our reduction can be used to obtain and improve the results of [CGT+20]. Recall that in [CGT+20] the authors presented a constant factor approximation algorithm against a Partially Adaptive benchmark where the order of opening boxes must be fixed up front.
In such a case, the reduction of Section 4 can be used to reduce PB to the standard Min Sum Set Cover (i.e. without feedback), which admits a 4-approximation [FLT04].
Corollary 3.4.
There exists a polynomial time algorithm for PB that is -competitive against the partially adaptive benchmark.
The same result applies even in the case of non-uniform opening costs. This is because a 4-approximate algorithm for Min Sum Set Cover is known even when elements have arbitrary costs [MBMW05]. The case of non-uniform opening costs has also been considered for Pandora’s Box by [CGT+20] but only provide an algorithm to handle polynomially bounded opening costs.
4 Connecting Pandora’s Box and
In this section we establish the connection between Pandora’s Box and Min Sum Set Cover with Feedback. We show that the two problems are equivalent up to logarithmic factors in approximation ratio.
One direction of this equivalence is easy to see in fact: Min Sum Set Cover with Feedback is a special case of Pandora’s Box. Note that in both problems we examine boxes/elements in an adaptive order. In PB we stop when we find a sufficiently small value; in we stop when we find an element that belongs to the instantiated scenario. To establish a formal connection, given an instance of , we can define the ”value” of each element in scenario as being if the element belongs to the set and as being for some sufficiently large value where is the feedback of element for set . This places the instance within the framework of PB and a PB algorithm can be used to solve it. We formally describe this reduction in Section B of the Appendix.
Claim 4.1.
If there exists an -approximation algorithm for PB then there exists a -approximation for .
The more interesting direction is a reduction from PB to . In fact we show that a general instance of PB can be reduced to the simpler uniform version of Min Sum Set Cover with Feedback. We devote the rest of this section to proving the following theorem.
Theorem 4.2 (Pandora’s Box to ).
If there exists an approximation algorithm for then there exists a -approximation for PB.
Guessing a stopping rule and an intermediate problem
The feedback structure in PB and is quite similar, and the main component in which the two problems differ is the stopping condition. In , an algorithm can stop examining elements as soon as it finds one that ”covers” the realized set. In PB, when the algorithm observes a value in a box, it is not immediately apparent whether the value is small enough to stop or whether the algorithm should probe further, especially if the scenario is not fully identified. The key to relating the two problems is to ”guess” an appropriate stopping condition for PB, namely an appropriate threshold such that as soon as the algorithm observes a value smaller than this threshold, it stops. We say that the realized scenario is ”covered”.
To formalize this approach, we introduce an intermediate problem called Pandora’s Box with costly outside option (also called threshold), denoted by . In this version the objective is to minimize the cost of finding a value , while we have the extra option to quit searching by opening an outside option box of cost . We say that a scenario is covered in a given run of the algorithm if it does not choose the outside option box .
We show that Pandora’s Box can be reduced to with a logarithmic loss in approximation factor, and then can be reduced to Min Sum Set Cover with Feedback with a constant factor loss. The following two results capture the details of these reductions.
Claim 4.3.
If there exists an approximation algorithm for then there exists an -approximation for .
It is also worth noting that is a special case of the Adaptive Ranking problem which directly implies a approximation factor (given in [KNN17]).
Main Lemma 4.4.
Given a polynomial-time -approximation algorithm for , there exists a polynomial-time -approximation for PB.
The relationship between and Min Sum Set Cover with Feedback is relatively straightforward and requires explicitly relating the structure of feedback in the two problems. We describe the details in Section B of the Appendix.
Putting it all together.
The proof of Theorem 4.2 follows by combining Claim 4.3 with Lemmas 4.5 and 4.4 presented in the following sections. Proofs of Claims 4.1, 4.3 deferred to Section B of the Appendix. The rest of this section is devoted to proving Lemmas 4.5 and 4.4. The landscape of reductions shown in this section is presented in Figure 3.
4.1 Reducing Pandora’s Box to
Recall that a solution to Pandora’s Box involves two components ; (1) the order in which to open boxes and (2) a stopping rule. The goal of the reduction to is to simplify the stopping rule of the problem, by making values either or , therefore allowing us to focus on the order in which boxes are opened, rather than which value to stop at. We start by presenting our main tool, a reduction to Min Sum Set Cover with Feedback in Section 4.1.1 and then improve upon that to reduce from the uniform version of (Section 4.1.2).
4.1.1 Main Tool
The high level idea in this reduction is that we repeatedly run the algorithm for with increasingly larger value of with the goal of covering some mass of scenarios at every step. The thresholds for every run have to be cleverly chosen to guarantee that enough mass is covered at every run. The distributions on the boxes remain the same, and this reduction does not increase the number of boxes, therefore avoiding the issues faced by the naive reduction given in Section A of the Appendix. Formally, we show the following lemma.
Main Lemma 4.5.
Given a polynomial-time -approximation algorithm for , there exists a polynomial-time -approximation for PB.
We will now analyze the policy produced by this algorithm.
Proof of Main Lemma 4.5.
We start with some notation. Given an instance of PB, we repeatedly run in phases. Phase consists of running with threshold on a sub instance of the original problem where we are left with a smaller set of scenarios, with their probabilities reweighted to sum to . Call this set of scenarios for phase and the corresponding instance . After every phase , we remove the probability mass that was covered666Recall, a scenario is covered if it does not choose the outside option box., and run on this new instance with a new threshold . In each phase, the boxes, costs and values remain the same, but the stopping condition changes: thresholds increase in every subsequent phase. The thresholds are chosen such that at the end of each phase, of the remaining probability mass is covered. The reduction process is formally shown in Algorithm 1.
Accounting for the cost of the policy.
We first note that the total cost of the policy in phase conditioned on reaching that phase is at most : if the policy terminates in that phase, it selects a box with value at most . Furthermore, the policy incurs probing cost at most in the phase. We can therefore bound the total cost of the policy as .
We will now relate the thresholds to the cost of the optimal PB policy for . To this end, we define corresponding thresholds for the optimal policy that we call -thresholds. Let denote the optimal PB policy for and let denote the cost incurred by when scenario is realized. A -threshold is the minimum possible threshold such that at most mass of the scenarios has cost more than in PB, formally defined below.
Definition 4.6 (-Threshold).
Let be an instance of PB and be the cost of scenario in , we define the -threshold as
The following two lemmas relate the cost of the optimal policy to the -thresholds, and the -thresholds to the thresholds our algorithm finds. The proofs of both lemmas are deferred to Section B.1 of the Appendix. We first formally define a sub-instance of the given Pandora’s Box instance.
Definition 4.7 (Sub-instance).
Let be an instance of with set of scenarios each with probability . For any we call a -sub instance of if and .
Lemma 4.8.
(Optimal Lower Bound) Let be the instance of PB. For any , any , and , for the optimal policy for PB it that
Lemma 4.9.
Given an instance of PB; an -approximation algorithm to ; and any and , suppose that the threshold satisfies
Then if is run on a -sub instance of with threshold , at most a total mass of of the scenarios pick the outside option box .
Calculating the thresholds.
For every phase we choose a threshold such that i.e. at most of the probability mass of the scenarios are not covered. In order to select this threshold, we do binary search starting from , running every time the -approximation algorithm for with outside option box and checking how many scenarios select it. We denote by the relevant interval of costs at every run of the algorithm, then by Lemma 4.9 for , we know that for remaining total probability mass , any threshold which satisfies
also satisfies the desired covering property, i.e. at least mass of the current scenarios is covered. Therefore the threshold
found by our binary search satisfies the following
(1) |
Bounding the final cost.
To bound the final cost, we recall that at the end of every phase we cover of the remaining scenarios. Furthermore, we observe that each threshold is charged in the above Equation (1) to optimal costs of scenarios corresponding to intervals of the form . Note that these intervals are overlapping. We therefore get
From equation (1) | ||||
Using Lemma 4.8 for | ||||
Where the last inequality follows since each scenario with cost can belong to at most intervals, therefore we get the theorem. ∎
Notice the generality of this reduction; the distributions on the values are preserved, and we did not make any more assumptions on the scenarios or values throughout the proof. Therefore we can apply this tool regardless of the type of correlation or the way it is given to us, e.g. we could be given a parametric distribution, or an explicitly given distribution, as we see in the next section.
4.1.2 An Even Stronger Tool
Moving one step further, we show that if we instead of we had an -approximation algorithm for we can obtain the same guarantees as the ones described in Lemma 4.5. Observe that we cannot directly use Algorithm 1 since the oracle now requires that all scenarios have the same probability, while this might not be the case in the initial PB instance. The theorem stated formally follows.
See 4.4
We are going to highlight the differences with the proof of Main Lemma 4.5, and show how to change Algorithm 1 to work with the new oracle, that requires the scenarios to have uniform probability. The function Expand shown in Algorithm 2 is used to transform the instance of scenarios to a uniform one where every scenario has the same probability by creating multiple copies of the more likely scenarios. The function is formally described in Algorithm 3 in Section B.2 of the Appendix, alongside the proof of Main Lemma 4.4.
5 Connecting and Optimal Decision Tree
In this section we establish the connection between Min Sum Set Cover with Feedback and Optimal Decision Tree. We show that the uniform versions of these problems are equivalent up to constant factors in approximation ratio. The results of this section are summarized in Figure 4 and the two results below.
Claim 5.1.
If there exists an -approximation algorithm for DT (UDT) then there exists a -approximation algorithm for (resp. ).
Theorem 5.2 (Uniform Decision Tree to ).
Given an -approximation algorithm for then there exists an -approximation algorithm for UDT.
The formal proofs of these statements can be found in Section C of the Appendix. Here we sketch the main ideas.
One direction of this equivalence is again easy to see. The main difference between Optimal Decision Tree and is that the former requires scenarios to be exactly identified whereas in the latter it suffices to simply find an element that covers the scenario. In particular, in an algorithm could cover a scenario without identifying it by, for example, covering it with an element that covers multiple scenarios. To reduce to DT we simply introduce extra feedback into all of the elements of the instance such that the elements isolate any scenarios they cover. (That is, if the algorithm picks an element that covers some subset of scenarios, this element provides feedback about which of the covered scenarios materialized.) This allows us to relate the cost of isolation and the cost of covering to within the cost of a single additional test, implying Claim 5.1.
Proof Sketch of Theorem 5.2.
The other direction is more complicated, as we want to ensure that covering implies isolation. Given an instance of UDT, we create a special element for each scenario which is the unique element covering the scenario and also isolates the scenario from all other scenarios. The intention is that an algorithm for on this new instance only chooses the special isolating element in a scenario after it has identified the scenario. If that happens, then the algorithm’s policy is a feasible solution to the UDT instance and incurs no extra cost. The problem is that an algorithm for over the modified instance may use the special covering element before isolating a scenario. We argue that this choice can be ”postponed” in the policy to a point at which isolation is nearly achieved without incurring too much extra cost. This involves careful analysis of the policy’s decision tree and we present details in the appendix.
Why our reduction does not work for DT.
Our analysis above heavily uses the fact that the probabilities of all scenarios in the UDT instance are equal. This is because the ”postponement” of elements charges increased costs of some scenarios to costs of other scenarios. In fact, our reduction above fails in the case of non-uniform distributions over scenarios – it can generate an instance with optimal cost much smaller than that of the original DT instance.
To see this, consider an example with scenarios where scenarios through happen with probability and scenario happens with probability . There are tests of cost each. Test for isolates scenario from all others. Observe that the optimal cost of this DT instance is at least as all tests need to be run to isolate scenario . Our construction of the instance adds another isolating test for scenario . A solution to this instance can use this new test at the beginning to identify scenario and then run other tests with the remaining probability. As a result, it incurs cost at most , which is a factor of cheaper than that of the original DT instance.
6 Mixture of Product Distributions
In this section we switch gears and consider the case where we are given a mixture of product distributions. Observe that using the tool described in Section 4.1.1, we can reduce this problem to . This now is equivalent to the noisy version of DT [GK17, JNNR19] where for a specific scenario, the result of each test is not deterministic and can get different values with different probabilities.
Comparison with previous work:
previous work on noisy decision tree, considers limited noise models or the runtime and approximation ratio depends on the type of noise. For example in the main result of [JNNR19], the noise outcomes are binary with equal probability. The authors mention that it is possible to extend the following ways:
-
•
to probabilities within , incurring an extra factor in the approximation
-
•
to non-binary noise outcomes, incurring an extra at most factor in the approximation
Additionally, their algorithm works by expanding the scenarios for every possible noise outcome (e.g. to for binary noise). In our work the number of noisy outcomes does not affect the number of scenarios whatsoever.
In our work, we obtain a constant approximation factor, that does not depend in any way on the type of the noise. Additionally, the outcomes of the noisy tests can be arbitrary, and do not affect either the approximation factor or the runtime. We only require a separability condition to hold ; the distributions either differ enough or are exactly the same. Formally, we require that for any two scenarios and for every box , the distributions and satisfy , where is the total variation distance of distributions and .
6.1 A DP Algorithm for noisy
We move on to designing a dynamic programming algorithm to solve the problem, in the case of a mixtures of product distributions. The guarantees of our dynamic programming algorithm are given in the following theorem.
Theorem 6.1.
For any , let and be the policies produced by Algorithm described by Equation (2) and the optimal policy respectively and . Then it holds that
and the runs in time , where is the number of boxes and is the minimum cost box.
Using the reduction described in Section 4.1.1 and the previous theorem we can get a constant-approximation algorithm for the initial PB problem given a mixture of product distributions. Observe that in the reduction, for every instance of it runs, the chosen threshold satisfies that where is the optimal policy for the threshold . The inequality holds since the algorithm for the threshold is a approximation and it covers of the scenarios left (i.e. pays for the rest). This is formalized in the following corollary.
Corollary 6.1.
Observe that the naive DP, that keeps track of all the boxes and possible outcomes, has space exponential in the number of boxes, which can be very large. In our DP, we exploit the separability property of the distributions by distinguishing the boxes in two different types based on a given set of scenarios. Informally, the informative boxes help us distinguish between two scenarios, by giving us enough TV distance, while the non-informative always have zero TV distance. The formal definition follows.
Definition 6.2 (Informative and non-informative boxes).
Let be a set of scenarios. Then we call a box informative if there exist such that
We denote the set of all informative boxes by . Similarly, the boxes for which the above does not hold are called non-informative and the set of these boxes is denoted by .
Recursive calls of the DP:
Our dynamic program chooses at every step one of the following options:
-
1.
open an informative box: this step contributes towards eliminating improbable scenarios. From the definition of informative boxes, every time such a box is opened, it gives TV distance at least between at least two scenarios, making one of them more probable than the other. We show (Lemma 6.3) that it takes a finite amount of these boxes to decide, with high probability, which scenario is the one realized (i.e. eliminating all but one scenarios).
-
2.
open a non-informative box: this is a greedy step; the best non-informative box to open next is the one that maximizes the probability of finding a value smaller than . Given a set of scenarios that are not yet eliminated, there is a unique next non-informative box which is best. We denote by the function that returns this next best non-informative box. Observe that the non-informative boxes do not affect the greedy ordering of which is the next best, since they do not affect which scenarios are eliminated.
State space of the DP:
the DP keeps track of the following three quantities:
-
1.
a list which consists of sets of informative boxes opened and numbers of non-informative ones opened in between the sets of informative ones. Specifically, has the following form: 777If for are boxes, the list looks like this: where is a set of informative boxes, and is the number of non-informative boxes opened exactly after the boxes in set . We also denote by the informative boxes in the list .
In order to update at every recursive call, we either append a new informative box opened (denoted by ) or, when a non-informative box is opened, we add at the end, denoted by .
-
2.
a list of tuples of integers , one for each pair of distinct scenarios with . The number keeps track of the number of informative boxes between and that the value discovered had higher probability for scenario , and the number is the total number of informative for scenarios and opened. Every time an informative box is opened, we increase the variables for the scenarios the box was informative and add to the if the value discovered had higher probability in . When a non-informative box is opened, the list remains the same.We denote this update by .
-
3.
a list of the scenarios not yet eliminated. Every time an informative test is performed, and the list updated, if for some scenario there exists another scenario such that and then is removed from , otherwise is removed888This is the process of elimination in the proof of Lemma 6.3. This update is denoted by .
Base cases:
if a value below is found, the algorithm stops. The other base case is when , which means that the scenario realized is identified, we either take the outside option or search the boxes for a value below , whichever is cheapest. If the scenario is identified correctly, the DP finds the expected optimal for this scenario. We later show that we make a mistake only with low probability, thus increasing the cost only by a constant factor. We denote by the “nature’s” move, where the value in the box we chose is realized, and is the minimum value obtained by opening boxes. The recursive formula is shown below.
(2) |
The final solution is , where is a list of tuples of the form , and in order to update we set .
Lemma 6.3.
Let be any two scenarios. Then after opening informative boxes, we can eliminate one scenario with probability at least .
References
- [AGY09] Yossi Azar, Iftah Gamzu, and Xiaoxin Yin. Multiple intents re-ranking. In Michael Mitzenmacher, editor, Proceedings of the 41st Annual ACM Symposium on Theory of Computing, STOC 2009, Bethesda, MD, USA, May 31 - June 2, 2009, pages 669–678. ACM, 2009.
- [AH12] Micah Adler and Brent Heeringa. Approximating optimal binary decision trees. Algorithmica, 62(3-4):1112–1121, 2012.
- [ASW16] Marek Adamczyk, Maxim Sviridenko, and Justin Ward. Submodular stochastic probing on matroids. Math. Oper. Res., 41(3):1022–1038, 2016.
- [BC22] Hedyeh Beyhaghi and Linda Cai. Pandora’s problem with nonobligatory inspection: Optimal structure and a PTAS. CoRR, abs/2212.01524, 2022.
- [BDP22] Curtis Bechtel, Shaddin Dughmi, and Neel Patel. Delegated pandora’s box. In David M. Pennock, Ilya Segal, and Sven Seuken, editors, EC ’22: The 23rd ACM Conference on Economics and Computation, Boulder, CO, USA, July 11 - 15, 2022, pages 666–693. ACM, 2022.
- [BEFF23] Ben Berger, Tomer Ezra, Michal Feldman, and Federico Fusco. Pandora’s problem with combinatorial cost. CoRR, abs/2303.01078, 2023.
- [BFLL20] Shant Boodaghians, Federico Fusco, Philip Lazos, and Stefano Leonardi. Pandora’s box problem with order constraints. In Péter Biró, Jason D. Hartline, Michael Ostrovsky, and Ariel D. Procaccia, editors, EC ’20: The 21st ACM Conference on Economics and Computation, Virtual Event, Hungary, July 13-17, 2020, pages 439–458. ACM, 2020.
- [BGK10] Nikhil Bansal, Anupam Gupta, and Ravishankar Krishnaswamy. A constant factor approximation algorithm for generalized min-sum set cover. In Proceedings of the Twenty-First Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2010, Austin, Texas, USA, January 17-19, 2010, pages 1539–1545, 2010.
- [BK19] Hedyeh Beyhaghi and Robert Kleinberg. Pandora’s problem with nonobligatory inspection. In Anna Karlin, Nicole Immorlica, and Ramesh Johari, editors, Proceedings of the 2019 ACM Conference on Economics and Computation, EC 2019, Phoenix, AZ, USA, June 24-28, 2019, pages 131–132. ACM, 2019.
- [CFG+00] Moses Charikar, Ronald Fagin, Venkatesan Guruswami, Jon M. Kleinberg, Prabhakar Raghavan, and Amit Sahai. Query strategies for priced information (extended abstract). In Proceedings of the Thirty-Second Annual ACM Symposium on Theory of Computing, May 21-23, 2000, Portland, OR, USA, pages 582–591, 2000.
- [CGT+20] Shuchi Chawla, Evangelia Gergatsouli, Yifeng Teng, Christos Tzamos, and Ruimin Zhang. Pandora’s box with correlations: Learning and approximation. In 61st IEEE Annual Symposium on Foundations of Computer Science, FOCS 2020, Durham, NC, USA, November 16-19, 2020, pages 1214–1225. IEEE, 2020.
- [CHKK15] Yuxin Chen, S. Hamed Hassani, Amin Karbasi, and Andreas Krause. Sequential information maximization: When is greedy near-optimal? In Proceedings of The 28th Conference on Learning Theory, COLT 2015, Paris, France, July 3-6, 2015, pages 338–363, 2015.
- [CJK+15] Yuxin Chen, Shervin Javdani, Amin Karbasi, J. Andrew Bagnell, Siddhartha S. Srinivasa, and Andreas Krause. Submodular surrogates for value of information. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, January 25-30, 2015, Austin, Texas, USA., pages 3511–3518, 2015.
- [CJLM10] Ferdinando Cicalese, Tobias Jacobs, Eduardo Sany Laber, and Marco Molinaro. On greedy algorithms for decision trees. In Otfried Cheong, Kyung-Yong Chwa, and Kunsoo Park, editors, Algorithms and Computation - 21st International Symposium, ISAAC 2010, Jeju Island, Korea, December 15-17, 2010, Proceedings, Part II, volume 6507 of Lecture Notes in Computer Science, pages 206–217. Springer, 2010.
- [CPR+11] Venkatesan T. Chakaravarthy, Vinayaka Pandit, Sambuddha Roy, Pranjal Awasthi, and Mukesh K. Mohania. Decision trees for entity identification: Approximation algorithms and hardness results. ACM Trans. Algorithms, 7(2):15:1–15:22, 2011.
- [CPRS09] Venkatesan T. Chakaravarthy, Vinayaka Pandit, Sambuddha Roy, and Yogish Sabharwal. Approximating decision trees with multiway branches. In Susanne Albers, Alberto Marchetti-Spaccamela, Yossi Matias, Sotiris E. Nikoletseas, and Wolfgang Thomas, editors, Automata, Languages and Programming, 36th International Colloquium, ICALP 2009, Rhodes, Greece, July 5-12, 2009, Proceedings, Part I, volume 5555 of Lecture Notes in Computer Science, pages 210–221. Springer, 2009.
- [Das04] Sanjoy Dasgupta. Analysis of a greedy active learning strategy. In Advances in Neural Information Processing Systems 17 [Neural Information Processing Systems, NIPS 2004, December 13-18, 2004, Vancouver, British Columbia, Canada], pages 337–344, 2004.
- [DFH+23] Bolin Ding, Yiding Feng, Chien-Ju Ho, Wei Tang, and Haifeng Xu. Competitive information design for pandora’s box. In Nikhil Bansal and Viswanath Nagarajan, editors, Proceedings of the 2023 ACM-SIAM Symposium on Discrete Algorithms, SODA 2023, Florence, Italy, January 22-25, 2023, pages 353–381. SIAM, 2023.
- [Dov18] Laura Doval. Whether or not to open pandora’s box. J. Econ. Theory, 175:127–158, 2018.
- [EHLM19] Hossein Esfandiari, Mohammad Taghi Hajiaghayi, Brendan Lucier, and Michael Mitzenmacher. Online pandora’s boxes and bandits. In The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27 - February 1, 2019, pages 1885–1892. AAAI Press, 2019.
- [FLL22] Hu Fu, Jiawei Li, and Daogao Liu. Pandora box problem with nonobligatory inspection: Hardness and improved approximation algorithms. CoRR, abs/2207.09545, 2022.
- [FLT04] Uriel Feige, László Lovász, and Prasad Tetali. Approximating min sum set cover. Algorithmica, 40(4):219–234, 2004.
- [GB09] Andrew Guillory and Jeff A. Bilmes. Average-case active learning with costs. In Ricard Gavaldà, Gábor Lugosi, Thomas Zeugmann, and Sandra Zilles, editors, Algorithmic Learning Theory, 20th International Conference, ALT 2009, Porto, Portugal, October 3-5, 2009. Proceedings, volume 5809 of Lecture Notes in Computer Science, pages 141–155. Springer, 2009.
- [GG74] M. R. Garey and Ronald L. Graham. Performance bounds on the splitting algorithm for binary testing. Acta Informatica, 3:347–355, 1974.
- [GGM06] Ashish Goel, Sudipto Guha, and Kamesh Munagala. Asking the right questions: model-driven optimization using probes. In Proceedings of the Twenty-Fifth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, June 26-28, 2006, Chicago, Illinois, USA, pages 203–212, 2006.
- [GJ74] J.C. Gittins and D.M. Jones. A dynamic allocation index for the sequential design of experiments. Progress in Statistics, pages 241–266, 1974.
- [GJSS19] Anupam Gupta, Haotian Jiang, Ziv Scully, and Sahil Singla. The markovian price of information. In Integer Programming and Combinatorial Optimization - 20th International Conference, IPCO 2019, Ann Arbor, MI, USA, May 22-24, 2019, Proceedings, pages 233–246, 2019.
- [GK01] Anupam Gupta and Amit Kumar. Sorting and selection with structured costs. In 42nd Annual Symposium on Foundations of Computer Science, FOCS 2001, 14-17 October 2001, Las Vegas, Nevada, USA, pages 416–425, 2001.
- [GK11] Daniel Golovin and Andreas Krause. Adaptive submodularity: Theory and applications in active learning and stochastic optimization. J. Artif. Intell. Res., 42:427–486, 2011.
- [GK17] Daniel Golovin and Andreas Krause. Adaptive submodularity: A new approach to active learning and stochastic optimization. CoRR, abs/1003.3967, 2017.
- [GKR10] Daniel Golovin, Andreas Krause, and Debajyoti Ray. Near-optimal bayesian active learning with noisy observations. In John D. Lafferty, Christopher K. I. Williams, John Shawe-Taylor, Richard S. Zemel, and Aron Culotta, editors, Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, Vancouver, British Columbia, Canada, pages 766–774. Curran Associates, Inc., 2010.
- [GN13] Anupam Gupta and Viswanath Nagarajan. A stochastic probing problem with applications. In Integer Programming and Combinatorial Optimization - 16th International Conference, IPCO 2013, Valparaíso, Chile, March 18-20, 2013. Proceedings, pages 205–216, 2013.
- [GNR17] Anupam Gupta, Viswanath Nagarajan, and R. Ravi. Approximation algorithms for optimal decision trees and adaptive TSP problems. Math. Oper. Res., 42(3):876–896, 2017.
- [GNS16] Anupam Gupta, Viswanath Nagarajan, and Sahil Singla. Algorithms and adaptivity gaps for stochastic probing. In Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2016, Arlington, VA, USA, January 10-12, 2016, pages 1731–1747, 2016.
- [GNS17] Anupam Gupta, Viswanath Nagarajan, and Sahil Singla. Adaptivity gaps for stochastic probing: Submodular and XOS functions. In Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2017, Barcelona, Spain, Hotel Porta Fira, January 16-19, pages 1688–1702, 2017.
- [GT23] Evangelia Gergatsouli and Christos Tzamos. Weitzman’s rule for pandora’s box with correlations. CoRR, abs/2301.13534, 2023.
- [HAKS13] Noam Hazon, Yonatan Aumann, Sarit Kraus, and David Sarne. Physical search problems with probabilistic knowledge. Artif. Intell., 196:26–52, 2013.
- [HR76] Laurent Hyafil and Ronald L. Rivest. Constructing optimal binary decision trees is np-complete. Inf. Process. Lett., 5(1):15–17, 1976.
- [INvdZ16] Sungjin Im, Viswanath Nagarajan, and Ruben van der Zwaan. Minimum latency submodular cover. ACM Trans. Algorithms, 13(1):13:1–13:28, 2016.
- [JNNR19] Su Jia, Viswanath Nagarajan, Fatemeh Navidi, and R. Ravi. Optimal decision tree with noisy outcomes. In Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d’Alché-Buc, Emily B. Fox, and Roman Garnett, editors, Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, pages 3298–3308, 2019.
- [KNN17] Prabhanjan Kambadur, Viswanath Nagarajan, and Fatemeh Navidi. Adaptive submodular ranking. In Integer Programming and Combinatorial Optimization - 19th International Conference, IPCO 2017, Waterloo, ON, Canada, June 26-28, 2017, Proceedings, pages 317–329, 2017.
- [KPB99] S. Rao Kosaraju, Teresa M. Przytycka, and Ryan S. Borgstrom. On an optimal split tree problem. In Frank K. H. A. Dehne, Arvind Gupta, Jörg-Rüdiger Sack, and Roberto Tamassia, editors, Algorithms and Data Structures, 6th International Workshop, WADS ’99, Vancouver, British Columbia, Canada, August 11-14, 1999, Proceedings, volume 1663 of Lecture Notes in Computer Science, pages 157–168. Springer, 1999.
- [LLM20] Ray Li, Percy Liang, and Stephen Mussmann. A tight analysis of greedy yields subexponential time approximation for uniform decision tree. In Shuchi Chawla, editor, Proceedings of the 2020 ACM-SIAM Symposium on Discrete Algorithms, SODA 2020, Salt Lake City, UT, USA, January 5-8, 2020, pages 102–121. SIAM, 2020.
- [Lov85] Donald W. Loveland. Performance bounds for binary testing with arbitrary weights. Acta Informatica, 22(1):101–114, 1985.
- [LPRY08] Zhen Liu, Srinivasan Parthasarathy, Anand Ranganathan, and Hao Yang. Near-optimal algorithms for shared filter evaluation in data stream systems. In Jason Tsong-Li Wang, editor, Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2008, Vancouver, BC, Canada, June 10-12, 2008, pages 133–146. ACM, 2008.
- [MBMW05] Kamesh Munagala, Shivnath Babu, Rajeev Motwani, and Jennifer Widom. The pipelined set cover problem. In Thomas Eiter and Leonid Libkin, editors, Database Theory - ICDT 2005, 10th International Conference, Edinburgh, UK, January 5-7, 2005, Proceedings, volume 3363 of Lecture Notes in Computer Science, pages 83–98. Springer, 2005.
- [NS17] Feng Nan and Venkatesh Saligrama. Comments on the proof of adaptive stochastic set cover based on adaptive submodularity and its implications for the group identification problem in ”group-based active query selection for rapid diagnosis in time-critical situations”. IEEE Trans. Inf. Theory, 63(11):7612–7614, 2017.
- [PD92] Krishna R. Pattipati and Mahesh Dontamsetty. On a generalized test sequencing problem. IEEE Trans. Syst. Man Cybern., 22(2):392–396, 1992.
- [PKSR02] Vili Podgorelec, Peter Kokol, Bruno Stiglic, and Ivan Rozman. Decision trees: An overview and their use in medicine. Journal of medical systems, 26:445–63, 11 2002.
- [Sin18] Sahil Singla. The price of information in combinatorial optimization. In Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2018, New Orleans, LA, USA, January 7-10, 2018, pages 2523–2532, 2018.
- [SW11] Martin Skutella and David P. Williamson. A note on the generalized min-sum set cover problem. Oper. Res. Lett., 39(6):433–436, 2011.
- [Wei79] Martin L Weitzman. Optimal Search for the Best Alternative. Econometrica, 47(3):641–654, May 1979.
Appendix A A Naive Reduction to
In this section we present a straightforward reduction from Pandora’s Box to as an alternative to Theorem 4.2. This reduction has a simpler construction compared to the reduction of Section 4, and does not lose a logarithmic factor in the approximation, it however faces the following issues.
-
1.
It incurs an extra computational cost, since it adds a number of boxes that depends on the size of the values’ support.
-
2.
It requires opening costs, which means that the oracle for Pandora’s Box with outside option should be able to handle non-unit costs on the boxes, even if the original PB problem had unit-cost boxes.
We denote by the version of Pandora’s Box with outside option that has non-unit cost boxes, and formally state the guarantees of our naive reduction below.
Theorem A.1.
For boxes and scenarios, given an -approximation algorithm for for arbitrary , there exits a -approximation for PB that runs in polynomial time in the number of scenarios, number of boxes, and the number of values.
Figure 5 summarizes all the reductions from PB to and in Table 1 we compare the properties of the naive reduction of this section, to the ones show in Section 4. The main differences are that there is a blow-up in the number of boxes that depends on the support, while losing only constant factors in the approximation.
Reducing PB to | ||||
, Theorem A.1 | , Main Lemma 4.5 (4.4) | |||
Costs of boxes | Introduces non-unit costs | Maintains costs | ||
Probabilities | Maintains probabilities |
|
||
# of extra scenarios | 0 | 0 | ||
# of extra boxes | 0 | |||
Approximation loss |
The main idea is that we can move the information about the values contained in the boxes into the cost of the boxes. We do achieve this effect by creating one new box for every (box, value)-pair. Note, that doing this risks losing the information about the realized scenario that the original boxes revealed. To retain this information, we keep the original boxes, but replace their values by high values. The high values guarantee the effect of the new boxes is retained. Now, we can formalize this intuition.
Instance.
Given an instance of PB, we construct an instance of . We need to be sufficiently large so that the outside option is never chosen. The net effect is that a policy for PB is easily inferred from a policy for . We define the instance to have the same scenarios and same scenario probabilities as . We choose 999We set to a value larger than ., and define the new values by . Note that all of these values will be larger than and so a feasible policy cannot terminate after receiving such a value. At the same time, these values ensure the same branching behaviour as before since each distinct value is mapped one to one to a new distinct value. Next, we add additional “final” boxes for each pair where is a box and a potential value of box . Each “final” box has cost . Box has value for the scenarios where box gives exactly value and values for all other scenarios. Formally,
Intuitively, these “final” boxes indicate to a policy that this will be the last box opened, and so its values, which are at least that of the best values of the boxes chosen, should now be taken into account in the cost of the solution.
In order to prove Theorem A.1, we use two key lemmas. In Lemma A.2 we show that the optimal value for the transformed instance of is not much higher than the optimal value for original instance . In Lemma A.3 we show how to obtain a policy for the initial instance with values, given a policy for the problem with a threshold.
Lemma A.2.
Given the instance of PB and the constructed instance of it holds that
Proof.
We show that given an optimal policy for PB, we can construct a feasible policy for such that . We construct the policy by opening the same boxes as and finally opening the corresponding “values” box, in order to find the needed to stop.
Fix any scenario , and suppose box achieved the smallest value of all boxes opened under scenario . Since is opened, in the instance we open box , and from the construction of we have that . Since on every branch we open a box with values 101010 opens at least one box., we see that is a feasible policy for . Under scenario , we have that the cost of is
In contrast, the minimum cost following is and there is the additional cost of the “values” box. Formally, the cost of is
Since appears in the cost of , we know that . Thus, , which implies that for our feasible policy . Observing that for any policy, completes the proof. ∎
Lemma A.3.
Given a policy for the constructed instance of , there exists a feasible policy for the instance of PB with no larger expected cost. Furthermore, any branch of can be constructed from in polynomial time.
Proof of Lemma A.3.
We construct a policy for using the policy . Fix some branch of . If opens box along this branch, we define policy to open the same box along this branch. When opens a “final” box , we define the policy to open box if it has not been opened already.
Next, we show this policy has no larger expected cost than . There are two cases to consider depending on where the “final” box is opened:
-
1.
“Final” box is at a leaf of : since has finite expected cost and this is the first “final” box we encountered, the result must be . Therefore, under the values will be by definition of . Observe that in this case, since the (at most) extra paid by for the value term, has already been paid by the box cost in when box was opened.
-
2.
“Final” box is at an intermediate node of : after opens box , we copy the subtree of that follows the branch into the branch of that follows the branch. Also, we copy the subtree of that follows the branch into each branch that has a value different from (the non- branches). The cost of this new subtree is instead of the original . The branch may accrue an additional cost of or smaller if was not the smallest values box on this branch, so in total, the branch has cost at most its original cost.
However, the non- branches have a term removed going down the tree. Specifically, since the feedback of down the non- branch was , some other box with values had to be opened at some point, and this box is still available to be used as the final values for this branch later on (since if this branch already had a , it would have stopped). Thus, the cost of this subtree is at most that originally, and has one fewer “final” box opened.
Putting these cases together implies that .
Lastly, we argue that any branch of can be computed efficiently. To compute a branch for , we follow the corresponding branch of . As we go along this branch, we open box whenever opens box and remember the feedback. We use the feedback to know which boxes of to open in the future. Hence, we can compute a branch of from in polynomial time. ∎
We are now ready to give the proof of Lemma A.1.
Proof of Lemma A.1.
Suppose we have an -approximation for . Given an instance to PB, we construct the instance for as described and then run the approximation algorithm on to get a policy . Next, we prune the tree as described in Lemma A.3 to get a policy, of no worse cost. Our policy will use time at most polynomially more than the policy for since each branch of can be computed in polynomial time from . Hence, the runtime is polynomial in the size of . We also note that we added at most total “final” boxes to construct our new instance , and so this algorithm will run in polynomial time in and . Thus, by Lemma A.3 and Lemma A.2 we know the cost of the constructed policy is
Hence, this algorithm is a -approximation for PB.
∎
Appendix B Proofs from Section 4
See 4.1
Proof of Claim 4.1.
Let be an instance of . We create an instance of PB the following way: for every set of that gives feedback when element is selected, we create a scenario with the same probability and whose value for box , is either if or otherwise, where denotes an extremely large value which is different for different values of the feedback . Observe that any solution to the PB instance gives a solution to the at the same cost and vice versa. ∎
See 4.3 Before formally proving this claim, recall the correspondence of scenarios and boxes of PB-type problems, to elements and sets of MSSC-type problems. The idea for the reduction is to create copies of sets for each scenario in the initial instance and one element per box, where if the price a box gives for a scenario is then the corresponding element belongs to all copies of the set . The final step is to “simulate” the outside option , for which we we create elements where the ’th one belongs only to the ’th copy of each set.
Proof of Claim 4.3.
Given an instance of with outside cost box , we construct the instance of as follows.
Construction of the instance.
For every scenario in the initial instance, we create sets denoted by where . Each of these sets has equal probability . We additionally create one element per box , which belongs to every set for all iff in the initial instance, otherwise gives feedback . In order to simulate box without introducing an element with non-unit cost, we use a sequence of outside option elements where for all i.e. element belongs to “copy ” of every set 111111Observe that there are exactly possible options for for any set. Choosing all these elements costs and covers all sets thus simulating ..
Construction of the policy.
We construct policy by ignoring any outside option elements that selects until has chosen at least such elements, at which point takes the outside option box . To show feasibility we need that for every scenario either is chosen or some box with . If is not chosen, then less than isolating elements were chosen, therefore in instance of some sub-sets will have to be covered by another element , corresponding to a box. This corresponding box however gives a value in the initial instance.
Approximation ratio.
Let be any scenario in . We distinguish between the following cases, depending on whether there are outside option tests on ’s branch.
-
1.
No outside option tests on ’s branch: scenario contributes equally in both policies, since absence of isolating elements implies that all copies of scenario will be on the same branch (paying the same cost) in both and
-
2.
Some outside option tests on ’s branch: for this case, from Lemma B.1 we have that .
Putting it all together we get
where the second inequality follows since we are given an approximation and the last inequality since if we are given an optimal policy for , the exact same policy is also feasible for any instance of UDT, which has cost at least . We also used that , since otherwise the initial policy would never take the outside option. ∎
Lemma B.1.
Let be an instance of , and the instance of constructed by the reduction of Claim 4.3. For a scenario , if there is at least one outside option test run in , then .
Proof.
For the branch of scenario , denote by the box elements chosen before there were outside option elements, and by the number of outside option elements in . Note that the smallest cost is achieved if all the outside option elements are chosen first121212Since the outside option tests cause some copies to be isolated and so can reduce their cost.. The copies of scenario can be split into two groups; those that were isolated before outside option elements were chosen, and those that were isolated after. We distinguish between the following cases, based on the value of .
-
1.
: in this case each of the copies of that are isolated after pays at least for the initial box elements and the initial sequence of outside option elements. For the copies isolated before, we lower bound the cost by choosing all outside option elements first.
The cost of all the copies in then is at least
Since , policy will take the outside option box for , immediately after choosing the initial boxes corresponding to the box elements. So, the total contribution has on the expected cost of is at most in this case. Hence, we have that ’s contribution in is at most times ’s contribution in .
-
2.
: policy will only select the boxes (corresponding to box elements) and this was sufficient for finding a value less than . The total contribution of on is exactly . On the other hand, since we know that at least half of the copies will pay for all of the box elements. The cost of all the copies is at least
therefore, the contribution has on is at least . Hence, we have
∎
B.1 Proofs from subsection 4.1.1
See 4.9
Proof.
Consider a policy which runs on the instance ; and for scenarios with cost aborts after spending this cost and chooses the outside option . The cost of this policy is:
(3) |
By our assumption on , this cost is at most . On the other hand since is an -approximation to the optimal we have that the cost of the algorithm’s solution is at most
Since the expected cost of is at most , then using Markov’s inequality, we get that . Therefore, covers at least mass every time. ∎
See 4.8
Proof.
In every interval of the form the optimal policy for PB covers at least of the probability mass that remains. Since the values belong in the interval in phase , it follows that the minimum possible value that the optimal policy might pay is , i.e. the lower end of the interval. Summing up for all intervals, we get the lemma. ∎
B.2 Proofs from subsection 4.1.2
See 4.4
Proof.
The proof in this case follows the steps of the proof of Theorem 4.5, and we are only highlighting the changes. The process of the reduction is the same as Algorithm 1 with the only difference that we add two extra steps; (1) we initially remove all low probability scenarios (line 3 - remove at most fraction) and (2) we add them back after running (line 8). The reduction process is formally shown in Algorithm 2.
Calculating the thresholds.
For every phase we choose a threshold such that i.e. at most of the probability mass of the scenarios are not covered, again using binary search as in Algorithm 1. We denote by the relevant interval of costs at every run of the algorithm, then by Lemma 4.9, we know that for remaining total probability mass , any threshold which satisfies
also satisfies the desired covering property, i.e. at least mass of the current scenarios is covered. Therefore the threshold found by our binary search satisfies
(4) |
Following the proof of Theorem 4.5, Constructing the final policy and Accounting for the values remain exactly the same as neither of them uses the fact that the scenarios are uniform.
Bounding the final cost.
Using the guarantee that at the end of every phase we cover of the scenarios, observe that the algorithm for is run in an interval of the form . Note also that these intervals are overlapping. Bounding the cost of the final policy for all intervals we get
From equation (4) | ||||
Using Lemma 4.8 | ||||
where the inequalities follow similarly to the proof of Theorem 4.5. Choosing and we get the theorem. ∎
Appendix C Proofs from Section 5
See 5.1
Proof of Claim 5.1.
Let be an instance of . We create an instance of DT the following way: for every set we create a scenario with the same probability and for every element we create a test with the same cost, that gives full feedback whenever an element belongs to a set, otherwise returns only the element’s feedback . Formally, the -test under scenario returns
therefore the test isolates scenario when .
Constructing the policy.
Given a policy for the instance of DT, we can construct a policy for by selecting the element that corresponds to the test chose. When finishes, all scenarios are identified and for any scenario either (1) there is a test in that corresponds to an element in (in the instance ) or (2) there is no such test, but we can pay an extra to select the lowest cost element in this set131313Since the scenario is identified, we know exactly which element this is..
Observe also that in this instance of DT if we were given the optimal solution, it directly translates to a solution for with the same cost, therefore
(5) |
Bounding the cost of the policy.
As we described above the total cost of the policy is
where in the last inequality we used equation (5).
Note that for this reduction we did not change the probabilities of the scenarios, therefore if we had started with uniform probabilities and had an oracle to UDT, we would still get an algorithm for . ∎
In the reduction proof of Theorem 5.2, we are using the following two lemmas, that show that the policy constructed for UDT via the reduction is feasible and has bounded cost.
Lemma C.1.
Given an instance of UDT and the corresponding instance of in the reduction of Theorem 5.2, the policy constructed for UDT is feasible.
Proof of Lemma C.1.
It suffices to show that every scenario is isolated. Fix a scenario . Observe that ’s branch has chosen the isolating element in the solution, since that is the the only element that belongs to set . Let be the set of scenarios just before is chosen and note that by definition .
If , then since runs tests giving the same branching behavior by definition of , and is the only scenario left, we have that the branch of isolates scenario .
If then all scenarios/sets in are not covered by choosing element , therefore they are covered at strictly deeper leaves in the tree. By induction on the depth of the tree, we can assume that for each scenario is isolated in . We distinguish the following cases based on when we encounter among the isolating elements on ’s branch.
-
1.
was the first isolating element chosen on the branch: then policy ignores element . Since every leaf holds a unique scenario in , if we ignore it follows some path of tests and either be isolated or end up in a node that originally would have had only one scenario, as shown in Figure 6. Since there are only two scenarios at that node, policy runs the cheapest test distinguishing from that scenario.
Figure 6: Case 1: is the set of scenarios remaining when is chosen, is the scenario that ends up with. -
2.
A different element was chosen before : by our construction, instead of ignoring we now run the cheapest test that distinguishes from , causing and to go down separate branches, as shown in figure 7. We apply the induction hypothesis again to the scenarios in these sub-branches, therefore, both and are either isolated or end up in a node with a single scenario and then get distinguished by the last case of ’s construction.
Figure 7: Case 2: run test to distinguish and . Sets and partition
Hence, is isolating for any scenario . Also, notice that any two scenarios that have isolating boxes on the same branch will end up in distinct subtrees of the lower node. ∎
Lemma C.2.
Given an instance of and an instance of UDT, in the reduction of Theorem 5.2 it holds that
Proof of Lemma C.2.
Let be any scenario in . We use induction on the number of isolating boxes along ’s branch in . Initially observe that will always exist in ’s branch, in any feasible solution to . We use and to denote the costs of box and test , for any and .
-
1.
Only is on the branch: since will be ignored, we end up with and some other not yet isolated scenario, let be that scenario. To isolate and we run the cheapest test that distinguishes between these. From the definition of the cost of we know that . Additionally, since and both and have probability , overall we have . This is also shown in Figure 6.
-
2.
More than one isolating elements are on the branch: similarly, observe that for any extra isolating element we encounter, we substitute it with a test that distinguishes between and and costs at most . Given that and scenarios are uniform, we again have .
∎
See 5.2
Proof of Theorem 5.2.
We begin by giving the construction of the policy in the reduction, and showing the final approximation ratio.
Constructing the policy.
Given a policy for the instance of , we construct a policy . For any test element that selects, runs the equivalent test . For the isolating elements we distinguish the following cases.
-
1.
If selects an isolating element for the first time on the current branch, then ignores this element but remembers the set/scenario , which belonged to.
-
2.
If selects another isolating element after some on the branch, then runs the minimum cost test that distinguishes scenario from where was the most recent isolating element chosen on this branch prior to .
-
3.
If we are at the end of , there can be at most scenarios remaining on the branch, so runs the minimum cost test that distinguishes these two scenarios.
By Lemma C.1, we have that the above policy is feasible for UDT.
Approximation ratio.
From Lemma C.2 we have that . For the optimal policy, we have that . This holds since if we have an optimal solution to UDT, we can add an isolating element at every leaf to make it feasible for , by only increasing the cost by a factor of 141414This is because for every two scenarios, the UDT solution must distinguish between them, but one of these scenarios is the scenario from the definition of , for which we pay less than , which means that will be less than this transformed solution. Overall, if is computed from an -approximation for , we have
∎
Appendix D Proofs from Section 6
See 6.3
Proof.
Let be any two scenarios in the instance of PB and let be the value returned by opening the ’th informative box, which has distributions and for scenarios and respectively. Then by the definition of informative boxes for every such box opened, there is a set of values for which and a set for which the reverse holds. Denote these sets by and respectively. We also define the indicator variables . Define , and observe that . Since for every box we have an gap in TV distance between the scenarios we have that
therefore if we conclude that scenario is eliminated, otherwise we eliminate scenario . The probability of error is , where we used Hoeffding’s inequality since . Since we want the probability of error to be less than , we need to open informative boxes. ∎
Proof of Theorem 6.1.
We describe how to bound the final cost, and calculate the runtime of the DP. Denote by where we show that in order to get -approximation we set .
Cost of the final solution.
Observe that the only case where the DP limits the search space is when . If the scenario is identified correctly, the DP finds the optimal solution by running the greedy order; every time choosing the box with the highest probability of a value below 151515When there is only one scenario, this is exactly Weitzman’s algorithm..
In order to eliminate all scenarios but one, we should eliminate all but one of the pairs in the list . From Lemma 6.3, and a union bound on all pairs, the probability of the last scenario being the wrong one is at most . By setting , we get that the probability of error is at most , in which case we pay at most , therefore getting an extra factor.
Runtime.
The DP maintains a list of sets of informative boxes opened, and numbers of non informative ones. Recall that has the following form , where from Lemma 6.3 and the fact that there are pairs in . There are in total boxes, and “positions” for them, therefore the size of the state space is . There is also an extra factor for searching in the list of informative boxes at every step of the recursion. Observe that the numbers of non-informative boxes also add a factor of at most in the state space. The list adds another factor at most , and the list a factor of making the total runtime to be . ∎
Appendix E Boxes with Non-Unit Costs: Revisiting our Results
In the original Pandora’s Box problem, denoted by , each box has a different known cost . Similarly we denote the non-unit cost version of both decision tree-like problems and Min Sum Set Cover-like problems by adding a superscript c to the problem name. Specifically, we now define , , and , where the tests (elements) have non-unit cost for the decision tree (min sum set cover) problems. We revisit our results and describe how our reductions change to incorporate non-unit cost boxes (summary in Figure 8).
Note also, that even though the known results for Optimal Decision Tree (e.g. [GB09, GNR17]) handle non-unit test costs, the currently known works for Uniform Decision Tree do not. If however there is an algorithm for Uniform Decision Tree with non-unit costs, our reductions will handle this obtaining the same approximation guarantees.
E.1 Connecting Pandora’s Box and
All the results of this section hold as they are when we change all versions to incorporate costs. We did not use the fact that the costs are unit in any of the proofs of Claim 4.1, Claim 4.3 or Lemmas 4.5, 4.4. We formally restate the main theorem of Section 4 as the following corollary, where the only change is that it now holds for the cost versions of the problems.
Corollary E.1 (Pandora’s Box to with non-unit costs).
If there exists an approximation algorithm for then there exists a -approximation for . The same result holds if the initial algorithm is for .
E.2 Connecting and Optimal Decision Tree
In this section the reduction of Theorem 5.2 uses the fact that the costs are uniform. However we can easily circumvent this and obtain corollary E.2. Using this, the results for the non-unit costs versions are summarized in Figure 10.
Corollary E.2 (Uniform Decision Tree with costs to ).
Given an -approximation algorithm for then there exists an -approximation algorithm for .
Proof.
The proof follows exactly the same way as the proof of Theorem 5.2 with one change: the cost of an isolating element is the minimum cost test needed to isolate from scenario where is the scenario that maximizes this quantity. Formally, if , then . The reduction follows the exact steps as the one we described in Section C. ∎