Streaming Adaptive Submodular Maximization
Abstract
Many sequential decision making problems can be formulated as an adaptive submodular maximization problem. However, most of existing studies in this field focus on pool-based setting, where one can pick items in any order, and there have been few studies for the stream-based setting where items arrive in an arbitrary order and one must immediately decide whether to select an item or not upon its arrival. In this paper, we introduce a new class of utility functions, semi-policywise submodular functions. We develop a series of effective algorithms to maximize a semi-policywise submodular function under the stream-based setting.
1 Introduction
Many machine learning and artificial intelligence tasks can be formulated as an adaptive sequential decision making problem. The goal of such a problem is to sequentially select a group of items, each selection is based on the past, in order to maximize some give utility function. It has been shown that in a wide range of applications, including active learning [6] and adaptive viral marketing [16], their utility functions satisfy the property of adaptive submodularity [6], a natural diminishing returns property under the adaptive setting. Several effective solutions have been developed for maximizing an adaptive submodular function subject to various practical constraints. For example, [6] developed a simple adaptive greedy policy that achieves a approximation ratio for maximizing an adaptive monotone and adaptive submodular function subject to a cardinality constraint. Recently, [13] extends the aforementioned studies to the non-monotone setting and they propose a approximated solution for maximizing a non-monotone adaptive submodular function subject to a cardinality constraint. In the same work, they develop a faster algorithm whose running time is linear in the number of items. [14] develops the first constant approximation algorithms subject to more general constraints such as knapsack constraint and -system constraint.
We note that most of existing studies focus on the pool-based setting where one is allowed to select items in any order. In this paper, we tackle this problem under the stream-based setting. Under our setting, items arrive one by one in an online fashion where the order of arrivals is decided by the adversary. Upon the arrival of an item, one must decide immediately whether to select that item or not. If this item is selected, then we are able to observe its realized state; otherwise, we skip this item and wait for the next item. Our goal is to adaptively select a group items in order to maximize the expected utility subject to a knapsack constraint. For solving this problem, we introduce the concept of semi-policywise submodularity, which is another adaptive extension of the classical notation of submodularity. We show that this property can be found in many real world applications such as active learning and adaptive viral marketing. We develop a series of simple adaptive policies for this problem and prove that if the utility function is semi-policywise submodular, then our policies achieve constant approximation ratios against the optimal pool-based policy. In particular, for a single cardinality constraint, we develop a stream-based policy that achieves an approximation ratio of . For a general knapsack constraint, we develop a stream-based policy that achieves an approximation ratio of .
2 Related Work
Stream-based submodular optimization
Non-adaptive submodular maximization under the stream-based setting has been extensively studied. For example, [2] develop the first efficient non-adaptive streaming algorithm SieveStreaming that achieves a approximation ratio against the optimum solution. Their algorithm requires only a single pass through the data, and memory independent of data size. [10] develop an enhanced streaming algorithm which requires less memory than SieveStreaming. Very recently, [11] propose a new algorithm that works well under the assumption that a single function evaluation is very expensive. [5] extend the previous studies from the non-adaptive setting to the adaptive setting. They develop constant factor approximation solutions for their problem. However, they assume that items arrive in a random order, which is a large difference from our adversarial arrival model. Our work is also related to submodular prophet inequalities [3, 12]. Although they also consider an adversarial arrival model, their setting is different from ours in that 1. they assume items are independent and 2. they are allowed to observe an item’s state before selecting it.
Adaptive submodular maximization
[6] introduce the concept of adaptive submodularity that extends the notation of submodularity from sets to policies. They develop a simple adaptive greedy policy that achieves a approximation ratio if the function is adaptive monotone and adaptive submodular. When the utility function is non-monotone, [13] show that a randomized greedy policy achieves a approximation ratio subject to a cardinality constraint. Very recently, they generalize their previous study and develop the first constant approximation algorithms subject to more general constraints such as knapsack constraint and -system constraint [14]. Other variants of adaptive submodular maximization have been studied in [15, 18, 17, 19].
3 Preliminaries
3.1 Items
We consider a set of items. Each items belongs to a random state where represents the set of all possible states. Denote by a realization of , i.e., for each , is a realization of . In the application of experimental design, an item represents a test, such as the blood pressure, and is the result of the test, such as, high. We assume that there is a known prior probability distribution over realizations . The distribution completely factorizes if realizations are independent. However, we consider a general setting where the realizations are dependent. For any subset of items , we use to represent a partial realization and is called the domain of . For any pair of a partial realization and a realization , we say is consistent with , denoted , if they are equal everywhere in . For any two partial realizations and , we say that is a subrealization of , and denoted by , if and they are consistent in . In addition, each item has a cost . For any , let denote the total cost of .
3.2 Policies
In the stream-based setting, we assume that items arrive one by one in an adversarial order . A policy has to make an irrevocable decision on whether to select an item or not when an item arrives. If an item is selected, then we are able to observe its realized state; otherwise, we can not reveal its realized state. Formally, a stream-based policy is a partial mapping that maps a pair of partial realizations and an item to some distribution of : , specifying whether to select the arriving item based on the current observation . For example, assume that the current observation is and the newly arrived item is , then (resp. ) indicates that selects (res. does not select) .
Assume that there is a utility function which is defined over items and states. Letting denote the subset of items selected by a stream-based policy conditioned on a realization and a sequence of arrivals , the expected utility of a stream-based policy conditioned on a sequence of arrivals can be written as
where the expectation is taken over all possible realizations and the internal randomness of the policy .
We next introduce the concept of policy concatenation which will be used in our proofs.
Definition 1 (Policy Concatenation)
Given two policies and , let denote a policy that runs first, and then runs , ignoring the observation obtained from running .
3.2.1 Pool-based policy
When analyzing the performance of our stream-based policy, we compare our policy against the optimal pool-based policy which is allowed to select items in any order. Note that any stream-based policy can be viewed as a special case of pool-based policy, hence, an optimal pool-based policy can not perform worse than any optimal stream-based policy. By abuse of notation, we still use to represent a pool-based policy. Formally, a pool-based policy can be encoded as a partial mapping that maps partial realizations to some distribution of : . Intuitively, specifies which item to select next based on the current observation . Letting denote the subset of items selected by a pool-based policy conditioned on a realization , the expected utility of a pool-based policy can be written as
where the expectation is taken over all possible realizations and the internal randomness of the policy . Note that if is a pool-based policy, then for any sequence of arrivals , . This is because the output of a pool-based policy does not depend on the sequence of arrivals.
3.3 Problem Formulation and Additional Notations
Our objective is to find an stream-based policy that maximizes the worst-case expected utility subject to a budget constraint , i.e.,
(1) |
where represents a set of all feasible stream-based policies subject to a knapsack constraint . That is, a feasible policy must satisfy the budget constraint under all possible realizations and sequences of arrivals.
We next introduce some additional notations and important assumptions in order to facilitate our study.
Definition 2 (Conditional Expected Marginal Utility of an Item)
Given a utility function , the conditional expected marginal utility of an item on top of a partial realization is
(2) |
where the expectation is taken over with respect to .
Definition 3
[6][Adaptive Submodularity and Monotonicity] A function is adaptive submodular with respect to a prior if for any two partial realization and such that and any item ,
(3) |
Moreover, if is adaptive monotone with respect to a prior , then we have for any partial realization and any item .
Definition 4 (Conditional Expected Marginal Utility of a Pool-based Policy)
Given a utility function , the conditional expected marginal utility of a pool-based policy on top of partial realization is
where the expectation is taken over with respect to and the internal randomness of .
We next introduce a new class of stochastic functions.
Definition 5 (Semi-Policywise Submodularity)
A function is semi-policywise submodular with respect to a prior and a knapsack constraint if for any partial realization ,
(4) |
where denotes the set of all possible pool-based policies subject to a knapsack constraint , i.e., , and
represents an optimal pool-based policy subject to .
In the rest of this paper, we always assume that our utility function is adaptive monotone, adaptive submodular and semi-policywise submodular with respect to a prior and a knapsack constraint . In appendix, we show that this type of function can be found in a variety of important real world applications. All missing materials are moved to appendix.
4 Uniform Cost
We first study the case when all items have uniform costs, i.e., . Without loss of generality, assume is some positive integer. To solve this problem, we extend the non-adaptive solution in [2] to the adaptive setting.
4.1 Algorithm Design
Recall that represents an optimal pool-based policy subject to a budget constraint , suppose we can estimate approximately, i.e., we know a value such that for some and . Our policy, called Online Adaptive Policy , starts with an empty set . In each subsequent iteration , after observing an arriving item , adds to if the marginal value of on top of the current partial realization is at least ; otherwise, it skips . This process iterates until there are no more arriving items or it reaches the cardinality constraint. A detailed description of is listed in Algorithm 1.
4.2 Performance Analysis
We present the main result of this section in the following theorem.
Theorem 4.1
Assuming that we know a value such that for some and , we have for any sequence of arrivals .
4.3 Offline Estimation of
Recall that the design of requires that we know a good approximation of . We next explain how to obtain such an estimation. It is well known that a simple greedy pool-based policy (which is outlined in Algorithm 2) provides a approximation for the pool-based adaptive submodular maximization problem subject to a cardinality constraint [6], i.e., . Hence, is a good approximation of . In particular, if we set , then we have . This, together with Theorem 5.1, implies that achieves a approximation ratio against . One can estimate the value of by simulating on every possible realization to obtain and letting . When the number of possible realizations is large, one can sample a set of realizations according to then run the simulation. Although obtaining a good estimation of may be time consuming, this only needs to be done once in an offline manner. Thus, it does not contribute to the running time of the online implementation of .
5 Nonuniform Cost
We next study the general case when items have nonuniform costs.
5.1 Algorithm Design
Suppose we can estimate approximately, i.e., we know a value such that for some and . For each , let denote for short. Our policy randomly selects a solution from and with equal probability, where is the best singleton and , which is called Online Adaptive Policy with Nonuniform Cost, is a density-greedy policy. Hence, the expected utility of our policy is for any given sequence of arrivals . We next explain the design of . starts with an empty set . In each subsequent iteration , after observing an arriving item , it adds to if the marginal value per unit budget of on top of the current realization is at least , i.e., , and adding to does not violate the budget constraint; otherwise, if , skips . This process iterates until there are no more arriving items or it reaches the first item (excluded) that violates the budget constraint. A detailed description of is listed in Algorithm 3.
5.2 Performance Analysis
Before presenting the main theorem, we first introduce a technical lemma.
Lemma 1
Assuming that we know a value such that for some and , we have for any sequence of arrivals .
Proof: We first introduce an auxiliary policy that follows the same procedure of except that is allowed to add the first item that violates the budget constraint. Although is not necessarily feasible, we next show that the expected utility of is upper bounded by for any sequence of arrivals , i.e., .
Proposition 1
For any sequence of arrivals ,
Proposition 1, whose proof is deferred to appendix, implies that to prove this lemma, it suffices to show that . The rest of the analysis is devoted to proving this inequality for any fixed sequence of arrivals . We use to denote a fixed run of , where is the partial realization of the first selected items and is the total number of selected items under . Let represent all possible stories of running , represent those stories where meets or violates the budget, i.e., , and represent those stories where does not use up the budget, i.e., . Therefore, . For each and , let denote the -th selected item under . Define for any . Using the above notations, we can represent as follows:
(5) | |||
(6) |
Then we consider two cases. We first consider the case when and show that the value of is lower bounded by . According to the definition of , we have for any . Moreover, recall that for all , due to the design of our algorithm. Therefore, for any ,
(7) |
Because we assume that , we have
(8) |
The first inequality is due to (7) and the third inequality is due to the assumption that . We conclude that the value of (and thus ) is no less than , i.e.,
(9) |
We next consider the case when . We show that under this case,
(10) |
Because is adaptive monotone, we have . To prove (10), it suffices to show that
Observe that we can represent the gap between and conditioned on as follows:
(11) | |||
(12) |
Because is semi-policywise submodular with respect to and , we have . Moreover, because , we have
(13) |
It follows that
(14) |
Next, we show that is upper bounded by . For any , we number all items by decreasing ratio , i.e., . Let . Define as the set containing the first items. Intuitively, represents a set of best-looking items conditional on . Consider any , assuming is the -th item in , let
where represents the first items in .
In analogy to Lemma 1 of [9],
(15) |
Note that for every , we have , that is, does not use up the budget under . This, together with the design of , indicates that for any , its benefit-to-cost ratio on top of is less than , i.e., . Therefore,
(16) |
(17) |
We next provide an upper bound of ,
(18) | |||||
(19) |
where the first inequality is due to (17) and the second inequality is due to .
Now we are in position to bound the value of ,
(20) | |||||
(21) | |||||
(22) | |||||
(23) |
The first inequality is due to (14) and (19). The second inequality is due to and the assumptions that and . Because , which is due to is adaptive monotone, we have
(24) | |||||
(25) |
where the second inequality is due to (23). This, together with the fact that , i.e., the optimal pool-based policy is not dependent on the sequence of arrivals, implies (10).
This, together with Proposition 1, immedinately conclues this lemma.
Recall that our final policy randomly picks a solution from and with equal probability, thus, its expected utility is which is lower bounded by . This, together with Lemma 1, implies the following main theorem.
Theorem 5.1
If we randomly pick a solution from and with equal probability, then it achieves a approximation ratio against the optimal pool-based policy .
5.3 Offline Estimation of
To complete the design of , we next explain how to estimate the utility of the optimal pool-based policy . It has been shown that the better solution between and a pool-based density-greedy policy (Algorithm 4) achieves a approximation for the pool-based adaptive submodular maximization problem subject to a knapsack constraint [20], i.e., . If we set in , then we have and . This, together with Theorem 5.1, implies that achieves a approximation ratio against . One can estimate the value of by simulating on every possible realization to obtain and letting . To estimate the value of , one can compute the value of using for all , then return the best result as .
References
- [1] Adibi, A., Mokhtari, A., Hassani, H.: Submodular meta-learning. Advances in Neural Information Processing Systems 33 (2020)
- [2] Badanidiyuru, A., Mirzasoleiman, B., Karbasi, A., Krause, A.: Streaming submodular maximization: Massive data summarization on the fly. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. pp. 671–680 (2014)
- [3] Chekuri, C., Livanos, V.: On submodular prophet inequalities and correlation gap. arXiv preprint arXiv:2107.03662 (2021)
- [4] Cuong, N.V., Lee, W.S., Ye, N., Chai, K.M., Chieu, H.L.: Active learning for probabilistic hypotheses using the maximum gibbs error criterion. Advances in Neural Information Processing Systems 26 (NIPS 2013) pp. 1457–1465 (2013)
- [5] Fujii, K., Kashima, H.: Budgeted stream-based active learning via adaptive submodular maximization. In: NIPS. vol. 16, pp. 514–522 (2016)
- [6] Golovin, D., Krause, A.: Adaptive submodularity: Theory and applications in active learning and stochastic optimization. Journal of Artificial Intelligence Research 42, 427–486 (2011)
- [7] Golovin, D., Krause, A., Ray, D.: Near-optimal bayesian active learning with noisy observations. In: NIPS (2010)
- [8] Gonen, A., Sabato, S., Shalev-Shwartz, S.: Efficient active learning of halfspaces: an aggressive approach. In: International Conference on Machine Learning. pp. 480–488. PMLR (2013)
- [9] Gotovos, A., Karbasi, A., Krause, A.: Non-monotone adaptive submodular maximization. In: Twenty-Fourth International Joint Conference on Artificial Intelligence (2015)
- [10] Kazemi, E., Mitrovic, M., Zadimoghaddam, M., Lattanzi, S., Karbasi, A.: Submodular streaming in all its glory: Tight approximation, minimum memory and low adaptive complexity. In: International Conference on Machine Learning. pp. 3311–3320. PMLR (2019)
- [11] Kuhnle, A.: Quick streaming algorithms for maximization of monotone submodular functions in linear time. In: International Conference on Artificial Intelligence and Statistics. pp. 1360–1368. PMLR (2021)
- [12] Rubinstein, A., Singla, S.: Combinatorial prophet inequalities. In: Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms. pp. 1671–1687. SIAM (2017)
- [13] Tang, S.: Beyond pointwise submodularity: Non-monotone adaptive submodular maximization in linear time. Theoretical Computer Science 850, 249–261 (2021)
- [14] Tang, S.: Beyond pointwise submodularity: Non-monotone adaptive submodular maximization subject to knapsack and -system constraints. In: 4th international conference on “Modelling, Computation and Optimization in Information Systems and Management Sciences” (2021)
- [15] Tang, S.: Robust adaptive submodular maximization. CoRR abs/2107.11333 (2021), https://arxiv.org/abs/2107.11333
- [16] Tang, S., Yuan, J.: Influence maximization with partial feedback. Operations Research Letters 48(1), 24–28 (2020)
- [17] Tang, S., Yuan, J.: Adaptive regularized submodular maximization. In: 32nd International Symposium on Algorithms and Computation (ISAAC 2021). Schloss Dagstuhl-Leibniz-Zentrum für Informatik (2021)
- [18] Tang, S., Yuan, J.: Non-monotone adaptive submodular meta-learning. In: SIAM Conference on Applied and Computational Discrete Algorithms (ACDA21). pp. 57–65. SIAM (2021)
- [19] Tang, S., Yuan, J.: Optimal sampling gaps for adaptive submodular maximization. In: AAAI (2022)
- [20] Yuan, J., Tang, S.J.: Adaptive discount allocation in social networks. In: Proceedings of the 18th ACM International Symposium on Mobile Ad Hoc Networking and Computing. pp. 1–10 (2017)
6 Appendix
6.1 Proof of Theorem 5.1
Our proof is conducted conditional on a fixed sequence of arrivals . Let denote a fixed run of , where is the partial realization of the first selected items and is the total number of selected items under . For any , let denote the -th selected item under , i.e., . Suppose is the probability that occurs for any , we can represent the expected utility of conditioned on as follows:
(27) | |||
(28) |
where represents all possible stories of running , represents those stories where selects exactly items, i.e., , and represents those stories where selects fewer than items, i.e., . Therefore, .
We prove this lemma in two cases as follows. We first consider the case when and show that the value of part of is lower bounded by . According to the definition of , we have for any . Moreover, recall that for all , due to the design of our algorithm. Therefore, for any ,
(29) |
It follows that
(30) |
The first inequality is due to (29), the second inequality is due to the assumption that , and the third inequality is due to the assumption that . We conclude that the value of (and thus ) is no less than , i.e.,
(31) |
We next consider the case when . We show that for any sequence of arrivals under this case. Observe that
(32) | |||
(33) |
Because is semi-policywise submodular with respect to and , we have . Moreover, because and , we have
(34) |
It follows that
(35) |
Next, we show that is upper bounded by . For any final partial realization , let denote the set of items having the largest marginal utility on top of . It has been shown that if is adaptive submodular, then for any ,
(36) |
Recall that for every , we have , that is, selects fewer than items under . This, together with the design of , indicates that for any , its marginal utility of on top of is less than , i.e., . Therefore,
(37) |
(38) |
We next provide an upper bound of ,
(39) |
where the first inequality is due to (38) and the second inequality is due to .
Now we are in position to bound the value of .
(40) | |||||
(41) | |||||
(42) | |||||
(43) |
The first inequality is due to (35) and (39). The second inequality is due to and the assumptions that and . Because , which is due to is adaptive monotone, we have
(44) | |||||
(45) |
where the second inequality is due to (43). This, together with the fact that , i.e., the optimal pool-based policy is not dependent on the sequence of arrivals, implies that
(46) |
Combining the above two cases ((31) and (46)), we have
(47) |
6.2 Proof of Proposition 1
Let denote a fixed run of , where is the partial realization of the first selected items and is the total number of selected items under . Let represent all possible stories of running , represent those stories where meets or violates the budget, i.e., , and represent those stories where does not use up the budget, i.e., . Therefore, . Using the above notations, we can represent as follows:
(48) | |||
(49) |
Note that the outputs of and differ in at most one item. This occurs only when selects some item that violates the budget constraint. Hence, by removing the last selected item form the output of under all , we obtain a lower bound on the expected utility of , using the same notations as for analyzing , as follows:
(50) | |||
(51) |
Hence,
(52) | |||
(53) | |||
(54) | |||
(55) | |||
(56) | |||
(57) | |||
(58) | |||
(59) | |||
(60) |
The third inequality is due to the assumption that is adaptive submodular, which implies that for any .
6.3 Applications
In this section, we show that both adaptive submodularity and semi-policywise submodularity can be found in several important applications. We first present the concept of policywise submodularity which is first introduced in [19].
Definition 6
[19] A function is policywise submodular with respect to a prior and a knapsack constraint if for any two partial realizations and such that and , and any such that , we have , where denotes the set of feasible policies which are restricted to selecting items only from .
In [19], it has been shown that many existing adaptive submodular functions used in various applications, including pool-based active learning [6, 7, 8, 4], stochastic submodular cover [1] and adaptive viral marketing [6], also satisfy the policywise submodularity. Our next lemma shows that policywise submodularity implies semi-policywise submodularity. This indicates that all aforementioned applications satisfy both adaptive submodularity and semi-policywise submodularity.
Lemma 2
If is policywise submodular and adaptive monotone with respect to and all knapsack constraints, then is semi-policywise submodular with respect to and any knapsack constraint .
Proof: Consider two partial realizations and , and any knapsack constraint , let and . Because and we assume is policywise submodular with respect to and all knapsack constraints, including , we have
(61) |
where . Let represent an optimal pool-based policy subject to a knapsack constraint on top of . Due to the definition of , it is easy to verify that . Hence, (61) indicates that
(62) |
Moreover, because represents the best pool-based policy subject to , we have where the second inequality is due to is adaptive submodular. This, together with (62), implies that . Hence, is semi-policywise submodular with respect to and .