Covering problems in edge- and node-weighted graphs
Abstract
This paper discusses the graph covering problem in which a set of edges in an edge- and node-weighted graph is chosen to satisfy some covering constraints while minimizing the sum of the weights. In this problem, because of the large integrality gap of a natural linear programming (LP) relaxation, LP rounding algorithms based on the relaxation yield poor performance. Here we propose a stronger LP relaxation for the graph covering problem. The proposed relaxation is applied to designing primal-dual algorithms for two fundamental graph covering problems: the prize-collecting edge dominating set problem and the multicut problem in trees. Our algorithms are an exact polynomial-time algorithm for the former problem, and a 2-approximation algorithm for the latter problem, respectively. These results match the currently known best results for purely edge-weighted graphs.
1 Introduction
1.1 Motivation
Choosing a set of edges in a graph that optimizes some objective function under constraints on the chosen edges constitutes a typical combinatorial optimization problem and has been investigated in many varieties. For example, the spanning tree problem seeks an acyclic edge set that spans all nodes in a graph, the edge cover problem finds an edge set such that each node is incident to at least one edge in the set, and the shortest path problem selects an edge set that connects two specified nodes. All these problems seek to minimize the sum of the weights assigned to edges.
This paper discusses several graph covering problems. Formally, the graph covering problem is defined as follows in this paper. Given a graph and family , find a subset of that satisfies for each , while optimizing some function depending on . As indicated above, the popular approaches assume an edge weight function is given, where denotes the set of non-negative real numbers, and seeks to minimize . On the other hand, we aspire to simultaneously minimize edge and node weights. Formally, we let denote the set of end nodes of edges in . Given a graph and weight function , we seek a subset of that minimizes under the constraints on . Hereafter, we denote and by and , respectively.
Most previous investigations of the graph covering problem have focused on edge weights. By contrast, node weights have been largely neglected, except in the problems of choosing node sets, such as the vertex cover and dominating set problems. To our knowledge, when node weights have been considered in graph covering problems for choosing edge sets, they have been restricted to the Steiner tree problem or its generalizations, possibly because the inclusion of node weights greatly complicates the problem. For example, the Steiner tree problem in edge-weighted graphs can be approximated within a constant factor (the best currently known approximation factor is 1.39 [5, 15]). Conversely, the Steiner tree problem in node-weighted graphs is known to extend the set cover problem (see [20]), indicating that achieving an approximation factor of is NP-hard. The literature is reviewed in Section 2. As revealed later, the inclusion of node weights generalizes the set cover problem in numerous fundamental problems.
However, from another perspective, node weights can introduce rich structure into the above problems. In fact, node weights provide useful optimization problems. The objective function counts the weight of a node only once, even if the node is shared by multiple edges. Hence, the objective function defined from node weights includes a certain subadditivity, which cannot be captured by edge weights.
The aim of the present paper is to give algorithms for fundamental graph covering problems in edge- and node-weighted graphs. In solving the problems, we adopt a basic linear programming (LP) technique. Many algorithms for combinatorial optimization problems are typically designed using LP relaxations. However, in problems with node-weighted graphs, the integrality gap of natural relaxations may be excessively large. Therefore, we propose tighter LP relaxations that preclude unnecessary integrality gaps. We then discuss upper bounds on the integrality gap of these relaxations in two fundamental graph covering problems: the edge dominating set (EDS) problem and multicut problem in trees. We prove upper bounds by designing primal-dual algorithms for both problems. The approximation factors of our proposed algorithms match the current best approximations in purely edge-weighted graphs.
1.2 Problem definitions
The EDS problem covers edges by choosing adjacent edges in undirected graphs. For any edge , let denote the set of edges that share end nodes with , including itself. We say that an edge dominates another edge if , and a set of edges dominates an edge if contains an edge that dominates . Given an undirected graph , a set of edges is called an EDS if it dominates each edge in . The EDS problem seeks to minimize the weight of the EDS. In other words, the EDS problem is the graph covering problem with .
In the multicut problem, an instance specifies an undirected graph and demand pairs . A multicut is an edge set whose removal from disconnects the nodes in each demand pair. This problem seeks a multicut of minimum weight. Let denote the set of paths connecting and . The multicut problem is equivalent to the graph covering problem with .
Our proposed algorithms for solving these problems assume that the given graph is a tree. In fact, our algorithms are applicable to the prize-collecting versions of these problems, which additionally specifies a penalty function . In this scenario, an edge set is a feasible solution even if for some , but imposes a penalty . The objective is to minimize the sum of , , and the penalty . The prize-collecting versions of the EDS and multicut problems are referred to as the prize-collecting EDS problem and the prize-collecting multicut problem, respectively.
1.3 Our results
Thus far, the EDS problem has been applied only to edge-weighted graphs. The vertex cover problem can be reduced to the EDS problem while preserving the approximation factors [6]. The vertex cover problem is solvable by a 2-approximation algorithm, which is widely regarded as the best possible approximation. Indeed, assuming the unique game conjecture, Khot and Regev [19] proved that the vertex cover problem cannot be approximated within a factor better than . Fujito and Nagamochi [11] showed that a 2-approximation algorithm is admitted by the EDS problem, which matches the approximation hardness known for the vertex cover problem. In the Appendix, we show that the EDS problem in bipartite graphs generalizes the set cover problem if assigned node weights and generalizes the non-metric facility location problem if assigned edge and node weights. This implies that including node weights increases difficulty of the problem even in bipartite graphs.
On the other hand, Kamiyama [18] proved that the prize-collecting EDS problem in an edge-weighted graph admits an exact polynomial-time algorithm if the graph is a tree. As one of our main results, we show that this idea is extendible to problems in edge- and node-weighted trees.
Theorem 1.
The prize-collecting EDS problem admits a polynomial-time exact algorithm for edge- and node-weighted trees.
Theorem 1 will be proven in Section 4. As demonstrated in the Appendix, the prize-collecting EDS problem in general edge- and node-weighted graphs admits an -approximation, which matches the approximation hardness on the set cover problem and the non-metric facility location problem.
The multicut problem is hard even in edge-weighted graphs; the best reported approximation factor is [13]. The multicut problem is known to be both NP-hard and MAX SNP-hard [10], and admits no constant factor approximation algorithm under the unique game conjecture [7]. However, Garg, Vazirani, and Yannakakis [14] developed a 2-approximation algorithm for the multicut problem with edge-weighted trees. They also mentioned that, although the graphs are restricted to trees, the structure of the problem is sufficiently rich. They showed that the tree multicut problem includes the set cover problem with tree-representable set systems. They also showed that the vertex cover problem in general graphs is simply reducible to the multicut problem in star graphs, while preserving the approximation factor. This implies that the 2-approximation seems to be tight for the multicut problem in trees. As a second main result, we extended this 2-approximation to edge- and node-weighted trees, as stated in the following theorem.
Theorem 2.
The prize-collecting multicut problem admits a 2-approximation algorithm for edge- and node-weighted trees.
Both algorithms claimed in Theorems 1 and 2 are primal-dual algorithms, that use the LP relaxations we propose. These algorithms fall in the same frameworks as those proposed in [14, 18] for edge-weighted graphs. However, they need several new ideas to achieve the claimed performance because our LP relaxations are much more complicated than those used in [14, 18].
The remainder of this paper is organized as follows. After surveying related work in Section 2, we define our LP relaxation for the prize-collecting graph covering problem in Section 3. Using this relaxation, we prove Theorems 1 and 2 in Sections 4 and 5, respectively. The paper concludes with Section 6. In the Appendix, we show that the prize-collecting EDS problem in edge- and node-weighted graphs generalizes the set cover problem and facility location problem, and admits an -approximation algorithm.
2 Related work
As mentioned in Section 1, the graph covering problem in node-weighted graphs has thus far been applied to the Steiner tree problem and its generalizations. Klein and Ravi [20] proposed an -approximation algorithm for the Steiner tree problem with node weights. Nutov [25, 26] extended this algorithm to the survivable network design problem with higher connectivity requirements. An -approximation algorithm for the prize-collecting Steiner tree problem with node weights was provided by Moss and Rabani [23]; however, as noted by Könemann, Sadeghian, and Sanità [21], the proof of this algorithm contains a technical error. This error was corrected in [21]. Bateni, Hajiaghayi, and Liaghat [1] proposed an -approximation algorithm for the prize-collecting Steiner forest problem and applied it to the budgeted Steiner tree problem. Chekuri, Ene, and Vakilian [8] gave an -approximation algorithm for the prize-collecting survivable network design problem with edge-connectivity requirements of maximum value . Later, they improved their approximation factor to , and also extended it to node-connectivity requirements (see [30]). Naor, Panigrahi, and Singh [24] established an online algorithm for the Steiner tree problem with node weights which was extended to the Steiner forest problem by Hajiaghayi, Liaghat, and Panigrahi [16]. The survivable network design problem with node weights has also been extended to a problem called the network activation problem [28, 27, 12].
The prize-collecting EDS problem generalizes the -EDS problem, in which given demand edges require being dominated by a solution edge set. The -EDS problem in general edge-weighted graphs admits a -approximation, which was proven by Berger et al. [2]. This -approximation was extended to the prize-collecting EDS problem by Parekh [29]. Berger and Parekh [3] designed an exact algorithm for the -EDS problem in edge-weighted trees, but their result contains an error [4]. Since the prize-collecting EDS problem embodies the -EDS problem, the latter problem could be alternatively solved by an algorithm developed for the prize-collecting EDS problem in edge-weighted trees, proposed by Kamiyama [18].
3 LP relaxations
This section discusses LP relaxations for the prize-collecting graph covering problem in edge and node-weighted graphs.
In a natural integer programming (IP) formulation of the graph covering problem, each edge is associated with a variable , and each node is associated with a variable . denotes that is selected as part of the solution set, while indicates the selection of an edge incident to . In the prize-collecting version, each demand set is also associated with a variable , where indicates that the covering constraint corresponding to is not satisfied. For , we let denote the set of edges incident to in . The subscript may be removed when . An IP of the prize-collecting graph covering problem is then formulated as follows.
minimize | ||||
subject to | ||||
In the above formulation, the first constraints specify the covering constraints, while the second constraints indicate that if the solution contains an edge incident to , then = 1. In the graph covering problem (without penalties), is fixed at 0.
To obtain an LP relaxation, we relax the definitions of and in the above IP to and . However, this relaxation may introduce a large integrality gap into the graph covering problem with node-weighted graphs, as shown in the following example. Suppose that comprises a single edge set , and each edge in is incident to a node . Let the weights of all edges and nodes other than be 0. In this scenario, the optimal value of the graph covering problem is . On the other hand, the LP relaxation admits a feasible solution such that and for each edge . The weight of this solution is , and the integrality gap of the relaxation for this instance is .
This phenomenon occurs even in the EDS problem and multicut problem in trees. For instance, consider a star of leaves in the EDS problem. The weight of all edges and nodes is 0 except the center node , whose weight is 1. In this instance of the EDS problem, the weight of any EDS is 1. On the other hand, LP relaxation admits a feasible solution such that for each edge , and for each node . Since the weight of this fractional solution is , the integrality gap is .
Let denote an edge that joints nodes and . To determine the integrality gap in the multicut problem, we consider that each edge in the star is subdivided into two edges and . The subdivision imposes a weight of 1 on node . All edges and remaining nodes (i.e., the center node and all leaves) have weight 0. All pairs of leaves are demand pairs. A path between the center node and a leaf is called a leg. In this instance, any multicut must choose at least one edge from each of legs. Hence, the minimum multicut weight is . On the other hand, if for every edge and for every node , the weight is (such a fractional solution is feasible to the relaxation). Hence, the integrality gap of the relaxation is at least . By contrast, Garg, Vazirani, Yannakakis [14] proved that the integrality gap of the relaxation is at most 2 when node weights are not considered.
The above poor examples can be excluded if the second constraints in the relaxation are replaced by for . However, the LP obtained by this modification does not relax the graph covering problem if the optimal solutions contain high-degree nodes. Thus, we introduce a new variable for each pair of and , and replace the second constraints by , where and . indicates that is chosen to satisfy the covering constraint of , and implies the opposite. Roughly speaking, represents a minimal fractional solution for covering a single demand set . If a single covering constraint is imposed, the degree of each node is at most one in any minimal integral solution. Then the graph covering problem is relaxed by the LP even after modification. Summing up, we formulate our LP relaxation for an instance of the prize-collecting graph covering problem as follows.
minimize | ||||
subject to | ||||
Theorem 3.
Let be an instance of the prize-collecting graph covering problem in edge- and node-weighted graphs. is not greater than the optimal value of .
Proof.
Let be an optimal solution of . We define a solution of from . For each , we set to 0 if , and otherwise. If , we choose an arbitrary edge , and let . For the remaining edges , we assign . In this way, the values of variables in are defined for each . is set to 1 if , and 0 otherwise. is set to 1 if contains an edge incident to , and 0 otherwise.
For each with , exactly one edge satisfies , and this is included in . If , then , and each end node of satisfies . For a pair of and , is one for exactly one edge , and zero for the remaining edges in . Thus, is feasible. The objective value of in is given by , which is the optimal value of , and the theorem is proven. ∎
In some graph covering problems, is not explicitly given, and is not bounded by a polynomial on the input size of the problem. In such cases, the above LP may not be solved in polynomial time because it cannot be written compactly. However, in this scenario, we may define a tighter LP than the natural relaxation if we can find such that , is bounded by a polynomial of input size, and the degree of each node is small in any minimal edge set covering all demand sets in for each . Applying these conditions, the present author obtained a new approximation algorithm for solving a problem generalizing some prize-collecting graph covering problems [12].
4 Prize-collecting EDS problem in trees
In this section, we prove Theorem 1. We regard the input graph as a rooted tree, with an arbitrary node selected as the root. The depth of a node is the number of edges on the path between and . When lies on the path between and another node , we say that is an ancestor of and is a descendant of . If the depth of node is the maximum among all ancestors of , then is defined as the parent of . If is the parent of , then is a child of . The upper and lower end nodes of an edge are denoted by and , respectively. We say that an edge is an ancestor of a node and is a descendant of when or is an ancestor of . Similarly, an edge is a descendant of a node and is an ancestor of if or is an ancestor of . An edge is defined as an ancestor of another edge if is an ancestor of .
Recall that in the EDS problem. Let be an instance of the prize-collecting EDS problem. We denote by for each . Then the dual of is formulated as follows.
maximize | |||||
subject to | (1) | ||||
(2) | |||||
(3) | |||||
(4) | |||||
For an edge set , let denote , and let denote . For the instance , our algorithm yields a solution and a feasible solution to , both satisfying
(5) |
Since the right-hand side of (5) is at most , is an optimal solution of . We note that the dual solution is required only for proving the optimality of the solution and need not be computed.
The algorithm operates by induction on the number of nodes of depth exceeding one. In the base case, all nodes are of depth one, indicating that is a star centered at . The alternative case is divided into two sub-cases: Case A, in which a leaf edge of maximum depth satisfies ; and Case B, which contains no such leaf edge.
Base case
In the base case, is a star centered at . Note that all edges in this graph are adjacent. Let and . An edge attaining is denoted by .
If , our algorithm sets as , and defines as for each . Otherwise (i.e., ), it specifies as , and sets so that , and for each , which is possible because . Note that and defined in this way satisfy (5).
To completely define the dual solution, we must define variables and . Let . To satisfy (3) for and , the sum of , , and cannot be smaller than for each . Note that in (1), is bounded from above for , while in (2) for , and are bounded for and , respectively. As an alternative interpretation, each has capacity shared by , , and each has capacity shared by , . The following lemma claims that and may be set to satisfy all of these constraints.
Lemma 1.
Suppose that is a star. If , let for each . Otherwise, suppose that is defined to satisfy , and for each . Then there exists a feasible solution to .
Proof.
First, we appropriately define and . All variables of and are initialized to . We fix an arbitrary ordering of edges in , and denote the -th edge by .
Let . We sequentially select edges to . On selection of , we first increase until the increase reaches or (2) becomes tight for . If (2) is tightened for before is increased by , then is increased until the total increase reaches or (1) becomes tight for . Once (1) has tightened for , is increased. The current iteration is terminated when the total increase reaches . If , the algorithm advances to the next iteration, and processes . Since , all edges in can be processed before (2) becomes tight for .
The above process defines , , and for each . This process is repeated for all , but is not increased beyond the first iteration. Note that is assigned the same value regardless of which edge we begin with. Thus, and have been completely defined, and the feasibility of follows from their definitions. ∎
Case A
In this case, a leaf edge of maximum depth satisfies . Since Case A is not the base case, the depth of exceeds one. Let denote the upper end node of . Also let denote the parent of , and let be the children of . Throughout this paper, the sets and are denoted by , and , respectively. The edge joining and is called , with ( is included in ). The relationships between these nodes and edges are illustrated in Figure 1. We define , , and . Let be the index of an edge that attains .
The algorithm constructs an instance as follows. Suppose that . In this case, is defined as , and is defined by
For , we denote by . The weight function is defined by
for each , and
for each . If , then is defined as the tree obtained by removing nodes and edges from , and is defined by
is defined identically to the case , ignoring and for .
If , the number of nodes with depth exceeding one is lower in than in . Hence, the algorithm inductively finds a solution to and a feasible dual solution to that satisfy (5). Otherwise, the number of leaf edges of maximum depth with is lower in than in . If lacks edges of this type, then instance is categorized into Case B, and and are found as demonstrated below. If such edges do exist in , the algorithm finds and by induction on the number of such edges. Therefore, it suffices to show that the required and can be constructed from and , provided that and exist.
We now define and . is defined by
Lemma 2.
There exists a feasible solution to that satisfies (5) with .
Proof.
We first consider the case of . In this case, follows from for each . We define as for . We also define , , and such that holds for each and . The other dual variables are set to their values assigned in . Note that, for each , we have . Hence, , , and can be defined without violating (1) or (2) as follows. We sequentially collect edges to . On selection of , is increased until the total increase reaches or (2) becomes tight for . If (2) is tightened for before has increased by , then is simultaneously increased for all until (1) becomes tight for . Once (1) has tightened for , is increased instead of .
We next consider the case of . In this scenario, holds because . We define , such that for each and , which is possible because . and are set to for each . We also define , , and such that for each and as specified for . The other variables are set to their values assigned in . The feasibility of follows from its definition.
We now prove that and satisfy (5). Without loss of generality, we can assume (if this condition is false, we can remove edges , where , from until ). The objective value of exceeds that of by at most , unless and . If and , then and by the definition of . Since , is counted in the objective value of . Thus, the objective values increases from to by . From , it follows that , therefore, the objective function increases by . Since , (5) is satisfied. ∎
Case B
In this case, holds for all leaf edges of maximum depth. Let be the grandparent of a leaf node of maximum depth. Also, let be the children of , and be the edge joining and for . In the following discussion, we assume that has a parent, and that each node has at least one child. This discussion is easily modified to cases in which has no parent or some node has no child. We denote the parent of by , and the edge between and by . For each , let be the set of children of , and be the set of edges joining to its child nodes in . Also define as an edge that attains . The relationships between these nodes and edges are illustrated in Figure 2.
Now define , , and let . We denote the index of an edge that attains by , and specify .
We define as follows. If , then is the tree obtained by removing all edges in and all nodes in from , and is defined such that
for . In this case, is defined by
for , and
for . If , then , and their descendants are removed from to obtain , and is defined by
Moreover, for and is defined as in the case , disregarding the weights of edges and nodes removed from .
Since has fewer nodes of depth exceeding one than , the algorithm inductively finds a solution to , and a feasible solution to satisfying (5). is constructed from as follows.
We define such that for and , which is possible because . We also define for each . The other variables in are set to their values in . The following lemma states that this can form a feasible solution to .
Lemma 3.
Suppose that satisfy for each and . Further, suppose that holds for each , and the other variables in are set to their values in . Then there exist and such that is feasible to .
Proof.
For and , we define and such that . This may be achieved without violating the constraints, because . We also define as . These variables satisfy (1) for , (2) for and , and (3) for . for and , and for and are set to . The other variables in and are set to their values in and . To advance the proof, we introduce an algorithm that increases for and , and for and . At the completion of the algorithm, is a feasible solution to .
The algorithm performs iterations, and the -th iteration increases the variables to satisfy (3) for each pair of and , where . The algorithm retains a set of variables to be increased. We introduce a notion of time: Over one unit of time, the algorithm simultaneously increases all variables in by one. The time consumed by the -th iteration is .
At the beginning of the -th iteration, is initialized to . The algorithm updates during the -th iteration as follows.
-
•
At time , is added to if ;
-
•
If (2) becomes tight for under the increase of , then is replaced by for each ;
-
•
If becomes tight for under the increase of with some , then is reset to .
We note that the time spent between two consecutive updates may be zero.
always contains a variable that appears in the right-hand side of (3) for with , and for after time . The algorithm updates so that (1) and (2) hold for all variables except . Hence, to show that is a feasible solution to , it suffices to show that (2) for does not become tight before the algorithm is completed.
We complete the proof by contradiction. Suppose that (2) for tightens at time in the -th iteration. Since at this moment, there exists such that (1) for and (2) for are tight. The variables in the left-hand sides of (1) for and (2) for and are not simultaneously increased. Nor are these variables increased over time in the -th iteration, and is initialized to . From this argument, it follows that . However, this result is contradicted by the definition of , which implies that . Thus, the claim is proven. ∎
Lemma 4.
and satisfy (5).
Proof.
For each , either holds, or holds (because ). Hence, . Therefore, it suffices to prove that .
Without loss of generality, we can assume (if false, we can remove edges , from until ). In the sequel, we discuss only the case of and . In the alternative case, the claim immediately follows from the definitions of and . implies that is counted in the objective value of . Moreover, follows from . Thus, the objective values increase from to by , which equals . ∎
5 Multicut problem in trees
In this section, we prove Theorem 2. Again the input tree is rooted by selecting an arbitrary root node. For each , we let denote the path connecting and , and denote the maximum-depth common ancestor of and . The paths are called demand paths. We also denote the set of edges in by , and the set of nodes in by for notational convenience. We say that an edge covers a demand path if . The multicut problem in seeks a minimum weight set of edges that covers all demand paths.
The prize-collecting multicut problem can be reduced to the multicut problem as follows. For each , add new nodes and new edges to , and replace the -th demand pair by . Those new nodes and edges are weighted by , , and . Choosing into a solution to this new instance of the multicut problem corresponds to violating the -th demand in the original instance of the prize-collecting multicut problem.
Due to this reduction, we consider only the multicut problem in trees, which is equivalent to assuming that for all . For an instance of the multicut problem, the dual of the LP relaxation is given by
maximize | |||||
subject to | (6) | ||||
(7) | |||||
(8) | |||||
Our algorithm initializes the solution set to an empty set, and the dual solution to 0. The algorithm proceeds in two phases; the increase phase and deletion phase. The algorithm iterates in the increase phase, selecting edges covering demand paths not previously covered by and adding them to while updating the dual solution. The increase phase terminates when all demand paths have been covered by . In the deletion phase, is converted into a minimal solution by removing some edges.
The demand pairs are assumed to be sorted in the decreasing order of depth of , implying that is not a descendant of if .
Increase phase
At the beginning of each iteration in the increase phase, the algorithm selects the minimum index for which is not covered by the current solution . It then updates the dual solution , and adds several edges to , one of which covers . If all demand paths are covered by after this operation, the increase phase is terminated and the algorithm proceeds to the deletion phase; otherwise, it begins the next iteration. The iteration that processes is called the iteration for . In the following discussion, we explain the update process of the dual solution, and how edges are selected for addition to in the iteration for .
First, we define some terminologies. We say that is tight with regard to if (6) becomes an equality for . We say that is a bottleneck edge with regard to if it is tight with regard to , if (7) becomes an equality for , and if (8) becomes an equality for both end nodes of .
At the beginning of the iteration, is assumed to be minimal under the condition that is a feasible solution to . This condition can be assumed without loss of generality because arbitrarily decreasing makes it minimal. By this assumption, if for some and , then is tight with regard to .
The algorithm attempts to continuously increase . As in Section 4, we introduce a time interval, during which increases by one. To satisfy (6), , or must be increased at the speed of for each edge that is tight with regard to . If no bottleneck edge exists with regard to , the algorithm retains an edge set and a node set such that each tight edge in is included in or is incident to a node in . The algorithm increases , , and , at the same speed as . We note that and are computed greedily so that they are minimal.
We now explain how the algorithm handles a bottleneck edge . For an end node of the bottleneck edge , we define as . The algorithm attempts to decrease , defined at an end node of and . Below we detail how is decreased while retaining the feasibility of . We note that decrease of is not always possible. We call relaxable (with regard to ) if can be decreased for some . If contains bottleneck edges, the algorithm maintains
-
•
a set of relaxable nodes such that each bottleneck edge is incident to at least one node in ,
-
•
an edge set ,
-
•
and a node set .
, , and are minimal under the condition that each tight edge is included in , or is incident to a node in . The algorithm increases , for , and for at the same speed, where increasing for involves decreasing for some and updating other variables, as explained below.
We now explain how is decreased for some , and formally define the relaxability of . implies that contains one or two edges incident to . Suppose that contains a single edge, . Let be the other end node of . If is not tight with regard to , then is decreased until becomes tight. Even if is tight with regard to , or is increased while is decreased at the same speed, provided that is not a bottleneck edge with regard to . This action retains the feasibility because (7) for or (8) for is not tight unless is a bottleneck edge with regard to . If is a bottleneck edge with regard to , the algorithm recursively attempts to decrease for some if is relaxable, and increase . Under these update rules, is decreased without violating the feasibility of . If contains two edges and incident to , decreases only when allowed for both and . We define as relaxable if one of these updates is possible. is not relaxable under the following conditions.
Fact 1.
A node is not relaxable with regard to if and only if , or for each , contains a bottleneck edge whose other end node is non-relaxable and which is incident to .
We note that decreasing for some relaxable node and may cause other variables to increase. In this case, if shares nodes or edges with , increasing by may increase the left-hand sides of (7) and (8) by more than . Hence, must be set sufficiently small that the feasibility of is maintained. In implementing the increase phase, we recommend solving an LP for deciding the increment of in a single step. The maximum increment for can be computed by formulating the problem as an LP.
When increases no further, the algorithm adds several edges to . At this moment, includes a bottleneck edge such that is tight with regard to all , and neither of its end nodes are relaxable. If two or more such edges exist, the edge of maximum depth, denoted , is added to . We call the witness of .
The algorithm then completes the following operations for each end node of . By Fact 1, contains a bottleneck edge incident to for each , where possibly holds. The algorithm adds such to for each with . Since the other end node of is non-relaxable, also contains a bottleneck edge incident to for each . If is added to and , the algorithm adds each of such to and repeats the process for the other end nodes of .
Lemma 5.
Let . At the completion of the increase phase, is a bottleneck edge with regard to each . Moreover, neither end node of is relaxable with regard to each .
Proof.
When is added to , it satisfies the above conditions. In later iterations, the algorithm does not decrease for any bottleneck edge nor for an end node of . Hence, satisfies the conditions at completion of the increase phase. ∎
Deletion phase
Let be the set of indices for which was considered in the increase phase. In other words, was the minimum index for which was covered by no edge in at the beginning of some iteration of the increase phase. The deletion phase is also iterated, sequentially processing in the decreasing order of . As defined in the increase phase, the iteration that considers is called the iteration for . Briefly, the deletion phase selects edges from to obtain a final solution in which each , contributes to at most two edges.
is initialized to the empty set. Suppose that the algorithm iterates . We denote by . Let be a node in with no ancestor in . By Lemma 5, implies that is non-relaxable with regard to . Suppose now that an edge in is added to in the iteration for with . implies that , and hence by Fact 1, contains a bottleneck edge incident to , whose other end node is non-relaxable. If contains two such edges, and if is the lower end node of one of those edges, becomes the edge satisfying . Otherwise, is arbitrarily selected from the candidate edges. If has not previously belonged to , it is added to . If contains an edge which is a descendant of , is deleted from .
The above can be selected from at most two choices in . The algorithm completes the above operation for each of these nodes. At the end of the operation, if does not contain the witness of or any ancestor of , the algorithm adds to . The algorithm performs no further tasks in the iteration for . Note that always holds. If , the algorithm proceeds to the next iteration. If , the deletion phase terminates, and the algorithm outputs .
Lemma 6.
Suppose that the algorithm outputs an edge set . Then it satisfies the following conditions.
-
(i)
for each .
-
(ii)
Let . Each subpath between an end node of and includes at most one edge in .
-
(iii)
Let with . If an edge in is incident to a node in , and if , then contains an edge incident to .
Proof.
To prove (i), we first consider the case of . The deletion phase adds an edge to before completing the iteration for . If is removed from in a later iteration for , then and the algorithm adds an ancestor of to . is a descendant of because and . Hence, also covers . Even if is eventually removed from , we can similarly prove that is covered by another edge that is added to in the iteration. Hence, at completion of the algorithm, is covered by if .
Before discussing the case of , we note that if , then contains the witness of or one of its ancestors at completion of the algorithm. This is guaranteed by the deletion phase during iteration for , which adds to if does not contain or any of its ancestors.
We now discuss the case of . By definition of the increase phase, contains an edge that covers , and is the witness of for some with . By the above observation, contains or one of its ancestors . also covers , because . Hence, contains an edge covering even if .
Next, we prove (ii) by induction on . Let us consider the case of maximum in . The first iteration of the deletion phase adds the witness of to . Suppose that another edge is added to in the iteration for in the deletion phase. If is added to before the iteration for in the increase phase, then does not belong to , which is a contradiction. Otherwise, is added to because it is incident to an end node of , and does not cover . Since , this implies that either is a descendant of or . In the former case, is not chosen as the witness of , because the witness is a bottleneck edge of maximum depth with both end nodes non-relaxable, which contradicts the definition of . In the latter case, , consistent with (ii) in Lemma 6.
We next consider the case of non-maximal in . Let be the set of edges in the subpath between an end node of and . Suppose that at the start of the iteration for in the deletion phase, there exist distinct edges . Without loss of generality, we assume that is an ancestor of . Assume that the deletion phase adds to during the iteration for with , and adds to during the iteration for with . Since is a descendant of , covers ; therefore, also covers . Hence, the subpath of between and is covered by two edges in , which is a contradiction by induction. Therefore, at the beginning of the iteration for in the deletion phase, is covered by at most one edge in .
At the start of the iteration for , suppose that is covered by an edge . If another edge is added to during this iteration, then is incident to a node , and another edge incident to is in . Suppose that is added to during the iteration for in the˚ deletion phase. If is an ancestor of , then is removed from when is added to . If is a descendant of , then is covered by because . Therefore, the subpath of between and is covered by both and , which again is a contradiction by induction.
could also be covered by two edges if the witness of is in and is added to in the iteration for , even though one of its descendants has already been in . We now demonstrate that this situation does not occur. Note that each edge in is either or is incident to a node in for some with . Since is not covered by and is not an ancestor of , is not an ancestor of any edge in . Hence, no descendant of is added to .
Finally, we prove (iii). Suppose that some violates the claim of (iii). We consider such a pair of and that minimizes . Let be an edge incident to a node . Suppose that . If contains no edge incident to , then the iteration for during the deletion phase adds an ancestor edge of in to . This edge also covers because is an ancestor of , which contradicts (ii). Next, we suppose that . Claim (iii) is obvious when covers ; thus, we suppose that does not cover . indicates that was not in at the start of the iteration for in the increase phase. Let be the index for which is added to during the iteration for in the increase phase, where . By the definition of the increase phase, the witness of is incident to , and because , an edge incident to in is added to simultaneously with . Since this edge is not in , the iteration for in the increase phase adds an ancestor edge of in to . This edge covers , indicating that the subpath of between its one end node and is covered by two edges if , which contradicts (ii). Therefore, . However, is not added to unless the witness of is added to and a contradiction arises. ∎
Proof of Theorem 2.
Let denote the edge set output by the algorithm. By claim (i) of Lemma 6, is a multicut. Since is a lower bound on the optimal value, it suffices to show that .
Lemma 5 implies that for each , and that for each . Recall that . Hence,
(9) |
(iii) of Lemma 6 indicates that holds if . If , then for any and for any . Hence, the right-hand side of (9) is equal to
for each by (ii) of Lemma 6. Recall that for each . Therefore, the right-hand-side of (9) is at most . ∎
6 Conclusion
In this paper, we emphasized a large integrality gap when the natural LP relaxation is applied to the graph covering problem that minimizes node weights. We then formulated an alternative LP relaxation for graph covering problems in edge- and node-weighted graphs that is stronger than the natural relaxation. This relaxation was incorporated into an exact algorithm for the prize-collecting EDS problem in trees, and a 2-approximation algorithm for the multicut problem in trees. The approximation guarantees for these algorithms match the previously known best results for purely edge-weighted graphs. In many other graph covering problems, the integrality gap in the proposed relaxation would increase if node weights were introduced, because the problems in node-weighted graphs admit stronger hardness results, as shown in the Appendix. Nonetheless, the proposed relaxation is a potentially useful tool for designing heuristics or using IP solvers to solve the above problems.
Acknowledgements
This work was partially supported by Japan Society for the Promotion of Science (JSPS), Grants-in-Aid for Young Scientists (B) 25730008. The author thanks an anonymous referee of SWAT 2014 for pointing out an error on the reduction from the prize-collecting multicut problem to the multicut problem in an earlier version of this paper.
References
- [1] M. Bateni, M. Hajiaghayi, and V. Liaghat. Improved approximation algorithms for (budgeted) node-weighted Steiner problems. In ICALP (1), vol. 7965 of Lecture Notes in Computer Science, pages 81–92, 2013.
- [2] A. Berger, T. Fukunaga, H. Nagamochi, and O. Parekh. Approximability of the capacitated b-edge dominating set problem. Theoretical Computer Science, 385(1-3):202–213, 2007.
- [3] A. Berger and O. Parekh. Linear time algorithms for generalized edge dominating set problems. Algorithmica, 50(2):244–254, 2008.
- [4] A. Berger and O. Parekh. Erratum to: Linear time algorithms for generalized edge dominating set problems. Algorithmica, 62(1-2):633–634, 2012.
- [5] J. Byrka, F. Grandoni, T. Rothvoß, and L. Sanità. Steiner tree approximation via iterative randomized rounding. Journal of the ACM, 60(1):6, 2013.
- [6] R. D. Carr, T. Fujito, G. Konjevod, and O. Parekh. A -approximation algorithm for a generalization of the weighted edge-dominating set problem. Journal of Combinatorial Optimization, 5(3):317–326, 2001.
- [7] S. Chawla, R. Krauthgamer, R. Kumar, Y. Rabani, and D. Sivakumar. On the hardness of approximating multicut and sparsest-cut. Computational Complexity, 15(2):94–114, 2006.
- [8] C. Chekuri, A. Ene, and A. Vakilian. Prize-collecting survivable network design in node-weighted graphs. In APPROX-RANDOM, vol. 7408 of Lecture Notes in Computer Science, pages 98–109, 2012.
- [9] V. Chvátal. A greedy heuristic for the set-covering problems. Mathematics of Operations Research, 4:233–235, 1979.
- [10] E. Dahlhaus, D. S. Johnson, C. H. Papadimitriou, P. D. Seymour, and M. Yannakakis. The complexity of multiterminal cuts. SIAM Journal on Computing, 23(4):864–894, 1994.
- [11] T. Fujito and H. Nagamochi. A 2-approximation algorithm for the minimum weight edge dominating set problem. Discrete Applied Mathematics, 118(3):199–207, 2002.
- [12] T. Fukunaga. Spider covers for prize-collecting network activation problem. CoRR, abs/1310.5422, 2013.
- [13] N. Garg, V. V. Vazirani, and M. Yannakakis. Approximate max-flow min-(multi)cut theorems and their applications. SIAM Journal on Computing, 25(2):235–251, 1996.
- [14] N. Garg, V. V. Vazirani, and M. Yannakakis. Primal-dual approximation algorithms for integral flow and multicut in trees. Algorithmica, 18(1):3–20, 1997.
- [15] M. X. Goemans, N. Olver, T. Rothvoß, and R. Zenklusen. Matroids and integrality gaps for hypergraphic steiner tree relaxations. In STOC, pages 1161–1176, 2012.
- [16] M. Hajiaghayi, V. Liaghat, and D. Panigrahi. Online node-weighted Steiner forest and extensions via disk paintings. In FOCS, pages 558–567, 2013.
- [17] D. S. Hochbaum. Heuristics for the fixed cost median problem. Mathematical Programming, 22(1):148–162, 1982.
- [18] N. Kamiyama. The prize-collecting edge dominating set problem in trees. In P. Hlinený and A. Kucera, editors, Mathematical Foundations of Computer Science 2010, 35th International Symposium, MFCS 2010, Brno, Czech Republic, August 23-27, 2010. Proceedings, vol. 6281 of Lecture Notes in Computer Science, pages 465–476. Springer, 2010.
- [19] S. Khot and O. Regev. Vertex cover might be hard to approximate to within . Journal of Computer and System Sciences, 74(3):335–349, 2008.
- [20] P. N. Klein and R. Ravi. A nearly best-possible approximation algorithm for node-weighted Steiner trees. Journal of Algorithms, 19(1):104–115, 1995.
- [21] J. Könemann, S. S. Sadeghabad, and L. Sanità. An LMP -approximation algorithm for node weighted prize collecting Steiner tree. In FOCS, pages 568–577, 2013.
- [22] L. Lovász. On the ratio of optimal integral and fractional covers. Discrete Mathematics, 13(4):383â–390, 1975.
- [23] A. Moss and Y. Rabani. Approximation algorithms for constrained node weighted Steiner tree problems. SIAM Journal on Computing, 37(2):460–481, 2007.
- [24] J. Naor, D. Panigrahi, and M. Singh. Online node-weighted Steiner tree and related problems. In FOCS, pages 210–219, 2011.
- [25] Z. Nutov. Approximating Steiner networks with node-weights. SIAM Journal on Computing, 39(7):3001–3022, 2010.
- [26] Z. Nutov. Approximating minimum-cost connectivity problems via uncrossable bifamilies. ACM Transactions on Algorithms, 9(1):1, 2012.
- [27] Z. Nutov. Survivable network activation problems. In LATIN, vol. 7256 of Lecture Notes in Computer Science, pages 594–605, 2012.
- [28] D. Panigrahi. Survivable network design problems in wireless networks. In SODA, pages 1014–1027, 2011.
- [29] O. Parekh. Approximation algorithms for partially covering with edges. Theoretical Computer Science, 400(1-3):159–168, 2008.
- [30] A. Vakilian. Node-weighted prize-collecting survivable network design problems. Master’s thesis, University of Illinois at Urbana-Champaign, 2013.
- [31] V. V. Vazirani. Approximation algorithms. Springer, 2001.
Appendix A EDS problem in general graphs
A.1 Hardness results
In this section, we discuss the EDS problem in general graphs. First, we show that the EDS problem in node-weighted graphs is as hard as the set cover problem, and the EDS problem in edge- and node-weighted graphs is as hard as the (non-metric) facility location problem. To this end, we reduce the set cover problem or the facility location problem to the EDS problem in bipartite graphs. Accordingly, our hardness results hold for the EDS problem in bipartite graphs.
We now define the set cover problem and facility location problem. In the set cover problem, we are given a set , family of subsets of , and cost function . A solution is a subfamily of such that . The objective is to minimize . Inputs in the facility location problem are a client set , a facility set , opening costs , and connection costs . A solution to this problem is a pair of and , and the objective is to minimize .
Given an instance of the set cover problem, define a facility set whose members each corresponds to a set in ; We let denote the member of corresponding to . Define opening costs and connection costs such that , if , and otherwise. Then, the instance of the facility location problem is equivalent to the instance of the set cover problem, implying that the non-metric facility location problem generalizes the set cover problem.
Theorem 4.
If the EDS problem in node-weighted bipartite graphs admits a -approximation algorithm, then the set cover problem also admits a -approximation algorithm. If the EDS problem in edge- and node-weighted bipartite graphs admits a -approximation algorithm, then the facility location problem also admits a -approximation algorithm.
Proof.
To prove Theorem 4, we reduce the facility location problem to the EDS problem. From a client set and facility set , we construct the complete bipartite graph with bipartition and . Moreover, for each , we add a new node and an edge that joins and . Note that this operation retains the bipartite property of the graph. We define edge weights such that for each , where and and for each . The node weights are defined by and for each , and for each .
Let be an EDS for this graph. We can assume because . Since dominates each , contains an edge incident to each . Moreover, if contains more than one edge incident to , remains an EDS if one of these edges is arbitrarily discarded. Hence contains exactly one edge incident to , which joins and a node . Let be the opposite end node of the edge in incident to , and let . Then, is a solution to the facility location problem, with cost equaling the weight of . Conversely, given a solution to the facility location problem, define as . Then is an EDS of the graph, and the weight of equals the cost of the solution to the facility location problem. Hence, the reduction is that required in the latter part of the theorem.
As observed above, the set cover problem corresponds to the instances of the facility location problem with zero connection cost. For these instances, the reduction defines instances of the EDS problem in which all edges are weighted zero. The former part of the theorem follows from this statement. ∎
Without describing the details, we can also show that reductions in the proof of Theorem 4 are applicable (with modification) to fundamental covering problems such as the Steiner tree problem, -join problem, and edge cover problem.
A.2 -approximation algorithm
Since it is NP-hard to achieve the -approximation in the set cover problem, by Theorem 4, the same situation exists for the EDS problem in node-weighted bipartite graphs. Here, we propose an -approximation algorithm for the prize-collecting EDS problem in general edge- and node-weighted graphs.
Our algorithm reduces the prize-collecting EDS problem to the edge cover problem. Given an undirected graph and a set of nodes, we define an edge cover as a set of edges such that for each . Given , the problem seeks an edge cover that minimizes . For an instance of the edge cover problem, we apply the following LP relaxation:
minimize | ||||
subject to | ||||
Lemma 7.
Suppose that an algorithm computes an edge cover of weight at most for any instance of the edge cover problem. Then the prize-collecting EDS problem in edge- and node-weighted graphs admits a -approximation algorithm.
Proof.
We first solve for the given instance of the prize-collecting EDS problem. Let be the obtained optimal solution to . Recall that in the EDS problem; thus we denote and by and respectively, where is the edge corresponding to . Let . An instance of the edge cover problem is assumed to consist of , , and .
We now prove that . In the following, is denoted by . Since for each , for each , implying that satisfies the first constraints of . By the minimality of , there exists such that for any . The second constraint for and in implies that for and an end node of . Hence, satisfies the second constraints of . Therefore, is a feasible solution to , and is at most .
Let be an edge cover of weight at most for . Let . By the first constraint of , for each . Let and be the end nodes of . Since , at least one of and is included in . Note that each node in has some incident edge in . Hence, each edge in is dominated by . Therefore, must pay at most as the penalty. Summing up, the objective value of for does not exceed
Since is a lower bound on the optimal value of , achieves a -approximation for . ∎
To obtain an -approximation algorithm for the prize-collecting EDS problem, it suffices to obtain an algorithm for the edge cover problem required in Lemma 7 with . As a sequel, we observe that such an algorithm indeed exists.
Since each node in is included in for any edge cover , an algorithm that solves instances of for all is sufficient. Moreover, if contains an edge joining nodes and in , the instance is transformed into an equivalent instance by inserting a new node that subdivides the edge and setting the new node and edge weights as and . In the obtained instance, forms an independent set. Such instances of the edge cover problem are included in the (non-metric) facility location problem. In fact, we may regard and as the client and facility sets, respectively. The weights of edges between clients and facilities represent connection costs, and the weights of clients indicate opening costs. Each edge in edge covers joins a client and facility and naturally allocates the client to the facility.
Consider an instance of the facility location problem. We define a star as a set comprising a facility , and clients . The cost of the star is . Let be the set of all stars. Identifying the star as the subset of , the facility location problem can be regarded as the set cover problem with the set and family of subsets of . Hence, the greedy algorithm for the set cover problem achieves -approximation for the facility location problem. An analysis of this greedy algorithm [22, 9] has shown that the costs of solutions are bounded by an LP relaxation of the set cover problem (see also [31]). Our LP relaxation applied to an instance of the edge cover problem is equivalent to this LP relaxation for the set cover problem derived from . The runtime of the greedy algorithm is a primary concern, because the size of in the set cover problem is not bounded by a polynomial of the input size of . The greedy algorithm can operate in polynomial time for the given instances in the facility location problem and edge cover problem [17]. Therefore, we can state the following theorem.
Theorem 5.
The prize-collecting EDS problem in general edge- and node-weighted graphs admits an -approximation algorithm.