The Power of Recourse: Better Algorithms for Facility Location in Online and Dynamic Models

Xiangyu Guo Department of Computer Science and Engineering, University at Buffalo, xiangyug@buffalo.edu Janardhan Kulkarni The Algorithms Group, Microsoft Research, Redmond, jakul@microsoft.com Shi Li Department of Computer Science and Engineering, University at Buffalo, shil@buffalo.edu Jiayi Xian Department of Computer Science and Engineering, University at Buffalo, jxian@buffalo.edu

Abstract

In this paper we study the facility location problem in the online with recourse and dynamic algorithm models. In the online with recourse model, clients arrive one by one and our algorithm needs to maintain good solutions at all time steps with only a few changes to the previously made decisions (called recourse). We show that the classic local search technique can lead to a $(1+\sqrt{2}+\epsilon)$ -competitive online algorithm for facility location with only $O\left(\frac{\log n}{\epsilon}\log\frac{1}{\epsilon}\right)$ amortized facility and client recourse.

We then turn to the dynamic algorithm model for the problem, where the main goal is to design fast algorithms that maintain good solutions at all time steps. We show that the result for online facility location, combined with the randomized local search technique of Charikar and Guha [10], leads to an $O(1+\sqrt{2}+\epsilon)$ approximation dynamic algorithm with amortized update time of $\tilde{O}(n)$ in the incremental setting against adaptive adversaries. Notice that the running time is almost optimal, since in general metric space, it takes $\Omega(n)$ time to specify a new client’s position. The approximation factor of our algorithm also matches the best offline analysis of the classic local search algorithm.

Finally, we study the fully dynamic model for facility location, where clients can both arrive and depart. Let $F$ denote the set of available facility locations. Our main result is an $O(1)$ -approximation algorithm in this model with $O(|F|)$ preprocessing time and $O(\log^{3}D)$ amortized update time for the HST metric spaces. Using the seminal results of Bartal [4] and Fakcharoenphol, Rao and Talwar [17], which show that any arbitrary $N$ -point metric space can be embedded into a distribution over HSTs such that the expected distortion is at most $O(\log N)$ , we obtain a $O(\log|F|)$ approximation with preprocessing time of $O(|F|^{2}\log|F|)$ and $O(\log^{3}D)$ amortized update time. The approximation guarantee holds in expectation for every time step of the algorithm, and the result holds in the oblivious adversary model.

1 Introduction

In the (uncapacitated) facility location problem, we are given a metric space $(F\cup C,d)$ , where $F$ is the set of facility locations, $C$ is the set of clients, and $d:(F\cup C)\times(F\cup C)\rightarrow{\mathbb{R}}_{\geq 0}$ is a distance function, which is non-negative, symmetric and satisfies triangle inequalities. For each location $i\in F$ , there is a facility opening cost $f_{i}\geq 0$ . The goal is open a subset $S\subseteq F$ of facilities so as to minimize cost of opening the facilities and the connection cost. The cost of connecting a client $j$ to an open facility $i$ is equal to $d(j,i)$ . Hence, the objective function can be expressed concisely as $\min_{S\subseteq F}\left(f(S)+\sum_{j\in C}d(j,S)\right)$ , where for a set $S\subseteq F$ , $f(S):=\sum_{i\in S}f_{i}$ is the total facility cost of $S$ and $d(j,S):=\min_{i\in S}d(j,i)$ denotes the distance of $j$ to the nearest location in $S$ . The facility location problem arises in countless applications: in the placement of servers in data centers, network design, wireless networking, data clustering, location analysis for placement of fire stations, medical centers, and so on. Hence, the problem has been studied extensively in many different communities: approximation algorithms, operations research, and computational geometry. In the approximation algorithms literature in particular, the problem occupies a prominent position as the development of every major technique in the field is tied to its application on the facility location problem. See the text book by Williamson and Shmoys [40] for more details. The problem is hard to approximate to a factor better than 1.463 [28]. The current best-known polynomial-time algorithm is given by the third author, and achieves 1.488-approximation [34].

In many real-world applications the set of clients arrive online, the metric space can change over time, and there can be memory constraints: This has motivated the problem to be studied in various models: online [35, 21, 2, 20], dynamic [11, 27, 12, 39, 18, 15, 1], incremental [19, 9, 23], streaming [32, 22, 33, 13, 9], game theoretic [38, 24, 25], to name a few. This paper is concerned with online and dynamic models. Thus to keep the flow of presentation linear, we restrict ourselves to the results in these two models here.

Motivated by its applications in network design and data clustering, Meyerson [35] initiated the study of facility location problem in the online setting. Here, clients arrive online one-by-one, the algorithm has to assign the newly arriving client to an already opened facility or needs to open a new facility to serve the request. The decisions made by the algorithm are irrevocable, in the sense that a facility that is opened cannot be closed and the clients cannot be reassigned. In the online setting, Meyerson [35] designed a very elegant randomized algorithm that achieves an $O(\log n)$ competitive ratio, and also showed that no online algorithm can obtain $O(1)$ competitive ratio. This result was later extended by Fotakis [21] to obtain an asymptotically optimal $O(\log n/\log\log n)$ -competitive algorithm. Both the algorithms and analysis techniques in [21, 35] were influential, and found many applications in other models such as streaming [23]. The lowerbound in Fotakis [21] holds even in very special metric spaces such as HSTs or the real line. Since then, several online algorithms have been designed achieving the same competitive ratio with more desirable properties such as deterministic [2], primal-dual [20], or having a small memory footprint [22]. We refer to a beautifully written survey by Fotakis [23] for more details.

The main reason to assume that decisions made by an algorithm are irrevocable is because the cost of changing the solution is expensive in some applications. However, if one examines these above applications closely, say for example connecting clients to servers in data centers, it is more natural to assume that decisions need not be irrevocable but the algorithm should not change the solution too much. This is even more true in modern data centers where topologies can be reconfigured; see [26] for more details. A standard way of quantifying the restriction that an online algorithm does not make too many changes is using the notion of recourse. The recourse per step of an online algorithm is the number of changes it makes to the solution. Recourse captures the minimal amount of changes an online algorithm has to make to maintain a desired competitive ratio due to the information theoretic limits. For the facility location problem, depending on the application, the recourse can correspond to: 1) the number of changes made to the opened facilities (called facility recourse) 2) the number of reconnections made to the clients (called client recourse). Notice that we can assume for every facility we open/close, we have to connect/disconnect at least one client. Thus the client recourse is at least the facility recourse. In the clustering applications arising in massive data sets, the opened facilities represent cluster centers, which represent summaries of data. Here one is interested in making sure that summaries do not change too frequently as more documents are added online. Therefore, facility recourse is a good approximation to the actual cost of changing the solution [9, 19]. On the other hand, in network design problems, client recourse is the true indicator of the cost to implement the changes in the solution. As a concrete example, consider the problem of connecting clients to servers in datacenters, which was one of the main motivation for Meyerson [35] to initiate the study of online facility location problem. Here, it is important that one does not reconnect clients to servers too many times, as such changes can incur significant costs both in terms of disruption of service and the labor cost. Consider another scenario where a retailing company tries to maintain stores to serve the dynamically changing set of clients. As the clients are changing so frequently, it would be infeasible to build/shutdown even one store for every new client. In this application, small client recourse per step is desirable, as that will automatically forbid frequent changes of store locations.

In this light, a natural question that arises is:

Is it possible to maintain a constant approximation for the facility location problem if we require that the facility and client recourse is small?

Our first main result shows that indeed this is possible. In the following theorems, we use $n$ to denote the total number of facility locations and all clients that ever arrived, and $D$ to denote the diameter of the metric $d$ (assuming all distances are integers).

Theorem 1.

There is a deterministic online algorithm for the facility location problem that achieves a competitive ratio of $(1+\sqrt{2}+\epsilon)$ with $O\left(\frac{\log n}{\epsilon}\log\frac{1}{\epsilon}\right)$ amortized facility and client recourse against an adaptive adversary.

Our algorithm to show the above theorem differs from the previous approaches used in the context of online variants of facility location problem, and is based on local search. The local search algorithm is one of the most widely used algorithms for the facility location problem in practice and is known to achieve an approximation factor of $(1+\sqrt{2})$ in the offline setting. See the influential paper by Arya et al [3] and a survey by Munagala [36]. Thus our result matches the best known approximation ratio for offline facility location using local search. Further, our result shows that the local search algorithm augmented with some small modifications is inherently stable as it does not make too many changes to the solutions even if clients are added in an online fashion. This gives further justification for its popularity among practitioners.

Prior to Theorem 1, the known results [19, 14, 22] needed one or more of these assumptions: 1) the facility costs are the same 2) we are interested in knowing only the cost of solution 3) we are interested only in bounding the facility recourse. In particular, there was no known algorithm that bounds the client recourse, which is an important consideration in many applications mentioned above. Moreover, our algorithm also achieves a better approximation factor; previously best known algorithm for the facility location problem achieved a competitive ratio of 48 [23].

Our result in the recourse setting for the facility location problem should be contrasted with the similar results shown recently for online Steiner tree [30], set cover [29], scheduling [31], and matchings and flows [6, 31]. Moreover, these results also raise an intriguing questions: is polylog amount of recourse enough to beat information theoretic lowerbounds in the online algorithms? Is recourse as or more powerful than randomization?

While having a small client recourse is enough in data center applications, it is not enough in some others. Take wireless networks as a concrete example. Here, the set of clients (mobile devices) keeps changing over time, and it is necessary to update the assignment of clients to facilities as quickly as possible so to minimize the service disruption. These applications motivated Cygan et al [12], Goranci et al [27] and Cohen-Addad et al [11] to study the facility location problem in the framework of dynamic algorithms. The dynamic model of [12] and [11] is different from what we study here, so we discuss it at end of this section.

The dynamic facility location problem is similar to the one in online setting except that at each time step either a new client arrives or an existing client departs. The goal is to always maintain a solution that is a constant factor approximation to the optimal solution, while minimizing the total time spent in updating the solution. We emphasize that we require our dynamic algorithms to maintain an actual assignment of clients to facilities, not just the set of open facilities and an estimate of connection cost. This is important for applications mentioned above. This setting was considered in [27], who showed that for metric spaces with doubling dimension $\kappa$ , there is a deterministic fully dynamic algorithm with $\tilde{O}(2^{\kappa^{2}})$ update time, which maintains a constant approximation. However, for more general metric spaces no results were known in the dynamic setting, and we give the first results. First we consider the incremental setting, where clients only arrive and never depart.

Theorem 2.

In the incremental setting against an adaptive adversary, there is a randomized dynamic algorithm for the facility location problem that, with probability at least $1-1/n^{2}$ , maintains an approximation factor of $(1+\sqrt{2}+\epsilon)$ and has total update time of $O(\frac{n^{2}}{\epsilon^{2}}\log^{3}n\log\frac{1}{\epsilon})$ .

Note that it takes $\Theta(n|F|)$ space to specify the input in our model (see Section 2.2). Hence the running time of our algorithms is almost optimal up to polylog factors when $|F|=\Omega(n)$ . The proof of above theorem uses randomized local search and builds on our result in the recourse setting. We use randomization to convert the recourse bound into an update time bound. Further, our analysis of above theorem also implies one can obtain $O(\frac{n|F|}{\epsilon^{2}}\log^{3}n\log\frac{1}{\epsilon})$ running time by losing $O(1)$ factors in the approximation ratio; see the remark at the end of Section 5.

Next we study the fully dynamic setting. Here, we first consider an important class of metric spaces called hierarchically well separated tree (HST) metrics [4]; see Definition 5 for the formal definition, and Section 2.2 for more details about how the input sequence is given. For HST metric spaces, we show the following result.

Theorem 3.

In the fully dynamic setting against adaptive adversaries, there is a deterministic algorithm for the facility location problem that achieves an $O(1)$ approximation factor with $O(|F|)$ preprocessing time and $O(n\log^{3}D)$ total update time for the HST metric spaces.

A seminal result by Bartal [4], which was later tightened by Fakcharoenphol, Rao and Talwar [17], shows that any arbitrary $N$ -point metric space can be embedded into a distribution over HSTs such that the expected distortion is at most $O(\log N)$ , which is also tight. Moreover, such a probabilistic embedding can also be computed in $O(N^{2}\log N)$ time; see recent results by Blelloch, Gu and Sun for details [7]. These results immediately imply the following theorem, provided the input is specified as in Section 2.2.

Theorem 4.

In the fully dynamic setting against oblivious adversary, there is a randomized algorithm for the facility location problem that maintains an approximation factor of $O(\log|F|)$ with preprocessing time of $O(|F|^{2}\log|F|)$ and $O(n\log^{3}D)$ total update time. The approximation guarantee holds only in expectation for every time step of the algorithm.

Observe that unlike the incremental setting, the above theorem holds only in the oblivious adversary model, as probabilistic embedding techniques preserve distances only in expectation as can be seen by taking a cycle on $n$ points. Our result also shows that probabilistic tree embeddings using HSTs can be a very useful technique in the design of dynamic algorithms, similar to its role in online algorithms [4, 5, 37, 8].

Our algorithms in Theorems 3 and 4 in the fully dynamic setting also have the nice property that amortized client and facility recourse is $O(\log^{3}D)$ (in fact, we can achieve a slight better bound of $O(\log^{2}D)$ as can be seen from the analysis). This holds as our dynamic algorithms maintain the entire assignment of clients to facilities explicitly in memory at every time step. Thus, the amortized client reconnections is at most the amortized update time. This is useful when one considers an online setting where clients arrive and depart, and is interested in small client recourse. A fully dynamic online model of facility location problem, where clients arrive and depart was recently studied by Cygan et al [12] and Cohen-Addad et al [11], but with different assumption on recourse. In this model, when a client arrives, the algorithm has to assign it to an open facility immediately; While upon departure of a client, if a facility was opened at the same location, then the clients that were assigned to that location should be reassigned immediately and irrevocably. Cygan et al [12] studied the case when recourse is not allowed: they showed that a delicate extension of Meyerson’s [35] algorithm obtains asymptotically tight competitive ratio of $O(\log n/\log\log n)$ . Cohen-Addad et al [11] later showed that this can be improved to $O(1)$ if recourse is allowed. However, both results holds only for the uniform facility costs and Cygan et al[12] even showed an unbounded lower bound for the non-uniform facility cost case in their model. Moreover, in their model reconnections of clients are assumed to be “automatic” and do not count towards the client recourse; it is not clear how many client reconnections their algorithm will make.

1.1 Our Techniques

Our main algorithmic technique for proving Theorems 1 and 2 is local search, which is one of the powerful algorithm design paradigms. Indeed, for both results, the competitive (approximation) ratio we achieve is $1+\sqrt{2}+\epsilon$ , which matches the best approximation ratio for offline facility location obtained using local search [3]. Both of our results are based on the following key lemma. Suppose we maintain local optimum solutions at every time step in our algorithm. When a new client $j_{t}$ comes at time $t$ , we add it to our solution using a simple operation, and let $\Delta_{t}$ be the increase of our cost due to the arrival of $j_{t}$ . The key lemma states that the sum of $\Delta_{t}$ values in the first $T^{\prime}$ time steps can be bounded in terms the optimum cost at time $T^{\prime}$ . With a simple modification to the local search algorithm, in which we require each local operation decreases enough cost for every client it reconnects, one can bound the total client recourse.

The straightforward way to implement the local search algorithm takes time $\Omega(n^{3})$ . To derive a better running time, we leverage the randomized local search idea of Charikar and Guha [10]. At every iteration, we randomly choose a facility $i$ or a closing operation, and then perform the best operation that opens or swaps in $i$ , or closes a facility if that is what we choose. By restricting the facility $i$ and with the help of the heap data structure, an iteration of the algorithm can be implemented in time $O(|C|\log|F|)$ . As in [10] we can also show that each iteration can make a reasonable progress in expectation, leading to a bound of $\tilde{O}(|F|)$ on the number of iterations for the success of the algorithm with high probability. We remark that the algorithm in [10] used a different local search framework. Therefore, our result shows that the classic algorithm of [3] can also be made fast.

However, directly replacing the randomized local search procedure with a deterministic one does not work: The solution at the end of each time might not be a local optimum as we did not enumerate all possible local operations. Thus the key lemma does not hold any more. Nevertheless we show that applying a few local operations around $j_{t}$ upon its arrival can address the issue. With the key lemma, one can bound the number of times we perform the iterative randomized local search procedure, and thus the overall running time.

Our proof for Theorem 3 is based on a generalization of the greedy algorithm for facility location on HST metrics, which was developed in [16] in the context of differential privacy but only for the case of uniform facility cost. The intuition of the algorithm is as follows: If for some vertex $v$ of the HST $T$ , the number of clients in the tree $T_{v}$ (the sub-tree of $T$ rooted at $v$ ) times the length of parent edge of $v$ is big compared to the cost of the cheapest facility in $T_{v}$ , then we should open that facility. Otherwise, we should not open it and let the clients in $T_{v}$ be connected to outside $T_{v}$ through the parent edge. This intuition can be made formal: We mark $v$ in the former case; then simply opening the cheapest facility in $T_{v}$ for all lowest marked vertices $v$ leads to a constant approximation for facility location.

The above offline algorithm leads to a dynamic data structure that maintains $O(1)$ -approximate solutions, supports insertion and deletion of clients, and reports the connecting facility of a client in $O(\log D)$ time. This is the case since each time a client arrives or departs, only its ancestors will be affected. However, in a dynamic algorithm setting, we need to maintain the assignment vector in memory, so that when the connecting facility of a client changes, it needs to be notified. This requires that the number of reconnections made by our algorithm to be small. To achieve the goal, we impose two constants for each $v$ when deciding whether $v$ should be marked and the cheapest facility in $T_{v}$ should be open. When a vertex $v$ changes its marking/opening status, we update the constants in such a way that it becomes hard for the status to be changed back.

2 Preliminaries

Throughout the paper, we use $F$ to denote the set of potential facilities for all the problems and models; we assume $F$ is given upfront. $C$ is the dynamic set of clients we need to connect by our algorithm. This is not necessarily the set of clients that are present: In the algorithms for online facility location with recourse and dynamic facility location in the incremental setting, we fix the connections of some clients as the algorithms proceed. These clients are said to be “frozen” and excluded from $C$ . We shall always use $d$ to denote the hosting metric containing $F$ and all potential clients. For any point $j$ and subset $V$ of points in the metric, we define $d(j,V)=\min_{v\in V}d(j,v)$ to be the minimum distance from $j$ to a point in $V$ . We assume all distances are integers, the minimum non-zero distance between two points is 1. We define $D$ , the diameter or the aspect ratio of a metric space, as the largest distance between two points in it. Let $n$ be $|F|$ plus the total number of clients arrived during the whole process. The algorithms do not need to know the exact value of $n$ in advance, except that in the dynamic algorithm for facility location in the incremental setting (the problem in Theorem 2), to achieve the $1-1/n^{2}$ success probability, a sufficiently large $\Gamma=\mathrm{poly}(n,\log D,\frac{1}{\epsilon})$ needs to be given.¹¹1For an algorithm that might fail, we need to have some information about $n$ to obtain a failure probability that depends on $n$ .

In all the algorithms, we maintain a set $S$ of open facilities, and a connection $\sigma\in S^{C}$ of clients in $C$ to facilities in $S$ . We do not require that $\sigma$ connects clients to their respective nearest open facilities. For any solution $(S^{\prime}\subseteq F,\sigma^{\prime}\in S^{\prime C})$ , we use $\mathsf{cc}(\sigma^{\prime})=\sum_{j\in C}d(j,\sigma_{j})$ to denote the connection cost of the solution. For facility location, we use $\mathsf{cost}(S^{\prime},\sigma^{\prime})=f(S^{\prime})+\mathsf{cc}(\sigma^{\prime})$ to denote the total cost of the solution $(S^{\prime},\sigma^{\prime})$ , where $f(S^{\prime}):=\sum_{i\in S^{\prime}}f_{i}$ . Notice that $\sigma$ and the definitions of $\mathsf{cc}$ and $\mathsf{cost}$ functions depend on the dynamic set $C$ .

Throughout the paper, we distinguish between a “moment”, a “time” and a “step”. A moment refers to a specific time point during the execution of our algorithm. A time corresponds to an arrival or a departure event: At each time, exactly one client arrives or departs, and time $t$ refers to the period from the moment the $t$ -th event happens until the moment the $(t+1)$ -th event happens (or the end of the algorithm). One step refers to one statement in our pseudo-codes indexed by a number.

2.1 Hierarchically Well Separated Trees

Definition 5.

A hierarchically-well-separated tree (or HST for short) is an edge-weighted rooted tree with the following properties:

•

all the root-to-leaf paths have the same number of edges,
•

if we define the level of vertex $v$ , ${\mathsf{level}}(v)$ , to be the number of edges in a path from $v$ to any of its leaf descendant, then for an non-root vertex $v$ , the weight of the edge between $v$ and its parent is exactly $2^{{\mathsf{level}}(v)}$ .

Given a HST $T$ with the set of leaves being $X$ , we use $d_{T}$ to denote the shortest path metric of the tree $T$ (with respect to the edge weights) restricted to $X$ .

The classic results by Bartal [4] and Fakcharoenphol, Rao and Talwar [17] state that we can embed any $N$ -point metric $(X,d)$ (with minimum non-zero distance being $1$ ) to a distribution $\pi$ of expanding²²2A metric $(X,d_{T})$ is expanding w.r.t $(X,d)$ if for every $u,v\in X$ , we have $d_{T}(u,v)\geq d(u,v)$ . HST metrics $(X,d_{T})$ with distortion $O(\log N)$ : For every $u,v\in X$ , we have $d_{T}(u,v)\geq d(u,v)$ and $\operatorname*{\mathbb{E}}_{u,v}[d_{T}(u,v)]\leq O(\log N)d(u,v)$ . Moreover, there is an efficient randomized algorithm [7] that outputs a sample of the tree $T$ from $\pi$ . Thus applying standard arguments, Theorem 3 implies Theorem 4.

2.2 Specifying Input Sequence

In this section we specify how the input sequence is given. For the online and dynamic facility location problem, we assume the facility locations $F$ , their costs $(f_{i})_{i\in F}$ , and the metric $d$ restricted to $F$ are given upfront, and they take $O(|F|^{2})$ space. Whenever a client $j\in C$ arrives, it specifies its distance to every facility $i\in F$ (notice that the connection cost of an assignment $\sigma\in S^{C}$ does not depend on distances between two clients and thus they do not need to be given). Thus the whole input contains $O(n|F|)$ words.

For Theorems 3 and 4, as we do not try to optimize the constants, we do not need that a client specifies its distance to every facility. By losing a multiplicative factor of $2$ and an additive factor of $1$ in the approximation ratio, we can assume that every client $j$ is collocated with its nearest facility in $F$ (See Appendix C). Thus, we only require that when a client $j$ comes, it reports the position of its nearest facility. For Theorem 3, the HST $T$ over $F$ is given at the beginning using $O(|F|)$ words. For Theorem 4, the metric $d$ over $F$ is given at the beginning using $O(|F|^{2})$ words. Then, we use an efficient algorithm [7] to sample a HST $T$ .

2.3 Local Search for facility location

The local-search technique has been used to obtain the classic $(1+\sqrt{2})$ -approximation offline algorithm for facility location [3]. We now give an overview of the algorithm, which will be the baseline of our online and dynamic algorithms for facility location. One can obtain a (tight) $3$ -approximation for facility location without scaling facility costs. Scaling the facility costs by a factor of $\lambda:=\sqrt{2}$ when deciding whether an operation can decrease the cost, we can achieve a better approximation ratio of $\alpha_{\mathsf{FL}}:=1+\sqrt{2}$ . Throughout, we fix the constants $\lambda=\sqrt{2}$ and $\alpha_{\mathsf{FL}}=1+\sqrt{2}$ . For a solution $(S^{\prime},\sigma^{\prime})$ to a facility location instance, we use $\mathsf{cost}_{\lambda}(S^{\prime},\sigma^{\prime})=\lambda f(S^{\prime})+\mathsf{cc}(\sigma^{\prime})$ to denote the cost of the solution $(S^{\prime},\sigma^{\prime})$ with facility costs scaled by $\lambda=\sqrt{2}$ . We call $\mathsf{cost}_{\lambda}(S^{\prime},\sigma^{\prime})$ the scaled cost of $(S^{\prime},\sigma^{\prime})$ .

Given the current solution $(S,\sigma)$ for a facility location instance defined by $F,C,d$ and $(f_{i})_{i\in F}$ , we can apply a local operation that changes the solution $(S,\sigma)$ . A valid local operation is one of the following.

•

An $\mathsf{open}$ operation, in which we open some facility $i\in F$ and reconnect a subset $C^{\prime}\subseteq C$ of clients to $i$ . We allow $i$ to be already in $S$ , in which case we simply reconnect $C^{\prime}$ to $i$ . This needs to be allowed since our $\sigma$ does not connect clients to their nearest open facilities.
•

A $\mathsf{close}$ operation, we close some facility $i^{\prime}\in S$ and reconnect the clients in $\sigma^{-1}(i^{\prime})$ to facilities in $S\setminus\{i^{\prime}\}$ .
•

In a $\mathsf{swap}$ operation, we open some facility $i\notin S$ and close some facility $i^{\prime}\in S$ , reconnect the clients in $\sigma^{-1}(i^{\prime})$ to facilities in $S\setminus\{i^{\prime}\}\cup\{i\}$ , and possibly some other clients to $i$ . We say $i$ is swapped in and $i^{\prime}$ is swapped out by the operation.

Thus, in any valid operation, we can open and/or close at most one facility. A client can be reconnected if it is currently connected to the facility that will be closed, or it will be connected to the new open facility. After we apply a local operation, $S$ and $\sigma$ will be updated accordingly so that $(S,\sigma)$ is always the current solution.

For the online algorithm with recourse model, since we need to bound the number of reconnections, we apply a local operation only if the scaled cost it decreases is large compared to the number of reconnections it makes. This motivates the following definition:

Definition 6 (Efficient operations for facility location).

Given a $\phi\geq 0$ , we say a local operation on a solution $(S,\sigma)$ for a facility location instance is $\phi$ -efficient, if it decreases $\mathsf{cost}_{\lambda}(S,\sigma)$ by more than $\phi$ times the number of clients it reconnects.

The following two theorems can be derived from the analysis for the local search algorithms for facility location. We include their proofs in Appendix A for completeness.

Theorem 7.

Consider a facility location instance with cost of the optimum solution being $\mathsf{opt}$ (using the original cost function). Let $(S,\sigma)$ be the current solution in our algorithm and $\phi\geq 0$ be a real number. If there are no $\phi$ -efficient local operations on $(S,\sigma)$ , then we have

\displaystyle\mathsf{cost}(S,\sigma)\leq\alpha_{\mathsf{FL}}\big{(}\mathsf{opt}+|C|\phi\big{)}.

In particular, if we apply the theorem with $\phi=0$ , then we obtain that $(S,\sigma)$ is a $(\alpha_{\mathsf{FL}}=1+\sqrt{2})$ -approximation for the instance.

The following theorem will be used to analyze our randomized local search procedure.

Theorem 8.

Let $(S,\sigma)$ be a solution to a facility location instance and $\mathsf{opt}$ be the optimum cost. Then there are two sets ${\mathcal{P}}_{\mathrm{C}}$ and ${\mathcal{P}}_{\mathrm{F}}$ of valid local operations on $(S,\sigma)$ , where each operation $\mathrm{op}$ decreases the scaled cost $\mathsf{cost}_{\lambda}(S,\sigma)$ by $\nabla_{\mathrm{op}}>0$ , such that the following holds:

•

$\sum_{\mathrm{op}\in{\mathcal{P}}_{\mathrm{C}}}\nabla_{\mathrm{op}}\geq\mathsf{cc}(\sigma)-(\lambda f(S^{*})+\mathsf{cc}(\sigma^{*}))$ .
•

$\sum_{\mathrm{op}\in{\mathcal{P}}_{\mathrm{F}}}\nabla_{\mathrm{op}}\geq\lambda f(S)-(\lambda f(S^{*})+2\mathsf{cc}(\sigma^{*}))$ .
•

There are at most $|F|$ $\mathsf{close}$ operations in ${\mathcal{P}}_{\mathrm{C}}\biguplus{\mathcal{P}}_{\mathrm{F}}$ .
•

For every $i\in F$ , there is at most 1 operation in each of ${\mathcal{P}}_{\mathrm{C}}$ and ${\mathcal{P}}_{\mathrm{F}}$ that opens or swaps in $i$ .

2.4 Useful Lemmas

The following lemmas will be used repeatedly in our analysis and thus we prove them separately in Appendix B.

Lemma 9.

Let $b\in{\mathbb{R}}_{\geq 0}^{T}$ for some integer $T\geq 1$ . Let $B_{T^{\prime}}=\sum_{t=1}^{T^{\prime}}b_{t}$ for every $T^{\prime}=0,1,\cdots,T$ . Let $0<a_{1}\leq a_{2}\leq\cdots\leq a_{T}$ be a sequence of real numbers and $\alpha>0$ such that $B_{t}\leq\alpha a_{t}$ for every $t\in[T]$ . Then we have

\displaystyle\sum_{t=1}^{T}\frac{b_{t}}{a_{t}}\leq\alpha\left(\ln\frac{a_{T}}{a_{1}}+1\right).

Lemma 10.

Assume at some moment of an algorithm for facility location, $C$ is the set of clients, $(S,\sigma)$ is the solution for $C$ . Let $i\in F$ and $\tilde{C}\subseteq C$ be any non-empty set of clients. Also at the moment there are no $\phi$ -efficient operation that opens $i$ for some $\phi\geq 0$ . Then we have

\displaystyle d(i,S)\leq\frac{f_{i}+2\sum_{\tilde{j}\in\tilde{C}}d(i,\tilde{j})}{|\tilde{C}|}+\phi.

Organization

The rest of the paper is organized as follows. In Section 3, we prove Theorem 1 by giving our online algorithm for facility location with recourse. Section 4 gives the randomized local search procedure, that will be used in the proof of Theorem 2 in Section 5. Section 6 is dedicated to the proof of Theorem 4, by giving the fully dynamic algorithm for facility location in HST metrics. We give some open problems and future directions in Section 7. Some proofs are deferred to the appendix for a better flow of the paper.

3 $(1+\sqrt{2}+\epsilon)$ -Competitive Online Algorithm with Recourse

In this section, we prove Theorem 1 by giving the algorithm for online facility location with recourse.

3.1 The Algorithm

For any $\epsilon>0$ , let $\epsilon^{\prime}=\Theta(\epsilon)$ be a parameter that is sufficiently small so that the approximation ratio $\alpha_{\mathsf{FL}}+O(\epsilon^{\prime})=1+\sqrt{2}+O(\epsilon^{\prime})$ achieved by our algorithm is at most $\alpha_{\mathsf{FL}}+\epsilon$ . Our algorithm for online facility location is easy to describe. Whenever the client $j_{t}$ comes at time $t$ , we use a simple rule to connect $j_{t}$ , as defined in the procedure $\mathsf{initial\mathchar 45\relax connect}$ in Algorithm 1: either connecting $j_{t}$ to the nearest facility in $S$ , or opening and connecting $j_{t}$ to its nearest facility in $F\setminus S$ , whichever incurs the smaller cost. Then we repeatedly perform $\phi$ -efficient operations (Definition 6), until no such operations can be found, for $\phi=\frac{\epsilon^{\prime}\cdot\mathsf{cost}(S,\sigma)}{\alpha_{\mathsf{FL}}|C|}$ . ³³3There are exponential number of possible operations, but we can check if there is a $\phi$ -efficient one efficiently. $\mathsf{close}$ operations can be handled easily. To check if we can open a facility $i$ , it suffices to check if $\sum_{j\in C:d(j,i)+\phi<d(j,\sigma_{j})}(d(j,\sigma_{j})-d(j,i)-\phi)>\lambda f_{i}\cdot 1_{i\notin S}$ . $\mathsf{swap}$ operations are more complicated but can be handled similarly.

Algorithm 1

\mathsf{initial\mathchar 45\relax connect}(j)

1:if

\min_{i\in F\setminus S}(f_{i}+d(i,j))<d(j,S)

then

2: let

i^{*}=\arg\min_{i\in F\setminus S}(f_{i}+d(i,j))

S\leftarrow S\cup\{i^{*}\},\sigma_{j}\leftarrow i^{*}

3:else

\sigma_{j}\leftarrow\arg\min_{i\in S}d(j,i)

We can show that the algorithm gives an $(\alpha_{\mathsf{FL}}+\epsilon)$ -approximation with amortized recourse $O(\log D\log n)$ ; recall that $D$ is the aspect ratio of the metric. To remove the dependence on $D$ , we divide the algorithm into stages, and freeze the connections of clients that arrived in early stages. The final algorithm is described in Algorithm 3, and Algorithm 2 gives one stage of the algorithm.

Algorithm 2 One Stage of Online Algorithm for Facility Location

•

$C$ : initial set of clients
•

$(S,\sigma)$ : a solution for $C$ which is $O(1)$ -approximate
•

Clients $j_{1},j_{2},\cdots$ arrive from time to time

2: Guaranteeing that

(S,\sigma)

at the end of each time

t

\frac{\alpha_{\mathsf{FL}}}{1-\epsilon^{\prime}}

-approximate

\mathsf{init}\leftarrow\mathsf{cost}(S,\sigma)

4:for

t\leftarrow 1,2,\cdots

, terminating if no more clients will arrive do

C\leftarrow C\cup\{j_{t}\}

, and call

\mathsf{initial\mathchar 45\relax connect}(j_{t})

6: while there exists an

\frac{\epsilon^{\prime}\cdot\mathsf{cost}(S,\sigma)}{\alpha_{\mathsf{FL}}|C|}

-efficient local operation do perform the operation

7: if

\mathsf{cost}(S,\sigma)>\mathsf{init}/\epsilon^{\prime}

then terminate the stage

Algorithm 3 Online Algorithm for Facility Location

C\leftarrow\emptyset,S\leftarrow\emptyset,\sigma=()

2:repeat

C^{\circ}\leftarrow C,(S^{\circ},\sigma^{\circ})\leftarrow(S,\sigma)

4: redefine the next time to be time 1 and run one stage as defined in Algorithm 2

5: permanently open one copy of each facility in

S^{\circ}

, and permanently connect clients in

C^{\circ}

according to

\sigma^{\circ}

(we call the operation freezing

S^{\circ}

and

C^{\circ}

)

C\leftarrow C\setminus C^{\circ}

, restrict the domain of

\sigma

to be the new

C

7:until no clients come

In Algorithm 2, we do as described above, with two modifications. First, we are given an initial set $C$ of clients and a solution $(S,\sigma)$ for $C$ which is $O(1)$ -approximate. Second, the stage will terminate if the cost of our solution increases by a factor of more than $1/\epsilon^{\prime}$ . The main algorithm (Algorithm 3) is broken into many stages. Since we shall focus on one stage of the algorithm for most part of our analysis, we simply redefine the time so that every stage starts with time 1. The improved recourse comes from the freezing operation: at the end of each stage, we permanently open one copy of each facility in $S^{\circ}$ , and permanently connect clients in $C^{\circ}$ to copies of $S^{\circ}$ according to $\sigma^{\circ}$ , where $C^{\circ}$ and $(S^{\circ},\sigma^{\circ})$ are the client set and solution at the beginning of the stage. Notice that we assume the original facilities in $S^{\circ}$ will still participate in the algorithm in the future; that is, they are subject to opening and closing. Thus each facility may be opened multiple times during the algorithm and we take the facility costs of all copies into consideration. This assumption is only for the sake of analysis; the actual algorithm only needs to open one copy and the costs can only be smaller compared to the described algorithm.

From now on, we focus on one stage of the algorithm and assume that the solution given at the beginning of each stage is $O(1)$ -approximate. In the end we shall account for the loss due to the freezing of clients and facilities. Within a stage, the approximation ratio follows directly from Theorem 7: Focus on the moment after the while loop at time step $t$ in Algorithm 2. Since there are no $\frac{\epsilon^{\prime}\cdot\mathsf{cost}(S,\sigma)}{\alpha_{\mathsf{FL}}|C|}$ -efficient local operations on $(S,\sigma)$ , we have by the theorem that $\mathsf{cost}(S,\sigma)\leq\alpha_{\mathsf{FL}}\left(\mathsf{opt}+|C|\cdot\frac{\epsilon^{\prime}\cdot\mathsf{cost}(S,\sigma)}{\alpha_{\mathsf{FL}}|C|}\right)=\alpha_{\mathsf{FL}}\mathsf{opt}+\epsilon^{\prime}\cdot\mathsf{cost}(S,\sigma)$ , where $\mathsf{opt}$ is the cost of the optimum solution for $C$ . Thus, at the end of each time, we have $\mathsf{cost}(S,\sigma)\leq\frac{\alpha_{\mathsf{FL}}}{1-\epsilon^{\prime}}\cdot\mathsf{opt}$ .

3.2 Bounding Amortized Recourse in One Stage

We then bound the amortized recourse in a stage; we assume that $\mathsf{cost}(S,\sigma)>0$ at the beginning of the stage since otherwise there will be no recourse involved in the stage (since we terminate the stage when the cost becomes non-zero). We use $T$ to denote the last time of the stage. For every time $t$ , let $C_{t}$ be the set $C$ at the end of time $t$ , and $\mathsf{opt}_{t}$ to be the cost of the optimum solution for the set $C_{t}$ . For every $t\in[T]$ , we define $\Delta_{t}$ to be the value of $\mathsf{cost}(S,\sigma)$ after Step 5 at time step $t$ in Algorithm 2, minus that before Step 5. We can think of this as the cost increase due to the arrival of $j_{t}$ .

The key lemma we can prove is the following:

Lemma 11.

For every $T^{\prime}\in[T]$ , we have

\sum_{t=1}^{T^{\prime}}\Delta_{t}\leq O(\log T^{\prime})\mathsf{opt}_{T^{\prime}}.

Proof.

Consider the optimum solution for $C_{T^{\prime}}$ and focus on any star $(i,C^{\prime})$ in the solution; that is, $i$ is an open facility and $C^{\prime}$ is the set of clients connected to $i$ . Assume $C^{\prime}\setminus C_{0}=\{j_{t_{1}},j_{t_{2}},\cdots,j_{t_{s}}\}$ , where $1\leq t_{1}<t_{2}<\cdots<t_{s}\leq T^{\prime}$ ; recall that $C_{0}$ is the initial set of clients given at the beginning of the stage. We shall bound $\sum_{s^{\prime}=1}^{s}\Delta_{t_{s^{\prime}}}$ in terms of the cost of the star $(i,C^{\prime}\setminus C_{0})$ .

By the rule specified in $\mathsf{initial\mathchar 45\relax connect}$ , we have $\Delta_{t_{1}}\leq f_{i}+d(i,j_{t_{1}})$ . Now focus on any integer $k\in[2,s]$ . Before Step 5 at time $t_{k}$ , no $\Big{(}\phi:=\frac{\epsilon^{\prime}\cdot\mathsf{cost}(S,\sigma)}{\alpha_{\mathsf{FL}}|C_{t_{k}-1}|}\leq\frac{O(\epsilon^{\prime})\cdot\mathsf{opt}_{t_{k}-1}}{t_{k}-1}\leq\frac{O(\epsilon^{\prime})\cdot\mathsf{opt}_{T^{\prime}}}{t_{k}-1}\Big{)}$ -efficient operation that opens $i$ is available. Thus, we can apply Lemma 10 on $i$ , $\tilde{C}=\{j_{t_{1}},j_{t_{2}},\cdots,j_{t_{k-1}}\}$ and $\phi$ to conclude that before Step 5, we have

\displaystyle d(i,S)\leq\frac{f_{i}+2\cdot\sum_{k^{\prime}=1}^{k-1}d(i,j_{t_{k^{\prime}}})}{k-1}+\frac{O(\epsilon^{\prime})\cdot\mathsf{opt}_{T^{\prime}}}{t_{k}-1}.

In $\mathsf{initial\mathchar 45\relax connect}(j_{t_{k}})$ , we have the option of connecting $j_{t_{k}}$ to its nearest open facility. Thus, we have

\displaystyle\Delta_{t_{k}}\leq d(i,S)+d(i,j_{t_{k}})

\displaystyle\leq\frac{f_{i}+2\cdot\sum_{k^{\prime}=1}^{k-1}d(i,j_{t_{k^{\prime}}})}{k-1}+\frac{O(\epsilon^{\prime})\cdot\mathsf{opt}_{T^{\prime}}}{t_{k}-1}+d(i,j_{t_{k}}).

We now sum up the above inequality for all $k\in[2,s]$ and that $\Delta_{t_{1}}\leq f_{i}+d(j,j_{t_{1}})$ . We get

\displaystyle\sum_{k=1}^{s}\Delta_{t_{k}}\leq O(\log s)\left(f_{i}+\sum_{k^{\prime}=1}^{s}d(i,j_{t_{k^{\prime}}})\right)+O(\epsilon^{\prime})\sum_{k=2}^{s}\frac{\mathsf{opt}_{T^{\prime}}}{t_{k}-1}.

(1)

To see the above inequality, it suffices to consider the coefficients for $f_{i}$ and $d(i,j_{t_{k^{\prime}}})$ ’s on the right-hand side. The coefficient for $f_{i}$ is at most $1+\frac{1}{1}+\frac{1}{2}+\cdots+\frac{1}{s-1}=O(\log s)$ ; the coefficient for each $d(i,j_{t_{k^{\prime}}})$ is $1+\frac{2}{k^{\prime}}+\frac{2}{k^{\prime}+1}+\cdots+\frac{2}{s-1}=O(\log s)$ .

We now take the sum of (1) over all stars $(i,C^{\prime})$ in the optimum solution for $C_{T^{\prime}}$ . The sum for the first term on the right side of (1) will be $O(\log T^{\prime})\mathsf{opt}_{T^{\prime}}$ since $f_{i}+\sum_{k^{\prime}=1}^{s}d(i,j_{t_{k^{\prime}}})$ is exactly the cost of the star $(i,C^{\prime}\setminus C_{0}\subseteq C^{\prime})$ . The sum for the second term will be $O(\epsilon^{\prime}\log T^{\prime})\cdot\mathsf{opt}_{T^{\prime}}$ since the set of integers $t_{k}-1$ overall stars $(i,C^{\prime})$ and all $k\geq 2$ are all positive and distinct. Thus overall, we have $\sum_{t=1}^{T^{\prime}}\Delta_{t}\leq O(\log T^{\prime})\mathsf{opt}_{T^{\prime}}$ . ∎

With Lemma 11, we can now bound the amortized recourse of one stage. In time $t$ , $\mathsf{cost}(S,\delta)$ first increases by $\Delta_{t}$ in Step 5. Then after that, it decreases by at least $\frac{\epsilon^{\prime}\mathsf{cost}(S,\sigma)}{\alpha_{\mathsf{FL}}|C|}\geq\frac{\epsilon^{\prime}\mathsf{opt}_{t}}{\alpha_{\mathsf{FL}}|C|}\geq\frac{\epsilon^{\prime}\mathsf{opt}_{t}}{\alpha_{\mathsf{FL}}|C_{T}|}$ for every reconnection we made. Let $\Phi_{T^{\prime}}=\sum_{t=1}^{T^{\prime}}\Delta_{t}$ ; Lemma 11 says $\Phi_{t}\leq\alpha\mathsf{opt}_{t}$ for some $\alpha=O(\log T)$ and every $t\in[T]$ . Noticing that $(\mathsf{opt}_{t})_{t\in T}$ is a non-decreasing sequence, the total number of reconnections is at most

\displaystyle\frac{\textsf{init}}{\epsilon^{\prime}\cdot\mathsf{opt}_{1}/(\alpha_{\mathsf{FL}}|C_{T}|)}+\sum_{t=1}^{T}\frac{\Delta_{t}}{\epsilon^{\prime}\cdot\mathsf{opt}_{t}/(\alpha_{\mathsf{FL}}|C_{T}|)}=\frac{\alpha_{\mathsf{FL}}|C_{T}|}{\epsilon^{\prime}}\left(\frac{\textsf{init}}{\mathsf{opt}_{1}}+\sum_{t=1}^{T-1}\frac{\Delta_{t}}{\mathsf{opt}_{t}}+\frac{\Delta_{T}}{\mathsf{opt}_{T}}\right).

Notice that $\mathsf{init}\leq O(1)\mathsf{opt}_{0}\leq O(1)\mathsf{opt}_{1}$ . Applying Lemma 9 with $T$ replaced by $T-1$ , $b_{t}=\Delta_{t},B_{t}=\Phi_{t}$ and $a_{t}=\mathsf{opt}_{t}$ for every $t$ , we have that $\sum_{t=1}^{T-1}\frac{\Delta_{t}}{\mathsf{opt}_{t}}\leq\alpha\left(\ln\frac{\mathsf{opt}_{T-1}}{\mathsf{opt}_{1}}+1\right)=O\left(\log T\log\frac{1}{\epsilon^{\prime}}\right)$ , since we have $\mathsf{opt}_{T-1}\leq O(1/\epsilon^{\prime})\cdot\mathsf{opt}_{1}$ . Notice that $\Delta_{T}\leq\mathsf{opt}_{T}$ since $\mathsf{opt}_{T}\geq\min_{i\in F}(f_{i}+d(i,j_{T}))\geq\Delta_{T}$ . So, the total number of reconnections is at most $O\left(\frac{\log T}{\epsilon^{\prime}}\log\frac{1}{\epsilon^{\prime}}\right)\cdot|C_{T}|$ . The amortized recourse per client is $O\left(\frac{\log T}{\epsilon^{\prime}}\log\frac{1}{\epsilon^{\prime}}\right)\leq O\left(\frac{\log n}{\epsilon^{\prime}}\log\frac{1}{\epsilon^{\prime}}\right)$ , where in the amortization, we only considered clients involved in the stage. Recall that $n$ is the total number of clients arrived.

As each client appears in at most 2 stages, the overall amortized recourse is $O\left(\frac{\log n}{\epsilon^{\prime}}\log\frac{1}{\epsilon^{\prime}}\right)$ . Finally we consider the loss in the approximation ratio due to freezing of clients. Suppose we are in the $p$ -th stage. Then the clients arrived at and before $(p-2)$ -th stage has been frozen and removed. Let $\overline{\mathsf{opt}}$ be the cost of the optimum solution for all clients arrived at or before $(p-1)$ -th stage. Then the frozen facilities and clients have cost at most $\overline{\mathsf{opt}}\cdot O\left(\epsilon^{\prime}+\epsilon^{\prime 2}+\epsilon^{\prime 2}+\cdots\right)=O(\epsilon^{\prime})\overline{\mathsf{opt}}$ . In any time in the $p$ -th stage, the optimum solution taking all arrived clients into consideration has cost $\overline{\mathsf{opt}}^{\prime}\geq\overline{\mathsf{opt}}$ , and our solution has cost at most $(\alpha_{\mathsf{FL}}+O(\epsilon^{\prime}))\overline{\mathsf{opt}}^{\prime}$ without considering the frozen clients and facilities. Thus, our solution still has approximation ratio $\frac{(\alpha_{\mathsf{FL}}+O(\epsilon^{\prime}))\overline{\mathsf{opt}}^{\prime}+O(\epsilon^{\prime})\overline{\mathsf{opt}}}{\overline{\mathsf{opt}}^{\prime}}=\alpha_{\mathsf{FL}}+O(\epsilon^{\prime})$ when taking the frozen clients into consideration.

4 Fast Local Search via Randomized Sampling

From now on, we will be concerned with dynamic algorithms. Towards proving Theorem 2 for the incremental setting, we first develop a randomized procedure that allows us to perform local search operations fast. In the next section, we use this procedure and ideas from the previous section to develop the dynamic algorithm with the fast update time.

The high level idea is as follows: We partition the set of local operations into many “categories” depending on which facility it tries to open or swap in. In each iteration of the procedure, we sample the category according to some distribution and find the best local operation in this category. By only focusing on one category, one iteration of the procedure can run in time $O(|C|\log|F|)$ . On the other hand, the categories and the distribution over them are designed in such a way that in each iteration, the cost of our solution will be decreased by a multiplicative factor of $1-\Omega\big{(}\frac{1}{|F|}\big{)}$ . This idea has been used in [10] to obtain their $\tilde{O}(n^{2})$ algorithm for approximating facility location. However, their algorithm was based on a different local search algorithm and analysis; for consistency and convenience of description, we stick to original local search algorithm of [3] that leads to $(1+\sqrt{2})$ -approximation for the problem. Our algorithm needs to use the heap data structure.

4.1 Maintaining Heaps for Clients

Unlike the online algorithm for facility location in Section 3, in the dynamic algorithm, we guarantee that the clients are connected to their nearest open facilities. That is, we always have $\sigma_{j}=\arg\min_{i\in S}d(j,i)$ ; we still keep $\sigma$ for convenience of description. We maintain $|C|$ min-heaps, one for each client $j\in C$ : The min-heap for $j$ will contain the facilities in $S\setminus\{\sigma_{j}\}$ , with priority value of $i$ being $d(j,i)$ . This allows us to efficiently retrieve the second nearest open facility to each $j$ : This is the facility at the top of the heap for $j$ and we use the procedure $\mathsf{heap\mathchar 45\relax top}(j)$ to return it.

Algorithm 4

\mathsf{\Delta\mathchar 45\relax open}(i)

: return

\lambda f_{i}-\sum_{j\in C}\max\{0,d(j,\sigma_{j})-d(j,i)\}

Algorithm 5

\mathsf{try\mathchar 45\relax open}(i)

1:if

\mathsf{\Delta\mathchar 45\relax open}(i)<0

then open

i

by updating

S,\sigma

and heaps accordingly

Algorithm 6

\mathsf{\Delta\mathchar 45\relax swap\mathchar 45\relax in}(i)

C^{\prime}\leftarrow\{j\in C:d(j,i)<d(j,\sigma_{j})\}

and

\Psi\leftarrow\lambda f_{i}-\sum_{j\in C^{\prime}}\big{(}d(j,\sigma_{j})-d(j,i)\big{)}

\Delta\leftarrow\min_{i^{\prime}\in S}\left\{\sum_{j\in\sigma^{-1}(i^{\prime})\setminus C^{\prime}}\big{[}\min\{d(j,i),d(j,\mathsf{heap\mathchar 45\relax top}(j))\}-d(j,i^{\prime})\big{]}-\lambda f_{i^{\prime}}\right\}+\Psi

3:return

(\Delta,\text{the $i^{\prime}$ above achieving the value of $\Delta$})

Algorithm 7

\mathsf{\Delta\mathchar 45\relax close}

\Delta\leftarrow\min_{i^{\prime}\in S}\left\{\sum_{j\in\sigma^{-1}(i^{\prime})}\big{[}d(j,\mathsf{heap\mathchar 45\relax top}(j))-d(j,i^{\prime})\big{]}-\lambda f_{i^{\prime}}\right\}

2:return

(\Delta,\text{the $i^{\prime}$ above achieving the value of $\Delta$})

We define four simple procedures $\mathsf{\Delta\mathchar 45\relax open},\mathsf{try\mathchar 45\relax open},\mathsf{\Delta\mathchar 45\relax swap\mathchar 45\relax in}$ and $\mathsf{\Delta\mathchar 45\relax close}$ that are described in Algorithms 4, 5, 6 and 7 respectively. Recall that we use the scaled cost for the local search algorithm; so we are working on the scaled cost function in all these procedures. $\mathsf{\Delta\mathchar 45\relax open}(i)$ for any $i\notin S$ returns $\Delta$ , the increment of the scaled cost that will be incurred by opening $i$ . (For it to be useful, $\Delta$ should be negative, in which case $|\Delta|$ indicates the cost decrement of opening $i$ ). This is just one line procedure as in Algorithm 4; $\mathsf{try\mathchar 45\relax open}$ will open $i$ if it can reduce the scaled cost. $\mathsf{\Delta\mathchar 45\relax swap\mathchar 45\relax in}(i)$ for some $i\notin S$ returns a pair $(\Delta,i^{\prime})$ , where $\Delta$ is the smallest scaled cost increment we can achieve by opening $i$ and closing some facility $i^{\prime}\in S$ , and $i^{\prime}$ gives the facility achieving the smallest value. (Again, for $\Delta$ to be useful, it should be negative, in which case $i^{\prime}$ is the facility that gives the maximum scaled cost decrement $|\Delta|$ .) Similarly, $\mathsf{\Delta\mathchar 45\relax close}$ returns a pair $(\Delta,i^{\prime})$ , which tells us the maximum scaled cost decrement we can achieve by closing one facility and which facility can achieve the decrement. Notice that in all the procedures, the facility we shall open or swap in is given as a parameter, while the facility we shall close is chosen and returned by the procedures.

With the heaps, the procedures $\mathsf{\Delta\mathchar 45\relax open},\mathsf{\Delta\mathchar 45\relax swap\mathchar 45\relax in}$ and $\mathsf{\Delta\mathchar 45\relax close}$ can run in $O(|C|)$ time. We only analyze $\mathsf{\Delta\mathchar 45\relax swap\mathchar 45\relax in}(i)$ as the other two are easier. First, we define $C^{\prime}$ to be the set of clients $j$ with $d(j,i)<d(j,\sigma_{j})$ ; these are the clients that will surely be reconnected to $i$ once $i$ is swapped in. Let $\Psi=\lambda f_{i}-\sum_{j\in C^{\prime}}(d(j,\sigma_{j})-d(j,i))$ be the net scaled cost increase by opening $i$ and connecting $C^{\prime}$ to $i$ . The computation of $C^{\prime}$ and $\Psi$ in Step 1 takes $O(|C|)$ time. If additionally we close some $i^{\prime}\in S$ , we need to reconnect each client in $\sigma^{-1}(i^{\prime})\setminus C^{\prime}$ to either $i$ , or the top element in the heap for $j$ , whichever is closer to $j$ . Steps 2 and 3 compute and return the best scaled cost increment and the best $i^{\prime}$ . Since $\sum_{i^{\prime}\in S}|\sigma^{-1}(i^{\prime})|=|C|$ , the running time of the step can be bounded by $O(|C|)$ .

The running time for $\mathsf{try\mathchar 45\relax open}$ , swapping two facilities and closing a facility (which are not defined explicitly as procedures, but used in Algorithms 8) can be bounded by $O(|C|\log|F|)$ . The running times come from updating the heap structures: For each of the $|C|$ heaps, we need to delete and/or add at most $2$ elements; each operation takes time $O(\log|F|)$ .

4.2 Random Sampling of Local Operations

Algorithm 8

\mathsf{sampled\mathchar 45\relax local\mathchar 45\relax search}

1:if

\mathsf{rand}(0,1)<1/3

then

\triangleright

\mathsf{rand}(0,1)

returns a uniformly random number in

[0,1]

(\Delta,i^{\prime})\leftarrow\mathsf{\Delta\mathchar 45\relax close}

3: if

\Delta<0

then close

i^{\prime}

by updating

S,\delta

and heaps accordingly

4:else

i\leftarrow

random facility in

F\setminus S

\Delta\leftarrow\mathsf{\Delta\mathchar 45\relax open}(i),(\Delta^{\prime},i^{\prime})\leftarrow\mathsf{\Delta\mathchar 45\relax swap\mathchar 45\relax in}(i)

7: if

\Delta\leq\Delta^{\prime}

and

\Delta<0

then open

i

by updating

S,\delta

and heaps accordingly

8: else if

\Delta^{\prime}<0

then open

i

and close

i^{\prime}

by updating

S,\delta

and heaps accordingly

Algorithm 9

\mathsf{FL\mathchar 45\relax iterate}(M)

(S^{\mathrm{best}},\sigma^{\mathrm{best}})\leftarrow(S,\sigma)

2:for

\ell\leftarrow 1

M

3: call

\mathsf{sampled\mathchar 45\relax local\mathchar 45\relax search}

4: if

\mathsf{cost}(S,\sigma)<\mathsf{cost}(S^{\mathrm{best}},\sigma^{\mathrm{best}})

then

(S^{\mathrm{best}},\sigma^{\mathrm{best}})\leftarrow(S,\sigma)

5:return

(S^{\mathrm{best}},\sigma^{\mathrm{best}})

With the support of the heaps, we can design a fast algorithm to implement randomized local search. $\mathsf{sampled\mathchar 45\relax local\mathchar 45\relax search}$ in Algorithm 8 gives one iteration of the local search. We first decide which operation we shall perform randomly. With probability $1/3$ , we perform the $\mathsf{close}$ operation that will reduce the scaled cost the most (if it exists). With the remaining probability $2/3$ , we perform either an $\mathsf{open}$ or a $\mathsf{swap}$ operation. To reduce the running time, we randomly choose a facility $i\in F\setminus S$ and find the best operation that opens or swaps in $i$ , and perform the operation if it reduces the cost. One iteration of $\mathsf{sampled\mathchar 45\relax local\mathchar 45\relax search}$ calls the procedures in Algorithms 4 to 7 at most once and performs at most one operation, and thus has running time $O(|C|\log|F|)$ .

In the procedure $\mathsf{FL\mathchar 45\relax iterate}(M)$ described in Algorithm 9, we run the $\mathsf{sampled\mathchar 45\relax local\mathchar 45\relax search}$ $M$ times. It returns the best solution obtained in these iterations, according to the original (non-scaled) cost, which is not necessarily the solution given in the last iteration. So we have

Observation 12.

The running time of $\mathsf{FL\mathchar 45\relax iterate}(M)$ is $O(M|C|\log|F|)$ , where $C$ is the set of clients when we run the procedure.

Throughout this section, we fix a facility location instance. Let $(S^{*},\sigma^{*})$ be the optimum solution (w.r.t the original cost) and $\mathsf{opt}=\mathsf{cost}(S^{*},\sigma^{*})$ be the optimum cost. Fixing one execution of $\mathsf{sampled\mathchar 45\relax local\mathchar 45\relax search}$ , we use $(S^{0},\sigma^{0})$ and $(S^{1},\sigma^{1})$ to denote the solutions before and after the execution respectively. Then, we have

Lemma 13.

Consider an execution of $\mathsf{sampled\mathchar 45\relax local\mathchar 45\relax search}$ and fix $(S^{0},\sigma^{0})$ . We have

\displaystyle\mathsf{cost}_{\lambda}(S^{0},\sigma^{0})-\operatorname*{\mathbb{E}}[\mathsf{cost}_{\lambda}(S^{1},\sigma^{1})]\geq\frac{1}{3|F|}\max\left\{\begin{array}[]{c}\mathsf{cc}(\sigma^{0})-(\lambda f(S^{*})+\mathsf{cc}(\sigma^{*}))\\ \lambda f(S)-(\lambda f(S^{*})+2\mathsf{cc}(\sigma^{*}))\\ \mathsf{cost}_{\lambda}(S^{0},\sigma^{0})-(2\lambda f(S^{*})+3\mathsf{cc}(\sigma^{*}))\end{array}\right\}.

Lemma 14.

Let $(S^{\circ},\sigma^{\circ})$ be the $(S,\sigma)$ at the beginning of an execution of $\mathsf{FL\mathchar 45\relax iterate}(M)$ , and assume it is an $O(1)$ -approximation to the instance. Let $\Gamma\geq 2$ and $M=O\left(\frac{|F|}{\epsilon^{\prime}}\log\Gamma\right)$ is big enough. Then with probability at least $1-\frac{1}{\Gamma}$ , the solution returned by the procedure is $(\alpha_{\mathsf{FL}}+\epsilon^{\prime})$ -approximate.

5 $(1+\sqrt{2}+\epsilon)$ -Approximate Dynamic Algorithm for Facility Location in Incremental Setting

In this section, we prove Theorem 2 by combining the ideas from Sections 3 and 4 to derive a dynamic algorithm for facility location in the incremental setting. As for the online algorithm in Section 3, we divide our algorithm into stages. Whenever a client comes, we use a simple rule to accommodate it. Now we can not afford to consider all possible local operations as in Section 3. Instead we use the randomized local search idea from the algorithm in Section 4 by calling the procedure $\mathsf{FL\mathchar 45\relax iterate}$ . We call the procedure only if the cost of our solution has increased by a factor of $1+\epsilon^{\prime}$ (where $\epsilon^{\prime}=\Theta(\epsilon)$ is small enough). In our analysis, we show a lemma similar to Lemma 11: The total increase of costs due to arrival of clients is small, compared to the optimum cost for these clients. Then, we can bound the number of times we call $\mathsf{FL\mathchar 45\relax iterate}$ . Recall that we are given an integer $\Gamma=\mathrm{poly}\big{(}n,\log D,\frac{1}{\epsilon}\big{)}$ that is big enough: We are aiming at a success probability of $1-1/\Gamma$ for each call of $\mathsf{FL\mathchar 45\relax iterate}$ . Our final running time will only depend on $O(\log\Gamma)$ .

The main algorithm will be the same as Algorithm 3, except that we use Algorithm 10 as the algorithm for one stage. As before, we only need to design one stage of the algorithm. Recall that in a stage we are given an initial set $C$ of clients, an $O(1)$ -approximate solution $(S,\sigma)$ for $C$ . Clients come one by one and our goal is to maintain an $(\alpha_{\mathsf{FL}}+O(\epsilon^{\prime}))$ -approximate solution at any time. The stage terminates if no client comes or our solution has cost more than $1/\epsilon^{\prime}$ times the cost of the initial solution.

Algorithm 10 One Stage of Dynamic Algorithm for Facility Location

•

$C$ : the initial set of clients
•

$(S,\sigma)$ : initial solution for $C$ , which is $O(1)$ -approximate

2:let

M=O\left(\frac{|F|}{\epsilon^{\prime}}\log\Gamma\right)

be large enough

(S,\sigma)\leftarrow\mathsf{FL\mathchar 45\relax iterate}\left(M\right)

\textsf{init}\leftarrow\mathsf{cost}(S,\sigma),{\mathsf{last}}\leftarrow\textsf{init}

4:for

t\leftarrow 1,2,3,\cdots

, terminating if no more clients arrive do

5: for

q={\left\lceil\log\frac{{\mathsf{last}}}{|F|}\right\rceil}

{\left\lceil\log\frac{{\mathsf{last}}}{\epsilon^{\prime}}\right\rceil}

6: if

i\leftarrow\arg\min_{i\in F\setminus S,f_{i}\leq 2^{q}}d(j_{t},i)

exists, then call

\mathsf{try\mathchar 45\relax open^{\prime}}(i)

\triangleright

\mathsf{try\mathchar 45\relax open^{\prime}}

is the same as

\mathsf{try\mathchar 45\relax open}

except we consider the cost instead of scaled cost.

C\leftarrow C\cup\{j_{t}\}

and call

\mathsf{try\mathchar 45\relax open^{\prime}}\big{(}\arg\min_{i\in F\setminus S}(d(j_{t},i)+f_{i})\big{)}

8: if

\mathsf{cost}(S,\sigma)>(1+\epsilon^{\prime})\cdot{\mathsf{last}}

then

(S,\sigma)\leftarrow\mathsf{FL\mathchar 45\relax iterate}\left(M\right)

10: if

\mathsf{cost}(S,\sigma)>{\mathsf{last}}

then

{\mathsf{last}}\leftarrow\mathsf{cost}(S,\sigma)

11: if

{\mathsf{last}}>\textsf{init}/\epsilon^{\prime}

then terminate the stage

Notice that in a stage, we are considering the original costs of solutions (instead of scaled costs as inside $\mathsf{FL\mathchar 45\relax iterate}$ ). During a stage we maintain a value ${\mathsf{last}}$ which gives an estimation on the cost of the current solution $(S,\sigma)$ . Whenever a client $j_{t}$ comes, we apply some rules to open some facilities and connect $j_{t}$ (Steps 5 to 7). These operations are needed to make the cost increase due to the arrival of $j_{t}$ (defined as $\Delta_{t}$ later) small. In the algorithm $\mathsf{try\mathchar 45\relax open^{\prime}}$ is the same as $\mathsf{try\mathchar 45\relax open}$ , except that we use the original cost instead of the scaled cost (this is not important but only for the sake of convenience). If $\mathsf{cost}(S,\sigma)$ becomes too large, i.e, $\mathsf{cost}(S,\sigma)>(1+\epsilon^{\prime}){\mathsf{last}}$ , then we call $(S,\sigma)\leftarrow\mathsf{FL\mathchar 45\relax iterate}(M)$ for the $M$ defined in Step 2 (Step 9), and update ${\mathsf{last}}$ to $\mathsf{cost}(S,\sigma)$ if we have $\mathsf{cost}(S,\sigma)>{\mathsf{last}}$ (Step 10). We terminate the algorithm when ${\mathsf{last}}\geq\mathsf{init}/\epsilon$ , where $\mathsf{init}$ is $\mathsf{cost}(S,\sigma)$ at the beginning of the stage (Step 11).

We say an execution of $\mathsf{FL\mathchar 45\relax iterate}(M)$ is successful if the event in Lemma 14 happens. Then we have

Lemma 15.

If all executions of $\mathsf{FL\mathchar 45\relax iterate}$ are successful, the solution $(S,\sigma)$ at the end of each time is $(1+\epsilon^{\prime})(\alpha_{\mathsf{FL}}+\epsilon^{\prime})$ -approximate.

Proof.

This holds since we always have $\mathsf{cost}(S,\sigma)\leq(1+\epsilon^{\prime}){\mathsf{last}}$ at the end of each time, where ${\mathsf{last}}$ is the cost of some $(\alpha_{\mathsf{FL}}+\epsilon^{\prime})$ -approximate solution at some moment before. As we only add clients to $C$ , the cost of the optimum solution can only increase and thus the claim holds. ∎

Now we argue each execution of $\mathsf{FL\mathchar 45\relax iterate}(M)$ is successful with probability at least $1-1/\Gamma$ . This will happen if $(S,\sigma)$ is $O(1)$ -approximate before the call. By Lemma 14, we only need to make sure that the $(S,\sigma)$ before the execution is $O(1)$ -approximate. This is easy to see: Before Step 7 in time $t$ , we have $\mathsf{cost}(S,\sigma)\leq O(1)\mathsf{opt}$ ; the increase of $\mathsf{cost}(S,\sigma)$ in the step is at most the value of $\mathsf{opt}$ after the step (i.e, we consider the client $j_{t}$ when defining $\mathsf{opt}$ ). Thus, we have $\mathsf{cost}(S,\sigma)\leq O(1)\mathsf{opt}$ after the step.

5.1 Bounding Number of Times of Calling $\mathsf{FL\mathchar 45\relax iterate}$

It remains to bound the number of times we call $\mathsf{FL\mathchar 45\relax iterate}$ . Again, we use $T$ to denote the last time step of Algorithm 10 (i.e, one stage of the dynamic algorithm) and $\Delta_{t}$ to denote the cost increase due to the arrival of $j_{t}$ : it is the value of $\mathsf{cost}(S,\sigma)$ before Step 7 minus that after Step 7 in time $t$ . For every time $t\in[T]$ , let $C_{t}$ be the set $C$ at the end of time $t$ , and let $\mathsf{opt}_{t}$ be the cost of the optimum solution for $C_{t}$ . Let ${\mathsf{last}}_{t}$ be the value of ${\mathsf{last}}$ at the beginning of time $t$ .

Due to Step 7, we have the following observation:

Observation 16.

For every $t\in[T]$ , we have $\Delta_{t}\leq\min_{i\in F}(f_{i}+d(i,j_{t}))$ .

Proof.

Let $i=\arg\min_{i\in F}(f_{i}+d(i,j_{t}))$ and consider Step 7 at time $t$ . If $d(j_{t},S)\leq f_{i}+d(i,j_{t})$ before the step, then we have $\Delta_{t}\leq d(i,j_{t})$ . Otherwise, $i\notin S$ and $d(j_{t},S)>f_{i}+d(i,j_{t})$ . Then $\mathsf{try\mathchar 45\relax open^{\prime}}(i)$ in the step will open $i$ and we have $\Delta_{t}\leq f_{i}+d(i,j_{t})$ . ∎

We can also prove the following lemma that bounds $\Delta_{t}$ :

Lemma 17.

Let $t\in[T],i^{*}\in F$ such that $f_{i^{*}}\leq{\mathsf{last}}_{t}/\epsilon^{\prime}$ and $C^{\prime}\subseteq C_{t-1}$ be any non-empty subset. Then we have

\displaystyle\Delta_{t}\leq\frac{2}{|C^{\prime}|}\left(\max{\left\{f_{i^{*}},{\mathsf{last}}_{t}/|F|\right\}}+\sum_{j\in C^{\prime}}d(i^{*},j)\right)+5d(i^{*},j_{t}).

Proof.

In this proof, we focus on the time $t$ of the algorithm. If $i^{*}\in S$ before Step 7, then we have $\Delta_{t}\leq d(i^{*},j_{t})$ and thus we can assume $i^{*}\notin S$ before Step 7. Since Loop 5 only adds facilities to $S$ , we have that $i^{*}\notin S$ at any moment in Loop 5.

Let $q={\left\lceil\log\max{\left\{f_{i^{*}},{\mathsf{last}}_{t}/|F|\right\}}\right\rceil}$ ; notice this $q$ is considered in Loop 5. Let $i\in F\setminus S$ be the facility with $f_{i}\leq 2^{q}$ nearest to $j_{t}$ at the beginning of the iteration $q$ ; this is the facility we try to open in Step 6 in the iteration for $q$ . Notice that $d(j_{t},i)\leq d(j_{t},i^{*})$ since $i^{*}$ is a candidate facility.

Since we called $\mathsf{try\mathchar 45\relax open}(i)$ in Step 6, there is no $0$ -efficient opening operation that opens $i$ after the step. Then, we can apply Lemma 10 on this facility $i$ , the set $C^{\prime}$ and $\phi=0$ . So, after Step 6 of the iteration for $q$ , we have

\displaystyle d(j_{t},S)\leq\frac{1}{|C^{\prime}|}\left(f_{i}+2\sum_{j\in C^{\prime}}d(i,j)\right)+d(i,j_{t}).

Notice that $d(i,i^{*})\leq d(i,j_{t})+d(j_{t},i^{*})\leq 2d(j_{t},i^{*})$ , $f_{i}\leq 2\max{\left\{f_{i^{*}},\epsilon^{\prime}{\mathsf{last}}_{t}/|F|\right\}}$ and $S$ can only grow before the end of Step 7. We have

	$\displaystyle\Delta_{t}$	$\displaystyle\leq\frac{1}{\|C^{\prime}\|}\left(2\max{\left\{f_{i^{}},{\mathsf{last}}_{t}/\|F\|\right\}}+2\sum_{j\in C^{\prime}}(d(i^{},j)+d(i^{},i))\right)+d(i^{},j_{t})$
		$\displaystyle\leq\frac{2}{\|C^{\prime}\|}\left(\max{\left\{f_{i^{}},{\mathsf{last}}_{t}/\|F\|\right\}}+\sum_{j\in C^{\prime}}d(i^{},j)\right)+5d(i^{*},j_{t}).\qed$

With the lemma, we can then prove the following lemma:

Lemma 18.

For every $T^{\prime}\in[T-1]$ , we have

\displaystyle\sum_{t=1}^{T^{\prime}}\Delta_{t}\leq O(\log T^{\prime})\cdot\mathsf{opt}_{T^{\prime}}

Proof.

The proof is similar to that of Lemma 11. Let $(S^{*},\sigma^{*})$ be the optimum solution for clients $C_{T^{\prime}}$ . Focus on some $i^{*}\in S^{*}$ and assume $(C_{T^{\prime}}\setminus C_{0})\cap\sigma^{*-1}(i^{*})=\{j_{t_{1}},j_{t_{2}},\cdots,j_{t_{s}}\}$ with $1\leq t_{1}<t_{2}<\cdots<t_{s}\leq T^{\prime}$ .

We have $\Delta_{t_{1}}\leq f_{i^{*}}+d(i^{*},j_{t_{1}})$ by Observation 16. Then focus on any $k\in[2,s]$ . If $f_{i^{*}}>{\mathsf{last}}_{t_{k}}/\epsilon$ , then we must have $\mathsf{opt}_{t_{k}}\geq{\mathsf{last}}_{t_{k}}/\epsilon$ and the stage will terminate at time ${t_{k}}$ . Thus ${t_{k}}=T$ , contradicting the assumption that ${t_{k}}\leq T^{\prime}\leq T-1$ . So we assume $f_{i^{*}}\leq{\mathsf{last}}_{t_{k}}/\epsilon$ . We can apply Lemma 17 with $i^{*}$ and $C^{\prime}=\{j_{t_{1}},j_{t_{2}},\cdots,j_{t_{k-1}}\}$ to obtain that $\Delta_{t_{k}}\leq\frac{2}{k-1}\left(\max{\left\{f_{i^{*}},{\mathsf{last}}_{t_{k}}/|F|\right\}}+\sum_{k^{\prime}=1}^{k-1}d(i^{*},j_{t_{k^{\prime}}})\right)+5d(i^{*},j_{t_{k}})$ . We can replace ${\mathsf{last}}_{t_{k}}$ with ${\mathsf{last}}_{T^{\prime}}$ since ${\mathsf{last}}_{t_{k}}\leq{\mathsf{last}}_{T^{\prime}}$ .

The sum of upper bounds over all $k\in[s]$ is a linear combinations of $\max{\left\{f_{i^{*}},{\mathsf{last}}_{T^{\prime}}/|F|\right\}}$ and $d(i^{*},j_{t_{k^{\prime}}})$ ’s. In the linear combination, the coefficient for $\max{\left\{f_{i^{*}},{\mathsf{last}}_{T^{\prime}}/|F|\right\}}$ is at most $1+\frac{2}{1}+\frac{2}{2}+\frac{2}{3}+\cdots+\frac{2}{s-1}=O(\log s)=O(\log T^{\prime})$ . The coefficient for $d(i^{*},j_{t_{k^{\prime}}})$ is at most $5+\frac{2}{k^{\prime}}+\frac{2}{k^{\prime}+1}+\cdots\frac{2}{s-1}=O(\log s)=O(\log T^{\prime})$ . Thus, overall, we have $\sum_{k=1}^{s}\Delta_{t_{k}}\leq O(\log T^{\prime})\big{(}\max{\left\{f_{i^{*}},{\mathsf{last}}_{T^{\prime}}/|F|\right\}}+\sum_{k^{\prime}=1}^{s}d(i^{*},j_{t_{k^{\prime}}})\big{)}$ .

Therefore $\sum_{t=1}^{T^{\prime}}\Delta_{t}\leq O(\log T^{\prime})\left(\mathsf{cost}(S^{*},\sigma^{*})+|S^{*}|{\mathsf{last}}_{T^{\prime}}/|F|\right)$ , by taking the sum of the above inequality over all $i^{*}\in S^{*}$ . The bound is at most $O(\log T^{\prime})(\mathsf{opt}_{T^{\prime}}+{\mathsf{last}}_{T^{\prime}})=O(\log T^{\prime})\cdot\mathsf{opt}_{T^{\prime}}$ , since $|S^{*}|\leq|F|$ and ${\mathsf{last}}_{T^{\prime}}\leq O(1)\mathsf{opt}_{T^{\prime}-1}\leq O(1)\mathsf{opt}_{T^{\prime}}$ . ∎

Between two consecutive calls of $\mathsf{FL\mathchar 45\relax iterate}$ in Step 9 at time $t_{1}$ and $t_{2}>t_{1}$ , $\mathsf{cost}(S,\sigma)$ should have increased by at least $\epsilon^{\prime}{\mathsf{last}}_{t_{2}}$ : At the end of time $t_{1}$ , we have $\mathsf{cost}(S,\sigma)\leq{\mathsf{last}}_{t_{1}+1}={\mathsf{last}}_{t_{2}}$ since otherwise ${\mathsf{last}}$ should have been updated in time $t_{1}$ . We need to have $\mathsf{cost}(S,\sigma)>(1+\epsilon^{\prime}){\mathsf{last}}_{t_{2}}$ after Step 7 at time $t_{2}$ in order to call $\mathsf{FL\mathchar 45\relax iterate}$ . Thus, the increase of the cost during this period is at least $\epsilon^{\prime}{\mathsf{last}}_{t_{2}}$ . Thus, we have $\sum_{t=t_{1}+1}^{t_{2}}\frac{\Delta_{t}}{\epsilon^{\prime}\cdot{\mathsf{last}}_{t}}\geq 1$ since ${\mathsf{last}}_{t}={\mathsf{last}}_{t_{2}}$ for every $t\in(t_{1},t_{2}]$ . The argument also holds when $t_{1}=0$ and $t_{2}>t_{1}$ is the first time in which we call $\mathsf{FL\mathchar 45\relax iterate}$ . Counting the call of $\mathsf{FL\mathchar 45\relax iterate}$ in Step 3, we can bound the total number of times we call the procedure by $1+\frac{1}{\epsilon^{\prime}}\sum_{t=1}^{T}\frac{\Delta_{t}}{{\mathsf{last}}_{t}}$ .

Again let $\Phi_{T^{\prime}}=\sum_{t=1}^{T^{\prime}}\Delta_{t}$ for every $T^{\prime}\in[0,T]$ . Lemma 18 says $\Phi_{t}\leq O(\log t)\mathsf{opt}_{t}$ for every $t\in[0,T-1]$ . For every $t\in[T]$ , since $\Delta_{t}\leq\mathsf{opt}_{t}$ , thus we have $\Phi_{t}=\Phi_{t-1}+\Delta_{t}\leq O(\log t)\mathsf{opt}_{t-1}\leq O(\log T){\mathsf{last}}_{t}$ since ${\mathsf{last}}_{t}$ will be at least the cost of some solution for $C_{t-1}$ . Applying Lemma 9 with $a_{t}={\mathsf{last}}_{t},b_{t}=\Delta_{t}$ and $B_{t}=\Phi_{t}$ for every $t$ , the number of times we call $\mathsf{FL\mathchar 45\relax iterate}$ can be bounded by

\displaystyle 1+\frac{1}{\epsilon^{\prime}}\sum_{t=1}^{T}\frac{\Delta_{t}}{{\mathsf{last}}_{t}}\leq\frac{1}{\epsilon^{\prime}}O(\log T)\left(\ln\frac{{\mathsf{last}}_{T}}{{\mathsf{last}}_{1}}+1\right)=O\left(\frac{\log T}{\epsilon}\log\frac{1}{\epsilon}\right).

We can then analyze the running time and the success probability of our algorithm. Focus on each stage of the algorithm. By Observation 12, each call to $\mathsf{FL\mathchar 45\relax iterate}(M)$ takes time $O(M|C|\log|F|)=O\left(\frac{|F|}{\epsilon^{\prime}}(\log\Gamma)|C|\log n\right)=O\left(\frac{n\cdot|C_{T}|}{\epsilon}\log^{2}n\right)$ , where $C$ is the set of clients in the algorithm at the time we call the procedure, $C_{T}\supseteq C$ is the number set of clients at the end of time $T$ , and $M=O\left(\frac{|F|}{\epsilon^{\prime}}\log\Gamma\right)$ is as defined in Step 2. The total number of times we call the procedure is at most $O\left(\frac{\log T}{\epsilon}\log\frac{1}{\epsilon}\right)\leq O\left(\frac{\log n}{\epsilon}\log\frac{1}{\epsilon}\right)$ . Thus, the running time we spent on $\mathsf{FL\mathchar 45\relax iterate}$ is $O\left(\frac{n\cdot|C_{T}|}{\epsilon^{2}}\log^{3}n\log\frac{1}{\epsilon}\right)$ . The running time for Steps 5 to 7 is at most $T\cdot O\big{(}\log\frac{|F|}{\epsilon^{\prime}}\big{)}\cdot O\big{(}|C_{T}|\log|F|\big{)}=O(|C_{T}|T\log^{2}\frac{|F|}{\epsilon})\leq O(n|C_{T}|\log^{2}\frac{n}{\epsilon})$ . Thus, the total running time of a stage is at most $O\left(\frac{n\cdot|C_{T}|}{\epsilon^{2}}\log^{3}n\log\frac{1}{\epsilon}\right)$ . Now consider all the stages together. The sum of $|C_{T}|$ values over all stages is at most $2n$ since every client appears in at most 2 stages. So, the total running time of our algorithm is $O\left(\frac{n^{2}}{\epsilon^{2}}\log^{3}n\log\frac{1}{\epsilon}\right)$ .

For the success probability, the total number of times we call $\mathsf{FL\mathchar 45\relax iterate}(M)$ is at most $O\left(\log_{1/\epsilon}(nD)\frac{\log n}{\epsilon}\log\frac{1}{\epsilon}\right)=\mathrm{poly}(\log n,\log D,\frac{1}{\epsilon})$ . If we have $\Lambda$ is at least $n^{2}$ times this number, which is still $\mathrm{poly}(n,\log D,\frac{1}{\epsilon})$ , then the success probability of our algorithm is at least $1-1/n^{2}$ .

Finally, we remark that the success of the algorithm only depends on the success of all executions of $\mathsf{FL\mathchar 45\relax iterate}$ . Each execution has success probability $1-1/\Gamma$ even if the adversary is adaptive. This finishes the proof of Theorem 2.

Remark

We can indeed obtain an algorithm that has both $O(\log T)$ amortized client recourse and $\tilde{O}(n^{2})$ total running time, by defining $\phi=\frac{\mathsf{cost}(S,\sigma)}{\alpha_{\mathsf{FL}}\epsilon^{\prime}}$ and only performing $\phi$ -efficient local operations. However, this will require us to put $\phi$ everywhere in our analysis and deteriorate the cleanness of the analysis. Thus, we choose to separate the two features in two algorithms: small recourse and $\tilde{O}(n^{2})$ total running time.

We also remark that the total running time for all calls of $\mathsf{FL\mathchar 45\relax iterate}$ is only $\tilde{O}(n|F|)$ , and the $\tilde{O}(n^{2})$ time comes from Steps 5 to 7. By losing a multiplicative factor of $2$ and additive factor of $1$ in the approximation ratio, we can assume every client is collocated with its nearest facility (See Appendix C). Then at any time we only have $O(|F|)$ different positions for clients, and the running time of the algorithm can be improved to $O(\frac{n|F|}{\epsilon^{2}}\log^{3}n\log\frac{1}{\epsilon})$ .

6 Fully Dynamic Algorithm for Facility Location on Hierarchically Well Separated Tree Metrics

In this section, we give our fully dynamic algorithm for facility location on hierarchically-well-separated-tree (HST) metrics. Our algorithm achieves $O(1)$ -approximation and $O(\log^{2}D)$ amortized update time. As we mentioned early, we assume each client is collocated with a facility. From now on, we fix the HST $T$ and assume the leaves of $T$ is $X=F$ ; let $V$ be the set of all nodes in $T$ . Let $d_{T}$ be the metric induced by $T$ over the set $V$ of vertices.

Notations. Recall that ${\mathsf{level}}(v)$ is the level of $v$ in $T$ . For every vertex $v\in V$ , define $\Lambda_{v}$ to be the set of children of $v$ , $X_{v}$ to be the set of leaf descendants of $v$ , and $T_{v}$ be the maximal sub-tree of $T$ rooted at $v$ . We extend the facility cost from $X$ to all vertices in $V$ : for every $v\in V\setminus X$ , we define $f_{v}=\min_{i\in X_{v}}f_{i}$ . We can assume that each internal vertex $v$ is a facility; by opening $v$ we mean opening a copy of the $i\in X_{v}$ with $f_{i}=f_{v}$ . This assumption only loses a factor of $2$ in the competitive ratio: On one hand, having more facilities can only make our problem easier; on the other hand, the cost of connecting a client to any $i\in X_{v}$ is at most twice that of connecting it to $v$ . By the definition, the facility costs along a root-to-leaf path are non-decreasing.

6.1 Offline Algorithm for Facility Location on HST Metrics

In this section, we first give an offline $O(1)$ -approximation algorithm for facility location on the HST metric $d_{T}$ as a baseline. Notice that facility location on trees can be solved exactly using dynamic programming. However the algorithm is hard to analyze in the dynamic algorithm model since the solution is sensitive to client arrivals and departures. Our algorithm generalizes the algorithm in [16] for facility location with uniform facility cost, that was used to achieve the differential privacy requirement.

For every vertex $v\in V$ , we let $N_{v}$ be the number of clients at locations in $X_{v}$ . Although according to the definition $N_{v}$ ’s are integers, in most part of the analysis we assume there are non-negative real numbers. This will be useful when we design the dynamic algorithm. Let $\alpha\in\{1,2\}^{V}$ and $\beta\in\{1,2\}^{V\setminus X}$ be vectors given to our algorithm. They are introduced solely for the purpose of extending the algorithm to the dynamic setting; for the offline algorithm we can set $\alpha$ and $\beta$ to be all-1 vectors.

Marked and Open Facilities

For every vertex $v\in V$ , we say $v$ is marked w.r.t the vectors $N$ and $\alpha$ if

N_{v}\cdot 2^{{\mathsf{level}}(v)}>f_{v}/\alpha_{v}

and unmarked otherwise. The following observation can be made:

Observation 19.

Let $u$ be the parent of $v$ . If $v$ is marked w.r.t $N$ and $\alpha$ , so is $u$ .

Proof.

$v$ is marked w.r.t $N$ and $\alpha$ implies $N_{v}2^{{\mathsf{level}}(v)}>f_{v}/\alpha_{v}$ . Notice that $N_{u}\geq N_{v},{\mathsf{level}}(u)={\mathsf{level}}(v)+1,\alpha_{v}\leq 2\alpha_{u}$ and $f_{u}\leq f_{v}$ . So, $N_{u}2^{{\mathsf{level}}(u)}\geq 2N_{v}2^{{\mathsf{level}}(v)}>2f_{v}/\alpha_{v}\geq f_{u}/\alpha_{u}$ . ∎

Thus there is a monotonicity property on the marking status of vertices in $T$ . We say a vertex $v$ is highest unmarked (w.r.t $N$ and $\alpha$ ) if it is unmarked and its parent is marked; we say a vertex $v$ is lowest marked if it is marked but all its children are unmarked. However, sometimes we say a vertex $u$ is the lowest marked ancestor of a leaf $v\in X$ if either $u=v$ is marked, or $u\neq v$ is marked and the child of $u$ in the $u$ - $v$ path is unmarked; notice that in this case, $u$ might not be a lowest marked vertex since it may have some other marked children. If we need to distinguish between the two cases, we shall use that $u$ is lowest marked globally to mean $u$ is a lowest marked vertex.

If a leaf vertex $v\in X$ is marked, then we open $v$ . For every marked vertex $v\in V\setminus X$ , we open $v$ if and only if

\left(\sum_{u\in\Lambda_{v}:u\text{ unmarked}}N_{u}\right)2^{{\mathsf{level}}(v)}>f_{v}/(\alpha_{v}\beta_{v}).

Notice that all unmarked vertices are closed.

Observation 20.

If $v$ is lowest marked, then $v$ is open.

Proof.

We can assume $v\notin X$ since otherwise $v$ is open. So, $N_{v}2^{{\mathsf{level}}(v)}>f_{v}/\alpha_{v}$ and all children of $v$ are unmarked. Thus, $\sum_{u\in\Lambda_{v}:{u\text{ unmarked}}}N_{u}=\sum_{u\in\Lambda_{v}}N_{u}=N_{v}$ . Therefore, $\left(\sum_{u\in\Lambda_{v}:{u\text{ unmarked}}}N_{u}\right)2^{{\mathsf{level}}(v)}=N_{v}2^{{\mathsf{level}}(v)}>f_{v}/\alpha_{v}\geq f_{v}/(\alpha_{v}\beta_{v})$ . Thus $v$ will be open. ∎

With the set of open facilities defined, every client is connected to its nearest open facility according to $d_{T}$ , using a consistent tie-breaking rule (e.g, the nearest open facility with the smallest index). We assume the root $r$ of $T$ has $\frac{f_{v}}{2^{{\mathsf{level}}(v)}}\leq 1$ by increasing the number of levels. So $r$ will be marked whenever $N_{r}\geq 1$ . This finishes the description of the offline algorithm.

Analysis of $O(1)$ -Approximation Ratio.

We show the algorithm achieves an $O(1)$ -approximation. First we give a lower bound on the optimum cost. For every $v\in V$ , let

{\mathsf{LB}}(v)=\min{\left\{N_{v}2^{{\mathsf{level}}(v)},f_{v}\right\}}.

Then we have

Lemma 21.

Let $U$ be a set of vertices in $T$ without an ancestor-descendant pair; i.e, for every two distinct vertex $u$ and $v$ in $U$ , $u$ is not an ancestor of $v$ . Then the cost of the optimum solution is at least $\sum_{v\in U}{\mathsf{LB}}(v)$ .

Proof.

Fix an optimum solution. Consider any $v\in U$ . We consider the cost inside $T_{v}$ in the optimum solution: the connection cost of clients, plus the cost of open facilities in $T_{v}$ . Then this cost is at least ${\mathsf{LB}}(v)=\min{\left\{N_{v}2^{{\mathsf{level}}(v)},f_{v}\right\}}$ : If we open a facility in $T_{v}$ then the facility cost is at least $f_{v}$ ; otherwise, all the $N_{v}$ clients in $T_{v}$ have to be connected to outside $T_{v}$ , incurring a cost of at least $N_{v}2^{{\mathsf{level}}(v)}$ . The lemma follows from that the trees $T_{v}$ over all $v\in U$ are disjoint and thus we are not over-counting the costs in the optimum solution. ∎

Then let $U$ be the set of highest unmarked vertices and marked leaves; clearly $U$ does not have an ancestor-descendant pair. By Lemma 21, the optimum cost is at least $\sum_{v\in U}{\mathsf{LB}}(v)$ . We prove the following lemma.

Lemma 22.

The solution produced by our algorithm has cost at most $O(1)\sum_{u\in U}{\mathsf{LB}}(u)$ .

Proof.

First consider the facility cost of our solution. If a leaf $v$ is marked and open, we have $N_{v}>f_{v}/\alpha_{v}$ (as ${\mathsf{level}}(v)=0$ ) and thus ${\mathsf{LB}}(v)=\min{\left\{N_{v},f_{v}\right\}}\geq f_{v}/\alpha_{v}$ . Then $f_{v}$ can be bounded by $\alpha_{v}{\mathsf{LB}}(v)\leq 2{\mathsf{LB}}(v)$ . If $v\in V\setminus X$ is marked and open, then by our algorithm we have $\sum_{u\in\Lambda_{v}:u\text{ unmarked}}N_{u}2^{{\mathsf{level}}(v)}\geq f_{v}/(\alpha_{v}\beta_{v})$ . Since each $u$ in the summation is unmarked, we have ${\mathsf{LB}}(u)=N_{u}2^{{\mathsf{level}}(u)}$ . Thus, we have $\sum_{u\in\Lambda_{v}:u\text{ unmarked}}{\mathsf{LB}}(u)=\frac{1}{2}\sum_{u}N_{u}2^{{\mathsf{level}}(v)}\geq\frac{1}{2}f_{v}/(\alpha_{v}\beta_{v})\geq\frac{1}{8}f_{v}$ . That is $f_{v}$ can be bounded by $8\sum_{u\in\Lambda_{v}:u\text{ unmarked}}{\mathsf{LB}}(u)$ . Notice that each $u$ in the summation has $u\in U$ since it is highest unmarked. So, summing the bounds over all open facilities $v$ gives us that the facility cost of our solution is at most $8\sum_{u\in U}{\mathsf{LB}}(u)$ .

Now consider the connection cost. For every $v\in X$ , let $u$ be the highest unmarked ancestor of $v$ (if $v$ itself is open, then its connection cost is $0$ and we do not need to consider this case). Let $w$ be the parent of $u$ ; so $w$ is marked. Then there must be an open facility in the maximal tree rooted at $w$ : consider any lowest marked vertex in the sub-tree rooted at $w$ ; it must be open by Lemma 20. Thus, any client at $v$ has connection cost at most $2\times 2^{{\mathsf{level}}(w)}=4\times 2^{{\mathsf{level}}(u)}$ . Thus, the total connection cost in our solution is at most $4\sum_{u\in U\setminus X}N_{u}2^{{\mathsf{level}}(u)}=4\sum_{u\in U\setminus X}{\mathsf{LB}}(u)$ . This finishes the proof of the lemma. ∎

Combining Lemmas 21 and 22 gives that our algorithm is an $O(1)$ -approximation. One lemma that will be useful in the analysis of dynamic algorithm is the following:

Lemma 23.

For any open facility $v$ in our solution, the number of clients connected to $v$ that are outside $T_{v}$ is at most $O(\log D)\frac{f_{v}}{2^{{\mathsf{level}}(v)}}$ .

Proof.

We consider each ancestor $u$ of $v$ and count the number clients connected to $v$ with lowest common ancestor with $v$ being $u$ . Focus on a child $w$ of $u$ that is not $v$ or an ancestor of $v$ . If $w$ is marked, then no clients in $T_{w}$ will be connected to $v$ since some facility in $T_{w}$ will be open. Thus, let $U^{\prime}$ be the unmarked children of $u$ that is not $v$ or an ancestor of $v$ . Then if we have $\sum_{w\in U^{\prime}}N_{w}2^{{\mathsf{level}}(u)}\geq f_{u}/(\alpha_{u}\beta_{u})$ , then $u$ will be marked and open and clients in $T_{w},w\in U^{\prime}$ will not be connected to $v$ . Otherwise we have $\sum_{w\in U^{\prime}}N_{w}<f_{u}/(\alpha_{u}\beta_{u}\cdot 2^{{\mathsf{level}}(u)})\leq f_{u}/2^{{\mathsf{level}}(u)}\leq f_{v}/2^{{\mathsf{level}}(v)}$ as $f_{u}\leq f_{v}$ and ${\mathsf{level}}(u)\geq{\mathsf{level}}(v)$ . The lemma follows since we have at most $O(\log D)$ ancestors of $v$ . ∎

Remark

The algorithm so far gives a data structure that supports the following operations in $O(\log D)$ time: i) updating $N_{v}$ for some $v\in X$ and ii) returning the nearest open facility of a leaf $v\in X$ . Indeed the algorithm can be made simpler: We set $\alpha$ to be the all-1 vector, and we open the set of lowest marked facilities (so both $\alpha$ and $\beta$ are not needed). For every vertex $u\in V$ , we maintain the nearest open facility $\psi_{u}$ to $u$ in $T_{u}$ . Whenever a client at $v$ arrives or departs, we only need change $N_{u}$ , $\psi_{u}$ , marking and opening status of $u$ for ancestors $u$ of $v$ . To return the closest open facility to a leaf $v\in X$ , we travel up the tree from $v$ until we find an ancestor $u$ with $\psi_{u}$ defined, and return $\psi_{u}$ . Both operations take $O(\log D)$ time. However, our goal is to maintain the solution $(S,\sigma)$ explicitly in memory. Thus we also have to bound the the number of reconnections during the algorithm, since that will be a lower bound on the total running time.

6.2 Dynamic Algorithm for Facility Location on HST Metrics

In this section, we extend the offline algorithm to a dynamic algorithm with $O(\log^{3}D)$ -amortized update time; recall that $D$ is the aspect ratio of the metric. We maintain $\alpha,\beta$ and $N$ -vectors, and at any moment of the algorithm, the marking and opening status of vertices are exactly the same as that obtained from the offline algorithm for $\alpha,\beta$ and $N$ .

Initially, let $\alpha$ and $\beta$ be all- $1$ vectors, and $N$ be the all-0 vector. So all the vertices are unmarked. Whenever a client at some $v\in X$ arrives or departs, the $\alpha,\beta$ values, the marking and opening status of ancestors of $v$ may change and we show how to handle the changes. The vertices that are not ancestors of $v$ are not affected during the process.

When a client at $v$ arrives or departs, we increase or decrease the $N_{u}$ values for all ancestors $u$ of $v$ by 1 continuously at the same rate (we can think of that the number of clients at $v$ increases or decreases by 1 continuously). During the process, the marking and opening status of these vertices may change. If such an event happens, we change $\alpha$ and/or $\beta$ values of the vertex so that it becomes harder for the status to change back in the future. Specifically, we use the following rules:

•

If a vertex $u$ changes to marked (from being unmarked), then we change $\alpha_{u}$ to $2$ (notice that $u$ remains marked w.r.t the new $\alpha$ ), and $\beta_{u}$ to $1$ . In this case, we do not consider the opening status change of $u$ as an event.
•

If a vertex $u$ changes to unmarked (from being marked), we change $\alpha_{u}$ to $1$ (notice that $u$ remains unmarked w.r.t the new $\alpha$ ). The $\beta_{u}$ value becomes useless. In this case, we also do not consider the opening status change of $u$ as an event.
•

If a marked vertex $u$ becomes open (from being closed), then we change $\beta_{u}$ to $2$ (notice that $u$ remains open w.r.t the new $\beta$ ).
•

If a marked vertex $u$ becomes closed (from being open), then we change $\beta_{u}$ to $1$ (notice that $u$ remains closed w.r.t the new $\beta$ ).

We call the 4 types of events above as marking, unmarking, opening and closing events.

Now we talk about the order the events happen. When we increase $N_{u}$ values of ancestors of $v$ continuously, one of the following two events may happen:

•

The highest unmarked ancestor $u$ of $v$ may become globally lowest marked, and this may induce a closing event for the parent $w$ of $u$ .
•

The lowest marked ancestor $u$ of $v$ may become open.

Similarly, when we decrease $N_{u}$ values of ancestors of $v$ continuously, one of the following two events may happen:

•

The lowest marked ancestor $u$ of $v$ may become unmarked (we must that $u$ was lowest marked globally), and this may induce an opening event for the parent $w$ of $u$ .
•

The lowest marked ancestor $u$ of $v$ may become closed.

Above, if two events happen at the same time, we handle an arbitrary event. Notice that after we handle the event, the conditions for the other event might not hold any more, in which case we do not handle it.

Once we have finished the process of increasing or decreasing $N_{u}$ values by 1, the clients will be connected to their respective nearest open facilities, breaking ties using the consistent rule. A reconnection happens if a client is connected to a different facility.

Bounding Number of Reconnections

Now we analyze the reconnections made in the algorithm. When a client at $v\in X$ arrives or departs, at most $O(\log D)$ vertices $u$ will have their $N_{u}$ values changed by $1$ . We distribute 4 tokens to each ancestor $u$ of $v$ , that are of type-A, type-B, type-C and type-D respectively.⁴⁴4The types are only defined for convenience. We are going to use these tokens to charge the events happened.

First focus on the sequence of marking/unmarking events happened at a vertex $u$ . Right before $u$ becomes unmarked we have $N_{u}\leq f_{u}/(2\times 2^{{\mathsf{level}}(u)})$ since at the moment we have $\alpha_{u}=2$ . Immediate after that $\alpha_{u}$ is changed to $1$ . For $u$ to become marked again, we need $N_{u}\leq f_{u}/2^{{\mathsf{level}}(u)}$ . So during the period $N_{u}$ must have been increased by at least $f_{u}/(2\times 2^{{\mathsf{level}}(u)})$ . Similarly, right before $u$ becomes marked we have $N_{u}\geq f_{u}/2^{{\mathsf{level}}(u)}$ since at the moment we have $\alpha_{u}=1$ . Then we change $\alpha_{u}$ to $2$ immediately. For $u$ to become unmarked again, $N_{u}$ should be decreased by at least $f_{u}/(2\times 2^{{\mathsf{level}}(u)})$ . So, when a marking/unmarking event happens at $u$ , we can spend $\Omega(f_{u}/2^{{\mathsf{level}}(u)})$ type-A tokens owned by $u$ .

Then we focus on the sequence $\mathcal{S}$ of opening/closing events at $u$ between two adjacent marking/unmarking events at $u$ . At these moments, $u$ is marked and $\alpha_{u}=2$ . For the first event in $\mathcal{S}$ , we can spend $\Omega(f_{u}/2^{{\mathsf{level}}(u)})$ type-B tokens owned by $u$ . If some opening/closing event $e$ in $\mathcal{S}$ is induced by an unmarking/marking event of some child $u^{\prime}$ of $u$ , then we can spend $\Omega(f_{u^{\prime}}/2^{{\mathsf{level}}(u^{\prime})})\geq\Omega(f_{u}/2^{{\mathsf{level}}(u)})$ type-C tokens owned by $u^{\prime}$ for $e$ , and the event $e^{\prime}$ after $e$ in $\mathcal{S}$ if it exists. Notice that we already argued that $u^{\prime}$ has collected enough number of type-C tokens.

Then we focus on an event $e^{\prime}$ in $\mathcal{S}$ such that both $e$ and the event $e$ before $e^{\prime}$ in $\mathcal{S}$ are not induced. First, assume $e$ is an opening event and $e^{\prime}$ is a closing event. Then, after $e$ we have $\sum_{u^{\prime}\in\Lambda_{u}:u^{\prime}\text{ unmarked}}N_{u^{\prime}}=f_{u}/(2\times 2^{{\mathsf{level}}(u)})$ and before $e^{\prime}$ we have $\sum_{u^{\prime}\in\Lambda_{u}:u^{\prime}\text{ unmarked}}N_{u^{\prime}}=f_{u}/(4\times 2^{{\mathsf{level}}(u)})$ . Notice that the set of unmarked children of $u$ may change, and let $U^{\prime}$ and $U^{\prime\prime}$ be the sets of unmarked children of $u$ at the moments after $e$ and before $e^{\prime}$ respectively. Again if there is some $u^{\prime}\in(U^{\prime}\setminus U^{\prime\prime})\cup(U^{\prime\prime}\setminus U^{\prime})$ , we spend $\Omega(\frac{f_{u^{\prime}}}{2^{{\mathsf{level}}(u^{\prime})}})\geq\Omega(\frac{f_{u}}{2^{{\mathsf{level}}(u)}})$ type-C tokens owned by $u^{\prime}$ . Otherwise, $U=U^{\prime}$ and $f_{u}/(4\times 2^{({\mathsf{level}}(u))})$ clients in $T_{u}$ must have departed between $e$ and $e^{\prime}$ and we can then spend $\Omega(f_{u}/2^{{\mathsf{level}}(u)})$ type-D tokens for $e^{\prime}$ . The case when $e$ is an closing event and $e^{\prime}$ is an opening event can be argued in the same way.

Thus, whenever an event happens at $u$ , we can spend $\Omega(f_{u}/2^{{\mathsf{level}}(u)})$ tokens; moreover if an opening/closing event at $u$ was induced by an unmarking/marking event at some child $u^{\prime}$ of $u$ , then we can spend $\Omega(f_{u^{\prime}}/2^{{\mathsf{level}}(u^{\prime})})$ tokens for the event at $u$ . A facility $u$ changes its opening status when an event happens at $u$ . Notice that, we reconnect a client only if it was connected to a ready-to-close facility, or it needs to be connected to newly open facility. By Lemma 23, at any moment the number of clients connected to $u$ from outside $T_{u}$ is at most $O(\log D)\cdot\frac{f_{u}}{2^{{\mathsf{level}}(u)}}$ . At the moment $u$ changes its opening status because of an non-induced event, then before and after the event the number of clients connected to $u$ from $T_{u}$ is of order $O\left(\frac{f_{u}}{2^{{\mathsf{level}}(u)}}\right)$ . $u$ changes its opening status due to a marking/unmarking event happened at some child $u^{\prime}$ of $u$ , then before and after the event the number of clients connected to $u$ from $T_{u}$ is of order $\Theta\left(\frac{f_{u^{\prime}}}{2^{{\mathsf{level}}(u^{\prime})}}\right)$ . Thus, on average, for each token we spent we connect at most $O(\log D)$ clients. Since each client arrival or departure distributes at most $O(\log D)$ tokens, we have that the amortized number of reconnections (per client arrival/departure) is at most $O(\log^{2}D)$ .

Analyzing Update Time

Then with the bound on the number of reconnections (recourse), we can bound the update time easily. Indeed, we can maintain a $\psi_{u}$ for every $u\in V$ , which indicates the nearest open facility to $u$ in $T_{u}\setminus u$ ( $\psi_{u}$ could be undefined). We also maintain a value $N^{\prime}_{u}$ for marked vertices $u$ where $N^{\prime}_{u}=\sum_{v\in\Lambda_{v},v\text{ unmarked}}N_{v}$ . Whenever a client at $v$ arrives or departs, we need to change $\alpha_{u},\beta_{u},N_{u},N^{\prime}_{u},\psi_{u}$ , marking and opening status of $u$ only for ancestors $u$ of $v$ . The update can be made in $O(\log D)$ time for every client arrival or departure using the information on the vertices. The bottleneck of the algorithm comes from reconnecting clients. We already argued that the amortized number of reconnections per client arrival/departure is $O(\log^{2}D)$ and thus it suffices to give an algorithm that can find the clients to be connected efficiently.

For every vertex $u$ , we maintain a double-linked-list of unmarked children $u^{\prime}$ of $u$ with $N_{u^{\prime}}\geq 1$ . With this structure it is easy to see that for every client that needs to be reconnected, we need $O(\log D)$ time to locate it. If $u$ becomes open, we need to consider each unmarked children $u^{\prime}$ of $u$ and reconnect clients in $T_{u^{\prime}}$ to $u$ . The time needed to locate these clients can be made $O(\log D)$ times the number of clients. For every strict ancestor $w$ of $u$ for which there are no open facilities in between we can use the $\psi_{w}$ information to see if we need to reconnect clients in $T_{w}$ . If yes, then for every unmarked child $w^{\prime}$ of $w$ with $N_{w^{\prime}}\geq 1$ that is not an ancestor of $u$ , we need to connect the clients in $T_{w^{\prime}}$ to $u$ . Again enumerating these clients takes time $O(\log D)$ times the number of clients. Similarly, if $u$ becomes closed, we then need to connect all clients connected to $u$ to the nearest open facility to $u$ , which can be computed using $\psi$ values of $u$ and its ancestors. Enumerating the clients takes time $O(\log D)$ times the number of clients. Overall, the amortized running time per client arrival/departure is $O(\log^{3}D)$ .

7 Open Problems and Discussions

We initiated the study of facility location problem in general metric spaces in recourse and dynamic models. Several interesting problems remain open: The most obvious one is can we get $O(1)$ -competitive online/dynamic algorithms with polylog amortized recourse or fast update times in the fully dynamic setting. Another interesting direction is can we extend our results to the capacitated facility location and capacitated $k$ -median, where there is an upper bound on the number of clients that can be assigned to a single open facility. From technical point of view, it would be interesting to find more applications of local search and probabilistic tree embedding techniques in the dynamic algorithms model. Finally, as alluded in the introduction, an exciting research direction is to understand the power of recourse in the online model.

References

[1] Hyung-Chan An, Ashkan Norouzi-Fard, and Ola Svensson. Dynamic facility location via exponential clocks. In Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2015, San Diego, CA, USA, January 4-6, 2015, pages 708–721, 2015.
[2] Aris Anagnostopoulos, Russell Bent, Eli Upfal, and Pascal Van Hentenryck. A simple and deterministic competitive algorithm for online facility location. Inf. Comput., 194(2):175–202, November 2004.
[3] Vijay Arya, Naveen Garg, Rohit Khandekar, Adam Meyerson, Kamesh Munagala, and Vinayaka Pandit. Local search heuristic for k-median and facility location problems. In Proceedings on 33rd Annual ACM Symposium on Theory of Computing, July 6-8, 2001, Heraklion, Crete, Greece, pages 21–29, 2001.
[4] Y. Bartal. Probabilistic approximation of metric spaces and its algorithmic applications. In Proceedings of the 37th Annual Symposium on Foundations of Computer Science, FOCS ’96, pages 184–, Washington, DC, USA, 1996. IEEE Computer Society.
[5] Yair Bartal, Avrim Blum, Carl Burch, and Andrew Tomkins. A polylog(n)-competitive algorithm for metrical task systems. In Proceedings of the Twenty-Ninth Annual ACM Symposium on the Theory of Computing, El Paso, Texas, USA, May 4-6, 1997, pages 711–719, 1997.
[6] Aaron Bernstein, Jacob Holm, and Eva Rotenberg. Online bipartite matching with amortized O(log ${}^{\mbox{2}}$ n) replacements. J. ACM, 66(5):37:1–37:23, 2019.
[7] Guy E. Blelloch, Yan Gu, and Yihan Sun. Efficient construction of probabilistic tree embeddings. In 44th International Colloquium on Automata, Languages, and Programming, ICALP 2017, July 10-14, 2017, Warsaw, Poland, pages 26:1–26:14, 2017.
[8] Sébastien Bubeck, Michael B. Cohen, Yin Tat Lee, James R. Lee, and Aleksander Madry. k-server via multiscale entropic regularization. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2018, Los Angeles, CA, USA, June 25-29, 2018, pages 3–16, 2018.
[9] Moses Charikar, Chandra Chekuri, Tomás Feder, and Rajeev Motwani. Incremental clustering and dynamic information retrieval. In Proceedings of the Twenty-ninth Annual ACM Symposium on Theory of Computing, STOC ’97, pages 626–635, New York, NY, USA, 1997. ACM.
[10] Moses Charikar and Sudipto Guha. Improved combinatorial algorithms for facility location problems. SIAM J. Comput., 34(4):803–824, April 2005.
[11] Vincent Cohen-Addad, Niklas Hjuler, Nikos Parotsidis, David Saulpic, and Chris Schwiegelshohn. Fully dynamic consistent facility location. In Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d’Alché-Buc, Emily B. Fox, and Roman Garnett, editors, Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, 8-14 December 2019, Vancouver, BC, Canada, pages 3250–3260, 2019.
[12] Marek Cygan, Artur Czumaj, Marcin Mucha, and Piotr Sankowski. Online facility location with deletions. In 26th Annual European Symposium on Algorithms, ESA 2018, August 20-22, 2018, Helsinki, Finland, pages 21:1–21:15, 2018.
[13] Artur Czumaj, Christiane Lammersen, Morteza Monemizadeh, and Christian Sohler. (1 + epsilon)-approximation for facility location in data streams. In Proceedings of the Twenty-fourth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’13, pages 1710–1728, Philadelphia, PA, USA, 2013. Society for Industrial and Applied Mathematics.
[14] Gabriella Divéki and Csanád Imreh. Online facility location with facility movements. Central European Journal of Operations Research, 19(2):191–200, Jun 2011.
[15] David Eisenstat, Claire Mathieu, and Nicolas Schabanel. Facility location in evolving metrics. In Javier Esparza, Pierre Fraigniaud, Thore Husfeldt, and Elias Koutsoupias, editors, Automata, Languages, and Programming, pages 459–470, Berlin, Heidelberg, 2014. Springer Berlin Heidelberg.
[16] Yunus Esencayi, Marco Gaboardi, Shi Li, and Di Wang. Facility location problem in differential privacy model revisited. CoRR, abs/1910.12050, 2019.
[17] Jittat Fakcharoenphol, Satish Rao, Satish Rao, and Kunal Talwar. A tight bound on approximating arbitrary metrics by tree metrics. In Proceedings of the Thirty-fifth Annual ACM Symposium on Theory of Computing, STOC ’03, pages 448–455, New York, NY, USA, 2003. ACM.
[18] Reza Zanjirani Farahani, Maryam Abedian, and Sara Sharahi. Dynamic Facility Location Problem, pages 347–372. Physica-Verlag HD, Heidelberg, 2009.
[19] Dimitris Fotakis. Incremental algorithms for facility location and k-median. Theor. Comput. Sci., 361(2-3):275–313, 2006.
[20] Dimitris Fotakis. A primal-dual algorithm for online non-uniform facility location. J. of Discrete Algorithms, 5(1):141–148, March 2007.
[21] Dimitris Fotakis. On the competitive ratio for online facility location. Algorithmica, 50(1):1–57, 2008.
[22] Dimitris Fotakis. Memoryless facility location in one pass. ACM Trans. Algorithms, 7(4):49:1–49:24, 2011.
[23] Dimitris Fotakis. Online and incremental algorithms for facility location. SIGACT News, 42(1):97–131, March 2011.
[24] Dimitris Fotakis and Christos Tzamos. On the power of deterministic mechanisms for facility location games. In Automata, Languages, and Programming - 40th International Colloquium, ICALP 2013, Riga, Latvia, July 8-12, 2013, Proceedings, Part I, pages 449–460, 2013.
[25] Dimitris Fotakis and Christos Tzamos. Winner-imposing strategyproof mechanisms for multiple facility location games. Theor. Comput. Sci., 472:90–103, 2013.
[26] Monia Ghobadi, Ratul Mahajan, Amar Phanishayee, Nikhil R. Devanur, Janardhan Kulkarni, Gireeja Ranade, Pierre-Alexandre Blanche, Houman Rastegarfar, Madeleine Glick, and Daniel C. Kilper. Projector: Agile reconfigurable data center interconnect. In Proceedings of the ACM SIGCOMM 2016 Conference, Florianopolis, Brazil, August 22-26, 2016, pages 216–229, 2016.
[27] Gramoz Goranci, Monika Henzinger, and Dariusz Leniowski. A tree structure for dynamic facility location. In 26th Annual European Symposium on Algorithms, ESA 2018, August 20-22, 2018, Helsinki, Finland, pages 39:1–39:13, 2018.
[28] Sudipto Guha and Samir Khuller. Greedy strikes back: Improved facility location algorithms. In Proceedings of the Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’98, pages 649–657, Philadelphia, PA, USA, 1998. Society for Industrial and Applied Mathematics.
[29] Anupam Gupta, Ravishankar Krishnaswamy, Amit Kumar, and Debmalya Panigrahi. Online and dynamic algorithms for set cover. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, Montreal, QC, Canada, June 19-23, 2017, pages 537–550, 2017.
[30] Anupam Gupta and Amit Kumar. Greedy algorithms for steiner forest. In Proceedings of the Forty-Seventh Annual ACM on Symposium on Theory of Computing, STOC 2015, Portland, OR, USA, June 14-17, 2015, pages 871–878, 2015.
[31] Anupam Gupta, Amit Kumar, and Cliff Stein. Maintaining assignments online: Matching, scheduling, and flows. In Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2014, Portland, Oregon, USA, January 5-7, 2014, pages 468–479, 2014.
[32] Piotr Indyk. Algorithms for dynamic geometric problems over data streams. In Proceedings of the Thirty-sixth Annual ACM Symposium on Theory of Computing, STOC ’04, pages 373–380, New York, NY, USA, 2004. ACM.
[33] Christiane Lammersen and Christian Sohler. Facility location in dynamic geometric data streams. In Dan Halperin and Kurt Mehlhorn, editors, Algorithms - ESA 2008, pages 660–671, Berlin, Heidelberg, 2008. Springer Berlin Heidelberg.
[34] Shi Li. A 1.488 approximation algorithm for the uncapacitated facility location problem. Inf. Comput., 222:45–58, 2013.
[35] A. Meyerson. Online facility location. In Proceedings of the 42Nd IEEE Symposium on Foundations of Computer Science, FOCS ’01, pages 426–, Washington, DC, USA, 2001. IEEE Computer Society.
[36] Kamesh Munagala. Local search for k-medians and facility location. In Encyclopedia of Algorithms, pages 1139–1143. 2016.
[37] Seeun Umboh. Online network design algorithms via hierarchical decompositions. In Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2015, San Diego, CA, USA, January 4-6, 2015, pages 1373–1387, 2015.
[38] Adrian Vetta. Nash equilibria in competitive societies, with applications to facility location, traffic routing and auctions. In Proceedings of the 43rd Symposium on Foundations of Computer Science, FOCS ’02, pages 416–, Washington, DC, USA, 2002. IEEE Computer Society.
[39] George O. Wesolowsky. Dynamic facility location. Manage. Sci., 19(11):1241–1248, July 1973.
[40] David P. Williamson and David B. Shmoys. The Design of Approximation Algorithms. Cambridge University Press, New York, NY, USA, 1st edition, 2011.

Appendix A Analysis of Offline Local Search Algorithms for Facility Location

In this section, we prove theorems related to the local search algorithms for facility location.

A.1 Local Search for facility location

See 7

Proof.

This proof is almost identical to the analysis of the $\alpha_{F}L$ -approximation local search algorithm for facility location, except we take $\phi$ into consideration in all the inequalities. Eventually we shall have an $\alpha_{F}L|C|\phi$ term on the right side of the inequality.

Formally, we let $(S^{*},\sigma^{*})$ be the optimum solution to facility location instance. Focus on an $i^{*}\in S^{*}$ . Since there is no $\phi$ -efficient operation that opens $i^{*}$ (recall that we can open $i^{*}$ even if we have $i\in S^{*}$ ), we have

\displaystyle\sum_{j\in\sigma^{*-1}(i^{*})}d(j,\sigma_{j})\leq\lambda f_{i^{*}}\cdot 1_{i^{*}\notin S}+\sum_{j\in\sigma^{*-1}(i^{*})}(d(j,i^{*})+\phi).

This implies

\displaystyle\sum_{j\in\sigma^{*-1}(i^{*})}d(j,\sigma_{j})\leq\lambda f_{i^{*}}+\sum_{j\in\sigma^{*-1}(i^{*})}d(j,i^{*})+|\sigma^{*-1}(i^{*})|\phi.

(2)

Summing the inequalities over all $i^{*}\in S^{*}$ gives us

\displaystyle\mathsf{cc}(\sigma)\leq\lambda f(S^{*})+\mathsf{cc}(\sigma^{*})+|C|\phi.

(3)

For every $i\in S$ , let $\psi(i)$ be the nearest facility in $S^{*}$ to $i$ . For every $i^{*}\in S^{*}$ with $\psi^{-1}(i^{*})\neq\emptyset$ , let $\psi^{*}(i^{*})$ be the nearest facility in $\psi^{-1}(i^{*})$ to $i^{*}$ .

Focus on some $i\in S,i^{*}=\psi(i)$ such that $\psi^{*}(i^{*})=i$ . The operation that swaps in $i^{*}$ , swaps out $i$ and connects $\sigma^{*-1}(i^{*})\cup\sigma^{-1}(i)$ to $i^{*}$ is not $\phi$ -efficient. This implies

	$\displaystyle\quad\lambda f_{i}+\sum_{j\in\sigma^{-1}(i^{})\cup\sigma^{-1}(i)}d(j,\sigma_{j})$
	$\displaystyle\leq\lambda f_{i^{}}+\sum_{j\in\sigma^{-1}(i^{})}d(j,i^{})+\sum_{j\in\sigma^{-1}(i)\setminus\sigma^{-1}(i^{})}d(j,i^{})+\big{\|}\sigma^{-1}(i^{*})\cup\sigma^{-1}(i)\big{\|}\phi$
	$\displaystyle\leq\lambda f_{i^{}}+\sum_{j\in\sigma^{-1}(i^{})}d(j,i^{})+\sum_{j\in\sigma^{-1}(i)\setminus\sigma^{-1}(i^{})}[d(j,\sigma^{}(j))+2d(j,i)]+\big{\|}\sigma^{-1}(i^{*})\cup\sigma^{-1}(i)\big{\|}\phi.$

To see the second inequality, notice that $d(j,i^{*})\leq d(j,i)+d(i,i^{*})\leq d(j,i)+d(i,\sigma^{*}(j))\leq 2d(j,i)+d(j,\sigma^{*}(j))$ . Canceling $\sum_{j\in\sigma^{-1}(i)\setminus\sigma^{*-1}(i^{*})}d(j,i)$ on both sides and relaxing the right side a bit gives us

	$\displaystyle\quad\lambda f_{i}+\sum_{j\in\sigma^{-1}(i^{})}d(j,\sigma_{j})$	$\displaystyle\leq\lambda f_{i^{}}+\sum_{j\in\sigma^{-1}(i^{})}d(j,i^{})+\big{\|}\sigma^{-1}(i^{})\cup\sigma^{-1}(i)\big{\|}\phi$
		$\displaystyle+\sum_{j\in\sigma^{-1}(i)}\left(d(j,i)+d(j,\sigma^{*}(j)))\right).$		(4)

Notice that it could happen that $i=i^{*}$ in the above setting; the inequality was implied by the operation that opens $i=i^{*}$ and connects $\sigma^{*-1}(i^{*}=i)$ to $i$ .

Now, focus on a $i\in S$ with $\psi^{*}(\psi(i))\neq i$ . Then closing $i$ and connecting each client in $j\in\sigma^{-1}(i)$ to $\psi^{*}(\sigma^{*}(j))\neq i$ is not $\phi$ -efficient. So, we have

	$\displaystyle\lambda f_{i}+\sum_{j\in\sigma^{-1}(i)}d(j,i)$	$\displaystyle\leq\lambda f_{i}+\sum_{j\in\sigma^{-1}(i)}d(j,\psi^{}(\sigma^{}(j)))+\big{\|}\sigma^{-1}(i)\big{\|}\phi$
		$\displaystyle\leq\sum_{j\in\sigma^{-1}(i)}[2d(j,\sigma^{*}(j))+d(j,i)]+\big{\|}\sigma^{-1}(i)\big{\|}\phi.$

To see the inequality, we have $d(j,\psi^{*}(\sigma^{*}(j)))\leq d(j,\sigma^{*}(j))+d(\sigma^{*}(j),\psi(\sigma^{*}(j)))\leq d(j,\sigma^{*}(j))+d(\sigma^{*}(j),i)\leq 2d(j,\sigma^{*}(j))+d(j,i)$ . This implies

\displaystyle\lambda f_{i}\leq 2\sum_{j\in\sigma^{-1}(i)}d(j,\sigma^{*}(j))+\big{|}\sigma^{-1}(i)\big{|}\phi.

(5)

Now, consider the inequality obtained by summing up (4) for all pairs $(i,i^{*})$ with $i^{*}=\psi(i)$ and $\psi^{*}(i^{*})=i$ , (5) for all $i$ with $\psi^{*}(\psi(i))\neq i$ , and (2) for all $i^{*}$ with $\psi^{-1}(i^{*})=\emptyset$ . This inequality will be $\lambda f(S)+\mathsf{cc}(\sigma)\leq\lambda f(S^{*})+2\mathsf{cc}(\sigma^{*})+\mathsf{cc}(\sigma)+2|C|\phi$ , which is

\displaystyle\lambda f(S)\leq\lambda f(S^{*})+2\mathsf{cc}(\sigma^{*})+2|C|\phi.

(6)

Summing up Inequalities (3) and $1/\lambda$ times (6) gives $f(S)+\mathsf{cc}(\sigma)\leq(1+\lambda)f(S^{*})+(1+2/\lambda)\left(\mathsf{cc}(\sigma^{*})+|C|\phi\right)=\alpha_{F}L\left(\mathsf{opt}+|C|\phi\right)$ , since $1+\lambda=1+2/\lambda=1+\sqrt{2}=\alpha_{F}L$ . This finishes the proof of Theorem 7. ∎

See 8

The theorem follows from the proof of Theorem 7. Let $\phi=0$ in the theorem statement and the proof. (3) and (6) were obtained by adding many of the inequalities of the form (2), (4) and (5). Notice that each inequality corresponds to a local operation. In the setting for Theorem 8, the inequalities do not hold anymore since we do not have the condition that $0$ -efficient operations do not exist. However for an inequality correspondent to an operation op, we can add $\nabla_{\text{op}}$ to the right side so that the inequality becomes satisfied. Then adding all the inequalities that were used to obtain (3), we obtain

\displaystyle\mathsf{cc}(\sigma)\leq\lambda f(S^{*})+\mathsf{cc}(\sigma^{*})+\sum_{\textrm{op}\in{\mathcal{P}}_{\mathrm{C}}}\nabla_{\textrm{op}}

where ${\mathcal{P}}_{\mathrm{C}}$ is the set of operations correspondent to the inequalities. Similarly we can obtain a set ${\mathcal{P}}_{\mathrm{F}}$ of operations, such that

\displaystyle\lambda f(S)\leq\lambda f(S^{*})+2\mathsf{cc}(\sigma^{*})+\sum_{\textrm{op}\in{\mathcal{P}}_{\mathrm{F}}}\nabla_{\textrm{op}}.

It is easy to check that each of ${\mathcal{P}}_{\mathrm{C}}$ and ${\mathcal{P}}_{\mathrm{F}}$ contains at most 1 operation opens or swaps in $i^{*}$ , for every $i^{*}\in S^{*}\subseteq f$ and does not contain operations that open or swap in facilities outside $S^{*}$ . ${\mathcal{P}}_{\mathrm{C}}\uplus{\mathcal{P}}_{\mathrm{F}}$ contains at most $|S|\leq|F|$ close operations. Rewriting the two inequalities almost gives us Theorem 8, except for the requirement that each $\textrm{op}\in{\mathcal{P}}_{\mathrm{C}}\cup{\mathcal{P}}_{\mathrm{F}}$ has $\nabla_{\mathrm{op}}>0$ ; this can be ensured by removing op’s with $\nabla_{\textrm{op}}\leq 0$ from ${\mathcal{P}}_{\mathrm{C}}$ and ${\mathcal{P}}_{\mathrm{F}}$ .

Appendix B Proofs of Useful Lemmas

See 9

Proof.

Define $a_{T+1}=+\infty$ .

	$\displaystyle\sum_{t=1}^{T}\frac{b_{t}}{a_{t}}$	$\displaystyle=\sum_{t=1}^{T}\frac{B_{t}-B_{t-1}}{a_{t}}=\sum_{t=1}^{T}B_{t}\left(\frac{1}{a_{t}}-\frac{1}{a_{t+1}}\right)=\sum_{t=1}^{T}\frac{B_{t}}{a_{t}}\left(1-\frac{a_{t}}{a_{t+1}}\right)\leq\alpha\sum_{t=1}^{T}\left(1-\frac{a_{t}}{a_{t+1}}\right)$
		$\displaystyle=\alpha T-\alpha\sum_{t=1}^{T-1}\frac{a_{t}}{a_{t+1}}\leq\alpha T-\alpha(T-1)\Big{(}\frac{a_{1}}{a_{T}}\Big{)}^{1/(T-1)}$
		$\displaystyle=\alpha(T-1)\left(1-e^{-\ln\frac{a_{T}}{a_{1}}/(T-1)}\right)+\alpha\leq\alpha(T-1)\ln\frac{a_{T}}{a_{1}}/(T-1)+\alpha=\alpha\left(\ln\frac{a_{T}}{a_{1}}+1\right).$

The inequality in the second line used the following fact: if the product of $T-1$ positive numbers is $\frac{a_{1}}{a_{T}}$ , then their sum is minimized when they are equal. The inequality in the third line used that $1-e^{-x}\leq x$ for every $x$ . ∎

See 10

Proof.

By the conditions in the lemma, opening facility $i$ and reconnecting $\tilde{C}$ to $i$ is not $\phi$ -efficient. This gives that at the moment, we have

\sum_{\tilde{j}\in\tilde{C}}d(\tilde{j},S)\leq\sum_{\tilde{j}\in\tilde{C}}d(\tilde{j},\sigma_{\tilde{j}})\leq f_{i}+\sum_{\tilde{j}\in\tilde{C}}d(i,\tilde{j})+|\tilde{C}|\cdot\phi

By triangle inequalities we have $d(\tilde{j},S)\geq d(i,S)-d(i,\tilde{j})$ for every $\tilde{j}\in\tilde{C}$ . Combining with the previous inequality yields:

\displaystyle d(i,S)\leq\frac{1}{|\tilde{C}|}\sum_{\tilde{j}\in\tilde{C}}\left(d(\tilde{j},S)+d(i,\tilde{j})\right)\leq\frac{f_{i}+2\sum_{\tilde{j}\in\tilde{C}}d(i,\tilde{j})}{|\tilde{C}|}+\phi.\hskip 80.0pt\qed

Appendix C Moving Clients to Facility Locations

In this section we show that by moving clients to their nearest facilities, we lose a multiplicative factor of $2$ and an additive factor of $1$ in the approximation. That is, an $\alpha$ approximate solution for the new instance, is $2\alpha+1$ approximate for the original instance. Throughout this section, we simply use the set of open facilities to define a solution and all clients are connected to their respective nearest open facilities.

Let a facility location instance be given by $F,(f_{j})_{j\in C},C$ and $d$ . Let $\psi_{j}$ be the nearest facility in $F$ to $j$ for every $j\in C$ . By moving all clients $j$ to $\psi_{j}$ , we obtain a new instance. Let $S^{*}$ be the optimum solution to the original instance. Suppose we have an solution $S$ for the new instance that is $\alpha$ -approximate solution. Thus $f(S)+\sum_{j\in C}d(\psi_{j},S)\leq\alpha\left(f(S^{*})+\sum_{j\in C}d(\psi_{j},S^{*})\right)$ . We show that $S$ is $2\alpha+1$ approximate for the original instance.

Notice that for every $j\in C$ , we have $d(j,S)-d(j,\psi_{j})\leq d(\psi_{j},S)\leq d(j,S)+d(j,\psi_{j})$ by triangle inequalities.

	$\displaystyle f(S)+\sum_{j\in C}d(j,S)$	$\displaystyle\leq f(S)+\sum_{j\in C}\left(d(\psi_{j},S)+d(j,\psi_{j})\right)$
		$\displaystyle\leq\alpha\left(f(S^{})+\sum_{j\in C}d(\psi_{j},S^{})\right)+\sum_{j\in C}d(j,\psi_{j})$

For every $j\in C$ , since $\psi_{j}$ is the nearest facility in $F$ to $j$ , we have $d(\psi_{j},S^{*})\leq d(j,\psi_{j})+d(j,S^{*})\leq 2d(j,S^{*})$ . Thus, we have

	$\displaystyle f(S)+\sum_{j\in C}d(j,S)$	$\displaystyle\leq\alpha f(S^{})+2\alpha\sum_{j\in C}d(j,S^{})+\sum_{j\in C}d(j,\psi_{j})$
		$\displaystyle\leq\alpha f(S^{})+(2\alpha+1)\sum_{j\in C}d(j,S^{}).$

Thus, we have that $S$ is a $(2\alpha+1)$ -approximate solution for the original instance.

Appendix D Missing Proofs from Section 4

See 13

Proof.

We are going to lower bound the expected value of $\mathsf{cost}_{\lambda}(S^{0},\sigma^{0})-\mathsf{cost}_{\lambda}(S^{1},\sigma^{1})$ . By Theorem 8, there are two sets ${\mathcal{P}}_{\mathrm{C}}$ and ${\mathcal{P}}_{\mathrm{F}}$ of local operations satisfying the properties. Below, we let ${\mathcal{Q}}$ be one of the following three sets: ${\mathcal{P}}_{\mathrm{C}}$ , or ${\mathcal{P}}_{\mathrm{F}}$ , or ${\mathcal{P}}_{\mathrm{C}}\biguplus{\mathcal{P}}_{\mathrm{F}}$ .

For every $i\in F$ , let ${\mathcal{Q}}_{i}$ be the set of operations in ${\mathcal{Q}}$ that open or swap in $i$ . Let ${\mathcal{Q}}_{\emptyset}$ be the set of $\mathsf{close}$ operations in ${\mathcal{Q}}$ . Let $\Phi_{i}$ be maximum of $\nabla_{\mathsf{op}}$ over all $\mathsf{op}\in{\mathcal{Q}}_{i}$ (define $\Phi_{i}=0$ if ${\mathcal{Q}}_{i}=\emptyset$ ); define $\Phi_{\emptyset}$ similarly. Notice that if $i\in S$ then open $i$ will not decrease the cost since we maintain that all the clients are connected to their nearest open facilities. Thus, ${\mathcal{Q}}_{i}=\emptyset$ for $i\in S$ . Then, conditioned on that we consider $\mathsf{close}$ operations in $\mathsf{sampled\mathchar 45\relax local\mathchar 45\relax search}$ , the cost decrement of the iteration is at least $\Phi_{\emptyset}$ . Conditioned on that we consider opening or swapping in $i$ in the iteration, the decrement is at least $\Phi_{i}$ . Thus, $\mathsf{cost}_{\lambda}(S^{0},\sigma^{0})-\operatorname*{\mathbb{E}}[\mathsf{cost}_{\lambda}(S^{1},\sigma^{1})]\geq\frac{\Phi_{\emptyset}}{3}+\sum_{i\in F\setminus S}\frac{2\Phi_{i}}{3|F\setminus S|}$ . Therefore,

	$\displaystyle\sum_{\mathsf{op}\in{\mathcal{Q}}}\nabla_{\mathsf{op}}$	$\displaystyle\leq\|{\mathcal{Q}}_{\emptyset}\|\Phi_{\emptyset}+\sum_{i\in F\setminus S}\|{\mathcal{Q}}_{i}\|\Phi_{i}\leq\|F\|\Phi_{\emptyset}+2\sum_{i\in F\setminus S}\Phi_{i}$
		$\displaystyle\leq 3\|F\|(\mathsf{cost}_{\lambda}(S^{0},\sigma^{0})-\operatorname*{\mathbb{E}}[\mathsf{cost}_{\lambda}(S^{1},\sigma^{1})]),$

since the third and fourth properties in the theorem imply $|{\mathcal{Q}}_{\emptyset}|\leq|F|$ and $|{\mathcal{Q}}_{i}|\leq 2$ for every $i\in F\setminus S$ . Replacing ${\mathcal{Q}}$ with each of ${\mathcal{P}}_{\mathrm{C}}$ , ${\mathcal{P}}_{\mathrm{F}}$ and ${\mathcal{P}}_{\mathrm{C}}\biguplus{\mathcal{P}}_{\mathrm{F}}$ , we obtain

\displaystyle\mathsf{cost}_{\lambda}(S^{0},\sigma^{0})-\operatorname*{\mathbb{E}}[\mathsf{cost}_{\lambda}(S^{1},\sigma^{1})]\geq\frac{1}{3|F|}\max\left\{\begin{array}[]{c}\mathsf{cc}(\sigma^{0})-(\lambda f(S^{*})+\mathsf{cc}(\sigma^{*}))\\ \lambda f(S)-(\lambda f(S^{*})+2\mathsf{cc}(\sigma^{*}))\\ \mathsf{cost}_{\lambda}(S^{0},\sigma^{0})-(2\lambda f(S^{*})+3\mathsf{cc}(\sigma^{*}))\end{array}\right\}.

This finishes the proof of the lemma. ∎

See 14

Proof.

We break the procedure in two stages. The first stage contains $M_{1}=O\left(|F|\log\frac{\Gamma}{\epsilon^{\prime}}\right)$ iterations of the for-loop in $\mathsf{FL\mathchar 45\relax iterate}(M)$ , where $M_{1}$ is sufficiently large. Applying Lemma 13 and using the third term in the $\max$ operator, for any execution of $\mathsf{sampled\mathchar 45\relax local\mathchar 45\relax search}$ , we have

	$\displaystyle\quad\operatorname{\mathbb{E}}\big{[}\big{(}\mathsf{cost}_{\lambda}(S^{1},\sigma^{1})-(2\lambda f(S^{})+3\mathsf{cc}(\sigma^{*}))\big{)}_{+}\big{]}$
	$\displaystyle\leq\left(1-\frac{1}{3\|F\|}\right)\big{(}\mathsf{cost}_{\lambda}(S^{0},\sigma^{0})-(2\lambda f(S^{})+3\mathsf{cc}(\sigma^{}))\big{)}_{+},$

where $(S^{0},\sigma^{0})$ and $(S^{1},\sigma^{1})$ are as defined w.r.t the execution, and $x_{+}$ is defined as $\max\{x,0\}$ for every real number $x$ . Notice that when $\mathsf{cost}_{\lambda}(S^{0},\sigma^{0})\leq 2\lambda f(S^{*})+3\mathsf{cc}(\sigma^{*})$ , the inequality holds trivially. Truncating at $0$ is needed later when we apply the Markov inequality.

So, after $M_{1}$ iterations, we have

	$\displaystyle\quad\operatorname{\mathbb{E}}\big{[}\big{(}\mathsf{cost}_{\lambda}(S,\sigma)-(2\lambda f(S^{})+3\mathsf{cc}(\sigma^{*}))\big{)}_{+}\big{]}$
	$\displaystyle\leq\left(1-\frac{1}{3\|F\|}\right)^{M_{1}}\big{(}\mathsf{cost}_{\lambda}(S^{\circ},\sigma^{\circ})-(2\lambda f(S^{})+3\mathsf{cc}(\sigma^{}))\big{)}_{+}\leq\frac{\epsilon^{\prime}}{2\Gamma}\mathsf{opt}.$

The second inequality holds since $\mathsf{cost}_{\lambda}(S^{\circ},\sigma^{\circ})\leq\lambda\mathsf{cost}(S^{\circ},\sigma^{\circ})\leq O(1)\mathsf{opt}$ and $M=O\left(\frac{|F|}{\epsilon^{\prime}}\log\Gamma\right)$ is sufficiently large. Using Markov’s inequality, with probability at least $1-\frac{1}{2\Gamma}$ , we have at the end of the first stage,

(\mathsf{cost}_{\lambda}(S,\sigma)-(2\lambda f(S^{*})+3\mathsf{cc}(\sigma^{*})))_{+}\leq\epsilon^{\prime}\cdot\mathsf{opt}.

If the event happens, we say the first stage is successful.

We assume the first stage is successful and analyze the second stage. The second stage contains $\log_{2}(2\Gamma)$ phases, and each phase contains $\frac{48|F|}{\epsilon^{\prime}}$ iterations. We focus on one phase in the stage. Assume that at the beginning of an iteration in the phase, we have

\displaystyle\mathsf{cc}(\sigma)\leq\big{(}\lambda+\frac{\epsilon^{\prime}}{2}\big{)}f(S^{*})+\big{(}1+\frac{\epsilon^{\prime}}{2}\big{)}\mathsf{cc}(\sigma^{*})\text{ and }\lambda f(S)\leq\big{(}\lambda+\frac{\lambda\epsilon^{\prime}}{2}\big{)}f(S^{*})+\big{(}2+\frac{\lambda\epsilon^{\prime}}{2}\big{)}\mathsf{cc}(\sigma^{*}).

Then at the moment, we have $\mathsf{cost}(S,\sigma)\leq(1+\lambda+\epsilon^{\prime})f(S^{*})+(1+2/\lambda+\epsilon^{\prime})\mathsf{cc}(\sigma^{*})=(\alpha_{\mathsf{FL}}+\epsilon^{\prime})\mathsf{opt}$ (obtained by adding the first inequality and $1/\lambda$ times the second inequality). Then we must have $\mathsf{cost}(S^{\mathsf{best}},\sigma^{\mathsf{best}})\leq(\alpha_{\mathsf{FL}}+\epsilon^{\prime})\mathsf{opt}$ in the end of this execution of $\mathsf{FL\mathchar 45\relax iterate}$ since $(S^{\mathsf{best}},\sigma^{\mathsf{best}})$ is the best solution according to the original (i.e, non-scaled) cost.

Thus, we say a phase in the second stage is successful if both inequalities hold at the end of some iteration in the phase; then we can pretend that the phase ends at the moment it is successful. If one of the two inequalities does not hold at the end of an iteration, then by Lemma 13, for the execution of $\mathsf{sampled\mathchar 45\relax local\mathchar 45\relax search}$ in the next iteration, we have $\mathsf{cost}_{\lambda}(S^{0},\sigma^{0})-\operatorname*{\mathbb{E}}[\mathsf{cost}_{\lambda}(S^{1},\sigma^{1})]\geq\frac{\epsilon^{\prime}}{6|F|}(f(S^{*})+\mathsf{cc}(\sigma^{*}))=\frac{\epsilon^{\prime}}{6|F|}\mathsf{opt}$ . Then, by stopping times of martingales, in expectation, the phase stops in at most $\frac{24|F|}{\epsilon^{\prime}}$ iterations since at the beginning of the phase we have $\mathsf{cost}_{\lambda}(S,\sigma)\leq\max\{3+\epsilon^{\prime},2\lambda+\epsilon^{\prime}\}(f(S^{*})+\mathsf{cc}(\sigma^{*}))\leq 4\cdot\mathsf{opt}$ and $\mathsf{cost}_{\lambda}(S,\sigma)$ is always positive. By Markov’s inequality, the probability that the phase does not stop early (i.e, is not successful) is at most $1/2$ . The probability that the second stage succeeds, i.e, at least one of its phases succeeds is at least $1-1/(2\Gamma)$ . Thus with probability at least $1-1/\Gamma$ , both stages succeed and we have $\mathsf{cost}(S^{\mathrm{best}},\sigma^{\mathrm{best}})\leq(\alpha_{\mathsf{FL}}+\epsilon^{\prime})\mathsf{opt}$ . The number of iterations we need in the two stages is $O\left(\frac{|F|}{\epsilon^{\prime}}\log\Gamma\right)$ . ∎

The Power of Recourse: Better Algorithms for Facility Location in Online and Dynamic Models

Abstract

1 Introduction

Theorem 1.

Theorem 2.

Theorem 3.

Theorem 4.

1.1 Our Techniques

2 Preliminaries

2.1 Hierarchically Well Separated Trees

Definition 5.

2.2 Specifying Input Sequence

2.3 Local Search for facility location

Definition 6 (Efficient operations for facility location).

Theorem 7.

Theorem 8.

2.4 Useful Lemmas

Lemma 9.

Lemma 10.

Organization

3 (1+2+ϵ)(1+\sqrt{2}+\epsilon)-Competitive Online Algorithm with Recourse

3.1 The Algorithm

3.2 Bounding Amortized Recourse in One Stage

Lemma 11.

Proof.

4 Fast Local Search via Randomized Sampling

4.1 Maintaining Heaps for Clients

4.2 Random Sampling of Local Operations

Observation 12.

Lemma 13.

Lemma 14.

5 (1+2+ϵ)(1+\sqrt{2}+\epsilon)-Approximate Dynamic Algorithm for Facility Location in Incremental Setting

Lemma 15.

Proof.

5.1 Bounding Number of Times of Calling 𝖥𝖫−𝗂𝗍𝖾𝗋𝖺𝗍𝖾\mathsf{FL\mathchar 45\relax iterate}

Observation 16.

Proof.

Lemma 17.

Proof.

Lemma 18.

Proof.

Remark

6 Fully Dynamic Algorithm for Facility Location on Hierarchically Well Separated Tree Metrics

6.1 Offline Algorithm for Facility Location on HST Metrics

Marked and Open Facilities

Observation 19.

Proof.

Observation 20.

Proof.

Analysis of O​(1)O(1)-Approximation Ratio.

Lemma 21.

Proof.

Lemma 22.

Proof.

Lemma 23.

Proof.

Remark

6.2 Dynamic Algorithm for Facility Location on HST Metrics

Bounding Number of Reconnections

Analyzing Update Time

7 Open Problems and Discussions

References

Appendix A Analysis of Offline Local Search Algorithms for Facility Location

A.1 Local Search for facility location

Proof.

Appendix B Proofs of Useful Lemmas

Proof.

Proof.

Appendix C Moving Clients to Facility Locations

Appendix D Missing Proofs from Section 4

Proof.

Proof.

3 $(1+\sqrt{2}+\epsilon)$ -Competitive Online Algorithm with Recourse

5 $(1+\sqrt{2}+\epsilon)$ -Approximate Dynamic Algorithm for Facility Location in Incremental Setting

5.1 Bounding Number of Times of Calling $\mathsf{FL\mathchar 45\relax iterate}$

Analysis of $O(1)$ -Approximation Ratio.