Fleet Sizing and Allocation for On-demand Last-Mile Transportation Systems
Abstract
The last-mile problem refers to the provision of travel service from the nearest public transportation node to home or other destination. Last-Mile Transportation Systems (LMTS), which have recently emerged, provide on-demand shared transportation. In this paper, we investigate the fleet sizing and allocation problem for the on-demand LMTS. Specifically, we consider the perspective of a last-mile service provider who wants to determine the number of servicing vehicles to allocate to multiple last-mile service regions in a particular city. In each service region, passengers demanding last-mile services arrive in batches, and allocated vehicles deliver passengers to their final destinations. The passenger demand (i.e., the size of each batch of passengers) is random and hard to predict in advance, especially with limited data during the planning process. The quality of fleet-allocation decisions is a function of vehicle fixed cost plus a weighted sum of passenger’s waiting time before boarding a vehicle and in-vehicle riding time. We propose and analyze two models—a stochastic programming model and a distributionally robust optimization model—to solve the problem, assuming known and unknown distribution of the demand, respectively. We conduct extensive numerical experiments to evaluate the models and discuss insights and implications into the optimal fleet sizing and allocation for the on-demand LMTS under demand uncertainty.
keywords:
Last-mile transportation,, on-demand transportation , fleet sizing and allocation , demand uncertainty , stochastic optimization1 Introduction
The last-mile problem refers to the design and provision of travel services from a public transportation node to a passenger’s final destination. It is a fundamental practical problem that has attracted intense attention in the past decade for several reasons. First, governments worldwide are under pressure to increase public transport’s share of urban trips to reduce road congestion and air pollution. Maybe not surprisingly, urban planners recently recognized that the unavailability of last-mile services is one of the main deterrents to the use of public transport. Second, the aging of populations has increased the demand for such services. Third, more countries impose legal requirements to ensure adequate mobility for particular demographic groups, who are most likely to need last-mile services, such as people with physical disabilities.
Any passenger needing on-demand last-mile service may provide advance notice to the last-mile transportation systems (LMTS) of his/her impending arrival at the alighting station and her specific final destination. Once this information is received, the LMTS assigns the passenger to one of the vehicles in the LMTS fleet, plans the vehicle’s route so that it includes a stop at the passenger’s destination, estimates the vehicle’s departure time, and notifies the passenger accordingly. Once all of the passengers assigned to a vehicle are on board, the vehicle executes a delivery route with stops at each passenger’s destination and returns to the station to pick up passengers for its next delivery tour. Many papers address various models and case studies of LMTS. With the high penetration of services such as Uber worldwide, most people are aware of the benefits of on-demand transportation services and request even more specialized forms, including last-mile service (Wang and Yang (2019)) .
In this paper, we investigate the fleet sizing and allocation problem for the on-demand LMTS. Specifically, we consider the perspective of a last-mile service provider who wants to determine the number of servicing vehicles to allocate to multiple service regions in a particular city. In each service region (e.g., an area around a metro station), passengers demanding last-mile services arrive in batches (e.g., as a result of consecutive arrivals of metros or trains), and allocated vehicles deliver passengers to their final destinations, i.e., last-mile stops. The size of each batch of passengers (demand henceforth) is random and hard to predict in advance. The quality of fleet-allocation decisions is a function of vehicle fixed cost (vehicle rental or purchase cost) plus a weighted sum of passenger waiting time before boarding a vehicle and in-vehicle riding time.
The fleet sizing and allocation problem is challenging, especially since the practical passenger demand is stochastic (i.e., uncertain). If we know the exact probability distribution of random demand, we may formulate the problem as a two-stage stochastic programming model. In the first stage, we decide the number of vehicles to allocate to each service region and their routes before the realization of random demand. Then in the second stage, we observe the realizations of stochastic demands and make optimal recourse actions (assigning passengers to vehicles), conditioning on first-stage decisions, and accordingly compute the associated passenger waiting and riding times. Mathematically, the stochastic programming model identifies vehicle allocation and routing decisions that minimize the total cost, comprising the fixed cost of vehicle allocation and the expected weighted sum of waiting time and riding time for passengers, where the expectation is taken with respect to this known probability distribution of random demand.
In reality, it is often notoriously difficult to accurately estimate the exact probability distribution of demand, especially with limited data during the planning stage. LMTS is a relatively emerging transportation mode, and thus sufficient data on LMTS operations is not readily available. Even if LMTS companies collect data on their LMTS operations, the data may not be sufficient or have high quality to model the demand distribution. Moreover, it is challenging to obtain LMTS data from companies due to privacy issues, among others. If we calibrate a stochastic programming model to a data sample from a biased distribution of random demand, then the resulting biased optimal decisions may demonstrate disappointing out-of-sample performance (in terms of passenger waiting and riding times) under the true distribution– a phenomenon known as optimizer’s curse (see Esfahani and Kuhn (2018) and Smith and Winkler (2006) for a detailed discussion). Alternatively, one can construct an ambiguity set (i.e., a family) of all plausible probability distributions compatible with the available limited data or expert knowledge about demand.
In this paper, we address the uncertainty of passengers’ demand for last-mile service. We propose, analyze, and evaluate the computational and operational performance of two models for the fleet sizing and allocation problem, assuming known and unknown distribution of the demand, respectively. First, a stochastic programming model is proposed to minimize the fixed cost of allocated vehicles and the expectation of a weighted sum of passenger waiting and riding times, under a distributional belief of demand. Second, a distributionally robust model is proposed to minimize the fixed cost of vehicles and the worst-case (i.e., maximum) expectation of passenger waiting time and riding times; we also evaluate the worst-case expectation over an ambiguity set (i.e., a family of all possible distributions of uncertain demand) and characterize the ambiguity set by known mean and support information of demand. We conduct extensive numerical experiments and discuss the insights and implications by examining trade-offs between total cost, fleet size, and passenger waiting and riding times.
The proposed model generalizes the single-service region LMTS routing and scheduling formulation in literature by (1) considering multiple service regions, (2) considering fleet sizing and allocation decisions, and (3) incorporating the uncertainty of passenger demand. To the best of our knowledge, and according to our literature review in Section 2, this is the first paper that provides a theoretical and computational analysis of stochastic optimization models for the fleet sizing and allocation problem for LMTS.
The reminder of the paper is structured as follows. In Section 2, we review the relevant literature. In Section 3, we formally define the problem and propose two model formulations. In Section 4, we present computational results and discuss managerial implications. Finally, we draw conclusions and discuss future directions in Section 5.
2 Relevant Literature
Existing literature has addressed various models and case studies of the LMTS. Several case studies analyze LMTS in different contexts, including Liu et al. (2012)’s study of a bicycle-sharing program for a passenger LMTS in Beijing. Some studies have examined the design and performance evaluation of an LMTS from a planning perspective. For example, Wang and Odoni (2016) address the planning side by focusing on passenger LMTS from a stochastic and planning perspective and provide closed-form approximations for the performance of an LMTS as a function of the system’s fundamental design parameters. Zhu et al. (2020) study passengers’ multi-modal commuting behavior with ride-splitting and ride-sourcing systems, while considering their feeding effects on public transit—i.e., the ride-splitting fleet provides first- and last-mile services to public transit.
Recent studies have examined the operation of an LMTS from an optimization perspective. For example, Wang (2019) focuses on LMTS from an operational perspective and provides efficient strategies for passenger assignment, vehicle routing, and scheduling operations based on a set of last-mile demand information. Similarly, Agussurja et al. (2019) study the use of ride-sharing in satisfying last-mile demands with the assumption that last-mile demands are uncertain and come in batches, and propose a two-level Markov decision process framework that is capable of generating a vehicle-dispatching policy. Liu et al. (2019) focus on the fleet size and scheduling of feeder transit services while considering the influence of bike-sharing systems, propose several hybrid operation modes that combine fixed and dynamic frequencies in a bimodal period, and compare these with conventional bus scheduling with constant service frequencies. Chen et al. (2020a) focus on solving the first-mile ride-sharing problem using autonomous vehicles and propose a mixed-integer linear programming model to determine autonomous vehicle dispatch and ride-sharing schemes for minimum operational costs. Serra et al. (2019) study the scheduling problem of last-mile service while considering uncertainty in the system, and propose a two-stage stochastic programming formulation for scheduling a set of known passengers and uncertain passengers that is realized from a finite set of scenarios. Chen and Wang (2018a, b) study the pricing problem of multiple types of passengers in a LMTS using a queueing model to approximate passenger waiting time.
Personal rapid transit (PRT) and demand responsive transit (DRT), which refer to a variety of on-demand transportation systems with characteristics that are similar, in some ways, to LMTS, have also attracted significant attention in recent years. For instance, the PRT system control frameworks by Anderson (1998); financial assessments by Bly and Teychenne (2005) and Berger et al. (2011); performance approximations by Lees-Miller et al. (2009); and case studies by Mueller and Sgouridis (2011). Other papers focus on DRT concept discussions, practical implementation, and assessment of simulations in case studies, such as Horn (2002) and Quadrifoglio et al. (2008), among others. Relevant fleet sizing problems have also been studied for on-demand ride-pooling service by Ke et al. (2020); one-way car sharing service by Xu and Meng (2019); and autonomous electric vehicles considering charging system planning by Zhang et al. (2019).
In contrast to the increased awareness for LMTS modeling and methodology research, the availability of practical datasets is a limiting factor in LMTS research. In our numerical study, we construct a real case study based on New York City’s travel demand data. In another paper, Hao et al. (2021) curated last-mile transportation demand data arising from job-related commute in the United States, with an emphasis on the correlation between last-mile demand and household income level.
For the challenge of stochastic and uncertain demand, we refer to some frameworks for optimization under uncertainty: stochastic programming, robust optimization, and distributionally robust optimization. When using stochastic programming, the goal is to optimize a certain measure of a random outcome (e.g., the expected operational cost) for a given fully known distribution of the uncertain parameters. We refer to Birge and Louveaux (2011) and Shapiro et al. (2014) and references therein for thorough discussions about applications, formulations, and solution algorithms. Robust optimization (RO) assumes complete ignorance about the probability distribution of uncertain parameters. Instead, it assumes that an uncertain parameter’s values may vary in a given constrained set, called “uncertainty set” (Ben-Tal et al., 2015; Bertsimas and Sim, 2004; Soyster, 1973). Optimization is performed with respect to the worst-case scenario in the uncertainty set, which may inevitably lead to over-conservatism and suboptimal decisions for other more-likely scenarios (Chen et al., 2020b; Delage and Ye, 2010; Thiele, 2010). By focusing on the worst-case scenario, RO solutions are often overly conservative. Moreover, they usually have poor expected performances because they cannot capture the distributional information of uncertainty. Distributionally robust optimization (DRO) is a third approach to model uncertainty that bridge the gap between the conservatism of RO and the specificity of SP. DRO optimal solutions are sought for the worst-case probability distribution within a family of candidate distributions, called an “ambiguity set”. One can use easy-to-approximate information such as the mean and range of random parameters to construct the ambiguity sets and models that better mimic reality and are less conservative than RO models. In addition, DRO models with some types of carefully designed ambiguity sets are often more tractable than their SP counterparts (Delage and Ye, 2010; Rahimian and Mehrotra, 2019).
3 Formulation and Analysis
In this section, we formally define the fleet sizing and allocation problem for the on-demand LMTS. We propose and analyze two optimization models, namely, a two-stage stochastic mixed-integer linear programming model in Section 3.2 and a two-stage distributionally robust model in Section 3.3.
3.1 Definitions and Random Parameters
We consider the perspective of a last-mile service provider who wants to determine the number of vehicles to allocate to each service region (e.g., an area around a metro station) in a particular city. Each service region consists of a known number of last-mile stops . For each stop there is a known number, , of feasible vehicle routes (i.e., a sequence of last-mile stops that a vehicle should visit on a trip). In each planing period (i.e., morning, afternoon, etc.), a set of punctual trains arrive at each service region . Each train dispatches a batch of passengers demanding last-mile service, and allocated vehicles deliver them to their final last-mile destinations. The number of passengers (demand henceforth) that need rides to each last-mile stop is random. The randomness of the demand stems from the uncertain number of passengers who make a last-minute request for last-mile service, in addition to those who request their last-mile service in advance. We make the following assumptions as in Wang (2019):
Indices | |
---|---|
index of train | |
index of last-mile stop | |
index of route | |
Sets | |
set of service regions | |
set of pre-selected feasible routes in service region | |
set of trains arriving to bring passengers in sequence in service region | |
set of pre-specified last-mile stops in service region | |
Parameters | |
maximum total number of vehicles in the fleet | |
(random) number of passengers demanding last-mile stop arriving at the station from train | |
lower/upper bound of | |
1 if last-mile stop is served by route ; 0 otherwise | |
total travel time of route , in terms of intervals between arrival trains | |
travel time to last-mile stop on route | |
vehicle capacity (i.e., number of seats) for vehicle | |
inter-arrival time (headway) between trains (demand batches) | |
fixed cost of each vehicle | |
weight of passenger waiting time before boarding in the objective function | |
weight of passenger in-vehicle riding time in the objective function | |
Decision variables | |
number of vehicles allocated to service region | |
number of passengers with destination at last-mile stop assigned to route right after arrival of train | |
number of trips on route starting right after arrival of train | |
Intermediate variables | |
number of unserved passengers with destination at last-mile stop waiting at the station | |
after the arrival of train | |
number of available vehicles at the station after the arrival of train in service region | |
and its corresponding service assignment |
-
A1.
The delivery fleet consists of at most vehicles, each with integer capacity ;
-
A2.
The set of last-mile stops in each service region is finite;
-
A3.
The set of feasible routes for LMTS vehicles in each service region is finite and preselected based on geometry, historical demand patterns, and some practical constraints—e.g., limits on the maximum number of last-mile stops on a single route or the route’s maximum travel distance or travel time;
-
A4.
The inter-arrival time (headway) between arrival trains is deterministic and equal to .
Given a fleet of at most vehicles and the sets of last-mile stops and preselected routes for LMTS vehicles, we aim to identify: (1) the number of vehicles to allocate for each service region, (2) a routing plan for the allocated vehicles (i.e., route selection) in each service region, and (3) the assignment of passengers to vehicles for different realizations of demand patterns. Decisions (1)–(2) are first-stage decisions that we make before observing demand patterns. If a route is selected, a vehicle should visit all of the last-mile stops specified on this route. The assignment decisions (3) represent the recourse actions in response to the first-stage decisions and the realizations of demand patterns. The quality of fleet-allocation decisions is a function of the vehicle fixed cost (which may include vehicle rental or purchase cost, etc.) plus a weighted sum of passenger waiting time before boarding a vehicle and in-vehicle riding time.
General notation: For , we define and . The abbreviations “w.l.o.g.” and “w.l.o.o.” respectively, represent “without loss of generality” and “without loss of optimality.” For notation brevity, we use (, , ) to denote both the number and set of (trains, last-mile stops, routes) in service region . For notational and modeling convenience, we assume w.l.o.g. that last-mile stops (and routes) are numbered sequentially, e.g., , , , , etc.
3.2 Two-stage Stochastic Model (SP) for Fleet Sizing and Allocation
In this section, we assume that we know the joint probability distribution of the random number of passengers arriving at the station from train with a destination of last-mile stop in each service region , , for all and and formulate the problem as a two-stage a “prior” stochastic mixed-integer programming model (SP). In the first stage, we determine the number of vehicles to allocate to each service region, their routes, and the number of trips on each route. In the second stage, we assign passengers to vehicles for different realizations of demand and compute the associated riding and waiting-time costs for passengers. A priori optimization has a managerial advantage, since it guarantees the regularity of service, which is beneficial for both passengers and the service provider.
Let integer decision variable represent the number of vehicles allocated to service region . The feasible region of variables is defined in (2) such that the number of allocated vehicles is less than or equal to the total number of vehicles in the fleet:
(2) |
Let variable represent the number of trips on route starting right after the arrival of train at service region , for all . Let variable represent the number of vehicles waiting at the metro station after arrival of the train (demand batch) in service region , for all . Feasible region of in (6) defines and constrains the number of vehicles waiting at each metro station after the arrival of each train (demand batch) and its corresponding service assignment.
(6) |
Given a feasible , , and a joint realization of uncertain parameters , we can compute: (1) the number of passengers with a destination of last-mile stop assigned to route right after the arrival of train at service region , and (2) the number of unserved passengers with a destinations of each last-mile stop waiting at the metro station after the arrival of train (demand batch) at service region and its corresponding service assignment using the following linear program (see Table 1 for notation):
(7a) | ||||
(7b) | ||||
(7c) | ||||
(7d) | ||||
(7e) |
The objective function (7a) minimizes a linear cost function of the total waiting time and riding time for passengers. Constraints (7b) and (7c) are passenger flow constraints—i.e., they define and constrain the number of unserved passengers with a destinations of each last-mile stop who are waiting at the metro station after the arrival of train and its corresponding service assignment. Constraint (7d) ensures the vehicle service capacity restriction. Finally, constraint (7e) specifies the feasible ranges of the decision variables. Accordingly, we formulate the stochastic fleet sizing and allocation of fleet problem as follows:
(8) |
The SP formulation in (8) searches for vehicle sizing, allocation, routing, and scheduling decisions that minimize the vehicle fixed cost plus a weighted expected sum of passenger waiting time and riding time, where the expectation is taken with respect to a known joint probability distribution of , where is the a vector of demand (i.e., a vector of all for all and ). The formulation generalizes the deterministic LMTS routing and scheduling formulation of Wang (2019) by (1) considering multiple service regions, (2) considering fleet sizing and allocation decisions, and (3) incorporating the uncertainty of passengers demand for LMTS. In Proposition 1, we show that the number of trips on routes that have at least two last-mile stops is at most one after each train arrival in the optimal solution.
Proposition 1.
For any , if , then in the optimal solution of formulation (8).
Proof.
For any such that , w.l.o.g., assume that the last-mile stops served by route are visited in the sequence of in route . We use to denote the value of objective function (7a) for a solution with route . The corresponding passenger assignments are and with . Assume that route only serves stop and route only serves stop .
-
1.
If and , we can construct another feasible solution with route , , and passenger assignment and .
-
2.
If and , we can construct another feasible solution with route and , and passenger assignment , and .
-
3.
If and , we can construct another feasible solution with route and , and passenger assignment , and .
Since routes and are sub-routes of route , we have , , and . Apparently, the value of objective function (7a) for the solution , which means that the solution with is not the optimal solution. Similarly, we can justify that all solutions with and are not the optimal solution. Therefore, in the optimal solutions, we have for any with .
∎
Proposition 1 implies that, at any vehicle dispatch decision after each train arrives, to reduce passenger riding time, we would never dispatch multiple identical routes (i.e., ) that visit multiple stops (i.e., ). In other words, if a route visits multiple stops (i.e., ), we would dispatch at most one vehicle to serve this route (i.e., ) after each train arrives. This proposition will render many integer decision variables to binary decision variables, which can help to improve the computational efficiency of the optimization model.
3.3 Distributionally Robust Model (DR) for Fleet Sizing and Allocation
The SP formulation in (8) assumes that we know the probability distributions of , where and . However, in reality, it is challenging, if not impossible, to accurately identify (estimate) the true distribution of random parameters for the demand. In this section, we assume that is not perfectly known. However, we know the support (i.e., upper and lower bounds) and the mean values of the random parameters. Mathematically, we consider support
In addition, we let represent the mean (expected) value of . Then we consider the following mean-support ambiguity set :
(11) |
where in represents the set of probability distributions supported on , and each distribution matches the mean values of . Using the ambiguity set , we formulate the fleet sizing and allocation as the following min-max problem:
(12) |
The formulation (12) seeks to identify vehicle sizing, allocation, routing, and scheduling decisions that minimize the worst-case expected cost of passenger waiting time and riding time over a family of distributions of random parameters residing in the ambiguity set for the demand. is the recourse problem defined in (7).
3.4 Reformulation
In this section, we use duality theory and follow a standard approach in distributionally robust optimization to reformulate the min-max model in (12) to one that is solvable. We first consider the inner maximization problem for a fixed vehicle allocation decision and , where is the decision variable—i.e., we are choosing the distribution that maximizes the expected value of .
(13a) | |||
(13b) | |||
(13c) |
where if and if . As we show in the proof of Proposition 2, problem (13) is equivalent to problem (14).
Proposition 2.
For a fixed and , problem (13) is equivalent to
(14) |
Proof.
For a fixed we can formulate problem (13) as the following linear functional optimization problem:
(15a) | |||
(15b) | |||
(15c) |
Letting and be the dual variable associated with constraints (15b) and (15c), respectively, we present problem (15) in its dual form:
(16a) | ||||
(16b) |
where and are unrestricted in sign, and constraint (16b) is associated with the primal variable . Under the standard assumption that belongs to the interior of the set is a probability distribution over support , strong duality holds between (15) and (16) (see Bertsimas and Popescu (2005) for a detailed discussion of this assumption and Jiang et al. (2017); Shehadeh and Padman (2021); Shehadeh and Sanci (2021) for applications). Note that for fixed (), constraints (16b) are equivalent to . Since we are minimizing in (16), the dual formulation of (15) is equivalent to
∎
Note that the recourse problem is a minimization problem. Thus, in (14) we have an inner max-min problem. Next, we use properties to derive an equivalent single minimization reformulation of (14). For fixed , , and a realized value of , is a linear program. We formulate in its dual form as
(17a) | ||||
(17b) | ||||
(17c) | ||||
(17d) |
where and are the dual variables associated with constraints (7b)–(7c) and (7d), respectively. Given the dual formulation of in (17) and a fixed and feasible (), we can rewrite the inner maximization problem in (14) as follows
(18a) | ||||
s.t. | (18b) |
As we show in the proof of Proposition 3 in A, for fixed and , problem (18) is equivalent to the minimization problem in (19) .
Proposition 3.
For fixed and problem (18) is equivalent to
(19a) | ||||
s.t. | (19b) | |||
(19c) | ||||
(19d) | ||||
(19e) |
4 Computational Experiments and Implications
The primary objective of our computational study is to evaluate the computational and operational performance of the proposed models. We solve the two-stage SP in (8) via the the sample average approximation (SAA) approach in B (see, e.g., Kim et al. (2015); Kleywegt et al. (2002) for a detailed discussion of SAA). Section 4.1 presents the details of data generation and experimental design. In Section 4.2, we evaluate the solution times of the SP and DR models. In Section 4.3, we evaluate the optimal solutions of the SP and DR models and their out-of-sample simulation performance. We close by analyzing the sensitivity of the optimal solutions to different parameter settings in Section 4.4.
4.1 Experimental Design and Computational Setup
We first construct four instances (instance 1-4 henceforth), in part based on the parameter settings and assumptions made by Wang (2019) (which address the deterministic counterpart LMTS routing and scheduling problem for one service region). We summarize our test instances in Table 2. Each of the four instances is characterized by the number of regions , number of last-mile stops in each region , and number of routes in each region . The sizes of the instances vary and correspond to different practical contexts. In addition to these four instances, we then construct an instance based on the actual on-demand transportation data related to New York City (NYC instance henceforth)111Souce: https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page. The dataset contains the yellow and green taxi trip records include fields capturing pick-up and drop-off dates/times, pick-up and drop-off locations, and driver-reported passenger counts. It was collected and provided to the NYC Taxi and Limousine Commission (TLC) by technology providers authorized under the Taxicab & Livery Passenger Enhancement Programs (TPEP/LPEP). Although it is not a real demand record for an exact existing LMTS, since the dataset contains real information of on-demand passengers, which also reflects the actual spatial and temporal patterns and uncertainty of the demand., which consists of regions. The procedure to construct the NYC instance, the details of last-mile stops and routes, and the empirical statistics of batch demand are summarized in C.
To generate demand profile for the instance 1-4, we use a similar magnitude of the number of passengers in the LMTS literature (e.g., Wang and Odoni (2016); Wang (2019)), as well as the same random parameter generation procedures in the distributionally robust scheduling and optimization literature (e.g., Jiang et al. (2017); Shehadeh (2021); Mak et al. (2014)). For instance 1–4, we randomly generate the mean values of from a uniform distribution and standard deviation . We randomly generate in-sample realizations by following lognormal (LogN) distributions with the generated and , for all . LogN is a standard distribution of model customers’ demand and service times in a wide range of applications. Kamath and Pakkala (2002) results suggest that the LogN is a suitable distribution for modeling stochastic demands in an economic context. For the NYC instance, in Table 2 in C, we present the empirical and of batch demand .
According to Gomez-Ibanez et al. (1999), for work trips in San Francisco, the monetary value of a unit of transfer waiting time is 195 of the user’s after-tax wages, and the monetary value of a unit of in-vehicle riding time is 76 of the user’s after-tax wages. In general, we should have in the objective function. Following similar parameter selections as in Wang (2019), we normalize and (we perform sensitivity analysis in Section 4.4). As for the vehicle fixed cost , if the service provider already has a fleet of vehicles, could be very small; if the service provider rents vehicles, should include the rental fee and operating cost (e.g., fuel and gas); if the service provider needs to purchase a new fleet of vehicles, may be very large and a complex depreciation should be considered. Unless stated otherwise, we test two values of : (1) (ignoring the fixed cost and assuming an existing vehicle fleet), and (2) from a range of [2,000, 6,000] (renting vehicles to serve passengers who are less sensitive to riding and waiting times)222The range is approximated considering the general after-tax wage for city residents, the rental price of vehicles with capacity 4-10, vehicle price and possible depreciation, fuel cost, etc. (considering vehicle rental fee and/or depreciation and operating cost; see Appendix D). We investigate the impact of in Section 4.4.
As in prior applied distributionally robust literature (see, e.g., Jiang et al. (2017), Shehadeh and Sanci (2021), Shehadeh and Padman (2021), and references therein), we respectively use the -quantile and -quantile values of the in-sample data to approximate the lower and upper bounds of . We optimize the SP model by using all of the data points, and the DR model with the corresponding mean, lower bounds, and upper bounds. We implemented the models using AMPL2016 programming language calling CPLEX V12.6.2 as a solver with default settings. We ran all experiments on a computer with an Intel Core i7 processor, 2.5 GHz CPU, and 16 GB (1600MHz DDR3) of memory, and imposed a solver time limit of 1 hour.
Inst | Inst | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 4 | 12 | 4 | 10 | 2 | 4 | 12 | 4 | 13 | |||
12 | 6 | 23 | 12 | 6 | 31 | |||||||
12 | 6 | 30 | 12 | 6 | 24 | |||||||
12 | 8 | 39 | 12 | 8 | 40 | |||||||
3 | 5 | 12 | 4 | 13 | 4 | 6 | 12 | 4 | 13 | |||
12 | 6 | 31 | 12 | 6 | 31 | |||||||
12 | 6 | 24 | 12 | 6 | 24 | |||||||
12 | 8 | 40 | 12 | 8 | 40 | |||||||
12 | 8 | 49 | 12 | 8 | 49 | |||||||
12 | 8 | 59 |
4.2 CPU Time
In this section, we analyze the solution times of the SP and DR models. For each instance in Table 2, we first generate mean demand for each last-mile stop in each service region from (low to average demand) and (high demand), and set . Then we generate the in-sample data from LogN using the generated and . To approximate the lower and upper of , we respectively use the -quantile and -quantile values of the in-sample data. We optimize the SP by using all of the data points, and the DR model with the corresponding corresponding mean, lower bounds, and upper bounds. We use , , and . For each instance, we impose a time limit of 7,200 seconds (i.e., 2 hours).
Our choice of the sample size to solve the SAA of SP was motivated by the trade-off between the computational effort required to solve the resulting mixed-integer linear programs (MILPs) and the quality of approximation of the expected value objective of SP by its SAA approximation. On the one hand, the sizes of MILP instances increase with and their solution times also increase. On the other hand, optimal solutions of SAA instances with larger values of are likely to be closer to optimality compared with the expected value objective.
The literature on the SAA method provides theoretical insights and guidance for selecting a sample size from this perspective. We implemented the so-called Monte Carlo Optimization (MCO) procedure to compute statistical lower and upper bounds on the optimal value of SP based on an optimal solution to its SAA approximation, which in turn provides a statistical estimate of the relative approximation gap between the optimal value of SP and its SAA approximation (see, e.g., Homem-de Mello and Bayraksan (2014) and Linderoth et al. (2006) for a thorough discussion of MCO). Applying the MCO procedure to our SP model with , we estimate the relative approximation gaps for the SP instances described in Table 2 to range between 1% and 5%. In contrast, larger sample sizes resulted in longer solution times without consistent and significant improvements in the relative approximation gaps. Based on these considerations, we selected for our computational experiments.
Inst | Range | DR | SAA | Range | DR | SAA | ||||
---|---|---|---|---|---|---|---|---|---|---|
1 | 40 | 0 | 0.30 | 197 | 40 | 0 | 1.67 | 2,477 | ||
4,000 | 1.7 | 7,110 | 4,000 | 0.2 | 7,200 | |||||
60 | 0 | 0.25 | 60 | 60 | 0 | 2 | 305 | |||
4,000 | 25 | 7,200 | 4,000 | 13 | 7,200 | |||||
80 | 0 | 0.22 | 31 | 80 | 0 | 0.27 | 230 | |||
4,000 | 18 | 7,200 | 4,000 | 12 | 2,542 | |||||
2 | 40 | 0 | 0.3 | 7,200 | 40 | 0 | 1 | 7,200 | ||
4,000 | 2 | 7,200 | 4,000 | 1 | 7,200 | |||||
60 | 0 | 1 | 0.13 | 60 | 0 | 1 | 91 | |||
4,000 | 2 | 7,200 | 4,000 | 1 | 7,200 | |||||
80 | 0 | 0.16 | 20 | 80 | 0 | 0.3 | 54 | |||
4,000 | 2 | 7,200 | 4,000 | 2 | 4,290 | |||||
3 | 40 | 0 | 9 | 7,202 | 40 | 0 | 1 | 7,200 | ||
4,000 | 3 | 7,200 | 4,000 | 1 | 7,200 | |||||
60 | 0 | 500 | 60 | 0 | 9 | 7,200 | ||||
4,000 | 38 | 7,200 | 4,000 | 2 | 7,200 | |||||
80 | 0 | 1 | 38 | 80 | 0 | 3 | 160 | |||
4,000 | 2.3 | 7,200 | 4,000 | 2 | 7,200 | |||||
4 | 40 | 0 | 3,600 | – | 40 | 0 | 1 | – | ||
4,000 | 32 | – | 4,000 | 1 | – | |||||
60 | 0 | 4 | – | 60 | 0 | 1 | – | |||
4,000 | 22 | – | 4,000 | 1 | – | |||||
80 | 0 | 0.30 | – | 80 | 0 | 4 | 10% | |||
4,000 | 50 | – | 4,000 | 40 | 33% |
Table 3 presents solution times (in seconds) using the SP and DR models for different values of (maximum number of vehicles). We first observe that the DR can quickly solve all instances under the two ranges of the average number of passengers and all values of and and significantly faster than the SP model. In contrast, the SP fails to solve all of the SAA instances corresponding to instance 4 to optimality within the time limit, and terminates with either a large relative MIP (relMIP) gap (relMip:=, where UB is the best upper bound and LB is the linear programming relaxation-based lower bound obtained at termination) or without any feasible MIP solution, and thus no upper bound. Additionally, the SP’s solution times differ under the two ranges of the average number of passengers and values of and .
Under , the two models allocate all vehicles (i.e., . The SP’s solution times decrease as increases from 40 to 80. Consider instance 1, for example: SP’s solution times decrease from 197 sec and 2,477 sec to 31 sec and 230 sec, respectively, under uncertain demand [1,4] and [3,7]. A possible explanation for this is that when is large, we can satisfy passenger demand with any vehicle allocations, routing, and scheduling decisions. That is, we can allocate a larger number of vehicles in each region that may be sufficient to transport passengers to their last-mile stops via the direct routes to those steps (e.g., each route serves only one stop). When is small, vehicle allocations, routing, and scheduling decisions are subtle and difficult to optimize.
Under , the two models allocate a subset of the vehicles to minimize the total cost function (see Table 4). The larger SP solution times indicate that SP’s routing and scheduling decisions are subtle and difficult to optimize in this case.
4.3 Analysis of optimal solutions
In this section, we compare the DR and SP optimal vehicle sizing and allocation decisions and their out-of-sample performance using the same settings as in Section 4.2. Under zero vehicle fixed cost (i.e., ), the two models have similar performance for all three instances. We show such results for instance 1 in Table 1 in E. Therefore, in this section we mainly compare the optimal sizing and allocation decisions under . For presentation brevity and illustrative purposes, we fix and present results for instances 1–3 and NYC, as the SAA-SP can solve all such instances.
First, we analyze optimal vehicle sizing and allocation decisions yielded by the DR and SP models, which are presented in Table 4. From this table, we first observe that, for instances 1-3, both models allocate a higher number of vehicles to each service region under demand [3,7] than [1,4]. This makes sense, as a larger batch of passengers arrives in each train in the former case. Second, we observe that by incorporating ambiguity in the number of passengers arriving at each station after each train in each service region, the DR models always allocate more vehicles than the SP. As we show next, allocating more vehicles results in a better quality of service in terms of passenger waiting and riding times, but a higher vehicle fixed cost.
Inst | Range | Model | ||||||
---|---|---|---|---|---|---|---|---|
1 | [1,4] | DR | 28 | 4 | 7 | 7 | 10 | |
SP | 16 | 2 | 4 | 4 | 6 | |||
[3,7] | DR | 59 | 11 | 16 | 12 | 20 | ||
SP | 36 | 7 | 9 | 8 | 12 | |||
2 | [1,4] | DR | 22 | 2 | 6 | 6 | 8 | |
SP | 17 | 2 | 4 | 5 | 6 | |||
[3,7] | DR | 50 | 7 | 11 | 12 | 19 | ||
SP | 28 | 5 | 7 | 7 | 10 | |||
NYC | DR | 21 | 3 | 6 | 7 | 5 | ||
SP | 10 | 2 | 3 | 3 | 2 | |||
Inst | Range | Model | ||||||
3 | [1, 4] | DR | 31 | 2 | 6 | 6 | 8 | 9 |
SP | 20 | 2 | 4 | 4 | 5 | 5 | ||
[3, 7] | DR | 60 | 8 | 10 | 11 | 16 | 15 | |
SP | 42 | 5 | 7 | 7 | 11 | 12 |
Next, we analyze the in-sample performance of the optimal vehicle sizing and allocation decisions of DR and SP under “perfect information” (known distributions) and out-of-sample performance with “misspecified distribution information.” Specifically, we simulate the optimal solutions of DR and SP using the following two sets of out-of-sample data , for all .
- 1.
-
2.
Misspecified distribution information: To simulate the out-of-sample performance of the DR and SP optimal solutions when the in-sample data are biased, we keep the same mean , standard deviation , and range values of as in the in-sample data, but we vary the distribution type of to generate the data. Specifically, we follow a Uniform distribution to generate realizations , for all . We follow the same standard statistical method as in prior distributionally robust literature to design the parameters of the joint Uniform distribution with varying levels of correlations, while keeping the mean and support of the out-of-sample data the same as those of the in-sample data.
We evaluate the out-of-sample performance of the optimal SP and DR solutions as follows. First, we fix the optimal first-stage allocation decisions (, for all ) in the SP model. Then we simulate the second-stage recourse problem with dynamic routing using the data to compute passenger waiting and riding time costs.
Table 5 presents the means and quantiles of the total cost (TC), second-stage cost (2nd-stage), total waiting time per region (TWT), and total riding time per region (TRT) yielded by the optimal solutions of the DR and SP for insatnces 1–3 under perfect distributional information (i.e., LogN distribution). Table 6 presents the results for the NYC instance. Clearly, by allocating more vehicles in each region, the DR results in a higher vehicle fixed (one-time) cost, and thus a higher total cost, than the SP. However, the DR also results a in significantly lower second-stage cost and, in particular, substantially lower waiting time on average and at all quantiles, and hence offers a better quality of service and greater passenger satisfaction. For example, consider instance 2 and
TWT for each region | TRT for each region | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Inst | R | Metric | Model | TC | 2nd-stage | 1 | 2 | 3 | 4 | 5 | 1 | 2 | 3 | 4 | 5 | ||
1 | [1,4] | Mean | DR | 117,944 | 5,944 | 354 | 68 | 104 | 111 | 708 | 1,043 | 1,129 | 1,793 | ||||
SP | 88,097 | 24,097 | 2,792 | 2,102 | 2,007 | 2,826 | 604 | 1,061 | 1,140 | 1,837 | |||||||
Median | DR | 117,836 | 5,836 | 320 | 50 | 100 | 110 | 708 | 1,045 | 1,128 | 1,795 | ||||||
SP | 87,866 | 23,866 | 2,770 | 2,090 | 1,980 | 2,770 | 604 | 1,066 | 1,140 | 1,836 | |||||||
75%-q | DR | 118,275 | 6,275 | 400 | 80 | 120 | 130 | 732 | 1,077 | 1,164 | 1,842 | ||||||
SP | 90,959 | 27,959 | 3,110 | 2,420 | 2,340 | 3,230 | 614 | 1,086 | 1,174 | 1,885 | |||||||
95%-q | DR | 119,440 | 7,440 | 670 | 190 | 170 | 180 | 760 | 1,121 | 1,218 | 1,921 | ||||||
SP | 95,817 | 31,817 | 3,610 | 2,980 | 2,960 | 3,910 | 625 | 1,110 | 1,213 | 1,949 | |||||||
[3, 7] | Mean | DR | 245,399 | 9,399 | 26 | 19 | 50 | 9 | 1,779 | 2,155 | 2,171 | 3,087 | |||||
SP | 176,342 | 32,342 | 1,681 | 2,686 | 3,238 | 3,910 | 1,811 | 2,162 | 2,211 | 3,129 | |||||||
Median | DR | 245,353 | 9,353 | 20 | 10 | 40 | 10 | 1,777 | 2,156 | 2,171 | 3,089 | ||||||
SP | 175,619 | 31,619 | 1,600 | 2,580 | 3,150 | 3,820 | 1,809 | 2,164 | 2,273 | 3,134 | |||||||
75%-q | DR | 245,794 | 9,794 | 40 | 30 | 70 | 10 | 1,844 | 2,232 | 2,244 | 3,174 | ||||||
SP | 181,013 | 37,013 | 2,010 | 3,280 | 3,820 | 4,600 | 1,879 | 2,236 | 2,273 | 3,205 | |||||||
95%-q | DR | 246,447 | 10,447 | 70 | 60 | 110 | 30 | 1,932 | 2,321 | 2,359 | 3,295 | ||||||
SP | 189,398 | 45,398 | 2,700 | 4,200 | 4,890 | 5,950 | 1,974 | 2,313 | 2,355 | 3,276 | |||||||
2 | [1,4] | Mean | DR | 94,617 | 6,617 | 613 | 139 | 215 | 301 | 532 | 921 | 1,211 | 1,418 | ||||
SP | 78,238 | 10,238 | 485 | 1,098 | 348 | 1,083 | 514 | 968 | 1,271 | 1,454 | |||||||
Median | DR | 94,421 | 6,421 | 590 | 100 | 200 | 280 | 530 | 922 | 1,209 | 1,420 | ||||||
SP | 78,049 | 10,049 | 470 | 1,070 | 330 | 1050 | 514 | 968 | 1,268 | 1,459 | |||||||
75%-q | DR | 95,137 | 7,137 | 680 | 170 | 260 | 350 | 551 | 947 | 1,250 | 1,469 | ||||||
SP | 79,282 | 11,282 | 540 | 1,290 | 400 | 1,240 | 532 | 995 | 1,313 | 1,502 | |||||||
95%-q | DR | 96,619 | 8,619 | 860 | 390 | 390 | 460 | 583 | 979 | 1,317 | 1,540 | ||||||
SP | 81,439 | 13,439 | 680 | 1,650 | 570 | 1,550 | 557 | 1,034 | 1,384 | 1564 | |||||||
[3,7] | Mean | DR | 207,918 | 7,918 | 175 | 84 | 97 | 32 | 1,299 | 1,500 | 2,100 | 2,241 | |||||
SP | 144,785 | 3,278 | 1,194 | 3,357 | 3,070 | 5,123 | 1,362 | 1,625 | 2,176 | 2,134 | |||||||
Median | DR | 207,836 | 7,836 | 220 | 70 | 90 | 30 | 1,300 | 1,498 | 2,101 | 2,237 | ||||||
SP | 144,200 | 32,200 | 1,160 | 3,290 | 2,970 | 5,030 | 1,358 | 1,628 | 2,177 | 2,137 | |||||||
75%-q | DR | 208,441 | 8,441 | 220 | 120 | 130 | 50 | 1,351 | 1,553 | 2,171 | 2,326 | ||||||
SP | 149,331 | 37,331 | 1,490 | 3,890 | 3,620 | 5,920 | 1,416 | 1,664 | 2,239 | 2,172 | |||||||
95%-q | DR | 209,384 | 9,384 | 320 | 200 | 200 | 80 | 1424 | 1,623 | 2,302 | 2,435 | ||||||
SP | 157,029 | 45,029 | 2,000 | 4,760 | 4,710 | 7,170 | 1,490 | 1,714 | 2,318 | 2,227 | |||||||
3 | [1,4] | Mean | DR | 132,669 | 8,669 | 853 | 104 | 125 | 232 | 190 | 475 | 953 | 1,229 | 1,405 | 1,596 | ||
SP | 102,232 | 22,232 | 513 | 1,004 | 1,211 | 2,751 | 2,752 | 522 | 1,000 | 1,326 | 1,320 | 1,599 | |||||
Median | DR | 132,460 | 8,460 | 840 | 80 | 90 | 210 | 190 | 476 | 955 | 1,226 | 1,407 | 1,596 | ||||
SP | 101,955 | 21,955 | 510 | 980 | 1,170 | 2,730 | 2,700 | 522 | 1,001 | 1,328 | 1,320 | 1,604 | |||||
75%-q | DR | 133,344 | 9,344 | 940 | 140 | 170 | 280 | 220 | 492 | 983 | 1,272 | 1,450 | 1,647 | ||||
SP | 104,734 | 24,734 | 580 | 1,180 | 1,420 | 3,130 | 3,100 | 541 | 1,026 | 1,372 | 1,344 | 1,631 | |||||
95%-q | DR | 135,111 | 11,111 | 1,110 | 270 | 370 | 440 | 310 | 517 | 1,020 | 1,339 | 1,524 | 1,711 | ||||
SP | 109,032 | 29,032 | 720 | 1,510 | 1,830 | 3,700 | 3,700 | 570 | 1,067 | 1,426 | 1,381 | 1,668 | |||||
[3, 7] | Mean | DR | 252,442 | 12,442 | 84 | 107 | 96 | 184 | 648 | 1,299 | 1,571 | 2,221 | 2,293 | 3,155 | |||
SP | 203,413 | 35,413 | 1,005 | 1,683 | 3,284 | 3,530 | 2,940 | 1,366 | 1,677 | 2,154 | 2,136 | 3,195 | |||||
Median | DR | 252,740 | 12,740 | 80 | 100 | 80 | 130 | 600 | 1,299 | 1,571 | 2,221 | 2,292 | 3,154 | ||||
SP | 202,673 | 34,673 | 940 | 1,610 | 3,200 | 3,440 | 2,880 | 1,363 | 1,679 | 2,158 | 2,136 | 3,197 | |||||
75%-q | DR | 254,142 | 14,142 | 110 | 140 | 120 | 190 | 790 | 1,348 | 1,623 | 2,293 | 2,375 | 3,243 | ||||
SP | 208,480 | 40,480 | 1,270 | 2,060 | 3,880 | 4,180 | 3,440 | 1,419 | 1,732 | 2,215 | 2,179 | 3,275 | |||||
95%-q | DR | 253,026 | 13,026 | 170 | 220 | 250 | 340 | 1180 | 1,425 | 1,701 | 2,417 | 2,499 | 3,364 | ||||
SP | 217,696 | 49,696 | 1,850 | 2,790 | 4,940 | 5,300 | 4,360 | 1,497 | 1,805 | 2,283 | 2,248 | 3,383 |
TWT for each region | TRT for each region | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Metric | Model | TC | 2nd-stage | 1 | 2 | 3 | 4 | 1 | 2 | 3 | 4 | ||
Mean | DR | 86,018 | 2,018 | 142 | 92 | 31 | 16 | 223 | 473 | 544 | 212 | ||
SP | 50,210 | 10,210 | 817 | 1,243 | 1,445 | 658 | 325 | 628 | 630 | 296 | |||
Median | DR | 85,982 | 1,982 | 140 | 90 | 30 | 10 | 221 | 472 | 538 | 211 | ||
SP | 49,341 | 9,341 | 730 | 1,160 | 1,270 | 580 | 319 | 625 | 625 | 292 | |||
75%-q | DR | 86,259 | 2,259 | 170 | 110 | 40 | 30 | 224 | 507 | 591 | 237 | ||
SP | 52,374 | 12,374 | 1,010 | 1,510 | 1,810 | 820 | 367 | 691 | 681 | 335 | |||
95%q | DR | 86,727 | 2,727 | 210 | 150 | 70 | 50 | 278 | 549 | 662 | 278 | ||
SP | 58,652 | 18,652 | 1,590 | 2,170 | 3,050 | 1310 | 443 | 788 | 788 | 393 |
range [1,4]. By allocating 22 and 17 vehicles, respectively, the DR and SP result in 88,000 and 68,000 vehicle fixed costs. However, the average second-stage cost and total waiting time (overall regions) of the DR are 55% and 137% lower than those of the SP, respectively. It is not surprising that the total cost of the objective function in SP is lower than that in DR, since we assume perfect information about the exact demand distribution in this case.
Table 7 presents the means and quantiles of the total and second stage costs from the optimal solutions of DR and SP under misspecified distributional information. For total cost, the out-of-sample results in Table 7 shows that there is still no clear winner for instances 1–3, while DR has significantly lower costs in some worst-case instances, for example in instances 2 with larger demands of (average TC of DR and SP are $215,208 and $226,406, respectively). For second-stage costs, DR out-performs SP significantly, since DR is designed to be robust against worst-case demand distributions in the second stage. Interestingly, the DR model provides significantly lower total and second-stage costs for the NYC instance compared to that of the SP model, which may be due to the much stronger demand uncertainty in the NYC instance constructed using real data. These simulation results demonstrate the value of incorporating both uncertainty and ambiguity into fleet sizing, allocation, routing, and scheduling models.
4.4 Sensitivity Analysis
In this section, we study the sensitivity of DR and SP solutions to various input parameter settings. For illustrative purposes and presentation brevity, we consider instance 1 for this experiment (we observe similar results for instances 2-4 and the NYC instance). For each experiment, we simulate the optimal solutions of the two-stage DR and two-stage SP (called TSM henceforth) under a sample of 10,000 scenarios of the number of passengers demanding last-mile service.
Instance | R | Metric | Model | TC | 2nd-Stage | Instance | R | Metric | Model | TC | 2nd-Stage |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | [1,4] | Mean | DR | 126,262 | 14,262 | 3 | [1,4] | Mean | DR | 144,217 | 20,217 |
SP | 116,912 | 52,912 | SP | 139,367 | 43,367 | ||||||
Median | DR | 123,791 | 11,791 | Median | DR | 148,987 | 24,987 | ||||
SP | 119,346 | 55,346 | SP | 148,516 | 52,515 | ||||||
75%-q | DR | 138,444 | 26,444 | 75%-q | DR | 156,193 | 32,193 | ||||
SP | 145,899 | 81,899 | SP | 162,518 | 66,518 | ||||||
95%-q | DR | 138,444 | 26,444 | 95%-q | DR | 156,193 | 32,193 | ||||
SP | 145,899 | 81,899 | SP | 162,518 | 66,518 | ||||||
[3,7] | Mean | DR | 257,150 | 21,150 | [3,7] | Mean | DR | 282,210 | 42,210 | ||
SP | 216,848 | 68,848 | SP | 314,953 | 146,953 | ||||||
Median | DR | 252,486 | 16,486 | Median | DR | 272,873 | 32,873 | ||||
SP | 210,143 | 62,143 | SP | 293,358 | 125,358 | ||||||
75%-q | DR | 272,729 | 36,729 | 75%-q | DR | 321,239 | 81,239 | ||||
SP | 254,225 | 106,225 | SP | 441,640 | 273,640 | ||||||
95%-q | DR | 277,311 | 41,311 | 95%-q | DR | 329,795 | 89,795 | ||||
SP | 269,561 | 121,561 | SP | 461,111 | 293,111 | ||||||
2 | [1,4] | Mean | DR | 102,883 | 14,883 | NYC | Mean | DR | 87,093 | 3,093 | |
SP | 94,451 | 26,451 | SP | 147,923 | 107,923 | ||||||
Median | DR | 106,262 | 18,262 | Median | DR | 87,165 | 3,165 | ||||
SP | 101,276 | 33,276 | SP | 114,264 | 103,443 | ||||||
75%-q | DR | 110,987 | 22,987 | 75%-q | DR | 88,210 | 4,210 | ||||
SP | 108,761 | 40,762 | SP | 182,414 | 168,191 | ||||||
95%-q | DR | 110,987 | 22,987 | 95%-q | DR | 90,838 | 6,838 | ||||
SP | 108,761 | 40,762 | SP | 26,2214 | 222,791 | ||||||
[3,7] | Mean | DR | 215,208 | 15,208 | |||||||
SP | 226,406 | 110,406 | |||||||||
Median | DR | 210,996 | 10,995 | ||||||||
SP | 223,017 | 107,017 | |||||||||
75%-q | DR | 230,932 | 30,932 | ||||||||
SP | 298,678 | 182,678 | |||||||||
95%-q | DR | 233,606 | 33,605 | ||||||||
SP | 313,528 | 197,528 |
Impact of variability in demand/demand ranges
First, we analyze the DR and SP solutions’ sensitivity to the variability and volume of the number of passengers arriving at each service region with each train. In addition to the base demand range (Range 1, [1,4]), we consider four additional ranges: [1,6], [1,8], [4,7], and [6,9]. In [1,6] and [1,8], we increase the variability (difference between the lower and upper bounds) of the number of from 3 to 5 and 7, respectively. In [4,7] and [6,9], we keep the difference between the upper and lower bounds of as in the base range (i.e., 3), and increase the demand volume (lower and upper bounds) to [4,7] and [6, 9]. We keep cost parameters as in the base case settings, i.e., , , and .
Figure 1 presents the optimal fleet size (i.e., ) and the average second-stage cost (waiting+riding time costs) as a function of the demand range. It is quite apparent from Figure 1 that both models tend to allocate more vehicles under higher variability and volume of demand. By allocating more vehicles, the DR mitigates the increase in passengers’ variability and volume by maintaining significantly lower waiting and riding time costs.


Impact of cost parameters
Next, we analyze the DR and SP solutions’ sensitivity to the cost parameters. We fix the demand range to [1,4] and [4,7] as examples of a low and relatively high volume of passengers, and vary and . Figures 2 and 3 present the optimal fleet size, , and the associated second-stage (waiting+riding time costs) cost as a function of and under demand range and , respectively.






We first observe that the optimal fleet size, , yielded by the DR model is always larger than that of the SP model under all values of and (). Second, we observe that both models allocate more vehicles under [4,7] than under [1,4], irrespective of the values of cost parameters, which makes sense given that the former implies a higher volume of demand for last-mile service that needs fulfilling. Third, we observe that for a fixed value of (), the optimal fleet size of the DR and SP decreases as the unit fixed cost increases. For example, when ()= (2,1) and range equal [4,7], the of the DR and SP, respectively, decreases from 62 and 39 to 12 and 7 vehicles as increases from to (see Figure 3(a) and 3(c)).
Fourth, we observe that as the values of () increase (i.e., passenger waiting and riding time become more important/expensive), the optimal fleet size yielded by DR and SP increase regardless of the unit fixed cost and range of passengers. Finally, we observe that by allocating a larger fleet, the DR always yields a substantially lower second-stage cost (i.e., waiting and riding times) than the SP, which indicates a better quality of service for passengers. However, this of course comes at a higher fixed cost. The relative difference in the fixed cost between DR and SP ranges from 0 to 40%, and the relative difference in the second-stage cost () from 15% to 333%. Practitioners may be willing to pay the extra one-time fixed cost of DR solutions to provide a better quality of service in terms of lower waiting and riding times, and thus maintain customer satisfaction and a good business reputation.






5 Conclusion
In this paper, we investigate the fleet sizing and allocation problem for the on-demand last-mile transportation systems. Specifically, we consider the perspective of a last-mile service provider who wants to determine the number of servicing vehicles allocated to multiple service regions. In each service region, passengers demanding last-mile services arrive in batches, and allocated vehicles deliver passengers to their final destinations. The size of each batch of passengers is random and hard to predict in advance. The quality of fleet-allocation decisions is a function of vehicle fixed cost plus a weighted sum of passenger’s waiting time before boarding a vehicle and in-vehicle riding time.
We propose, analyze, and evaluate the computational and operational performance of two models for the fleet sizing and allocation problem, assuming known and unknown distribution of the demand, respectively. First, we propose a stochastic programming model to minimize the the fixed cost of allocated vehicles and the expectation of a weighted sum of passenger waiting and riding times, under a distributional belief of demand. Second, we propose a distributionally robust model to minimize the fixed cost of vehicles and the worst-case (i.e., maximum) expectation of passenger waiting time and riding times. We also conduct a set of numerical experiments and discuss the insights and implications by examining trade-offs between total cost, fleet size, and passenger waiting and riding times.
Our study opens other avenues that merit further exploration. To name a few, (1) LMTS fleet sizing and allocation given train arrival uncertainty; (2) LMTS planning and operations under certain special types of demand uncertainty; e.g., multi-modal distribution of demand; (3) fleet sizing and allocation for an on-demand transportation system that combines last- and first-mile services; (4) pricing for last-mile services for multiple service regions with demand uncertainty; (5) incentive and subsidy mechanism design if drivers in the fleet are independent income-seeking decision-makers (e.g., Sun et al. (2019), Zhu et al. (2021)); and (6) distributionally robust optimization models for other optimization problems in on-demand transportation; e.g., vehicle allocation, routing, and scheduling for hybrid services with both fixed and flexible routes. Finally, our computational experiment is not all based on real-world case studies or exact data due to the lack of benchmark instances on the specific LMTS problem that we address in this paper. We hope that our approach and results will also motivate future data collection efforts and standardized benchmark instances. The availability of high-quality data will enable the development of data-driven approaches for this and other emerging LMTS problems.
Acknowledgment
We want to thank all of our colleagues and practitioners who have contributed significantly to the related literature. We are grateful to the anonymous reviewers for their insightful comments and suggestions that allowed us to improve the paper. Dr. Karmel S. Shehadeh dedicates her effort in this paper to every little dreamer in the whole world who has a dream so big and so exciting. Believe in your dreams and do whatever it takes to achieve them–the best is yet to come for you.
References
References
- Agussurja et al. (2019) Agussurja, L., Cheng, S.-F., Lau, H. C., 2019. A state aggregation approach for stochastic multiperiod last-mile ride-sharing problems. Transportation Science 53 (1), 148–166.
- Anderson (1998) Anderson, J. E., 1998. Control of personal rapid transit systems. Journal of advanced transportation 32 (1), 57–74.
- Ben-Tal et al. (2015) Ben-Tal, A., Den Hertog, D., Vial, J.-P., 2015. Deriving robust counterparts of nonlinear uncertain inequalities. Mathematical programming 149 (1-2), 265–299.
- Berger et al. (2011) Berger, T., Sallez, Y., Raileanu, S., Tahon, C., Trentesaux, D., Borangiu, T., 2011. Personal rapid transit in an open-control framework. Computers & Industrial Engineering 61 (2), 300–312.
- Bertsimas and Popescu (2005) Bertsimas, D., Popescu, I., 2005. Optimal inequalities in probability theory: A convex optimization approach. SIAM Journal on Optimization 15 (3), 780–804.
- Bertsimas and Sim (2004) Bertsimas, D., Sim, M., 2004. The price of robustness. Operations research 52 (1), 35–53.
- Birge and Louveaux (2011) Birge, J. R., Louveaux, F., 2011. Introduction to stochastic programming. Springer Science & Business Media.
- Bly and Teychenne (2005) Bly, P., Teychenne, P., 2005. Three financial and socio-economic assessments of a personal rapid transit system. In: Automated People Movers 2005: Moving to Mainstream. pp. 1–16.
- Chen et al. (2020a) Chen, S., Wang, H., Meng, Q., 2020a. Solving the first-mile ridesharing problem using autonomous vehicles. Computer-Aided Civil and Infrastructure Engineering 35 (1), 45–60.
- Chen and Wang (2018a) Chen, Y., Wang, H., 2018a. Pricing for a last-mile transportation system. Transportation Research Part B: Methodological 107, 57–69.
- Chen and Wang (2018b) Chen, Y., Wang, H., 2018b. Why are fairness concerns so important? lessons from pricing a shared last-mile transportation system. Available at SSRN: https://ssrn.com/abstract=3168324.
- Chen et al. (2020b) Chen, Z., Sim, M., Xiong, P., 2020b. Robust stochastic optimization made easy with rsome. Management Science.
- Delage and Ye (2010) Delage, E., Ye, Y., 2010. Distributionally robust optimization under moment uncertainty with application to data-driven problems. Operations research 58 (3), 595–612.
- Esfahani and Kuhn (2018) Esfahani, P. M., Kuhn, D., 2018. Data-driven distributionally robust optimization using the wasserstein metric: Performance guarantees and tractable reformulations. Mathematical Programming 171 (1-2), 115–166.
- Gomez-Ibanez et al. (1999) Gomez-Ibanez, J., Tye, W. B., Winston, C., 1999. Essays in transportation economics and policy. The Brookings Institution, Washington, DC.
-
Hao et al. (2021)
Hao, H., Wang, H., Zhang, P., 2021. U.s. household commuting dataset and
transportation fairness. (under review).
URL https://github.com/peteryz/employment-od - Homem-de Mello and Bayraksan (2014) Homem-de Mello, T., Bayraksan, G., 2014. Monte carlo sampling-based methods for stochastic optimization. Surveys in Operations Research and Management Science 19 (1), 56–85.
- Horn (2002) Horn, M., 2002. Multi-modal and demand-responsive passenger transport systems: a modelling framework with embedded control systems. Transportation Research Part A: Policy and Practice 36 (2), 167–188.
- Jiang et al. (2017) Jiang, R., Shen, S., Zhang, Y., 2017. Integer programming approaches for appointment scheduling with random no-shows and service durations. Operations Research 65 (6), 1638–1656.
- Kamath and Pakkala (2002) Kamath, K. R., Pakkala, T., 2002. A bayesian approach to a dynamic inventory model under an unknown demand distribution. Computers & Operations Research 29 (4), 403–422.
- Ke et al. (2020) Ke, J., Yang, H., Li, X., Wang, H., Ye, J., 2020. Pricing and equilibrium in on-demand ride-pooling markets. Transportation Research Part B: Methodological 139, 411–431.
- Kim et al. (2015) Kim, S., Pasupathy, R., Henderson, S. G., 2015. A guide to sample average approximation. In: Handbook of simulation optimization. Springer, pp. 207–243.
- Kleywegt et al. (2002) Kleywegt, A. J., Shapiro, A., Homem-de Mello, T., 2002. The sample average approximation method for stochastic discrete optimization. SIAM Journal on Optimization 12 (2), 479–502.
- Lees-Miller et al. (2009) Lees-Miller, J., Hammersley, J., Davenport, N., 2009. Ride sharing in personal rapid transit capacity planning. In: Automated People Movers 2009: Connecting People, Connecting Places, Connecting Modes. pp. 321–332.
- Linderoth et al. (2006) Linderoth, J., Shapiro, A., Wright, S., 2006. The empirical behavior of sampling methods for stochastic programming. Annals of Operations Research 142 (1), 215–241.
- Liu et al. (2019) Liu, L., Sun, L., Chen, Y., Ma, X., 2019. Optimizing fleet size and scheduling of feeder transit services considering the influence of bike-sharing systems. Journal of Cleaner Production 236, 117550.
- Liu et al. (2012) Liu, Z., Jia, X., Cheng, W., 2012. Solving the last mile problem: Ensure the success of public bicycle system in beijing. Procedia-Social and Behavioral Sciences 43, 73–78.
- Mak et al. (2014) Mak, H.-Y., Rong, Y., Zhang, J., 2014. Appointment scheduling with limited distributional information. Management Science 61 (2), 316–334.
- Mueller and Sgouridis (2011) Mueller, K., Sgouridis, S. P., 2011. Simulation-based analysis of personal rapid transit systems: service and energy performance assessment of the masdar city prt case. Journal of Advanced Transportation 45 (4), 252–270.
- Quadrifoglio et al. (2008) Quadrifoglio, L., Dessouky, M. M., Ordóñez, F., 2008. A simulation study of demand responsive transit system design. Transportation Research Part A: Policy and Practice 42 (4), 718–737.
- Rahimian and Mehrotra (2019) Rahimian, H., Mehrotra, S., 2019. Distributionally robust optimization: A review. arXiv preprint arXiv:1908.05659.
- Serra et al. (2019) Serra, T., Raghunathan, A. U., Bergman, D., Hooker, J., Kobori, S., 2019. Last-mile scheduling under uncertainty. In: International Conference on Integration of Constraint Programming, Artificial Intelligence, and Operations Research. Springer, pp. 519–528.
- Shapiro et al. (2009) Shapiro, A., Dentcheva, D., Ruszczyński, A., 2009. Lectures on stochastic programming: modeling and theory. SIAM.
- Shapiro et al. (2014) Shapiro, A., Dentcheva, D., Ruszczyński, A., 2014. Lectures on stochastic programming: modeling and theory. SIAM.
- Shehadeh (2021) Shehadeh, K. S., 2021. Distributionally robust optimization approaches for a stochastic mobile facility routing and scheduling problem. arXiv preprint arXiv:2009.10894.
- Shehadeh and Padman (2021) Shehadeh, K. S., Padman, R., 2021. A distributionally robust optimization approach for stochastic elective surgery scheduling with limited intensive care unit capacity. European Journal of Operational Research 290 (3), 901–913.
- Shehadeh and Sanci (2021) Shehadeh, K. S., Sanci, E., 2021. Distributionally robust facility location with bimodal random demand. Computers & Operations Research 134, 105257.
- Smith and Winkler (2006) Smith, J. E., Winkler, R. L., 2006. The optimizer’s curse: Skepticism and postdecision surprise in decision analysis. Management Science 52 (3), 311–322.
- Soyster (1973) Soyster, A. L., 1973. Convex programming with set-inclusive constraints and applications to inexact linear programming. Operations research 21 (5), 1154–1157.
- Sun et al. (2019) Sun, H., Wang, H., Wan, Z., 2019. Model and analysis of labor supply for ride-sharing platforms in the presence of sample self-selection and endogeneity. Transportation Research Part B: Methodological 125, 76–93.
- Thiele (2010) Thiele, A., 2010. A note on issues of over-conservatism in robust optimization with cost uncertainty. Optimization 59 (7), 1033–1040.
- Wang (2019) Wang, H., 2019. Routing and scheduling for a last-mile transportation system. Transportation Science 53 (1), 131–147.
- Wang and Odoni (2016) Wang, H., Odoni, A., 2016. Approximating the performance of a “last mile” transportation system. Transportation Science 50 (2), 659–675.
- Wang and Yang (2019) Wang, H., Yang, H., 2019. Ridesourcing systems: A framework and review. Transportation Research Part B: Methodological 129, 122–155.
- Xu and Meng (2019) Xu, M., Meng, Q., 2019. Fleet sizing for one-way electric carsharing services considering dynamic vehicle relocation and nonlinear charging profile. Transportation Research Part B: Methodological 128, 23–49.
- Zhang et al. (2019) Zhang, H., Sheppard, C. J., Lipman, T. E., Moura, S. J., 2019. Joint fleet sizing and charging system planning for autonomous electric vehicles. IEEE Transactions on Intelligent Transportation Systems 21 (11), 4725–4738.
- Zhu et al. (2021) Zhu, Z., Ke, J., Wang, H., 2021. A mean-field markov decision process model for spatial-temporal subsidies in ride-sourcing markets. Transportation Research Part B: Methodological 150, 540–565.
- Zhu et al. (2020) Zhu, Z., Qin, X., Ke, J., Zheng, Z., Yang, H., 2020. Analysis of multi-modal commute behavior with feeding and competing ridesplitting services. Transportation Research Part A: Policy and Practice 132, 713–727.
Appendix A Proof of Proposition 3
Proof.
(21a) | ||||
s.t. | (21b) | |||
(21c) | ||||
(21d) |
First, we rewrite constraints (21c) as . Given that and the objective of maximizing a positive number times in (21), then, without loss of optimality, we can assume that , for all and . Note that if , then a positive number. Given that , then in this case, condition a positive number is relaxed and the first term in the objective will be negative for . It follows that in the optimal solution. Second, we consider the following cases, for fixed and fixed :
-
1.
Case 1: when
- (a)
-
(b)
If . In this case, , , and the first term in the objective function (21a) reduces to:
(23)
- 2.
It follows from the above analysis, that is optimal to (21). Thus,, w.l.o.o., we can set in (21). Accordingly, (21) is equivalent to
(25a) | ||||
s.t. | (25b) | |||
(25c) | ||||
(25d) |
For fixed and , problem (25) is a bounded and feasible linear program. We formulate (25) in its dual form as
(26a) | ||||
s.t. | (26b) | |||
(26c) | ||||
(26d) |
Combining the inner problem in the form of (26) with the outer minimization problems in (14) and (12), we derive the following equivalent reformulation of the DR model in (12)
(27a) | ||||
s.t. | (27b) | |||
(27c) | ||||
(27d) | ||||
(27e) |
∎
Appendix B Sample average approximation of SP
There are two well-known difficulties in obtaining an (exact) optimal solution to the SP in (8). First, evaluating the value of involves taking multi-dimensional integrals and solving a huge number of similar integer programs. Second, both and are non-convex and discontinuous (Birge and Louveaux, 2011; Shapiro et al., 2009). In view of these two difficulties, we resort to approximation solution approaches, and the sample average approximation (SAA) approach in particular.
In SAA, we replace the distribution of with a (discrete) empirical distribution based on independent and identically distributed (i.i.d.) samples of random demand, then solve the sample average approximation (28) of (8). Note that in the SAA formulations (28), we associate all scenario-dependent parameters, variables, and constraints with a scenario index for all . For example, parameters by to represent number of passengers realized in scenario , and variables are replaced by to represent the number of unserved passengers in scenario . In addition, constraints (7b)–(7e) are incorporated in each scenario.
(28a) | |||
(28b) | |||
(28c) |
Appendix C Construction and Statistics of the NYC Instance
We construct the NYC instance based on the dataset the procedure as follows:
-
1.
Select 4 metro stations that are relatively far away from each other in Manhattan, NYC;
-
2.
For each station, construct a 1-mile by 1-mile square as a last-mile service region with the station located in the center;
-
3.
For each service region, consider the passengers with destination within the region as potential demand for LMTS;
-
4.
For each service region , cluster the passenger destinations to clusters; assume the center of each cluster as the location of a last-mile stop to cover all passengers in that cluster;
-
5.
For each last-mile stop in service region , record the number of passengers going to that stop (i.e., with destination in that cluster) within each time interval (e.g., every 5 or 10 minutes) as the batch demand for the LMTS;
-
6.
For each last-mile stop in service region , compute the upper bound, lower bound, percentile, and percentile for batch demand across in certain period (e.g., 10 am to 11 am);
-
7.
Using the locations of all last-mile stops in each service region , generate routes to serve a subset of stops (e.g., serve 1, 2, and 3 stops), all of which start from and return to the metro station;
-
8.
For each route , record its total travel time , stop-route configuration , and travel time to each stop .
Notation: is number of regions, is a region, is number of last-mile stops in regions , is number of routes in region .
Inst | ||||
---|---|---|---|---|
NYC | 4 | 5 | 31 | |
6 | 30 | |||
4 | 15 | |||
5 | 20 |
Notation: and are respectively the empirical mean and standard deviations of batch demand .
Region, | LM stop, | ||
---|---|---|---|
1.11 | 1.64 | ||
1.13 | 1.88 | ||
2.23 | 2.72 | ||
1.13 | 2.45 | ||
1.30 | 2.25 | ||
1.40 | 1.72 | ||
1.43 | 1.93 | ||
1.20 | 1.92 | ||
2.63 | 2.54 | ||
2.63 | 2.54 | ||
1.83 | 1.89 | ||
2.43 | 2.86 | ||
3.33 | 2.37 | ||
2.63 | 3.10 | ||
3.23 | 3.58 | ||
1.37 | 2.86 | ||
1.13 | 2.37 | ||
1.90 | 3.10 | ||
1.20 | 3.58 | ||
1.70 | 2.58 |
Appendix D Values of parameters , , and
Let the average after-tax hourly wage of passengers be /hour. According to Gomez-Ibanez et al. (1999):
-
1.
Monetary value of riding time ( of after-tax wage): /hour/minute.
-
2.
Monetary value of waiting time ( of after-tax wage): /hour=/minute.
-
3.
Average hourly total fixed cost of a vehicle (with capacity 5), including the cost to rent the vehicle, wage paid to the driver, fuel cost, and other maintenance and operating costs: /hour.
In the LMTS, we have trains with headway minute. The duration of the time that vehicles are needed is slightly longer than . Then, we can approximate the fixed cost , parameter , and as ; ; .
Scenario 1: Assuming there is an existing fleet with no additional cost, we have , , and . In the numerical experiments, we normalize to 1 and evaluate this scenario with the following parameters: , ; .
Scenario 2: Assuming the average after-tax hourly wage of passengers (e.g., working adults who are more sensitive to riding and waiting times) /hour and the average hourly total cost of a vehicle (e.g., regular vehicle) /hour, then: ; ; . In the numerical experiments, we normalize to 1 and evaluate this scenario with the following parameters: , ; .
Scenario 3: Assuming the equivalent average after-tax hourly wage of passengers (e.g., children, students, seniors, and the disabled, who are less sensitive to riding and waiting times) /hour and the average hourly total cost of a vehicle (e.g., vehicle with special equipment for children, seniors, and the disables) /hour, then ; ; . In the numerical experiments, we normalize to 1 and evaluate this scenario with the following parameters: , ; .
Appendix E Example of In-sample performance under
Under , both models allocate the same numbers of vehicles. As such, they have similar in-sample and out-of-sample simulation performances for all three instances. As an example, in Table 1, we present in-sample simulation results (i.e., under set 1; perfect information) for instance 1.
Instance | R | Metric | Model | TC | 2nd-Stage |
---|---|---|---|---|---|
1 | [1,4] | Mean | DR | 4,590 | 4,590 |
SP | 4,572 | 4,572 | |||
Median | DR | 4,592 | 4,592 | ||
SP | 4,573 | 4,573 | |||
75%-q | DR | 4,722 | 4,722 | ||
SP | 4,750 | 4,750 | |||
95%-q | DR | 4,921 | 4,921 | ||
SP | 5,001 | 5,001 | |||
[3,7] | Mean | DR | 9,370 | 9,370 | |
SP | 9,618 | 9,618 | |||
Median | DR | 9,332 | 9,332 | ||
SP | 9,605 | 9,605 | |||
75%-q | DR | 9,776 | 9,776 | ||
SP | 9,941 | 9,941 | |||
95%-q | DR | 10,403 | 10,403 | ||
SP | 10,509 | 10,509 |