ComOpT: Combination and Optimization for Testing
Autonomous Driving Systems

Changwen Li^1,5#, Chih-Hong Cheng^2#, Tiantian Sun³, Yuhang Chen^4,5, and Rongjie Yan^1,5
https://github.com/safeautonomy/ComOpT ¹ State Key Laboratory of Computer Science, ISCAS, China² Independent contributor, Germany ³ Beijing University of Technology, China⁴ Technology Center of Software Engineering, ISCAS, China⁵ University of Chinese Academy of Sciences, China^# The first two authors are listed in reverse alphabetical order but they contribute equally. Chih-Hong Cheng works on this project as his hobby activities and thereby voluntary and not-for-profit; opinions stated in this paper shall not be viewed as an official statement from his organization. ^∗ Correspondence to yrj@ios.ac.cn, cheng.chihhong@gmail.com

Abstract

ComOpT is an open-source research tool for coverage-driven testing of autonomous driving systems, focusing on planning and control. Starting with (i) a meta-model characterizing discrete conditions to be considered and (ii) constraints specifying the impossibility of certain combinations, ComOpT first generates constraint-feasible abstract scenarios while maximally increasing the coverage of $k$ -way combinatorial testing. Each abstract scenario can be viewed as a conceptual equivalence class, which is then instantiated into multiple concrete scenarios by (1) randomly picking one local map that fulfills the specified geographical condition, and (2) assigning all actors accordingly with parameters within the range. Finally, ComOpT evaluates each concrete scenario against a set of KPIs and performs local scenario variation via spawning a new agent that might lead to a collision at designated points. We use ComOpT to test the Apollo 6 autonomous driving software stack. ComOpT can generate highly diversified scenarios with limited test budgets while uncovering problematic situations such as inabilities to make simple right turns, uncomfortable accelerations, and dangerous driving patterns. ComOpT participated in the 2021 IEEE AI Autonomous Vehicle Testing Challenge and won first place among more than $110$ contending teams.

I Introduction

The development of autonomous driving (AD) technologies has reached the stage where the safety of such systems is a dominating factor for defining success. For verification and validation of autonomous vehicles in a fixed operational design domain (ODD), simulation-based testing is one of the highly recommended methods for modular testing of planning and control systems. The simulation environment can provide object labels such as bounding boxes (as a replacement of the perception module), allowing the prediction and planning modules to be tested in isolation. Nevertheless, the critical challenge remains to be designing the test case generation and management module. Within a limited test budget, the test case generation module should outsmart the AD system under test by creating scenarios that lead to undesired behavior (e.g., collision). Simultaneously, to demonstrate sufficient coverage over the ODD, the generation of test cases should be coverage-driven while ensuring diversity.

Towards the aforementioned challenges, we present ComOpT, an open-source research tool (under the AGPL v3 license) for coverage-driven testing of autonomous driving systems. As of September 2021, ComOpT interfaces to the open-source simulator LGSVL [1] and the Baidu Apollo autonomous driving software stack [2]. Internally, ComOpT integrates an axiomatic approach to generate abstract scenarios from a pre-defined list of categories. Every element in a category has its physical interpretation reflected in the simulation environment. One combination forms an abstract scenario, which can be instantiated in the simulation environment by (a) assigning a local map as well as (b) concretizing every element by picking a value in its associated parameter range. Although the axiomatic approach of abstract scenario generation sounds intuitive at first glance, it suffers from three crucial limitations in terms of realization:

1.

Combinatorial explosion: Given $N$ categories with each category having only $2$ elements, there exist $2^{N}$ possible abstract scenarios in the worst case.
2.

Feasibility considerations: Certain combinations are semantically unclear (e.g., making a left-turn in a straight-line road segment) or simply nonexistent in the ODD under consideration (e.g., no roundabout with pedestrian crossings in the map).
3.

Semantically enriched map: Concretizing an abstract scenario requires searching for high-level concepts such as intersection types; these concepts are nonexistent in standard map formats.

For the first two issues, ComOpT utilizes the technique of constraint-based $k$ -way combinatorial testing [3, 4]. Combinatorial testing [5] provides a coverage metric by requiring test cases to cover all element tuples for any $k$ categories. The constraint-based variation allows integrating feasibility constraints, while implementing it using optimization solvers (e.g., mixed-integer linear programming) allows suggesting new test cases that maximally increase coverage. For the third issue, ComOpT implements a special module that allows querying high-level semantic information. In particular, it allows searching for local maps that can satisfy a specific combination of conditions (e.g., find all T-way junctions having pedestrian-crossings).

Finally, given a concrete scenario tested against a set of KPIs, ComOpT includes methods that can perform local perturbation over the scenario based on an innovation called agent spawning. Concretely, ComOpT introduces new agents (e.g., vehicles) to challenge the AD system. How these agents are introduced depends on the high-level behavioral pattern of the ego vehicle extracted from the simulated trace.

ComOpT participated in the 2021 IEEE AI Autonomous Vehicle Testing challenge. Apart from convincing scenario coverage and diversity metrics, ComOpT discovered numerous undesired scenarios for the Apollo Autonomous Driving SW stack, including situations to hit crossing pedestrians or inabilities to make trivial right turns. Ultimately, ComOpT won first place in the competition (within more than $110$ contending teams), serving as objective evidence of the system performance.

II Related Work

Testing autonomous driving systems has been an active research field, and many autonomous driving safety standards such as ISO PAS 21448 SOTIF (Safety of Intended Functionality) [6] or UL 4600 [7] consider simulation-based testing to be instrumental. There are recent programmatic methods such as Scenic [8] or Paracosm [9] that allow specifying and generating scenarios for AD testing. The abstraction that ComOpT takes is one layer higher, supported by an automatic translation from the map of the ODD to a list of sub-maps matching the specification. This implies that given a specific ODD (e.g., Munich city), methods suggested by Scenic or Paracosm cannot easily provide coverage claims over the ODD. These methods are more applicable to re-specify a small number of scenarios, such as known crash scenarios. The Scenic tool, together with the CPS falsification tool VerifAI [10], also participated as an integrated testing framework in the 2021 IEEE AI testing competition [11] but failed to compete against ComOpT.¹¹1For the complete evaluation criteria, we refer readers to the IEEE AI AV testing competition website http://av-test-challenge.org/

Within the scope of autonomous driving, prior work on combinatorial testing focuses on testing deep neural networks [3, 12, 4]. However, the black-box feature also makes it also highly applicable to be used in testing prediction and planning modules, as demonstrated by this work. With the semantic enrichment of maps, we can use combinatorial testing to argue the completeness of abstract scenarios against the ODD, thereby further differentiating ComOpT from other works. Apart from an axiomatic approach demonstrated by ComOpT, other commonly seen approaches include replaying existing crash scenarios (e.g., NHTSA accident database [13]) or scenario databases created by physical driving or by collective efforts (e.g., Pegasus [14] or SafetyPool [15]). One can run these scenarios (from standard formats ASAM OpenScenario [16]) and subsequently, perform scenario variations via various CPS falsification techniques [17, 18, 19, 10, 20, 21]. While we surely understand the benefit of scenario replay, ComOpT aims more - scenario replay and variation are incapable of arguing diversity and completeness. Finally, the previously mentioned CPS falsification techniques [17, 18, 19, 10, 20, 21] largely take low-level information traces such as vehicle position and velocity profiles and perform the search. In contrast, our scenario variation method (agent spawning) also integrates high-level semantic information such as behavioral patterns of the ego-vehicle and positions related to lanes to avoid performing a search in unnecessarily high parameter dimensions. Some of the low-level scenario variation methods also integrate constraint-free combinatorial testing in dealing with discrete variables [17, 9]. However, ComOpT uses a constraint-based version on the abstract (high-level) scenario generation; constraints are introduced due to considerations on encoding scenario infeasibility.

III Inside ComOpT

In this section, we detail the underlying techniques integrated inside ComOpT. To ease understanding, Figure 1 provides a summary of the workflow and associates each function with the corresponding subsection in this paper.

Refer to caption — Figure 1: The underlying workflow of ComOpT

III-A Abstract scenario generation via combinatorial testing

In this stage, the input of ComOpT is a list of discrete categories serving as the basis for abstract scenario generation. To ease understanding, throughout this paper, we consider the following overly simplified three categories.

•

${\small\textsf{{weather}}}\in\{{\small\textsf{{sunny, rainy, cloudy}}}\}$ ,
•

${\small\textsf{{road}}}\in\{{\small\textsf{{straight}}},{\small\textsf{{T-shaped}}}\}$ , and
•

${\small\textsf{{ego-action}}}\in\{{\small\textsf{{drive-straight}}},{\small\textsf{{left-turn}}},{\small\textsf{{u-turn}}}\}$ .

Apart from categories, ComOpT also takes explicitly stored constraints stating that certain combinations are impossible²²2It is possible to add additional constraints on the fly where if some abstract scenarios are not realizable, then block the further generation of such abstract scenarios using constraint assignments.. For example, if the road is straight, it is impossible for the autonomous vehicle to take a left turn. Utilizing the defined categories, the statement can be written as a logical formula $``{\small\textsf{{road.straight}}}\rightarrow\neg{\small\textsf{{ego-action.left-turn}}}"$ .

ComOpT uses $k$ -way combinatorial testing as a coverage criterion, whose goal is to ensure that the set of tested scenarios can cover all combinations for arbitrary $k$ categories. To increase $k$ -way coverage while avoiding certain scenario combinations, the abstract scenario generation problem is reduced to a mixed-integer linear programming (MILP) problem, with the objective being maximally increasing the coverage. The encoding is borrowed from our earlier work for testing ML-based systems [3]. A simple illustration of the 2-way combinatorial testing can be found in Figure 2. First, ComOpT builds tables for three categories mentioned earlier, i.e., $C_{1}={\small\textsf{{weather}}}$ , $C_{2}={\small\textsf{{road}}}$ , and $C_{3}={\small\textsf{{ego-action}}}$ . A table is associated with every pair of categories; therefore a total of three ${{3}\choose{2}}$ categories. With a particular combination being impossible (in this simple example, it is stated that it is impossible to do a left turn when the road is straight), the total number of empty cells to be filled equals $20$ ( $C_{1},C_{2}:9$ ; $C_{2},C_{3}:5$ ; $C_{1},C_{3}:6$ ). Given that there exists the first abstract scenario $\langle{\small\textsf{{sunny}}},{\small\textsf{{drive-straight}}},{\small\textsf{{straight}}}\rangle$ , the MILP will propose the second scenario $\langle{\small\textsf{{rainy}}},{\small\textsf{{u-turn}}},{\small\textsf{{T-shaped}}}\rangle$ that can maximally increase the metric by filling $3$ cells. When all cells are filled, we have achieved a relative form of completeness where every category pair is covered. Observe that two abstract scenarios in Figure 2 have very different characteristics. By proceeding with such methods, the testing is truly driven by coverage, and we can guarantee that every abstract scenario is different from previously generated ones at least in one category, thereby guaranteeing diversity.

Finally, readers may raise concerns about generating complex routes via an axiomatic approach. Nevertheless, for any complex route within the ODD, one can always view it as a concatenation of multiple simple sub-journeys with each sub-journey to cross a junction or drive along a lane segment (without intersection). Therefore, when ComOpT performs systematic testing, it just considers driving scenarios with various initial configurations under a single junction (e.g., T-shaped, roundabout) or a lane segment without intersection.

III-B Concrete scenario generation

From a synthesized abstract scenario, ComOpT generates a concrete scenario by querying the map and the simulator. Detailed actions include finding a sub-map that satisfies all geographical conditions, locating the ego vehicle (here referring to the vehicle being controlled by the autonomous driving software stack), configuring Non-Player Character (NPC) vehicles and pedestrians, and setting up simulation parameters such as weather.

III-B1 Sub-map finding based on semantic information

For a specific road structure matching the semantic information, such as T-shaped junction, ComOpT needs to search for the corresponding sub-map in the map and assign all agents accordingly. Due to space limits, we only highlight how a T-shaped junction is discovered using the junction example in Figure 3. We refer readers to the ComOpT documentation (in the source tree: scripts/comopt/map_parse/README.md) for details on identifying other road structures.

First, for a given junction, ComOpT considers the number of related roads. Road structures with the same number of related roads are allocated into the same group. In Figure 3, three roads connect to junction J_5. Then ComOpT computes the angles between related roads. The angle sequence of adjacent roads is used to match the unique angle range sequences of each road structure class. In Figure 3, roads with identifiers road_115, road_116, and road_117 form angles of $181.7$ , $90.1$ , and $88.2$ degrees. The computed degrees match the specification of a T-shaped junction, where for an ideal T-shaped junction, the incoming roads should form angles of $180$ , $90$ , and $90$ degrees. Junction J_5 will not be categorized as a Y-shaped junction; as for a Y-shaped junction, incoming roads should form angles around $120$ , $120$ , and $120$ degrees.

III-B2 Concrete scenario generation outline

To demonstrate how ComOpT generates concrete scenarios with map data, we consider the example in Figure 4 which is a detailed visualization for the junction J_5. The detailed process is as follows:

•

Step 1 - assigning the initial road for the ego vehicle: We first decide the initial road segment for the ego vehicle, such that the specified ego behavior in an abstract scenario is legal and feasible (without collision with other agents). For example, the initial road where the ego vehicle starts its journey in the example is marked as “self” (road_115).
•

Step 2 - assigning the destination road for the ego vehicle: The choice for the destination road of the ego vehicle considers the road structure and the behavior of the ego vehicle. For example, consider if the ego vehicle should perform a left turn, as indicated by the abstract scenario. The road marked with road_117 is marked as left, as it is in the left of the located road of the ego vehicle. Therefore, we choose a feasible position on road_117 as its destination.
•

Step 3 - other assignments: Finally, ComOpT considers other road-related configurations associated with the scenario, such as the path of NPCs and pedestrians and the traffic light control strategy.

III-B3 Non-road related parameter instantiation

For other parameters, the instantiation from an abstract scenario to a concrete scenario requires a mapping process to the simulated world. As an example, in the parameter mapping file, it is specified that the weather “cloudy” stated in the abstract scenario is mapped to the LGSVL simulator with the following parameter range:

•

“cloudiness”: $[0.3,1.0]$ ,
•

“rain”: $[0.0,0.1]$ ,
•

“wetness” : $[0.0,0.3]$ ,
•

“fog” : $[0.0,0.3]$

Therefore, when instantiating concrete scenarios, one can either select the medium value or assign a random value within the range. ComOpT predominantly uses random assignment, but for parameters to be perturbed in later scenario variation (e.g., number-of-vehicles), ComOpT carefully selects the value to avoid being on the boundary.

III-C KPI for evaluating the quality of the scenario

Once when a concrete scenario is created, ComOpT runs the scenario and uses a set of metrics to evaluate if an undesired situation occurs. The set of undesired situations considered by ComOpT includes (a) collision or being very close, (b) uncomfortable brake or sudden acceleration, (c) lateral jerk, (d) route deviation, and finally (e) ignoring traffic signals. Note that for route deviation, it includes events such as driving out-of-road (e.g., onto the sidewalk) as well as inabilities to make simple right turns when the destination is simply placed at the right of the intersection.

III-D Local scenario variation via agent spawning

When a concrete scenario does not demonstrate undesired behavior, ComOpT introduces a new scenario perturbation technique called agent spawning to find scenario variations that can make the resulting simulation demonstrating undesired behavior. Initially, the concrete scenario may have some agents (e.g., pedestrians or other road vehicles). The maximum number of additional agents that can be added is constrained by the equivalence class specified in the abstract-to-concrete mapping information. E.g., for vehicle density being mild, in the concrete scenario, it is possible to have the number of vehicles within range $[3,6]$ . Therefore, if the current concrete scenario has only $4$ vehicles, it is still possible to spawn $2$ additional vehicles while the scenario is still within the equivalence class defined by the abstract scenario.

The premise of applying agent spawning requires analyzing the timed trace over a concrete scenario. A timed trace represents the state of every agent (ego vehicle, NPC, traffic light, etc) under a predefined time granularity (e.g., $0.1$ second) recorded in the simulation. The purpose of analyzing the timed trace contains two goals: (a) extract the high-level behavioral information of the ego vehicle, and based on the high-level behavioral information, (b) decide how to inject a new NPC in the existing scenario that may lead to a collision. In the following, we detail the underlying workflow.

III-D1 Extracting the behavioral sequence

ComOpT first analyzes the timed trace and extracts the behavioral sequence of the ego vehicle, which is a summary of all intermediate actions conducted by the ego vehicle when moving from the source waypoint to the destination waypoint. Consider the example in Figure 5a. It shows the trace of the ego vehicle in a simulated episode from a concrete scenario. Intuitively, the ego vehicle first follows the lane, then performs a lane change, and finally continues to follow the new lane. The behavioral sequence can thus be summarized as $\langle{\small\textsf{{lane-following}}},{\small\textsf{{lane-change-right}}},{\small\textsf{{lane-following}}}\rangle$ . In ComOpT, each pattern in the alphabet (for building the behavioral sequence) has a precise interpretation to be mapped to a segment of the timed trace. For example,

•

the pattern “lane-change-right” matches a segment of the timed trace starting when the bounding box of the ego vehicle intersects with the right lane separator until the bounding box is fully contained in the right adjacent lane being in the same driving direction, and
•

the pattern “encroaching-change-left” is similar to lane-change-right with the difference that the vehicle moves to the adjacent left lane being in the opposite driving direction.

III-D2 Extracting the targeted collision points

Subsequently, ComOpT extracts the targeted collision points out of the behavioral sequence, where we expect the introduced NPC to collide with the ego vehicle at one of the points. Consider again Figure 5a, where two points $A$ and $B$ are selected for the first segment lane-following, and point $C$ is selected for the segment lane-change-right. Again, ComOpT has a predefined rule in terms of extracting targeted collision points. For instance, for the segment lane-change-right, ComOpT always extracts from the timed trace the position where the center of the ego vehicle first crosses the lane boundary.

III-D3 Altering the scenario by adding one new NPC

Subsequently ComOpT decides, for each targeted collision point and its associated behavioral pattern, possible ways for introducing an NPC to induce a collision. We use again the scenario in Figure 5a to explain the concept.

•

For targeted collision point $C$ where the ego vehicle performs a lane-change-right, a natural way of introducing an NPC is to allocate it on the adjacent right lane. Subsequently, control the configuration such that when the NPC drives along the lane, at time $t$ when the ego vehicle reaches point $C$ , the position of the NPC is also very close to $C$ . This leads to the new scenario as demonstrated in Figure 5c.
•

For targeted collision point $B$ where the ego vehicle performs a lane-following, a natural way of introducing an NPC is to allocate it on the same lane with a configuration such as abrupt braking. This leads to the new scenario as demonstrated in Figure 5b.

While for each pattern (e.g., lane-following) there is a fixed set of NPC control strategies (e.g., braking or being stationary), how to derive the concrete configuration remains to be solved. For simple control strategies such as setting the NPC to be stationary, deriving the configuration is trivial. For other cases, one needs to utilize physics to infer the possible configuration. For example, consider the scenario in Figure 5c where ComOpT plans to spawn an NPC to drive along the adjacent lane. Provided that the NPC shall be close to position $C$ at time $t$ , if the NPC drives under a fixed velocity $v$ , then it should be placed initially at a position that is approximately $vt$ meters away from point $C$ . In the implementation, the distance to point $C$ is relaxed by a set $[vt-\Delta,vt+\Delta]$ , with $\Delta$ being a constant to allow considering nearby starting positions.

III-D4 Scenario testing via optimization-based parameter search

Summarizing the process until now, given a concrete scenario, for each targeted collision point, ComOpT derives a parameterized scenario where the parameter is the initial distance between the newly spawned NPC and the targeted collision point. Finally, ComOpT performs a systematic search over the parameterized scenario. One can adopt multiple strategies such as randomization or uniform sampling to derive concrete parameters, in order to find scenarios with the ego vehicle demonstrating undesired behavior. However, testing a scenario is computationally expensive: it requires executing the autonomous driving stack in the simulation environment. Therefore, we have used an optimization-based approach to guide the finding of suitable parameters that lead to a collision.

The optimization target is to reach the situation where (1) the newly introduced NPC first reaches the targeted collision point, thereby generating possibilities for the ego vehicle to hit the NPC. Meanwhile, (2) the time when the introduced NPC reaches the targeted collision point should ideally be slightly earlier than the ego vehicle.

Therefore, if the time when the NPC drives through the targeted collision point is substantially earlier than the time when the ego vehicle reaches there, ComOpT increases the distance between the NPC’s initial location and the targeted collision point. On the contrary, when the NPC arrives at the targeted collision point much later than the ego vehicle, ComOpT decreases the initial distance. Due to space limits, we refer readers to the source code for concrete configurations such as step size.

III-D5 Managing the ordering of scenarios to be tested, and termination

In previous subsections, we detail how to derive a new scenario to be tested, under a unique targeted collision point. However, as shown in Figure 5a, there are multiple targeted collision points ( $A,B,C,D$ ) to be considered. To this end, one requires a meta-level search strategy to manage the ordering on the scenarios to be tested, as well as setting up proper termination criteria. We exemplify the underlying meta-level search strategy using Figure 5.

First, ComOpT maintains a priority queue over the targeted collision points, where the priority is based on the associated behavioral pattern of the point. For example, lane-change-right has higher priority over lane-following. Therefore, in Figure 5a, the targeted collision point $C$ is explored first, leading to the parameterized scenario in Figure 5c. By varying the initial location of the NPC and running the simulation, one of the situations may occur:

•

The ego vehicle collides with other NPCs, as demonstrated in Figure 5f. Then the search terminates.
•

The ego vehicle generates the same behavioral sequence $\langle{\small\textsf{{lane-following}}},{\small\textsf{{lane-change-right}}},{\small\textsf{{lane-following}}}\rangle$ , as demonstrated in Figure 5e. As ComOpT stores all visited behavioral sequences that have been used, this scenario is not further explored.
•

The ego vehicle generates a new behavioral sequence $\langle{\small\textsf{{lane-following}}},{\small\textsf{{lane-change-left}}},{\small\textsf{{lane-following}}}\rangle$ , as demonstrated in Figure 5d. This scenario will be further explored. Therefore, targeted collision points $E,F,G$ , and $H$ will be added to the existing priority queue containing $A$ , $B$ , and $D$ . ComOpT then selects point $G$ to be further explored, due to its associated pattern lane-change-left having higher priority.

Finally, to ensure termination, ComOpT sets a fixed budget on the number of simulations to be executed whenever local scenario perturbation algorithm is triggered.

IV Evaluation on Baidu Apollo and LGSVL

This section details the result evaluated on the Baidu Apollo 6.0 AD stack (master branch taken on 2021.04.14) and the LGSVL simulator (version 2021.01). The configuration reported in this paper is based on the version that participated in the 2021 IEEE Autonomous Vehicle Testing Challenge. We refer readers to the following YouTube channel for a summary of results, including (1) a two-minute teaser video highlighting the techniques and some undesired scenarios, and (2) additional $40$ problematic AD driving scenarios discovered by ComOpT.

https://www.youtube.com/playlist?list=PL6ii4xJXGd8L3bn9pWuchuKBPb0R5JgJ7

For the version used in the competition, ComOpT is set with a default configuration to generate the first $15$ abstract scenarios (equivalence classes) that maximally increase coverage governed by $2$ -way combinatorial testing. We assume that the ODD is defined using the San Francisco map as supported by the LGSVL simulator. Therefore, the atomic map geometry only contains straight lanes, T-way junctions, and 4-way intersections. We need to explicitly specify constraints such as “no roundabouts” to prevent the abstract scenario generator from creating such candidates.

Table I highlights the statistical summary of the generated scenarios. Every column lists the number of such scenarios. The scenario set base has $45$ concrete scenarios, as each abstract scenario ( $15$ in total) is instantiated $3$ times to build concrete scenarios. The case of perturbed considers the generated scenarios by perturbing (detailed in Sec. III-D) the concrete scenarios in the base. The problems on too-close and collision are regarded as safety-critical. The others are undesirable but regarded as performance issues. A scenario may contain both safety-critical and performance issues. From Table I, we observe that our local scenario perturbation technique, at least within the experiment, is effective in uncovering safety-critical scenarios. We observe that there are only two types of collision accidents between vehicles collected from the simulation results: rear collision and lateral collision. The reason is that the San Francisco map of offered by LGSVL only covers a part of the city. Roads within this part of the city does not have dashed yellow lines, and overtaking by borrowing the lane in the opposite direction is not allowed in Apollo. In such a case, head-to-head collisions cannot happen unless agents are educated to intentionally violate the traffic law.

TABLE I: Statistics on the generated scenarios

types	total	problematic	safety-critical	performance
base	45	39	5	39
perturbed	273	243	130	225

Studying the root cause of undesired behaviors, we realize that Apollo may (1) fail to make trivial right turns, as it does not perform a lane change first, (2) fail to keep a safe distance, (3) have frequent interleaving of acceleration and deceleration, even when no external traffic signal exists, (4) violate the traffic rules such as running over red lights when there is still sufficient space to stop, and finally (5) run into trouble when inconsistencies exist between the map in the LGSVL simulator and the map internally stored in Apollo.

V Summary

ComOpT is our initial step towards a vision where testing autonomous driving systems can be systematically approached with scientific rigor. Within ComOpT, we designed and implemented a coverage-driven, two-layered approach that guarantees to maintain abstract scenario diversity while capable of performing local scenario perturbation.

ComOpT still has many potentials to be improved and matured, such as integrating the methods in the literature for generating scenarios, or designing other traffic agents beyond using the NPC agents made available by the simulators, or providing proper interfaces to other autonomous driving SW stacks such as Autoware.Auto [22].

References

[1] G. Rong, B. H. Shin, H. Tabatabaee, Q. Lu, S. Lemke, M. Možeiko, E. Boise, G. Uhm, M. Gerow, S. Mehta, et al., “LGSVL simulator: A high fidelity simulator for autonomous driving,” in ITSC. IEEE, 2020, pp. 1–6.
[2] “Baidu Apollo Autonomous Driving,” https://apollo.auto/, 2021.
[3] C.-H. Cheng, C.-H. Huang, and H. Yasuoka, “Quantitative projection coverage for testing ml-enabled autonomous systems,” in ATVA. Springer, 2018, pp. 126–142.
[4] C.-H. Cheng, C.-H. Huang, and G. Nührenberg, “nn-dependability-kit: Engineering neural networks for safety-critical autonomous driving systems,” in ICCAD. IEEE, 2019, pp. 1–6.
[5] C. Nie and H. Leung, “A survey of combinatorial testing,” ACM Computing Surveys (CSUR), vol. 43, no. 2, pp. 1–29, 2011.
[6] “ISO/PAS 21448:2019 road vehicles — safety of the intended functionality,” https://www.iso.org/standard/70939.html, 2019.
[7] P. Koopman, U. Ferrell, F. Fratrik, and M. Wagner, “A safety standard approach for fully autonomous vehicles,” in SAFECOMP. Springer, 2019, pp. 326–332.
[8] D. J. Fremont, T. Dreossi, S. Ghosh, X. Yue, A. L. Sangiovanni-Vincentelli, and S. A. Seshia, “Scenic: a language for scenario specification and scene generation,” in PLDI. ACM, 2019, pp. 63–78.
[9] R. Majumdar, A. Mathur, M. Pirron, L. Stegner, and D. Zufferey, “Paracosm: A test framework for autonomous driving simulations,” in FASE. Springer, 2021, pp. 172–195.
[10] T. Dreossi, D. J. Fremont, S. Ghosh, E. Kim, H. Ravanbakhsh, M. Vazquez-Chanlatte, and S. A. Seshia, “VerifAI: A toolkit for the design and analysis of artificial intelligence-based systems,” in CAV. Springer, 2019, pp. 432–442.
[11] K. Viswanadha, F. Indaheng, J. Wong, E. Kim, E. Kalvan, Y. Pant, D. J. Fremont, and S. A. Seshia, “Addressing the IEEE AV test challenge with Scenic and VerifAI,” arXiv preprint arXiv:2108.13796, 2021.
[12] L. Ma, F. Zhang, M. Xue, B. Li, Y. Liu, J. Zhao, and Y. Wang, “Combinatorial testing for deep learning systems,” arXiv preprint arXiv:1806.07723, 2018.
[13] E. Thorn, S. C. Kimmel, M. Chaka, B. A. Hamilton, et al., “A framework for automated driving system testable cases and scenarios,” NHTSA, Tech. Rep., 2018.
[14] A. Zlocki, J. Bock, and L. Eckstein, “Database of relevant traffic scenarios for highly automated vehicles,” in Autonomous Vehicle Test and Development Symposium, 2017, pp. 20–22.
[15] “The Safety Pool Initiative,” https://www.safetypool.ai/, 2021.
[16] “ASAM OpenSCENARIO 1.1,” https://www.asam.net/standards/detail/openscenario/, 2021.
[17] Y. Annpureddy, C. Liu, G. Fainekos, and S. Sankaranarayanan, “S-taliro: A tool for temporal logic falsification for hybrid systems,” in TACAS. Springer, 2011, pp. 254–257.
[18] R. Ben Abdessalem, S. Nejati, L. C. Briand, and T. Stifter, “Testing advanced driver assistance systems using multi-objective search and neural networks,” in ASE. ACM, 2016, pp. 63–74.
[19] M. Koren, S. Alsaif, R. Lee, and M. J. Kochenderfer, “Adaptive stress testing for autonomous vehicles,” in IV. IEEE, 2018, pp. 1–7.
[20] Z. Zhang, P. Arcaini, and I. Hasuo, “Constraining counterexamples in hybrid system falsification: Penalty-based approaches,” in NFM. Springer, 2020, pp. 401–419.
[21] B. Barbot, N. Basset, T. Dang, A. Donzé, J. Kapinski, and T. Yamaguchi, “Falsification of cyber-physical systems with constrained signal spaces,” in NFM. Springer, 2020, pp. 420–439.
[22] “Autoware.Auto AD stack,” https://www.autoware.org/autoware-auto, 2021.

ComOpT: Combination and Optimization for Testing Autonomous Driving Systems