Planning-Assisted Context-Sensitive Autonomous Shepherding of Dispersed Robotic Swarms in Obstacle-Cluttered Environments

Jing Liu jing.liu5@unsw.edu.au/ liujing2605@gmail.com Hemant Singh h.singh@adfa.edu.au Saber Elsayed s.elsayed@unsw.edu.au Robert Hunjet r.hunjet@adfa.edu.au Hussein Abbass h.abbass@unsw.edu.au School of Engineering and Information Technology, University of New South Wales, Canberra ACT, Australia

Abstract

Robotic shepherding is a bio-inspired approach to autonomously guiding a swarm of agents towards a desired location. The research area has earned increasing research interest recently due to the efficacy of controlling a large number of agents in a swarm (sheep) using a smaller number of actuators (sheepdogs). However, shepherding a highly dispersed swarm in an obstacle-cluttered environment remains challenging for existing methods. To improve the efficacy of shepherding in complex environments with obstacles and dispersed sheep, this paper proposes a planning-assisted context-sensitive autonomous shepherding framework with collision avoidance abilities. The proposed approach models the swarm shepherding problem as a single Travelling Salesperson Problem (TSP), with two sheepdogs’ modes: no-interaction and interaction. An adaptive switching approach is integrated into the framework to guide real-time path planning for avoiding collisions with static and dynamic obstacles; the latter representing moving sheep swarms. We then propose an overarching hierarchical mission planning system, which is made of three sub-systems: a clustering approach to group and distinguish sheep sub-swarms, an Ant Colony Optimisation algorithm as a TSP solver for determining the optimal herding sequence of the sub-swarms, and an online path planner for calculating optimal paths for both sheepdogs and sheep. The experiments on various environments, both with and without obstacles, objectively demonstrate the effectiveness of the proposed shepherding framework and planning approaches.

keywords:

swarm shepherding , path planning , travelling saleperson problem , ant colony optimisation

^†^†journal:

1 Introduction

As a bio-inspired swarm guidance approach, robotic shepherding seeks to guide a swarm of agents (e.g., sheep flock, crowd) to a goal area by controlling the movement of one or more outside robots (known as sheepdogs or shepherds) [1]. Simulating the shepherding behaviour has attracted increasing attention of scholars due to the ability to map the level of abstraction in the shepherding problem to many real-world applications such as crowd control [2], precision agriculture [3], objects collection [4], robotic manipulation [5], and preventing birds from entering an airspace in airports [6].

One of the most challenging issues in shepherding is how to increase success rate while reducing mission’s completion time when herding a large number of sheep that are highly dispersed in an environment with obstacles (more on this challenge in Section 2.1).

Existing swarm shepherding methods can be roughly classified as rule-based methods [7] [8], learning-based methods [9] [10], and planning-based methods [11] [12]. Rule-based algorithms lack the flexibility and adaptability required to manage a wide range of environments [13]. While learning-based methods have the potential to address adaptability, they rely heavily on training and require a large amount of data and/or significant computational time for training [14]. Planning-based methods integrate planning approaches (e.g., optimisation techniques) into rule-based methods to guide sheepdog behaviour. However, current literature is limited to motion (e.g., path) planning algorithms, mostly for a single agent or multi-agents exhibiting self-control only (i.e. they do not need to exercise indirect control over groups). Therefore, existing methods face difficulty addressing the shepherding problem. The problem is compoinded when sheepdogs have limited influence ranges and need to herd a large dispersed flock into several sub-swarms in environments containing obstacles.

Planning is an important research field in robotics and artificial intelligence [15] [16] [17]. The field promises to improve the shepherding performance in terms of success rate and completion time [18]. Aiming to address the multiple sub-swarm, obstacle-cluttered environment, and effective shepherding, this paper focuses on planning-based methods. We capitalise on the similarity between multiple sub-swarm shepherding and the Travelling Salesperson Problem (TSP) to realise the benefits of path planning in obstacle-cluttered environments. TSP is a well-known route planning problem for determining the optimal visiting sequence of a list of cities [19] and has been extensively studied as described in Section 2.2. However, the similarities and application of TSP to shepherding problems has not been studied. Path planning using metaheuristics such as Evolutionary Computation Algorithms (EC) [20] and Rapidly Exploring Random Tree algorithm [12] have shown their promising early results in facilitating swarm shepherding.

This paper proposes a planning-assisted swarm shepherding framework by integrating the TSP with path planning to improve the effectiveness of shepherding, especially for a highly dispersed sheep swarm in an obstacle-cluttered environment. In the proposed shepherding framework, the sheep swarm is firstly divided into sub-swarms to identify the set of virtual ‘cities’ and then the shepherding problem is transformed into a single TSP to determine the optimal push sequence of sheep sub-swarms. Path planning is integrated with TSP for finding the optimal path for the sheep sub-swarm to be herded towards the next ‘city’ sequentially without collision with obstacles, and for the sheepdog to move towards the driving point of the sheep sub-swarm in real time. A primary difference between a classic TSP and the way it is adopted for shepherding is that the travelling salesperson actuates on itself, while in shepherding, it actuates on a group with unpredictable responses from the members of the groups. To handle this challenge effectively, we needed to combine the offline TSP with on-line heuristics to manage the emerging dynamics from these indirect interactions.

We consider two modes for a sheepdog to respond to context-sensitive information. First, a context-sensitive interaction mode where the sheepdog is forcing the sheep sub-swarm to move. Second, a no-interaction mode where the influence of sheepdogs on the sheep swarm is minimal to avoid undesired movements of sheep. An adaptive switching approach is proposed to assist sheepdogs in switching between the two interaction and no-interaction modes based on context during real-time opertions. Subsequently, we present a hierarchical mission planning algorithm, which combines the offline grouping and TSP solver, as well as the online path planner, to solve the optimisation problems involved in the framework. The grouping method divides the sheep swarm by evaluating if there is cohesion forces among the sheep and a well-known optimisation approach, Max-Min Ant System (MMAS) [21], is introduced for addressing the TSP. Besides, a two-layer path planner, A*-Post Processing (A*-PP), is presented to optimise the path for both sheepdogs and sheep swarm.

The contributions of this paper include the following:

1.

A planning-assisted swarm shepherding model is proposed to effectively herd the highly dispersed sheep swarm to the goal in obstacle-cluttered environments.
2.

The formulation of the swarm shepherding problem as a single TSP to determine the optimal herding sequence of sub-swarms.
3.

A context-sensitive response model where the sheepdog adaptively switches between two modes of operation during real-time path planning.
4.

A hierarchical mission planning system, consisting of offline grouping and MMAS-based sequencing as well as online path planning, are designed.

The remainder of the paper is organised as follows. Section 2 provides a review of the works related to swarm shepherding and mission planning. Section 3 presents the basic shepherding model while Section 4 describes the proposed planning-assisted shepherding model. The planning algorithms used in the proposed shepherding model are presented in Section 5, followed by the experimental results and analysis Section 6. Last is the conclusion in Section 7.

2 Related works

This section covers an overview of related work to swarm shepherding.

2.1 Swarm shepherding

The success of robotic swarm shepherding relies on the modelling of sheep flocking behaviour and the design of sheepdog control strategies. The rules of BOIDS [22] [23] [24] are the most common sheep modelling method, where separation, cohesion and alignment of sheep are considered. To improve robustness when herding larger flocks, Harrison et al.[25] viewed the flock as an abstracted deformable shape, while Hu et al. [26] used adaptive protocols and artificial potential filed methods to model the sheep flocking behaviour.

The shepherding field of research has focused more on the design of sheepdog control strategies. As a representative work of rule-based shepherding methods, Strömbom et al’s shepherding algorithm [7] laid the foundations for many other shepherding methods, such as the modulation model in [8] and the Reinforcement Learning (RL) approach in [27]. Strömbom et al. [7] (described in section 3) simulated two typical sheepdog behaviours (collecting the dispersed sheep, and driving the aggregated sheep swarm to a specific location). They found that the mission completion time increases and the success rate decreases as the number of agents in the swarm increases.

A coordination algorithm was designed in [26] to employ multiple robotic sheepdogs to herd two flocks of sheep, which consists of 20 and 30 sheep, respectively. It was observed that the proposed algorithm could not handle the shepherding of a large flock. El-Fiqi et al. [13] investigated the influence of some key factors (e.g., the density of obstacles and the initial spatial distribution of sheep) on the complexity of shepherding and identified the limitations of reactive shepherding. It was suggested that an increase in the density of obstacles and the sheep’s initial level of dispersion in the environment escalate problem complexity and reduce mission success rate.

Learning-based methods have also been studied [10] [9] [28]. Go et al. [27] extended Strömbom et al.’s model by applying RL for learning the sheepdog’s behaviour policy. Hussein et al. [14] decomposed the shepherding problem into two sub-problems: learning to push an agent from a location to the destination and selecting whether to collect scattered agents or drive the largest flock to the destination. They aimed to reduce the problem’s complexity and proposed a curriculum-based RL to accelerate the learning process. However, the investigation of the swarm shepherding problem with multiple sub-swarms randomly dispersed in the obstacle-cluttered environment is still limited, and an efficient way to address this problem is lacking.

2.2 Planning approaches

Mission planning approaches such as path planning algorithms have been well investigated and applied for mobile robots. The planning sub-problems (e.g., path planning, route planning, task assignment) involved in mission planning are defined, and the promising applications for swarm shepherding are discussed in [18]. Long et al. [1] also suggested that the sheepdog should take charge of high-level planning, such as path planning and task allocation for completing complex shepherding tasks. Some research on the applications of planning approaches for swarm shepherding exist [11] [12]. For instance, Lien and Pratt [2] presented a computer-human interactive motion planning method to address the shepherding problem. They observed that the planner lacks efficiency when the flock separates into several sub-groups. To modify the shepherding model in environments with obstacles, Elsayed et al. [20] presented a 2-stage differential evolution-based path planning algorithm that optimises the path for the sheepdog and sheep. They demonstrated that the path planning algorithm could reduce the time to complete the shepherding task.

TSP is a well-known NP-hard route planning problem that aims to find the route with the optimal cost for a salesperson to visit each city exactly once and returns to the initial city, given a set of cities and the travelling cost between each pair of cities [19]. TSP is a generalisation of or can be applied to many real-world problems, such as vehicle routing problem (VRP) [29], multi-robot task allocation [30], multi-regional coverage path planning [31], transportation and delivery [32]. Significant research has been conducted on TSP [33] [34]. Some effective approaches for solving the TSP include EC [35] [36] and swarm-intelligence algorithm such as Ant Colony Optimisation (ACO) [37], which has demonstrated its ability to solve TSP in multiple studies [21] [38] [39]. Many variants of TSP, such as Multiple TSP (MTSP) and Dynamic TSP (DTSP) [35], exist. For example, when there are multiple salespersons, the problem is called MTSP and can be further classified as single-depot and multi-depot based on where salespersons depart from [40]. Transformation methods have been used to convert a complex TSP problem to a classic single TSP where general and efficient TSP solvers exist [41]. Shepherding problems, especially with multiple sub-swarms, share some similarities with TSP. For example, there are some swarm locations (‘cities’) required to be visited by some agents (sheepdogs/salespersons) in both problems. However, to the best of our knowledge, TSP has not been applied to the robotic shepherding problem before.

3 Strömbom model

Before moving to the proposed approach, we briefly describe the model proposed by Strömbom et al. to introduce the terminology associated with shepherding that will be used subsequently. Let the sheep swarm be $\Pi=\{\pi_{1},...,\pi_{i},...,\pi_{N}\}$ where $\pi_{i}$ denotes a sheep agent and $N$ is the number of sheep agents in the swarm. $B=\{\beta_{1},...,\beta_{j},...,\beta_{M}\}$ is the set of sheepdog agents (UGVs) with $M$ sheepdogs denoted as $\beta_{j}$ . The goal position which sheepdogs herd the sheep swarm towards is denoted as $P_{G}$ . The position of $\pi_{i}$ / $\beta_{j}$ at time step $t$ is denoted as $P_{\pi_{i}}^{t}$ / $P_{\beta_{j}}^{t}$ . As per [8] [20] [13], sheep $\pi_{i}$ total force $F_{\pi_{i}}^{t}$ and sheepdog $\beta_{j}$ total force $F_{\beta_{j}}^{t}$ are calculated as Equation (1) and Equation (2) respectively.

\begin{split}F_{\pi_{i}}^{t}=W_{\pi_{v}}F_{\pi_{i}}^{t-1}+W_{\pi\Lambda}F_{\pi_{i}\Lambda_{\pi_{i}}^{t}}^{t}+W_{\pi\beta}F_{\pi_{i}\beta_{j}}^{t}\\ +W_{\pi\pi}F_{\pi_{i}\pi_{i_{1}}}^{t}+W_{\pi o}F_{\pi_{i}o}^{t}+W_{e\pi_{i}}F_{\pi_{i}\epsilon}^{t}\end{split}

(1)

F_{\beta_{j}}^{t}=F_{\beta_{j}cd}^{t}+W_{e\beta_{j}}F_{\beta_{j}\epsilon}^{t}

(2)

where each $W$ is the weight of the corresponding force vector. Each force vector is described as follows:

For sheep $\pi_{i}$ :

1.

$F_{\pi_{i}}^{t-1}$ is the previous total force vector;
2.

$F_{\pi_{i}\Lambda_{\pi_{i}}^{t}}^{t}$ represents the attraction force to its neighbours $\Lambda_{\pi_{i}}^{t}$ within the cohesion range $R_{\Lambda}$ ;
3.

$F_{\pi_{i}\beta_{j}}^{t}$ represents the repulsion force from sheepdog $\beta_{j}$ if $\pi_{i}$ is within the influence range of the sheepdog $R_{\pi\beta}$ ;
4.

$F_{\pi_{i}\pi_{i_{1}}}^{t}$ is the repulsion force from other sheep $\pi_{i_{1}},i_{1}\neq i$ within the sheep avoidance radius $R_{\pi\pi}$ ;
5.

$F_{\pi_{i}o}^{t}$ is the repulsion force from the obstacles $o$ within the obstacles avoidance radius $R_{\pi o}$ ;
6.

$F_{\pi_{i}\epsilon}^{t}$ is the random forces added to sheep $\pi_{i}$ .

For sheepdog $\beta_{j}$ :

1.

$F_{\beta_{j}cd}^{t}$ represents the normalised force vector that makes the sheepdog move to the driving point $P_{D}^{t}$ or collection point $P_{C}^{t}$ ;
2.

$F_{\beta_{j}\epsilon}^{t}$ is the random forces added to Sheepdog $\beta_{j}$ to help avoid deadlocks.

To complete the shepherding mission, sheepdog agents switch between driving behaviour and collecting behaviour by evaluating if any sheep is further away from the sheep flock as shown in Algorithm 1. Specifically, if the distance between any sheep and the Global Centre of Mass (GCM) of flock is further than the neighbourhood range $R_{n}$ , the sheepdog moves to the collecting point $P_{C}^{t}$ , which is located behind the furthest sheep $\pi_{f}^{t}$ in the direction of the GCM; otherwise, the sheep are clustered in the flock and the sheepdog needs to execute a driving behaviour by moving to the driving point $P_{D}^{t}$ , which is located behind the GCM relative to the final goal $P_{G}$ . $P_{D}^{t}$ and $P_{C}^{t}$ are calculated as following:

P_{D}^{t}=GCM^{t}+(R_{n}+R_{s})\frac{P_{G}-GCM^{t}}{||P_{G}-GCM^{t}||}

(3)

P_{C}^{t}=P_{\pi_{f}^{t}}+R_{s}\frac{GCM^{t}-P_{\pi_{f}}^{t}}{||GCM^{t}-P_{\pi_{f}}^{t}||}

(4)

R_{n}=R_{\pi\pi}\sqrt{2N}

(5)

where $R_{s}$ is the safe operation distance between a sheepdog and a sheep.

Algorithm 1 Herding(

P_{G}

GCM^{t}

\Pi

)

P_{G}

GCM^{t}

\Pi=\{\pi_{1},...,\pi_{i},...,\pi_{N}\}

2:Locate the furthermost sheep

\pi_{f}

3:if the distance between

\pi_{f}

and

GCM>R_{n}

then

4: Calculate the driving point

P_{D}^{t}

using Equation (3)

5:else

6: Calculate the collecting point

P_{C}^{t}

using Equation (4)

7:end if

P_{D}^{t}

P_{C}^{t}

Then sheepdog $\beta_{j}$ position $P_{\beta_{i}}^{t+1}$ and sheep $\pi_{i}$ position $P_{\pi_{i}}^{t+1}$ are updated according to Equation (6) and Equation (7), respectively.

P_{\beta_{j}}^{t+1}=P_{\beta_{j}}^{t}+S_{\beta_{j}}^{t}F_{\beta_{j}}^{t}

(6)

P_{\pi_{i}}^{t+1}=P_{\pi_{i}}^{t}+S_{\pi_{i}}^{t}F_{\pi_{i}}^{t}

(7)

where $S_{\beta_{j}}^{t}$ and $S_{\pi_{i}}^{t}$ represent the moving speed of sheepdog $\beta_{j}$ and sheep $\pi_{i}$ .

4 Planning-assisted swarm shepherding framework

As discussed in Section 2, existing shepherding models are inefficient when the sheep agents are too dispersed and the density of obstacles is high. To address this issue, this section proposes a planning-assisted swarm shepherding framework to improve shepherding efficacy by integrating a grouping/clustering approach, a TSP solver, and a localised path planning and navigation into a planning-assisted shepherding framework.

4.1 Grouping of dispersed sheep in the environment

Given a highly dispersed sheep swarm $\Pi$ in an environment, the first step in the planning-assisted shepherding framework is to group the dispersed sheep into some sub-swarms and locate the Local Centre of Mass (LCM) of each sub-swarm. The set of sheep sub-swarms is denoted as

\Phi=\{\phi_{1},...,\phi_{q},...,\phi_{Q}\}

(8)

where $Q$ is the number of sub-swarms and the sub-swarm $\phi_{q}$ subjects to $\bigcup_{q=1}^{Q}\phi_{q}=\Pi$ , $\bigcap_{q=1}^{Q}\phi_{q}=\emptyset$ and $\phi_{q}\neq\emptyset,\ \forall q\in\{1,...,Q\}$ . A sheep is assigned to a sub-swarm $\phi_{q}$ if it is within the cohesion range $R_{\Lambda}$ from any sheep of this sub-swarm. The LCM of $\phi_{q}$ at time step $t$ is calculated as

LCM_{q}^{t}=\frac{1}{N_{s}^{q}}\sum_{l=1}^{N_{s}^{q}}P_{\pi_{l}^{q}}^{t}

(9)

where $N_{s}^{q}$ is the number of sheep in the sub-swarm and $P_{\pi_{l}^{q}}^{t}$ is the position of the $l_{th}$ sheep grouped in $\phi_{q}$ at $t$ . The LCM of each $\phi_{q}$ is regarded as a target location, which the sheepdog should visit.

4.2 Transforming the swarm shepherding problem to the TSP for task sequencing

After obtaining LCMs $\{LCM_{1}^{t},...,LCM_{q}^{t},...,LCM_{Q}^{t}\}$ of sub-swarms, the swarm shepherding problem can be transformed into a variant of TSP. This section discusses how to transform the single-sheepdog shepherding and bi-sheepdog shepherding problems to a single TSP and presents the mathematical formulation of the shepherding-transformed single TSP. Subsequently, the general TSP solver (presented in Section 5.1) can be employed to find the optimal push sequence of sheep sub-swarms to guide the sheepdog(s)’ behaviours.

4.2.1 Transforming the single-sheepdog shepherding problem

To transform the single-sheepdog shepherding problem, we first describe how the shepherding mission is expected to be completed in our proposed model. For illustrative purposes, Fig. 1(a) presents a single-sheepdog swarm shepherding problem with sheep dispersed randomly in the obstacle-free environment and a sheepdog located in the top-right corner. The grouping result is indicated in Fig. 1(b) with 5 sub-swarms $\{\phi_{1},\phi_{2},\phi_{3},\phi_{4},\phi_{5}\}$ in different colours and the LCMs $\{LCM_{1}^{t},LCM_{2}^{t},LCM_{3}^{t},LCM_{4}^{t},LCM_{5}^{t}\}$ are represented as black crosses. Assuming the optimal push sequence of sub-swarms is $[1,2,3,4,5]$ , Fig. 1(b) illustrates how the sheepdog is going to drive the sequenced sheep sub-swarms to reach the goal area.

Similar to the description in Section 3, the driving point of each sub-swarm is located behind the sub-swarm in the direction of the next target location, maintaining the distance of $R_{n}+R_{s}$ from the LCM of the sub-swarm. The driving point $P_{D_{q}}$ for each sub-swarm $\phi_{q}$ is represented as a yellow square in Fig. 1(b). To control the sheep sub-swarms to move as indicated by the grey thick directed line segments in Fig. 1(b), the sheepdog should follow the route represented by the red directed line segments and curves. To assist the sheepdog in switching to drive another sub-swarm, switch points $P_{SW_{q}}^{t}$ are introduced in the proposed model and are represented as blue triangles in Fig. 1(b). Specifically, the sheepdog $\beta$ departs from its initial position $P_{\beta}^{0}$ for $P_{D_{1}}^{t}$ , pushes $\phi_{1}$ to $LCM_{2}^{t}$ by travelling to $P_{SW_{2}}^{t}$ , then switches to $P_{D_{2}}^{t}$ for pushing $\phi_{2}$ , and repeats this process until all the sub-swarms reach $P_{G}$ .

To define a TSP, two vital issues need to be addressed. These are 1) identifying the list of cities and 2) evaluating the travelling cost between each pair of cities. In the single-sheepdog swarm shepherding problem, $P_{\beta}^{0}$ , $P_{G}$ and the areas where the sub-swarms $\phi_{q}$ are located in (blue dashed circles in Fig. 1(b)) constitute the set of cities. For convenience, let the sheepdog’s initial position $P_{\beta}^{0}$ be $LCM_{0}^{t}$ , $\phi_{0}=\emptyset$ and the final goal $P_{G}$ be $LCM_{Q+1}^{t}$ . The route’s start and end city are fixed to be $LCM_{0}^{t}$ and $LCM_{Q+1}^{t}$ to ensure that the sheepdog departs from $P_{\beta}^{0}$ and pushes all the sheep to $P_{G}$ . The travelling cost between each pair of cities should be evaluated by the cost $C_{qq^{\prime}}$ for pushing $\phi_{q}$ from $LCM_{q}^{t}$ to $LCM_{q^{\prime}}^{t},\ q,\ q^{\prime}\in\{0,1,2,...,Q+1\},\ q\neq q^{\prime}$ . However, it is challenging to precisely evaluate $C_{qq^{\prime}}$ as shepherding is a complex, interactive, dynamic process involving some uncontrollable factors. In this study, we simplify the evaluation of $C_{qq^{\prime}}$ by calculating it as the distance between $LCM_{q}^{t}$ and $LCM_{q^{\prime}}^{t}$ for the obstacle-free environment, and the cost of generated path between $LCM_{q}^{t}$ and $LCM_{q^{\prime}}^{t}$ ( $C_{qq^{\prime}}=C\_Pa(q,q^{\prime})$ as calculated in Equa (17)) for the obstacle-laden environment.

4.2.2 Transforming the bi-sheepdog shepherding problem

Similarly, the bi-sheepdog shepherding problem can be regarded as a multiple TSP where multiple sheepdogs depart from their corresponding initial locations $P_{\beta_{j}}^{0}$ (depots) to visit each LCM (city) exactly once for collecting the dispersed sub-swarms $\Phi=\{\phi_{1},...,\phi_{q},...,\phi_{Q}\}$ and finally drive them to reach the goal location $P_{G}$ (terminal). The sheepdogs are not required to return to their initial locations. In this section, we further convert the shepherding-transformed multiple TSP to a single TSP so that the general single TSP solver can be employed to address the problem.

To solve the bi-sheepdog shepherding problem as a single TSP (STSP), we regard the initial position of a sheepdog $P_{\beta_{1}}^{0}$ as $LCM_{0}^{t}$ , the goal location $P_{G}$ as $LCM_{Q+1}^{t}$ and another sheepdog’s initial position $P_{\beta_{2}}^{0}$ as $LCM_{Q+2}^{t}$ . $\{LCM_{0}^{t},...,LCM_{q}^{t},...,LCM_{Q+2}^{t}\}$ are the set of the cities’ locations. The start and the end city of the route are fixed to be $LCM_{0}^{t}$ and $LCM_{Q+2}^{t}$ . Then a solution of the STSP can be converted to the solution of MTSP by splitting it into two lists at $LCM_{Q+1}^{t}$ and reversing the order of the latter list. In this way, the first city of each list is the initial position of a sheepdog ( $P_{\beta_{1}}^{0}$ or $P_{\beta_{2}}^{0}$ ) and the end city is the goal ( $P_{G}$ ). Other cities on the list are the sub-swarms to be driven by the corresponding sheepdog located at the start of the list. Fig. 2(a) shows a solution of the STSP, and Fig. 2(b) illustrates the transformed solution of MTSP and the shepherding process guided by the MTSP solution.

4.2.3 Mathematical formulation of the TSP

Based on the abovementioned discussion, the solution of the shepherding-transformed TSP is formulated as follows:

\psi_{qq^{\prime}},\ q,\ q^{\prime}\in\{0,1,2,...,Q+M\},\ q\neq q^{\prime}

(10)

where $q,\ q^{\prime}$ are the indexes of LCMs. If the sheepdog pushes $\phi_{q}$ from $LCM_{q}^{t}$ to $LCM_{q^{\prime}}^{t}$ , $\psi_{qq^{\prime}}=1$ ; otherwise, $\psi^{u}_{qq^{\prime}}=0$ . $M$ is the number of sheepdogs. $M$ is limited to $\{1,2\}$ here.

The optimisation objective of the TSP is:

Minimize:

\displaystyle F=\sum_{q=0}^{Q+M}\sum_{q^{\prime}=0}^{Q+M}C_{qq^{\prime}}\cdot\psi_{qq^{\prime}}

(11)

Subject to:

		$\displaystyle\sum_{q=0,q\neq q^{\prime}}^{Q+M}\psi^{u}_{qq^{\prime}}=1,\forall q^{\prime}\in\{1,2,...Q+M\}$		(12)
		$\displaystyle\sum_{q^{\prime}=0,q^{\prime}\neq q}^{Q+M}\psi^{u}_{qq^{\prime}}=1,\forall q\in\{0,1,...Q+M-1\}$		(13)
		$\displaystyle\sum_{q=1}^{Q+M}\psi^{u}_{q0}=0$		(14)
		$\displaystyle\sum_{q^{\prime}=0}^{Q+M-1}\psi^{u}_{(Q+M)q^{\prime}}=0$		(15)

Here, $F$ is the total cost and $C_{qq^{\prime}}$ is the cost to push $\phi_{q}$ from $LCM_{q}^{t}$ to $LCM_{q^{\prime}}^{t}$ . Constraints (12) and (13) ensure that each target location is visited exactly once. Constraints (14) and (15) ensure that the sheepdog $\beta_{j}$ departs from $P_{\beta_{j}}$ and finally reaches $P_{G}$ .

4.3 Path planning for sheepdog(s) and sheep swarm

Given the sequenced sub-swarms $\{\phi_{1}^{\prime},...,\phi_{q}^{\prime},...,\phi_{Q}^{\prime}\}$ and the corresponding LCMs $\{LCM_{1}^{\prime},...,LCM_{q}^{\prime},...,LCM_{Q}^{\prime}\}$ , the mission of the sheepdog can be regarded as a set of sequential sub-tasks, i.e., pushing $\phi_{q}^{\prime}$ from $LCM_{q}^{\prime}$ to $LCM_{q+1}^{\prime}$ , $\forall\leavevmode\nobreak\ q\in\{1,...,Q\}$ . Path planning is crucial for both sheepdog(s) and sheep swarm to reduce detours and mission completion time. Next we present the mathematical formulation of path planning and discuss how to integrate the path planning algorithm into shepherding based on a proposed classification of sheepdog moving mode.

4.3.1 Mathematical formulation of path planning

In this paper, the path is defined as a sequence of way-points that can be connected as a set of path segments. Denoting the start and goal points as $W_{0}$ and $W_{D+1}$ , respectively, the solution of path planning between the two points could be represented as:

Path(W_{0},W_{D+1})=\{W_{0},W_{1},...,W_{d},...,W_{D+1}\}

(16)

In the obstacle-cluttered environment, collision avoidance is a hard constraint which means that a path is infeasible once it intersects, i.e. collides, with any obstacle in the environment.

Referring to [42] [43], the cost evaluation function of a feasible path, which is also the optimisation objective of path planning, is defined as:

C\_{Pa}(W_{0},W_{D+1})=\alpha_{1}\cdot C\_L(D)+\alpha_{2}\cdot C\_Th(D)

(17)

where $\alpha_{1}$ and $\alpha_{2}$ are the weights of costs.

$C\_L$ is the path length cost and is calculated as:

C\_L(D)=\sum_{d=0}^{D}||W_{d}-W_{d+1}||

(18)

$C\_Th$ is the threat cost evaluating the unwanted disturbance of sheepdogs on sheep and is calculated as:

C\_Th(D)=\sum_{d=0}^{D}Threat_{d}

(19)

$Threat_{d}=1$ if the path segment from $W_{d}$ to $W_{d+1}$ collides with the threat area which is defined as a set of circles with the centre points $P^{t}_{\pi_{i}},\forall i\in\{1,...N\}$ and the radius $R_{th}$ . $R_{th}$ is the threat range, representing the distance that the sheepdog should keep from the sheep to avoid the unwanted influence. A large $R_{th}$ will increase the path length cost of the sheepdog to reach the target point while a small $R_{th}$ might disturb the sheep and cause unexpected movements.

4.3.2 Path planning for shepherding

To integrate path planning into context-sensitive shepherding, we design a two-mode sheepdog operations, a no-interaction and an interaction modes based on whether the influence of sheepdogs on sheep swarm $C\_Th$ should be minimised or not.

No-interaction mode: The no-interaction mode is usually triggered when the sheepdog departs from its initial location $P_{\beta_{j}}^{0}$ for the driving point of the sub-swarm $\phi_{1}^{\prime}$ , or when the sheepdog just finished a sub-task of pushing $\phi_{q-1}^{\prime}$ and switches to the next sub-task by moving towards the driving point of the next sub-swarm $\phi_{q}^{\prime}$ . During the no-interaction mode, the influence of sheepdogs on the sheep swarm should be minimised to avoid unwanted movements of the sheep swarm. Sheep are considered obstacles that should be avoided to avoid disturbing the flocks while the sheepdog is positioning itself for a driving position. The path planning algorithm, A*-PP (presented in Section 5.2), is used to find the optimal path $Path(P_{\beta_{j}}^{t},P_{D_{q}}^{t})$ , which is obstacle-free and has the lowest cost, for the sheepdog $\beta_{j}$ to follow from its current location $P_{\beta_{j}}^{t}$ to the driving point $P_{D_{q}}^{t}$ of the target sub-swarm. The path cost $C\_{Pa}(P_{\beta_{j}}^{t},P_{D_{q}}^{t})$ is evaluated according to Equation (17) by setting $\alpha_{1},\alpha_{2}$ as positive numbers.

Interaction mode: The interaction mode is activated when the sheepdog is driving the sheep sub-swarm $\phi_{q}^{\prime}$ from $LCM_{q}^{{}^{\prime}t}$ towards a sub-goal $P_{SG}^{t}$ . During this phase, the sheepdog continues to influence the sheep sub-swarm by witching between collecting and driving behaviours as it evaluates the furthest distance of the sheep to $LCM_{q}^{{}^{\prime}t}$ . The path $Path(P_{\beta_{j}}^{t},P_{D_{q}}^{t}/P_{C_{q}}^{t})$ from the sheepdog’s current position $P_{\beta_{j}}^{t}$ to the driving/collecting point $P_{D_{q}}^{t}$ / $P_{C_{q}}^{t}$ of the target sub-swarm is also optimised using A*-PP. Different from the no-interaction mode, only the path length $C_{L}$ is considered for the path cost evaluation in the interaction mode. Therefore, $\alpha_{2}$ in Equation (17) should be set to 0.

The driving point $P_{D_{q}}^{t}$ is calculated as follows:

P_{D_{q}}^{t}=LCM_{q}^{{}^{\prime}t}+(R_{n}+R_{s})\frac{P_{SG}^{t}-LCM_{q}^{{}^{\prime}t}}{||P_{SG}^{t}-LCM_{q}^{{}^{\prime}t}||}

(20)

where $LCM_{q}^{{}^{\prime}t}$ is the LCM of the sheep sub-swarm $\phi_{q}^{\prime}$ that the sheepdog will be driving or is currently driving; $P_{SG}^{t}$ is the sub-goal that the sheepdog is driving $\phi_{q}^{\prime}$ towards. $P_{SG}^{t}$ is set as the waypoint in the optimised path $Path(LCM_{q}^{{}^{\prime}t},LCM_{q+1}^{{}^{\prime}t})$ of $\phi_{q}^{\prime}$ from $LCM_{q}^{{}^{\prime}t}$ to $LCM_{q+1}^{{}^{\prime}t}$ , which is obtained by A*-PP as well. In this way, the sheepdog will push $\phi_{q}^{\prime}$ to move towards the optimal path so as to reduce the detours of both sheepdogs and sheep sub-swarms.

4.4 Planning-assisted shepherding framework

Based on the discussion above, the overall planning-assisted shepherding framework is presented in Algorithm 2, where $Mode=0$ represents the no-interaction mode while $Mode=1$ represents the interaction mode. Before the real-time shepherding, the offline planner obtains the grouping and sequencing results (lines 2 and 3). Then the sheepdog starts with the no-interaction mode for reaching the driving point of $\phi_{1}^{\prime}$ .

Algorithm 2 Planning-assisted shepherding model

\Pi=\{\pi_{1},...,\pi_{i},...,\pi_{N}\}

\beta

P_{G}

2:Initialise:

q=1,t=0,Mode=0

T

3:Get

\Phi=\{\phi_{1},...,\phi_{q},...,\phi_{Q}\}

via Algorithm 3

4:Get sequenced sub-swarms

\Phi^{\prime}=\{\phi_{0}^{\prime},...,\phi_{q}^{\prime},...,\phi_{Q}^{\prime}\}

and corresponding LCMs

\{LCM_{0}^{\prime},...,LCM_{q}^{\prime},...,LCM_{Q+1}^{\prime}\}

using MMAS

5:while

t<T

&& sheep have not all reached

LCM_{Q+1}^{\prime}

t=t+1

7: if

\phi_{q}^{\prime}

encounters

\phi_{q+1}^{\prime}

then

\phi_{q+1}^{\prime}=\phi_{q+1}^{\prime}\cup\phi_{q}^{\prime}

;

q=q+1

;

Mode=0

;

10: end if

11: Update LCMs;

12: if

Mode=0

mod(t,10)=1

then

13: Calculate the optimal path of

\phi_{q}^{\prime}

Path_{\phi^{\prime}_{q}}=\{LCM_{q}^{\prime},W_{1},W_{2},...,W_{D},LCM_{q+1}^{\prime}\}

14: Let

W_{1}

be the sub-goal

P_{SG}^{t}

15: else

16: Locate the nearest waypoint on

Path_{\phi^{\prime}_{q}}

ahead of

\beta

as the sub-goal

P_{SG}^{t}

17: end if

18: if

Mode=0

then

19: Calculate

P_{D_{q}}^{t}

based on Equation (20)

20: Optimise

Path_{\beta_{j}}(P_{\beta_{j}}^{t},P_{D_{q}}^{t})

as per no-interaction mode

21: else

22: Calculate

P_{D_{q}}^{t}

P_{C_{q}}^{t}

based on Algorithm 1:

P_{D_{q}}^{t}/P_{C_{q}}^{t}

= Herding(

P_{SG}^{t}

LCM_{q}^{{}^{\prime}t}

\phi_{q}^{\prime}

)

23: Optimise

Path_{\beta_{j}}(P_{\beta_{j}}^{t},P_{D_{q}}^{t}/P_{C_{q}}^{t})

as per interaction mode

24: end if

25: Sheepdog moves following

Path_{\beta_{j}}

with the limitation of maximum speed

26: Update the sheep position according to Equation (7)

27: if

Mode=0

\beta

reaches

P_{D_{q}}^{t}

then

28:

Mode=1

29: end if

30:end while

31:

t

During the shepherding process, the sheepdog switches between the no-interaction and the interaction mode based on context and to complete the set of sequenced sub-tasks, i.e., pushing the sheep sub-swarms $\phi_{q}^{\prime}$ to $LCM_{q+1}^{\prime}$ , $\forall\leavevmode\nobreak\ q\in\{1,...,Q\}$ . An adaptive switching approach is designed based on the real-time evaluation of the shepherding progress and is integrated into the shepherding framework. As presented in line 5-9 of Algorithm 2, the switching from the interaction mode to the no-interaction mode happens when $\phi_{q}^{\prime}$ encounters $\phi_{q+1}^{\prime}$ , indicating that the sheepdog just successfully pushed $\phi_{q}^{\prime}$ to $LCM_{q+1}^{\prime}$ to merge with $\phi_{q+1}^{\prime}$ and is in preparation for the next sub-task by moving to the driving point of $\phi_{q+1}^{\prime}$ . Then once the driving point of $\phi_{q+1}^{\prime}$ is reached in the no-interaction mode, the sheepdog switches to the interaction mode (line 27-29), which means that the sheepdog starts pushing the sub-swarm $\phi_{q+1}^{\prime}$ . This process continues until all the sheep reach the goal area $LCM_{Q+1}^{\prime}$ or the pre-defined maximum time steps $T$ is reached. Figure 3 shows the corresponding flowchart.

5 Hierarchical mission planning for shepherding

To assist the proposed swarm shepherding framework, a hierarchical mission planning system is proposed in this section by combining the approach for grouping, MMAS for TSP, and A*-PP for online path planning.

5.1 Grouping and MMAS: offline task planner

Given the sheep swarm $\Pi$ , a cohesion range-based grouping method as presented in Algorithm 3 is used to obtain the set of sheep sub-swarms and calculate the LCMs. Then MMAS [21], a well-known ACO algorithm, is introduced to address the shepherding-transformed TSP for getting the optimal push sequence of sub-swarms due to its outstanding performance in addressing TSP. ACO is inspired from the real ant colonies’ foraging behaviour. During the foraging process, ants deposited pheromone trails on the return routes if they find food sources. It enables other ants to find optimal paths to food sources by getting information from pheromone trails.

Algorithm 3 Grouping of dispersed sheep

1:a sheep swarm

\Pi=\{\pi_{1},...,\pi_{i},...,\pi_{N}\}

2:Initialise:

Q=0

\phi_{q}=Null

3:Calculate the distance

D_{\pi_{1}\pi}

between

\pi_{1}

and all other sheep, and sort

D_{\pi_{1}\pi}

in ascending order to get the sorted index

Idx

of sheep

4:for i=1:N do

k=Idx(i)

6: Find the neighborhood sheep

\Lambda_{\pi_{k}}

\pi_{k}

within

R_{\Lambda}

7: if

\Lambda_{\pi_{k}}\neq\emptyset

& any of

\Lambda_{\pi_{k}}

\pi_{k}

belongs to an exiting sub-swarm

\phi_{q}

then

\phi_{q}=\phi_{q}\cup\pi_{k}\cup\Lambda_{\pi_{k}}^{t}

9: else

10:

Q=Q+1

;

\phi_{Q}=\pi_{k}\cup\Lambda_{\pi_{k}}

11: end if

12:end for

13:Calculate the LCM of each

\phi_{q}

as Equation (9)

14:the set of sheep sub-swarms

\Phi=\{\phi_{1},...,\phi_{q},...,\phi_{Q}\}

and the corresponding LCMs

\{LCM_{1}^{t},...,LCM_{q}^{t},...,LCM_{Q}^{t}\}

The core components of MMAS are solution construction and pheromone update. To find the optimal visiting sequence of sheep sub-swarms, the solution in MMAS is constructed by selecting sub-goals (regarded as a solution component $c_{q}^{q^{\prime}}$ ) one by one based on pheromone values $\tau_{qq^{\prime}}$ and the values $\eta(c_{q}^{q^{\prime}})$ of edges to form the complete solution, which indicates the visiting sequencing. Pheromone values $\tau_{qq^{\prime}}$ are updated every generation based on the quality of constructed solutions and the evaporation of existing pheromones. The values of $\eta(c_{q}^{q^{\prime}})$ are defined as $\eta(c_{q}^{q^{\prime}})=1/C_{qq^{\prime}}$ where $C_{qq^{\prime}}$ is the travelling cost from $LCM_{q}^{t}$ to $LCM_{q}^{\prime t}$ . In MMAS, only the best ant is used to update the pheromone and the pheromone values are limited in the predefined ranges $[\tau_{min},\tau_{max}]$ . The implementation of MMAS for finding the optimal push sequence of sub-swarms is described as follows:

Step 1: Initialise the parameters of MMAS, including the ant colony size $N_{s}$ , two parameters $\alpha$ and $\gamma$ , evaporation rate $\rho$ , pheromone values of each edge $\tau_{qq^{\prime}}=\tau_{max}$ , the values of each edge $\eta(c_{q}^{q^{\prime}})=1/C_{qq^{\prime}}$ , the best solution $s^{*}=$ NULL;

Step 2: Construct new ant solutions:

Step 2.1: Start with a partial solution $s_{p}=LCM_{0}^{t}$ ;

Step 2.2: For each $LCM_{q^{\prime}}^{t},q^{\prime}\in\{1,2,...,Q\}$ , calculate the probability $p(c_{q}^{q^{\prime}}|s_{p})$ of moving from the current location $LCM_{q}^{t}$ to $LCM_{q^{\prime}}^{t}$ based on the following Equation;

p(c_{q}^{q^{\prime}}|s_{p})=\frac{\tau^{\alpha}_{qq^{\prime}}\cdot[\eta(c_{q}^{q^{\prime}})]^{\gamma}}{\sum_{c_{q}^{l}\in N(s_{p})}\tau^{\alpha}_{ql}\cdot[\eta(c_{q}^{l})]^{\gamma}},\ \forall c_{q}^{q^{\prime}}\in N(s_{p})

(21)

where $N(s_{p})$ is a set of available solution components for the current partial solution $s_{p}$ ;

Step 2.3: Select the next solution component based on the probability $p(c_{q}^{q^{\prime}}|s_{p})$ and add the selected $LCM_{q^{\prime}}^{t}$ to the current partial solution $s_{p}$ ;

Step 2.4: If all $LCM_{q^{\prime}}^{t},q^{\prime}\in\{1,2,...,Q\}$ are visited, add $LCM_{Q+1}^{t}$ to $s_{p}$ and go to Step 2.5; otherwise, go to Step 2.2;

Step 2.5: If $N$ new solutions are constructed, go to Step 3; otherwise, go to Step 2.1;

Step 3: Calculate the cost $F(s)$ and record the best solution $s*$ found with the lowest cost;

Step 4: Update the pheromone values according to:

	$\displaystyle\qquad\tau_{qq^{\prime}}$	$\displaystyle=(1-\rho)\tau_{qq^{\prime}}+\Delta\tau_{qq^{\prime}}^{best}$		(22)
	$\displaystyle\qquad\tau_{qq^{\prime}}$	$\displaystyle=min(max(\tau_{qq^{\prime}},\tau_{min}),\tau_{max})$		(23)

where $\Delta\tau_{qq^{\prime}}^{best}=1/F(s*)$ ;

Step 5: If the termination condition is met, output the best solution which represents the optimal travelling sequence; otherwise, go to Step 2.

5.2 A*-PP: online path planner

As discussed in Sections 4.3 and 4.4, path planning is crucial for reducing detours of both sheepdogs and sheep swarm, and is invoked during the online shepherding process. This section presents a two-layer path planning algorithm A*-PP, where the first layer, A*, finds the path with optimal cost, and the second layer, post-processing, eliminates the redundant waypoints in the path.

A* [44] is a well-known node-based path search algorithm that searches in a landscape represented by graphs. A* starts from the specific start node of the search graph and expands the nodes on candidate paths by adding one node at a time until it reaches the goal node in the graph. To decide which node on the candidate paths to be extended next, A* employs an evaluation function $f(n)$ which can be calculated as Equation (24) to estimate the cost of the path going through node $n$ .

f(n)=g(n)+h(n)

(24)

where $g(n)$ is the cost of the optimal path from the start node to the current node $n$ and $h(n)$ is the heuristic function for estimating the cost from the current node $n$ to the goal node. In this paper, $g(n)$ is calculated as following:

g(n)=C\_L(n)+C\_Th(n),

(25)

where $C\_L(n)$ and $C\_Th(n)$ are the length cost and threat cost. $h(n)$ is calculated as the straight line distance from the current node $n$ to the goal node, which is permissible to guarantee A* returns the optimal path.

The pseudo-code of A* is given in Algorithm 4. $Open$ is the set of nodes that can be considered for expansion. $Closed$ is the set of nodes that have been expanded, which makes sure that each node can be travelled at most once. $c(n,m)$ is the cost from the node $n$ to node $m$ . $Parent(m)$ is to record the path with the lowest cost. A* starts the search from the initial point $W_{init}$ . At each iteration of the main loop, A* selects the node $n$ with the lowest $f(n)$ from $Open$ and removes it from $Open$ to $Closed$ . Then A* checks the neighbours of $n$ to insert feasible neighbour nodes $m$ into $Open$ if $m$ is not in $Open$ , or update $f(m)$ if $m$ is already in $Open$ and $g(m)+h(m)$ is better than the old $f(m)$ . The loop continues until the node with the lowest $f(n)$ is the goal point $W_{goal}$ or $Open$ is empty, meaning no feasible path exists.

Algorithm 4 The pseudo code of A*

g(W_{init})=0;\ Open=\emptyset;\ Closed=\emptyset;\ Parent(W_{init})=W_{init}

2:Inset

W_{init}

into

Open

with

g(W_{init})+h(W_{init})

;

3:while

Open\neq\emptyset

4: Select the node

n

Open

with the lowest value of

f(n)

;

5: if

n=W_{goal}

then

6: Extract

Path

from

Parent

7: Return

Path

;

8: end if

9: Remove the node

n

from

Open

;

10: Add the node

n

Closed

;

11: for each neighbour

m

n

12: if

i\notin Closed

then

13: if

i\notin Open

then

14:

g(m)=\infty

;

15:

Parent(m)=NULL;

16: end if

17: if

g(n)+c(n,m)<g(m)

then

18:

g(m)=g(n)+c(n,m)

;

19:

Parent(m)=n

;

20: Inset

m

into

Open

or update

f(m)

Open

with

g(m)+h(m)

;

21: end if

22: end if

23: end for

24:end while

25:

Path

However, the original path obtained by A* usually contains many waypoints and the sub-path between two waypoints might be taking an unnecessary detour in some cases where a straight line can connect these two waypoints with no obstacle collision. Therefore, a path post-processing method, line of sight path pruning [45], is introduced to remove some redundant waypoints on the path to further reduce the path cost. The pseudo-code of the path post-processing is presented in Algorithm 5. The core of this process is to replace the original sub-path between two waypoints with a straight line if the straight line does not collide with any obstacles, meaning that the waypoints on the original sub-path, except the start point and end point, will be removed.

Algorithm 5 The pseudo code of path post-processing

Path

N_{nodes}=size(Path,1)

2:while

i<=N_{nodes}

3: for

j=2:N_{nodes}-1

4: Check the collision between the

i^{th}

node and the

(i+j)^{th}

node

5: if No collision exists then

6: if

j>=N_{nodes}-1

then

7: Add the last node on

Path

Processed\_path

and Break

8: else

9: Continue

10: end if

11: end if

12: Add the

(i+j-1)^{th}

node to

Processed\_path

and Break

13: end for

14:

i=i+j-1

15:end while

16:Add the last node on

Path

Processed\_path

if it is not

17:

Processed\_path

6 Numerical Experiments

6.1 Experimental setting

To evaluate the planning-assisted shepherding model and the hierarchical mission planning algorithm, experiments are conducted on a set of synthetic shepherding problems with different levels of complexity. Table 1 presents the details of the 20 benchmark problems, showing the environment size (mostly $100\times 100$ ), the number of sheep $N$ (20, 50, 100) and if obstacles are contained in each case. The benchmark set consists of three groups, the obstacle-free group, the obstacle-contained group with small swarms and the obstacle-contained group with large swarms. The cases in each group have an increasing level of complexity. Fig. 4 shows the visualised initialisation of each case with red dots representing the sheep, red asterisks representing sheepdogs, black areas denoting obstacles and a blue circle representing the goal area. Cases 1-6 are obstacle-free environments with an increasing level of complexity by varying the environment size, $N$ , the goal location and the swarm initialisation. Cases 7-20 are obstacle-contained environments where the density of obstacles further impacts the problem’s complexity. The initialisation in cases 11 and 18 is based on randomly distributed sheep individuals, while the initialisation in other cases is based on randomly distributed sub-swarms.

For each problem instance, the experiments are conducted 20 times to capture the statistical behaviour. The number of the maximum time steps for each run is set to $T=300+20*N$ . Three metrics are recorded to evaluate shepherding performance, including 1) SR: the success rate, i.e., number of times the shepherding mission was completed out of 20 runs; 2) No. of steps: the number of time steps consumed to complete the shepherding mission, and 3) path length: the total moving distance of the sheepdog. The mean and standard deviation of only the successful runs are presented for the no. of steps and path length.

Table 1: Basic features of the benchmark problems

Group	Group 1: Obstacle-free cases						Group 2: Obstacle-contained cases…
Case	1	2	3	4	5	6	7	8	9	10
Environment size	50*50	100*100	100*100	100*100	100*100	100*100	50*50	100*100	100*100	100*100
Number of sheep $N$	20	20	50	50	100	100	20	20	20	50
Obstacles	N	N	N	N	N	N	Y	Y	Y	Y
Group	…with small sheep swarms			Group 3: Obstacle-contained cases with large sheep swarms
Case	11	12	13	14	15	16	17	18	19	20
Environment size	100*100	100*100	100*100	100*100	100*100	100*100	100*100	100*100	100*100	100*100
Number of sheep	50	50	50	100	100	100	100	100	100	100
Obstacles	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y

Table 2: Parameters setting of the shepherding model

$W_{\pi_{v}}$	$W_{\pi\Lambda}$	$W_{\pi\beta}$	$W_{\pi\pi}$	$W_{\pi o}$	$W_{e\pi_{i}}$	$W_{e\beta_{j}}$	$R_{\Lambda}$	$R_{\pi\beta}$	$R_{\pi\pi}$	$R_{\pi o}$	$R_{s}$
0.5	1.05	1	2	3	0.3	0.3	4	8	0.4	2	4

6.2 Parameters values and the effects of the adaptively-switch

The parameters in MMAS are set according to [21]. To be specific, the maximum number of iteration is set to 600; the population size is set to the problem dimension; $\alpha=1$ , $\beta=2$ , $\rho=0.98$ , $\tau_{min}=1/D$ , $\tau_{max}=1$ . The number of neighbours in the A* search algorithm is set to 8, and the scaling factor is 1. Table 2 presents the setting of most parameters involved in the shepherding model by referring to [7]. The newly introduced parameter $R_{th}$ and the cost weights $\alpha_{1},\ \alpha_{2}$ are analysed in the following.

$R_{th}$ determines the threat area size that the sheepdog should try to avoid in the no-interaction mode. It should be no less than $R_{s}=4$ to keep the safe operating distance from the sheep and no more than the sheepdog’s influence range $R_{\pi\beta}=8$ . Therefore, we tested the effects of $R_{th}$ on shepherding performance of 6 representative cases by setting $R_{th}$ to 4, 5, 6, 7 and 8. Table 3 presents the experimental results. Without the explicit declaration, the best results are shown in bold in all the following tables, and Wilcoxon rank-sum tests are conducted between the best results and other results for each case to test if their performances are statistically different, with a significance level of 0.05. ‘*’ indicates the significant difference.

Table 3: The effects of

R_{th}

on shepherding performance

Case	SR	No. of steps	Path length
$R_{th}=4$
Case1	1.00	131.00 $\pm$ 12.14	214.70 $\pm$ 20.69
Case3	1.00	396.05 $\pm$ 18.85	510.17 $\pm$ 38.77
Case7	1.00	232.50 $\pm$ 47.04	409.36 $\pm$ 94.24
Case11	0.95	996.21 $\pm$ 89.63	1401.94 $\pm$ 166.63
Case16	1.00	1191.75 $\pm$ 137.19	955.65 $\pm$ 143.42
Case18	0.60	2071.17 $\pm$ 156.11	1858.70 $\pm$ 235.51
$R_{th}=5$
Case1	1.00	133.85 $\pm$ 13.57	220.77 $\pm$ 24.12
Case3	1.00	432.25 $\pm$ 33.11*	567.76 $\pm$ 53.05*
Case7	0.95	249.05 $\pm$ 70.08	439.90 $\pm$ 130.25
Case11	0.80	991.00 $\pm$ 113.56	1391.99 $\pm$ 238.58
Case16	1.00	1304.50 $\pm$ 136.34*	1152.18 $\pm$ 257.64*
Case18	0.25	2088.00 $\pm$ 108.46	1892.80 $\pm$ 293.40
$R_{th}=6$
Case1	1.00	146.05 $\pm$ 15.13*	239.70 $\pm$ 29.18*
Case3	1.00	433.80 $\pm$ 25.89*	577.58 $\pm$ 46.55*
Case7	1.00	258.85 $\pm$ 110.51	451.99 $\pm$ 212.87
Case11	0.90	1091.89 $\pm$ 78.26	1560.16 $\pm$ 194.04
Case16	1.00	1345.30 $\pm$ 155.60*	1156.08 $\pm$ 250.54*
Case18	0.35	2038.29 $\pm$ 172.45	1875.59 $\pm$ 215.09
$R_{th}=7$
Case1	1.00	156.70 $\pm$ 14.75*	262.80 $\pm$ 30.15*
Case3	1.00	449.55 $\pm$ 25.79*	595.32 $\pm$ 49.50*
Case7	1.00	254.65 $\pm$ 74.52	442.00 $\pm$ 140.60
Case11	0.85	1023.06 $\pm$ 132.27	1442.46 $\pm$ 245.77
Case16	0.95	1337.58 $\pm$ 167.01*	1127.46 $\pm$ 242.16*
Case18	0.25	2052.00 $\pm$ 129.39	1806.39 $\pm$ 71.94
$R_{th}=8$
Case1	1.00	156.50 $\pm$ 16.94*	257.28 $\pm$ 28.99*
Case3	1.00	453.10 $\pm$ 45.54*	605.75 $\pm$ 110.63*
Case7	1.00	274.60 $\pm$ 123.49	485.03 $\pm$ 232.67
Case11	0.95	1109.37 $\pm$ 104.44	1605.64 $\pm$ 173.35
Case16	1.00	1326.00 $\pm$ 131.83*	1146.23 $\pm$ 247.51*
Case18	0.40	1999.12 $\pm$ 225.17	1717.57 $\pm$ 234.66
* represents the statistical significance

We can observe from Table 3 that $R_{th}=4$ performs better than other values, achieving the highest SR on 6 cases and minimum time steps and path length on 4 cases. It should also be noted that the differences in the results by setting different $R_{th}$ values are not always significant. This is due to the high randomness of sheep behaviours which are impacted by many factors such as the obstacles and the neighbourhood. But only $R_{th}=4$ is not significantly worse than other values in these cases. This is probably because although $R_{th}=4$ can not minimise the influence of sheepdogs on some sheep on the edge of the swarm, it can avoid most of the poor disturbance behaviours, e.g., crossing the swarm. On the contrary, a large $R_{th}$ might cause unnecessary detours for sheepdogs. Therefore, $R_{th}$ is set to 4 in the following experiments.

The parameters $\alpha_{1}$ and $\alpha_{2}$ are parameters for determining the weights of the path length cost and the threat cost. We examined the effects of the ratio of $\alpha_{2}$ to $\alpha_{1}$ on shepherding performance of some representative cases by fixing $\alpha_{1}$ to 1 and varying $\alpha_{2}$ to 0, 20, 40, 60, 80 and 100. In particular, $\alpha_{2}=0$ means that the no-interaction mode turns into the interaction mode, and the adaptively switch between these two modes is disabled. Table 4 shows that, similar to the effects of $R_{th}$ , the influence of different $\alpha_{2}$ values is not significant in some cases. Particularly, the shepherding performance is not very sensitive to the change of $\alpha_{2}$ if it is non-zero. This is because the change of non-zero $\alpha_{2}$ values only slightly impacts the path planning results in no-interaction mode, which does not cause a significant difference in the shepherding. However, when $\alpha_{2}=0$ , which disables the adaptively switch, the shepherding performance of more cases is impacted. Therefore, $\alpha_{1}=1,\ \alpha_{2}=100$ are set in the following experiments.

Table 4: The effects of

\alpha_{2}

on shepherding performance (

\alpha_{1}=1

)

	$\alpha_{2}=0$			$\alpha_{2}=20$			$\alpha_{2}=40$
Case	SR	No. of steps	Path length	SR	No. of steps	Path length	SR	No. of steps	Path length
C1	1.00	140.95 $\pm$ 14.53*	232.33 $\pm$ 23.90*	1.00	131.45 $\pm$ 14.80	218.62 $\pm$ 26.35	1.00	128.85 $\pm$ 13.18	213.01 $\pm$ 23.70
C3	1.00	425.35 $\pm$ 36.43*	565.53 $\pm$ 72.38*	1.00	410.20 $\pm$ 25.86	530.70 $\pm$ 41.55	1.00	405.35 $\pm$ 25.48	521.85 $\pm$ 38.94
C7	1.00	234.95 $\pm$ 73.25	413.66 $\pm$ 135.80	1.00	208.25 $\pm$ 53.61	364.71 $\pm$ 104.67	1.00	225.50 $\pm$ 50.62	399.21 $\pm$ 95.84
C11	1.00	976.25 $\pm$ 130.14	1396.81 $\pm$ 247.64	1.00	978.00 $\pm$ 120.45	1380.84 $\pm$ 228.90	0.95	978.00 $\pm$ 130.36	1376.36 $\pm$ 254.97
C16	1.00	1249.50 $\pm$ 124.46	1002.57 $\pm$ 170.85	1.00	1202.60 $\pm$ 184.64	1009.26 $\pm$ 233.03	1.00	1282.00 $\pm$ 136.54	1091.85 $\pm$ 245.40
C18	0.20	2171.25 $\pm$ 138.81*	1768.19 $\pm$ 398.51	0.20	2095.25 $\pm$ 189.02*	1594.14 $\pm$ 271.94	0.50	2198.40 $\pm$ 33.07*	1606.22 $\pm$ 145.12
	$\alpha_{2}=60$			$\alpha_{2}=80$			$\alpha_{2}=100$
Case	SR	No. of steps	Path length	SR	No. of steps	Path length	SR	No. of steps	Path length
C1	1.00	134.25 $\pm$ 11.57	214.39 $\pm$ 18.91	1.00	133.90 $\pm$ 11.24	221.59 $\pm$ 20.24	1.00	131.00 $\pm$ 12.14	214.70 $\pm$ 20.69
C3	1.00	407.45 $\pm$ 15.67*	523.76 $\pm$ 31.46	1.00	406.85 $\pm$ 28.58	521.03 $\pm$ 44.73	1.00	396.05 $\pm$ 18.85	510.17 $\pm$ 38.77
C7	1.00	239.90 $\pm$ 74.86	421.64 $\pm$ 144.35	1.00	238.35 $\pm$ 59.09	424.55 $\pm$ 111.83	1.00	232.50 $\pm$ 47.04	409.36 $\pm$ 94.24
C11	1.00	1004.20 $\pm$ 140.56	1420.79 $\pm$ 232.61	0.90	950.06 $\pm$ 139.47	1336.35 $\pm$ 265.27	0.95	996.21 $\pm$ 89.63	1401.94 $\pm$ 166.63
C16	1.00	1245.05 $\pm$ 147.89	1012.75 $\pm$ 206.21	1.00	1225.45 $\pm$ 135.50	969.26 $\pm$ 132.54	1.00	1191.75 $\pm$ 137.19	955.65 $\pm$ 143.42
C18	0.20	2104.25 $\pm$ 127.41*	1572.75 $\pm$ 263.46	0.30	2201.33 $\pm$ 106.89*	1594.05 $\pm$ 213.48	0.60	2071.17 $\pm$ 156.11	1858.70 $\pm$ 235.51*
* represents the statistical significance

6.3 Performance of the planning-assisted shepherding

The proposed method is compared to the reactive shepherding from Strömbom et al. [7], referred to as Method 1 for convenience, to validate the effectiveness of the proposed shepherding model. As the proposed model consists of offline task planning (grouping and TSP-based sequencing) and online path planning, we further add the shepherding method with only task planning assisted, referred to as Method 2 as a comparative method to evaluate the impact of task planning and path planning separately. The proposed planning-assisted shepherding method is referred to as Method 3 in the comparisons.

6.3.1 Planning-assisted shepherding with single-sheepdog

The comparative results of shepherding methods using single-sheepdog are presented in Table 5. The planning results and generated trajectories by each method during shepherding for three representative cases (one for a group) are visualised in Fig. 5. The blue lines in Fig. 5(a), 5(e), 5(i) represent the planning results. In other figures, the grey lines represent the sheep trajectories, and the blue lines represent the sheepdog trajectories.

As we can see from Table 5, the proposed planning-assisted swarm shepherding method performed the best overall among the three methods in almost all cases. In terms of the SR, with the increase of shepherding complexity, it becomes increasingly untenable for reactive shepherding to successfully complete the mission within the limited number of time steps, and the SR drops from 1 to 0. While reactive shepherding only obtained 100% SR on 3 cases of Group 1 (Cases 1, 5 and 6) and failed in 10 cases, task planning-assisted shepherding achieved 100% SR on 6 cases (Cases 1-5 of Group 1 and 7 of Group 2 ), which include most obstacle-free cases and a few obstacle-contained cases. But task planning-assisted shepherding failed in 7 cases (Cases 10, 12-13, 17-20) and had low SR (less than 50%) on 4 cases (Cases 8, 11, 15-16). This indicated that task planning, which divides the sheep swarm and determines the optimal pushing sequence of sub-swarms, could significantly increase the SR of shepherding in the environment without obstacles. It could also address some relatively simple shepherding missions in the environment with obstacles, but is unable to deal with the complex situations with cluttered obstacles and a large sheep swarm size. On the contrary, the proposed planning-assisted shepherding with both task planning and path planning integrated succeeded in all cases of Group 1 and more than half of Group 2 and 3, and achieved higher SR of Cases 8-13, 14-18 compared to Method 2. This demonstrates the effectiveness of path planning in improving the shepherding SR in obstacle-cluttered environments. Fig. 5(f)- 5(h) also validate that the sheepdog easily reached a deadlock without path planning (Methods 1 and 2), while Method 3 was able to effectively avoid this situation.

Table 5: Comparative results of shepherding methods with single-sheepdog

Case	Method 1: Reactive shepherding			Method 2: Task planning-assisted shepherding			Method 3: Planning-assisted shepherding
	SR	No. of steps	Path length	SR	No. of steps	Path length	SR	No. of steps	Path length
C1	1.00	219.45 $\pm$ 46.38*	427.43 $\pm$ 87.90*	1.00	143.35 $\pm$ 12.09*	257.31 $\pm$ 24.19*	1.00	131.00 $\pm$ 12.14	214.70 $\pm$ 20.69
C2	0.80	540.94 $\pm$ 95.78*	1058.34 $\pm$ 184.48*	1.00	281.55 $\pm$ 35.88	494.55 $\pm$ 67.65*	1.00	272.80 $\pm$ 24.27	457.75 $\pm$ 43.86
C3	0.95	898.53 $\pm$ 186.88*	1801.14 $\pm$ 363.64*	1.00	410.70 $\pm$ 32.80	604.55 $\pm$ 41.13*	1.00	396.05 $\pm$ 18.85	510.17 $\pm$ 38.77
C4	0.75	968.07 $\pm$ 145.03*	1898.39 $\pm$ 279.91*	1.00	464.10 $\pm$ 36.62	633.36 $\pm$ 37.50*	1.00	456.75 $\pm$ 24.57	563.06 $\pm$ 35.75
C5	1.00	1082.45 $\pm$ 192.37	2132.50 $\pm$ 362.61*	1.00	1098.10 $\pm$ 73.27	645.07 $\pm$ 26.22*	1.00	1118.30 $\pm$ 95.97	560.12 $\pm$ 30.96
C6	1.00	1584.65 $\pm$ 182.53*	3083.65 $\pm$ 373.67*	0.95	1410.84 $\pm$ 210.77	677.38 $\pm$ 56.06*	1.00	1434.05 $\pm$ 255.51	616.90 $\pm$ 56.48
C7	0.85	438.53 $\pm$ 80.00*	807.11 $\pm$ 154.04*	1.00	251.85 $\pm$ 52.26	455.44 $\pm$ 100.63	1.00	232.50 $\pm$ 47.04	409.36 $\pm$ 94.24
C8	0.05	685.00 $\pm$ 0.00*	1082.91 $\pm$ 0.00*	0.05	695.00 $\pm$ 0.00*	614.87 $\pm$ 0.00*	1.00	293.70 $\pm$ 22.20	490.47 $\pm$ 37.76
C9	0.00	—-	—-	0.80	544.75 $\pm$ 37.62*	2292.09 $\pm$ 129.15*	1.00	484.30 $\pm$ 13.09	283.20 $\pm$ 18.14
C10	0.00	—-	—-	0.00	—-	—-	1.00	567.45 $\pm$ 56.21	725.97 $\pm$ 88.96
C11	0.00	—-	—-	0.20	1212.00 $\pm$ 68.00*	1653.71 $\pm$ 259.48*	0.95	996.21 $\pm$ 89.63	1401.94 $\pm$ 166.63
C12	0.00	—-	—-	0.00	—-	—-	0.75	1230.60 $\pm$ 31.84	749.71 $\pm$ 57.93
C13	0.00	—-	—-	0.00	—-	—-	0.75	1216.40 $\pm$ 43.09	773.66 $\pm$ 45.49
C14	0.10	1808.00 $\pm$ 48.08*	2652.05 $\pm$ 63.06*	0.80	1528.75 $\pm$ 145.03*	804.04 $\pm$ 92.00*	1.00	1441.85 $\pm$ 109.28	714.76 $\pm$ 58.81
C15	0.10	1763.00 $\pm$ 206.48*	2701.55 $\pm$ 285.92*	0.50	1223.30 $\pm$ 203.73*	794.82 $\pm$ 94.96*	1.00	1129.40 $\pm$ 96.65	722.48 $\pm$ 64.41
C16	0.00	—-	—-	0.30	1837.83 $\pm$ 311.60*	1652.02 $\pm$ 285.06*	1.00	1191.75 $\pm$ 137.19	955.65 $\pm$ 143.42
C17	0.00	—-	—-	0.00	—-	—-	0.15	1984.00 $\pm$ 52.12	967.57 $\pm$ 138.94
C18	0.00	—-	—-	0.00	—-	—-	0.45	2071.17 $\pm$ 156.11	1858.70 $\pm$ 235.51*
C19	0.00	—-	—-	0.00	—-	—-	0.00	—-	—-
C20	0.00	—-	—-	0.00	—-	—-	0.00	—-	—-

The integration of task planning and path planning also significantly reduces the number of steps and the path length required to herd the sheep swarm to the goal in almost all cases. This proves that the planning-assisted shepherding method can save time and reduce the energy consumption of robots to complete the shepherding mission, which results in significant benefit in the real-world shepherding applications. In detail, we can observe from Table 5 that, in most of the obstacle-free environments (Cases 2-6) and the relatively simple obstacle-cluttered environments (Case 7), the reduction of the number of steps is mainly caused by the employment of task planning as the difference between the number of steps obtained by task planning-assisted shepherding and planning assisted shepherding are not significant. However, when the shepherding complexity increases in Cases 8-18, path planning plays an important role in further reducing the number of time steps so that the planning-assisted shepherding achieves the minimum number of steps. In terms of the path length, the task planning-assisted shepherding performed better than the reactive shepherding on all data-applicable cases, while the planning-assisted shepherding obtained the best performance as presented in Table 5 and Fig. 5. This demonstrates that both tasking planning and path planning are very effective in reducing the detours of the sheepdog. However, the single-sheepdog planning-assisted shepherding still has difficulty in addressing the most complex Cases 19 and 20 within the limited time and cannot guarantee 100% SR for a few cases.

6.3.2 Planning-assisted shepherding with bi-sheepdog shepherding

The failure of the planning-assisted shepherding with single-sheepdog in Cases 19 and 20 encourages the employment of multi-sheepdog in shepherding. We evaluated the performance of the three methods with two sheepdogs on the benchmark set, and Table 6 presents the results. When employing multiple agents, the mission completion time and the minimum cruising ability requirement are determined by the agent which consumes the most time and travels the longest distance, respectively. Therefore, the No. of steps and Path length presented in Table 6 are calculated based on the larger one between the values of the two sheepdogs. The visualised planning results for the 3 representative cases are presented in Fig. 6(a), 6(e), 6(i) where the lines in different colours represent the optimal routes for sheepdogs. The trajectories of sheep (represented as lines in grey) and sheepdogs (represented as lines in blue and red) generated during the bio-sheepdog shepherding process based on different methods for these cases are visualised as other figures in Fig. 6.

Table 6: Comparative results of shepherding methods with bi-sheepdog

Case	Method 1: Reactive shepherding			Method 2: Task planning-assisted shepherding			Method 3: Planning-assisted shepherding
	SR	No. of steps	Path length	SR	No. of steps	Path length	SR	No. of steps	Path length
C1	1.00	192.05 $\pm$ 60.30*	378.07 $\pm$ 113.59*	1.00	77.50 $\pm$ 8.29	153.00 $\pm$ 13.99*	1.00	76.85 $\pm$ 7.94	132.49 $\pm$ 15.92
C2	0.95	412.84 $\pm$ 136.45*	811.45 $\pm$ 264.00*	1.00	132.45 $\pm$ 7.34	250.01 $\pm$ 13.10*	1.00	133.10 $\pm$ 7.37	232.43 $\pm$ 13.68
C3	1.00	688.05 $\pm$ 124.54*	1392.42 $\pm$ 242.39*	1.00	289.40 $\pm$ 20.38	494.00 $\pm$ 37.71*	1.00	284.45 $\pm$ 19.16	414.43 $\pm$ 31.54
C4	0.95	662.53 $\pm$ 122.04*	1322.85 $\pm$ 243.08*	1.00	195.65 $\pm$ 10.52	374.66 $\pm$ 20.84*	1.00	198.30 $\pm$ 8.89	317.29 $\pm$ 20.73
C5	1.00	768.90 $\pm$ 158.46*	1543.09 $\pm$ 317.80*	1.00	433.35 $\pm$ 28.25	424.59 $\pm$ 22.60*	1.00	431.90 $\pm$ 49.90	365.22 $\pm$ 28.34
C6	1.00	1077.00 $\pm$ 208.75*	2128.58 $\pm$ 420.37*	1.00	263.65 $\pm$ 15.46	366.74 $\pm$ 19.44*	1.00	261.35 $\pm$ 20.29	298.20 $\pm$ 25.56
C7	0.90	349.72 $\pm$ 118.35*	647.45 $\pm$ 209.27*	0.95	204.89 $\pm$ 113.89	357.89 $\pm$ 146.12*	1.00	157.40 $\pm$ 36.19	282.79 $\pm$ 67.60
C8	0.85	463.94 $\pm$ 113.64*	854.12 $\pm$ 194.77*	1.00	174.30 $\pm$ 15.00	323.17 $\pm$ 29.17*	1.00	168.25 $\pm$ 9.63	293.42 $\pm$ 19.03
C9	0.00	—-	—-	0.00	—-	—-	1.00	407.25 $\pm$ 11.39	281.33 $\pm$ 23.32
C10	0.40	906.50 $\pm$ 177.10*	1556.23 $\pm$ 267.23*	0.00	—-	—-	1.00	249.65 $\pm$ 11.50	383.25 $\pm$ 24.44
C11	0.00	—-	—-	0.70	755.50 $\pm$ 267.97*	1149.11 $\pm$ 288.59*	1.00	468.95 $\pm$ 190.20	765.19 $\pm$ 244.20
C12	0.00	—-	—-	0.00	—-	—-	1.00	568.90 $\pm$ 12.33	350.87 $\pm$ 17.13
C13	0.00	—-	—-	0.00	—-	—-	1.00	668.95 $\pm$ 14.39	419.06 $\pm$ 23.33
C14	0.85	1138.35 $\pm$ 336.01*	1907.98 $\pm$ 477.13*	0.00	—-	—-	1.00	352.50 $\pm$ 14.99	359.98 $\pm$ 26.04
C15	0.60	1656.17 $\pm$ 394.33*	2690.52 $\pm$ 491.41*	1.00	423.20 $\pm$ 312.16*	555.73 $\pm$ 208.74*	1.00	296.95 $\pm$ 18.88	395.81 $\pm$ 47.19
C16	0.10	1807.50 $\pm$ 245.37*	2469.13 $\pm$ 34.49*	0.75	1234.27 $\pm$ 459.81*	1111.22 $\pm$ 453.89*	1.00	605.45 $\pm$ 209.93	564.11 $\pm$ 116.75
C17	0.00	—-	—-	0.00	—-	—-	1.00	720.95 $\pm$ 168.93	558.64 $\pm$ 87.80
C18	0.05	2077.00 $\pm$ 0.00*	2770.34 $\pm$ 0.00*	0.30	1421.00 $\pm$ 600.48*	1673.98 $\pm$ 613.43*	0.85	689.29 $\pm$ 134.12	1028.32 $\pm$ 195.86
C19	0.10	1765.50 $\pm$ 77.07*	2892.37 $\pm$ 71.84*	0.80	1030.62 $\pm$ 351.46*	1436.14 $\pm$ 471.77*	1.00	527.25 $\pm$ 48.09	618.39 $\pm$ 156.38
C20	0.00	—-	—-	0.00	—-	—-	1.00	1162.45 $\pm$ 27.88	559.21 $\pm$ 36.11

We can observe from Table 6 and Fig. 6 that the planning-assisted bi-sheepdog shepherding performed the best among these 3 methods and Method 2 performed better than Method 1, indicating the same findings from the above single-sheepdog shepherding: task planning and path planning can significantly improve the shepherding performance, especially for complex shepherding missions. Specifically, bi-sheepdog planning-assisted shepherding achieved 100% SR on all cases except Case 18, while Method 1 and Method 2 with bi-sheepdog still completely failed in some cases. Furthermore, compared to single-sheepdog shepherding, it is easy to find that the deployment of 2 sheepdogs, no matter based on which method, significantly improved the SR of addressing the complex shepherding tasks and reduced the number of time steps and the path length to complete the mission as shown in Table 6. For example, bi-sheepdog planning-assisted shepherding increased the SR from 0 to 100% on Cases 19 and 20 and obtained lower time steps and shorter path length compared to single-sheepdog planning-assisted shepherding in all cases. We can conclude that the deployment of multiple sheepdogs is an efficient way to reduce the completion time of the shepherding mission and the cruising ability requirement of the sheepdog.

To further validate the effectiveness of bi-sheepdog shepherding, we also compare the total number of steps and the total path length of 2 sheepdogs obtained by planning-assisted shepherding to the best results of single-sheepdog shepherding. The results are presented in Table 7, where the boldface denotes that the bi-sheepdog shepherding achieves better results than the single-sheepdog shepherding and ‘*’ indicates the significant difference. It can be found that the planning-assisted bi-sheepdog shepherding still significantly outperformed the single-sheepdog shepherding in most of the cases in terms of the total values. This further demonstrates the efficiency of bi-sheepdog planning-assisted shepherding in terms of reducing the total time and energy consumption to complete the mission.

Table 7: The comparison of bi-sheepdog shepherding to single sheepdog shepherding

* represents the statistical significance
Case		The total
	SR	No. of steps	Path length
C1	1.00	112.75 $\pm$ 7.99*	195.62 $\pm$ 15.68*
C2	1.00	257.50 $\pm$ 9.74	444.74 $\pm$ 19.52
C3	1.00	338.70 $\pm$ 18.77	503.27 $\pm$ 28.82
C4	1.00	332.65 $\pm$ 10.80*	531.97 $\pm$ 22.48*
C5	1.00	547.70 $\pm$ 49.69	541.10 $\pm$ 38.67
C6	1.00	467.90 $\pm$ 26.30*	548.44 $\pm$ 35.06*
C7	1.00	196.45 $\pm$ 36.49*	347.06 $\pm$ 67.72*
C8	1.00	307.25 $\pm$ 46.74*	537.17 $\pm$ 85.49*
C9	1.00	562.05 $\pm$ 12.17*	378.06 $\pm$ 23.17*
C10	1.00	518.15 $\pm$ 153.67*	840.70 $\pm$ 283.54*
C11	1.00	704.45 $\pm$ 220.95*	1183.11 $\pm$ 267.44*
C12	1.00	866.65 $\pm$ 15.21*	601.40 $\pm$ 29.48*
C13	1.00	897.10 $\pm$ 16.47*	606.20 $\pm$ 25.21*
C14	1.00	572.30 $\pm$ 69.50	704.61 $\pm$ 178.78
C15	1.00	503.55 $\pm$ 38.45*	612.12 $\pm$ 62.65*
C16	1.00	687.70 $\pm$ 85.22	900.81 $\pm$ 161.92
C16	1.00	827.45 $\pm$ 242.41	947.27 $\pm$ 209.32
C17	0.85	1137.29 $\pm$ 221.63	1728.28 $\pm$ 234.88
C19	1.00	836.45 $\pm$ 122.29*	1039.61 $\pm$ 218.38*
C20	1.00	1628.65 $\pm$ 39.02*	908.14 $\pm$ 58.52*

7 Conclusion

This paper presents a planning-assisted context-sensitive swarm shepherding model and a hierarchical mission planning system for effectively herding a large flock of highly dispersed sheep to the destination in an environment with obstacles. In the proposed shepherding model, the sheep swarm is first grouped into some sheep sub-swarms, based on which the shepherding problem is transformed into a TSP to determine the optimal pushing sequence of sub-swarms by regarding each sub-swarm as a ‘city’ to visit. Then the online path planning is integrated with a context-sensitive response model to find the optimal paths for the sheep sub-swarms to be pushed to the next ‘city’ and the optimal paths for the sheepdogs to push the sheep sub-swarms. The hierarchical mission planning system is designed to solve the planning problems in the proposed shepherding model by combining a cohesion range-based method for grouping, ACO for TSP, and A*-PP for path planning.

Experiments conducted on 20 shepherding cases consisting of three groups with different levels of complexity demonstrated the effectiveness of the planning-assisted swarm shepherding model in terms of increasing the success rate and reducing the time and energy consumption to complete the mission. The planning-assisted swarm shepherding model can also be extended for employing multiple sheepdogs, and experiments have also validated the performance improvements for bi-sheepdog shepherding. However, there remains more opportunities for extending this research. The employment of more than 2 sheepdogs for shepherding has not been studied in this work. Besides, when transforming the swarm shepherding problem into a TSP, the dynamics in shepherding are not considered. The travelling cost between each pair of cities does not consider the influence of the swarm size on the cost. Our future research will focus on how to model the multi-sheepdog swarm shepherding problem as a multiple dynamic TSP with a more accurate cost evaluation.

8 Funding

This work is supported by a U.S. Office of Naval Research-Global (ONR-G) Grant and a Defence Science and Technology Group grant.

Declaration of Competing Interest

None.

References

[1] N. K. Long, K. Sammut, D. Sgarioto, M. Garratt, H. A. Abbass, A comprehensive review of shepherding as a bio-inspired swarm-robotics guidance approach, IEEE Transactions on Emerging Topics in Computational Intelligence 4 (4) (2020) 523–537.
[2] J.-M. Lien, E. Pratt, Interactive planning for shepherd motion., in: AAAI Spring Symposium: Agents that Learn from Human Teachers, 2009, pp. 95–102.
[3] M. Evered, P. Burling, M. Trotter, et al., An investigation of predator response in robotic herding of sheep, International Proceedings of Chemical, Biological and Environmental Engineering 63 (2014) 49–54.
[4] D. Strömbom, A. J. King, Robot collection and transport of objects: A biomimetic process, Frontiers in Robotics and AI (2018) 48.
[5] B. Bat-Erdene, O.-E. Mandakh, Shepherding algorithm of multi-mobile robot system, in: 2017 First IEEE International Conference on Robotic Computing (IRC), IEEE, 2017, pp. 358–361.
[6] A. A. Paranjape, S.-J. Chung, K. Kim, D. H. Shim, Robotic herding of a flock of birds using an unmanned aerial vehicle, IEEE Transactions on Robotics 34 (4) (2018) 901–915.
[7] D. Strömbom, R. P. Mann, A. M. Wilson, S. Hailes, A. J. Morton, D. J. Sumpter, A. J. King, Solving the shepherding problem: heuristics for herding autonomous, interacting agents, Journal of the royal society interface 11 (100) (2014) 20140719.
[8] H. Singh, B. Campbell, S. Elsayed, A. Perry, R. Hunjet, H. Abbass, Modulation of force vectors for effective shepherding of a swarm: A bi-objective approach, in: 2019 IEEE Congress on Evolutionary Computation (CEC), IEEE, 2019, pp. 2941–2948.
[9] T. Nguyen, J. Liu, H. Nguyen, K. Kasmarik, S. Anavatti, M. Garratt, H. Abbass, Perceptron-learning for scalable and transparent dynamic formation in swarm-on-swarm shepherding, in: 2020 International Joint Conference on Neural Networks (IJCNN), IEEE, 2020, pp. 1–8.
[10] J. Zhi, J.-M. Lien, Learning to herd agents amongst obstacles: Training robust shepherding behaviors using deep reinforcement learning, IEEE Robotics and Automation Letters 6 (2) (2021) 4163–4168.
[11] V. S. Chipade, D. Panagou, Multiagent planning and control for swarm herding in 2-d obstacle environments under bounded inputs, IEEE Transactions on Robotics 37 (6) (2021) 1956–1972.
[12] H. Song, A. Varava, O. Kravchenko, D. Kragic, M. Y. Wang, F. T. Pokorny, K. Hang, Herding by caging: a formation-based motion planning framework for guiding mobile agents, Autonomous Robots 45 (5) (2021) 613–631.
[13] H. El-Fiqi, B. Campbell, S. Elsayed, A. Perry, H. K. Singh, R. Hunjet, H. A. Abbass, The limits of reactive shepherding approaches for swarm guidance, IEEE Access 8 (2020) 214658–214671.
[14] A. Hussein, E. Petraki, S. Elsawah, H. A. Abbass, Autonomous swarm shepherding using curriculum-based reinforcement learning., in: AAMAS, 2022, pp. 633–641.
[15] S. M. LaValle, Planning algorithms, Cambridge university press, 2006.
[16] Z. Zhao, M. Jin, E. Lu, S. X. Yang, Path planning of arbitrary shaped mobile robots with safety consideration, IEEE Transactions on Intelligent Transportation Systems (2021).
[17] J. Müller, J. Strohbeck, M. Herrmann, M. Buchholz, Motion planning for connected automated vehicles at occluded intersections with infrastructure sensors, IEEE Transactions on Intelligent Transportation Systems (2022).
[18] J. Liu, S. Anavatti, M. Garratt, H. A. Abbass, Mission planning for shepherding a swarm of uninhabited aerial vehicles, Shepherding UxVs for Human-Swarm Teaming: An Artificial Intelligence Approach to Unmanned X Vehicles (2021) 87–114.
[19] J. K. Lenstra, A. R. Kan, Some simple applications of the travelling salesman problem, Journal of the Operational Research Society 26 (4) (1975) 717–733.
[20] S. Elsayed, H. Singh, E. Debie, A. Perry, B. Campbell, R. Hunjel, H. Abbass, Path planning for shepherding a swarm in a cluttered environment using differential evolution, in: 2020 IEEE Symposium Series on Computational Intelligence (SSCI), IEEE, 2020, pp. 2194–2201.
[21] T. Stützle, H. H. Hoos, Max–min ant system, Future generation computer systems 16 (8) (2000) 889–914.
[22] C. W. Reynolds, Flocks, herds and schools: A distributed behavioral model, in: Proceedings of the 14th annual conference on Computer graphics and interactive techniques, 1987, pp. 25–34.
[23] T. Miki, T. Nakamura, An effective rule based shepherding algorithm by using reactive forces between individuals, International Journal of InnovativeComputing, Information and Control 3 (4) (2007) 813–823.
[24] K. Fujioka, Effective herding in shepherding problem in V-formation control, Transactions of the Institute of Systems, Control and Information Engineers 31 (1) (2018) 21–27.
[25] J. F. Harrison, C. Vo, J.-M. Lien, Scalable and robust shepherding via deformable shapes, in: International Conference on Motion in Games, Springer, 2010, pp. 218–229.
[26] J. Hu, A. E. Turgut, T. Krajník, B. Lennox, F. Arvin, Occlusion-based coordination protocol design for autonomous robotic shepherding tasks, IEEE Transactions on Cognitive and Developmental Systems (2020).
[27] C. K. Go, B. Lao, J. Yoshimoto, K. Ikeda, A reinforcement learning approach to the shepherding task using sarsa, in: 2016 International Joint Conference on Neural Networks (IJCNN), IEEE, 2016, pp. 3833–3836.
[28] H. T. Nguyen, T. D. Nguyen, V. P. Tran, M. Garratt, K. Kasmarik, S. Anavatti, M. Barlow, H. A. Abbass, Continuous deep hierarchical reinforcement learning for ground-air swarm shepherding, arXiv preprint arXiv:2004.11543 (2020).
[29] K. Bérczi, M. Mnich, R. Vincze, Efficient approximations for many-visits multiple traveling salesman problems, arXiv preprint arXiv:2201.02054 (2022).
[30] A. Ayari, S. Bouamama, Acd3gpso: automatic clustering-based algorithm for multi-robot task allocation using dynamic distributed double-guided particle swarm optimization, Assembly Automation 40 (2) (2019) 235–247.
[31] J. Xie, J. Chen, Multiregional coverage path planning for multiple energy constrained uavs, IEEE Transactions on Intelligent Transportation Systems (2022).
[32] P. Baniasadi, M. Foumani, K. Smith-Miles, V. Ejov, A transformation technique for the clustered generalized traveling salesman problem with applications to logistics, European Journal of Operational Research 285 (2) (2020) 444–457.
[33] I. Khoufi, A. Laouiti, C. Adjih, A survey of recent extended variants of the traveling salesman and vehicle routing problems for unmanned aerial vehicles, Drones 3 (3) (2019) 66.
[34] X. Xu, J. Li, M. Zhou, X. Yu, Precedence-constrained colored traveling salesman problem: An augmented variable neighborhood search approach, IEEE Transactions on Cybernetics (2021).
[35] M. Mavrovouniotis, F. M. Müller, S. Yang, Ant colony optimization with local search for dynamic traveling salesman problems, IEEE transactions on cybernetics 47 (7) (2016) 1743–1756.
[36] I. M. Ali, D. Essam, K. Kasmarik, A novel design of differential evolution for solving discrete traveling salesman problems, Swarm and Evolutionary Computation 52 (2020) 100607.
[37] M. Dorigo, Optimization, learning and natural algorithms, PhD Thesis, Politecnico di Milano (1992).
[38] M. Dorigo, L. M. Gambardella, Ant colony system: a cooperative learning approach to the traveling salesman problem, IEEE Transactions on evolutionary computation 1 (1) (1997) 53–66.
[39] X. Xiang, Y. Tian, X. Zhang, J. Xiao, Y. Jin, A pairwise proximity learning-based ant colony algorithm for dynamic vehicle routing problems, IEEE Transactions on Intelligent Transportation Systems 23 (6) (2021) 5275–5286.
[40] O. Cheikhrouhou, I. Khoufi, A comprehensive survey on the multiple traveling salesman problem: Applications, approaches and taxonomy, Computer Science Review 40 (2021) 100369.
[41] P. Oberlin, S. Rathinam, S. Darbha, Today’s traveling salesman problem, IEEE robotics & automation magazine 17 (4) (2010) 70–77.
[42] V. Roberge, M. Tarbouchi, G. Labonté, Comparison of parallel genetic algorithm and particle swarm optimization for real-time UAV path planning, IEEE Transactions on Industrial Informatics 9 (1) (2012) 132–141.
[43] X. Yu, W.-N. Chen, T. Gu, H. Yuan, H. Zhang, J. Zhang, ACO-A*: Ant colony optimization plus A* for 3-D traveling in environments with dense obstacles, IEEE Transactions on Evolutionary Computation 23 (4) (2018) 617–631.
[44] P. E. Hart, N. J. Nilsson, B. Raphael, A formal basis for the heuristic determination of minimum cost paths, IEEE transactions on Systems Science and Cybernetics 4 (2) (1968) 100–107.
[45] K. Yang, Anytime synchronized-biased-greedy rapidly-exploring random tree path planning in two dimensional complex environments, International Journal of Control, Automation and Systems 9 (4) (2011) 750–758.