This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Planning-Assisted Context-Sensitive Autonomous Shepherding of Dispersed Robotic Swarms in Obstacle-Cluttered Environments

Jing Liu jing.liu5@unsw.edu.au/ liujing2605@gmail.com Hemant Singh h.singh@adfa.edu.au Saber Elsayed s.elsayed@unsw.edu.au Robert Hunjet r.hunjet@adfa.edu.au Hussein Abbass h.abbass@unsw.edu.au School of Engineering and Information Technology, University of New South Wales, Canberra ACT, Australia
Abstract

Robotic shepherding is a bio-inspired approach to autonomously guiding a swarm of agents towards a desired location. The research area has earned increasing research interest recently due to the efficacy of controlling a large number of agents in a swarm (sheep) using a smaller number of actuators (sheepdogs). However, shepherding a highly dispersed swarm in an obstacle-cluttered environment remains challenging for existing methods. To improve the efficacy of shepherding in complex environments with obstacles and dispersed sheep, this paper proposes a planning-assisted context-sensitive autonomous shepherding framework with collision avoidance abilities. The proposed approach models the swarm shepherding problem as a single Travelling Salesperson Problem (TSP), with two sheepdogs’ modes: no-interaction and interaction. An adaptive switching approach is integrated into the framework to guide real-time path planning for avoiding collisions with static and dynamic obstacles; the latter representing moving sheep swarms. We then propose an overarching hierarchical mission planning system, which is made of three sub-systems: a clustering approach to group and distinguish sheep sub-swarms, an Ant Colony Optimisation algorithm as a TSP solver for determining the optimal herding sequence of the sub-swarms, and an online path planner for calculating optimal paths for both sheepdogs and sheep. The experiments on various environments, both with and without obstacles, objectively demonstrate the effectiveness of the proposed shepherding framework and planning approaches.

keywords:
swarm shepherding , path planning , travelling saleperson problem , ant colony optimisation
journal:

1 Introduction

As a bio-inspired swarm guidance approach, robotic shepherding seeks to guide a swarm of agents (e.g., sheep flock, crowd) to a goal area by controlling the movement of one or more outside robots (known as sheepdogs or shepherds) [1]. Simulating the shepherding behaviour has attracted increasing attention of scholars due to the ability to map the level of abstraction in the shepherding problem to many real-world applications such as crowd control [2], precision agriculture [3], objects collection [4], robotic manipulation [5], and preventing birds from entering an airspace in airports [6].

One of the most challenging issues in shepherding is how to increase success rate while reducing mission’s completion time when herding a large number of sheep that are highly dispersed in an environment with obstacles (more on this challenge in Section 2.1).

Existing swarm shepherding methods can be roughly classified as rule-based methods [7] [8], learning-based methods [9] [10], and planning-based methods [11] [12]. Rule-based algorithms lack the flexibility and adaptability required to manage a wide range of environments [13]. While learning-based methods have the potential to address adaptability, they rely heavily on training and require a large amount of data and/or significant computational time for training [14]. Planning-based methods integrate planning approaches (e.g., optimisation techniques) into rule-based methods to guide sheepdog behaviour. However, current literature is limited to motion (e.g., path) planning algorithms, mostly for a single agent or multi-agents exhibiting self-control only (i.e. they do not need to exercise indirect control over groups). Therefore, existing methods face difficulty addressing the shepherding problem. The problem is compoinded when sheepdogs have limited influence ranges and need to herd a large dispersed flock into several sub-swarms in environments containing obstacles.

Planning is an important research field in robotics and artificial intelligence [15] [16] [17]. The field promises to improve the shepherding performance in terms of success rate and completion time [18]. Aiming to address the multiple sub-swarm, obstacle-cluttered environment, and effective shepherding, this paper focuses on planning-based methods. We capitalise on the similarity between multiple sub-swarm shepherding and the Travelling Salesperson Problem (TSP) to realise the benefits of path planning in obstacle-cluttered environments. TSP is a well-known route planning problem for determining the optimal visiting sequence of a list of cities [19] and has been extensively studied as described in Section 2.2. However, the similarities and application of TSP to shepherding problems has not been studied. Path planning using metaheuristics such as Evolutionary Computation Algorithms (EC) [20] and Rapidly Exploring Random Tree algorithm [12] have shown their promising early results in facilitating swarm shepherding.

This paper proposes a planning-assisted swarm shepherding framework by integrating the TSP with path planning to improve the effectiveness of shepherding, especially for a highly dispersed sheep swarm in an obstacle-cluttered environment. In the proposed shepherding framework, the sheep swarm is firstly divided into sub-swarms to identify the set of virtual ‘cities’ and then the shepherding problem is transformed into a single TSP to determine the optimal push sequence of sheep sub-swarms. Path planning is integrated with TSP for finding the optimal path for the sheep sub-swarm to be herded towards the next ‘city’ sequentially without collision with obstacles, and for the sheepdog to move towards the driving point of the sheep sub-swarm in real time. A primary difference between a classic TSP and the way it is adopted for shepherding is that the travelling salesperson actuates on itself, while in shepherding, it actuates on a group with unpredictable responses from the members of the groups. To handle this challenge effectively, we needed to combine the offline TSP with on-line heuristics to manage the emerging dynamics from these indirect interactions.

We consider two modes for a sheepdog to respond to context-sensitive information. First, a context-sensitive interaction mode where the sheepdog is forcing the sheep sub-swarm to move. Second, a no-interaction mode where the influence of sheepdogs on the sheep swarm is minimal to avoid undesired movements of sheep. An adaptive switching approach is proposed to assist sheepdogs in switching between the two interaction and no-interaction modes based on context during real-time opertions. Subsequently, we present a hierarchical mission planning algorithm, which combines the offline grouping and TSP solver, as well as the online path planner, to solve the optimisation problems involved in the framework. The grouping method divides the sheep swarm by evaluating if there is cohesion forces among the sheep and a well-known optimisation approach, Max-Min Ant System (MMAS) [21], is introduced for addressing the TSP. Besides, a two-layer path planner, A*-Post Processing (A*-PP), is presented to optimise the path for both sheepdogs and sheep swarm.

The contributions of this paper include the following:

  • 1.

    A planning-assisted swarm shepherding model is proposed to effectively herd the highly dispersed sheep swarm to the goal in obstacle-cluttered environments.

  • 2.

    The formulation of the swarm shepherding problem as a single TSP to determine the optimal herding sequence of sub-swarms.

  • 3.

    A context-sensitive response model where the sheepdog adaptively switches between two modes of operation during real-time path planning.

  • 4.

    A hierarchical mission planning system, consisting of offline grouping and MMAS-based sequencing as well as online path planning, are designed.

The remainder of the paper is organised as follows. Section 2 provides a review of the works related to swarm shepherding and mission planning. Section 3 presents the basic shepherding model while Section 4 describes the proposed planning-assisted shepherding model. The planning algorithms used in the proposed shepherding model are presented in Section 5, followed by the experimental results and analysis Section 6. Last is the conclusion in Section 7.

2 Related works

This section covers an overview of related work to swarm shepherding.

2.1 Swarm shepherding

The success of robotic swarm shepherding relies on the modelling of sheep flocking behaviour and the design of sheepdog control strategies. The rules of BOIDS [22] [23] [24] are the most common sheep modelling method, where separation, cohesion and alignment of sheep are considered. To improve robustness when herding larger flocks, Harrison et al.[25] viewed the flock as an abstracted deformable shape, while Hu et al. [26] used adaptive protocols and artificial potential filed methods to model the sheep flocking behaviour.

The shepherding field of research has focused more on the design of sheepdog control strategies. As a representative work of rule-based shepherding methods, Strömbom et al’s shepherding algorithm [7] laid the foundations for many other shepherding methods, such as the modulation model in [8] and the Reinforcement Learning (RL) approach in [27]. Strömbom et al. [7] (described in section 3) simulated two typical sheepdog behaviours (collecting the dispersed sheep, and driving the aggregated sheep swarm to a specific location). They found that the mission completion time increases and the success rate decreases as the number of agents in the swarm increases.

A coordination algorithm was designed in [26] to employ multiple robotic sheepdogs to herd two flocks of sheep, which consists of 20 and 30 sheep, respectively. It was observed that the proposed algorithm could not handle the shepherding of a large flock. El-Fiqi et al. [13] investigated the influence of some key factors (e.g., the density of obstacles and the initial spatial distribution of sheep) on the complexity of shepherding and identified the limitations of reactive shepherding. It was suggested that an increase in the density of obstacles and the sheep’s initial level of dispersion in the environment escalate problem complexity and reduce mission success rate.

Learning-based methods have also been studied [10] [9] [28]. Go et al. [27] extended Strömbom et al.’s model by applying RL for learning the sheepdog’s behaviour policy. Hussein et al. [14] decomposed the shepherding problem into two sub-problems: learning to push an agent from a location to the destination and selecting whether to collect scattered agents or drive the largest flock to the destination. They aimed to reduce the problem’s complexity and proposed a curriculum-based RL to accelerate the learning process. However, the investigation of the swarm shepherding problem with multiple sub-swarms randomly dispersed in the obstacle-cluttered environment is still limited, and an efficient way to address this problem is lacking.

2.2 Planning approaches

Mission planning approaches such as path planning algorithms have been well investigated and applied for mobile robots. The planning sub-problems (e.g., path planning, route planning, task assignment) involved in mission planning are defined, and the promising applications for swarm shepherding are discussed in [18]. Long et al. [1] also suggested that the sheepdog should take charge of high-level planning, such as path planning and task allocation for completing complex shepherding tasks. Some research on the applications of planning approaches for swarm shepherding exist [11] [12]. For instance, Lien and Pratt [2] presented a computer-human interactive motion planning method to address the shepherding problem. They observed that the planner lacks efficiency when the flock separates into several sub-groups. To modify the shepherding model in environments with obstacles, Elsayed et al. [20] presented a 2-stage differential evolution-based path planning algorithm that optimises the path for the sheepdog and sheep. They demonstrated that the path planning algorithm could reduce the time to complete the shepherding task.

TSP is a well-known NP-hard route planning problem that aims to find the route with the optimal cost for a salesperson to visit each city exactly once and returns to the initial city, given a set of cities and the travelling cost between each pair of cities [19]. TSP is a generalisation of or can be applied to many real-world problems, such as vehicle routing problem (VRP) [29], multi-robot task allocation [30], multi-regional coverage path planning [31], transportation and delivery [32]. Significant research has been conducted on TSP [33] [34]. Some effective approaches for solving the TSP include EC [35] [36] and swarm-intelligence algorithm such as Ant Colony Optimisation (ACO) [37], which has demonstrated its ability to solve TSP in multiple studies [21] [38] [39]. Many variants of TSP, such as Multiple TSP (MTSP) and Dynamic TSP (DTSP) [35], exist. For example, when there are multiple salespersons, the problem is called MTSP and can be further classified as single-depot and multi-depot based on where salespersons depart from [40]. Transformation methods have been used to convert a complex TSP problem to a classic single TSP where general and efficient TSP solvers exist [41]. Shepherding problems, especially with multiple sub-swarms, share some similarities with TSP. For example, there are some swarm locations (‘cities’) required to be visited by some agents (sheepdogs/salespersons) in both problems. However, to the best of our knowledge, TSP has not been applied to the robotic shepherding problem before.

3 Strömbom model

Before moving to the proposed approach, we briefly describe the model proposed by Strömbom et al. to introduce the terminology associated with shepherding that will be used subsequently. Let the sheep swarm be Π={π1,,πi,,πN}\Pi=\{\pi_{1},...,\pi_{i},...,\pi_{N}\} where πi\pi_{i} denotes a sheep agent and NN is the number of sheep agents in the swarm. B={β1,,βj,,βM}B=\{\beta_{1},...,\beta_{j},...,\beta_{M}\} is the set of sheepdog agents (UGVs) with MM sheepdogs denoted as βj\beta_{j}. The goal position which sheepdogs herd the sheep swarm towards is denoted as PGP_{G}. The position of πi\pi_{i}/βj\beta_{j} at time step tt is denoted as PπitP_{\pi_{i}}^{t}/PβjtP_{\beta_{j}}^{t}. As per [8] [20] [13], sheep πi\pi_{i} total force FπitF_{\pi_{i}}^{t} and sheepdog βj\beta_{j} total force FβjtF_{\beta_{j}}^{t} are calculated as Equation (1) and Equation (2) respectively.

Fπit=WπvFπit1+WπΛFπiΛπitt+WπβFπiβjt+WππFπiπi1t+WπoFπiot+WeπiFπiϵt\begin{split}F_{\pi_{i}}^{t}=W_{\pi_{v}}F_{\pi_{i}}^{t-1}+W_{\pi\Lambda}F_{\pi_{i}\Lambda_{\pi_{i}}^{t}}^{t}+W_{\pi\beta}F_{\pi_{i}\beta_{j}}^{t}\\ +W_{\pi\pi}F_{\pi_{i}\pi_{i_{1}}}^{t}+W_{\pi o}F_{\pi_{i}o}^{t}+W_{e\pi_{i}}F_{\pi_{i}\epsilon}^{t}\end{split} (1)
Fβjt=Fβjcdt+WeβjFβjϵtF_{\beta_{j}}^{t}=F_{\beta_{j}cd}^{t}+W_{e\beta_{j}}F_{\beta_{j}\epsilon}^{t} (2)

where each WW is the weight of the corresponding force vector. Each force vector is described as follows:

For sheep πi\pi_{i}:

  1. 1.

    Fπit1F_{\pi_{i}}^{t-1} is the previous total force vector;

  2. 2.

    FπiΛπittF_{\pi_{i}\Lambda_{\pi_{i}}^{t}}^{t} represents the attraction force to its neighbours Λπit\Lambda_{\pi_{i}}^{t} within the cohesion range RΛR_{\Lambda};

  3. 3.

    FπiβjtF_{\pi_{i}\beta_{j}}^{t} represents the repulsion force from sheepdog βj\beta_{j} if πi\pi_{i} is within the influence range of the sheepdog RπβR_{\pi\beta};

  4. 4.

    Fπiπi1tF_{\pi_{i}\pi_{i_{1}}}^{t} is the repulsion force from other sheep πi1,i1i\pi_{i_{1}},i_{1}\neq i within the sheep avoidance radius RππR_{\pi\pi};

  5. 5.

    FπiotF_{\pi_{i}o}^{t} is the repulsion force from the obstacles oo within the obstacles avoidance radius RπoR_{\pi o};

  6. 6.

    FπiϵtF_{\pi_{i}\epsilon}^{t} is the random forces added to sheep πi\pi_{i}.

For sheepdog βj\beta_{j}:

  1. 1.

    FβjcdtF_{\beta_{j}cd}^{t} represents the normalised force vector that makes the sheepdog move to the driving point PDtP_{D}^{t} or collection point PCtP_{C}^{t};

  2. 2.

    FβjϵtF_{\beta_{j}\epsilon}^{t} is the random forces added to Sheepdog βj\beta_{j} to help avoid deadlocks.

To complete the shepherding mission, sheepdog agents switch between driving behaviour and collecting behaviour by evaluating if any sheep is further away from the sheep flock as shown in Algorithm 1. Specifically, if the distance between any sheep and the Global Centre of Mass (GCM) of flock is further than the neighbourhood range RnR_{n}, the sheepdog moves to the collecting point PCtP_{C}^{t}, which is located behind the furthest sheep πft\pi_{f}^{t} in the direction of the GCM; otherwise, the sheep are clustered in the flock and the sheepdog needs to execute a driving behaviour by moving to the driving point PDtP_{D}^{t}, which is located behind the GCM relative to the final goal PGP_{G}. PDtP_{D}^{t} and PCtP_{C}^{t} are calculated as following:

PDt=GCMt+(Rn+Rs)PGGCMtPGGCMtP_{D}^{t}=GCM^{t}+(R_{n}+R_{s})\frac{P_{G}-GCM^{t}}{||P_{G}-GCM^{t}||} (3)
PCt=Pπft+RsGCMtPπftGCMtPπftP_{C}^{t}=P_{\pi_{f}^{t}}+R_{s}\frac{GCM^{t}-P_{\pi_{f}}^{t}}{||GCM^{t}-P_{\pi_{f}}^{t}||} (4)
Rn=Rππ2NR_{n}=R_{\pi\pi}\sqrt{2N} (5)

where RsR_{s} is the safe operation distance between a sheepdog and a sheep.

Algorithm 1 Herding(PGP_{G}, GCMtGCM^{t}, Π\Pi)
1:PGP_{G}, GCMtGCM^{t}, Π={π1,,πi,,πN}\Pi=\{\pi_{1},...,\pi_{i},...,\pi_{N}\}
2:Locate the furthermost sheep πf\pi_{f}
3:if the distance between πf\pi_{f} and GCM>RnGCM>R_{n} then
4:   Calculate the driving point PDtP_{D}^{t} using Equation (3)
5:else
6:   Calculate the collecting point PCtP_{C}^{t} using Equation (4)
7:end if
8:PDtP_{D}^{t}/PCtP_{C}^{t}

Then sheepdog βj\beta_{j} position Pβit+1P_{\beta_{i}}^{t+1} and sheep πi\pi_{i} position Pπit+1P_{\pi_{i}}^{t+1} are updated according to Equation (6) and Equation (7), respectively.

Pβjt+1=Pβjt+SβjtFβjtP_{\beta_{j}}^{t+1}=P_{\beta_{j}}^{t}+S_{\beta_{j}}^{t}F_{\beta_{j}}^{t} (6)
Pπit+1=Pπit+SπitFπitP_{\pi_{i}}^{t+1}=P_{\pi_{i}}^{t}+S_{\pi_{i}}^{t}F_{\pi_{i}}^{t} (7)

where SβjtS_{\beta_{j}}^{t} and SπitS_{\pi_{i}}^{t} represent the moving speed of sheepdog βj\beta_{j} and sheep πi\pi_{i}.

4 Planning-assisted swarm shepherding framework

As discussed in Section 2, existing shepherding models are inefficient when the sheep agents are too dispersed and the density of obstacles is high. To address this issue, this section proposes a planning-assisted swarm shepherding framework to improve shepherding efficacy by integrating a grouping/clustering approach, a TSP solver, and a localised path planning and navigation into a planning-assisted shepherding framework.

4.1 Grouping of dispersed sheep in the environment

Given a highly dispersed sheep swarm Π\Pi in an environment, the first step in the planning-assisted shepherding framework is to group the dispersed sheep into some sub-swarms and locate the Local Centre of Mass (LCM) of each sub-swarm. The set of sheep sub-swarms is denoted as

Φ={ϕ1,,ϕq,,ϕQ}\Phi=\{\phi_{1},...,\phi_{q},...,\phi_{Q}\} (8)

where QQ is the number of sub-swarms and the sub-swarm ϕq\phi_{q} subjects to q=1Qϕq=Π\bigcup_{q=1}^{Q}\phi_{q}=\Piq=1Qϕq=\bigcap_{q=1}^{Q}\phi_{q}=\emptyset and ϕq,q{1,,Q}\phi_{q}\neq\emptyset,\ \forall q\in\{1,...,Q\}. A sheep is assigned to a sub-swarm ϕq\phi_{q} if it is within the cohesion range RΛR_{\Lambda} from any sheep of this sub-swarm. The LCM of ϕq\phi_{q} at time step tt is calculated as

LCMqt=1Nsql=1NsqPπlqtLCM_{q}^{t}=\frac{1}{N_{s}^{q}}\sum_{l=1}^{N_{s}^{q}}P_{\pi_{l}^{q}}^{t} (9)

where NsqN_{s}^{q} is the number of sheep in the sub-swarm and PπlqtP_{\pi_{l}^{q}}^{t} is the position of the lthl_{th} sheep grouped in ϕq\phi_{q} at tt. The LCM of each ϕq\phi_{q} is regarded as a target location, which the sheepdog should visit.

4.2 Transforming the swarm shepherding problem to the TSP for task sequencing

After obtaining LCMs {LCM1t,,LCMqt,,LCMQt}\{LCM_{1}^{t},...,LCM_{q}^{t},...,LCM_{Q}^{t}\} of sub-swarms, the swarm shepherding problem can be transformed into a variant of TSP. This section discusses how to transform the single-sheepdog shepherding and bi-sheepdog shepherding problems to a single TSP and presents the mathematical formulation of the shepherding-transformed single TSP. Subsequently, the general TSP solver (presented in Section 5.1) can be employed to find the optimal push sequence of sheep sub-swarms to guide the sheepdog(s)’ behaviours.

4.2.1 Transforming the single-sheepdog shepherding problem

To transform the single-sheepdog shepherding problem, we first describe how the shepherding mission is expected to be completed in our proposed model. For illustrative purposes, Fig. 1(a) presents a single-sheepdog swarm shepherding problem with sheep dispersed randomly in the obstacle-free environment and a sheepdog located in the top-right corner. The grouping result is indicated in Fig. 1(b) with 5 sub-swarms {ϕ1,ϕ2,ϕ3,ϕ4,ϕ5}\{\phi_{1},\phi_{2},\phi_{3},\phi_{4},\phi_{5}\} in different colours and the LCMs {LCM1t,LCM2t,LCM3t,LCM4t,LCM5t}\{LCM_{1}^{t},LCM_{2}^{t},LCM_{3}^{t},LCM_{4}^{t},LCM_{5}^{t}\} are represented as black crosses. Assuming the optimal push sequence of sub-swarms is [1,2,3,4,5][1,2,3,4,5], Fig. 1(b) illustrates how the sheepdog is going to drive the sequenced sheep sub-swarms to reach the goal area.

Similar to the description in Section 3, the driving point of each sub-swarm is located behind the sub-swarm in the direction of the next target location, maintaining the distance of Rn+RsR_{n}+R_{s} from the LCM of the sub-swarm. The driving point PDqP_{D_{q}} for each sub-swarm ϕq\phi_{q} is represented as a yellow square in Fig. 1(b). To control the sheep sub-swarms to move as indicated by the grey thick directed line segments in Fig. 1(b), the sheepdog should follow the route represented by the red directed line segments and curves. To assist the sheepdog in switching to drive another sub-swarm, switch points PSWqtP_{SW_{q}}^{t} are introduced in the proposed model and are represented as blue triangles in Fig. 1(b). Specifically, the sheepdog β\beta departs from its initial position Pβ0P_{\beta}^{0} for PD1tP_{D_{1}}^{t}, pushes ϕ1\phi_{1} to LCM2tLCM_{2}^{t} by travelling to PSW2tP_{SW_{2}}^{t}, then switches to PD2tP_{D_{2}}^{t} for pushing ϕ2\phi_{2}, and repeats this process until all the sub-swarms reach PGP_{G}.

Refer to caption
(a)
Refer to caption
(b)
Figure 1: Illustration of a single-sheepdog swarm shepherding problem with multiple sub-swarms

To define a TSP, two vital issues need to be addressed. These are 1) identifying the list of cities and 2) evaluating the travelling cost between each pair of cities. In the single-sheepdog swarm shepherding problem, Pβ0P_{\beta}^{0}, PGP_{G} and the areas where the sub-swarms ϕq\phi_{q} are located in (blue dashed circles in Fig. 1(b)) constitute the set of cities. For convenience, let the sheepdog’s initial position Pβ0P_{\beta}^{0} be LCM0tLCM_{0}^{t}, ϕ0=\phi_{0}=\emptyset and the final goal PGP_{G} be LCMQ+1tLCM_{Q+1}^{t}. The route’s start and end city are fixed to be LCM0tLCM_{0}^{t} and LCMQ+1tLCM_{Q+1}^{t} to ensure that the sheepdog departs from Pβ0P_{\beta}^{0} and pushes all the sheep to PGP_{G}. The travelling cost between each pair of cities should be evaluated by the cost CqqC_{qq^{\prime}} for pushing ϕq\phi_{q} from LCMqtLCM_{q}^{t} to LCMqt,q,q{0,1,2,,Q+1},qqLCM_{q^{\prime}}^{t},\ q,\ q^{\prime}\in\{0,1,2,...,Q+1\},\ q\neq q^{\prime}. However, it is challenging to precisely evaluate CqqC_{qq^{\prime}} as shepherding is a complex, interactive, dynamic process involving some uncontrollable factors. In this study, we simplify the evaluation of CqqC_{qq^{\prime}} by calculating it as the distance between LCMqtLCM_{q}^{t} and LCMqtLCM_{q^{\prime}}^{t} for the obstacle-free environment, and the cost of generated path between LCMqtLCM_{q}^{t} and LCMqtLCM_{q^{\prime}}^{t} (Cqq=C_Pa(q,q)C_{qq^{\prime}}=C\_Pa(q,q^{\prime}) as calculated in Equa (17)) for the obstacle-laden environment.

4.2.2 Transforming the bi-sheepdog shepherding problem

Similarly, the bi-sheepdog shepherding problem can be regarded as a multiple TSP where multiple sheepdogs depart from their corresponding initial locations Pβj0P_{\beta_{j}}^{0} (depots) to visit each LCM (city) exactly once for collecting the dispersed sub-swarms Φ={ϕ1,,ϕq,,ϕQ}\Phi=\{\phi_{1},...,\phi_{q},...,\phi_{Q}\} and finally drive them to reach the goal location PGP_{G} (terminal). The sheepdogs are not required to return to their initial locations. In this section, we further convert the shepherding-transformed multiple TSP to a single TSP so that the general single TSP solver can be employed to address the problem.

To solve the bi-sheepdog shepherding problem as a single TSP (STSP), we regard the initial position of a sheepdog Pβ10P_{\beta_{1}}^{0} as LCM0tLCM_{0}^{t}, the goal location PGP_{G} as LCMQ+1tLCM_{Q+1}^{t} and another sheepdog’s initial position Pβ20P_{\beta_{2}}^{0} as LCMQ+2tLCM_{Q+2}^{t}. {LCM0t,,LCMqt,,LCMQ+2t}\{LCM_{0}^{t},...,LCM_{q}^{t},...,LCM_{Q+2}^{t}\} are the set of the cities’ locations. The start and the end city of the route are fixed to be LCM0tLCM_{0}^{t} and LCMQ+2tLCM_{Q+2}^{t}. Then a solution of the STSP can be converted to the solution of MTSP by splitting it into two lists at LCMQ+1tLCM_{Q+1}^{t} and reversing the order of the latter list. In this way, the first city of each list is the initial position of a sheepdog (Pβ10P_{\beta_{1}}^{0} or Pβ20P_{\beta_{2}}^{0}) and the end city is the goal (PGP_{G}). Other cities on the list are the sub-swarms to be driven by the corresponding sheepdog located at the start of the list. Fig. 2(a) shows a solution of the STSP, and Fig. 2(b) illustrates the transformed solution of MTSP and the shepherding process guided by the MTSP solution.

Refer to caption
(a)
Refer to caption
(b)
Figure 2: Illustration of a multi-sheepdog swarm shepherding problem with scattered sub-swarms

4.2.3 Mathematical formulation of the TSP

Based on the abovementioned discussion, the solution of the shepherding-transformed TSP is formulated as follows:

ψqq,q,q{0,1,2,,Q+M},qq\psi_{qq^{\prime}},\ q,\ q^{\prime}\in\{0,1,2,...,Q+M\},\ q\neq q^{\prime} (10)

where q,qq,\ q^{\prime} are the indexes of LCMs. If the sheepdog pushes ϕq\phi_{q} from LCMqtLCM_{q}^{t} to LCMqtLCM_{q^{\prime}}^{t}, ψqq=1\psi_{qq^{\prime}}=1; otherwise, ψqqu=0\psi^{u}_{qq^{\prime}}=0. MM is the number of sheepdogs. MM is limited to {1,2}\{1,2\} here.

The optimisation objective of the TSP is:

Minimize:

F=q=0Q+Mq=0Q+MCqqψqq\displaystyle F=\sum_{q=0}^{Q+M}\sum_{q^{\prime}=0}^{Q+M}C_{qq^{\prime}}\cdot\psi_{qq^{\prime}} (11)

Subject to:

q=0,qqQ+Mψqqu=1,q{1,2,Q+M}\displaystyle\sum_{q=0,q\neq q^{\prime}}^{Q+M}\psi^{u}_{qq^{\prime}}=1,\forall q^{\prime}\in\{1,2,...Q+M\} (12)
q=0,qqQ+Mψqqu=1,q{0,1,Q+M1}\displaystyle\sum_{q^{\prime}=0,q^{\prime}\neq q}^{Q+M}\psi^{u}_{qq^{\prime}}=1,\forall q\in\{0,1,...Q+M-1\} (13)
q=1Q+Mψq0u=0\displaystyle\sum_{q=1}^{Q+M}\psi^{u}_{q0}=0 (14)
q=0Q+M1ψ(Q+M)qu=0\displaystyle\sum_{q^{\prime}=0}^{Q+M-1}\psi^{u}_{(Q+M)q^{\prime}}=0 (15)

Here, FF is the total cost and CqqC_{qq^{\prime}} is the cost to push ϕq\phi_{q} from LCMqtLCM_{q}^{t} to LCMqtLCM_{q^{\prime}}^{t}. Constraints (12) and (13) ensure that each target location is visited exactly once. Constraints (14) and (15) ensure that the sheepdog βj\beta_{j} departs from PβjP_{\beta_{j}} and finally reaches PGP_{G}.

4.3 Path planning for sheepdog(s) and sheep swarm

Given the sequenced sub-swarms {ϕ1,,ϕq,,ϕQ}\{\phi_{1}^{\prime},...,\phi_{q}^{\prime},...,\phi_{Q}^{\prime}\} and the corresponding LCMs {LCM1,,LCMq,,LCMQ}\{LCM_{1}^{\prime},...,LCM_{q}^{\prime},...,LCM_{Q}^{\prime}\}, the mission of the sheepdog can be regarded as a set of sequential sub-tasks, i.e., pushing ϕq\phi_{q}^{\prime} from LCMqLCM_{q}^{\prime} to LCMq+1LCM_{q+1}^{\prime}, q{1,,Q}\forall\leavevmode\nobreak\ q\in\{1,...,Q\}. Path planning is crucial for both sheepdog(s) and sheep swarm to reduce detours and mission completion time. Next we present the mathematical formulation of path planning and discuss how to integrate the path planning algorithm into shepherding based on a proposed classification of sheepdog moving mode.

4.3.1 Mathematical formulation of path planning

In this paper, the path is defined as a sequence of way-points that can be connected as a set of path segments. Denoting the start and goal points as W0W_{0} and WD+1W_{D+1}, respectively, the solution of path planning between the two points could be represented as:

Path(W0,WD+1)={W0,W1,,Wd,,WD+1}Path(W_{0},W_{D+1})=\{W_{0},W_{1},...,W_{d},...,W_{D+1}\} (16)

In the obstacle-cluttered environment, collision avoidance is a hard constraint which means that a path is infeasible once it intersects, i.e. collides, with any obstacle in the environment.

Referring to [42] [43], the cost evaluation function of a feasible path, which is also the optimisation objective of path planning, is defined as:

C_Pa(W0,WD+1)=α1C_L(D)+α2C_Th(D)C\_{Pa}(W_{0},W_{D+1})=\alpha_{1}\cdot C\_L(D)+\alpha_{2}\cdot C\_Th(D) (17)

where α1\alpha_{1} and α2\alpha_{2} are the weights of costs.

C_LC\_L is the path length cost and is calculated as:

C_L(D)=d=0DWdWd+1C\_L(D)=\sum_{d=0}^{D}||W_{d}-W_{d+1}|| (18)

C_ThC\_Th is the threat cost evaluating the unwanted disturbance of sheepdogs on sheep and is calculated as:

C_Th(D)=d=0DThreatdC\_Th(D)=\sum_{d=0}^{D}Threat_{d} (19)

Threatd=1Threat_{d}=1 if the path segment from WdW_{d} to Wd+1W_{d+1} collides with the threat area which is defined as a set of circles with the centre points Pπit,i{1,N}P^{t}_{\pi_{i}},\forall i\in\{1,...N\} and the radius RthR_{th}. RthR_{th} is the threat range, representing the distance that the sheepdog should keep from the sheep to avoid the unwanted influence. A large RthR_{th} will increase the path length cost of the sheepdog to reach the target point while a small RthR_{th} might disturb the sheep and cause unexpected movements.

4.3.2 Path planning for shepherding

To integrate path planning into context-sensitive shepherding, we design a two-mode sheepdog operations, a no-interaction and an interaction modes based on whether the influence of sheepdogs on sheep swarm C_ThC\_Th should be minimised or not.

No-interaction mode: The no-interaction mode is usually triggered when the sheepdog departs from its initial location Pβj0P_{\beta_{j}}^{0} for the driving point of the sub-swarm ϕ1\phi_{1}^{\prime}, or when the sheepdog just finished a sub-task of pushing ϕq1\phi_{q-1}^{\prime} and switches to the next sub-task by moving towards the driving point of the next sub-swarm ϕq\phi_{q}^{\prime}. During the no-interaction mode, the influence of sheepdogs on the sheep swarm should be minimised to avoid unwanted movements of the sheep swarm. Sheep are considered obstacles that should be avoided to avoid disturbing the flocks while the sheepdog is positioning itself for a driving position. The path planning algorithm, A*-PP (presented in Section 5.2), is used to find the optimal path Path(Pβjt,PDqt)Path(P_{\beta_{j}}^{t},P_{D_{q}}^{t}), which is obstacle-free and has the lowest cost, for the sheepdog βj\beta_{j} to follow from its current location PβjtP_{\beta_{j}}^{t} to the driving point PDqtP_{D_{q}}^{t} of the target sub-swarm. The path cost C_Pa(Pβjt,PDqt)C\_{Pa}(P_{\beta_{j}}^{t},P_{D_{q}}^{t}) is evaluated according to Equation (17) by setting α1,α2\alpha_{1},\alpha_{2} as positive numbers.

Interaction mode: The interaction mode is activated when the sheepdog is driving the sheep sub-swarm ϕq\phi_{q}^{\prime} from LCMqtLCM_{q}^{{}^{\prime}t} towards a sub-goal PSGtP_{SG}^{t}. During this phase, the sheepdog continues to influence the sheep sub-swarm by witching between collecting and driving behaviours as it evaluates the furthest distance of the sheep to LCMqtLCM_{q}^{{}^{\prime}t}. The path Path(Pβjt,PDqt/PCqt)Path(P_{\beta_{j}}^{t},P_{D_{q}}^{t}/P_{C_{q}}^{t}) from the sheepdog’s current position PβjtP_{\beta_{j}}^{t} to the driving/collecting point PDqtP_{D_{q}}^{t}/PCqtP_{C_{q}}^{t} of the target sub-swarm is also optimised using A*-PP. Different from the no-interaction mode, only the path length CLC_{L} is considered for the path cost evaluation in the interaction mode. Therefore, α2\alpha_{2} in Equation (17) should be set to 0.

The driving point PDqtP_{D_{q}}^{t} is calculated as follows:

PDqt=LCMqt+(Rn+Rs)PSGtLCMqtPSGtLCMqtP_{D_{q}}^{t}=LCM_{q}^{{}^{\prime}t}+(R_{n}+R_{s})\frac{P_{SG}^{t}-LCM_{q}^{{}^{\prime}t}}{||P_{SG}^{t}-LCM_{q}^{{}^{\prime}t}||} (20)

where LCMqtLCM_{q}^{{}^{\prime}t} is the LCM of the sheep sub-swarm ϕq\phi_{q}^{\prime} that the sheepdog will be driving or is currently driving; PSGtP_{SG}^{t} is the sub-goal that the sheepdog is driving ϕq\phi_{q}^{\prime} towards. PSGtP_{SG}^{t} is set as the waypoint in the optimised path Path(LCMqt,LCMq+1t)Path(LCM_{q}^{{}^{\prime}t},LCM_{q+1}^{{}^{\prime}t}) of ϕq\phi_{q}^{\prime} from LCMqtLCM_{q}^{{}^{\prime}t} to LCMq+1tLCM_{q+1}^{{}^{\prime}t}, which is obtained by A*-PP as well. In this way, the sheepdog will push ϕq\phi_{q}^{\prime} to move towards the optimal path so as to reduce the detours of both sheepdogs and sheep sub-swarms.

4.4 Planning-assisted shepherding framework

Based on the discussion above, the overall planning-assisted shepherding framework is presented in Algorithm 2, where Mode=0Mode=0 represents the no-interaction mode while Mode=1Mode=1 represents the interaction mode. Before the real-time shepherding, the offline planner obtains the grouping and sequencing results (lines 2 and 3). Then the sheepdog starts with the no-interaction mode for reaching the driving point of ϕ1\phi_{1}^{\prime}.

Algorithm 2 Planning-assisted shepherding model
1:Π={π1,,πi,,πN}\Pi=\{\pi_{1},...,\pi_{i},...,\pi_{N}\}, β\beta, PGP_{G}
2:Initialise: q=1,t=0,Mode=0q=1,t=0,Mode=0, TT
3:Get Φ={ϕ1,,ϕq,,ϕQ}\Phi=\{\phi_{1},...,\phi_{q},...,\phi_{Q}\} via Algorithm 3
4:Get sequenced sub-swarms Φ={ϕ0,,ϕq,,ϕQ}\Phi^{\prime}=\{\phi_{0}^{\prime},...,\phi_{q}^{\prime},...,\phi_{Q}^{\prime}\} and corresponding LCMs {LCM0,,LCMq,,LCMQ+1}\{LCM_{0}^{\prime},...,LCM_{q}^{\prime},...,LCM_{Q+1}^{\prime}\} using MMAS
5:while t<Tt<T && sheep have not all reached LCMQ+1LCM_{Q+1}^{\prime} do
6:   t=t+1t=t+1
7:   if ϕq\phi_{q}^{\prime} encounters ϕq+1\phi_{q+1}^{\prime}  then
8:      ϕq+1=ϕq+1ϕq\phi_{q+1}^{\prime}=\phi_{q+1}^{\prime}\cup\phi_{q}^{\prime};
9:      q=q+1q=q+1; Mode=0Mode=0;
10:   end if
11:   Update LCMs;
12:   if Mode=0Mode=0 || mod(t,10)=1mod(t,10)=1 then
13:      Calculate the optimal path of ϕq\phi_{q}^{\prime}: Pathϕq={LCMq,W1,W2,,WD,LCMq+1}Path_{\phi^{\prime}_{q}}=\{LCM_{q}^{\prime},W_{1},W_{2},...,W_{D},LCM_{q+1}^{\prime}\}
14:      Let W1W_{1} be the sub-goal PSGtP_{SG}^{t}
15:   else
16:      Locate the nearest waypoint on PathϕqPath_{\phi^{\prime}_{q}} ahead of β\beta as the sub-goal PSGtP_{SG}^{t}
17:   end if
18:   if Mode=0Mode=0 then
19:      Calculate PDqtP_{D_{q}}^{t} based on Equation (20)
20:      Optimise Pathβj(Pβjt,PDqt)Path_{\beta_{j}}(P_{\beta_{j}}^{t},P_{D_{q}}^{t}) as per no-interaction mode
21:   else
22:      Calculate PDqtP_{D_{q}}^{t} or PCqtP_{C_{q}}^{t} based on Algorithm 1: PDqt/PCqtP_{D_{q}}^{t}/P_{C_{q}}^{t}= Herding(PSGtP_{SG}^{t}, LCMqtLCM_{q}^{{}^{\prime}t}, ϕq\phi_{q}^{\prime})
23:      Optimise Pathβj(Pβjt,PDqt/PCqt)Path_{\beta_{j}}(P_{\beta_{j}}^{t},P_{D_{q}}^{t}/P_{C_{q}}^{t}) as per interaction mode
24:   end if
25:   Sheepdog moves following PathβjPath_{\beta_{j}} with the limitation of maximum speed
26:   Update the sheep position according to Equation (7)
27:   if Mode=0Mode=0 & β\beta reaches PDqtP_{D_{q}}^{t} then
28:      Mode=1Mode=1
29:   end if
30:end while
31:tt

During the shepherding process, the sheepdog switches between the no-interaction and the interaction mode based on context and to complete the set of sequenced sub-tasks, i.e., pushing the sheep sub-swarms ϕq\phi_{q}^{\prime} to LCMq+1LCM_{q+1}^{\prime}, q{1,,Q}\forall\leavevmode\nobreak\ q\in\{1,...,Q\}. An adaptive switching approach is designed based on the real-time evaluation of the shepherding progress and is integrated into the shepherding framework. As presented in line 5-9 of Algorithm 2, the switching from the interaction mode to the no-interaction mode happens when ϕq\phi_{q}^{\prime} encounters ϕq+1\phi_{q+1}^{\prime}, indicating that the sheepdog just successfully pushed ϕq\phi_{q}^{\prime} to LCMq+1LCM_{q+1}^{\prime} to merge with ϕq+1\phi_{q+1}^{\prime} and is in preparation for the next sub-task by moving to the driving point of ϕq+1\phi_{q+1}^{\prime}. Then once the driving point of ϕq+1\phi_{q+1}^{\prime} is reached in the no-interaction mode, the sheepdog switches to the interaction mode (line 27-29), which means that the sheepdog starts pushing the sub-swarm ϕq+1\phi_{q+1}^{\prime}. This process continues until all the sheep reach the goal area LCMQ+1LCM_{Q+1}^{\prime} or the pre-defined maximum time steps TT is reached. Figure 3 shows the corresponding flowchart.

Refer to caption
Figure 3: The flowchart of the planning-assisted swarm shepherding model

5 Hierarchical mission planning for shepherding

To assist the proposed swarm shepherding framework, a hierarchical mission planning system is proposed in this section by combining the approach for grouping, MMAS for TSP, and A*-PP for online path planning.

5.1 Grouping and MMAS: offline task planner

Given the sheep swarm Π\Pi, a cohesion range-based grouping method as presented in Algorithm 3 is used to obtain the set of sheep sub-swarms and calculate the LCMs. Then MMAS [21], a well-known ACO algorithm, is introduced to address the shepherding-transformed TSP for getting the optimal push sequence of sub-swarms due to its outstanding performance in addressing TSP. ACO is inspired from the real ant colonies’  foraging behaviour. During the foraging process, ants deposited pheromone trails on the return routes if they find food sources. It enables other ants to find optimal paths to food sources by getting information from pheromone trails.

Algorithm 3 Grouping of dispersed sheep
1:a sheep swarm Π={π1,,πi,,πN}\Pi=\{\pi_{1},...,\pi_{i},...,\pi_{N}\}
2:Initialise: Q=0Q=0, ϕq=Null\phi_{q}=Null
3:Calculate the distance Dπ1πD_{\pi_{1}\pi} between π1\pi_{1} and all other sheep, and sort Dπ1πD_{\pi_{1}\pi} in ascending order to get the sorted index IdxIdx of sheep
4:for i=1:N do
5:   k=Idx(i)k=Idx(i)
6:   Find the neighborhood sheep Λπk\Lambda_{\pi_{k}} of πk\pi_{k} within RΛR_{\Lambda}
7:   if Λπk\Lambda_{\pi_{k}}\neq\emptyset & any of Λπk\Lambda_{\pi_{k}} or πk\pi_{k} belongs to an exiting sub-swarm ϕq\phi_{q} then
8:      ϕq=ϕqπkΛπkt\phi_{q}=\phi_{q}\cup\pi_{k}\cup\Lambda_{\pi_{k}}^{t}
9:   else
10:      Q=Q+1Q=Q+1; ϕQ=πkΛπk\phi_{Q}=\pi_{k}\cup\Lambda_{\pi_{k}}
11:   end if
12:end for
13:Calculate the LCM of each ϕq\phi_{q} as Equation (9)
14:the set of sheep sub-swarms Φ={ϕ1,,ϕq,,ϕQ}\Phi=\{\phi_{1},...,\phi_{q},...,\phi_{Q}\} and the corresponding LCMs {LCM1t,,LCMqt,,LCMQt}\{LCM_{1}^{t},...,LCM_{q}^{t},...,LCM_{Q}^{t}\}

The core components of MMAS are solution construction and pheromone update. To find the optimal visiting sequence of sheep sub-swarms, the solution in MMAS is constructed by selecting sub-goals (regarded as a solution component cqqc_{q}^{q^{\prime}}) one by one based on pheromone values τqq\tau_{qq^{\prime}} and the values η(cqq)\eta(c_{q}^{q^{\prime}}) of edges to form the complete solution, which indicates the visiting sequencing. Pheromone values τqq\tau_{qq^{\prime}} are updated every generation based on the quality of constructed solutions and the evaporation of existing pheromones. The values of η(cqq)\eta(c_{q}^{q^{\prime}}) are defined as η(cqq)=1/Cqq\eta(c_{q}^{q^{\prime}})=1/C_{qq^{\prime}} where CqqC_{qq^{\prime}} is the travelling cost from LCMqtLCM_{q}^{t} to LCMqtLCM_{q}^{\prime t}. In MMAS, only the best ant is used to update the pheromone and the pheromone values are limited in the predefined ranges [τmin,τmax][\tau_{min},\tau_{max}]. The implementation of MMAS for finding the optimal push sequence of sub-swarms is described as follows:

Step 1: Initialise the parameters of MMAS, including the ant colony size NsN_{s}, two parameters α\alpha and γ\gamma, evaporation rate ρ\rho, pheromone values of each edge τqq=τmax\tau_{qq^{\prime}}=\tau_{max}, the values of each edge η(cqq)=1/Cqq\eta(c_{q}^{q^{\prime}})=1/C_{qq^{\prime}}, the best solution s=s^{*}= NULL;

Step 2: Construct new ant solutions:

Step 2.1: Start with a partial solution sp=LCM0ts_{p}=LCM_{0}^{t};

Step 2.2: For each LCMqt,q{1,2,,Q}LCM_{q^{\prime}}^{t},q^{\prime}\in\{1,2,...,Q\}, calculate the probability p(cqq|sp)p(c_{q}^{q^{\prime}}|s_{p}) of moving from the current location LCMqtLCM_{q}^{t} to LCMqtLCM_{q^{\prime}}^{t} based on the following Equation;

p(cqq|sp)=τqqα[η(cqq)]γcqlN(sp)τqlα[η(cql)]γ,cqqN(sp)p(c_{q}^{q^{\prime}}|s_{p})=\frac{\tau^{\alpha}_{qq^{\prime}}\cdot[\eta(c_{q}^{q^{\prime}})]^{\gamma}}{\sum_{c_{q}^{l}\in N(s_{p})}\tau^{\alpha}_{ql}\cdot[\eta(c_{q}^{l})]^{\gamma}},\ \forall c_{q}^{q^{\prime}}\in N(s_{p}) (21)

where N(sp)N(s_{p}) is a set of available solution components for the current partial solution sps_{p};

Step 2.3: Select the next solution component based on the probability p(cqq|sp)p(c_{q}^{q^{\prime}}|s_{p}) and add the selected LCMqtLCM_{q^{\prime}}^{t} to the current partial solution sps_{p};

Step 2.4: If all LCMqt,q{1,2,,Q}LCM_{q^{\prime}}^{t},q^{\prime}\in\{1,2,...,Q\} are visited, add LCMQ+1tLCM_{Q+1}^{t} to sps_{p} and go to Step 2.5; otherwise, go to Step 2.2;

Step 2.5: If NN new solutions are constructed, go to Step 3; otherwise, go to Step 2.1;

Step 3: Calculate the cost F(s)F(s) and record the best solution ss* found with the lowest cost;

Step 4: Update the pheromone values according to:

τqq\displaystyle\qquad\tau_{qq^{\prime}} =(1ρ)τqq+Δτqqbest\displaystyle=(1-\rho)\tau_{qq^{\prime}}+\Delta\tau_{qq^{\prime}}^{best} (22)
τqq\displaystyle\qquad\tau_{qq^{\prime}} =min(max(τqq,τmin),τmax)\displaystyle=min(max(\tau_{qq^{\prime}},\tau_{min}),\tau_{max}) (23)

where Δτqqbest=1/F(s)\Delta\tau_{qq^{\prime}}^{best}=1/F(s*);

Step 5: If the termination condition is met, output the best solution which represents the optimal travelling sequence; otherwise, go to Step 2.

5.2 A*-PP: online path planner

As discussed in Sections 4.3 and 4.4, path planning is crucial for reducing detours of both sheepdogs and sheep swarm, and is invoked during the online shepherding process. This section presents a two-layer path planning algorithm A*-PP, where the first layer, A*, finds the path with optimal cost, and the second layer, post-processing, eliminates the redundant waypoints in the path.

A* [44] is a well-known node-based path search algorithm that searches in a landscape represented by graphs. A* starts from the specific start node of the search graph and expands the nodes on candidate paths by adding one node at a time until it reaches the goal node in the graph. To decide which node on the candidate paths to be extended next, A* employs an evaluation function f(n)f(n) which can be calculated as Equation (24) to estimate the cost of the path going through node nn.

f(n)=g(n)+h(n)f(n)=g(n)+h(n) (24)

where g(n)g(n) is the cost of the optimal path from the start node to the current node nn and h(n)h(n) is the heuristic function for estimating the cost from the current node nn to the goal node. In this paper, g(n)g(n) is calculated as following:

g(n)=C_L(n)+C_Th(n),g(n)=C\_L(n)+C\_Th(n), (25)

where C_L(n)C\_L(n) and C_Th(n)C\_Th(n) are the length cost and threat cost. h(n)h(n) is calculated as the straight line distance from the current node nn to the goal node, which is permissible to guarantee A* returns the optimal path.

The pseudo-code of A* is given in Algorithm 4. OpenOpen is the set of nodes that can be considered for expansion. ClosedClosed is the set of nodes that have been expanded, which makes sure that each node can be travelled at most once. c(n,m)c(n,m) is the cost from the node nn to node mm. Parent(m)Parent(m) is to record the path with the lowest cost. A* starts the search from the initial point WinitW_{init}. At each iteration of the main loop, A* selects the node nn with the lowest f(n)f(n) from OpenOpen and removes it from OpenOpen to ClosedClosed. Then A* checks the neighbours of nn to insert feasible neighbour nodes mm into OpenOpen if mm is not in OpenOpen, or update f(m)f(m) if mm is already in OpenOpen and g(m)+h(m)g(m)+h(m) is better than the old f(m)f(m). The loop continues until the node with the lowest f(n)f(n) is the goal point WgoalW_{goal} or OpenOpenis empty, meaning no feasible path exists.

Algorithm 4 The pseudo code of A*
1:g(Winit)=0;Open=;Closed=;Parent(Winit)=Winitg(W_{init})=0;\ Open=\emptyset;\ Closed=\emptyset;\ Parent(W_{init})=W_{init}
2:Inset WinitW_{init} into OpenOpen with g(Winit)+h(Winit)g(W_{init})+h(W_{init});
3:while OpenOpen\neq\emptyset do
4:   Select the node nn in OpenOpen with the lowest value of f(n)f(n) ;
5:   if n=Wgoaln=W_{goal} then
6:      Extract PathPath from ParentParent
7:      Return PathPath;
8:   end if
9:   Remove the node nn from OpenOpen;
10:   Add the node nn to ClosedClosed;
11:   for each neighbour mm of nn do
12:      if iClosedi\notin Closed then
13:         if iOpeni\notin Open then
14:            g(m)=g(m)=\infty;
15:            Parent(m)=NULL;Parent(m)=NULL;
16:         end if
17:         if g(n)+c(n,m)<g(m)g(n)+c(n,m)<g(m) then
18:            g(m)=g(n)+c(n,m)g(m)=g(n)+c(n,m);
19:            Parent(m)=nParent(m)=n;
20:            Inset mm into OpenOpen or update f(m)f(m) in OpenOpen with g(m)+h(m)g(m)+h(m);
21:         end if
22:      end if
23:   end for
24:end while
25:PathPath

However, the original path obtained by A* usually contains many waypoints and the sub-path between two waypoints might be taking an unnecessary detour in some cases where a straight line can connect these two waypoints with no obstacle collision. Therefore, a path post-processing method, line of sight path pruning [45], is introduced to remove some redundant waypoints on the path to further reduce the path cost. The pseudo-code of the path post-processing is presented in Algorithm 5. The core of this process is to replace the original sub-path between two waypoints with a straight line if the straight line does not collide with any obstacles, meaning that the waypoints on the original sub-path, except the start point and end point, will be removed.

Algorithm 5 The pseudo code of path post-processing
1:PathPath, Nnodes=size(Path,1)N_{nodes}=size(Path,1)
2:while i<=Nnodesi<=N_{nodes} do
3:   for j=2:Nnodes1j=2:N_{nodes}-1 do
4:      Check the collision between the ithi^{th} node and the (i+j)th(i+j)^{th} node
5:      if No collision exists then
6:         if j>=Nnodes1j>=N_{nodes}-1 then
7:            Add the last node on PathPath to Processed_pathProcessed\_path and Break
8:         else
9:            Continue
10:         end if
11:      end if
12:      Add the (i+j1)th(i+j-1)^{th} node to Processed_pathProcessed\_path and Break
13:   end for
14:   i=i+j1i=i+j-1
15:end while
16:Add the last node on PathPath to Processed_pathProcessed\_path if it is not
17:Processed_pathProcessed\_path

6 Numerical Experiments

6.1 Experimental setting

To evaluate the planning-assisted shepherding model and the hierarchical mission planning algorithm, experiments are conducted on a set of synthetic shepherding problems with different levels of complexity. Table 1 presents the details of the 20 benchmark problems, showing the environment size (mostly 100×100100\times 100), the number of sheep NN (20, 50, 100) and if obstacles are contained in each case. The benchmark set consists of three groups, the obstacle-free group, the obstacle-contained group with small swarms and the obstacle-contained group with large swarms. The cases in each group have an increasing level of complexity. Fig. 4 shows the visualised initialisation of each case with red dots representing the sheep, red asterisks representing sheepdogs, black areas denoting obstacles and a blue circle representing the goal area. Cases 1-6 are obstacle-free environments with an increasing level of complexity by varying the environment size, NN, the goal location and the swarm initialisation. Cases 7-20 are obstacle-contained environments where the density of obstacles further impacts the problem’s complexity. The initialisation in cases 11 and 18 is based on randomly distributed sheep individuals, while the initialisation in other cases is based on randomly distributed sub-swarms.

Refer to caption
(a)
Refer to caption
(b)
Refer to caption
(c)
Refer to caption
(d)
Refer to caption
(e)
Refer to caption
(f)
Refer to caption
(g)
Refer to caption
(h)
Refer to caption
(i)
Refer to caption
(j)
Refer to caption
(k)
Refer to caption
(l)
Refer to caption
(m)
Refer to caption
(n)
Refer to caption
(o)
Refer to caption
(p)
Refer to caption
(q)
Refer to caption
(r)
Refer to caption
(s)
Refer to caption
(t)
Figure 4: The visualised initialisation of Case 1-Case 20

For each problem instance, the experiments are conducted 20 times to capture the statistical behaviour. The number of the maximum time steps for each run is set to T=300+20NT=300+20*N. Three metrics are recorded to evaluate shepherding performance, including 1) SR: the success rate, i.e., number of times the shepherding mission was completed out of 20 runs; 2) No. of steps: the number of time steps consumed to complete the shepherding mission, and 3) path length: the total moving distance of the sheepdog. The mean and standard deviation of only the successful runs are presented for the no. of steps and path length.

Table 1: Basic features of the benchmark problems
Group Group 1: Obstacle-free cases Group 2: Obstacle-contained cases…
Case 1 2 3 4 5 6 7 8 9 10
Environment size 50*50 100*100 100*100 100*100 100*100 100*100 50*50 100*100 100*100 100*100
Number of sheep NN 20 20 50 50 100 100 20 20 20 50
Obstacles N N N N N N Y Y Y Y
Group …with small sheep swarms Group 3: Obstacle-contained cases with large sheep swarms
Case 11 12 13 14 15 16 17 18 19 20
Environment size 100*100 100*100 100*100 100*100 100*100 100*100 100*100 100*100 100*100 100*100
Number of sheep 50 50 50 100 100 100 100 100 100 100
Obstacles Y Y Y Y Y Y Y Y Y Y
Table 2: Parameters setting of the shepherding model
WπvW_{\pi_{v}} WπΛW_{\pi\Lambda} WπβW_{\pi\beta} WππW_{\pi\pi} WπoW_{\pi o} WeπiW_{e\pi_{i}} WeβjW_{e\beta_{j}} RΛR_{\Lambda} RπβR_{\pi\beta} RππR_{\pi\pi} RπoR_{\pi o} RsR_{s}
0.5 1.05 1 2 3 0.3 0.3 4 8 0.4 2 4

6.2 Parameters values and the effects of the adaptively-switch

The parameters in MMAS are set according to [21]. To be specific, the maximum number of iteration is set to 600; the population size is set to the problem dimension; α=1\alpha=1, β=2\beta=2, ρ=0.98\rho=0.98, τmin=1/D\tau_{min}=1/D, τmax=1\tau_{max}=1. The number of neighbours in the A* search algorithm is set to 8, and the scaling factor is 1. Table 2 presents the setting of most parameters involved in the shepherding model by referring to [7]. The newly introduced parameter RthR_{th} and the cost weights α1,α2\alpha_{1},\ \alpha_{2} are analysed in the following.

RthR_{th} determines the threat area size that the sheepdog should try to avoid in the no-interaction mode. It should be no less than Rs=4R_{s}=4 to keep the safe operating distance from the sheep and no more than the sheepdog’s influence range Rπβ=8R_{\pi\beta}=8. Therefore, we tested the effects of RthR_{th} on shepherding performance of 6 representative cases by setting RthR_{th} to 4, 5, 6, 7 and 8. Table 3 presents the experimental results. Without the explicit declaration, the best results are shown in bold in all the following tables, and Wilcoxon rank-sum tests are conducted between the best results and other results for each case to test if their performances are statistically different, with a significance level of 0.05. ‘*’ indicates the significant difference.

Table 3: The effects of RthR_{th} on shepherding performance
Case SR No. of steps Path length
Rth=4R_{th}=4
Case1 1.00 131.00±\pm12.14 214.70±\pm20.69
Case3 1.00 396.05±\pm18.85 510.17±\pm38.77
Case7 1.00 232.50±\pm47.04 409.36±\pm94.24
Case11 0.95 996.21±\pm89.63 1401.94±\pm166.63
Case16 1.00 1191.75±\pm137.19 955.65±\pm143.42
Case18 0.60 2071.17±\pm156.11 1858.70±\pm235.51
Rth=5R_{th}=5
Case1 1.00 133.85±\pm13.57 220.77±\pm24.12
Case3 1.00 432.25±\pm33.11* 567.76±\pm53.05*
Case7 0.95 249.05±\pm70.08 439.90±\pm130.25
Case11 0.80 991.00±\pm113.56 1391.99±\pm238.58
Case16 1.00 1304.50±\pm136.34* 1152.18±\pm257.64*
Case18 0.25 2088.00±\pm108.46 1892.80±\pm293.40
Rth=6R_{th}=6
Case1 1.00 146.05±\pm15.13* 239.70±\pm29.18*
Case3 1.00 433.80±\pm25.89* 577.58±\pm46.55*
Case7 1.00 258.85±\pm110.51 451.99±\pm212.87
Case11 0.90 1091.89±\pm78.26 1560.16±\pm194.04
Case16 1.00 1345.30±\pm155.60* 1156.08±\pm250.54*
Case18 0.35 2038.29±\pm172.45 1875.59±\pm215.09
Rth=7R_{th}=7
Case1 1.00 156.70±\pm14.75* 262.80±\pm30.15*
Case3 1.00 449.55±\pm25.79* 595.32±\pm49.50*
Case7 1.00 254.65±\pm74.52 442.00±\pm140.60
Case11 0.85 1023.06±\pm132.27 1442.46±\pm245.77
Case16 0.95 1337.58±\pm167.01* 1127.46±\pm242.16*
Case18 0.25 2052.00±\pm129.39 1806.39±\pm71.94
Rth=8R_{th}=8
Case1 1.00 156.50±\pm16.94* 257.28±\pm28.99*
Case3 1.00 453.10±\pm45.54* 605.75±\pm110.63*
Case7 1.00 274.60±\pm123.49 485.03±\pm232.67
Case11 0.95 1109.37±\pm104.44 1605.64±\pm173.35
Case16 1.00 1326.00±\pm131.83* 1146.23±\pm247.51*
Case18 0.40 1999.12±\pm225.17 1717.57±\pm234.66
* represents the statistical significance

We can observe from Table 3 that Rth=4R_{th}=4 performs better than other values, achieving the highest SR on 6 cases and minimum time steps and path length on 4 cases. It should also be noted that the differences in the results by setting different RthR_{th} values are not always significant. This is due to the high randomness of sheep behaviours which are impacted by many factors such as the obstacles and the neighbourhood. But only Rth=4R_{th}=4 is not significantly worse than other values in these cases. This is probably because although Rth=4R_{th}=4 can not minimise the influence of sheepdogs on some sheep on the edge of the swarm, it can avoid most of the poor disturbance behaviours, e.g., crossing the swarm. On the contrary, a large RthR_{th} might cause unnecessary detours for sheepdogs. Therefore, RthR_{th} is set to 4 in the following experiments.

The parameters α1\alpha_{1} and α2\alpha_{2} are parameters for determining the weights of the path length cost and the threat cost. We examined the effects of the ratio of α2\alpha_{2} to α1\alpha_{1} on shepherding performance of some representative cases by fixing α1\alpha_{1} to 1 and varying α2\alpha_{2} to 0, 20, 40, 60, 80 and 100. In particular, α2=0\alpha_{2}=0 means that the no-interaction mode turns into the interaction mode, and the adaptively switch between these two modes is disabled. Table 4 shows that, similar to the effects of RthR_{th}, the influence of different α2\alpha_{2} values is not significant in some cases. Particularly, the shepherding performance is not very sensitive to the change of α2\alpha_{2} if it is non-zero. This is because the change of non-zero α2\alpha_{2} values only slightly impacts the path planning results in no-interaction mode, which does not cause a significant difference in the shepherding. However, when α2=0\alpha_{2}=0, which disables the adaptively switch, the shepherding performance of more cases is impacted. Therefore, α1=1,α2=100\alpha_{1}=1,\ \alpha_{2}=100 are set in the following experiments.

Table 4: The effects of α2\alpha_{2} on shepherding performance (α1=1\alpha_{1}=1)
α2=0\alpha_{2}=0 α2=20\alpha_{2}=20 α2=40\alpha_{2}=40
Case SR No. of steps Path length SR No. of steps Path length SR No. of steps Path length
C1 1.00 140.95±\pm14.53* 232.33±\pm23.90* 1.00 131.45±\pm14.80 218.62±\pm26.35 1.00 128.85±\pm13.18 213.01±\pm23.70
C3 1.00 425.35±\pm36.43* 565.53±\pm72.38* 1.00 410.20±\pm25.86 530.70±\pm41.55 1.00 405.35±\pm25.48 521.85±\pm38.94
C7 1.00 234.95±\pm73.25 413.66±\pm135.80 1.00 208.25±\pm53.61 364.71±\pm104.67 1.00 225.50±\pm50.62 399.21±\pm95.84
C11 1.00 976.25±\pm130.14 1396.81±\pm247.64 1.00 978.00±\pm120.45 1380.84±\pm228.90 0.95 978.00±\pm130.36 1376.36±\pm254.97
C16 1.00 1249.50±\pm124.46 1002.57±\pm170.85 1.00 1202.60±\pm184.64 1009.26±\pm233.03 1.00 1282.00±\pm136.54 1091.85±\pm245.40
C18 0.20 2171.25±\pm138.81* 1768.19±\pm398.51 0.20 2095.25±\pm189.02* 1594.14±\pm271.94 0.50 2198.40±\pm33.07* 1606.22±\pm145.12
α2=60\alpha_{2}=60 α2=80\alpha_{2}=80 α2=100\alpha_{2}=100
Case SR No. of steps Path length SR No. of steps Path length SR No. of steps Path length
C1 1.00 134.25±\pm11.57 214.39±\pm18.91 1.00 133.90±\pm11.24 221.59±\pm20.24 1.00 131.00±\pm12.14 214.70±\pm20.69
C3 1.00 407.45±\pm15.67* 523.76±\pm31.46 1.00 406.85±\pm28.58 521.03±\pm44.73 1.00 396.05±\pm18.85 510.17±\pm38.77
C7 1.00 239.90±\pm74.86 421.64±\pm144.35 1.00 238.35±\pm59.09 424.55±\pm111.83 1.00 232.50±\pm47.04 409.36±\pm94.24
C11 1.00 1004.20±\pm140.56 1420.79±\pm232.61 0.90 950.06±\pm139.47 1336.35±\pm265.27 0.95 996.21±\pm89.63 1401.94±\pm166.63
C16 1.00 1245.05±\pm147.89 1012.75±\pm206.21 1.00 1225.45±\pm135.50 969.26±\pm132.54 1.00 1191.75±\pm137.19 955.65±\pm143.42
C18 0.20 2104.25±\pm127.41* 1572.75±\pm263.46 0.30 2201.33±\pm106.89* 1594.05±\pm213.48 0.60 2071.17±\pm156.11 1858.70±\pm235.51*
* represents the statistical significance

6.3 Performance of the planning-assisted shepherding

The proposed method is compared to the reactive shepherding from Strömbom et al. [7], referred to as Method 1 for convenience, to validate the effectiveness of the proposed shepherding model. As the proposed model consists of offline task planning (grouping and TSP-based sequencing) and online path planning, we further add the shepherding method with only task planning assisted, referred to as Method 2 as a comparative method to evaluate the impact of task planning and path planning separately. The proposed planning-assisted shepherding method is referred to as Method 3 in the comparisons.

6.3.1 Planning-assisted shepherding with single-sheepdog

The comparative results of shepherding methods using single-sheepdog are presented in Table 5. The planning results and generated trajectories by each method during shepherding for three representative cases (one for a group) are visualised in Fig. 5. The blue lines in Fig. 5(a)5(e)5(i) represent the planning results. In other figures, the grey lines represent the sheep trajectories, and the blue lines represent the sheepdog trajectories.

As we can see from Table 5, the proposed planning-assisted swarm shepherding method performed the best overall among the three methods in almost all cases. In terms of the SR, with the increase of shepherding complexity, it becomes increasingly untenable for reactive shepherding to successfully complete the mission within the limited number of time steps, and the SR drops from 1 to 0. While reactive shepherding only obtained 100% SR on 3 cases of Group 1 (Cases 1, 5 and 6) and failed in 10 cases, task planning-assisted shepherding achieved 100% SR on 6 cases (Cases 1-5 of Group 1 and 7 of Group 2 ), which include most obstacle-free cases and a few obstacle-contained cases. But task planning-assisted shepherding failed in 7 cases (Cases 10, 12-13, 17-20) and had low SR (less than 50%) on 4 cases (Cases 8, 11, 15-16). This indicated that task planning, which divides the sheep swarm and determines the optimal pushing sequence of sub-swarms, could significantly increase the SR of shepherding in the environment without obstacles. It could also address some relatively simple shepherding missions in the environment with obstacles, but is unable to deal with the complex situations with cluttered obstacles and a large sheep swarm size. On the contrary, the proposed planning-assisted shepherding with both task planning and path planning integrated succeeded in all cases of Group 1 and more than half of Group 2 and 3, and achieved higher SR of Cases 8-13, 14-18 compared to Method 2. This demonstrates the effectiveness of path planning in improving the shepherding SR in obstacle-cluttered environments. Fig. 5(f)5(h) also validate that the sheepdog easily reached a deadlock without path planning (Methods 1 and 2), while Method 3 was able to effectively avoid this situation.

Table 5: Comparative results of shepherding methods with single-sheepdog
Case Method 1: Reactive shepherding Method 2: Task planning-assisted shepherding Method 3: Planning-assisted shepherding
SR No. of steps Path length SR No. of steps Path length SR No. of steps Path length
C1 1.00 219.45±\pm46.38* 427.43±\pm87.90* 1.00 143.35±\pm12.09* 257.31±\pm24.19* 1.00 131.00±\pm12.14 214.70±\pm20.69
C2 0.80 540.94±\pm95.78* 1058.34±\pm184.48* 1.00 281.55±\pm35.88 494.55±\pm67.65* 1.00 272.80±\pm24.27 457.75±\pm43.86
C3 0.95 898.53±\pm186.88* 1801.14±\pm363.64* 1.00 410.70±\pm32.80 604.55±\pm41.13* 1.00 396.05±\pm18.85 510.17±\pm38.77
C4 0.75 968.07±\pm145.03* 1898.39±\pm279.91* 1.00 464.10±\pm36.62 633.36±\pm37.50* 1.00 456.75±\pm24.57 563.06±\pm35.75
C5 1.00 1082.45±\pm192.37 2132.50±\pm362.61* 1.00 1098.10±\pm73.27 645.07±\pm26.22* 1.00 1118.30±\pm95.97 560.12±\pm30.96
C6 1.00 1584.65±\pm182.53* 3083.65±\pm373.67* 0.95 1410.84±\pm210.77 677.38±\pm56.06* 1.00 1434.05±\pm255.51 616.90±\pm56.48
C7 0.85 438.53±\pm80.00* 807.11±\pm154.04* 1.00 251.85±\pm52.26 455.44±\pm100.63 1.00 232.50±\pm47.04 409.36±\pm94.24
C8 0.05 685.00±\pm0.00* 1082.91±\pm0.00* 0.05 695.00±\pm0.00* 614.87±\pm0.00* 1.00 293.70±\pm22.20 490.47±\pm37.76
C9 0.00 —- —- 0.80 544.75±\pm37.62* 2292.09±\pm129.15* 1.00 484.30±\pm13.09 283.20±\pm18.14
C10 0.00 —- —- 0.00 —- —- 1.00 567.45±\pm56.21 725.97±\pm88.96
C11 0.00 —- —- 0.20 1212.00±\pm68.00* 1653.71±\pm259.48* 0.95 996.21±\pm89.63 1401.94±\pm166.63
C12 0.00 —- —- 0.00 —- —- 0.75 1230.60±\pm31.84 749.71±\pm57.93
C13 0.00 —- —- 0.00 —- —- 0.75 1216.40±\pm43.09 773.66±\pm45.49
C14 0.10 1808.00±\pm48.08* 2652.05±\pm63.06* 0.80 1528.75±\pm145.03* 804.04±\pm92.00* 1.00 1441.85±\pm109.28 714.76±\pm58.81
C15 0.10 1763.00±\pm206.48* 2701.55±\pm285.92* 0.50 1223.30±\pm203.73* 794.82±\pm94.96* 1.00 1129.40±\pm96.65 722.48±\pm64.41
C16 0.00 —- —- 0.30 1837.83±\pm311.60* 1652.02±\pm285.06* 1.00 1191.75±\pm137.19 955.65±\pm143.42
C17 0.00 —- —- 0.00 —- —- 0.15 1984.00±\pm52.12 967.57±\pm138.94
C18 0.00 —- —- 0.00 —- —- 0.45 2071.17±\pm156.11 1858.70±\pm235.51*
C19 0.00 —- —- 0.00 —- —- 0.00 —- —-
C20 0.00 —- —- 0.00 —- —- 0.00 —- —-
Refer to caption
(a)
Refer to caption
(b)
Refer to caption
(c)
Refer to caption
(d)
Refer to caption
(e)
Refer to caption
(f)
Refer to caption
(g)
Refer to caption
(h)
Refer to caption
(i)
Refer to caption
(j)
Refer to caption
(k)
Refer to caption
(l)
Figure 5: The visualised planning results and generated trajectories of Case 3, 13 and 18 with single-sheepdog

The integration of task planning and path planning also significantly reduces the number of steps and the path length required to herd the sheep swarm to the goal in almost all cases. This proves that the planning-assisted shepherding method can save time and reduce the energy consumption of robots to complete the shepherding mission, which results in significant benefit in the real-world shepherding applications. In detail, we can observe from Table 5 that, in most of the obstacle-free environments (Cases 2-6) and the relatively simple obstacle-cluttered environments (Case 7), the reduction of the number of steps is mainly caused by the employment of task planning as the difference between the number of steps obtained by task planning-assisted shepherding and planning assisted shepherding are not significant. However, when the shepherding complexity increases in Cases 8-18, path planning plays an important role in further reducing the number of time steps so that the planning-assisted shepherding achieves the minimum number of steps. In terms of the path length, the task planning-assisted shepherding performed better than the reactive shepherding on all data-applicable cases, while the planning-assisted shepherding obtained the best performance as presented in Table 5 and Fig. 5. This demonstrates that both tasking planning and path planning are very effective in reducing the detours of the sheepdog. However, the single-sheepdog planning-assisted shepherding still has difficulty in addressing the most complex Cases 19 and 20 within the limited time and cannot guarantee 100% SR for a few cases.

6.3.2 Planning-assisted shepherding with bi-sheepdog shepherding

The failure of the planning-assisted shepherding with single-sheepdog in Cases 19 and 20 encourages the employment of multi-sheepdog in shepherding. We evaluated the performance of the three methods with two sheepdogs on the benchmark set, and Table 6 presents the results. When employing multiple agents, the mission completion time and the minimum cruising ability requirement are determined by the agent which consumes the most time and travels the longest distance, respectively. Therefore, the No. of steps and Path length presented in Table 6 are calculated based on the larger one between the values of the two sheepdogs. The visualised planning results for the 3 representative cases are presented in Fig. 6(a)6(e)6(i) where the lines in different colours represent the optimal routes for sheepdogs. The trajectories of sheep (represented as lines in grey) and sheepdogs (represented as lines in blue and red) generated during the bio-sheepdog shepherding process based on different methods for these cases are visualised as other figures in Fig. 6.

Table 6: Comparative results of shepherding methods with bi-sheepdog
Case Method 1: Reactive shepherding Method 2: Task planning-assisted shepherding Method 3: Planning-assisted shepherding
SR No. of steps Path length SR No. of steps Path length SR No. of steps Path length
C1 1.00 192.05±\pm60.30* 378.07±\pm113.59* 1.00 77.50±\pm8.29 153.00±\pm13.99* 1.00 76.85±\pm7.94 132.49±\pm15.92
C2 0.95 412.84±\pm136.45* 811.45±\pm264.00* 1.00 132.45±\pm7.34 250.01±\pm13.10* 1.00 133.10±\pm7.37 232.43±\pm13.68
C3 1.00 688.05±\pm124.54* 1392.42±\pm242.39* 1.00 289.40±\pm20.38 494.00±\pm37.71* 1.00 284.45±\pm19.16 414.43±\pm31.54
C4 0.95 662.53±\pm122.04* 1322.85±\pm243.08* 1.00 195.65±\pm10.52 374.66±\pm20.84* 1.00 198.30±\pm8.89 317.29±\pm20.73
C5 1.00 768.90±\pm158.46* 1543.09±\pm317.80* 1.00 433.35±\pm28.25 424.59±\pm22.60* 1.00 431.90±\pm49.90 365.22±\pm28.34
C6 1.00 1077.00±\pm208.75* 2128.58±\pm420.37* 1.00 263.65±\pm15.46 366.74±\pm19.44* 1.00 261.35±\pm20.29 298.20±\pm25.56
C7 0.90 349.72±\pm118.35* 647.45±\pm209.27* 0.95 204.89±\pm113.89 357.89±\pm146.12* 1.00 157.40±\pm36.19 282.79±\pm67.60
C8 0.85 463.94±\pm113.64* 854.12±\pm194.77* 1.00 174.30±\pm15.00 323.17±\pm29.17* 1.00 168.25±\pm9.63 293.42±\pm19.03
C9 0.00 —- —- 0.00 —- —- 1.00 407.25±\pm11.39 281.33±\pm23.32
C10 0.40 906.50±\pm177.10* 1556.23±\pm267.23* 0.00 —- —- 1.00 249.65±\pm11.50 383.25±\pm24.44
C11 0.00 —- —- 0.70 755.50±\pm267.97* 1149.11±\pm288.59* 1.00 468.95±\pm190.20 765.19±\pm244.20
C12 0.00 —- —- 0.00 —- —- 1.00 568.90±\pm12.33 350.87±\pm17.13
C13 0.00 —- —- 0.00 —- —- 1.00 668.95±\pm14.39 419.06±\pm23.33
C14 0.85 1138.35±\pm336.01* 1907.98±\pm477.13* 0.00 —- —- 1.00 352.50±\pm14.99 359.98±\pm26.04
C15 0.60 1656.17±\pm394.33* 2690.52±\pm491.41* 1.00 423.20±\pm312.16* 555.73±\pm208.74* 1.00 296.95±\pm18.88 395.81±\pm47.19
C16 0.10 1807.50±\pm245.37* 2469.13±\pm34.49* 0.75 1234.27±\pm459.81* 1111.22±\pm453.89* 1.00 605.45±\pm209.93 564.11±\pm116.75
C17 0.00 —- —- 0.00 —- —- 1.00 720.95±\pm168.93 558.64±\pm87.80
C18 0.05 2077.00±\pm0.00* 2770.34±\pm0.00* 0.30 1421.00±\pm600.48* 1673.98±\pm613.43* 0.85 689.29±\pm134.12 1028.32±\pm195.86
C19 0.10 1765.50±\pm77.07* 2892.37±\pm71.84* 0.80 1030.62±\pm351.46* 1436.14±\pm471.77* 1.00 527.25±\pm48.09 618.39±\pm156.38
C20 0.00 —- —- 0.00 —- —- 1.00 1162.45±\pm27.88 559.21±\pm36.11
Refer to caption
(a)
Refer to caption
(b)
Refer to caption
(c)
Refer to caption
(d)
Refer to caption
(e)
Refer to caption
(f)
Refer to caption
(g)
Refer to caption
(h)
Refer to caption
(i)
Refer to caption
(j)
Refer to caption
(k)
Refer to caption
(l)
Figure 6: The visualised planning results and trajectories of Case 3, 13 and 18 with bi-sheepdog

We can observe from Table 6 and Fig. 6 that the planning-assisted bi-sheepdog shepherding performed the best among these 3 methods and Method 2 performed better than Method 1, indicating the same findings from the above single-sheepdog shepherding: task planning and path planning can significantly improve the shepherding performance, especially for complex shepherding missions. Specifically, bi-sheepdog planning-assisted shepherding achieved 100% SR on all cases except Case 18, while Method 1 and Method 2 with bi-sheepdog still completely failed in some cases. Furthermore, compared to single-sheepdog shepherding, it is easy to find that the deployment of 2 sheepdogs, no matter based on which method, significantly improved the SR of addressing the complex shepherding tasks and reduced the number of time steps and the path length to complete the mission as shown in Table 6. For example, bi-sheepdog planning-assisted shepherding increased the SR from 0 to 100% on Cases 19 and 20 and obtained lower time steps and shorter path length compared to single-sheepdog planning-assisted shepherding in all cases. We can conclude that the deployment of multiple sheepdogs is an efficient way to reduce the completion time of the shepherding mission and the cruising ability requirement of the sheepdog.

To further validate the effectiveness of bi-sheepdog shepherding, we also compare the total number of steps and the total path length of 2 sheepdogs obtained by planning-assisted shepherding to the best results of single-sheepdog shepherding. The results are presented in Table 7, where the boldface denotes that the bi-sheepdog shepherding achieves better results than the single-sheepdog shepherding and ‘*’ indicates the significant difference. It can be found that the planning-assisted bi-sheepdog shepherding still significantly outperformed the single-sheepdog shepherding in most of the cases in terms of the total values. This further demonstrates the efficiency of bi-sheepdog planning-assisted shepherding in terms of reducing the total time and energy consumption to complete the mission.

Table 7: The comparison of bi-sheepdog shepherding to single sheepdog shepherding
Case The total
SR No. of steps Path length
C1 1.00 112.75±\pm7.99* 195.62±\pm15.68*
C2 1.00 257.50±\pm9.74 444.74±\pm19.52
C3 1.00 338.70±\pm18.77 503.27±\pm28.82
C4 1.00 332.65±\pm10.80* 531.97±\pm22.48*
C5 1.00 547.70±\pm49.69 541.10±\pm38.67
C6 1.00 467.90±\pm26.30* 548.44±\pm35.06*
C7 1.00 196.45±\pm36.49* 347.06±\pm67.72*
C8 1.00 307.25±\pm46.74* 537.17±\pm85.49*
C9 1.00 562.05±\pm12.17* 378.06±\pm23.17*
C10 1.00 518.15±\pm153.67* 840.70±\pm283.54*
C11 1.00 704.45±\pm220.95* 1183.11±\pm267.44*
C12 1.00 866.65±\pm15.21* 601.40±\pm29.48*
C13 1.00 897.10±\pm16.47* 606.20±\pm25.21*
C14 1.00 572.30±\pm69.50 704.61±\pm178.78
C15 1.00 503.55±\pm38.45* 612.12±\pm62.65*
C16 1.00 687.70±\pm85.22 900.81±\pm161.92
C16 1.00 827.45±\pm242.41 947.27±\pm209.32
C17 0.85 1137.29±\pm221.63 1728.28±\pm234.88
C19 1.00 836.45±\pm122.29* 1039.61±\pm218.38*
C20 1.00 1628.65±\pm39.02* 908.14±\pm58.52*
* represents the statistical significance

7 Conclusion

This paper presents a planning-assisted context-sensitive swarm shepherding model and a hierarchical mission planning system for effectively herding a large flock of highly dispersed sheep to the destination in an environment with obstacles. In the proposed shepherding model, the sheep swarm is first grouped into some sheep sub-swarms, based on which the shepherding problem is transformed into a TSP to determine the optimal pushing sequence of sub-swarms by regarding each sub-swarm as a ‘city’ to visit. Then the online path planning is integrated with a context-sensitive response model to find the optimal paths for the sheep sub-swarms to be pushed to the next ‘city’ and the optimal paths for the sheepdogs to push the sheep sub-swarms. The hierarchical mission planning system is designed to solve the planning problems in the proposed shepherding model by combining a cohesion range-based method for grouping, ACO for TSP, and A*-PP for path planning.

Experiments conducted on 20 shepherding cases consisting of three groups with different levels of complexity demonstrated the effectiveness of the planning-assisted swarm shepherding model in terms of increasing the success rate and reducing the time and energy consumption to complete the mission. The planning-assisted swarm shepherding model can also be extended for employing multiple sheepdogs, and experiments have also validated the performance improvements for bi-sheepdog shepherding. However, there remains more opportunities for extending this research. The employment of more than 2 sheepdogs for shepherding has not been studied in this work. Besides, when transforming the swarm shepherding problem into a TSP, the dynamics in shepherding are not considered. The travelling cost between each pair of cities does not consider the influence of the swarm size on the cost. Our future research will focus on how to model the multi-sheepdog swarm shepherding problem as a multiple dynamic TSP with a more accurate cost evaluation.

8 Funding

This work is supported by a U.S. Office of Naval Research-Global (ONR-G) Grant and a Defence Science and Technology Group grant.

Declaration of Competing Interest

None.

References

  • [1] N. K. Long, K. Sammut, D. Sgarioto, M. Garratt, H. A. Abbass, A comprehensive review of shepherding as a bio-inspired swarm-robotics guidance approach, IEEE Transactions on Emerging Topics in Computational Intelligence 4 (4) (2020) 523–537.
  • [2] J.-M. Lien, E. Pratt, Interactive planning for shepherd motion., in: AAAI Spring Symposium: Agents that Learn from Human Teachers, 2009, pp. 95–102.
  • [3] M. Evered, P. Burling, M. Trotter, et al., An investigation of predator response in robotic herding of sheep, International Proceedings of Chemical, Biological and Environmental Engineering 63 (2014) 49–54.
  • [4] D. Strömbom, A. J. King, Robot collection and transport of objects: A biomimetic process, Frontiers in Robotics and AI (2018) 48.
  • [5] B. Bat-Erdene, O.-E. Mandakh, Shepherding algorithm of multi-mobile robot system, in: 2017 First IEEE International Conference on Robotic Computing (IRC), IEEE, 2017, pp. 358–361.
  • [6] A. A. Paranjape, S.-J. Chung, K. Kim, D. H. Shim, Robotic herding of a flock of birds using an unmanned aerial vehicle, IEEE Transactions on Robotics 34 (4) (2018) 901–915.
  • [7] D. Strömbom, R. P. Mann, A. M. Wilson, S. Hailes, A. J. Morton, D. J. Sumpter, A. J. King, Solving the shepherding problem: heuristics for herding autonomous, interacting agents, Journal of the royal society interface 11 (100) (2014) 20140719.
  • [8] H. Singh, B. Campbell, S. Elsayed, A. Perry, R. Hunjet, H. Abbass, Modulation of force vectors for effective shepherding of a swarm: A bi-objective approach, in: 2019 IEEE Congress on Evolutionary Computation (CEC), IEEE, 2019, pp. 2941–2948.
  • [9] T. Nguyen, J. Liu, H. Nguyen, K. Kasmarik, S. Anavatti, M. Garratt, H. Abbass, Perceptron-learning for scalable and transparent dynamic formation in swarm-on-swarm shepherding, in: 2020 International Joint Conference on Neural Networks (IJCNN), IEEE, 2020, pp. 1–8.
  • [10] J. Zhi, J.-M. Lien, Learning to herd agents amongst obstacles: Training robust shepherding behaviors using deep reinforcement learning, IEEE Robotics and Automation Letters 6 (2) (2021) 4163–4168.
  • [11] V. S. Chipade, D. Panagou, Multiagent planning and control for swarm herding in 2-d obstacle environments under bounded inputs, IEEE Transactions on Robotics 37 (6) (2021) 1956–1972.
  • [12] H. Song, A. Varava, O. Kravchenko, D. Kragic, M. Y. Wang, F. T. Pokorny, K. Hang, Herding by caging: a formation-based motion planning framework for guiding mobile agents, Autonomous Robots 45 (5) (2021) 613–631.
  • [13] H. El-Fiqi, B. Campbell, S. Elsayed, A. Perry, H. K. Singh, R. Hunjet, H. A. Abbass, The limits of reactive shepherding approaches for swarm guidance, IEEE Access 8 (2020) 214658–214671.
  • [14] A. Hussein, E. Petraki, S. Elsawah, H. A. Abbass, Autonomous swarm shepherding using curriculum-based reinforcement learning., in: AAMAS, 2022, pp. 633–641.
  • [15] S. M. LaValle, Planning algorithms, Cambridge university press, 2006.
  • [16] Z. Zhao, M. Jin, E. Lu, S. X. Yang, Path planning of arbitrary shaped mobile robots with safety consideration, IEEE Transactions on Intelligent Transportation Systems (2021).
  • [17] J. Müller, J. Strohbeck, M. Herrmann, M. Buchholz, Motion planning for connected automated vehicles at occluded intersections with infrastructure sensors, IEEE Transactions on Intelligent Transportation Systems (2022).
  • [18] J. Liu, S. Anavatti, M. Garratt, H. A. Abbass, Mission planning for shepherding a swarm of uninhabited aerial vehicles, Shepherding UxVs for Human-Swarm Teaming: An Artificial Intelligence Approach to Unmanned X Vehicles (2021) 87–114.
  • [19] J. K. Lenstra, A. R. Kan, Some simple applications of the travelling salesman problem, Journal of the Operational Research Society 26 (4) (1975) 717–733.
  • [20] S. Elsayed, H. Singh, E. Debie, A. Perry, B. Campbell, R. Hunjel, H. Abbass, Path planning for shepherding a swarm in a cluttered environment using differential evolution, in: 2020 IEEE Symposium Series on Computational Intelligence (SSCI), IEEE, 2020, pp. 2194–2201.
  • [21] T. Stützle, H. H. Hoos, Max–min ant system, Future generation computer systems 16 (8) (2000) 889–914.
  • [22] C. W. Reynolds, Flocks, herds and schools: A distributed behavioral model, in: Proceedings of the 14th annual conference on Computer graphics and interactive techniques, 1987, pp. 25–34.
  • [23] T. Miki, T. Nakamura, An effective rule based shepherding algorithm by using reactive forces between individuals, International Journal of InnovativeComputing, Information and Control 3 (4) (2007) 813–823.
  • [24] K. Fujioka, Effective herding in shepherding problem in V-formation control, Transactions of the Institute of Systems, Control and Information Engineers 31 (1) (2018) 21–27.
  • [25] J. F. Harrison, C. Vo, J.-M. Lien, Scalable and robust shepherding via deformable shapes, in: International Conference on Motion in Games, Springer, 2010, pp. 218–229.
  • [26] J. Hu, A. E. Turgut, T. Krajník, B. Lennox, F. Arvin, Occlusion-based coordination protocol design for autonomous robotic shepherding tasks, IEEE Transactions on Cognitive and Developmental Systems (2020).
  • [27] C. K. Go, B. Lao, J. Yoshimoto, K. Ikeda, A reinforcement learning approach to the shepherding task using sarsa, in: 2016 International Joint Conference on Neural Networks (IJCNN), IEEE, 2016, pp. 3833–3836.
  • [28] H. T. Nguyen, T. D. Nguyen, V. P. Tran, M. Garratt, K. Kasmarik, S. Anavatti, M. Barlow, H. A. Abbass, Continuous deep hierarchical reinforcement learning for ground-air swarm shepherding, arXiv preprint arXiv:2004.11543 (2020).
  • [29] K. Bérczi, M. Mnich, R. Vincze, Efficient approximations for many-visits multiple traveling salesman problems, arXiv preprint arXiv:2201.02054 (2022).
  • [30] A. Ayari, S. Bouamama, Acd3gpso: automatic clustering-based algorithm for multi-robot task allocation using dynamic distributed double-guided particle swarm optimization, Assembly Automation 40 (2) (2019) 235–247.
  • [31] J. Xie, J. Chen, Multiregional coverage path planning for multiple energy constrained uavs, IEEE Transactions on Intelligent Transportation Systems (2022).
  • [32] P. Baniasadi, M. Foumani, K. Smith-Miles, V. Ejov, A transformation technique for the clustered generalized traveling salesman problem with applications to logistics, European Journal of Operational Research 285 (2) (2020) 444–457.
  • [33] I. Khoufi, A. Laouiti, C. Adjih, A survey of recent extended variants of the traveling salesman and vehicle routing problems for unmanned aerial vehicles, Drones 3 (3) (2019) 66.
  • [34] X. Xu, J. Li, M. Zhou, X. Yu, Precedence-constrained colored traveling salesman problem: An augmented variable neighborhood search approach, IEEE Transactions on Cybernetics (2021).
  • [35] M. Mavrovouniotis, F. M. Müller, S. Yang, Ant colony optimization with local search for dynamic traveling salesman problems, IEEE transactions on cybernetics 47 (7) (2016) 1743–1756.
  • [36] I. M. Ali, D. Essam, K. Kasmarik, A novel design of differential evolution for solving discrete traveling salesman problems, Swarm and Evolutionary Computation 52 (2020) 100607.
  • [37] M. Dorigo, Optimization, learning and natural algorithms, PhD Thesis, Politecnico di Milano (1992).
  • [38] M. Dorigo, L. M. Gambardella, Ant colony system: a cooperative learning approach to the traveling salesman problem, IEEE Transactions on evolutionary computation 1 (1) (1997) 53–66.
  • [39] X. Xiang, Y. Tian, X. Zhang, J. Xiao, Y. Jin, A pairwise proximity learning-based ant colony algorithm for dynamic vehicle routing problems, IEEE Transactions on Intelligent Transportation Systems 23 (6) (2021) 5275–5286.
  • [40] O. Cheikhrouhou, I. Khoufi, A comprehensive survey on the multiple traveling salesman problem: Applications, approaches and taxonomy, Computer Science Review 40 (2021) 100369.
  • [41] P. Oberlin, S. Rathinam, S. Darbha, Today’s traveling salesman problem, IEEE robotics & automation magazine 17 (4) (2010) 70–77.
  • [42] V. Roberge, M. Tarbouchi, G. Labonté, Comparison of parallel genetic algorithm and particle swarm optimization for real-time UAV path planning, IEEE Transactions on Industrial Informatics 9 (1) (2012) 132–141.
  • [43] X. Yu, W.-N. Chen, T. Gu, H. Yuan, H. Zhang, J. Zhang, ACO-A*: Ant colony optimization plus A* for 3-D traveling in environments with dense obstacles, IEEE Transactions on Evolutionary Computation 23 (4) (2018) 617–631.
  • [44] P. E. Hart, N. J. Nilsson, B. Raphael, A formal basis for the heuristic determination of minimum cost paths, IEEE transactions on Systems Science and Cybernetics 4 (2) (1968) 100–107.
  • [45] K. Yang, Anytime synchronized-biased-greedy rapidly-exploring random tree path planning in two dimensional complex environments, International Journal of Control, Automation and Systems 9 (4) (2011) 750–758.