This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

\jyear

2021

[1]\fnmXiangke \surWang

[1]\orgdivCollege of Intelligence Science and Technology, \orgnameNational University of Defense Technology, \orgaddress \stateChangsha, \countryP. R. China

Integrated Design of Cooperative Area Coverage and Target Tracking with Multi-UAV System

\fnmMengge \surZhang zhangmg@nudt.edu.cn    \fnmJie \surLi lijie09@nudt.edu.cn    xkwang@nudt.edu.cn *
Abstract

This paper systematically studies the cooperative area coverage and target tracking problem of multiple-unmanned aerial vehicles (multi-UAVs). The problem is solved by decomposing into three sub-problems: information fusion, task assignment, and multi-UAV behavior decision-making. Specifically, in the information fusion process, we use the maximum consistency protocol to update the joint estimation states of multi-targets (JESMT) and the area detection information. The area detection information is represented by the equivalent visiting time map (EVTM), which is built based on the detection probability and the actual visiting time of the area. Then, we model the task assignment problem of multi-UAV searching and tracking multi-targets as a network flow model with upper and lower flow bounds. An algorithm named task assignment minimum-cost maximum-flow (TAMM) is proposed. Cooperative behavior decision-making uses Fisher information as the mission reward to obtain the optimal tracking action of the UAV. Furthermore, a coverage behavior decision-making algorithm based on the anti-flocking method is designed for those UAVs assigned the coverage task. Finally, a distributed multi-UAV cooperative area coverage and target tracking algorithm is designed, which integrates information fusion, task assignment, and behavioral decision-making. Numerical and hardware-in-the-loop simulation results show that the proposed method can achieve persistent area coverage and cooperative target tracking.

keywords:
Multi-UAV, Task assignment, Area coverage, Target tracking

1 Introduction

The multi-UAV cooperative area coverage and target tracking problem is a typical problem in area surveillance tasks, which is an essential means of information acquisition. Typical application scenarios of area coverage and target tracking include forest fire monitoring, search and rescue, and pursuit-evasion. These applications not only require continuous search and coverage of the mission area. They also need the multi-UAV system to promptly detect the intruding non-cooperative targets and cooperatively track the allocated targets. However, various factors, such as the dynamic environment, the target appearance probability, and the sensor performance, make the problem very complex.

Many existing works consider the area coverage or target tracking problem sololy bib1 ; bib2 .The integrated design of area coverage and target tracking has not gotten enough attention bib1 . Previous studies of area coverage and target tracking problem usually include two categories: coupling or decouping. The coupling study of area coverage and target tracking problem simultaneously considers the coverage and tracking tasks, usually called simultaneous coverage and tracking (SCAT). SCAT focuses on the multi-objective optimization of coverage and tracking tasks, and the typical method of SCAT is the voronoi-based coverage method bib2 ; bib3 . The target tracking part in bib2 is treated as a parameter-based procedure. Then, the area coverage and target tracking problem is translated to the problem of covering environments with time-varying density functions. Although SCAT has been verified in actual scenarios, it ignores the single performance of area coverage or target tracking task to a certain extent. For the decoupling design of coverage and tracking, it is necessary to make the UAV switch between different modes according to specific environments and mission requirements bib5 and reasonably allocate different UAVs to track multiple targets. Control logic design based on a finite state automaton model, integrating four modes of operations, is presented in bib8 . The semi-flocking algorithm in bib5 ; bib6 ; bib7 assigns a small flock of sensors to each target while at the same time leaving some sensors free to explore the environment. It enables mobile nodes to self-organize themselves switch between searching and tracking modes. Although the above decoupling algorithms have been well applied, they lack the coordinated management of UAVs, making it difficult to achieve optimal cooperation between UAVs.

Considering that the coupled algorithms ignores the performance of a single task, this paper solves the area coverage and target tracking problem under the decoupled framework. A hierarchical modular architecture is designed to enhance the collaboration of UAVs in the distributed manner. The hierarchical modular method integrates three sub-modules of information fusion, task allocation, and behavior decision-making.

The information fusion module includes data preprocessing and information fusion methods. Most of the current work is based on Bayesian methods bib24 ; bib25 for data pre-processing, while the sensors are modeled as sources of uncertainty. While for the data fusion in a distributed system, nodes need to be guided through an interactive protocol to generate a consistent estimate. The information consensus algorithm bib28 has been widely used in distributed estimation bib29 , task assignment bib32 ; bib33 , and other scenarios. However, the consensus process is highly dependent on the communication links, and the algorithm convergence time increases rapidly as the task complexity increases. bib34 gives the termination condition of the maximum consensus algorithm in the distributed information fusion process. In this paper, we improve the maximum consensus information fusion algorithm in bib34 so that it can be applied to the area coverage and target tracking task.

The task allocation module is the key to enhancing the collaboration of UAVs. It can be solved by centralized or distributed methods. Centralized assignment methods bib36 can globally coordinate the complex relationships between tasks, but they rely too much on the control center and lack robustness. Commonly used distributed allocation methods are based on the market mechanism bib44 ; bib45 . Distributed task assignment methods are computationally flexible and suitable for solving large-scale assignment problems. However, there are challenges to obtaining global optimal assignments. In this paper, based on consensus algorithms and network flow theory bib49 , we design an algorithm that can get the global optimal allocation in a distributed manner. Applications of network flow theory to the task assignment are not typical. We find that the multi-UAV task assignment problem in area coverage and target tracking task can essentially be modeled as a minimum-cost maximum-flow problem bib50 , which then can be solved using various MCMF algorithms. The MCMF algorithms are centralized methods. Combining the MCMF algorithms with the consensus algorithms enables the allocation algorithm to adapt to the overall distributed architecture.

The collaborative behavioral decision-making module utilizes Fisher information as the task reward, and UAVs make decisions according to the assigned coverage or tracking tasks. As for coverage decision, researchers have focused on geometric, probabilistic or biological intelligence approaches bib10 . Geometric-based methods bib13 ; bib14 can achieve complete coverage of the area.However, the centralized and offline nature makes the geometry-based approach unsuitable for dynamic environments. The probabilistic-based approach bib10 builds probabilistic models to characterize the environmental uncertainty and uses search algorithms for distributed online decisions to reduce environmental uncertainty. Compared with other decision-making algorithms, the methods based on biological intelligence can generate coverage behavior through simple rules with low computational cost bib19 . Therefore, we choose a rule-based anti-flocking algorithm bib22 ; bib23 for the coverage decision. In the tracking behavior decision-making part, most target tracking methods are based on the principles of placing the target in the center of the field of view bib54 . In this paper, we use the rolling horizon method for tracking decision-making. And the optimal action sequence that maximizes the cumulative Fisher information volume is found.

According to the hierarchical modular design and the ideas of the three sub-modules, our main contributions are as follows:

  • First, a distributed hierarchical modular architecture is provided for the area coverage and target tracking problem. We modularize information fusion, task assignment, and behavioral decision-making and integrate the three modules through a distributed architecture design. Compared with the coupling SCAT architecture, the distributed hierarchical modular design can fully exploit the capabilities of each sub-module, thereby improving the area coverage and target tracking performance of the system.

  • Second, a distributed information fusion strategy that combines the compression, extraction, and fusion of the coverage time maps in the coverage task with the fusion of the target states is designed. The compression and extraction of the coverage information map allow our information fusion strategy to be adapted in arbitrarily large mission areas.

  • Third, model the allocation problem of multi-UAV coverage and target tracking task as a network flow model. Then, a minimum-cost maximum-flow algorithm for task assignment is designed. The allocation algorithm adapts to the change in the quantitative relationship between UAVs and targets and can achieve fast allocation for large-scale tasks.

The rest of the paper is organized as follows. Section 2 analyzes the area coverage and target tracking problem and decomposes it into three subproblems: distributed information fusion, multi-UAV task assignment, and behavioral decision-making. And then, each subproblem is studied accordingly in Sections 3,4 and 5. Section 6 systematically designs a distributed multi-UAV cooperative area coverage and target tracking algorithm. Finally, Section 7 gives the numerical and hardware-in-loop simulations and corresponding discussions. Numerical and hardware-in-the-loop simulations demonstrate that our proposed hierarchical modular architecture can effectively solve the area coverage and target tracking problem.

2 Problem formulation

In this paper, several ground targets are moving in the mission area and the number and states of the targets are unknown. The multi-UAV system consisting of NuN_{u} homogeneous UAVs needs to cover the area continuously and automatically assigns UAVs to track the searched targets, while the remaining UAVs perform the coverage task. According to the mission requirements, each UAV has two task modes, i.e., the area coverage mode and the target tracking mode. Fig. 1 shows a typical scenario of the area coverage and target tracking task.

Refer to caption
Figure 1: A typical scenario of the area coverage and target tracking task with multi-UAV system.

Assuming that there are NτN_{\tau} targets searched at the current moment. Let Iu={1,,Nu}I_{u}=\{1,\cdots,N_{u}\} denote the number list of all UAVs and Iτ={1,,Nτ}I_{\tau}=\{1,\cdots,N_{\tau}\} denote the number list of all targets. Beaides, 𝒰i{\mathcal{U}}_{i} and 𝒯j{\mathcal{T}}_{j} denote the iith UAV and jjth target, respectively. The detection sensor is modeled by a disk with limited sensing capability and a fixed field of view (FOV). While the iith UAV can observe any targets within its measurement range rior_{i}^{o}, which is determined by its altitude hih_{i} and FOV. UAVs communicate through the wireless communication module with the communication range of rcr_{c}. Let pi=[xi,yi]p_{i}=\left[x_{i},y_{i}\right] denote the position of 𝒰i{\mathcal{U}}_{i}. Then we use 𝒢in={𝒱in,in}\mathscr{G}_{i}^{n}=\{\mathscr{V}_{i}^{n},\mathscr{E}_{i}^{n}\} to represent the undirected communication subgraph containing 𝒰i{\mathcal{U}}_{i} and its neighbors, where 𝒱in={qIupqpi<rc}\mathscr{V}_{i}^{n}=\{q\in I_{u}\mid\left\|p_{q}-p_{i}\right\|<r_{c}\} and in={(i,q)q𝒱in,qi}\mathscr{E}_{i}^{n}=\{(i,q)\mid q\in\mathscr{V}_{i}^{n},q\neq i\}. Here 𝒢in\mathscr{G}_{i}^{n} depends on the communication range rcr_{c} and the relative location between UAVs.

Refer to caption
Figure 2: The geometric relationship of UAV and target in the two-dimensional plane.

Let sjτ=(xjτ,x˙jτ,yjτ,y˙jτ)s_{j}^{\tau}=(x_{j}^{\tau},\dot{x}_{j}^{\tau},y_{j}^{\tau},\dot{y}_{j}^{\tau}) represent the state of target 𝒯j{\mathcal{T}}_{j}. We use the linear kinematic model for the ground moving targets. Meanwhile, we assume that the UAVs all fly at a constant altitude. The state of 𝒰i{\mathcal{U}}_{i} is defined by si=(xi,yi,vi,ηi)s_{i}=\left(x_{i},y_{i},v_{i},{\eta}_{i}\right), where viv_{i} is the flight speed and ηi{\eta}_{i} is the heading angle. The geometric relationship of UAV and target in the two-dimensional plane is shown in Fig. 2. The discrete kinematics model of the fixed-wing UAV is as follows:

{xi(k+1)=xi(k)+vi(k)ΔTcosηiyi(k+1)=yi(k)+vi(k)ΔTsinηivi(k+1)=[vi(k)+Δvi(k)ΔT]vminvmaxηi(k+1)=ηi(k)+ωi(k)ΔT\left\{\begin{aligned} x_{i}\left(k+1\right)&=x_{i}\left(k\right)+v_{i}\left(k\right)\Delta T\cos{\eta}_{i}\\ y_{i}\left(k+1\right)&=y_{i}\left(k\right)+v_{i}\left(k\right)\Delta T\sin{\eta}_{i}\\ v_{i}\left(k+1\right)&=\left[v_{i}\left(k\right)+\Delta v_{i}\left(k\right)\Delta T\right]_{v_{min}}^{v_{max}}\\ {\eta}_{i}\left(k+1\right)&={\eta}_{i}\left(k\right)+{\omega}_{i}\left(k\right)\Delta T\end{aligned}\right. (1)

where ΔT\Delta T denotes the length of each time step, and the action command πi\pi_{i} of the iith UAV is (Δvi,ωi)\left(\Delta v_{i},{\omega}_{i}\right), indicating the ground acceleration and heading angular velocity.

In our work, we only consider the target tracking task in the task allocation stage, since moving targets are more valuable. We calculate the tracking action and the corresponding reward of each UAV to each target through the tracking decision algorithm and then assign tasks to UAVs according to the reward matrix RNu×NτR\in{\mathbb{R}}^{N_{u}\times N_{\tau}}. RijR_{ij} is the corresponding task reward when 𝒰i{\mathcal{U}}_{i} tracks target 𝒯j{\mathcal{T}}_{j}. ϝNu×Nτ\digamma\in{\mathbb{R}}^{N_{u}\times N_{\tau}} denotes the task decision matrix. ϝij=1\digamma_{ij}=1 indicates that task 𝒯j{\mathcal{T}}_{j} is assigned to 𝒰i{\mathcal{U}}_{i} and 0 otherwise. πNu\pi\in{\mathbb{R}}^{N_{u}} denotes the action command of all UAVs. And πi\pi_{i} is the action command of 𝒰i{\mathcal{U}}_{i}. Our goal is to find the optimal mapping ϝ\digamma from the tasks to UAVs and plan the action of each UAV to maximize the overall mission reward while meeting the task constraints. Then, the optimization function of the multi-UAV system is:

maxϝ,πi=1Nuj=1NτRij(πi)ϝij\displaystyle\max\limits_{\digamma,\pi}\sum\limits_{i=1}^{N_{u}}{\sum\limits_{j=1}^{N_{\tau}}{R_{ij}(\pi_{i})\digamma_{ij}}} (2)
s.t.i=1Nuϝijnj,jIτj=1Nτϝij1,iIuϝij{0,1},iIu,jIτ\displaystyle\begin{array}[]{r@{\quad}r@{}l@{\quad}l}s.t.&\sum\limits_{i=1}^{N_{u}}\digamma_{ij}&\leq{n_{j}},&\forall{j}\in I_{\tau}\\ &\sum\limits_{j=1}^{N_{\tau}}\digamma_{ij}&\leq 1,&\forall{i}\in I_{u}\\ &\digamma_{ij}\in\{&0,1\},&\forall{i}\in I_{u},\forall{j}\in I_{\tau}\\ \end{array}

The first constraint indicates that the number of UAVs tracking one target cannot exceed njn_{j}. While the second constraint shows that each UAV can select at most one target to track. Note that if a UAV is assigned no target to track, it then performs the area coverage tasks.

Refer to caption
Figure 3: The overall framework of the multi-UAV area coverage and target tracking system.

In the decoupling architecture, we divide the area coverage and target tracking task into three sub-problems: information fusion, task assignment, and multi-UAV behavior decision-making. Fig. 3 is the framework of our multi-UAV area coverage and target tracking system. Then we analyze each sub-problem in detail in the following sections.

3 Multi-UAV information fusion

The information interaction is essential for realizing the cooperation among UAVs. It can improve the area coverage efficiency and the tracking accuracy of targets. In this section, a distributed approach is proposed where the joint estimation state of multi-targets and the coverage information of each UAV are propagated through the whole network so that the consistency of the detection information is guaranteed. The information fusion algorithm in this section improves the information fusion strategy in bib34 which is designed only for the target tracking task.

3.1 Definition of detection information

In the information fusion process, UAVs need to exchange their detection information (the local area coverage information and the local estimation state of multi-targets) with neighbors. Therefore, we design the area information map and a target information table to save the detection information of each UAV.

3.1.1 Equivalent visiting time map

In order to record the coverage information of each UAV and cooperate with other UAVs, we discretize the task area into a grid map gg with MNM*N grids. Suppose we set the detection radius of the airborne sensor simply as a disk. The target will be covered as long as within the sensor’s detection range, which does not consider the probability of successful detection. Therefore, we build an equivalent visiting time map (EVTM) considering the sensor’s performance.

Let TiM×N{T}_{i}\in{\mathbb{R}}^{M\times N} denote the EVTM of the iith UAV. The equivalent visiting time map (See Fig. 5) records the equivalent time when the grids were last visited. tim,n(k1){t}_{i}^{m,n}(k-1) represents the equivalent visiting time of grid g(m,n)g\left(m,n\right) at the end of time step k1k-1. tkt_{k} is the actual time at time step kk.

Initialize the equivalent visiting time map with tim,n(0)=t0{t}_{i}^{m,n}\left(0\right)=t_{0}, and the update rule of the EVTM is as follows:

t^im,n(k)={tim,n(k1),g(m,n)notvisitedattktkΔtim,n(k),g(m,n)visitedattk\hat{t}_{i}^{m,n}({k})=\left\{\begin{aligned} &{t}_{i}^{m,n}(k-1),&g(m,n)\ not\ visited\ at\ t_{k}\\ &t_{k}-\Delta t_{i}^{m,n}(k),&g(m,n)\ visited\ at\ t_{k}\end{aligned}\right. (3)

Δtim,n(k)\Delta t_{i}^{m,n}(k) is relevant to the probability of successful detection and will be calculated subsequently. t^im,n(k)\hat{t}_{i}^{m,n}({k}) is updated only by the detection information of 𝒰i{\mathcal{U}}_{i}. After information fusion phase, we can get tim,n(k){t}_{i}^{m,n}({k}) with the dection information of the UAVs in the same connected network.

Let γid(k,m,n)\gamma_{i}^{d}(k,m,n) denote the successful detection probability of grid g(m,n)g(m,n) by 𝒰i{\mathcal{U}}_{i} at time step kk. γid(k,m,n)\gamma_{i}^{d}(k,m,n) is related to the sensor performance and the distance disi(m,n)dis_{i\rightarrow(m,n)} between 𝒰i{\mathcal{U}}_{i} and the detection grid g(m,n)g(m,n), that is:

γid(m,n)=e[wp1(disi(m,n)/wp2)wp3]\displaystyle{\gamma_{i}^{d}}\left(m,n\right)=e^{\left[-w_{p1}\left(dis_{i\rightarrow\left(m,n\right)}/w_{p2}\right)^{w_{p3}}\right]} (4)

where wp1w_{p1}, wp2w_{p2}, and wp3w_{p3} are adjustable parameters.

Considering the successful detection probability γid(k,m,n)\gamma_{i}^{d}(k,m,n), the visiting requirement of grid g(m,n)g(m,n) is:

λim,n(k)=λim,n(k1)(1γid(k1,m,n))\displaystyle{\lambda}_{i}^{m,n}({k})={\lambda}_{i}^{m,n}(k-1)\left(1-\gamma_{i}^{d}(k-1,m,n)\right) (5)

We can also use an S-curve function to represent the visiting requirement λim,n(k){\lambda}_{i}^{m,n}(k) of grid g(m,n)g(m,n) from its corresponding equivalent visiting time:

λim,n(k)=1eα[(tktim,n(k))/Tc]β\displaystyle{\lambda}_{i}^{m,n}(k)=1-e^{-\alpha\left[\left(t_{k}-{t}_{i}^{m,n}(k)\right)/{T_{c}}\right]^{\beta}} (6)

where α\alpha and β\beta are curve parameters, TcT_{c} is the revisit time threshold.

According to Eq. 3-6, we have:

Δtim,n(k)=\displaystyle\Delta t_{i}^{m,n}(k)= Tc[1αln[1(1eα[(tk1tim,n(k1))/Tc]β)\displaystyle{T_{c}}\left[-\frac{1}{\alpha}ln\left[1\!-\!\left(1\!-\!e^{\!-\!\alpha\left[\left(t_{k-1}\!-\!{t}_{i}^{m,n}\left(k-1\right)\right)/{T_{c}}\right]^{\beta}}\right)\right.\right. (7)
(1γid(k1,m,n))]]1/β\displaystyle\left.\left.\left(1\!-\!\gamma_{i}^{d}(k-1,m,n)\right)\right]\right]^{1/\beta}

3.1.2 Target information table

In our previous work bib34 , each UAV maintains an information table to store its current estimation states of the targets. We define the information for a single target at time step kk as the following tuple:

{s^i,j(k),P^i,j(k),ρ^i,j(k),s¯i,j(k),P¯i,j(k)}\{\hat{s}_{i,j}(k),\hat{P}_{i,j}(k),\hat{\rho}_{i,j}(k),\overline{s}_{i,j}(k),\overline{P}_{i,j}(k)\} (8)

s^i,j\hat{s}_{i,j} and P^i,j\hat{P}_{i,j} are the local estimated state and the error covariance matrix of target 𝒯j{\mathcal{T}}_{j} and can be obtained through the local Kalman filtering of 𝒰i{\mathcal{U}}_{i}. Perceptual confidence ρ^i,j\hat{\rho}_{i,j} quantifies the accuracy of the UAV’s target state estimation and is set as the trace of the covariance matrix, which is:

ρ^i,j(k):={Trace(P^i,j(k))}1\hat{\rho}_{i,j}(k):=\{Trace(\hat{P}_{i,j}(k))\}^{-1} (9)

Besides, s¯i,j\overline{s}_{i,j} and P¯i,j\overline{P}_{i,j} are the corresponding estimated state and error covariance matrix after information fusion. Therefore, we can use {{s^i,j,P^i,j,ρ^i,j}jIτ}\{\{\hat{s}_{i,j},\hat{P}_{i,j},\hat{\rho}_{i,j}\}\mid j\in I_{\tau}\} to denote the local detection information of 𝒰i{\mathcal{U}}_{i} to all targets.

3.2 Distributed information fusion

In the distributed information fusion process, each UAV exchanges information with its neighbors to update the local information. The coverage time map, the perceptual confidence value, and the corresponding estimated state and error covariance matrix are propagated over the network in a finite time. Through limited updates with max-consensus protocol, the perceptual information of all UAVs in the connected network achieves consistency.

3.2.1 Communication topology

When performing the area coverage and target tracking task, UAVs are constantly moving. Thus the communication topology of the multi-UAV system is time-varying. Depending on the spatial distribution and communication radius of UAVs, several communication topologies may emerge as shown in Fig. 4.

  1. (a)

    The connected communication topology;

  2. (b)

    The communication topology which is divided into several isolated connected communication topologies;

  3. (c)

    The communication topology that all nodes can not communicate with each other.

Refer to caption
Refer to caption
Refer to caption
Figure 4: The communication topologies of the multi-UAV system.

To simplify the analysis, the communication topology is assumed to remain constant during each time step. Only the UAVs within the same connected communication topology need to exchange and fuse information to make the detection information achieve consistency.

3.2.2 The compression and extraction of EVTM

To reduce the interactive information, we compress and extract the visiting time map to get a compressed time map and a local time map, representing the global coverage information and local coverage information, respectively.

Refer to caption
Figure 5: The schematic of the compression and extraction of EVTM.

Fig. 5 shows the schematic of the compression and extraction of the EVTM. Assuming that 𝒰i{\mathcal{U}}_{i} is located at grid g(mi,ni)g(m_{i},n_{i}), the compressed time map T^ig3×3\hat{T}_{i}^{g}\in{\mathbb{R}}^{3\times 3} can be obtained by averaging the equivalent visiting time of the grids around g(mi,ni)g(m_{i},n_{i}) in EVTM T^i\hat{T}_{i}, which is:

T^ig=compress(T^i,(mi,ni))=(mi+1M1ni1t^im,n(Mmi)(ni1)mi+1M1Nt^im,n(Mmi)Nmi+1Mni+1Nt^im,n(Mmi)(Nni)1M1ni1t^im,nM(ni1)t^imi,ni1Mni+1Nt^jm,nM(Nni)1mi11ni1t^im,n(mi1)(ni1)1mi11Nt^im,n(mi1)N1mi1ni+1Nt^jm,n(mi1)(Nni))\begin{split}\hat{T}_{i}^{g}&=compress(\hat{T}_{i},(m_{i},n_{i}))\\ &=\left(\begin{array}[]{ccc}\frac{\sum\nolimits_{m_{i}+1}^{M}\sum\nolimits_{1}^{n_{i}-1}{\hat{t}}_{i}^{m^{\prime},n^{\prime}}}{(M-m_{i})(n_{i}-1)}&\frac{\sum\nolimits_{m_{i}+1}^{M}\sum\nolimits_{1}^{N}{\hat{t}}_{i}^{m^{\prime},n^{\prime}}}{(M-m_{i})N}&\frac{\sum\nolimits_{m_{i}+1}^{M}\sum\nolimits_{n_{i}+1}^{N}{\hat{t}}_{i}^{m^{\prime},n^{\prime}}}{(M-m_{i})(N-n_{i})}\\ \frac{\sum\nolimits_{1}^{M}\sum\nolimits_{1}^{n_{i}-1}{\hat{t}}_{i}^{m^{\prime},n^{\prime}}}{M(n_{i}-1)}&{\hat{t}}_{i}^{m_{i},n_{i}}&\frac{\sum\nolimits_{1}^{M}\sum\nolimits_{n_{i}+1}^{N}{\hat{t}}_{j}^{m^{\prime},n^{\prime}}}{M(N-n_{i})}\\ \frac{\sum\nolimits_{1}^{m_{i}-1}\sum\nolimits_{1}^{n_{i}-1}{\hat{t}}_{i}^{m^{\prime},n^{\prime}}}{(m_{i}-1)(n_{i}-1)}&\frac{\sum\nolimits_{1}^{m_{i}-1}\sum\nolimits_{1}^{N}{\hat{t}}_{i}^{m^{\prime},n^{\prime}}}{(m_{i}-1)N}&\frac{\sum\nolimits_{1}^{m_{i}-1}\sum\nolimits_{n_{i}+1}^{N}{\hat{t}}_{j}^{m^{\prime},n^{\prime}}}{(m_{i}-1)(N-n_{i})}\end{array}\right)\end{split} (10)

where compress()compress(\cdot) is the compression function of the equivalent visiting time map.

The local equivalent time map T^ilL×L\hat{T}_{i}^{l}\in{\mathbb{R}}^{L\times L} is obtained by extracting the L×LL\times L grids around the UAV’s location g(mi,ni)g(m_{i},n_{i}) from the EVTM T^i\hat{T}_{i}. The extraction function extract()extract(\cdot) is as follows:

T^il\displaystyle\hat{T}_{i}^{l} =extract(T^i,(mi,ni))\displaystyle=extract(\hat{T}_{i},(m_{i},n_{i})) (11)
={t^im,nm{mi,mi±(L1)/2},n{ni,ni±(L1)/2}}\displaystyle=\{\hat{t}_{i}^{m,n}\mid m\in\{m_{i},m_{i}\pm(L-1)/2\},n\in\{n_{i},n_{i}\pm(L-1)/2\}\}

LL is a positive odd number.

Table 1 shows the message passed from neighboring UAV 𝒰q{\mathcal{U}}_{q} to 𝒰i{\mathcal{U}}_{i}, of which T^ql\hat{T}_{q}^{l} is the local equivalent time map of 𝒰q{\mathcal{U}}_{q}. Such information interaction allows our coverage algorithm to extend to surveillance tasks with arbitrarily large mission areas.

Table 1: Information Interaction
Information type Information content
Status information sq(k)=(xq(k),yq(k),vq(k),ηq(k))s_{q}(k)=(x_{q}(k),y_{q}(k),v_{q}(k),\eta_{q}(k))
Area detection information T^ql(k)\hat{T}_{q}^{l}(k)
Target detection information {{s^q,j(k),P^q,j(k),ρ^q,j(k)}jIτ}\{\{\hat{s}_{q,j}(k),\hat{P}_{q,j}(k),\hat{\rho}_{q,j}(k)\}\mid j\in I_{\tau}\}
\botrule
Remark 1.

Regardless of the size of the area, the interaction information between UAVs is always a map with L×LL\times L grids. Then, for an arbitrarily large detection area with an area information map of M×NM\times N grids, the information content is only (L×L)/(M×N)(L\times L)/(M\times N) of the original global map.

3.2.3 The distributed information fusion based on maximum consistency

In order to get consistent information before the subsequent decision-making process, we define a new sampling time denoted by index dd with a higher frequency in the information fusion phase. Moreover, to express the fusion process more clearly, a group of temporary variables initialized with the information at time step kk (See Table 2) is defined.

Table 2: Temporary variables
Temporary variable Initialization
T^q(d)\hat{T^{\prime}}_{q}(d) T^q(0)=T^q(k)\hat{T^{\prime}}_{q}(0)=\hat{T}_{q}(k)
T^ql(d)\hat{T^{\prime}}_{q}^{l}(d) T^ql(0)=extract(T^q(0),(mq,nq))\hat{T^{\prime}}_{q}^{l}(0)=extract(\hat{T^{\prime}}_{q}(0),(m_{q},n_{q}))
{s^q,j(d),P^q,j(d),ρ^q,j(d)}jIτ\{\hat{s^{\prime}}_{q,j}(d),\hat{P^{\prime}}_{q,j}(d),\hat{\rho^{\prime}}_{q,j}(d)\}_{j\in I_{\tau}} {s^q,j(0),P^q,j(0),ρ^q,j(0)}={s^q,j(k),P^q,j(k),ρ^q,j(k)},jIτ\{\hat{s^{\prime}}_{q,j}(0),\hat{P^{\prime}}_{q,j}({0}),\hat{\rho^{\prime}}_{q,j}({0})\}=\{\hat{s}_{q,j}(k),\hat{P}_{q,j}(k),\hat{\rho}_{q,j}(k)\},\forall{j}\in I_{\tau}
\botrule

The detection information from neighbors can then be fused based on the max-consensus protocol. Specifically, the equivalent visiting time map T^i(d)\hat{T^{\prime}}_{i}(d) is updated with the received EVTM as follows:

t^im,n(d)=maxu𝒱in{t^um,n(d1)}\hat{t^{\prime}}_{i}^{m,n}(d)=\max\limits_{u\in\mathscr{V}_{i}^{n}}\{\hat{t^{\prime}}_{u}^{m,n}(d-1)\} (12)

The update rule of the perceptual confidence is:

ρ^i,j(d)=maxu𝒱in{ρ^u,j(d1)},jIτ{\hat{\rho}^{\prime}}_{i,j}(d)=\max\limits_{u\in\mathscr{V}_{i}^{n}}\{{\hat{\rho}^{\prime}}_{u,j}({d-1})\},j\in I_{\tau} (13)

According to the general definition of maximum consistency in bib52 , we give the concept of maximum consistency of the detection information in our area coverage and target tracking system.

Definition 1 (Maximum consistency of the detection information).

Consider the undirected communication subgraph 𝒢i={𝒱i,i}\mathscr{G}_{i}=\{\mathscr{V}_{i},\mathscr{E}_{i}\} where 𝒰i{\mathcal{U}}_{i} is located, the initial value t^um,n(0)\hat{t^{\prime}}_{u}^{m,n}(0) of the equivalent visiting time map, and the initial value ρ^u,j(0)\hat{\rho}^{\prime}_{u,j}(0) of perceptual confidence of each UAV in the subgraph, the update rules of EVTM and the perceptual confidence are (12) and (13), respectively. Then the system achieves maximum consistency, if δ\exists\delta\in\mathbb{N}, so that:

ρ^i,j(d)\displaystyle\hat{\rho}^{\prime}_{i,j}(d) =ρ^q,j(d)\displaystyle=\hat{\rho}^{\prime}_{q,j}(d)
=maxu𝒱i{ρ^u,j(0)},jIτ,dδ,i,q𝒱i\displaystyle=\max\limits_{u\in\mathscr{V}_{i}}\{{\hat{\rho}^{\prime}}_{u,j}(0)\},{\forall j}\in I_{\tau},{\forall d}\geq\delta,\forall i,q\in\mathscr{V}_{i} (14)
t^im,n(d)\displaystyle\hat{t^{\prime}}_{i}^{m,n}(d) =t^qm,n(d)\displaystyle=\hat{t^{\prime}}_{q}^{m,n}(d)
=maxu𝒱i{t^um,n(0)},dδ,i,q𝒱i,mM,nN\displaystyle=\max\limits_{u\in\mathscr{V}_{i}}\{\hat{t^{\prime}}_{u}^{m,n}(0)\},{\forall d}\geq\delta,\forall i,q\in\mathscr{V}_{i},\forall m\leq M,\forall n\leq N (15)

where δ\delta is the lower bound of the iteration number to achieve maximum consistency in the whole connected graph 𝒢i\mathscr{G}_{i}.

According to the termination condition of the maximum consistency algorithm given in Theorem 1 of bib34 , we directly set the number of iterations of the iith UAV as 𝒟i{\mathcal{D}}_{i}, which is the diameter of the shortest path tree (SPT) of 𝒢i\mathscr{G}_{i} rooted at node ii. Algorithm 1 gives the information fusion process of the iith UAV. After the information fusion phase, we get the joint estimation states of multi-targets (JESMT) and the joint EVTM of the mission area.

Algorithm 1 Information fusion of the iith UAV at the time step kk
1:Ti(k1),{{s¯i,j(k1),P¯i,j(k1)}jIτ}{T}_{i}(k-1),\{\{\overline{s}_{i,j}({k-1}),\overline{P}_{i,j}({k-1})\}\mid j\in I_{\tau}\}
2:Ti(k),Til(k),Tig(k),{{s¯i,j(k),P¯i,j(k)}jIτ}{T}_{i}(k),{T}_{i}^{l}(k),{T}_{i}^{g}(k),\{\{\overline{s}_{i,j}({k}),\overline{P}_{i,j}({k})\}\mid j\in I_{\tau}\}
3:for j=1j=1 to NτN_{\tau} do
4:     Calculate {s^i,j(k),P^i,j(k)}\{\hat{s}_{i,j}({k}),\hat{P}_{i,j}({k})\} by Kalman filter according to {s¯i,j(k1),P¯i,j(k1)}\{\overline{s}_{i,j}({k-1}),\overline{P}_{i,j}({k-1})\};
5:     ρ^i,j(k):={Trace(P^i,j(k))}1\hat{\rho}_{i,j}({k}):=\{Trace(\hat{P}_{i,j}({k}))\}^{-1};
6:end for
7:Get T^i(k)\hat{T}_{i}(k) from Ti(k1){T}_{i}(k-1) by (3);
8:Calculate T^il(k)\hat{T}_{i}^{l}({k}) according to (11);
9:Let {s^i,j(0),P^i,j(0),ρ^i,j(0)}={s^i,j(k),P^i,j(k),ρ^i,j(k)},jIτ\{\hat{s^{\prime}}_{i,j}({0}),\hat{P^{\prime}}_{i,j}({0}),\hat{\rho^{\prime}}_{i,j}({0})\}=\{\hat{s}_{i,j}({k}),\hat{P}_{i,j}({k}),\hat{\rho}_{i,j}({k})\},\forall{j}\in I_{\tau}, T^i(0)=T^i(k)\hat{T^{\prime}}_{i}(0)=\hat{T}_{i}({k}), and T^il(0)=T^il(k)\hat{T^{\prime}}_{i}^{l}(0)=\hat{T}_{i}^{l}({k});
10:for d=1d=1 to 𝒟i{\mathcal{D}}_{i} do
11:     Send {{s^i,j(d1),P^i,j(d1),ρ^i,j(d1)}jIτ}\{\{\hat{s^{\prime}}_{i,j}({d-1}),\hat{P^{\prime}}_{i,j}(d-1),\hat{\rho^{\prime}}_{i,j}(d-1)\}\mid\forall{j}\in I_{\tau}\}, T^il(d1)\hat{T^{\prime}}_{i}^{l}(d-1);
12:     Receive {{s^u,j(d1),P^u,j(d1),ρ^u,j(d1)}jIτ}\{\{\hat{s^{\prime}}_{u,j}({d-1}),\hat{P^{\prime}}_{u,j}({d-1}),\hat{\rho^{\prime}}_{u,j}({d-1})\}\mid\forall{j}\in I_{\tau}\}, T^ul(d1)\hat{T^{\prime}}_{u}^{l}(d-1), for u𝒱in\forall{u}\in\mathscr{V}_{i}^{n};
13:     Update T^i\hat{T^{\prime}}_{i} according to (12):
14:     qargmaxu𝒱int^um,n(d1)q\leftarrow\mathop{\arg\max}\limits_{u\in\mathscr{V}_{i}^{n}}{\hat{t^{\prime}}_{u}^{m,n}(d-1)}, t^im,n(d)=t^qm,n(d1)\hat{t^{\prime}}_{i}^{m,n}(d)=\hat{t^{\prime}}_{q}^{m,n}(d-1);
15:     for j=1j=1 to NτN_{\tau} do
16:         largmaxu𝒱inρ^u,j(d1)l\leftarrow\mathop{\arg\max}\limits_{u\in\mathscr{V}_{i}^{n}}{\hat{\rho^{\prime}}}_{u,j}({d-1});
17:         {s^i,j(d),P^i,j(d),ρ^i,j(d)}={s^l,j(d1),P^l,j(d1),ρ^l,j(d1)}\{\hat{s^{\prime}}_{i,j}({d}),\hat{P^{\prime}}_{i,j}({d}),\hat{\rho^{\prime}}_{i,j}({d})\}=\{\hat{s^{\prime}}_{l,j}({d-1}),\hat{P^{\prime}}_{l,j}({d-1}),\hat{\rho^{\prime}}_{l,j}({d-1})\};
18:     end for
19:     T^il(d)=extract(T^i(d),(mi,ni))\hat{T^{\prime}}_{i}^{l}(d)=extract(\hat{T^{\prime}}_{i}(d),(m_{i},n_{i}));
20:end for
21:Ti(k)=T^i(𝒟i){T}_{i}(k)=\hat{T^{\prime}}_{i}({\mathcal{D}}_{i}), Til(k)=T^il(𝒟i){T}_{i}^{l}(k)=\hat{T^{\prime}}_{i}^{l}({\mathcal{D}}_{i}), Tig(k)=compress(T^i(k),(mi,ni)){T}_{i}^{g}(k)=compress(\hat{T}_{i}(k),(m_{i},n_{i})), {s¯i,j(k),P¯i,j(k)}={s^i,j(𝒟i),P^i,j(𝒟i)}\{\overline{s}_{i,j}(k),\overline{P}_{i,j}(k)\}=\{\hat{s^{\prime}}_{i,j}({{\mathcal{D}}_{i}}),\hat{P^{\prime}}_{i,j}({{\mathcal{D}}_{i}})\}, jIτ{\forall j\in I_{\tau}}.
Remark 2.

It is worthy pointing out that the information fusion strategy in bib34 can only be used for the target tracking task. By contrast, Algorithm 1 in this paper adds the compression, extraction, and fusion of the coverage time map for the area coverage and target tracking task.

4 Task assignment based on minimum-cost maximum-flow method

According to the task allocation problem of multi-UAV search and tracking multi-target formulated in (2), it is assumed that target 𝒯j{\mathcal{T}}_{j} should be tracked by at least one UAV and at most njn_{j} UAVs. Meanwhile, each UAV can choose one target to track at most. Combined with the network flow theory, this task allocation problem can be modeled as a network flow model with upper and lower flow bounds.

4.1 Related concepts of network flow

We first give some definitions related to network flow problem bib49 ; bib50 .

  1. 1)

    Capacity Network: Given a directed graph 𝒢=(𝒱,𝒜)\mathcal{G}=(\mathcal{V},\mathcal{A}) and the capacity c(νi,νj)c(\nu_{i},\nu_{j}) of each arc ai,j=(νi,νj)a_{i,j}=(\nu_{i},\nu_{j}), (𝒱,𝒜,c(ai,j))(\mathcal{V},\mathcal{A},c(a_{i,j})) is called the capacity network.

  2. 2)

    Source, Sink, and Intermediate Vertices: In a capacity network, the source node is denoted as νs\nu_{s} with in-degree zero, the sink node νt\nu_{t} has out-degree zero, and the other nodes are called intermediate vertices.

  3. 3)

    Network Flow: A function f:𝒜f:\mathcal{A}\to\mathcal{R} defined from the arc set 𝒜\mathcal{A} to the nonnegative number set is called the network flow on 𝒢\mathcal{G}, while fij=f(νi,νj)f_{ij}=f(\nu_{i},\nu_{j}) is the flow on arc aija_{ij}.

  4. 4)

    Feasible Flow: A feasible flow is a network flow ff from νs\nu_{s} to νt\nu_{t}, which simultaneously satisfies the following capacity constraints (16) and conservation constraints (17) .

    0f(νi,νj)c(νi,νj),(νi,νj)𝒜0\leq f(\nu_{i},\nu_{j})\leq c(\nu_{i},\nu_{j}),\quad{\forall(\nu_{i},\nu_{j})\in\mathcal{A}} (16)
    (νi,νj)𝒜fij=(νj,νi)𝒜fji,νi𝒱\{νs,νt}\sum\nolimits_{(\nu_{i},\nu_{j})\in\mathcal{A}}{f_{ij}}=\sum\nolimits_{(\nu_{j},\nu_{i})\in\mathcal{A}}{f_{ji}},\quad\forall\nu_{i}\in\mathcal{V}\backslash\{\nu_{s},\nu_{t}\} (17)

    While the value of a feasible flow is defined as:

    |f|=νj𝒱fsj=νj𝒱fjt|f|=\sum\nolimits_{\nu_{j}\in\mathcal{V}}{f_{sj}}=\sum\nolimits_{\nu_{j}\in\mathcal{V}}{f_{jt}} (18)
  5. 5)

    Upper and Lower Bounds of Flow: For each arc (νi,νj)(\nu_{i},\nu_{j}) of the directed graph 𝒢=(𝒱,𝒜)\mathcal{G}=(\mathcal{V},\mathcal{A}), the capacity constraints (16) can be further extended with the upper bound flow upperijupper_{ij} and the lower bound flow lowerijlower_{ij}.

  6. 6)

    Minimum-cost maximum-flow: Given a capacity network (𝒱,𝒜,c(aij))(\mathcal{V},\mathcal{A},c(a_{ij})), the cost of the unit flow for arc aija_{ij} is denoted as l(νi,νj)l(\nu_{i},\nu_{j}). The minimum-cost maximum-flow problem is to find a feasible flow f:𝒜f:\mathcal{A}\to\mathcal{R}, so that

    min(νi,νj)𝒜lijfij&max|f|\min{\sum_{(\nu_{i},\nu_{j})\in\mathcal{A}}l_{ij}f_{ij}}\quad\&\quad\max{|f|} (19)

4.2 Network flow model for multi-UAV task assignment

According to the concepts of network flow introduced in the previous section, the task assignment problem of multi-UAV searching and tracking multi-target can be modeled as a network flow model with upper and lower flow bounds.

Refer to caption
Refer to caption
Figure 6: Graphs consisting of UAV vertices and target vertices.

We first build a graph 𝒢(𝒱,𝒜)\mathcal{G}(\mathcal{V},\mathcal{A}) (See Fig. 6) with the UAV vertices {ν𝒰i}iIu\{\nu_{{\mathcal{U}}_{i}}\}_{i\in I_{u}}, the target vertices {ν𝒯j}jIτ\{\nu_{{\mathcal{T}}_{j}}\}_{j\in I_{\tau}}, and the edges {(ν𝒰i,ν𝒯j)iIu,jIτ}\{(\nu_{{\mathcal{U}}_{i}},\nu_{{\mathcal{T}}_{j}})\mid i\in I_{u},j\in I_{\tau}\} connecting each UAV vertex and each target vertex. Then, graph 𝒢(𝒱st,𝒜st)\mathcal{G}({\mathcal{V}}_{st},{\mathcal{A}}_{st}) (See Fig. 6) can be obtained by adding a source vertex νs\nu_{s}, a sink vertex νt\nu_{t}, and new edges {(νs,ν𝒰i)}iIu\{(\nu_{s},\nu_{{\mathcal{U}}_{i}})\}_{i\in I_{u}} and {(ν𝒯j,νt)}jIτ\{(\nu_{{\mathcal{T}}_{j}},\nu_{t})\}_{j\in I_{\tau}}.

Refer to caption
(a) 𝒢¯l(𝒱st,𝒜st){\mathcal{\overline{G}}}_{l}({\mathcal{V}}_{st},{\mathcal{A}}_{st})
Refer to caption
(b) 𝒢¯m(𝒱st,𝒜st){\mathcal{\overline{G}}}_{m}({\mathcal{V}}_{st},{\mathcal{A}}_{st})
Refer to caption
(c) 𝒢¯s(𝒱st,𝒜st){\mathcal{\overline{G}}}_{s}({\mathcal{V}}_{st},{\mathcal{A}}_{st})
Figure 7: The network flow models with upper and lower flow bounds for multi-UAV task assignment according to the relationship between the number of UAVs and targets. (a)Nuj=1NτnjN_{u}\geq\sum\nolimits_{j=1}^{N_{\tau}}{n_{j}}, (b)Nτ<Nu<j=1NτnjN_{\tau}<N_{u}<\sum\nolimits_{j=1}^{N_{\tau}}{n_{j}}, and (c) NτNuN_{\tau}\geq N_{u}.

Considering the relationship between the number of UAVs NuN_{u}, the number of targets NτN_{\tau}, and the maximum number of UAVs required for all targets j=1Nτnj\sum\nolimits_{j=1}^{N_{\tau}}{n_{j}}, graph 𝒢(𝒱st,𝒜st)\mathcal{G}({\mathcal{V}}_{st},{\mathcal{A}}_{st}) can be further transformed into a capacity network 𝒢¯(𝒱st,𝒜st){\mathcal{\overline{G}}}({\mathcal{V}}_{st},{\mathcal{A}}_{st}) by adding some flow bounds (See Fig. 7).According to the optimization objective function (2), our goal is to maximize the task rewards of the multi-UAV system. Then we set the cost of arc (ν𝒰i,ν𝒯j)(\nu_{{\mathcal{U}}_{i}},\nu_{{\mathcal{T}}_{j}}) as l(ν𝒰i,ν𝒯j)=Rijl(\nu_{{\mathcal{U}}_{i}},\nu_{{\mathcal{T}}_{j}})=-R_{ij} and the cost of all other arcs as 0. The flow bounds and cost for each arc in the transformed network is shown in Table 3.

Table 3: The flow bounds and cost for each arc in 𝒢¯(𝒱st,𝒜st){\mathcal{\overline{G}}}({\mathcal{V}}_{st},{\mathcal{A}}_{st})
Numerical relationship Flow bounds/(upper,lower)(upper,lower) Cost
(𝝂𝓤𝒊,𝝂𝓣𝒋)(\nu_{{\mathcal{U}}_{i}},\nu_{{\mathcal{T}}_{j}}) (𝝂𝒔,𝝂𝓤𝒊)(\nu_{s},\nu_{{\mathcal{U}}_{i}}) (𝝂𝓣𝒋,𝝂𝒕)(\nu_{{\mathcal{T}}_{j}},\nu_{t}) (𝝂𝓤𝒊,𝝂𝓣𝒋)(\nu_{{\mathcal{U}}_{i}},\nu_{{\mathcal{T}}_{j}}) 𝓐\(𝝂𝓤𝒊,𝝂𝓣𝒋)\mathcal{A}\backslash(\nu_{{\mathcal{U}}_{i}},\nu_{{\mathcal{T}}_{j}})
Nuj=1NτnjN_{u}\geq\sum\nolimits_{j=1}^{N_{\tau}}{n_{j}} (1,0) (1,0) (nj,njn_{j},n_{j}) Rij-R_{ij} 0
Nτ<Nu<j=1NτnjN_{\tau}<N_{u}<\sum\nolimits_{j=1}^{N_{\tau}}{n_{j}} (1,0) (1,1) (nj,1n_{j},1) Rij-R_{ij} 0
NτNuN_{\tau}\geq N_{u} (1,0) (1,1) (1,0) Rij-R_{ij} 0
Definition 2 (Network flow model for multi-UAV task assignment).

Given a graph 𝒢(𝒱st,𝒜st)\mathcal{G}({\mathcal{V}}_{st},{\mathcal{A}}_{st}) constructed by all UAV vertices and target vertices, set the upper flow bound of arc (ν𝒰i,ν𝒯j)(\nu_{{\mathcal{U}}_{i}},\nu_{{\mathcal{T}}_{j}}) as 1 and the lower as 0. According to the numerical relationship between NuN_{u}, NτN_{\tau}, and j=1Nτnj\sum\nolimits_{j=1}^{N_{\tau}}{n_{j}}, add the flow bounds and the cost to arc (νs,ν𝒰i)(\nu_{s},\nu_{{\mathcal{U}}_{i}}) and arc (ν𝒯j,νt)(\nu_{{\mathcal{T}}_{j}},\nu_{t}) with Table 3. The resulting capacity network 𝒢¯(𝒱st,𝒜st){\mathcal{\overline{G}}}({\mathcal{V}}_{st},{\mathcal{A}}_{st}) is a network flow model with upper and lower flow bounds for multi-UAV task assignment.

For the transformed network, the feasible flow that satisfies the flow constraints must be the maximum flow. Moreover, when the network cost is minimized, the reward of the corresponding multi-UAV system is maximized. So far, the multi-UAV task assignment problem has been transformed into an MCMF problem.

4.3 Task assignment minimum-cost maximum-flow algorithm

The traditional MCMF algorithm is aimed at the case where there is no lower flow bound. Therefore, we further convert the network obtained in Section 4.2 and design an algorithm named task assignment minimum-cost maximum-flow (TAMM) for our multi-UAV task assignment problem. The detailed TAMM algorithm is given in Algorithm 2. In order to describe the algorithm more clearly, we first define the sum of the lower flow bounds of all arcs flowing into vertex νi\nu_{i} and that flowing out of νi\nu_{i} as finif_{in}^{i} and foutif_{out}^{i}.

Algorithm 2 Task assignment minimum-cost maximum-flow algorithm
1:Task reward matrix RR (RR is calculated in Section 5), Network 𝒢¯(𝒱st,𝒜st){\mathcal{\overline{G}}}({\mathcal{V}}_{st},{\mathcal{A}}_{st}) (𝒢¯{\mathcal{\overline{G}}} is obtained by Definition 2)
2:The allocated task 𝒯j{\mathcal{T}}_{j} for each UAV
3:Add an arc (νt,νs)(\nu_{t},\nu_{s}) with upperts=+upper_{ts}=+\infty and lowerts=0lower_{ts}=0, a source vertex νs\nu_{s^{\prime}}, and a sink vertex νt\nu_{t^{\prime}}.
4:The current edge set is 𝒜st=𝒜st(νt,νs){\mathcal{A}}_{s^{\prime}t^{\prime}}={\mathcal{A}}_{st}\cup(\nu_{t},\nu_{s}) and the current vertex set is 𝒱st=𝒱st{νt,νs}{\mathcal{V}}_{s^{\prime}t^{\prime}}={\mathcal{V}}_{st}\cup\{\nu_{t}^{\prime},\nu_{s}^{\prime}\};
5:for νi𝒱st\forall\nu_{i}\in{\mathcal{V}}_{st} do
6:     if finifoutif_{in}^{i}\geq f_{out}^{i} then
7:         For νq𝒱st\forall\nu_{q}\in{\mathcal{V}}_{s^{\prime}t^{\prime}} with lowerqi>0lower_{qi}\textgreater 0, let upperqi=upperqilowerqiupper_{qi}=upper_{qi}-lower_{qi} and lowerzi=0lower_{zi}=0;
8:         Add an arc (νs,νi)(\nu_{s^{\prime}},\nu_{i}) with uppersi=finifoutiupper_{s^{\prime}i}=f_{in}^{i}-f_{out}^{i} and lowersi=0lower_{s^{\prime}i}=0. Then 𝒜st=𝒜st(νs,νi){\mathcal{A}}_{s^{\prime}t^{\prime}}={\mathcal{A}}_{s^{\prime}t^{\prime}}\cup(\nu_{s^{\prime}},\nu_{i});
9:     else
10:         For νp𝒱st\forall\nu_{p}\in{\mathcal{V}}_{s^{\prime}t^{\prime}} with lowerip>0lower_{ip}\textgreater 0, let upperip=upperiploweripupper_{ip}=upper_{ip}-lower_{ip} and lowerip=0lower_{ip}=0;
11:         Add an arc (νi,νt)(\nu_{i},\nu_{t^{\prime}}) with upperit=finifoutiupper_{it^{\prime}}=f_{in}^{i}-f_{out}^{i} and lowerit=0lower_{it^{\prime}}=0. Then, 𝒜st=𝒜st(νi,νt){\mathcal{A}}_{s^{\prime}t^{\prime}}={\mathcal{A}}_{s^{\prime}t^{\prime}}\cup(\nu_{i},\nu_{t^{\prime}});
12:     end if
13:end for
14:For (νi,νj)𝒜st\forall(\nu_{i},\nu_{j})\in{\mathcal{A}}_{s^{\prime}t^{\prime}}, delete (νi,νj)(\nu_{i},\nu_{j}) with upperij=0upper_{ij}=0 and lowerij=0lower_{ij}=0;
15:for  νi𝒱st\forall\nu_{i}\in{\mathcal{V}}_{s^{\prime}t^{\prime}} do
16:     if the in-degree and out-degree of νi\nu_{i} are both 1 then
17:         Find (νj,νi),(νi,νk)𝒜st(\nu_{j},\nu_{i}),(\nu_{i},\nu_{k})\in{\mathcal{A}}_{s^{\prime}t^{\prime}}, then delete νi\nu_{i}, (νj,νi)(\nu_{j},\nu_{i}), and (νi,νk)(\nu_{i},\nu_{k});
18:         Add the arc (νj,νk)(\nu_{j},\nu_{k}) with c(νj,νk)=max{c(νj,νi),c(νi,νk)}c(\nu_{j},\nu_{k})=\max\{c(\nu_{j},\nu_{i}),c(\nu_{i},\nu_{k})\};
19:     end if
20:end for
21:Denote the current network as 𝒢^(𝒱st,𝒜st){\mathcal{\hat{G}}}({\mathcal{V}}_{s^{\prime}t^{\prime}},{\mathcal{A}}_{s^{\prime}t^{\prime}});
22:Run a minimum-cost maximum-flow method to find the MCMF fmcmff_{mcmf} of 𝒢^(𝒱st,𝒜st){\mathcal{\hat{G}}}({\mathcal{V}}_{s^{\prime}t^{\prime}},{\mathcal{A}}_{s^{\prime}t^{\prime}});
23:for iIu\forall i\in I_{u} do
24:     if jIτ,f𝒰i𝒯j=1\exists j\in I_{\tau},{f_{{{\mathcal{U}}_{i}}{{\mathcal{T}}_{j}}}=1} in fmcmff_{mcmf}  then
25:         𝒰i{\mathcal{U}}_{i} is assigned to track target 𝒯j{\mathcal{T}}_{j};
26:     end if
27:     if jIτf𝒰i𝒯j=0\sum_{j\in I_{\tau}}{f_{{{\mathcal{U}}_{i}}{{\mathcal{T}}_{j}}}=0} in fmcmff_{mcmf} then
28:         𝒰i{\mathcal{U}}_{i} is assigned to cover the area. Let 𝒯0{\mathcal{T}}_{0} denote the area coverage task.
29:     end if
30:end for

The basic idea of TAMM is to convert the network with lower flow bounds 𝒢¯(𝒱st,𝒜st){\mathcal{\overline{G}}}({\mathcal{V}}_{st},{\mathcal{A}}_{st}) into the network 𝒢^(𝒱st,𝒜st){\mathcal{\hat{G}}}({\mathcal{V}}_{s^{\prime}t^{\prime}},{\mathcal{A}}_{s^{\prime}t^{\prime}}) without lower flow bounds by adding new source vertex, sink vertex, and related edges. The detailed conversion process is shown in Fig. 8. Then the minimum-cost maximum-flow method can be used to find the MCMF of the converted network. The task assignment results can be obtained by mapping the obtained flow to the matching relationship between UAVs and targets.

Refer to caption
Figure 8: The detailed conversion process to get network 𝒢^(𝒱st,𝒜st){\mathcal{\hat{G}}}({\mathcal{V}}_{s^{\prime}t^{\prime}},{\mathcal{A}}_{s^{\prime}t^{\prime}}) without lower flow bounds.
Remark 3.

Our task assignment algorithm entails a complexity 𝒪(|𝒱st||𝒜st|){\mathcal{O}}(|{\mathcal{V}}_{s^{\prime}t^{\prime}}||{\mathcal{A}}_{s^{\prime}t^{\prime}}|). Since |𝒱st|=Nu+Nτ+3|{\mathcal{V}}_{s^{\prime}t^{\prime}}|=N_{u}+N_{\tau}+3 and |𝒜st|=Nu+NuNτ+2(Nτ+1)|{\mathcal{A}}_{s^{\prime}t^{\prime}}|=N_{u}+N_{u}N_{\tau}+2(N_{\tau}+1) in network 𝒢^(𝒱st,𝒜st){\mathcal{\hat{G}}}({\mathcal{V}}_{s^{\prime}t^{\prime}},{\mathcal{A}}_{s^{\prime}t^{\prime}}), the complexity of task assignment is 𝒪(Nu2Nτ+NuNτ2)\mathcal{O}({N_{u}^{2}}N_{\tau}+N_{u}{N_{\tau}^{2}}). Therefore, our TAMM algorithm has the polynomial time complexity and can quickly solve the multi-UAV task assignment problem.

Since the MCMF of 𝒢¯(𝒱st,𝒜st){\mathcal{\overline{G}}}({\mathcal{V}}_{st},{\mathcal{A}}_{st}) corresponds to the allocation plan that maximizes the task reward. To illustrate the MCMF of the converted network 𝒢^(𝒱st,𝒜st){\mathcal{\hat{G}}}({\mathcal{V}}_{s^{\prime}t^{\prime}},{\mathcal{A}}_{s^{\prime}t^{\prime}}) also corresponds to the allocation plan with the maximum task reward, we introduce the following theorem.

Theorem 1.

Given the network 𝒢¯(𝒱st,𝒜st){\mathcal{\overline{G}}}({\mathcal{V}}_{st},{\mathcal{A}}_{st}) with the upper and lower bounds, construct the network following the steps in Algorithm 2, then the minimum-cost maximum-flow of the original network 𝒢¯(𝒱st,𝒜st){\mathcal{\overline{G}}}({\mathcal{V}}_{st},{\mathcal{A}}_{st}) is also that of the converted network 𝒢^(𝒱st,𝒜st){\mathcal{\hat{G}}}({\mathcal{V}}_{s^{\prime}t^{\prime}},{\mathcal{A}}_{s^{\prime}t^{\prime}}).

The proof of Theorem 1 is based on the basic concepts of network flow mentioned in Section 4.1.

Proof: In the transformation process, the original network is first converted into the no-source and no-sink network 𝒢¯(𝒱st,𝒜¯st){\mathcal{\overline{G}}}({\mathcal{V}}_{st},{\mathcal{\overline{A}}}_{st}) by introducing an arc (νt,νs)(\nu_{t},\nu_{s}) with the capacity constraint ++\infty. The minimum-cost maximum-flow of the original network is also that of 𝒢¯(𝒱st,𝒜¯st){\mathcal{\overline{G}}}({\mathcal{V}}_{st},{\mathcal{\overline{A}}}_{st}). After that, each step of the transformation from network 𝒢¯(𝒱st,𝒜¯st){\mathcal{\overline{G}}}({\mathcal{V}}_{st},{\mathcal{\overline{A}}}_{st}) to network 𝒢^(𝒱st,𝒜st){\mathcal{\hat{G}}}({\mathcal{V}}_{s^{\prime}t^{\prime}},{\mathcal{A}}_{s^{\prime}t^{\prime}}) follows the conservation constraints (17).

Then, if a feasible flow ff makes any arcs (νs,νi)(\nu_{s^{\prime}},\nu_{i}) or (νi,νt)(\nu_{i},\nu_{t^{\prime}}) in 𝒢^(𝒱st,𝒜st){\mathcal{\hat{G}}}({\mathcal{V}}_{s^{\prime}t^{\prime}},{\mathcal{A}}_{s^{\prime}t^{\prime}}) reach maximum capacity, this flow must be a feasible flow of network 𝒢¯(𝒱st,𝒜¯st){\mathcal{\overline{G}}}({\mathcal{V}}_{st},{\mathcal{\overline{A}}}_{st}). On the contrary, any feasible flow in network 𝒢¯(𝒱st,𝒜¯st){\mathcal{\overline{G}}}({\mathcal{V}}_{st},{\mathcal{\overline{A}}}_{st}) corresponds to the flow whose arc (νs,νi)(\nu_{s^{\prime}},\nu_{i}) or (νi,νt)(\nu_{i},\nu_{t^{\prime}}) achieves maximum capacity in network 𝒢^(𝒱st,𝒜st){\mathcal{\hat{G}}}({\mathcal{V}}_{s^{\prime}t^{\prime}},{\mathcal{A}}_{s^{\prime}t^{\prime}}).

The flow reaching maximum capacity from the source νs\nu_{s^{\prime}} to the sink νt\nu_{t^{\prime}} of network 𝒢^(𝒱st,𝒜st){\mathcal{\hat{G}}}({\mathcal{V}}_{s^{\prime}t^{\prime}},{\mathcal{A}}_{s^{\prime}t^{\prime}}) must be the maximum flow of the network. Therefore, finding a feasible flow of network 𝒢¯(𝒱st,𝒜¯st){\mathcal{\overline{G}}}({\mathcal{V}}_{st},{\mathcal{\overline{A}}}_{st}) is equivalent to finding the maximum flow from νs\nu_{s^{\prime}} to νt\nu_{t^{\prime}} of network 𝒢^(𝒱st,𝒜st){\mathcal{\hat{G}}}({\mathcal{V}}_{s^{\prime}t^{\prime}},{\mathcal{A}}_{s^{\prime}t^{\prime}}).

According to the flow constraints of (νs,νi)(\nu_{s},\nu_{i}) and (νi,νt)(\nu_{i},\nu_{t}) in network 𝒢¯(𝒱st,𝒜¯st){\mathcal{\overline{G}}}({\mathcal{V}}_{st},{\mathcal{\overline{A}}}_{st}), the feasible flow of 𝒢¯(𝒱st,𝒜¯st){\mathcal{\overline{G}}}({\mathcal{V}}_{st},{\mathcal{\overline{A}}}_{st}) must be its maximum flow. Then the maximum flow of the network 𝒢^(𝒱st,𝒜st){\mathcal{\hat{G}}}({\mathcal{V}}_{s^{\prime}t^{\prime}},{\mathcal{A}}_{s^{\prime}t^{\prime}}) is also the maximum flow of 𝒢¯(𝒱st,𝒜¯st){\mathcal{\overline{G}}}({\mathcal{V}}_{st},{\mathcal{\overline{A}}}_{st}). Since the cost of the newly added arc is 0, the minimum-cost maximum-flow fmf_{m} of the converted network 𝒢^(𝒱st,𝒜st){\mathcal{\hat{G}}}({\mathcal{V}}_{s^{\prime}t^{\prime}},{\mathcal{A}}_{s^{\prime}t^{\prime}}) is that of 𝒢¯(𝒱st,𝒜¯st){\mathcal{\overline{G}}}({\mathcal{V}}_{st},{\mathcal{\overline{A}}}_{st}). Furthermore, fmf_{m} is also the minimum cost maximum-flow of the original network 𝒢¯(𝒱st,𝒜st){\mathcal{\overline{G}}}({\mathcal{V}}_{st},{\mathcal{A}}_{st}).

5 Multi-UAV behavior decision-making

In our modular framework, the cooperative behavior decision-making process includes tracking behavior decision-making module and coverage behavior decision-making module. The main idea of cooperative behavior decision-making is to first get the tracking action and the corresponding rewards through the tracking decision-making for each UAV and then assign tasks to UAVs according to the reward matrix. If a UAV is assigned one target to track, its action is the optimal tracking action calcaulated by tracking decision-making module. While if a UAV is assigned no target to track, it then plans its action through coverage decision-making.

5.1 Tracking behavior decision-making

The objective of tracking decision-making is to plan the actions of UAVs according to the predicted target states to obtain more accurate measurements of targets.

We use the determinant of the Fisher information matrix (FIM) as the reward value. bib34 explained the reasons for using FIM as the reward function: FIM processes the original measurement data from the sensor and is derived from the measurement model in the natural sensor polar coordinate system directly, reflecting the volume of information of the measurement data.

The FIM of UAV 𝒰i{\mathcal{U}}_{i} about target 𝒯j{\mathcal{T}}_{j} at time kk is defined as Gij(k)G_{ij}(k), and det(Gij(k))det(G_{ij}(k)) denotes the determinant of the FIM. When considering the information accumulated over HH time step, the reward at time step kk can be approximated by the sum of the determinants of the FIM as follows:

Rij=l=1H[det(Gij(k+l))]R_{ij}=\sum_{l=1}^{H}{[det(G_{ij}({k+l}))]} (20)

The tracking decision-making algorithm based on the rolling horizon method finds the optimal action sequence that maximizes the cumulative Fisher information volume.

max(π^i(k),,π^i(k+H1))l=1H[det(Gij(k+l))]\max\limits_{({\hat{\pi}}_{i}(k),\cdots,{\hat{\pi}}_{i}({k+H-1}))}{\sum_{l=1}^{H}{[det(G_{ij}({k+l}))]}} (21)

At time step kk, each UAV predicts the target positions of the future HH steps according to the state s¯i,j(k)\overline{s}_{i,j}(k) updated in the distributed fusion phase. Then the optimal planning method is used to find the optimal action sequence (πiopt(k),,πiopt(k+H1)){(\pi_{i}^{opt}(k),\cdots,\pi_{i}^{opt}(k+H-1))} that maximizes the objective function. The planned action of 𝒰i{\mathcal{U}}_{i} is the first item of the optimal action sequence, which is:

πi(k)=πiopt(k)\pi_{i}(k)=\pi_{i}^{opt}(k) (22)

We also call the planned action the optimal tracking action. The optimal tracking action of UAVs at each step can be obtained by repeating the above process.

5.2 Coverage behavior decision-making

The purpose of the area coverage task is to continuously search the targets and reduce the uncertainty of the dynamic environment. Inspired by solitary organisms, the distributed anti-flocking algorithm bib20 is well suited for area coverage tasks and is summarized as follows: 1) Collision avoidance: avoid surrounding obstacles and neighbors. 2) Decentralization: move away from neighboring centers. 3) Selfishness: maximize own gains.

Our previous work bib18 designed a distributed multi-UAV persistent coverage algorithm based on the anti-flocking method, of which several area coverage information maps are designed. The algorithm uses the collision avoidance and decentering rules in the anti-flocking method to design a heading matching map (HMM), making the UAVs disperse from each other to increase the instantaneous coverage area and avoid collisions. The EVTM, the compressed EVTM, and the HMM are integrated to calculate the detection rewards of UAVs and guide the UAV to the area where the rewards are maximized. We briefly describe the algorithm below.

First, calculate the desired separation heading ηid\eta_{i}^{d} of each UAV through the rules of collision avoidance and decentering.

ηid=ko[o=1Mi𝒮(pipio,dio)pipiopipio]+kc[𝒮(pipic,dc)pipicpipic]\begin{split}\eta_{i}^{d}=&k_{o}\left[\sum\limits_{o=1}^{M_{i}}{\mathcal{S}(\left\|{{p}_{i}}-p_{i}^{o}\right\|,{d_{i}}^{o})\frac{{{p}_{i}}-p_{i}^{o}}{\left\|{{p}_{i}}-p_{i}^{o}\right\|}}\right]+\\ &k_{c}\left[\mathcal{S}(\left\|{{p}_{i}}-p_{i}^{c}\right\|,{d_{c}})\frac{{{p}_{i}}-p_{i}^{c}}{\left\|{{p}_{i}}-p_{i}^{c}\right\|}\right]\end{split} (23)

𝒮()\mathcal{S}(\cdot) is a non-negative repulsive potential function. diod_{i}^{o} represents the safety distance of 𝒰i{\mathcal{U}}_{i}, dcd_{c} is the distance threshold that the decentering term works, and MiM_{i} is the total number of neighboring obstacles and UAVs of 𝒰i{\mathcal{U}}_{i}. Besides, picp_{i}^{c} denotes the position of the centroid of the neighbor UAVs.

Let ηim,n\eta_{i}^{m,n} denote the heading of 𝒰i{\mathcal{U}}_{i} to the grid g(m,n)g(m,n), that is:

ηim,n=pm,npipm,npi\eta_{i}^{m,n}=\frac{{{p}_{m,n}}-p_{i}}{\left\|{{p}_{m,n}}-p_{i}\right\|} (24)

where pm,np_{m,n} represents the position of grid g(m,n)g(m,n).

The heading matching value between ηim,n\eta_{i}^{m,n} and the desired heading ηid\eta_{i}^{d} is:

Aim,n={eka(ηim,nηid)2|Δηim,n|<ωmaxΔT0otherwise{A}_{i}^{m,n}\!=\!\left\{\begin{aligned} &e^{-k_{a}\left(\eta_{i}^{m,n}-\eta_{i}^{d}\right)^{2}}&|\Delta\eta_{i}^{m,n}|<{\omega}_{max}\Delta T\\ &0&otherwise\end{aligned}\right. (25)

where Δηim,n\Delta\eta_{i}^{m,n} is the deviation value between ηim,n\eta_{i}^{m,n} and ηi\eta_{i}.

Then the coverage reward is calculated from EVTM Ti{T}_{i}. We define the coverage reward that 𝒰i{\mathcal{U}}_{i} obtains from grid g(m,n)g(m,n) by the visiting requirement λim,n\lambda_{i}^{m,n} as follows:

fim,n=λim,nf_{i}^{m,n}=\lambda_{i}^{m,n} (26)

Suppose 𝒰i{\mathcal{U}}_{i} is located at g(m,n)g(m,n) at the next step and ϕsi(m,n)\phi_{s_{i}(m,n)} denotes the detection area of 𝒰i{\mathcal{U}}_{i}, then the predicted coverage reward is:

Fim,n=pm,nϕsi(m,n)λim,n{F}_{i}^{m,n}=\sum_{p_{m^{\prime},n^{\prime}}\subset\phi_{s_{i}(m,n)}}\lambda_{i}^{m^{\prime},n^{\prime}} (27)

The global searching reward map of 𝒰i{\mathcal{U}}_{i} is Qi3×3Q_{i}\in{\mathbb{R}}^{3\times 3}, which calculated from the compressed EVTM Tig{T}_{i}^{g} as follows:

Qi=(tk𝟏3Tig)/Tc{Q}_{i}=(t_{k}{\bf 1}_{3}-{T}_{i}^{g})/T_{c} (28)

Furthermore, we need to extract the heading matching value and the coverage reward of the 3×33\times 3 grids around the location of the UAV to get the local heading matching map Ail{A}_{i}^{l} and local coverage reward map Fil{F}_{i}^{l}. The overall coverage reward map (OCRM) Ji3×3J_{i}\in{\mathbb{R}}^{3\times 3} consists of the weighted sum of FilF_{i}^{l}, QiQ_{i}, and AilA_{i}^{l}, which is:

Ji(κ,ι)=wfFil(κ,ι)+wqQi(κ,ι)+waAil(κ,ι)J_{i}(\kappa,\iota)=w_{f}F_{i}^{l}(\kappa,\iota)+w_{q}Q_{i}(\kappa,\iota)+w_{a}A_{i}^{l}(\kappa,\iota)\quad (29)

where wfw_{f}, wqw_{q} and waw_{a} are the weights of the corresponding term.

Based on the selfishness rule of the anti-flocking method, we directly select the grid with maximum reward in the OCRM as the target grid g(m,n)g(m_{*},n_{*}). (κ,ι)=argmax(κ,v)Ji(κ,ι)(\kappa^{\prime},\iota^{\prime})=\mathop{\arg\max}\limits_{(\kappa,v)}J_{i}(\kappa,\iota) denotes the grid with the maximum reward in map JiJ_{i}. When 𝒰i{\mathcal{U}}_{i} is located at g(mi,ni)g(m_{i},n_{i}), then m=κ+mi2m_{*}=\kappa^{\prime}+m_{i}-2 and n=ι+ni2n_{*}=\iota^{\prime}+n_{i}-2.

Finally, let g(m,n)g(m_{*},n_{*}) be the desired gird of 𝒰i{\mathcal{U}}_{i}. The control input πi=(Δvi,ωi)\pi_{i}=(\Delta v_{i},\omega_{i}) of UiU_{i} is:

{Δvi=[pm,npi/ΔTvi]dvmaxdvmaxωi=[(ηiηi)/ΔT]ωmaxωmax\left\{\begin{aligned} &\Delta v_{i}=\left[\|p_{m_{*},n_{*}}-p_{i}\|/\Delta T-v_{i}\right]_{-dv_{max}}^{dv_{max}}\\ &\omega_{i}=\left[(\eta_{i}^{*}-\eta_{i})/\Delta T\right]_{-\omega_{max}}^{\omega_{max}}\\ \end{aligned}\right. (30)

pm,np_{m_{*},n_{*}} is the position of grid g(m,n)g(m_{*},n_{*}). And ηi\eta_{i}^{*} is the desired heading when 𝒰i{\mathcal{U}}_{i} flies to g(m,n)g(m_{*},n_{*}) from its current position, which is:

ηi=atan2(yiyi,xixi)\eta_{i}^{*}=atan2(y_{i}^{*}-y_{i},x_{i}^{*}-x_{i}) (31)

6 System design of distributed multi-UAV cooperative area coverage and target tracking

Integrating the information fusion strategy, the task assignment method, and the behavior decision algorithm designed in Sections 3-5, this section designs the distributed multi-UAV cooperative area coverage and target tracking algorithm. The framework is shown in Fig. 9.

Refer to caption
Figure 9: The distributed multi-UAV cooperative area coverage and target tracking framework.

As shown in the information fusion process, we use the maximum consensus protocol to update the joint estimation state of multi-targets and the equivalent visiting time map, which reach the consensus in the connected communication network after limited iterations. Then each UAV uses the Fisher information reward value to get the optimal tracking action sequence and exchanges the Fisher information reward with its neighbors to get the reward matrix. The task assignment method based on minimum-cost maximum-flow assigns the task to each UAV that maximizes the task reward. If 𝒰i{\mathcal{U}}_{i} is assigned a target to track, the optimal tracking action of 𝒰i{\mathcal{U}}_{i} is the first item of the optimal tracking action sequence. While for those UAVs assigned the coverage task, we use the coverage behavior decision-making algorithm based on the anti-flocking method to plan its action.

In particular, the task assignment algorithm in Section 4 needs to know the global reward matrix. While each UAV only has its reward about targets initially. Therefore, each UAV exchanges the Fisher information reward with its neighbors. Through limited updates with max-consensus protocol, the reward matrix of all UAVs in the connected network achieves consistency. As with the information fusion algorithm in Section 3, we set the iteration number for the iith UAV as 𝒟i{\mathcal{D}}_{i} according to the termination condition of the maximum consistency algorithm given in bib34 . Therefore, the distributed information interaction enables the proposed task allocation algorithm to be adapted to the overall distributed framework.

Algorithm 3 below summarizes the distributed approach for local detection, fusion, assignment, and decision-making of 𝒰i{\mathcal{U}}_{i}.

Algorithm 3 Local detection, fusion, assignment, and decision-making for the iith UAV at time kk
1:Ti(k1),{{s¯i,j(k1),P¯i,j(k1)}jIτ}{T}_{i}(k-1),\{\{\overline{s}_{i,j}(k-1),\overline{P}_{i,j}(k-1)\}\mid j\in I_{\tau}\}
2:Action πi(k)\pi_{i}(k)
3:Run Algorithm 1 to get Ti(k),Til(k),Tig(k),{{s¯i,j(k),P¯i,j(k)}jIτ}{T}_{i}(k),{T}_{i}^{l}(k),{T}_{i}^{g}(k),\{\{\overline{s}_{i,j}(k),\overline{P}_{i,j}(k)\}\mid j\in I_{\tau}\} after information fusion;
4:for j=1j=1 to NτN_{\tau} do
5:     Rijopt(k)max(π^i(k),,π^i(k+H1))l=1H[det(Gij(k+l))]R_{ij}^{opt}(k)\leftarrow\max\limits_{({\hat{\pi}}_{i}({k}),\cdots,{\hat{\pi}}_{i}({k+H-1}))}{\sum_{l=1}^{H}{[det(G_{ij}(k+l))]}};
6:     πijopt(k)argmax(π^i(k),,π^i(k+H1))l=1H[det(Gij(k+l))]\pi_{ij}^{opt}(k)\leftarrow\mathop{\arg\max}\limits_{({\hat{\pi}}_{i}(k),\cdots,{\hat{\pi}}_{i}({k+H-1}))}{\sum_{l=1}^{H}{[det(G_{ij}({k+l}))]}};
7:end for
8:Riopt(k):=[Ri1opt(k),Ri2opt(k),,RiNτopt(k)]R_{i}^{opt}(k):=[R_{i1}^{opt}(k),R_{i2}^{opt}(k),\cdots,R_{i{N_{\tau}}}^{opt}(k)];
9:Riall(0):={Riopt(k)}R_{i}^{all}(0):=\{R_{i}^{opt}(k)\};
10:for ι=1\iota=1 to 𝒟i{\mathcal{D}}_{i} do
11:     Send Riall(ι1)R_{i}^{all}(\iota-1);
12:     Receive Rqall(ι1),q𝒱in,qiR_{q}^{all}(\iota-1),q\in\mathscr{V}_{i}^{n},q\neq i;
13:     Riall(ι)q𝒱inRqall(ι1)R_{i}^{all}(\iota)\leftarrow\bigcup\limits_{q\in\mathscr{V}_{i}^{n}}R_{q}^{all}(\iota-1);
14:end for
15:Get reward matrix R(k)R(k) by the elements in Riall(𝒟i)R_{i}^{all}({\mathcal{D}}_{i});
16:The assigned task got by Algorithm 2 is 𝒯jTAMM(R(k)){\mathcal{T}}_{j_{*}}\leftarrow TAMM(R(k));
17:if j=0j_{*}=0 then
18:     Calculate the overall coverage reward map JiJ_{i} by (29);
19:     (κ,ι)argmax(κ,ι)Ji(κ,ι)(\kappa^{\prime},\iota^{\prime})\leftarrow\mathop{argmax}\limits_{(\kappa,\iota)}J_{i}(\kappa,\iota);
20:     The target grid g(m,n)g(m_{*},n_{*}): m=κ+mi2,n=ι+ni2m_{*}=\kappa^{\prime}+m_{i}-2,n_{*}=\iota^{\prime}+n_{i}-2;
21:     Get πi(k)\pi_{i}(k) by (30) and (31);
22:else
23:     πi(k)πijopt(k)\pi_{i}(k)\leftarrow\pi_{i{j_{*}}}^{opt}(k).
24:end if

In the tracking behavior decision phase, each UAV calculates its rewards for all tasks. The complexity of a single step of computing is 𝒪(Nτ|Π|)\mathcal{O}(N_{\tau}|\Pi|), where |Π||\Pi| is the cardinality of the action set. Compared with the computational complexity 𝒪(NuNτ|Π|)\mathcal{O}(N_{u}N_{\tau}|\Pi|) of the centralized decision-making, our approach has lower computational complexity. Apart from the decision-making phase, the task assignment phase entails a complexity 𝒪(Nu2Nτ+NuNτ2)\mathcal{O}({N_{u}^{2}}N_{\tau}+N_{u}{N_{\tau}^{2}}). The information fusion phase also has low computational complexity. Therefore, our distributed hierarchical modular algorithm can be well applied to the area coverage and target tracking tasks of a large area with multiple UAVs.

7 Simulation results

The proposed Algorithm 3 for area coverage and target tracking has been analyzed in the simulated environment. The area surveillance mission aims to maintain continuous coverage of the area and keep track of the detected targets. We evaluate the performance of our method through performance indicators such as area uncovered time and target observation coverage rate in a series of simulations. The setup for the different simulations and their corresponding results are detailed in the following subsections.

7.1 Simulation setup

Suppose the UAV flies at a fixed altitude of 90110m90-110m, and the communication radius RcR_{c} between the UAV is set through specific experiments. We set the task area to be 2.5km2.5km2.5km*2.5km which is rasterized into grids with both length and width of 20m20m. 44 ground moving targets are initially randomly distributed in this area, and their speed and heading can change with time. The sensor parameters are the same as those in bib34 .

Refer to caption
Figure 10: The simulation scenario with seven UAVs and four targets.

The numerical simulation is implemented on MATLAB R2020b, and Fig. 10 is one example of the simulation scenarios. And we also perform the hardware-in-the-loop simulations, which are described in Section 7.3. Assuming that each target can be tracked by a maximum of 2 UAVs simultaneously, set the parameters related to grid visiting requirements as wp1=1,wp2=1.2,wp3=0.8w_{p1}=1,w_{p2}=1.2,w_{p3}=0.8 and the related parameters of sensor detection performance as α=1.1,β=0.8\alpha=1.1,\beta=0.8. Considering the randomness of the experiment, including the initial positions of the target and the UAV, and the random movement of the target, we performed 50 repetitions of each set of experiments with 1500 simulation steps to get average results.

7.2 Simulation analysis

7.2.1 The effect of different parameters

In this subsection, we take the simulations with different communication ranges and UAV numbers to study the effect of these parameters on the performance of our distributed area surveillance method.

First, we set the UAV numbers as Nu[3,13]N_{u}\in[3,13] with a fixed communication radius of 400m400m. Since the target is tracked by at most two UAVs, ideally, when Nu<4N_{u}<4, there have some targets that are not assigned to the UAV. When 4Nu84\leq N_{u}\leq 8, each target can be assigned to at least one UAV to track. When Nu>8N_{u}>8, each target can be assigned to two UAVs to track. Fig. 11 is an example of the allocation result when the number of UAVs is 11. Due to the random movement of the target, the detected targets can be lost and require the multi-UAV system to search the targets again.

Refer to caption
Figure 11: The allocation results when the number of UAVs is 11.

We use the instantaneous average uncovered time tIMT(k)t_{IMT}(k) to present the coverage performance of our algorithm, as shown in Fig. 13. tIMT(k)t_{IMT}(k) indicates the average uncovered time of all grids at time step kk. We find that the higher the number of UAVs, the shorter the average uncovered time, indicating a shorter visit interval.

Refer to caption
Figure 12: The average uncovered time of all grids at each time step under different UAV numbers.
Refer to caption
Figure 13: The average observation coverage rate of all targets under different UAV numbers.

For the target tracking effect, we use the observation coverage rate to represent the ratio of the time duration that the target is observed to the total simulation time. The average observation coverage rate of all targets is shown in Fig. 13. Since the initial positions of the UAVs and the targets are random and the target moves randomly, the tracking performance can be measured directly using the overall average observation rate. Fig. 13 shows that with the increase of the UAV numbers, the average observation rate of the targets is also gradually increasing. The distribution of the observation coverage rate for our repeated simulations is shown in Fig. 14. It indicates that the observation coverage rate of most of the targets is maintained at a high level when the number of UAVs is more than that of targets. Moreover, Fig. 16 shows the RMSE of targets in one simulation.

Refer to caption
Refer to caption
Figure 14: The distribution of the observation coverage rate for different UAV numbers.
Refer to caption
Figure 15: The RMSE of targets in one simulation (0 indicates that the target is not detected or lost yet).
Refer to caption
Figure 16: The average uncovered time of different communication ranges with seven UAVs performing only coverage task.

Then, we evaluate the performance under different communication ranges, which vary from 100m100m to 600m600m (See Fig. 16 and 17). To explore the influence of the communication range on the coverage performance, we consider that UAVs only perform the coverage task. Fig. 16 indicates that the uncovered time decreases gradually with the increased communication ranges. When simultaneously considering the search and tracking tasks, since the target tracking algorithm can ensure the targets are tracked most time, the communication range has little effect on the observation coverage rate, as shown in Fig. 17. Simulation results show that the area coverage and target tracking performance can be enhanced by improving communication capability.

Refer to caption
Refer to caption
Figure 17: The observation coverage rate of different communication ranges with nine UAVs and four targets. (a) is the observation coverage rate from lowest to highest and the average observation coverage rate of targets, (b) is the distribution of the observation coverage rate.

7.2.2 Comparison with other algorithms

Here, we compare our method with the simultaneous coverage and tracking (SCAT) algorithm in bib2 and the multi-target tracking (MT) algorithm in bib34 . The SCAT method in bib2 translates the area coverage and target tracking problem to the problem of covering environments with time-varying density functions. However, the target tracking task is not explicitly considered in SCAT. The MT method in bib34 uses the distributed partially observable Markov decision algorithm to make tracking decisions. It assumes that the targets are initially located in the field of view of the UAVs. In order to make the comparison reasonable, we combine our coverage algorithm with the MT algorithm for the subsequent comparison, called CMT. In addition, the task assignment of the MT algorithm adopts multiple rounds of 1-to-1 assignment to solve the many-to-many assignment problem. There are no restrictions on the number of UAVs tracking the same target, resulting in UAVs selecting to perform the tracking task. We limit the number of 1-to-1 assignments in the task allocation part of CMT and call this algorithm as L-CMT.

We compare these algorithms from the target observation coverage rate and the area uncovered time. We set the number of UAVs as 7 and the number of targets as 4 in subsequent comparisons.

Refer to caption
Figure 18: The average observation coverage rate of 7 UAVs and 4 targets for different methods.
Refer to caption
Figure 19: The average uncovered time of 7 UAVs and 4 targets for different methods.

Fig. 19 shows the average observation coverage rate of all targets. The average observation coverage rate is 0.7080±0.23110.7080\pm 0.2311 for our method, 0.6600±0.37610.6600\pm 0.3761 for the CMT method, 0.6190±0.23970.6190\pm 0.2397 for the L-CMT method, and 0.6177±0.22030.6177\pm 0.2203 for the SCAT method. Fig. 19 shows the average uncovered time of the mission area. The average uncovered time of the SCAT method is the smallest, indicating the coverage performance of SCAT is the best. Since the SCAT algorithm ignores the explicit tracking of the target, its coverage performance is good, while the tracking performance is poor. The CMT algorithm does not limit the number of UAVs tracking the same target, making UAVs tend to choose the tracking task and sometimes focus on tracking the same targets. Thus, the CMT algorithm has the worst area coverage performance. Besides, the target observation coverage variance of CMT is the largest. The L-CMT method improves the coverage performance of CMT by limiting the number of allocations. Due to the design of the task assignment minimum-cost maximum-flow algorithm, our method has the best tracking effect, and the area coverage effect is slightly lower than that of L-CMT and SCAT methods. These simulation results indicate that our method is more suitable for the area coverage and target-tracking task.

7.2.3 Other performance

Here, we also list the time taken by the allocation algorithm in Table 4, which shows that our allocation algorithm can be applied to the fast real-time allocation of large-scale tasks.

Table 4: Running time of our allocation method
       𝐍𝐮𝐍τ\bf N_{u}\leq\bf N_{\tau}        𝟐𝟎𝐕𝟑𝟎\bf 20V30        𝟓𝟎𝐕𝟔𝟎\bf 50V60        𝟖𝟎𝐕𝟏𝟎𝟎\bf 80V100
       Time use (s)        0.00218        0.00948        0.02526
       𝐍τ<𝐍𝐮<𝐣=𝟏𝐍τ𝐧𝐣\bf N_{\tau}<\bf N_{u}<\bf\sum\nolimits_{j=1}^{N_{\tau}}{n_{j}}        𝟔𝟎𝐕𝟓𝟎\bf 60V50        𝟖𝟎𝐕𝟓𝟎\bf 80V50        𝟏𝟎𝟎𝐕𝟓𝟎\bf 100V50
       Time use (s)        0.00925        0.01168        0.01469
       𝐍𝐮>𝐣=𝟏𝐍τ𝐧𝐣\bf N_{u}>\bf\sum\nolimits_{j=1}^{N_{\tau}}{n_{j}}        𝟖𝟎𝐕𝟐𝟎\bf 80V20        𝟏𝟎𝟎𝐕𝟐𝟓\bf 100V25        𝟏𝟐𝟎𝐕𝟑𝟎\bf 120V30
       Time use (s)        0.00491        0.00768        0.01083
       \botrule

In addition, to reflect the scalability of the distributed algorithm, we set the task area as 10km*10km with 50 UAVs and 15 targets in Fig. 21. Fig. 21 shows the single-step decision-making time of the simulation, which can meet the needs of real-time decision-making.

Refer to caption
Figure 20: The area coverage and target tracking task with 50 UAVs and 15 targets in the area of 10km*10km.
Refer to caption
Figure 21: The single-step decision-making time in one simulation with 50 UAVs and 15 targets.

7.3 Hardware-in-the-loop simulation

The hardware-in-the-loop (HIL) verification of our distributed area coverage and target tracking algorithm is carried out through the the HIL simulator bib55 constructed by our team. We set the task area as 2.5km*2.5km with 7 UAVs and 4 moving targets. As shown in Fig. 22, all UAVs take the area coverage task when there are no targets. Fig. 22 shows that the UAV automatically decides the task mode when detecting the targets, and each target is assigned to one UAV for tracking. While the remaining UAVs carry out the area coverage tasks.

Refer to caption
Refer to caption
Figure 22: Hardware-in-the-loop simulation of the area coverage ans tracking task with 7 UAVs. (a) No targets, (b) 4 targets.

We count the target observation coverage rate and the area uncovered time of multiple HIL simulations (See Fig. 24 and 24). Here, the coverage and tracking performance difference between HIL simulations and numerical simulations mainly comes from the uncertainty of the underlying modeling in hardware-in-the-loop simulations. The average observation coverage rate is 0.6240±0.11530.6240\pm 0.1153 in HIL simulatons. Moreover, the assignment results in one of the HIL simulations are given in Fig. 25. The hardware-in-the-loop simulations demonstrate the feasibility of the real-time onboard implementations of our area coverage and target tracking algorithm.

Refer to caption
Figure 23: The average observation coverage rate of 20 HIL simulations.
Refer to caption
Figure 24: The average uncovered time of 20 HIL simulations.
Refer to caption
Figure 25: The allocation results in one HIL simulation.

8 Conclusion

This paper proposes a distributed algorithm for area coverage and multi-target tracking. We use a distributed information fusion strategy based on maximum consensus protocol to estimate the joint state of multi-targets and get the joint detection information of the mission area. We also design a task allocation algorithm based on minimum cost and maximum flow. Optimal tracking and area coverage action are obtained based on optimal planning and distributed anti-flocking algorithm. In the distributed information fusion phase, we introduce the area compression map to extend the coverage algorithm to any large task area, and the scalability is verified in the simulation. In addition, combined with the consensus strategy, the designed network flow allocation algorithm is implemented in a distributed way. And we use the Fisher information as the task reward of target tracking.

We apply the integrated algorithm to area coverage and target tracking tasks and study the effects of the number of UAVs and communication range on the coverage and tracking performance. The results show that the coverage performance of UAVs can be significantly enhanced by increasing their scale and improving their communication capability. However, the communication range has little effect on the tracking performance. In addition, we apply our algorithm in the hardware-in-the-loop simulation of area coverage and target tracking tasks.

In future work, we will consider these problems and focus on reducing the communication load and dealing with the communication delay problem to promote the practical application of the algorithm.

References

  • \bibcommenthead
  • (1) Meng, W., He, Z., Teo, R., Su, R., Xie, L.: Integrated multi-agent system framework: Decentralised search, tasking and tracking. IET Control Theory and Applications 9, 493–502 (2015)
  • (2) Pimenta, L.C.A., Schwager, M., Lindsey, Q., Kumar, V., Rus, D., Mesquita, R.C., Pereira, G.A.S.: Simultaneous Coverage and Tracking (SCAT) of Moving Targets with Robot Networks, vol. 57, pp. 85–99. Springer, Berlin, Heidelberg (2010)
  • (3) Moon, S., Frew, E.W.: Distributed cooperative control for joint optimization of sensor coverage and target tracking. In: 2017 International Conference on Unmanned Aircraft Systems (ICUAS), pp. 759–766 (2017)
  • (4) Semnani, S.H., Basir, O.A.: Semi-flocking algorithm for motion control of mobile sensors in large-scale surveillance systems. IEEE Transactions on Cybernetics 45(1), 129–137 (2015)
  • (5) Meng, W., He, Z., Su, R., Yadav, P.K., Teo, R., Xie, L.: Decentralized multi-UAV flight autonomy for moving convoys search and track. IEEE Transactions on Control Systems Technology 25(4), 1480–1487 (2017)
  • (6) Yuan, W., Ganganath, N., Cheng, C.-T., Qing, G., Lau, F.C.M.: Semi-flocking-controlled mobile sensor networks for dynamic area coverage and multiple target tracking. IEEE Sensors Journal 18(21), 8883–8892 (2018)
  • (7) Yuan, W., Ganganath, N., Cheng, C.-T., Valaee, S., Qing, G., Lau, F.C.M., Iu, H.H.C.: Semi-flocking-controlled mobile sensor networks for tracking targets with different priorities. In: 2019 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 1–5 (2019)
  • (8) Achanta, S.: Passive target tracking using unscented kalman filter based on monte carlo simulation. Indian Journal of Science and Technology, 8, 1–5 (2015)
  • (9) Lee, D.-y., Shim, S.-W., Hwang, M.-c., Tahk, M.-J.: Target tracking using adaptive coarse-to-fine particle filter. In: AIAA Guidance, Navigation, and Control Conference (2017)
  • (10) Ren, W., Beard, R.W., Atkins, E.M.: Information consensus in multivehicle cooperative control. IEEE Control Systems Magazine 27(2), 71–82 (2007)
  • (11) Yu, D., Xia, Y., Li, L., Zhu, C.: Distributed consensus-based estimation with unknown inputs and random link failures. Automatica 122, 109259 (2020)
  • (12) Fanti, M.P., Mangini, A.M., Ukovich, W.: A quantized consensus algorithm for distributed task assignment. In: 2012 IEEE Conference on Decision and Control (CDC), pp. 2040–2045 (2012)
  • (13) Luo, L., Chakraborty, N., Sycara, K.: Distributed algorithms for multirobot task assignment with task deadline constraints. IEEE Transactions on Automation Science and Engineering 12(3), 876–888 (2015)
  • (14) Zhao, Y., Wang, X., Wang, C., Cong, Y., Shen, L.: Systemic design of distributed multi-uav cooperative decision-making for multi-target tracking. Autonomous Agents and Multi-Agent Systems 33, 132–158 (2019)
  • (15) Bänziger, T., Kunz, A., Wegener, K.: Optimizing human-robot task distribution using a simulation tool based on standardized work descriptions. Journal of Intelligent Manufacturing 31, 1635–1648 (2020)
  • (16) Wu, W., Nai-gang, C.: Distributed task allocation for multiple heterogeneous uavs based on consensus algorithm and online cooperative strategy. Aircraft Engineering and Aerospace Technology 90(9), 1464–1473 (2018)
  • (17) Chen, X., Zhang, P., Du, G., Li, F.: A distributed method for dynamic multi-robot task allocation problems with critical time constraints. Robotics and Autonomous Systems 118, 31–46 (2019)
  • (18) Dantzig, G., Thapa, M.: Network Flow Theory, pp. 253–313. Springer, New York, NY (1997)
  • (19) Li, B., Springer, J., Bebis, G., Gunes, M.: A survey of network flow applications. Journal of Network and Computer Applications 36(2), 567–581 (2013)
  • (20) Zhu, K., Han, B., Zhang, T.: Multi-UAV distributed collaborative coverage for target search using heuristic strategy. Guidance, Navigation and Control 01(1), 2150002 (2021)
  • (21) Ni, J., Tang, G., Mo, Z., Cao, W., Yang, S.X.: An improved potential game theory based method for multi-uav cooperative search. IEEE Access 8, 47787–47796 (2020)
  • (22) Chen, J., Du, C., Zhang, Y., Han, P., Wei, W.: A clustering-based coverage path planning method for autonomous heterogeneous uavs. IEEE Transactions on Intelligent Transportation Systems, 1–11 (2021)
  • (23) Liu, Z., Gao, X., Fu, X.: A cooperative search and coverage algorithm with controllable revisit and connectivity maintenance for multiple unmanned aerial vehicles. Sensors 18(5), 1472 (2018)
  • (24) Ganganath, N., Cheng, C.-T., Tse, C.K.: Distributed antiflocking algorithms for dynamic coverage of mobile sensor networks. IEEE Transactions on Industrial Informatics 12(5), 1795–1805 (2016)
  • (25) Ganganath, N., Yuan, W., Fernando, T., Iu, H.H.C., Cheng, C.-T.: Energy-efficient anti-flocking control for mobile sensor networks on uneven terrains. IEEE Transactions on Circuits and Systems II: Express Briefs 65(12), 2022–2026 (2018)
  • (26) Zhang, M., Liu, H.: Cooperative tracking a moving target using multiple fixed-wing uavs. Journal of Intelligent and Robotic Systems 81, 505–529 (2016)
  • (27) Monajemi Nejad, B., Attia, S., Raisch, J.: Max-consensus in a max-plus algebraic setting: The case of fixed communication topologies, pp. 1–7 (2009)
  • (28) Miao, Y.-Q., Khamis, A., Kamel, M.S.: Applying anti-flocking model in mobile surveillance systems. In: 2010 International Conference on Autonomous and Intelligent Systems, AIS 2010, pp. 1–6 (2010)
  • (29) Zhang, M., Li, H., Li, J., Wang, X.: A distributed persistent coverage algorithm of multiple unmanned aerial vehicles in complex mission areas. In: 2021 IEEE International Conference on Robotics and Biomimetics (ROBIO), pp. 1835–1840 (2021)
  • (30) Chen, H., Cong, Y., Wang, X., Xu, X., Shen, L.: Coordinated path-following control of fixed-wing unmanned aerial vehicles. IEEE Transactions on Systems, Man, and Cybernetics: Systems 52(4), 2540–2554 (2022)