This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Improved Quick Hypervolume Algorithm

Andrzej Jaszkiewicz Poznan University of Technology, Faculty of Computing, Institute of Computing Science, ul. Piotrowo 2, 60-965 Poznan, andrzej.jaszkiewicz@put.poznan.pl
Abstract

In this paper, we present a significant improvement of the Quick Hypervolume algorithm, one of the state-of-the-art algorithms for calculating the exact hypervolume of the space dominated by a set of d-dimensional points. This value is often used as the quality indicator in the multiobjective evolutionary algorithms and other multiobjective metaheuristics and the efficiency of calculating this indicator is of crucial importance especially in the case of large sets or many dimensional objective spaces. We use a similar divide and conquer scheme as in the original Quick Hypervolume algorithm, but in our algorithm we split the problem into smaller sub-problems in a different way. Through both theoretical analysis and a computational study we show that our approach improves the computational complexity of the algorithm and practical running times.

keywords:
Multiobjective optimization , Hypervolume indicator , Computational complexity analysis
journal: COMPUTERS & OPERATIONS RESEARCH

1 Introduction

In this paper, we consider the problem of calculating the exact hypervolume of the space dominated by a set of d-dimensional points. This hypervolume is often used as the quality indicator in the multiobjective evolutionary algorithms (MOEAs) and other multiobjective metaheuristics (MOMHs), where the set of points corresponds to images in the objective space of the solutions generated by a MOMH. Multiple quality indicators have been proposed in the literature, however, the hypervolume indicator has the advantage of being compatible with the comparison of approximation sets based on the dominance relation (see [1] for details) and is one of the most often used indicators. The hypervolume indicator may be used a posteriori to evaluate the final set of solutions generated by a MOMH e.g. for the purpose of a computational experiment comparing different algorithms or to tune parameters of a MOMH. Some authors proposed also indicator-based MOMHs that use the hypervolume to guide the work of the algorithms [2, 3].

The exact calculation of the hypervolume may become, however, computationally demanding especially in the case of large sets in many dimensional objective spaces. Thus the exact calculation of the hypervolume obtained a significant interest from the research community [4, 5, 6, 7, 8, 9]. According to the recent study of Lacour et al. [8] the state-of-the-art algorithms for the exact calculation of the hypervolume are Quick Hypervolume (QHV) [4, 5], Hypervolume Box Decomposition Algorithm (HBDA) [8] and Walking Fish Group algorithm (WFG) [10]. From the theoretical point of view the currently most efficient algorithm in the case d4d\geq 4 in terms of the worst case complexity (𝒪(nd3)polygon(n)\mathcal{O}(n^{\frac{d}{3}})\text{polygon}(n)) is by Chan [9]. To our knowledge, there is currently, however, no available implementation of this approach and no evidence of its practical efficiency [8].

In this paper, we improve QHV algorithm by modifying the way the problem is split into smaller sub-problems. This modification although may seem relatively simple significantly improves the computational complexity of the algorithm and practical running times. Since our work is based on relatively recently published results we do not give in this paper an extended overview of the applications of the hypervolume indicator and the algorithms for the hypervolume calculation. Instead we refer an interested reader to [4, 5, 6, 7, 8, 10] for recent overviews of this area.

The paper is organized in the following way. In the next section, we define the problem of the hypervolume calculation. In section 3, the improved Quick Hypervolume (QHV-II) algorithm is proposed. The computational complexity of QHV-II algorithm is analyzed and compared to QHV in section 4. In section 5, a computational study is presented. The paper finishes with conclusions and directions for further research.

2 Problem formulation

Consider a d-dimensional space d\mathbb{R}^{d} that will be interpreted as the space of dd maximized objectives.

We say that a point s1ds^{1}\in\mathbb{R}^{d} dominates a point s2ds^{2}\in\mathbb{R}^{d} if, and only if, sj1sj2j{1,,d}j{1,,d}:sj1>sj2s^{1}_{j}\geq s^{2}_{j}\ \forall\,j\in\{1,\ldots,d\}\wedge\exists\,j\in\{1,\ldots,d\}:s^{1}_{j}>s^{2}_{j}. We denote this relation by s1s2s^{1}\succ s^{2}.

We will consider hypercuboids in d\mathbb{R}^{d} parallel to the axes, defined by two extreme points rdr_{*}\in\mathbb{R}^{d} and rdr^{*}\in\mathbb{R}^{d} such that H(r,r)={sdj{1,,d}rjsjrj}H(r^{*},r_{*})=\{s\in\mathbb{R}^{d}\ \mid\forall\,j\in\{1,\ldots,d\}\,\,{r_{*}}_{j}\leq s_{j}\leq r^{*}_{j}\}.

Consider a finite set of points SH(r,r)S\subset H(r^{*},r_{*}). The hypervolume of the space dominated by SS within hypercuboid H(r,r)H(r^{*},r_{*}), denoted by (S,H(r,r))\mathcal{H}(S,H(r^{*},r_{*})) is the Lebesgue measure of the set sSH(s,r)\bigcup\limits_{s\in S}H(s,r_{*}). The introduction of rr^{*} may seem superfluous since it does not influence the hypervolume, however, such definition will facilitate further explanation of the algorithms which are based on the idea of splitting the original problem into sub-problems corresponding to smaller hypercuboids.

3 Quick Hypervolume II algorithm

In this section we propose a modification of the QHV algorithm proposed by Russo and Francisco [4, 5]. We call this modified algorithm QHV-II. Both QHV and QHV-II are based on the following observations:

  1. 1.

    sS(S,H(r,r))=(s,r)+((sS{s}H(s,r))H(s,r))\forall\,s^{\prime}\in S\quad\mathcal{H}(S,H(r^{*},r_{*}))=\mathcal{H}(s^{\prime},r_{*})+\mathcal{H}\Big{(}\big{(}\bigcup\limits_{s\in S\setminus\{s^{\prime}\}}H(s,r_{*})\big{)}\setminus H(s^{\prime},r_{*})\Big{)}, i.e. the hypervolume of the space dominated by SS is equal to the hypervolume of the hypercuboid defined by a single point sSs^{\prime}\in S and rr_{*}, i.e. (s,r)\mathcal{H}(s^{\prime},r_{*}), plus the hypervolume of the area dominated by the remaining points, i.e. S{s}S\setminus\{s^{\prime}\} excluding the area of hypercuboid H(s,r)H(s^{\prime},r_{*}).

  2. 2.

    The region H(r,r)H(s,r)H(r^{*},r_{*})\setminus H(s^{\prime},r_{*}) may be defined as a union of non-overlapping hypercuboids {H1,,HL}\{H_{1},...,H_{L}\}.

  3. 3.

    Consider a point s1H(r,r)s1rs^{1}\notin H(r^{*},r_{*})\,\wedge\,s^{1}\succ r_{*}. The hypervolume of the space dominated by s1s^{1} within H(r,r)H(r^{*},r_{*}) is equal to the hypervolume of the space dominated by the projection of s1s^{1} onto H(r,r)H(r^{*},r_{*}). The projection means that the coordinates of the projected point larger than the corresponding coordinates of rr^{*} are replaced by the corresponding coordinates of rr^{*}.

The above observations immediately suggest the possibility of calculating the hypervolume in a recursive manner with Algorithm 1. The algorithm selects a pivot point, calculates the hypervolume of the area dominated by the pivot point, and then splits the problem of calculating the remaining hypervolume into a number of sub-problems. If the number of points is sufficiently small it uses simple geometric properties to calculate the hypervolume.

Algorithm 1 General QHV
 Parameters \downarrow: H(r,r),SH(r,r)H(r^{*},r_{*}),S\subset H(r^{*},r_{*})
if SS contains one or two points then
  Calculate (S,H(r,r))\mathcal{H}(S,H(r^{*},r_{*})) using simple geometric properties
  HyperVolume(S,H(r,r))HyperVolume\leftarrow\mathcal{H}(S,H(r^{*},r_{*}))
else
  Select a pivot point sSs^{\prime}\in S
  HyperVolume(s,r)HyperVolume\leftarrow\mathcal{H}(s^{\prime},r_{*})
  Split H(r,r)H(s,r)H(r^{*},r_{*})\setminus H(s^{\prime},r_{*}) into a set of non-overlapping hypercuboids {H1,,HL}\{H_{1},\dots,H_{L}\}.
  for all Hl{H1,,HLH_{l}\in\{H_{1},\dots,H_{L}do
   Construct set SlS_{l} containing the points dominating rlr^{l}_{*} and if necessary projected onto HlH_{l}
   HyperVolumeHyperVolume+QHV(Hl,Sl)HyperVolume\leftarrow HyperVolume+\texttt{QHV}(H_{l},S_{l})
return  HyperVolumeHyperVolume

Russo and Francisco [4, 5] propose to split the region H(r,r)H(s,r)H(r^{*},r_{*})\setminus H(s^{\prime},r_{*}) into 2d22^{d}-2 hypercuboids corresponding to each possible combination of the comparisons on each objective, where a coordinate may be << or \geq than the corresponding coordinate of the pivot point ss^{\prime}, with the exception of the two combinations corresponding to the areas dominated and dominating ss^{\prime}. Each such hypercuboid may be defined by a binary vector where 0 at jj-th position means that sj<sjs_{j}<s^{\prime}_{j} and 11 at jj-th position means that sjsjs_{j}\geq s^{\prime}_{j}. We will call such hypercuboids basic hypercuboids.

We propose a different splitting scheme. We split the region H(r,r)H(s,r)H(r^{*},r_{*})\setminus H(s^{\prime},r_{*}) into dd hypercuboids defined in the following way:

  • 1.

    H1H_{1} is defined by the condition s1s1s_{1}\geq s^{\prime}_{1}

  • 2.

  • 3.

    HjH_{j} is defined by the conditions sl<sll=1,,j1sjsjs_{l}<s^{\prime}_{l}\quad\forall\,l=1,\dots,j-1\land s_{j}\geq s^{\prime}_{j}

  • 4.

In other words, the hypercuboids are defined not by binary vectors but by the following schemata of binary vectors:

  • 1.

    v1=1v_{1}=1*\dots*

  • 2.

  • 3.

    vi=001v_{i}=0\dots 01*\dots*, with 11 at jj-th position

  • 4.

  • 5.

    vd=001v_{d}=0\dots 01

where * means any symbol either 0 or 11. Each of these hypercuboids is obviously an union of a number of the basic hypercuboids.

The difference between the splitting schemes in QHV and QHV-II is graphically illustrated in Figure 1 for the 3-objective case. In this case, there are 6 basic hypercuboids. The colors describe the hypercuboids corresponding to the different sub-problems. Please note, that in the case of QHV-II the hypercuboid defined by the condition s1s1s_{1}\geq s^{\prime}_{1} contains also the region dominating ss^{\prime} but this region does not contain any points. In addition, the arrows indicate the directions of projections of the points onto the hypercuboids.

QHV              QHV-II
  Refer to caption  Refer to caption
  d=10d=10                

Figure 1: The comparison of the splitting schemes in QHV and QHV-II

Alike proposed in [4, 5] as the pivot point we select the point sSs^{\prime}\in S with the maximum (s,r)\mathcal{H}(s^{\prime},r_{*}).

Please note, that the points projected onto a hypercuboid HlH_{l} may become dominated since some coordinates are replaced with lower values. Russo and Francisco [4, 5] propose to explicitly remove the dominated points e.g. with the algorithm proposed in [11]. We do not use this step in QHV-II since we did not find it practically beneficial in the preliminary computational experiments. Note, however, that alike Russo and Francisco [4, 5] we implemented only a naive method for the removal of the dominated points by comparing all pairs of points. This could perhaps be improved by using more advanced methods. Please note, however, that the pivot point ss^{\prime} selected in the way described above is guaranteed to be non-dominated within SS. Furthermore, while assigning points to sets SlS_{l} each point is compared to ss^{\prime} and the points dominated by ss^{\prime} may be removed. It does not guarantee an immediate removal of all dominated points but finally all dominated points will be removed by the algorithm because each of the dominated points will be dominated by one of the selected pivots or eliminated while using the simple geometric properties when the number of points is sufficiently small.

Please also note, that as suggested in [4, 5] the projected points do not need to be constructed explicitly, but their coordinates may be calculated on demand to reduce the memory requirements.

4 Theoretical Analysis

The analysis of the computational complexity of such complex algorithms like QHV or QHV-II is very difficult. Russo and Francisco [4] present a quite complex analysis of the computational complexity of QHV algorithm. However, the authors needed to make some strong assumptions like the uniform distribution of points on a hyperplane or the surface of a hypersphere to prove some main results. Furthermore their analysis in general shows that for large number of points, most of the projected points become dominated and may be removed. Thus the time needed for the removal of the dominated points has the main influence on the asymptotic behavior of the algorithm. Realistic data sets are, however, much smaller and the projected non-dominated points significantly influence running times. Indeed in our theoretical analysis we show that in the best case when all projected points are dominated and removed QHV and QHV-II have the same computational complexity. However, if all or a fraction of the projected points are preserved QHV-II starts to outperform QHV.

In this paper we use more standard tools like the recurrence solving and the Akra-Bazzi method [12] to analyze the computational complexity of QHV-II and QHV is some specific cases. We present results for the worst and the best cases. Since it is very difficult to analyze the average case behavior of such complex algorithms we analyze a number of specific intermediate cases for which we can obtain the complexity of QHV and QHV-II.

4.1 Worst case

To analyze the worst case we follow the approach proposed in [4]. The worst case is when all points except of the pivot point are assigned to each sub-problem. In this case, we can discard the time of removal of the dominated points and use the relation:

T(n)=dT(n1)T(n)=dT(n-1) (1)

where T(n)T(n) is the numbers of the comparisons needed to process a set of nn points. Solving the above recurrence:

T(n)=cdn1T(n)=cd^{n-1} (2)

where cc is a constant. Thus QHV-II has the worst case time complexity 𝒪(dn1)\mathcal{O}(d^{n-1})

For the original QHV algorithm we get:

T(n)=(2d2)T(n1)T(n)=(2^{d}-2)T(n-1) (3)

Solving the above recurrence:

T(n)=c(2d2)n1T(n)=c(2^{d}-2)^{n-1} (4)

where cc is a constant. Thus QHV has the worst case time complexity 𝒪(2d(n1))\mathcal{O}(2^{d(n-1)}) which is worse than the worst case time complexity of QHV-II.

4.2 Best case

In the best case points are distributed uniformly in all basic hypercuboids and all projected points become dominated and are removed. So, each of the 2d22^{d}-2 basic hypercuboids contains n/(2d2)n/(2^{d}-2) points.

To analyze the complexity of the algorithm we use the Akra-Bazzi method [12] for the analysis of the divide and conquer algorithms of the form:

T(n)={T0,ifnn0k=1KakT(n/bk)+g(n),otherwiseT(n)=\begin{cases}T_{0},&\text{if}\ n\leq n_{0}\\ \sum_{k=1}^{K}\,a_{k}T(n/b_{k})+g(n),&\text{otherwise}\end{cases} (5)

where KK is a constant, ak>0a_{k}>0 and bk>1b_{k}>1 are constants for all k, g(n)=Ω(nc)g(n)=\Omega(n^{c}), and g(n)=𝒪(nd)g(n)=\mathcal{O}(n^{d}) for some constants 0<cd0<c\leq d. In this case:

T(n)=θ(npo(1+1ng(u)upo+1𝑑u))T(n)=\theta\big{(}n^{p^{o}}(1+\int_{1}^{n}\frac{g(u)}{u^{{p^{o}}+1}}du)\big{)} (6)

where pop^{o} is the unique real solution to the equation:

k=1Kakbkp=1\sum_{k=1}^{K}\,\frac{a_{k}}{{b_{k}}^{p}}=1 (7)

4.2.1 QHV-II

In QHV-II the problem is split into dd sub-problems. The kk-th sub-problem corresponds to the schema with dkd-k positions that can be either 0 or 11. Thus there are 2dk2^{d-k} binary vectors and thus 2dk2^{d-k} basic hypercuboids directly (i.e. without projection) assigned to the kk-th sub-problem, for k=2,,dk=2,\dots,d. For k=1k=1 we have 2d112^{d-1}-1 directly assigned basic hypercuboids because we discard the basic hypercuboid corresponding to the binary vector 11…1 and containing points dominating the pivot point.

In this case, K=dK=d, ak=1a_{k}=1, bk=(2d2)/(2dk)b_{k}=(2^{d}-2)/(2^{d-k}), k=2,,dk=2,\dots,d, b1=(2d2)/(2d11)b_{1}=(2^{d}-2)/(2^{d-1}-1). Equation 7 has the solution po=1p^{o}=1. If we discard the removal of the dominated points (i.e. assume that all of them where removed through the comparison to the pivot point) then g(n)=ng(n)=n and:

T(n)=θ(n(1+1nuu2𝑑u))=Θ(nlogn)\begin{split}T(n)=\theta\big{(}n(1+\int_{1}^{n}\frac{u}{u^{2}}du)\big{)}=\Theta(n\log n)\end{split} (8)

The dominated points can be removed in Θ(nlogd2n)\Theta(n\log^{d-2}n) time [4, 11]. So if we take into account the time needed for the removal of the dominated points and the time needed to compare each point to the pivot point then g(n)=n+Θ(nlogd2n)=Θ(nlogd2n)g(n)=n+\Theta(n\log^{d-2}n)=\Theta(n\log^{d-2}n) and we get

T(n)=θ(n(1+1nΘ(ulogd2u)u2𝑑u))==Θ(nlogd1n)\begin{split}T(n)=\theta\big{(}n(1+\int_{1}^{n}\frac{\Theta(u\log^{d-2}u)}{u^{2}}du)\big{)}=\\ =\Theta(n\log^{d-1}n)\end{split} (9)

4.2.2 QHV

In the same way we may analyze the best case behavior of the original QHV algorithm. In QHV the problem is split into d1d-1 classes of equivalent sub-problems. The kk-th class contains (dk)\binom{d}{k} sub-problems with binary vectors having kk symbols 11. Each subproblem contains just one basic hypercuboid.

In this case, K=d1K=d-1, ak=(dk)a_{k}=\binom{d}{k}, bk=2d2b_{k}=2^{d}-2. In this case, we also get po=1p^{o}=1 and the same time T(n)=Θ(nlogn)T(n)=\Theta(n\log n) or T(n)=Θ(nlogd1n)T(n)=\Theta(n\log^{d-1}n), without or with the explicit removal of the dominated points, respectively. So, in the best case the time complexity of QHV and QHV-II is the same.

4.3 Intermediate case

Consider the case when the points are equally distributed in all basic hypercuboids but no projected points are removed. In other words each sub-problem will contain all points belonging to its hypercuboid and all projected points.

4.3.1 QHV-II

In this case, ak=1a_{k}=1 and bk=2b_{k}=2 since each sub-problem will contain half of the points better on one objective than the pivot point. Equation 7 has the solution po=log2dp^{o}=\log_{2}d. Since we can discard the removal of the dominated points g(n)=ng(n)=n and

T(n)=θ(nlog2d(1+1nuulog2d+1𝑑u))==Θ(nlog2d(1+Θ(n1log2d)))=Θ(nlog2d)\begin{split}T(n)=\theta\big{(}n^{\log_{2}d}(1+\int_{1}^{n}\frac{u}{u^{log_{2}d+1}}du)\big{)}=\\ =\Theta\big{(}n^{\log_{2}d}(1+\Theta(n^{1-log_{2}d}))\big{)}=\Theta(n^{\log_{2}d})\end{split} (10)

Furthermore, since the points dominated by the pivot point may be removed without any additional comparisons QHV-II has time complexity 𝒪(nlog2d)\mathcal{O}(n^{\log_{2}d}) with the considered distribution of points.

4.3.2 QHV

In the case of the original QHV algorithm ak=(dk)a_{k}=\binom{d}{k}, bk=(2d2)/(2dk1)b_{k}=(2^{d}-2)/(2^{d-k}-1), since there are (2dk2)(2^{d-k}-2) other basic hypercuboids projected onto a basic hypercuboid with kk symbols 11.

Unfortunately, we were not able to solve analytically equation 7 in this case. In Table 1 we present values of pop^{o} obtained numerically for different numbers of objectives dd. However, we can consider an approximate model assuming that nn points are uniformly distributed among all basic hypercuboids including the two hypercuboids dominated and dominating the pivot point ss^{\prime}. For larger values of dd the differences between the exact and the approximate models become very small as illustrated in Table 1 and since the approximate model differs in only two among 2d2^{d} basic hypercuboids it asymptotically converges to the exact model with growing dd. Please note, that although Table 1 suggest that pop^{o} is always higher for the approximate model we do not have a formal proof of this fact. For the approximate model ak=(dk)a_{k}=\binom{d}{k}, k=1,,dk=1,\dots,d, bk=(2d)/(2dk)b_{k}=(2^{d})/(2^{d-k}) and equation 7 is expressed as:

k=1K(dk)((2d)/(2dk))p=(2p+1)d1=1\sum_{k=1}^{K}\,\frac{\binom{d}{k}}{\big{(}(2^{d})/(2^{d-k})\big{)}^{p}}=(2^{-p}+1)^{d}-1=1 (11)

and has solution:

p=log2(21/d1)p=-\log_{2}(2^{1/d}-1) (12)

thus:

T(n)=θ(nlog2(21/d1)(1+1nuulog2(21/d1)+1𝑑u))==θ(nlog2(21/d1)(1+Θ(n2+log2(21/d1))))==Θ(nlog2(21/d1))\begin{split}T(n)=\theta\big{(}n^{-\log_{2}(2^{1/d}-1)}(1+\int_{1}^{n}\frac{u}{u^{-\log_{2}(2^{1/d}-1)+1}}du)\big{)}=\\ =\theta\big{(}n^{-\log_{2}(2^{1/d}-1)}(1+\Theta(n^{2+log_{2}(2^{1/d}-1)}))\big{)}=\\ =\Theta(n^{-\log_{2}(2^{1/d}-1)})\end{split} (13)
Theorem 1.

Θ(nlog2d)<Θ(nlog2(21/d1))\Theta(n^{\log_{2}d})<\Theta(n^{-\log_{2}(2^{1/d}-1)}), i.e. the time complexity of QHV-II in the considered case is lower than the time complexity of QHV assuming the approximate model.

This theorem is the direct consequence of Lemma 1.

Lemma 1.

log2d<log2(21/d1)\log_{2}d<-\log_{2}(2^{1/d}-1)

Proof 1.
log2d<log2(21/d1)log2d+log2(21/d1)<0log2(d21/dd)<0d21/dd<1\begin{split}\log_{2}d<-\log_{2}(2^{1/d}-1)\implies\\ \log_{2}d+\log_{2}(2^{1/d}-1)<0\implies\\ \log_{2}(d2^{1/d}-d)<0\implies\\ d2^{1/d}-d<1\end{split} (14)

Using Laurent series extension

d21/dd=d+log2+log222d+𝒪((1/d)2)d<1d2^{1/d}-d=d+\log 2+\frac{\log^{2}2}{2d}+\mathcal{O}((1/d)^{2})-d<1 (15)

\square

Table 1: Comparison of pop^{o} values for the exact and the approximate models
dd pop^{o} exact model pop^{o} approximate model
2 1 1.2715
4 2.2942 2.4019
6 2.9920 3.0295
8 3.4543 3.4658
10 3.7971 3.8004
12 4.0709 4,0718

4.4 Constant fraction of the removed points

In practice some of the projected points are removed. The fraction of the preserved (not removed) projected points depends in general on the number of points and the number of projected objectives. We may consider, however, a simplified model where the fraction of the preserved points is a constant 0C10\leq C\leq 1. Please note, that for C=0C=0 we get the best case considered above, and for C=1C=1 the intermediate case considered above. In this case:

{bk=2d22di+C(2d12di1),k=2,,db1=2d22di1\begin{cases}b_{k}=\frac{2^{d}-2}{2^{d-i}+C(2^{d-1}-2^{d-i}-1)},k=2,\dots,d\\ b_{1}=\frac{2^{d}-2}{2^{d-i}-1}\end{cases} (16)

for QHV-II, and:

bk=2d21+C(2dk2),k=1,,db_{k}=\frac{2^{d}-2}{1+C(2^{d-k}-2)},k=1,\dots,d (17)

for QHV. Unfortunately, we did not manage to solve equation 7 in these cases analytically. Numerical analysis with the use of non-linear solvers suggests that the following hypothesis is true:

Hypothesis 1.

pop^{o} is lower for QHV-II than for QHV, i.e. the time complexity of QHV-II is lower than the time complexity of QHV in the considered case, for any C>0C>0.

Namely, we used a non-linear solver to solve the following optimization problem:

min(pQHVopQHVIIo)s.t.equation 7 for QHVequation 7 for QHV-II\begin{split}\min(p^{o}_{QHV}-p^{o}_{QHV-II})\\ s.t.\\ \textnormal{equation }\ref{eq:p}\textnormal{ for QHV}\\ \textnormal{equation }\ref{eq:p}\textnormal{ for QHV-II}\\ \end{split} (18)

with variables 0C1,d2,pQHVo1,pQHVIIo10\leq C\leq 1,d\geq 2,p^{o}_{QHV}\geq 1,p^{o}_{QHV-II}\geq 1. The optimum solution was always C=0C=0, pQHVo=1p^{o}_{QHV}=1, pQHVIIo=1p^{o}_{QHV-II}=1 for any starting solution tested. Furthermore, in Table 2 we present some exemplary values of pop^{o} for various numbers of objectives and values of CC obtained by solving equation 7 numerically. Of course, this is still not a formal proof of Hypothesis 1.

Table 2: Comparison of pop^{o} values for QHV-II and QHV
dd CC pQHVIIop^{o}_{QHV-II} pQHVop^{o}_{QHV}
4 0,9 1,8635 2,0928
4 0,5 1,4279 1,5071
4 0,1 1,0837 1,0957
12 0,9 3,1897 3,6731
12 0,5 2,0706 2,5321
12 0,1 1,2548 1,5445
20 0,9 3,8092 4,3628
20 0,5 2,3820 3,0746
20 0,1 1,3563 1,9701

4.5 Average case with no removal of the dominated points

In QHV-II each of the dd sub-problems corresponds to a split based on the value of one objective in the pivot point which allows us to follow the analysis of well-known algorithms like the binary search or Quicksort. Assuming that the pivot point is selected at random with the uniform probability and that the ranks of a given point according to particular objectives are independent variables:

T(n)=n+d1nk=1n1T(k)T(n)=n+d\frac{1}{n}\sum_{k=1}^{n-1}T(k)\\ (19)

Multiplying both sides by nn:

nT(n)=n2+dk=1n1T(k)nT(n)=n^{2}+d\sum_{k=1}^{n-1}T(k)\\ (20)

Assuming that n2n\geq 2:

(n1)T(n1)=(n1)2+dk=1n2T(k)(n-1)T(n-1)=(n-1)^{2}+d\sum_{k=1}^{n-2}T(k)\\ (21)

Subtracting equations 20 and 21:

nT(n)(n1)T(n1)==n2(n1)2+dk=1n1T(k)dk=1n2T(k)\begin{split}nT(n)-(n-1)T(n-1)=\\ =n^{2}-(n-1)^{2}+d\sum_{k=1}^{n-1}T(k)-d\sum_{k=1}^{n-2}T(k)\end{split} (22)

Simplifying:

T(n)=2n1n+n+d1nT(n1)T(n)=\frac{2n-1}{n}+\frac{n+d-1}{n}T(n-1) (23)

The above recurrence has the solutions:

T(n)=c1Γ(d+n)Γ(n+1)c2nc3T(n)=c_{1}\frac{\Gamma(d+n)}{\Gamma(n+1)}-c_{2}n-c_{3} (24)

where:

c1=c(d2)(d1)+d2(d2)(d1)Γ(d+1)c2=2d2c3=d(d1)(d2)\begin{split}c_{1}=c\frac{(d-2)(d-1)+d^{2}}{(d-2)(d-1)\Gamma(d+1)}\\ c_{2}=\frac{2}{d-2}\\ c_{3}=\frac{d}{(d-1)(d-2)}\end{split} (25)

(with cc being an arbitrary constant), which can be shown by substituting T(n)T(n) and T(n1)T(n-1) with 24 in 23

Since

limnΓ(n+α)Γ(n)nα=1\lim_{n\to\infty}\frac{\Gamma(n+\alpha)}{\Gamma(n)n^{\alpha}}=1 (26)

thus

limnΓ(n+α)Γ(n)=nα\lim_{n\to\infty}\frac{\Gamma(n+\alpha)}{\Gamma(n)}=n^{\alpha} (27)
T(n)=Θ(nd1)T(n)=\Theta(n^{d-1}) (28)

This analysis was based on the assumption that the pivot point is selected at random while in QHV and QHV-II we select a point with maximum (s,r)\mathcal{H}(s,r_{*}). This increases the chance of selecting points with a good balance of the values of all objectives with ranks closer to n/2n/2 on each objective which may improve the practical behavior of the algorithms.

Similar analysis of QHV is much more difficult, we can prove however that under the same assumptions the following relation holds:

Theorem 2.

T(n)>T(n)T^{\prime}(n)>T(n), where T(n)T^{\prime}(n) and T(n)T(n) are the numbers of comparisons in QHV and QHV-II algorithms, respectively.

Proof 2.

In QHV each node is split into 2d22^{d}-2 sub-problems. Among them are the dd sub-problems corresponding to the dd basic hypercuboids with only one objective better or equal to the corresponding value in the pivot point. Consider one of such basic hypercuboids better or equal to the corresponding value in the pivot point only on objective jj. All points with equal or better values on objective jj either belong to or are projected on this hypercuboid. Thus, each these dd sub-problems is equivalent to one of the dd sub-problems used in QHV-II. In result:

T(n)=n+d1nk=1n1T(k)+T′′(n)==T(n)+T′′(n)>T(n)\begin{split}T^{\prime}(n)=n+d\frac{1}{n}\sum_{k=1}^{n-1}T(k)+T^{\prime\prime}(n)=\\ =T(n)+T^{\prime\prime}(n)>T(n)\end{split} (29)

where T′′(n)T^{\prime\prime}(n) is the number of comparisons needed to process the remaining sub-problems other than the dd sub-problems discussed above. \square

4.6 Discussion

We have shown that the worst case time complexity of QHV-II is better than of QHV. On the other hand, the best case time complexity of the two algorithms is equal. We have analyzed also a number of intermediate cases each time showing that the time complexity of QHV-II is better than of QHV which supports the hypothesis that in practical cases QHV-II performs better than QHV.

5 Computational experiment

In order to test the proposed method we used the test instances proposed in [8]. The set of instances includes linear, concave, convex, and hard instances with up to 10 objectives and up to 1000 points. In addition we used also instances with points uniformly distributed over a multi-dimensional spherical surface generated by us in a way proposed by Russo and Francisco [4]. Among the hard instances there is just one instance of each size. In other cases, there are 10 instances of each size. The presented results are the average values over 10 instances of each type and size in all cases with the exception of the hard instances. All test instances used in this experiment, as well as the source code and the detailed numerical results, are available at https://sites.google.com/view/qhv-ii/qhv-ii.

The main goal of the computational experiment is to show that QHV-II performs less operations than QHV. Please note, that our goal is not to present a very efficient code for calculation of the hypervolume. We have developed only a simple implementation of QHV-II in C++. In the same programming language we have implemented also QHV algorithm. Both implementations share majority of the code. We compare it also to the implementation of QHV available at http://web.tecnico.ulisboa.pt/luis.russo/QHV/ developed by the authors of the algorithm. We call this implementation oQHV (the original implementation of QHV). oQHV implementation is much faster than our implementation of QHV (and thus of QHV-II) because of several reasons:

  • 1.

    oQHV is implemented in C and the code is highly optimized for efficiency. Our implementation is made in C++ and is relatively simple. In particular, we do not use any low level code optimizations used in oQHV, and we use object-oriented techniques, e.g. classes from the STL library, which are very convenient, but can make the code less efficient.

  • 2.

    oQHV uses additional methods, namely HSO[13] and IEX [14] for the sub-problems with small numbers of points, because these methods are faster in such cases. We do not use these additional methods in our implementation.

To make sure that our understanding of QHV is correct we modified the code of oQHV. We call this modified implementation oQGHV-s (oQHV simplified). In oQGHV-s we do not use the additional methods (HSO and IEX) for the sub-problems with small numbers of points. In other words, oQGHV-s implements the same algorithm as our implementation of QHV. We have compared the numbers of visited nodes in the recursive tree in both our implementation of QHV and oQHV-s. The numbers of visited nodes were the same for smaller instances and differed slightly (only in once case above 0.1%) for larger instances. These small differences where probably caused by small numerical perturbations since in both implementations the objective values are stored as the floating point numbers. On the other hand the running times of our implementation of QHV were on average 3 times higher due to the differences in the programming languages and code optimization. Of course, we have also checked that our implementations of QHV-II and QHV returns the same values of the hypervolume that the other tested algorithms.

The numbers of visited nodes in the case of QHV-II and oQHV are not directly comparable since oQHV cuts the recursive tree at some levels and applies the additional methods (HSO or IEX) for the small sub-problems. To make a fair comparison in Figures 2 to 6 we compare the original implementation of QHV and our implementation of QHV-II with the recursive trees cut in both cases at the level at which the original implementation of QHV would use the additional methods. In other words, the sub-problems for which the original implementation of QHV would solve with the additional methods were completely skipped. We call these methods oQHV-cut and QHV-II-cut. We report the numbers of visited (internal) nodes and the number of leafs, i.e. the number of calls to the additional methods (HSO or IEX). The number of visited nodes was on average 2,21 (33,5 maximum) times higher in oQHV-cut and the number of leafs was on average 5,39 (46 maximum) times higher. These numbers were almost always lower for QHV-II-cut expect of 4 hard instances with 6 objectives were the numbers of visited nodes were slightly higher for QHV-II-cut. The relative differences were generally higher for instances with higher number of objectives. These results confirm that QHV-II constructs smaller recursive trees and allows to predict that a more advanced implementation of QHV-II, i.e. with a more optimized code and with the use of the additional methods for small sub-problems, could be several times faster than the original implementation of QHV for the tested instances.

Please note also, that QHV-II behaves much more predictably than QHV on hard instances. The number of visited nodes and leafs in QHV-II grows gradually with the growing number of points, while in QHV it is sometimes higher for smaller sets.

Although our goal is not to develop a very efficient implementation of QHV-II our simple code is already competitive to the state-of-the-art implementations in some cases. In Figures 7 to 11 we present running times of our implementation of QHV-II, the original implementation of QHV (oQHV), as well as of the non-incremental version of HBDA (HBDA-NI) [8] and WFG [10] algorithms. oQHV and QHV-II were run by ourselves on Intel Core i7-5500U CPU at 2.4 GHz. For HBDA-NI and WFG, in the case of the linear, concave, convex, and hard instances, we used results obtained in [8]. Since a slower CPU was used in that experiment we re-scaled the running times by a factor of 2,5 obtained by comparing the running times in a number of test runs. For uniform spherical instances, we run ourselves WFG algorithm using the code available at http://www.wfg.csse.uwa.edu.au/hypervolume/. Unfortunately, because of some technical problems, we were not able run the code of HBDA-NI, thus for these instances this method was not used. These results confirm that even our simple implementation of QHV-II is already competitive to the state-of-the-art codes in some cases, especially for hard instances with 10 objectives. Let us remind, that the results presented above suggest that the implementation of QHV-II could be further significantly improved by adding a low level code optimization, and using the additional methods for the small sub-problems.

d=6d=6              d=8d=8
Refer to captionRefer to caption
  d=10d=10                
Refer to captionRefer to caption

Number of operations

Number of points

Figure 2: Number of nodes and leafs in QHV-cut and QHV-II-cut for the linear instances

d=6d=6              d=8d=8
Refer to captionRefer to caption
  d=10d=10                
Refer to captionRefer to caption

Number of operations

Number of points

Figure 3: Number of nodes and leafs in QHV-cut and QHV-II-cut for the convex instances

d=6d=6              d=8d=8
Refer to captionRefer to caption
  d=10d=10                
Refer to captionRefer to caption

Number of operations

Number of points

Figure 4: Number of nodes and leafs in QHV-cut and QHV-II-cut for the concave instances

d=6d=6              d=8d=8
Refer to captionRefer to caption
  d=10d=10                
Refer to captionRefer to caption

Number of operations

Number of points

Figure 5: Number of nodes and leafs in QHV-cut and QHV-II-cut for the hard instances

d=6d=6              d=8d=8
Refer to captionRefer to caption
  d=10d=10                
Refer to captionRefer to caption

Number of operations

Number of points

Figure 6: Number of nodes and leafs in QHV-cut and QHV-II-cut for the uniform spherical instances

d=6d=6              d=8d=8
Refer to captionRefer to caption
  d=10d=10                
Refer to captionRefer to caption

CPU time [s]

Number of points

Figure 7: Running times for the linear instances

d=6d=6              d=8d=8
Refer to captionRefer to caption
  d=10d=10                
Refer to captionRefer to caption

CPU time [s]

Number of points

Figure 8: Running times for the convex instances

d=6d=6              d=8d=8
Refer to captionRefer to caption
  d=10d=10                
Refer to captionRefer to caption

CPU time [s]

Number of points

Figure 9: Running times for the concave instances

d=6d=6              d=8d=8
Refer to captionRefer to caption
  d=10d=10                
Refer to captionRefer to caption

CPU time [s]

Number of points

Figure 10: Running times for the hard instances

d=6d=6              d=8d=8
Refer to captionRefer to caption
  d=10d=10                
Refer to captionRefer to caption

CPU time [s]

Number of points

Figure 11: Running times for the uniform spherical instances

6 Conclusions and Further Work

We have presented a modified version of QHV divide and conquer algorithm for calculating the exact hypervolume. Namely, we have modified the scheme of splitting the original problem into smaller sub-problems. Through both theoretical analysis and computational experiments we have shown that the modified version constructs smaller recursive trees and reduces the CPU time.

We have not developed a very efficient implementation of QHV-II but we believe that the we have provided a sufficient evidence that an efficient implementation of QHV-II (i.e. implementation comparable to the efficient implementation of QHV - oQHV) would run several times faster than oQHV for the considered instances.

Interesting directions for further research are:

  • 1.

    The use of the concepts of QHV-II for the incremental calculation of the hypervolume, i.e. the calculation of the change of the hypervolume after adding new point(s). Similar adaptation of the original QHV algorithm has been proposed in [5]. We can reasonably expect that the splitting scheme resulting in a faster static algorithm would also reduce the time of the incremental calculation of the hypervolume.

  • 2.

    Parallelization of QHV-II which may further improve practical running times. This issue was also considered in the case of the original QHV in [5].

  • 3.

    Adaptation of QHV-II for the approximate calculation of the hypervolume with a guaranteed maximum approximation error.

Acknowledgment

The research of Andrzej Jaszkiewicz was funded by the the Polish National Science Center, grant no. UMO-2013/11/B/ST6/01075.

We would like to thank Renaud Lacour, Kathrin Klamroth, and Carlos M. Fonseca for providing us detailed results of the computational experiment reported in [8].

References

References

  • Zitzler et al. [2003] E. Zitzler, L. Thiele, M. Laumanns, C. Fonseca, V. da Fonseca, Performance assessment of multiobjective optimizers: An analysis and review, IEEE Transactions on Evolutionary Computation 7 (2003) 117–132.
  • Zitzler and Künzli [2004] E. Zitzler, S. Künzli, Indicator-Based Selection in Multiobjective Search, Springer Berlin Heidelberg, Berlin, Heidelberg, pp. 832–842.
  • Jiang et al. [2015] S. Jiang, J. Zhang, Y. S. Ong, A. N. Zhang, P. S. Tan, A simple and fast hypervolume indicator-based multiobjective evolutionary algorithm, IEEE Transactions on Cybernetics 45 (2015) 2202–2213.
  • Russo and Francisco [2014] L. M. S. Russo, A. P. Francisco, Quick Hypervolume, IEEE Transactions on Evolutionary Computation 18 (2014) 481–502.
  • Russo and Francisco [2016] L. M. S. Russo, A. P. Francisco, Extending quick hypervolume, Journal of Heuristics 22 (2016) 245–271.
  • Beume et al. [2009] N. Beume, C. M. Fonseca, M. Lopez-Ibanez, L. Paquete, J. Vahrenhold, On the Complexity of Computing the Hypervolume Indicator, IEEE Transactions on Evolutionary Computation 13 (2009) 1075–1082.
  • While and Bradstreet [2012] L. While, L. Bradstreet, Applying the WFG algorithm to calculate incremental hypervolumes, in: 2012 IEEE Congress on Evolutionary Computation, pp. 1–8.
  • Lacour et al. [2017] R. Lacour, K. Klamroth, C. M. Fonseca, A box decomposition algorithm to compute the hypervolume indicator, Computers & Operations Research 79 (2017) 347 – 360.
  • Chan [2013] T. M. Chan, Klee’s measure problem made easy, in: 2013 IEEE 54th Annual Symposium on Foundations of Computer Science, pp. 410–419.
  • While et al. [2012] L. While, L. Bradstreet, L. Barone, A fast way of calculating exact hypervolumes, IEEE Transactions on Evolutionary Computation 16 (2012) 86–95.
  • Bentley [1980] J. L. Bentley, Multidimensional divide-and-conquer, Commun. ACM 23 (1980) 214–229.
  • Akra and Bazzi [1998] M. Akra, L. Bazzi, On the solution of linear recurrence equations, Computational Optimization and Applications 10 (1998) 195–210.
  • While et al. [2006] L. While, P. Hingston, L. Barone, S. Huband, A faster algorithm for calculating hypervolume, IEEE Transactions on Evolutionary Computation 10 (2006) 29–38.
  • Wu and Azarm [2001] J. Wu, S. Azarm, Metrics for quality assessment of a multiobjective design optimization solution set, Journal of Mechanical Design 123 (2001) 18–25.