\WarningFilter

hyperrefYou have enabled option ‘breaklinks’.

Another look at halfspace depth:
Flag halfspaces with applications

Dušan Pokorný , Petra Laketa and Stanislav Nagy nagy@karlin.mff.cuni.cz Charles University, Faculty of Mathematics and Physics, Prague, Czech Republic

Abstract.

The halfspace depth is a well studied tool of nonparametric statistics in multivariate spaces, naturally inducing a multivariate generalisation of quantiles. The halfspace depth of a point with respect to a measure is defined as the infimum mass of closed halfspaces that contain the given point. In general, a closed halfspace that attains that infimum does not have to exist. We introduce a flag halfspace — an intermediary between a closed halfspace and its interior. We demonstrate that the halfspace depth can be equivalently formulated also in terms of flag halfspaces, and that there always exists a flag halfspace whose boundary passes through any given point $x$ , and has mass exactly equal to the halfspace depth of $x$ . Flag halfspaces allow us to derive theoretical results regarding the halfspace depth without the need to differentiate absolutely continuous measures from measures containing atoms, as was frequently done previously. The notion of flag halfspaces is used to state results on the dimensionality of the halfspace median set for random samples. We prove that under mild conditions, the dimension of the sample halfspace median set of $d$ -variate data cannot be $d-1$ , and that for $d=2$ the sample halfspace median set must be either a two-dimensional convex polygon, or a data point. The latter result guarantees that the computational algorithm for the sample halfspace median form the R package TukeyRegion is exact also in the case when the median set is less-than-full-dimensional in dimension $d=2$ .

Key words and phrases:

flag halfspace; halfspace depth; halfspace median; Tukey depth

1991 Mathematics Subject Classification:

62H05, 62G35

1. Introduction: Halfspace depth and its median

Denote by $\mathcal{M}\left(\mathbb{R}^{d}\right)$ the set of all finite Borel measures on the Euclidean space $\mathbb{R}^{d}$ . The halfspace (or Tukey) depth of $x\in\mathbb{R}^{d}$ with respect to (w.r.t.) $\mu\in\mathcal{M}\left(\mathbb{R}^{d}\right)$ is defined as¹¹1We consider the halfspace depth w.r.t. finite measures $\mu$ , that is when $\mu(\mathbb{R}^{d})<\infty$ . Compared to the usual setup of probability measures, this extension is minor, and made only for notational convenience. All our results could be considered also for probability measures only, with obvious modifications.

(1)

0pt{x}{\mu}=\inf\left\{\mu(H)\colon H\in\mathcal{H}(x)\right\},

where $\mathcal{H}(x)$ is the collection of closed halfspaces in $\mathbb{R}^{d}$ that contain $x$ on their boundary. The halfspace depth quantifies the centrality of $x$ w.r.t. the mass of $\mu$ . That is quite useful in nonparametric statistics, as it allows us to rank sample points according to their depth, from the central to the peripheral ones. As such, the depth enables the introduction of rankings, orderings, and quantile-like inference to multivariate datasets [3, 20, 21]. The upper level sets of the halfspace depth of $\mu$ , given for $\alpha\geq 0$ by

(2)

D_{\alpha}(\mu)=\left\{x\in\mathbb{R}^{d}\colon 0pt{x}{\mu}\geq\alpha\right\},

play in nonparametric statistics the role of the inner quantile regions of $\mu$ . They are often called the (halfspace) central regions of $\mu$ . The sets (2) are nested, closed and convex; they are compact for $\alpha>0$ , and non-empty for $\alpha\leq\alpha^{*}(\mu)$ , where $\alpha^{*}(\mu)=\sup_{x\in\mathbb{R}^{d}}0pt{x}{\mu}$ is the maximum halfspace depth of $\mu$ . Of special importance is the set $D^{*}\left(\mu\right)=D_{\alpha^{*}(\mu)}$ , which contains points that are the most centrally positioned w.r.t. $\mu$ . It is called the set of the halfspace medians of $\mu$ and, as its name suggests, it generalises the median to $\mathbb{R}^{d}$ . The halfspace depth has many applications in multivariate statistics, and is already for 30 years a subject of active research [11, 12, 13, 15, 16]. Although many other statistical depth functions have been developed [2, 14, 21], in this paper we focus on the halfspace depth, and sometimes write simply depth instead of halfspace depth. We call a measure $\mu\in\mathcal{M}\left(\mathbb{R}^{d}\right)$ smooth if the $\mu$ -mass of every hyperplane in $\mathbb{R}^{d}$ is zero. A measure with a density is smooth; examples of non-smooth measures are those with an atom. The infimum in (1) is attained for smooth measures. That is why theoretical results on the halfspace depth are often formulated only for smooth measures, and why the analysis of the sample halfspace depth (that is, the halfspace depth evaluated w.r.t. empirical measures of random samples) is performed using different techniques [10, 12, 13]. In this paper we introduce flag halfspaces — symmetrised variants of closed halfspaces that may be considered in (1) instead of $\mathcal{H}(x)$ without altering the depth, with the property that a flag halfspace attaining the depth always exists. We will see that our restatement of formula (1) simplifies many theoretical derivations about the halfspace depth, as it is no longer needed to distinguish whether the infimum in (1) is attained.

Flag halfspaces are introduced in Section 2. Two applications to the computation of the depth are given in Section 3. In Section 3.1, we investigate the dimensionality and the structure of the median set $D^{*}\left(\mu\right)$ for $\mu$ an empirical measure. We show that for datasets sampled from absolutely continuous probability measures in $\mathbb{R}^{d}$ , the halfspace median set cannot be of dimension $d-1$ , almost surely. In a series of examples in $\mathbb{R}^{3}$ we demonstrate that already for random samples of size $n=8$ from the standard Gaussian distribution, halfspace median sets of dimensions $0$ , $1$ , and $3$ occur with positive probability. In Section 3.2 we deal with the special situation of data of dimension $d=2$ . We show that if the dataset satisfies a mild condition of general position, then the halfspace median set must be either a full-dimensional polygon, or a data point. Both these advances find applications in the computation of the halfspace median and the central regions (2), where the dimensionality of $D^{*}\left(\mu\right)$ plays a crucial role [6, 11]. The paper is complemented by online Supplementary Material containing R and Mathematica scripts with visualisations and computations completing examples from Section 3.

Notations.

Some of our proofs are based on convexity theory. As a basic reference we take [17]; we now gather notations and elementary definitions that will be used throughout the paper. The unit sphere in $\mathbb{R}^{d}$ is $\mathbb{S}^{d-1}$ . We write $S\subset K$ for $S$ being a proper subset of $K$ . The restriction of $\mu\in\mathcal{M}\left(\mathbb{R}^{d}\right)$ to a Borel set $S\subseteq\mathbb{R}^{d}$ is denoted by $\mu|_{S}\in\mathcal{M}\left(\mathbb{R}^{d}\right)$ and is defined by $\mu|_{S}\left(B\right)=\mu\left(B\cap S\right)$ for $B\subseteq\mathbb{R}^{d}$ Borel. The affine hull $\operatorname{aff}\left(S\right)$ of $S\subseteq\mathbb{R}^{d}$ is the smallest affine subspace of $\mathbb{R}^{d}$ containing $S$ . The dimension $\dim(S)$ of $S$ is defined as the dimension of $\operatorname{aff}\left(S\right)$ . For example, the affine hull of two different points in $\mathbb{R}^{d}$ is the infinite line joining them, and its dimension is 1. We write $\operatorname{int}\left(S\right)$ , $\operatorname{cl}\left(S\right)$ , and $\operatorname{bd}\left(S\right)$ for the interior, closure, and boundary of $S\subseteq\mathbb{R}^{d}$ . The interior, closure, and boundary of $S$ when considered as a subset of its affine hull $\operatorname{aff}\left(S\right)$ is denoted by $\operatorname{relint}\left(S\right)$ , $\operatorname{relcl}\left(S\right)$ and $\operatorname{relbd}\left(S\right)$ , and is called the relative interior, relative closure, and relative boundary of $S$ , respectively. Of course, if $\dim(S)=d$ , the interior is the same as the relative interior etc.

The class of all closed halfspaces in $\mathbb{R}^{d}$ is $\mathcal{H}$ . A generic halfspace from $\mathcal{H}$ may be denoted simply by $H$ ; $H_{x,v}$ means a halfspace $\left\{y\in\mathbb{R}^{d}\colon\left\langle y,v\right\rangle\geq\left\langle x,v\right\rangle\right\}$ whose boundary passes through $x\in\mathbb{R}^{d}$ with inner normal $v\in\mathbb{R}^{d}\setminus\{0\}$ . For an affine space $A\subseteq\mathbb{R}^{d}$ and $x\in A$ we denote by $\mathcal{H}(x,A)$ the set of all relatively closed halfspaces $H$ in $A$ whose relative boundary contains $x$ ; surely $\mathcal{H}(x,\mathbb{R}^{d})\equiv\mathcal{H}(x)$ . We say that a sequence of halfspaces $\{H_{x_{n},v_{n}}\}_{n=1}^{\infty}\subset\mathcal{H}$ converges to $H_{x,v}\in\mathcal{H}$ if $x_{n}\rightarrow x$ and $v_{n}\rightarrow v$ . Finally, for any of the symbols $\mathcal{H}$ , $\mathcal{H}(x)$ , or $\mathcal{H}(x,A)$ , a superscript $\circ$ designates the corresponding relatively open halfspaces, e.g. $\mathcal{H}^{\circ}(x,A)=\left\{\operatorname{relint}\left(H\right)\colon H\in\mathcal{H}(x,A)\right\}$ .

2. Flag halfspaces

For $\mu\in\mathcal{M}\left(\mathbb{R}^{d}\right)$ and $x\in\mathbb{R}^{d}$ we call $H\in\mathcal{H}(x)$ a minimising halfspace of $\mu$ at $x$ if $\mu(H)=0pt{x}{\mu}$ . For $d=1$ minimising halfspaces always trivially exist. They also exist if $\mu$ is smooth, or if $\mu$ is supported in a finite number of points. In general, however, the infimum in (1) does not have to be attained. We give a simple example.

Example 1.

Take $\mu\in\mathcal{M}\left(\mathbb{R}^{2}\right)$ the sum of the Dirac measure at $a=(1,1)\in\mathbb{R}^{2}$ and the uniform distribution on the disk $\left\{x\in\mathbb{R}^{2}\colon\left\|x\right\|\leq 2\right\}$ . For $x=(1,0)\in\mathbb{R}^{2}$ no minimising halfspace exists. As we see in Figure 1, the depth $D(x;\mu)$ is approached by $\mu(H_{x,v_{n}})$ for a sequence of halfspaces $H_{n}\equiv H_{x,v_{n}}$ , $n=1,2,\dots$ , with inner normals $v_{n}=\left(\cos(-1/n),\sin(-1/n)\right)$ that converge to $v=(1,0)\in\mathbb{S}^{1}$ , yet $D(x;\mu)=\lim_{n\to\infty}\mu(H_{x,v_{n}})<\mu(H_{x,v})$ .

Refer to caption — Figure 1. The support of $\mu\in\mathcal{M}\left(\mathbb{R}^{2}\right)$ from Example 1 (coloured disk) and its atom $a$ (diamond). No minimising halfspace of $\mu$ at $x=(1,0)\in\mathbb{R}^{2}$ (coloured point) exists. In the left hand panel we see a halfspace $H_{n}\in\mathcal{H}(x)$ whose $\mu$ -mass is almost $D(x;\mu)$ . It does not contain $a$ . In the right hand panel the minimising flag halfspace $F\in\mathcal{F}(x)$ of $\mu$ at $x$ is displayed.

The problem with measures not attaining the infimum in (1) is elegantly resolved by considering flag halfspaces instead of the usual closed halfspaces.

Definition.

Define $\mathcal{F}(x)$ as the system of all sets $F$ of the form

(3)

F=\{x\}\cup\left(\bigcup_{k=1}^{d}G_{k}\right)

where $G_{d}\in\mathcal{H}^{\circ}(x)$ , and $G_{k}\in\mathcal{H}^{\circ}(x,\operatorname{relbd}\left(G_{k+1}\right))$ for every $k=1,\dots,d-1$ . Any element of $\mathcal{F}(x)$ is called a flag halfspace at $x$ .

The name flag comes from geometry [17], where an analogous recursive construction is considered, involving nested faces of convex polytopes. The formal definition of flag halfspaces is somewhat convoluted, but these sets appear naturally. In $\mathbb{R}^{2}$ , a flag halfspace at $x$ is the union of an open halfplane $G_{2}$ whose boundary passes through $x$ , a relatively open halfline $G_{1}$ originating at $x$ contained in the one-dimensional affine space (line) $\operatorname{bd}\left(G_{2}\right)$ , and the $0$ -dimensional point $x$ itself. For an example see Figure 1. A flag halfspace is neither an open nor a closed set. In contrast to a usual closed halfspace, a complement of a flag halfspace $F\in\mathcal{F}(x)$ is, except for its central point $x$ , again a flag halfspace from $\mathcal{F}(x)$ , i.e. $(\mathbb{R}^{d}\setminus F)\cup\{x\}\in\mathcal{F}(x)$ . Several more interesting properties and characterisations of flag halfspaces can be found in [9].

We define a minimising flag halfspace of $\mu$ at $x$ to be any $F\in\mathcal{F}(x)$ that satisfies $\mu(F)=D(x;\mu)$ . In the following Theorem 1 we show that the halfspace depth (1) of any measure can be expressed in terms of the $\mu$ -mass of flag halfspaces, and a minimising flag halfspace always exists. The intuition behind this result is as follows: Even if the minimising closed halfspace of $x\in\mathbb{R}^{d}$ does not exist, there is a sequence of closed halfspaces $\{H_{n}\}_{n=1}^{\infty}\subset\mathcal{H}(x)$ that satisfies

(4)

\lim_{n\to\infty}\mu\left(H_{n}\right)=0pt{x}{\mu}.

Because the unit normals $\{v_{n}\}_{n=1}^{\infty}$ of these halfspaces come from the compact set $\mathbb{S}^{d-1}$ , we can also assume that the sequence of halfspaces is convergent and $\lim_{n\to\infty}v_{n}=v\in\mathbb{S}^{d-1}$ (otherwise, we extract a convergent subsequence). For $n$ large enough, $\mu\left(H_{n}\right)$ is arbitrarily close to $0pt{x}{\mu}$ , but this fact alone, of course, does not imply that the $\mu$ -mass of the limit $H\equiv H_{x,v}$ defined as $H=\lim_{n\to\infty}H_{n}$ is equal to $0pt{x}{\mu}$ . It turns out that for general measures, it is not possible to find any useful upper bound on the mass $\mu\left(H\right)$ , but it is possible to bound the mass of its interior by $\mu\left(\operatorname{int}\left(H\right)\right)\leq 0pt{x}{\mu}$ . The interior of $H$ is the first open halfspace $G_{d}$ in the construction of the minimising flag halfspace (3). The remaining relatively open halfspaces $G_{k}$ are found by iterating the same process inside the relative boundary of the previous $G_{k+1}$ , $k=1,\dots,d-1$ . In the right hand panel of Figure 2 we see a visualisation of our setup, with $d=2$ . In the situation displayed, as $n\to\infty$ , the halfspaces $H_{n}$ do not intersect the halfline $G_{1}^{-}\subset\operatorname{bd}\left(H\right)$ originating at $x$ , so the $\mu$ -mass of $G_{1}^{-}$ does not contribute to the depth of $x$ , and $G_{1}^{-}$ should not be contained in $F$ . The formal statement of our theorem follows.

Theorem 1.

For any $\mu\in\mathcal{M}\left(\mathbb{R}^{d}\right)$ and $x\in\mathbb{R}^{d}$ we have

(5)

D\left(x;\mu\right)=\min\left\{\mu\left(F\right)\colon F\in\mathcal{F}(x)\right\}.

In particular, there always exists a minimising flag halfspace.

Proof.

Let $\left\{H_{n}\right\}_{n=1}^{\infty}\subset\mathcal{H}(x)$ be a sequence of halfspaces satisfying (4) with limit $H\equiv H_{x,v}=\lim_{n\to\infty}H_{n}$ . For all $n=1,2,\dots$ we have

(6)

\mu(H_{n})\geq\mu(H_{n}\cap H)=\mu\left(H_{n}\cap\operatorname{int}\left(H\right)\right)+\mu\left(H_{n}\cap\operatorname{bd}\left(H\right)\right).

We first bound both summands on the right hand side from below. For each $n=1,2,\dots$ we define $A_{n}=\left(\bigcap_{m\geq n}H_{m}\right)\cap\operatorname{int}\left(H\right)\subseteq H_{n}\cap\operatorname{int}\left(H\right)$ . From the convergence of the halfspaces $\left\{H_{n}\right\}_{n=1}^{\infty}$ we know that $A_{n}\uparrow\operatorname{int}\left(H\right)$ as $n\rightarrow\infty$ , and using the continuity of measure from below [4, Theorem 3.1.11] we obtain the equality in

(7)

\mu(\operatorname{int}\left(H\right))=\lim_{n\to\infty}\mu\left(A_{n}\right)\leq\liminf_{n\to\infty}\mu\left(H_{n}\cap\operatorname{int}\left(H\right)\right).

On the other side, $x\in H_{n}\cap\operatorname{bd}\left(H\right)$ for all $n=1,2,\dots$ , so $H_{n}\cap\operatorname{bd}\left(H\right)$ is either a closed halfspace when considered in the $(d-1)$ -dimensional space $\operatorname{bd}\left(H\right)$ , or is equal to $\operatorname{bd}\left(H\right)$ . In any case, we have that $\mu\left(H_{n}\cap\operatorname{bd}\left(H\right)\right)\geq 0pt{x}{\mu|_{\operatorname{bd}\left(H\right)}}$ for $\mu|_{\operatorname{bd}\left(H\right)}$ the restriction of $\mu$ to the hyperplane $\operatorname{bd}\left(H\right)$ . Consequently

(8)

\liminf_{n\to\infty}\mu\left(H_{n}\cap\operatorname{bd}\left(H\right)\right)\geq 0pt{x}{\mu|_{\operatorname{bd}\left(H\right)}}.

Combining (6), (7) and (8) one gets

(9)		$\displaystyle 0pt{x}{\mu}$	$\displaystyle=\lim_{n\to\infty}\mu(H_{n})\geq\liminf_{n\to\infty}\mu\left(H_{n}\cap\operatorname{int}\left(H\right)\right)+\liminf_{n\to\infty}\mu\left(H_{n}\cap\operatorname{bd}\left(H\right)\right)$
(9)			$\displaystyle\geq\mu(\operatorname{int}\left(H\right))+0pt{x}{\mu\|_{\operatorname{bd}\left(H\right)}}.$

Assume now for a contradiction that the inequality in (9) is strict, i.e. that $0pt{x}{\mu}-\mu(\operatorname{int}\left(H\right))-0pt{x}{\mu|_{\operatorname{bd}\left(H\right)}}=c>0$ . The definition of the halfspace depth implies that there exists a halfspace $\widetilde{H}\in\mathcal{H}(x,\operatorname{bd}\left(H\right))$ in the hyperplane $\operatorname{bd}\left(H\right)$ that satisfies

(10)

\mu|_{\operatorname{bd}\left(H\right)}(\widetilde{H})<0pt{x}{\mu|_{\operatorname{bd}\left(H\right)}}+c/2.

Denote by $\widetilde{v}\in\mathbb{S}^{d-1}$ the unit inner normal of $\widetilde{H}$ and set $w_{n}=v+\widetilde{v}/n$ and $C_{n}=H_{x,w_{n}}\setminus H$ for $n=1,2,\dots$ . Then $w_{n}\to v$ and $C_{n}\downarrow\emptyset$ as $n\to\infty$ , meaning that $\lim_{n\to\infty}H_{x,w_{n}}=H$ and $\lim_{n\to\infty}\mu(C_{n})=0$ due to the continuity of measure from above [4, Theorem 3.1.1]. For $n$ large enough we have $\mu(H_{x,w_{n}}\setminus H)<c/2$ . Note also that $H_{x,w_{n}}\cap\operatorname{bd}\left(H\right)=\widetilde{H}$ for all $n=1,2,\dots$ , due to the choice of $w_{n}$ . Altogether, we have

(11)	$\displaystyle\mu(H_{x,w_{n}})$	$\displaystyle=\mu(H_{x,w_{n}}\cap\operatorname{int}\left(H\right))+\mu(H_{x,w_{n}}\cap\operatorname{bd}\left(H\right))+\mu(H_{x,w_{n}}\setminus H)$
		$\displaystyle<\mu(\operatorname{int}\left(H\right))+\mu(\widetilde{H})+c/2=\mu(\operatorname{int}\left(H\right))+\mu\|_{\operatorname{bd}\left(H\right)}(\widetilde{H})+c/2$
		$\displaystyle<\mu(\operatorname{int}\left(H\right))+0pt{x}{\mu\|_{\operatorname{bd}\left(H\right)}}+c=0pt{x}{\mu},$

where the last inequality in (11) follows from (10). Note that because $H_{x,w_{n}}\in\mathcal{H}(x)$ , inequality (11) contradicts the definition of the halfspace depth (1), and we get

(12)

0pt{x}{\mu}=\mu(G_{d})+0pt{x}{\mu|_{\operatorname{relbd}\left(G_{d}\right)}},

where we denoted $G_{d}=\operatorname{int}\left(H\right)\in\mathcal{H}^{\circ}(x)$ . We have just constructed the first open halfspace $G_{d}$ in the system (3). We proceed by induction. We consider $\mu|_{\operatorname{bd}\left(H\right)}=\mu|_{\operatorname{relbd}\left(G_{d}\right)}$ instead of $\mu$ and using the same argument obtain $G_{d-1}\in\mathcal{H}^{\circ}(x,\operatorname{relbd}\left(G_{d}\right))$ that satisfies an equation analogous to (12), i.e. $0pt{x}{\mu|_{\operatorname{relbd}\left(G_{d}\right)}}=\mu(G_{d-1})+0pt{x}{\mu|_{\operatorname{relbd}\left(G_{d-1}\right)}}$ . Continuing the same procedure we eventually obtain a flag halfspace $F=\{x\}\cup\left(\bigcup_{k=1}^{d}G_{k}\right)$ such that

	$\displaystyle 0pt{x}{\mu}$	$\displaystyle=\mu(\operatorname{int}\left(H\right))+0pt{x}{\mu\|_{\operatorname{bd}\left(H\right)}}=\mu(G_{d})+\mu(G_{d-1})+0pt{x}{\mu\|_{\operatorname{relbd}\left(G_{d-1}\right)}}=\dots$
		$\displaystyle=\sum_{k=2}^{d}\mu(G_{k})+0pt{x}{\mu\|_{\operatorname{relbd}\left(G_{2}\right)}}=\sum_{k=1}^{d}\mu(G_{k})+\mu(\{x\})=\mu(F).$

The last but one equality above follows from the fact that $\operatorname{relbd}\left(G_{2}\right)$ is a line, meaning that $G_{1}$ is one of the two relatively open halflines determined by $x$ in $\operatorname{relbd}\left(G_{2}\right)$ having a smaller $\mu$ -mass. Thus, $0pt{x}{\mu|_{\operatorname{relbd}\left(G_{2}\right)}}=\mu\left(\{x\}\right)+\mu(G_{1})$ . ∎

In Example 1, the single minimising flag halfspace of $\mu$ at $x$ is

F=\{x\}\cup\left\{\left(1,x_{2}\right)\in\mathbb{R}^{2}\colon x_{2}<0\right\}\cup\left\{\left(x_{1},x_{2}\right)\in\mathbb{R}^{2}\colon x_{1}>1\right\}\in\mathcal{F}(x).

In formula (12) in the proof of Theorem 1 we unveiled the recursive nature of the halfspace depth. The following result formalises that observation. In the special situation of an empirical measure $\mu\in\mathcal{M}\left(\mathbb{R}^{d}\right)$ , a related result has been observed in [5, Theorems 1 and 2] and successfully applied in the task of exact computation of the halfspace depth.

Corollary 2.

For $x\in\mathbb{R}^{d}$ and $\mu\in\mathcal{M}\left(\mathbb{R}^{d}\right)$ it holds true that

0pt{x}{\mu}=\inf_{H\in\mathcal{H}(x)}\left(\mu(\operatorname{int}\left(H\right))+0pt{x}{\mu|_{\operatorname{bd}\left(H\right)}}\right).

Proof.

There are more flag halfspaces in $\mathcal{F}(x)$ than closed halfspaces in $\mathcal{H}(x)$ , in the sense that the mapping $\mathcal{F}(x)\to\mathcal{H}(x)\colon F\mapsto\operatorname{cl}\left(F\right)$ is not bijective. We define an equivalence relation $\sim$ between the elements of $\mathcal{F}(x)$ by

F_{1}\sim F_{2}\mbox{ if and only if }\operatorname{cl}\left(F_{1}\right)=\operatorname{cl}\left(F_{2}\right).

By $\mathcal{K}$ we denote the quotient set of $\sim$ . This allows us to rewrite (5) from Theorem 1 as

0pt{x}{\mu}=\min_{F\in\mathcal{F}(x)}\mu(F)=\inf_{K\in\mathcal{K}}\inf_{F\in K}\mu(F).

Note that for flag halfspaces, $\operatorname{int}\left(F_{1}\right)=\operatorname{int}\left(F_{2}\right)$ is equivalent with $\operatorname{cl}\left(F_{1}\right)=\operatorname{cl}\left(F_{2}\right)$ . Take $K\in\mathcal{K}$ and denote $G_{K}=\operatorname{int}\left(F\right)\in\mathcal{H}^{\circ}(x)$ for $F\in K$ . Then each $F\in K$ can be represented as $F=G_{K}\cup F^{\prime}$ , for $F^{\prime}$ a flag halfspace centred at $x$ when considered inside the affine space $\operatorname{bd}\left(G_{K}\right)$ (denoted by $F^{\prime}\in\mathcal{F}\left(x,\operatorname{bd}\left(G_{K}\right)\right)$ ). We get, using Theorem 1 again,

\inf_{F\in K}\mu(F)=\mu(G_{K})+\inf_{F^{\prime}\in\mathcal{F}\left(x,\operatorname{bd}\left(G_{K}\right)\right)}\mu(F^{\prime})=\mu(G_{K})+0pt{x}{\mu|_{\operatorname{bd}\left(G_{K}\right)}}.

The mapping $\mathcal{H}^{\circ}(x)\to\mathcal{H}(x)\colon G\mapsto\operatorname{cl}\left(G\right)$ is a bijection, so any $K\in\mathcal{K}$ corresponds to exactly one element $H=\operatorname{cl}\left(G_{K}\right)\in\mathcal{H}(x)$ , and we obtain desired result. ∎

3. Applications: Properties of the sample halfspace median

We now use flag halfspaces to derive several properties of the sample halfspace median that are of interest in the practice of the depth; additional applications of flag halfspaces to the theory of the halfspace depth can be found in [8, 9]. Write $\mathcal{A}\left(\mathbb{R}^{d}\right)$ for the set of all empirical measures $\mu\in\mathcal{M}\left(\mathbb{R}^{d}\right)$ , that is all purely atomic probability measures with a finite number $n$ of atoms, each atom having $\mu$ -mass $1/n$ , for some $n=1,2,\dots$ . These measures are typically obtained observing a random sample $X_{1},\dots,X_{n}\in\mathbb{R}^{d}$ from a probability distribution $\nu\in\mathcal{M}\left(\mathbb{R}^{d}\right)$ , each sample point corresponding to an atom. To approximate the halfspace depth of $\nu$ , the depth of $\mu$ is computed. The latter depth function is standardly used for inference about the unknown distribution $\nu$ . Naturally, it is therefore crucial to understand the behaviour of the halfspace depth w.r.t. empirical measures. We provide results on the dimensionality of the median set, assuming that the atoms of $\mu\in\mathcal{A}\left(\mathbb{R}^{d}\right)$ lie in a sufficiently general position. The last assumption is not restrictive; it is satisfied if, for instance, the measure $\nu$ from which we sample is smooth. The proof of the following lemma is standard and omitted.

Lemma 3.

Let $X_{1},X_{2},\dots,X_{n}$ be independent random variables sampled from smooth (and possibly different) probability measures from $\mathcal{M}\left(\mathbb{R}^{d}\right)$ . Then the following holds true almost surely.

(i)

The points $X_{1},X_{2},\dots,X_{n}$ are in general position.²²2A set $S$ of points in $\mathbb{R}^{d}$ is said to lie in general position if no subset of $k$ of these points lies in a $(k-2)$ -dimensional affine space, for all $k=2,\dots,d+1$ . If there are $n>d$ points in $S$ , this is equivalent to saying that no hyperplane in $\mathbb{R}^{d}$ contains more than $d$ points from $S$ .
(ii)

Writing $l(x,y)$ for the infinite line determined by $x\neq y\in\mathbb{R}^{d}$ , if $d\geq 2$ and $k_{1},\dots,k_{6}\in\{1,2,\dots,n\}$ are pairwise different indices, then

$l(X_{k_{1}},X_{k_{2}})\cap l(X_{k_{3}},X_{k_{4}})\cap l(X_{k_{5}},X_{k_{6}})=\emptyset.$

3.1. Dimensionality of the sample halfspace median

As our first application we show that for an empirical measure with atoms in general position, the median set $D^{*}\left(\mu\right)$ in dimension $d\geq 2$ cannot be $(d-1)$ -dimensional, unless we are in the trivial case when the number of atoms is equal to $d$ . Our findings should be seen as complementary to the earlier advances from [19, Lemma 6], where it was demonstrated that for $\mu\in\mathcal{A}\left(\mathbb{R}^{d}\right)$ with atoms in general position are all the depth regions $D_{\alpha}(\mu)$ full-dimensional, except for possibly the depth median $D^{*}\left(\mu\right)$ .

Theorem 4.

Let $\mu\in\mathcal{A}\left(\mathbb{R}^{d}\right)$ be a measure with $n$ atoms of mass $1/n$ in general position. If $n\neq d\geq 2$ , then $\dim(D^{*}\left(\mu\right))\neq d-1$ .

Proof.

We use two auxiliary lemmas. Our first lemma is a special case of a more general result that can be found in [7, Lemma 4]. In [7], that lemma is formulated with a final inequality $\mu(\operatorname{int}\left(H\right))\leq\alpha$ ; for $\mu\in\mathcal{A}\left(\mathbb{R}^{d}\right)$ also a strict inequality can be written, because the depth of $\mu$ attains only finitely many values.

Lemma 5.

Suppose that $\mu\in\mathcal{M}\left(\mathbb{R}^{d}\right)$ , $\alpha>0$ , a point $x\notin D_{\alpha}(\mu)$ and a face $F$ of $D_{\alpha}(\mu)$ are given so that the relatively open line segment $L(x,y)$ formed by $x$ and $y$ does not intersect $D_{\alpha}(\mu)$ for any $y\in F$ . Then there exists a touching³³3A halfspace $H\in\mathcal{H}$ is a touching halfspace of a non-empty convex set $A\subset\mathbb{R}^{d}$ if $H\cap\operatorname{cl}\left(A\right)\neq\emptyset$ and $\operatorname{int}\left(H\right)\cap A=\emptyset$ . halfspace $H\in\mathcal{H}$ of $D_{\alpha}(\mu)$ such that $\mu(\operatorname{int}\left(H\right))\leq\alpha$ , $x\in H$ and $F\subset\operatorname{bd}\left(H\right)$ . If, in addition, $\mu\in\mathcal{A}\left(\mathbb{R}^{d}\right)$ , then we can write even $\mu(\operatorname{int}\left(H\right))<\alpha$ .

Our second lemma is a simple observation about the structure of a simplex, that is a convex hull of $k+1$ points in general position, in the linear space $\mathbb{R}^{k}$ . These $k+1$ points are called the vertices of $S$ .

Lemma 6.

For a simplex $S\subset\mathbb{R}^{k}$ and any convex set $K\subseteq S$ with non-empty interior there exist $x,y\in K$ and $v\in\mathbb{S}^{k-1}$ such that each of the disjoint halfspaces $H_{x,v}$ and $H_{y,-v}$ contains only one vertex of $S$ .

Proof.

In this proof, all the vectors are column vectors, and by $A^{\mathsf{T}}$ we denote the transpose of a matrix $A$ . Denote $s_{1},\dots,s_{k+1}\in S$ the vertices of $S$ . Denote by $a$ any point in the interior of $K$ . We first transform both $S$ and $K$ by an affine transform $T\colon\mathbb{R}^{k}\to\mathbb{R}^{k}\colon z\mapsto A\,z+b$ for $A\in\mathbb{R}^{k\times k}$ non-singular and $b\in\mathbb{R}^{k}$ such that $T(s_{i})=e_{i}$ for each $i=1,\dots,k$ for $e_{i}$ the $i$ -th standard basis vector in $\mathbb{R}^{k}$ , and $T(a)=0$ is the origin in $\mathbb{R}^{k}$ . Such an affine transform certainly exists, because each full-dimensional simplex in $\mathbb{R}^{k}$ can be uniquely mapped to any other one using an invertible affine mapping. Because $a\in\operatorname{int}\left(K\right)\subseteq\operatorname{int}\left(S\right)$ , the origin $T(a)$ must be contained in the interior of the $T$ -image of $S$ defined by $T(S)=\left\{T(z)\colon z\in S\right\}$ , meaning that necessarily $T(s_{k+1})\in(-\infty,0)^{k}$ . Since $K$ is a convex set with $a$ in its interior, also $T(K)$ is convex with $0=T(a)\in\operatorname{int}\left(T(K)\right)$ . Thus, there is a closed ball $B$ centred at the origin with radius $\delta>0$ small enough so that $B\subseteq T(K)\subseteq T(S)$ . For $\widetilde{v}=e_{1}\in\mathbb{S}^{k-1}$ we have $\left\langle\widetilde{v},T(s_{1})\right\rangle=\left\langle\widetilde{v},e_{1}\right\rangle=1$ , $\left\langle\widetilde{v},T(s_{i})\right\rangle=\left\langle\widetilde{v},e_{i}\right\rangle=0$ for $i=2,\dots,k$ , and $\left\langle\widetilde{v},T(e_{k+1})\right\rangle<0$ . Take $\widetilde{x}=\delta\,e_{1}\in B$ and $\widetilde{y}=-\widetilde{x}\in B$ . Then $H_{\widetilde{x},\widetilde{v}}=\left\{z\in\mathbb{R}^{k}\colon\left\langle\widetilde{v},z\right\rangle\geq\delta\right\}$ contains $e_{1}=T(s_{1})$ as the only vertex of $T(S)$ , and $H_{\widetilde{y},-\widetilde{v}}=\left\{z\in\mathbb{R}^{k}\colon\left\langle\widetilde{v},z\right\rangle\leq-\delta\right\}$ contains only $T(e_{k+1})$ as the only vertex of $T(S)$ . Certainly, also $H_{\widetilde{x},\widetilde{v}}\cap H_{\widetilde{y},-\widetilde{v}}=\emptyset$ . Now it remains to apply the inverse affine transform $T^{-1}\colon\mathbb{R}^{k}\to\mathbb{R}^{k}\colon z\mapsto A^{-1}\left(z-b\right)$ for $A^{-1}\in\mathbb{R}^{k\times k}$ the inverse of $A$ , and define $x=T^{-1}(\widetilde{x})$ , $y=T^{-1}(\widetilde{y})$ , and $v=\left(A^{\mathsf{T}}e_{1}\right)/\left\|\left(A^{\mathsf{T}}e_{1}\right)\right\|\in\mathbb{S}^{k-1}$ . Because $v$ is taken to be the inner normal vector of $T^{-1}(H_{\widetilde{x},\widetilde{v}})=H_{x,v}$ , we indeed found the desired pair of halfspaces $H_{x,v}$ and $H_{y,-v}$ . ∎

We are ready to prove Theorem 4. Recall that $\alpha^{*}(\mu)=\sup_{x\in\mathbb{R}^{d}}0pt{x}{\mu}$ . Assume for a contradiction that $\dim(D^{*}\left(\mu\right))=d-1$ . Then $D^{*}\left(\mu\right)$ is contained in a hyperplane that determines two different closed halfspaces — we denote them by $H^{+}$ and $H^{-}$ , respectively. Take any $w\in\operatorname{int}\left(H^{+}\right)$ and $q\in\operatorname{int}\left(H^{-}\right)$ . We can consider the set $D^{*}\left(\mu\right)$ itself as a $(d-1)$ -dimensional face of $D^{*}\left(\mu\right)$ that satisfies the conditions of Lemma 5 for either of the choices $x=w$ , or $x=q$ . We apply Lemma 5 twice, first to $x=w$ and then also to $x=q$ . We obtain that $\mu(\operatorname{int}\left(H^{+}\right))<\alpha^{*}(\mu)$ and $\mu(\operatorname{int}\left(H^{-}\right))<\alpha^{*}(\mu)$ . Denoting $G^{+}=\operatorname{int}\left(H^{+}\right)$ , $G^{-}=\operatorname{int}\left(H^{-}\right)$ and $A=\operatorname{bd}\left(H^{+}\right)=\operatorname{bd}\left(H^{-}\right)$ we can write

(13)

\max\left\{\mu(G^{+}),\mu(G^{-})\right\}<\alpha^{*}(\mu).

Applying Corollary 2 to $x\in D^{*}\left(\mu\right)$ and halfspaces $H^{+}$ and $H^{-}$ and using (13), we get that

(14)

0pt{x}{\mu|_{A}}\geq\alpha^{*}(\mu)-\mu(G)>0\mbox{ for all }x\in D^{*}\left(\mu\right)\mbox{ and }G\in\{G^{+},G^{-}\}.

Because $\dim(D^{*}\left(\mu\right))=d-1$ and $0pt{x}{\mu|_{A}}>0$ for all $x\in D^{*}\left(\mu\right)$ , there must exist at least $d$ atoms of $\mu$ in the hyperplane $A$ . At the same time, due to our assumption of the atoms of $\mu$ being in general position, there are at most $d$ atoms of $\mu$ in any hyperplane, meaning that $A$ contains exactly $d$ atoms of $\mu$ , and these atoms are in general position inside $A$ . Consequently,

(15)

0pt{x}{\mu|_{A}}=1/n\quad\mbox{for all }x\in D^{*}\left(\mu\right).

From (14) and (15) it follows that $\alpha^{*}(\mu)>\mu(G)\geq\alpha^{*}(\mu)-1/n$ for $G\in\{G^{+},G^{-}\}$ . Since $\mu$ is an empirical measure with $n$ atoms, the $\mu$ -mass of any set can be only a multiple of $1/n$ , so it must be that $\mu(G^{+})=\mu(G^{-})=\alpha^{*}(\mu)-1/n$ . We obtain

(16)

2\left(\alpha^{*}(\mu)-\frac{1}{n}\right)=\mu(G^{+})+\mu(G^{-})=1-\mu(A).

Since we have shown that there are exactly $d$ atoms of $\mu$ in $A$ , it has to be $n\geq d$ . From an assumption of our theorem we thus have $n>d$ . Then there exists $z\in\mathbb{R}^{d}\setminus A$ such that $\mu(\{z\})=1/n$ . We apply Lemma 6 in the subspace $A$ to conclude that there exist $x,y\in D^{*}\left(\mu\right)$ and closed halfspaces $H_{x,v},H_{y,-v}$ in space $A$ such that $\mu|_{A}(H_{x,v})=\mu|_{A}(H_{y,-v})=1/n$ and $H_{x,v}\cap H_{y,-v}=\emptyset$ . Choose a full-dimensional halfspace $H_{x,u}\in\mathcal{H}$ that meets the conditions $H_{x,u}\cap A=H_{x,v}$ and $z\notin H_{x,u}\cup H_{y,-u}$ . Denote $S_{x}=H_{x,u}\setminus A$ and $S_{y}=H_{y,-u}\setminus A$ . Then $H_{x,u}=H_{x,v}\cup S_{x}$ and $H_{y,-u}=H_{y,-v}\cup S_{y}$ , so we have

(17)

\mu(H_{x,u})=1/n+\mu(S_{x}),\quad\mu(H_{y,-u})=1/n+\mu(S_{y}).

Note that the sets $S_{x}$ and $S_{y}$ are disjoint and $\left(S_{x}\cup S_{y}\right)\cap\left(A\cup\{z\}\right)=\emptyset$ , so

(18)

\mu(S_{x})+\mu(S_{y})=\mu(S_{x}\cup S_{y})\leq 1-\mu\left(A\cup\{z\}\right)=1-\mu(A)-\frac{1}{n}.

Combining (16), (17) and (18), we obtain

\mu(H_{x,u})+\mu(H_{y,-u})=\frac{2}{n}+\mu(S_{x})+\mu(S_{y})\leq 2\alpha^{*}(\mu)-\frac{1}{n}.

It follows that $\min\{\mu(H_{x,u}),\mu(H_{y,-u})\}<\alpha^{*}(\mu)$ , a contradiction with our choice $\{x,y\}\subset D^{*}\left(\mu\right)$ . ∎

Theorem 4 is valid for empirical measures; an analogous theorem for absolutely continuous measures can be found in [18, Proposition 3.4]. There, it was shown that for $\mu\in\mathcal{M}\left(\mathbb{R}^{d}\right)$ satisfying certain smoothness conditions including the existence of the density, the dimension of the median $D^{*}\left(\mu\right)$ cannot exceed $d-2$ provided that $d\geq 2$ . A version of the latter theorem with weaker conditions, but still requiring smoothness and contiguous support of $\mu\in\mathcal{M}\left(\mathbb{R}^{d}\right)$ , is given in [7, Corollary 7]. Unlike the proofs for smooth measures, the proof of Theorem 4 requires the use of flag halfspaces, which makes the derivation more technical and delicate. Without the assumption of general position, the claim of Theorem 4 is not valid. An example of a measure in $\mu\in\mathcal{A}\left(\mathbb{R}^{2}\right)$ whose atoms are not in general position but $\dim(D^{*}\left(\mu\right))=1$ is given in [7, Section 2].

Excluding the case of $\dim(D^{*}\left(\mu\right))=d-1$ for random samples from smooth probability measures, one can ask whether there are other dimensions that the sample median set cannot attain. The answer is negative already in the case of $n=8$ points sampled randomly from a Gaussian distribution in $\mathbb{R}^{3}$ , as we show in the next example.

Example 2.

For $\nu\in\mathcal{M}\left(\mathbb{R}^{3}\right)$ the standard Gaussian probability measure and $X_{1},\dots,X_{8}$ a random sample from $\nu$ with empirical measure $\mu\in\mathcal{A}\left(\mathbb{R}^{3}\right)$ , the median set $D^{*}\left(\mu\right)$ is of dimension $3$ , $1$ , or a single-point set, all with positive probability. The claim follows by considering three setups of eight points $x_{1},\dots,x_{8}$ in the space $\mathbb{R}^{3}$ . Denote $k=\dim\left(D^{*}\left(\mu\right)\right)$ and write $\mu\in\mathcal{A}\left(\mathbb{R}^{3}\right)$ for the empirical measure of $x_{1},\dots,x_{8}$ . The direct computations described below are based on the analysis performed using the R package TukeyRegion [1] for evaluation of full-dimensional central regions, and the Mathematica visualisations provided in the script in the online Supplementary Material. Plots of the three setups below are displayed in Figure 3.

•

Case $k=3$ . This situation is standard and common. For example, direct computation shows that already for randomly perturbed vertices of a unit cube in $\mathbb{R}^{3}$ , i.e. points in a configuration where the convex hull of $x_{1},\dots,x_{8}$ contains all the eight points on its boundary, possess a full-dimensional polyhedral median set with maximum depth $2/8$ .
•

Case $k=1$ . Arrange the points so that $x_{1},x_{2},x_{3}$ form vertices of a triangle $T_{1}$ in a plane, and $x_{4},x_{5},x_{6}$ form vertices of a triangle $T_{2}$ in a plane parallel to that determined by $T_{1}$ , so that the convex hull of $x_{1},\dots,x_{6}$ is a triangular prism in $\mathbb{R}^{3}$ . To obtain points in general position, we perturb the six points slightly. Direct computation shows that for these six points, the halfspace median set is a three-dimensional polyhedron $M$ inside the prism that does not intersect $T_{1}$ or $T_{2}$ , of points with depth $2/6$ . Place the last two points $x_{7}$ and $x_{8}$ in the interior of $M$ , so that the straight line $l(x_{7},x_{8})$ between these points intersects both relative interiors of $T_{1}$ and $T_{2}$ . Note that certainly $0pt{x_{7}}{\mu}=0pt{x_{8}}{\mu}=3/8$ , since the two points were placed inside $M$ . No point can have depth $4/8$ , as in that situation the setup would exhibit halfspace symmetry which is clearly impossible [10, Proposition 1]. Finally, projecting all points of $\mu$ into the plane orthogonal to $l(x_{7},x_{8})$ shows that any point $y\notin l(x_{7},x_{8})$ can be separated from $l(x_{7},x_{8})$ by a plane that is parallel to $l(x_{7},x_{8})$ and contains only two sample points, meaning that $0pt{y}{\mu}\leq 2/8$ . The median set of $\mu$ is therefore the line segment between $x_{7}$ and $x_{8}$ , with depth $3/8$ .

•

Case $k=0$ . Consider four points $x_{1},\dots,x_{4}$ forming the vertices of a tetrahedron $T$ (blue points in the bottom panels of Figure 3). Three points $x_{5},x_{6},x_{7}\notin T$ are attached to three different facets of $T$ so that each of these points together with its facet forms another (non-regular) tetrahedron not intersecting $\operatorname{int}\left(T\right)$ (red points in the bottom panels of Figure 3). Finally, a single point $x_{8}$ is placed strategically inside $T$ into the full-dimensional halfspace median of $x_{1},\dots,x_{7}$ . An example is the configuration

	$\displaystyle x_{1}$	$\displaystyle=\left(1,0,-\frac{1}{\sqrt{2}}\right),$	$\displaystyle x_{2}$	$\displaystyle=\left(-1,0,-\frac{1}{\sqrt{2}}\right),$	$\displaystyle x_{3}$	$\displaystyle=\left(0,-1,\frac{1}{\sqrt{2}}\right),$	$\displaystyle x_{4}$	$\displaystyle=\left(0,1,\frac{1}{\sqrt{2}}\right),$
	$\displaystyle x_{5}$	$\displaystyle=\left(0,1,-\frac{1}{4}\right),$	$\displaystyle x_{6}$	$\displaystyle=\left(\frac{1}{10},-1,-\frac{1}{4}\right),$	$\displaystyle x_{7}$	$\displaystyle=\left(\frac{3}{4},0,\frac{1}{4}\right),$	$\displaystyle x_{8}$	$\displaystyle=\left(\frac{1}{10},\frac{1}{10},0\right).$

These points are in general position in $\mathbb{R}^{3}$ . For the setup of halfspaces $H_{x_{8},u_{i}}\in\mathcal{H}(x_{8})$ given by the normal vectors

u_{1}=\left(-\frac{7}{10},-\frac{3}{10},-\frac{3}{5}\right),\ u_{2}=\left(-\frac{2}{5},-\frac{1}{10},\frac{9}{10}\right),\ u_{3}=\left(-\frac{1}{5},\frac{4}{5},\frac{3}{5}\right),\ u_{4}=\left(1,\frac{1}{10},0\right)

we obtain $\mu(H_{x_{8},u_{i}})=0pt{x_{8}}{\mu}=3/8$ for each $i=1,2,3,4$ . At the same time, the union of the open halfspaces $\operatorname{int}\left(H_{x_{8},u_{i}}\right)$ is $\mathbb{R}^{3}$ , meaning that for any $y\neq x_{8}$ we can find $i$ with $y\in\operatorname{int}\left(H_{x_{8},u_{i}}\right)$ , and the shifted closed halfspace $H_{y,u_{i}}=H_{x_{8},u_{i}}+(y-x_{8})\in\mathcal{H}(y)$ necessarily contains at most two atoms of $\mu$ . Thus, $0pt{y}{\mu}\leq 2/8$ and the point $x_{8}$ is the single halfspace median of $\mu$ .

The medians in all three cases above are stable in the sense that for a small perturbation of all the sample points, the dimension of the median set remains unchanged. Thus, in each setup and for each $x_{i}$ we can find a small open ball around $x_{i}$ such that if $x_{i}$ is replaced by any element of this ball, the dimension of the new median remains the same. In conclusion, all three cases $k=0,1,3$ occur with positive probability if $x_{1},\dots,x_{8}$ are sampled from any distribution in $\mathbb{R}^{3}$ with positive density everywhere.⁴⁴4Note that our example for case $k=1$ happens to disagree with [10, Theorem 3], as for $n=8$ and $d=3$ we obtain the maximum depth $\lfloor(n-d+2)/2\rfloor/n=3/8$ , but the median set is not a single point set. The problem appears to stem from formula (8) in [10] that is not valid in general.

3.2. Computation of the halfspace median in $\mathbb{R}^{2}$

In dimension $d=2$ , Theorem 4 leaves only trivial cases: the halfspace median must be either full-dimensional or a singleton, and both situations may occur. But, as we show in our last result below, if $\mu\in\mathcal{A}\left(\mathbb{R}^{2}\right)$ has a unique median and $n\neq 4$ , then the median must be one of the data points. The case of $n=4$ data points is trivial and not interesting.⁵⁵5In the situation $n=4$ and under the assumptions of Lemma 3, the unique median is almost surely a singleton and is (i) either the atom contained in the interior of the convex hull of the remaining three sample points; or (ii) not an atom, but the single point of intersection of the two diagonals of the quadrilateral formed by the convex hull of the atoms.

Theorem 7.

Let $\mu\in\mathcal{A}\left(\mathbb{R}^{2}\right)$ be an empirical measure with precisely $n$ atoms of mass $1/n$ , with $n\neq 4$ , that satisfy conditions (i) and (ii) from Lemma 3. If the halfspace median $D^{*}\left(\mu\right)$ is a single point set, then it must be an atom of $\mu$ . In particular, the median set is either full-dimensional, or an atom of $\mu$ .

Proof.

Suppose without loss of generality that $x=0\in\mathbb{R}^{2}$ is the unique median of $\mu$ . Assume, for a contradiction, that $\mu(\{0\})=0$ . We start with the following observation: For every $v\in\mathbb{S}^{1}$ there is $w(v)\in\mathbb{S}^{1}$ that meets the following conditions

(19)

\langle v,w(v)\rangle\geq 0,\ \mu\left(\operatorname{int}\left(H_{0,w(v)}\right)\right)=\alpha^{*}(\mu)-\frac{1}{n},\mbox{ and }\mu\left(\operatorname{bd}\left(H_{0,w(v)}\right)\right)=2/n.

To prove the existence of $w=w(v)$ satisfying (19) pick a real sequence $a_{i}\downarrow 0$ and note that for every $i=1,2,\dots$ we have $a_{i}v\notin D^{*}\left(\mu\right)$ , so there is $w_{i}\in\mathbb{S}^{1}$ such that

(20)

\mu(H_{a_{i}v,w_{i}})=0pt{a_{i}v}{\mu}\leq\alpha^{*}(\mu)-1/n<0pt{0}{\mu}.

The existence of a minimising halfspace $H_{a_{i}v,w_{i}}\in\mathcal{H}(a_{i}v)$ follows from the fact that minimising halfspaces always exist for $\mu\in\mathcal{A}\left(\mathbb{R}^{d}\right)$ , as we observed in Section 2. Then necessarily $0\not\in H_{a_{i}v,w_{i}}$ , meaning that $\langle v,w_{i}\rangle>0$ . The sequence $\{w_{i}\}_{i=1}^{\infty}\subset\mathbb{S}^{1}$ is bounded and therefore contains a convergent subsequence $\{w_{i_{j}}\}_{j=1}^{\infty}$ with a limit point $w\in\mathbb{S}^{1}$ that satisfies $\langle v,w\rangle\geq 0$ . By the Fatou lemma [4, Lemma 4.3.3] applied to the sets $\operatorname{int}\left(H_{0,w}\right)\subseteq{\lim\inf}_{j\to\infty}\operatorname{int}\left(H_{a_{i_{j}}v,w_{i_{j}}}\right)$ and (20) we have that

(21)

\mu(\operatorname{int}\left(H_{0,w}\right))\leq\liminf_{j\to\infty}\mu\left(\operatorname{int}\left(H_{a_{i_{j}}v,w_{i_{j}}}\right)\right)\leq\alpha^{*}(\mu)-1/n,

which together with Corollary 2 gives us

\alpha^{*}(\mu)=0pt{0}{\mu}\leq\mu(\operatorname{int}\left(H_{0,w}\right))+0pt{0}{\mu|_{\operatorname{bd}\left(H_{0,w}\right)}}\leq\alpha^{*}(\mu)-\frac{1}{n}+0pt{0}{\mu|_{\operatorname{bd}\left(H_{0,w}\right)}}.

Therefore, $0pt{0}{\mu|_{\operatorname{bd}\left(H_{0,w}\right)}}\geq 1/n$ . Because the halfspace median $0$ is not an atom of $\mu$ , the condition of general position of the atoms of $\mu$ from part (i) of Lemma 3 implies that the straight line $\operatorname{bd}\left(H_{0,w}\right)$ contains exactly two atoms of $\mu$ at some points $y,z\in\operatorname{bd}\left(H_{0,w}\right)$ such that $0$ is contained in the relatively open line segment formed by $y$ and $z$ . Denote by $l_{y}\subset\operatorname{bd}\left(H_{0,w}\right)$ the open halfline centred at $0$ that contains $y$ . The flag halfspace $F=\{y\}\cup l_{y}\cup\operatorname{int}\left(H_{0,w}\right)\in\mathcal{F}(0)$ then satisfies $\mu(F)=\mu\left(\operatorname{int}\left(H_{0,w}\right)\right)+1/n$ . Inequality (21) implies $\mu(F)\leq\alpha^{*}(\mu)$ , so it must be $\mu(F)=\alpha^{*}(\mu)$ because of Theorem 1. Consequently, $\mu\left(\operatorname{int}\left(H_{0,w}\right)\right)=\alpha^{*}(\mu)-1/n$ and we may take $w(v)=w$ . We have proved (19).

Pick any $v\in\mathbb{S}^{1}$ . There exists $u=w(v)\in\mathbb{S}^{1}$ that satisfies (19). Using the same observation again, we are able to find $u^{\prime}=w(-u)\in\mathbb{S}^{1}$ that satisfies (19) for $v$ replaced by $-u$ . We consider two different cases.

First case: $u^{\prime}=-u$ . By summing up equalities $\mu\left(\operatorname{int}\left(H_{0,u}\right)\right)=\alpha^{*}(\mu)-1/n$ , $\mu\left(\operatorname{int}\left(H_{0,-u}\right)\right)=\alpha^{*}(\mu)-1/n$ and $\mu\left(\operatorname{bd}\left(H_{0,u}\right)\right)=2/n$ that all follow from (19), we obtain $\alpha^{*}(\mu)=1/2$ . Consider any infinite line $l$ that passes through the origin and the two open halfplanes $G^{+}$ and $G^{-}$ determined by $l$ . If $\mu(l)=1/n$ , then one of the open halfplanes $G^{+}$ and $G^{-}$ is of $\mu$ -mass at most $1/2-1/(2n)$ . Assume that $\mu(G^{+})\leq 1/2-1/(2n)$ . Because $l$ contains only one atom of $\mu$ that is not at the origin, there is a flag halfspace $F\in\mathcal{F}(0)$ composed of $G^{+}$ and the relatively closed halfline in $l$ starting at $0$ that does not contain atoms. Then $\mu(F)=\mu(G^{+})\leq 1/2-1/(2n)<\alpha^{*}(\mu)$ , a contradiction with $\mu(l)=1/n$ . Due to the assumption of general position of atoms from part (i) of Lemma 3, we know that $\mu(l)\leq 2/n$ , so $\mu(l)$ can take only one of the two possible values: either $0$ or $2/n$ . Because of our assumption from part (ii) of Lemma 3, there however cannot be three different lines determined by pairs of sample points that all intersect in the origin. This means that for only at most two lines $l$ in $\mathbb{R}^{2}$ passing through the origin, the $\mu$ -mass of $l$ can be $2/n$ ; all the other lines that we now consider have null $\mu$ -mass (given that we have already excluded the case $\mu(l)=1/n$ ). This leaves only two possibilities: either $n=2$ , or $n=4$ . If $n=2$ , then the median set $D^{*}\left(\mu\right)$ is the line segment determined by the only two atoms of $\mu$ , and therefore it is one-dimensional. Only the case $n=4$ , not covered by the statement of this theorem, remains.

Second case: $u^{\prime}\neq-u$ . There exists a closed halfspace $H_{0,v^{\prime}}$ whose boundary passes through the origin that does not contain any of the points $u$ and $u^{\prime}$ . Let $\tilde{u}=w(v^{\prime})$ be the unit vector that satisfies (19) with $v=v^{\prime}$ . Directly by (19), each of the three different lines $\operatorname{bd}\left(H_{0,u}\right)$ , $\operatorname{bd}\left(H_{0,u^{\prime}}\right)$ and $\operatorname{bd}\left(H_{0,\tilde{u}}\right)$ contains two atoms of $\mu$ , a contradiction with our assumption from part (ii) of Lemma 3.

The last part of the statement of Theorem 7 follows directly from Theorem 4. ∎

Theorems 4 and 7 fully justify the algorithmic procedure from [11] and [6] for finding the halfspace medians of samples from smooth probability distributions in $\mathbb{R}^{2}$ . If the median set is full-dimensional, the algorithm from [11] implemented in the R package TukeyRegion [1] finds the median set exactly, as proved in [6]. If the median is not full-dimensional, we conclude that it has to be a single sample point, and evaluation of the maximum halfspace depth of all sample points gives the unique halfspace median.

In dimension $d>2$ , the situation with possible less-than-full-dimensional halfspace medians appears to be much more convoluted, as demonstrated already in Example 2. Our proof technique from Theorem 7 does not extend directly to $d>2$ . One might, however, conjecture that in accordance with Theorem 7, a less-than-full-dimensional median of a dataset in general position must contain at least one atom of $\mu\in\mathcal{A}\left(\mathbb{R}^{d}\right)$ . Our final example shows that this is not true: a configuration of points in general position without an atom in the halfspace median set is indeed possible.

Example 3.

Consider a dataset of $n=8$ points in $\mathbb{R}^{3}$ given by

	$\displaystyle x_{1}$	$\displaystyle=\left(-1,\frac{1}{3},-\frac{2}{3}\right),$	$\displaystyle x_{2}$	$\displaystyle=\bigg{(}1,0,-1\bigg{)},$	$\displaystyle x_{3}$	$\displaystyle=\left(0,\frac{3}{2},-1\right),$	$\displaystyle x_{4}$	$\displaystyle=\left(-\frac{1}{2},0,1\right),$
	$\displaystyle x_{5}$	$\displaystyle=\left(1,0,\frac{4}{3}\right),$	$\displaystyle x_{6}$	$\displaystyle=\bigg{(}0,2,1\bigg{)},$	$\displaystyle x_{7}$	$\displaystyle=\left(-\frac{1}{3},\frac{1}{2},-2\right),$	$\displaystyle x_{8}$	$\displaystyle=\left(\frac{1}{3},\frac{1}{2},2\right).$

Similarly as in case $k=1$ in Example 2, points $x_{1},\dots,x_{6}$ are perturbed vertices of a triangular prism. Points $x_{7}$ and $x_{8}$ determine a line segment that passes through both triangular bases of that prism. The dataset is in general position. A direct computation performed in Mathematica, provided in the script in the online Supplementary Material, confirms that the sample halfspace median set of this dataset attains depth $3/8$ , and the median set consists of the line segment between points $\left(0,1/2,0\right)$ and $\left(3/44,1/2,9/22\right)$ . This median line segment lies strictly in the relative interior of the straight line between points $x_{7}$ and $x_{8}$ , and does not contain any atoms of the corresponding measure $\mu\in\mathcal{A}\left(\mathbb{R}^{3}\right)$ . Thus, it is possible that in dimension $d>2$ , a less-than-full-dimensional median set contains no data points. For a visualisation of our dataset and its median set see Figure 4.

In Example 3, we constructed a dataset in general position in dimension $d=3$ , with a one-dimensional median set. We do not know an example of a dataset sampled from a distribution with a density in $\mathbb{R}^{d}$ , $d>2$ , with a unique (zero-dimensional) halfspace median that is not a data point. The higher dimensional situation therefore deserves further investigation.

Acknowledgement

P. Laketa was supported by the OP RDE project “International mobility of research, technical and administrative staff at the Charles University”, grant CZ.02.2.69/0.0/0.0/18_053/0016976. The work of S. Nagy was supported by Czech Science Foundation (EXPRO project n. 19-28231X).

References

Barber and Mozharovskyi, [2022] Barber, C. and Mozharovskyi, P. (2022). TukeyRegion: Tukey region and median. R package version 0.1.5.5.
Chernozhukov et al., [2017] Chernozhukov, V., Galichon, A., Hallin, M., and Henry, M. (2017). Monge-Kantorovich depth, quantiles, ranks and signs. Ann. Statist., 45(1):223–256.
Donoho and Gasko, [1992] Donoho, D. L. and Gasko, M. (1992). Breakdown properties of location estimates based on halfspace depth and projected outlyingness. Ann. Statist., 20(4):1803–1827.
Dudley, [2002] Dudley, R. M. (2002). Real analysis and probability, volume 74 of Cambridge Studies in Advanced Mathematics. Cambridge University Press, Cambridge.
Dyckerhoff and Mozharovskyi, [2016] Dyckerhoff, R. and Mozharovskyi, P. (2016). Exact computation of the halfspace depth. Comput. Statist. Data Anal., 98:19–30.
Fojtík et al., [2022] Fojtík, V., Laketa, P., Mozharovskyi, P., and Nagy, S. (2022). On exact computation of Tukey depth central regions. arXiv preprint arXiv:2208.04587.
[7] Laketa, P. and Nagy, S. (2022a). Halfspace depth for general measures: the ray basis theorem and its consequences. Statist. Papers, 63(3):849–883.
[8] Laketa, P. and Nagy, S. (2022b). Partial reconstruction of measures from halfspace depth. In Proceedings of CLADAG2021, Stud. Classification Data Anal. Knowledge Organ. Springer, Cham. To appear.
Laketa et al., [2022] Laketa, P., Pokorný, D., and Nagy, S. (2022). Simple halfspace depth. Under review.
Liu et al., [2020] Liu, X., Luo, S., and Zuo, Y. (2020). Some results on the computing of Tukey’s halfspace median. Statist. Papers, 61(1):303–316.
Liu et al., [2019] Liu, X., Mosler, K., and Mozharovskyi, P. (2019). Fast computation of Tukey trimmed regions and median in dimension $p>2$ . J. Comput. Graph. Statist., 28(3):682–697.
Massé, [2004] Massé, J.-C. (2004). Asymptotics for the Tukey depth process, with an application to a multivariate trimmed mean. Bernoulli, 10(3):397–419.
Mizera and Volauf, [2002] Mizera, I. and Volauf, M. (2002). Continuity of halfspace depth contours and maximum depth estimators: diagnostics of depth-related methods. J. Multivariate Anal., 83(2):365–388.
Mosler and Mozharovskyi, [2022] Mosler, K. and Mozharovskyi, P. (2022). Choosing among notions of multivariate depth statistics. Statist. Sci., 37(3):348–368.
Nagy et al., [2019] Nagy, S., Schütt, C., and Werner, E. M. (2019). Halfspace depth and floating body. Stat. Surv., 13:52–118.
Rousseeuw and Ruts, [1999] Rousseeuw, P. J. and Ruts, I. (1999). The depth function of a population distribution. Metrika, 49(3):213–244.
Schneider, [2014] Schneider, R. (2014). Convex bodies: the Brunn-Minkowski theory, volume 151 of Encyclopedia of Mathematics and its Applications. Cambridge University Press, Cambridge, expanded edition.
Small, [1987] Small, C. G. (1987). Measures of centrality for multivariate and directional distributions. Canad. J. Statist., 15(1):31–39.
Struyf and Rousseeuw, [1999] Struyf, A. and Rousseeuw, P. J. (1999). Halfspace depth and regression depth characterize the empirical distribution. J. Multivariate Anal., 69(1):135–153.
Tukey, [1975] Tukey, J. W. (1975). Mathematics and the picturing of data. In Proceedings of the International Congress of Mathematicians (Vancouver, B. C., 1974), Vol. 2, pages 523–531. Canad. Math. Congress, Montreal, Que.
Zuo and Serfling, [2000] Zuo, Y. and Serfling, R. (2000). General notions of statistical depth function. Ann. Statist., 28(2):461–482.

Another look at halfspace depth: Flag halfspaces with applications

Abstract.

Key words and phrases:

1991 Mathematics Subject Classification:

1. Introduction: Halfspace depth and its median

Notations.

2. Flag halfspaces

Example 1.

Definition.

Theorem 1.

Proof.

Corollary 2.

Proof.

3. Applications: Properties of the sample halfspace median

Lemma 3.

3.1. Dimensionality of the sample halfspace median

Theorem 4.

Proof.

Lemma 5.

Lemma 6.

Proof.

Example 2.

3.2. Computation of the halfspace median in ℝ2\mathbb{R}^{2}

Theorem 7.

Proof.

Example 3.

Acknowledgement

References

Another look at halfspace depth:
Flag halfspaces with applications

3.2. Computation of the halfspace median in $\mathbb{R}^{2}$