A flexible family of distributions on the cylinder

Shonosuke Sugasawa , Kunio Shimizu and Shogo Kato^† Graduate School of Economics, The University of Tokyo, E-mail: shonosuke622@gmail.comThe Institute of Statistical Mathematics

Abstract

We propose a flexible family of distributions, generalized $t$ -distributions, on the cylinder which is obtained as a conditional distribution of a trivariate $t$ distribution. The new distribution has unimodality or bimodality, symmetry or asymmetry, depending on the values of parameters and flexibly fits the cylindrical data. The circular marginal of this distribution is distributed as a generalized $t$ -distribution on the circle. Some other properties are also investigated. The proposed distribution is applied to the real cylindrical data.

Key words and phrases: circular-linear correlation, circular-linear regression, generalized von Mises distribution, Johnson–Wehrly model, Mardia–Sutton model

1 Introduction

Directional or circular data often appear in a variety of scientific fields and various stochastic models have been proposed for analyzing such data. For univariate circular data, there have been many distributions investigated in terms of both tractability and applicability, see Jones and Pewsey (2005), Kato and Jones (2010) and Kato and Jones (2015). However, we sometimes encounter situations which involve both circular and linear variables, namely cylindrical data such as the pair of wind direction and temperature (Mardia and Sutton, 1978) or directions and distances of animal movements (Fisher, 1993). For such data, the distribution on the cylinder is needed, but there are not so many distributions compared to univariate circular distributions. We give a brief review below for several cylindrical distributions known in the literature. Johnson and Wehrly (1978) gave a distribution based on the principle of maximum entropy subject to constraints on certain moments, and Mardia and Sutton (1978) provided another distribution as a conditional distribution of a trivariate normal distribution or a maximum entropy distribution. An extension of the distribution by Mardia and Sutton (1978) was studied by Kato and Shimizu (2008), which can also be derived as a maximum entropy distribution or a conditional of a trivariate normal.

In this paper, we propose the generalized $t$ -distribution on the cylinder, which is a natural extension of the member of the exponential family given by Kato and Shimizu (2008). The proposed distribution is also regarded as a cylindrical extension of the generalized $t$ -distribution on the circle proposed by Siew et al. (2008). In fact, the circular marginal distribution is the generalized $t$ -distribution on the circle. The proposed distribution can be obtained as a conditional distribution of a trivariate $t$ distribution and characterized as the maximum $\beta$ -entropy distribution. This is a quite flexible distribution which allows for both asymmetry and variations in tail weight in terms of parameter values. We investigate some properties such as marginal and conditional distributions, modality, moments, circular-linear correlation and skewness. We briefly discuss a circular-linear regression model derived from the conditional distribution. For practical use, we provide an iterative algorithm for parameter estimation.

Subsequent sections are organized as follows. Section 2 provides a derivation of the distribution. In Section 3 some properties of the new distribution are studied. We apply the proposed distribution to the data set of the wind direction and ozone level given in Johnson and Wehrly (1977) in Section 4.

2 Derivation

Suppose that a trivariate random vector $W$ is distributed as a trivariate $t$ distribution with degrees of freedom $\alpha$ , mean vector $\eta=(\eta_{1},\eta_{2},\eta_{3})^{\prime}\in\mathbb{R}^{3}$ and variance-covariance matrix $\Sigma$ , where

\Sigma=\left(\begin{array}[]{ccc}\sigma_{1}^{2}&\rho_{12}\sigma_{1}\sigma_{2}&\rho_{13}\sigma_{1}\sigma_{3}\\ \rho_{12}\sigma_{1}\sigma_{2}&\sigma_{2}^{2}&\rho_{23}\sigma_{2}\sigma_{3}\\ \rho_{13}\sigma_{1}\sigma_{3}&\rho_{23}\sigma_{2}\sigma_{3}&\sigma_{3}^{2}\end{array}\right)

for $\sigma_{j}>0$ ( $j=1,2,3$ ), $-1<\rho_{12}<1$ and $1+2\rho_{12}\rho_{13}\rho_{23}-\rho_{12}^{2}-\rho_{13}^{2}-\rho_{23}^{2}>0.$ From Kotz and Nadarajah (2004), the distribution of $W$ has density

f(w)=\frac{\Gamma((\alpha+3)/2)}{(\alpha\pi)^{3/2}|\Sigma|^{1/2}\Gamma(\alpha/2)}\left\{1+\frac{(w-\eta)^{\prime}\Sigma^{-1}(w-\eta)}{\alpha}\right\}^{-(\alpha+3/2)},\quad w\in\mathbb{R}^{3}.

We use a cylindrical coordinate $W=(X,X_{1},X_{2})^{\prime}$ , where $X_{1}=R\cos\Theta$ and $X_{2}=R\sin\Theta$ with $R>0$ and $0\leq\Theta<2\pi$ , and consider the conditional distribution of $(X,\Theta)^{\prime}$ given $R=r$ , which provides a distribution on the cylinder. Define new parameters as

$\begin{array}[]{l}\mu(\theta)=\mu+\lambda\cos(\theta-\nu),\quad\tau^{2}=\displaystyle{\frac{\sigma_{1}^{2}\rho^{2}}{1-\rho_{23}^{2}}},\\ \kappa_{1}^{\ast}\cos\mu_{1}=r(b_{1}\eta_{2}-b_{2}\eta_{3}),\quad\kappa_{1}^{\ast}\sin\mu_{1}=r(b_{3}\eta_{3}-b_{2}\eta_{2}),\\ \displaystyle{\kappa_{2}^{\ast}\cos 2\mu_{2}=\frac{r^{2}(b_{3}-b_{1})}{4},\quad\kappa_{2}^{\ast}\sin 2\mu_{2}=\frac{r^{2}b_{2}}{2}}\end{array}$

with $-\infty<\mu<\infty$ , $\lambda,\kappa_{1}^{\ast},\kappa_{2}^{\ast}\geq 0$ , $0\leq\nu,\mu_{1}<2\pi$ and $0\leq\mu_{2}<\pi$ , where

$\begin{array}[]{l}\mu=\eta_{1}+a_{1}\eta_{2}+a_{2}\eta_{3},\quad\rho^{2}=1+2\rho_{12}\rho_{13}\rho_{23}-\rho_{12}^{2}-\rho_{13}^{2}-\rho_{23}^{2},\\ \displaystyle{a_{1}=\frac{\sigma_{1}}{\sigma_{2}}\frac{\rho_{13}\rho_{23}-\rho_{12}}{1-\rho_{23}^{2}},\quad a_{2}=\frac{\sigma_{1}}{\sigma_{3}}\frac{\rho_{12}\rho_{23}-\rho_{13}}{1-\rho_{23}^{2}},}\\ \displaystyle{b_{1}=\frac{1}{\sigma_{2}^{2}(1-\rho_{23}^{2})},\quad b_{2}=\frac{\rho_{23}}{\sigma_{2}\sigma_{3}(1-\rho_{23}^{2})},\quad b_{3}=\frac{1}{\sigma_{3}^{2}(1-\rho_{23}^{2})},}\\ \lambda\cos\nu=-a_{1}r,\quad\lambda\sin\nu=-a_{2}r.\end{array}$

Then we have

\frac{1}{2}(w-\eta)^{\prime}\Sigma^{-1}(w-\eta)=\frac{1}{2\tau^{2}}\{x-\mu(\theta)\}^{2}-\kappa_{1}^{\ast}\cos(\theta-\mu_{1})-\kappa_{2}^{\ast}\cos 2(\theta-\mu_{2})+d,

where $d=(b_{1}\eta_{2}^{2}+b_{3}\eta_{3}^{2}-2b_{2}\eta_{2}\eta_{3})/2\ (\geq 0)$ . The conditional probability density function $f(x,\theta|r)$ of $(X,\Theta)^{\prime}|(R=r)$ is represented as

	$\displaystyle f(x,$	$\displaystyle\theta\|r)$
		$\displaystyle=C^{-1}\left[1+\frac{1}{2\sigma^{2}}\left\{x-\mu(\theta)\right\}^{2}-\kappa_{1}\cos(\theta-\mu_{1})-\kappa_{2}\cos 2(\theta-\mu_{2})\right]^{-(\alpha+3)/2},$		(1)

where $\kappa_{1}=\kappa_{1}^{\ast}/\gamma,\ \kappa_{2}=\kappa_{2}^{\ast}/\gamma,\ \sigma^{2}=\gamma\tau^{2},\ \gamma=\alpha+d\ (>0)$ and the normalizing constant is

$\displaystyle C$	$\displaystyle=$	$\displaystyle 2\sqrt{2}\pi B(1/2,\alpha/2+1)\sigma\left\{F_{4}\left(\frac{\alpha}{4}+\frac{1}{2},\frac{\alpha}{4}+1,1,1;\kappa_{1}^{2},\kappa_{2}^{2}\right)\right.$	(2)
		$\displaystyle+2\sum_{j=1}^{\infty}\frac{(\alpha/2+1)_{3j}}{(2j)!j!}\left(\frac{\kappa_{1}}{2}\right)^{2j}\left(\frac{\kappa_{2}}{2}\right)^{j}\cos 2j(\mu_{2}-\mu_{1})$
		$\displaystyle\times\left.F_{4}\left(\frac{\alpha+6j}{4}+\frac{1}{2},\frac{\alpha+6j}{4}+1,2j+1,j+1;\kappa_{1}^{2},\kappa_{2}^{2}\right)\right\}.$

Here $F_{4}$ denotes Appell’s double hypergeometric function (Gradshteyn and Ryzhik, 2007, 9.180.4) defined by

F_{4}(\alpha_{1},\alpha_{2},\beta_{1},\beta_{2};z_{1},z_{2})=\sum_{i=0}^{\infty}\sum_{j=0}^{\infty}\frac{(\alpha_{1})_{i+j}(\alpha_{2})_{i+j}}{(\beta_{1})_{i}(\beta_{2})_{j}}\frac{z_{1}^{i}z_{2}^{j}}{i!j!},\quad\sqrt{z_{1}}+\sqrt{z_{2}}<1

with Pochhammer’s symbol

(c)_{j}=\left\{\begin{array}[]{ll}c(c+1)\cdots(c+j-1),&j\geq 1,\\ 1,&j=0,\end{array}\right.

and $B$ the beta function. The distribution with density function (2) has nine parameters $\alpha,\sigma>0$ , $-\infty<\mu<\infty$ , $\kappa_{1},\kappa_{2},\lambda\geq 0$ , $0\leq\nu,\mu_{1}<2\pi$ and $0\leq\mu_{2}<\pi$ with restriction $\kappa_{1}+\kappa_{2}<1$ . The resulting distribution should be called the generalized $t$ -distribution on the cylinder. Relationships between the new and original parameters are:

(a)

$\eta_{2}=\eta_{3}=0\Leftrightarrow\kappa_{1}=0$ .
(b)

$\kappa_{2}=0\Leftrightarrow\rho_{23}=0,\sigma_{2}=\sigma_{3}$ .
(c)

$\lambda=0\Leftrightarrow\rho_{12}=\rho_{13}=0$ .

As being introduced in (2), the parameter $\alpha$ was assumed positive. However, the density is still valid when the parameter space of $\alpha$ is extended to $\alpha\geq-1$ . Note that $\alpha$ corresponds to the degrees of freedom in the generalized $t$ -distribution on the cylinder (2) and the parameter $\alpha$ determines the degree of concentration around the mode. The proposed distribution with density (2) has 9 parameters. The roles of the parameters are as follows: $(\mu_{1},\mu_{2})$ and $\mu$ are location parameters of $\Theta$ and $X$ , respectively, and $\sigma$ is the scale parameter of $X$ . $\lambda$ and $\nu$ in $\mu(\theta)$ determines the degree of correlation between $X$ and $\Theta$ , and $\lambda$ controls the skewness of $X$ . $\kappa_{1}$ and $\kappa_{2}$ determine the morality of the distribution, discussed in Section 3.3. $\alpha$ controls the tail weight of the distribution. To see the role of $\alpha$ and $\lambda$ , we provide contour plots of the proposed density (2) in Figure 1 for some combinations of $\alpha$ and $\lambda$ . Other parameters are specified as $\mu=\nu=\mu_{1}=\mu_{2}=0,\ \sigma^{2}=1,\ \kappa_{1}=0.1,\ \kappa_{2}=0.4$ . The column (I) in Figure 1 illustrates the interpretation of $\alpha$ : as $\alpha$ increases, the tail weight of the density increases. The role of $\lambda$ as controlling the skewness of the distribution can be seen from the column (II) in Figure 1. The interpretation of $\kappa_{1},\kappa_{2}$ are discussed in Section 3.3.

Under the assumption $\gamma=(\alpha+3)/2$ , if we use reparametrization $1/\psi=-(\alpha+3)/2,\ -\kappa_{1}=\tanh(\kappa_{1}^{\ast\ast}\psi),\ -\kappa_{2}=\tanh(\kappa_{2}^{\ast\ast}\psi)$ , then the density (2) has a similar expression to the family of distributions introduced by Jones and Pewsey (2005):

f(x,\theta|r)\propto\left[1-\frac{\psi}{2\tau^{2}}\{x-\mu(\theta)\}^{2}+\tanh(\kappa_{1}^{\ast\ast}\psi)\cos(\theta-\mu_{1})+\tanh(\kappa_{2}^{\ast\ast}\psi)\cos 2(\theta-\mu_{2})\right]^{1/\psi}.

The restriction $\alpha\geq-1$ in (2) is changed into $-1\leq\psi<0$ .

Refer to caption — Figure 1: Contour plots of density (2) for: (I) $\lambda=0$ and (a) $\alpha=-1$ , (b) $\alpha=10$ , and (c) $\alpha=20$ ; (II) $\alpha=6$ and (a) $\lambda=0$ , (b) $\lambda=0.5$ , and (c) $\lambda=1$ . The other parameters are set as $\mu=\nu=\mu_{1}=\mu_{2}=0,\ \sigma^{2}=1,\ \kappa_{1}=0.1,\ \kappa_{2}=0.4$ .

3 Properties

3.1 Special cases

1. Under the assumption $\gamma=\alpha/2$ in (2), letting $\gamma\ (=\alpha/2)\to\infty$ , we have an extension of the distribution by Mardia and Sutton (1978). Its joint probability density function (Kato and Shimizu, 2008) is

f(x,\theta)=C_{1}^{-1}\exp\left[-\frac{\{x-\mu(\theta)\}^{2}}{2\tau^{2}}+\kappa_{1}^{\ast}\cos(\theta-\mu_{1})+\kappa_{2}^{\ast}\cos 2(\theta-\mu_{2})\right]

(3)

with the normalizing constant

C_{1}=(2\pi)^{3/2}\tau\left[I_{0}(\kappa_{1}^{\ast})I_{0}(\kappa_{2}^{\ast})+2\sum_{j=1}^{\infty}I_{j}(\kappa_{2}^{\ast})I_{2j}(\kappa_{1}^{\ast})\cos 2j(\mu_{1}-\mu_{2})\right].

(4)

Here $I_{j}$ denotes the modified Bessel function of the first kind and order $j$ given by

I_{j}(z)=\frac{1}{2\pi}\int_{0}^{2\pi}\cos(j\theta){\rm e}^{z\cos\theta}{\rm d}\theta=\sum_{r=0}^{\infty}\frac{1}{\Gamma(r+j+1)r!}\left(\frac{z}{2}\right)^{2r+j},\quad z\in\mathbb{C}.

2. When $\kappa_{2}=0$ , (2) reduces to

f(x,\theta)=C_{2}^{-1}\left[1+\frac{1}{2\sigma^{2}}\left\{x-\mu(\theta)\right\}^{2}-\kappa_{1}\cos(\theta-\mu_{1})\right]^{-(\alpha+3)/2}.

(5)

The normalizing constant is represented as

C_{2}=2\sqrt{2}\pi\sigma B(1/2,\alpha/2+1){}_{2}F_{1}\left(\frac{\alpha}{4}+\frac{1}{2},\frac{\alpha}{4}+1,1;\kappa_{1}^{2}\right)

using the Gauss hypergeometric function ${}_{2}F_{1}$ . If we replace $\kappa_{1}=\kappa_{1}^{\ast}/\gamma,\sigma^{2}=\gamma\tau^{2}$ and let $\gamma=\alpha/2\to\infty$ , we have the distribution proposed by Mardia and Sutton (1978) and the constant $C_{2}$ tends to $(2\pi)^{3/2}\tau I_{0}(\kappa_{1}^{\ast})$ . This agrees with the fact that $C_{1}=(2\pi)^{3/2}\tau I_{0}(\kappa_{1}^{\ast})$ when $\kappa_{2}^{\ast}=0$ in (4).

3.2 Marginal and conditional distributions

The marginal distribution of $\Theta$ is

f_{\Theta}(\theta)=C_{\Theta}^{-1}\{1-\kappa_{1}\cos(\theta-\mu_{1})-\kappa_{2}\cos 2(\theta-\mu_{2})\}^{-\alpha/2-1},

(6)

where

$\displaystyle C_{\Theta}$	$\displaystyle=2\pi\left\{F_{4}\left(\frac{\alpha}{4}+\frac{1}{2},\frac{\alpha}{4}+1,1,1;\kappa_{1}^{2},\kappa_{2}^{2}\right)\right.$
	$\displaystyle+2\sum_{j=1}^{\infty}\frac{(\alpha/2+1)_{3j}}{(2j)!j!}\left(\frac{\kappa_{1}}{2}\right)^{2j}\left(\frac{\kappa_{2}}{2}\right)^{j}\cos 2j(\mu_{2}-\mu_{1})$
	$\displaystyle\times\left.F_{4}\left(\frac{\alpha+6j}{4}+\frac{1}{2},\frac{\alpha+6j}{4}+1,2j+1,j+1;\kappa_{1}^{2},\kappa_{2}^{2}\right)\right\}.$	(7)

The distribution with density (6) is a member of the generalized $t$ -distributions on the circle proposed by Siew et al. (2008), and is possibly bimodal and asymmetric. Cosine and sine moments of the generalized $t$ -distributions are given in their paper. The generalized t-distributions include the generalized von Mises distribution (cf. Yfantis and Borgman, 1982) as a special case. As another special case when $\kappa_{2}=0$ in (2), the marginal distribution of $\Theta$ belongs to the family of symmetric distributions by Jones and Pewsey (2005). Note that the marginal density (6) is independent of $\lambda,\mu,\sigma$ and $\nu$ which are the parameters of the proposed density (2). On the other hand, the marginal distribution of $X$ does not have a closed form in general. When $\lambda=0$ , we can obtain the marginal distribution of $X$ in a closed form given by

f_{X}(x)=C^{-1}D(x)\left\{1+\frac{1}{2\sigma^{2}}(x-\mu)^{2}\right\}^{-(\alpha+3/2)},

where $C$ is defined by (2) and $D(x)$ is obtained by replacing $\alpha$ and $\kappa_{i}\ (i=1,2)$ with $\alpha+1/2$ and $\kappa_{i}/\{1+(x-\mu)^{2}/(2\sigma^{2})\}$ , respectively, in $C_{\Theta}$ defined in (3.2). Note that this density is symmetric about $\mu$ .

In (2), the conditional distribution of $X$ given $\Theta=\theta$ has the generalized $t$ -density function provided by

f_{X|\Theta}(x|\theta)=C_{X|\Theta}^{-1}\left(1+\frac{\{x-\mu(\theta)\}^{2}}{2\sigma^{2}\left[1-\left\{\kappa_{1}\cos(\theta-\mu_{1})+\kappa_{2}\cos 2(\theta-\mu_{2})\right\}\right]}\right)^{-(\alpha+3)/2}

(8)

with the normalizing constant

C_{X|\Theta}=\sqrt{2}\sigma B\left(\frac{1}{2},\frac{\alpha}{2}+1\right)\left[1-\{\kappa_{1}\cos(\theta-\mu_{1})+\kappa_{2}\cos 2(\theta-\mu_{2})\}\right]^{1/2}.

The conditional distribution of $\Theta$ given $X=x$ is a member of the generalized $t$ -distributions on the circle given by

f(\theta|X=x)\propto\{1-\kappa_{1}(x)\cos(\theta-\mu_{1})-\kappa_{2}(x)\cos 2(\theta-\mu_{2})\}^{-(\alpha+3)/2},

where $\kappa_{i}(x)=\kappa_{i}/\{1+(x-\mu)^{2}/(2\sigma^{2})\},\ i=1,2$ . As shown in Siew et al. (2008), the mean direction of $\Theta$ with density (6) depends on $\kappa_{i},\ i=1,2$ . Thus the conditional mean direction $E(\Theta|X=x)$ depends on $x$ through $\kappa_{i}(x)$ . The result that the conditional distributions of $X|\Theta=\theta$ and $\Theta|X=x$ are the generalized $t$ -distribution comes from the fact that the conditional distribution of a multivariate $t$ distribution is again a multivariate $t$ distribution (Joe, 2015).

3.3 Modality

We consider the modality of the distribution with density (2). The mode $x^{\ast}$ of (2), whenever the value of $\theta$ is specified, is $x^{\ast}=\mu(\theta)$ . Similar to Siew et al. (2008), we discuss maximization of the function $m(\theta)=\kappa_{1}\cos(\theta-\mu_{1})+\kappa_{2}\cos 2(\theta-\mu_{2})$ with respect to $\theta$ . The solution $\theta^{\ast}$ of an equation

\kappa_{1}\sin(\theta-\mu_{1})+2\kappa_{2}\sin 2(\theta-\mu_{2})=0

(9)

is a value which maximizes $m(\theta)$ if the sign of $h(\theta^{\ast})$ is positive, where

h(\theta)=\kappa_{1}\cos(\theta-\mu_{1})+4\kappa_{2}\cos 2(\theta-\mu_{2}).

Equation (9) can be solved numerically for any combinations of $\mu_{1}$ and $\mu_{2}$ . Without loss of generality, we let $\mu_{1}=0$ . Then (9) has closed form solutions when $\mu_{2}=0,\pi/2,\pi/4$ and $3\pi/4$ . The results are given in Table 1, where $\theta_{0}=\arccos\{\kappa_{1}/(4\kappa_{2})\}$ , $\theta_{1}=\arcsin\{(-\kappa_{1}+\sqrt{\kappa_{1}^{2}+32\kappa_{2}^{2}})/(8\kappa_{2})\}$ and $\theta_{2}=\arcsin\{(\kappa_{1}+\sqrt{\kappa_{1}^{2}+32\kappa_{2}^{2}})/(8\kappa_{2})\}$ . See Yfantis and Borgman (1982) for more discussion as to the solutions of (9).

Table 1: Summary of the modality of the proposed distribution for some values of

\mu_{2}

when

\mu_{1}=0

$\mu_{2}\ \$	Condition	Modes $(x,\theta)$
$0$	$4\kappa_{2}>\kappa_{1}$	$(\mu+\lambda\cos\nu,0),(\mu-\lambda\cos\nu,\pi)$
	$4\kappa_{2}<\kappa_{1}$	$(\mu+\lambda\cos\nu,0)$
$\pi/2$	$4\kappa_{2}>\kappa_{1}$	$(\mu+\lambda\cos(\theta_{0}-\nu),\theta_{0}),(\mu+\lambda\cos(\theta_{0}+\nu),2\pi-\theta_{0})$
	$4\kappa_{2}<\kappa_{1}$	$(\mu+\lambda\cos\nu,0)$
$\pi/4$	$2\kappa_{2}>\kappa_{1}$	$(\mu+\lambda\cos(\theta_{1}-\nu),\theta_{1}),(\mu-\lambda\cos(\theta_{2}-\nu),\pi+\theta_{2})$
	$2\kappa_{2}<\kappa_{1}$	$(\mu+\lambda\cos(\theta_{1}-\nu),\theta_{1})$
$3\pi/4$	$2\kappa_{2}>\kappa_{1}$	$(\mu-\lambda\cos(\theta_{2}+\nu),\pi-\theta_{2}),(\mu+\lambda\cos(\theta_{1}+\nu),2\pi-\theta_{1})$
	$2\kappa_{2}<\kappa_{1}$	$(\mu+\lambda\cos(\theta_{1}+\nu),2\pi-\theta_{1})$

Contour plots and marginal density plots of the proposed distribution are given in Figure 2. Figure 2 shows the case when $\kappa_{1}=0.5$ , $\kappa_{2}=0.1$ , $\lambda=0$ , and (II) $\kappa_{1}=0.2$ , $\kappa_{2}=0.3$ , $\lambda=1$ , while the other parameters are set as $\sigma=1$ , $\mu=0$ , $\nu=\pi/3$ , $\mu_{1}=\mu_{2}=0$ , $\alpha=6$ . In the case of (I) in Figure 2, the joint density is unimodal and the marginal densities of $X$ and $\Theta$ are symmetric. On the other hand, in the case of (II) in Figure 3, the joint density is bimodal and the marginal density of $X$ is asymmetric. In fact, the marginal density of $X$ is possibly skew depending on the values of the parameters, as will be seen in Section 3.5. The marginal density of $\Theta$ in Figure 3 is bimodal as given in Table 1.

In the case where a unimodal distribution is desired, we put, for example, $\mu_{2}=\mu_{1}+\pi/4$ with restriction $2|\kappa_{2}|<\kappa_{1}$ to get a unimodal density function from (2). This submodel can be useful for modeling multimodality when a finite mixture model is easier to interpret than a bimodal distribution.

3.4 Moments and circular-linear correlation

We consider moments of the proposed distribution. Let $\alpha_{m,k}=E(\cos m\theta)$ and $\beta_{m,k}=E(\sin k\theta)$ be the trigonometric moments of (6) under replacement $\alpha/2$ with $\alpha/2-k$ , which are obtainable using the results by Siew et al. (2008). Moments of a random vector $(X,\Theta)^{\prime}$ having (2) are given by

E[\{X-\mu(\Theta)\}^{2k}\cos m\Theta]=C_{\Theta,k}C^{-1}2^{k+1/2}\sigma^{2k+1}B\left(k+\frac{1}{2},\frac{\alpha}{2}-k+1\right)\alpha_{m,k}

(10)

and

E[\{X-\mu(\Theta)\}^{2k}\sin m\Theta]=C_{\Theta,k}C^{-1}2^{k+1/2}\sigma^{2k+1}B\left(k+\frac{1}{2},\frac{\alpha}{2}-k+1\right)\beta_{m,k},

(11)

where $C_{\Theta,k}$ is obtained by replacing $\alpha$ with $\alpha-2k$ in $C_{\Theta}$ in (3.2).

Moreover, we derive another type of moments of a random vector $(X,\Theta)^{\prime}$ having (5) given by putting $\kappa_{2}=0$ in (2). After some calculations, the moments for nonnegative integers $k\ (<\alpha/2+1)$ and $m$ turn out to be

	$\displaystyle E[\left\{X-\mu(\Theta)\right\}^{2k}\cos m(\Theta-\mu_{1})]$
	$\displaystyle=\frac{2^{k}\sigma^{2k}B\left(k+\frac{1}{2},\frac{\alpha}{2}-k+1\right)(-1)^{m+\frac{\alpha}{2}-k+1}P_{\alpha/2-k}^{m}\left(-(1-\kappa_{1}^{2})^{-1/2}\right)\left(1-\kappa_{1}^{2}\right)^{-(\alpha-2k+2)/4}}{B\left(1/2,\alpha/2+1\right){{}_{2}F_{1}}\left(\alpha/4+\frac{1}{2},\alpha/4+1,1;\kappa_{1}^{2}\right)(\alpha/2-k-m+1)_{m}},$		(12)

where $P$ denotes the associated Legendre function (Gradshteyn and Ryzhik, 2007, 8.711.2) defined by

P_{\nu}^{m}(z)=\frac{(\nu+1)(\nu+2)\cdots(\nu+m)}{\pi}\int_{0}^{\pi}\left(z+\sqrt{z^{2}-1}\cos\varphi\right)^{\nu}\cos m\varphi\ {\rm d}\varphi.

Next, we study the circular-linear correlation $R_{x\theta}$ between $X$ and $\Theta$ (cf. Mardia and Jupp, 1999, p. 245) which is defined as

R_{x\theta}^{2}=\frac{r_{xs}^{2}+r_{xc}^{2}-2r_{cs}r_{xs}r_{xc}}{1-r_{cs}^{2}},

where $r_{xs}={\rm Corr}(X,\cos\Theta),r_{xc}={\rm Corr}(X,\sin\Theta)$ and $r_{cs}={\rm Corr}(\cos\Theta,\sin\Theta)$ are Pearson’s correlation coefficients. We only consider the case when $(X,\Theta)^{\prime}$ has density (5) for simplicity because the circular-linear correlation of $(X,\Theta)$ with density (2) is not feasible to compute analytically. Note that the $R_{x\theta}$ in case of (2) can be obtained by numerical calculation. A straightforward calculation shows that

R_{x\theta}^{2}=\frac{\lambda^{2}U}{q+\lambda^{2}U},

(13)

where

U=\frac{1}{2}\left(1-p_{2}\right)\sin^{2}(\mu_{1}-\nu)+\left\{\frac{1}{2}\left(1+p_{2}\right)-p_{1}^{2}\right\}\cos^{2}(\mu_{1}-\nu),

and $p_{1},p_{2}$ and $q$ denote $p_{m}=E\{\cos m(\Theta-\mu_{1})\},m=1,2,$ and $q=E[\{X-\mu(\Theta)\}^{2}]$ , calculable from (12). Note that $U>0$ since $1-p_{2}>0$ and $(1+p_{2})/2-p_{1}^{2}={\rm Var}\{\cos(\Theta-\mu_{1})\}>0$ . We can observe that $R_{x\theta}^{2}$ is an increasing function of $\lambda$ , and $R_{x\theta}^{2}=0$ if and only if $\lambda=0$ . Furthermore, letting $\gamma=\alpha/2\to\infty$ , it is seen that (13) reduces to the circular-linear correlation of the distribution proposed by Mardia and Sutton (1978) because (5) goes to the Mardia–Sutton model. This is confirmed from the fact that $q\to\tau^{2}$ and $p_{m}\to I_{m}(\kappa^{\ast})/I_{0}(\kappa^{\ast}),m=1,2$ as $\gamma=\alpha/2\to\infty$ .

3.5 Skewness of marginal distribution on the real line

For the proposed density (2), we derive the skewness of the marginal density of $X$ defined as

\gamma_{1}=\frac{E\left[\left\{X-E(X)\right\}^{3}\right]}{\left(E\left[\left\{X-E(X)\right\}^{2}\right]\right)^{3/2}}.

Straightforward calculation shows that

\gamma_{1}=\frac{3v_{2}\lambda+v_{3}\lambda^{3}}{(v_{1}\lambda^{2}+q)^{3/2}},

(14)

where $v_{1}={\rm Var}\{\cos(\Theta-\nu)\},\ v_{2}={\rm Cov}[\{X-\mu(\Theta)\}^{2},\cos(\Theta-\nu)]$ and $v_{3}=E([\cos(\Theta-\nu)-E\{\cos(\Theta-\nu)\}]^{3})$ , which are calculable from (10) and (11). We can easily obtain from (14) that $\gamma_{1}\to v_{3}/v_{1}^{3/2}$ as $\lambda\to\infty$ and $\gamma_{1}=0$ when $\lambda=0$ . Figure 4 shows that a graph of skewness as a function of $\lambda$ . We see from Figure 4 that the marginal distribution of X of the proposed model can be left and right skewed according to the values of the parameters.

3.6 Regression

We can derive a circular-linear regression model from the conditional distribution with density (8). In fact, the conditional mean of $X$ given $\Theta=\theta$ is

E(X|\Theta=\theta)=\mu(\theta)=\mu+\lambda\cos(\theta-\nu)

and the conditional variance of $X$ given $\Theta=\theta$ is

{\rm Var}(X|\Theta=\theta)=\frac{2\sigma^{2}}{\alpha}\left\{1-\kappa_{1}\cos(\theta-\mu_{1})-\kappa_{2}\cos 2(\theta-\mu_{2})\right\}.

Note that the conditional variance is dependent on $\theta$ , i.e. the regression model is possibly heterogeneous. As a reduced model, if we let $\gamma=\alpha\to\infty$ , we have ${\rm Var}(X|\Theta=\theta)=\tau^{2}$ , which is independent of $\theta$ . Moreover if $\kappa_{1}=0$ and $\kappa_{2}=0$ in (8), we have ${\rm Var}(X|\Theta=\theta)=2\sigma^{2}/\alpha$ and, in this case, we obtain a regression model

x_{i}=\mu+\lambda\cos(\theta_{i}-\nu)+\varepsilon_{i},\ \ \ \ \ i=1,\ldots,n,

with random errors $\varepsilon_{i}$ which are independent and identically distributed according to the generalized $t$ -distribution.

3.7 Maximizing $\beta$ -entropy

The related distribution proposed by Mardia and Sutton (1978) and Kato and Shimizu (2008) can be characterized as the maximum entropy distribution under certain moment conditions. Also a maximum entropy distribution under certain moment conditions relates to the proposed distribution with density (2). We consider the $\beta$ -entropy (see Eguchi, 2009, Section 13.2.4) defined as

E(f)=\left\{\int f^{\beta+1}(x,\theta){\rm d}x{\rm d}\theta-\beta-1\right\}\bigg{/}\beta(\beta+1).

Then the maximum entropy distribution subject to constraints on the moments

E(X^{2}),\ E(X\cos\Theta),\ E(X\sin\Theta),\ E(\cos p\Theta),\ E(\sin p\Theta),\ p=1,2,

is the distribution with density

f(x,\theta)\propto\left(1+\beta\left[-\frac{\left\{x-\mu(\theta)\right\}^{2}}{2\tau^{2}}+\kappa_{1}^{\ast}\cos(\theta-\mu_{1})+\kappa_{2}^{\ast}\cos 2(\theta-\mu_{2})\right]\right)^{1/\beta}.

(15)

If we take $\beta=-2/(\alpha+3)$ , (15) gives a density related to (2).

3.8 Parameter estimation

We provide a method for calculating the maximum likelihood estimates of the parameters in the generalized $t$ -distribution with density (2). When we observe $(x_{i},\theta_{i}),\ i=1,\ldots,n$ , the log-likelihood function is given by

	$\displaystyle L({\text{\boldmath$\psi$}})$	$\displaystyle=-n\log B\left(\frac{1}{2},\frac{\alpha}{2}+1\right)-n\log C_{\theta}-\frac{n}{2}\log 2-n\log\sigma$
		$\displaystyle-\frac{1}{2}\left(\alpha+3\right)\sum_{i=1}^{n}\log\left[1+\frac{1}{2\sigma^{2}}\left\{x_{i}-\mu-\lambda\cos(\theta_{i}-\nu)\right\}^{2}-\kappa_{1}\cos(\theta_{i}-\mu_{1})-\kappa_{2}\cos 2(\theta_{i}-\mu_{2})\right],$

where ${\text{\boldmath$\psi$}}=(\sigma,\mu,\lambda,\nu,\kappa_{1},\mu_{1},\kappa_{2},\mu_{2},\alpha)^{\prime}$ . For obtaining the maximizer of $L({\text{\boldmath$\psi$}})$ , we propose the conditional maximization algorithm. We first divide the parameter $\psi$ into ${\text{\boldmath$\psi$}}=(\sigma,{\text{\boldmath$\psi$}}_{1}^{\prime},{\text{\boldmath$\psi$}}_{2}^{\prime})^{\prime}$ , where ${\text{\boldmath$\psi$}}_{1}=(\mu,\lambda,\nu)^{\prime}$ and ${\text{\boldmath$\psi$}}_{2}=(\kappa_{1},\mu_{1},\kappa_{2},\mu_{2},\alpha)^{\prime}$ . Given the value of $\sigma$ and ${\text{\boldmath$\psi$}}_{2}$ , maximizing $L({\text{\boldmath$\psi$}})$ is equivalent to maximizing

-\sum_{i=1}^{n}\log\left[C_{i}+\frac{1}{2\sigma^{2}}\left\{x_{i}-\mu-\lambda\cos(\theta_{i}-\nu)\right\}^{2}\right],

with respect to ${\text{\boldmath$\psi$}}_{1}$ , where $C_{i}=1-\kappa_{1}\cos(\theta_{i}-\mu_{1})-\kappa_{2}\cos 2(\theta_{i}-\mu_{2})$ . Let ${\text{\boldmath$x$}}=(x_{1},\ldots,x_{n})^{\prime}$ , ${\text{\boldmath$T$}}=({\text{\boldmath$t$}}_{1}^{\prime},\ldots,{\text{\boldmath$t$}}_{n}^{\prime})^{\prime}$ for ${\text{\boldmath$t$}}_{i}=(1,\cos\theta_{i},\sin\theta_{i})^{\prime}$ , ${\text{\boldmath$\beta$}}=(\mu,\lambda\cos\nu,\lambda\sin\nu)^{\prime}$ and ${\text{\boldmath$W$}}={\rm diag}(w_{1},\ldots,w_{n})$ for

w_{i}=\frac{\sigma^{2}}{\sigma^{2}+\left\{x_{i}-\mu-\lambda\cos(\theta_{i}-\nu)\right\}^{2}/2}.

Using the theory of weighted regression (see Andrews, 1974), the maximizer of ${\text{\boldmath$\psi$}}_{1}$ given $\sigma$ and ${\text{\boldmath$\psi$}}_{2}$ can be obtained as

\widehat{{\text{\boldmath$\beta$}}}=({\text{\boldmath$T$}}^{\prime}{\text{\boldmath$W$}}{\text{\boldmath$T$}})^{-1}{\text{\boldmath$T$}}^{\prime}{\text{\boldmath$W$}}{\text{\boldmath$x$}},

(16)

which deduces the maximizer $\widehat{{\text{\boldmath$\psi$}}}_{1}$ . Since $W$ depends on ${\text{\boldmath$\psi$}}_{1}$ , we calculate $W$ based on the current values in each iteration.

Given ${\text{\boldmath$\psi$}}_{1}$ and ${\text{\boldmath$\psi$}}_{2}$ , maximizing $L({\text{\boldmath$\psi$}})$ with respect to $\sigma^{2}$ is equivalent to solving the following equation:

n\left(\alpha+3\right)^{-1}=\sum_{i=1}^{n}\frac{\left\{x_{i}-\mu-\lambda\cos(\theta_{i}-\nu)\right\}^{2}}{\left\{x_{i}-\mu-\lambda\cos(\theta_{i}-\nu)\right\}^{2}+C_{i}\sigma^{2}},

(17)

which deduces the maximizer $\hat{\sigma}$ .

Finally for maximizing ${\text{\boldmath$\psi$}}_{2}$ under given $\sigma$ and ${\text{\boldmath$\psi$}}_{1}$ , we maximize

-n\log B\left(\frac{1}{2},\frac{\alpha}{2}+1\right)-n\log C_{\theta}-\frac{1}{2}\left(\alpha+3\right)\sum_{i=1}^{n}\log\left\{D_{i}-\kappa_{1}\cos(\theta_{i}-\mu_{1})-\kappa_{2}\cos 2(\theta_{i}-\mu_{2})\right\}

(18)

with respect to ${\text{\boldmath$\psi$}}_{2}$ , where $D_{i}=1+(2\sigma^{2})^{-1}\left\{x_{i}-\mu-\lambda\cos(\theta_{i}-\nu)\right\}^{2}$ . This maximization problem is quite similar to obtaining the maximum likelihood estimates of the generalized $t$ -distribution on the circle (Siew et al., 2008), so that we can obtain the maximizer ${\text{\boldmath$\psi$}}_{2}$ given $\sigma$ and ${\text{\boldmath$\psi$}}_{1}$ . Note that we carried out the maximization of (18) with use of numerical integration for getting the value of $C_{\theta}$ .

Therefore, the proposed estimation method is described in the following.

Estimation Algorithm

1.

Determine the initial values ${\text{\boldmath$\psi$}}^{(0)}$ . Set $k=0$ .
2.

Compute $W$ based on $\sigma^{(k)}$ and ${\text{\boldmath$\psi$}}_{1}^{(k)}$ , then obtain ${\text{\boldmath$\psi$}}_{1}^{(k+1)}$ based on (16).
3.

Compute $\sigma^{(k+1)}$ by solving (17) with ${\text{\boldmath$\psi$}}_{1}={\text{\boldmath$\psi$}}_{1}^{(k+1)}$ and ${\text{\boldmath$\psi$}}_{2}={\text{\boldmath$\psi$}}_{2}^{(k)}$ .
4.

Compute ${\text{\boldmath$\psi$}}_{2}^{(k+1)}$ by maximizing (18) with ${\text{\boldmath$\psi$}}_{1}={\text{\boldmath$\psi$}}_{1}^{(k+1)}$ and $\sigma=\sigma^{(k+1)}$ .
5.

Set $k=k+1$ and go to Step 2 (until numerical convergence).

4 Empirical application

For an illustrative example, we consider a cylindrical dataset given in Johnson and Wehrly (1977) on the wind direction and ozone level taken at 6:00 pm at four-day intervals between April 18th and June 29th, 1975 at a weather station in Milwaukee with $19$ samples. We fitted the proposed generalized $t$ -distribution and its submodels. For submodels of (2), we consider two cases, namely, $\kappa_{2}=0$ (GT-sub1) given in (5) and $\mu_{2}=\mu_{1}+\pi/4$ with $2|\kappa_{2}|<\kappa_{1}$ (GT-sub2) discussed in the end of Section 3.3. When we carry on the estimation algorithm given in Section 3.8, we repeat the algorithm until the difference between update and current values are smaller than $0.001$ . For comparison, we also fitted the member of the exponential family given by Kato and Shimizu (2008) with density

f(x,\theta)\propto\exp\left\{-\frac{(x-\mu(\theta))^{2}}{2\sigma^{2}}+\kappa_{1}\cos(\theta-\mu_{1})+\kappa_{2}\cos 2(\theta-\mu_{2})\right\},

which is reparametrized form of (3) with $\kappa_{1}^{\ast}=\kappa_{1}$ , $\kappa_{2}^{\ast}=\kappa_{2}$ and $\tau^{2}=\sigma^{2}$ . Table 2 provides the maximum likelihood estimates of the parameters, AIC values, and the multivariate Kolmogorov-Smirnov statistics of goodness of fit testing given by Justel et al. (1997). In Table 1 of Justel et al. (1997), the bivariate Kolmogorov-Smirnov statistics distributions are reported. From the table, the upper $5\%$ , $10\%$ and $25\%$ percentiles are $0.362$ , $0.335$ and $0.292$ , respectively, for $20$ sample cases. Thus the percentiles in 19 sample cases are smaller than these values, so that the fitted four models are not rejected with $5\%$ significance level. Judging from AIC, we see that the two submodels of the proposed distribution, GT-sub1 and GT-sub2, gives better fits than the Kato and Shimizu distribution. Figure 4 shows scatter plots of the data and contour plots of the fitted densities of two submodels.

Table 2: Maximum likelihood estimates of the parameters, the maximum log-likelihood (MLL), and AIC values of model (2) (GT) and its submodels and the Kato and Shimizu model (3) (KS) fitted to the data from Johnson and Wehrly (1977).

Model	$\mu$	$\lambda$	$\nu$	$\sigma$	$\kappa_{1}$	$\mu_{1}$	$\kappa_{2}$	$\mu_{2}$	$\alpha$	AIC	g.o.f
GT	41.38	31.14	1.25	68.90	0.11	6.28	0.03	1.35	23.84	417.79	0.180
GT-sub1	41.37	31.17	1.24	74.81	0.08	0.30	—	—	28.28	240.20	0.314
GT-sub2	41.01	32.70	1.41	6.08	0.49	0.19	0.15	—	-1.00	203.34	0.359
KS	41.24	31.38	1.20	19.77	1.41	0.19	0.35	1.45	—	241.42	0.282

5 Conclusions

In this paper, we derived the distributions on the cylinder based on a trivariate $t$ -distribution. The derived distribution is considered as a cylindrical extension of the generalized $t$ -distribution on the circle proposed by Siew et al. (2008) and includes the exponential family given by Kato and Shimizu (2008). We investigated some properties of the proposed distribution including the marginal and conditional distributions, circular-linear correlation and the algorithm for parameter estimation. We applied our proposed distribution to the data set of the wind direction and ozone level given in Johnson and Wehrly (1977), and we confirmed that the proposed distribution gave a better fit than the distribution given by Kato and Shimizu (2008).

Acknowledgement

We would like to thank the two reviewers for many valuable comments and helpful suggestions which led to an improved version of this paper. The first author was supported in part by Grant-in-Aid for Scientific Research (10076) from Japan Society for the Promotion of Science (JSPS). The work of the third author was supported by JSPS KAKENHI Grant Number 25400218.

References

[2] Andrews, D. F. (1974). A robust method for multiple linear regression, Technometrics, 16:523–532.
[4] Eguchi, S. (2009). Information Divergence Geometry and the Application to Statistical Machine Learning. In Information Theory and Statistical Learning, Chapter 13, 309–332, Emmert-Streib, F. and Dehmer, M. Eds., New York, Springer.
[6] Fisher, N. I. (1993). Statistical Analysis of Circular Data. Cambridge University Press, Cambridge.
[8] Gradshteyn, I. S. and Ryzhik, I. M. (2007). Table of Integrals, Series, and Products. Seventh Edition, Elsevier, Amsterdam.
[10] Joe, H. (2015). Dependence Modeling with Copulas. Chapman & Hal/CRC, Boca Raton.
[12] Jones, M. C. and Pewsey, A. (2005). A family of symmetric distributions on the circle. Journal of the American Statistical Association, 100:1422–1428.
[14] Johnson, R. A. and Wehrly, T. (1977). Measures and models for angular correlation and angular-linear correlation. Journal of the Royal Statistical Society B, 39:222–229.
[16] Johnson, R. A. and Wehrly, T. E. (1978). Some angular-linear distributions and related regression models. Journal of the American Statistical Association, 73:602–606.
[18] Justel, A., Pe $\tilde{\rm n}$ a, D. and Zamar, R. (1997). A multivariate Kolmogorov-Smirnov test of goodness of fit. Statistics and Probability Letters, 35, 251–259.
[20] Kato, S. and Jones, M. C. (2010). A family of distributions on the circle with links to, and applications arising from, Möbius transformation. Journal of the American Statistical Association, 105:249–262.
[22] Kato, S. and Jones, M. C. (2015). A tractable and interpretable four-parameter family of unimodal distributions on the circle. Biometrika, 102, 181-190.
[24] Kato, S. and Shimizu, K. (2008). Dependent models for observations which include angular ones. Journal of Statistical Planning and Inference, 138:3538-3549.
[26] Kotz, S. and Nadarajah, S. (2004). Multivariate $t$ Distributions and Their Applications, Cambridge University Press, Cambridge.
[28] Mardia, K. V. and Jupp, P. E. (1999). Directional Statistics. Wiley, Chichester.
[30] Mardia, K. V. and Sutton, T. W. (1978). A model for cylindrical variables with applications. Journal of the Royal Statistical Society B, 40:229-233.
[32] Siew, H-Y., Kato, S. and Shimizu, K. (2008). The generalized $t$ -distribution on the circle. Japanese Journal of Applied Statistics, 37:1-17.
[34] Yfantis, E. A. and Borgman, L. E. (1982). An extension of the von Mises distribution. Communications in Statistics–Theory and Methods, 11:1695-1706.
[35]