This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

A flexible family of distributions on the cylinder

Shonosuke Sugasawa ,   Kunio Shimizu  and Shogo Kato Graduate School of Economics, The University of Tokyo, E-mail: shonosuke622@gmail.comThe Institute of Statistical Mathematics
Abstract

We propose a flexible family of distributions, generalized tt-distributions, on the cylinder which is obtained as a conditional distribution of a trivariate tt distribution. The new distribution has unimodality or bimodality, symmetry or asymmetry, depending on the values of parameters and flexibly fits the cylindrical data. The circular marginal of this distribution is distributed as a generalized tt-distribution on the circle. Some other properties are also investigated. The proposed distribution is applied to the real cylindrical data.


Key words and phrases: circular-linear correlation, circular-linear regression, generalized von Mises distribution, Johnson–Wehrly model, Mardia–Sutton model

1 Introduction

Directional or circular data often appear in a variety of scientific fields and various stochastic models have been proposed for analyzing such data. For univariate circular data, there have been many distributions investigated in terms of both tractability and applicability, see Jones and Pewsey (2005), Kato and Jones (2010) and Kato and Jones (2015). However, we sometimes encounter situations which involve both circular and linear variables, namely cylindrical data such as the pair of wind direction and temperature (Mardia and Sutton, 1978) or directions and distances of animal movements (Fisher, 1993). For such data, the distribution on the cylinder is needed, but there are not so many distributions compared to univariate circular distributions. We give a brief review below for several cylindrical distributions known in the literature. Johnson and Wehrly (1978) gave a distribution based on the principle of maximum entropy subject to constraints on certain moments, and Mardia and Sutton (1978) provided another distribution as a conditional distribution of a trivariate normal distribution or a maximum entropy distribution. An extension of the distribution by Mardia and Sutton (1978) was studied by Kato and Shimizu (2008), which can also be derived as a maximum entropy distribution or a conditional of a trivariate normal.

In this paper, we propose the generalized tt-distribution on the cylinder, which is a natural extension of the member of the exponential family given by Kato and Shimizu (2008). The proposed distribution is also regarded as a cylindrical extension of the generalized tt-distribution on the circle proposed by Siew et al. (2008). In fact, the circular marginal distribution is the generalized tt-distribution on the circle. The proposed distribution can be obtained as a conditional distribution of a trivariate tt distribution and characterized as the maximum β\beta-entropy distribution. This is a quite flexible distribution which allows for both asymmetry and variations in tail weight in terms of parameter values. We investigate some properties such as marginal and conditional distributions, modality, moments, circular-linear correlation and skewness. We briefly discuss a circular-linear regression model derived from the conditional distribution. For practical use, we provide an iterative algorithm for parameter estimation.

Subsequent sections are organized as follows. Section 2 provides a derivation of the distribution. In Section 3 some properties of the new distribution are studied. We apply the proposed distribution to the data set of the wind direction and ozone level given in Johnson and Wehrly (1977) in Section 4.

2 Derivation

Suppose that a trivariate random vector WW is distributed as a trivariate tt distribution with degrees of freedom α\alpha, mean vector η=(η1,η2,η3)3\eta=(\eta_{1},\eta_{2},\eta_{3})^{\prime}\in\mathbb{R}^{3} and variance-covariance matrix Σ\Sigma, where

Σ=(σ12ρ12σ1σ2ρ13σ1σ3ρ12σ1σ2σ22ρ23σ2σ3ρ13σ1σ3ρ23σ2σ3σ32)\Sigma=\left(\begin{array}[]{ccc}\sigma_{1}^{2}&\rho_{12}\sigma_{1}\sigma_{2}&\rho_{13}\sigma_{1}\sigma_{3}\\ \rho_{12}\sigma_{1}\sigma_{2}&\sigma_{2}^{2}&\rho_{23}\sigma_{2}\sigma_{3}\\ \rho_{13}\sigma_{1}\sigma_{3}&\rho_{23}\sigma_{2}\sigma_{3}&\sigma_{3}^{2}\end{array}\right)

for σj>0\sigma_{j}>0 (j=1,2,3j=1,2,3), 1<ρ12<1-1<\rho_{12}<1 and 1+2ρ12ρ13ρ23ρ122ρ132ρ232>0.1+2\rho_{12}\rho_{13}\rho_{23}-\rho_{12}^{2}-\rho_{13}^{2}-\rho_{23}^{2}>0. From Kotz and Nadarajah (2004), the distribution of WW has density

f(w)=Γ((α+3)/2)(απ)3/2|Σ|1/2Γ(α/2){1+(wη)Σ1(wη)α}(α+3/2),w3.f(w)=\frac{\Gamma((\alpha+3)/2)}{(\alpha\pi)^{3/2}|\Sigma|^{1/2}\Gamma(\alpha/2)}\left\{1+\frac{(w-\eta)^{\prime}\Sigma^{-1}(w-\eta)}{\alpha}\right\}^{-(\alpha+3/2)},\quad w\in\mathbb{R}^{3}.

We use a cylindrical coordinate W=(X,X1,X2)W=(X,X_{1},X_{2})^{\prime}, where X1=RcosΘX_{1}=R\cos\Theta and X2=RsinΘX_{2}=R\sin\Theta with R>0R>0 and 0Θ<2π0\leq\Theta<2\pi, and consider the conditional distribution of (X,Θ)(X,\Theta)^{\prime} given R=rR=r, which provides a distribution on the cylinder. Define new parameters as

μ(θ)=μ+λcos(θν),τ2=σ12ρ21ρ232,κ1cosμ1=r(b1η2b2η3),κ1sinμ1=r(b3η3b2η2),κ2cos2μ2=r2(b3b1)4,κ2sin2μ2=r2b22\begin{array}[]{l}\mu(\theta)=\mu+\lambda\cos(\theta-\nu),\quad\tau^{2}=\displaystyle{\frac{\sigma_{1}^{2}\rho^{2}}{1-\rho_{23}^{2}}},\\ \kappa_{1}^{\ast}\cos\mu_{1}=r(b_{1}\eta_{2}-b_{2}\eta_{3}),\quad\kappa_{1}^{\ast}\sin\mu_{1}=r(b_{3}\eta_{3}-b_{2}\eta_{2}),\\ \displaystyle{\kappa_{2}^{\ast}\cos 2\mu_{2}=\frac{r^{2}(b_{3}-b_{1})}{4},\quad\kappa_{2}^{\ast}\sin 2\mu_{2}=\frac{r^{2}b_{2}}{2}}\end{array}

with <μ<-\infty<\mu<\infty, λ,κ1,κ20\lambda,\kappa_{1}^{\ast},\kappa_{2}^{\ast}\geq 0, 0ν,μ1<2π0\leq\nu,\mu_{1}<2\pi and 0μ2<π0\leq\mu_{2}<\pi, where

μ=η1+a1η2+a2η3,ρ2=1+2ρ12ρ13ρ23ρ122ρ132ρ232,a1=σ1σ2ρ13ρ23ρ121ρ232,a2=σ1σ3ρ12ρ23ρ131ρ232,b1=1σ22(1ρ232),b2=ρ23σ2σ3(1ρ232),b3=1σ32(1ρ232),λcosν=a1r,λsinν=a2r.\begin{array}[]{l}\mu=\eta_{1}+a_{1}\eta_{2}+a_{2}\eta_{3},\quad\rho^{2}=1+2\rho_{12}\rho_{13}\rho_{23}-\rho_{12}^{2}-\rho_{13}^{2}-\rho_{23}^{2},\\ \displaystyle{a_{1}=\frac{\sigma_{1}}{\sigma_{2}}\frac{\rho_{13}\rho_{23}-\rho_{12}}{1-\rho_{23}^{2}},\quad a_{2}=\frac{\sigma_{1}}{\sigma_{3}}\frac{\rho_{12}\rho_{23}-\rho_{13}}{1-\rho_{23}^{2}},}\\ \displaystyle{b_{1}=\frac{1}{\sigma_{2}^{2}(1-\rho_{23}^{2})},\quad b_{2}=\frac{\rho_{23}}{\sigma_{2}\sigma_{3}(1-\rho_{23}^{2})},\quad b_{3}=\frac{1}{\sigma_{3}^{2}(1-\rho_{23}^{2})},}\\ \lambda\cos\nu=-a_{1}r,\quad\lambda\sin\nu=-a_{2}r.\end{array}

Then we have

12(wη)Σ1(wη)=12τ2{xμ(θ)}2κ1cos(θμ1)κ2cos2(θμ2)+d,\frac{1}{2}(w-\eta)^{\prime}\Sigma^{-1}(w-\eta)=\frac{1}{2\tau^{2}}\{x-\mu(\theta)\}^{2}-\kappa_{1}^{\ast}\cos(\theta-\mu_{1})-\kappa_{2}^{\ast}\cos 2(\theta-\mu_{2})+d,

where d=(b1η22+b3η322b2η2η3)/2(0)d=(b_{1}\eta_{2}^{2}+b_{3}\eta_{3}^{2}-2b_{2}\eta_{2}\eta_{3})/2\ (\geq 0). The conditional probability density function f(x,θ|r)f(x,\theta|r) of (X,Θ)|(R=r)(X,\Theta)^{\prime}|(R=r) is represented as

f(x,\displaystyle f(x, θ|r)\displaystyle\theta|r)
=C1[1+12σ2{xμ(θ)}2κ1cos(θμ1)κ2cos2(θμ2)](α+3)/2,\displaystyle=C^{-1}\left[1+\frac{1}{2\sigma^{2}}\left\{x-\mu(\theta)\right\}^{2}-\kappa_{1}\cos(\theta-\mu_{1})-\kappa_{2}\cos 2(\theta-\mu_{2})\right]^{-(\alpha+3)/2}, (1)

where κ1=κ1/γ,κ2=κ2/γ,σ2=γτ2,γ=α+d(>0)\kappa_{1}=\kappa_{1}^{\ast}/\gamma,\ \kappa_{2}=\kappa_{2}^{\ast}/\gamma,\ \sigma^{2}=\gamma\tau^{2},\ \gamma=\alpha+d\ (>0) and the normalizing constant is

C\displaystyle C =\displaystyle= 22πB(1/2,α/2+1)σ{F4(α4+12,α4+1,1,1;κ12,κ22)\displaystyle 2\sqrt{2}\pi B(1/2,\alpha/2+1)\sigma\left\{F_{4}\left(\frac{\alpha}{4}+\frac{1}{2},\frac{\alpha}{4}+1,1,1;\kappa_{1}^{2},\kappa_{2}^{2}\right)\right. (2)
+2j=1(α/2+1)3j(2j)!j!(κ12)2j(κ22)jcos2j(μ2μ1)\displaystyle+2\sum_{j=1}^{\infty}\frac{(\alpha/2+1)_{3j}}{(2j)!j!}\left(\frac{\kappa_{1}}{2}\right)^{2j}\left(\frac{\kappa_{2}}{2}\right)^{j}\cos 2j(\mu_{2}-\mu_{1})
×F4(α+6j4+12,α+6j4+1,2j+1,j+1;κ12,κ22)}.\displaystyle\times\left.F_{4}\left(\frac{\alpha+6j}{4}+\frac{1}{2},\frac{\alpha+6j}{4}+1,2j+1,j+1;\kappa_{1}^{2},\kappa_{2}^{2}\right)\right\}.

Here F4F_{4} denotes Appell’s double hypergeometric function (Gradshteyn and Ryzhik, 2007, 9.180.4) defined by

F4(α1,α2,β1,β2;z1,z2)=i=0j=0(α1)i+j(α2)i+j(β1)i(β2)jz1iz2ji!j!,z1+z2<1F_{4}(\alpha_{1},\alpha_{2},\beta_{1},\beta_{2};z_{1},z_{2})=\sum_{i=0}^{\infty}\sum_{j=0}^{\infty}\frac{(\alpha_{1})_{i+j}(\alpha_{2})_{i+j}}{(\beta_{1})_{i}(\beta_{2})_{j}}\frac{z_{1}^{i}z_{2}^{j}}{i!j!},\quad\sqrt{z_{1}}+\sqrt{z_{2}}<1

with Pochhammer’s symbol

(c)j={c(c+1)(c+j1),j1,1,j=0,(c)_{j}=\left\{\begin{array}[]{ll}c(c+1)\cdots(c+j-1),&j\geq 1,\\ 1,&j=0,\end{array}\right.

and BB the beta function. The distribution with density function (2) has nine parameters α,σ>0\alpha,\sigma>0, <μ<-\infty<\mu<\infty, κ1,κ2,λ0\kappa_{1},\kappa_{2},\lambda\geq 0, 0ν,μ1<2π0\leq\nu,\mu_{1}<2\pi and 0μ2<π0\leq\mu_{2}<\pi with restriction κ1+κ2<1\kappa_{1}+\kappa_{2}<1. The resulting distribution should be called the generalized tt-distribution on the cylinder. Relationships between the new and original parameters are:

  • (a)

    η2=η3=0κ1=0\eta_{2}=\eta_{3}=0\Leftrightarrow\kappa_{1}=0.

  • (b)

    κ2=0ρ23=0,σ2=σ3\kappa_{2}=0\Leftrightarrow\rho_{23}=0,\sigma_{2}=\sigma_{3}.

  • (c)

    λ=0ρ12=ρ13=0\lambda=0\Leftrightarrow\rho_{12}=\rho_{13}=0.

As being introduced in (2), the parameter α\alpha was assumed positive. However, the density is still valid when the parameter space of α\alpha is extended to α1\alpha\geq-1. Note that α\alpha corresponds to the degrees of freedom in the generalized tt-distribution on the cylinder (2) and the parameter α\alpha determines the degree of concentration around the mode. The proposed distribution with density (2) has 9 parameters. The roles of the parameters are as follows: (μ1,μ2)(\mu_{1},\mu_{2}) and μ\mu are location parameters of Θ\Theta and XX, respectively, and σ\sigma is the scale parameter of XX. λ\lambda and ν\nu in μ(θ)\mu(\theta) determines the degree of correlation between XX and Θ\Theta, and λ\lambda controls the skewness of XX. κ1\kappa_{1} and κ2\kappa_{2} determine the morality of the distribution, discussed in Section 3.3. α\alpha controls the tail weight of the distribution. To see the role of α\alpha and λ\lambda, we provide contour plots of the proposed density (2) in Figure 1 for some combinations of α\alpha and λ\lambda. Other parameters are specified as μ=ν=μ1=μ2=0,σ2=1,κ1=0.1,κ2=0.4\mu=\nu=\mu_{1}=\mu_{2}=0,\ \sigma^{2}=1,\ \kappa_{1}=0.1,\ \kappa_{2}=0.4. The column (I) in Figure 1 illustrates the interpretation of α\alpha: as α\alpha increases, the tail weight of the density increases. The role of λ\lambda as controlling the skewness of the distribution can be seen from the column (II) in Figure 1. The interpretation of κ1,κ2\kappa_{1},\kappa_{2} are discussed in Section 3.3.

Under the assumption γ=(α+3)/2\gamma=(\alpha+3)/2, if we use reparametrization 1/ψ=(α+3)/2,κ1=tanh(κ1ψ),κ2=tanh(κ2ψ)1/\psi=-(\alpha+3)/2,\ -\kappa_{1}=\tanh(\kappa_{1}^{\ast\ast}\psi),\ -\kappa_{2}=\tanh(\kappa_{2}^{\ast\ast}\psi), then the density (2) has a similar expression to the family of distributions introduced by Jones and Pewsey (2005):

f(x,θ|r)[1ψ2τ2{xμ(θ)}2+tanh(κ1ψ)cos(θμ1)+tanh(κ2ψ)cos2(θμ2)]1/ψ.f(x,\theta|r)\propto\left[1-\frac{\psi}{2\tau^{2}}\{x-\mu(\theta)\}^{2}+\tanh(\kappa_{1}^{\ast\ast}\psi)\cos(\theta-\mu_{1})+\tanh(\kappa_{2}^{\ast\ast}\psi)\cos 2(\theta-\mu_{2})\right]^{1/\psi}.

The restriction α1\alpha\geq-1 in (2) is changed into 1ψ<0-1\leq\psi<0.

(I)
Refer to caption Refer to caption Refer to caption
(II)
Refer to caption Refer to caption Refer to caption

Figure 1: Contour plots of density (2) for: (I) λ=0\lambda=0 and (a) α=1\alpha=-1, (b) α=10\alpha=10, and (c) α=20\alpha=20; (II) α=6\alpha=6 and (a) λ=0\lambda=0, (b) λ=0.5\lambda=0.5, and (c) λ=1\lambda=1. The other parameters are set as μ=ν=μ1=μ2=0,σ2=1,κ1=0.1,κ2=0.4\mu=\nu=\mu_{1}=\mu_{2}=0,\ \sigma^{2}=1,\ \kappa_{1}=0.1,\ \kappa_{2}=0.4.

3 Properties

3.1 Special cases

1. Under the assumption γ=α/2\gamma=\alpha/2 in (2), letting γ(=α/2)\gamma\ (=\alpha/2)\to\infty, we have an extension of the distribution by Mardia and Sutton (1978). Its joint probability density function (Kato and Shimizu, 2008) is

f(x,θ)=C11exp[{xμ(θ)}22τ2+κ1cos(θμ1)+κ2cos2(θμ2)]f(x,\theta)=C_{1}^{-1}\exp\left[-\frac{\{x-\mu(\theta)\}^{2}}{2\tau^{2}}+\kappa_{1}^{\ast}\cos(\theta-\mu_{1})+\kappa_{2}^{\ast}\cos 2(\theta-\mu_{2})\right] (3)

with the normalizing constant

C1=(2π)3/2τ[I0(κ1)I0(κ2)+2j=1Ij(κ2)I2j(κ1)cos2j(μ1μ2)].C_{1}=(2\pi)^{3/2}\tau\left[I_{0}(\kappa_{1}^{\ast})I_{0}(\kappa_{2}^{\ast})+2\sum_{j=1}^{\infty}I_{j}(\kappa_{2}^{\ast})I_{2j}(\kappa_{1}^{\ast})\cos 2j(\mu_{1}-\mu_{2})\right]. (4)

Here IjI_{j} denotes the modified Bessel function of the first kind and order jj given by

Ij(z)=12π02πcos(jθ)ezcosθdθ=r=01Γ(r+j+1)r!(z2)2r+j,z.I_{j}(z)=\frac{1}{2\pi}\int_{0}^{2\pi}\cos(j\theta){\rm e}^{z\cos\theta}{\rm d}\theta=\sum_{r=0}^{\infty}\frac{1}{\Gamma(r+j+1)r!}\left(\frac{z}{2}\right)^{2r+j},\quad z\in\mathbb{C}.

2. When κ2=0\kappa_{2}=0, (2) reduces to

f(x,θ)=C21[1+12σ2{xμ(θ)}2κ1cos(θμ1)](α+3)/2.f(x,\theta)=C_{2}^{-1}\left[1+\frac{1}{2\sigma^{2}}\left\{x-\mu(\theta)\right\}^{2}-\kappa_{1}\cos(\theta-\mu_{1})\right]^{-(\alpha+3)/2}. (5)

The normalizing constant is represented as

C2=22πσB(1/2,α/2+1)F12(α4+12,α4+1,1;κ12)C_{2}=2\sqrt{2}\pi\sigma B(1/2,\alpha/2+1){}_{2}F_{1}\left(\frac{\alpha}{4}+\frac{1}{2},\frac{\alpha}{4}+1,1;\kappa_{1}^{2}\right)

using the Gauss hypergeometric function F12{}_{2}F_{1}. If we replace κ1=κ1/γ,σ2=γτ2\kappa_{1}=\kappa_{1}^{\ast}/\gamma,\sigma^{2}=\gamma\tau^{2} and let γ=α/2\gamma=\alpha/2\to\infty, we have the distribution proposed by Mardia and Sutton (1978) and the constant C2C_{2} tends to (2π)3/2τI0(κ1)(2\pi)^{3/2}\tau I_{0}(\kappa_{1}^{\ast}). This agrees with the fact that C1=(2π)3/2τI0(κ1)C_{1}=(2\pi)^{3/2}\tau I_{0}(\kappa_{1}^{\ast}) when κ2=0\kappa_{2}^{\ast}=0 in (4).

3.2 Marginal and conditional distributions

The marginal distribution of Θ\Theta is

fΘ(θ)=CΘ1{1κ1cos(θμ1)κ2cos2(θμ2)}α/21,f_{\Theta}(\theta)=C_{\Theta}^{-1}\{1-\kappa_{1}\cos(\theta-\mu_{1})-\kappa_{2}\cos 2(\theta-\mu_{2})\}^{-\alpha/2-1}, (6)

where

CΘ\displaystyle C_{\Theta} =2π{F4(α4+12,α4+1,1,1;κ12,κ22)\displaystyle=2\pi\left\{F_{4}\left(\frac{\alpha}{4}+\frac{1}{2},\frac{\alpha}{4}+1,1,1;\kappa_{1}^{2},\kappa_{2}^{2}\right)\right.
+2j=1(α/2+1)3j(2j)!j!(κ12)2j(κ22)jcos2j(μ2μ1)\displaystyle+2\sum_{j=1}^{\infty}\frac{(\alpha/2+1)_{3j}}{(2j)!j!}\left(\frac{\kappa_{1}}{2}\right)^{2j}\left(\frac{\kappa_{2}}{2}\right)^{j}\cos 2j(\mu_{2}-\mu_{1})
×F4(α+6j4+12,α+6j4+1,2j+1,j+1;κ12,κ22)}.\displaystyle\times\left.F_{4}\left(\frac{\alpha+6j}{4}+\frac{1}{2},\frac{\alpha+6j}{4}+1,2j+1,j+1;\kappa_{1}^{2},\kappa_{2}^{2}\right)\right\}. (7)

The distribution with density (6) is a member of the generalized tt-distributions on the circle proposed by Siew et al. (2008), and is possibly bimodal and asymmetric. Cosine and sine moments of the generalized tt-distributions are given in their paper. The generalized t-distributions include the generalized von Mises distribution (cf. Yfantis and Borgman, 1982) as a special case. As another special case when κ2=0\kappa_{2}=0 in (2), the marginal distribution of Θ\Theta belongs to the family of symmetric distributions by Jones and Pewsey (2005). Note that the marginal density (6) is independent of λ,μ,σ\lambda,\mu,\sigma and ν\nu which are the parameters of the proposed density (2). On the other hand, the marginal distribution of XX does not have a closed form in general. When λ=0\lambda=0, we can obtain the marginal distribution of XX in a closed form given by

fX(x)=C1D(x){1+12σ2(xμ)2}(α+3/2),f_{X}(x)=C^{-1}D(x)\left\{1+\frac{1}{2\sigma^{2}}(x-\mu)^{2}\right\}^{-(\alpha+3/2)},

where CC is defined by (2) and D(x)D(x) is obtained by replacing α\alpha and κi(i=1,2)\kappa_{i}\ (i=1,2) with α+1/2\alpha+1/2 and κi/{1+(xμ)2/(2σ2)}\kappa_{i}/\{1+(x-\mu)^{2}/(2\sigma^{2})\}, respectively, in CΘC_{\Theta} defined in (3.2). Note that this density is symmetric about μ\mu.

In (2), the conditional distribution of XX given Θ=θ\Theta=\theta has the generalized tt-density function provided by

fX|Θ(x|θ)=CX|Θ1(1+{xμ(θ)}22σ2[1{κ1cos(θμ1)+κ2cos2(θμ2)}])(α+3)/2f_{X|\Theta}(x|\theta)=C_{X|\Theta}^{-1}\left(1+\frac{\{x-\mu(\theta)\}^{2}}{2\sigma^{2}\left[1-\left\{\kappa_{1}\cos(\theta-\mu_{1})+\kappa_{2}\cos 2(\theta-\mu_{2})\right\}\right]}\right)^{-(\alpha+3)/2} (8)

with the normalizing constant

CX|Θ=2σB(12,α2+1)[1{κ1cos(θμ1)+κ2cos2(θμ2)}]1/2.C_{X|\Theta}=\sqrt{2}\sigma B\left(\frac{1}{2},\frac{\alpha}{2}+1\right)\left[1-\{\kappa_{1}\cos(\theta-\mu_{1})+\kappa_{2}\cos 2(\theta-\mu_{2})\}\right]^{1/2}.

The conditional distribution of Θ\Theta given X=xX=x is a member of the generalized tt-distributions on the circle given by

f(θ|X=x){1κ1(x)cos(θμ1)κ2(x)cos2(θμ2)}(α+3)/2,f(\theta|X=x)\propto\{1-\kappa_{1}(x)\cos(\theta-\mu_{1})-\kappa_{2}(x)\cos 2(\theta-\mu_{2})\}^{-(\alpha+3)/2},

where κi(x)=κi/{1+(xμ)2/(2σ2)},i=1,2\kappa_{i}(x)=\kappa_{i}/\{1+(x-\mu)^{2}/(2\sigma^{2})\},\ i=1,2. As shown in Siew et al. (2008), the mean direction of Θ\Theta with density (6) depends on κi,i=1,2\kappa_{i},\ i=1,2. Thus the conditional mean direction E(Θ|X=x)E(\Theta|X=x) depends on xx through κi(x)\kappa_{i}(x). The result that the conditional distributions of X|Θ=θX|\Theta=\theta and Θ|X=x\Theta|X=x are the generalized tt-distribution comes from the fact that the conditional distribution of a multivariate tt distribution is again a multivariate tt distribution (Joe, 2015).

3.3 Modality

We consider the modality of the distribution with density (2). The mode xx^{\ast} of (2), whenever the value of θ\theta is specified, is x=μ(θ)x^{\ast}=\mu(\theta). Similar to Siew et al. (2008), we discuss maximization of the function m(θ)=κ1cos(θμ1)+κ2cos2(θμ2)m(\theta)=\kappa_{1}\cos(\theta-\mu_{1})+\kappa_{2}\cos 2(\theta-\mu_{2}) with respect to θ\theta. The solution θ\theta^{\ast} of an equation

κ1sin(θμ1)+2κ2sin2(θμ2)=0\kappa_{1}\sin(\theta-\mu_{1})+2\kappa_{2}\sin 2(\theta-\mu_{2})=0 (9)

is a value which maximizes m(θ)m(\theta) if the sign of h(θ)h(\theta^{\ast}) is positive, where

h(θ)=κ1cos(θμ1)+4κ2cos2(θμ2).h(\theta)=\kappa_{1}\cos(\theta-\mu_{1})+4\kappa_{2}\cos 2(\theta-\mu_{2}).

Equation (9) can be solved numerically for any combinations of μ1\mu_{1} and μ2\mu_{2}. Without loss of generality, we let μ1=0\mu_{1}=0. Then (9) has closed form solutions when μ2=0,π/2,π/4\mu_{2}=0,\pi/2,\pi/4 and 3π/43\pi/4. The results are given in Table 1, where θ0=arccos{κ1/(4κ2)}\theta_{0}=\arccos\{\kappa_{1}/(4\kappa_{2})\}, θ1=arcsin{(κ1+κ12+32κ22)/(8κ2)}\theta_{1}=\arcsin\{(-\kappa_{1}+\sqrt{\kappa_{1}^{2}+32\kappa_{2}^{2}})/(8\kappa_{2})\} and θ2=arcsin{(κ1+κ12+32κ22)/(8κ2)}\theta_{2}=\arcsin\{(\kappa_{1}+\sqrt{\kappa_{1}^{2}+32\kappa_{2}^{2}})/(8\kappa_{2})\}. See Yfantis and Borgman (1982) for more discussion as to the solutions of (9).

Table 1: Summary of the modality of the proposed distribution for some values of μ2\mu_{2} when μ1=0\mu_{1}=0.
μ2\mu_{2}\ \ Condition Modes  (x,θ)(x,\theta)
0 4κ2>κ14\kappa_{2}>\kappa_{1} (μ+λcosν,0),(μλcosν,π)(\mu+\lambda\cos\nu,0),(\mu-\lambda\cos\nu,\pi)
4κ2<κ14\kappa_{2}<\kappa_{1} (μ+λcosν,0)(\mu+\lambda\cos\nu,0)
π/2\pi/2 4κ2>κ14\kappa_{2}>\kappa_{1} (μ+λcos(θ0ν),θ0),(μ+λcos(θ0+ν),2πθ0)(\mu+\lambda\cos(\theta_{0}-\nu),\theta_{0}),(\mu+\lambda\cos(\theta_{0}+\nu),2\pi-\theta_{0})
4κ2<κ14\kappa_{2}<\kappa_{1} (μ+λcosν,0)(\mu+\lambda\cos\nu,0)
π/4\pi/4 2κ2>κ12\kappa_{2}>\kappa_{1} (μ+λcos(θ1ν),θ1),(μλcos(θ2ν),π+θ2)(\mu+\lambda\cos(\theta_{1}-\nu),\theta_{1}),(\mu-\lambda\cos(\theta_{2}-\nu),\pi+\theta_{2})
2κ2<κ12\kappa_{2}<\kappa_{1} (μ+λcos(θ1ν),θ1)(\mu+\lambda\cos(\theta_{1}-\nu),\theta_{1})
3π/43\pi/4 2κ2>κ12\kappa_{2}>\kappa_{1} (μλcos(θ2+ν),πθ2),(μ+λcos(θ1+ν),2πθ1)(\mu-\lambda\cos(\theta_{2}+\nu),\pi-\theta_{2}),(\mu+\lambda\cos(\theta_{1}+\nu),2\pi-\theta_{1})
2κ2<κ12\kappa_{2}<\kappa_{1} (μ+λcos(θ1+ν),2πθ1)(\mu+\lambda\cos(\theta_{1}+\nu),2\pi-\theta_{1})

Contour plots and marginal density plots of the proposed distribution are given in Figure 2. Figure 2 shows the case when κ1=0.5\kappa_{1}=0.5, κ2=0.1\kappa_{2}=0.1, λ=0\lambda=0, and (II) κ1=0.2\kappa_{1}=0.2, κ2=0.3\kappa_{2}=0.3, λ=1\lambda=1, while the other parameters are set as σ=1\sigma=1, μ=0\mu=0, ν=π/3\nu=\pi/3, μ1=μ2=0\mu_{1}=\mu_{2}=0, α=6\alpha=6. In the case of (I) in Figure 2, the joint density is unimodal and the marginal densities of XX and Θ\Theta are symmetric. On the other hand, in the case of (II) in Figure 3, the joint density is bimodal and the marginal density of XX is asymmetric. In fact, the marginal density of XX is possibly skew depending on the values of the parameters, as will be seen in Section 3.5. The marginal density of Θ\Theta in Figure 3 is bimodal as given in Table 1.

In the case where a unimodal distribution is desired, we put, for example, μ2=μ1+π/4\mu_{2}=\mu_{1}+\pi/4 with restriction 2|κ2|<κ12|\kappa_{2}|<\kappa_{1} to get a unimodal density function from (2). This submodel can be useful for modeling multimodality when a finite mixture model is easier to interpret than a bimodal distribution.

(I)
Refer to caption Refer to caption Refer to caption
(II)
Refer to caption Refer to caption Refer to caption

Figure 2: (a) Contour plot of log-density, and marginal density plots of (b) the linear random variable XX and (c) the circular random variable Θ\Theta for the proposed distribution for: (I) κ1=0.5\kappa_{1}=0.5, κ2=0.1\kappa_{2}=0.1, λ=0\lambda=0, and (II) κ1=0.2\kappa_{1}=0.2, κ2=0.3\kappa_{2}=0.3, λ=1\lambda=1. The other parameters are set as σ=1\sigma=1, μ=0\mu=0, ν=π/3\nu=\pi/3, μ1=μ2=0\mu_{1}=\mu_{2}=0, α=6\alpha=6.

3.4 Moments and circular-linear correlation

We consider moments of the proposed distribution. Let αm,k=E(cosmθ)\alpha_{m,k}=E(\cos m\theta) and βm,k=E(sinkθ)\beta_{m,k}=E(\sin k\theta) be the trigonometric moments of (6) under replacement α/2\alpha/2 with α/2k\alpha/2-k, which are obtainable using the results by Siew et al. (2008). Moments of a random vector (X,Θ)(X,\Theta)^{\prime} having (2) are given by

E[{Xμ(Θ)}2kcosmΘ]=CΘ,kC12k+1/2σ2k+1B(k+12,α2k+1)αm,kE[\{X-\mu(\Theta)\}^{2k}\cos m\Theta]=C_{\Theta,k}C^{-1}2^{k+1/2}\sigma^{2k+1}B\left(k+\frac{1}{2},\frac{\alpha}{2}-k+1\right)\alpha_{m,k} (10)

and

E[{Xμ(Θ)}2ksinmΘ]=CΘ,kC12k+1/2σ2k+1B(k+12,α2k+1)βm,k,E[\{X-\mu(\Theta)\}^{2k}\sin m\Theta]=C_{\Theta,k}C^{-1}2^{k+1/2}\sigma^{2k+1}B\left(k+\frac{1}{2},\frac{\alpha}{2}-k+1\right)\beta_{m,k}, (11)

where CΘ,kC_{\Theta,k} is obtained by replacing α\alpha with α2k\alpha-2k in CΘC_{\Theta} in (3.2).

Moreover, we derive another type of moments of a random vector (X,Θ)(X,\Theta)^{\prime} having (5) given by putting κ2=0\kappa_{2}=0 in (2). After some calculations, the moments for nonnegative integers k(<α/2+1)k\ (<\alpha/2+1) and mm turn out to be

E[{Xμ(Θ)}2kcosm(Θμ1)]\displaystyle E[\left\{X-\mu(\Theta)\right\}^{2k}\cos m(\Theta-\mu_{1})]
=2kσ2kB(k+12,α2k+1)(1)m+α2k+1Pα/2km((1κ12)1/2)(1κ12)(α2k+2)/4B(1/2,α/2+1)F12(α/4+12,α/4+1,1;κ12)(α/2km+1)m,\displaystyle=\frac{2^{k}\sigma^{2k}B\left(k+\frac{1}{2},\frac{\alpha}{2}-k+1\right)(-1)^{m+\frac{\alpha}{2}-k+1}P_{\alpha/2-k}^{m}\left(-(1-\kappa_{1}^{2})^{-1/2}\right)\left(1-\kappa_{1}^{2}\right)^{-(\alpha-2k+2)/4}}{B\left(1/2,\alpha/2+1\right){{}_{2}F_{1}}\left(\alpha/4+\frac{1}{2},\alpha/4+1,1;\kappa_{1}^{2}\right)(\alpha/2-k-m+1)_{m}}, (12)

where PP denotes the associated Legendre function (Gradshteyn and Ryzhik, 2007, 8.711.2) defined by

Pνm(z)=(ν+1)(ν+2)(ν+m)π0π(z+z21cosφ)νcosmφdφ.P_{\nu}^{m}(z)=\frac{(\nu+1)(\nu+2)\cdots(\nu+m)}{\pi}\int_{0}^{\pi}\left(z+\sqrt{z^{2}-1}\cos\varphi\right)^{\nu}\cos m\varphi\ {\rm d}\varphi.

Next, we study the circular-linear correlation RxθR_{x\theta} between XX and Θ\Theta (cf. Mardia and Jupp, 1999, p. 245) which is defined as

Rxθ2=rxs2+rxc22rcsrxsrxc1rcs2,R_{x\theta}^{2}=\frac{r_{xs}^{2}+r_{xc}^{2}-2r_{cs}r_{xs}r_{xc}}{1-r_{cs}^{2}},

where rxs=Corr(X,cosΘ),rxc=Corr(X,sinΘ)r_{xs}={\rm Corr}(X,\cos\Theta),r_{xc}={\rm Corr}(X,\sin\Theta) and rcs=Corr(cosΘ,sinΘ)r_{cs}={\rm Corr}(\cos\Theta,\sin\Theta) are Pearson’s correlation coefficients. We only consider the case when (X,Θ)(X,\Theta)^{\prime} has density (5) for simplicity because the circular-linear correlation of (X,Θ)(X,\Theta) with density (2) is not feasible to compute analytically. Note that the RxθR_{x\theta} in case of (2) can be obtained by numerical calculation. A straightforward calculation shows that

Rxθ2=λ2Uq+λ2U,R_{x\theta}^{2}=\frac{\lambda^{2}U}{q+\lambda^{2}U}, (13)

where

U=12(1p2)sin2(μ1ν)+{12(1+p2)p12}cos2(μ1ν),U=\frac{1}{2}\left(1-p_{2}\right)\sin^{2}(\mu_{1}-\nu)+\left\{\frac{1}{2}\left(1+p_{2}\right)-p_{1}^{2}\right\}\cos^{2}(\mu_{1}-\nu),

and p1,p2p_{1},p_{2} and qq denote pm=E{cosm(Θμ1)},m=1,2,p_{m}=E\{\cos m(\Theta-\mu_{1})\},m=1,2, and q=E[{Xμ(Θ)}2]q=E[\{X-\mu(\Theta)\}^{2}], calculable from (12). Note that U>0U>0 since 1p2>01-p_{2}>0 and (1+p2)/2p12=Var{cos(Θμ1)}>0(1+p_{2})/2-p_{1}^{2}={\rm Var}\{\cos(\Theta-\mu_{1})\}>0. We can observe that Rxθ2R_{x\theta}^{2} is an increasing function of λ\lambda, and Rxθ2=0R_{x\theta}^{2}=0 if and only if λ=0\lambda=0. Furthermore, letting γ=α/2\gamma=\alpha/2\to\infty, it is seen that (13) reduces to the circular-linear correlation of the distribution proposed by Mardia and Sutton (1978) because (5) goes to the Mardia–Sutton model. This is confirmed from the fact that qτ2q\to\tau^{2} and pmIm(κ)/I0(κ),m=1,2p_{m}\to I_{m}(\kappa^{\ast})/I_{0}(\kappa^{\ast}),m=1,2 as γ=α/2\gamma=\alpha/2\to\infty.

3.5 Skewness of marginal distribution on the real line

For the proposed density (2), we derive the skewness of the marginal density of XX defined as

γ1=E[{XE(X)}3](E[{XE(X)}2])3/2.\gamma_{1}=\frac{E\left[\left\{X-E(X)\right\}^{3}\right]}{\left(E\left[\left\{X-E(X)\right\}^{2}\right]\right)^{3/2}}.

Straightforward calculation shows that

γ1=3v2λ+v3λ3(v1λ2+q)3/2,\gamma_{1}=\frac{3v_{2}\lambda+v_{3}\lambda^{3}}{(v_{1}\lambda^{2}+q)^{3/2}}, (14)

where v1=Var{cos(Θν)},v2=Cov[{Xμ(Θ)}2,cos(Θν)]v_{1}={\rm Var}\{\cos(\Theta-\nu)\},\ v_{2}={\rm Cov}[\{X-\mu(\Theta)\}^{2},\cos(\Theta-\nu)] and v3=E([cos(Θν)E{cos(Θν)}]3)v_{3}=E([\cos(\Theta-\nu)-E\{\cos(\Theta-\nu)\}]^{3}), which are calculable from (10) and (11). We can easily obtain from (14) that γ1v3/v13/2\gamma_{1}\to v_{3}/v_{1}^{3/2} as λ\lambda\to\infty and γ1=0\gamma_{1}=0 when λ=0\lambda=0. Figure 4 shows that a graph of skewness as a function of λ\lambda. We see from Figure 4 that the marginal distribution of X of the proposed model can be left and right skewed according to the values of the parameters.

Refer to caption
Refer to caption
Figure 3: Skewness of the marginal distribution of X as a function of λ\lambda for (a) σ=3,μ=0,κ1=0.5,κ2=0,μ1=μ2=0,α=9\sigma=3,\ \mu=0,\ \kappa_{1}=0.5,\ \kappa_{2}=0,\ \mu_{1}=\mu_{2}=0,\ \alpha=9 with ν=π\nu=\pi (solid), ν=π/2\nu=\pi/2 (dashed) and ν=0\nu=0 (dotted), and (b) σ=3,μ=0,ν=0,κ1=0.3,μ1=μ2=0,α=9\sigma=3,\ \mu=0,\ \nu=0,\ \kappa_{1}=0.3,\ \mu_{1}=\mu_{2}=0,\ \alpha=9 with κ2=0\kappa_{2}=0 (solid), κ2=0.1\kappa_{2}=0.1 (dashed) and κ2=0.2\kappa_{2}=0.2 (dotted).

3.6 Regression

We can derive a circular-linear regression model from the conditional distribution with density (8). In fact, the conditional mean of XX given Θ=θ\Theta=\theta is

E(X|Θ=θ)=μ(θ)=μ+λcos(θν)E(X|\Theta=\theta)=\mu(\theta)=\mu+\lambda\cos(\theta-\nu)

and the conditional variance of XX given Θ=θ\Theta=\theta is

Var(X|Θ=θ)=2σ2α{1κ1cos(θμ1)κ2cos2(θμ2)}.{\rm Var}(X|\Theta=\theta)=\frac{2\sigma^{2}}{\alpha}\left\{1-\kappa_{1}\cos(\theta-\mu_{1})-\kappa_{2}\cos 2(\theta-\mu_{2})\right\}.

Note that the conditional variance is dependent on θ\theta, i.e. the regression model is possibly heterogeneous. As a reduced model, if we let γ=α\gamma=\alpha\to\infty, we have Var(X|Θ=θ)=τ2{\rm Var}(X|\Theta=\theta)=\tau^{2}, which is independent of θ\theta. Moreover if κ1=0\kappa_{1}=0 and κ2=0\kappa_{2}=0 in (8), we have Var(X|Θ=θ)=2σ2/α{\rm Var}(X|\Theta=\theta)=2\sigma^{2}/\alpha and, in this case, we obtain a regression model

xi=μ+λcos(θiν)+εi,i=1,,n,x_{i}=\mu+\lambda\cos(\theta_{i}-\nu)+\varepsilon_{i},\ \ \ \ \ i=1,\ldots,n,

with random errors εi\varepsilon_{i} which are independent and identically distributed according to the generalized tt-distribution.

3.7 Maximizing β\beta-entropy

The related distribution proposed by Mardia and Sutton (1978) and Kato and Shimizu (2008) can be characterized as the maximum entropy distribution under certain moment conditions. Also a maximum entropy distribution under certain moment conditions relates to the proposed distribution with density (2). We consider the β\beta-entropy (see Eguchi, 2009, Section 13.2.4) defined as

E(f)={fβ+1(x,θ)dxdθβ1}/β(β+1).E(f)=\left\{\int f^{\beta+1}(x,\theta){\rm d}x{\rm d}\theta-\beta-1\right\}\bigg{/}\beta(\beta+1).

Then the maximum entropy distribution subject to constraints on the moments

E(X2),E(XcosΘ),E(XsinΘ),E(cospΘ),E(sinpΘ),p=1,2,E(X^{2}),\ E(X\cos\Theta),\ E(X\sin\Theta),\ E(\cos p\Theta),\ E(\sin p\Theta),\ p=1,2,

is the distribution with density

f(x,θ)(1+β[{xμ(θ)}22τ2+κ1cos(θμ1)+κ2cos2(θμ2)])1/β.f(x,\theta)\propto\left(1+\beta\left[-\frac{\left\{x-\mu(\theta)\right\}^{2}}{2\tau^{2}}+\kappa_{1}^{\ast}\cos(\theta-\mu_{1})+\kappa_{2}^{\ast}\cos 2(\theta-\mu_{2})\right]\right)^{1/\beta}. (15)

If we take β=2/(α+3)\beta=-2/(\alpha+3), (15) gives a density related to (2).

3.8 Parameter estimation

We provide a method for calculating the maximum likelihood estimates of the parameters in the generalized tt-distribution with density (2). When we observe (xi,θi),i=1,,n(x_{i},\theta_{i}),\ i=1,\ldots,n, the log-likelihood function is given by

L(𝝍)\displaystyle L({\text{\boldmath$\psi$}}) =nlogB(12,α2+1)nlogCθn2log2nlogσ\displaystyle=-n\log B\left(\frac{1}{2},\frac{\alpha}{2}+1\right)-n\log C_{\theta}-\frac{n}{2}\log 2-n\log\sigma
12(α+3)i=1nlog[1+12σ2{xiμλcos(θiν)}2κ1cos(θiμ1)κ2cos2(θiμ2)],\displaystyle-\frac{1}{2}\left(\alpha+3\right)\sum_{i=1}^{n}\log\left[1+\frac{1}{2\sigma^{2}}\left\{x_{i}-\mu-\lambda\cos(\theta_{i}-\nu)\right\}^{2}-\kappa_{1}\cos(\theta_{i}-\mu_{1})-\kappa_{2}\cos 2(\theta_{i}-\mu_{2})\right],

where 𝝍=(σ,μ,λ,ν,κ1,μ1,κ2,μ2,α){\text{\boldmath$\psi$}}=(\sigma,\mu,\lambda,\nu,\kappa_{1},\mu_{1},\kappa_{2},\mu_{2},\alpha)^{\prime}. For obtaining the maximizer of L(𝝍)L({\text{\boldmath$\psi$}}), we propose the conditional maximization algorithm. We first divide the parameter 𝝍\psi into 𝝍=(σ,𝝍1,𝝍2){\text{\boldmath$\psi$}}=(\sigma,{\text{\boldmath$\psi$}}_{1}^{\prime},{\text{\boldmath$\psi$}}_{2}^{\prime})^{\prime}, where 𝝍1=(μ,λ,ν){\text{\boldmath$\psi$}}_{1}=(\mu,\lambda,\nu)^{\prime} and 𝝍2=(κ1,μ1,κ2,μ2,α){\text{\boldmath$\psi$}}_{2}=(\kappa_{1},\mu_{1},\kappa_{2},\mu_{2},\alpha)^{\prime}. Given the value of σ\sigma and 𝝍2{\text{\boldmath$\psi$}}_{2}, maximizing L(𝝍)L({\text{\boldmath$\psi$}}) is equivalent to maximizing

i=1nlog[Ci+12σ2{xiμλcos(θiν)}2],-\sum_{i=1}^{n}\log\left[C_{i}+\frac{1}{2\sigma^{2}}\left\{x_{i}-\mu-\lambda\cos(\theta_{i}-\nu)\right\}^{2}\right],

with respect to 𝝍1{\text{\boldmath$\psi$}}_{1}, where Ci=1κ1cos(θiμ1)κ2cos2(θiμ2)C_{i}=1-\kappa_{1}\cos(\theta_{i}-\mu_{1})-\kappa_{2}\cos 2(\theta_{i}-\mu_{2}). Let 𝒙=(x1,,xn){\text{\boldmath$x$}}=(x_{1},\ldots,x_{n})^{\prime}, 𝑻=(𝒕1,,𝒕n){\text{\boldmath$T$}}=({\text{\boldmath$t$}}_{1}^{\prime},\ldots,{\text{\boldmath$t$}}_{n}^{\prime})^{\prime} for 𝒕i=(1,cosθi,sinθi){\text{\boldmath$t$}}_{i}=(1,\cos\theta_{i},\sin\theta_{i})^{\prime}, 𝜷=(μ,λcosν,λsinν){\text{\boldmath$\beta$}}=(\mu,\lambda\cos\nu,\lambda\sin\nu)^{\prime} and 𝑾=diag(w1,,wn){\text{\boldmath$W$}}={\rm diag}(w_{1},\ldots,w_{n}) for

wi=σ2σ2+{xiμλcos(θiν)}2/2.w_{i}=\frac{\sigma^{2}}{\sigma^{2}+\left\{x_{i}-\mu-\lambda\cos(\theta_{i}-\nu)\right\}^{2}/2}.

Using the theory of weighted regression (see Andrews, 1974), the maximizer of 𝝍1{\text{\boldmath$\psi$}}_{1} given σ\sigma and 𝝍2{\text{\boldmath$\psi$}}_{2} can be obtained as

𝜷^=(𝑻𝑾𝑻)1𝑻𝑾𝒙,\widehat{{\text{\boldmath$\beta$}}}=({\text{\boldmath$T$}}^{\prime}{\text{\boldmath$W$}}{\text{\boldmath$T$}})^{-1}{\text{\boldmath$T$}}^{\prime}{\text{\boldmath$W$}}{\text{\boldmath$x$}}, (16)

which deduces the maximizer 𝝍^1\widehat{{\text{\boldmath$\psi$}}}_{1}. Since 𝑾W depends on 𝝍1{\text{\boldmath$\psi$}}_{1}, we calculate 𝑾W based on the current values in each iteration.

Given 𝝍1{\text{\boldmath$\psi$}}_{1} and 𝝍2{\text{\boldmath$\psi$}}_{2}, maximizing L(𝝍)L({\text{\boldmath$\psi$}}) with respect to σ2\sigma^{2} is equivalent to solving the following equation:

n(α+3)1=i=1n{xiμλcos(θiν)}2{xiμλcos(θiν)}2+Ciσ2,n\left(\alpha+3\right)^{-1}=\sum_{i=1}^{n}\frac{\left\{x_{i}-\mu-\lambda\cos(\theta_{i}-\nu)\right\}^{2}}{\left\{x_{i}-\mu-\lambda\cos(\theta_{i}-\nu)\right\}^{2}+C_{i}\sigma^{2}}, (17)

which deduces the maximizer σ^\hat{\sigma}.

Finally for maximizing 𝝍2{\text{\boldmath$\psi$}}_{2} under given σ\sigma and 𝝍1{\text{\boldmath$\psi$}}_{1}, we maximize

nlogB(12,α2+1)nlogCθ12(α+3)i=1nlog{Diκ1cos(θiμ1)κ2cos2(θiμ2)}-n\log B\left(\frac{1}{2},\frac{\alpha}{2}+1\right)-n\log C_{\theta}-\frac{1}{2}\left(\alpha+3\right)\sum_{i=1}^{n}\log\left\{D_{i}-\kappa_{1}\cos(\theta_{i}-\mu_{1})-\kappa_{2}\cos 2(\theta_{i}-\mu_{2})\right\} (18)

with respect to 𝝍2{\text{\boldmath$\psi$}}_{2}, where Di=1+(2σ2)1{xiμλcos(θiν)}2D_{i}=1+(2\sigma^{2})^{-1}\left\{x_{i}-\mu-\lambda\cos(\theta_{i}-\nu)\right\}^{2}. This maximization problem is quite similar to obtaining the maximum likelihood estimates of the generalized tt-distribution on the circle (Siew et al., 2008), so that we can obtain the maximizer 𝝍2{\text{\boldmath$\psi$}}_{2} given σ\sigma and 𝝍1{\text{\boldmath$\psi$}}_{1}. Note that we carried out the maximization of (18) with use of numerical integration for getting the value of CθC_{\theta}.

Therefore, the proposed estimation method is described in the following.

Estimation Algorithm

  • 1.

    Determine the initial values 𝝍(0){\text{\boldmath$\psi$}}^{(0)}. Set k=0k=0.

  • 2.

    Compute 𝑾W based on σ(k)\sigma^{(k)} and 𝝍1(k){\text{\boldmath$\psi$}}_{1}^{(k)}, then obtain 𝝍1(k+1){\text{\boldmath$\psi$}}_{1}^{(k+1)} based on (16).

  • 3.

    Compute σ(k+1)\sigma^{(k+1)} by solving (17) with 𝝍1=𝝍1(k+1){\text{\boldmath$\psi$}}_{1}={\text{\boldmath$\psi$}}_{1}^{(k+1)} and 𝝍2=𝝍2(k){\text{\boldmath$\psi$}}_{2}={\text{\boldmath$\psi$}}_{2}^{(k)}.

  • 4.

    Compute 𝝍2(k+1){\text{\boldmath$\psi$}}_{2}^{(k+1)} by maximizing (18) with 𝝍1=𝝍1(k+1){\text{\boldmath$\psi$}}_{1}={\text{\boldmath$\psi$}}_{1}^{(k+1)} and σ=σ(k+1)\sigma=\sigma^{(k+1)}.

  • 5.

    Set k=k+1k=k+1 and go to Step 2 (until numerical convergence).

4 Empirical application

For an illustrative example, we consider a cylindrical dataset given in Johnson and Wehrly (1977) on the wind direction and ozone level taken at 6:00 pm at four-day intervals between April 18th and June 29th, 1975 at a weather station in Milwaukee with 1919 samples. We fitted the proposed generalized tt-distribution and its submodels. For submodels of (2), we consider two cases, namely, κ2=0\kappa_{2}=0 (GT-sub1) given in (5) and μ2=μ1+π/4\mu_{2}=\mu_{1}+\pi/4 with 2|κ2|<κ12|\kappa_{2}|<\kappa_{1} (GT-sub2) discussed in the end of Section 3.3. When we carry on the estimation algorithm given in Section 3.8, we repeat the algorithm until the difference between update and current values are smaller than 0.0010.001. For comparison, we also fitted the member of the exponential family given by Kato and Shimizu (2008) with density

f(x,θ)exp{(xμ(θ))22σ2+κ1cos(θμ1)+κ2cos2(θμ2)},f(x,\theta)\propto\exp\left\{-\frac{(x-\mu(\theta))^{2}}{2\sigma^{2}}+\kappa_{1}\cos(\theta-\mu_{1})+\kappa_{2}\cos 2(\theta-\mu_{2})\right\},

which is reparametrized form of (3) with κ1=κ1\kappa_{1}^{\ast}=\kappa_{1}, κ2=κ2\kappa_{2}^{\ast}=\kappa_{2} and τ2=σ2\tau^{2}=\sigma^{2}. Table 2 provides the maximum likelihood estimates of the parameters, AIC values, and the multivariate Kolmogorov-Smirnov statistics of goodness of fit testing given by Justel et al. (1997). In Table 1 of Justel et al. (1997), the bivariate Kolmogorov-Smirnov statistics distributions are reported. From the table, the upper 5%5\%, 10%10\% and 25%25\% percentiles are 0.3620.362, 0.3350.335 and 0.2920.292, respectively, for 2020 sample cases. Thus the percentiles in 19 sample cases are smaller than these values, so that the fitted four models are not rejected with 5%5\% significance level. Judging from AIC, we see that the two submodels of the proposed distribution, GT-sub1 and GT-sub2, gives better fits than the Kato and Shimizu distribution. Figure 4 shows scatter plots of the data and contour plots of the fitted densities of two submodels.

Table 2: Maximum likelihood estimates of the parameters, the maximum log-likelihood (MLL), and AIC values of model (2) (GT) and its submodels and the Kato and Shimizu model (3) (KS) fitted to the data from Johnson and Wehrly (1977).
Model μ\mu λ\lambda ν\nu σ\sigma κ1\kappa_{1} μ1\mu_{1} κ2\kappa_{2} μ2\mu_{2} α\alpha AIC g.o.f
GT 41.38 31.14 1.25 68.90 0.11 6.28 0.03 1.35 23.84 417.79 0.180
GT-sub1 41.37 31.17 1.24 74.81 0.08 0.30 28.28 240.20 0.314
GT-sub2 41.01 32.70 1.41 6.08 0.49 0.19 0.15 -1.00 203.34 0.359
KS 41.24 31.38 1.20 19.77 1.41 0.19 0.35 1.45 241.42 0.282
Refer to caption
Refer to caption
Figure 4: Sample plot and contour plot of the fitted density function.

5 Conclusions

In this paper, we derived the distributions on the cylinder based on a trivariate tt-distribution. The derived distribution is considered as a cylindrical extension of the generalized tt-distribution on the circle proposed by Siew et al. (2008) and includes the exponential family given by Kato and Shimizu (2008). We investigated some properties of the proposed distribution including the marginal and conditional distributions, circular-linear correlation and the algorithm for parameter estimation. We applied our proposed distribution to the data set of the wind direction and ozone level given in Johnson and Wehrly (1977), and we confirmed that the proposed distribution gave a better fit than the distribution given by Kato and Shimizu (2008).


Acknowledgement

We would like to thank the two reviewers for many valuable comments and helpful suggestions which led to an improved version of this paper. The first author was supported in part by Grant-in-Aid for Scientific Research (10076) from Japan Society for the Promotion of Science (JSPS). The work of the third author was supported by JSPS KAKENHI Grant Number 25400218.

References

  • [2] Andrews, D. F. (1974). A robust method for multiple linear regression, Technometrics, 16:523–532.
  • [4] Eguchi, S. (2009). Information Divergence Geometry and the Application to Statistical Machine Learning. In Information Theory and Statistical Learning, Chapter 13, 309–332, Emmert-Streib, F. and Dehmer, M. Eds., New York, Springer.
  • [6] Fisher, N. I. (1993). Statistical Analysis of Circular Data. Cambridge University Press, Cambridge.
  • [8] Gradshteyn, I. S. and Ryzhik, I. M. (2007). Table of Integrals, Series, and Products. Seventh Edition, Elsevier, Amsterdam.
  • [10] Joe, H. (2015). Dependence Modeling with Copulas. Chapman & Hal/CRC, Boca Raton.
  • [12] Jones, M. C. and Pewsey, A. (2005). A family of symmetric distributions on the circle. Journal of the American Statistical Association, 100:1422–1428.
  • [14] Johnson, R. A. and Wehrly, T. (1977). Measures and models for angular correlation and angular-linear correlation. Journal of the Royal Statistical Society B, 39:222–229.
  • [16] Johnson, R. A. and Wehrly, T. E. (1978). Some angular-linear distributions and related regression models. Journal of the American Statistical Association, 73:602–606.
  • [18] Justel, A., Pen~\tilde{\rm n}a, D. and Zamar, R. (1997). A multivariate Kolmogorov-Smirnov test of goodness of fit. Statistics and Probability Letters, 35, 251–259.
  • [20] Kato, S. and Jones, M. C. (2010). A family of distributions on the circle with links to, and applications arising from, Möbius transformation. Journal of the American Statistical Association, 105:249–262.
  • [22] Kato, S. and Jones, M. C. (2015). A tractable and interpretable four-parameter family of unimodal distributions on the circle. Biometrika, 102, 181-190.
  • [24] Kato, S. and Shimizu, K. (2008). Dependent models for observations which include angular ones. Journal of Statistical Planning and Inference, 138:3538-3549.
  • [26] Kotz, S. and Nadarajah, S. (2004). Multivariate tt Distributions and Their Applications, Cambridge University Press, Cambridge.
  • [28] Mardia, K. V. and Jupp, P. E. (1999). Directional Statistics. Wiley, Chichester.
  • [30] Mardia, K. V. and Sutton, T. W. (1978). A model for cylindrical variables with applications. Journal of the Royal Statistical Society B, 40:229-233.
  • [32] Siew, H-Y., Kato, S. and Shimizu, K. (2008). The generalized tt-distribution on the circle. Japanese Journal of Applied Statistics, 37:1-17.
  • [34] Yfantis, E. A. and Borgman, L. E. (1982). An extension of the von Mises distribution. Communications in Statistics–Theory and Methods, 11:1695-1706.
  • [35]