Empirical Likelihood Inference of Variance Components in Linear Mixed-Effects Models

J. Zhang Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, U.S.A. W. Guo Genetic Epidemiology Research Branch, National Institute of Mental Health, National Institutes of Health J.S. Carpenter The University of Sydney’s Brain and Mind Centre Andrew Leroux Genetic Epidemiology Research Branch, National Institute of Mental Health, National Institutes of Health Department of Biostatistics & Informatics, University of Colorado K.R. Merikangas Genetic Epidemiology Research Branch, National Institute of Mental Health, National Institutes of Health N.G. Martin QIMR Berghofer Medical Research Institute I.B. Hickie The University of Sydney’s Brain and Mind Centre H. Shou Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, U.S.A. H. Li ¹¹1hongzhe@pennmedicine.upenn.edu Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, U.S.A.

Abstract

Linear mixed-effects models are widely used in analyzing repeated measures data, including clustered and longitudinal data, where inferences of both fixed effects and variance components are of importance. Unlike the fixed effect inference that has been well studied, inference on the variance components is more challenging due to null value being on the boundary and the nuisance parameters of the fixed effects. Existing methods often require strong distributional assumptions on the random effects and random errors. In this paper, we develop empirical likelihood-based methods for the inference of the variance components in the presence of fixed effects. A nonparametric version of the Wilks’ theorem for the proposed empirical likelihood ratio statistics for variance components is derived. We also develop an empirical likelihood test for multiple variance components related to a sequence of correlated outcomes. Simulation studies demonstrate that the proposed methods exhibit better type 1 error control than the commonly used likelihood ratio tests when the Gaussian distributional assumptions of the random effects are violated. We apply the methods to investigate the heritability of physical activity as measured by wearable device in the Australian Twin study and observe that such activity is heritable only in the quantile range from 0.375 to 0.514.

Keywords Boundary value; Global test; Heritability; Nonparametric test; Wearable device data.

1 Introduction

Longitudinal and clustered data commonly arise from observational studies or clinical trials, where subjects are measured repeatedly over time or within a cluster. The repeated measures within a subject or a cluster are often correlated. To analyze such data, linear mixed-effects models that incorporate both fixed and random effects are widely used. Many statistical methods have been developed for such linear mixed-effects models, especially methods for inference of the fixed effects. However, inference on the variance components is less studied and often requires strong distributional assumptions on the random effects and the error terms. When the underlying distributions are known, classical inference methods, including the likelihood ratio tests and the score tests, can be applied. However, these parametric methods are often restrictive and not robust if the model assumptions are violated.

Empirical likelihood (el) method, as an alternative to parametric likelihood-based methods, was first proposed by Owen, (1988) and has been applied to many statistical inference problems. Without a prespecified distributional assumption on the data, el methods incorporate side information through constraints or prior distributions and have favorable statistical properties, including but not limited to Bartlett correctability, transformation invariance, better coverage accuracy of the corresponding confidence internals and greater power. A comprehensive introduction to el methods can be found in Owen, (2001). el methods have been applied to inferences of mixture models (Zou et al.,, 2002) and censored survival data (Chang and McKeague,, 2016), and have also been considered for longitudinal data modeling. For example, You et al., (2006) proposed a block el method for inference of the regression parameters assuming a working independence covariance, and Xue and Zhu, (2007) considered a semiparametric regression model, where the repeated within-subject measures are summarized as a function over time in order to address the dependence issue. Wang et al., (2010) proposed a generalized el method that takes into account the within-subject correlations. Li and Pan, (2013) defined an empirical likelihood ratio (elr) test by utilizing the extended score from quadratic inference functions for longitudinal data, which does not involve direct estimation of the correlation parameters.

The el methods mentioned above only focus on the inference of fixed effects in linear mixed effect models. In this paper, we consider a general setting of linear mixed-effects models and develop el methods for the inference of the variance components. Specifically, suppose there are $n$ subjects and denote by $n_{i}$ the number of repeated measures for the $i$ th subject. For the $i$ th subject, we observe a response vector $y_{i}\in R^{n_{i}}$ , an $n_{i}\times p$ design matrix $X_{i}$ for the fixed effects $\beta^{*}\in R^{p}$ , and $d$ $n_{i}\times n_{i}$ semi-positive design matrices $\Phi_{iq}~{}(q=1,\cdots,d)$ for the variance components $\theta^{*}\in(R_{+}\cup\{0\})^{d}$ . The general linear mixed-effects model can be written as

y_{i}=X_{i}\beta^{*}+r_{i},\quad i=1,\cdots,n,

(1)

where $r_{i}\in R^{n_{i}}$ is a zero-mean random variable with variance-covariance $H_{i}(\theta^{*})$ . We assume that $H_{i}(\theta^{*})$ has a linear structure,

H_{i}(\theta^{*})=\sum_{q=1}^{d}\theta^{*}_{q}\Phi_{iq},\quad\theta^{*}=(\theta^{*}_{1},\cdots,\theta^{*}_{d})^{T}=(\theta^{*}_{1},\theta^{*T}_{(1)})^{T},

where $\theta^{*}=(\theta^{*}_{1},\cdots,\theta^{*}_{d})^{T}$ is the vector of the variance components. We emphasize that this general setting does not require any assumptions on the distributions of the data or the distributions of the random effects.

In many real applications, we are interested in making statistical inference on the variance components $\theta^{*}$ in model (1). For example, in the study of heritability based on twin data, each monozygotic twin or dizygotic twin is treated as one cluster, and the linear variance structure can be constructed based on the twin type (see details in Section 6). In the heritability analysis, a key question is whether there exists an genetic effect, which motivates us to study the inference of one of the variance components, say, $\theta^{*}_{1}$ . We propose to develop an el based inference method for $\theta^{*}_{1}$ without any assumptions on the random components. The method can effectively account for the nuisance parameters, including the unknown fixed effects $\beta^{*}$ and the variance components $\theta^{*T}_{(1)}$ . The key difficulty when compared to the el inference of the fixed effects is to deal with the boundary value problem when $\theta^{*}_{1}=0$ in local testing problem $H_{0}:\theta^{*}_{1}=\theta_{1}^{0}$ . To solve the issues, we propose a new empirical likelihood ratio test by utilizing an unbiased estimator of $\beta^{*}$ under very mild conditions, and prove that the asymptotic distribution of the test statistic is a mixture of $\chi^{2}$ distribution.

Motivated by heritability analysis of daily activity distribution as measured by wearable device such as actigraphy, we also consider the setting when linear mixed-effects models are fitted to a sequence of dependent outcomes. The wearable device data have been increasingly collected for continuous activity monitoring in large observational or experimental studies (Burton et al.,, 2013; Krane-Gartiser et al.,, 2014). In typical wearable activity tracking data, the activity is measured at one-minute resolution over several days for a given subject. Such wearable device data with repeated measures enable us to account for day-to-day variability of the activity. Instead of focusing on the activity counts at any minute of the day, daily activity distribution or the amount of time with the activity count above a given threshold provides a biologically meaningful measure of the activity traits. When the activity counts are summarized as distributions, we can consider the activity quantile profile as a phenotype measure. In analysis of such wearable device data, we fit a linear mixed-effects model for each of the activity level or quantile $y_{i}(t)$ at $t$ . Denote by $\theta^{*}(t)$ the variance components for activity profile at level $t$ . We are then interested in testing the global null $H_{0}:\theta^{*}_{1}(t)\equiv\theta_{1}^{0},~{}t\in[t_{1},t_{2}]$ . We develop a max-type statistic for this global testing problem. Since the numerator of the proposed empirical likelihood ratio (elr) tests can be rewritten as the sum of approximately independent random variables over different subjects, a random perturbation method is developed to approximate the $p$ -values of the proposed global test.

We first introduce some notation. Denote by $(A)_{-1}$ the submatrix of $A$ without the first column of $A$ . For two vectors or matrices $A$ and $B$ of compatible dimension, define the inner product $\langle A,B\rangle=\text{tr}(A^{T}B)$ . For a matrix $D_{m\times n}=(D_{1},\cdots,D_{n})$ , where $D_{i}$ is the $i$ th column of $D$ , the vectorized $D$ is defined by $(D_{1}^{T},\cdots,D_{n}^{T})^{T}$ . Let $E(x)$ and $\text{var}(x)$ be the expectation and variance of a random vector $x$ , and let $\text{cov}(x,y)$ be the covariance of random vectors $x$ and $y$ . When $x$ is a random matrix, $E(x)$ and $\text{var}(x)$ represent the expectation and variance of the vectorized $x$ . When $x$ and $y$ are random matrices, $\text{cov}(x,y)$ denotes the covariance of the vectorized $x$ and vectorized $y$ . We use $a=O(b)$ to denote that $a$ and $b$ are of the same order, and $a=o(b)$ to denote that $a$ is of a smaller order than $b$ . We use $x=O_{p}(y)$ to denote that $x$ and $y$ are of the same order in probability, and $x=o_{p}(y)$ to denote that $x$ is of a smaller order than y in probability.

2 elr test for the fixed effects $\beta^{*}$

Statistical tests for the fixed effects in the linear mixed-effects model (1), $H_{0}:\beta^{*}=\beta_{0}$ , has been well studied. We first briefly review the subject-wise el method proposed in Wang et al., (2010), where the covariance structure for each subject is considered.

Let $\hat{H}_{in}$ be an estimator of $H_{i}$ , and assume that $\hat{H}_{in}$ converges to some $H_{i}^{*}$ in probability uniformly over all $i=1,\cdots,n$ . One such a nonparametric sample covariance matrix $\hat{H}_{in}$ can be obtained using a simple two-step procedure, including estimating the residuals $\hat{r}_{i}=y_{i}-X_{i}\hat{\beta}$ , where $\hat{\beta}$ is the least-squares estimator using working independence correlation matrices, and solving the constrained optimization problem $\min_{\theta\geq 0}\sum_{i=1}^{n}\|H_{i}(\theta)-\hat{r}_{i}\hat{r}_{i}^{T}\|_{F}^{2}$ . Let

\phi_{i}(\beta)=X_{i}^{T}\hat{H}_{in}^{-1}(y_{i}-X_{i}\beta),

which satisfies $E\{\phi_{i}(\beta)\}=0$ when $\beta$ is the true value. Denote by $p_{i}$ the point mass at the $i$ th subject. The nonparametric empirical likelihood is defined as

L_{0}(\beta)=\sup_{p_{i}}\Big{\{}\prod_{i=1}^{n}p_{i}:p_{i}\geq 0,\sum_{i=1}^{n}p_{i}=1,\sum_{i=1}^{n}p_{i}\phi_{i}(\beta)=0\Big{\}}.

Since it can be proved that $\max_{\beta}L_{0}(\beta)=1/n^{n}$ (Owen,, 2001), Wang et al., (2010) proposed the elr statistic

\textsc{elr}_{0}(\beta_{0})=\frac{L_{0}(\beta_{0})}{\max_{\beta}L_{0}(\beta)}=n^{n}L_{0}(\beta_{0}).

To obtain the asymptotic distribution of the elr statistic, the following regularity conditions are needed.

Condition 1.

As $n\rightarrow\infty$ , $P(0\in ch\{\phi_{1}(\beta_{0}),\cdots,\phi_{n}(\beta_{0})\})\rightarrow 1$ , where $ch\{\}$ is the convex hull.

Condition 2.

The limit $\lim_{n\rightarrow\infty}n^{-1}\sum_{i=1}^{n}X_{i}^{T}H_{i}^{*-1}H_{i}H_{i}^{*-1}X_{i}$ exists and is positive definite.

Condition 3.

The expectation $E\|\phi_{i}(\beta_{0})\|_{2}^{2+\gamma_{1}}$ are bounded uniformly for some $\gamma_{1}>0$ .

Condition 4.

Let $\hat{G}_{in}=\hat{H}_{in}^{-1}$ with element $\hat{g}_{ijk}$ , $x_{ij}^{T}$ be the $j$ th row of $X_{i}$ , and $r_{ik}$ be the $k$ th element of $r_{i}$ . For each pair $i$ and $i^{\prime}$ with $i,i^{\prime}=1,\cdots,n$ and $i\neq i^{\prime}$ , $\hat{g}_{ijk}-\hat{g}_{-(i,i^{\prime})jk}=O_{p}(n^{-1})$ and sufficient moment conditions are satisfied so that $E(\hat{B}_{ii^{\prime}})=O(n^{-1})$ and $E(\hat{B}_{ii^{\prime}}\hat{B}_{ii^{\prime}}^{T})=O(n^{-2})$ , where $\hat{g}_{-(i,i^{\prime})jk}$ is $\hat{g}_{ijk}$ but computed with all the data except for subjects $i$ and $i^{\prime}$ and $\hat{B}_{ii^{\prime}}=\sum_{j=1}^{n_{i}}\sum_{k=1}^{n_{i}}(\hat{g}_{ijk}-\hat{g}_{-(i,i^{\prime})jk})x_{ij}r_{ik}$ .

Conditions 1–3 are common conditions for the empirical likelihood methods (Owen,, 1991). Condition 4 assumes mild constraints on $\hat{H}_{in}^{-1}$ to ensure that the difference between the statistic $\textsc{elr}_{0}(\beta_{0})$ defined with $\hat{H}_{in}$ and the one using $H_{i}^{*}$ vanishes as $n\rightarrow\infty$ . Under these regularity conditions, the following theorem provides the asymptotic distribution of the elr test $\textsc{elr}_{0}(\beta_{0})$ (Wang et al.,, 2010) under the null.

Theorem 1.

Under the regularity conditions (1)–(4), as $n\rightarrow\infty$ , $-2\log\textsc{elr}_{0}(\beta_{0})\rightarrow\chi_{p}^{2}$ in distribution under the null hypothesis $H_{0}:\beta^{*}=\beta_{0}$ .

The asymptotic result only requires that the $\hat{H}_{in}$ converge uniformly to some $H_{i}^{*}$ , which may not be the true $H_{i}$ (Wang et al.,, 2010). When the correlation structure is correctly specified, the estimator $\hat{H}_{in}$ is a consistent estimator of $H_{i}^{*}=H_{i}$ . The statistic defined with the true $H_{i}$ is asymptotically locally most powerful among all the choices of the weight matrices.

3 elr test for the variance component $\theta_{1}^{*}$

We consider the local test $H_{0}:\theta^{*}_{1}=\theta_{1}^{0}$ in the framework of the empirical likelihood, including the null $H_{0}:\theta_{1}^{*}=0$ , which is of the most interest. We define $r_{i}=y_{i}-X_{i}\beta^{*}$ and $R_{i}=r_{i}r_{i}^{T}$ . Since $E(r_{i})=0$ and $\text{var}(r_{i})=H_{i}(\theta^{*})$ , we have

R_{i}=H_{i}(\theta^{*})+\delta_{i}=\sum_{q=1}^{d}\theta^{*}_{q}\Phi_{iq}+\delta_{i},

where $E({\delta_{i}})=0$ and $\text{var}({\delta_{i}})$ exists. Since $\beta^{*}$ is unknown, we first need an estimator of $\beta^{*}$ , denoted by $\hat{\beta}$ . One simple choice is the least-squares estimator using the all data. Specifically, we stack the data from all subjects by denoting $X=(X_{1}^{T},\cdots,X_{n}^{T})^{T}$ , $y=(y_{1}^{T},\cdots,y_{n}^{T})^{T}$ , and $r=(r_{1}^{T},\cdots,r_{n}^{T})^{T}$ . Model (1) can be rewritten as

y=X\beta^{*}+r.

Then the least-squares estimator is $\hat{\beta}=(X^{T}X)^{-1}X^{T}y$ . For $i=1,\cdots,n$ , let $\hat{r}_{i}=y_{i}-X_{i}\hat{\beta}=r_{i}+X_{i}(\beta^{*}-\hat{\beta}).$ We have

	$\displaystyle\hat{R}_{i}$	$\displaystyle=\hat{r}_{i}\hat{r}_{i}^{T}=r_{i}r_{i}^{T}+r_{i}(\beta^{}-\hat{\beta})^{T}X_{i}^{T}+X_{i}(\beta^{}-\hat{\beta})r_{i}^{T}+X_{i}(\beta^{}-\hat{\beta})(\beta^{}-\hat{\beta})^{T}X_{i}^{T}$
		$\displaystyle=R_{i}+\hat{\epsilon}_{i}=H_{i}(\theta^{*})+\delta_{i}+\hat{\epsilon}_{i},$

where $\hat{\epsilon}_{i}=r_{i}(\beta^{*}-\hat{\beta})^{T}X_{i}^{T}+X_{i}(\beta^{*}-\hat{\beta})r_{i}^{T}+X_{i}(\beta^{*}-\hat{\beta})(\beta^{*}-\hat{\beta})^{T}X_{i}^{T}$ .

To control the rates of $E({\hat{\epsilon}_{i}})$ , $\text{cov}({r_{i}r_{i}^{T}},{\hat{\epsilon}_{j}})$ , and $\text{cov}({\hat{\epsilon}_{i}},{\hat{\epsilon}_{j}})$ , we need the following condition, which is also commonly used for empirical likelihood methods.

Condition 5.

The expectation $E\|r_{i}\|_{2}^{4+\gamma_{1}}$ are bounded uniformly for some $\gamma_{1}>0$ .

Under Condition 5, we see that the least-squares estimator $\hat{\beta}$ is good enough.

Proposition 1.

Assume that $n^{-1}X^{T}X\rightarrow\Sigma$ and $n^{-1/2}X^{T}r\xrightarrow{d}\eta$ as $n\rightarrow\infty$ , where $0<\|\Sigma\|_{2},\|\Sigma^{-1}\|_{2}<\infty$ , $E\eta=0$ and $E\|\eta\|_{2}^{4}=O(1)$ . When Condition 5 holds and $\hat{\beta}=(X^{T}X)^{-1}X^{T}y$ , we have $E({\hat{\epsilon}_{i}})=O(n^{-1}),$ $i=1,\cdots,n$ , and $\text{cov}({r_{i}r_{i}^{T}},{\hat{\epsilon}_{j}}),~{}\text{cov}({\hat{\epsilon}_{i}},{\hat{\epsilon}_{j}})=O(n^{-2})$ , $i,j=1,\cdots,n,i\neq j$ .

Let $\Xi=(\Xi_{kl})_{d\times d}$ with $\Xi_{kl}=\sum_{i=1}^{n}\text{tr}(\Phi_{ik}\Phi_{il})$ , and $\hat{\Upsilon}=(\hat{\Upsilon}_{1},\cdots,\hat{\Upsilon}_{d})^{T}$ with $\hat{\Upsilon}_{k}=\sum_{i=1}^{n}\text{tr}(\Phi_{ik}\hat{R}_{i})$ . We define

\hat{Z}_{i}(\theta_{1})=\text{tr}\Bigg{\{}\Phi_{i1}\Bigg{(}\hat{R}_{i}-\Phi_{i1}\theta_{1}-\sum_{q=2}^{d}\hat{\theta}_{q}\Phi_{iq}\Bigg{)}\Bigg{\}},~{}i=1,\cdots,n,

where

\hat{\theta}_{(1)}=(\hat{\theta}_{2},\cdots,\hat{\theta}_{q})^{T}=(\Xi^{-1})_{-1}^{T}\hat{\Upsilon}.

(2)

Since Proposition 1 implies $E\hat{Z}_{i}(\theta_{1})=O(n^{-1})$ if $\theta_{1}$ is the true value (see (A18) in the appendix), we define the nonparametric likelihood as

L(\theta_{1})=\max_{p_{i}}\Bigg{\{}\prod_{i=1}^{n}p_{i}|p_{i}\geq 0,\sum_{i=1}^{n}p_{i}=1,\sum_{i=1}^{n}p_{i}\hat{Z}_{i}(\theta_{1})=0\Bigg{\}}

and the corresponding elr statistic as

\textsc{elr}(\theta_{1}^{0})=\frac{L(\theta_{1}^{0})}{\max_{\theta_{1}\geq 0}L(\theta_{1})}.

(3)

If the true value $\theta_{1}^{*}=0$ (i.e., the null hypothesis under the case $\theta_{1}^{0}=0$ ), the denominator in (3) would not be $1/n^{n}$ as usual owing to the boundary value issue, and thus the existing results are inapplicable. To derive the asymptotic distribution of the proposed test $\textsc{elr}(\theta_{1}^{0})$ , we assume the following condition similar to Condition 1.

Condition 6.

As $n\rightarrow\infty$ , $P(0\in ch\{Z_{1}(\theta_{1}^{0}),\cdots,Z_{n}(\theta_{1}^{0})\})\rightarrow 1$ , where $Z_{i}(\theta_{1}^{0})$ is defined as $\hat{Z}_{i}(\theta_{1}^{0})$ with $\hat{R}_{i}$ replaced by $R_{i}$ .

Under Conditions 5 and 6, we have the following theorem on the asymptotic distribution of the elr test under the null.

Theorem 2.

Let $\hat{c}_{n}(\theta_{1}^{0})=\hat{\nu}_{2n}^{2}(\theta_{1}^{0})/\hat{\nu}_{1n}^{2}(\theta_{1}^{0})$ , where $\hat{\nu}_{1n}^{2}(\theta_{1}^{0})$ is a consistent estimator of the asymptotic variance of $n^{-1/2}\sum_{i=1}^{n}\hat{Z}_{i}(\theta_{1}^{0})$ and $\hat{\nu}_{2n}^{2}(\theta_{1}^{0})=n^{-1}\sum_{i=1}^{n}\hat{Z}_{i}^{2}(\theta_{1}^{0})$ . If $\theta^{*}_{(1)}\in R_{+}^{d-1}$ , and Conditions 5 and 6 hold, as $n\rightarrow\infty$ , $\hat{c}_{n}(\theta_{1}^{0})$ $\left\{-2\log\textsc{elr}(\theta_{1}^{0})\right\}\rightarrow\chi_{1}^{2}$ in distribution when $\theta_{1}^{0}>0$ , and $\hat{c}_{n}(0)\left\{-2\log\textsc{elr}(0)\right\}\rightarrow U_{+}^{2}$ in distribution, where $U\sim N(0,1)$ and $U_{+}=\max(U,0)$ .

Although the elr statistic $\hat{c}_{n}(\theta_{1}^{0})\left(-2\log\textsc{elr}(\theta_{1}^{0})\right)$ in Theorem 2 involves optimizations in the numerator and denominator, the following lemma shows that the statistic has an asymptotically equivalent expression that can be used to calculate the statistic efficiently.

Lemma 3.

If $\theta^{*}_{(1)}\in R_{+}^{d-1}$ , then under Conditions 5 and 6,

		$\displaystyle\hat{c}_{n}(\theta_{1}^{0})\left\{-2\log\textsc{elr}(\theta_{1}^{0})\right\}$
	$\displaystyle=$	$\displaystyle\begin{cases}\frac{\left\{n^{-1/2}\sum_{i=1}^{n}\hat{Z}_{i}(\theta_{1}^{0})\right\}^{2}}{\hat{\nu}_{1n}^{2}(\theta_{1}^{0})}+o_{p}(1),&\text{ if }\quad\theta_{1}^{0}>0,\\ \frac{\left\{n^{-1/2}\sum_{i=1}^{n}\hat{Z}_{i}(\theta_{1}^{0})\right\}^{2}}{\hat{\nu}_{1n}^{2}(\theta_{1}^{0})}I(\sum_{i=1}^{n}\hat{Z}_{i}(\theta_{1}^{0})\geq 0)+o_{p}(1),&\text{ if }\quad\theta_{1}^{0}=0.\end{cases}$

We next provide an estimator of the asymptotic variance of $n^{-1/2}\sum_{i=1}^{n}\hat{Z}_{i}(\theta_{1}^{0})$ . We rewrite $\Xi$ as

\Xi=\left(\begin{smallmatrix}E_{11}&E_{12}\\ E_{21}&E_{22}\end{smallmatrix}\right)

with $E_{11}$ being a scalar. Let $F=E_{22}^{-1}E_{21}=(F_{1},\cdots,F_{d-1})^{T}$ and $\alpha=1-E_{12}F/E_{11}\in(0,1]$ . It can be verified that

\sum_{i=1}^{n}\hat{Z}_{i}(\theta_{1}^{0})=\sum_{i=1}^{n}\hat{D}_{i}(\theta_{1}^{0})=\sum_{i=1}^{n}\hat{M}_{i}(\theta_{1}^{0}),

(4)

where

	$\displaystyle\hat{D}_{i}(\theta_{1}^{0})$	$\displaystyle=\alpha^{-1}\langle\Phi_{i1}-\sum_{q=1}^{d-1}F_{q}\Phi_{iq+1},\hat{R}_{i}-\theta_{1}^{0}\Phi_{i1}\rangle,$
	$\displaystyle\hat{M}_{i}(\theta_{1}^{0})$	$\displaystyle=\alpha^{-1}\langle\Phi_{i1}-\sum_{q=1}^{d-1}F_{q}\Phi_{iq+1},\hat{R}_{i}-H_{i}((\theta_{1}^{0},\hat{\theta}_{(1)}^{T})^{T})\rangle.$

In addition, for $i\neq j$ ,

\displaystyle\text{cov}({\hat{R}_{i}},{\hat{R}_{j}})=

\displaystyle\text{cov}({\delta_{i}}+{\hat{\epsilon}_{i}},{\delta_{j}}+{\hat{\epsilon}_{j}})=\text{cov}({\delta_{i}},{\hat{\epsilon}_{j}})+\text{cov}({\hat{\epsilon}_{i}},{\delta_{j}})+\text{cov}({\hat{\epsilon}_{i}},{\hat{\epsilon}_{j}})=O(n^{-2})

based on proposition 1. Therefore, ${\hat{D}_{i}(\theta_{1}^{0})}~{}(i=1,\cdots,n)$ are asymptotically independent with expectation $E(\hat{D}_{i}(\theta_{1}^{0}))=\alpha^{-1}\langle\Phi_{i1}-\sum_{q=1}^{d-1}F_{q}\Phi_{iq+1},\sum_{q=2}^{d}\theta_{q}^{*}\Phi_{iq}\rangle+O(n^{-1})$ , while the expectations of $\hat{Z}_{i}(\theta_{1}^{0})$ and $\hat{M}_{i}(\theta_{1}^{0})$ are $O(n^{-1})$ (see (A18) in the appendix). We have

	$\displaystyle\hat{D}_{i}(\theta_{1}^{0})-E(\hat{D}_{i}(\theta_{1}^{0}))$	$\displaystyle=\alpha^{-1}\langle\Phi_{i1}-\sum_{q=1}^{d-1}F_{q}\Phi_{iq+1},\hat{R}_{i}-H_{i}((\theta_{1}^{0},(\theta_{(1)}^{*})^{T})^{T})\rangle+O(n^{-1})$
		$\displaystyle=\hat{M}_{i}(\theta_{1}^{0})+o_{p}(1).$

Therefore,

	$\displaystyle\text{var}\big{\{}n^{-1/2}\sum_{i=1}^{n}\hat{Z}_{i}(\theta_{1}^{0})\big{\}}=$	$\displaystyle\frac{1}{n}\sum_{i=1}^{n}\{\hat{D}_{i}(\theta_{1}^{0})-E(\hat{D}_{i}(\theta_{1}^{0}))\}^{2}+o_{p}(1)$
	$\displaystyle=$	$\displaystyle\frac{1}{n}\sum_{i=1}^{n}\hat{M}_{i}(\theta_{1}^{0})^{2}+o_{p}(1),$		(5)

which leads a consistent estimator of the variance of $n^{-1/2}\sum_{i=1}^{n}\hat{Z}_{i}(\theta_{1}^{0})$ as

\hat{\nu}_{1n}^{2}(\theta_{1}^{0})=n^{-1}\sum_{i=1}^{n}\hat{M}_{i}(\theta_{1}^{0})^{2}.

4 Variance Component Analysis Over a Sequence of Responses

In some applications, we are interested in testing whether the variance components are all zero over a set of possibly correlated outcomes. One example of such applications is to test the variance components for the activity distribution based on wearable device data where we are interested in testing the variance component at each of the quantiles $t$ of the activity distribution. Extending model (1), we assume the following outcome model at level $t$ ,

y_{i}(t)=X_{i}\beta^{*}(t)+r_{i}(t),\quad i=1,\cdots,n,

(6)

where $r_{i}(t)\in R^{n_{i}}$ is a zero-mean random variable with variance $H_{i}(\theta^{*}(t))$ . We assume that $H_{i}(\theta^{*}(t))$ has the same linear structure for each $t$ ,

H_{i}(\theta^{*}(t))=\sum_{q=1}^{d}\theta^{*}_{q}(t)\Phi_{iq},\quad\theta^{*}(t)=\{\theta^{*}_{1}(t),\cdots,\theta^{*}_{d}(t)\}^{T}=\{\theta^{*}_{1}(t),\theta^{*T}_{(1)}(t)\}^{T}.

We are interested in testing the null $H_{0}:\theta^{*}_{1}(t)\equiv\theta_{1}^{0},~{}t\in[t_{1},t_{2}]$ , where $[t_{1},t_{2}]$ is a pre-defined interval. We propose the following maximally selected empirical likelihood ratio statistic (gelr),

\Gamma=\sup_{t\in[t_{1},t_{2}]}\hat{c}_{n}(\theta_{1}^{0},t)\left\{-2\log\textsc{elr}(\theta_{1}^{0},t)\right\},

(7)

where $\hat{c}_{n}(\theta_{1}^{0},t)\left\{-2\log\textsc{elr}(\theta_{1}^{0},t)\right\}$ is the elrstatistic for the outcome at $t$ . It can be shown that $\Gamma=\sup_{t\in[t_{1},t_{2}]}S(t)+o_{p}(1),$ with

S(t)=\begin{cases}\frac{\{n^{-1/2}\sum_{i=1}^{n}\hat{Z}_{i}(\theta_{1}^{0},t)\}^{2}}{\hat{\nu}_{1n}^{2}(\theta_{1}^{0},t)},&\text{ if }\theta_{1}^{0}>0,\\ \frac{\{n^{-1/2}\sum_{i=1}^{n}\hat{Z}_{i}(\theta_{1}^{0},t)\}^{2}}{\hat{\nu}_{1n}^{2}(\theta_{1}^{0},t)}I\{\sum_{i=1}^{n}\hat{Z}_{i}(\theta_{1}^{0},t)\geq 0\},&\text{ if }\theta_{1}^{0}=0,\end{cases}

where

	$\displaystyle\hat{Z}_{i}(\theta_{1}^{0},t)$	$\displaystyle=\text{tr}\Bigg{\{}\Phi_{i1}\Bigg{(}\hat{R}_{i}(t)-\Phi_{i1}\theta_{1}^{0}-\sum_{q=2}^{d}\hat{\theta}_{q}(t)\Phi_{iq}\Bigg{)}\Bigg{\}},$
	$\displaystyle\hat{\nu}_{1n}^{2}(\theta_{1}^{0},t)$	$\displaystyle=n^{-1}\alpha^{-2}\sum_{i=1}^{n}\Bigg{\langle}\hat{R}_{i}(t)-H_{i}((\theta_{1}^{0},\hat{\theta}_{(1)}(t)^{T})^{T}),\Phi_{i1}-\sum_{q=1}^{d-1}F_{q}\Phi_{iq+1}\Bigg{\rangle}^{2}.$

Assessment of the statistical significance of the statistic $\Gamma$ defined in (7) is challenging because of the dependence of $\hat{Z}_{i}(\theta_{1}^{0},t)$ . We propose a simple way of evaluating its significance by perturbing the el statistic. Specifically, we apply (4) to rewrite $\sum_{i=1}^{n}\hat{Z}_{i}(\theta_{1}^{0},t)$ as $\sum_{i=1}^{n}\hat{M}_{i}(\theta_{1}^{0},t)$ , where $\hat{M}_{i}(\theta_{1}^{0},t)=\alpha^{-1}\left\langle\Phi_{i1}-\sum_{q=1}^{d-1}F_{q}\Phi_{iq+1},\hat{R}_{i}(t)-H_{i}((\theta_{1}^{0},\hat{\theta}_{(1)}^{T}(t))^{T})\right\rangle$ . We can generate the null distribution of $\Gamma$ by perturbing the test statistic $\Gamma^{(g)}$ . Specifically, for each $g~{}(g=1,\cdots,G)$ , we generate $\xi_{i}^{(g)}$ from i.i.d. standard normal distribution and define

S^{(g)}(t)=\begin{cases}\frac{\{n^{-1/2}\sum_{i=1}^{n}\hat{M}_{i}(\theta_{1}^{0},t)\xi_{i}^{(g)}\}^{2}}{\hat{\nu}_{1n}^{2}(\theta_{1}^{0},t)},&\text{ if }\theta_{1}^{0}>0,\\ \frac{\{n^{-1/2}\sum_{i=1}^{n}\hat{M}_{i}(\theta_{1}^{0},t)\xi_{i}^{(g)}\}^{2}}{\hat{\nu}_{1n}^{2}(\theta_{1}^{0},t)}I\{\sum_{i=1}^{n}\hat{M}_{i}(\theta_{1}^{0},t)\xi_{i}^{(g)}\geq 0\},&\text{ if }\theta_{1}^{0}=0.\end{cases}

Define the corresponding perturbed test statistic as $\Gamma^{(g)}=\sup_{t\in[t_{1},t_{2}]}S^{(g)}(t)$ . The following Proportion 2 shows that the perturbed test statistics have the same distribution as the original test statistic under the null. Therefore, the $p$ -value of $\Gamma$ can be approximated by $\sum_{g=1}^{G}I(\Gamma^{(g)}>\Gamma)/G$ .

Proposition 2.

$\hat{M}_{i}(\theta_{1}^{0},t)$ satisfies

(i)

$E\{n^{-1/2}\sum_{i=1}^{n}\hat{M}_{i}(\theta_{1}^{0},t)\xi_{i}^{(g)}\}-E\{n^{-1/2}\sum_{i=1}^{n}\hat{Z}_{i}(\theta_{1}^{0},t)\}=o(1)$ ;
(ii)

$\text{var}\{n^{-1/2}\sum_{i=1}^{n}\hat{M}_{i}(\theta_{1}^{0},t)\xi_{i}^{(g)}\}-\text{var}\{n^{-1/2}\sum_{i=1}^{n}\hat{Z}_{i}(\theta_{1}^{0},t)\}=o(1)$ ;
(iii)

$\text{cov}\{n^{-1/2}\sum_{i=1}^{n}\hat{M}_{i}(\theta_{1}^{0},s)\xi_{i}^{(g)},n^{-1/2}\sum_{j=1}^{n}\hat{M}_{j}(\theta_{1}^{0},t)\xi_{j}^{(g)}\}-\text{cov}\{n^{-1/2}\sum_{i=1}^{n}\hat{Z}_{i}(\theta_{1}^{0},s),$
$n^{-1/2}\sum_{j=1}^{n}\hat{Z}_{j}(\theta_{1}^{0},t)\}=o(1)$ .

5 Simulation studies

5.1 Data generation

We examine the performance of the proposed empirical likelihood ratio tests for variance components and compare the results with the standard likelihood ratio (lr) test assuming Gaussian random effects and Gaussian errors. To mimic the twin design in the heritability analysis of wearable device data that we analyze next, we simulate data on a monozygotic or dizygotic twin pair. For the $i$ th twin, let $n_{i}=n_{i1}+n_{i2}$ , where $n_{i1}$ and $n_{i2}$ are the numbers of repeated measures for the twin. In wearable device data, $y_{i}(t)$ represents the $t$ th quantile of the activity distributions over $n_{i}$ days. The data are generated from the model:

y_{i}(t)=X_{i}\beta(t)+T_{i}a_{i}(t)+\tau_{i}(t),\quad i=1,\cdots,n,\quad t=s_{1},s_{2},\cdots,s_{m},

(8)

where $T_{i}=\text{blkdiag}\{\mathbf{1}_{n_{i1}},\mathbf{1}_{n_{i2}}\}$ , $a_{i}(t)$ is a random intercept, and $\tau_{i}(t)$ denotes zero-mean noise with variance $\sigma_{M}^{2}(t)\mathbf{I}_{n_{i}}$ . Here, $a_{i}(t)$ is assumed as the sum of additive genetic effect $g_{i}(t)$ , common environment $c_{i}(t)$ , and unique subject-specific environment $e_{i}(t)$ , i.e.,

a_{i}(t)=g_{i}(t)+c_{i}(t)+e_{i}(t),

where $g_{i}(t),c_{i}(t),e_{i}(t)$ are independent zero-mean random variables with variance-covariance $\sigma_{A}^{2}(t)K_{i}$ , $\sigma_{C}^{2}(t)\Lambda_{i}$ , and $\sigma_{E}^{2}(t)\mathbf{I}_{2}$ , respectively. The variance components $\sigma_{A}^{2}(t),\sigma_{C}^{2}(t)$ , and $\sigma_{E}^{2}(t)$ represent the additive genetic variance, common environmental variance, and unique environmental variance, respectively. For the $i$ th twin, $K_{i}$ is a genetic similarity matrix with

K_{i}=\begin{pmatrix}1&1\\ 1&1\end{pmatrix}\mbox{ for monozygotic twin},\mbox{ }K_{i}=\begin{pmatrix}1&0.5\\ 0.5&1\end{pmatrix}\mbox{ for dizygotic twin},

and $\Lambda_{i}$ quantifies shared environment between the twin pair with

\Lambda_{i}=\begin{pmatrix}1&1\\ 1&1\end{pmatrix}.

Under this model, we have

	$\displaystyle H_{i}(\theta^{*}(t))$	$\displaystyle=\sigma_{A}^{2}(t)T_{i}K_{i}T_{i}^{T}+\sigma_{C}^{2}(t)T_{i}\Lambda_{i}T_{i}^{T}+\sigma_{E}^{2}(t)T_{i}T_{i}^{T}+\sigma_{M}^{2}(t)\mathbf{I}_{n_{i}},$
	$\displaystyle~{}\theta^{*}(t)$	$\displaystyle=(\sigma_{A}^{2}(t),\sigma_{C}^{2}(t),\sigma_{E}^{2}(t),\sigma_{M}^{2}(t))^{T}.$

We sample $n_{ik}~{}(i=1,\cdots,n;k=1,2)$ from $\{3,4,5,6,7\}$ with equal probability $1/5$ . We set $n=100$ , among which there are 50 monozygotic twin families and 50 dizygotic twin families, and $t=0.01,0.03,0.05,\cdots,0.99$ . Denote by $x_{ij}^{T}$ the $j$ th row of $X_{i}$ . Let $x_{ij}=(1,x_{ij1},x_{ij2})^{T}$ with $x_{ij1}\sim N(0,1)$ and $x_{ij2}\sim N(2,1)$ . Moreover, we set $\beta(t)=(\beta_{1}(t),\beta_{2}(t),\beta_{3}(t))^{T}$ where $\beta_{1}(t)$ and $\beta_{3}(t)$ are the quantile functions of $N(1,6)$ and $N(1,9)$ , respectively, and $\beta_{2}(t)=0$ . Let

\sigma_{A}^{2}(t)=\sum_{l=1}^{N_{a}}\lambda_{l}^{a}(t)(\psi_{l}^{a}(t))^{2},\quad\sigma_{C}^{2}(t)=0,\quad\sigma_{E}^{2}(t)=\sum_{l=1}^{N_{e}}\lambda_{l}^{e}(t)(\psi_{l}^{e}(t))^{2},

where $N_{a}=N_{e}=2$ , $(\lambda_{1}^{a}(t),\lambda_{2}^{a}(t))=C_{a}(t)(0.5,1)$ , $(\lambda_{1}^{e}(t),\lambda_{2}^{e}(t))=C_{e}(t)(0.6,0.9)$ , $\psi_{1}^{a}(t)=\psi_{2}^{e}(t)=\sqrt{2}\sin(2\pi t)$ , $\psi_{2}^{a}(t)=\psi_{1}^{e}(t)=\sqrt{2}\cos(2\pi t)$ .

5.2 elr test for single variance component

For each $t\in\{0.01,0.03,0.05,\cdots,0.99\}$ , we test the null hypothesis $H_{0}:\sigma_{A}^{2}(t)=0$ . We consider the following two types of distributions for $g_{i}(t),c_{i}(t),e_{i}(t)$ and $\tau_{i}(t)$ :

(i)

multivariate normal distribution, $g_{i}(t)\stackrel{{\scriptstyle iid}}{{\sim}}\mathcal{N}(\mathbf{0},\sigma_{A}^{2}(t)K_{i})$ , $c_{i}(t)\stackrel{{\scriptstyle iid}}{{\sim}}\mathcal{N}(\mathbf{0},\sigma_{C}^{2}(t)\Lambda_{i})$ , $e_{i}(t)\stackrel{{\scriptstyle iid}}{{\sim}}\mathcal{N}(\mathbf{0},\sigma_{E}^{2}(t)\mathbf{I}_{i})$ , $\tau_{i}(t)\stackrel{{\scriptstyle iid}}{{\sim}}\mathcal{N}(\mathbf{0},0.3\mathbf{I}_{n_{i}})$ ;
(ii)

multivariate $t$ distribution, $g_{i}(t)\stackrel{{\scriptstyle iid}}{{\sim}}t_{3}(\mathbf{0},\sigma_{A}^{2}(t)K_{i}/3)$ , $c_{i}(t)\stackrel{{\scriptstyle iid}}{{\sim}}t_{3}(\mathbf{0},\sigma_{C}^{2}(t)\Lambda_{i}/3)$ , $e_{i}(t)\stackrel{{\scriptstyle iid}}{{\sim}}$
$t_{3}(\mathbf{0},\sigma_{E}^{2}(t)\mathbf{I}_{i}/3)$ , $\tau_{i}(t)\stackrel{{\scriptstyle iid}}{{\sim}}t_{3}(\mathbf{0},0.1\mathbf{I}_{n_{i}})$ .

Refer to caption — Figure 1: Type 1 error for each given value of $t$ . Top: normal random effects and errors; bottom: $t$ -distributed random effects and errors. elr: el ratio test with the lest-square estimate of $\beta^{*}$ ; lr : likelihood ratio test under the normal random-effects assumptions.

Denote by elr the proposed empirical likelihood ratio test with unknown $\beta^{*}(t)$ . We first consider the type 1 error for each given value of $t$ . We consider the setting $C_{a}(t)=0$ and $C_{e}(t)=0.1$ with random errors generated both from a normal and a $t$ distribution. We repeat the simulations 500 times. Figure 1 gives the results of type 1 errors for different values of $t$ . We see all the methods perform well under the normal errors. However, lr shows inflated type 1 errors when the error follows a long-tailed $t$ distribution.

To evaluate the power of the proposed tests, we consider the model with $C_{a}(t)=0.1$ and $C_{e}(t)=0.1$ . We calculate the empirical power of the proposed test at 0.05 level for different values of $t$ and present the results in Figure 2. The proposed method exhibits similar power as the lr test under the normal error. When the error follows a $t$ distribution, we do not report the result of the lr test because of its inflated type 1 error as shown in Figure 1 (b), and the elr test does not lose much power.

5.3 elr test for variance components over an interval

To evaluate the proposed tests for variance components in the case of correlated outcomes over an interval of $t$ , we generate data as follows. Let $g_{i}(t)=\sigma_{A}(t)\zeta_{ai},c_{i}(t)=\sigma_{C}(t)\zeta_{ci}$ , and $e_{i}(t)=\sigma_{E}(t)\zeta_{ei}$ . Let $\tau_{i}(t)=\sigma_{M}(t)\zeta_{\tau i}$ , where $\sigma_{M}^{2}(t)=\sum_{l=1}^{N_{m}}\lambda_{l}^{m}(t)(\psi_{l}^{m}(t))^{2}$ with $N_{m}=2$ , $(\lambda_{1}^{m}(t),\lambda_{2}^{m}(t))=C_{m}(t)(0.5,1)$ , $\psi_{1}^{m}(t)=\sqrt{2}\cos(2\pi t)$ , and $\psi_{2}^{m}(t)=\sqrt{2}\sin(2\pi t)$ . We consider two types of distributions for $\zeta_{ai},\zeta_{ci},\zeta_{ei},\zeta_{\tau i}$ :

(i)

multivariate normal distribution, $\zeta_{ai}\stackrel{{\scriptstyle iid}}{{\sim}}\mathcal{N}(\mathbf{0},K_{i})$ , $\zeta_{ci}\stackrel{{\scriptstyle iid}}{{\sim}}\mathcal{N}(\mathbf{0},\Lambda_{i})$ , $\zeta_{ei}\stackrel{{\scriptstyle iid}}{{\sim}}\mathcal{N}(\mathbf{0},\mathbf{I}_{2})$ , $\zeta_{\tau i}\stackrel{{\scriptstyle iid}}{{\sim}}\mathcal{N}(\mathbf{0},\mathbf{I}_{2})$ ;
(ii)

multivariate $t$ distribution, $\zeta_{ai}\stackrel{{\scriptstyle iid}}{{\sim}}t_{3}(\mathbf{0},K_{i}/3)$ , $\zeta_{ci}\stackrel{{\scriptstyle iid}}{{\sim}}t_{3}(\mathbf{0},\Lambda_{i}/3)$ , $\zeta_{ei}\stackrel{{\scriptstyle iid}}{{\sim}}t_{3}(\mathbf{0},\mathbf{I}_{2}/3)$ , $\zeta_{\tau i}\stackrel{{\scriptstyle iid}}{{\sim}}t_{3}(\mathbf{0},\mathbf{I}_{2}/3)$ .

Denote by gELR the proposed global empirical likelihood ratio test with unknown $\beta^{*}(t)$ . We first consider the global test $H_{0}:\sigma_{A}^{2}(t)\equiv 0,~{}t\in[0,1]$ . Let $C_{a}(t)=c_{0}I(t=0.49)$ , $C_{e}(t)=0.1$ , and $C_{m}(t)=0.08$ , where $I(\cdot)$ is an indicator function. We consider different choices of the signal size $c_{0}$ by setting $c_{0}=0,0.02,0.04,0.06,0.08,0.1$ , and generate 500 datasets for each setting. Figure 3 presents the empirical power of gELR at 0.05 significance level under different distributions of random errors and different $c_{0}$ . As expected, the empirical power of rejecting the null hypothesis increases with the signal size. Compared to the results under the multivariate $t$ distribution, the proposed test gELR has higher power when data are normally distributed.

To further evaluate the type 1 error and the power, we consider models with $C_{a}(t)=0.08I(t\in\{0.47,0.49,0.51,0.53\})$ , $C_{e}(t)=0.1$ , and $C_{m}(t)=0.08$ . We consider to test each of the candidate intervals of lengths $\{3,4,5,6\}$ and denote them by scan3, scan4, scan5, and scan6, respectively. Let $\mathcal{J}_{k}$ be the set of candidate intervals under the scanning length $k$ ( $k=3,4,5,6$ ) and let $\mathcal{J}=\cup_{k=3}^{6}\mathcal{J}_{k}$ be the set of all candidate intervals. For each candidate interval $L\in\mathcal{J}$ , we test the null hypothesis $H_{0}:\sigma_{A}^{2}(t)\equiv 0,~{}t\in L$ . The signal in the interval $L$ is significant if

h(\Gamma_{L})=\frac{\Gamma_{L}-\bar{\Gamma}_{L}}{\sqrt{\sum_{g=1}^{G}(\Gamma_{L}^{(g)}-\bar{\Gamma}_{L})^{2}/(G-1)}}>\sqrt{2\log|\mathcal{J}|},

where $\bar{\Gamma}_{L}=(\sum_{g=1}^{G}\Gamma_{L}^{(g)})/G$ with $G=1000$ . The threshold $\sqrt{2\log|\mathcal{J}|}$ is selected based on the extreme value distribution of $|\mathcal{J}|$ normal random variables.

Under each type of error distributions, 500 datasets are generated. For the global test under the candidate interval $L=\{t_{1},t_{1}+0.02,\cdots,t_{2}\}$ , we mark its empirical power at $(t_{1}+t_{2})/2$ . The results are shown in Figure 4. The proposed global test gELR exhibits high power if the interval involves at least one nonzero time points and show almost no power otherwise.

6 Application to genetic heritability analysis of physical activity distribution

6.1 Description of the data

We apply the methods to data set of the Australian Twin study, which includes 366 healthy twins, 151 of them are monozygotic twins, and 215 are dizygotic twins. The participants wore actigraphy to track their physical activities for no more than 14 days. The minute-to-minute activity counts derived from actigraphy were collected in a 1440-dimensional vector per day. Since we are interested in inference of the heritability of the activity distributions, we obtain the empirical quantiles of activity counts at different quantiles, $t=1/144,~{}2/144,\cdots,144/144$ . Specifically, for the $j$ th measurement (day) from the $k$ th person in the $i$ th twin family, the raw data $\xi_{ikj}=(\xi_{ikj1},\cdots,\xi_{ikj1440})^{T}$ from the wearable device are transformed by using

\tilde{\xi}_{ikj}=\log(9250\xi_{ikj}+1),\quad i=1,\cdots,n;~{}k=1,2;~{}j=1,\cdots,n_{ik}.

For the $k$ th person in the $i$ th twin family, the $j$ th repeated measure of $t$ -quantile of activity counts is obtained as

y_{ikj}(t)=\tilde{\xi}_{ikj}^{[1440\cdot t]},\quad t=1/144,~{}2/144,\cdots,144/144,

where $\tilde{\xi}_{ikj}^{[s]}$ denotes the $s$ th order statistic of $\tilde{\xi}_{ikj}$ .

The covariate $x_{ikj}$ includes gender, age, BMI, and indicator of weekend, i.e., $x_{ikj}=(1,\text{Gender},\text{Age},$ $\text{BMI},\text{Weekend})^{T}$ . Let $y_{i}(t)=(y_{i11}(t),\cdots,y_{i1n_{i1}}(t),y_{i21}(t),\cdots,y_{i2n_{i2}}(t))^{T}$ and $X_{i}=(x_{i11}^{T},\cdots,x_{i1n_{i1}}^{T},$ $x_{i21}^{T},\cdots,x_{i2n_{i2}}^{T})^{T}$ . Let $y(t)=(y_{1}^{T}(t),\cdots,y_{n}^{T}(t))^{T}$ . For each $t$ , we remove the outliers of $y(t)$ defined as values that are more than 1.5 times the inter-quantile range above the upper quantile or below the lower quantile. After removing all outliers and removing missing data, we have $n=149$ twin families including 63 monozygotic twin families and 86 dizygotic twin families, and the total number of observations is 3,489.

6.2 Effects of Gender, Age, BMI, Weekend on Activity Profiles

We first examine the associations between the covariates including gender, age, BMI, and weekend vs weekday and the overall activity distribution. For each of the four covariates and each of the $t$ values, we obtain the el estimator by solving the estimating equations $\sum_{i=1}^{n}\phi_{i}(\beta)=0$ , and apply Theorem 1 to construct the confidence interval $\{\beta_{0}:-2\log\textsc{elr}_{0}(\beta_{0})\leq\chi_{1}^{2}(1-\alpha)\}$ , where $\chi_{1}^{2}(1-\alpha)$ is the $(1-\alpha)$ quantile of the $\chi_{1}^{2}$ distribution. The first column of Figure 5 shows the estimated regression coefficient for each of the $t$ values and its point-wise 95% confidence intervals using the el method.

We then test whether there is any difference in activity profiles between individuals of different gender, age, and BMI and whether the activity profiles are different between weekdays and weekends. Specifically, we consider testing such differences at each of the quantile $t$ . To test $H_{0}:\beta_{l}(t)=0,~{}l\in\{\text{Gender, Age, BMI, Weekend}\}$ , we apply the empirical likelihood ratio test in Section 2 and the standard likelihood ratio (lr) test assuming normal random effects, and we obtain the $p$ -value for each of the $t$ values. The second column of Figure 5 shows the $p$ -values for each $t$ and for each of these four covariates. At the nominal $p$ -value of 0.05, the elr test shows that there is an gender effects when the activity counts are small (i.e., small $t$ ). In contrast, the standard lr test only shows such significance in a smaller interval from 0.23 to 0.42. For age, the elr shows a significant effect for the large activity counts region (i.e., large $t$ ). Both the elr and lr tests do not reject the null hypothesis that there is no effect of BMI, while the effects of weekend are statistically significant under almost the whole region of $t$ .

6.3 Analysis of heritability of the activity distribution

We then address the question whether the activity distribution is heritable, where the distribution is summarized as the quantiles. This is equivalent to test the null hypothesis $H_{0}:\sigma_{A}^{2}(t)=0,~{}t\in[0,1]$ . For each quantile $t$ , we first estimate the fixed effects using the least-square estimate and then apply the proposed elr to test the null hypothesis $H_{0}:\sigma_{A}^{2}(t)=0$ and to compare the results with the lr method. Figure 6 (a) gives the $p$ -values at different quantiles $t$ . It shows that the test $H_{0}:\sigma_{A}^{2}(t)=0$ is rejected for $t\in[0.375,0.958]$ based on the elr test and $t\in[0.472,0.931]$ using lr at the nominal 0.05 significance level. However, if we use the Bonferroni correction for multiple testing, only the proposed elr test identifies significant heritability for the quantiles between 0.375 and 0.514. The $p$ -value of global test $H_{0}:\sigma_{A}^{2}(t)\equiv 0,~{}t\in[0,1]$ is 0 when applying the proposed elr with 1000 permutations. Overall, our analysis shows that the activity distribution is heritable, especially in the quantile range from 0.375 to 0.514.

6.4 Sensitivity analysis of heritability of the activity distribution

To examine whether our previous preprocessing steps affect the analysis of heritability, we consider an alternative approach to remove the outliers. For each $t$ , we remove the outliers of $y(t)$ defined as the values that are greater than 3 standard deviations from its median. After removing all the outliers and missing data, we have $n=152$ twin families including 64 monozygotic twin families and 88 dizygotic twin families, and the total number of observations is 4,190.

Under this preprocessing method, the $-\log_{10}(p$ -value) of testing heritability is provided in Figure 6 (b), which shows similar results as Figure 6 (a). The test $H_{0}:\sigma_{A}^{2}(t)=0$ is rejected for $t\in[0.285,0.979]$ when using the elrtest and $t\in[0.514,1]$ using the LR test at the nominal 0.05 significance level. If we adopt the Bonferroni correction for multiple testing, only the proposed elrtest identified significant heritability under the quantiles between 0.396 and 0.576. The $p$ -value of the global test $H_{0}:\sigma_{A}^{2}(t)\equiv 0,~{}t\in[0,1]$ is 0 when applying the proposed elrwith 1000 permutations.

7 Discussion

In this paper, we have developed an empirical likelihood method for making inference of the variance components in general linear mixed-effects models. The proposed empirical likelihood ratio test statistic can be applied to a large set of related outcomes such as different quantiles of the activity distribution when we analyze the wearable device data sets. Simulation studies show that the proposed methods control type 1 error much better than the likelihood ratio method when the normality assumptions do not hold.

To address the unknown nuisance variance components, we assume its true value $\theta^{*}_{(1)}$ being positive and thus as $n\rightarrow\infty$ , (2) provides unbiased positive estimates with probability 1. When applying the proposed methods to the real data, we note that (2) may provide negative estimators at some quantile $t$ . To solve this problem, we first test whether the nuisance variance component (for example, $\sigma_{C}^{2}(t)$ or $\sigma_{E}^{2}(t)$ ) is zero at these quantile points. If the null hypothesis is not rejected, we omit the nuisance variance components in the model and then apply the proposed elrtest for the components of interest.

ACKNOWLEDGMENT

This research was supported by the Intramural Research Program of the National Institute of Mental Health through grant ZIA MH002954-04 [Motor Activity Research Consortium for Health (mMARCH)]. We thank Dr. Hickie and Dr. Martin for sharing the Australian twin study data as part of the mMARCH network and the Genetic Epidemiology Research Branch at National Institute of Mental Health for processing the accelerometry data.

APPENDIX

Appendix A Proofs and complements

A.1 Proof of Theorem 2

To prove Theorem 2, we first consider the setting with known $\beta^{*}$ . We define $Z_{i}(\theta_{1})$ , $L_{1}(\theta_{1})$ , $\textsc{elr}_{1}(\theta_{1}^{0})$ , and $\tilde{\nu}_{1n}^{2}(\theta_{1}^{0})$ in the same way as $\hat{Z}_{i}(\theta_{1})$ , $L(\theta_{1})$ , $\textsc{elr}(\theta_{1}^{0})$ , and $\hat{\nu}_{1n}^{2}(\theta_{1}^{0})$ , respectively, with $\hat{R}_{i}$ replaced by $R_{i}$ . Under Conditions 5 and 6, we derive the asymptotic distribution of $\textsc{elr}_{1}(\theta_{1}^{0})$ in the following theorem.

Theorem A1.

Let $\tilde{c}_{n}(\theta_{1}^{0})=\tilde{\nu}_{2n}^{2}(\theta_{1}^{0})/\tilde{\nu}_{1n}^{2}(\theta_{1}^{0})$ , where $\tilde{\nu}_{1n}^{2}(\theta_{1}^{0})$ is a consistent estimator of the asymptotic variance of $n^{-1/2}\sum_{i=1}^{n}Z_{i}(\theta_{1}^{0})$ and $\tilde{\nu}_{2n}^{2}(\theta_{1}^{0})=n^{-1}\sum_{i=1}^{n}Z_{i}^{2}(\theta_{1}^{0})$ . If $\theta^{*}_{(1)}\in R_{+}^{d-1}$ , then under Conditions 6 and 5, as $n\rightarrow\infty$ , $\tilde{c}_{n}(\theta_{1}^{0})\big{(}-2\log\textsc{elr}_{1}(\theta_{1}^{0})\big{)}\rightarrow\chi_{1}^{2}$ in distribution when $\theta_{1}^{0}>0$ , and $\tilde{c}_{n}(0)(-2\log\textsc{elr}_{1}(0))\rightarrow U_{+}^{2}$ in distribution, where $U\sim N(0,1)$ and $U_{+}=\max(U,0)$ .

Proof.

For simplicity, we sometimes use $Z_{i}$ to denote $Z_{i}(\theta_{1})$ when there is no confusion. Using the method of Lagrange multipliers, let

\mathcal{L}=-\sum_{i=1}^{n}\log p_{i}+\kappa(\sum_{i=1}^{n}p_{i}-1)+\lambda_{0}\sum_{i=1}^{n}p_{i}Z_{i}.

Since

\frac{\partial\mathcal{L}}{\partial p_{i}}=-\frac{1}{p_{i}}+\kappa+\lambda_{0}Z_{i}=0,

we have

p_{i}=\frac{1}{\kappa+\lambda_{0}Z_{i}}\quad\text{ and }\quad\kappa=n.

(A9)

Plugging $\lambda_{0}=n\lambda$ into (A9), we obtain

p_{i}=\frac{1}{n(1+\lambda Z_{i})}.

(A10)

Since

0=\sum_{i=1}^{n}p_{i}Z_{i}=\sum_{i=1}^{n}\frac{Z_{i}}{n(1+\lambda Z_{i})},

(A11)

under Condition 5, one can show that

\lambda=\big{(}\sum_{i=1}^{n}Z_{i}^{2}\big{)}^{-1}{\sum_{i=1}^{n}Z_{i}}+o_{p}(n^{-1/2})\text{ by Taylor expansion}.

Let $W_{1}(\theta_{1})=n^{n}L_{1}(\theta_{1})$ . We have

$\displaystyle-2\log(W_{1}(\theta_{1}))$	$\displaystyle=-2\sum_{i=1}^{n}(\log p_{i}+\log n)=2\sum_{i=1}^{n}\log(1+\lambda Z_{i})$
	$\displaystyle=2\sum_{i=1}^{n}\big{(}\lambda Z_{i}-\frac{1}{2}(\lambda Z_{i})^{2}\big{)}+o_{p}(1)\text{ (by Taylor expansion)}$
	$\displaystyle=\big{(}\sum_{i=1}^{n}Z_{i}\big{)}^{2}\big{(}\sum_{i=1}^{n}Z_{i}^{2}\big{)}^{-1}+o_{p}(1).$	(A12)

(1)

If $\theta_{1}^{0}>0$ , then Lemma A2 implies

		$\displaystyle\tilde{c}_{n}(\theta_{1}^{0})\big{(}-2\log\frac{L_{1}(\theta_{1}^{0})}{\max_{\theta_{1}\geq 0}L_{1}(\theta_{1})}\big{)}=\tilde{c}_{n}(\theta_{1}^{0})\big{(}-2\log W_{1}(\theta_{1}^{0})\big{)}$
	$\displaystyle=$	$\displaystyle\big{(}n^{-1/2}\sum_{i=1}^{n}Z_{i}(\theta_{1}^{0})\big{)}^{2}/\tilde{\nu}_{1n}^{2}(\theta_{1}^{0})+o_{p}(1)$
	$\displaystyle=$	$\displaystyle\frac{\big{(}n^{-1/2}\alpha^{-1}\sum_{i=1}^{n}\big{\langle}\Phi_{i1}-\sum_{q=1}^{d-1}F_{q}\Phi_{iq+1},R_{i}-\theta_{1}^{0}\Phi_{i1}\big{\rangle}\big{)}^{2}}{n^{-1}\alpha^{-2}\sum_{i=1}^{n}\big{\langle}R_{i}-H_{i}((\theta_{1}^{0},\tilde{\theta}_{(1)}^{T})^{T}),\Phi_{i1}-\sum_{q=1}^{d-1}F_{q}\Phi_{iq+1}\big{\rangle}^{2}}+o_{p}(1).$

Since

		$\displaystyle n^{-1}\alpha^{-2}\sum_{i=1}^{n}\big{\langle}R_{i}-H_{i}((\theta_{1}^{0},\tilde{\theta}_{(1)}^{T})^{T}),\Phi_{i1}-\sum_{q=1}^{d-1}F_{q}\Phi_{iq+1}\big{\rangle}^{2}$
	$\displaystyle=$	$\displaystyle\text{var}\big{(}n^{-1/2}\alpha^{-1}\sum_{i=1}^{n}\big{\langle}\Phi_{i1}-\sum_{q=1}^{d-1}F_{q}\Phi_{iq+1},R_{i}-\theta_{1}^{0}\Phi_{i1}\big{\rangle}\big{)}+o_{p}(1)$

under Condition 5, it implies $\tilde{c}_{n}(\theta_{1}^{0})\big{(}-2\log\textsc{elr}_{1}(\theta_{1}^{0})\big{)}\rightarrow\chi^{2}_{1}$ in distribution when $\theta_{1}^{0}>0$ .

(2)

When $\theta_{1}^{0}=0$ , Lemma A2 implies

\tilde{c}_{n}(0)\big{(}-2\log\frac{L_{1}(0)}{\max_{\theta_{1}\geq 0}L_{1}(\theta_{1})}\big{)}=\tilde{c}_{n}(0)\big{(}-2\log W_{1}(0)\big{)}I\big{(}\sum_{i=1}^{n}Z_{i}(0)\geq 0\big{)}

as $n$ is large enough. Therefore, $\tilde{c}_{n}(0)\big{(}-2\log\textsc{elr}_{1}(0)\big{)}\rightarrow U_{+}^{2}$ in distribution, where $U\sim N(0,1)$ and $U_{+}=\max(U,0)$ .

∎

Lemma A2.

Let $W_{1}(\theta_{1})=n^{n}L_{1}(\theta_{1})$ and $\tilde{\theta}_{1}=\arg\max_{\theta_{1}\geq 0}W_{1}(\theta_{1})$ . If the true value $\theta^{*}_{1}>0$ , then $W_{1}(\tilde{\theta}_{1})=1$ as $n$ is large enough. If the true value $\theta^{*}_{1}=0$ , then $W_{1}(\tilde{\theta}_{1})=I(\sum_{i=1}^{n}Z_{i}(0)\geq 0)+W_{1}(0)I(\sum_{i=1}^{n}Z_{i}(0)<0)$ as $n$ is large enough.

Proof.

Let $\check{\theta}_{1}=\arg\max_{\theta_{1}}W_{1}(\theta_{1})$ . We use a similar method in Qin and Lawless, (1994). Since

\check{\theta}_{1}=\arg\min_{\theta_{1}}-2\log W_{1}(\theta_{1}),

	$\displaystyle 0=\frac{\partial(-2\log W_{1}(\theta_{1}))}{\partial\theta_{1}}\|_{\theta_{1}=\check{\theta}_{1}}$	$\displaystyle=2\sum_{i=1}^{n}\frac{\frac{\partial\lambda}{\partial\theta_{1}}Z_{i}+\lambda\frac{\partial Z_{i}}{\partial\theta_{1}}}{1+\lambda Z_{i}}\|_{\theta_{1}=\check{\theta}_{1}}$
		$\displaystyle=2\lambda\sum_{i=1}^{n}\frac{1}{1+\lambda Z_{i}}\frac{\partial Z_{i}}{\partial\theta_{1}}\|_{\theta_{1}=\check{\theta}_{1}}\text{ (by (\ref{eq:Q1}))}.$		(A13)

Let $\check{\lambda}=\lambda(\check{\theta}_{1})$ . We note $\check{\theta}_{1}$ and $\check{\lambda}$ satisfy

Q_{1n}(\check{\theta}_{1},\check{\lambda})=0,\quad Q_{2n}(\check{\theta}_{1},\check{\lambda})=0,

where

	$\displaystyle Q_{1n}(\theta_{1},\lambda)$	$\displaystyle=\frac{1}{n}\sum_{i=1}^{n}\frac{Z_{i}(\theta_{1})}{1+\lambda Z_{i}(\theta_{1})}\text{ (by (\ref{eq:Q1}))},$
	$\displaystyle Q_{2n}(\theta_{1},\lambda)$	$\displaystyle=\frac{\lambda}{n}\sum_{i=1}^{n}\frac{1}{1+\lambda Z_{i}(\theta_{1})}\frac{\partial Z_{i}(\theta_{1})}{\partial\theta_{1}}\text{ (by (\ref{eq:Q2}))}.$

Taking derivatives about $\theta_{1}$ and $\lambda$ , we have

	$\displaystyle\frac{\partial Q_{1n}(\theta_{1},0)}{\partial\theta_{1}}=\frac{1}{n}\sum_{i=1}^{n}\frac{\partial Z_{i}(\theta_{1})}{\partial\theta_{1}},$	$\displaystyle\frac{\partial Q_{1n}(\theta_{1},0)}{\partial\lambda}=-\frac{1}{n}\sum_{i=1}^{n}Z_{i}(\theta_{1})^{2},$
	$\displaystyle\frac{\partial Q_{2n}(\theta_{1},0)}{\partial\theta_{1}}=0,$	$\displaystyle\frac{\partial Q_{2n}(\theta_{1},0)}{\partial\lambda}=\frac{1}{n}\sum_{i=1}^{n}\frac{\partial Z_{i}(\theta_{1})}{\partial\theta_{1}}.$

Expanding $Q_{1n}$ and $Q_{2n}$ at $(\theta_{1}=\theta^{*}_{1},\lambda=0)$ , we have

$\displaystyle 0=$	$\displaystyle Q_{1n}(\check{\theta}_{1},\check{\lambda})$
$\displaystyle=$	$\displaystyle Q_{1n}(\theta^{}_{1},0)+\frac{\partial Q_{1n}(\theta^{}_{1},0)}{\partial\theta_{1}}(\check{\theta}_{1}-\theta^{}_{1})+\frac{\partial Q_{1n}(\theta^{}_{1},0)}{\partial\lambda}\check{\lambda}+o_{p}(n^{-1/2}),$	(A14)
$\displaystyle 0=$	$\displaystyle Q_{2n}(\check{\theta}_{1},\check{\lambda})$
$\displaystyle=$	$\displaystyle Q_{2n}(\theta^{}_{1},0)+\frac{\partial Q_{2n}(\theta^{}_{1},0)}{\partial\theta_{1}}(\check{\theta}_{1}-\theta^{}_{1})+\frac{\partial Q_{2n}(\theta_{1}^{},0)}{\partial\lambda}\check{\lambda}+o_{p}(n^{-1/2}).$	(A15)

(A14) and (A15) give

\begin{pmatrix}\check{\theta}_{1}-\theta^{*}_{1}\\ \check{\lambda}\end{pmatrix}=\begin{pmatrix}\frac{1}{n}\sum_{i=1}^{n}\frac{\partial Z_{i}(\theta^{*}_{1})}{\partial\theta_{1}}&-\frac{1}{n}\sum_{i=1}^{n}Z_{i}(\theta^{*}_{1})^{2}\\ 0&\frac{1}{n}\sum_{i=1}^{n}\frac{\partial Z_{i}(\theta^{*}_{1})}{\partial\theta_{1}}\end{pmatrix}^{-1}\begin{pmatrix}-\frac{1}{n}\sum_{i=1}^{n}Z_{i}(\theta^{*}_{1})+o_{p}(n^{-1/2})\\ o_{p}(n^{-1/2})\end{pmatrix}

Hence,

\check{\theta}_{1}-\theta^{*}_{1}=-\frac{n^{-1}\sum_{i=1}^{n}Z_{i}(\theta^{*}_{1})}{n^{-1}\sum_{i=1}^{n}\frac{\partial Z_{i}(\theta^{*}_{1})}{\partial\theta_{1}}}+o_{p}(n^{-1/2}),

(A16)

where $(\sum_{i=1}^{n}Z_{i}(\theta^{*}_{1}))/(\sum_{i=1}^{n}\partial Z_{i}(\theta^{*}_{1})/\partial\theta_{1})=O_{p}(n^{-1/2})$ .

When $\theta^{*}_{1}>0$ , (A16) implies $\check{\theta}_{1}>0$ as $n$ is large enough. Thus, $\tilde{\theta}_{1}=\check{\theta}_{1}$ as $n$ is large enough. Then $\tilde{\theta}_{1}$ satisfies (A.1), i.e.,

2\lambda\sum_{i=1}^{n}\frac{1}{1+\lambda Z_{i}}(-\|\Phi_{i1}\|_{F}^{2})=0.

(A17)

Plugging (A10) into (A17), we have $\lambda=0$ . Then $p_{i}=n^{-1}$ and $W_{1}(\tilde{\theta}_{1})=1$ .

When $\theta^{*}_{1}=0$ , $\tilde{\theta}_{1}=\check{\theta}_{1}I(\check{\theta}_{1}\geq 0)$ as $n$ is large enough. Since $\sum_{i=1}^{n}\partial Z_{i}(0)/\partial\theta_{1}=-\sum_{i=1}^{n}\|\Phi_{i1}\|_{F}^{2}<0$ , we have $\tilde{\theta}_{1}=\check{\theta}_{1}I(\sum_{i=1}^{n}Z_{i}(0)\geq 0)$ as $n$ is large enough. So $W_{1}(\tilde{\theta}_{1})=I(\sum_{i=1}^{n}Z_{i}(0)\geq 0)+W_{1}(0)I(\sum_{i=1}^{n}Z_{i}(0)<0)$ . ∎

of Theorem 2.

Let $\Delta$ be a $d$ -dimensional vector with the $k$ th element $\Delta_{k}=\sum_{i=1}^{n}\text{tr}(\Phi_{ik}E(\hat{\epsilon}_{i}))$ . Let $\varsigma_{i}=(\text{tr}(\Phi_{i1}\Phi_{i2}),\cdots,\text{tr}(\Phi_{i1}\Phi_{id}))^{T}$ . For $i=1,\cdots,n$ ,

	$\displaystyle E(\hat{R}_{i})=H_{i}(\theta^{*})+E(\hat{\epsilon}_{i}),$
	$\displaystyle E(\hat{\theta}_{(1)})=\theta^{*}_{(1)}+\big{(}\Xi^{-1}\big{)}_{-1}^{T}\Delta,$

so we have

		$\displaystyle E(\hat{Z}_{i}(\theta_{1}^{0}))=\text{tr}(\Phi_{i1}E(\hat{\epsilon}_{i}))-\varsigma_{i}^{T}(\Xi^{-1})_{-1}^{T}\Delta=O(n^{-1}),$		(A18)
		$\displaystyle E(n^{-1/2}\sum_{i=1}^{n}\hat{Z}_{i}(\theta_{1}^{0}))=o(1)$

under Proposition 1. Then with similar techniques in the proof of Theorem A1, Theorem 2 can be proved. ∎

A.2 Proof of of Proposition 1

of Proposition 1.

Since

n^{1/2}(\hat{\beta}-\beta^{*})=n^{1/2}(X^{T}X)^{-1}X^{T}r=(n^{-1}X^{T}X)^{-1}n^{-1/2}X^{T}r,

we have $n^{1/2}(\hat{\beta}-\beta^{*})\xrightarrow{d}\Sigma^{-1}\eta$ .

Under Condition 5, $E\|r_{i}\|_{2}^{2}$ and $E\|r_{i}\|_{2}^{4}$ are bounded uniformly. Since

	$\displaystyle nE(r_{i}(\beta^{*}-\hat{\beta})^{T}X_{i}^{T})=$	$\displaystyle-E(r_{i}\sum_{k=1}^{n}r_{k}^{T}X_{k}(n^{-1}X^{T}X)^{-1}X_{i}^{T})$
	$\displaystyle=$	$\displaystyle-E(r_{i}r_{i}^{T}X_{i}(n^{-1}X^{T}X)^{-1}X_{i}^{T})\rightarrow O(1),$

nE(X_{i}(\beta^{*}-\hat{\beta})(\beta^{*}-\hat{\beta})^{T}X_{i}^{T})\rightarrow O(1),

we have $E({\hat{\epsilon}_{i}})=O(n^{-1})$ .

Note that for $i\neq j$ ,

		$\displaystyle\text{cov}({r_{i}r_{i}^{T}},{\hat{\epsilon}_{j}})$
	$\displaystyle=$	$\displaystyle\text{cov}({r_{i}r_{i}^{T}},{r_{j}(\beta^{}-\hat{\beta})^{T}X_{j}^{T}})+\text{cov}({r_{i}r_{i}^{T}},{X_{j}(\beta^{}-\hat{\beta})r_{j}^{T}})+\text{cov}({r_{i}r_{i}^{T}},{X_{j}(\beta^{}-\hat{\beta})(\beta^{}-\hat{\beta})^{T}X_{j}^{T}})$
	$\displaystyle=$	$\displaystyle\text{cov}({r_{i}r_{i}^{T}},{X_{j}(\beta^{}-\hat{\beta})(\beta^{}-\hat{\beta})^{T}X_{j}^{T}}).$

Since

		$\displaystyle n^{2}\text{cov}({r_{i}r_{i}^{T}},{X_{j}(\beta^{}-\hat{\beta})(\beta^{}-\hat{\beta})^{T}X_{j}^{T}})$
	$\displaystyle=$	$\displaystyle\text{cov}({r_{i}r_{i}^{T}},{X_{j}(n^{-1}X^{T}X)^{-1}\sum_{l=1}^{n}X_{l}^{T}r_{l}\sum_{k=1}^{n}r_{k}^{T}X_{k}(n^{-1}X^{T}X)^{-1}X_{j}^{T}})$
	$\displaystyle\rightarrow$	$\displaystyle\text{cov}({r_{i}r_{i}^{T}},{X_{j}\Sigma^{-1}X_{i}^{T}r_{i}r_{i}^{T}X_{i}\Sigma^{-1}X_{j}^{T}})=O(1),$

$\text{cov}({r_{i}r_{i}^{T}},{\hat{\epsilon}_{j}})=O(n^{-2})$ .

For $\text{cov}({\hat{\epsilon}_{i}},{\hat{\epsilon}_{j}}),~{}i\neq j$ , we only analyze $\text{cov}({r_{i}(\beta^{*}-\hat{\beta})^{T}X_{i}^{T}},{r_{j}(\beta^{*}-\hat{\beta})^{T}X_{j}^{T}})$ , $\text{cov}({r_{i}(\beta^{*}-\hat{\beta})^{T}X_{i}^{T}},$ ${X_{j}(\beta^{*}-\hat{\beta})(\beta^{*}-\hat{\beta})^{T}X_{j}^{T}})$ and $\text{cov}({X_{i}(\beta^{*}-\hat{\beta})(\beta^{*}-\hat{\beta})^{T}X_{i}^{T}},$ ${X_{j}(\beta^{*}-\hat{\beta})(\beta^{*}-\hat{\beta})^{T}X_{j}^{T}})$ . Since

		$\displaystyle n^{2}\text{cov}({r_{i}(\beta^{}-\hat{\beta})^{T}X_{i}^{T}},{r_{j}(\beta^{}-\hat{\beta})^{T}X_{j}^{T}})$
	$\displaystyle=$	$\displaystyle\text{cov}({r_{i}\sum_{l=1}^{n}r_{l}^{T}X_{l}(n^{-1}X^{T}X)^{-1}X_{i}^{T}},{r_{j}\sum_{k=1}^{n}r_{k}^{T}X_{k}(n^{-1}X^{T}X)^{-1}X_{j}^{T}})$
	$\displaystyle\rightarrow$	$\displaystyle\text{cov}({r_{i}r_{j}^{T}X_{j}\Sigma^{-1}X_{i}^{T}},{r_{j}r_{i}^{T}X_{i}\Sigma^{-1}X_{j}^{T}})=O(1),$

n^{2}\text{cov}({X_{i}(\beta^{*}-\hat{\beta})(\beta^{*}-\hat{\beta})^{T}X_{i}^{T}},{X_{j}(\beta^{*}-\hat{\beta})(\beta^{*}-\hat{\beta})^{T}X_{j}^{T}})\rightarrow O(1),

		$\displaystyle\text{cov}({r_{i}(\beta^{}-\hat{\beta})^{T}X_{i}^{T}},{X_{j}(\beta^{}-\hat{\beta})(\beta^{*}-\hat{\beta})^{T}X_{j}^{T}})$
	$\displaystyle=$	$\displaystyle-n^{-3}\text{cov}({r_{i}\sum_{k=1}^{n}r_{k}^{T}X_{k}(n^{-1}X^{T}X)^{-1}X_{i}^{T}},{X_{j}(n^{-1}X^{T}X)^{-1}\sum_{s=1}^{n}\sum_{t=1}^{n}X_{s}^{T}r_{s}r_{t}^{T}X_{t}(n^{-1}X^{T}X)^{-1}X_{j}^{T}})$
	$\displaystyle=$	$\displaystyle-n^{-3}\text{cov}({r_{i}r_{i}^{T}X_{i}(n^{-1}X^{T}X)^{-1}X_{i}^{T}},{X_{j}(n^{-1}X^{T}X)^{-1}X_{i}^{T}r_{i}r_{i}^{T}X_{i}(n^{-1}X^{T}X)^{-1}X_{j}^{T}})$
		$\displaystyle-n^{-3}\sum_{k\neq i}\text{cov}({r_{i}r_{k}^{T}X_{k}(n^{-1}X^{T}X)^{-1}X_{i}^{T}},{X_{j}(n^{-1}X^{T}X)^{-1}(X_{i}^{T}r_{i}r_{k}^{T}X_{k}+X_{k}^{T}r_{k}r_{i}^{T}X_{i})(n^{-1}X^{T}X)^{-1}X_{j}^{T}}),$

we have $\text{cov}({r_{i}(\beta^{*}-\hat{\beta})^{T}X_{i}^{T}},{r_{j}(\beta^{*}-\hat{\beta})^{T}X_{j}^{T}})$ , $\text{cov}({r_{i}(\beta^{*}-\hat{\beta})^{T}X_{i}^{T}},{X_{j}(\beta^{*}-\hat{\beta})(\beta^{*}-\hat{\beta})^{T}X_{j}^{T}})$ , and $\text{cov}({X_{i}(\beta^{*}-\hat{\beta})(\beta^{*}-\hat{\beta})^{T}X_{i}^{T}},{X_{j}(\beta^{*}-\hat{\beta})(\beta^{*}-\hat{\beta})^{T}X_{j}^{T}})=O(n^{-2})$ . Hence, $\text{cov}({\hat{\epsilon}_{i}},{\hat{\epsilon}_{j}})=O(n^{-2})$ . ∎

A.3 proof of equation (4)

of equation (4).

Rewrite $\Xi$ as $\Xi=\left(\begin{smallmatrix}E_{11}&E_{12}\\ E_{21}&E_{22}\end{smallmatrix}\right)$ with $E_{11}$ being a scalar. Rewrite $\hat{\Upsilon}$ as $\hat{\Upsilon}=(\hat{\Upsilon}_{1},\hat{\Upsilon}_{(1)})^{T}$ . So

\displaystyle\hat{\theta}_{(1)}=(\Xi^{-1})_{-1}^{T}\hat{\Upsilon}=-E_{22}^{-1}E_{21}q^{-1}\hat{\Upsilon}_{1}+E_{22}^{-1}\hat{\Upsilon}_{(1)}+E_{22}^{-1}E_{21}q^{-1}E_{12}E_{22}^{-1}\hat{\Upsilon}_{(1)},

where $q=E_{11}-E_{12}E_{22}^{-1}E_{21}$ . Let $F=E_{22}^{-1}E_{21}$ . We obtain

	$\displaystyle\sum_{i=1}^{n}\hat{Z}_{i}(\theta_{1}^{0})=$	$\displaystyle\hat{\Upsilon}_{1}-F^{T}\big{(}-q^{-1}E_{21}\hat{\Upsilon}_{1}+\hat{\Upsilon}_{(1)}+E_{21}q^{-1}E_{12}E_{22}^{-1}\hat{\Upsilon}_{(1)}\big{)}-E_{11}\theta_{1}^{0}$
	$\displaystyle=$	$\displaystyle(1+q^{-1}F^{T}E_{21})\hat{\Upsilon}_{1}-(1+F^{T}E_{21}q^{-1})F^{T}\hat{\Upsilon}_{(1)}-E_{11}\theta_{1}^{0}$
	$\displaystyle=$	$\displaystyle(1+q^{-1}F^{T}E_{21})\sum_{i=1}^{n}\big{\langle}\Phi_{i1}-\sum_{q=1}^{d-1}F_{q}\Phi_{iq+1},\hat{R}_{i}\big{\rangle}-E_{11}\theta_{1}^{0}$
	$\displaystyle=$	$\displaystyle\sum_{i=1}^{n}\hat{D}_{i}(\theta_{1}^{0}),$

where $\hat{D}_{i}(\theta_{1}^{0})=\alpha^{-1}\big{\langle}\Phi_{i1}-\sum_{q=1}^{d-1}F_{q}\Phi_{iq+1},\hat{R}_{i}-\theta_{1}^{0}\Phi_{i1}\big{\rangle}$ . Note that for any $b=(b_{1},\cdots,b_{d-1})^{T}$ ,

\displaystyle\sum_{i=1}^{n}\big{\langle}\Phi_{i1}-\sum_{q=1}^{d-1}F_{q}\Phi_{iq+1},\sum_{j=1}^{d-1}b_{j}\Phi_{ij+1}\big{\rangle}=\sum_{j=1}^{d-1}b_{j}\Xi_{1j+1}-\sum_{j=1}^{d-1}b_{j}\sum_{q=1}^{d-1}F_{q}\Xi_{q+1j+1}=0,

so we have

\sum_{i=1}^{n}\hat{D}_{i}(\theta_{1}^{0})=\sum_{i=1}^{n}\hat{M}_{i}(\theta_{1}^{0}),

where $\hat{M}_{i}(\theta_{1}^{0})=\alpha^{-1}\langle\Phi_{i1}-\sum_{q=1}^{d-1}F_{q}\Phi_{iq+1},\hat{R}_{i}-H_{i}((\theta_{1}^{0},\hat{\theta}_{(1)}^{T})^{T})\rangle$ . ∎

A.4 Proof of Proposition 2

of Proposition 2.

Since $\xi_{i}^{(g)}\sim N(0,1)$ and (A18), property (i) holds.

To prove property (ii), we have

\displaystyle\text{var}\big{(}n^{-1/2}\sum_{i=1}^{n}\hat{M}_{i}(\theta_{1}^{0},t)\xi_{i}^{(g)}\big{)}=n^{-1}\sum_{i=1}^{n}\text{var}(\hat{M}_{i}(\theta_{1}^{0},t)\xi_{i}^{(g)})=n^{-1}\sum_{i=1}^{n}E\hat{M}_{i}(\theta_{1}^{0},t)^{2},

and

	$\displaystyle\text{var}\big{(}n^{-1/2}\sum_{i=1}^{n}\hat{Z}_{i}(\theta_{1}^{0},t)\big{)}=$	$\displaystyle\text{var}\big{(}n^{-1/2}\sum_{i=1}^{n}\hat{D}_{i}(\theta_{1}^{0},t)\big{)}=n^{-1}\sum_{i=1}^{n}\text{var}(\hat{D}_{i}(\theta_{1}^{0},t))+o(1)$
	$\displaystyle=$	$\displaystyle n^{-1}\sum_{i=1}^{n}E\hat{M}_{i}(\theta_{1}^{0},t)^{2}+o(1).$

Thus, we have property (ii).

Since

		$\displaystyle\text{cov}\big{(}n^{-1/2}\sum_{i=1}^{n}\hat{M}_{i}(\theta_{1}^{0},s)\xi_{i}^{(g)},n^{-1/2}\sum_{j=1}^{n}\hat{M}_{j}(\theta_{1}^{0},t)\xi_{j}^{(g)}\big{)}=n^{-1}\sum_{i=1}^{n}\text{cov}(\hat{M}_{i}(\theta_{1}^{0},s)\xi_{i}^{(g)},\hat{M}_{i}(\theta_{1}^{0},t)\xi_{i}^{(g)})$
	$\displaystyle=$	$\displaystyle n^{-1}\sum_{i=1}^{n}E(\hat{M}_{i}(\theta_{1}^{0},s)\hat{M}_{i}(\theta_{1}^{0},t)),$

and

		$\displaystyle\text{cov}\big{(}n^{-1/2}\sum_{i=1}^{n}\hat{Z}_{i}(\theta_{1}^{0},s),n^{-1/2}\sum_{j=1}^{n}\hat{Z}_{j}(\theta_{1}^{0},t)\big{)}=\text{cov}\big{(}n^{-1/2}\sum_{i=1}^{n}\hat{D}_{i}(\theta_{1}^{0},s),n^{-1/2}\sum_{j=1}^{n}\hat{D}_{j}(\theta_{1}^{0},t)\big{)}$
	$\displaystyle=$	$\displaystyle n^{-1}\sum_{i=1}^{n}\text{cov}(\hat{D}_{i}(\theta_{1}^{0},s),\hat{D}_{i}(\theta_{1}^{0},t))+o(1)=n^{-1}\sum_{i=1}^{n}E(\hat{M}_{i}(\theta_{1}^{0},s)\hat{M}_{i}(\theta_{1}^{0},t))+o(1),$

we see property (iii) holds. ∎

References

Burton et al., (2013) Burton, C., McKinstry, B., Tătar, A. S., Serrano-Blanco, A., Pagliari, C., and Wolters, M. (2013). Activity monitoring in patients with depression: a systematic review. Journal of Affective Disorders, 145(1):21–28.
Chang and McKeague, (2016) Chang, H.-w. and McKeague, I. W. (2016). Empirical likelihood based tests for stochastic ordering under right censorship. Electronic Journal of Statistics, 10(2):2511.
Krane-Gartiser et al., (2014) Krane-Gartiser, K., Henriksen, T. E. G., Morken, G., Vaaler, A., and Fasmer, O. B. (2014). Actigraphic assessment of motor activity in acutely admitted inpatients with bipolar disorder. PloS One, 9(2):e89574.
Li and Pan, (2013) Li, D. and Pan, J. (2013). Empirical likelihood for generalized linear models with longitudinal data. Journal of Multivariate Analysis, 114:63–73.
Owen, (1991) Owen, A. (1991). Empirical likelihood for linear models. The Annals of Statistics, pages 1725–1747.
Owen, (1988) Owen, A. B. (1988). Empirical likelihood ratio confidence intervals for a single functional. Biometrika, 75(2):237–249.
Owen, (2001) Owen, A. B. (2001). Empirical likelihood. CRC press.
Qin and Lawless, (1994) Qin, J. and Lawless, J. (1994). Empirical likelihood and general estimating equations. the Annals of Statistics, 22(1):300–325.
Wang et al., (2010) Wang, S., Qian, L., and Carroll, R. J. (2010). Generalized empirical likelihood methods for analyzing longitudinal data. Biometrika, 97(1):79–93.
Xue and Zhu, (2007) Xue, L. and Zhu, L. (2007). Empirical likelihood semiparametric regression analysis for longitudinal data. Biometrika, 94(4):921–937.
You et al., (2006) You, J., Chen, G., and Zhou, Y. (2006). Block empirical likelihood for longitudinal partially linear regression models. Canadian Journal of Statistics, 34(1):79–96.
Zou et al., (2002) Zou, F., Fine, J., and Yandell, B. (2002). On empirical likelihood for a semiparametric mixture model. Biometrika, 89(1):61–75.

	$\displaystyle 0=\frac{\partial(-2\log W_{1}(\theta_{1}))}{\partial\theta_{1}}\|_{\theta_{1}=\check{\theta}_{1}}$	$\displaystyle=2\sum_{i=1}^{n}\frac{\frac{\partial\lambda}{\partial\theta_{1}}Z_{i}+\lambda\frac{\partial Z_{i}}{\partial\theta_{1}}}{1+\lambda Z_{i}}\|_{\theta_{1}=\check{\theta}_{1}}$
		$\displaystyle=2\lambda\sum_{i=1}^{n}\frac{1}{1+\lambda Z_{i}}\frac{\partial Z_{i}}{\partial\theta_{1}}\|_{\theta_{1}=\check{\theta}_{1}}\text{ (by (\ref{eq:Q1}))}.$		(A13)

Empirical Likelihood Inference of Variance Components in Linear Mixed-Effects Models

Abstract

1 Introduction

2 elr test for the fixed effects β∗\beta^{*}

Condition 1.

Condition 2.

Condition 3.

Condition 4.

Theorem 1.

3 elr test for the variance component θ1∗\theta_{1}^{*}

Condition 5.

Proposition 1.

Condition 6.

Theorem 2.

Lemma 3.

4 Variance Component Analysis Over a Sequence of Responses

Proposition 2.

5 Simulation studies

5.1 Data generation

5.2 elr test for single variance component

5.3 elr test for variance components over an interval

6 Application to genetic heritability analysis of physical activity distribution

6.1 Description of the data

6.2 Effects of Gender, Age, BMI, Weekend on Activity Profiles

6.3 Analysis of heritability of the activity distribution

6.4 Sensitivity analysis of heritability of the activity distribution

7 Discussion

ACKNOWLEDGMENT

APPENDIX

Appendix A Proofs and complements

A.1 Proof of Theorem 2

Theorem A1.

Proof.

Lemma A2.

Proof.

of Theorem 2.

A.2 Proof of of Proposition 1

of Proposition 1.

A.3 proof of equation (4)

of equation (4).

A.4 Proof of Proposition 2

of Proposition 2.

References

2 elr test for the fixed effects $\beta^{*}$

3 elr test for the variance component $\theta_{1}^{*}$