A correlation structure for the analysis of Gaussian and non-Gaussian responses in crossover experimental designs with repeated measures

\nameN.A. Cruz^a, O.O. Melo ^b and C.A. Martinez ^c N.A. Cruz . Email: neacruzgu@unal.edu.co, Corresponding author ^aPhd Student, Department of Statistics, Faculty of Sciences, Universidad Nacional de Colombia, Phone:(+571) 3165000 Ext: 13206, Fax:(+571) 3165000 Ext: 13210 ; ^boomelom@unal.edu.co, Associate Professor, Department of Statistics, Faculty of Sciences, Universidad Nacional de Colombia; ^ccmartinez@agrosavia.co, Phd, Associate Researcher, Corporación Colombiana de Investigación Agropecuaria – AGROSAVIA, Sede Central

Abstract

In this study, we propose a family of correlation structures for crossover designs with repeated measures for both, Gaussian and non-Gaussian responses using generalized estimating equations (GEE). The structure considers two matrices: one that models between-period correlation and another one that models within-period correlation. The overall correlation matrix, which is used to build the GEE, corresponds to the Kronecker between these matrices. A procedure to estimate the parameters of the correlation matrix is proposed, its statistical properties are studied and a comparison with standard models using a single correlation matrix is carried out. A simulation study showed a superior performance of the proposed structure in terms of the quasi-likelihood criterion, efficiency, and the capacity to explain complex correlation phenomena/patterns in longitudinal data from crossover designs

keywords:

Carry-over effect; Generalized Estimating Equations; Kronecker correlation; Overdispersed Count Data

^†^†articletype: ARTICLE TEMPLATE

1 Introduction

Experimental designs are a very useful tool to analyze the effects of treatments applied to a set of experimental units (Hinkelmann and Kempthorne, 2005). In this kind of studies, it is frequent that experimental units are observed at a unique time point, this is known as a transversal experiment; notwithstanding, sometimes experimental units are observed several times during the study, keeping the same treatment, giving rise to longitudinal studies (Davis, 2002). There are situations in which an experimental unit receives all treatments, each one in a different period, this induces a setup where a sequence of treatments is applied to each unit. This kind of designs is known as crossover design (Jones and Kenward, 2015). In the scope of crossover designs, published results focus on the case of a normally distributed (Jones and Kenward, 2015) or binary (two possible outputs, namely success or failure) response variable. The latter case has been treated by means of generalized linear models for binary data (Ratkowsky et al. (1992), Curtin (2017) and Li et al. (2018)).
Jones and Kenward (2015, pag 204) described a crossover experiment with three treatments to control arterial pressure: treatment A is a placebo, treatments B and C are 20 and 40 mg doses of a test drug. Thus, there were six three-period sequences: ABC, ACB, BCA, BAC, CAB, and CBA, each one of them was applied to two individuals. In each period, 10 consecutive measurements of diastolic arterial pressure were taken: 30 and 15 minutes before, and 15, 30, 45, 60, 75, 90, 120 and 240 minutes after the administration of the treatment, as shown in Table 1.

Sequence

Period 1

Period 2

Period 3

(1) ABC	Ind 1
	Ind 2

10 measurements

⋮

(6) CBA	Ind 11
	Ind 12

10 measurements

Table 1: Structure of the blood pressure crossover design

. A second experiment carried out by researchers of the Department of Animal Sciences of Universidad Nacional de Colombia, was focused on inferring the effects of two diets, A (grass) and B (a mixture of grass and paper recycling waste) on dairy cattle performance. Eight cows split into two groups of four received the diets in such a way that the first group was fed diet A from day 1 to day 42, and diet B from day 43 to day 84, while the second group was fed diet B during the first period and diet A during the second one. Measurements of milk yield and quality, live weight and body condition score were taken at 1, 14, 28, 42, 56, 70 and 84 days (Jaime, 2019); the design structure is shown in Table2.

Sequence

Period 1

Period 2

(1) AB	Ind 1
	⋮
	Ind 4

3 measurements

⋮

3 measurements

⋮

3 measurements

(1) BA	Ind 5
	⋮
	Ind 8

3 measurements

⋮

3 measurements

⋮

3 measurements

Table 2: Structure of the crossover design in cows

To analyze this sort of designs, Basu and Santra (2010), Josephy et al. (2015), Hao et al. (2015), Lui (2015), Rosenkranz (2015), Grayling et al. (2018), Madeyski and Kitchenham (2018) and Kitchenham et al. (2018) used mixed models for crossover designs with Gaussian response and a single observation per period including additive carryover effects. Biabani et al. (2018) presented a review on crossover designs in neuroscience, all papers considered Gaussian responses and did not account for carryover effects due to the presence of a washout period, even when it lasted a few days. On the other hand, Oh et al. (2003), Curtin (2017) and Li et al. (2018) used generalized linear models for crossover designs of two periods and two sequences of two treatments with a continuous (normal or gamma) or binary response variable and each experimental unit observed once per period. They used generalized estimating equations (GEE) to estimate model parameters. A Bayesian formulation of generalized linear models where the response was the survival time with a single observation per period was presented in Shkedy et al. (2005) and Liu and Li (2016).

Moreover, Dubois et al. (2011), Diaz et al. (2013) and Forbes et al. (2015) used Gaussian mixed models to analyze records from crossover designs with repeated measures using the area under curve as a strategy to obtain a single observation per period and one experimental unit, and they did not account for carryover effects.

In all aforementioned approaches, it is assumed that the crossover design considers a washout period between treatments and that the response variable of each individual is observed once per period. These assumptions are not fulfilled in the experiments described above because of two reasons: i) in patients with arterial hypertension the treatment cannot be stopped, while in the case of dairy cattle, cows must be fed every day; moreover, in both studies, the placebo is part of the treatment design, so considering it as a washout period would modify the experiment, ii) in each treatment, experimental units were observed several times per period. Therefore, there is a necessity for developing a consistent methodology to analyze this sort of experiments. In this study, we develop a methodology to analyze data from crossover designs with repeated measures using GEE and considering two correlation structures, between and within periods, which are combined via a Kronecker product to yield the overall correlation matrix. These approach was found to improve the estimation of parameters of interest with respect to methods based on GEE that consider a single correlation structure for all the observations of a subject; in addition, this method accounts for carryover effects, hence, it does not need a washout period.

In the second section, we present some background and define the methodology, in the third section we propose a method to estimate the correlation matrix, and provide theoretical results that support our estimation and modelling approaches. In the fourth section, we present a simulation study showing some advantages of our model as compared to the model with a single correlation structure for each period. Lastly, in the fifth section, the methodology is applied to real data from the aforementioned arterial pressure and dairy cattle experiments.

2 Crossover designs with repeated measures

A crossover design has the following components (Patterson, 1951): i) Sequences which are each one of the distinct treatment combinations to be sequentially and randomly applied to the experimental units, ii) Treatments, which are randomly applied to each experimental unit within each sequence, iii) Periods, the time intervals in which the treatments that make up a sequence are applied, usually, each period is the same for all sequences and, consequently, the number of periods is equal to the number of elements of each sequence, iv) Experimental unit, the subjects or elements that receive the treatments, in each sequence there are $n_{l}$ experimental units, so the total number of experimental units in the study is denoted by $n=\sum_{l=1}^{S}n_{l}$ .

Another important feature of the crossover design is the existence of carryover effects, defined in Vegas et al. (2016) as follows: persistency of the effect of a treatment over those that will be applied later, i.e., the treatment is applied in a given period, but its effect still affects the response in later periods when other treatments are applied, this residual effect is known as carryover effect. When the effect persists after one period, it is known as first order carryover effect, and when it persists after two periods, it is called a second order effect and so on (Patterson, 1951).

In a crossover design with $S$ sequences of length (the number of elements comprising each sequence) $P$ , $Y_{ijk}$ is defined as the $k$ th respose of the $i$ th experimental unit at $j$ th period. Let $n_{ij}$ be the number of observations of the $i$ th experimental unit during the $j$ th period. Then, vector $\boldsymbol{Y}_{ij}$ is defined as:

\boldsymbol{Y}_{ij}=\left(Y_{ij1},\ldots,Y_{ijn_{ij}}\right)^{T}

(1)

Also, vector $\boldsymbol{Y}_{i}$ is defined as:

\boldsymbol{Y}_{i}=\left(\boldsymbol{Y}_{i1},\ldots,\boldsymbol{Y}_{iP}\right)^{T}

(2)

which has dimension $\sum_{j=1}^{P}n_{ij}$ , where $P$ is the number of periods.
According to these definitions, the ideas presented when discussing the arterial pressure and dairy cattle experiments, and assuming that $Y_{ijk}$ has a distribution in the exponential family, we propose the following model based on GEE:

	$\displaystyle E(Y_{ijk})$	$\displaystyle=\mu_{ijk},\qquad i=1,\ldots,n,\;j=1,\ldots,P,\;k=1,\ldots,n_{ij},\;L=\max_{ij}\{n_{ij}\}$
	$\displaystyle g(\mu_{ijk})$	$\displaystyle=\boldsymbol{x}^{T}_{ijk}\boldsymbol{\beta}=\mu+\gamma_{j}+\tau_{d[i,j]}+\theta_{d[i,j-u]}+\cdots+\theta_{d[i,j-q]}$		(3)

where $g(\cdot)$ is the link function related to the exponential family, $\boldsymbol{x}_{ijk}$ is the vector of the design matrix corresponding to $k$ th response from the $i$ th experimental unit at the $j$ th period, $\boldsymbol{\beta}$ is the vectors of fixed effects, $\mu$ is the overall mean, $\alpha_{i}$ is the effect of the $i$ th sequence, $\gamma_{j}$ is the effect of the $j$ th period, $\tau_{d[i,j]}$ is the effect of treatment $d$ applied in the period $j$ to $i$ th experimental unit ( $d=1,\ldots,q$ ), $\theta_{(1)[i,j-1]}$ is the first order carryover effect of $d$ treatment, $\theta_{d[i,j-u]}$ is the carry over effect of order $u$ of $d$ treatment. The carryover effects are considered in the two experiments discussed above because there was not a washout period. Due to the fact that observations of the same experimental unit are correlated, parameter estimation is carried out using GEE (Liang and Zeger, 1986). To this end, the following system of $q=dim(\boldsymbol{\beta})$ equations has to be solved:

\displaystyle\boldsymbol{U}(\boldsymbol{\beta})=

\displaystyle\left[\left\{\sum_{i=1}^{n_{i}}\boldsymbol{x}_{mi}^{T}\boldsymbol{D}\left(\frac{\partial\boldsymbol{\mu}_{i}}{\partial\eta}\right)[\boldsymbol{V}(\boldsymbol{\mu}_{i})]^{-1}\left(\frac{\boldsymbol{y}_{i}-\boldsymbol{\mu}_{i}}{a(\phi)}\right)\right\}_{m=1,\ldots,q}\right]_{q\times 1}

(4)

where $\boldsymbol{\mu}_{i}=(\mu_{i11},\ldots,\mu_{iPn_{iP}})^{T}$ , $\boldsymbol{y}_{i}=(y_{i11},\ldots,y_{iPn_{iP}})^{T}$ , $\boldsymbol{x}_{mi}$ is the $m$ th column of the design matrix of the ith experimental unit, and the covariance component $\boldsymbol{V}(\boldsymbol{\mu}_{i})$ is defined as:

\boldsymbol{V}(\boldsymbol{\mu}_{i})=\left[\boldsymbol{D}(V(\mu_{ijk})^{\frac{1}{2}})\boldsymbol{R}(\boldsymbol{\boldsymbol{\alpha}})\boldsymbol{D}({V}(\mu_{ijk})^{\frac{1}{2}})\right]_{P\times P}

(5)

where $\boldsymbol{D}(\cdot)$ is a diagonal matrix, $V(\mu_{ijk})$ is the variance function corresponding to the exponential family, and $R(\boldsymbol{\alpha})$ is the correlation matrix related to the covariance matrix $\boldsymbol{\Sigma}_{i}=Var(\boldsymbol{Y}_{i})$ as follows:

\displaystyle\left(\boldsymbol{\Sigma}_{i}\right)_{LP\times LP}

\displaystyle=\left(\boldsymbol{D}(Var(Y_{ijk})^{\frac{1}{2}})\boldsymbol{R}(\boldsymbol{\alpha})\boldsymbol{D}(Var(Y_{ijk})^{\frac{1}{2}})\right)_{LP\times LP}.

3 Kronecker correlation matrix

Due to the repeated measures structure in the crossover design, we propose a correlation structure of the form:

\boldsymbol{R}(\boldsymbol{\alpha})=\boldsymbol{\Psi}\otimes\boldsymbol{R}_{1}(\boldsymbol{\alpha}_{1})

(6)

where $\boldsymbol{R}_{1}(\boldsymbol{\alpha}_{1})$ is the within-period correlation matrix, $\boldsymbol{\Psi}$ is the between-period correlation matrix, and $\otimes$ represents the Kronecker product (Harville, 1997). To estimate this matrix, we propose the following modified GEE to estimate $\boldsymbol{\beta}$ is:

\displaystyle\boldsymbol{U}_{1}(\boldsymbol{\beta})=

\displaystyle\left[\left\{\sum_{i=1}^{n}\boldsymbol{x}_{mi}^{T}\boldsymbol{D}\left(\frac{\partial\boldsymbol{\mu}_{i}}{\partial\eta}\right)[\boldsymbol{V}(\boldsymbol{\mu}_{i})]^{-1}\left(\frac{\boldsymbol{y}_{i}-\boldsymbol{\mu}_{i}}{a(\phi)}\right)\right\}_{m=1,\ldots,p}\right]_{p\times 1}

where $\boldsymbol{V}(\boldsymbol{\mu}_{i})$ is defined as in Equation (5), $\boldsymbol{\mu}_{i}=\{\mu_{i11},\ldots,\mu_{iPL}\}$ , $\boldsymbol{x}_{mi}$ is the $m$ th column of the design matrix of the $i$ th experimental unit.
The estimating equation for $\boldsymbol{\alpha}_{1}$ is:

\boldsymbol{U}_{2}(\boldsymbol{\alpha}_{1})=\sum_{i=1}^{n}\left(\frac{\partial\boldsymbol{\varepsilon}_{i}}{\partial\boldsymbol{\alpha}_{1}}\right)^{T}\boldsymbol{H}_{i}^{-1}\left(\boldsymbol{W}_{i}-\boldsymbol{\varepsilon}_{i}\right)

(7)

where $\boldsymbol{H}_{i}=\boldsymbol{D}(V(r_{ijk}))_{q\times q}$ is a diagonal matrix, $\boldsymbol{\varepsilon}_{i}=E(\boldsymbol{W}_{i})_{q\times 1}$ ,
$\boldsymbol{W}_{i}=(r_{i11}r_{i12},$ $r_{i11}r_{i13},\ldots,r_{iP(L-1)}r_{iPL})^{T}_{q\times 1}$ , and $r_{ijk}$ is the $ijk$ th observed Pearson residual defined as:

r_{ijk}=\frac{Y_{ijk}-\hat{\mu}_{ijk}}{\hat{\phi}\sqrt{V(\hat{\mu}_{ijk})})}

(8)

and $q={P\choose 2}$ . On the other hand, we propose the following estimator for $\boldsymbol{\Psi}=\left(\psi_{jj^{\prime}}\right)_{P\times P}$ :

\hat{\psi}_{jj^{\prime}}=\frac{1}{n}\sum_{i=1}^{n}tr\left(\boldsymbol{R}_{1}(\boldsymbol{\hat{\alpha}}_{1})(\boldsymbol{r}_{(j)i}-\bar{\boldsymbol{r}}_{(j)})(\boldsymbol{r}_{(j^{\prime})i}-\bar{\boldsymbol{r}}_{(j^{\prime})})^{T}\right)

(9)

where $\boldsymbol{r}_{(j)i}$ is the vector of Pearson residuals for the $i$ th experminetal unit at the $j$ th period, and $\bar{\boldsymbol{r}}_{(j)}$ is the average (over i), $j\neq j^{\prime}$ , with $j=1,\ldots,P$

Theorem 3.1.

The estimator of $\boldsymbol{R}(\boldsymbol{\alpha})$ given by $\boldsymbol{\hat{\Psi}}\otimes\boldsymbol{R}_{1}(\boldsymbol{\hat{\alpha}}_{1})$ is asymptotically unbiased and consistent.

Proof.

See appendix A ∎

The form of $\boldsymbol{R}_{1}(\boldsymbol{\alpha}_{1})$ in (6) is given by correlation structures such as: independence, autoregressive, exchangeable, etc (Davis, 2002).

The correlation structures are compared via the quasi-likelihood information criterion ( $QIC$ , Pan (2001)), which is defined as:

QIC=-2QL(\hat{\mu};\boldsymbol{I})+2trace(\boldsymbol{\hat{\Omega}}^{-1}_{I}\boldsymbol{\hat{V}_{R}})

(10)

where $\hat{\mu}_{ijk}=\hat{\eta}_{ijk}=g^{-1}(\boldsymbol{x}_{ijk}\hat{\beta})$ is the estimated expected value of observation $Y_{ijk}$ under the model assuming the correlation matrix $R$ , $\hat{\Omega}_{I}$ is the estimated covariance matrix of $\boldsymbol{\beta}$ under independence, and $\hat{V}_{R}$ is the covariance matrix of $\boldsymbol{\beta}$ under the model with correlation matrix $\boldsymbol{R}$ defined as in (6).

4 Simulation Study

Refer to caption — Figure 1: Mean and corresponding 95% confidence interval (computed from the 100 simulations) for the proportion of the number of times that the $QIC$ selected each model according to the number of replicates per sequence.

A simulation study was carried out to evaluate the proposed model. The setup was a crossover model with three periods $(P=3)$ , five repetitions $(L=5)$ and two sequences $(S=2)$ within each period. The number of replicates per sequence $(n)$ varied from 2 to 100, and for each replicate, the simulation was ran 100 times. The following parameters are used in the simulation:

	$\displaystyle P_{k}\sim N(0,1),\;k=1,2,3,$
	$\displaystyle T_{j}\sim N(0,1),\;j=1,2,3,4,5,$
	$\displaystyle S_{jk}\sim N(0,1),$
	$\displaystyle\Psi=\{\psi_{ab}\}_{3\times 3},\;\psi_{ab}\sim U(-1,1),\;\Psi\geq 0,$
	$\displaystyle\boldsymbol{R}_{1}(\alpha_{1}),\;r_{ab}=\alpha_{1}^{\|a-b\|},\;\alpha_{1}\sim U(-1,1),\;\boldsymbol{R}_{1}(\alpha_{1})\geq 0,$		(11)
	$\displaystyle\mu_{ijk}=\alpha+P_{k}+T_{j}+S_{(ij)},\;i=1,\ldots,3n,$
	$\displaystyle Y_{ijk}\sim N(\mu_{ijk},\sigma^{2}),$
	$\displaystyle\boldsymbol{Y}_{i}=\left(\boldsymbol{Y}_{i1},\ldots,\boldsymbol{Y}_{iP}\right)^{T},$
	$\displaystyle Corr(\boldsymbol{Y}_{i})=\boldsymbol{\Psi}\otimes\boldsymbol{R}_{1}(\alpha_{1})$

Four GEE models were fit per replicate, all had the same linear predictor and differed in the correlation structure: 1) independence, 2) first order autoregressive, 3) exchangeable, and 4) the Kronecker matrix proposed in this paper. Both, the simulations and fitted model were carried out using the libraries gee (Carey., 2019) and geeM (McDaniel et al., 2013) of R (R Core Team, 2022)

Figure 1 shows the proportion of times that each model had the smallest QIC relative to the 100 simulations for each number of replications per sequence (the mean and 95% confidence interval are shown). Moreover, an analysis of parameter estimation under each one of these models using the simulated data was performed; Figure 2 presents 95% confidence intervals for the location parameters under each model. In the simulation study it is observed that most of the times the lowest $QIC$ is obtained for the model with the true correlation structure. This suggets that the proposed estimation method along with theorem 3.1, which will allow to detect kronecker correlation structures in the data of the crossover design. In addition, this behavior is maintained even when the number of replicas within the design is low.
Regarding the coverage of confidence intervals for each parameter, in Figure 2 it can be noticed that the model with independence structure has a very low coverage for period and sequence effects; this will not allow a correct inference about the parameters of interest in the design. The other three models had more appropriate coverages, but the model with the within-period xchangeable matrix (that is, the true structure) exhibited values closer to 95%. As to the time effects, corresponding coverages were close to 95% in all four models; however, a subcoverage was observed for the model with a within-period exchangeable structure and an overcoverage for the independence model. The true model showed coverages corresponding to 95%.

5 Application

The systolic blood pressure data is presented in Figure 3. The model used to analyze this experiment had a linear predictor considering fixed effects of treatment, period, baseline (the two measurements taken before applying the treatment), first and second order linear and quadratic carryover effects as a function of time. On the other hand, the model use to analyze data from the second experiment had a linear predictor considering fixed effects of baseline, treatment, period, and first order linear carryover effects as a function of time (quadratic effects were not considered because there were three observations within each period). The structures of the working correlation matrices were: i) independence, ii) first order autoregressive, and iii) fitted individually and were compared via the $QIC$ to determine the one exhibiting the best fit to each dataset. Table 3 presents the $QIC$ for each correlation structure in both datasets.

Matriz $R(\alpha)$	$QIC$ presión arterial	$QIC$ vacas
$\boldsymbol{I}_{PL}$	46086	200.65
$\boldsymbol{I}_{P}\otimes AR(1)_{L}$	44166	139.70
$\boldsymbol{I}_{P}\otimes Exch_{L}$	45059	138.85
$AR(1)_{PL}$	44923	143.85
$Exch_{PL}$	44158	138.27
$\boldsymbol{\Psi}\otimes AR(1)_{L}$	43400	180.34
$\boldsymbol{\Psi}\otimes Exch_{L}$	43393	138.22

Table 3: Correlation matrices and the corresponding

QIC

values

Parameter	Estimate	Robust SE	Robust $z$	$p$ -value
Intercept	106.2917	3.4327	30.9641	$<$ 0.001
Base	8.9587	1.9515	4.5908	$<$ 0.001
Time	1.0754	0.4209	2.5548	0.0106
Period 2	5.9752	3.0859	1.9363	0.0528
Period 3	10.4117	5.1565	2.0191	0.0435
Treatment B	-1.3032	2.0353	-0.6403	0.5220
Treatment C	-10.8840	3.4096	-3.1922	0.0014
Carry-over B	-7.2221	4.0008	-1.8052	0.0710
Carry-over C	-12.1513	5.6162	-2.1636	0.0305
Time²	-0.0526	0.0202	-2.6091	0.0091
Time $\times$ Carry over B	0.9191	0.7102	1.2940	0.1957
Time $\times$ Carry over C	-0.1007	0.5789	-0.1739	0.8619
Time ${}^{2}\times$ Carry over B	-0.0459	0.0405	-1.1325	0.2574
Time ${}^{2}\times$ Carry over C	0.0209	0.0410	0.5097	0.6103

Table 4: Estimates obtained by GEE for arterial pressure data

Parameter	Estimate	Robust SE	Robust $z$	$p$ -value
Intercept	19.5505	1.0248	10.2950	$<$ 0.001
Base	0.5850	0.0629	9.2961	$<$ 0.001
Time	-0.6293	0.0949	-6.6321	$<$ 0.001
Period 2	2.3908	1.0395	2.3000	0.0214
Treatment A	1.1217	0.8341	1.3448	0.1787
Carry-over A	-4.2480	1.4128	-3.0069	0.0026
Time²	0.0117	0.0018	6.4550	$<$ 0.001
Time $\times$ Carry over A	-0.0702	0.0260	-2.6949	0.0070

Table 5: Estimates obtained by GEE for dairy cattle data

According to 3 the $\boldsymbol{\Psi}\otimes Exch_{10}$ and $\boldsymbol{\Psi}\otimes Exch_{3}$ correlation matrices had the smallest QIC values for the arterial pressure and dairy cattle experiments, respectively. Hence, correlation matrices having the Kronecker structure had the best fit in the two datasets. The correlation matrices estimated using equations (7) and (9) for the arterial pressure data are, respectively:

	$\displaystyle\hat{\boldsymbol{\Psi}}$	$\displaystyle=\begin{pmatrix}1.0000&0.0537&0.4486\\ 0.0537&1.0000&0.3756\\ 0.4486&0.3756&1.0000\\ \end{pmatrix}$		(12)
	$\displaystyle\boldsymbol{\hat{R}}_{1}(\boldsymbol{\alpha})$	$\displaystyle=\begin{pmatrix}1.0000&0.4958&0.4958&\cdots&0.4958\\ 0.4958&1.0000&0.4958&\cdots&0.4958\\ \vdots&\vdots&\vdots&\ddots&\vdots\\ 0.4958&0.4958&0.4958&\cdots&1.0000\\ \end{pmatrix}$		(13)

and for the dairy cattle dataset:

	$\displaystyle\hat{\boldsymbol{\Psi}}$	$\displaystyle=\begin{pmatrix}1.0000&0.1073\\ 0.1073&1.0000\\ \end{pmatrix}$		(14)
	$\displaystyle\boldsymbol{\hat{R}}_{1}(\boldsymbol{\alpha})$	$\displaystyle=\begin{pmatrix}1.0000&0.5610&0.5610\\ 0.5610&1.0000&0.5610\\ 0.5610&0.5610&1.0000\\ \end{pmatrix}$		(15)

Figure 4 shows the matrix computed as the Kronecker product of matrices in equations (12) and (13). Notice the positive correlation between periods 1 and 3, a small positive correlation between periods 1 and 2. On the other hand, the matrix obtained as the Kronecker product of matrices in equations (14) and (15) is shown in Figure 5 where a positive but small correlation between periods 1 and 2 can be seen.

The matrices with the Kronecker structure are used to estimate the location parameters of the linear model (since these had the smallest QIC) yielding Table 4 4 for the arterial pressure data and Table 5 for the dairy cattle data. Each table shows the estimates, their standard errors (computed using the “sandwich” variance estimator (Hardin and Hilbe, 2003), the z statistic and the p-value corresponding to the null hypothesis $\beta=0$

Table 4 shows significant effects of baseline, treatment C (with respect to A), time (linear and quadratic), period, and carryover effect of treatment C. The interaction between carryover effects and time was not significant, which means that the effect of the previous treatment remains the same during the rest of the experiment. Moreover, Table 5 shows significant effects of baseline (milk yield before the beginning of the experiment), linear (negative) and quadratic (positive) regression coefficients of time, which means that milk yield shows a convex behavior during the experiment. There was no significant treatment effect, but there was a significant carryover effect of the paper-based diet over the grass-based diet, i.e., the cows being fed paper took a time to recover after changing to the grass-based diet.

6 Conclusions

Defining the correlation matrix structure is a highly relevant decision when using GEE, a proper specification of correlation structures in longitudinal data analysis improves estimation efficiency, leading to more reliable statistical inferences (Hin and Wang, 2009) . Hence, in this paper we develop a family of correlation structures that combine some of the classical structures used in longitudinal data analysis with an additional matrix that allows more flexibility in the overall correlation structure by adding a few parameters. Moreover, we provide an explicit estimation method of these parameters which features some sound statistical properties.

The theoretical results supporting some asymptotic properties of the proposed estimators are illustrated through the simulation study where a gain in goodness of fit was observed across the different simulation scenarios. The QIC was able to select the correct model, which had the proposed correlation structure. In addition, the confidence intervals built from the GEE showed a coverage that matched the nominal value. The estimation by intervals for each of the parameters presents coverage close to 95%, which shows a correct theoretical specification of each univariate interval.

As to the real data analysis, the results for the arterial pressure data showed the importance of accounting for carryover effects as they are useful for correctly estimating and interpreting the treatment affects across the time. If not included in the model, these residual effects may induce confusion problems. On the other hand, in the dairy cattle data, significant carryover effects were detected as well. Regarding the estimates of correlation matrices in both datasets, the QIC selected the proposed correlation matrix. Thus, the theoretical and empirical results from real and simulated data analyses suggest that the proposed methodology is promising and may be applied to perform better inferences from data obtained under crossover designs without washout periods and a repeated measures structure.

References

Basu and Santra (2010) S. Basu and S. Santra. A joint model for incomplete data in crossover trials. Journal of Statistical Planning and Inference, 140(10):2839–2845, 2010.
Biabani et al. (2018) M. Biabani, M. Farrell, M. Zoghi, G. Egan, and S. Jaberzadeh. Crossover design in transcranial direct current stimulation studies on motor learning: potential pitfalls and difficulties in interpretation of findings. Reviews in the Neurosciences, 29(4):463–473, 2018.
Carey. (2019) V. J. Carey. gee: Generalized Estimation Equation Solver, 2019. URL https://CRAN.R-project.org/package=gee. R package version 4.13-20.
Cordeiro (2004) G. M. Cordeiro. On pearson’s residuals in generalized linear models. Statistics & probability letters, 66(3):213–219, 2004.
Cordeiro and McCullagh (1991) G. M. Cordeiro and P. McCullagh. Bias correction in generalized linear models. Journal of the Royal Statistical Society: Series B (Methodological), 53(3):629–643, 1991.
Cox and Snell (1968) D. R. Cox and E. J. Snell. A general definition of residuals. Journal of the Royal Statistical Society: Series B (Methodological), 30(2):248–265, 1968.
Curtin (2017) F. Curtin. Meta-analysis combining parallel and crossover trials using generalised estimating equation method. Research Synthesis Methods, 8(3):312–320, 2017.
Davis (2002) C. S. Davis. Statistical Methods for the Analysis of Repeated Measurements. Springer, San Diego, 2002.
Diaz et al. (2013) F. J. Diaz, M. J. Berg, R. Krebill, T. Welty, B. E. Gidal, R. Alloway, and M. Privitera. Random-effects linear modeling and sample size tables for two special crossover designs of average bioequivalence studies: the four-period, two-sequence, two-formulation and six-period, three-sequence, three-formulation designs. Clinical pharmacokinetics, 52(12):1033–1043, 2013.
Dubois et al. (2011) A. Dubois, M. Lavielle, S. Gsteiger, E. Pigeolet, and F. Mentré. Model-based analyses of bioequivalence crossover trials using the stochastic approximation expectation maximisation algorithm. Statistics in medicine, 30(21):2582–2600, 2011.
Forbes et al. (2015) A. B. Forbes, M. Akram, D. Pilcher, J. Cooper, and R. Bellomo. Cluster randomised crossover trials with binary data and unbalanced cluster sizes: Application to studies of near-universal interventions in intensive care. Clinical Trials, 12(1):34–44, 2015.
Grayling et al. (2018) M. J. Grayling, A. P. Mander, and J. M. Wason. Blinded and unblinded sample size reestimation in crossover trials balanced for period. Biometrical Journal, 60(5):917–933, 2018.
Hao et al. (2015) C. Hao, D. von Rosen, and T. von Rosen. Explicit influence analysis in two-treatment balanced crossover models. Mathematical Methods of Statistics, 24(1):16–36, 2015.
Hardin and Hilbe (2003) J. W. Hardin and J. Hilbe. Generalized Estimating Equations. Chapman & Hall, Boca Raton, 2003.
Harville (1997) D. A. Harville. Matrix algebra from a statistician’s perspective, volume 1. Springer, 1997.
Hin and Wang (2009) L.-Y. Hin and Y.-G. Wang. Working-correlation-structure identification in generalized estimating equations. Statistics in medicine, 28(4):642–658, 2009.
Hinkelmann and Kempthorne (2005) K. Hinkelmann and O. Kempthorne. Design and Analysis of Experiments. Wiley series in probability and mathematical statistics. Applied probability and statistics. Wiley, New York, Vol 2, 2005.
Jaime (2019) G. Jaime. Uso de un residuo de papel como suplemento para vacas lecheras. Tesis de maestría, Universidad Nacional de Colombia, Sede Bogota, 2019.
Jones and Kenward (2015) B. Jones and M. G. Kenward. Design and Analysis of Cross-Over Trials Third Edition. Chapman & Hall/CRC, Boca Raton, 2015.
Josephy et al. (2015) H. Josephy, S. Vansteelandt, M.-A. Vanderhasselt, and T. Loeys. Within-subject mediation analysis in ab/ba crossover designs. The international journal of biostatistics, 11(1):1–22, 2015.
Kitchenham et al. (2018) B. Kitchenham, L. Madeyski, F. Curtin, et al. Corrections to effect size variances for continuous outcomes of cross-over clinical trials. Statistics in medicine, 37(2):320–323, 2018.
Lancaster (1965) H. Lancaster. The helmert matrices. The American Mathematical Monthly, 72(1):4–12, 1965.
Li et al. (2018) F. Li, A. B. Forbes, E. L. Turner, and J. S. Preisser. Power and sample size requirements for gee analyses of cluster randomized crossover trials. Statistics in Medicine, 2018.
Liang and Zeger (1986) K.-Y. Liang and S. L. Zeger. Longitudinal data analysis using generalized linear models. Biometrika, 73(1):13–22, 1986.
Liu and Li (2016) F. Liu and Q. Li. A bayesian model for joint analysis of multivariate repeated measures and time to event data in crossover trials. Statistical Methods in Medical Research, 25(5):2180–2192, 2016.
Lui (2015) K.-J. Lui. Test equality between three treatments under an incomplete block crossover design. Journal of biopharmaceutical statistics, 25(4):795–811, 2015.
Madeyski and Kitchenham (2018) L. Madeyski and B. Kitchenham. Effect sizes and their variance for ab/ba crossover design studies. Empirical Software Engineering, 23(4):1982–2017, 2018.
McDaniel et al. (2013) L. S. McDaniel, N. C. Henderson, and P. J. Rathouz. Fast pure R implementation of GEE: application of the Matrix package. The R Journal, 5:181–187, 2013. URL https://journal.r-project.org/archive/2013-1/mcdaniel-henderson-rathouz.pdf.
Oh et al. (2003) H. S. Oh, S.-g. Ko, and M.-S. Oh. A bayesian approach to assessing population bioequivalence in a 2 2 2 crossover design. Journal of Applied Statistics, 30(8):881–891, 2003.
Pan (2001) W. Pan. Akaike’s information criterion in generalized estimating equations. Biometrics, 57:120–125, 2001.
Patterson (1951) H. D. Patterson. Change-over trials. Journal of the Royal Statistical Society. Series B (Methodological), 13:256–271, 1951.
R Core Team (2022) R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2022. URL https://www.R-project.org/.
Ratkowsky et al. (1992) D. Ratkowsky, R. Alldredge, and M. Evans. Cross-over Experiments. Statistics A Series of Textbooks and Monographs, Washington, 1992.
Rosenkranz (2015) G. K. Rosenkranz. Analysis of cross-over studies with missing data. Statistical methods in medical research, 24(4):420–433, 2015.
Shkedy et al. (2005) Z. Shkedy, G. Molenberghs, H. V. Craenendonck, T. Steckler, and L. Bijnens. A hierarchical binomial-poisson model for the analysis of a crossover design for correlated binary data when the number of trials is dose-dependent. Journal of biopharmaceutical statistics, 15(2):225–239, 2005.
Srivastava et al. (2008) M. S. Srivastava, T. von Rosen, and D. Von Rosen. Models with a kronecker product covariance structure: estimation and testing. Mathematical Methods of Statistics, 17(4):357–370, 2008.
Vegas et al. (2016) S. Vegas, C. Apa, and N. Juristo. Crossover designs in software engineering experiments: Benefits and perils. IEEE Transactions on Software Engineering, 42(2):120–135, 2016.

Appendix A

Theorem A.1.

The estimator of $\boldsymbol{R}(\boldsymbol{\alpha})$ given by $\boldsymbol{\hat{\Psi}}\otimes\boldsymbol{R}_{1}(\boldsymbol{\hat{\alpha}}_{1})$ is asymptotically unbiased and consistent.

Proof.

First, asymptotic properties of Pearson’s residuals concerning expectation, variance, and residuals are explored. Define the theoretical Pearson residuals of response of the $k$ th response of the $i$ th experimental unit at the $j$ th period as:

R_{ijk}=\frac{Y_{ijk}-\mu_{ijk}}{{\phi}\sqrt{V(\mu_{ijk})})}

(16)

and adapting the results of Cox and Snell (1968) it is true that:

$\displaystyle E(R_{ijk})$	$\displaystyle=\sum_{l=1}^{p}B(\hat{\beta}_{l})E(H_{l}^{(ijk)})$
	$\displaystyle-\sum_{l,s=1}^{p}\mathcal{K}^{ls}E\left(H_{l}^{(ijk)}U_{s}^{(ijk)}+\frac{1}{2}H_{ls}^{(ijk)}\right)+O(n^{-1})$	(17)
$\displaystyle Var(R_{ijk})$	$\displaystyle=1+2\sum_{l=1}^{p}B(\hat{\beta_{l}})E(R_{ijk}H_{l}^{(ijk)})$
	$\displaystyle-\sum_{l,s=1}^{p}\mathcal{K}^{ls}E\left(2R_{ijk}H_{l}^{(ijk)}U_{s}^{(ijk)}+H_{l}^{(ijk)}H_{s}^{(ijk)}R_{ijk}H_{ls}^{(ijk)}\right)+O(n^{-1})$	(18)

where

H_{l}^{(ijk)}=\frac{\partial R_{ijk}}{\partial\beta_{l}},\;H_{ls}^{(ijk)}=\frac{\partial^{2}R_{ijk}}{\partial\beta_{l}\partial\beta_{s}}

and $\mathcal{K}^{ls}$ is the element at position $(l,s)$ of the inverse of the Fisher information matrix, and according to Cordeiro and McCullagh (1991), the bias of $\hat{\boldsymbol{\beta}}$ ( $B(\hat{\boldsymbol{\beta}})$ ) is given by:

$\displaystyle B(\hat{\boldsymbol{\beta}})$	$\displaystyle=-\frac{1}{2\phi}\boldsymbol{(X^{T}WX)}\boldsymbol{X}^{T}\boldsymbol{D}(z_{ijk})\boldsymbol{F}\boldsymbol{1}$	(19)
$\displaystyle\boldsymbol{W}$	$\displaystyle=\boldsymbol{D}(V(Y_{ijk})^{\frac{1}{2}})\boldsymbol{R}(\boldsymbol{\alpha})\boldsymbol{D}(V(Y_{ijk})^{\frac{1}{2}})\boldsymbol{D}\left(\frac{\partial\boldsymbol{\mu}_{i}}{\partial\eta}\right)$	(20)
$\displaystyle\boldsymbol{F}$	$\displaystyle=\boldsymbol{D}\left(V(\mu_{ijk})^{-1}\left(\frac{\partial\mu_{ijk}}{\partial\eta_{ijk}}\right)\left(\frac{\partial^{2}\mu_{ijk}}{\partial\eta^{2}_{ijk}}\right)\right)$

where $\boldsymbol{1}$ is a vector of appropriate size whose entries are all equal to 1, $\boldsymbol{D}(z_{ijk})$ is a diagonal matrix with elements on the diagonal given by the variance of the estimated linear predictors, i.e. , the diagonal of the matrix

\boldsymbol{z}=Var(\hat{\eta}_{111}.\ldots,\hat{\eta}_{nPL})

(21)

i.e., $z_{ijk}=Var(\hat{\eta}_{ijk})$ and $\boldsymbol{X}$ is the design matrix of the parametric effects described in Equation (3). Now, computing the expected values by taking into account the properties of the exponential family we obtain:

$\displaystyle E\left(H_{l}^{(ijk)}\right)$	$\displaystyle=-\sqrt{\phi V(\mu_{ijk})}\left(\frac{\partial\mu_{ijk}}{\partial\eta_{ijk}}\right)x_{l(ijk)}$	(22)
$\displaystyle E\left(H_{ls}^{(ijk)}\right)$	$\displaystyle=\left[2V(\mu_{ijk})^{-\frac{3}{2}}\left(\frac{\partial V(\mu_{ijk})}{\partial\mu_{ijk}}\right)\left(\frac{\partial\mu_{ijk}}{\partial\eta_{ijk}}\right)^{2}-2V(\mu_{ijk})^{-\frac{1}{2}}\left(\frac{\partial^{2}\mu_{ijk}}{\partial\eta_{ijk}^{2}}\right)\right]$
	$\displaystyle\times\frac{1}{2}\sqrt{\phi}x_{l(ijk)}x_{s(ijk)}$	(23)
$\displaystyle E\left(H_{l}^{(ijk)}U_{s}^{(ijk)}\right)$	$\displaystyle=-\frac{1}{2}\phi^{\frac{1}{2}}V(\mu_{ijk})^{-\frac{3}{2}}\left(\frac{\partial V(\mu_{ijk})}{\partial\mu_{ijk}}\right)\left(\frac{\partial\mu_{ijk}}{\partial\eta_{ijk}}\right)^{2}x_{l(ijk)}x_{s(ijk)}$	(24)
$\displaystyle E\left(2R_{ijk}H_{l}^{(ijk)}U_{s}^{(ijk)}\right)$	$\displaystyle=-V(\mu_{ijk})^{-2}\left(\frac{\partial V(\mu_{ijk})}{\partial\mu_{ijk}}\right)^{2}\left(\frac{\partial\mu_{ijk}}{\partial\eta_{ijk}}\right)^{2}x_{l(ijk)}x_{s(ijk)}-$
	$\displaystyle 2\phi\left(\frac{\partial V(\mu_{ijk})}{\partial\mu_{ijk}}\right)^{-1}\left(\frac{\partial\mu_{ijk}}{\partial\eta_{ijk}}\right)^{2}x_{l(ijk)}x_{s(ijk)}$	(25)
$\displaystyle E\left(H_{l}^{(ijk)}H_{s}^{(ijk)}\right)$	$\displaystyle=\left[\phi+\frac{\left(\frac{\partial V(\mu_{ijk})}{\partial\mu_{ijk}}\right)}{4V(\mu_{ijk})}\right]w_{ijk}x_{l(ijk)}x_{s(ijk)}$	(26)
$\displaystyle E\left(R_{ijk}H_{ls}^{(ijk)}\right)$	$\displaystyle=\frac{1}{2}\phi V(\mu_{ijk})^{-\frac{1}{2}}\left(\frac{\partial\mu_{ijk}}{\partial\eta_{ijk}}\right)$	(27)

where $w_{ijk}$ is the element $ijk$ of the diagonal of the matrix $\boldsymbol{W}$ defined in Equation (20). From Equation (22) and the bias of $\boldsymbol{\beta}$ given in Equation (19), it follows that:

\sum_{l=1}^{p}B(\hat{\beta_{l}})E\left(H_{l}^{(ijk)}\right)=-\phi^{\frac{1}{2}}V(\mu_{ijk})^{-\frac{1}{2}}\left(\frac{\partial\mu_{ijk}}{\partial\eta_{ijk}}\right)\boldsymbol{e}_{ijk}\boldsymbol{X}B(\hat{\boldsymbol{\beta}})

(28)

where $\boldsymbol{e}_{ijk}$ is a vector of zeros with a 1’s at the $ijk$ th position. From equations (23) and (24), we get:

	$\displaystyle E\left(H_{l}^{(ijk)}U_{s}^{(ijk)}+\frac{1}{2}H_{ls}^{(ijk)}\right)$	$\displaystyle=-\frac{1}{2}\phi^{\frac{1}{2}}V(\mu_{ijk})^{-\frac{1}{2}}\left(\frac{\partial^{2}\mu_{ijk}}{\partial\eta_{ijk}^{2}}\right)^{2}x_{l(ijk)}x_{s(ijk)}$
	$\displaystyle\sum_{l,s=1}^{p}\mathcal{K}^{ls}E(H_{l}^{(ijk)}U_{s}^{(ijk)}+\frac{1}{2}H_{ls}^{(ijk)})$	$\displaystyle=-\frac{1}{2}\phi^{\frac{1}{2}}V(\mu_{ijk})^{-\frac{1}{2}}\left(\frac{\partial^{2}\mu_{ijk}}{\partial\eta_{ijk}^{2}}\right)^{2}x_{l(ijk)}x_{s(ijk)}$		(29)

and therefore, from equations (28), (29) and (17):

E(R_{111},R_{112},\ldots,R_{nPL})=\frac{-1}{2\sqrt{\phi}}(\boldsymbol{I}-\boldsymbol{H})\boldsymbol{J}\boldsymbol{z}

(30)

where

	$\displaystyle\boldsymbol{H}$	$\displaystyle=\boldsymbol{W}^{\frac{1}{2}}\boldsymbol{X(X^{T}WX)}^{\frac{1}{2}}\boldsymbol{X}^{T}\boldsymbol{W}^{\frac{1}{2}}$
	$\displaystyle\boldsymbol{J}$	$\displaystyle=\boldsymbol{D}\left(V(Y_{ijk})\right)\boldsymbol{D}\left(\frac{\partial\boldsymbol{\mu}^{2}_{i}}{\partial^{2}\eta}\right)$

from equations (25), (26) y (27):

	$\displaystyle-\sum_{l,s=1}^{p}\mathcal{K}^{ls}$	$\displaystyle E\left(2R_{ijk}H_{l}^{(ijk)}U_{s}^{(ijk)}+H_{l}^{(ijk)}H_{s}^{(ijk)}R_{ijk}H_{ls}^{(ijk)}\right)$
		$\displaystyle=\left[-\phi w_{ijk}-\frac{\left(\frac{\partial V(\mu_{ijk})}{\partial\mu_{ijk}}\right)\left(\frac{\partial^{2}\mu_{ijk}}{\partial\eta_{ijk}^{2}}\right)}{2V(\mu_{ijk})}-\frac{1}{2}w_{ijk}\left(\frac{\partial^{2}V(\mu_{ijk})}{\partial\mu^{2}_{ijk}}\right)\right]\frac{z_{ijk}}{\phi}$		(31)

and from equations (19) and (27) it follows that:

	$\displaystyle 2\sum_{l=1}^{p}B(\hat{\beta}_{l})E(R_{ijk}H_{ls}^{(ijk)})$
	$\displaystyle=\frac{1}{2\phi}\frac{\left(\frac{\partial V(\mu_{ijk})}{\partial\mu_{ijk}}\right)\left(\frac{\partial^{2}\mu_{ijk}}{\partial\eta_{ijk}^{2}}\right)}{V(\mu_{ijk})}\boldsymbol{e}_{ijk}\boldsymbol{Z}\boldsymbol{D}(z_{ii})\boldsymbol{D}\left(V(\mu_{ijk})^{-1}\left(\frac{\partial^{2}\mu_{ijk}}{\partial\eta_{ijk}^{2}}\right)\left(\frac{\partial\mu_{ijk}}{\partial\eta_{ijk}}\right)\right)\boldsymbol{1}$		(32)

therefore, from the results (31) and (32) we have that:

\left[Var(R_{111}),Var(R_{112}),\ldots,Var(R_{nPL})\right]=\boldsymbol{1}+\frac{1}{2\phi}\left(\boldsymbol{QHJ-M}\right)\boldsymbol{z}

(33)

where $\boldsymbol{1}$ is a vector of ones and

	$\displaystyle\boldsymbol{Q}$	$\displaystyle=\boldsymbol{D}\left(V(Y_{ijk})\right)^{\frac{1}{2}}\boldsymbol{D}\left(\frac{\partial V(Y_{ijk})}{\partial\mu}\right)$
	$\displaystyle\boldsymbol{M}$	$\displaystyle=\boldsymbol{D}\left(V(Y_{ijk})\right)^{\frac{1}{2}}\boldsymbol{D}\left(\frac{\partial V(Y_{ijk})^{2}}{\partial^{2}\mu}+2\phi\boldsymbol{D}(\boldsymbol{W})+\frac{\frac{\partial V(Y_{ijk})}{\partial\mu}}{V(Y_{ijk})}\right)$

By theorem 2 of Liang and Zeger (1986) it is known that the GEE estimator of $\eta_{ijk}$ are consistent and unbiased, i.e.,

\boldsymbol{Z}\xrightarrow[n\to\infty]{p}\boldsymbol{0}

(34)

thus from (30) and (33), we find that:

	$\displaystyle E(R_{ijk})$	$\displaystyle=O(n^{-1})$		(35)
	$\displaystyle Var(R_{ijk})$	$\displaystyle=1+O(n^{-1})$		(36)

and furthermore, by Section 3 of Cordeiro (2004) and equations (35) and (36) it follows that:

R_{ijk}\xrightarrow[n\to\infty]{d}N(0,1)

(37)

Let

\boldsymbol{\Gamma}=\{\gamma_{ij}\}_{n\times n}

(38)

be a matrix whose first column is $\frac{\boldsymbol{1}}{\sqrt{n}}$ and the following columns are:

g_{i-1}=\left(\frac{1}{\sqrt{(i-1)i}},\ldots,\frac{1}{\sqrt{(i-1)i}},-\frac{i-1}{\sqrt{(i-1)i}},0,\ldots,0\right),\qquad i=2,\ldots,n

(39)

\boldsymbol{\Gamma}^{T}=\begin{pmatrix}\frac{\boldsymbol{1}}{\sqrt{n}}&\vdots&\boldsymbol{G}\end{pmatrix}^{T}=\begin{pmatrix}\frac{1}{\sqrt{n}}&\frac{1}{\sqrt{n}}&\frac{1}{\sqrt{n}}&\frac{1}{\sqrt{n}}&\cdots&\frac{1}{\sqrt{n}}\\ \frac{1}{\sqrt{2}}&-\frac{1}{\sqrt{2}}&0&0&\cdots&0\\ \frac{1}{\sqrt{6}}&\frac{1}{\sqrt{6}}&-\frac{2}{\sqrt{6}}&0&\cdots&0\\ \frac{1}{\sqrt{12}}&\frac{1}{\sqrt{12}}&\frac{1}{\sqrt{12}}&-\frac{3}{\sqrt{12}}&\cdots&0\\ \vdots&\vdots&\vdots&\vdots&\ddots&\vdots\\ \frac{1}{\sqrt{(n(n-1))}}&\frac{1}{\sqrt{n(n-1)}}&\frac{1}{\sqrt{n(n-1)}}&\frac{1}{\sqrt{n(n-1)}}&\cdots&-\frac{n-1}{\sqrt{n(n-1)}}\\ \end{pmatrix}

where

\boldsymbol{G}^{T}=\begin{pmatrix}\frac{1}{\sqrt{2}}&-\frac{1}{\sqrt{2}}&0&0&\cdots&0\\ \frac{1}{\sqrt{6}}&\frac{1}{\sqrt{6}}&-\frac{2}{\sqrt{6}}&0&\cdots&0\\ \frac{1}{\sqrt{12}}&\frac{1}{\sqrt{12}}&\frac{1}{\sqrt{12}}&-\frac{3}{\sqrt{12}}&\cdots&0\\ \vdots&\vdots&\vdots&\vdots&\ddots&\vdots\\ \frac{1}{\sqrt{(n(n-1))}}&\frac{1}{\sqrt{n(n-1)}}&\frac{1}{\sqrt{n(n-1)}}&\frac{1}{\sqrt{n(n-1)}}&\cdots&-\frac{n-1}{\sqrt{n(n-1)}}\\ \end{pmatrix}

then the matrix $\boldsymbol{\Gamma}$ is a Helmert matrix (Lancaster, 1965) and therefore:

\boldsymbol{\Gamma}\boldsymbol{\Gamma}^{T}=\boldsymbol{I}_{n},\qquad\boldsymbol{1}_{n-1}^{T}\boldsymbol{G}=0,\qquad\boldsymbol{G}^{T}\boldsymbol{G}=\boldsymbol{I}_{n-1}-\frac{1}{n-1}\boldsymbol{1}_{n-1}\boldsymbol{1}_{n-1}^{T}

(40)

If $r_{ijk}$ is defined as the estimated Pearson residual of the $i$ th experimental unit in the $j$ th period and the $k$ th observation, i.e.,

r_{ijk}=\hat{R}_{ijk}=\frac{Y_{ijk}-\hat{\mu}_{ijk}}{\hat{\phi}\sqrt{V(\hat{\mu}_{ijk})})}

and the matrix $\boldsymbol{r}_{i}$ of residuals of the $i$ th individual, where the first row has the $L$ Pearson residuals defined in Equation (8) corresponding to the first period, and the second row of the $L$ corresponding to the second period and so on until completing a matrix with $P$ rows and $L$ columns, i.e.:

\boldsymbol{r}_{i}=\begin{pmatrix}r_{i11}&r_{i21}&\cdots&r_{i1L}\\ r_{i21}&r_{i22}&\cdots&r_{i2L}\\ \vdots&\vdots&\ddots&\vdots\\ r_{iP1}&r_{i2P}&\cdots&r_{iPL}\\ \end{pmatrix}

(41)

By Equation (35) and the correlation assumption given in Equation (6) it is true that:

	$\displaystyle E(\boldsymbol{r}_{i})=\boldsymbol{0}_{P\times L}$
	$\displaystyle Corr\left(Vec(\boldsymbol{r}_{i})\right)=\boldsymbol{\Psi}\otimes\boldsymbol{R}_{1}(\boldsymbol{\alpha}_{1})$		(42)
	$\displaystyle Corr\left(Vec(\boldsymbol{r}_{i}),Vec(\boldsymbol{r}_{i^{\prime}})\right)=\boldsymbol{0}_{PL\times PL}\,\qquad i\neq i^{\prime}$

And defining $\boldsymbol{R}$ as:

\boldsymbol{R}=(\boldsymbol{r}_{1},\ldots,\boldsymbol{r}_{n})_{P\times nL}\qquad

and since $\boldsymbol{\Gamma}$ is orthogonal, then $\boldsymbol{\Gamma}\otimes\boldsymbol{I}_{L}$ is also orthogonal. Thus (Srivastava et al., 2008):

\boldsymbol{R}(\boldsymbol{\Gamma}\otimes\boldsymbol{I}_{L})=\begin{pmatrix}\sqrt{n}\bar{\boldsymbol{r}}&\vdots&\boldsymbol{R}(\boldsymbol{G}\otimes\boldsymbol{I}_{L})\end{pmatrix}

(43)

and according to Equation (42), we get:

	$\displaystyle\boldsymbol{R}\left(\boldsymbol{I}_{n}\otimes\boldsymbol{\Psi}^{-1}\right)\boldsymbol{R}^{T}$	$\displaystyle=(\boldsymbol{r}_{1},\ldots,\boldsymbol{r}_{n})\left(\boldsymbol{I}_{n}\otimes\boldsymbol{\Psi}^{-1}\right)(\boldsymbol{r}_{1},\ldots,\boldsymbol{r}_{n})^{T}$
		$\displaystyle=n\bar{\boldsymbol{r}}\otimes\boldsymbol{\Psi}^{-1}\boldsymbol{R}+\boldsymbol{R}(\boldsymbol{G}\otimes\boldsymbol{I}_{L})\left(\boldsymbol{I}_{n}\otimes\boldsymbol{\Psi}^{-1}\right)(\boldsymbol{G}^{T}\otimes\boldsymbol{I}_{L})\boldsymbol{R}^{T}$
		$\displaystyle=n\bar{\boldsymbol{r}}\otimes\boldsymbol{\Psi}^{-1}\boldsymbol{R}+\boldsymbol{Z}(\boldsymbol{G}^{T}\otimes\boldsymbol{I}_{L})\boldsymbol{Z}^{T}$

where $\boldsymbol{Z}$ is:

\displaystyle\boldsymbol{Z}_{P\times(n-1)L}=(\boldsymbol{Z}_{1},\ldots,\boldsymbol{Z}_{(n-1)})=\boldsymbol{R}(\boldsymbol{G}\otimes\boldsymbol{I}_{L})=(\boldsymbol{r}_{1},\ldots,\boldsymbol{r}_{n})(\boldsymbol{G}\otimes\boldsymbol{I}_{L})

(44)

with $\bar{\boldsymbol{r}}$ is the matrix of the average residuals defined in Equation (41) for each period, that is,

\bar{\boldsymbol{r}}=\frac{1}{n}\sum_{i=1}^{n}\boldsymbol{r}_{i}=\frac{1}{n}\sum_{i=1}^{n}\begin{pmatrix}r_{i11}&r_{i21}&\cdots&r_{i1L}\\ r_{i21}&r_{i22}&\cdots&r_{i2L}\\ \vdots&\vdots&\ddots&\vdots\\ r_{iP1}&r_{i2P}&\cdots&r_{iPL}\\ \end{pmatrix}

and

	$\displaystyle\boldsymbol{Z}_{1}=\begin{pmatrix}\frac{1}{\sqrt{2}}r_{111}-\frac{1}{\sqrt{2}}r_{211}&\cdots&\frac{1}{\sqrt{2}}r_{11L}-\frac{1}{\sqrt{2}}r_{21L}\\ \frac{1}{\sqrt{2}}r_{121}-\frac{1}{\sqrt{2}}r_{221}&\cdots&\frac{1}{\sqrt{2}}r_{12L}-\frac{1}{\sqrt{2}}r_{22L}\\ \vdots&\ddots&\vdots\\ \frac{1}{\sqrt{2}}r_{1P1}-\frac{1}{\sqrt{2}}r_{2P1}&\cdots&\frac{1}{\sqrt{2}}r_{1PL}-\frac{1}{\sqrt{2}}r_{2PL}\end{pmatrix}_{P\times L}$
	$\displaystyle\boldsymbol{Z}_{2}=\begin{pmatrix}\frac{1}{\sqrt{6}}\sum_{i=1}^{2}r_{i11}-\frac{2}{\sqrt{6}}r_{311}&\cdots&\frac{1}{\sqrt{6}}\sum_{i=1}^{2}r_{i1L}-\frac{2}{\sqrt{6}}r_{31L}\\ \frac{1}{\sqrt{6}}\sum_{i=1}^{2}r_{i21}-\frac{2}{\sqrt{6}}r_{321}&\cdots&\frac{1}{\sqrt{6}}\sum_{i=1}^{2}r_{i2L}-\frac{2}{\sqrt{6}}r_{32L}\\ \vdots&\vdots&\ddots&\vdots\\ \frac{1}{\sqrt{6}}\sum_{i=1}^{2}r_{iP1}-\frac{2}{\sqrt{6}}r_{3P1}&\cdots&\frac{1}{\sqrt{6}}\sum_{i=1}^{2}r_{iPL}-\frac{2}{\sqrt{6}}r_{3PL}\\ \end{pmatrix}_{P\times L}$
	$\displaystyle\vdots$
	$\displaystyle\boldsymbol{Z}_{(n-1)}=\begin{pmatrix}\frac{1}{\sqrt{n(n-1)}}\sum_{i=1}^{n-1}r_{i11}-\frac{n-1}{\sqrt{n(n-1)}}r_{(n-1)11}&\cdots&\frac{1}{\sqrt{n(n-1)}}\sum_{i=1}^{n-1}r_{i1L}-\frac{n-1}{\sqrt{n(n-1)}}r_{(n-1)1L}\\ \frac{1}{\sqrt{n(n-1)}}\sum_{i=1}^{n-1}r_{i21}-\frac{n-1}{\sqrt{n(n-1)}}r_{(n-1)21}&\cdots&\frac{1}{\sqrt{n(n-1)}}\sum_{i=1}^{n-1}r_{i2L}-\frac{n-1}{\sqrt{n(n-1)}}r_{(n-1)2L}\\ \vdots&\vdots&\ddots&\vdots\\ \frac{1}{\sqrt{n(n-1)}}\sum_{i=1}^{n-1}r_{iP1}-\frac{n-1}{\sqrt{n(n-1)}}r_{(n-1)P1}&\cdots&\frac{1}{\sqrt{n(n-1)}}\sum_{i=1}^{n-1}r_{iPL}-\frac{n-1}{\sqrt{n(n-1)}}r_{(n-1)PL}\\ \end{pmatrix}_{P\times L}$

Now by the properties of the Pearson residuals we have that:

\displaystyle E(\boldsymbol{Z}_{1})

\displaystyle=E(\boldsymbol{Z}_{2})=\cdots=E(\boldsymbol{Z}_{n-1})=\boldsymbol{0}_{P\times L}

and by the properties given in Equation (40) and because we assume that the experimental units are independent, that is, $Corr(r_{ijk},r_{i^{\prime}j^{\prime}k^{\prime}})=0$ , for all $i\neq i^{\prime}$ and that Equation (3) is true, then:

		$\displaystyle Corr(r_{ijk},r_{i^{\prime}j^{\prime}k^{\prime}})=0\qquad\forall i\neq i^{\prime}$
		$\displaystyle Corr(r_{ijk},r_{ij^{\prime}k^{\prime}})=Corr(r_{i^{\prime}jk},r_{i^{\prime}j^{\prime}k^{\prime}})=Corr(r_{i^{\prime}jk},r_{i^{\prime}j^{\prime}k^{\prime}}),\qquad\forall i\neq i^{\prime}$
		$\displaystyle Corr\left(\frac{1}{\sqrt{2}}r_{111}-\frac{1}{\sqrt{2}}r_{211},\frac{1}{\sqrt{2}}r_{121}-\frac{1}{\sqrt{2}}r_{221}\right)=\frac{1}{2}Corr(r_{111},r_{121})+\frac{1}{2}Cov(r_{211},r_{221})$
		$\displaystyle=Corr(r_{111},r_{121})=Corr(r_{111},r_{121})$
		$\displaystyle Corr\left(\frac{1}{\sqrt{n(n-1)}}\sum_{i=1}^{n-1}r_{i11}-\frac{n-1}{\sqrt{n(n-1)}}r_{(n-1)11},\frac{1}{\sqrt{n(n-1)}}\sum_{i=1}^{n-1}r_{i21}-\frac{n-1}{\sqrt{n(n-1)}}r_{(n-1)21}\right)$
		$\displaystyle=Corr(r_{111},r_{121})=Corr(r_{111},r_{121})$

furthermore,

	$\displaystyle Var(Vec(\boldsymbol{Z}_{1}))$	$\displaystyle=Var\left\{\begin{pmatrix}\frac{1}{\sqrt{2}}r_{111}-\frac{1}{\sqrt{2}}r_{211}\\ \frac{1}{\sqrt{2}}r_{121}-\frac{1}{\sqrt{2}}r_{221}\\ \vdots\\ \frac{1}{\sqrt{2}}r_{1P1}-\frac{1}{\sqrt{2}}r_{2P1}\\ \frac{1}{\sqrt{2}}r_{112}-\frac{1}{\sqrt{2}}r_{212}\\ \vdots\\ \frac{1}{\sqrt{2}}r_{1PL}-\frac{1}{\sqrt{2}}r_{2PL}\end{pmatrix}_{PL\times 1}\right\}$
		$\displaystyle=\begin{pmatrix}1&Corr(r_{111},r_{121})&\cdots&Corr(r_{111},r_{1P1})\\ Corr(r_{111},r_{121})&1&\cdots&Corr(r_{121},r_{1P1})\\ \vdots&\vdots&\ddots&\vdots\\ Corr(r_{111},r_{1P1})&Corr(r_{121},r_{1P1})&\cdots&1\end{pmatrix}_{PL\times PL}$
		$\displaystyle=\boldsymbol{\Psi}\otimes\boldsymbol{R}_{1}(\boldsymbol{\alpha}_{1}))$

Var(Vec(\boldsymbol{Z}_{1}))=Var(Vec(\boldsymbol{Z}_{2}))=\cdots=Var(Vec(\boldsymbol{Z}_{(n-1)}))=\boldsymbol{\Psi}\otimes\boldsymbol{R}_{1}(\boldsymbol{\alpha}_{1}))

(45)

By the central limit theorem, we get that:

\displaystyle Vec(\bar{\boldsymbol{r}})

\displaystyle\xrightarrow[n\to\infty]{d}N_{LP}(\boldsymbol{0},\boldsymbol{\Psi}\otimes\boldsymbol{R}_{1}(\boldsymbol{\alpha}_{1}))

(46)

By Equations (37) and (45) it follows that:

\displaystyle Vec(\boldsymbol{Z}_{j})

\displaystyle\xrightarrow[n\to\infty]{d}N_{LP}(\boldsymbol{0},\boldsymbol{\Psi}\otimes\boldsymbol{R}_{1}(\boldsymbol{\alpha}_{1}))

(47)

and by Equations (40), (46) y (47) that:

	$\displaystyle Cov$	$\displaystyle(Vec(\bar{\boldsymbol{r}}),Vec(\boldsymbol{Z}_{j}))=\boldsymbol{0}$
	$\displaystyle Cov$	$\displaystyle(Vec(\boldsymbol{Z}_{i}),Vec(\boldsymbol{Z}_{j}))=\boldsymbol{0}$

and partitioning $\boldsymbol{Z}_{1}$ as follows:

	$\displaystyle\boldsymbol{Z}_{1}$	$\displaystyle=\begin{pmatrix}\frac{1}{\sqrt{2}}r_{111}-\frac{1}{\sqrt{2}}r_{211}&\cdots&\frac{1}{\sqrt{2}}r_{11L}-\frac{1}{\sqrt{2}}r_{21L}\\ \frac{1}{\sqrt{2}}r_{121}-\frac{1}{\sqrt{2}}r_{221}&\cdots&\frac{1}{\sqrt{2}}r_{12L}-\frac{1}{\sqrt{2}}r_{22L}\\ \vdots&\ddots&\vdots\\ \frac{1}{\sqrt{2}}r_{1P1}-\frac{1}{\sqrt{2}}r_{2P1}&\cdots&\frac{1}{\sqrt{2}}r_{1PL}-\frac{1}{\sqrt{2}}r_{2PL}\end{pmatrix}_{P\times L}$
		$\displaystyle=\begin{pmatrix}\begin{pmatrix}\frac{1}{\sqrt{2}}r_{111}-\frac{1}{\sqrt{2}}r_{211}\\ \frac{1}{\sqrt{2}}r_{121}-\frac{1}{\sqrt{2}}r_{221}\\ \vdots\\ \frac{1}{\sqrt{2}}r_{1P1}-\frac{1}{\sqrt{2}}r_{2P1}\end{pmatrix}_{P\times 1}&\cdots&\begin{pmatrix}\frac{1}{\sqrt{2}}r_{11L}-\frac{1}{\sqrt{2}}r_{21L}\\ \frac{1}{\sqrt{2}}r_{1PL}-\frac{1}{\sqrt{2}}r_{2PL}\\ \vdots\\ \frac{1}{\sqrt{2}}r_{1PL}-\frac{1}{\sqrt{2}}r_{2PL}\end{pmatrix}_{P\times 1}\end{pmatrix}$
		$\displaystyle=(\boldsymbol{z}_{(1)1},\ldots,\boldsymbol{z}_{(L)1})$

Then it is obtained that

E(\boldsymbol{z}_{(j)1},\boldsymbol{z}^{T}_{(j)1})=\Psi_{jj}\boldsymbol{R}_{1}(\boldsymbol{\alpha}_{1})\qquad E(\boldsymbol{Z}_{1},\boldsymbol{Z}^{T}_{1})=(trace(\boldsymbol{\Psi}))\boldsymbol{R}_{1}(\boldsymbol{\alpha}_{1})

Similarly, it is found that:

	$\displaystyle E(\boldsymbol{z}_{(j)1},\boldsymbol{z}^{T}_{(j)1})=\Psi_{jj}\boldsymbol{R}_{1}(\boldsymbol{\alpha}_{1})$
	$\displaystyle E(\boldsymbol{z}_{(j)2},\boldsymbol{z}^{T}_{(j)2})=\Psi_{jj}\boldsymbol{R}_{1}(\boldsymbol{\alpha}_{1})$
$\displaystyle\vdots$
	$\displaystyle E(\boldsymbol{z}_{(j)(n-1)},\boldsymbol{z}^{T}_{(j)(n-1)})=\Psi_{jj}\boldsymbol{R}_{1}(\boldsymbol{\alpha}_{1})$	(48)

	$\displaystyle E(\boldsymbol{Z}_{1},\boldsymbol{Z}^{T}_{1})=(trace(\boldsymbol{\Psi}))\boldsymbol{R}_{1}(\boldsymbol{\alpha}_{1})$
$\displaystyle\vdots$
	$\displaystyle E(\boldsymbol{Z}_{(n-1)},\boldsymbol{Z}^{T}_{(n-q)})=(trace(\boldsymbol{\Psi}))\boldsymbol{R}_{1}(\boldsymbol{\alpha}_{1})$	(49)

Therefore since $\boldsymbol{\Psi}_{jj}=1$ , $\forall j=1,\ldots,P$ , $trace(\boldsymbol{\Psi})=1$ from Equation (48) we get:

E\left(\boldsymbol{R}_{1}(\boldsymbol{\alpha}_{1})^{\frac{1}{2}}\boldsymbol{z}_{(j^{\prime})k}\boldsymbol{z}^{T}_{(j)k}\boldsymbol{R}_{1}(\boldsymbol{\alpha}_{1})^{\frac{1}{2}}\right)=\boldsymbol{\Psi}_{jj^{\prime}}\boldsymbol{I}_{P},\qquad\forall k=1,\ldots,n-1

(50)

Cov[\boldsymbol{R}_{1}(\boldsymbol{\alpha}_{1})^{\frac{1}{2}}\boldsymbol{z}_{(k)j}\boldsymbol{z}^{T}_{(i)j}\boldsymbol{R}_{1}(\boldsymbol{\alpha}_{1})^{\frac{1}{2}},\boldsymbol{R}_{1}(\boldsymbol{\alpha}_{1})^{\frac{1}{2}}\boldsymbol{z}_{(k)j^{\prime}}\boldsymbol{z}^{T}_{(i)j^{\prime}}\boldsymbol{R}_{1}(\boldsymbol{\alpha}_{1})^{\frac{1}{2}}]=\boldsymbol{0}

(51)

By Theorem 2 in Liang and Zeger (1986) it is known that $\boldsymbol{R}_{1}(\hat{\boldsymbol{\alpha}}_{1})$ is consistent and unbiased for $\boldsymbol{R}_{1}(\boldsymbol{\alpha}_{1})$ , by Equations (50), (46) and (47), we have that

\hat{\psi}_{jj^{\prime}}=\frac{1}{n}\sum_{i=1}^{n}tr\left(\boldsymbol{R}_{1}(\boldsymbol{\hat{\alpha}}_{1})(\boldsymbol{r}_{(j)i}-\bar{\boldsymbol{r}}_{(j)})(\boldsymbol{r}_{(j^{\prime})i}-\bar{\boldsymbol{r}}_{(j^{\prime})})^{T}\right)

(52)

is a consistent and asymptotically unbiased estimator for $\Psi_{jj^{\prime}}$ , which proves the theorem ∎