This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Gaussian variational approximation with composite likelihood for crossed random effect models

Libai Xu 
School of Mathematical Sciences, Soochow University, Suzhou, Jiangsu, China,
1 Shizi Street, Suzhou, Jiangsu, 215006, China.
lbxu@suda.edu.cn
Nancy Reid and Dehan Kong
Department of Statistical Sciences, University of Toronto, Toronto, Canada,
700 University Ave, 9th Floor, Toronto, Ontario M5G 1Z5, Canada.
nancym.reid@utoronto.ca and dehan.kong@utoronto.ca
Abstract

Composite likelihood usually ignores dependencies among response components, while variational approximation to likelihood ignores dependencies among parameter components. We derive a Gaussian variational approximation to the composite log-likelihood function for Poisson and Gamma regression models with crossed random effects. We show consistency and asymptotic normality of the estimates derived from this approximation and support this theory with some simulation studies. The approach is computationally much faster than a Gaussian variational approximation to the full log-likelihood function.


Keywords: Gamma regression; generalized linear mixed models; likelihood inference; Poisson regression.

1 Introduction

Generalized linear mixed models with crossed random effects are useful for the analysis and inference of cross-classified data, such as arise in educational studies (Chung and Cai, 2021; Menictas et al., 2022), psychometric research studies (Baayen et al., 2008; Jeon et al., 2017), and medical studies (Coull et al., 2001), among others.

Suppose Yij,i=1,,m;j=1,,nY_{ij},i=1,\dots,m;j=1,\dots,n are conditionally independent, given random effects UiU_{i} and VjV_{j}. A generalized linear model for YijY_{ij} has density function

f(YijXij,Ui,Vj)=exp[{Yijθijb(θij)}/a(ϕ)+c(Yij,ϕ)],f(Y_{ij}\mid X_{ij},U_{i},V_{j})=\exp\left[\{Y_{ij}\theta_{ij}-b(\theta_{ij})\}/a(\phi)+c(Y_{ij},\phi)\right], (1)

and we relate this to the covariates in the usual manner:

g(μij)=XijTβ+Ui+Vj,g(\mu_{ij})=X^{{\mathrm{\scriptscriptstyle T}}}_{ij}\beta+U_{i}+V_{j}, (2)

where μij=E(Yij)\mu_{ij}=E(Y_{ij}). In (2) Xij=(1,Xij)TX_{ij}=(1,X_{ij})^{{\mathrm{\scriptscriptstyle T}}}, β\beta is a (p+1)(p+1)-vector of fixed-effects parameters, and UiU_{i} and VjV_{j} are independent random effects assumed to follow N(0,σu2)N(0,\sigma^{2}_{u}) and N(0,σv2)N(0,\sigma^{2}_{v}) distributions, respectively.

In this paper, we develop a Gaussian variational approximation to a form of composite likelihood for model (1), in order to make the computations more tractable than in a full likelihood approach. We focus on two examples: Poisson regression , where YijPoisson(μij)Y_{ij}\sim\text{Poisson}(\mu_{ij}), with θij=log(μij)\theta_{ij}=\log(\mu_{ij}), b(θij)=exp(θij)b(\theta_{ij})=\exp(\theta_{ij}), g(μij)=θijg(\mu_{ij})=\theta_{ij}, and ϕ=1\phi=1, and Gamma regression, where YijY_{ij} distributed as a Gamma distribution with shape parameter α=ϕ1\alpha=\phi^{-1} and expectation parameter μij\mu_{ij}, so that θij=μij1\theta_{ij}=-\mu_{ij}^{-1}, b(θij)=log(μij)b(\theta_{ij})=\log(\mu_{ij}), and g(μij)=log(μij)g(\mu_{ij})=\log(\mu_{ij}).

Other approaches to simplifying the likelihood for crossed random effects have been proposed. Penalized quasi-likelihood is discussed in Breslow and Clayton (1993); Schall (1991), based on Laplace approximation, although for binary data the resulting estimates are not guaranteed to be consistent (Lin and Breslow, 1996). Coull et al. (2001) developed a Monte Carlo Newton-Raphson algorithm based on expectation-maximization (McCulloch, 1997), and Ghosh et al. (2022a) proposed a backfitting algorithm for a Gaussian linear model, where the integral can be evaluated explicitly. This approach was extended to logistic regression in Ghosh et al. (2022b).

Composite likelihood methods have also been proposed to simplify computation in models with random effects. Renard et al. (2004) and Bartolucci and Lupparelli (2016) proposed pairwise likelihood for nested random effects models, and Bellio and Varin (2005) developed a pairwise likelihood approach to crossed random effects.

Our approach combines composite likelihood with variational inference. The theory is developed by extending results of Ormerod and Wand (2012), Hall et al. (2011a), and Hall et al. (2011a) which treat Poisson models with nested random effects. Related variational approaches are developed in Menictas et al. (2022) for linear models, and in Rijmen and Jeon (2013), Jeon et al. (2017), Hui et al. (2017), Hui et al. (2019) for generalized linear mixed models.

We introduce a row-column composite likelihood that requires the computation of only one-dimensional integrals and apply a Gaussian variational approximation to this composite likelihood. We prove consistency and asymptotic normality of the variational estimates for the Poisson and Gamma regression models. Simulations demonstrate that our method is faster than the Gaussian variational approximation to the full likelihood function.

2 Gaussian variational approximation

The log-likelihood function for the generalized linear model with crossed random effects is constructed from the marginal distribution of the responses, YijY_{ij}, given the covariates XijX_{ij}, and depends on the parameters Ψ=(β,σu2,σv2)\Psi=(\beta,\sigma^{2}_{u},\sigma^{2}_{v}):

(Ψ)\displaystyle\ell(\Psi) =\displaystyle= logRm+ni=1mj=1nf(YijXij,Ui,Vj)p(Ui)p(Vj)dUidVj,\displaystyle\log\int_{R^{m+n}}\prod^{m}_{i=1}\prod^{n}_{j=1}f(Y_{ij}\mid X_{ij},U_{i},V_{j})p(U_{i})p(V_{j})dU_{i}dV_{j}, (3)

with maximum likelihood estimator Ψ^=argmaxΨ(Ψ)\hat{\Psi}=\arg\max_{\Psi}\ell(\Psi). To simplify the computation of the (m+n)(m+n)-dimensional integral we apply a Gaussian variational approximation to (3) by introducing pairs of variational parameters (μui,λui),i=1,,m(\mathbf{\mu}_{u_{i}},\lambda_{u_{i}}),i=1,...,m and (μvj,λvj),j=1,,n(\mathbf{\mu}_{v_{j}},\lambda_{v_{j}}),j=1,...,n. By Jensen’s inequality and concavity of the logarithm function:

(Ψ)\displaystyle\ell(\Psi) =\displaystyle= logRm+ni=1mj=1n{f(YijXij,Ui,Vj)p(Ui)p(Vj)ϕ(Ui)ϕ(Vj)ϕ(Ui)ϕ(Vj)dUidVj}\displaystyle\log\int_{R^{m+n}}\prod^{m}_{i=1}\prod^{n}_{j=1}\left\{f(Y_{ij}\mid X_{ij},U_{i},V_{j})p(U_{i})p(V_{j})\frac{\phi(U_{i})\phi(V_{j})}{\phi(U_{i})\phi(V_{j})}dU_{i}dV_{j}\right\}
\displaystyle\geq Rm+ni=1mj=1nϕ(Ui)ϕ(Vj)logf(YijXij,Ui,Vj)dUidVj\displaystyle\int_{R^{m+n}}\sum^{m}_{i=1}\sum^{n}_{j=1}\phi(U_{i})\phi(V_{j})\log f(Y_{ij}\mid X_{ij},U_{i},V_{j})dU_{i}dV_{j}
+Rmi=1mϕ(Ui)logp(Ui)dUi+Rnj=1nϕ(Vj)logp(Vj)dVj\displaystyle+\int_{R^{m}}\sum^{m}_{i=1}\phi(U_{i})\log p(U_{i})dU_{i}+\int_{R^{n}}\sum^{n}_{j=1}\phi(V_{j})\log p(V_{j})dV_{j}
Rmi=1mϕ(Ui)logϕ(Ui)dUiRnj=1nϕ(Vj)logϕ(Vj)dVj=¯(Ψ,ξ),\displaystyle-\int_{R^{m}}\sum^{m}_{i=1}\phi(U_{i})\log\phi(U_{i})dU_{i}-\int_{R^{n}}\sum^{n}_{j=1}\phi(V_{j})\log\phi(V_{j})dV_{j}=\underline{\ell}(\Psi,\xi),

where ϕ(Ui)\phi(U_{i}) and ϕ(Vj)\phi(V_{j}) are Gaussian densities with means μui\mu_{u_{i}}, μvj\mu_{v_{j}}, and variances λui\lambda_{u_{i}}, λvj\lambda_{v_{j}}, respectively. The function ¯(Ψ,ξ)\underline{\ell}(\Psi,\xi) is the variational lower bound on (Ψ)\ell(\Psi); the variational parameters are ξ=(μu1,,μum,μv1,,μvn,λui,,λum,λv1,,λvn)T\xi=(\mu_{u_{1}},...,\mu_{u_{m}},\mu_{v_{1}},...,\mu_{v_{n}},\lambda_{u_{i}},...,\lambda_{u_{m}},\lambda_{v_{1}},...,\lambda_{v_{n}})^{{\mathrm{\scriptscriptstyle T}}}. Ignoring some constants, this lower bound simplifies to

¯(Ψ,ξ)\displaystyle\underline{\ell}(\Psi,\xi) =\displaystyle= i=1mj=1n[YijEϕ(ui)ϕ(vj)(θij)Eϕ(ui)ϕ(vj){b(θij)}]\displaystyle\sum^{m}_{i=1}\sum^{n}_{j=1}\left[Y_{ij}E_{\phi(u_{i})\phi(v_{j})}(\theta_{ij})-E_{\phi(u_{i})\phi(v_{j})}\{b(\theta_{ij})\}\right] (4)
+(1/2)i=1m{log(λui/σu2)(μui2+λui)/σu2}\displaystyle+(1/2)\sum^{m}_{i=1}\left\{\log(\lambda_{u_{i}}/\sigma^{2}_{u})-(\mu^{2}_{u_{i}}+\lambda_{u_{i}})/\sigma^{2}_{u}\right\}
+(1/2)j=1n{log(λvj/σv2)(μvj2+λvj)/σv2}.\displaystyle+(1/2)\sum^{n}_{j=1}\{\log(\lambda_{v_{j}}/\sigma^{2}_{v})-(\mu^{2}_{v_{j}}+\lambda_{v_{j}})/\sigma^{2}_{v}\}.

The Gaussian variational approximation estimators maximize the parameters of the evidence lower bound; (Ψ¯^,ξ¯^)=argmaxΨ,ξ¯(Ψ,ξ).(\widehat{\underline{\Psi}},\widehat{\underline{\xi}})={\arg\max}_{\Psi,\xi}\underline{\ell}(\Psi,\xi). The advantage of using the variational lower bound ¯(Ψ,ξ)\underline{\ell}(\Psi,\xi) over (Ψ)\ell(\Psi) is that the former only contains terms Eϕ(ui)ϕ(vj)(θij)E_{\phi(u_{i})\phi(v_{j})}(\theta_{ij}) or Eϕ(ui)ϕ(vj){b(θij)}E_{\phi(u_{i})\phi(v_{j})}\{b(\theta_{ij})\} involving a two-dimensional integral, and in our models these can be evaluated explicitly. For the Poisson regression model,

Eϕ(ui)ϕ(vj)(θij)\displaystyle E_{\phi(u_{i})\phi(v_{j})}(\theta_{ij}) =\displaystyle= XijTβ+μui+μvj,\displaystyle X^{{\mathrm{\scriptscriptstyle T}}}_{ij}\beta+\mu_{u_{i}}+\mu_{v_{j}},
Eϕ(ui)ϕ(vj){b(θij)}\displaystyle E_{\phi(u_{i})\phi(v_{j})}\{b(\theta_{ij})\} =\displaystyle= exp(XijTβ+μui+λui/2+μvj+λvj/2).\displaystyle\exp(X^{{\mathrm{\scriptscriptstyle T}}}_{ij}\beta+\mu_{u_{i}}+\lambda_{u_{i}}/2+\mu_{v_{j}}+\lambda_{v_{j}}/2).

For the Gamma regression model,

Eϕ(ui)ϕ(vj)(θij)\displaystyle E_{\phi(u_{i})\phi(v_{j})}(\theta_{ij}) =\displaystyle= exp(XijTβμui+λui/2μvj+λvj/2),\displaystyle-\exp(-X^{{\mathrm{\scriptscriptstyle T}}}_{ij}\beta-\mu_{u_{i}}+\lambda_{u_{i}}/2-\mu_{v_{j}}+\lambda_{v_{j}}/2),
Eϕ(ui)ϕ(vj){b(θij)}\displaystyle E_{\phi(u_{i})\phi(v_{j})}\{b(\theta_{ij})\} =\displaystyle= XijTβ+μui+μvj.\displaystyle X^{{\mathrm{\scriptscriptstyle T}}}_{ij}\beta+\mu_{u_{i}}+\mu_{v_{j}}.

3 Composite likelihood Gaussian variational approximation

In general models, computation of the variational approximation requires mnmn two-dimensional integrals; for the Poisson and Gamma regression models this simplifies to mnmn function evaluations. To reduce computation further we develop a version of composite likelihood by considering the two random effects UiU_{i} and VjV_{j} separately.

First, we ignore the dependence among rows, Yi(1)=(Yi1,,Yin),i=1,,mY^{(1)}_{i}=(Y_{i1},...,Y_{in}),i=1,...,m, by ignoring the column random effects VjV_{j}’s; i.e., in (2) we replace XijTβ+Ui+VjX^{{\mathrm{\scriptscriptstyle T}}}_{ij}\beta+U_{i}+V_{j} with XijTβr+UiX^{{\mathrm{\scriptscriptstyle T}}}_{ij}\beta^{r}+U_{i} , and define the row-composite likelihood function by

CL1(βr,σu2)\displaystyle CL_{1}(\beta^{r},\sigma^{2}_{u}) =\displaystyle= i=1mRj=1nf(YijXij,Ui)p(Ui)dUi.\displaystyle\prod^{m}_{i=1}\int_{R}\prod^{n}_{j=1}f(Y_{ij}\mid X_{ij},U_{i})p(U_{i})dU_{i}.

The row-composite log-likelihood function is

c1(βr,σu2)=i=1mlogRj=1np(YijXij,Ui)p(Ui)dUi.\displaystyle c\ell_{1}(\beta^{r},\sigma^{2}_{u})=\sum^{m}_{i=1}\log\int_{R}\prod^{n}_{j=1}p(Y_{ij}\mid X_{ij},U_{i})p(U_{i})dU_{i}. (5)

By a similar operation ignoring row random effects, we can define the column-composite log-likelihood function by

c2(βc,σv2)\displaystyle c\ell_{2}(\beta^{c},\sigma^{2}_{v}) =\displaystyle= j=1nlogRi=1mp(YijXij,Vj)p(Vj)dVj.\displaystyle\sum^{n}_{j=1}\log\int_{R}\prod^{m}_{i=1}p(Y_{ij}\mid X_{ij},V_{j})p(V_{j})dV_{j}. (6)

The notation βr=(β0r,β1,,βp)\beta^{r}=(\beta^{r}_{0},\beta_{1},\dots,\beta_{p}) and βc=(β0c,β1,,βp)\beta^{c}=(\beta^{c}_{0},\beta_{1},\dots,\beta_{p}) is needed, as the misspecification caused by ignoring the column (or row) random effects changes the intercept from β0\beta_{0}, say, to β0r=β0+σv2/2\beta^{r}_{0}=\beta_{0}+\sigma^{2}_{v}/2 for the row-composite likelihood and β0c=β0+σu2/2\beta^{c}_{0}=\beta_{0}+\sigma^{2}_{u}/2 for the column-composite likelihood, for both the Poisson and the Gamma regression models. The other components, β1,β2,..,βp,\beta_{1},\beta_{2},..,\beta_{p}, are unchanged.

Based on (5) and (6), we propose the misspecified row-column composite log-likelihood function by

c(Ψrc)\displaystyle c\ell(\Psi^{rc}) =\displaystyle= c1(βr,σv2)+c2(βc,σu2),\displaystyle c\ell_{1}(\beta^{r},\sigma^{2}_{v})+c\ell_{2}(\beta^{c},\sigma^{2}_{u}), (7)

where Ψrc=(β0r,β0c,β1,,βp,σu2,σv2)\Psi^{rc}=(\beta^{r}_{0},\beta^{c}_{0},\beta_{1},\dots,\beta_{p},\sigma^{2}_{u},\sigma^{2}_{v}). This definition of misspecified composite likelihood reduces the computation of an (m+n)(m+n)-dimensional integral in (3) to m+nm+n one-dimensional integrals. Bartolucci et al. (2017) proposed a different composite likelihood function in which column (or row) random effects are marginalized over instead.

We construct a Gaussian variational approximation to the misspecified row-column composite likelihood (7) using the same variational distributions as in the previous section, leading to

c(Ψrc)\displaystyle c\ell(\Psi^{rc}) (8)
=\displaystyle= j=1nlogRi=1mp(YijXij,Ui)p(Ui)ϕ(Ui)/ϕ(Ui)dUi\displaystyle\sum^{n}_{j=1}\log\int_{R}\prod^{m}_{i=1}p(Y_{ij}\mid X_{ij},U_{i})p(U_{i})\phi(U_{i})/\phi(U_{i})dU_{i}
+i=1mlogRj=1np(YijXij,Vj)p(Vj)ϕ(Vj)/ϕ(Vj)dVj\displaystyle+\sum^{m}_{i=1}\log\int_{R}\prod^{n}_{j=1}p(Y_{ij}\mid X_{ij},V_{j})p(V_{j})\phi(V_{j})/\phi(V_{j})dV_{j}
\displaystyle\geq j=1ni=1m[YijEϕ(Ui)(θijr)Eϕ(Ui){b(θijr)}]+(1/2)i=1m{log(λui/σu2)(μui2+λui)/σu2}\displaystyle\sum^{n}_{j=1}\sum^{m}_{i=1}\left[Y_{ij}E_{\phi(U_{i})}(\theta^{r}_{ij})-E_{\phi(U_{i})}\{b(\theta^{r}_{ij})\}\right]+(1/2)\sum^{m}_{i=1}\left\{\log(\lambda_{u_{i}}/\sigma^{2}_{u})-(\mu^{2}_{u_{i}}+\lambda_{u_{i}})/\sigma^{2}_{u}\right\}
+j=1ni=1m[YijEϕ(Vj)(θijc)Eϕ(Vj){b(θijc)}]+(1/2)j=1n{log(λvj/σv2)(μvj2+λvj)/σv2}\displaystyle+\sum^{n}_{j=1}\sum^{m}_{i=1}\left[Y_{ij}E_{\phi(V_{j})}(\theta^{c}_{ij})-E_{\phi(V_{j})}\{b(\theta^{c}_{ij})\}\right]+(1/2)\sum^{n}_{j=1}\left\{\log(\lambda_{v_{j}}/\sigma^{2}_{v})-(\mu^{2}_{v_{j}}+\lambda_{v_{j}})/\sigma^{2}_{v}\right\}
=\displaystyle= c¯(Ψrc,ξ).\displaystyle\underline{c\ell}(\Psi^{rc},\xi).

For the Poisson regression model,

θijr=XijTβr+Ui,Eϕ(ui)(θijr)=XijTβr+μui,Eϕ(ui){b(θijr)}=exp(XijTβr+μui+λui/2);\displaystyle\theta^{r}_{ij}=X^{{\mathrm{\scriptscriptstyle T}}}_{ij}\beta^{r}+U_{i},~E_{\phi(u_{i})}(\theta^{r}_{ij})=X^{{\mathrm{\scriptscriptstyle T}}}_{ij}\beta^{r}+\mu_{u_{i}},~E_{\phi(u_{i})}\{b(\theta^{r}_{ij})\}=\exp(X^{{\mathrm{\scriptscriptstyle T}}}_{ij}\beta^{r}+\mu_{u_{i}}+\lambda_{u_{i}}/2);
θijc=XijTβc+Vj,Eϕ(vj)(θijc)=XijTβc+μvj,Eϕ(vj){b(θijc)}=exp(XijTβc+μvj+λvj/2).\displaystyle\theta^{c}_{ij}=X^{{\mathrm{\scriptscriptstyle T}}}_{ij}\beta^{c}+V_{j},~E_{\phi(v_{j})}(\theta^{c}_{ij})=X^{{\mathrm{\scriptscriptstyle T}}}_{ij}\beta^{c}+\mu_{v_{j}},~E_{\phi(v_{j})}\{b(\theta^{c}_{ij})\}=\exp(X^{{\mathrm{\scriptscriptstyle T}}}_{ij}\beta^{c}+\mu_{v_{j}}+\lambda_{v_{j}}/2).

For the Gamma regression model,

θijr=exp(XijTβrUi),Eϕ(ui)(θijr)=exp(XijTβrμui+λui/2),\displaystyle\theta^{r}_{ij}=-\exp(-X^{{\mathrm{\scriptscriptstyle T}}}_{ij}\beta^{r}-U_{i}),~E_{\phi(u_{i})}(\theta^{r}_{ij})=-\exp(-X^{{\mathrm{\scriptscriptstyle T}}}_{ij}\beta^{r}-\mu_{u_{i}}+\lambda_{u_{i}}/2),
Eϕ(ui){b(θijr)}=XijTβr+μui;\displaystyle E_{\phi(u_{i})}\{b(\theta^{r}_{ij})\}=X^{{\mathrm{\scriptscriptstyle T}}}_{ij}\beta^{r}+\mu_{u_{i}};
θijc=exp(XijTβcVj),Eϕ(vj)(θijc)=exp(XijTβcμvj+λvj/2),\displaystyle\theta^{c}_{ij}=-\exp(-X^{{\mathrm{\scriptscriptstyle T}}}_{ij}\beta^{c}-V_{j}),~E_{\phi(v_{j})}(\theta^{c}_{ij})=-\exp(-X^{{\mathrm{\scriptscriptstyle T}}}_{ij}\beta^{c}-\mu_{v_{j}}+\lambda_{v_{j}}/2),
Eϕ(vj){b(θijc)}=XijTβc+μvj.\displaystyle E_{\phi(v_{j})}\{b(\theta^{c}_{ij})\}=X^{{\mathrm{\scriptscriptstyle T}}}_{ij}\beta^{c}+\mu_{v_{j}}.

We define the estimators based on this Gaussian variational approximation by

(Ψ¯^rc,ξ¯^)=argmaxΨrc,ξc¯(Ψrc,ξ).\displaystyle(\widehat{\underline{\Psi}}^{rc},\widehat{\underline{\xi}})=\underset{\Psi^{rc},\xi}{\arg\max}\underline{c\ell}(\Psi^{rc},\xi).

To get the estimator Ψ¯^\widehat{\underline{\Psi}} from Ψ¯^rc\widehat{\underline{\Psi}}^{rc}, we only need to convert β¯0r^\widehat{\underline{\beta}^{r}_{0}} and β¯0c^\widehat{\underline{\beta}^{c}_{0}} to β¯^0\widehat{\underline{\beta}}_{0}. This is discussed in §4 Remark 1. The advantage of c¯(Ψrc,ξ)\underline{c\ell}(\Psi^{rc},\xi) over ¯(Ψ,ξ)\underline{\ell}(\Psi,\xi) is that the former only involves m+nm+n one-dimensional integrals, and especially, for Poisson and Gamma regression models with explicit expressions, even one-dimensional integrals are no longer involved; only m+nm+n calculations are needed. The misspecified row-column composite likelihood-based variational approximation enhances the efficiency of calculations compared to the variational approximation to the log-likelihood function.

4 Theoretical Properties

In this section we present our main results on the consistency and convergence rates of the parameter estimates based on the Gaussian variational approximation introduced above. We denote the true value of the parameter with a superscript 0.

  • Poisson regression model:

    YijXij,Ui,Vjfollows independent Poisson with meanexp(β00+β10Xij+Ui+Vj),\displaystyle Y_{ij}\mid X_{ij},U_{i},V_{j}~\text{follows~independent~Poisson~with~mean}~\exp(\beta^{0}_{0}+\beta^{0}_{1}X_{ij}+U_{i}+V_{j}),
    UiN(0,(σu2)0),VjN(0,(σv2)0),UiandVjare independent.\displaystyle U_{i}\sim N(0,(\sigma^{2}_{u})^{0}),V_{j}\sim N(0,(\sigma^{2}_{v})^{0}),U_{i}~\text{and}~V_{j}~\text{are independent}. (9)
  • Gamma regression model:

    YijXij,Ui,Vjindependent Gamma with α and meanexp(β00+β10Xij+Ui+Vj),\displaystyle Y_{ij}\mid X_{ij},U_{i},V_{j}~\text{independent~Gamma~with~$\alpha$ and mean}~\exp(\beta^{0}_{0}+\beta^{0}_{1}X_{ij}+U_{i}+V_{j}),
    UiN(0,(σu2)0),VjN(0,(σv2)0),UiandVjare independent,\displaystyle U_{i}\sim N(0,(\sigma^{2}_{u})^{0}),V_{j}\sim N(0,(\sigma^{2}_{v})^{0}),U_{i}~\text{and}~V_{j}~\text{are independent}, (10)

    where the shape parameter α\alpha is assumed known.

Under the assumptions (A1)-(A9) in the §S1 of the Supplementary Material, we have the following two theorems.

Theorem 1

As mm and nn diverge, such that m/n=Ω(1)m/n=\Omega(1), for both Poisson and Gamma regression models,

β¯^0β00\displaystyle\widehat{\underline{\beta}}_{0}-\beta^{0}_{0} =\displaystyle= Op(m1/2+n1/2),β¯^1β10=Op(m1/2+n1/2),\displaystyle O_{p}(m^{-1/2}+n^{-1/2}),~~\widehat{\underline{\beta}}_{1}-\beta^{0}_{1}=O_{p}(m^{-1/2}+n^{-1/2}),
σ¯^u2(σu2)0\displaystyle\widehat{\underline{\sigma}}^{2}_{u}-(\sigma^{2}_{u})^{0} =\displaystyle= Op(m1/2+n1/2),σ¯^v2(σv2)0=Op(m1/2+n1/2).\displaystyle O_{p}(m^{-1/2}+n^{-1/2}),~\widehat{\underline{\sigma}}^{2}_{v}-(\sigma^{2}_{v})^{0}=O_{p}(m^{-1/2}+n^{-1/2}).
Remark 1

As noted in §3, the intercept terms for the row-only and column-only composite likelihood functions are not equal to β0\beta_{0}. In the proof we derive the probability limits of the variational estimates of β0r\beta^{r}_{0} and β0c\beta^{c}_{0}, and show how to estimate β0\beta_{0} from these and estimates of σu2\sigma^{2}_{u} and σv2\sigma^{2}_{v}.

In addition, we have the following asymptotic normality results.

Theorem 2

For both Poisson and Gamma regression models, as mm and nn diverge, such that m/n=Ω(1)m/n=\Omega(1), we have

β¯^0β00\displaystyle\widehat{\underline{\beta}}_{0}-\beta^{0}_{0} =\displaystyle= m1/213TZ113+n1/213TZ213+op(m1/2+n1/2),\displaystyle m^{-1/2}1^{{\mathrm{\scriptscriptstyle T}}}_{3}Z_{1}1_{3}+n^{-1/2}1^{{\mathrm{\scriptscriptstyle T}}}_{3}Z_{2}1_{3}+o_{p}(m^{-1/2}+n^{-1/2}),

where 13=(1,1,1)T1_{3}=(1,1,1)^{{\mathrm{\scriptscriptstyle T}}}, Z1Z_{1} and Z2Z_{2} are independent normal distributions with mean zero and covariance

Γ1\displaystyle\Gamma_{1} =\displaystyle= 18(2[exp{(σu2)0}1]2(σu2)0{(σu2)0}22(σu2)02(σu2)00{(σu2)0}20{(σu2)0}2),\displaystyle\frac{1}{8}\left(\begin{array}[]{ccc}2[\exp\{(\sigma^{2}_{u})^{0}\}-1]&2(\sigma^{2}_{u})^{0}&-\{(\sigma^{2}_{u})^{0}\}^{2}\\ 2(\sigma^{2}_{u})^{0}&2(\sigma^{2}_{u})^{0}&0\\ -\{(\sigma^{2}_{u})^{0}\}^{2}&0&\{(\sigma^{2}_{u})^{0}\}^{2}\\ \end{array}\right), (14)

and

Γ2\displaystyle\Gamma_{2} =\displaystyle= 18(2[exp{(σv2)0}1]2(σv2)0{(σv2)0}22(σv2)02(σv2)00{(σv2)0}20{(σv2)0}2)\displaystyle\frac{1}{8}\left(\begin{array}[]{ccc}2[\exp\{(\sigma^{2}_{v})^{0}\}-1]&2(\sigma^{2}_{v})^{0}&-\{(\sigma^{2}_{v})^{0}\}^{2}\\ 2(\sigma^{2}_{v})^{0}&2(\sigma^{2}_{v})^{0}&0\\ -\{(\sigma^{2}_{v})^{0}\}^{2}&0&\{(\sigma^{2}_{v})^{0}\}^{2}\\ \end{array}\right) (18)

respectively;

σ¯^u2(σu2)0\displaystyle\widehat{\underline{\sigma}}^{2}_{u}-(\sigma^{2}_{u})^{0} =\displaystyle= m1/2Z3+op(n1/2+m1/2),\displaystyle m^{-1/2}Z_{3}+o_{p}(n^{-1/2}+m^{-1/2}),

where the random variable Z3Z_{3} follows N(0,2{(σu2)0}2)N(0,2\{(\sigma^{2}_{u})^{0}\}^{2});

σ¯^v2(σv2)0\displaystyle\widehat{\underline{\sigma}}^{2}_{v}-(\sigma^{2}_{v})^{0} =\displaystyle= n1/2Z4+op(m1/2+n1/2),\displaystyle n^{-1/2}Z_{4}+o_{p}(m^{-1/2}+n^{-1/2}),

where the random variable Z4Z_{4} follows N(0,2{(σv2)0}2)N(0,2\{(\sigma^{2}_{v})^{0}\}^{2}).

For Poisson regression, the slope term satisfies

β¯^1β10\displaystyle\widehat{\underline{\beta}}_{1}-\beta^{0}_{1} =\displaystyle= (mn)1/2Z5+op{m1+(mn)1/2+n1},\displaystyle(mn)^{-1/2}Z_{5}+o_{p}\{m^{-1}+(mn)^{-1/2}+n^{-1}\},

the random variable Z5Z_{5} follows a normal distribution with zero mean and variance

exp{β00(σu2)0/2(σv2)0/2}τ1+τ12τ212TΓ312,\displaystyle\exp\{-\beta^{0}_{0}-(\sigma^{2}_{u})^{0}/2-(\sigma^{2}_{v})^{0}/2\}\tau_{1}+\tau^{2}_{1}\tau_{2}1^{{\mathrm{\scriptscriptstyle T}}}_{2}\Gamma_{3}1_{2},

where τ1=ϕ(β10)/{ϕ2(β10)ϕ(β10)ϕ12(β10)}\tau_{1}=\phi(\beta^{0}_{1})/\{\phi_{2}(\beta^{0}_{1})\phi(\beta^{0}_{1})-\phi^{2}_{1}(\beta^{0}_{1})\}, τ2=ϕ2(2β10)2ϕ1(β10)ϕ1(2β10)/ϕ(β10)+ϕ1(β10)2ϕ(2β10)/ϕ(β10)2\tau_{2}=\phi_{2}(2\beta^{0}_{1})-2\phi_{1}(\beta^{0}_{1})\phi_{1}(2\beta^{0}_{1})/\phi(\beta^{0}_{1})+\phi_{1}(\beta^{0}_{1})^{2}\phi(2\beta^{0}_{1})/\phi(\beta^{0}_{1})^{2}, and

Γ3\displaystyle\Gamma_{3} =\displaystyle= 14(exp{(σu2)0}[exp{(σv2)0}1][exp{(σu2)0}1][exp{(σv2)0}1][exp{(σu2)0}1][exp{(σv2)0}1]exp{(σv2)0}[exp{(σu2)0}1]);\displaystyle\frac{1}{4}\left(\begin{array}[]{cc}\exp\{(\sigma^{2}_{u})^{0}\}[\exp\{(\sigma^{2}_{v})^{0}\}-1]&[\exp\{(\sigma^{2}_{u})^{0}\}-1][\exp\{(\sigma^{2}_{v})^{0}\}-1]\\ ~[\exp\{(\sigma^{2}_{u})^{0}\}-1][\exp\{(\sigma^{2}_{v})^{0}\}-1]&\exp\{(\sigma^{2}_{v})^{0}\}[\exp\{(\sigma^{2}_{u})^{0}\}-1]\\ \end{array}\right); (21)

For Gamma regression, the slope term satisfies

β¯^1β10=(mn)1/212TZ612+op{m1+(mn)1/2+n1},\displaystyle\widehat{\underline{\beta}}_{1}-\beta^{0}_{1}=(mn)^{-1/2}1^{{\mathrm{\scriptscriptstyle T}}}_{2}Z_{6}1_{2}+o_{p}\{m^{-1}+(mn)^{-1/2}+n^{-1}\},

where Z6Z_{6} follows a normal distribution with zero mean and covariance

Σ~=14α(α[exp{(σv2)0}1]+exp{(σv2)0}11α[exp{(σu2)0}1]+exp{(σu2)0}).\displaystyle\widetilde{\Sigma}=\frac{1}{4\alpha}\left(\begin{array}[]{cc}\alpha[\exp\{(\sigma^{2}_{v})^{0}\}-1]+\exp\{(\sigma^{2}_{v})^{0}\}&1\\ 1&\alpha[\exp\{(\sigma^{2}_{u})^{0}\}-1]+\exp\{(\sigma^{2}_{u})^{0}\}\\ \end{array}\right). (24)
Remark 2

We assume above that mm and nn diverge at the same rate, whereas Hall et al. (2011b) assume n/m0n/m\rightarrow 0, to reflect the usual nested effects setting, with many groups (random effects) and relatively few observations per group. While Hall et al. (2011b)’s asymptotic manner is sufficient to ensure the validity of the asymptotic theory, it is not necessary. In the generalized linear mixed model with crossed random effects, if n/mn/m tends to 0, the convergence rate will be dominated by n1/2n^{-1/2}, indicating that row random effects can be disregarded. However, in the case we consider, the row random effects should not be ignored.

5 Simulation Studies

In this section, we perform simulations to evaluate the effectiveness of the proposed approach. We assess the performance of both Poisson and Gamma regression models with crossed random effects as specified in equations (4) and (4). For both models, we independently generate XijX_{ij} from N(1,1)N(1,1) for i=1,,mi=1,\ldots,m and j=1,,nj=1,\ldots,n. We set β00=2\beta^{0}_{0}=-2, β10=2\beta^{0}_{1}=-2, (σu)0=0.5(\sigma_{u})^{0}=0.5, and (σv)0=0.5(\sigma_{v})^{0}=0.5. For the Gamma regression model, there is an additional shape parameter set as α=0.8\alpha=0.8. We consider two different scenarios: (m,n)=(50,50)(m,n)=(50,50) and (m,n)=(100,100)(m,n)=(100,100).

We compare the proposed approach with the Gaussian variational inference based on the full log-likelihood function. We report the mean and standard deviation of the parameter estimates based on 1000 repetitions of Monte Carlo studies. We also calculate the average computational times of both methods, using R version 4.1.1 on a laptop equipped with a 2.8 gigahertz Intel Core i7-1165G7 processor and 16 gigabytes of random access memory. The results are presented in Table 1.

Table 1: Simulations for Poisson and Gamma regression models with crossed random effects were conducted using parameters β00=2\beta^{0}_{0}=-2, β10=2\beta^{0}_{1}=-2, (σu)0=0.5(\sigma_{u})^{0}=0.5, and (σv)0=0.5(\sigma_{v})^{0}=0.5 under two different scenarios: (m,n)=(50,50)(m,n)=(50,50) and (m,n)=(100,100)(m,n)=(100,100). For the Gamma regression model, the additional shape parameter was set as α=0.8\alpha=0.8. “GVA” denotes the Gaussian variational approximation to the log-likelihood function and “GVACL” denotes our proposed method. “Mean(SD)” reports the mean and standard deviations of the parameter estimates based on the 10001000 datasets; “MESE” denotes the average of the standard errors of the estimated parameters calculated using the asymptotic variance formula; “Mean Time(s)” is the average time of each simulation in seconds. For the GVA method, we did not report the MESE since no asymptotic variance was derived in previous literature, denoted by ”NA”.
Poisson (m,n)=(50,50)(m,n)=(50,50) (m,n)=(100,100)(m,n)=(100,100)
GVA GVACL GVA GVACL
Mean(SD) -1.99(0.07) -2.04(0.14) -1.99(0.08) -2.03(0.09)
MESE NA 0.11 NA 0.07
Mean(SD) -1.99(0.07) -1.99(0.08) -1.99(0.03) -2.00(0.04)
MESE NA 0.10 NA 0.05
Mean(SD) 0.47(0.10) 0.53(0.09) 0.49(0.05) 0.52(0.05)
MESE NA 0.05 NA 0.04
Mean(SD) 0.46(0.10) 0.52(0.09) 0.49(0.05) 0.52(0.05)
MESE NA 0.05 NA 0.04
Mean Time(s) 3.55 0.21 14.04 0.63
Gamma (m,n)=(50,50)(m,n)=(50,50) (m,n)=(100,100)(m,n)=(100,100)
Mean(SD) -2.00(0.10) -2.01(0.11) -2.00(0.07) -2.00(0.07)
MESE NA 0.10 NA 0.07
Mean(SD) -2.00(0.02) -2.00(0.03) -2.00(0.01) -2.00(0.01)
MESE NA 0.03 NA 0.01
Mean(SD) 0.49(0.05) 0.50(0.05) 0.50(0.04) 0.50(0.04)
MESE NA 0.05 NA 0.04
Mean(SD) 0.49(0.06) 0.50(0.06) 0.50(0.04) 0.50(0.04)
MESE NA 0.05 NA 0.04
Mean Time(s) 4.77 0.28 19.00 0.88

Our results demonstrate that both methods yield similar parameter estimates. However, our proposed approach exhibits a notable advantage in terms of computational efficiency compared to the Gaussian variational approximation to the log-likelihood function. We further investigate additional settings to assess the performance of both methods. The results are included in the supplementary §S6 and Table S1, and the findings are similar.

6 Application

We illustrate our methods with data concerning motor vehicle insurance in Indonesia, in 2014, obtained from the Financial Services Authority (Adam et al., 2021). The dataset consists of 175,381 claim events from 35 distinct areas over a period of 12 months. Each claim event includes information on the claim amount and deductible. However, 17 out of the 35 areas did not have any claim events during the entire 12-month period. Consequently, our analysis focuses on the remaining 18 areas. Table S3 includes the counts of claim events for each area and month.

We randomly selected one claim event from each area and month. The response variable, denoted as YijY_{ij}, represents the claim amount, while the explanatory variable, denoted as XijX_{ij}, represents the deductible, where i=1,,18i=1,\dots,18 and j=1,,12j=1,\ldots,12. A portion of the dataset can be found in Table S4 of the supplementary material. From that table, we can see both XijX_{ij} and YijY_{ij} have large magnitudes, therefore we divide them by 10710^{7} to avoid singular Hessian matrices in computation (Adam et al., 2021). We then fit a Gamma regression model, introducing random effects for both area and month of occurrence.

To evaluate the proposed method and the Gaussian variational approximation to the log-likelihood function, we regenerated 10001000 datasets using different random seeds at each step of the random event selection process. We then applied both methods and reported the mean and standard deviation of both estimators across these 10001000 datasets. In addition, we record the average time it takes to fit the model. Table 2 presents the results of this analysis.

Table 2: Real data results: “GVA” is the Gaussian variational approximation to the log-likelihood function, and “GVACL” is our proposed method. “Mean Time(s)” is the average time of each simulation in seconds; “Mean(SD)” reports the mean and sample standard deviations of the model parameters based on the 10001000 datasets.
β0\beta_{0} β1\beta_{1} σu\sigma_{u} σv\sigma_{v}
Method Mean Time(s) Mean(SD) Mean(SD) Mean(SD) Mean(SD)
GVA 1.19 -0.71 (0.12) 3.17 (1.37) 0.29(0.11) 0.10(0.10)
GVACL 0.15 -0.74 (0.12) 3.60 (1.26) 0.29(0.11) 0.14(0.12)

From Table 2, it is evident that the two methods yield comparable model parameter estimates. However, our proposed approach “GVACL” is much faster in terms of computation time.

7 Discussion

In this article, we cover two examples: the Poisson regression and Gamma regression. An interesting future direction is to extend the results to other generalized linear mixed models. For example, the logistic regression model with crossed random effects has

Yij=1Xij,Ui,VjBernoulli[1/{1+exp(β00β10XijUiVj)}],\displaystyle Y_{ij}=1\mid X_{ij},U_{i},V_{j}\sim\text{Bernoulli}[1/\{1+\exp(-\beta^{0}_{0}-\beta^{0}_{1}X_{ij}-U_{i}-V_{j})\}],
UiN(0,(σu2)0),VjN(0,(σv2)0),UiandVjare independent.\displaystyle U_{i}\sim N(0,(\sigma^{2}_{u})^{0}),V_{j}\sim N(0,(\sigma^{2}_{v})^{0}),U_{i}~\text{and}~V_{j}~\text{are~independent}.

Unlike the Poisson and Gamma regression models, there is no analytic solution to the two-dimensional integral

Eϕ(ui)ϕ(vj){b(θij)}=R2log{1+exp(XijTβ+Ui+Vj)}ϕ(Ui)ϕ(Vj)𝑑Ui𝑑Vj\displaystyle E_{\phi(u_{i})\phi(v_{j})}\{b(\theta_{ij})\}=\int_{R^{2}}\log\{1+\exp(X^{{\mathrm{\scriptscriptstyle T}}}_{ij}\beta+U_{i}+V_{j})\}\phi(U_{i})\phi(V_{j})dU_{i}dV_{j} (25)

appearing in (4). One solution is to evaluate the integrals using adaptive Gauss-Hermite quadrature (Liu and Pierce, 1994) with a product of N1×N2N_{1}\times N_{2} quadrature points over the two-dimensional integral in (25). The one-dimensional integrals arising in the row-column composite likelihood of (8) can also be computed by adaptive Gauss-Hermite quadrature. Compared to the Poisson and Gamma cases, the computational time of the proposed method is longer, especially when mm and nn are large.

We carried out some limited simulations for the logistic model and report the results in the supplementary §S6 and Table S2. We found that the computational time of the proposed method is much shorter than for the Gaussian variational approximation to the log-likelihood function. Based on the simulation results, we conjecture that the intercept and slope estimates and the sum of random effect variance estimates have the following relationship:

β¯^0(β¯0r^+β¯0c^)/[2{10.1(σ¯^u2+σ¯^v2)}]andβ¯^1β¯1rc^/{10.1(σ¯^u2+σ¯^v2)}.\displaystyle\widehat{\underline{\beta}}_{0}\approx(\widehat{\underline{\beta}^{r}_{0}}+\widehat{\underline{\beta}^{c}_{0}})/[2\{1-0.1(\widehat{\underline{\sigma}}^{2}_{u}+\widehat{\underline{\sigma}}^{2}_{v})\}]~~\text{and}~~\widehat{\underline{\beta}}_{1}\approx\widehat{\underline{\beta}^{rc}_{1}}/\{1-0.1(\widehat{\underline{\sigma}}^{2}_{u}+\widehat{\underline{\sigma}}^{2}_{v})\}.

The rigorous derivation of the relationship is beyond the scope of the paper and we leave it for future research.

Variational approximations to the composite log-likelihood function of crossed random effects models establish a connection between two distinct topics in statistics. We introduce the row-composite likelihood function (5) to eliminate the row dependence of responses by disregarding the column random effects. Subsequently, we utilize the Gaussian variational approximation to eliminate the column dependence of responses through the variational distributions of row random effects. Similarly, we construct the column-composite likelihood function (6), to eliminate the column dependence of responses by disregarding the random row effects. We then apply the Gaussian variational approximation to eliminate the row dependence of responses through the variational distributions of column random effects. Comparing the row-column composite likelihood-based Gaussian variational approximation and likelihood-based Gaussian variational approximation, both approaches share the common goal of breaking the dependence of responses through manipulating random effects for the generalized linear mixed models with crossed random effects. For any other links between composite likelihood and variational approximations, we will leave them for future researc

Acknowledgment

The authors are grateful to Dr. Fia Fridayanti Adam for sharing the motor vehicle insurance data. This research was partially supported by the Natural Sciences and Engineering Council of Canada.

8 BibTeX

References

  • Chung and Cai (2021) Seungwon Chung and Li Cai. Cross-classified random effects modeling for moderated item calibration. Journal of Educational and Behavioral Statistics, 46(6):651–681, 2021.
  • Menictas et al. (2022) Marianne Menictas, Gioia Di Credico, and Matt P Wand. Streamlined variational inference for linear mixed models with crossed random effects. Journal of Computational and Graphical Statistics, pages 1–17, 2022.
  • Baayen et al. (2008) R Harald Baayen, Douglas J Davidson, and Douglas M Bates. Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language, 59(4):390–412, 2008.
  • Jeon et al. (2017) Minjeong Jeon, Frank Rijmen, and Sophia Rabe-Hesketh. A variational maximization–maximization algorithm for generalized linear mixed models with crossed random effects. Psychometrika, 82(3):693–716, 2017.
  • Coull et al. (2001) Brent A Coull, James P Hobert, Louise M Ryan, and Lewis B Holmes. Crossed random effect models for multiple outcomes in a study of teratogenesis. Journal of the American Statistical Association, 96(456):1194–1204, 2001.
  • Breslow and Clayton (1993) Norman E Breslow and David G Clayton. Approximate inference in generalized linear mixed models. Journal of the American Statistical Association, 88(421):9–25, 1993.
  • Schall (1991) Robert Schall. Estimation in generalized linear models with random effects. Biometrika, 78(4):719–727, 1991.
  • Lin and Breslow (1996) Xihong Lin and Norman E Breslow. Bias correction in generalized linear mixed models with multiple components of dispersion. Journal of the American Statistical Association, 91(435):1007–1016, 1996.
  • McCulloch (1997) Charles E McCulloch. Maximum likelihood algorithms for generalized linear mixed models. Journal of the American Statistical Association, 92(437):162–170, 1997.
  • Ghosh et al. (2022a) Swarnadip Ghosh, Trevor Hastie, and Art B Owen. Backfitting for large scale crossed random effects regressions. The Annals of Statistics, 50(1):560–583, 2022a.
  • Ghosh et al. (2022b) Swarnadip Ghosh, Trevor Hastie, and Art B Owen. Scalable logistic regression with crossed random effects. Electronic Journal of Statistics, 16(2):4604–4635, 2022b.
  • Renard et al. (2004) Didier Renard, Geert Molenberghs, and Helena Geys. A pairwise likelihood approach to estimation in multilevel probit models. Computational Statistics & Data Analysis, 44(4):649–667, 2004.
  • Bartolucci and Lupparelli (2016) Francesco Bartolucci and Monia Lupparelli. Pairwise likelihood inference for nested hidden Markov chain models for multilevel longitudinal data. Journal of the American Statistical Association, 111(513):216–228, 2016.
  • Bellio and Varin (2005) Ruggero Bellio and Cristiano Varin. A pairwise likelihood approach to generalized linear models with crossed random effects. Statistical Modelling, 5(3):217–227, 2005.
  • Ormerod and Wand (2012) John T Ormerod and Matt P Wand. Gaussian variational approximate inference for generalized linear mixed models. Journal of Computational and Graphical Statistics, 21(1):2–17, 2012.
  • Hall et al. (2011a) Peter Hall, John T Ormerod, and Matt P Wand. Theory of Gaussian variational approximation for a Poisson mixed model. Statistica Sinica, 21:369–389, 2011a.
  • Rijmen and Jeon (2013) Frank Rijmen and Minjeong Jeon. Fitting an item response theory model with random item effects across groups by a variational approximation method. Annals of Operations Research, 206(1):647–662, 2013.
  • Hui et al. (2017) Francis KC Hui, David I Warton, John T Ormerod, Viivi Haapaniemi, and Sara Taskinen. Variational approximations for generalized linear latent variable models. Journal of Computational and Graphical Statistics, 26(1):35–43, 2017.
  • Hui et al. (2019) Francis KC Hui, Chong You, Han Lin Shang, and Samuel Müller. Semiparametric regression using variational approximations. Journal of the American Statistical Association, 114:1765–1777, 2019.
  • Bartolucci et al. (2017) Francesco Bartolucci, Francesca Chiaromonte, Prabhani Kuruppumullage Don, and Bruce G Lindsay. Composite likelihood inference in a discrete latent variable model for two-way “clustering-by-segmentation” problems. Journal of Computational and Graphical Statistics, 26(2):388–402, 2017.
  • Hall et al. (2011b) Peter Hall, Tung Pham, Matt P Wand, and Shen SJ Wang. Asymptotic normality and valid inference for Gaussian variational approximation. The Annals of Statistics, 39(5):2502–2532, 2011b.
  • Adam et al. (2021) FA Adam, A Kurnia, IGP Purnaba, IW Mangku, and AM Soleh. Modeling the amount of insurance claim using Gamma linear mixed model with AR (1) random effect. In Journal of Physics: Conference Series, volume 1863, page 012027. IOP Publishing, 2021.
  • Liu and Pierce (1994) Qing Liu and Donald A Pierce. A note on Gauss–Hermite quadrature. Biometrika, 81(3):624–629, 1994.