This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Simulation study of estimating between-study variance and overall effect in meta-analysis of standardized mean difference

Ilyas Bakbergenuly, David C. Hoaglin and Elena Kulinskaya
(July 28, 2025)
Abstract

Methods for random-effects meta-analysis require an estimate of the between-study variance, τ2\tau^{2}. The performance of estimators of τ2\tau^{2} (measured by bias and coverage) affects their usefulness in assessing heterogeneity of study-level effects, and also the performance of related estimators of the overall effect. For the effect measure standardized mean difference (SMD), we provide the results from extensive simulations on five point estimators of τ2\tau^{2} (the popular methods of DerSimonian-Laird, restricted maximum likelihood, and Mandel and Paule (MP); the less-familiar method of Jackson; the new method (KDB) based on the improved approximation to the distribution of the Q statistic by Kulinskaya, Dollinger and Bjørkestøl (2011) ), five interval estimators for τ2\tau^{2} (profile likelihood, Q-profile, Biggerstaff and Jackson, Jackson, and the new KDB method), six point estimators of the overall effect (the five related to the point estimators of τ2\tau^{2} and an estimator whose weights use only study-level sample sizes), and eight interval estimators for the overall effect (five based on the point estimators for τ2\tau^{2}; the Hartung-Knapp-Sidik-Jonkman (HKSJ) interval; a modification of HKSJ; and an interval based on the sample-size-weighted estimator).

Keywords between-study variance, heterogeneity, random-effects model, meta-analysis, mean difference, standardized mean difference

1 Introduction

Meta-analysis is a statistical methodology for combining estimated effects from several studies in order to assess their heterogeneity and obtain an overall estimate. In this paper we focus on meta-analysis of standardized mean difference. The data and, often, existing tradition determine the choice of outcome measure. In a comparative study with continuous subject-level data for a treatment arm (T) and a control arm (C), the customary outcome measures are the mean difference (MD) and the standardized mean difference (SMD). The Cochrane Handbook (Higgins and Green, 2011, Part 2, Chapter 9) points out that the choice of MD over SMD depends on whether “outcome measurements in all studies are made on the same scale.” However, fields of application have established preferences: MD in medicine and SMD in social sciences.

If the studies can be assumed to have the same true effect, a meta-analysis uses a fixed-effect (FE) model (common-effect model) to combine the estimates. Otherwise, the studies’ true effects can depart from homogeneity in a variety of ways. Most commonly, a random-effects (RE) model regards those effects as a sample from a distribution and summarizes their heterogeneity via its variance, usually denoted by τ2\tau^{2}.The between-studies variance, τ2\tau^{2}, has a key role in estimates of the mean of the distribution of random effects; but it is also important as a quantitative indication of heterogeneity (Higgins et al., 2009). In studying estimation for meta-analysis of SMD, we focus first on τ2\tau^{2} and then proceed to the overall effect.

Veroniki et al. (2016) provide a comprehensive overview and recommendations on methods of estimating τ2\tau^{2} and its uncertainty. Their review, however, has two important limitations. First, the authors study only “methods that can be applied for any type of outcome data.” However, the performance of the methods that we study varies widely among effect measures. (Veroniki et al., 2016, Section 6.1) mention this only in passing as a hypothetical possibility. Second, any review on the topic, such as Veroniki et al. (2016), currently can draw on limited empirical information on the comparative performance of the methods. The (short) list of previous simulation studies for SMD is given in Table 1. From this table, it is clear that only three studies (Viechtbauer (2005) Petropoulou and Mavridis (2017) and Langan et al. (2018)) considered and compared several point estimators of τ2\tau^{2}.

However, to assess bias of the estimators of heterogeneity variance, Petropoulou and Mavridis (2017) use mean absolute error (the only performance measure reported for SMD in their Table S8), which is not a measure of bias; it is the linear counterpart of mean squared error.

Langan et al. (2018) studied bias and mean squared error of estimators of τ2\tau^{2}, as well as coverage of confidence intervals for the overall effect, but they used only one value of SMD=0.5 because they believed that the value of SMD does not matter. We show in our simulations that it does matter. Additionally, both Viechtbauer (2005) and Langan et al. (2018)) consider a very restricted range of τ2\tau^{2} values. There appear to be no studies at all on coverage of τ2\tau^{2}. Most of the simulation studies consider only inverse-variance-based estimation of the overall effect, and only two (Langan et al. (2018) and Hamman et al. (2018)) consider t-based confidence intervals for it.

Additionally, all moment-based point estimators of τ2\tau^{2} (i.e., the vast majority of the estimators listed in Table 1), use the inferior χK12\chi^{2}_{K-1} approximation to the distribution of Cochran’s QQ statistic, which does not perform well for small to medium sample sizes (Kulinskaya et al., 2011).

To address this gap in information on methods of estimating the heterogeneity variance for SMD, we use simulation to study four methods recommended by Veroniki et al. (2016). These are the well-established methods of DerSimonian and Laird (1986), restricted maximum likelihood, and Mandel and Paule (1970), and the less-familiar method of Jackson (2013). We also include a new method based on improved approximations to the distribution of the QQ statistic for SMD (Kulinskaya et al., 2011). We also study coverage of confidence intervals for τ2\tau^{2} achieved by five methods, including the Q-profile method of Viechtbauer (2007), a Q-profile method based on an improved approximation to the distribution of Cochran’s QQ, and profile-likelihood-based intervals.

For each estimator of τ2\tau^{2}, we also study bias of the corresponding inverse-variance-weighted estimator of the overall effect. As our work progressed, it became clear that those inverse-variance-weighted estimators generally had unacceptable bias for SMD. Therefore, we added an estimator (SSW) whose weights depend only on the sample sizes of the Treatment and Control arms. We studied the coverage of the confidence intervals associated with the inverse-variance-weighted estimators, and also the HKSJ interval, the HKSJ interval using the improved estimator of τ2\tau^{2}, and the interval centered at SSW and using the improved τ^2\hat{\tau}^{2} in estimating its variance.

Study SMD measure δ\delta τ2\tau^{2} n and/or n¯\bar{n} K τ^2\hat{\tau}^{2} Coverage of τ2\tau^{2} δ^\hat{\delta} Coverage of δ\delta
Viechtbauer 2005 Hedgess d 0, 0.2 ,0.5 ,0.8 0, 0.01 ,0.025, 0.05 ,0.1 n¯=20,40,80,160,320\bar{n}=20,40,80,160,320 5, 10, 20, 40, 80 DL IV
niN(n¯,n¯/3)n_{i}\sim N(\bar{n},\bar{n}/3) ML
nTi=nCi=ni/2n_{Ti}=n_{Ci}=n_{i}/2 REML
HE
HS
Friedrich et al. 2008 Hedgess d 0.2 ,0.5 ,0.8 0, 0.5 nT=nC=10,100n_{T}=n_{C}=10,100 5, 10, 30 DL IV IV
Petropoulou and Mavridis 2017 0 & 0.5 0,0.01,0.05,0.5 nT=nCU(20, 200)n_{T}=n_{C}\sim U(20,\;200) 5,10,20, 30 20 IV IV
estimators
of τ2\tau^{2}
Langan et al. 2018 Hedgess d 0.5 depends on I2I^{2}, 0%,15% n=40n=40, nU(40,400),n\sim U(40,400), 2 ,3, 5, REML IV IV
30%, 45%, 60%, n=400n=400, nU(2000,4000)n\sim U(2000,4000), 10, 20, 30, CA IV+t
75%, 90%, 95% n=40n=40+NU(2000,4000)N\sim U(2000,4000) 50, 100 PM or MP IV +HKSJ
nT=nC=n/2n_{T}=n_{C}=n/2 PMCAPM_{CA}
PMDLPM_{DL}
HM
SJ
SJCASJ_{CA}
Lin 2018 Cohens d 0, 0.2, 0.5, 0.8, 1 0,0.2,0.5 U(5,10), U(10,20), 5, 10, 20, 50 DL IV IV
Hedgess d U(20,30), U(30,50)
U(50,100), U(100,500),
U(500,1000), nT=nCn_{T}=n_{C}
Hamman et al. 2018 Hedgess d 0,0.1,0.15,0.25 0,0.1,0.5,1, n¯=4,6,8,10,12,\bar{n}=4,6,8,10,12, 5,10,15,25,35, REML IV IV +HKSJ
0.35,0.5,0.6,0.75, 2.5,5,10 14,16,20,25 45,55,75,100,125 SSW(H)
1,1.25,1.5,2.5 EW
Marín-Martínez and 0.2,0.5,0.8 0,0.04,0.08,0.16,0.32 n¯=30,50,80,100\bar{n}=30,50,80,100 5,10,20,40,100 DL FE
Sánchez-Meca 2010 nT=nCn_{T}=n_{C} ML IV
HS
Table 1: Simulation studies on meta-analysis of SMD.
Estimators of τ2\tau^{2}: DL - DerSimonian and Laird estimator, ML - Maximum likelihood, REML - Restricted maximum likelihood estimator, HE - Hedges estimator, HS - Hunter-Schmidt estimator, CA - Cochran ANOVA, PM or MP - Mandel-Paule estimator, PMCAPM_{CA} - two-step Cochran ANOVA, PMDLPM_{DL} -two-step DerSimonian-Laird, HM - Hartung-Makambi, SJ - Sidik-Jonkman, SJCASJ_{CA} - alternative Sidik-Jonkman, BM - Bayes Modal estimator;
Estimators of δ\delta: SSW(H) - sample-size-weighted (Hedges 1982), EW - equal weights, IV - inverse-variance-weighted, HS - Hunter and Schmidt (1990) total-sample-size-weighted;
Coverage of δ\delta: IV - confidence interval centered at inverse-variance-weighted estimator of δ\delta with z quantiles, IV+t - confidence interval centered at inverse-variance-weighted estimator of δ\delta with t quantiles, IV+HKSJ - Hartung-Knapp-Sidik-Jonkman confidence interval centered at inverse-variance-weighted estimator of δ\delta.

2 Study-level estimation of standardized mean difference

We assume that each of the KK studies in the meta-analysis consists of two arms, Treatment and Control, with sample sizes niTn_{iT} and niCn_{iC}. The total sample size in Study ii is ni=niT+niCn_{i}=n_{iT}+n_{iC}. We denote the ratio of the control sample size to the total by qi=niC/niq_{i}=n_{iC}/n_{i}. The subject-level data in each arm are assumed to be normally distributed with means μiT\mu_{iT} and μiC\mu_{iC} and variances σiT2\sigma_{iT}^{2} and σiC2\sigma_{iC}^{2}. The sample means are x¯ij\bar{x}_{ij}, and the sample variances are sij2s^{2}_{ij}, for i=1,,Ki=1,\ldots,K and j=Cj=C or TT.

The standardized mean difference effect measure is

δi=μiTμiCσi.\delta_{i}=\frac{\mu_{iT}-\mu_{iC}}{\sigma_{i}}.

The plug-in estimator di=(x¯iTx¯iC)/sid_{i}=(\bar{x}_{iT}-\bar{x}_{iC})/s_{i}, known as Cohen’s dd, is biased in small samples, and we do not consider it further. Instead, we study the unbiased estimator

gi=J(mi)x¯iTx¯iCsi,{g}_{i}=J(m_{i})\frac{\bar{x}_{iT}-\bar{x}_{iC}}{s_{i}},

where mi=niT+niC2m_{i}=n_{iT}+n_{iC}-2, and the factor

J(m)=Γ(m2)m2Γ(m12),J(m)=\frac{\Gamma\left(\frac{m}{2}\right)}{\sqrt{\frac{m}{2}}\Gamma\left(\frac{m-1}{2}\right)},

often approximated by 13/(4m1)1-3/(4m-1), corrects for bias (Hedges, 1983). This estimator of δ\delta is sometimes called Hedges’s gg. The variances in the Treatment and Control arms are usually assumed to be equal. Therefore, σi\sigma_{i} is estimated by the square root of the pooled sample variance

si2=(niT1)siT2+(niC1)siC2niT+niC2.s_{i}^{2}=\frac{(n_{iT}-1)s_{iT}^{2}+(n_{iC}-1)s_{iC}^{2}}{n_{iT}+n_{iC}-2}.

For the variance of gi{g}_{i} we use the unbiased estimator

vi2=niT+niCniTniC+(1(mi2)miJ(mi)2)gi2,v_{i}^{2}=\frac{n_{iT}+n_{iC}}{n_{iT}n_{iC}}+\left(1-\frac{(m_{i}-2)}{m_{i}J(m_{i})^{2}}\right)g^{2}_{i}, (2.1)

Hedges (1983). The sample SMD gi{g}_{i} has a scaled non-central tt-distribution with non-centrality parameter [niqi(1qi)]1/2δi[n_{i}q_{i}(1-q_{i})]^{1/2}\delta_{i} :

niqi(1qi)J(mi)gitmi([niqi(1qi)]1/2δi).\frac{\sqrt{n_{i}q_{i}(1-q_{i})}}{J(m_{i})}{g}_{i}\sim t_{m_{i}}([n_{i}q_{i}(1-q_{i})]^{1/2}\delta_{i}). (2.2)

Cohen (1988) categorized values of δ=0.2, 0.5, 0.8\delta=0.2,\;0.5,\;0.8 as small, medium, and large effect sizes. Four studies ( Viechtbauer (2005), Friedrich et al. (2008), Mar´ın-Mart´ınez and Sánchez-Meca (2010); Sánchez-Meca and Mar´ın-Mart´ınez (2010) and Lin (2018)) use these values of SMD in their simulations. However, these definitions of “small,” “medium,” and “large” may not be appropriate outside the behavioral sciences. Ferguson (2009) proposed the values 0.41, 1.15, 2.700.41,\;1.15,\;2.70 as benchmarks in the social sciences. In an empirical study of 21 ecological meta-analyses by Møller and Jennions (2002), 136 observed values of SMD varied in magnitude from 0.0050.005 to 3.4163.416, with mean 0.7210.721 and 95%95\% confidence interval (0.6220.8200.622-0.820).

3 Standard random-effects model

In meta-analysis, the standard random-effects model assumes that within- and between-study variabilities are accounted for by approximately normal distributions of within- and between-study effects. For a generic measure of effect,

θ^iN(θi,σi2)andθiN(θ,τ2),\hat{\theta}_{i}\sim N(\theta_{i},{\sigma}_{i}^{2})\quad\text{and}\quad\theta_{i}\sim N(\theta,\tau^{2}), (3.1)

resulting in the marginal distribution θ^iN(θ,σi2+τ2)\hat{\theta}_{i}\sim N(\theta,\sigma_{i}^{2}+\tau^{2}). θ^i\hat{\theta}_{i} is the estimate of the effect in Study ii, and its within-study variance is σi2\sigma_{i}^{2}, estimated by σ^i2\hat{\sigma}_{i}^{2}, i=1,,Ki=1,\ldots,K. τ2\tau^{2} is the between-study variance, which is estimated by τ^2\hat{\tau}^{2}. The overall effect θ\theta can be estimated by the weighted mean

θ^𝑅𝐸=i=1Kw^i(τ^2)θ^ii=1Kw^i(τ^2),\hat{\theta}_{\mathit{RE}}=\frac{\sum\limits_{i=1}^{K}\hat{w}_{i}(\hat{\tau}^{2})\hat{\theta}_{i}}{\sum\limits_{i=1}^{K}\hat{w}_{i}(\hat{\tau}^{2})}, (3.2)

where the w^i(τ^2)=(σ^i2+τ^2)1\hat{w}_{i}(\hat{\tau}^{2})=(\hat{\sigma}_{i}^{2}+\hat{\tau}^{2})^{-1} are inverse-variance weights. The FE estimate θ^\hat{\theta} uses weights w^i=w^i(0)\hat{w}_{i}=\hat{w}_{i}(0).

If wi=1/Var(θ^i)w_{i}=1/\hbox{Var}(\hat{\theta}_{i}), the variance of the weighted mean of the θ^i\hat{\theta}_{i} is 1/wi1/\sum w_{i}. Thus, many authors estimate the variance of θ^𝑅𝐸\hat{\theta}_{\mathit{RE}} by [i=1Kw^i(τ^2)]1\left[\sum_{i=1}^{K}\hat{w}_{i}(\hat{\tau}^{2})\right]^{-1}. In practice, however, this estimate may not be satisfactory Sidik and Jonkman (2006); Li et al. (1994); Rukhin (2009).

4 Point and interval estimation of τ2\tau^{2} by the Kulinskaya-Dollinger-Bjørkestøl method (KDB)

Because the w^i(τ2)\hat{w}_{i}(\tau^{2}) in Equation (3.2) involve the σ^i2\hat{\sigma}_{i}^{2}, K1K-1 is an adequate approximation for the expected value of Cochran’s QQ statistic only for very large sample sizes. As an alternative one can use one of the improved approximations to the expected value of Cochran’s QQ. Corrected Mandel-Paule methods for estimating τ2\tau^{2} equate Cochran’s QQ statistic with the weights w^i(τ2)\hat{w}_{i}(\tau^{2}) to the first moment of an improved approximate null distribution.

More-realistic approximations to the distribution of QQ are available for several effect measures. In these approximations the estimates σ^i2\hat{\sigma}_{i}^{2} are not treated as equal to the σi2\sigma_{i}^{2}. For SMD, Kulinskaya et al. (2011) derived O(1/n)O(1/n) corrections to moments of QQ and suggested using the chi-squared distribution with degrees of freedom equal to the estimate of the corrected first moment to approximate the distribution of QQ. Kulinskaya et al. (2011) give expressions from which it can be calculated, along with a computer program in R.

We propose a new method of estimating τ2\tau^{2} based on this improved approximation. Let EKDB(Q)E_{KDB}({Q}) denote the corrected expected value of QQ. Then one obtains the KBD estimate of τ2\tau^{2} by iteratively solving

Q(τ2)=i=1K(θiθ^RE)2σ^i2+τ2=EKDB(Q).Q(\tau^{2})=\sum\limits_{i=1}^{K}\frac{(\theta_{i}-\hat{\theta}_{RE})^{2}}{\hat{\sigma}_{i}^{2}+\tau^{2}}=E_{KDB}({Q}). (4.1)

We denote the resulting estimator of τ2\tau^{2} by τ^KDB2\hat{\tau}_{KDB}^{2}.

We also propose a new KDB confidence interval for the between-study variance. This interval for τ2\tau^{2} combines the Q-profile approach and the improved approximation by Kulinskaya et al. (2011) (i.e., the chi-squared distribution with fractional degrees of freedom based on the corrected first moment of QQ).

This corrected Q-profile confidence interval can be estimated from the lower and upper quantiles of FQF_{Q}, the cumulative distribution function for the improved approximation to the distribution of QQ:

Q(τL2)=FQ;0.975Q(τU2)=FQ;0.025Q(\tau_{L}^{2})=F_{Q;0.975}\qquad Q(\tau_{U}^{2})=F_{Q;0.025} (4.2)

The upper and lower confidence limits for τ2\tau^{2} can be calculated iteratively.

5 Sample-size-weighted (SSW) point and interval estimators of θ\theta

In an attempt to avoid the bias in the inverse-variance-weighted estimators, we included a point estimator whose weights depend only on the studies’ sample sizes. For this estimator (SSW), wi=n~i=niTniC/(niT+niC)w_{i}=\tilde{n}_{i}=n_{iT}n_{iC}/(n_{iT}+n_{iC}); n~i\tilde{n}_{i} is the effective sample size in Study ii. These weights would coincide with the inverse-variance weights when δ=0\delta=0. These effective-sample-size-based weights were suggested in (Hedges and Olkin, 1985, p.110).

The interval estimator corresponding to SSW (SSW KDB) uses the SSW point estimator as its center, and its half-width equals the estimated standard deviation of SSW under the random-effects model times the critical value from the tt distribution on K1K-1 degrees of freedom. The estimator of the variance of SSW is

Var^(θ^𝑆𝑆𝑊)=n~i2(vi2+τ^2)(n~i)2,\widehat{\hbox{Var}}(\hat{\theta}_{\mathit{SSW}})=\frac{\sum\tilde{n}_{i}^{2}(v_{i}^{2}+\hat{\tau}^{2})}{(\sum\tilde{n}_{i})^{2}}, (5.1)

in which vi2v_{i}^{2} comes from Equation (2.1) and τ^2=τ^𝐾𝐷𝐵2\hat{\tau}^{2}=\hat{\tau}_{\mathit{KDB}}^{2}.

6 Simulation study

As mentioned in Section 1, other studies have used simulation to examine estimators of τ2\tau^{2} or of the overall effect for SMD, but gaps in evidence remain.

Our simulation study for SMD uses 0δ20\leq\delta\leq 2 and 0τ22.50\leq\tau^{2}\leq 2.5 as realistic for a range of applications.

Our simulation study assesses the performance of five methods for point estimation of between-study variance τ2\tau^{2} (DL, REML, J, MP, and KDB) and five methods of interval estimation of τ2\tau^{2} (Q-profile-based methods corresponding to DerSimonian-Laird and KDB, the generalized Q-profile intervals of Biggerstaff and Jackson (2008) and Jackson (2013), and the profile-likelihood confidence interval based on REML).

We also assess the performance of the point and interval estimators of δ\delta in the random-effects model for SMD.

We vary five parameters: the overall true SMD (δ\delta), the between-studies variance (τ2\tau^{2}), the number of studies (KK), the studies’ total sample size (nn and n¯\bar{n}), and the proportion of observations in the Control arm (qq). The combinations of parameters are listed in Table 2.

All simulations use the same numbers of studies K=5, 10, 30K=5,\;10,\;30 and, for each combination of parameters, the same vector of total sample sizes n=(n1,,nK)n=(n_{1},\ldots,n_{K}) and the same proportions of observations in the Control arm qi=niC/ni=.5, .75q_{i}=n_{iC}/n_{i}=.5,\;.75 for all ii. The values of qq reflect two situations for the two arms of each study: approximately equal (1:1) and quite unbalanced (1:3). The sample sizes in the Treatment and Control arms are niT=(1qi)nin_{iT}=\lceil{(1-q_{i})n_{i}}\rceil and niC=niniTn_{iC}=n_{i}-n_{iT}, i=1,,Ki=1,\ldots,K.

We study equal and unequal study sizes. For equal study sizes nin_{i} is as small as 20, and for unequal study sizes nin_{i} is as small as 12, in order to examine how the methods perform for the extremely small sample sizes that arise in some areas of application. In choosing unequal study sizes, we follow a suggestion of Sánchez-Meca and Mar´ın-Mart´ınez (2000), who selected study sizes having skewness of 1.464, which they considered typical in behavioral and health sciences. Table 2 gives the details.

The patterns of sample sizes are illustrative; they do not attempt to represent all patterns seen in practice. By using the same patterns of sample sizes for each combination of the other parameters, we avoid the additional variability in the results that would arise from choosing sample sizes at random (e.g., uniformly between 20 and 200).

We use a total of 10,00010,000 repetitions for each combination of parameters. Thus, the simulation standard error for estimated coverage of τ2\tau^{2} or δ\delta at the 95%95\% confidence level is roughly 0.95×0.05/10,000=0.00218\sqrt{0.95\times 0.05/10,000}=0.00218.

We generate the true effect sizes δi\delta_{i} from a normal distribution: δiN(δ,τ2)\delta_{i}\sim N(\delta,\tau^{2}). We generate the values of Hedges’s estimator gi{g}_{i} directly from the appropriately scaled non-central tt-distribution, given by Equation (2.2), and obtain their estimated within-study variances from Equation (2.1).

The simulations were programmed in R version 3.3.2 using the University of East Anglia 140-computer-node High Performance Computing (HPC) Cluster, providing a total of 2560 CPU cores, including parallel processing and large memory resources. For each configuration, we divided the 10,000 replications into 10 parallel sets of 1000 replications.

The structure of the simulations invites an analysis of the results along the lines of a designed experiment, in which the variables are τ2\tau^{2}, nn, KK, qq, and δ\delta. Most of the variables are crossed, but two have additional structure. Within the two levels of nn, equal and unequal, the values are nested: n=20, 40, 100, 250n=20,\;40,\;100,\;250 and n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160. We approach the analysis of the data from the simulations qualitatively, to identify the variables that substantially affect (or do not affect) the performance of the estimators as a whole and the variables that reveal important differences in performance. We might hope to describe the estimators’ performance one variable at a time, but such “main effects” often do not provide an adequate summary: important differences are related to certain combinations of two or more variables.

We use this approach to examine bias and coverage in estimation of τ2\tau^{2} and bias and coverage in estimation of δ\delta. Our summaries of results are based on examination of the figures in the corresponding Appendices. Section 8 gives brief summaries, and Appendix A contains more detail.

Table 2: Combinations of parameters in the simulations for SMD
SMD Equal study sizes Unequal study sizes Results in
Appendix
KK (number of studies) 5, 10, 30 5, 10, 30
nn or n¯\bar{n} (average (individual) study size, 20, 40, 100, 250 30 (12,16,18,20,84),
total of the two arms) 30, 50, 60, 70 60 (24,32,36,40,168),
For K=10K=10 and K=30K=30, the same set of 100 (64,72,76,80,208),
unequal study sizes is used 2 or 6 times, respectively. 160 (124,132,136,140,268)
qq (proportion of each study in the control arm) 1/2, 3/4 1/2, 3/4
τ2\tau^{2} (variance of random effect) 0(0.5)2.5 0(0.5)2.5 A1, A2
δ\delta (true value of the SMD) 0, 0.2, 0.5, 1, 2 0, 0.2, 0.5, 1, 2 B1, B2

7 Methods of estimation of τ2\tau^{2} and δ\delta used in simulations

Point estimators of τ2\tau^{2}

  • DL - method of DerSimonian and Laird (1986)

  • J - method of Jackson (2013)

  • KDB - method based on corrected null moment of QQ per Kulinskaya et al. (2011)

  • MP - method of Mandel and Paule (1970)

  • REML - restricted maximum-likelihood method

Interval estimators of τ2\tau^{2}

  • BJ - method of Biggerstaff and Jackson (2008)

  • J - method of Jackson (2013)

  • KDB - Q-profile method based on corrected null distribution of QQ per Kulinskaya et al. (2011)

  • PL - profile-likelihood confidence interval based on τ^REML2\hat{\tau}_{REML}^{2}

  • QP - Q-profile confidence interval of Viechtbauer (2007)

Point estimators of δ\delta

Inverse-variance-weighted methods with τ2\tau^{2} estimated by:

  • DL

  • J

  • REML

  • KDB

  • MP

and

  • SSW - weighted mean with weights that depend only on studies sample sizes

Interval estimators of δ\delta

Inverse-variance-weighted methods using normal quantiles, with τ2\tau^{2} estimated by:

  • DL

  • J

  • KDB

  • MP

  • REML

Inverse-variance-weighted methods with modified variance of δ^\hat{\delta} and t-quantiles as in Hartung and Knapp (2001) and Sidik and Jonkman (2002)

  • HKSJ (DL) - τ2\tau^{2} estimated by DL

  • HKSJ KDB - τ2\tau^{2} estimated by KDB

and

  • SSW KDB - SSW point estimator of δ\delta with estimated variance given by (5.1) and t-quantiles

8 Results

Our full simulation results, comprising 130130 figures, each presenting 1212 combinations of the 4 values of nn or n¯\bar{n} and the 3 values of KK, are provided in Appendices A and B. A summary is given below.

8.1 Bias in estimation of τ2\tau^{2} (Appendix A1)

The five estimators (DL, REML, J, MP, and KDB) have biases whose traces fan out from the same small positive bias at τ2=0\tau^{2}=0. As τ2\tau^{2} increases, KDB remains positive and increases slowly; MP stays close to 0 (and slightly below); REML stays negative, with a negative slope; J stays negative, with a more-negative slope; and DL becomes increasingly negative, showing noticeable curvature. The value of δ\delta has little effect on this pattern (except that the bias of DL and J has smaller magnitude when δ=2\delta=2).

As nn or n¯\bar{n} increases, the traces for KDB, MP, and REML flatten, and their bias is essentially 0 when nn or n¯=100\bar{n}=100. The traces for J and DL become less steep, but substantial bias remains at n=250n=250 (or n¯=160\bar{n}=160).

As KK increases, the trace for KDB flattens somewhat, but the traces for the other estimators become steeper.

The traces for J and DL are slightly less steep when q=.75q=.75 than when q=.5q=.5.

In summary, the patterns of bias indicate a choice among the five estimators of τ2\tau^{2} (DL, REML, J, MP, and KDB). When n40n\leq 40, MP is closer to unbiased than KDB when K=5K=5, the magnitudes of their biases are roughly equal when K=10K=10, and KDB is closer to unbiased when K=30K=30. When n100n\geq 100, MP, KDB, and REML are nearly unbiased. DL and J seriously underestimate τ2\tau^{2}. The average of MP and KDB should be close to unbiased.

8.2 Coverage in estimation of τ2\tau^{2} (Appendix A2)

The five estimators (PL, QP, BJ, J, and KDB) share the feature that their coverage decreases as τ2\tau^{2} increases from 0 to 0.5. At τ2=0\tau^{2}=0 all five have coverage .95\geq.95. KDB is highest (e.g., .99 when q=.5q=.5 and n=20n=20), but it drops below .95 at τ2=0.5\tau^{2}=0.5 or τ2=1.0\tau^{2}=1.0 and remains slightly below .95. BJ is next highest (e.g., .98 when q=.5q=.5 and n=20n=20), and it remains above .95 (say, .96 to .97) when K=5K=5 and K=10K=10. QP is close to .95 for τ20.5\tau^{2}\geq 0.5. PL is between BJ and QP, and it remains above QP when K=30K=30. The trace for BJ behaves quite differently when K=30K=30 than when K10K\leq 10, decreasing steeply and linearly to around .77 at τ2=2.5\tau^{2}=2.5 (when q=.5q=.5 and n=20n=20). When K10K\leq 10, J is between BJ and QP; and when K=30K=30, it also decreases linearly, but less steeply (e.g., to around .92 at τ2=2.5\tau^{2}=2.5 when q=.5q=.5 and n=20n=20).

Coverage does not change noticeably as nn or n¯\bar{n} increases, when K10K\leq 10. When K=30K=30, the slopes of BJ and J become slightly less steep, and the traces of the other estimators move closer together and are closer to .95.

Setting aside the behavior of BJ and J when K=30K=30, and of the other estimators when n=20n=20 and K=30K=30, the traces move closer together as KK increases.

The slopes of BJ and J when K=30K=30 are less steep when q=.75q=.75 than when q=.5q=.5.

As δ\delta increases, the coverage of QP at K=30K=30 increases slightly; it is substantially closer to .95 when δ=2\delta=2.

In summary, all five interval estimators of τ2\tau^{2} have coverage substantially above .95 when τ2=0\tau^{2}=0. When τ20.5\tau^{2}\geq 0.5, QP is generally closest to .95. The unusual behavior of BJ (and, to a lesser extent, J) when K=30K=30 adds to the evidence against it.

8.3 Bias and mean squared error in estimation of δ\delta (Appendix B1)

When δ=0\delta=0, the bias of the six estimators (DL, REML, MP, KDB, J, and SSW) follows a single trace, close to 0, for all values of τ2\tau^{2}. When δ>0\delta>0, SSW stays close to 0, and the others shift down, to increasingly negative bias, as δ\delta increases, and their traces separate. For example, when δ=1\delta=1, q=.5q=.5, n=20n=20, K=10K=10, and τ2=2.5\tau^{2}=2.5, the bias ranges from 0.05-0.05 (KDB) to 0.07-0.07 (DL), and MP, REML, and J (in that order) have intermediate values.

When δ0.5\delta\leq 0.5, bias has little relation to τ2\tau^{2}. When δ1\delta\geq 1, however, the bias of the estimators other than SSW (especially DL) becomes increasingly negative as τ2\tau^{2} increases. (The plot for n=20n=20 and K=30K=30 in B1.17 shows an extreme example.)

Where bias is nonzero, increasing nn or n¯\bar{n} moves the traces toward (or to) 0, decreasing separation between them. Some plots (e.g.,  B1.37 and B1.39) show slight evidence of greater separation among traces when sample sizes are unequal and δ=2\delta=2.

For the most part, KK has little or no effect on bias. Some plots suggest that, where bias is nonzero, separation among traces increases as KK increases, especially from K=10K=10 to K=30K=30.

Bias does not differ noticeably between q=.5q=.5 and q=.75q=.75.

SSW essentially avoids the bias that we found in the inverse-variance-weighted estimators of δ\delta. To provide an additional measure of its performance (besides coverage, discussed below). we estimated the mean squared error of SSW and the best two inverse-variance-weighted estimators, KDB and MP. Appendix E1 includes figures that plot (versus τ2\tau^{2}) the ratios MSE(SSW)/MSE(KDB) and MSE(SSW)/MSE(MP) for the five values of δ\delta, the two values of qq, and ni=20, 40, 100, 250n_{i}=20,\;40,\;100,\;250. For most situations the two ratios are essentially equal and differ little among values of KK and qq. In most situations the traces are essentially flat as τ2\tau^{2} increases; otherwise, they curve downward as τ2\tau^{2} approaches 0. As nn increases, the ratios approach 1. For example, when δ=0\delta=0 and K=5K=5, they decrease from around 1.1 when n=20n=20 to nearly 1.0 when n=250n=250. As δ\delta increases (0.5\geq 0.5), the ratios at small τ2\tau^{2} decrease. This pattern is first noticeable when δ=0.5\delta=0.5 and n=20n=20 and K=30K=30; and as δ\delta increases, it becomes more pronounced at that combination of nn and KK and extends to larger nn (with K=30K=30) and to n40n\leq 40 and K=10K=10. When δ=2\delta=2, q=.5q=.5, n=20n=20, and K=30K=30, the traces for the two ratios are separate and curve up from around 0.55 at τ2=0\tau^{2}=0 to slightly <1<1 when τ2=2.5\tau^{2}=2.5. The patterns are similar for q=.75q=.75.

In summary, the bias of SSW is close to 0, and the other five estimators (DL, REML, J, MP, and KDB), which use inverse-variance weights, have greater (and negative) bias, amounting to 5 – 10% when sample sizes are small and δ1\delta\geq 1. This bias increases as τ2\tau^{2} increases. SSW usually has slightly greater mean squared error than KDB and MP when nn is small, but its MSE can be substantially smaller, especially for small τ2\tau^{2}.

8.4 Coverage in estimation of δ\delta (Appendix B2)

Coverage of the estimators that rely on inverse-variance weights and normal critical values (DL, REML, MP, KDB, and J) is influenced most by τ2\tau^{2} (=0=0 versus >0>0) and KK. At τ2=0\tau^{2}=0 their coverage is around .97, but at τ2=0.5\tau^{2}=0.5 (when q=.5q=.5 and n=20n=20) it is mostly below .95: .90 to .91 when K=5K=5, .92 to.94 when K=10K=10, and .93 to .95+ when K=30K=30. As τ2\tau^{2} increases, their coverage either is flat (REML, MP, and KDB) or decreases (DL, J). DL almost always has the lowest coverage, and the gap between it and J widens as KK increases.

Except for the effect of KK on SSW KDB at τ2=0\tau^{2}=0 (above .99 when K=5K=5, decreasing to .97 when K=30K=30) and below-nominal coverage of HKSJ and HKSJ KDB in a region whose definition involves mainly δ\delta, nn (or n¯\bar{n}), KK, and τ2\tau^{2}, the coverage of SSW and the HKSJ-type estimators is close to .95 for all KK and τ2\tau^{2}. The challenging situations in that region generally involve δ=1\delta=1 or δ=2\delta=2, the smaller nn or n¯\bar{n}, K=10K=10 or K=30K=30, and the smaller τ2\tau^{2}. For example, when δ=2\delta=2 and n=20n=20 or n¯=30\bar{n}=30 and K=30K=30, coverage can be as low as .84 when τ2=0\tau^{2}=0.

When τ20.5\tau^{2}\geq 0.5, coverage of DL, REML, MP, KDB, and J increases as KK increases, usually staying below .95.

The effect of δ\delta on coverage is slight except for some situations involving δ=2\delta=2. When n=20n=20 and K=30K=30, all of the estimators except SSW KDB have low coverage at τ2=0\tau^{2}=0, ranging from <.82<.82 to .86. Their traces rise as τ2\tau^{2} increases; when τ2=2.5\tau^{2}=2.5, KDB and HKSJ KDB are almost .94, and DL is .86 (up from .84). The pattern is similar when n=40n=40 and K=30K=30, but much reduced.

When τ20.5\tau^{2}\geq 0.5, coverage decreases slightly as nn or n¯\bar{n} increases (except for SSW and the two HKSJ-type estimators); coverage is somewhat lower when sample sizes are unequal.

The plots show at most slight differences between q=.5q=.5 and q=.75q=.75.

In summary, except when δ=2\delta=2 and K=30K=30, HKSJ and HKSJ KDB have coverage closest to .95; they differ little, and departures from .95 (toward lower coverage) are seldom serious. SSW KDB is rather conservative when K=5K=5 and for other KK when τ2=0\tau^{2}=0. Otherwise it provides reliable, albeit slightly conservative, coverage. When δ=2\delta=2 and K=30K=30, SSW KDB is the best alternative. All of the estimators that use inverse-variance weights and critical values from the normal distribution (DL,REML, J, MP, and KDB) often have coverage substantially below .95.

9 Discussion: Practical implications for meta-analysis

The results of our simulations for SMD give a rather disappointing picture. In brief:
Because the study-level effects and their variances are related (cf. Equation (2.1) for SMD), the performance of all statistical methods depends on the effect measures, estimates of overall effects are biased, and coverage of confidence intervals is too low, especially for small sample sizes.

The conventional wisdom is that these deficiencies do not matter, as meta-analysis usually deals with studies that are “large,” so all these little problems are automatically resolved. Unfortunately, this is not true, even in medical meta-analyses; in Issue 4 of the Cochrane Database 2004, the maximum study size was 6363 or less in 25%25\% of meta-analyses with K3K\geq 3 that used SMD as an effect measure, and less than 110110 in 50%50\% of them (our own analysis). We have not surveyed typical study sizes in psychology, but Sánchez-Meca and Mar´ın-Mart´ınez (2010), promoting MA in psychological research, use an example with 24 studies in which the smallest study size is 1212 and the largest is 121121. In ecology, typical sample sizes are between 4 and 25 (Hamman et al., 2018). An effect-measure-specific estimator of τ2\tau^{2}, such as KDB for SMD, can reduce inherent biases.

Arguably, the main purpose of a meta-analysis is to provide point and interval estimates of an overall effect.

Usually, after estimating the between-study variance τ2\tau^{2}, inverse-variance weights are used in estimating the overall effect (and, often, its variance). This approach relies on the theoretical result that, for known variances, and given unbiased estimates θ^i\hat{\theta}_{i}, it yields a Uniformly Minimum-Variance Unbiased Estimate (UMVUE) of θ\theta. In practice, however, the true within-study variances are unknown, and use of the estimated variances makes the inverse-variance-weighted estimate of the overall effect biased.

Consumers routinely expect point estimates to have no (or small) bias and CIs to have (close to) nominal coverage. Thus, the IV-weighted approach is unsatisfactory because, in general, it cannot produce an unbiased estimate of an overall effect.

A pragmatic approach to unbiased estimation of θ\theta uses weights that do not involve estimated variances of study-level estimates, for example, weights proportional to the study sizes nin_{i}. Hunter and Schmidt (1990) and Shuster (2010), among others, have proposed such weights, and Mar´ın-Mart´ınez and Sánchez-Meca (2010) and Hamman et al. (2018) have studied the method’s performance by simulation for SMD. We prefer to use weights proportional to an effective sample size, n~i=niTniC/ni\tilde{n}_{i}=n_{iT}n_{iC}/n_{i}; these are the optimal inverse-variance weights for SMD when δ=0\delta=0 and τ2=0\tau^{2}=0. Thus, the overall effect is estimated by θ^𝑆𝑆𝑊=n~iθ^i/n~i\hat{\theta}_{\mathit{SSW}}=\sum\tilde{n}_{i}\hat{\theta}_{i}/\sum\tilde{n}_{i}, and its variance is estimated by Equation (5.1). Hamman et al. (2018) use weights proposed by Hedges (1982), which differ slightly for very small sample sizes.

A good estimator of τ2\tau^{2}, such as MP or KDB, can be used as τ^2\hat{\tau}^{2}. Further, confidence intervals for δ\delta centered at δ^𝑆𝑆𝑊\hat{\delta}_{\mathit{SSW}} with τ^𝐾𝐷𝐵2\hat{\tau}_{\mathit{KDB}}^{2} in Equation (5.1) can be used.

This approach based on SSW requires further study. For example, in the confidence intervals we have used critical values from the tt-distribution on K1K-1 degrees of freedom, but we have not yet examined the actual sampling distribution of SSW. The raw material for such an examination is readily available: For each situation in our simulations, each of the 10,00010,000 replications yields an observation on the sampling distribution of SSW.

Funding

The work by E. Kulinskaya was supported by the Economic and Social Research Council [grant number ES/L011859/1].

Appendices

  • Appendix A. SMD: Plots for bias and coverage of τ2\tau^{2}.

  • Appendix B. SMD: Plots for bias and coverage of standardized mean difference δ\delta.

References

  • Biggerstaff and Jackson [2008] Brad J Biggerstaff and Dan Jackson. The exact distribution of Cochran’s heterogeneity statistic in one-way random effects meta-analysis. Statistics in Medicine, 27(29):6093–6110, 2008.
  • Cohen [1988] J Cohen. Statistical Power Analysis for the Behavioral Sciences. New York: Academic Press, 1988.
  • DerSimonian and Laird [1986] Rebecca DerSimonian and Nan Laird. Meta-analysis in clinical trials. Controlled Cinical Trials, 7(3):177–188, 1986.
  • Ferguson [2009] Christopher J Ferguson. An effect size primer: A guide for clinicians and researchers. Professional Psychology: Research & Practice, 40(5):532–538, 2009.
  • Friedrich et al. [2008] Jan O Friedrich, Neill KJ Adhikari, and Joseph Beyene. The ratio of means method as an alternative to mean differences for analyzing continuous outcome variables in meta-analysis: a simulation study. BMC Medical Research Methodology, 8:32, 2008.
  • Hamman et al. [2018] Elizabeth A. Hamman, Paula Pappalardo, James R. Bence, Scott D. Peacor, and Craig W. Osenberg. Bias in meta-analyses using Hedges’ d. Ecosphere, 9(9):e02419, 2018. doi: 10.1002/ecs2.2419. URL https://esajournals.onlinelibrary.wiley.com/doi/abs/10.1002/ecs2.2419.
  • Hartung and Knapp [2001] Joachim Hartung and Guido Knapp. A refined method for the meta-analysis of controlled clinical trials with binary outcome. Statistics in Medicine, 20(24):3875–3889, 2001.
  • Hedges [1982] Larry V Hedges. Estimation of effect size from a series of independent experiments. Psychological Bulletin, 92(2):490–499, 1982.
  • Hedges [1983] Larry V Hedges. A random effects model for effect sizes. Psychological Bulletin, 93(2):388–395, 1983.
  • Hedges and Olkin [1985] Larry V Hedges and Ingram Olkin. Statistical Methods for Meta-Analysis. San Diego, California: Academic Press, 1985.
  • Higgins et al. [2009] Julian P T Higgins, Simon G Thompson, and David J Spiegelhalter. A re-evaluation of random-effects meta-analysis. Journal of the Royal Statistical Society, Series A, 172(1):137–159, 2009.
  • Higgins and Green [2011] Julian P.T. Higgins and Sally Green, editors. Cochrane Handbook for Systematic Reviews of Interventions, Version 5.1.0 [updated March 2011]. The Cochrane Collaboration, 2011. URL http://handbook.cochrane.org.
  • Hunter and Schmidt [1990] John E Hunter and Frank L Schmidt. Methods of Meta-analysis: Correcting Error and Bias in Research Findings. Sage Publications, Inc, 1990.
  • Jackson [2013] Dan Jackson. Confidence intervals for the between-study variance in random effects meta-analysis using generalised Cochran heterogeneity statistics. Research Synthesis Methods, 4(3):220–229, 2013.
  • Kulinskaya et al. [2011] Elena Kulinskaya, Michael B Dollinger, and Kirsten Bjørkestøl. Testing for homogeneity in meta-analysis I. The one-parameter case: standardized mean difference. Biometrics, 67(1):203–212, 2011.
  • Kulinskaya et al. [2014] Elena Kulinskaya, Stephan Morgenthaler, and Robert G Staudte. Combining statistical evidence. International Statistical Review, 82(2):214–242, 2014.
  • Langan et al. [2018] Dean Langan, Julian P. T. Higgins, Dan Jackson, Jack Bowden, Areti Angeliki Veroniki, Evangelos Kontopantelis, and Wolfgang Viechtbauer. A comparison of heterogeneity variance estimators in simulated random-effects meta-analyses. Research Synthesis Methods, to appear, 2018.
  • Li et al. [1994] Yuanzhang Li, Li Shi, and H Daniel Roth. The bias of the commonly-used estimate of variance in meta-analysis. Communications in Statistics–Theory and Methods, 23(4):1063–1085, 1994.
  • Lin [2018] Lifeng Lin. Bias caused by sampling error in meta-analysis with small sample sizes. PLoS ONE, 13(9):e0204056, 2018.
  • Mandel and Paule [1970] John Mandel and Robert C Paule. Interlaboratory evaluation of a material with unequal numbers of replicates. Analytical Chemistry, 42(11):1194–1197, 1970.
  • Mar´ın-Mart´ınez and Sánchez-Meca [2010] Fulgencio Marín-Martínez and Julio Sánchez-Meca. Weighting by inverse variance or by sample size in random-effects meta-analysis. Educational and Psychological Measurement, 70(1):56–73, 2010.
  • Møller and Jennions [2002] Anders Møller and Michael D Jennions. How much variance can be explained by ecologists and evolutionary biologists? Oecologia, 132(4):492–500, 2002.
  • Petropoulou and Mavridis [2017] Maria Petropoulou and Dimitris Mavridis. A comparison of 20 heterogeneity variance estimators in statistical synthesis of results from studies: a simulation study. Statistics in Medicine, 36(27):4266–4280, 2017.
  • Rukhin [2009] Andrew L Rukhin. Weighted means statistics in interlaboratory studies. Metrologia, 46(3):323–331, 2009.
  • Sánchez-Meca and Mar´ın-Mart´ınez [2000] Julio Sánchez-Meca and Fulgencio Marín-Martínez. Testing the significance of a common risk difference in meta-analysis. Computational Statistics & Data Analysis, 33(3):299–313, 2000.
  • Sánchez-Meca and Mar´ın-Mart´ınez [2010] Julio Sánchez-Meca and Fulgencio Marín-Martínez. Meta-analysis in psychological research. International Journal of Psychological Research, 3(1):150–162, 2010.
  • Shuster [2010] Jonathan J Shuster. Empirical vs natural weighting in random effects meta-analysis. Statistics in Medicine, 29(12):1259–1265, 2010.
  • Sidik and Jonkman [2002] K. Sidik and J. N. Jonkman. A simple confidence interval for meta-analysis. Statistics in Medicine, 21(21):3153–3159, 2002.
  • Sidik and Jonkman [2006] Kurex Sidik and Jeffrey N Jonkman. Robust variance estimation for random effects meta-analysis. Computational Statistics & Data Analysis, 50(12):3681–3701, 2006.
  • Veroniki et al. [2016] Areti Angeliki Veroniki, Dan Jackson, Wolfgang Viechtbauer, Ralf Bender, Jack Bowden, Guido Knapp, Oliver Kuss, Julian P T Higgins, Dean Langan, and Georgia Salanti. Methods to estimate the between-study variance and its uncertainty in meta-analysis. Research Synthesis Methods, 7(1):55–79, 2016.
  • Viechtbauer [2005] Wolfgang Viechtbauer. Bias and efficiency of meta-analytic variance estimators in the random-effects model. Journal of Educational and Behavioral Statistics, 30(3):261–293, 2005.
  • Viechtbauer [2007] Wolfgang Viechtbauer. Confidence intervals for the amount of heterogeneity in meta-analysis. Statistics in Medicine, 26(1):37–52, 2007.

10 Appendices

A1. Bias of point estimators of τ^2\hat{\tau}^{2} for τ^2=0.0(0.5)2.5\hat{\tau}^{2}=0.0(0.5)2.5.

For bias of τ^2\hat{\tau}^{2}, each figure corresponds to a value of δ(=0,0.5,1,1.5,2,2.5)\delta(=0,0.5,1,1.5,2,2.5), a value of q(=.5,.75)q(=.5,.75), and a set of values of nn (= 20, 40, 100, 250 or 30, 50, 60, 70) or n¯(=30,60,100,160)\bar{n}(=30,60,100,160).
Each figure contains a panel (with τ2\tau^{2} on the horizontal axis) for each combination of n (or n¯\bar{n}) and K(=5,10,30)K(=5,10,30).
The point estimators of τ2\tau^{2} are

  • DL (DerSimonian-Laird)

  • REML (restricted maximum likelihood)

  • MP (Mandel-Paule)

  • KDB (improved moment estimator based on Kulinskaya, Dollinger and Bjørkestøl (2011))

  • J (Jackson)

Refer to caption
Figure A1.1: Bias of the estimation of between-studies variance τ2\tau^{2} for δ=0\delta=0, q=0.5q=0.5, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure A1.2: Bias of the estimation of between-studies variance τ2\tau^{2} for δ=0\delta=0, q=0.5q=0.5, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure A1.3: Bias of the estimation of between-studies variance τ2\tau^{2} for δ=0\delta=0, q=0.5q=0.5, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.
Refer to caption
Figure A1.4: Bias of the estimation of between-studies variance τ2\tau^{2} for δ=0.2\delta=0.2, q=0.5q=0.5, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure A1.5: Bias of the estimation of between-studies variance τ2\tau^{2} for δ=0.2\delta=0.2, q=0.5q=0.5, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure A1.6: Bias of the estimation of between-studies variance τ2\tau^{2} for δ=0.2\delta=0.2, q=0.5q=0.5, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.
Refer to caption
Figure A1.7: Bias of the estimation of between-studies variance τ2\tau^{2} for δ=0.5\delta=0.5, q=0.5q=0.5, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure A1.8: Bias of the estimation of between-studies variance τ2\tau^{2} for δ=0.5\delta=0.5, q=0.5q=0.5, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure A1.9: Bias of the estimation of between-studies variance τ2\tau^{2} for δ=0.5\delta=0.5, q=0.5q=0.5, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.
Refer to caption
Figure A1.10: Bias of the estimation of between-studies variance τ2\tau^{2} for δ=1\delta=1, q=0.5q=0.5, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure A1.11: Bias of the estimation of between-studies variance τ2\tau^{2} for δ=1\delta=1, q=0.5q=0.5, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure A1.12: Bias of the estimation of between-studies variance τ2\tau^{2} for δ=1\delta=1, q=0.5q=0.5, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.
Refer to caption
Figure A1.13: Bias of the estimation of between-studies variance τ2\tau^{2} for δ=2\delta=2, q=0.5q=0.5, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure A1.14: Bias of the estimation of between-studies variance τ2\tau^{2} for δ=2\delta=2, q=0.5q=0.5, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure A1.15: Bias of the estimation of between-studies variance τ2\tau^{2} for δ=2\delta=2, q=0.5q=0.5, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.
Refer to caption
Figure A1.16: Bias of the estimation of between-studies variance τ2\tau^{2} for δ=0\delta=0, q=0.75q=0.75, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure A1.17: Bias of the estimation of between-studies variance τ2\tau^{2} for δ=0\delta=0, q=0.75q=0.75, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure A1.18: Bias of the estimation of between-studies variance τ2\tau^{2} for δ=0\delta=0, q=0.75q=0.75, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.
Refer to caption
Figure A1.19: Bias of the estimation of between-studies variance τ2\tau^{2} for δ=0.2\delta=0.2, q=0.75q=0.75, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure A1.20: Bias of the estimation of between-studies variance τ2\tau^{2} for δ=0.2\delta=0.2, q=0.75q=0.75, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure A1.21: Bias of the estimation of between-studies variance τ2\tau^{2} for δ=0.2\delta=0.2, q=0.75q=0.75, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.
Refer to caption
Figure A1.22: Bias of the estimation of between-studies variance τ2\tau^{2} for δ=0.5\delta=0.5, q=0.75q=0.75, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure A1.23: Bias of the estimation of between-studies variance τ2\tau^{2} for δ=0.5\delta=0.5, q=0.75q=0.75, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure A1.24: Bias of the estimation of between-studies variance τ2\tau^{2} for δ=0.5\delta=0.5, q=0.75q=0.75, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.
Refer to caption
Figure A1.25: Bias of the estimation of between-studies variance τ2\tau^{2} for δ=1\delta=1, q=0.75q=0.75, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure A1.26: Bias of the estimation of between-studies variance τ2\tau^{2} for δ=1\delta=1, q=0.75q=0.75, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure A1.27: Bias of the estimation of between-studies variance τ2\tau^{2} for δ=1\delta=1, q=0.75q=0.75, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.
Refer to caption
Figure A1.28: Bias of the estimation of between-studies variance τ2\tau^{2} for δ=2\delta=2, q=0.75q=0.75, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure A1.29: Bias of the estimation of between-studies variance τ2\tau^{2} for δ=2\delta=2, q=0.75q=0.75, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure A1.30: Bias of the estimation of between-studies variance τ2\tau^{2} for δ=2\delta=2, q=0.75q=0.75, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.

A2. Coverage of of interval estimators of τ^2\hat{\tau}^{2} for τ^2=0.0(0.5)2.5\hat{\tau}^{2}=0.0(0.5)2.5.

For coverage of τ^2\hat{\tau}^{2}, each figure corresponds to a value of δ(=0,0.5,1,1.5,2,2.5)\delta(=0,0.5,1,1.5,2,2.5), a value of q(=.5,.75)q(=.5,.75), and a set of values of nn (= 20, 40, 100, 250 or 30, 50, 60, 70) or n¯(=30,60,100,160)\bar{n}(=30,60,100,160).
Each figure contains a panel (with τ2\tau^{2} on the horizontal axis) for each combination of n (or n¯\bar{n}) and K(=5,10,30)K(=5,10,30).
The interval estimators of τ2\tau^{2} are

  • QP (Q-profile confidence interval)

  • BJ (Biggerstaff and Jackson interval )

  • PL (Profile likelihood interval)

  • KDB ( KDB - improved Q-profile method based on Kulinskaya, Dollinger and Bjørkestøl (2011))

  • J (Jackson’s interval)

Refer to caption
Figure A2.1: Coverage at the nominal confidence level of 0.950.95 of the between-studies variance τ2\tau^{2} for δ=0\delta=0, q=0.5q=0.5, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure A2.2: Coverage at the nominal confidence level of 0.950.95 of the between-studies variance τ2\tau^{2} for δ=0\delta=0, q=0.5q=0.5, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure A2.3: Coverage at the nominal confidence level of 0.950.95 of the between-studies variance τ2\tau^{2} for δ=0\delta=0, q=0.5q=0.5, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.
Refer to caption
Figure A2.4: Coverage at the nominal confidence level of 0.950.95 of the between-studies variance τ2\tau^{2} for δ=0.2\delta=0.2, q=0.5q=0.5, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure A2.5: Coverage at the nominal confidence level of 0.950.95 of the between-studies variance τ2\tau^{2} for δ=0.2\delta=0.2, q=0.5q=0.5, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure A2.6: Coverage at the nominal confidence level of 0.950.95 of the between-studies variance τ2\tau^{2} for δ=0.2\delta=0.2, q=0.5q=0.5, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.
Refer to caption
Figure A2.7: Coverage at the nominal confidence level of 0.950.95 of the between-studies variance τ2\tau^{2} for δ=0.5\delta=0.5, q=0.5q=0.5, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure A2.8: Coverage at the nominal confidence level of 0.950.95 of the between-studies variance τ2\tau^{2} for δ=0.5\delta=0.5, q=0.5q=0.5, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure A2.9: Coverage at the nominal confidence level of 0.950.95 of the between-studies variance τ2\tau^{2} for δ=0.5\delta=0.5, q=0.5q=0.5, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.
Refer to caption
Figure A2.10: Coverage at the nominal confidence level of 0.950.95 of the between-studies variance τ2\tau^{2} for δ=1\delta=1, q=0.5q=0.5, n=20, 40, 100, 250n=20,\;40,\;100,\;250 .
Refer to caption
Figure A2.11: Coverage at the nominal confidence level of 0.950.95 of the between-studies variance τ2\tau^{2} for δ=1\delta=1, q=0.5q=0.5, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure A2.12: Coverage at the nominal confidence level of 0.950.95 of the between-studies variance τ2\tau^{2} for δ=1\delta=1, q=0.5q=0.5, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.
Refer to caption
Figure A2.13: Coverage at the nominal confidence level of 0.950.95 of the between-studies variance τ2\tau^{2} for δ=2\delta=2, q=0.5q=0.5, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure A2.14: Coverage at the nominal confidence level of 0.950.95 of the between-studies variance τ2\tau^{2} for δ=2\delta=2, q=0.5q=0.5, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure A2.15: Coverage at the nominal confidence level of 0.950.95 of the between-studies variance τ2\tau^{2} for δ=2\delta=2, q=0.5q=0.5, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.
Refer to caption
Figure A2.16: Coverage at the nominal confidence level of 0.950.95 of the between-studies variance τ2\tau^{2} for δ=0\delta=0, q=0.75q=0.75, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure A2.17: Coverage at the nominal confidence level of 0.950.95 of the between-studies variance τ2\tau^{2} for δ=0\delta=0, q=0.75q=0.75, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure A2.18: Coverage at the nominal confidence level of 0.950.95 of the between-studies variance τ2\tau^{2} for δ=0\delta=0, q=0.75q=0.75, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.
Refer to caption
Figure A2.19: Coverage at the nominal confidence level of 0.950.95 of the between-studies variance τ2\tau^{2} for δ=0.2\delta=0.2, q=0.75q=0.75, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure A2.20: Coverage at the nominal confidence level of 0.950.95 of the between-studies variance τ2\tau^{2} for δ=0.2\delta=0.2, q=0.75q=0.75, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure A2.21: Coverage at the nominal confidence level of 0.950.95 of the between-studies variance τ2\tau^{2} for δ=0.2\delta=0.2, q=0.75q=0.75, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.
Refer to caption
Figure A2.22: Coverage at the nominal confidence level of 0.950.95 of the between-studies variance τ2\tau^{2} for δ=0.5\delta=0.5, q=0.75q=0.75, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure A2.23: Coverage at the nominal confidence level of 0.950.95 of the between-studies variance τ2\tau^{2} for δ=0.5\delta=0.5, q=0.75q=0.75, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure A2.24: Coverage at the nominal confidence level of 0.950.95 of the between-studies variance τ2\tau^{2} for δ=0.5\delta=0.5, q=0.75q=0.75, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.
Refer to caption
Figure A2.25: Coverage at the nominal confidence level of 0.950.95 of the between-studies variance τ2\tau^{2} for δ=1\delta=1, q=0.75q=0.75, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure A2.26: Coverage at the nominal confidence level of 0.950.95 of the between-studies variance τ2\tau^{2} for δ=1\delta=1, q=0.75q=0.75, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure A2.27: Coverage at the nominal confidence level of 0.950.95 of the between-studies variance τ2\tau^{2} for δ=1\delta=1, q=0.75q=0.75, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.
Refer to caption
Figure A2.28: Coverage at the nominal confidence level of 0.950.95 of the between-studies variance τ2\tau^{2} for δ=2\delta=2, q=0.75q=0.75, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure A2.29: Coverage at the nominal confidence level of 0.950.95 of the between-studies variance τ2\tau^{2} for δ=2\delta=2, q=0.75q=0.75, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure A2.30: Coverage at the nominal confidence level of 0.950.95 of the between-studies variance τ2\tau^{2} for δ=2\delta=2, q=0.75q=0.75, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.

B1. Bias and mean squared error of point estimators of δ^\hat{\delta} for τ^2=0.0(0.5)2.5\hat{\tau}^{2}=0.0(0.5)2.5.

For bias of δ^\hat{\delta}, each figure corresponds to a value of δ(=0,0.5,1,1.5,2,2.5)\delta(=0,0.5,1,1.5,2,2.5), a value of q(=.5,.75)q(=.5,.75), and a set of values of nn (= 20, 40, 100, 250 or 30, 50, 60, 70) or n¯(=30,60,100,160)\bar{n}(=30,60,100,160).
Figures for mean squared error (expressed as the ratio of the MSE of SSW to the MSEs of the inverse-variance-weighted estimators that use the MP or KDB estimator of τ2\tau^{2}) use the above values of δ\delta and q but only n = 20, 40, 100, 250.
Each figure contains a panel (with τ2\tau^{2} on the horizontal axis) for each combination of n (or n¯\bar{n}) and K(=5,10,30)K(=5,10,30).
The point estimators of δ\delta are

  • DL (DerSimonian-Laird)

  • REML (restricted maximum likelihood)

  • MP (Mandel-Paule)

  • KDB (improved moment estimator based on Kulinskaya, Dollinger and Bjørkestøl (2011))

  • J (Jackson)

  • SSW (sample-size weighted)

Refer to caption
Figure B1.1: Bias of inverse-variance estimator of δ=0\delta=0, for q=0.5q=0.5, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure B1.2: Bias of inverse-variance estimator of δ=0\delta=0, for q=0.5q=0.5, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure B1.3: Bias of inverse-variance estimator of δ=0\delta=0, for q=0.5q=0.5, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.
Refer to caption
Figure B1.4: Ratio of mean squared errors of the fixed-weights to mean squared errors of inverse-variance estimator for δ=0\delta=0, q=0.5q=0.5, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure B1.5: Bias of inverse-variance estimator of δ=0.2\delta=0.2, for q=0.5q=0.5, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure B1.6: Bias of inverse-variance estimator of δ=0.2\delta=0.2, for q=0.5q=0.5, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure B1.7: Bias of inverse-variance estimator of δ=0.2\delta=0.2, for q=0.5q=0.5, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.
Refer to caption
Figure B1.8: Ratio of mean squared errors of the fixed-weights to mean squared errors of inverse-variance estimator for δ=0.2\delta=0.2, q=0.5q=0.5, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure B1.9: Bias of inverse-variance estimator of δ=0.5\delta=0.5, for q=0.5q=0.5, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure B1.10: Bias of inverse-variance estimator of δ=0.5\delta=0.5, for q=0.5q=0.5, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure B1.11: Bias of inverse-variance estimator of δ=0.5\delta=0.5, for q=0.5q=0.5, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.
Refer to caption
Figure B1.12: Ratio of mean squared errors of the fixed-weights to mean squared errors of inverse-variance estimator for δ=0.5\delta=0.5, q=0.5q=0.5, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure B1.13: Bias of inverse-variance estimator of δ=1\delta=1, for q=0.5q=0.5, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure B1.14: Bias of inverse-variance estimator of δ=1\delta=1, for q=0.5q=0.5, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure B1.15: Bias of inverse-variance estimator of δ=1\delta=1, for q=0.5q=0.5, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.
Refer to caption
Figure B1.16: Ratio of mean squared errors of the fixed-weights to mean squared errors of inverse-variance estimator for δ=1\delta=1, for q=0.5q=0.5, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure B1.17: Bias of inverse-variance estimator of δ=2\delta=2, for q=0.5q=0.5, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure B1.18: Bias of inverse-variance estimator of δ=2\delta=2, for q=0.5q=0.5, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure B1.19: Bias of inverse-variance estimator of δ=2\delta=2, for q=0.5q=0.5, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.
Refer to caption
Figure B1.20: Ratio of mean squared errors of the fixed-weights to mean squared errors of inverse-variance estimator for δ=2\delta=2, for q=0.5q=0.5, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure B1.21: Bias of inverse-variance estimator of δ=0\delta=0, for q=0.75q=0.75, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure B1.22: Bias of inverse-variance estimator of δ=0\delta=0, for q=0.75q=0.75, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure B1.23: Bias of inverse-variance estimator of δ=0\delta=0, for q=0.75q=0.75, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.
Refer to caption
Figure B1.24: Ratio of mean squared errors of the fixed-weights to mean squared errors of inverse-variance estimator for δ=0\delta=0, for q=0.75q=0.75, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure B1.25: Bias of inverse-variance estimator of δ=0.2\delta=0.2, for q=0.75q=0.75, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure B1.26: Bias of inverse-variance estimator of δ=0.2\delta=0.2, for q=0.75q=0.75, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure B1.27: Bias of inverse-variance estimator of δ=0.2\delta=0.2, for q=0.75q=0.75, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.
Refer to caption
Figure B1.28: Ratio of mean squared errors of the fixed-weights to mean squared errors of inverse-variance estimator for δ=0.2\delta=0.2, for q=0.75q=0.75, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure B1.29: Bias of inverse-variance estimator of δ=0.5\delta=0.5, for q=0.75q=0.75, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure B1.30: Bias of inverse-variance estimator of δ=0.5\delta=0.5, for q=0.75q=0.75, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure B1.31: Bias of inverse-variance estimator of δ=0.5\delta=0.5, for q=0.75q=0.75, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.
Refer to caption
Figure B1.32: Ratio of mean squared errors of the fixed-weights to mean squared errors of inverse-variance estimator for δ=0.5\delta=0.5, for q=0.75q=0.75, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure B1.33: Bias of inverse-variance estimator of δ=1\delta=1, for q=0.75q=0.75, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure B1.34: Bias of inverse-variance estimator of δ=1\delta=1, for q=0.75q=0.75, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure B1.35: Bias of inverse-variance estimator of δ=1\delta=1, for q=0.75q=0.75, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.
Refer to caption
Figure B1.36: Ratio of mean squared errors of the fixed-weights to mean squared errors of inverse-variance estimator for δ=1\delta=1, for q=0.75q=0.75, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure B1.37: Bias of inverse-variance estimator of δ=2\delta=2, for q=0.75q=0.75, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure B1.38: Bias of inverse-variance estimator of δ=2\delta=2, for q=0.75q=0.75, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure B1.39: Bias of inverse-variance estimator of δ=2\delta=2, for q=0.75q=0.75, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.
Refer to caption
Figure B1.40: Ratio of mean squared errors of the fixed-weights to mean squared errors of inverse-variance estimator for δ=2\delta=2, for q=0.75q=0.75, n=20, 40, 100, 250n=20,\;40,\;100,\;250.

B2. Coverage of interval estimators δ^\hat{\delta} for τ^2=0.0(0.5)2.5\hat{\tau}^{2}=0.0(0.5)2.5.

For coverage of δ^\hat{\delta}, each figure corresponds to a value of δ\delta (= 0, 0.5, 1, 1.5, 2 , 2.5), a value of q (= .5, .75), and a set of values of n (= 20, 40, 100, 250 or 30, 50, 60, 70) or n¯\bar{n} (= 30, 60, 100, 160).
Each figure contains a panel (with τ2\tau^{2} on the horizontal axis) for each combination of n (or n¯\bar{n}) and K(=5,10,30)K(=5,10,30).
The interval estimators of δ\delta are the companions to the inverse-variance-weighted point estimators

  • DL (DerSimonian-Laird)

  • REML (restricted maximum likelihood)

  • MP (Mandel-Paule)

  • KDB (improved moment estimator based on Kulinskaya, Dollinger and Bjørkestøl (2011))

  • J (Jackson)

and

  • HKSJ (Hartung-Knapp-Sidik-Jonkman)

  • HKSJ KDB (HKSJ with KDB estimator of τ2\tau^{2})

  • SSW (SSW as center and half-width equal to critical value from tK1t_{K-1}

times estimated standard deviation of SSW with τ^2\hat{\tau}^{2} = τ^KDB2\hat{\tau}^{2}_{KDB}.

Refer to caption
Figure B2.1: Coverage at the nominal confidence level of 0.950.95 of the δ=0\delta=0, for q=0.5q=0.5, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure B2.2: Coverage at the nominal confidence level of 0.950.95 of the δ=0\delta=0, for q=0.5q=0.5, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure B2.3: Coverage at the nominal confidence level of 0.950.95 of the δ=0\delta=0, for q=0.5q=0.5, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.
Refer to caption
Figure B2.4: Coverage at the nominal confidence level of 0.950.95 of the δ=0.2\delta=0.2, for q=0.5q=0.5, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure B2.5: Coverage at the nominal confidence level of 0.950.95 of the δ=0.2\delta=0.2, for q=0.5q=0.5, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure B2.6: Coverage at the nominal confidence level of 0.950.95 of the δ=0.2\delta=0.2, for q=0.5q=0.5, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.
Refer to caption
Figure B2.7: Coverage at the nominal confidence level of 0.950.95 of the δ=0.5\delta=0.5, for q=0.5q=0.5, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure B2.8: Coverage at the nominal confidence level of 0.950.95 of the δ=0.5\delta=0.5, for q=0.5q=0.5, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure B2.9: Coverage at the nominal confidence level of 0.950.95 of the δ=0.5\delta=0.5, for q=0.5q=0.5, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.
Refer to caption
Figure B2.10: Coverage at the nominal confidence level of 0.950.95 of the δ=1\delta=1, for q=0.5q=0.5, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure B2.11: Coverage at the nominal confidence level of 0.950.95 of the δ=1\delta=1, for q=0.5q=0.5, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure B2.12: Coverage at the nominal confidence level of 0.950.95 of the δ=1\delta=1, for q=0.5q=0.5, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.
Refer to caption
Figure B2.13: Coverage at the nominal confidence level of 0.950.95 of the δ=2\delta=2, for q=0.5q=0.5, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure B2.14: Coverage at the nominal confidence level of 0.950.95 of the δ=2\delta=2, for q=0.5q=0.5, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure B2.15: Coverage at the nominal confidence level of 0.950.95 of the δ=2\delta=2, for q=0.5q=0.5, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.
Refer to caption
Figure B2.16: Coverage at the nominal confidence level of 0.950.95 of the δ=0\delta=0, for q=0.75q=0.75, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure B2.17: Coverage at the nominal confidence level of 0.950.95 of the δ=0\delta=0, for q=0.75q=0.75, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure B2.18: Coverage at the nominal confidence level of 0.950.95 of the δ=0\delta=0, for q=0.75q=0.75, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.
Refer to caption
Figure B2.19: Coverage at the nominal confidence level of 0.950.95 of the δ=0.2\delta=0.2, for q=0.75q=0.75, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure B2.20: Coverage at the nominal confidence level of 0.950.95 of the δ=0.2\delta=0.2, for q=0.75q=0.75, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure B2.21: Coverage at the nominal confidence level of 0.950.95 of the δ=0.2\delta=0.2, for q=0.75q=0.75, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.
Refer to caption
Figure B2.22: Coverage at the nominal confidence level of 0.950.95 of the δ=0.5\delta=0.5, for q=0.75q=0.75, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure B2.23: Coverage at the nominal confidence level of 0.950.95 of the δ=0.5\delta=0.5, for q=0.75q=0.75, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure B2.24: Coverage at the nominal confidence level of 0.950.95 of the δ=0.5\delta=0.5, for q=0.75q=0.75, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.
Refer to caption
Figure B2.25: Coverage at the nominal confidence level of 0.950.95 of the δ=1\delta=1, for q=0.75q=0.75, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure B2.26: Coverage at the nominal confidence level of 0.950.95 of the δ=1\delta=1, for q=0.75q=0.75, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure B2.27: Coverage at the nominal confidence level of 0.950.95 of the δ=1\delta=1, for q=0.75q=0.75, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160.
Refer to caption
Figure B2.28: Coverage at the nominal confidence level of 0.950.95 of the δ=2\delta=2, for q=0.75q=0.75, n=20, 40, 100, 250n=20,\;40,\;100,\;250.
Refer to caption
Figure B2.29: Coverage at the nominal confidence level of 0.950.95 of the δ=2\delta=2, for q=0.75q=0.75, n=30, 50, 60, 70n=30,\;50,\;60,\;70.
Refer to caption
Figure B2.30: Coverage at the nominal confidence level of 0.950.95 of the δ=2\delta=2, for q=0.75q=0.75, unequal sample sizes with n¯=30, 60, 100, 160\bar{n}=30,\;60,\;100,\;160 .