BERRY–ESSEEN BOUNDS FOR GENERALIZED STATISTICS
Abstract
In this paper, we establish optimal Berry–Esseen bounds for the generalized -statistics.
The proof is based on a new Berry–Esseen theorem for exchangeable pair approach by Stein’s method under a general linearity condition setting.
As applications, an optimal convergence rate of the normal approximation for subgraph counts in Erdös–Rényi graphs and graphon-random graph is obtained.
MSC: Primnary 60F05; secondary 60K35.
Keywords:
Generalized -statistics,
Stein’s method,
Exchangeable pair approach,
Berry-Esseen bound,
graphon-generated random graph,
Erdös-Rényi model.
1 Introduction
Let and be two families of i.i.d. random variables; moreover, and are also mutually independent and we set for . For , let be a function and we say is symmetric if the value of the function remains unchanged for any permutation of indices . In this paper, we consider the generalized -statistic defined by
(1.1) |
where for every and ,
(1.2) |
We note that every is an -fold ordered index.
As a generalization of the classical -statistic, generalized -statistics have been widely applied in the random graph theory as a count random variable. Janson and Nowicki (1991) studied the limiting behavior of via a projection method. Specifically, the function can be represented as an orthogonal sum of terms indexed by subgraphs of the complete graph with vertices. Janson and Nowicki (1991) showed that the limiting behavior of depends on topology of the principle support graphs (see more details in Subsection 2.1) of . In particular, the random variable is asymptotically normally distributed if the principle support graphs are all connected. However, the convergence rate is still unknown.
The main purpose of this paper is to establish a Berry–Esseen bound for by using Stein’s method. Stein’s method is a powerful tool to estimating convergence rates for distributional approximation. Since introduced by Stein (1972) in 1972, Stein’s method has shown to be a powerful tool to evaluate distributional distances for dependent random variables. One of the most important techniques in Stein’s method is the exchangeable pair approach, which is commonly taken in computing the Berry–Esseen bound for both normal and nonnormal approximations. We refer to Stein (1986); Rinott and Rotar (1997); Chatterjee and Shao (2011) and Shao and Zhang (2016) for more details on Berry–Esseen bound for bounded exchangeable pairs. It is worth mentioning that Shao and Zhang (2019) obtained a Berry–Esseen bound for unbounded exchangeable pairs.
Let be the random variable of interest, and we say is an exchangeable pair if . For normal approximation, it is often to assume the following condition holds:
(1.3) |
where and is a random variable with a small . The condition 1.3 can be understood as a linear regression condition. Although an exchangeable pair can be easily constructed, it may be not easy to verify the linearity condition 1.3 in some applications.
In this paper, we aim to establish an optimal Berry–Esseen bound for the generalized -statistics by developing a new Berry–Esseen theorem for exchangeable pair approach by assuming a more general condition than 1.3. More specifically, we replace in 1.3 by a random variable that is an antisymmetric function of . The new result is given in Section 4. There are several advantages of our result. Firstly, we propose a new condition more general than 1.3 that may be easy to verify. For instance, the condition can be verified by constructing an antisymmetric random variable by the Gibbs sampling method, embedding method, generalized perturbative approach and so on. Secondly, the Berry–Esseen bound often provides an optimal convergence rate for many practical applications.
The rest of this paper is organized as follows. In Section 2, we give the Berry–Esseen bounds for . Applications to subgraph counts in -random graphs are given in Section 3. The new Berry–Esseen theorem for exchangeable pair approach under a new setting is established in Section 4. We give the proofs of our main results in Section 5. The proofs of other results are postponed to Section 6.
2 Main results
Let , and be defined in Section 1. For any , and . Let and let , and let and . Specially, we can simply write as . Let be the graph with vertex set and edge set , and let be the number of nodes in .
By the Hoeffding decomposition, we have
where is defined as
(2.1) |
where and for and . We remark that if and , then . For , let
(2.2) |
where is the number of nodes in . Let , and we call the principal degree of . We say is the principal part of . Moreover, we say the subgraphs such that and are the principal support graphs of .
The central limit theorems for is proved by Janson and Nowicki (1991). Let , and let be the set of principal index graph. We remark that if has the principal degree , then is of order , see Lemmas 2 and 3 in Janson and Nowicki (1991). Janson and Nowicki (1991) proved that if all graphs in are connected, then
Note that if not all principal support graphs are connected, then the limiting distribution of the scaled version of is nonnormal (see Theorems 2 and 3 in Janson and Nowicki (1991)), and we will consider this case in another paper.
Now, assume that is a symmetric function having principal degree (). In this subsection, we give a Berry–Esseen bound for . For , let
If , then it follows that . Here and in the sequel, we denote by for and we denote by the distribution function of . The following theorem provides the Berry–Esseen bound for in the case where .
Theorem 2.1.
If , then
(2.3) |
Remark 2.2.
If , then , that is, the principal degree of is at least . We have the following theorem.
Theorem 2.3.
Let and let . Assume that is a symmetric function having principal degree for some , and assume further that for all graphs in are connected. Then, we have
where is a constant depending only on , and .
If we further assume that the function does not depend on , i.e., for some symmetric , we obtain a sharper convergence rate. To give the theorem, we first introduce some more notation. Let be the graph generated from by deleting the node and all the edges connecting to the node . We say is strongly connected if is connected or empty for all . We note that all strongly connected graphs are also connected. The following theorem provides a sharper Berry–Esseen bound than that in Theorem 2.3.
Theorem 2.4.
Assume that almost surely for some symmetric . Let and be defined in Theorem 2.3. Assume that the conditions in Theorem 2.3 are satisfied and assume further that all graphs in are strongly connected. Then,
where is a constant depending on , and .
3 Applications
3.1 Subgraphs counts in random graphs generated from graphons
A symmetric Lebesgue measurable function is called a graphon, which was firstly introduced by Lovász and Szegedy (2006) to represent the graph limit. Given a graphon and , the -random graph can be generated as follows: Let and let be a vector of independent uniformly distributed random variables on . Given , we generate the graph by connecting the node pair independently with probability . This construction was firstly introduced by Diaconis and Freedman (1981), which can be used to study large dense and sparse random graphs and random trees generated from graphons. We refer to Lovász and Szegedy (2006); Bollobás et al. (2007); Lovász (2012) for more details.
Subgraph counts are important statistics in estimating graphons. As a special case, when for some , the -random graph model becomes the classical Erdös–Rényi model . The study of asymptotic properties of subgraph counts in dates back to Nowicki (1989); Barbour et al. (1989); Janson and Nowicki (1991) for more details. Recently, Krokowski et al. (2017), Röllin (2017) and Privault and Serafin (2018) applied Stein’s method to obtain an optimal Berry–Esseen bound for triangle counts in . For subgraph counts in -random graph, Kaur and Röllin (2020) proved an upper bound of the Kolmogorov distance for multivariate normal approximations for centered subgraph counts with order for some . However, the Berry–Esseen bounds for subgraph counts of -random graph is still unknown so far. In this subsection, we apply Theorems 2.3 and 2.4 to prove sharp Berry–Esseen bounds for subgraph counts statistics.
Let be the adjacency matrix of , where for each , the binary random variable indicates the connection of the graph. Formally, let be a vector of independent uniformly distributed random variables that is also independent of , and then we can write . For any nonrandom simple with , the (injective) subgraph counts and induced subgraph counts in are defined by
respectively, where for ,
Here, the summation ranges over the subgraphs with nodes that are isomorphic to and thus contains terms, where is the number of automorphisms of . Moreover, we note that both and are symmetric. For example, if is the -star, then , and
If is a triangle, then and
Let
Then, we have
As , let
Now, as random variables are conditionally independent given , we have
Let
and similarly, let
We have the following theorem, which follows from Theorem 2.1 directly.
Theorem 3.1.
Let and . Assume that , then
Moreover, assume that , then
If for a fixed number , then the random variables are i.i.d. and the functions and do not depend on . We have the following theorem:
Theorem 3.2.
Let for . Then
Remark 3.3.
For the bound, Barbour et al. (1989) proved the same order of in the case that is a constant. For the Berry–Esseen bound, Privault and Serafin (2018) proved a general Berry–Esseen bound for subgraph counts for Erdös–Rényi random graph using a different method. Specially, if is a constant, then Theorem 3.2 provides the same result as in Privault and Serafin (2018).
For induced subgraph counts, we need to consider some separate cases. Let and denote the number of 2-stars and triangles in , respectively. If any of the following conditions holds, then it has been proven by Janson and Nowicki (1991) that converges to a standard normal distribution:
-
(G1)
If ;
-
(G2)
if , ;
-
(G3)
if , and .
The following theorem gives the Berry–Esseen bounds for induced subgraph counts.
Theorem 3.4.
Let for . If (G1) or(G3) holds, then
(3.1) |
If (G2) holds, then
(3.2) |
4 A new Berry–Esseen bound for exchangeable pair approach
4.1 Berry–Esseen bound
In this section, we establish a new Berry–Esseen theorem for exchangeable pair approach under a new setting. Let be a random variable valued on a measurable space and let be the random variable of interest where . Assume that and . We propose the following condition:
-
(A)
Let be an exchangeable pair and let be an antisymmetric function. Assume that satisfies the following condition:
(4.1) where is a constant and is a random variable.
We remark that the operator of antisymmetric functions was firstly mentioned by Holmes and Reinert (2004), and the condition (A) was considered by Chatterjee (2007), who applied the exchangeable pair approach to prove concentration inequalities.
The following theorem provides a uniform Berry–Esseen bound for exchangeable pair approach under the assumption (A).
Theorem 4.1.
Let and satisfy the condition (A). Let and . Then,
(4.2) |
provided that , where is a symmetric function.
Remark 4.2.
Assume that 1.3 is satisfied. Then, we can choose , and the right hand side of 4.2 reduces to
where is a symmetric function for and such that . Thus, Theorem 4.1 recovers to Theorem 2.1 in Shao and Zhang (2019).
The following corollary is useful for random variables that can be decomposed as a sum of and a remainder term. Specifically, let be a random variable such that , where is as defined at the beginning of this section, and is a remainder term. The following corollary gives a Berry–Esseen bound for .
Corollary 4.3.
Let be an exchangeable pair and let where is antisymmetric. Assume that
(4.3) |
for some and some random variable . Let and . Then, we have
provided that is any symmetric function of and such that .
Remark 4.4.
Assume that is a family of independent random variables. Let be a linear statistic, where and is a nonrandom function, such that and , and let be a random variable. Let , and . Chen and Shao (2007) (see also Shao and Zhou (2016)) proved the following result:
(4.4) |
where is any random variable independent of .
The Berry–Esseen bound in Corollary 4.3 improves Chen and Shao (2007)’s result in the sense that the random variable in our result is not necessarily a partial sum of independent random variables, and our result in Corollary 4.3 can be applied to a general class of random variables.
4.2 Proof of Theorem 4.1
In this subsection, we prove Theorem 4.1 by Stein’s method. The proof is similar to that of Theorem 2.1 in Shao and Zhang (2019). To begin with, we need to prove the following lemma, which is useful in the proof of Theorem 4.1.
Lemma 4.5.
Proof of Lemma 4.5.
Since is nondecreasing, it follows that
and
which yields
Recalling that , is antisymmetric and is symmetric, as is exchangeable, we have
and | |||
Moreover, as and , it follows that
Therefore,
Proof of Theorem 4.1.
We apply some ideas of Theorem 2.1 in Shao and Zhang (2019) to prove the desired result. Let be a fixed real number, and the solution to the Stein equation:
(4.5) |
where is the distribution function of the standard normal distribution. It is well known that (see, e.g., Chen et al. (2011))
(4.6) |
Since , and is antisymmetric, it follows that, for any absolutely continuous function ,
Rearranging the foregoing equality, we have
(4.7) |
By 4.7,
and thus,
(4.8) | ||||
where
We now bound , and , separately. By Chen et al. (2011, Lemma 2.3), we have
(4.9) |
Therefore,
(4.10) | ||||
For observe that , and both and are increasing functions (see, e.g. Chen et al. (2011, Lemma 2.3)), by Lemma 4.5,
(4.11) | ||||
where
Then, by 4.9, . This proves Theorem 4.1 together with 4.10. ∎
4.3 Proof of Corollary 4.3
In this subsection, we apply Theorem 4.1 to prove Corollary 4.3. By 4.3, we have
Let , then we have is exchangeable. Then, by Theorem 4.1, we have
This completes the proof by recalling that .
5 Proofs of Theorems 2.1, 2.3 and 2.4
In this section, we give the proofs of Theorems 2.1, 2.3 and 2.4.
5.1 Proof of Theorem 2.1
Without loss of generality, we assume that , otherwise the inequality is trivial. We use Corollary 4.3 to prove this theorem. For each , let
(5.1) | ||||
Let , and
where
By orthogonality we have , and thus
(5.2) |
Let be an independent copy of . For each , define where
and let
The following lemma provides the upper bounds of and .
Lemma 5.1.
For and ,
(5.3) | ||||
(5.4) |
The proof of Lemma 5.1 is put in the appendix.
Now, we apply Corollary 4.3 to prove the Berry–Esseen bound for . To this end, let for each . Let be a random index uniformly distributed over , which is independent of all others. Let
then it follows that
Thus, 4.3 is satisfied with and . Moreover, we have
Also,
Therefore, by the Cauchy inequality and Lemma 5.1, we have for ,
where we used 5.2 in the last line. Using the same argument, we have for ,
Now we give the bounds for and . We have two cases. For the case where , then it follows that and . As for , noting that for , by Lemma 5.1 and the Cauchy inequality, we have
and
By Corollary 4.3 and noting that , we have
This proves 2.3.
5.2 Proof of Theorem 2.2
We first prove a proposition for the Hoeffding decomposition.
Proposition 5.2.
For such that , and for any such that and but , we have
(5.5) |
Proof.
Let
Then, . For and and , write
Moreover, for any and , let
and similarly, can be represented as
Let be an independent copy of . For any , let with
Then, it follows that for each , is an exchangeable pair. For any , let . For any , , and , define
Let be defined as in 2.2, and it follows that
Moreover, by assumption, as has principal degree , and it follows that for . Let and . The next lemma estimates the upper and lower bounds of and . The proof is similar to that of Lemma 4 of Janson and Nowicki (1991), and we omit the details.
Lemma 5.3.
We have for each and ,
(5.6) | |||
(5.7) | |||
(5.8) |
and
(5.9) |
where is the number of the automorphisms of , and are some absolute constant.
For any and , let
and for any (), let
Recall that is the graph generated by . For any for , we simply write as the number of nodes of the graph . Recall that and we similarly define . We have the following lemmas.
Lemma 5.4.
For all , such that and are connected, we have
Lemma 5.5.
Assume that . For all , , we have
We are now ready to give the proof of Theorem 2.3.
Proof of Theorem 2.3.
We assume that without loss of generality, otherwise the result is trivial. Recall that is defined in 2.2. Write , and
(5.10) |
Here, if , then set . With a slight abuse of notation, we write if . We have
because by assumption, for all .
For each , let
Let be a random 2-fold index uniformly chosen in , which is independent of all others. Then, is an exchangeable pair. Let
Also, define
Then, we have is antisymmetric with respect to and .
Thus, 4.3 is satisfied with and . Moreover, by exchangeability,
(5.13) |
Then, we have
Now, by the Cauchy inequality, 5.13 and Lemmas 5.3 and 5.4, we have
Taking , by Lemma 5.5,
5.3 Proof of Theorem 2.4
The proof of Theorem 2.4 is similar to that of Theorem 2.3. Without loss of generality, we assume that , otherwise the proof is even simpler.
For any and , recall that
By Proposition 5.2, we have there exists a Hoeffding decomposition of as follows:
where and Also, for any and (), let
For any , let be the node set of the graph with edge set . For any , let . Recall that and .
We need to apply the following lemma in the proof of Theorem 2.4.
Lemma 5.6.
Assume that . For all and let for , we have
Proof of Theorem 2.4.
Again, write , and let
(5.14) |
Here, if , then set . Then, . Now we apply Corollary 4.3 again to prove the desired result. To this end, we need to construct an exchangeable pair. For each , let
By assumption, we have
Let be a random 2-fold index uniformly chosen in , which is independent of all others. Then, is an exchangeable pair. Let
Also, define
Then, is antisymmetric with respect to and .
6 Proof of other results
6.1 Proof of Theorem 3.2
As does not dependent on if for some . Fix . Define
and by Proposition 5.2, we have has the following decomposition:
(6.1) |
By (Janson and Nowicki, 1991, p. 361), we have
Therefore, by Theorem 2.4 with , we complete the proof.
6.2 Proof of Theorem 3.3
Recall that is the number of 2-stars in and is the number of triangles in . Let
Let
By Janson and Nowicki (1991), letting , and , we have
We now consider the following three cases.
Case 1. If . In this case, we have . Then, by Theorem 2.4, we have 3.1 holds.
Case 2. If and . In this case, we have
However, the graph generated by is a 2-star, which is not strongly connected. Then, by Theorem 2.3, we have 3.2 holds.
Case 3. If , and . In this case, we have
Because the graph generated by is a triangle, which is strongly connected. Then, by Theorem 2.4, we have 3.1 holds.
Appendix A Proofs of some lemmas
A.1 Proof of Lemma 5.1
Proof of Lemma 5.1.
We write for any . Also, write . Now, observe that
(A.1) |
Note that if , then and are independent, then clearly it follows that
(A.2) |
if . If there exists such that , then
(A.3) |
By independence, we have the first term of A.3 is 0. For the second term, note that for any , then , and thus the second term of A.3 is also 0. Therefore,
(A.4) |
For any and such that , by the Cauchy inequality, we have
Recall that and are orthogonal for every . By 5.1, we have
Thus, it follows that
(A.5) |
Combining 5.2, A.1, A.2, A.4 and A.5, we have
(A.6) | ||||
This proves 5.3.
A.2 Proof of Lemma 5.3
Recall that for . To prove Lemma 5.4, we need the following lemma.
Lemma A.1.
Let , , and . Let
If , then
(A.7) |
Proof of Lemma A.1.
Let
(A.8) | ||||||||||
Then, we have , , . Without loss of generality, assume that .
If , which is equivalent to , then and . If , then and are independent, which implies that A.7 holds.
If and , then there exists such that . Now, assume that without loss of generality. Let
(A.9) |
Therefore, we have . Then, by 5.5,
(A.10) |
Hence,
which further implies that | |||
and
(A.11) | ||||
If and , then either the following two conditions holds: (a) there exists or (b) . If (a) holds, then following a similar argument that leading to A.11, we have A.7 holds.
If (b) is true, letting , we have conditional on , is conditionally independent of , and thus,
Without loss of generality, we assume that , otherwise the argument is even simpler. Moreover, we may assume that . Let , and we have and are conditionally independent given . Moreover, by 5.5, and thus . Therefore, we have under the condition (b),
. | (A.12) |
Combining A.11 and A.12 we prove that A.7 holds for . This completes the proof. ∎
Proof of Lemma 5.4.
Proof of Lemma 5.5.
If , then it follows that for all and . Therefore, we assume without loss of generality.
Observe that
(A.16) | ||||
Letting
and noting that
is anti-symmetric with respect to , we have
Now, we consider the following two cases. First, if , we have
and by anti-symmetry again,
Therefore,
(A.17) |
for .
It suffices to consider the case where . Observe that
(A.18) | ||||
Let and . Let , and then we have . Now, as
where and . If there exists , letting , then we have
and by orthogonality, we have
Therefore, we have
Hence, by Cauchy’s inequality, we have
Following the similar argument in the proof of Lemma 5.4, and recalling that and , we have
Therefore, we have
Substituting the foregoing inequality to A.18, we have
(A.19) |
∎
A.3 Proof of Lemma 5.6
Lemma 5.6 follows from a similar argument as that in the proof of Lemma 5.4 and the following lemma. Let Now, as the function does not depend on , we set in the following lemma. With a slight abuse of notation, For and for , let be the graph generated by and let be the number of nodes of , and we write if .
Lemma A.2.
Let for . Let , and let , for . Let and . For , let indicate that . Then
(A.20) |
for .
Proof.
The proof is similar to that of Lemma A.1.
Let be defined as in A.8. Note that if has isolated nodes, then for all , where is the number of nodes of the graph generated by the index set . If , then it follows that and . If , then and are independent, which further implies that A.20 holds.
Now we consider the case where . If , then following the same argument as that leading to A.11, we have A.20 holds.
If is connected and , then either the following two conditions holds: (a) there exists or (b) . If (a) holds, then following a similar argument as before, we have A.20 holds. Now we consider that the case where (b) holds. Let and
By orthogonality, we have .
Now, we further assume that . If or is a graph containing one single edge, then the proof is even simpler. Without loss of generality, we now assume that is connected for every for . We then prove that A.20 holds when . Under this condition, additional to (a) and (b), there is still another event that may happen: (c) there exists such that . As the cases (a) and (b) have been discussed, we only need to prove that A.20 holds under (c).
As , we have , and is not empty. Let
Then, conditional on , we have and are conditionally independent. Hence,
Letting
Now, if is connected for every , there is at least one edge in connecting and , and thus
where the last equality follows from orthogonality. Noting that , then and thus A.20 holds.
∎
Acknowledgements
The research is supported by Singapore Ministry of Education Academic Research Fund MOE 2018-T2-076.
References
- Barbour et al. (1989) A. D. Barbour, M. Karoński and A. Ruciński (1989). A central limit theorem for decomposable random variables with applications to random graphs. J. Comb. Theory Ser. B 47, 125–145.
- Bollobás et al. (2007) B. Bollobás, S. Janson and O. Riordan (2007). The phase transition in inhomogeneous random graphs. Random Structures & Algorithms 31, 3–122.
- Chatterjee and Shao (2011) S. Chatterjee and Q.-M. Shao (2011). Nonnormal approximation by stein’s method of exchangeable pairs with application to the curie–weiss model. Ann. Appl. Probab. 21, 464–483.
- Chatterjee (2007) S. Chatterjee (2007). Stein’s method for concentration inequalities. Probab. Theory Relat. Fields 138, 305–321.
- Chen et al. (2011) L. H. Y. Chen, L. Goldstein and Q.-M. Shao (2011). Normal Approximation by Stein’s Method. Probability and Its Applications. Springer, Heidelberg, New York.
- Chen and Shao (2007) L. H. Chen and Q.-M. Shao (2007). Normal approximation for nonlinear statistics using a concentration inequality approach. Bernoulli 13, 581–599.
- Diaconis and Freedman (1981) P. Diaconis and D. Freedman (1981). On the statistics of vision: The Julesz conjecture. J. of Math. Psycho. 24, 112–138.
- Holmes and Reinert (2004) S. Holmes and G. Reinert (2004). Stein’s method for the bootstrap. In Stein’s Method: Ex- pository Lectures and Applications 46 93–132. Institute of Mathematical Statistics, Hayward, CA.
- Janson and Nowicki (1991) S. Janson and K. Nowicki (1991). The asymptotic distributions of generalized u-statistics with applications to random graphs. Probab. Theory Relat. Fields 90, 341–375.
- Kaur and Röllin (2020) G. Kaur and A. Röllin (2020). Higher-order fluctuations in dense random graph models. Available at ArXiv 200615805.
- Krokowski et al. (2017) K. Krokowski, A. Reichenbachs and C. Thäle (2017). Discrete Malliavin–Stein method: Berry–Esseen bounds for random graphs and percolation. Ann. Probab. 45, 1071–1109.
- Lovász (2012) L. Lovász (2012). Large networks and graph limits, volume 60. American Mathematical Soc.
- Lovász and Szegedy (2006) L. Lovász and B. Szegedy (2006). Limits of dense graph sequences. J. Comb. Theory Ser. B 96, 933–957.
- Nowicki (1989) K. Nowicki (1989). Asymptotic normality of graph statistics. J. Stat. Plann. Inference 21, 209–222.
- Privault and Serafin (2018) N. Privault and G. Serafin (2018). Normal approximation for sums of discrete -statistics - application to Kolmogorov bounds in random subgraph counting. Available at arXiv 1806.05339.
- Rinott and Rotar (1997) Y. Rinott and V. Rotar (1997). On coupling constructions and rates in the clt for dependent summands with applications to the antivoter model and weighted u-statistics. Ann. Appl. Probab.pages 1080–1105.
- Röllin (2017) A. Röllin (2017). Kolmogorov bounds for the normal approximation of the number of triangles in the Erdos-Renyi random graph. Available at arXiv:1704.00410.
- Shao and Zhang (2016) Q.-M. Shao and Z.-S. Zhang (2016). Identifying the limiting distribution by a general approach of Stein’s method. Sci. China. Math. 59, 2379–2392.
- Shao and Zhang (2019) Q.-M. Shao and Z.-S. Zhang (2019). Berry–Esseen bounds of normal and nonnormal approximation for unbounded exchangeable pairs. Ann. Probab. 47, 61–108.
- Shao and Zhou (2016) Q.-M. Shao and W.-X. Zhou (2016). Cramér type moderate deviation theorems for self-normalized processes. Bernoulli 22, 2029–2079.
- Stein (1972) C. Stein (1972). A bound for the error in the normal approximation to the distribution of a sum of dependent random variables. In Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability, Volume 2: Probability Theory, pages 583–602, Berkeley, Calif. University of California Press.
- Stein (1986) C. Stein (1986). Approximate computation of expectations. Institute of Mathematical Statistics Lecture Notes—Monograph Series, 7. Institute of Mathematical Statistics, Hayward, CA.