Moderate deviation principles for kernel estimator of invariant density in bifurcating Markov chains models.
Abstract.
Bitseki and Delmas (2021) have studied recently the central limit theorem for kernel estimator of invariant density in bifurcating Markov chains models. We complete their work by proving a moderate deviation principle for this estimator. Unlike the work of Bitseki and Gorgui (2021), it is interesting to see that the distinction of the two regimes disappears and that we are able to get moderate deviation principle for large values of the ergodic rate. It is also interesting and surprising to see that for moderate deviation principle, the ergodic rate begins to have an impact on the choice of the bandwidth for values smaller than in the context of central limit theorem studied by Bitseki and Delmas (2021).
Keywords: Bifurcating Markov chains, bifurcating
auto-regressive process, binary trees, density estimation.
Mathematics Subject Classification (2020): 62G05, 62F12, 60F10, 60J80,
1. Introduction
The study of bifurcating Markov chains (BMCs, for short) models has taken a special place in the literature these last years due to their links with the study of the cell dynamics (see for e.g. [6, 10, 13, 16, 17]). The first model of BMC, named “symmetric” bifurcating autoregressive process (BAR, for short) were introduced by Cowan and Staudte [9] in order to understand the cell division mechanisms of Escherichia Coli (E. Coli, for short). E. Coli is a rod shaped bacterium which reproduces by dividing in two, thus producing two new cells. One of type 1 which has the old end of the mother and the other of type 0 which has the new end of the mother. The age of a cell is thus given by the age of its old pole in the sense of the number of divisions from which this pole exists. This cell division mechanism raises several questions, among other that of the symmetry of the division. In order to give a rigorous answer to this question, Guyon [16] has developed and studied the theory of BMCs. We note that to the best of our knowledge, the term BMC appears for the first time in the work of [1]. In particular, Guyon has studied an extension of the model introduced by Cowan and Staudte, named “asymmetric” BAR. In the conclusion of his study, Guyon concludes that aging has an impact on cell reproduction. We note that an extension of the model proposed by Guyon, named nonlinear BAR (NBAR, for short) were studied by Bitseki and Olivier in [6]. Another question of interest related to cell division is estimating the division rate at which cells divide. This question has been tackled recently in the work of Doumic & al. [13] and Hoffman & Marguet [17]. In all the previous work, the behaviour and the definition of parameters of interest are associated with the density of the invariant probability of an auxiliary Markov chain (see below for a precise definition). The estimation of this invariant density has recently been the subject of several studies. One can cite [5, 8] where adaptive methods have been proposed for the estimation of this invariant density. More recently, Bitseki and Delmas [2] have studied central limit theorem for kernel estimators of this invariant density. Our main objective in this paper is to complete the previous study by establishing a moderate deviation principle for these kernel estimators. Before going any further, let us recall the definition of the main concepts that we will use and study.
2. The model of bifurcating Markov chain and definition of the estimators
2.1. The regular binary tree associated to BMC models
We denote by (resp. ) the space of (resp. positive) natural integers. We set , and for , and . The set corresponds to the -th generation, to the tree up to the -th generation, and the complete binary tree. One can see that the genealogy of the cells is entirely described by (each vertex of the tree designates an individual). For , we denote by the generation of ( if and only if ) and for , where is the concatenation of the two sequences , with the convention that . For , we denote by the number of elements of . Note that for all and
2.2. The probability kernels associated to BMC models
For our convenience, we set , and is equipped with the Borel sigma-algebra . For any , we denote by (resp. , resp. ) the space of (resp. bounded, resp. bounded continuous ) valued measurable functions defined on . For all , we set . Let be a probability kernel on , that is: is measurable for all , and is a probability measure on for all . For any and , we set for :
(1) |
We define (resp. ), or simply for (resp. for ), as soon as the corresponding integral (1) is well defined, and we have that and belong to . We denote by , and respectively the first and the second marginal of , and the mean of and , that is, for all and
Now let us give a precise definition of bifurcating Markov chain.
Definition 2.1 (Bifurcating Markov Chains, see [16, 2]).
We say a stochastic process indexed by , , is a bifurcating Markov chain (BMC) on a measurable space with initial probability distribution on and probability kernel on if:
-
-
(Initial distribution.) The random variable is distributed as .
-
-
(Branching Markov property.) For any sequence of functions belonging to and for all , we have
Following [16], we introduce an auxiliary Markov chain on with and transition probability . The chain corresponds to a random lineage taken in the population. We shall write when (i.e. the initial distribution is the Dirac mass at ). We will assume that the Markov chain is ergodic and we denote by its invariant probability measure. Asymptotic and non-asymptotic behaviour of BMCs are strongly related to the knowledge of . In particular, Guyon has proved that if is ergodic, then for all ,
But in most cases, the invariant probability is unknown, so its estimation from the data is of great interest. For that purpose, we do the following assumption.
Assumption 2.2.
The transition kernel has a density, still denoted by , with respect to the Lebesgue measure.
Remark 2.3.
Assumption 2.2 implies that the transition kernel has a density, still denoted by , with respect to the Lebesgue measure. More precisely, we have This implies in particular that the invariant probability has a density, still denoted by , with respect to the Lebesgue measure (for more details, we refer for e.g. to [14], chap 6).
2.3. Kernel estimator of the invariant density
Recall that and , . Assume we observe . Let be a sequence of positive numbers which converges to as goes to infinity. We will simply write for if there is no ambiguity. Let the kernel function such that Then, for all we propose to estimate by
(2) |
where These estimators are strongly inspired from [18, 21, 22]. They have been studied in [13, 8] (non asymptotic studies) and in [2] (central limit theorem).
2.4. Moderate deviation principle and related topics
Our aim is to study moderate deviation principles for the estimators defined in (2). Before we proceed, let us introduce the notion of moderate deviation principle. We give the definition in a general setting. Let be a sequence of random variables with values in endowed with its Borel -field and let be a positive sequence that converges to . We assume that converges in probability to 0 and that converges in distribution to a centered Gaussian law. Let be a lower semicontinuous function, that is for all the sub-level set is a closed set. Such a function is called rate function and it is called good rate function if all its sub-level sets are compact sets. Let be a positive sequence such that and as goes to .
Definition 2.4 (Moderate deviation principle, MDP).
We say that satisfies a moderate deviation principle on with speed and rate function if, for any ,
where and denote respectively the interior and the closure of .
The following two concepts are closely related to the theory of MDP: super-exponential convergence and exponential equivalence. Let , be sequences of random variables and a random variable with value in a metric space .
Definition 2.5 (Super-exponential convergence).
We say that converges super-exponentially fast in probability to and we note if, for all ,
Definition 2.6 (Exponential equivalence, see [11], Chap 4).
We say that and are -exponentially equivalent and we note if for any ,
Remark 2.7.
Note that for a determininistic sequence that converges to some limit , it also converges superexponentially fast to for any rate . We also note that if and are -exponentially equivalent and if satisfies a MDP, then satisfies the same MDP (for more details, see for e.g [11], Chap 4).
The following result give a sufficient condition for super-exponential convergence of a sequence of random variables.
Remark 2.8.
We assume that is a metric space. Let be a sequence of random variables with values in , a random variable with values in . So if is upper-bounded by a deterministic sequence which converges to , then, for all sequence converging to , .
The moderate deviation principle has been proved in the i.i.d. setting for kernel density estimator, see for e.g. Gao [15], Mokkadem & al. [20]. We refer also to [19] where Mokkaddem and Pelletier have constructed confidence bands for probability densities based on moderate deviation principles. In this paper, we will establish moderate deviation principle for following the martingale approach developed in [2]. We will need the following assumption.
Assumption 2.9.
There exists a positive real number and such that for all :
(3) |
Remark 2.10.
The others assumptions we will need are based on the following bias-variance type decomposition of the estimator :
(4) |
where for and finite:
and for and , we set:
To study the variance term , we will introduce a more general sequence of functions (see Section 3.2).
The following assumptions on the kernel, the bandwidth and the regularity of the unknown density function are usual. Recall with .
Assumption 2.11 (Regularity of the kernel function and the bandwidth).
-
(i)
The kernel function satisfies:
-
(ii)
There exists such that the bandwidth are defined by .
Assumption 2.12 (Further regularity on the density , the kernel function and the bandwidths).
Suppose that there exists an invariant probability measure of and that Assumptions 2.2 and 2.11 hold. We assume there exists such that the following hold:
-
(i)
The density belongs to the (isotropic) Hölder class of order : The density admits partial derivatives with respect to , for all , up to the order and there exists a finite constant such that for all , and :
where denotes the vector where we have replaced the coordinate by , with the convention .
-
(ii)
The kernel is of order : We have and for all and .
For , we shall also assume the following.
Assumption 2.13.
Remark 2.14.
As consequence of Assumption 2.13 and (ii) of Assumption 2.11, for moderate deviation principle, the ergodicity rate begins to have an impact on the choice of the bandwidth for This is out of step with the central limit theorem where the ergodicity rate begins to have an impact on the choice of the bandwidth for (see [2] for more details).
In the sequel, we will consider the positive sequence such that:
(6) |
where is the regularity parameter given in Assumption 2.12.
The paper is organised as follows. In Section 3.1 we state the main result for the moderate deviation principles of the estimators for in the set continuity of and . In Section 3.2, directly linked to the study of variance term defined in (4), we study the moderate deviation principle for general additive functionals of BMCs. Sections 4 and 5 are devoted to the proofs of results. In Section 6, we recall some useful results.
3. Main result
3.1. Moderate deviation principle for
First, we state a strong consistency result for the estimators for in the set of continuity of . Its proof is given in Section 4.1.
Lemma 3.1.
The main result of this Section is the following theorem which state the moderate deviation principle for for in the set of continuity of the function .
Theorem 3.2.
Under the hypothesis of Lemma 3.1, for all in the set of continuity of and , satisfies a moderate deviation principle on with speed and rate function defined by: for all , that is, for any ,
where and denote respectively the interior and the closure of .
In order to obtain confidence intervals for , it would be interesting to replace in the expression of the rate function by an estimator. In that direction, we have the following. Let Obviously, and can be the same. We consider the estimator of defined with instead of . Let be a sequence of real numbers such that as Then, we have the following result which the proof is given in Section 4.3.
Theorem 3.3.
Under the hypothesis of Lemma 3.1, for all in the set of continuity of and , satisfies a moderate deviation principle on with speed and rate function defined by: for all .
In particular, using the contraction principle (see for e.g Dembo and Zeitouni [11], Chap 4), we have the following corollary of Theorem 3.3.
Corollary 3.4.
Under the hypothesis of Theorem 3.3, we have the following convergence for in the set of continuity of and
Remark 3.5.
Corollary 3.4 yields a simple confidence interval for , of decreasing size and with level asymptotically close to
Using the structure of the asymptotic variance in (7), we can prove the following multidimensional result which the proof is given in Section 4.4
Corollary 3.6.
Under the hypothesis of Theorem 3.2, we have, for in the set of continuity of and for all , satisfies a moderate deviation principle on with speed and good rate function defined by
with , where denotes the diagonal matrix and stands for the transpose of vector .
Remark 3.7.
We deduce from Corollary 3.6 that the estimators are asymptotically independent in the sense of moderate deviation for and for any
3.2. Moderate deviation principle for additive functionals of BMCs
In order to study the variance term , we give here a moderate deviation principle for a general additive functionals of BMCs. For that purpose, we introduce the following assumption.
Assumption 3.8.
For , let be a sequence of functions defined on such that if and there exists such that:
-
(i)
-
(ii)
-
(iii)
The following limit exists and is finite:
(7)
We will use the following notations. For a finite set and a function , we set:
In this paper, we are interested in the cases and , that is the th generation and the first generation of the tree. Recall the invariant probability of , transition probability of the auxiliary Markov chain . For , we set:
Recall the sequence defined in Assumption 3.8. For , we set:
(8) |
The notation means that we consider the average from the root to the th generation.
Remark 3.9.
The definition of in (8) is mainly motivated by the decomposition (4). It will allow us to threat the variance term of the estimator defined in (2). Instead, for , we set . Then, we consider the sequences of functions and defined by:
(9) |
It is not difficult to check that under Assumption 2.11, the sequence and defined in (9) satisfy Assumption 3.8. In particular, let be in the set of continuity of . Thanks to Lemma 6.3, we have:
(10) |
For our convenience, we assume that the quantity which appears in Assumptions 2.11 and 3.8 is the same. The main result of this section is the following.
Theorem 3.10.
Let be a BMC with kernel and initial distribution such that Assumptions 2.9, 2.11 and 3.8 hold. Furthermore, if then assume that Assumption 2.13 holds. Let be a positive sequence with satisfies (6). Then satisfies a moderate deviation principle on with speed and rate function defined by: for all , with the finite variance defined in (7).
Remark 3.11.
4. Proof of Lemma 3.1, Theorems 3.2 and 3.3 and Corollary 3.6
We will denote by any unimportant finite constant which may vary from line to line (in particular does not depend on ).
4.1. Proof of Lemma 3.1
We begin the proof with Recall the decomposition (4) with instead of . Using Lemma 6.3, we have From Remark 2.7, this implies that Next, we set in such a way that we have
Following line by line the proof of (32) (where we take for all ), we get
Taking the , dividing by and letting goes to the infinity in the latter inequality, we get
It then follows from the decomposition (4) that We similarly get the result for and this ends the proof of the lemma.
4.2. Proof of Theorem 3.2
We begin the proof with . We have the following decomposition:
where with the functions defined in (9) for and otherwise; is defined in (8) and the bias term is defined in (4). Thanks to Theorem 3.10 applied to the sequence and using that we get that satisfies a moderate deviation principle in with speed and rate function defined by: for all To complete the proof of Theorem 3.2, it suffices to prove that
(11) |
Next, using that
the Taylor expansion and Assumption 2.12, we get that, for some finite constant ,
Now, (11) follows using the latter inequality and (6). This ends the proof of Theorem 3.2 for . The proof is similar for using
4.3. Proof of Theorem 3.3
. We begin the proof with . We have the following decomposition:
(12) |
where
First, we prove that
(13) |
Let For all , we have
This implies that (see for e.g [11], Lemma 1.2.15)
(14) |
Using Theorem 3.2 and the contraction principle, we have
(15) |
Following the step 1 of the proof of Theorem 6 in [7] and using Lemma 3.1, we can prove that
Using Lemma B.2 in [3], the latter convergence implies that
(16) |
Using (14), (15) and (16), we get
Since can be taken arbitrarily close to , we get (13) and using (12), this implies that
(17) |
Using Theorem 3.2 and the contraction principle, we get that satisfies a moderate deviation principle on with speed and rate function defined by: for all . Using (17) and Remark 2.7, we get the result of Theorem 3.3.
4.4. Proof of Corollary 3.6
Let . Let We consider the sequence defined by for all and otherwise. We easily check that satisfies Assumptions 3.8. In particular, the asymptotic variance defined in (7) is given by Observe that the linear combinaison , with coefficients of the estimators , has the following decomposition:
(18) |
where is defined in (8) and the , , are defined in (4). Applying Theorem 3.10, we get that satisfies a moderate deviation principle on with speed and rate function defined by
(19) |
Using (11), we have that
Using Remark 2.7, this implies that
(20) |
Using (18) and (20) we get that and satisfy the same moderate deviation principle. We then conclude that satisfies a moderate deviation principle on with speed and rate function defined in (19). Since this is true for all vector , that is for all the linear combinaisons of the estimators , , we get the result of Corollary 3.6.
5. Proof of Theorem 3.10
We begin with some notations. We will denote by any unimportant finite constant which may vary from line to line (in particular does not depend on nor on the considered sequence of functions ). Let be a non-decreasing sequence of elements of such that
When there is no ambiguity, we write for .
Let . We write if . We denote by the most recent common ancestor of and , which is defined as the only such that if and , then . We also define the lexicographic order if either or and for . Let be a with kernel and initial measure . For , we define the -field:
By construction, the -fields are nested as for .
We define for , and the martingale increments:
(21) |
where
(22) |
We have:
Using the branching Markov property, we get for :
(23) |
We have the following decomposition:
(24) |
where is defined in (21) and:
From (24), our goals will be achieved if we prove the following:
(25) | |||
(26) | |||
(27) |
Note that (25) and (26) mean that and are negligible in the sense of moderate deviations in such a way that using (24) and Remark 2.7, and satisfy the same moderate deviation principle. To prove (27), the main method we will use is the moderate deviations for martingale (see [12] for more details).
In the sequel, the sequence which appears in Assumption 3.8 will be denoted in such a way that we have We have the following result.
Lemma 5.1.
Under the assumptions of Theorem 3.10, we have
Proof.
Let . Using the Chernoff bound, we have, for all ,
(28) |
For all and for , we set
Then, using recursively the fact that
for all and for some function , we get
For all , we set
Using the branching Markov property, we get the following decomposition:
with
For all , we will upper bound the quantity and then . We claim that:
(29) |
(30) |
For that purpose, we plan to use the bound
(31) |
valid for any , any random variable such that , and . For all and for all we get, using (29)-(31),
For all , the latter inequality implies that
Recall that By recurrence, we get
Using and of Assumption 3.8 and (3), we have
This implies that
Distinguishing the cases , and and using (5) for , we get
where , and are some positive constants. The latter inequality and (28) imply that
Taking111In fact, we use the following. For and we have for the choice
we are led to
Since we can do the same thing for instead of , we get that
(32) |
Finally, in the latter inequality, taking the , dividing by and letting goes to infinity, we get the result of Lemma 5.1. Now, to end the proof, we will prove (29) and (30).
Proof of (29)
Proof of (30)
Using the branching Markov property for the second inequality, Assumption 2.9 for the fourth inequality and and of Assumption 3.8 for the last inequality, we get
∎
Next, we have the following result.
Lemma 5.2.
Under the assumptions of Theorem 3.10, we have
Proof.
(33) |
We follow the same arguments that in the proof of Lemma 5.1. For all and for all , we set
We also consider the following quantities for and :
Note that using the branching Markov property, we have
(34) |
As for (29)-(30), for all and , one can prove that
(35) |
Using (31) and (35), we have, for all and for all ,
The latter inequality and (34) imply that
By recurrence, this implies that
(36) |
Using and of Assumption 3.8 and Assumption 2.9, we get
(37) |
From (36), (37) and according to the value of , we have, for some positive constants , and (recall the definition of and given in (35)):
Recall that . Using the Chernoff bound and (33), we have for all and for all ,
Taking
and since we can do the same things for instead of , we get,
From (24), Lemmas 5.1 and 5.2, we have
(38) |
As a consequence, using Remark 2.7, and satisfy the same moderate deviation principle.
We now study the martingale part of the decomposition (24). The bracket of is defined by:
Using (22) and (21), we write:
(39) |
with:
We have the following result.
Lemma 5.3.
Under the Assumptions of Theorem 3.10, we have
Proof.
Using the branching Markov property, we have
Using Assumption 2.9 and and of Assumption 3.8, we get
This implies that
(40) |
Recall that with . Using Assumption 2.13, we conclude from (40) that is bounded by a deterministic sequence which converge to 0. As a consequence, using Remark 2.8, we get the result of Lemma 5.3. ∎
Recall given in (7). We have the following result.
Lemma 5.4.
Under the Assumptions of Theorem 3.10, we have
Proof of (41)
Proof of (42)
We set
in such a that . Using, (3) and and of Assumption 3.8, we get
This implies that and then that , where the sequence is defined by
Using (5) and the fact that converges to 0, we get that the sequence converges to . Thus, we have that is bounded by a deterministic sequence which converges to . Then (42) follows using Remark 2.8. ∎
Lemma 5.5.
Under the Assumptions of Theorem 3.10, we have
Proof.
Using (51), we get:
with
First, we set
in such a way that Using, (3) and and of Assumption 3.8, we get
This implies that and then that , where the sequence is defined by
Since the sequence is deterministic and converges to 0, it follows, using Remark 2.8, that
Next, for the term , we have for all :
where we used (3) for the second inequality and and of Assumption 3.8 for the second and the last inequality. Using the latter inequality in , we get
We thus have that is bounded by a deterministic sequence which converges to . It then follows from Remark 2.8 that
From the foregoing, we get the result of Lemma since ∎
Lemma 5.6.
Under the Assumptions of Theorem 3.10, we have
We now study the 4th-order exponential moment condition. We stress that this condition imply in particular the exponential Lindeberg condition (condition (C3) in Proposition 6.1). We have the following result.
Lemma 5.7.
Under the Assumptions of Theorem 3.10, we have
Proof.
For all , we have
(43) |
where we have used the definition of , the inequality and the branching Markov property. Using (43), we get
(44) |
where We will now prove that the right hand side of (44) converges superexponentially to at the speed , that is
For that purpose, we will treat the case , and finally the case . First, we treat the case Set We have
(45) |
Since as , it suffices to prove that the first term of the right hand side in (45) converges superexponentially to at the speed , that is, for all
(46) |
As in the proof of Lemma 5.2, we can prove that
Taking the and dividing by , we get (46).
Upper bound of
Upper bound of
Upper bound of
Upper bound of
Upper bound of
Upper bound of
Upper bound of
In the same way as for , we have
Upper bound of
Upper bound of
In the same way as for , we have
Putting together all the upper bounds for and using (43) and (47), we deduce that is bounded by a deterministic sequence which converges to 0. As a consequence, it follows, using Remark 2.8, that
Finally, using (43), (44), (46), we get
∎
For Chen-Ledoux type condition, we have the following result.
Lemma 5.8.
Under the assumptions of Theorem 3.10, we have
Proof.
For all , using (21) we have
(48) |
with defined in (22). Following the proof of (32), we get
Next, for
we have
where we used (49) and the branching Markov property for the first equality, Chernoff bound for the first inequality and (3) for the last inequality. Doing the same thing for instead of we get
From the foregoing, we get, using (48),
Finally, taking the and dividing by in the latter inequality, we get the result of Lemma 5.8. ∎
6. Appendix
We recall here a simplified version of Theorem 1 in [12]. We consider the real martingale with respect to the filtration and we denote its bracket.
Proposition 6.1.
Let a sequence satisfying
such that is non-decreasing, and define the reciprocal function by
Under the following conditions:
-
(C1)
there exists such that for all ,
-
(C2)
-
(C3)
for all and for all ,
satisfies the MDP on with the speed and rate function
Lemma 6.2.
Let , and . Assuming that all the quantities below are well defined, we have:
(49) | ||||
(50) | ||||
(51) | ||||
We recall the following result due to Bochner (see [21, Theorem 1A] which can be easily extended to any dimension ).
Lemma 6.3.
Let be a sequence of positive numbers converging to as goes to infinity. Let be a measurable function such that . Let be a measurable function such that , and . Define
Then, we have at every point of continuity of ,
We also give some bounds on , see the proof of Theorem 2.1 in [3]. We will use the notation:
Lemma 6.4.
There exists a finite constant such that for all , and a probability measure on , assuming that all the quantities below are well defined, there exist functions for such that:
and, with and (notice that either or is bounded), writing :
References
- [1] I. V. Basawa and J. Zhou. Non-Gaussian bifurcating models and quasi-likelihood estimation. Adv. in Appl. Probab., 41(A):55–64, 2004.
- [2] S. V. Bitseki Penda and J.-F. Delmas. Central limit theorem for kernel estimator of invariant density in bifurcating markov chains models. arXiv preprint arXiv:2106.08626, 2021.
- [3] S. V. Bitseki Penda, H. Djellout, and A. Guillin. Deviation inequalities, moderate deviations and some limit theorems for bifurcating Markov chains with application. Ann. Appl. Probab., 24(1):235–291, 2014.
- [4] S. V. Bitseki Penda and G. Gackou. Moderate deviation principles for bifurcating markov chains: case of functions dependent of one variable. arXiv e-prints, pages arXiv–2105, 2021.
- [5] S. V. Bitseki Penda, M. Hoffmann, and A. Olivier. Adaptive estimation for bifurcating Markov chains. Bernoulli, 23(4B):3598–3637, 2017.
- [6] S. V. Bitseki Penda and A. Olivier. Autoregressive functions estimation in nonlinear bifurcating autoregressive models. Stat. Inference Stoch. Process., 20(2):179–210, 2017.
- [7] S. V. Bitseki Penda and A. Olivier. Moderate deviation principle in nonlinear bifurcating autoregressive models. Statistics & Probability Letters, 138:20–26, 2018.
- [8] S. V. Bitseki Penda and A. Roche. Local bandwidth selection for kernel density estimation in a bifurcating Markov chain model. Journal of Nonparametric Statistics, 0(0):1–28, 2020.
- [9] R. Cowan and R. Staudte. The bifurcating autoregression model in cell lineage studies. Biometrics, 42(4):769–783, December 1986.
- [10] J.-F. Delmas and L. Marsalle. Detection of cellular aging in a Galton-Watson process. Stochastic Process. Appl., 120(12):2495–2519, 2010.
- [11] A. Dembo and O. Zeitouni. Large Deviations Techniques and Applications. Applications of mathematics. Springer, 1998.
- [12] H. Djellout. Moderate deviations for martingale differences and applications to o-mixing sequences. Stochastics and stochastics reports, 73(1):37–64, 2002.
- [13] M. Doumic, M. Hoffmann, N. Krell, and L. Robert. Statistical estimation of a growth-fragmentation model observed on a genealogical tree. Bernoulli, 21(3):1760–1799, 2015.
- [14] M. Duflo. Random iterative models, volume 34. Springer Science & Business Media, 2013.
- [15] F. Gao. Moderate deviations and large deviations for kernel density estimators. Journal of Theoretical Probability, 16(2):401–418, 2003.
- [16] J. Guyon. Limit theorems for bifurcating Markov chains. Application to the detection of cellular aging. Ann. Appl. Probab., 17(5-6):1538–1569, 2007.
- [17] M. Hoffmann and A. Marguet. Statistical estimation in a randomly structured branching population. Stochastic Process. Appl., 129(12):5236–5277, 2019.
- [18] E. Masry. Recursive probability density estimation for weakly dependent stationary processes. IEEE Transactions on Information Theory, 32(2):254–267, 1986.
- [19] A. Mokkadem and M. Pelletier. Confidence bands for densities, logarithmic point of view. Alea, 2:231–266, 2006.
- [20] A. Mokkadem, M. Pelletier, and J. Worms. Large and moderate deviations principles for kernel estimation of a multivariate density and its partial derivatives. Australian & New Zealand Journal of Statistics, 47(4):489–502, 2005.
- [21] E. Parzen. On estimation of a probability density function and mode. The Annals of Mathematical Statistics, 33(3):1065–1076, 1962.
- [22] G. G. Roussas. Nonparametric estimation in Markov processes. Annals of the Institute of Statistical Mathematics, 21(1):73–87, 1969.