This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Stability and statistical inversion of travel time tomography

Ashwin Tarikere Department of Mathematics, University of California Santa Barbara, Santa Barbara, CA 93106-3080, USA ashwintan@ucsb.edu  and  Hanming Zhou Department of Mathematics, University of California Santa Barbara, Santa Barbara, CA 93106-3080, USA hzhou@math.ucsb.edu
Abstract.

In this paper, we consider the travel time tomography problem for conformal metrics on a bounded domain, which seeks to determine the conformal factor of the metric from the lengths of geodesics joining boundary points. We establish forward and inverse stability estimates for simple conformal metrics under some a priori conditions. We then apply the stability estimates to show the consistency of a Bayesian statistical inversion technique for travel time tomography with discrete, noisy measurements.

1. Introduction

Consider a smooth, bounded, and simply connected domain Ωm\Omega\subseteq\mathbb{R}^{m}, with m2m\geq 2. Given a Riemannian metric gg on Ω¯\overline{\Omega}, we define the associated boundary distance function Γg:Ω×Ω[0,)\Gamma_{g}:\partial\Omega\times\partial\Omega\to[0,\infty) by

Γg(ξ,η)=inf{γd|g|:=0T|γ˙(t)|gdt:γC1([0,T],Ω¯),γ(0)=ξ,γ(T)=η},\Gamma_{g}(\xi,\eta)=\inf\left\{\int_{\gamma}\,d|g|:=\int_{0}^{T}|\dot{\gamma}(t)|_{g}\,dt\ :\ \gamma\in C^{1}([0,T],\overline{\Omega}),\ \gamma(0)=\xi,\ \gamma(T)=\eta\right\},

for all ξ,ηΩ\xi,\eta\in\partial\Omega. In other words, Γg(ξ,η)\Gamma_{g}(\xi,\eta) is the Riemannian distance (with respect to gg) between the boundary points ξ\xi and η\eta. We consider the following inverse problem: Can we recover the metric gg in the interior of the domain from the boundary distance function Γg\Gamma_{g}?

This inverse problem, called the boundary rigidity problem in mathematics literature, arose in geophysics in an attempt to determine the inner structure of the earth, such as the sound speed or index of refraction, from measurements of travel times of seismic waves on the earth’s surface. This is called the inverse kinematic problem or the travel time tomography problem in seismology [16, 45].

The boundary rigidity problem is not solvable in general. Consider, for example, a unit disk with a metric whose magnitude is large (and therefore, geodesic speed is low) near the center of the disk. In such cases, it is possible that all distance minimizing geodesics connecting boundary points avoid the large metric region, and therefore one can not expect to recover the metric in this region from the boundary distance function. In view of this restriction, one needs to impose additional geometric conditions on the metric to be reconstructed. One such condition is simplicity. A metric gg on Ω¯\overline{\Omega} is said to be simple if the boundary Ω\partial\Omega is strictly convex w.r.t. to gg and any two points on Ω¯\overline{\Omega} can be joined by a unique distance minimizing geodesic. Michel conjectured that simple metrics are boundary distance rigid [21], and this has been proved in dimension two [34]. In dimensions 3\geq 3, this is known for generic simple metrics [36]. When caustics appear, a completely new approach was established in [37, 38] for the boundary rigidity problem in dimensions 3\geq 3, assuming a convex foliation condition. Boundary rigidity problems for more general dynamical systems can be found in [10, 2, 48, 32, 17, 46, 35]. We also refer to [9, 39] for summaries of recent developments on the boundary rigidity problem.

The boundary rigidity problem for general Riemannian metrics has a natural gauge: isometries of (Ω¯,g)(\overline{\Omega},g) that preserve Ω\partial\Omega will also preserve the boundary distance function. In this paper, we restrict our attention to the problem of determining metrics from a fixed conformal class. Let g¯\bar{g} be a fixed “background” metric on Ω¯\overline{\Omega} which is simple and has C3C^{3} regularity. For any positive function nC3(Ω¯)n\in C^{3}(\overline{\Omega}), define

gn:=n2g¯,g_{n}:=n^{2}\bar{g},

which is a new Riemannian metric on Ω¯\overline{\Omega} that is conformal to g¯\bar{g}. Our goal is to recover the parameter nn from the boundary distance function of gng_{n}. In this problem, the gauge of isometries does not appear, and one expects to be able to uniquely determine the conformal factor nn from Γgn\Gamma_{g_{n}}.

It is known that simple metrics from the same conformal class are boundary rigid for all m2m\geq 2 [26, 25, 28]. To be precise, if n1,n2C3(Ω¯)n_{1},n_{2}\in C^{3}(\overline{\Omega}) are such that gn1,gn2g_{n_{1}},g_{n_{2}} are both simple metrics on Ω¯\overline{\Omega}, then Γgn1=Γgn2\Gamma_{g_{n_{1}}}=\Gamma_{g_{n_{2}}} if and only if n1=n2n_{1}=n_{2}. To simplify notation, we will henceforth denote Γgn\Gamma_{g_{n}} by simply Γn\Gamma_{n}.

1.1. Stability estimates for the deterministic inverse problem

The uniqueness aspect of the boundary rigidity problem for conformal simple metrics has been quite well understood through the aforementioned studies [26, 25, 28]. The first topic of this paper is the stability of the boundary rigidity problem, i.e., quantitative lower bounds on the change in Γn\Gamma_{n} corresponding to a change in the parameter nn. Stability is important in practice, as we hope the inversion method for travel time tomography will be stable under perturbations of the data, e.g., by noise.

Conditional stability estimates for simple metrics can be found in [44, 36, 37], where the metrics are assumed a priori to be close to a given one. When considering a fixed conformal class, various stability estimates without the closeness assumption have been established in [25, 27, 3]. In [25] the following stability result has been proved for the 2D boundary rigidity problem with the Euclidean background metric:

(1) n1n2L2(Ω)12πdξ(Γn1Γn2)(ξ,η)L2(Ω×Ω).\|n_{1}-n_{2}\|_{L^{2}(\Omega)}\leq\frac{1}{\sqrt{2\pi}}\|d_{\xi}(\Gamma_{n_{1}}-\Gamma_{n_{2}})(\xi,\eta)\|_{L^{2}(\partial\Omega\times\partial\Omega)}.

Here, dξd_{\xi} is the exterior derivative operator with respect to ξ\xi and the L2L^{2} norms are taken with respect to the standard Euclidean metric. Notice that since the boundary distance function is symmetric, this estimate essentially says that the L2L^{2}-norm of n1n2n_{1}-n_{2} can be controlled by the H1H^{1}-norm of Γn1Γn2\Gamma_{n_{1}}-\Gamma_{n_{2}}. For dimensions 3\geq 3, there are generalizations [3, 27] of (1) with more complicated expressions (see also Theorem 2.1). However, the estimates of [3, 27] are not in standard Sobolev or Hölder norms, which makes them inconvenient for applications.

In this paper, we establish stability estimates similar to (1) for all dimensions 2\geq 2, without any a priori closeness assumptions on n1,n2n_{1},n_{2}. Before giving the statement of our results, we need to define some function spaces for the conformal parameter nn.

Definition 1.1.

Let Ω0\Omega_{0} be a smooth, relatively compact subdomain of Ω\Omega, and let λ,Λ,,L\lambda,\Lambda,\ell,L be real numbers such that

0<λ<1<Λ,0<<L.0<\lambda<1<\Lambda,\qquad 0<\ell<L.

We define 𝒩λ,Λ,,L(Ω0)\mathcal{N}_{\lambda,\Lambda,\ell,L}(\Omega_{0}) to be the set of all functions nC3(Ω¯)n\in C^{3}(\overline{\Omega}) that satisfy the following conditions:

  1. (i)

    The metric gn=n2g¯g_{n}=n^{2}\bar{g} is a simple metric on Ω¯\overline{\Omega}.

  2. (ii)

    λ<n(x)<Λ\lambda<n(x)<\Lambda for all xΩ¯x\in\overline{\Omega} and n1n\equiv 1 on Ω¯Ω0\overline{\Omega}\setminus\Omega_{0}.

  3. (iii)

    Let expn(x,v)\exp_{n}(x,v) denote the exponential map with respect to gng_{n} based at xΩ¯x\in\overline{\Omega} and acting on vTxΩ¯v\in T_{x}\overline{\Omega} (that is, the tangent space of Ω¯\overline{\Omega} at xx). Then the derivative of expn(x,)\exp_{n}(x,\cdot) satisfies

    (2) |w|g¯<|Dvexpn(x,v)(w)|g¯<L|w|g¯,\ell|w|_{\bar{g}}<|D_{v}\exp_{n}(x,v)(w)|_{\bar{g}}<L|w|_{\bar{g}},

    for all xΩ¯x\in\overline{\Omega}, vdom(expn(x,))v\in\operatorname{dom}(\exp_{n}(x,\cdot)), and wTvTxΩ¯TxΩ¯w\in T_{v}T_{x}\overline{\Omega}\cong T_{x}\overline{\Omega}.

We also let

𝒩λ,(Ω0):=Λ>1,L>0𝒩λ,Λ,,L(Ω0).\mathcal{N}_{\lambda,\ell}(\Omega_{0}):=\bigcup_{\Lambda>1,\,L>0}\mathcal{N}_{\lambda,\Lambda,\ell,L}(\Omega_{0}).

The class of metrics associated with these function spaces includes any metric with non-positive sectional curvature that is conformal to g¯\bar{g} and equal to g¯\bar{g} in a neighborhood of Ω\partial\Omega . Indeed, suppose gn=n2g¯g_{n}=n^{2}\bar{g} is such a metric. Then (Ω¯,gn)(\overline{\Omega},g_{n}) is free of conjugate points by the curvature assumption, and Ω\partial\Omega remains strictly convex with respect to gng_{n} since gng¯g_{n}\equiv\bar{g} near Ω\partial\Omega. Therefore, gng_{n} is a simple metric. Moreover, it follows from the Rauch Comparison Theorem that its exponential map expn\exp_{n} satisfies (2) for sufficiently large LL and any <1\ell<1 (see, e.g., [6, Corollary 1.35]).

Remark 1.1 (Notation).

Let T:W1W2T:W_{1}\to W_{2} be a linear map between normed vector spaces. Given real numbers m,Mm,M, we will use the notation

mTMm\prec T\prec M

as shorthand for

mwW1<TwW2<MwW1,m\|w\|_{W_{1}}<\|Tw\|_{W_{2}}<M\|w\|_{W_{1}},

for all wW1w\in W_{1}. Using this notation, (2) can be rewritten as

(3) Dvexpn(x,v)L.\ell\prec D_{v}\exp_{n}(x,v)\prec L.

We will also use Top\|T\|_{op} to denote the operator norm of TT:

Top:=sup{TwW2:wW1,wW1=1}.\|T\|_{op}:=\sup\left\{\|Tw\|_{W_{2}}\ :\ w\in W_{1},\ \|w\|_{W_{1}}=1\right\}.
Remark 1.2.

Let δ>0\delta>0 be the distance (w.r.t. to g¯\bar{g}) between Ω\partial\Omega and Ω¯0\overline{\Omega}_{0}, and let ξ,ηΩ\xi,\eta\in\partial\Omega be any pair of boundary points such that distg¯(ξ,η)<δ\operatorname{dist}_{\bar{g}}(\xi,\eta)<\delta. For any n𝒩λ,(Ω0)n\in\mathcal{N}_{\lambda,\ell}(\Omega_{0}), gng_{n} coincides with g¯\bar{g} on Ω¯Ω0\overline{\Omega}\setminus\Omega_{0}, and consequently, we have Γn(ξ,η)=distg¯(ξ,η)\Gamma_{n}(\xi,\eta)=\operatorname{dist}_{\bar{g}}(\xi,\eta). In particular, Γn1(ξ,η)=Γn2(ξ,η)\Gamma_{n_{1}}(\xi,\eta)=\Gamma_{n_{2}}(\xi,\eta) for all n1,n2𝒩λ,(Ω0)n_{1},n_{2}\in\mathcal{N}_{\lambda,\ell}(\Omega_{0}).

We are now ready to state our results on stability estimates for the boundary rigidity problem. The following “inverse stability” estimate follows from a result of Beylkin [3], combined with some estimates for metrics with conformal factors n𝒩λ,(Ω0)n\in\mathcal{N}_{\lambda,\ell}(\Omega_{0}). The details are presented in Section 2.

Theorem 1.2.

Let Ω,Ω0,g¯\Omega,\Omega_{0},\bar{g} be as before, and let λ,\lambda,\ell be real numbers such that

0<λ<1,0<.0<\lambda<1,\qquad 0<\ell.

Then there exists a constant C1(Ω,Ω0,g¯,)>0C_{1}(\Omega,\Omega_{0},\bar{g},\ell)>0 such that for all n1,n2𝒩λ,(Ω0)n_{1},n_{2}\in\mathcal{N}_{\lambda,\ell}(\Omega_{0}),

n1n2L2(Ω)C1λ2mdξ(Γn1Γn2)(ξ,η)L2(Ω×Ω).\|n_{1}-n_{2}\|_{L^{2}(\Omega)}\leq C_{1}\lambda^{2-m}\|d_{\xi}(\Gamma_{n_{1}}-\Gamma_{n_{2}})(\xi,\eta)\|_{L^{2}(\partial\Omega\times\partial\Omega)}.

Here, the L2L^{2} norms are taken with respect to the background metric g¯\bar{g}, and dξd_{\xi} represents the exterior derivative operator with respect to ξ\xi. Please note that the stability constant C1C_{1} can blow up as 0\ell\to 0. In a sense, as \ell approaches 0, we allow the metrics in our class to get closer and closer to potentially having conjugate points, and thus becoming non-simple.

We will apply the above stability estimate to study a statistical inversion technique for travel time tomography. For this purpose, we also need the following continuity (or “forward stability”) estimate of Γn\Gamma_{n}. To the best of our knowledge, no such continuity estimate has been published before. The key idea in the proof is to apply the change of variables formula and use the upper bounds on det(Dvexpnj)\det\left(D_{v}\exp_{n_{j}}\right) to control Γn1Γn2L2\|\Gamma_{n_{1}}-\Gamma_{n_{2}}\|_{L^{2}} in terms of n1n2L2\|n_{1}-n_{2}\|_{L^{2}}.

Theorem 1.3.

Let Ω,Ω0,g¯\Omega,\Omega_{0},\bar{g} be as before, and let λ,Λ,,L\lambda,\Lambda,\ell,L be real numbers such that

0<λ<1<Λ,0<<L.0<\lambda<1<\Lambda,\qquad 0<\ell<L.

Then there exists a constant C2(Ω,Ω0,g¯,,L)>0C_{2}(\Omega,\Omega_{0},\bar{g},\ell,L)>0 such that for all n1,n2𝒩λ,Λ,,L(Ω0)n_{1},n_{2}\in\mathcal{N}_{\lambda,\Lambda,\ell,L}(\Omega_{0}),

Γn1Γn2L2(Ω×Ω)C2Λm/2λn1n2L2(Ω).\|\Gamma_{n_{1}}-\Gamma_{n_{2}}\|_{L^{2}(\partial\Omega\times\partial\Omega)}\leq C_{2}\frac{\Lambda^{m/2}}{\lambda}\|n_{1}-n_{2}\|_{L^{2}(\Omega)}.

As with Theorem 1.2, the constant C2C_{2} can blow up as 0\ell\to 0. The same happens as LL\to\infty, since this allows det(Dvexpnj)\det\left(D_{v}\exp_{n_{j}}\right) to blow up. The details are again postponed to Section 2.

1.2. The statistical inverse problem

The boundary rigidity problem is nonlinear, and geodesics are curved in general, so it is hard to derive explicit inversion formulas. Some reconstruction algorithms and numerical implementations based on theoretical analyses can be found in [7, 8, 47]. Typically, inversion methods in travel time tomography take an optimization approach with appropriate regularization. This is a deterministic approach which seeks to minimize some mismatch functional that quantifies the difference between the observations and the forecasts (synthetic data). However, this approach generally does not work well for non-convex problems. Moreover, various approximations in numerical methods can introduce systematic (random) error to the reconstruction procedure.

In this paper, we apply the above stability estimates (Theorems 1.2 and 1.3) to study a Bayesian inversion technique for the travel time tomography problem. The Bayesian inversion technique provides a reasonable solution for ill-posed inverse problems when the number of available observations is limited, which is a common scenario in practice. Applications of Bayesian inversion to seismology can be found in [20, 41], which are based on the general paradigm of infinite dimensional Bayesian inverse problems developed by Stuart [40]. However, most studies in the literature are concerned with waveform inversion, which is more PDE-based. On the other hand, there are very few results on statistical guarantees for the Bayesian approach to seismic inverse problems. These motivate us to apply Stuart’s Bayesian inversion framework to produce a rigorous statistical analysis of the problem of recovering the wave speed from the (noisy) travel time measurements.

For statistical inversion, it is convenient to rewrite the conformal factor nn using an exponential parameter: For any β3\beta\geq 3, let C0β(Ω0)C_{0}^{\beta}(\Omega_{0}) denote the closure in the Hölder space Cβ,ββ(Ω¯0)C^{\lfloor\beta\rfloor,\beta-\lfloor\beta\rfloor}(\overline{\Omega}_{0}) of the subspace of all smooth functions compactly supported in Ω0\Omega_{0}. Given any function cC03(Ω0)c\in C_{0}^{3}(\Omega_{0}), we define the corresponding conformal factor ncn_{c} by

(4) nc(x)={ec(x)if xΩ0,1if xΩ¯Ω0.n_{c}(x)=\begin{cases}e^{c(x)}&\textrm{if }x\in\Omega_{0},\\ 1&\textrm{if }x\in\overline{\Omega}\setminus\Omega_{0}.\end{cases}

It is easy to see that ncn_{c} is a positive C3C^{3} function on Ω¯\overline{\Omega}. To simplify notation, we will denote the corresponding boundary distance function Γnc\Gamma_{n_{c}} by simply Γc\Gamma_{c}.

Our goal is to reconstruct the exponential parameter cc from error-prone measurements of Γc\Gamma_{c} on finitely many pairs of boundary points (Xi,Yi)(X_{i},Y_{i}), i=1,,Ni=1,\ldots,N. Following the general paradigm of Bayesian inverse problems, we assume that cc arises from a prior probability distribution Π\Pi on C03(Ω0)C^{3}_{0}(\Omega_{0}). We will construct Π\Pi so that it is supported in a subset of C03(Ω0)C_{0}^{3}(\Omega_{0}) of the following form:

Definition 1.4.

Let ,M>0\ell,M>0 and β3\beta\geq 3. We define 𝒞,Mβ(Ω0)\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}) as the set of all functions cC0β(Ω0)c\in C_{0}^{\beta}(\Omega_{0}) that satisfy the following conditions:

  1. (i)

    The metric gnc=nc2g¯g_{n_{c}}=n_{c}^{2}\bar{g} is a simple metric on Ω¯\overline{\Omega}.

  2. (ii)

    The derivative of expnc(x,)\exp_{n_{c}}(x,\cdot) satisfies

    Dwexpnc(x,w),D_{w}\exp_{n_{c}}(x,w)\succ\ell,

    for all xΩ¯x\in\overline{\Omega} and wdom(expnc(x,))w\in\operatorname{dom}(\exp_{n_{c}}(x,\cdot)).

  3. (iii)

    cCβ,ββ(Ω¯0)<M\|c\|_{C^{\lfloor\beta\rfloor,\beta-\lfloor\beta\rfloor}(\overline{\Omega}_{0})}<M.

We will show in Section 2 that if c𝒞,Mβ(Ω0)c\in\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}), the corresponding conformal parameter nc𝒩λ,Λ,,L(Ω0)n_{c}\in\mathcal{N}_{\lambda,\Lambda,\ell,L}(\Omega_{0}) for appropriate choices of λ,Λ\lambda,\Lambda and LL. The precise construction of Π\Pi is described in Section 3.

Remark 1.3 (Notation).

Henceforth, we will denote Cβ,ββC^{\lfloor\beta\rfloor,\beta-\lfloor\beta\rfloor} by simply CβC^{\beta}.

Remark 1.4.

It is known that small perturbations of simple metrics are again simple. Therefore, 𝒞,Mβ(Ω0)\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}) is an open subset of C0β(Ω0)C^{\beta}_{0}(\Omega_{0}).

The pairs of boundary points (Xi,Yi)(X_{i},Y_{i}) between which the distance measurements are to be made are chosen according to the rule

(Xi,Yi)i.i.d.μ,(X_{i},Y_{i})\stackrel{{\scriptstyle\textrm{i.i.d.}}}{{\sim}}\mu,

where μ\mu is the uniform probability measure on Ω×Ω\partial\Omega\times\partial\Omega induced by the background metric g¯\bar{g}. The actual distance measurements between these points are assumed to be of the form

Γi=eϵiΓc(Xi,Yi),\Gamma_{i}=e^{\epsilon_{i}}\Gamma_{c}(X_{i},Y_{i}),

where ϵi\epsilon_{i} are i.i.d. N(0,σ2)N(0,\sigma^{2}) normal random variables (σ>0\sigma>0 is fixed) that are also independent of (Xj,Yj)j=1N(X_{j},Y_{j})_{j=1}^{N}. For simplicity, we will henceforth assume that σ=1\sigma=1 without loss of generality. Define

Zc=logΓc,Z_{c}=\log\Gamma_{c},

and for i=1,,Ni=1,\ldots,N,

Zi\displaystyle Z_{i} =logΓi\displaystyle=\log\Gamma_{i}
=Zc(Xi,Yi)+ϵi.\displaystyle=Z_{c}(X_{i},Y_{i})+\epsilon_{i}.

All of our measurements can be summarized using the data vector

(5) 𝒟N=(Xi,Yi,Zi)i=1N(Ω×Ω×)N.\mathcal{D}_{N}=(X_{i},Y_{i},Z_{i})_{i=1}^{N}\in(\partial\Omega\times\partial\Omega\times\mathbb{R})^{N}.

For convenience, let us define 𝒳=Ω×Ω×\mathcal{X}=\partial\Omega\times\partial\Omega\times\mathbb{R}.

Next, let PcNP^{N}_{c} denote the probability law of 𝒟N|c\mathcal{D}_{N}|c. It is easy to see that PcN=×i=1NPc(i)P^{N}_{c}=\times_{i=1}^{N}P^{(i)}_{c}, where for each i{1,,N}i\in\{1,\ldots,N\}, Pc(i)P^{(i)}_{c} is equal to the probability law of (Xi,Yi,Zi)(X_{i},Y_{i},Z_{i}). More explicitly, for each i{1,,N}i\in\{1,\ldots,N\},

dPc(i)(x,y,z)=pcdμ(x,y)dz,dP^{(i)}_{c}(x,y,z)=p_{c}d\mu(x,y)dz,

where

pc(x,y,z)=12πexp{12(zZc(x,y))2}.p_{c}(x,y,z)=\frac{1}{\sqrt{2\pi}}\exp\left\{-\frac{1}{2}\left(z-Z_{c}(x,y)\right)^{2}\right\}.

We denote the posterior distribution of c|𝒟Nc|\mathcal{D}_{N} by Π(|𝒟N)\Pi(\cdot|\mathcal{D}_{N}). By Corollary 2.7, the map (c,(x,y,z))pc(x,y,z)(c,(x,y,z))\mapsto p_{c}(x,y,z) is jointly Borel-measurable from C03(Ω0)×𝒳C^{3}_{0}(\Omega_{0})\times\mathcal{X} to \mathbb{R}. So it follows from standard arguments (see [14, p. 7] ) that the posterior distribution is well-defined and takes the form

Π(A|𝒟N)=Ai=1Npc(Xi,Yi,Zi)dΠ(c)i=1Npc(Xi,Yi,Zi)dΠ(c)\Pi(A|\mathcal{D}_{N})=\frac{\int_{A}\prod_{i=1}^{N}p_{c}(X_{i},Y_{i},Z_{i})d\Pi(c)}{\int\prod_{i=1}^{N}p_{c}(X_{i},Y_{i},Z_{i})d\Pi(c)}

for any Borel set AC03(Ω0)A\subseteq C^{3}_{0}(\Omega_{0}). Our posterior estimator for cc will be the posterior mean

(6) c¯N=𝔼Π[c|𝒟N].\overline{c}_{N}=\mathbb{E}^{\Pi}[c|\mathcal{D}_{N}].
Theorem 1.5.

Suppose that the true parameter c0c_{0} is smooth and compactly supported in Ω0\Omega_{0}, and is such that gnc0g_{n_{c_{0}}} is a simple metric on Ω¯\overline{\Omega}. Then there is a well defined prior distribution Π\Pi on C03(Ω0)C^{3}_{0}(\Omega_{0}) such that the posterior mean c¯N\overline{c}_{N} satisfies

c¯Nc0L2(Ω)0\|\overline{c}_{N}-c_{0}\|_{L^{2}(\Omega)}\to 0

in Pc0NP^{N}_{c_{0}}- probability, as NN\to\infty.

A more precise version of this result is stated in Theorem 3.1 in Section 3, which in fact requires significantly weaker regularity assumptions on c0c_{0}. It also specifies an explicit NωN^{-\omega} rate of convergence, where ω\omega is a positive constant that can be made arbitrarily close to 1/41/4.

To prove Theorem 1.5, we apply the analytic techniques developed in recent consistency studies of statistical inversion of the geodesic X-ray transform [22] and related non-linear problem arising in polarimetric neutron tomography [23, 24]. The forward and inverse stability estimates for the measurement operators (like the ones in Theorems 1.2 and 1.3) play a key role in the arguments of these references.

The analysis of theoretical guarantees for statistical inverse problems is currently a very active topic. Recent progress for various linear and non-linear inverse problems include [11, 12, 1, 22, 29, 23, 24, 31, 5, 4]. See also the recent lecture notes [30].

The paper is structured as follows. In Section 2, we establish the forward and inverse stability estimates for the boundary distance function. Section 3 is devoted to proving the statistical consistency of Bayesian inversion for the boundary rigidity problem.

2. Forward and Inverse continuity estimates

In order to prove the statistical consistency of the proposed Bayesian estimator, we need to establish quantitative upper and lower bounds on the magnitude of change in the boundary distance function Γn\Gamma_{n} corresponding to a change in the conformal parameter nn of the metric. This is the content of Theorems 1.2 and 1.3, which we will prove in this section. We will also use these estimates to establish similar bounds for the map cZc=logΓcc\mapsto Z_{c}=\log\Gamma_{c}, when cc belongs to the parameter space 𝒞,Mβ(Ω0)\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}) defined in Definition 1.4.

2.1. Stability estimates

We begin with the proof of Theorem 1.2. As we noted in the introduction, such an estimate has already been proved for dimension m=2m=2 by Mukhometov in [25]. For general m2m\geq 2, we have the following result by Beylkin [3]. Also see [27, Lemma 4].

Theorem 2.1 ([3]).

Let n1,n2C3(Ω¯)n_{1},n_{2}\in C^{3}(\overline{\Omega}) be such that gn1,gn2g_{n_{1}},g_{n_{2}} are simple metrics on Ω¯\overline{\Omega}. Then

(7) Ω(n1n2)(n1m1n2m1)dVolg¯CmΩξ×Ωηa+b=m2dξ(Γn1Γn2)dη(Γn1Γn2)(dξdηΓn1)a(dξdηΓn2)b,\begin{split}\int_{\Omega}(n_{1}-n_{2})&(n_{1}^{m-1}-n_{2}^{m-1})\,d\textrm{Vol}_{\bar{g}}\\ &\leq C_{m}\int_{\partial\Omega_{\xi}\times\partial\Omega_{\eta}}\sum_{a+b=m-2}d_{\xi}(\Gamma_{n_{1}}-\Gamma_{n_{2}})\wedge d_{\eta}(\Gamma_{n_{1}}-\Gamma_{n_{2}})\wedge(d_{\xi}d_{\eta}\Gamma_{n_{1}})^{a}\wedge(d_{\xi}d_{\eta}\Gamma_{n_{2}})^{b}\,,\end{split}

where dVolg¯d\textrm{Vol}_{\bar{g}} is the Riemannian volume form induced by g¯\bar{g}, and dξd_{\xi} and dηd_{\eta} represent the exterior derivative operators on Ω\partial\Omega with respect to ξ\xi and η\eta respectively. Given local coordinates (ξ1,,ξm1)(\xi^{1},\ldots,\xi^{m-1}) for ξ\xi and (η1,,ηm1)(\eta^{1},\ldots,\eta^{m-1}) for η\eta, we have dξ=dξiξid_{\xi}=d\xi^{i}\frac{\partial}{\partial\xi^{i}}, dη=dηjηjd_{\eta}=d\eta^{j}\frac{\partial}{\partial\eta^{j}}, and dξdη=dξidηj2ξiηjd_{\xi}d_{\eta}=d\xi^{i}\wedge d\eta^{j}\frac{\partial^{2}}{\partial\xi^{i}\partial\eta^{j}}. The constant

Cm=(1)(m1)(m2)2Γ(m/2)2πm/2(m1)!,C_{m}=\frac{(-1)^{\frac{(m-1)(m-2)}{2}}\Gamma(m/2)}{2\pi^{m/2}(m-1)!},

depends only on the dimension mm.

We will show that when n1,n2𝒩λ,(Ω0)n_{1},n_{2}\in\mathcal{N}_{\lambda,\ell}(\Omega_{0}), the inequality (7) leads to the desired stability estimate.

Lemma 2.2.

Let n𝒩λ,(Ω0)n\in\mathcal{N}_{\lambda,\ell}(\Omega_{0}). Then the corresponding boundary distance function Γn\Gamma_{n} satisfies

|dξΓn(ξ,η)|g¯1,|dηΓn(ξ,η)|g¯1,|d_{\xi}\Gamma_{n}(\xi,\eta)|_{\bar{g}}\leq 1,\quad|d_{\eta}\Gamma_{n}(\xi,\eta)|_{\bar{g}}\leq 1,

and

|ξηΓn(ξ,η)|g¯(1+1)λdistg¯(ξ,η)1|\nabla^{\xi}\nabla^{\eta}\Gamma_{n}(\xi,\eta)|_{\bar{g}}\leq\frac{(1+\ell^{-1})}{\lambda}\operatorname{dist}_{\bar{g}}(\xi,\eta)^{-1}

for all ξ,ηΩ\xi,\eta\in\partial\Omega with ξη\xi\neq\eta. Here, ξ,η\nabla^{\xi},\nabla^{\eta} denote the covariant derivative operators with respect to ξ\xi and η\eta respectively, and distg¯(ξ,η)\operatorname{dist}_{\bar{g}}(\xi,\eta) is the distance from ξ\xi to η\eta with respect to the metric g¯\bar{g}.

Proof.

Given ξ,ηΩ\xi,\eta\in\partial\Omega with ξη\xi\neq\eta, let v(ξ,η)v(\xi,\eta) denote the unit vector (with respect to gng_{n}) at η\eta tangent to the geodesic from ξ\xi to η\eta. It follows from the First Variation Formula (cf. [18], Theorem 6.3) that the gradient (with respect to gng_{n}) of Γn(ξ,)\Gamma_{n}(\xi,\cdot) is given by

(8) gradηΓn(ξ,η)=Πηv(ξ,η),\operatorname{grad}_{\eta}\Gamma_{n}(\xi,\eta)=\Pi_{\eta}v(\xi,\eta),

where Πη:TηΩ¯TηΩ\Pi_{\eta}:T_{\eta}\overline{\Omega}\to T_{\eta}\partial\Omega is the orthogonal projection map onto the tangent space of the boundary. Since gn=g¯g_{n}=\bar{g} on Ω\partial\Omega, it follows immediately that

|dηΓn(ξ,η)|g¯=|gradηΓn(ξ,η)|gn=|Πηv(ξ,η)|gn|v(ξ,η)|gn=1.|d_{\eta}\Gamma_{n}(\xi,\eta)|_{\bar{g}}=|\operatorname{grad}_{\eta}\Gamma_{n}(\xi,\eta)|_{g_{n}}=\left|\Pi_{\eta}v(\xi,\eta)\right|_{g_{n}}\leq|v(\xi,\eta)|_{g_{n}}=1.

Similar arguments show that |dξΓn(ξ,η)|g¯1|d_{\xi}\Gamma_{n}(\xi,\eta)|_{\bar{g}}\leq 1 as well.

Next, let (ξ1,,ξm1)(\xi^{1},\ldots,\xi^{m-1}) and (η1,,ηm1)(\eta^{1},\ldots,\eta^{m-1}) be local coordinates for Ω\partial\Omega around ξ\xi and η\eta respectively. We can extend these coordinate charts to boundary normal coordinates (ξ1,,ξm)(\xi^{1},\ldots,\xi^{m}) and (η1,,ηm)(\eta^{1},\ldots,\eta^{m}) by taking ξm\xi^{m} and ηm\eta^{m} to be the corresponding distance functions from the boundary. With respect to these coordinates, we may rewrite (8) as

(9) gradηΓn(ξ,η)=j=1m1vj(ξ,η)ηj.\operatorname{grad}_{\eta}\Gamma_{n}(\xi,\eta)=\sum_{j=1}^{m-1}v^{j}(\xi,\eta)\frac{\partial}{\partial\eta^{j}}.

We can extend both sides of this equality to (1,0)(1,0)-tensor fields on Ωξ×Ωη\partial\Omega_{\xi}\times\partial\Omega_{\eta}, while maintaining the equality. Taking covariant derivatives of both sides with respect to ξ\xi, we get

(10) ξgradηΓn(ξ,η)=i,j=1m1vjξi(ξ,η)ηjdξi.\nabla^{\xi}\operatorname{grad}_{\eta}\Gamma_{n}(\xi,\eta)=\sum_{i,j=1}^{m-1}\frac{\partial v^{j}}{\partial\xi^{i}}(\xi,\eta)\frac{\partial}{\partial\eta^{j}}\otimes d\xi^{i}.

Here, we have used the fact that the product connection on Ωξ×Ωη\partial\Omega_{\xi}\times\partial\Omega_{\eta} satisfies ξiηj=0\nabla_{\partial_{\xi_{i}}}\partial_{\eta_{j}}=0 for all i,ji,j. Recall that gng_{n} is a simple metric, and its exponential map expn(x,)\exp_{n}(x,\cdot) at any xΩ¯x\in\overline{\Omega} is a diffeomorphism onto Ω¯\overline{\Omega}. Let w(x,):Ω¯TxΩ¯w(x,\cdot):\overline{\Omega}\to T_{x}\overline{\Omega} denote its inverse map. Since Dvexpn(x,v)D_{v}\exp_{n}(x,v)\succ\ell for all vv in the domain of expn(x,)\exp_{n}(x,\cdot), we have

(11) Dyw(x,y)op<1for all yΩ¯.\|D_{y}w(x,y)\|_{op}<\ell^{-1}\qquad\textrm{for all }y\in\overline{\Omega}.

Now observe that we have the identity

v(ξ,η)=w(η,ξ)Γn(ξ,η).v(\xi,\eta)=-\frac{w(\eta,\xi)}{\Gamma_{n}(\xi,\eta)}.

So by (9) and (10),

ξgradηΓn(ξ,η)\displaystyle\nabla^{\xi}\operatorname{grad}_{\eta}\Gamma_{n}(\xi,\eta) =i,j=1m1{1Γn(ξ,η)wj(η,ξ)ξiwj(η,ξ)Γn(ξ,η)2Γn(ξ,η)ξi}ηjdξi\displaystyle=-\sum_{i,j=1}^{m-1}\left\{\frac{1}{\Gamma_{n}(\xi,\eta)}\frac{\partial w^{j}(\eta,\xi)}{\partial\xi^{i}}-\frac{w^{j}(\eta,\xi)}{\Gamma_{n}(\xi,\eta)^{2}}\frac{\partial\Gamma_{n}(\xi,\eta)}{\partial\xi^{i}}\right\}\frac{\partial}{\partial\eta^{j}}\otimes d\xi^{i}
(12) =1Γn(ξ,η){i,j=1m1wj(η,ξ)ξiηjdξi}+1Γn(ξ,η)v(ξ,η)dξΓn(ξ,η).\displaystyle=-\frac{1}{\Gamma_{n}(\xi,\eta)}\left\{\sum_{i,j=1}^{m-1}\frac{\partial w^{j}(\eta,\xi)}{\partial\xi^{i}}\frac{\partial}{\partial\eta^{j}}\otimes d\xi^{i}\right\}+\frac{1}{\Gamma_{n}(\xi,\eta)}v(\xi,\eta)\otimes d_{\xi}\Gamma_{n}(\xi,\eta).

Observe that i,j=1m1wj(η,ξ)ξiηjdξi\sum_{i,j=1}^{m-1}\frac{\partial w^{j}(\eta,\xi)}{\partial\xi^{i}}\frac{\partial}{\partial\eta^{j}}\otimes d\xi^{i} is precisely the tensor form of the linear map

ΠηDyw(η,y)|y=ξΠξ,\Pi_{\eta}\circ D_{y}w(\eta,y)\big{|}_{y=\xi}\circ\Pi_{\xi},

where Πξ\Pi_{\xi} and Πη\Pi_{\eta} are, as before, orthogonal projections from TξΩ¯TξΩT_{\xi}\overline{\Omega}\to T_{\xi}\partial\Omega and TηΩ¯TηΩT_{\eta}\overline{\Omega}\to T_{\eta}\partial\Omega respectively. Therefore,

|i,j=1m1wj(η,ξ)ξiηjdξi|g¯Dyw(η,y)|y=ξop<1.\left|\sum_{i,j=1}^{m-1}\frac{\partial w^{j}(\eta,\xi)}{\partial\xi^{i}}\frac{\partial}{\partial\eta^{j}}\otimes d\xi^{i}\right|_{\bar{g}}\leq\left\|D_{y}w(\eta,y)\big{|}_{y=\xi}\right\|_{op}<\ell^{-1}.

Combining this with (12), we get

|ξdηΓn(ξ,η)|g¯\displaystyle|\nabla^{\xi}d_{\eta}\Gamma_{n}(\xi,\eta)|_{\bar{g}} =|ξgradηΓn(ξ,η)|g¯\displaystyle=|\nabla^{\xi}\operatorname{grad}_{\eta}\Gamma_{n}(\xi,\eta)|_{\bar{g}}
1Γn(ξ,η)+|v(ξ,η)|g¯|dξΓn(ξ,η)|g¯Γn(ξ,η)\displaystyle\leq\frac{\ell^{-1}}{\Gamma_{n}(\xi,\eta)}+\frac{|v(\xi,\eta)|_{\bar{g}}|d_{\xi}\Gamma_{n}(\xi,\eta)|_{\bar{g}}}{\Gamma_{n}(\xi,\eta)}
(1+1)Γn(ξ,η).\displaystyle\leq\frac{(1+\ell^{-1})}{\Gamma_{n}(\xi,\eta)}.

Finally, applying the simple estimate

distg¯(ξ,η)1λΓn(ξ,η),\operatorname{dist}_{\bar{g}}(\xi,\eta)\leq\frac{1}{\lambda}\Gamma_{n}(\xi,\eta),

we get

|ξηΓn(ξ,η)|g¯=|ξdηΓn(ξ,η)|g¯(1+1)λdistg¯(ξ,η)1.|\nabla^{\xi}\nabla^{\eta}\Gamma_{n}(\xi,\eta)|_{\bar{g}}=|\nabla^{\xi}d_{\eta}\Gamma_{n}(\xi,\eta)|_{\bar{g}}\leq\frac{(1+\ell^{-1})}{\lambda}\operatorname{dist}_{\bar{g}}(\xi,\eta)^{-1}.

This completes the proof. ∎

With these estimates in hand, we’re now ready to prove Theorem 1.2.

Proof of Theorem 1.2.

Consider the inequality (7) from Theorem 2.1. For n1,n2𝒩λ,(Ω0)n_{1},n_{2}\in\mathcal{N}_{\lambda,\ell}(\Omega_{0}), the left hand side becomes

(13) Ω(n1n2)2(n1m2+n1m3n2++n2m2)𝑑Volg¯(m1)λm2n1n2L2(Ω)2.\int_{\Omega}(n_{1}-n_{2})^{2}(n_{1}^{m-2}+n_{1}^{m-3}n_{2}+\cdots+n_{2}^{m-2})d\textrm{Vol}_{\bar{g}}\geq(m-1)\lambda^{m-2}\|n_{1}-n_{2}\|_{L^{2}(\Omega)}^{2}.

Now consider the right hand side of (7). By Lemma 2.2,

|dξdηΓn|g¯=|Alt(ξηΓn)|g¯(1+1)λdistg¯(ξ,η)1.|d_{\xi}d_{\eta}\Gamma_{n}|_{\bar{g}}=\left|\textrm{Alt}\left(\nabla^{\xi}\nabla^{\eta}\Gamma_{n}\right)\right|_{\bar{g}}\leq\frac{(1+\ell^{-1})}{\lambda}\operatorname{dist}_{\bar{g}}(\xi,\eta)^{-1}.

Therefore, the right hand side of (7) is bounded above by

|Cm|Ω×Ω|dξ(Γn1Γn2)|g¯|dη(Γn1Γn2)|g¯a+b=m2|dξdηΓn1|g¯a|dξdηΓn2|g¯bdσg¯\displaystyle\phantom{\leq}|C_{m}|\int_{\partial\Omega\times\partial\Omega}|d_{\xi}(\Gamma_{n_{1}}-\Gamma_{n_{2}})|_{\bar{g}}|d_{\eta}(\Gamma_{n_{1}}-\Gamma_{n_{2}})|_{\bar{g}}\sum_{a+b=m-2}|d_{\xi}d_{\eta}\Gamma_{n_{1}}|_{\bar{g}}^{a}|d_{\xi}d_{\eta}\Gamma_{n_{2}}|_{\bar{g}}^{b}\,d\sigma_{\bar{g}}
(m1)|Cm|(1+1)m2λm2Ω×Ω|dξ(Γn1Γn2)|g¯|dη(Γn1Γn2)|g¯|distg¯(ξ,η)|2m𝑑σg¯,\displaystyle\leq(m-1)|C_{m}|\frac{(1+\ell^{-1})^{m-2}}{\lambda^{m-2}}\int_{\partial\Omega\times\partial\Omega}|d_{\xi}(\Gamma_{n_{1}}-\Gamma_{n_{2}})|_{\bar{g}}|d_{\eta}(\Gamma_{n_{1}}-\Gamma_{n_{2}})|_{\bar{g}}|\operatorname{dist}_{\bar{g}}(\xi,\eta)|^{2-m}\,d\sigma_{\bar{g}},

where dσg¯d\sigma_{\bar{g}} is the surface measure on Ω×Ω\partial\Omega\times\partial\Omega induced by g¯\bar{g}. Observe that by Remark 1.2, we have (Γn1Γn2)(ξ,η)=0(\Gamma_{n_{1}}-\Gamma_{n_{2}})(\xi,\eta)=0 for all ξ,ηΩ\xi,\eta\in\partial\Omega with distg¯(ξ,η)<δ\operatorname{dist}_{\bar{g}}(\xi,\eta)<\delta. Therefore, the above expression is further bounded above by

(m1)|Cm|(1+1)m2λm2δ2mΩ×Ω|dξ(Γn1Γn2)|g¯|dη(Γn1Γn2)|g¯|dσg¯.\displaystyle\phantom{\leq}(m-1)|C_{m}|\frac{(1+\ell^{-1})^{m-2}}{\lambda^{m-2}}\delta^{2-m}\int_{\partial\Omega\times\partial\Omega}|d_{\xi}(\Gamma_{n_{1}}-\Gamma_{n_{2}})|_{\bar{g}}|d_{\eta}(\Gamma_{n_{1}}-\Gamma_{n_{2}})|_{\bar{g}}|d\sigma_{\bar{g}}.
m,δ,λ2m(dξ(Γn1Γn2)L2(Ω×Ω)2+dη(Γn1Γn2)L2(Ω×Ω)2)\displaystyle\lesssim_{m,\delta,\ell}\lambda^{2-m}\left(\|d_{\xi}(\Gamma_{n_{1}}-\Gamma_{n_{2}})\|^{2}_{L^{2}(\partial\Omega\times\partial\Omega)}+\|d_{\eta}(\Gamma_{n_{1}}-\Gamma_{n_{2}})\|^{2}_{L^{2}(\partial\Omega\times\partial\Omega)}\right)
m,δ,λ2mdξ(Γn1Γn2)L2(Ω×Ω)2\displaystyle\lesssim_{m,\delta,\ell}\lambda^{2-m}\|d_{\xi}(\Gamma_{n_{1}}-\Gamma_{n_{2}})\|^{2}_{L^{2}(\partial\Omega\times\partial\Omega)}

since dξ(Γn1Γn2)L2=dη(Γn1Γn2)L2\|d_{\xi}(\Gamma_{n_{1}}-\Gamma_{n_{2}})\|_{L^{2}}=\|d_{\eta}(\Gamma_{n_{1}}-\Gamma_{n_{2}})\|_{L^{2}} by symmetry. Combining this with (13), we get

n1n2L2(Ω)2m,δ,λ2(2m)dξ(Γn1Γn2)L2(Ω×Ω)2\|n_{1}-n_{2}\|^{2}_{L^{2}(\Omega)}\lesssim_{m,\delta,\ell}\lambda^{2(2-m)}\|d_{\xi}(\Gamma_{n_{1}}-\Gamma_{n_{2}})\|^{2}_{L^{2}(\partial\Omega\times\partial\Omega)}

and the theorem follows. ∎

Recall that we parametrized the conformal parameter nn of the metric gng_{n} by a function cc belonging to the parameter space 𝒞,Mβ(Ω0)\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}), as defined in (4). We assumed that our input data consists of finitely many measurements of the function Zc=logΓcZ_{c}=\log\Gamma_{c}. In the following corollary, we translate Theorem 1.2 into stability estimates for the map cZcc\mapsto Z_{c} using simple Lipschitz estimates for the exponential function: For all x,y[M1,M2]x,y\in[M_{1},M_{2}],

(14) eM1|xy||exey|eM2|xy|.e^{M_{1}}|x-y|\leq|e^{x}-e^{y}|\leq e^{M_{2}}|x-y|.

This immediately implies that for all c1,c2𝒞,Mβ(Ω0)c_{1},c_{2}\in\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}),

(15) eMc1c2L2(Ω0)nc1nc2L2(Ω)eMc1c2L2(Ω0).e^{-M}\|c_{1}-c_{2}\|_{L^{2}(\Omega_{0})}\leq\|n_{c_{1}}-n_{c_{2}}\|_{L^{2}(\Omega)}\leq e^{M}\|c_{1}-c_{2}\|_{L^{2}(\Omega_{0})}.
Corollary 2.3.

For any M>0M>0, there exists a constant C1=C1(Ω,Ω0,g¯,,M)>0C_{1}^{\prime}=C_{1}^{\prime}(\Omega,\Omega_{0},\bar{g},\ell,M)>0 such that

c1c2L2(Ω0)C1Zc1Zc2H1(Ω×Ω)\|c_{1}-c_{2}\|_{L^{2}(\Omega_{0})}\leq C_{1}^{\prime}\|Z_{c_{1}}-Z_{c_{2}}\|_{H^{1}(\partial\Omega\times\partial\Omega)}

for all c1,c2𝒞,M3(Ω0)c_{1},c_{2}\in\mathcal{C}^{3}_{\ell,M}(\Omega_{0}).

Proof.

Let c1,c2𝒞,M3(Ω0)c_{1},c_{2}\in\mathcal{C}^{3}_{\ell,M}(\Omega_{0}). Then nc1,nc2𝒩λ,(Ω0)n_{c_{1}},n_{c_{2}}\in\mathcal{N}_{\lambda,\ell}(\Omega_{0}) for λ=eM\lambda=e^{-M}. So it follows from Theorem 1.2 that

(16) nc1nc2L2(Ω)C1e(m2)Mdξ(Γc1Γc2)L2(Ω×Ω).\|n_{c_{1}}-n_{c_{2}}\|_{L^{2}(\Omega)}\leq C_{1}e^{(m-2)M}\|d_{\xi}(\Gamma_{c_{1}}-\Gamma_{c_{2}})\|_{L^{2}(\partial\Omega\times\partial\Omega)}.

By (15), the left hand side of the above equation is bounded below by eMc1c2L2(Ω0)e^{-M}\|c_{1}-c_{2}\|_{L^{2}(\Omega_{0})}. Now, rewrite dξ(Γc1Γc2)d_{\xi}(\Gamma_{c_{1}}-\Gamma_{c_{2}}) as

dξ(Γc1Γc2)=dξ(eZc1eZc2)=eZc1dξZc1eZc2dξZc2=eZc1dξ(Zc1Zc2)+(eZc1eZc2)dξZc2.\begin{split}d_{\xi}(\Gamma_{c_{1}}-\Gamma_{c_{2}})&=d_{\xi}(e^{Z_{c_{1}}}-e^{Z_{c_{2}}})\\ &=e^{Z_{c_{1}}}d_{\xi}Z_{c_{1}}-e^{Z_{c_{2}}}d_{\xi}Z_{c_{2}}\\ &=e^{Z_{c_{1}}}d_{\xi}(Z_{c_{1}}-Z_{c_{2}})+(e^{Z_{c_{1}}}-e^{Z_{c_{2}}})d_{\xi}Z_{c_{2}}.\end{split}

It follows from Remark 1.2 that if (ξ,η)supp(Γc1Γc2)(\xi,\eta)\in\operatorname{supp}(\Gamma_{c_{1}}-\Gamma_{c_{2}}), we have distg¯(ξ,η)δ\operatorname{dist}_{\bar{g}}(\xi,\eta)\geq\delta, and consequently,

eMδΓcj(ξ,η)eMdiamg¯(Ω),j=1,2.e^{-M}\delta\leq\Gamma_{c_{j}}(\xi,\eta)\leq e^{M}\operatorname{diam}_{\bar{g}}(\Omega),\qquad j=1,2.

Therefore, by applying (14) along with the fact that |dξΓcj|g¯1|d_{\xi}\Gamma_{c_{j}}|_{\bar{g}}\leq 1 by Lemma 2.2, we get

|dξ(Γc1Γc2)|g¯|Γc1||dξ(Zc1Zc2)|g¯+|Γc1Γc2||dξΓc2|g¯/|Γc2|eMdiamg¯(Ω)|dξ(Zc1Zc2)|g¯+|eZc1eZc2|eMδeMdiamg¯(Ω)|dξ(Zc1Zc2)|g¯+eMdiamg¯(Ω)eMδ|Zc1Zc2|,\begin{split}|d_{\xi}(\Gamma_{c_{1}}-\Gamma_{c_{2}})|_{\bar{g}}&\leq|\Gamma_{c_{1}}||d_{\xi}(Z_{c_{1}}-Z_{c_{2}})|_{\bar{g}}+|\Gamma_{c_{1}}-\Gamma_{c_{2}}||d_{\xi}\Gamma_{c_{2}}|_{\bar{g}}/|\Gamma_{c_{2}}|\\ &\leq e^{M}\operatorname{diam}_{\bar{g}}(\Omega)|d_{\xi}(Z_{c_{1}}-Z_{c_{2}})|_{\bar{g}}+\frac{|e^{Z_{c_{1}}}-e^{Z_{c_{2}}}|}{e^{-M}\delta}\\ &\leq e^{M}\operatorname{diam}_{\bar{g}}(\Omega)|d_{\xi}(Z_{c_{1}}-Z_{c_{2}})|_{\bar{g}}+\frac{e^{M}\operatorname{diam}_{\bar{g}}(\Omega)}{e^{-M}\delta}|Z_{c_{1}}-Z_{c_{2}}|,\end{split}

where diamg¯(Ω)\operatorname{diam}_{\bar{g}}(\Omega) denotes the diameter of Ω\Omega with respect to the metric g¯\bar{g}. This further implies

dξ(Γc1Γc2)L2(Ω×Ω)Ω,g¯,δ,,MZc1Zc2H1(Ω×Ω).\|d_{\xi}(\Gamma_{c_{1}}-\Gamma_{c_{2}})\|_{L^{2}(\partial\Omega\times\partial\Omega)}\lesssim_{\Omega,\bar{g},\delta,\ell,M}\|Z_{c_{1}}-Z_{c_{2}}\|_{H^{1}(\partial\Omega\times\partial\Omega)}.

Combining this with (15) and (16), we get

c1c2L2(Ω0)Ω,g¯,δ,,MZc1Zc2H1(Ω×Ω).\|c_{1}-c_{2}\|_{L^{2}(\Omega_{0})}\lesssim_{\Omega,\bar{g},\delta,\ell,M}\|Z_{c_{1}}-Z_{c_{2}}\|_{H^{1}(\partial\Omega\times\partial\Omega)}.

This completes the proof. ∎

2.2. Forward continuity estimates

We now move on to the proof of Theorem 1.3. The key idea is to use upper bounds on Dvexpnj(x,v)D_{v}\exp_{n_{j}}(x,v) to control Γn1Γn2L2\|\Gamma_{n_{1}}-\Gamma_{n_{2}}\|_{L^{2}} with respect to n1n2L2\|n_{1}-n_{2}\|_{L^{2}}.

We begin by introducing some notation. Let SΩ¯S\overline{\Omega} denote the unit sphere bundle on Ω¯\overline{\Omega}, that is,

SΩ¯={(x,v)TΩ¯:|v|g¯=1}.S\overline{\Omega}=\{(x,v)\in T\overline{\Omega}\ :\ |v|_{\bar{g}}=1\}.

The boundary of SΩ¯S\overline{\Omega} consists of unit tangent vectors at Ω\partial\Omega. Specifically,

SΩ¯={(x,v)SΩ¯:xΩ}.\partial S\overline{\Omega}=\{(x,v)\in S\overline{\Omega}\ :\ x\in\partial\Omega\}.

Let ν\nu denote the inward unit normal vector field along Ω\partial\Omega with respect to the metric g¯\bar{g}. We define the bundles of inward pointing and outward pointing unit tangent vectors on Ω\partial\Omega as follows:

+SΩ¯\displaystyle\partial_{+}S\overline{\Omega} :={(ξ,v)SΩ¯:v,νξg¯0},and\displaystyle:=\left\{(\xi,v)\in\partial S\overline{\Omega}\ :\ \langle v,\nu_{\xi}\rangle_{\bar{g}}\geq 0\right\},\quad\textrm{and }
SΩ¯\displaystyle\partial_{-}S\overline{\Omega} :={(ξ,v)SΩ¯:v,νξg¯0}.\displaystyle:=\left\{(\xi,v)\in\partial S\overline{\Omega}\ :\ \ \langle v,\nu_{\xi}\rangle_{\bar{g}}\leq 0\right\}.

We also set

0SΩ¯:=+SΩ¯SΩ¯.\partial_{0}S\overline{\Omega}:=\partial_{+}S\overline{\Omega}\cap\partial_{-}S\overline{\Omega}.

This coincides with SΩS\partial\Omega, the unit sphere bundle on Ω\partial\Omega.

Next, let nNλ,(Ω0)n\in N_{\lambda,\ell}(\Omega_{0}). For (ξ,v)+SΩ¯(\xi,v)\in\partial_{+}S\overline{\Omega}, we let γn(ξ,v,t)=expn(ξ,tv)\gamma_{n}(\xi,v,t)=\exp_{n}(\xi,tv) denote the unit speed geodesic (with respect to gng_{n}) starting at ξ\xi with initial direction vv at time t=0t=0. We define τn(ξ,v)\tau_{n}(\xi,v) to be the time at which γn(ξ,v,)\gamma_{n}(\xi,v,\cdot) exits Ω¯\overline{\Omega}. It is known (see [33]) that for simple manifolds, τn\tau_{n} is a C1C^{1} function of +SΩ¯\partial_{+}S\overline{\Omega}, and τn(ξ,v)=0\tau_{n}(\xi,v)=0 if and only if vSξΩv\in S_{\xi}\partial\Omega. We also define ηn(ξ,v)\eta_{n}(\xi,v) and un(ξ,v)u_{n}(\xi,v) as the point and direction at which γn(ξ,v,)\gamma_{n}(\xi,v,\cdot) exits Ω¯\overline{\Omega}. In other words,

ηn(ξ,v)\displaystyle\eta_{n}(\xi,v) :=γn(ξ,v,τn(ξ,v)),and\displaystyle:=\gamma_{n}(\xi,v,\tau_{n}(\xi,v)),\quad\textrm{and}
un(ξ,v)\displaystyle u_{n}(\xi,v) :=γ˙n(ξ,v,τn(ξ,v)).\displaystyle:=\dot{\gamma}_{n}(\xi,v,\tau_{n}(\xi,v)).
Lemma 2.4.

Let n𝒩λ,Λ,,L(Ω0)n\in\mathcal{N}_{\lambda,\Lambda,\ell,L}(\Omega_{0}). Then for all (ξ,v)+SΩ¯(\xi,v)\in\partial_{+}S\overline{\Omega},

Dvτn(ξ,v)opLτn(ξ,v)ν,ug¯LΛdiamg¯(Ω)ν,ug¯,\|D_{v}\tau_{n}(\xi,v)\|_{op}\leq L\frac{\tau_{n}(\xi,v)}{\langle\nu,u\rangle_{\bar{g}}}\leq\frac{L\Lambda\operatorname{diam}_{\bar{g}}(\Omega)}{\langle\nu,u\rangle_{\bar{g}}},

where ν=νηn(ξ,v)\nu=\nu_{\eta_{n}(\xi,v)} and u=un(ξ,v)u=u_{n}(\xi,v).

Proof.

Let ρC1(Ω¯)\rho\in C^{1}(\overline{\Omega}) be such that ρ1(0)=Ω\rho^{-1}(0)=\partial\Omega and ρ(x)=distg¯(x,Ω)\rho(x)=\operatorname{dist}_{\bar{g}}(x,\partial\Omega) for xx near Ω\partial\Omega. Consider the function

f(t,v)=ρ(expn(ξ,tv)).f(t,v)=\rho(\exp_{n}(\xi,tv)).

Observe that

ft|t=τn(ξ,v)=(gradρ)ηn(ξ,v),un(ξ,v)g¯=ν,ug¯.\frac{\partial f}{\partial t}\Big{|}_{t=\tau_{n}(\xi,v)}=\left\langle(\operatorname{grad}\rho)_{\eta_{n}(\xi,v)},u_{n}(\xi,v)\right\rangle_{\bar{g}}=\langle\nu,u\rangle_{\bar{g}}.

On the other hand,

Dvf(t,v)\displaystyle D_{v}f(t,v) =Dρexpn(ξ,tv)(tDwexpn(ξ,w)|w=tv)\displaystyle=D\rho_{\exp_{n}(\xi,tv)}\circ\left(tD_{w}\exp_{n}(\xi,w)\big{|}_{w=tv}\right)
Dvf|(τn(ξ,v),v)\displaystyle\Rightarrow D_{v}f\big{|}_{(\tau_{n}(\xi,v),v)} =τn(ξ,v)ΠνDwexpn(ξ,w)|w=τn(ξ,v)v,\displaystyle=\tau_{n}(\xi,v)\Pi^{\nu}\circ D_{w}\exp_{n}(\xi,w)\big{|}_{w=\tau_{n}(\xi,v)v},

where Πν\Pi^{\nu} is the linear map given by

Πν(w)=ν,wg¯for all wTηn(ξ,v)Ω¯.\Pi^{\nu}(w)=\langle\nu,w\rangle_{\bar{g}}\qquad\textrm{for all }w\in T_{\eta_{n}(\xi,v)}\overline{\Omega}.

Now differentiating the identity f(τn(ξ,v),v)=0f(\tau_{n}(\xi,v),v)=0 with respect to vv, we get

0=ft|(τn(ξ,v),v)Dvτn(ξ,v)+Dvf|(τn(ξ,v),v)=ν,ug¯Dvτn(ξ,v)+τn(ξ,v)ΠνDwexpn(ξ,w)|w=τn(ξ,v)v.\begin{split}0&=\frac{\partial f}{\partial t}\Big{|}_{(\tau_{n}(\xi,v),v)}D_{v}\tau_{n}(\xi,v)+D_{v}f\big{|}_{(\tau_{n}(\xi,v),v)}\\ &=\langle\nu,u\rangle_{\bar{g}}D_{v}\tau_{n}(\xi,v)+\tau_{n}(\xi,v)\Pi^{\nu}\circ D_{w}\exp_{n}(\xi,w)\big{|}_{w=\tau_{n}(\xi,v)v}.\end{split}

Therefore,

Dvτn(ξ,v)\displaystyle D_{v}\tau_{n}(\xi,v) =τn(ξ,v)ν,ug¯ΠνDwexpn(ξ,w)|w=τn(ξ,v)v\displaystyle=-\frac{\tau_{n}(\xi,v)}{\langle\nu,u\rangle_{\bar{g}}}\Pi^{\nu}\circ D_{w}\exp_{n}(\xi,w)\big{|}_{w=\tau_{n}(\xi,v)v}
Dvτn(ξ,v)op\displaystyle\Rightarrow\|D_{v}\tau_{n}(\xi,v)\|_{op} τn(ξ,v)ν,ug¯Dwexpn(ξ,w)|w=τn(ξ,v)vop\displaystyle\leq\frac{\tau_{n}(\xi,v)}{\langle\nu,u\rangle_{\bar{g}}}\left\|D_{w}\exp_{n}(\xi,w)\big{|}_{w=\tau_{n}(\xi,v)v}\right\|_{op}
L[τn(ξ,v)ν,ug¯],\displaystyle\leq L\left[\frac{\tau_{n}(\xi,v)}{\langle\nu,u\rangle_{\bar{g}}}\right],

as required. Now the lemma follows by observing that

τn(ξ,v)diamgn(Ω)Λdiamg¯(Ω),\tau_{n}(\xi,v)\leq\operatorname{diam}_{g_{n}}(\Omega)\leq\Lambda\operatorname{diam}_{\bar{g}}(\Omega),

for all (ξ,v)+SΩ¯(\xi,v)\in\partial_{+}S\overline{\Omega}. ∎

We are now ready to prove Theorem 1.3. Recall that the notation γfd|g|\int_{\gamma}fd|g| denotes the integral of a function ff along the curve γ\gamma with respect to the arc-length metric induced by gg.

Proof of Theorem 1.3.

Fix ξΩ\xi\in\partial\Omega, and define the sets

B1(ξ)\displaystyle B_{1}(\xi) :={ηΩ:Γn1(ξ,η)Γn2(ξ,η)},\displaystyle:=\{\eta\in\partial\Omega\ :\ \Gamma_{n_{1}}(\xi,\eta)\leq\Gamma_{n_{2}}(\xi,\eta)\},
B2(ξ)\displaystyle B_{2}(\xi) :={ηΩ:Γn2(ξ,η)Γn1(ξ,η)}.\displaystyle:=\{\eta\in\partial\Omega\ :\ \Gamma_{n_{2}}(\xi,\eta)\leq\Gamma_{n_{1}}(\xi,\eta)\}.

Suppose ηB1(ξ)\eta\in B_{1}(\xi), and let γ1(ξ,η)\gamma_{1}(\xi,\eta) denote the unit speed geodesic with respect to gn1g_{n_{1}} from ξ\xi to η\eta. Clearly, Γn1(ξ,η)=γ1(ξ,η)n1d|g¯|\Gamma_{n_{1}}(\xi,\eta)=\int_{\gamma_{1}(\xi,\eta)}n_{1}d|\bar{g}|, whereas Γn2(ξ,η)γ1(ξ,η)n2d|g¯|\Gamma_{n_{2}}(\xi,\eta)\leq\int_{\gamma_{1}(\xi,\eta)}n_{2}d|\bar{g}|. So we have

(Γn2Γn1)(ξ,η)γ1(ξ,η)(n2n1)d|g¯|=γ1(ξ,η)(n2n1)n1d|gn1|.(\Gamma_{n_{2}}-\Gamma_{n_{1}})(\xi,\eta)\leq\int_{\gamma_{1}(\xi,\eta)}(n_{2}-n_{1})d|\bar{g}|=\int_{\gamma_{1}(\xi,\eta)}\frac{(n_{2}-n_{1})}{n_{1}}d|g_{n_{1}}|.

This implies

(Γn2Γn1)2(ξ,η)\displaystyle(\Gamma_{n_{2}}-\Gamma_{n_{1}})^{2}(\xi,\eta) Γn1(ξ,η)γ1(ξ,η)(n2n1)2n12d|gn1|(by Cauchy-Schwarz)\displaystyle\leq\Gamma_{n_{1}}(\xi,\eta)\int_{\gamma_{1}(\xi,\eta)}\frac{(n_{2}-n_{1})^{2}}{n_{1}^{2}}d|g_{n_{1}}|\quad\textrm{(by Cauchy-Schwarz)}
=Γn1(ξ,η)0Γn1(ξ,η)(n2n1)2n12(γ1(ξ,η,t))𝑑t\displaystyle=\Gamma_{n_{1}}(\xi,\eta)\int_{0}^{\Gamma_{n_{1}}(\xi,\eta)}\frac{(n_{2}-n_{1})^{2}}{n_{1}^{2}}(\gamma_{1}(\xi,\eta,t))dt
Γn1(ξ,η)λ20Γn1(ξ,η)(n2n1)2(expn1(ξ,tvn1(ξ,η)))𝑑t,\displaystyle\leq\frac{\Gamma_{n_{1}}(\xi,\eta)}{\lambda^{2}}\int_{0}^{\Gamma_{n_{1}}(\xi,\eta)}(n_{2}-n_{1})^{2}(\exp_{n_{1}}(\xi,tv_{n_{1}}(\xi,\eta)))dt,

where vn1(ξ,η)=γ˙n1(ξ,η,0)v_{n_{1}}(\xi,\eta)=\dot{\gamma}_{n_{1}}(\xi,\eta,0), that is, the unit tangent vector at ξ\xi that points towards η\eta. This implies

B1(ξ)(Γn2Γn1)2(ξ,η)𝑑η\displaystyle\int_{B_{1}(\xi)}(\Gamma_{n_{2}}-\Gamma_{n_{1}})^{2}(\xi,\eta)d\eta Λdiamg¯(Ω)λ2Ω0Γn1(ξ,η)(n2n1)2(expn1(ξ,tvn1(ξ,η)))𝑑t𝑑η\displaystyle\leq\frac{\Lambda\operatorname{diam}_{\bar{g}}(\Omega)}{\lambda^{2}}\int_{\partial\Omega}\int_{0}^{\Gamma_{n_{1}}(\xi,\eta)}(n_{2}-n_{1})^{2}(\exp_{n_{1}}(\xi,tv_{n_{1}}(\xi,\eta)))dtd\eta
(17) =Λdiamg¯(Ω)λ2+SξΩ¯0τn1(ξ,v)(n2n1)2(expn1(ξ,tv))|det[Dvηn1(ξ,v)]dtdv.\displaystyle=\frac{\Lambda\operatorname{diam}_{\bar{g}}(\Omega)}{\lambda^{2}}\int_{\partial_{+}S_{\xi}\overline{\Omega}}\int_{0}^{\tau_{n_{1}}(\xi,v)}(n_{2}-n_{1})^{2}(\exp_{n_{1}}(\xi,tv))|\det[D_{v}\eta_{n_{1}}(\xi,v)]dtdv.

by the change of variables formula. (Here, dηd\eta is the surface measure on ηΩ\eta\in\partial\Omega with respect to g¯\bar{g}.) We now find an upper bound for |det[Dvηn1]||\det[D_{v}\eta_{n_{1}}]| on the support of the integrand. Recall that by definition,

ηn1(ξ,v)=expn1(ξ,τn1(ξ,v)v).\eta_{n_{1}}(\xi,v)=\exp_{n_{1}}(\xi,\tau_{n_{1}}(\xi,v)v).

With the canonical identification of TvSξΩ¯T_{v}S_{\xi}\overline{\Omega} with a subspace of TξΩ¯T_{\xi}\overline{\Omega}, we get

Dvηn1(ξ,v)\displaystyle D_{v}\eta_{n_{1}}(\xi,v) =Dwexpn1(ξ,w)|w=τn1(ξ,v)vDv(τn1(ξ,v)v)\displaystyle=D_{w}\exp_{n_{1}}(\xi,w)\big{|}_{w=\tau_{n_{1}}(\xi,v)v}\circ D_{v}(\tau_{n_{1}}(\xi,v)v)
=Dwexpn1(ξ,w)|w=τn1(ξ,v)v(τn1(ξ,v)Id+vDvτn1(ξ,v)).\displaystyle=D_{w}\exp_{n_{1}}(\xi,w)\big{|}_{w=\tau_{n_{1}}(\xi,v)v}\circ\big{(}\tau_{n_{1}}(\xi,v)\text{Id}+v\otimes D_{v}\tau_{n_{1}}(\xi,v)\big{)}.

Here, vDvτn1(ξ,v)v\otimes D_{v}\tau_{n_{1}}(\xi,v) should be interpreted as the map

wTvSξΩ¯TξΩ¯[Dvτn1|(ξ,v)(w)]vTξΩ¯.w\in T_{v}S_{\xi}\overline{\Omega}\subseteq T_{\xi}\overline{\Omega}\qquad\mapsto\qquad[D_{v}\tau_{n_{1}}|_{(\xi,v)}(w)]v\in T_{\xi}\overline{\Omega}.

So we have

Dvηn1(ξ,v)op\displaystyle\|D_{v}\eta_{n_{1}}(\xi,v)\|_{op} Dwexpn1(ξ,w)|w=τn1(ξ,v)vop(τn1(ξ,v)+Dvτn1(ξ,v)op)\displaystyle\leq\left\|D_{w}\exp_{n_{1}}(\xi,w)\big{|}_{w=\tau_{n_{1}}(\xi,v)v}\right\|_{op}\big{(}\tau_{n_{1}}(\xi,v)+\|D_{v}\tau_{n_{1}}(\xi,v)\|_{op}\big{)}
L(Λdiamg¯(Ω)+LΛdiamg¯(Ω)ν(ηn1(ξ,v)),un1(ξ,v)g¯)\displaystyle\leq L\left(\Lambda\operatorname{diam}_{\bar{g}}(\Omega)+\frac{L\Lambda\operatorname{diam}_{\bar{g}}(\Omega)}{\left\langle\nu(\eta_{n_{1}}(\xi,v)),u_{n_{1}}(\xi,v)\right\rangle_{\bar{g}}}\right)

by Lemma 2.4. Now since Ω0\Omega_{0} is a relatively compact subset of Ω\Omega, there exists an ε(0,1)\varepsilon\in(0,1) such that if ν(ηn1(ξ,v)),un1(ξ,v)g¯<ε\langle\nu(\eta_{n_{1}}(\xi,v)),u_{n_{1}}(\xi,v)\rangle_{\bar{g}}<\varepsilon, the geodesic γn1(ξ,v,)\gamma_{n_{1}}(\xi,v,\cdot) lies entirely within Ω¯Ω0\overline{\Omega}\setminus\Omega_{0}, and therefore,

(n2n1)2(expn1(ξ,tv))=0for all t[0,τn1(ξ,v)].(n_{2}-n_{1})^{2}(\exp_{n_{1}}(\xi,tv))=0\qquad\textrm{for all }t\in[0,\tau_{n_{1}}(\xi,v)].

Therefore, on the support of the integrand in the right hand side of (17), we have the bounds

Dvηn1(ξ,v)opL(Λdiamg¯(Ω)+LΛdiamg¯(Ω)ε)Ω,Ω0,g¯,LΛ,\|D_{v}\eta_{n_{1}}(\xi,v)\|_{op}\leq L\left(\Lambda\operatorname{diam}_{\bar{g}}(\Omega)+\frac{L\Lambda\operatorname{diam}_{\bar{g}}(\Omega)}{\varepsilon}\right)\lesssim_{\Omega,\Omega_{0},\bar{g},L}\Lambda,

and consequently

|det[Dv(ηn1(ξ,v))]|Ω,Ω0,g¯,LΛm1.|\det[D_{v}(\eta_{n_{1}}(\xi,v))]|\lesssim_{\Omega,\Omega_{0},\bar{g},L}\Lambda^{m-1}.

Applying this bound to the right hand side of (17), we get

B1(ξ)(Γn1Γn2)2(ξ,η)𝑑η\displaystyle\int_{B_{1}(\xi)}(\Gamma_{n_{1}}-\Gamma_{n_{2}})^{2}(\xi,\eta)d\eta Λmλ2+SξΩ¯0τn1(ξ,v)(n2n1)2(expn1(ξ,tv))𝑑t𝑑v\displaystyle\lesssim\frac{\Lambda^{m}}{\lambda^{2}}\int_{\partial_{+}S_{\xi}\overline{\Omega}}\int_{0}^{\tau_{n_{1}}(\xi,v)}(n_{2}-n_{1})^{2}(\exp_{n_{1}}(\xi,tv))dtdv
Λmλ2dom(expn1(ξ,))(n2n1)2(expn1(ξ,w))|w|g¯m1𝑑w\displaystyle\sim\frac{\Lambda^{m}}{\lambda^{2}}\int_{\operatorname{dom}(\exp_{n_{1}}(\xi,\cdot))}\frac{(n_{2}-n_{1})^{2}(\exp_{n_{1}}(\xi,w))}{|w|_{\bar{g}}^{m-1}}dw

Again by Remark 1.2, we have (n2n1)2(expn1(ξ,w))=0(n_{2}-n_{1})^{2}(\exp_{n_{1}}(\xi,w))=0 for all wdom(expn1(ξ,))w\in\operatorname{dom}(\exp_{n_{1}}(\xi,\cdot)) with |w|g¯δ|w|_{\bar{g}}\leq\delta. Therefore, we get

B1(ξ)(Γn1Γn2)2(ξ,η)𝑑ηΛmλ2δm1dom(expn1(ξ,))(n2n1)2(expn1(ξ,w))𝑑w.\int_{B_{1}(\xi)}(\Gamma_{n_{1}}-\Gamma_{n_{2}})^{2}(\xi,\eta)d\eta\lesssim\frac{\Lambda^{m}}{\lambda^{2}\delta^{m-1}}\int_{\operatorname{dom}(\exp_{n_{1}}(\xi,\cdot))}(n_{2}-n_{1})^{2}(\exp_{n_{1}}(\xi,w))dw.

We now make the change of variable x=expn1(ξ,w)x=\exp_{n_{1}}(\xi,w). The assumption that Dwexpn1(ξ,w)D_{w}\exp_{n_{1}}(\xi,w)\succ\ell implies that the inverse wn1(ξ,)w_{n_{1}}(\xi,\cdot) of expn1(ξ,)\exp_{n_{1}}(\xi,\cdot) satisfies Dxwn1(ξ,x)op<1\|D_{x}w_{n_{1}}(\xi,x)\|_{op}<\ell^{-1}, and consequently,

|det(Dxwn1(ξ,x))|<m.|\det(D_{x}w_{n_{1}}(\xi,x))|<\ell^{-m}.

Therefore,

B1(ξ)(Γn1Γn2)2(ξ,η)𝑑η\displaystyle\int_{B_{1}(\xi)}(\Gamma_{n_{1}}-\Gamma_{n_{2}})^{2}(\xi,\eta)d\eta Λmλ2Ω(n2n1)2(x)|det(Dxwn1(ξ,x))|dVolg¯(x)\displaystyle\lesssim\frac{\Lambda^{m}}{\lambda^{2}}\int_{\Omega}(n_{2}-n_{1})^{2}(x)|\det(D_{x}w_{n_{1}}(\xi,x))|d\operatorname{Vol}_{\bar{g}}(x)
Λmλ2mΩ(n2n1)2(x)dVolg¯(x).\displaystyle\lesssim\frac{\Lambda^{m}}{\lambda^{2}\ell^{m}}\int_{\Omega}(n_{2}-n_{1})^{2}(x)d\operatorname{Vol}_{\bar{g}}(x).

By analogous arguments, we also have

B2(ξ)(Γn1Γn2)2(ξ,η)𝑑ηΛmλ2mΩ(n2n1)2(x)dVolg¯(x).\int_{B_{2}(\xi)}(\Gamma_{n_{1}}-\Gamma_{n_{2}})^{2}(\xi,\eta)d\eta\lesssim\frac{\Lambda^{m}}{\lambda^{2}\ell^{m}}\int_{\Omega}(n_{2}-n_{1})^{2}(x)d\operatorname{Vol}_{\bar{g}}(x).

Adding the last two inequalities, we get

Ω(Γn1Γn2)2(ξ,η)𝑑η\displaystyle\int_{\partial\Omega}(\Gamma_{n_{1}}-\Gamma_{n_{2}})^{2}(\xi,\eta)d\eta Λmλ2mn1n2L2(Ω)2\displaystyle\lesssim\frac{\Lambda^{m}}{\lambda^{2}\ell^{m}}\|n_{1}-n_{2}\|^{2}_{L^{2}(\Omega)}
ΩΩ(Γn1Γn2)2(ξ,η)𝑑η𝑑ξ\displaystyle\Rightarrow\int_{\partial\Omega}\int_{\partial\Omega}(\Gamma_{n_{1}}-\Gamma_{n_{2}})^{2}(\xi,\eta)d\eta d\xi Λmλ2mn1n2L2(Ω)2\displaystyle\lesssim\frac{\Lambda^{m}}{\lambda^{2}\ell^{m}}\|n_{1}-n_{2}\|^{2}_{L^{2}(\Omega)}
Γn1Γn2L2(Ω×Ω)\displaystyle\Rightarrow\|\Gamma_{n_{1}}-\Gamma_{n_{2}}\|_{L^{2}(\partial\Omega\times\partial\Omega)} Ω,Ω0,g¯,,LΛm/2λn1n2L2(Ω).\displaystyle\lesssim_{\Omega,\Omega_{0},\bar{g},\ell,L}\frac{\Lambda^{m/2}}{\lambda}\|n_{1}-n_{2}\|_{L^{2}(\Omega)}.

This completes the proof.

Next, we derive the analogous continuity estimate for the map cZcc\mapsto Z_{c}. The key step is to show that for any M>0M>0, the operator norm of the derivative of expnc(x,v)\exp_{n_{c}}(x,v) is uniformly bounded for all c𝒞,M3(Ω0)c\in\mathcal{C}^{3}_{\ell,M}(\Omega_{0}) and (x,v)dom(expnc)(x,v)\in\operatorname{dom}(\exp_{n_{c}}). We begin with a simple lemma.

Lemma 2.5.

Let (,g)(\mathcal{M},g) be a Riemannian manifold whose curvature tensor RR satisfies

R=sup{|R(u,v)w|g:u,v,wS}<.\|R\|=\sup\left\{|R(u,v)w|_{g}:u,v,w\in S\mathcal{M}\right\}<\infty.

Then any Jacobi field JJ along a unit speed geodesic γ:[0,T]\gamma:[0,T]\to\mathcal{M} satisfies the norm bounds

|J(t)|g2+|J˙(t)|g2e(1+R)t(|J(0)|g2+|J˙(0)|g2)for all t[0,T].|J(t)|_{g}^{2}+|\dot{J}(t)|_{g}^{2}\leq e^{(1+\|R\|)t}\left(|J(0)|_{g}^{2}+|\dot{J}(0)|_{g}^{2}\right)\qquad\textrm{for all }t\in[0,T].
Proof.

Set f(t)=|J(t)|g2+|J˙(t)|g2f(t)=|J(t)|_{g}^{2}+|\dot{J}(t)|_{g}^{2}. Since JJ is a Jacobi field, it satisfies the equation

J¨(t)+R(J(t),γ˙(t))γ˙(t)=0.\ddot{J}(t)+R(J(t),\dot{\gamma}(t))\dot{\gamma}(t)=0.

Therefore,

f(t)\displaystyle f^{\prime}(t) =2J(t),J˙(t)g+2J˙(t),J¨(t)g\displaystyle=2\langle J(t),\dot{J}(t)\rangle_{g}+2\langle\dot{J}(t),\ddot{J}(t)\rangle_{g}
=2J,J˙g+2J˙,R(J,γ˙)γ˙g\displaystyle=2\langle J,\dot{J}\rangle_{g}+2\langle\dot{J},-R(J,\dot{\gamma})\dot{\gamma}\rangle_{g}
2|J|g|J˙|g+2|J˙|gR|J|g|γ˙|g2\displaystyle\leq 2|J|_{g}|\dot{J}|_{g}+2|\dot{J}|_{g}\|R\||J|_{g}|\dot{\gamma}|_{g}^{2}
(1+R)f(t).\displaystyle\leq(1+\|R\|)f(t).

So it follows that

f(t)e(1+)tf(0)for all t[0,T].f(t)\leq e^{(1+\|\mathbb{R}\|)t}f(0)\qquad\textrm{for all }t\in[0,T].

Next, let us recall the definition of the canonical metric on the tangent bundle of a Riemannian manifold, also called the Sasaki metric. Let (,g)(\mathcal{M},g) be a Riemannian manifold, (x,w)T(x,w)\in T\mathcal{M}, and V1,V2T(x,w)TV_{1},V_{2}\in T_{(x,w)}T\mathcal{M}. Then we may choose curves αj(s)=(σj(s),vj(s))\alpha_{j}(s)=(\sigma_{j}(s),v_{j}(s)) in TT\mathcal{M}, defined on (ε,ε)(-\varepsilon,\varepsilon), such that

αj(0)=(x,w),α˙j(0)=Vj,for j=1,2.\alpha_{j}(0)=(x,w),\qquad\dot{\alpha}_{j}(0)=V_{j},\qquad\textrm{for }j=1,2.

The inner product of V1,V2V_{1},V_{2} with respect to the Sasaki metric is defined to be

V1,V2g:=σ˙1(0),σ˙2(0)g+v˙1(0),v˙2(0)g,\langle V_{1},V_{2}\rangle_{g}:=\langle\dot{\sigma}_{1}(0),\dot{\sigma}_{2}(0)\rangle_{g}+\langle\dot{v}_{1}(0),\dot{v}_{2}(0)\rangle_{g},

where v˙j(s)\dot{v}_{j}(s) represents the covariant derivative of vj(s)v_{j}(s) along the curve σj(s)\sigma_{j}(s). Note that we are using the same notation for the Sasaki metric as for the original metric gg. Now, for any C1C^{1} map F:TF:T\mathcal{M}\to\mathcal{M}, the operator norm of the total derivative of FF at (x,w)T(x,w)\in T\mathcal{M} is given by

DF(x,w)op:=sup{|DF(x,w)(V)|g:VT(x,w)T,|V|g=1}.\|DF(x,w)\|_{op}:=\sup\{|DF(x,w)(V)|_{g}\ :\ V\in T_{(x,w)}T\mathcal{M},\,|V|_{g}=1\}.

We will show that if c𝒞,M3(Ω0)c\in\mathcal{C}^{3}_{\ell,M}(\Omega_{0}), the total derivative of expnc\exp_{n_{c}} is bounded above in the operator norm.

Proposition 2.6.

For any M>0M>0, there exists L=L(M)>0L=L(M)>0 such that for all c𝒞,M3(Ω0)c\in\mathcal{C}^{3}_{\ell,M}(\Omega_{0}), the total derivative of the exponential map of gncg_{n_{c}} satisfies

Dexpnc(x,w)op<L\|D\exp_{n_{c}}(x,w)\|_{op}<L

for all xΩ¯x\in\overline{\Omega} and wdom(expnc(x,))w\in\operatorname{dom}(\exp_{n_{c}}(x,\cdot)). In particular, nc𝒩λ,Λ,,L(Ω0)n_{c}\in\mathcal{N}_{\lambda,\Lambda,\ell,L}(\Omega_{0}).

Proof.

Suppose c𝒞,M3(Ω0)c\in\mathcal{C}^{3}_{\ell,M}(\Omega_{0}). Fix (x,w)dom(expnc)(x,w)\in\operatorname{dom}(\exp_{n_{c}}), and let VT(x,w)TΩ¯V\in T_{(x,w)}T\overline{\Omega}. It suffices to show that

|Dexpnc(x,w)(V)|g¯<L|V|g¯.|D\exp_{n_{c}}(x,w)(V)|_{\bar{g}}<L|V|_{\bar{g}}.

Choose a curve α(s)=(σ(s),v(s))\alpha(s)=(\sigma(s),v(s)) in TΩ¯T\overline{\Omega}, defined on (ε,ε)(-\varepsilon,\varepsilon), such that α(0)=(x,w)\alpha(0)=(x,w) and α˙(0)=V\dot{\alpha}(0)=V. Consider the family of geodesics Φ:(ε,ε)×[0,1]Ω¯\Phi:(-\varepsilon,\varepsilon)\times[0,1]\to\overline{\Omega} defined by

Φ(s,t)=expnc(σ(s),tv(s)).\Phi(s,t)=\exp_{n_{c}}(\sigma(s),tv(s)).

The variation field of this family of geodesics is

J(t):=sexpnc(σ(s),tv(s))|s=0,J(t):=\partial_{s}\exp_{n_{c}}(\sigma(s),tv(s))\big{|}_{s=0},

which is a Jacobi field along γ(t):=Φ(0,t)\gamma(t):=\Phi(0,t). Observe that

J(1)=sexpnc(σ(s),v(s))|s=0=Dexpnc(x,w)(V),J(1)=\partial_{s}\exp_{n_{c}}(\sigma(s),v(s))\big{|}_{s=0}=D\exp_{n_{c}}(x,w)(V),

which is precisely the quantity whose norm we want to estimate.

Let RR be the Riemann curvature tensor of (Ω¯,gnc)(\overline{\Omega},g_{n_{c}}), and let RjkliR^{i}_{jkl} denote its tensor coefficients with respect to a fixed global coordinate chart on Ω¯\overline{\Omega}. Then we have

Rjkli=kΓljilΓkji+ΓkmiΓljmΓlmiΓkjm,R^{i}_{jkl}=\partial_{k}\Gamma^{i}_{lj}-\partial_{l}\Gamma^{i}_{kj}+\Gamma^{i}_{km}\Gamma^{m}_{lj}-\Gamma^{i}_{lm}\Gamma^{m}_{kj},

where

Γjkl=12nc2g¯lm(j(nc2g¯km)+k(nc2g¯jm)m(nc2g¯jk)).\Gamma^{l}_{jk}=\frac{1}{2}n_{c}^{-2}\bar{g}^{lm}\left(\partial_{j}(n_{c}^{2}\bar{g}_{km})+\partial_{k}(n_{c}^{2}\bar{g}_{jm})-\partial_{m}(n_{c}^{2}\bar{g}_{jk})\right).

This implies that for any xΩ¯x\in\overline{\Omega},

maxijkl|Rjkli(x)|g¯1+nc(x)2ncC22e4M(1+M)4.\max_{ijkl}|R^{i}_{jkl}(x)|\lesssim_{\bar{g}}1+n_{c}(x)^{-2}\|n_{c}\|^{2}_{C^{2}}\lesssim e^{4M}(1+M)^{4}.

Therefore, for any xΩ¯x\in\overline{\Omega} and unit tangent vectors u,v,wSxΩu,v,w\in S_{x}\Omega,

|R(u,v)w|gc\displaystyle|R(u,v)w|_{g_{c}} nc(x)(maxijkl|Rjkli(x)ujvkwl|)e5M(1+M)4\displaystyle\lesssim n_{c}(x)\left(\max_{ijkl}|R^{i}_{jkl}(x)u^{j}v^{k}w^{l}|\right)\lesssim e^{5M}(1+M)^{4}
R\displaystyle\Rightarrow\|R\| Ce5M(1+M)4\displaystyle\leq Ce^{5M}(1+M)^{4}

for some C>0C>0. Taking L2>exp(1+Ce5M(1+M)4)L^{2}>\exp(1+C^{\prime}e^{5M}(1+M)^{4}) and applying Lemma 2.5, we get

|Dexpc(x,w)(V)|gc2=|J(1)|gnc2\displaystyle|D\exp_{c}(x,w)(V)|_{g_{c}}^{2}=|J(1)|_{g_{n_{c}}}^{2} <L2(|J(0)|gnc2+|J˙(0)|gnc2)\displaystyle<L^{2}\left(|J(0)|_{g_{n_{c}}}^{2}+|\dot{J}(0)|_{g_{n_{c}}}^{2}\right)
=L2(|σ˙(0)|2+|v˙(0)|2)=L2|V|g¯2.\displaystyle=L^{2}\left(|\dot{\sigma}(0)|^{2}+\left|\dot{v}(0)\right|^{2}\right)=L^{2}|V|_{\bar{g}}^{2}.

This completes the proof. ∎

Corollary 2.7.

There exists a constant C2=C2(Ω,Ω0,g¯,,M)>0C_{2}^{\prime}=C_{2}^{\prime}(\Omega,\Omega_{0},\bar{g},\ell,M)>0 such that for all c1,c2𝒞,M3(Ω0)c_{1},c_{2}\in\mathcal{C}^{3}_{\ell,M}(\Omega_{0}),

Zc1Zc2L2(Ω×Ω)C2c1c2L2(Ω0).\|Z_{c_{1}}-Z_{c_{2}}\|_{L^{2}(\partial\Omega\times\partial\Omega)}\leq C_{2}^{\prime}\|c_{1}-c_{2}\|_{L^{2}(\Omega_{0})}.
Proof.

We know from Theorem 1.3, Proposition 2.6, and equation (15) that

Γc1Γc2L2(Ω×Ω)Ω,Ω0,g¯,,Mc1c2L2(Ω0).\|\Gamma_{c_{1}}-\Gamma_{c_{2}}\|_{L^{2}(\partial\Omega\times\partial\Omega)}\lesssim_{\Omega,\Omega_{0},\bar{g},\ell,M}\|c_{1}-c_{2}\|_{L^{2}(\Omega_{0})}.

Now consider

Γc1Γc2L2(Ω×Ω)2=Ω×Ω|eZc1eZc2|2𝑑ξ𝑑η.\|\Gamma_{c_{1}}-\Gamma_{c_{2}}\|_{L^{2}(\partial\Omega\times\partial\Omega)}^{2}=\int_{\partial\Omega\times\partial\Omega}\left|e^{Z_{c_{1}}}-e^{Z_{c_{2}}}\right|^{2}\ d\xi d\eta.

Recall that there exists δ>0\delta>0 such that Zc1(ξ,η)=Zc2(ξ,η)Z_{c_{1}}(\xi,\eta)=Z_{c_{2}}(\xi,\eta) whenever distg¯(ξ,η)<δ\operatorname{dist}_{\bar{g}}(\xi,\eta)<\delta. On the set {distg¯(ξ,η)δ}\{\operatorname{dist}_{\bar{g}}(\xi,\eta)\geq\delta\},

eMδ\displaystyle e^{-M}\delta Γcj(ξ,η)eMdiamg¯(Ω)\displaystyle\leq\Gamma_{c_{j}}(\xi,\eta)\leq e^{M}\operatorname{diam}_{\bar{g}}(\Omega)
(18) M+logδ\displaystyle\Rightarrow-M+\log\delta Zcj(ξ,η)M+log|diamg¯(Ω)|.\displaystyle\leq Z_{c_{j}}(\xi,\eta)\leq M+\log|\operatorname{diam}_{\bar{g}}(\Omega)|.

So by (14),

|eZc1(ξ,η)eZc2(ξ,η)|eMδ|Zc1(ξ,η)Zc2(ξ,η)||e^{Z_{c_{1}}(\xi,\eta)}-e^{Z_{c_{2}}(\xi,\eta)}|\geq e^{-M}\delta|Z_{c_{1}}(\xi,\eta)-Z_{c_{2}}(\xi,\eta)|

for all (ξ,η)Ω×Ω(\xi,\eta)\in\partial\Omega\times\partial\Omega. Consequently,

Γc1Γc2L2(Ω×Ω)2=|eZc1eZc2|2𝑑ξ𝑑ηe2Mδ2|Zc1Zc2|2𝑑ξ𝑑η.\|\Gamma_{c_{1}}-\Gamma_{c_{2}}\|^{2}_{L^{2}(\partial\Omega\times\partial\Omega)}=\int\left|e^{Z_{c_{1}}}-e^{Z_{c_{2}}}\right|^{2}\ d\xi d\eta\geq e^{-2M}\delta^{2}\int|Z_{c_{1}}-Z_{c_{2}}|^{2}\ d\xi d\eta.

So we conclude that

Zc1Zc2L2Γc1Γc2L2c1c2L2.\|Z_{c_{1}}-Z_{c_{2}}\|_{L^{2}}\lesssim\|\Gamma_{c_{1}}-\Gamma_{c_{2}}\|_{L^{2}}\lesssim\|c_{1}-c_{2}\|_{L^{2}}.

We conclude this section with a technical result that will be necessary for the proof of Theorem 3.7 in Section 3.

Theorem 2.8.

Given M>0M>0, there exists a constant C3=C3(Ω,Ω0,g¯,,M)>0C_{3}^{\prime}=C_{3}^{\prime}(\Omega,\Omega_{0},\bar{g},\ell,M)>0 such that for all c1,c2𝒞,M3(Ω0)c_{1},c_{2}\in\mathcal{C}^{3}_{\ell,M}(\Omega_{0}),

Zc1Zc2H2(Ω×Ω)C3.\|Z_{c_{1}}-Z_{c_{2}}\|_{H^{2}(\partial\Omega\times\partial\Omega)}\leq C_{3}^{\prime}.
Proof.

We know from Theorem 1.3 that

Zc1Zc2L2c1c2L22M.\|Z_{c_{1}}-Z_{c_{2}}\|_{L^{2}}\lesssim\|c_{1}-c_{2}\|_{L^{2}}\lesssim 2M.

Next, let ξ,ηΩ\xi,\eta\in\partial\Omega. It follows from Remark 1.2 that if distg¯(ξ,η)<δ\operatorname{dist}_{\bar{g}}(\xi,\eta)<\delta, then Zc1Zc2Z_{c_{1}}-Z_{c_{2}} and all its derivatives are identically 0 in a neighborhood of (ξ,η)(\xi,\eta). On the other hand, if distg¯(ξ,η)>δ\operatorname{dist}_{\bar{g}}(\xi,\eta)>\delta, Lemma 2.2 implies

|dξ(Zc1Zc2)(ξ,η)|g¯|dξΓc1(ξ,η)|g¯Γc1(ξ,η)+|dξΓc2(ξ,η)|g¯Γc2(ξ,η)eMδ.|d_{\xi}(Z_{c_{1}}-Z_{c_{2}})(\xi,\eta)|_{\bar{g}}\leq\frac{|d_{\xi}\Gamma_{c_{1}}(\xi,\eta)|_{\bar{g}}}{\Gamma_{c_{1}}(\xi,\eta)}+\frac{|d_{\xi}\Gamma_{c_{2}}(\xi,\eta)|_{\bar{g}}}{\Gamma_{c_{2}}(\xi,\eta)}\lesssim\frac{e^{M}}{\delta}.

This shows that dξ(Zc1Zc2)L2\|d_{\xi}(Z_{c_{1}}-Z_{c_{2}})\|_{L^{2}} is uniformly bounded for c1,c2𝒞,M3(Ω0)c_{1},c_{2}\in\mathcal{C}^{3}_{\ell,M}(\Omega_{0}). By symmetry, dη(Zc1Zc2)L2\|d_{\eta}(Z_{c_{1}}-Z_{c_{2}})\|_{L^{2}} is also uniformly bounded.

So it only remains to consider the Hessian tensor of Zc1Zc2Z_{c_{1}}-Z_{c_{2}}. Let \nabla denote the Levi-Civita connection on Ωξ×Ωη\partial\Omega_{\xi}\times\partial\Omega_{\eta}, and let πξ:Ωξ×ΩηΩξ\pi^{\xi}:\partial\Omega_{\xi}\times\partial\Omega_{\eta}\to\partial\Omega_{\xi} and πη:Ωξ×ΩηΩη\pi^{\eta}:\partial\Omega_{\xi}\times\partial\Omega_{\eta}\to\partial\Omega_{\eta} denote the canonical projection maps. We may decompose \nabla as ξ+η\nabla^{\xi}+\nabla^{\eta}, where ξ\nabla^{\xi} and η\nabla^{\eta} are the covariant derivative operations with respect to ξ\xi and η\eta respectively. More precisely, given any tensor field FF on Ωξ×Ωη\partial\Omega_{\xi}\times\partial\Omega_{\eta}, and any tangent vector vT(Ωξ×Ωη)v\in T(\partial\Omega_{\xi}\times\partial\Omega_{\eta}), we have

vξF=(πξ)vξF,vηF=(πη)vηF,\nabla^{\xi}_{v}F=\nabla_{(\pi^{\xi})_{*}v_{\xi}}F,\qquad\nabla^{\eta}_{v}F=\nabla_{(\pi^{\eta})_{*}v_{\eta}}F,

where (vξ,vη)(v_{\xi},v_{\eta}) is the image of vv under the canonical isomorphism from T(Ωξ×Ωη)T(\partial\Omega_{\xi}\times\partial\Omega_{\eta}) to (TΩξ)×(TΩη)(T\partial\Omega_{\xi})\times(T\partial\Omega_{\eta}). Correspondingly, the Hessian operator on Ωξ×Ωη\partial\Omega_{\xi}\times\partial\Omega_{\eta} can be decomposed as

Hess=2\displaystyle\operatorname{Hess}=\nabla^{2} =(ξ+η)(ξ+η)\displaystyle=(\nabla^{\xi}+\nabla^{\eta})(\nabla^{\xi}+\nabla^{\eta})
=ξξ+ξη+ηξ+ηη\displaystyle=\nabla^{\xi}\nabla^{\xi}+\nabla^{\xi}\nabla^{\eta}+\nabla^{\eta}\nabla^{\xi}+\nabla^{\eta}\nabla^{\eta}
=Hessξ+ξη+ηξ+Hessη,\displaystyle=\operatorname{Hess}_{\xi}+\nabla^{\xi}\nabla^{\eta}+\nabla^{\eta}\nabla^{\xi}+\operatorname{Hess}_{\eta},

where Hessξ\operatorname{Hess}_{\xi} and Hessη\operatorname{Hess}_{\eta} are the Hessian operators with respect to ξ\xi and η\eta respectively. Now let ξ,ηΩ\xi,\eta\in\partial\Omega be such that distg¯(ξ,η)>δ\operatorname{dist}_{\bar{g}}(\xi,\eta)>\delta. Then for j=1,2j=1,2,

ξηZcj(ξ,η)\displaystyle\nabla^{\xi}\nabla^{\eta}Z_{c_{j}}(\xi,\eta) =ξηlogΓcj(ξ,η)\displaystyle=\nabla^{\xi}\nabla^{\eta}\log\Gamma_{c_{j}}(\xi,\eta)
=(ξηΓcjΓcjdξΓcjdηΓcjΓcj2)(ξ,η).\displaystyle=\left(\frac{\nabla^{\xi}\nabla^{\eta}\Gamma_{c_{j}}}{\Gamma_{c_{j}}}-\frac{d_{\xi}\Gamma_{c_{j}}\otimes d_{\eta}\Gamma_{c_{j}}}{\Gamma_{c_{j}}^{2}}\right)(\xi,\eta).

By Lemma 2.2, this implies

|ξηZcj(ξ,η)|g¯\displaystyle|\nabla^{\xi}\nabla^{\eta}Z_{c_{j}}(\xi,\eta)|_{\bar{g}} |ξηΓcj(ξ,η)|g¯Γcj(ξ,η)+|dξΓcj(ξ,η)|g¯|dηΓcj(ξ,η)|g¯Γcj2(ξ,η)\displaystyle\leq\frac{|\nabla^{\xi}\nabla^{\eta}\Gamma_{c_{j}}(\xi,\eta)|_{\bar{g}}}{\Gamma_{c_{j}}(\xi,\eta)}+\frac{|d_{\xi}\Gamma_{c_{j}}(\xi,\eta)|_{\bar{g}}|d_{\eta}\Gamma_{c_{j}}(\xi,\eta)|_{\bar{g}}}{\Gamma_{c_{j}}^{2}(\xi,\eta)}
1+1λδ2+1δ2.\displaystyle\lesssim\frac{1+\ell^{-1}}{\lambda\delta^{2}}+\frac{1}{\delta^{2}}.

This implies that ξη(Zc1Zc2)L2\|\nabla^{\xi}\nabla^{\eta}(Z_{c_{1}}-Z_{c_{2}})\|_{L^{2}} is uniformly bounded as well. Finally, consider the fact [43] that

HessξΓcj(ξ,η)=(Dwexpcj(ξ,w(ξ,η)))1(Dξexpcj(ξ,w(ξ,η))),\operatorname{Hess}_{\xi}\Gamma_{c_{j}}(\xi,\eta)=(D_{w}\exp_{c_{j}}(\xi,w(\xi,\eta)))^{-1}(D_{\xi}\exp_{c_{j}}(\xi,w(\xi,\eta))),

where w(ξ,)w(\xi,\cdot) is the inverse of expcj(ξ,)\exp_{c_{j}}(\xi,\cdot) as in Lemma 2.2. Therefore, by Proposition 2.6,

|HessξΓcj(ξ,η)|g¯1L(M).|\operatorname{Hess}_{\xi}\Gamma_{c_{j}}(\xi,\eta)|_{\bar{g}}\lesssim\ell^{-1}L(M).

Writing Zcj=logΓcjZ_{c_{j}}=\log\Gamma_{c_{j}}, we get

HessξZcj(ξ,η)\displaystyle\operatorname{Hess}_{\xi}Z_{c_{j}}(\xi,\eta) =HessξlogΓcj(ξ,η)\displaystyle=\operatorname{Hess}_{\xi}\log\Gamma_{c_{j}}(\xi,\eta)
=(HessξΓcjΓcjdξΓcjdξΓcjΓcj2)(ξ,η),\displaystyle=\left(\frac{\operatorname{Hess}_{\xi}\Gamma_{c_{j}}}{\Gamma_{c_{j}}}-\frac{d_{\xi}\Gamma_{c_{j}}\otimes d_{\xi}\Gamma_{c_{j}}}{\Gamma_{c_{j}}^{2}}\right)(\xi,\eta),

which implies

|HessξZcj(ξ,η)|g¯\displaystyle|\operatorname{Hess}_{\xi}Z_{c_{j}}(\xi,\eta)|_{\bar{g}} |HessξΓcj(ξ,η)|g¯Γcj(ξ,η)+|dξΓcj(ξ,η)|g¯2Γcj2(ξ,eta)\displaystyle\leq\frac{|\operatorname{Hess}_{\xi}\Gamma_{c_{j}}(\xi,\eta)|_{\bar{g}}}{\Gamma_{c_{j}}(\xi,\eta)}+\frac{|d_{\xi}\Gamma_{c_{j}}(\xi,\eta)|_{\bar{g}}^{2}}{\Gamma_{c_{j}}^{2}(\xi,eta)}
1Lλδ2+1δ2.\displaystyle\lesssim\frac{\ell^{-1}L}{\lambda\delta^{2}}+\frac{1}{\delta^{2}}.

So we conclude that Hessξ(Zc1Zc2)L2\|\operatorname{Hess}_{\xi}(Z_{c_{1}}-Z_{c_{2}})\|_{L^{2}}, and by similar arguments, Hessη(Zc1Zc2)L2\|\operatorname{Hess}_{\eta}(Z_{c_{1}}-Z_{c_{2}})\|_{L^{2}}, are both uniformly bounded on 𝒞,Mβ(Ω0)\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}) as well. This proves the result. ∎

3. Statistical Inversion through the Bayesian framework

As discussed in the Introduction, we will be using the posterior mean of cc given finitely many measurements 𝒟N=(Xi,Yi,Zi)i=1N\mathcal{D}_{N}=(X_{i},Y_{i},Z_{i})_{i=1}^{N}, as an estimator for the true metric parameter c0c_{0}. Let us begin by describing the prior distribution Π\Pi for cC03(Ω0)c\in C^{3}_{0}(\Omega_{0}). We will assume that Π\Pi arises from a centered Gaussian probability distribution Π~\widetilde{\Pi} on the Banach space C(Ω¯0)C(\overline{\Omega}_{0}) that satisfies the following conditions.

Condition 3.1.

Let β3\beta\geq 3 and α>β+m2\alpha>\beta+\frac{m}{2}. We assume that Π~\widetilde{\Pi} is a centered Gaussian Borel probability measure on C(Ω¯0)C(\overline{\Omega}_{0}) that is supported in a separable subspace of C0β(Ω0)C^{\beta}_{0}(\Omega_{0}). Moreover, its Reproducing Kernel Hilbert space (RKHS) (,)(\mathcal{H},\|\cdot\|_{\mathcal{H}}) must be continuously embedded in the Sobolev space Hα(Ω0)H^{\alpha}(\Omega_{0}).

We refer the reader to [14, Chapter 11] or [15, Sections 2.1 and 2.6] for basic facts about Gaussian probability measures and their Reproducing Kernel Hilbert Spaces.

We now define the prior Π\Pi to be the restriction of Π~\widetilde{\Pi} to 𝒞,Mβ(Ω0)\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}) in the sense that

(19) Π(A)=Π~(A𝒞,Mβ(Ω0))Π~(𝒞,Mβ(Ω0))\Pi(A)=\frac{\widetilde{\Pi}\left(A\cap\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0})\right)}{\widetilde{\Pi}(\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}))}

for all Borel sets AC03(Ω0)A\subseteq C^{3}_{0}(\Omega_{0}). We will see in Lemma 3.5 that CβC^{\beta}-balls have positive Π~\widetilde{\Pi}-measure. This together with the fact that 𝒞,Mβ(Ω0)\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}) is an open subset of C0β(Ω0)C_{0}^{\beta}(\Omega_{0}) (c.f. Remark 1.4) implies that Π~(𝒞,Mβ(Ω0))>0\widetilde{\Pi}(\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}))>0. Therefore, (19) yields a well-defined probability distribution on C03(Ω0)C^{3}_{0}(\Omega_{0}).

Theorem 3.1.

Let Π\Pi be a prior distribution on C03(Ω0)C^{3}_{0}(\Omega_{0}) defined by (19). Assume that the true parameter c0𝒞,Mβ(Ω0)c_{0}\in\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0})\cap\mathcal{H}, and let c¯N\overline{c}_{N} be the mean (6) of the posterior distribution Π(|𝒟N)\Pi(\cdot|\mathcal{D}_{N}) arising from observations (5). Then there exists ω(0,1/4)\omega\in(0,1/4) such that

Pc0N(c¯Nc0L2(Ω0)>Nω)0as N.P^{N}_{c_{0}}\left(\|\overline{c}_{N}-c_{0}\|_{L^{2}(\Omega_{0})}>N^{-\omega}\right)\to 0\qquad\textrm{as }N\to\infty.

Moreover, ω\omega can be made arbitrarily close to 1/41/4 for α\alpha, β\beta large enough.

Remark 3.1.

The assumption that c0𝒞,Mβ(Ω0)c_{0}\in\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0})\cap\mathcal{H} is weaker than in Theorem 1.5, where we assumed that c0c_{0} is smooth, compactly supported in Ω0\Omega_{0}, and that gnc0g_{n_{c_{0}}} is simple. Indeed, if gnc0g_{n_{c_{0}}} is a smooth simple metric, c0c_{0} necessarily belongs to 𝒞,Mβ(Ω0)\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}) for appropriate values of ,M\ell,M, and any β\beta. Moreover, given any c0H0α(Ω0)c_{0}\in H^{\alpha}_{0}(\Omega_{0}), it is possible to choose Π~\widetilde{\Pi} so that its RKHS \mathcal{H} contains c0c_{0}. Indeed, let (f(x):xΩ0)(f(x):x\in\Omega_{0}) be the so-called Matérn-Whittle process of regularity α\alpha (see [14, Example 11.8]), whose corresponding RKHS is Hα(Ω0)H^{\alpha}(\Omega_{0}). It follows from Lemma I.4 in [14] that the sample paths of this process belong almost surely to Cβ(Ω¯0)C^{\beta}(\overline{\Omega}_{0}). Now choose a cut-off function φC(Ω¯0)\varphi\in C^{\infty}(\overline{\Omega}_{0}) such that φ>0\varphi>0 on Ω0\Omega_{0}, φ\varphi and all its partial derivatives vanish on Ω0\partial\Omega_{0}, and φ1c0Hα(Ω0)\varphi^{-1}c_{0}\in H^{\alpha}(\Omega_{0}). Define Π~\widetilde{\Pi} to be the probability law of (φ(x)f(x):xΩ0)(\varphi(x)f(x):x\in\Omega_{0}). Then ={φf:fHα(Ω0)}\mathcal{H}=\left\{\varphi f:f\in H^{\alpha}(\Omega_{0})\right\}, which contains c0c_{0}. Therefore, Theorem 3.1 is a more general and precise version of Theorem 1.5.

3.1. A General Contraction Theorem

Our proof of Theorem 3.1 will follow the same general strategy as in [23], with some modifications necessitated by the fact that our prior Π\Pi is not in itself a Gaussian probability measure, but rather the restriction of such a measure to 𝒞,Mβ(Ω0)\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}). We begin with a general posterior contraction result (Theorem 3.2). This is a simplified version of [23, Theorem 5.13], which suffices for us since our prior Π\Pi independent of NN. Before stating the result, we need to introduce some notation. Recall that for c𝒞,Mβ(Ω0)c\in\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}), we defined pcp_{c} as the probability density function

pc(x,y,z)=12πexp{12(zZc(x,y))2}for all (x,y,z)𝒳,p_{c}(x,y,z)=\frac{1}{\sqrt{2\pi}}\exp\left\{-\frac{1}{2}(z-Z_{c}(x,y))^{2}\right\}\qquad\textrm{for all }(x,y,z)\in\mathcal{X},

where 𝒳=Ω×Ω×\mathcal{X}=\partial\Omega\times\partial\Omega\times\mathbb{R}. Given c1,c2𝒞,Mβ(Ω0)c_{1},c_{2}\in\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}), let

h(c1,c2):=(𝒳(pc1pc2)2𝑑μ(x,y)𝑑z)1/2h(c_{1},c_{2}):=\left(\int_{\mathcal{X}}(\sqrt{p_{c_{1}}}-\sqrt{p_{c_{2}}})^{2}d\mu(x,y)\,dz\right)^{1/2}

denote the Hellinger distance between pc1p_{c_{1}} and pc2p_{c_{2}},

K(c1,c2):=𝔼c1[log(pc1pc2)]=𝒳log(pc1pc2)pc1𝑑μ(x,y)𝑑zK(c_{1},c_{2}):=\mathbb{E}_{c_{1}}\left[\log\left(\frac{p_{c_{1}}}{p_{c_{2}}}\right)\right]=\int_{\mathcal{X}}\log\left(\frac{p_{c_{1}}}{p_{c_{2}}}\right)p_{c_{1}}d\mu(x,y)\,dz

the Kullback-Leibler divergence, and

V(c1,c2):=𝔼c1[log(pc1pc2)]2.V(c_{1},c_{2}):=\mathbb{E}_{c_{1}}\left[\log\left(\frac{p_{c_{1}}}{p_{c_{2}}}\right)\right]^{2}.

Also, for any F𝒞,Mβ(Ω0)F\subseteq\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}) and δ>0\delta>0, we let 𝒩(F,h,δ)\mathcal{N}(F,h,\delta) denote the minimum number of hh-balls of radius δ\delta needed to cover FF.

Theorem 3.2.

Let Π^\widehat{\Pi} be a Borel probability measure on C03(Ω0)C_{0}^{3}(\Omega_{0}) supported on 𝒞,Mβ(Ω0)\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}). Let c0𝒞,Mβ(Ω0)c_{0}\in\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}) be fixed, and let δN\delta_{N} be a sequence of positive numbers such that δN0\delta_{N}\to 0 and NδN\sqrt{N}\delta_{N}\to\infty as NN\to\infty. Assume that the following two conditions hold:

  1. (1)

    There exists C>0C>0 such that for all NN\in\mathbb{N},

    (20) Π^({c𝒞,Mβ(Ω0):K(c,c0)δN2,V(c,c0)δN2})eCNδN2.\widehat{\Pi}\left(\left\{c\in\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}):K(c,c_{0})\leq\delta_{N}^{2},V(c,c_{0})\leq\delta_{N}^{2}\right\}\right)\geq e^{-CN\delta_{N}^{2}}.
  2. (2)

    There exists C~>0\widetilde{C}>0 such that

    (21) log𝒩(𝒞,Mβ(Ω0),h,δN)C~NδN2.\log\mathcal{N}(\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}),h,\delta_{N})\leq\widetilde{C}N\delta_{N}^{2}.

Now suppose that we make i.i.d. observations 𝒟N=(Xi,Yi,Zi)i=1NPc0N\mathcal{D}_{N}=(X_{i},Y_{i},Z_{i})_{i=1}^{N}\sim P^{N}_{c_{0}}. Then for some k>0k>0 large enough, we have

(22) Pc0N(Π^({c𝒞,Mβ(Ω0):h(c,c0)kδN}|𝒟N)1e(C+3)NδN2)0P^{N}_{c_{0}}\left(\widehat{\Pi}\left(\left\{c\in\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}):h(c,c_{0})\leq k\delta_{N}\right\}|\mathcal{D}_{N}\right)\leq 1-e^{-(C+3)N\delta_{N}^{2}}\right)\to 0

as NN\to\infty.

Proof.

Define

(23) BN={c𝒞,Mβ(Ω0):K(c,c0)δN2,V(c,c0)δN2},N.B_{N}=\left\{c\in\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}):K(c,c_{0})\leq\delta_{N}^{2},V(c,c_{0})\leq\delta_{N}^{2}\right\},\qquad N\in\mathbb{N}.

By condition (1) and [15, Lemma 7.3.2], we have that for any ζ>0\zeta>0 and any probability measure m~\widetilde{m} on BNB_{N},

Pc0N(BNi=1Npcpc0(Xi,Yi,Zi)dm~(c)e(1+ζ)NδN2)1ζ2NδN2.P^{N}_{c_{0}}\left(\int_{B_{N}}\prod_{i=1}^{N}\frac{p_{c}}{p_{c_{0}}}(X_{i},Y_{i},Z_{i})d\widetilde{m}(c)\leq e^{-(1+\zeta)N\delta_{N}^{2}}\right)\leq\frac{1}{\zeta^{2}N\delta_{N}^{2}}.

In particular, choosing ζ=1\zeta=1 and taking m~\widetilde{m} to be the restriction of Π^\widehat{\Pi} to BNB_{N} followed by normalization, we get that

Pc0N(BNi=1Npcpc0(Xi,Yi,Zi)dΠ^(c)Π^(BN)e2NδN2)1NδN2N0.P^{N}_{c_{0}}\left(\int_{B_{N}}\prod_{i=1}^{N}\frac{p_{c}}{p_{c_{0}}}(X_{i},Y_{i},Z_{i})d\widehat{\Pi}(c)\leq\widehat{\Pi}(B_{N})e^{-2N\delta_{N}^{2}}\right)\leq\frac{1}{N\delta_{N}^{2}}\xrightarrow{N\to\infty}0.

Set

AN={BNi=1Npcpc0(Xi,Yi,Zi)dΠ^(c)e(2+C)NδN2},A_{N}=\left\{\int_{B_{N}}\prod_{i=1}^{N}\frac{p_{c}}{p_{c_{0}}}(X_{i},Y_{i},Z_{i})d\widehat{\Pi}(c)\geq e^{-(2+C)N\delta_{N}^{2}}\right\},

where CC is as in condition (1). It is clear that AN{BNi=1Npcpc0dΠ^(c)Π^(BN)e2NδN2}A_{N}\supseteq\left\{\int_{B_{N}}\prod_{i=1}^{N}\frac{p_{c}}{p_{c_{0}}}d\widehat{\Pi}(c)\geq\widehat{\Pi}(B_{N})e^{-2N\delta_{N}^{2}}\right\}, and therefore, Pc0N(AN)1P^{N}_{c_{0}}(A_{N})\to 1 as NN\to\infty.

Next, we consider condition (2). Let k>k>0k>k^{\prime}>0 be numbers to be determined later. Fix NN and define the function N(ε)=eC~NδN2N(\varepsilon)=e^{\widetilde{C}N\delta_{N}^{2}} for all ε>ε0=kδN\varepsilon>\varepsilon_{0}=k^{\prime}\delta_{N}. It follows from condition (2) that for any ε>ε0\varepsilon>\varepsilon_{0},

𝒩(𝒞,Mβ(Ω0),h,ε/4)𝒩(𝒞,Mβ(Ω0),h,kδN/4)eC~NδN2=N(ε).\mathcal{N}(\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}),h,\varepsilon/4)\leq\mathcal{N}(\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}),h,k^{\prime}\delta_{N}/4)\leq e^{\widetilde{C}N\delta_{N}^{2}}=N(\varepsilon).

Therefore, by [15, Theorem 7.1.4], there exist test functions ΨN=ΨN(𝒟N)\Psi_{N}=\Psi_{N}(\mathcal{D}_{N}) such that for some K>0K>0,

Pc0N[ΨN=1]N(ε)KeKNε2;supc:h(c,c0)>ε𝔼cN[1ΨN]eKNε2.P^{N}_{c_{0}}[\Psi_{N}=1]\leq\frac{N(\varepsilon)}{K}e^{-KN\varepsilon^{2}}\quad;\quad\sup_{c:h(c,c_{0})>\varepsilon}\mathbb{E}^{N}_{c}[1-\Psi_{N}]\leq e^{-KN\varepsilon^{2}}.

Now let l>C~l>\widetilde{C} be arbitrary. Setting k=l/Kk=\sqrt{l/K} and ε=kδN\varepsilon=k\delta_{N}, we can see that this implies

(24) Pc0N[ΨN=1]0 as N;supc:h(c,c0)>kδN𝔼cN[1ΨN]elNδN2.P^{N}_{c_{0}}[\Psi_{N}=1]\to 0\,\textrm{ as }N\to\infty\quad;\quad\sup_{c:h(c,c_{0})>k\delta_{N}}\mathbb{E}^{N}_{c}[1-\Psi_{N}]\leq e^{-lN\delta_{N}^{2}}.

Now define

FN={c𝒞,Mβ(Ω0):h(c,c0)kδN}F_{N}=\{c\in\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}):h(c,c_{0})\leq k\delta_{N}\}

which is the event whose probability we want to bound. Then by (24),

Pc0N(Π^(FNc|𝒟N)e(C+3)NδN2)\displaystyle\ P^{N}_{c_{0}}\left(\widehat{\Pi}(F_{N}^{c}|\mathcal{D}_{N})\geq e^{-(C+3)N\delta_{N}^{2}}\right)
=\displaystyle= Pc0N(FNci=1Npcpc0(Xi,Yi,Zi)dΠ^(c)i=1Npcpc0(Xi,Yi,Zi)dΠ^(c)e(C+3)NδN2,ΨN=0,AN)+o(1)\displaystyle\ P^{N}_{c_{0}}\left(\frac{\int_{F_{N}^{c}}\prod_{i=1}^{N}\frac{p_{c}}{p_{c_{0}}}(X_{i},Y_{i},Z_{i})d\widehat{\Pi}(c)}{\int\prod_{i=1}^{N}\frac{p_{c}}{p_{c_{0}}}(X_{i},Y_{i},Z_{i})d\widehat{\Pi}(c)}\geq e^{-(C+3)N\delta_{N}^{2}},\ \Psi_{N}=0,\ A_{N}\right)+o(1)
\displaystyle\leq Pc0N((1ΨN)FNci=1Npcpc0(Xi,Yi,Zi)dΠ^(c)e(2C+5)NδN2)+o(1).\displaystyle\ P^{N}_{c_{0}}\left((1-\Psi_{N})\int_{F_{N}^{c}}\prod_{i=1}^{N}\frac{p_{c}}{p_{c_{0}}}(X_{i},Y_{i},Z_{i})d\widehat{\Pi}(c)\geq e^{-(2C+5)N\delta_{N}^{2}}\right)+o(1).

Now by Markov’s inequality, this is further bounded above by

𝔼c0N[(1ΨN)FNci=1Npcpc0(Xi,Yi,Zi)dΠ^(c)]e(2C+5)NδN2+o(1)\displaystyle\ \mathbb{E}^{N}_{c_{0}}\left[(1-\Psi_{N})\int_{F_{N}^{c}}\prod_{i=1}^{N}\frac{p_{c}}{p_{c_{0}}}(X_{i},Y_{i},Z_{i})d\widehat{\Pi}(c)\right]e^{(2C+5)N\delta_{N}^{2}}+o(1)
=\displaystyle= [FNc𝔼c0N[(1ΨN)i=1Npcpc0(Xi,Yi,Zi)]𝑑Π^(c)]e(2C+5)NδN2+o(1)(by Fubini’s Theorem)\displaystyle\ \left[\int_{F_{N}^{c}}\mathbb{E}^{N}_{c_{0}}\left[(1-\Psi_{N})\prod_{i=1}^{N}\frac{p_{c}}{p_{c_{0}}}(X_{i},Y_{i},Z_{i})\right]d\widehat{\Pi}(c)\right]e^{(2C+5)N\delta_{N}^{2}}+o(1)\quad\textrm{(by Fubini's Theorem)}
=\displaystyle= [c:h(c,c0)>kδN𝔼cN[(1ΨN)]𝑑Π^(c)]e(2C+5)NδN2+o(1)\displaystyle\ \left[\int_{c:h(c,c_{0})>k\delta_{N}}\mathbb{E}^{N}_{c}[(1-\Psi_{N})]d\widehat{\Pi}(c)\right]e^{(2C+5)N\delta_{N}^{2}}+o(1)
\displaystyle\leq e(2C+5l)NδN2+o(1).\displaystyle\ e^{(2C+5-l)N\delta_{N}^{2}}+o(1).

Now choosing l>2C+5l>2C+5, the Theorem follows. ∎

3.2. Properties of the Prior

In this section, we will verify the assumptions of Theorem 3.2 when Π^=Π\widehat{\Pi}=\Pi. The key ingredient in the arguments is the forward continuity estimate from Corollary 2.7. We begin by observing that the Hellinger distance between c1,c2𝒞,Mβ(Ω0)c_{1},c_{2}\in\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}) is equivalent to the L2(Ω×Ω)L^{2}(\partial\Omega\times\partial\Omega) distance between Zc1Z_{c_{1}} and Zc2Z_{c_{2}}.

Lemma 3.3.

There exists κ=κ(Ω,g¯,M)>0\kappa=\kappa(\Omega,\bar{g},M)>0 such that for all c1,c2𝒞,Mβ(Ω0)c_{1},c_{2}\in\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}),

κZc1Zc2L22h2(c1,c2)14Volg¯(Ω)2Zc1Zc2L22.\kappa\|Z_{c_{1}}-Z_{c_{2}}\|_{L^{2}}^{2}\leq h^{2}(c_{1},c_{2})\leq\frac{1}{4\operatorname{Vol}_{\bar{g}}(\partial\Omega)^{2}}\|Z_{c_{1}}-Z_{c_{2}}\|_{L^{2}}^{2}.
Proof.

Consider the “Hellinger affinity” function

ρ(c1,c2)=𝒳pc1pc2𝑑μ=112h2(c1,c2).\rho(c_{1},c_{2})=\int_{\mathcal{X}}\sqrt{p_{c_{1}}p_{c_{2}}}d\mu=1-\frac{1}{2}h^{2}(c_{1},c_{2}).

We have

ρ(c1,c2)=\displaystyle\rho(c_{1},c_{2})= 12π𝒳exp{14((zZc1(x,y))2+(zZc2(x,y))2)}𝑑μ(x,y)𝑑z\displaystyle\ \frac{1}{\sqrt{2\pi}}\int_{\mathcal{X}}\exp\left\{-\frac{1}{4}((z-Z_{c_{1}}(x,y))^{2}+(z-Z_{c_{2}}(x,y))^{2})\right\}d\mu(x,y)\,dz
=\displaystyle= 1Volg¯(Ω×Ω)Ω×Ωexp{14(Zc1(x,y)2+Zc2(x,y)2)}\displaystyle\ \frac{1}{\operatorname{Vol}_{\bar{g}}(\partial\Omega\times\partial\Omega)}\int_{\partial\Omega\times\partial\Omega}\exp\left\{-\frac{1}{4}(Z_{c_{1}}(x,y)^{2}+Z_{c_{2}}(x,y)^{2})\right\}
×[12πexp{12(zZc1+Zc22)2}𝑑z]exp{18(Zc1+Zc2)2}dxdy\displaystyle\times\left[\frac{1}{\sqrt{2\pi}}\int_{-\infty}^{\infty}\exp\left\{-\frac{1}{2}\left(z-\frac{Z_{c_{1}}+Z_{c_{2}}}{2}\right)^{2}\right\}dz\right]\exp\left\{\frac{1}{8}(Z_{c_{1}}+Z_{c_{2}})^{2}\right\}dx\,dy
(25) =\displaystyle= 1Volg¯(Ω)2Ω×Ωexp{18(Zc1(x,y)Zc2(x,y))2}𝑑x𝑑y.\displaystyle\ \frac{1}{\operatorname{Vol}_{\bar{g}}(\partial\Omega)^{2}}\int_{\partial\Omega\times\partial\Omega}\exp\left\{-\frac{1}{8}(Z_{c_{1}}(x,y)-Z_{c_{2}}(x,y))^{2}\right\}dx\,dy.

Now applying the simple estimate et1te^{-t}\geq 1-t for all t0t\geq 0, we get

ρ(c1,c2)\displaystyle\rho(c_{1},c_{2}) 1Volg¯(Ω)2Ω×Ω[118(Zc1Zc2)2]𝑑x𝑑y\displaystyle\geq\frac{1}{\operatorname{Vol}_{\bar{g}}(\partial\Omega)^{2}}\int_{\partial\Omega\times\partial\Omega}\left[1-\frac{1}{8}(Z_{c_{1}}-Z_{c_{2}})^{2}\right]dx\,dy
=118Volg¯(Ω)2Zc1Zc2L22.\displaystyle=1-\frac{1}{8\operatorname{Vol}_{\bar{g}}(\partial\Omega)^{2}}\|Z_{c_{1}}-Z_{c_{2}}\|_{L^{2}}^{2}.

Consequently,

h2(c1,c2)=2(1ρ(c1,c2))14Volg¯(Ω)2Zc1Zc2L22.h^{2}(c_{1},c_{2})=2(1-\rho(c_{1},c_{2}))\leq\frac{1}{4\operatorname{Vol}_{\bar{g}}(\partial\Omega)^{2}}\|Z_{c_{1}}-Z_{c_{2}}\|_{L^{2}}^{2}.

Next, we use the fact Zc1,Zc2Z_{c_{1}},Z_{c_{2}} satisfy the uniform bounds (18) on the support of Zc1Zc2Z_{c_{1}}-Z_{c_{2}}. Consequently, for all x,yΩx,y\in\partial\Omega, we have

(26) |Zc1(x,y)Zc2(x,y)|Δ,|Z_{c_{1}}(x,y)-Z_{c_{2}}(x,y)|\leq\Delta,

where Δ=2M+logdiamg¯(Ω)logδ\Delta=2M+\log\operatorname{diam}_{\bar{g}}(\Omega)-\log\delta. Set T=Δ2/8T=\Delta^{2}/8 and observe that for all t[0,T]t\in[0,T],

et1(1eTT)te^{-t}\leq 1-\left(\frac{1-e^{-T}}{T}\right)t

by the convexity of tett\mapsto e^{-t}. Therefore, for κ=1eT4T\kappa=\frac{1-e^{-T}}{4T}, we have

exp{18(Zc1(x,y)Zc2(x,y))2}1κ2|Zc1(x,y)Zc2(x,y)|2\exp\left\{-\frac{1}{8}(Z_{c_{1}}(x,y)-Z_{c_{2}}(x,y))^{2}\right\}\leq 1-\frac{\kappa}{2}|Z_{c_{1}}(x,y)-Z_{c_{2}}(x,y)|^{2}

for all (x,y)Ω×Ω(x,y)\in\partial\Omega\times\partial\Omega. Integrating both sides of this inequality with respect to dμ(x,y)d\mu(x,y) and applying (25), we get

ρ(c1,c2)\displaystyle\rho(c_{1},c_{2}) 1κ2Zc1Zc2L22\displaystyle\leq 1-\frac{\kappa}{2}\|Z_{c_{1}}-Z_{c_{2}}\|_{L^{2}}^{2}
h2(c1,c2)\displaystyle\Rightarrow\ h^{2}(c_{1},c_{2}) κZc1Zc2L22.\displaystyle\geq\kappa\|Z_{c_{1}}-Z_{c_{2}}\|_{L^{2}}^{2}.

This completes the proof. ∎

Now let us verify Condition (1) of Theorem 3.2 for Π\Pi.

Lemma 3.4.

For c0𝒞,Mβ(Ω0)c_{0}\in\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}) and t>0t>0, define

N(t)={c𝒞,Mβ(Ω0):cc0CβδN/t},\mathcal{B}_{N}(t)=\{c\in\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}):\|c-c_{0}\|_{C^{\beta}}\leq\delta_{N}/t\},

and let BN,Π,B_{N},\Pi, and δN\delta_{N} be as in Theorem 3.2. Then for some t>0t>0 large enough, N(t)BN\mathcal{B}_{N}(t)\subset B_{N} for all NN\in\mathbb{N}. In particular,

Π(BN)Π(N(t)).\Pi(B_{N})\geq\Pi(\mathcal{B}_{N}(t)).
Proof.

We need to verify that if tt is large enough, then for any cN(t)c\in\mathcal{B}_{N}(t), we have K(c,c0)δN2K(c,c_{0})\leq\delta_{N}^{2} and V(c,c0)δN2V(c,c_{0})\leq\delta_{N}^{2}. Consider a random observation (X,Y,Z)(X,Y,Z), where (X,Y)(X,Y) is a pair of boundary points chosen with respect to the uniform probability measure μ\mu, and Z=Zc0(X,Y)+ϵZ=Z_{c_{0}}(X,Y)+\epsilon, with ϵN(0,1)\epsilon\sim N(0,1) independent of (X,Y)(X,Y). Observe that for any cN(t)c\in\mathcal{B}_{N}(t),

logpc0pc(X,Y,Z)\displaystyle\log\frac{p_{c_{0}}}{p_{c}}(X,Y,Z) =12[(ZZc0(X,Y))2(ZZc(X,Y))2]\displaystyle=-\frac{1}{2}[(Z-Z_{c_{0}}(X,Y))^{2}-(Z-Z_{c}(X,Y))^{2}]
(27) =12(Zc(X,Y)Zc0(X,Y))2ϵ(Zc(X,Y)Zc0(X,Y)).\displaystyle=\frac{1}{2}(Z_{c}(X,Y)-Z_{c_{0}}(X,Y))^{2}-\epsilon(Z_{c}(X,Y)-Z_{c_{0}}(X,Y)).

Since 𝔼[ϵ|X,Y]=0\mathbb{E}[\epsilon|X,Y]=0, we have

K(c,c0)\displaystyle K(c,c_{0}) =𝔼c0[logpc0pc(X,Y,Z)]\displaystyle=\mathbb{E}_{c_{0}}\left[\log\frac{p_{c_{0}}}{p_{c}}(X,Y,Z)\right]
(28) =𝔼μ[12(Zc(X,Y)Zc0(X,Y))2]\displaystyle=\mathbb{E}^{\mu}\left[\frac{1}{2}(Z_{c}(X,Y)-Z_{c_{0}}(X,Y))^{2}\right]
=12Volg¯(Ω×Ω)Ω×Ω(Zc(x,y)Zc0(x,y))2𝑑x𝑑y\displaystyle=\frac{1}{2\operatorname{Vol}_{\bar{g}}(\partial\Omega\times\partial\Omega)}\int_{\partial\Omega\times\partial\Omega}(Z_{c}(x,y)-Z_{c_{0}}(x,y))^{2}\,dx\,dy
=12Volg¯(Ω)2ZcZc0L22\displaystyle=\frac{1}{2\operatorname{Vol}_{\bar{g}}(\partial\Omega)^{2}}\|Z_{c}-Z_{c_{0}}\|_{L^{2}}^{2}
cc0L22(by Corollary 2.7)\displaystyle\lesssim\|c-c_{0}\|^{2}_{L^{2}}\qquad(\textrm{by Corollary \ref{forward-c}})
(29) δN2t2.\displaystyle\lesssim\frac{\delta_{N}^{2}}{t^{2}}.

So it follows that if tt is large enough, K(c,c0)δN2K(c,c_{0})\leq\delta_{N}^{2} for all cN(t)c\in\mathcal{B}_{N}(t). Next, consider

V(c,c0)\displaystyle V(c,c_{0}) =𝔼c0[logpc0pc(X,Y,Z)]2\displaystyle=\mathbb{E}_{c_{0}}\left[\log\frac{p_{c_{0}}}{p_{c}}(X,Y,Z)\right]^{2}
2𝔼μ[12(ZcZc0)2]2+2𝔼μ[(ZcZc0)2𝔼ϵ[ϵ2]](by (27))\displaystyle\leq 2\mathbb{E}^{\mu}\left[\frac{1}{2}(Z_{c}-Z_{c_{0}})^{2}\right]^{2}+2\mathbb{E}^{\mu}\left[(Z_{c}-Z_{c_{0}})^{2}\mathbb{E}_{\epsilon}[\epsilon^{2}]\right]\qquad\textrm{(by \eqref{pcpform})}
=12Ω×Ω|ZcZc0|4𝑑μ(x,y)+2𝔼μ[ZcZc0]2(since 𝔼[ϵ2]=1)\displaystyle=\frac{1}{2}\int_{\partial\Omega\times\partial\Omega}|Z_{c}-Z_{c_{0}}|^{4}d\mu(x,y)+2\mathbb{E}^{\mu}[Z_{c}-Z_{c_{0}}]^{2}\qquad\textrm{(since $\mathbb{E}[\epsilon^{2}]=1$)}
ZcZc0L22Volg¯(Ω)2ZcZc0L22+4K(c,c0)\displaystyle\leq\frac{\|Z_{c}-Z_{c_{0}}\|^{2}_{L^{\infty}}}{2\operatorname{Vol}_{\bar{g}}(\partial\Omega)^{2}}\|Z_{c}-Z_{c_{0}}\|^{2}_{L^{2}}+4K(c,c_{0})

by (28). It follows from (26) that ZcZc0L<Δ\|Z_{c}-Z_{c_{0}}\|_{L^{\infty}}<\Delta, where Δ>0\Delta>0 depends only on Ω,g¯,δ\Omega,\bar{g},\delta. Consequently,

V(c,c0)\displaystyle V(c,c_{0}) ZcZc0L22+K(c,c0)\displaystyle\lesssim\|Z_{c}-Z_{c_{0}}\|^{2}_{L^{2}}+K(c,c_{0})
C22cc0L22+K(c,c0)(by Corollary 2.7)\displaystyle\lesssim C_{2}^{\prime 2}\|c-c_{0}\|^{2}_{L^{2}}+K(c,c_{0})\qquad(\textrm{by Corollary \ref{forward-c}})
cc0Cβ2+δN2t2(by (29))\displaystyle\lesssim\|c-c_{0}\|^{2}_{C^{\beta}}+\frac{\delta_{N}^{2}}{t^{2}}\qquad\textrm{(by \eqref{kdelta})}
δN2t2.\displaystyle\lesssim\frac{\delta_{N}^{2}}{t^{2}}.

This shows that for t>0t>0 large enough, we also get V(c,c0)δN2V(c,c_{0})\leq\delta_{N}^{2} for all cN(t)c\in\mathcal{B}_{N}(t). ∎

Next, we will establish a lower bound for Π(N(t))\Pi(\mathcal{B}_{N}(t)), which will follow from estimates of Π~\widetilde{\Pi}- measures of sets of the form {c:cCβε}\{c:\|c\|_{C^{\beta}}\leq\varepsilon\} when ε>0\varepsilon>0 is small. To this end, it is convenient to work with Hölder-Zygmund spaces Cs(Ω0)C^{s}_{*}(\Omega_{0}), with s>0s>0 (see [42] for a detailed treatment). If ss is not an integer, Cs(Ω0)C^{s}_{*}(\Omega_{0}) is simply the Hölder space Cs(Ω¯0)C^{s}(\overline{\Omega}_{0}). On the other hand, if ss is a positive integer, Cs(Ω0)C^{s}_{*}(\Omega_{0}) is a larger space than Cs(Ω¯0)C^{s}(\overline{\Omega}_{0}), and is defined by the norm

fCs(Ω0)=|a|s1supxΩ0|af(x)|+|a|=s1supxΩ0,h0|af(x+h)+af(xh)2fa(x)||h|.\|f\|_{C^{s}_{*}(\Omega_{0})}=\sum_{|a|\leq s-1}\sup_{x\in\Omega_{0}}|\partial^{a}f(x)|+\sum_{|a|=s-1}\sup_{x\in\Omega_{0},\,h\neq 0}\frac{|\partial^{a}f(x+h)+\partial^{a}f(x-h)-2f\partial^{a}(x)|}{|h|}.

In either case, it is easy to see that fCsfCs\|f\|_{C^{s}_{*}}\leq\|f\|_{C^{s}} for all fCs(Ω¯0)f\in C^{s}(\overline{\Omega}_{0}). It turns out that Cs(Ω0)C^{s}_{*}(\Omega_{0}) coincides with the Besov space B,s(Ω0)B^{s}_{\infty,\infty}(\Omega_{0}), which allows us to use various embedding and approximation results from Besov space theory.

Before proceeding, let us fix ν>0\nu>0 such that

(30) ν>max{2m2(αβ)m,mβ},and defineδN=N1/(2+ν).\nu>\max\left\{\frac{2m}{2(\alpha-\beta)-m},\frac{m}{\beta}\right\},\qquad\textrm{and define}\qquad\delta_{N}=N^{-1/(2+\nu)}.

It is easy to verify that δN0\delta_{N}\to 0 and NδN=Nν2(2+ν)\sqrt{N}\delta_{N}=N^{\tfrac{\nu}{2(2+\nu)}}\to\infty as NN\to\infty.

Lemma 3.5.

Let c0𝒞,Mβ(Ω0)c_{0}\in\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0})\cap\mathcal{H}, and define δN\delta_{N} as in (30). Then for t>0t>0 large enough, there exists C=C(Ω,Ω0,g¯,α,β,,M,c0,t)>0C^{\prime}=C^{\prime}(\Omega,\Omega_{0},\bar{g},\alpha,\beta,\ell,M,c_{0},t)>0 such that for all NN\in\mathbb{N},

Π(N(t))exp{CNδN2}.\Pi(\mathcal{B}_{N}(t))\geq\exp\{-C^{\prime}N\delta_{N}^{2}\}.

In particular, there exists C=C(Ω,Ω0,g¯,α,β,,M,c0)>0C=C(\Omega,\Omega_{0},\bar{g},\alpha,\beta,\ell,M,c_{0})>0 such that for all NN\in\mathbb{N},

Π(BN)exp{CNδN2}.\Pi(B_{N})\geq\exp\{-CN\delta_{N}^{2}\}.
Proof.

The sets {bC03(Ω0):bCβδ}\{b\in C^{3}_{0}(\Omega_{0}):\|b\|_{C^{\beta}}\leq\delta\} for δ>0\delta>0 are convex and symmetric. Hence by [15, Corollary 2.6.18],

Π~(cc0CβδN/t)ec02/2Π~(cCβδN/t).\widetilde{\Pi}(\|c-c_{0}\|_{C^{\beta}}\leq\delta_{N}/t)\geq e^{-\|c_{0}\|_{\mathcal{H}}^{2}/2}\widetilde{\Pi}(\|c\|_{C^{\beta}}\leq\delta_{N}/t).

Moreover, since c0𝒞,Mβ(Ω0)c_{0}\in\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}), which is open with respect to the CβC^{\beta} metric, we have for all sufficiently large t>0t>0,

Π(N(t))=Π(cc0CβδN/t)=Π~(cc0CβδN/t)Π~(𝒞,Mβ(Ω0)),\Pi(\mathcal{B}_{N}(t))=\Pi(\|c-c_{0}\|_{C^{\beta}}\leq\delta_{N}/t)=\frac{\widetilde{\Pi}(\|c-c_{0}\|_{C^{\beta}}\leq\delta_{N}/t)}{\widetilde{\Pi}(\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}))},

and therefore,

(31) Π(N(t))ec02/2Π~(cCβδN/t)Π~(𝒞,Mβ(Ω0)).\Pi(\mathcal{B}_{N}(t))\geq e^{-\|c_{0}\|_{\mathcal{H}}^{2}/2}\frac{\widetilde{\Pi}(\|c\|_{C^{\beta}}\leq\delta_{N}/t)}{\widetilde{\Pi}(\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}))}.

Next, choose a real number γ\gamma such that

(32) β<γ<αm2,ν>2m2(αγ)m.\beta<\gamma<\alpha-\frac{m}{2},\qquad\nu>\frac{2m}{2(\alpha-\gamma)-m}.

Alternatively, if β\beta is not an integer, we can simply set γ=β\gamma=\beta. In either case, we have fCβfCγ\|f\|_{C^{\beta}}\leq\|f\|_{C^{\gamma}_{*}} for all fCγ(Ω0)f\in C^{\gamma}_{*}(\Omega_{0}).

Now recall our assumption that the RKHS \mathcal{H} of Π~\widetilde{\Pi} is continuously embedded into Hα(Ω0)H^{\alpha}(\Omega_{0}). We know from [13, Theorem 3.1.2] that the unit ball UU of this space satisfies

log𝒩(U,Cγ,ε)(Aε)m(αγ)\log\mathcal{N}(U,\|\cdot\|_{C^{\gamma}_{*}},\varepsilon)\leq\left(\frac{A}{\varepsilon}\right)^{\tfrac{m}{(\alpha-\gamma)}}

for some fixed A>0A>0 and all ε>0\varepsilon>0 small enough. Therefore, by [19, Theorem 1.2], there exists D>0D>0 such that for all ε>0\varepsilon>0 small enough,

Π~(cCβε)Π~(cCγε)exp{Dε2m2(αγ)m}.\widetilde{\Pi}(\|c\|_{C^{\beta}}\leq\varepsilon)\geq\widetilde{\Pi}(\|c\|_{C^{\gamma}_{*}}\leq\varepsilon)\geq\exp\left\{-D\varepsilon^{-\tfrac{2m}{2(\alpha-\gamma)-m}}\right\}.

Consequently, (31) implies that for t>0t>0 large enough,

Π(N(t))\displaystyle\Pi(\mathcal{B}_{N}(t)) 1Π~(𝒞,Mβ(Ω0))exp{c022Dt2m2(αγ)mδN2m2(αγ)m}\displaystyle\geq\frac{1}{\widetilde{\Pi}(\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}))}\exp\left\{-\frac{\|c_{0}\|^{2}_{\mathcal{H}}}{2}-Dt^{\tfrac{2m}{2(\alpha-\gamma)-m}}\delta_{N}^{-\tfrac{2m}{2(\alpha-\gamma)-m}}\right\}
>1Π~(𝒞,Mβ(Ω0))exp{c022Dt2m2(αγ)mδNν}(by (30) and (32))\displaystyle>\frac{1}{\widetilde{\Pi}(\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}))}\exp\left\{-\frac{\|c_{0}\|^{2}_{\mathcal{H}}}{2}-Dt^{\tfrac{2m}{2(\alpha-\gamma)-m}}\delta_{N}^{-\nu}\right\}\qquad\textrm{(by \eqref{nu} and \eqref{gammadef})}
=1Π~(𝒞,Mβ(Ω0))exp{c022Dt2m2(αγ)mNδN2}\displaystyle=\frac{1}{\widetilde{\Pi}(\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}))}\exp\left\{-\frac{\|c_{0}\|^{2}_{\mathcal{H}}}{2}-Dt^{\tfrac{2m}{2(\alpha-\gamma)-m}}N\delta_{N}^{2}\right\}
exp{CNδN2}\displaystyle\geq\exp\{-C^{\prime}N\delta_{N}^{2}\}

for C=log(Π~(𝒞,Mβ(Ω0)))+c022+Dt2m2(αγ)mC^{\prime}=\log\left(\widetilde{\Pi}(\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}))\right)+\frac{\|c_{0}\|^{2}_{\mathcal{H}}}{2}+Dt^{\tfrac{2m}{2(\alpha-\gamma)-m}}. It now follows from Lemma 3.4 that for t>0t>0 sufficiently large, there exists C>0C>0 such that Π(BN)exp{CNδN2}\Pi(B_{N})\geq\exp\{-CN\delta_{N}^{2}\}. This completes the proof. ∎

Thus, we have verified Condition (1) of Theorem 3.2. The next Lemma verifies Condition (2).

Lemma 3.6.

There exists C~=C~(Ω,Ω0,g¯,β,)>0\widetilde{C}=\widetilde{C}(\Omega,\Omega_{0},\bar{g},\beta,\ell)>0 such that

log𝒩(𝒞,Mβ(Ω0),h,δN)C~NδN2.\log\mathcal{N}(\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}),h,\delta_{N})\leq\widetilde{C}N\delta_{N}^{2}.
Proof.

In order to construct a covering of 𝒞,Mβ(Ω0)\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}), it suffices to construct such a covering of the Cβ(Ω0)C^{\beta}_{*}(\Omega_{0}) - ball of radius MM centered at 0. Therefore, if UβU_{\beta} denotes the unit ball of Cβ(Ω0)C^{\beta}_{*}(\Omega_{0}),

log𝒩(𝒞,Mβ(Ω0),L2,δN)log𝒩(MUβ,L2,δN).\log\mathcal{N}(\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}),\|\cdot\|_{L^{2}},\delta_{N})\leq\log\mathcal{N}(MU_{\beta},\|\cdot\|_{L^{2}},\delta_{N}).

Now applying [13, Theorem 3.1.2] to the inclusion Cβ(Ω0)L2(Ω0)C^{\beta}_{*}(\Omega_{0})\hookrightarrow L^{2}(\Omega_{0}), we have

log𝒩(𝒞,Mβ(Ω0),L2,δN)(AδN)mβ\log\mathcal{N}(\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}),\|\cdot\|_{L^{2}},\delta_{N})\leq\left(\frac{A^{\prime}}{\delta_{N}}\right)^{\frac{m}{\beta}}

for some A>0A^{\prime}>0. Since ν>m/β\nu>m/\beta, we get

log𝒩(𝒞,Mβ(Ω0),L2,δN)bδNν=bNδN2,\log\mathcal{N}(\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}),\|\cdot\|_{L^{2}},\delta_{N})\leq b\delta_{N}^{-\nu}=bN\delta_{N}^{2},

where b>0b>0. Now, Lemma 3.3 and Corollary 2.7 imply that an L2L^{2} ball of radius δN\delta_{N} centered at any c𝒞,Mβ(Ω0)c\in\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}) is contained in the Hellinger ball of radius C22Volg¯(Ω)δN\frac{C_{2}^{\prime}}{2\operatorname{Vol}_{\bar{g}}(\partial\Omega)}\delta_{N} centered at the same point. Therefore, by suitably rescaling the constant bb to C~(Ω,Ω0,g¯,β,,M)>0\widetilde{C}(\Omega,\Omega_{0},\bar{g},\beta,\ell,M)>0, we get the desired complexity bound

log𝒩(𝒞,Mβ(Ω0),h,δN)C~NδN2.\log\mathcal{N}(\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}),h,\delta_{N})\leq\widetilde{C}N\delta_{N}^{2}.

3.3. Posterior Convergence

In this section, we will combine the results of Sections 3.1 and 3.2 to prove Theorem 3.1.

Theorem 3.7.

Let Π,α,β,M,c0\Pi,\alpha,\beta,M,c_{0} be as in Theorem 3.1, ν,δN\nu,\delta_{N} as in (30), and C>0C>0 as in Lemma 3.5. Then for k>0k^{\prime}>0 large enough, we have

(33) Pc0N(Π({c𝒞,Mβ(Ω0):ZcZc0L2kδN}|𝒟N)1e(C+3)NδN2)1P^{N}_{c_{0}}\left(\Pi(\{c\in\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}):\|Z_{c}-Z_{c_{0}}\|_{L^{2}}\leq k^{\prime}\delta_{N}\}|\mathcal{D}_{N})\geq 1-e^{-(C+3)N\delta_{N}^{2}}\right)\to 1

as NN\to\infty. Moreover, for all k′′>0k^{\prime\prime}>0 large enough,

(34) Pc0N(Π({c𝒞,Mβ(Ω0):cc0L2k′′δN1/2}|𝒟N)e(C+3)NδN2)0P^{N}_{c_{0}}\left(\Pi(\{c\in\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}):\|c-c_{0}\|_{L^{2}}\geq k^{\prime\prime}\delta_{N}^{1/2}\}|\mathcal{D}_{N})\geq e^{-(C+3)N\delta_{N}^{2}}\right)\to 0

as NN\to\infty.

Proof.

Combining Lemmas 3.5 and 3.6 with Theorem 3.2, we get (33) for all sufficiently large k>0k^{\prime}>0. To get (34), consider the event

EN={c𝒞,Mβ(Ω0):ZcZc0L2kδN}.E_{N}=\{c\in\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0}):\|Z_{c}-Z_{c_{0}}\|_{L^{2}}\leq k^{\prime}\delta_{N}\}.

By Corollary 2.3, for any cENc\in E_{N},

cc0L2\displaystyle\|c-c_{0}\|_{L^{2}} \displaystyle\leq C1ZcZc0H1\displaystyle C_{1}^{\prime}\|Z_{c}-Z_{c_{0}}\|_{H^{1}}
\displaystyle\leq C1ZcZc0L21/2ZcZc0H21/2\displaystyle C_{1}^{\prime}\|Z_{c}-Z_{c_{0}}\|_{L^{2}}^{1/2}\|Z_{c}-Z_{c_{0}}\|_{H^{2}}^{1/2}

by the standard interpolation result for Sobolev spaces. Therefore, by Theorem 2.8,

cc0L2C1(C3)1/2(kδN)1/2\|c-c_{0}\|_{L^{2}}\leq C_{1}^{\prime}(C_{3}^{\prime})^{1/2}(k^{\prime}\delta_{N})^{1/2}

Taking k′′>C1(kC3)1/2k^{\prime\prime}>C_{1}^{\prime}(k^{\prime}C_{3}^{\prime})^{1/2}, we conclude that

cc0L2k′′δN1/2.\|c-c_{0}\|_{L^{2}}\leq k^{\prime\prime}\delta_{N}^{1/2}.

Combining this with (33) gives us (34). ∎

The final step in the proof of Theorem 3.1 is to prove that the posterior contraction rate in the above Theorem carries over to the posterior mean c¯N=𝔼Π[c|𝒟N]\overline{c}_{N}=\mathbb{E}^{\Pi}[c|\mathcal{D}_{N}] as well. Let

0<ω<12(2+ν).0<\omega<\frac{1}{2(2+\nu)}.

We note that ω\omega can be made arbitrarily close to 1/41/4 by choosing α,β\alpha,\beta appropriately. Indeed, if α\alpha and β\beta are sufficiently large, (30) allows ν\nu to be arbitrarily close to 0. Correspondingly, ω\omega can be made arbitrarily close to 1/41/4. Next, define

ωN:=k′′δN1/2=k′′N12(2+ν)=o(Nω)\omega_{N}:=k^{\prime\prime}\delta_{N}^{1/2}=k^{\prime\prime}N^{-\frac{1}{2(2+\nu)}}=o(N^{-\omega})

where k′′>0k^{\prime\prime}>0 is as in Theorem 3.7.

Proof of Theorem 3.1.

Observe that

c¯Nc0L2\displaystyle\|\overline{c}_{N}-c_{0}\|_{L^{2}} =\displaystyle= 𝔼Π[c|𝒟N]c0L2\displaystyle\left\|\mathbb{E}^{\Pi}[c|\mathcal{D}_{N}]-c_{0}\right\|_{L^{2}}
\displaystyle\leq 𝔼Π[cc0L2|𝒟N](by Jensen’s inequality)\displaystyle\mathbb{E}^{\Pi}\left[\|c-c_{0}\|_{L^{2}}|\mathcal{D}_{N}\right]\quad\textrm{(by Jensen's inequality)}
\displaystyle\leq ωN+𝔼Π[cc0L2𝟙{cc0L2ωN}|𝒟N]\displaystyle\omega_{N}+\mathbb{E}^{\Pi}\left[\|c-c_{0}\|_{L^{2}}\mathds{1}_{\{\|c-c_{0}\|_{L^{2}}\geq\omega_{N}\}}\big{|}\mathcal{D}_{N}\right]
\displaystyle\leq ωN+𝔼Π[cc0L22|𝒟N]1/2[Π(cc0L2ωN|𝒟N)]1/2\displaystyle\omega_{N}+\mathbb{E}^{\Pi}\left[\|c-c_{0}\|_{L^{2}}^{2}|\mathcal{D}_{N}\right]^{1/2}\left[\Pi(\|c-c_{0}\|_{L^{2}}\geq\omega_{N}|\mathcal{D}_{N})\right]^{1/2}

by Cauchy-Schwarz inequality. Now it suffices to show that the second summand on the right hand side is stochastically O(ωN)O(\omega_{N}) as NN\to\infty.

Arguing as in the proof of Theorem 3.2 and applying Lemma 3.5, we get that the events

AN={𝒞,Mβ(Ω0)i=1Npcpc0(Xi,Yi,Zi)dΠ(c)e(2+C)NδN2}A_{N}^{\prime}=\left\{\int_{\mathcal{C}^{\beta}_{\ell,M}(\Omega_{0})}\prod_{i=1}^{N}\frac{p_{c}}{p_{c_{0}}}(X_{i},Y_{i},Z_{i})d\Pi(c)\geq e^{-(2+C)N\delta_{N}^{2}}\right\}

satisfy Pc0N(AN)1P^{N}_{c_{0}}(A_{N}^{\prime})\to 1 as NN\to\infty. Here, CC is as in Lemma 3.5. Now, Theorem 3.7 implies

Pc0N(𝔼Π[cc0L22|𝒟N]×Π(cc0L2ωN|𝒟N)>ωN2)\displaystyle P^{N}_{c_{0}}\left(\mathbb{E}^{\Pi}\left[\|c-c_{0}\|_{L^{2}}^{2}|\mathcal{D}_{N}\right]\times\Pi(\|c-c_{0}\|_{L^{2}}\geq\omega_{N}|\mathcal{D}_{N})>\omega_{N}^{2}\right)
Pc0N(𝔼Π[cc0L22|𝒟N]e(C+3)NδN2>ωN2)+o(1),\displaystyle\leq P^{N}_{c_{0}}\left(\mathbb{E}^{\Pi}\left[\|c-c_{0}\|_{L^{2}}^{2}|\mathcal{D}_{N}\right]e^{-(C+3)N\delta_{N}^{2}}>\omega_{N}^{2}\right)+o(1),

which is bounded above by

Pc0N(e(C+3)NδN2𝔼Π[cc0L22|𝒟N]>ωN2,AN)+o(1)\displaystyle P^{N}_{c_{0}}\left(e^{-(C+3)N\delta_{N}^{2}}\mathbb{E}^{\Pi}\left[\|c-c_{0}\|_{L^{2}}^{2}|\mathcal{D}_{N}\right]>\omega_{N}^{2},A_{N}^{\prime}\right)+o(1)
=Pc0N(e(C+3)NδN2cc0L22i=1Npcpc0(Xi,Yi,Zi)dΠ(c)i=1Npcpc0(Xi,Yi,Zi)dΠ(c)>ωN2,AN)+o(1)\displaystyle=P^{N}_{c_{0}}\left(e^{-(C+3)N\delta_{N}^{2}}\frac{\int\|c-c_{0}\|_{L^{2}}^{2}\prod_{i=1}^{N}\frac{p_{c}}{p_{c_{0}}}(X_{i},Y_{i},Z_{i})d\Pi(c)}{\int\prod_{i=1}^{N}\frac{p_{c}}{p_{c_{0}}}(X_{i},Y_{i},Z_{i})d\Pi(c)}>\omega_{N}^{2},A_{N}^{\prime}\right)+o(1)
(35) Pc0N(cc0L22i=1Npcpc0(Xi,Yi,Zi)dΠ(c)>ωN2eNδN2)+o(1)\displaystyle\leq P^{N}_{c_{0}}\left(\int\|c-c_{0}\|_{L^{2}}^{2}\prod_{i=1}^{N}\frac{p_{c}}{p_{c_{0}}}(X_{i},Y_{i},Z_{i})d\Pi(c)>\omega_{N}^{2}e^{N\delta_{N}^{2}}\right)+o(1)

using the fact that i=1Npcpc0(Xi,Yi,Zi)dΠ(c)e(C+2)NδN2\int\prod_{i=1}^{N}\frac{p_{c}}{p_{c_{0}}}(X_{i},Y_{i},Z_{i})d\Pi(c)\geq e^{-(C+2)N\delta_{N}^{2}} on ANA_{N}^{\prime}. Next, using Markov’s inequality, (35) can be further bounded above by

eNδN2ωN2𝔼c0N[cc0L22i=1Npcpc0(Xi,Yi,Zi)dΠ(c)]+o(1)\displaystyle\leq e^{-N\delta_{N}^{2}}\omega_{N}^{-2}\mathbb{E}^{N}_{c_{0}}\left[\int\|c-c_{0}\|_{L^{2}}^{2}\prod_{i=1}^{N}\frac{p_{c}}{p_{c_{0}}}(X_{i},Y_{i},Z_{i})d\Pi(c)\right]+o(1)
=eNδN2ωN2cc0L22𝔼c0N[1Npcpc0(Xi,Yi,Zi)]𝑑Π(c)+o(1)(by Fubini’s Theorem)\displaystyle=e^{-N\delta_{N}^{2}}\omega_{N}^{-2}\int\|c-c_{0}\|_{L^{2}}^{2}\mathbb{E}_{c_{0}}^{N}\left[\prod_{1}^{N}\frac{p_{c}}{p_{c_{0}}}(X_{i},Y_{i},Z_{i})\right]d\Pi(c)+o(1)\quad\textrm{(by Fubini's Theorem)}
eNδN2ωN2cc0L22𝑑Π(c)+o(1)(since 𝔼c0N[1Npcpc0]=1)\displaystyle\leq e^{-N\delta_{N}^{2}}\omega_{N}^{-2}\int\|c-c_{0}\|_{L^{2}}^{2}d\Pi(c)+o(1)\quad\left(\textrm{since }\mathbb{E}^{N}_{c_{0}}\left[\prod_{1}^{N}\frac{p_{c}}{p_{c_{0}}}\right]=1\right)
eNδN2ωN2+o(1)eNδN2N2ω+o(1)0 as N\displaystyle\lesssim e^{-N\delta_{N}^{2}}\omega_{N}^{-2}+o(1)\lesssim e^{-N\delta_{N}^{2}}N^{2\omega}+o(1)\to 0\textrm{ as }N\to\infty

This completes the proof. ∎


Acknowledgement: HZ is partly supported by the NSF grant DMS-2109116.

References

  • [1] K. Abraham and R. Nickl. On statistical calderón problems. Math. Stat. Learn., 2(2):165–216, 2019.
  • [2] Y. Assylbekov and H. Zhou. Boundary and scattering rigidity problems in the presence of a magnetic field and a potential. Inverse Prob. Imaging, 9:935–950, 2015.
  • [3] G. Beylkin. Stability and uniqueness of the solution of the inverse kinematic problem of seismology in higher dimensions. Zap. Nauchn. Sem. Leningrad. Otdel. Mat. Inst. Steklov. (LOMI), 84:3–6, 1979.
  • [4] J. Bohr. A bernstein–von-mises theorem for the calderón problem with piecewise constant conductivities. arXiv:2206.08177, 2022.
  • [5] J. Bohr and R. Nickl. On log-concave approximations of high-dimensional posterior measures and stability properties in non-linear inverse problems. arXiv:2105.07835, 2021.
  • [6] J. Cheeger and D. G. Ebin. Comparison theorems in Riemannian geometry. AMS Chelsea Publishing, Providence, RI, 2008. Revised reprint of the 1975 original.
  • [7] E. Chung, J. Qian, G. Uhlmann, and H. Zhao. A new phase space method for recovering index of refraction from travel times. Inverse Problems, 23:309, 2007.
  • [8] E. Chung, J. Qian, G. Uhlmann, and H. Zhao. A phase-space formulation for elastic-wave traveltime tomography. Journal of Physics: Conference Series, 124:012018, 2008.
  • [9] C. Croke. Rigidity theorems in Riemannian geometry. Geometric Methods in Inverse Problems and PDE Control, IMA Vol. Math. Appl., Springer, New York, 137:47–72, 2004.
  • [10] N. S. Dairbekov, G. P. Paternain, P. Stefanov, and G. Uhlmann. The boundary rigidity problem in the presence of a magnetic field. Adv. Math., 216:535–609, 2007.
  • [11] M. Dashti and A. M. Stuart. The bayesian approach to inverse problems. Handbook of Uncertainty Quantification, Editors R. Ghanem, D. Higdon and H. Owhadi, Springer, 2016.
  • [12] M. M. Dunlop and A. M. Stuart. The bayesian formulation of eit: analysis and algorithms. Inverse Probl. Imaging, 10(4):1007–1036, 2016.
  • [13] D. E. Edmunds and H. Triebel. Entropy Numbers and Approximation Numbers in Function Spaces, II. Proceedings of the London Mathematical Society, s3-64(1):153–169, 01 1992.
  • [14] S. Ghosal and A. van der Vaart. Fundamentals of nonparametric Bayesian inference, volume 44 of Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge, 2017.
  • [15] E. Giné and R. Nickl. Mathematical Foundations of Infinite-Dimensional Statistical Models. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2015.
  • [16] G. Herglotz. Über die elastizität der erde bei berücksichtigung ihrer variablen dichte. Zeitschr. für Math. Phys., 52:275–299, 1905.
  • [17] M. Lassas, L. Oksanen, and Y. Yang. Determination of the spacetime from local time measurements. Math. Ann., 365:271–307, 2016.
  • [18] J. M. Lee. Introduction to Riemanninan Manifolds. Graduate Texts in Mathematics. Springer Cham, 2021.
  • [19] W. V. Li and W. Linde. Approximation, Metric Entropy and Small Ball Estimates for Gaussian Measures. The Annals of Probability, 27(3):1556 – 1578, 1999.
  • [20] J. Martin, L. C. Wilcox, C. Burstedde, and O. Ghattas. A stochastic newton mcmc method for large-scale statistical inverse problems with application to seismic inversion. SIAM J. Sci. Comput., 34:1460–1487, 2012.
  • [21] R. Michel. Sur la rigidité imposée par la longueur des géodésiques. Invent. Math., 65(1):71–83, 1981/82.
  • [22] F. Monard, R. Nickl, and G. P. Paternain. Efficient nonparametric Bayesian inference for XX-ray transforms. Ann. Statist., 47(2):1113–1147, 2019.
  • [23] F. Monard, R. Nickl, and G. P. Paternain. Consistent inversion of noisy non-abelian x-ray transforms. Communications on Pure and Applied Mathematics, 74:1045–1099, 2021.
  • [24] F. Monard, R. Nickl, and G. P. Paternain. Statistical guarantees for bayesian uncertainty quantification in non-linear inverse problems with gaussian process priors. Annals of Statistics, 49(6):3255–3298, 2021.
  • [25] R. G. Mukhometov. The inverse kinematic problem of seismology on the plane. Math. Problems of Geophysics. Akad. Nauk. SSSR, Sibirsk. Otdel., Vychisl. Tsentr, Novosibirsk, 6(2):243–252, 1975. (In Russian).
  • [26] R. G. Mukhometov. On the problem of integral geometry. Math. Problems of Geophysics. Akad. Nauk. SSSR, Sibirsk. Otdel., Vychisl. Tsentr, Novosibirsk, 6(2):212–242, 1975. (In Russian).
  • [27] R. G. Mukhometov. On a problem of reconstructing Riemannian metrics. Sibirsk. Mat. Zh., 22(3):119–135, 237, 1981.
  • [28] R. G. Mukhometov and V. G. Romanov. On the problem of finding an isotropic Riemannian metric in an nn-dimensional space. Dokl. Akad. Nauk SSSR, 243(1):41–44, 1978.
  • [29] R. Nickl. Bernstein–von mises theorems for statistical inverse problems i: Schrödinger equation. J. Eur. Math. Soc. (JEMS), 22(8):2697–2750, 2020.
  • [30] R. Nickl. Bayesian non-linear statistical inverse problems. 2022.
  • [31] R. Nickl and G. Paternain. On some information-theoretic aspects of non-linear statistical inverse problems. arXiv:2107.09488, 2021.
  • [32] G. Paternain, G. Uhlmann, and H. Zhou. Lens rigidity for a particle in a yang-mills field. Comm. Math. Phys., 366:681–707, 2019.
  • [33] G. P. Paternain, M. Salo, and G. Uhlmann. Geometric inverse problems—with emphasis on two dimensions, volume 204 of Cambridge Studies in Advanced Mathematics. Cambridge University Press, Cambridge, 2023. With a foreword by András Vasy.
  • [34] L. Pestov and G. Uhlmann. Two dimensional compact simple Riemannian manifolds are boundary distance rigid. Ann. of Math. (2), 161(2):1093–1110, 2005.
  • [35] P. Stefanov. The lorentzian scattering rigidity problem and rigidity of stationary metrics. arXiv:2212.13213, 2022.
  • [36] P. Stefanov and G. Uhlmann. Boundary rigidity and stability for generic simple metrics. J. Amer. Math. Soc., 18(4):975–1003, 2005.
  • [37] P. Stefanov, G. Uhlmann, and A. Vasy. Boundary rigidity with partial data. J. Amer. Math. Soc., 29(2):299–332, 2016.
  • [38] P. Stefanov, G. Uhlmann, and A. Vasy. Local and global boundary rigidity and the geodesic X-ray transform in the normal gauge. Ann. of Math. (2), 194(1):1–95, 2021.
  • [39] P. Stefanov, G. Uhlmann, A. Vasy, and H. Zhou. Travel time tomography. Acta Math. Sin. (Engl. Ser.), 35(6):1085–1114, 2019.
  • [40] A. M. Stuart. Inverse problems: A bayesian perspective. Acta Numerica, 19:451–559, 2010.
  • [41] B.-T. Tan, O. Ghattas, J. Martin, and G. Stadler. A computational framework for infinite-dimensional bayesian inverse problems part i: The linearized case, with applications to global seismic inversion. SIAM J. Sci. Comput., 35:1–11, 2013.
  • [42] H. Triebel. Theory of Function Spaces. Modern Birkhäuser Classics. Springer Basel, 2010.
  • [43] C. Villani. Regularity of optimal transport and cut locus: From nonsmooth analysis to geometry to smooth analysis. Discrete & Continuous Dynamical Systems, 30(2):559–571, 2011.
  • [44] J.-N. Wang. Stability for the reconstruction of a Riemannian metric by boundary measurements. Inverse Problems, 15(5):1177–1192, 1999.
  • [45] E. Weichert and K. Zöppritz. Über erdbebenwellen. Nachr. Koenigl. Gesellschaft Wiss. Göttingen, 4:415–549, 1907.
  • [46] Y. Yang, G. Uhlmann, and H. Zhou. Travel time tomography in stationary spacetimes. J. Geom. Anal., 31:9573–9596, 2021.
  • [47] T. A. Yeung, E. Chung, and G. Uhlmann. Numerical inversion of three-dimensional geodesic x-ray transform arising from travel time tomography. SIAM J. Imaging Sciences, 12:1296–1323, 2019.
  • [48] H. Zhou. Lens rigidity with partial data in the presence of a magnetic field. Inverse Prob. Imaging, 12:1365–1387, 2018.