This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Branch points of split degenerate superelliptic curves II: on a conjecture of Gerritzen and van der Put

Jeffrey Yelton Department of Mathematics and Computer Science, Wesleyan University 265 Church Street, Middletown, CT 06459-0128 jyelton@wesleyan.edu
Abstract.

Let KK be a field with a discrete valuation, and let pp be a prime. It is known that if ΓΓ0<PGL2(K)\Gamma\lhd\Gamma_{0}<\mathrm{PGL}_{2}(K) is a Schottky group normally contained in a larger group which is generated by order-pp elements each fixing 22 points ai,biK1a_{i},b_{i}\in\mathbb{P}_{K}^{1}, then the quotient of a certain subset of the projective line K1\mathbb{P}_{K}^{1} by the action of Γ\Gamma can be algebraized as a superelliptic curve C:yp=f(x)/KC:y^{p}=f(x)/K. The subset SK{}S\subset K\cup\{\infty\} consisting of these pairs ai,bia_{i},b_{i} of fixed points is mapped bijectively modulo Γ\Gamma to the set \mathcal{B} of branch points of the superelliptic map x:CK1x:C\to\mathbb{P}_{K}^{1}. A conjecture of Gerritzen and van der Put, in the case that CC is hyperelliptic and KK has residue characteristic 2\neq 2, compares the cluster data of SS with that of \mathcal{B}. We show that this conjecture requires a slight modification in order to hold and then prove a much stronger version of the modified conjecture that holds for any pp and any residue characteristic.

1. Introduction

This paper represents a continuation of a study of split degenerate superelliptic curves CC which are pp-cyclic covers of the projective line over a non-archimedean ground field KK of characteristic different from pp and of the uniformization of such a curve as a quotient of a certain subset of K{}K\cup\{\infty\} by the action of a free group Γ<PGL2(K)\Gamma<\mathrm{PGL}_{2}(K). We denote the set of branch points of the covering map CK1C\to\mathbb{P}_{K}^{1} by \mathcal{B} and write d=#d=\#\mathcal{B}. One can easily compute from the Riemann-Hurwitz formula that the genus of CC is given by 12(p1)(d2)\frac{1}{2}(p-1)(d-2). It is also well known that a superelliptic curve which is a degree-pp cover of the projective line can be described by an affine model of the form

(1) yp=i=1d(xzi)riK[x],y^{p}=\prod_{i=1}^{d^{\prime}}(x-z_{i})^{r_{i}}\in K[x],

with d{d,d1}d^{\prime}\in\{d,d-1\}, where 1rip11\leq r_{i}\leq p-1 for 1id1\leq i\leq d^{\prime} and ={z1,,zd}\mathcal{B}=\{z_{1},\dots,z_{d^{\prime}}\} (resp. ={z1,,zd,}\mathcal{B}=\{z_{1},\dots,z_{d^{\prime}},\infty\}) if d=dd^{\prime}=d (resp. d=d1d^{\prime}=d-1). In the special case that p=2p=2, we call CC a hyperelliptic curve; in this case, our formula for the genus gg implies that dd is even and that the equation in (1) can be written as y2=f(x)K[x]y^{2}=f(x)\in K[x] for a squarefree polynomial ff of degree 2g+12g+1 (resp. 2g+22g+2) if the degree-22 cover CK1C\to\mathbb{P}_{K}^{1} is not (resp. is) branched above \infty.

1.1. Background on non-archimedean uniformization of superelliptic curves

Throughout this paper, we assume that KK is a field equipped with a discrete valuation v:K×v:K^{\times}\to\mathbb{Z}, and we denote by K\mathbb{C}_{K} the completion of an algebraic closure of KK. We fix a prime pp and a primitive ppth root of unity ζpK\zeta_{p}\in\mathbb{C}_{K} and assume throughout that we have ζpK\zeta_{p}\in K. (This will ensure, among other things, that each automorphism of C/KC/K as a cyclic pp-cover of K1\mathbb{P}_{K}^{1} is defined over KK.) We adopt the convention of using the notation K1\mathbb{P}_{K}^{1} both for the projective line with its structure as a variety over KK and for the set of KK-points of K1\mathbb{P}_{K}^{1}, i.e. in a context that will appear frequently in this paper, we write K1\mathbb{P}_{K}^{1} for the set K{}K\cup\{\infty\}.

Mumford showed in his groundbreaking paper [8] that any curve C/KC/K (not necessarily superelliptic) of genus g1g\geq 1 can be realized as a quotient of a certain subset ΩK1\Omega\subset\mathbb{P}_{K}^{1} by the action of a free subgroup Γ<PGL2(K)\Gamma<\mathrm{PGL}_{2}(K) of gg generators via fractional linear transformations if and only if the the curve CC satisfies a property called split degenerate reduction (see [9, Definition 6.7] or [6, §IV.3]). The free subgroup Γ<PGL2(K)\Gamma<\mathrm{PGL}_{2}(K) must act discontinuously on K1\mathbb{P}_{K}^{1} (i.e. the set of limit points under its action must not coincide with all of K1\mathbb{P}_{K}^{1}), and the subset ΩK1\Omega\subset\mathbb{P}_{K}^{1} such that CC can be uniformized as the quotient Ω/Γ\Omega/\Gamma coincides with the set of non-limit points. This main result on non-archimedean uniformization of curves is given as [8, Theorem 4.20] and [6, Theorems III.2.2, III.2.12.2, and IV.3.10]; many more details are contained in those sources. We comment that in the special case of g=1g=1, after applying an appropriate automorphism of K1\mathbb{P}_{K}^{1} we get Ω=K1{0,}=K×\Omega=\mathbb{P}_{K}^{1}\smallsetminus\{0,\infty\}=K^{\times} and that Γ\Gamma is generated by the fractional linear transformation zqzz\mapsto qz for some element qK×q\in K^{\times} of positive valuation, and thus we recover the Tate uniformization CK×/qC\cong K^{\times}/\langle q\rangle established in [10].

It is shown in [6, §9.2] and [12, §1] (for the p=2p=2 case) and in [11, §2] (for general pp) that given a prime pp and a split degenerate curve C/KC/K of genus (p1)g(p-1)g realized as such a quotient Ω/Γ\Omega/\Gamma, the curve CC is superelliptic and a degree-pp cover of K1\mathbb{P}_{K}^{1} if and only if Γ\Gamma is normally contained in a larger subgroup Γ0<PGL2(K)\Gamma_{0}<\mathrm{PGL}_{2}(K) generated by g+1g+1 elements s0,,sgs_{0},\dots,s_{g} whose only relations are s0p==sgp=1s_{0}^{p}=\dots=s_{g}^{p}=1.111In fact, the genus of every split degenerate pp-cyclic cover of K1\mathbb{P}_{K}^{1} is divisible by p1p-1, so all such curves arise in this way. In this situation, we have [Γ0:Γ]=p[\Gamma_{0}:\Gamma]=p and Ω/Γ0K1\Omega/\Gamma_{0}\cong\mathbb{P}_{K}^{1} so that the natural surjection Ω/ΓΩ/Γ0\Omega/\Gamma\twoheadrightarrow\Omega/\Gamma_{0} is just the degree-pp covering map CK1C\to\mathbb{P}_{K}^{1}. Each order-pp element siPGL2(K)s_{i}\in\mathrm{PGL}_{2}(K) fixes exactly 22 points of K1\mathbb{P}_{K}^{1}, which we denote as aia_{i} and bib_{i}.

As in [14], we call the subgroup Γ0<PGL2(K)\Gamma_{0}<\mathrm{PGL}_{2}(K) discussed above a pp-Whittaker group (see [14, Remark 2.12] for an explanation for this terminology). One can show that we have ai,biΩa_{i},b_{i}\in\Omega for 0ig0\leq i\leq g: see 4.3 below. Writing S={a0,b0,,ag,bg}ΩS=\{a_{0},b_{0},\dots,a_{g},b_{g}\}\subset\Omega, it is easy to verify that the set-theoretic image of SS modulo the action of the pp-Whittaker group Γ0\Gamma_{0} coincides with the set of branch points K1Ω/Γ0\mathcal{B}\subset\mathbb{P}_{K}^{1}\cong\Omega/\Gamma_{0}. For 0ig0\leq i\leq g, let us write αi,βi\alpha_{i},\beta_{i}\in\mathcal{B} for the respective images of ai,biΩa_{i},b_{i}\in\Omega, so that ={α0,β0,,αg,βg}\mathcal{B}=\{\alpha_{0},\beta_{0},\dots,\alpha_{g},\beta_{g}\}. By [11, Proposition 3.1(a)], the superelliptic curve CC has an equation of the form

(2) yp=i=0g(xαi)mi(xβi)pmi,y^{p}=\prod_{i=0}^{g}(x-\alpha_{i})^{m_{i}}(x-\beta_{i})^{p-m_{i}},

where the term (xαi)mi(x-\alpha_{i})^{m_{i}} (resp. (xβi)pmi(x-\beta_{i})^{p-m_{i}}) in the product is replaced by 11 if we have αi=\alpha_{i}=\infty (resp. βi=\beta_{i}=\infty).

1.2. Our previous results

In the previous work [14], the author considered a construction proceeding in the other direction: after fixing a prime pp, we begin with a subset SK1S\subset\mathbb{P}_{K}^{1} of cardinality 2g+22g+2 for some integer g1g\geq 1 and try to construct a superelliptic curve of genus (p1)g(p-1)g over KK which is uniformized using a Schottky group ΓΓ0=s0,,sg\Gamma\lhd\Gamma_{0}=\langle s_{0},\dots,s_{g}\rangle, where the fixed points ai,biK1a_{i},b_{i}\in\mathbb{P}_{K}^{1} of each generator sis_{i} of the associated pp-Whittaker group are the elements of SS. It is by no means the case that an arbitrary (2g+2)(2g+2)-element set SS can be partitioned into pairs {ai,bi}\{a_{i},b_{i}\} which each consitute the set of fixed points of an order-pp automorphism siPGL2(K)s_{i}\in\mathrm{PGL}_{2}(K) such that the group Γ0:=s0,,sj\Gamma_{0}:=\langle s_{0},\dots,s_{j}\rangle is a pp-Whittaker group that can be used to uniformize a superelliptic curve over KK (i.e. it may be the case that for every possible partition S=i=0g{ai,bi}S=\bigsqcup_{i=0}^{g}\{a_{i},b_{i}\} and every choice of order-pp automorphisms sis_{i} fixing ai,bia_{i},b_{i}, the elements sis_{i} may satisfy some group relations other than s0p==sgp=1s_{0}^{p}=\dots=s_{g}^{p}=1, so that the subgroup of PGL2(K)\mathrm{PGL}_{2}(K) that they generate cannot be pp-Whittaker).

Given a (2g+2)(2g+2)-element subset SS, a partition S=i=0g{ai,bi}S=\bigsqcup_{i=0}^{g}\{a_{i},b_{i}\}, and a choice of siPGL2(K)s_{i}\in\mathrm{PGL}_{2}(K) of order pp which fixes ai,bia_{i},b_{i} for 0ig0\leq i\leq g, we defined (in [14, Definition 1.1]) the associated subgroups ΓΓ0<PGL2(K)\Gamma\lhd\Gamma_{0}<\mathrm{PGL}_{2}(K) as Γ0=s0,,sg\Gamma_{0}=\langle s_{0},\dots,s_{g}\rangle and Γ=s0j1sis0j0ig,1jp1\Gamma=\langle s_{0}^{j-1}s_{i}s_{0}^{j}\rangle_{0\leq i\leq g,1\leq j\leq p-1}. If Γ\Gamma is a Schottky group (thus leading to the construction of a superelliptic curve) and Γ0\Gamma_{0} cannot be generated by fewer than g+1g+1 elements (which then ensures that Γ\Gamma is freely generated by the (p1)g(p-1)g elements of the form s0j1sis0js_{0}^{j-1}s_{i}s_{0}^{j}), then we say (as in [14, Definition 1.2]) that SS is pp-superelliptic. It turns out that if a set SS is pp-superelliptic, then there is a unique partition S=i=0g{ai,bi}S=\bigsqcup_{i=0}^{g}\{a_{i},b_{i}\} such that the associated group Γ\Gamma is Schottky (regardless of the choice of order-pp automorphism siPGL2(K)s_{i}\in\mathrm{PGL}_{2}(K) fixing ai,bia_{i},b_{i}, which is unique up to power by [14, Proposition 2.8(a)]222While replacing a generator sis_{i} by a prime-to-pp power does not change Γ0\Gamma_{0}, it does change Γ\Gamma, and it affects the curve CΩ/ΓC\cong\Omega/\Gamma by changing the integer mim_{i} appearing in its defining equation (2): see [11, Proposition 3.2].). Moreover, it is shown that the condition on SS of being clustered in v(p)p1\frac{v(p)}{p-1}-separated pairs (see 3.3 below) is necessary (though not sufficient in general: see [14, Example 2.18]) for SS to be pp-superelliptic. The above two assertions are given as parts (b) and (a) respectively of [14, Remark 2.15].

The main goal of [14] is to find a method of determining whether a given (2g+2)(2g+2)-element subset SK1S\subset\mathbb{P}_{K}^{1} is pp-superelliptic. An algorithm is provided (as [14, Algorithm 4.2]) in which a non-empty even-cardinality input subset SK1S\subset\mathbb{P}_{K}^{1} is transformed through a sequence of modifications called foldings (which are bijections ϕ:SSK1\phi:S\to S^{\prime}\subset\mathbb{P}_{K}^{1} satisfying certain properties) which do not affect the associated group Γ\Gamma, checking at each step that the resulting set SS^{\prime} is clustered in v(p)p1\frac{v(p)}{p-1}-separated pairs, until eventually, in the case that the input set SS is pp-superelliptic, it is transformed into a set SminS^{\mathrm{min}} satisfying a property which is called optimality (see [14, Definition 3.12]) for which there are no more “good foldings” to be performed. It is shown in particular (as [14, Lemma 3.18]) that an optimal set is pp-superelliptic.

1.3. Our main result

Our main purpose in this paper is to compare a pp-superelliptic set SS with its image \mathcal{B} modulo the action of Γ0\Gamma_{0} (the branch points of the resulting superelliptic curve), specifically in terms of the combinatorial data of the distances (under the metric induced by the discrete valuation vv) between elements of SS and the corresponding elements in \mathcal{B}. In other words, we are interested in comparing the cluster data of SS with that of its modulo-Γ0\Gamma_{0} image \mathcal{B} (see 2.3 below). The main inspiration for this general question is a conjecture posed by Gerritzen and van der Put on page 282 of their book [6], which may be paraphrased in the language of clusters (as 2.7 below) as essentially stating that a subset 𝔰S\mathfrak{s}\subset S is a cluster if and only its image modulo the action of Γ0\Gamma_{0} is a cluster of \mathcal{B}. This conjecture was formulated only in a context where p=2p=2 and the residue characteristic of KK is not 22.

We are able to resolve this conjecture by showing that it is false in general as stated (see 2.8 below) but that it is true when SS satisfies the condition of optimality discussed in the previous subsection. (We have shown as [14, Corollary 3.23, Remark 3.27] that in the special case studied by Kadziela in his dissertation [7, Chapters 5, 6], the set SS is optimal, thus recovering Kadziela’s result that the conjecture holds in this case.) We assert this in the more general context of removing all conditions on pp and on the residue characteristic of KK. We are moreover able to directly compare the cluster data of the optimal set SS with that of its image \mathcal{B}. When pp is not the residue characteristic of KK, this comparison is simple to state; otherwise, a general formula relating the relative depth of a cluster of SS with that of its image cluster of \mathcal{B} is too unpleasant to write down here, but such a formula becomes simple in certain cases such as when no cluster of SS is itself the union of 2\geq 2 even-cardinality sub-clusters. Our main result (in the cases that are not too cumbersome to write down) is summed up in the following theorem.

Theorem 1.1 (cluster version).

Let SK1S\subset\mathbb{P}_{K}^{1} be an optimal subset with associated pp-Whittaker group Γ0\Gamma_{0} and superelliptic curve CΩ/Γ0C\cong\Omega/\Gamma_{0} with branch points K1\mathcal{B}\subset\mathbb{P}_{K}^{1}; and write π:S\pi:S\to\mathcal{B} for the bijection corresponding to reduction modulo Γ0\Gamma_{0}. Assume that π(bg)=bg=\pi(b_{g})=b_{g}=\infty.

  1. (a)

    A subset 𝔰S\mathfrak{s}\subset S with 2#𝔰2g2\leq\#\mathfrak{s}\leq 2g is a cluster of SS if and only if its image π(𝔰)\pi(\mathfrak{s})\subset\mathcal{B} is a cluster of \mathcal{B}.

  2. (b)

    Given a cluster 𝔰\mathfrak{s} of SS with 2#𝔰2g2\leq\#\mathfrak{s}\leq 2g, in certain cases the relative depth of π(𝔰)\pi(\mathfrak{s}) can be compared to that of 𝔰\mathfrak{s} as follows; below we write 𝔰\mathfrak{s}^{\prime} for the smallest cluster of SS of cardinality 2\geq 2 and 2g\leq 2g which properly contains a cluster 𝔰\mathfrak{s}.

    1. (i)

      If 𝔰\mathfrak{s} has odd cardinality, then we have

      (3) δ(π(𝔰))=pδ(𝔰).\delta(\pi(\mathfrak{s}))=p\delta(\mathfrak{s}).
    2. (ii)

      If 𝔰\mathfrak{s} has even cardinality and if pp is not the residue characteristic of KK, then we have

      (4) δ(π(𝔰))=δ(𝔰).\delta(\pi(\mathfrak{s}))=\delta(\mathfrak{s}).
    3. (iii)

      If 𝔰\mathfrak{s} has even cardinality and if neither 𝔰\mathfrak{s} nor 𝔰\mathfrak{s}^{\prime} is the union of 2\geq 2 even-cardinality sub-clusters, then we have

      (5) δ(π(𝔰))=δ(𝔰)+2v(p).\delta(\pi(\mathfrak{s}))=\delta(\mathfrak{s})+2v(p).
Remark 1.2.

The hypothesis in the above theorem that βg=bg=\beta_{g}=b_{g}=\infty is no encumbrance to finding the cluster data of the set \mathcal{B} branch points of a superelliptic curve CC which is uniformized using the Schottky group associated to a set SS of fixed points. If, in our situation, we have bgb_{g}\neq\infty, then it is easy to see that we may choose a fractional linear transformation σPGL2(K)\sigma\in\mathrm{PGL}_{2}(K) such that σ(bg)=\sigma(b_{g})=\infty and replace SS and its associated groups ΓΓ0\Gamma\lhd\Gamma_{0} with σ(S)\sigma(S) and σΓσ1σΓ0σ1\sigma\Gamma\sigma^{-1}\lhd\sigma\Gamma_{0}\sigma^{-1} respectively and that the conjugate σΓ0σ1\sigma\Gamma_{0}\sigma^{-1} can be used to uniformize the same curve CC. Meanwhile, if we have βg\beta_{g}\neq\infty, then we may similarly choose a fractional linear transformation τPGL2(K)\tau\in\mathrm{PGL}_{2}(K) such that τ(βg)=\tau(\beta_{g})=\infty and replace \mathcal{B} with σ()\sigma(\mathcal{B}); applying the automorphism τ\tau to the projective line K1\mathbb{P}_{K}^{1} (in order to modify the set of branch points in this way) induces an isomorphism between models of CC given by the equation in (2) in terms of sets \mathcal{B} of branch points. Meanwhile, there is an easy formula relating the cluster data of a finite subset AK1A\subset\mathbb{P}_{K}^{1} with that of the subset σ(A)K1\sigma(A)\subset\mathbb{P}_{K}^{1} for any automorphism σPGL2(K)\sigma\in\mathrm{PGL}_{2}(K) (see, for instance, [5, Remark 5.7]).

We now present our main result not in terms of clusters but in the language of convex hulls in the Berkovich projective line, which removes any need to restrict to certain hypotheses on a given even-cardinality cluster 𝔰\mathfrak{s} or to assume that \infty lies in either SS or \mathcal{B}.

Theorem 1.3 (Berkovich version).

Let SK1S\subset\mathbb{P}_{K}^{1} be an optimal subset with associated pp-Whittaker group Γ0\Gamma_{0} and superelliptic curve CΩ/Γ0C\cong\Omega/\Gamma_{0} with branch points K1\mathcal{B}\subset\mathbb{P}_{K}^{1}; and write π:S\pi:S\to\mathcal{B} for the bijection corresponding to reduction modulo Γ0\Gamma_{0}. Viewing SK1,anS\subset\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} as a subset consisting of points of Type I in the Berkovich projective line K1,an\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}}, write ΣS\Sigma_{S} for the convex hull of SS (i.e. the smallest connected subspace of K1,an\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} which contains SS), and define the convex hull Σ\Sigma_{\mathcal{B}} analogously.

The map π\pi extends to a homeomorphism π:ΣSΣ\pi_{*}:\Sigma_{S}\to\Sigma_{\mathcal{B}} which affects distances between points (with respect to the hyperbolic metric – see 3.1 below) according to the following formula. For 0ig0\leq i\leq g, let Λ(i)K1,an\Lambda_{(i)}\subset\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} denote the (unique) non-backtracking path connecting the points aia_{i} and bib_{i}. For any points v,wK1,anv,w\in\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} of Type II or III, let [v,w]K1,an[v,w]\subset\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} denote the (unique) non-backtracking path connecting them, and let v,w[v,w]\llbracket v,w\rrbracket\subseteq[v,w] be the subspace consisting of points η\eta of distance v(p)p1\leq\frac{v(p)}{p-1} from one of the paths Λ(i)\Lambda_{(i)}. The subspace v,w\llbracket v,w\rrbracket is a disjoint union of shorter paths in K1,an\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}}; define μ(v,w)\mu(v,w) to be the sum of the lengths (i.e. distances between endpoints) of these segments. For any points v,wΣS{ηai,ηbi}0igv,w\in\Sigma_{S}\smallsetminus\{\eta_{a_{i}},\eta_{b_{i}}\}_{0\leq i\leq g}, we have

(6) δ(π(v),π(w))=δ(v,w)+(p1)μ(v,w).\delta(\pi_{*}(v),\pi_{*}(w))=\delta(v,w)+(p-1)\mu(v,w).
Remark 1.4.

It follows from[14, Remark 2.15] or from 3.4 below that the set SS in the statement of 1.3, by being pp-superelliptic, satisfies that the axes Λ(i)\Lambda_{(i)} are pairwise disjoint and at a distance of >2v(p)p1>\frac{2v(p)}{p-1}. The formula in 1.3 relating distances between points of ΣS\Sigma_{S} to distances between their images in Σ\Sigma_{\mathcal{B}} can be described more visually as follows. The map π\pi_{*} “transforms” the space ΣS\Sigma_{S} into the space Σ\Sigma_{\mathcal{B}} simply by dilating each axis Λ(i)\Lambda_{(i)} by a factor of pp, if pp is not the residue characteristic, and leaving the rest of the space unchanged with respect to the metric. This generalizes to the case of residue characteristic pp by, instead of dilating only each axis Λ(i)\Lambda_{(i)}, dilating the tubular neighborhood of radius v(p)p1\frac{v(p)}{p-1} of each axis Λ(i)\Lambda_{(i)} by a factor of pp.

The fact that 1.3 implies 1.1 will be given as 3.12 below, while the results in §3.2 will show that 1.3 allows us to compute the relative depths of clusters of \mathcal{B} in the cases not covered by the statement of 1.1, so that 1.3 is stronger than 1.1. We also mention that, as discussed in §1.2, one may use [14, Algorithm 4.2] to turn any pp-superelliptic set into an optimal set; therefore, 1.3 enables us, given any pp-superelliptic set SS, to determine the metric graph isomorphism type of Σ\Sigma_{\mathcal{B}} (or, equivalently, the cluster data of \mathcal{B} by the results of §3.2,3.3 below), where K1\mathcal{B}\subset\mathbb{P}_{K}^{1} is the set of branch points of the superelliptic curve associated to SS.

1.4. Outline of the paper

We postpone the proof of our main result to §4 and use §2,3 to establish three different ways to discuss the non-archimedean combinatorial properties of a finite subset AK1A\subset\mathbb{P}_{K}^{1} and how to translate between them, in order to understand the relationship between the conjecture as originally stated by Gerritzen and van der Put and our two variants of it in the form of Theorems 1.1 and 1.3. These three aspects of the set AA are its position, its cluster data, and the metric graph properties of its convex hull in the Berkovich projective line. Each of these frameworks has its advantages and its disadvantages. Looking at the position of AA is the framework used by Gerritzen and van der Put in many places in their book [6] and is directly tied to constructing useful models of the projective line, but position conveys less information than cluster data and the convex hull do (specifically, the depths of clusters and the lengths of segments of the convex hull are not reflected by the position). The language of cluster data has recently become popular as it relates to many applications involving the arithmetic of hyperelliptic and superelliptic curves ([3, 4] being the originating examples), and cluster data is easy to immediately compute, but it has the downside of being somewhat affected by automorphisms of the projective line (albeit in a way that can be described by easy formulas) while position and metric graph properties of the convex hull are independent of the chosen coordinate of the projective line; it is for this reason that our results stated in terms of clusters often require an extra hypothesis to be articulated succinctly. Moreover, some hypotheses and results are cumbersome to describe in terms of clusters, which is why only certain cases are fully described in the statement of 1.1. Viewing a set AA through its convex hull and studying its metric graph properties, while less accessible of a framework than that of position or cluster data, is the most powerful both in terms of ease of stating hypotheses and results and ease of proving them directly. It is for this reason that the result that we will directly prove is 1.3 and our methods of proving it mainly involve convex hulls and other subspaces of the Berkovich projective line.

In §2, we examine the aforementioned conjecture of Gerritzen and van der Put, putting it as precisely as possible by rigorously defining what they mean by “position” of a finite subset of K1\mathbb{P}_{K}^{1} (in §2.1) and then translating their conjecture into the language of clusters (in §2.2). We then show in §2.3 that without adding the hypothesis that the set SS is optimal as in 1.1, the conjecture does not hold.

The object of §3 is then to reframe everything in terms of the convex hull ΣS\Sigma_{S} of the subset SK1S\subset\mathbb{P}_{K}^{1} as a subset of the Berkovich projective line K1,an\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} consisting of points of Type I, culminating in a proof that the “Berkovich version” of our main result (1.3) implies the “cluster version” of our main result (1.1). This is done by introducing (a slightly simplified version of) the Berkovich projective line K1,an\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} in §3.1, studying convex hulls in K1,an\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} in §3.2, and directly relating the metric graph structure of the convex hull of AA to its cluster data in §3.3.

The actual proof of 1.3 takes up §4, which is the heart of this paper and broken up into five subsections, the first two of which provide background information about the action of PGL2(K)\mathrm{PGL}_{2}(K) on convex hulls and the map π:ΩΩ/ΓC\pi:\Omega\twoheadrightarrow\Omega/\Gamma\cong C as an explicit theta function, and the last three of which break the proof of 1.3 into three parts. Along the way, we will prove a result (4.10) which approximates the outputs of π\pi on a certain subset of its domain Ω\Omega, which is interesting in its own right.

We finish the paper with a corollary to our main result which describes a property that generally holds for the branch points of a split degenerate pp-cyclic cover of the projective line.

1.5. Acknowledgements

The author is grateful to Christopher Rasmussen for helpful discussions that took place during the process of developing these results. The author would also like to thank Robert Benedetto for conversations which enabled him to frame the main result in terms of map on subsets of the Berkovich projective line which is naturally induced by the uniformizing map of the superelliptic curve in the proof of 1.3 and to make the proof more rigorous.

2. A conjecture of Gerritzen and van der Put

On [6, p. 282], Gerritzen and van der Put make a conjecture regarding the relationship between the set SS of fixed points of generators of a pp-Whittaker group Γ0\Gamma_{0} and its image \mathcal{B} under reduction modulo Γ0\Gamma_{0}, which is the set of branch points of the resulting superelliptic curve. The conjecture is worded so as to say that the “position” of SS and the “position” of \mathcal{B} are “identical”. This conjecture is restated by Kadziela in his dissertation as [7, Conjecture 3.1]. In §2.1, we make the statement of this conjecture rigorous by carefully defining position, and in §2.2, we introduce the language of clusters and translate the conjecture so that it can be expressed in this language. Then in §2.3, we show by counterexample that the conjecture is not actually true. As it will turn out that the conjecture becomes true with the addition of a hypothesis on SS, our work in this section is still important in allowing us to properly define what it means to “have identical position” and to express it in the language of clusters.

2.1. Statement of the conjecture in terms of position

Although the term “position” is not defined in either [6] or [7] in very precise language, one can understand the meaning to be as follows. Let RKR\subset K be the ring of integers, and let kk be the residue field of the local ring RR. Given an ordered 33-element subset z¯={z0,z1,z}K1\underline{z}=\{z_{0},z_{1},z_{\infty}\}\subset\mathbb{P}_{K}^{1}, there is a unique automorphism γz¯PGL2(K)\gamma_{\underline{z}}\in\mathrm{PGL}_{2}(K) which sends ziz_{i} to iK1i\in\mathbb{P}_{K}^{1} for i=0,1,i=0,1,\infty. Composing this with the reduction map RkR\to k, we get a map γ¯z¯:K1k1\bar{\gamma}_{\underline{z}}:\mathbb{P}_{K}^{1}\to\mathbb{P}_{k}^{1}. In [6, §I.2], Gerritzen and van der Put define a tree T(A)T(A) whose vertices correspond to equivalence classes of ordered 33-element subsets z¯\underline{z} of AA, where subsets z¯,w¯K1\underline{z},\underline{w}\subset\mathbb{P}_{K}^{1} are equivalent if the automorphism γ¯w¯γ¯z¯1PGL2(k)\bar{\gamma}_{\underline{w}}\bar{\gamma}_{\underline{z}}^{-1}\in\mathrm{PGL}_{2}(k) is invertible. Writing |T(A)||T(A)| for the set of vertices of the graph T(A)T(A), let us define the map

RA:K1(k1)#|T(A)|R_{A}:\mathbb{P}_{K}^{1}\to(\mathbb{P}_{k}^{1})^{\#|T(A)|}

to be the product of the maps γ¯z¯\bar{\gamma}_{\underline{z}}, where z¯\underline{z} ranges over a set of representatives of each equivalence class of ordered 33-element subsets z¯\underline{z} of AA. The image RA(K1)R_{A}(\mathbb{P}_{K}^{1}) is a curve over kk whose components are all isomorphic to k1\mathbb{P}_{k}^{1} and intersect only at ordinary double points, none of which is the image of a point in AA; the components correspond to the vertices of the graph T(A)T(A). See [6, §I.4.2] for more details and proofs of these assertions. (We moreover note that the image of RAR_{A} is in fact the special fiber of a model of R1\mathbb{P}_{R}^{1} which is minimal with respect to the condition that the images of the RR-points of this model extending the points in AA do not intersect in the special fiber.) We may now define “position” as follows.

Definition 2.1.

The position of a finite subset AK1A\subset\mathbb{P}_{K}^{1} of cardinality 3\geq 3 is the combinatorial data of the tree T(A)T(A) along with the map rA:A|T(A)|r_{A}:A\to|T(A)| given by sending a point zAz\in A to the vertex of T(A)T(A) corresponding to the (unique) component of RA(K1)R_{A}(\mathbb{P}_{K}^{1}) which contains RA(z)R_{A}(z).

Given two finite subsets A,AK1A,A^{\prime}\subset\mathbb{P}_{K}^{1} of (equal) cardinality 3\geq 3 and a bijection φ:AA\varphi:A\to A^{\prime}, the sets AA and AA^{\prime} are said to have the same position if there is an graph isomorphism T(A)T(A)T(A)\stackrel{{\scriptstyle\sim}}{{\to}}T(A^{\prime}) making the below diagram commute.

(7) A\textstyle{A\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces}rA\scriptstyle{\!\!\!\!r_{A}}φ\scriptstyle{\varphi}|T(A)|\textstyle{|T(A)|\ignorespaces\ignorespaces\ignorespaces\ignorespaces}\scriptstyle{\wr}A\textstyle{A^{\prime}\ignorespaces\ignorespaces\ignorespaces\ignorespaces}rA\scriptstyle{\!\!\!r_{A^{\prime}}}|T(A)|\textstyle{|T(A^{\prime})|}

It is clear directly from definitions that the position of a subset AK1A\subset\mathbb{P}_{K}^{1} does not change after applying an automorphism in PGL2(K)\mathrm{PGL}_{2}(K) to the whole subset. Now the conjecture of Gerritzen and van der Put may be presented as follows.

Conjecture 2.2 (Gerritzen and van der Put, 1980).

With the above set-up, the (2g+2)(2g+2)-element subsets SK1S\subset\mathbb{P}_{K}^{1} and K1\mathcal{B}\subset\mathbb{P}_{K}^{1} have the same position.

2.2. Cluster data and position

We introduce the language of clusters and cluster data, following its use in [4], below.

Definition 2.3.

Let AK1A\subset\mathbb{P}_{K}^{1} be a finite subset. A subset 𝔰A\mathfrak{s}\subseteq A is called a cluster (of AA) if there is some subset DKD\subset K which is a disc under the metric induced by the valuation v:K×v:K^{\times}\to\mathbb{Z} such that 𝔰=AD\mathfrak{s}=A\cap D. The depth of a cluster 𝔰\mathfrak{s} is the integer

(8) d(𝔰):=minz,z𝔰v(zz).d(\mathfrak{s}):=\min_{z,z^{\prime}\in\mathfrak{s}}v(z-z^{\prime}).

Given a cluster 𝔰\mathfrak{s} of AA which is properly contained in another cluster of AA, and letting 𝔰A\mathfrak{s}^{\prime}\subseteq A be the minimum cluster containing 𝔰\mathfrak{s}, we define the relative depth δ(𝔰)\delta(\mathfrak{s}) of 𝔰\mathfrak{s} to be the difference d(𝔰)d(𝔰)d(\mathfrak{s})-d(\mathfrak{s}^{\prime}).

The data of all clusters of a finite subset AK1A\subset\mathbb{P}_{K}^{1} along with each of their depths (or relative depths) is called the cluster data of AA. The data that only consists of all clusters of AA (that is, which subsets of AA are clusters, without considering depth) is called the combinatorial cluster data of AA.

Remark 2.4.

Applying a fractional linear transformation to a finite subset AK1A\subset\mathbb{P}_{K}^{1} changes the combinatorial cluster data in a predictable way. Every fractional linear transformation is a composition of translations, homotheties, and the reciprocal map ι:zz1\iota:z\mapsto z^{-1}. Transformations and homotheties clearly do not affect the combinatorial cluster data, while the reciprocal map affects it if and only if not all elements of AA have the same valuation, in the following way. Let 𝔰A\mathfrak{s}\subset A be the subset of elements with maximal valuation; it is easy to see that 𝔰\mathfrak{s} is a cluster. Then for each cluster 𝔠\mathfrak{c} of AA such that 𝔠𝔰=\mathfrak{c}\cap\mathfrak{s}=\varnothing or 𝔠𝔰\mathfrak{c}\subsetneq\mathfrak{s}, the image ι(𝔠)\iota(\mathfrak{c}) is a cluster of ι(A)\iota(A), while for each cluster 𝔠\mathfrak{c} of AA which contains 𝔰\mathfrak{s}, the image ι(A𝔠)\iota(A\smallsetminus\mathfrak{c}) is a cluster of ι(A)\iota(A), and this describes all clusters of ι(A)\iota(A). It is also easy to describe what happens to the relative depths of the clusters; see [5, Remark 5.7] for more details.

Lemma 2.5.

In the definition of T(A)T(A) from §2.1, each equivalence class of ordered triples has a representative (z0,z1,z)(z_{0},z_{1},z_{\infty}) satisfying v(zz0)<v(z1z0)v(z_{\infty}-z_{0})<v(z_{1}-z_{0}), where we adopt the convention that v(z)=v(\infty-z)=\infty for zKz\in K. Conversely, any ordered triple (z0,z1,z)(z_{0}^{\prime},z_{1}^{\prime},z_{\infty}^{\prime}) such that v(zz0)<v(z1z0)=v(z1z0)=v(zizj)v(z_{\infty}^{\prime}-z_{0}^{\prime})<v(z_{1}^{\prime}-z_{0}^{\prime})=v(z_{1}-z_{0})=v(z_{i}^{\prime}-z_{j}) for all i,j{0,1}i,j\in\{0,1\} is equivalent to (z0,z1,z)(z_{0},z_{1},z_{\infty}).

Proof.

Choose any ordered triple w¯=(z0,z1,w)\underline{w}=(z_{0},z_{1},w). It is easy to verify, first of all, that any permutation of the coordinates z0,z1,wz_{0},z_{1},w of the ordered triple defining z¯\underline{z} does not affect the equivalence class, so after applying a suitable permutation, we assume that v(z1z0)max{v(wz0),v(wz1)}v(z_{1}-z_{0})\geq\max\{v(w-z_{0}),v(w-z_{1})\}. Choose an element zAz_{\infty}\in A satisfying v(zz0)=v(zz1)<v(z1z0)v(z_{\infty}-z_{0})=v(z_{\infty}-z_{1})<v(z_{1}-z_{0}) (such a zz_{\infty} certainly exists since we may take z=z_{\infty}=\infty), and let z¯\underline{z} be the ordered triple (z0,z1,z)(z_{0},z_{1},z_{\infty}). It is straightforward to verify that the map γ¯z¯:K1k1\bar{\gamma}_{\underline{z}}:\mathbb{P}_{K}^{1}\to\mathbb{P}_{k}^{1} may be described as sending zK1z\in\mathbb{P}_{K}^{1} to the reduction of (z1z0)1(zz0)(z_{1}-z_{0})^{-1}(z-z_{0}). Now it is visible from this formula that we have γ¯z¯(w){1,0}\bar{\gamma}_{\underline{z}}(w)\notin\{1,0\}, so γ¯z¯\bar{\gamma}_{\underline{z}} sends z0,z1,wz_{0},z_{1},w to 33 distinct points in k1\mathbb{P}_{k}^{1}. It follows that the composition γ¯z¯γ¯w¯1M2(k)\bar{\gamma}_{\underline{z}}\bar{\gamma}_{\underline{w}}^{-1}\in\mathrm{M}_{2}(k) sends 0,1,k10,1,\infty\in\mathbb{P}_{k}^{1} to 33 distinct points. It is now an elementary exercise to show that γ¯z¯γ¯w¯1\bar{\gamma}_{\underline{z}}\bar{\gamma}_{\underline{w}}^{-1} is invertible, so that z¯\underline{z} and w¯\underline{w} are in the same equivalence class.

Now let z,z1Az_{\infty}^{\prime},z_{1}^{\prime}\in A be elements satisfying v(zv0)<v(z1z0)=v(z1z0)=v(z1z1)v(z_{\infty}^{\prime}-v_{0})<v(z_{1}^{\prime}-z_{0})=v(z_{1}-z_{0})=v(z_{1}^{\prime}-z_{1}). Then we apply what we have just shown to conclude that the ordered triple (z0,z1,z1)(z_{0},z_{1},z_{1}^{\prime}) is equivalent both to (z0,z1,z)(z_{0},z_{1},z_{\infty}) and to (z0,z1,z)(z_{0},z_{1}^{\prime},z_{\infty}^{\prime}), and therefore the last two ordered triples are equivalent to each other. Then by a similar argument, for any z0Az_{0}^{\prime}\in A satisfying the hypotheses of the lemma, we see that the ordered triple (z0,z1,z)(z_{0}^{\prime},z_{1}^{\prime},z_{\infty}^{\prime}) is equivalent to all of these. ∎

Proposition 2.6.

Let φ:AA\varphi:A\to A^{\prime} be a bijection of finite subsets of K1\mathbb{P}_{K}^{1}, and assume that we have AA\infty\in A\cap A^{\prime} and that φ()=\varphi(\infty)=\infty. Then the sets AA and AA^{\prime} have the same position if and only if φ\varphi acts as a bijection between the clusters of AA and those of AA^{\prime}.

Proof.

There is a correspondence between the vertices of |T(A)||T(A)| and the clusters of AA of cardinality 2\geq 2, described as follows. By Lemma 2.5, a vertex v|T(A)|v\in|T(A)| is represented by an ordered triple (z0,z1,z)(z_{0},z_{1},z_{\infty}) satisfying v(zz0)<v(z1z0)v(z_{\infty}-z_{0})<v(z_{1}-z_{0}); the corresponding cluster is the smallest cluster 𝔰\mathfrak{s} which contains z0z_{0} and z1z_{1} (note that z𝔰z_{\infty}\notin\mathfrak{s}). Conversely, given a cluster 𝔰\mathfrak{s} of AA, choosing elements z0,z1𝔰z_{0},z_{1}\in\mathfrak{s} such that z0z_{0} and z1z_{1} do not both lie in a proper sub-cluster of 𝔰\mathfrak{s}, and choosing zA𝔰z_{\infty}\in A\smallsetminus\mathfrak{s}, the corresponding vertex is the one represented by the ordered triple (z0,z1,z)(z_{0},z_{1},z_{\infty}). Lemma 2.5 directly implies that this vertex depends neither on the choice of elements z0,z1𝔰z_{0},z_{1}\in\mathfrak{s} which do not both lie in a proper sub-cluster of 𝔰\mathfrak{s} nor on the choice of zA𝔰z_{\infty}\in A\smallsetminus\mathfrak{s}.

Let us denote the cluster corresponding to a vertex v|T(A)|v\in|T(A)| by 𝔰v\mathfrak{s}_{v}. Given any ordered triple z¯=(z0,z1,z)\underline{z}=(z_{0},z_{1},z_{\infty}) satisfying v(zz0)<v(z1z0)v(z_{\infty}-z_{0})<v(z_{1}-z_{0}) which represents a vertex v|T(A)|v\in|T(A)| and given any pair of distinct elements z,zAz,z^{\prime}\in A, the map γ¯z¯\bar{\gamma}_{\underline{z}} (whose formula is explicitly given in the proof of Lemma 2.5) sends zz and zz^{\prime} to the same point in k1\mathbb{P}_{k}^{1} if and only if either we have z,z𝔰vz,z^{\prime}\notin\mathfrak{s}_{v} or we have z,z𝔠z,z^{\prime}\in\mathfrak{c} for some proper sub-cluster 𝔠𝔰v\mathfrak{c}\subsetneq\mathfrak{s}_{v}. It follows that for each vertex v|T(A)|v\in|T(A)|, the inverse image rA1(v)r_{A}^{-1}(v) coincides with the set of elements z𝔰vz\in\mathfrak{s}_{v} such that zz is not in any proper sub-cluster of 𝔰v\mathfrak{s}_{v}, together with the element A\infty\in A if 𝔰v\mathfrak{s}_{v} is not properly contained in any cluster. Therefore, the map rAr_{A} determines and is determined by the combinatorial cluster data of AA, and the claim of the proposition follows. ∎

In light of 2.6 above, and keeping in mind that a given element in a finite subset AK1A\subset\mathbb{P}_{K}^{1} may be moved to \infty by applying an appropriate fractional linear transformation to AA, 2.2 may be restated in the language of cluster data as follows.

Assertion 2.7.

With the above set-up, after possibly applying a suitable fractional linear transformation to the (2g+2)(2g+2)-element subset SK1S\subset\mathbb{P}_{K}^{1} and a suitable fractional linear transformation to its image π(S)=K1\pi(S)=\mathcal{B}\subset\mathbb{P}_{K}^{1}, the subsets S,K1S,\mathcal{B}\subset\mathbb{P}_{K}^{1} have the same combinatorial cluster data.

2.3. Counterexample to the conjecture as stated

Gerritzen and van der Put show in [6, §IX.2.5] that their conjecture holds as stated when g=1g=1, and they claim, without showing explicit calculations, that they have confirmed by checking each possible position of a 66-element subset SK1S\subset\mathbb{P}_{K}^{1} that their conjecture holds also when g=2g=2. However, in the following remark we show that there are counterexamples 2.7 even when g=2g=2.

Remark 2.8.

We observe an issue with Gerritzen and van der Put’s conjecture (as interpreted literally) in that when g2g\geq 2 it is possible for two good cardinality-(2g+2)(2g+2) subsets S,SK1S,S^{\prime}\in\mathbb{P}_{K}^{1} to induce the same Schottky group ΓΓ0\Gamma\lhd\Gamma_{0} but to not have the same position. As an example, suppose that K=3(ζ3)K=\mathbb{Q}_{3}(\zeta_{3}); let

S={a0:=9,b0:=9,a1:=3,b1:=12,a2:=1,b2:=};S=\{a_{0}:=-9,b_{0}:=9,a_{1}:=3,b_{1}:=12,a_{2}:=1,b_{2}:=\infty\};

and for i=0,1,2i=0,1,2, let siPGL2(K)s_{i}\in\mathrm{PGL}_{2}(K) be the unique fractional linear transformation of order 22 which fixes the points ai,biK1a_{i},b_{i}\in\mathbb{P}_{K}^{1}, noting that s0s_{0} is simply the function z81zz\mapsto\frac{81}{z}. This set SS satisfies the hypotheses of [7, Theorem 5.7], and so that theorem tells us that Γ:=s0s1,s0s2\Gamma:=\langle s_{0}s_{1},s_{0}s_{2}\rangle is a Schottky group (this can also be deduced using [14, Corollary 3.23]). Now let

S={a0:=a0=9,b0:=b0=9,a1:=s0(a1)=27,b1:=s0(b1)=27/4,a2:=a2=1,b2:=b2=}.S^{\prime}=\{a_{0}^{\prime}:=a_{0}=-9,b_{0}^{\prime}:=b_{0}=9,a_{1}^{\prime}:=s_{0}(a_{1})=27,b_{1}^{\prime}:=s_{0}(b_{1})=27/4,a_{2}^{\prime}:=a_{2}=1,b_{2}^{\prime}:=b_{2}=\infty\}.

Let ϕ:SS\phi:S\to S^{\prime} be the bijection given by (ai,bi)(ai,bi)(a_{i},b_{i})\mapsto(a_{i}^{\prime},b_{i}^{\prime}) for 1i31\leq i\leq 3, noting that we have πϕ=π\pi\circ\phi=\pi on SS. Letting si=sis_{i}^{\prime}=s_{i} for i=0,2i=0,2 and s1=s0s1s0s_{1}^{\prime}=s_{0}s_{1}s_{0}, we see that each sis_{i}^{\prime} is the unique fractional linear transformation of order 22 which fixes the points ai,biK1a_{i}^{\prime},b_{i}^{\prime}\in\mathbb{P}_{K}^{1} and that, in constructing a Schottky group from SS^{\prime} in the usual way, we get

(9) Γ:=s0s1,s0s2=(s0s1)1,s0s2=Γ.\Gamma^{\prime}:=\langle s_{0}^{\prime}s_{1}^{\prime},s_{0}^{\prime}s_{2}^{\prime}\rangle=\langle(s_{0}s_{1})^{-1},s_{0}s_{2}\rangle=\Gamma.

However, 2.2 cannot hold for SS because there is no fractional linear transformation σPGL2(K)\sigma\in\mathrm{PGL}_{2}(K) such that σ(S)\sigma(S^{\prime}) has the same combinatorial cluster data as SS (or more precisely, such that the bijection σϕ:Sσ(S)\sigma\circ\phi:S\to\sigma(S^{\prime}) acts as a bijection between clusters of SS and clusters of SS^{\prime}). To see this, we note that the even-cardinality clusters of SS are {9,9,3,12}\{-9,9,3,12\}, {9,9}\{-9,9\}, and {3,12}\{3,12\}, while the even-cardinality clusters of SS^{\prime} are only {9,9,27,27/4}\{-9,9,27,27/4\} and {27,27/4}\{27,27/4\}. 2.4 then shows that there is no fractional linear transformation σPGL2(K)\sigma\in\mathrm{PGL}_{2}(K) such that σ(S)\sigma(S^{\prime}) has the same combinatorial cluster data as SS (or more precisely, such that the composition σϕ:Sσ(S)\sigma\circ\phi:S\to\sigma(S^{\prime}) acts as a bijection between clusters of SS and clusters of σ(S)\sigma(S^{\prime})). Alternately, in terms of position, one sees that the positions of SS and SS^{\prime} fall under the cases (a) and (b) respectively in [6, §IX.2.5.3].

3. Convex hulls in the Berkovich projective line

Given the completion of an algebraic closure K\mathbb{C}_{K} of KK, we write v:Kv:\mathbb{C}_{K}\to\mathbb{R} for an extension of the valuation v:K×v:K^{\times}\to\mathbb{Z}. Below when we speak of a disc DKD\subset\mathbb{C}_{K}, we mean that DD is a closed disc with respect to the metric induced by v:Kv:\mathbb{C}_{K}\to\mathbb{R}; in other words, D={zK|v(zc)r}D=\{z\in\mathbb{C}_{K}\ |\ v(z-c)\geq r\} for some center cKc\in\mathbb{C}_{K} and real number rr\in\mathbb{R}, which is the (logarithmic) radius of DD. Given a disc DKD\subset\mathbb{C}_{K}, we denote its logarithmic radius by d(D)d(D).

3.1. The Berkovich projective line and related notation

The Berkovich projective line K1,an\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} over an algebraic closure of a discrete valuation field KK is a type of rigid analytification of the projective line K1\mathbb{P}_{\mathbb{C}_{K}}^{1} and is typically defined in terms of multiplicative seminorms on K[x]\mathbb{C}_{K}[x] as in [1, §1] and [2, §6.1]. Points of K1,an\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} are identified with multiplicative seminorms which are each classified as Type I, II, III, or IV. For the purposes of this paper, as in [14], we may safely ignore points of Type IV and need only adopt a fairly rudimentary construction which does not directly involve seminorms.

Definition 3.1.

Define the Berkovich projective line, denoted K1,an\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}}, to be the topological space with points and topology given as follows. The points of K1,an\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} are identified with

  1. (i)

    K\mathbb{C}_{K}-points zK1z\in\mathbb{P}_{\mathbb{C}_{K}}^{1}, which we will call points of Type I; and

  2. (ii)

    discs DKD\subset\mathbb{C}_{K}; if d(D)d(D)\in\mathbb{Q} (resp. d(D)d(D)\notin\mathbb{Q}), we call this a point of Type II (resp. a point of Type III).

A point of K1,an\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} which is identified with a point zK1z\in\mathbb{P}_{\mathbb{C}_{K}}^{1} (resp. a disc DKD\subset\mathbb{C}_{K}) is denoted ηzK1,an\eta_{z}\in\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} (resp. ηDK1,an\eta_{D}\in\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}}).

We define an infinite metric on K1,an\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} given by the distance function

δ:K1,an×K1,an{}\delta:\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}}\times\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}}\to\mathbb{R}\cup\{\infty\}

defined as follows. We set δ(ηz,η)=\delta(\eta_{z},\eta^{\prime})=\infty for any point ηz\eta_{z} of Type I and any point ηηzK1,an\eta^{\prime}\neq\eta_{z}\in\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}}. Given a containment DDKD\subseteq D^{\prime}\subset\mathbb{C}_{K} of discs, we set δ(ηD,ηD)=d(D)d(D)\delta(\eta_{D},\eta_{D^{\prime}})=d(D)-d(D^{\prime})\in\mathbb{R}. More generally, if D,DKD,D^{\prime}\subset\mathbb{C}_{K} are discs and D′′KD^{\prime\prime}\subset\mathbb{C}_{K} is the smallest disc containing both DD and DD^{\prime}, we set

(10) δ(ηD,ηD)=δ(ηD,ηD′′)+δ(ηD,ηD′′)=d(D)+d(D)2d(D′′).\delta(\eta_{D},\eta_{D^{\prime}})=\delta(\eta_{D},\eta_{D^{\prime\prime}})+\delta(\eta_{D^{\prime}},\eta_{D^{\prime\prime}})=d(D)+d(D^{\prime})-2d(D^{\prime\prime}).

We endow the subspace of K1,an\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} consisting of points of Type II and III with the topology induced by the metric given by δ\delta, and we extend this to a topology on all of K1,an\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} in such a way that, given any element zKz\in\mathbb{C}_{K} and disc DKD\subset\mathbb{C}_{K} containing zz, the map λ:[0,ed(D)]K1,an\lambda:[0,e^{d(D)}]\to\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} given by sending 0 to ηz\eta_{z} and sending s(0,ed(D)]s\in(0,e^{d(D)}] to the disc of radius ln(s)-\ln(s) containing zz provides a path from ηz\eta_{z} to ηD\eta_{D} and that there is a similarly defined path from ηD\eta_{D} to η\eta_{\infty} – see [14, Definition 2.1, Remark 2.2] for details.

As is discussed in [14, Remark 2.2], the space K1,an\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} is path-connected, and there is a unique non-backtracking path between any pair of points in K1,an\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}}. This allow us to set the following notation. Below we denote the image in K1,an\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} of the non-backtracking path between two points η,ηK1,an\eta,\eta^{\prime}\in\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} by [η,η]K1,an[\eta,\eta^{\prime}]\subset\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}}, and we will often refer to this image itself as “the path” from η\eta to η\eta^{\prime}; note that with this notation we have [η,η]=[η,η][\eta,\eta^{\prime}]=[\eta^{\prime},\eta]. The above observations imply that, given a point ηK1,an\eta\in\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} and a subspace ΛK1,an\Lambda\in\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}}, there is a (unique) point ξΛ\xi\in\Lambda such that every path from η\eta to a point in Λ\Lambda contains ξ\xi; we will often speak of “the closest point in Λ\Lambda to η\eta” in referring to this point ξ\xi (note that we will use this language even in the case that the distance from η\eta to any point in Λ\Lambda is infinite, as in when ηΛ\eta\notin\Lambda is of Type I). In a similar way, if Λ,ΛK1,an\Lambda,\Lambda^{\prime}\in\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} are subspaces, we will speak of “the closest point in Λ\Lambda to Λ\Lambda^{\prime}” (and vice versa). Given a point ηK1,an\eta\in\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} and subspaces Λ,ΛK1,an\Lambda,\Lambda^{\prime}\in\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}}, we write δ(η,Λ)\delta(\eta,\Lambda) (resp. δ(Λ,Λ)\delta(\Lambda,\Lambda^{\prime})) for the distance between η\eta and the closest point in Λ\Lambda^{\prime} to η\eta (resp. between the closest point in Λ\Lambda to Λ\Lambda^{\prime} and the closest point in Λ\Lambda^{\prime} to Λ\Lambda).

Given points η,ηK1,an\eta,\eta^{\prime}\in\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}}, let ηη[η,η]K1,an\eta\vee\eta^{\prime}\in[\eta,\eta^{\prime}]\subset\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} be the point of Type II or III corresponding to the largest disc among discs corresponding to points in the path [η,η][\eta,\eta^{\prime}].

We refer to the path between distinct points ηa,ηbK1,an\eta_{a},\eta_{b}\in\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} of Type I as an axis and denote the axis connecting them by Λa,b:=[ηa,ηb]K1,an\Lambda_{a,b}:=[\eta_{a},\eta_{b}]\subset\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}}. We note that the space Λa,b{ηa,ηb}\Lambda_{a,b}\smallsetminus\{\eta_{a},\eta_{b}\} consists precisely of those points whose corresponding disc DD either satisfies #(D{a,b})=1\#(D\cap\{a,b\})=1 or is the smallest disc containing {a,b}\{a,b\}, a fact that we will freely use in arguments below. In our frequent context of dealing with a (2g+2)(2g+2)-set consisting of elements labeled a0,b0,,ag,bga_{0},b_{0},\dots,a_{g},b_{g}, we write Λ(i)=Λai,bi\Lambda_{(i)}=\Lambda_{a_{i},b_{i}} for 0ig0\leq i\leq g.

As in [14], given a subspace ΛK1,an\Lambda\subset\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} and a real number r>0r>0, we define the (closed) tubular neighborhood of Λ\Lambda of radius rr to be

(11) B(Λ,r)={ηK1,an|δ(η,Λ)r},B(\Lambda,r)=\{\eta\in\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}}\ |\ \delta(\eta,\Lambda)\leq r\},

and we write Λ^(i)=B(Λ(i),v(p)p1)\hat{\Lambda}_{(i)}=B(\Lambda_{(i)},\frac{v(p)}{p-1}) for 0ig0\leq i\leq g in the aforementioned context where the axes Λ(i)\Lambda_{(i)} are defined.

Finally, the following elementary facts relating valuations to distances from the axis Λ0,\Lambda_{0,\infty}, which are in [14, Lemma 4.5], will be used many times in arguments below, and so for convenience we present them here in full.

Proposition 3.2.

In addition to the above notation, for any rr\in\mathbb{R}, write D(r)KD(r)\subset\mathbb{C}_{K} for the disc {zK|v(z)r}\{z\in\mathbb{C}_{K}\ |\ v(z)\geq r\}.

  1. (a)

    Given any point aK1{0,}a\in\mathbb{P}_{\mathbb{C}_{K}}^{1}\smallsetminus\{0,\infty\}, the closest point in the axis Λ0,\Lambda_{0,\infty} to ηa\eta_{a} is ηD(v(a))\eta_{D(v(a))}.

  2. (b)

    Given any distinct points η,ηK1,an\eta,\eta^{\prime}\in\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}}, the point ηη[η,η]\eta\vee\eta^{\prime}\in[\eta,\eta^{\prime}] has minimal distance to the axis Λ0,\Lambda_{0,\infty} among points in the path [η,η][\eta,\eta^{\prime}].

  3. (c)

    Given any distinct points a,bK1{0,}a,b\in\mathbb{P}_{\mathbb{C}_{K}}^{1}\smallsetminus\{0,\infty\}, writing r=min{v(a),v(b)}r=\min\{v(a),v(b)\}, we have

    (12) v(ab)=r+δ(ηaηb,D(r))=r+δ(Λa,b,Λ0,).v(a-b)=r+\delta(\eta_{a}\vee\eta_{b},D(r))=r+\delta(\Lambda_{a,b},\Lambda_{0,\infty}).

3.2. Convex hulls of finite sets

By studying a finite subset AK1A\subset\mathbb{P}_{K}^{1} through its convex hull in K1,an\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}}, we adopt a more topological point of view which enables us to assert stronger results and to better demonstrate them. For our purposes, we will only care about the convex hull of a set AA which is clustered in pairs, so we begin with that definition.

Definition 3.3.

Let AK1A\subset\mathbb{P}_{K}^{1} be a non-empty even-cardinality subset. We say that AA is clustered in rr-separated pairs for some r0r\in\mathbb{R}_{\geq 0} if there is a labeling a0,b0,,ag,bga_{0},b_{0},\dots,a_{g},b_{g} of the elements of AA such that we have B(Λ(i),r)B(Λ(j),r)=B(\Lambda_{(i)},r)\cap B(\Lambda_{(j)},r)=\varnothing for iji\neq j.

If AA is clustered in 0-separated pairs, we say more simply that AA is clustered in pairs.

In the context of using this terminology, the pairs that AA is clustered in are the 22-element sets {ai,bi}\{a_{i},b_{i}\}.

The following crucial fact comes from [14, Remark 2.15].

Proposition 3.4.

Every pp-superelliptic set SS is clustered in v(p)p1\frac{v(p)}{p-1}-separated pairs {a0,b0},,{ag,bg}\{a_{0},b_{0}\},\dots,\{a_{g},b_{g}\}; in other words, we have Λ^(i)Λ^(j)=\hat{\Lambda}_{(i)}\cap\hat{\Lambda}_{(j)}=\varnothing for iji\neq j. Moreover, the partition S=i=0g{ai,bi}S=\bigsqcup_{i=0}^{g}\{a_{i},b_{i}\} into pairs is the only one with respect to which SS is clustered in pairs.

In light of the above proposition, in the context of this paper, all subsets SK1S\subset\mathbb{P}_{K}^{1} consisting of fixed points of generators of pp-Whittaker groups satisfy this property of being clustered in v(p)p1\frac{v(p)}{p-1}-separated pairs (which, in the case that pp is not the residue characteristic of KK, simply means being clustered in pairs).

Definition 3.5.

Let AK1A\subset\mathbb{P}_{K}^{1} be a subset which is clustered in the pairs {a0,b0},,{ag,bg}\{a_{0},b_{0}\},\dots,\{a_{g},b_{g}\} for some g0g\geq 0. We denote by ΣAK1,an\Sigma_{A}\subset\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} the convex hull of AA, i.e. the smallest connected subspace of K1,an\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} containing the set of points of Type I corresponding to AA.

A distinguished vertex of ΣA\Sigma_{A} is a point vΛai,biΣAv\in\Lambda_{a_{i},b_{i}}\subset\Sigma_{A} for some index i{0,,g}i\in\{0,\dots,g\} satisfying that no neighborhood of vv in the metric space ΣA\Sigma_{A} is contained in the axis Λ(i)\Lambda_{(i)}.

A natural vertex of ΣA\Sigma_{A} is a point vΣAv\in\Sigma_{A} whose open neighborhoods contain a star shape centered at vv (with 3\geq 3 edges coming out of vv).

A vertex of ΣA\Sigma_{A} is a distinguished or a natural vertex.

Remark 3.6.

The idea behind the above terminology is that the closure of the subspace ΣA{Λ(i)}0igΣA\Sigma_{A}\smallsetminus\{\Lambda_{(i)}\}_{0\leq i\leq g}\subset\Sigma_{A} is a finite metric graph whose vertices are precisely the vertices of ΣA\Sigma_{A} as described in 3.5. See [14, §3.1] for a proof and more details.

Proposition 3.7.

Let AK1A\subset\mathbb{P}_{K}^{1} be a subset which is clustered in pairs. The set AA is clustered in rr-separated pairs for some r>0r\in\mathbb{R}_{>0} if and only if for each pair of distinguished vertices v,vΣAv,v^{\prime}\in\Sigma_{A} which do not lie in the same axis Λ(i)\Lambda_{(i)} for any index ii, we have δ(v,v)>2r\delta(v,v^{\prime})>2r.

Proof.

We begin by observing that being clustered in rr-separated pairs is clearly equivalent to the condition that δ(Λ(i),Λ(j))>2r\delta(\Lambda_{(i)},\Lambda_{(j)})>2r for any indices iji\neq j; we freely use this fact below.

Let v,vΣAv,v^{\prime}\in\Sigma_{A} be distinguished vertices not lying in the same axis Λ(i)\Lambda_{(i)}; as distinguished vertices by definition each lie in an axis of this type, there are indices iji\neq j such that vΛ(i)v\in\Lambda_{(i)} and vΛ(j)v^{\prime}\in\Lambda_{(j)}. The forward direction of the assertion now directly follows. The other direction follows from the fact that, immediately from the definition of distinguished vertex, given any indices iji\neq j, the closest point in Λ(i)\Lambda_{(i)} (resp. Λ(j)\Lambda_{(j)}) to Λ(j)\Lambda_{(j)} (resp. Λ(i)\Lambda_{(i)}) is a distinguished vertex vv (resp. vv^{\prime}), so δ(v,v)>2r\delta(v,v^{\prime})>2r implies δ(Λ(i),Λ(j))>2r\delta(\Lambda_{(i)},\Lambda_{(j)})>2r. ∎

Proposition 3.8.

Let AK1A\subset\mathbb{P}_{K}^{1} be a subset which is clustered in the pairs {a0,b0},,{ag,bg}\{a_{0},b_{0}\},\dots,\{a_{g},b_{g}\} for some g0g\geq 0.

  1. (a)

    Define an equivalence relation \sim on AA as follows: given two points z,wAz,w\in A, we write zwz\sim w if zz and ww lie in the exact same even-cardinality clusters of AA, i.e. if for every even-cardinality cluster 𝔰A\mathfrak{s}\subseteq A we have either z,w𝔰z,w\in\mathfrak{s} or z,w𝔰z,w\notin\mathfrak{s}. Then the equivalence classes under the relation \sim are precisely the pairs {zi,wi}\{z_{i},w_{i}\}.

  2. (b)

    There is a one-to-one correspondence between the clusters 𝔰\mathfrak{s} of AA satisfying 2#𝔰2g+12\leq\#\mathfrak{s}\leq 2g+1 and the vertices of ΣA\Sigma_{A} given by sending a cluster 𝔰\mathfrak{s} to the point ηD𝔰\eta_{D_{\mathfrak{s}}}, where D𝔰KD_{\mathfrak{s}}\subset\mathbb{C}_{K} is the smallest disc containing 𝔰\mathfrak{s}. Under this correspondence, a cluster which is not (resp. is) itself the union of 2\geq 2 even-cardinality sub-clusters gets sent to a distinguished vertex (resp. a non-distinguished vertex).

Proof.

Throughout this proof, for any index ii, we write 𝔰iA\mathfrak{s}_{i}\subset A for the minimal cluster containing the elements ai,bia_{i},b_{i}, and we write DiKD_{i}\subset\mathbb{C}_{K} for the corresponding disc; note that we have ηDi=ηaiηbi\eta_{D_{i}}=\eta_{a_{i}}\vee\eta_{b_{i}}. Fix an index ii, and suppose that there is an even-cardinality cluster 𝔰\mathfrak{s} such that #(𝔰{ai,bi})=1\#(\mathfrak{s}\cap\{a_{i},b_{i}\})=1. By considering the cardinality, it is clear that there is an index lil\neq i such that #(𝔰{al,bl})=1\#(\mathfrak{s}\cap\{a_{l},b_{l}\})=1. We then clearly have ηD𝔰Λ(i)Λ(l)\eta_{D_{\mathfrak{s}}}\in\Lambda_{(i)}\cap\Lambda_{(l)}, which contradicts the fact that SS is clustered in pairs. It follows that we have aibia_{i}\sim b_{i}.

Now let iji\neq j be distinct indices. We must have 𝔰i𝔰j\mathfrak{s}_{i}\neq\mathfrak{s}_{j}, because otherwise we would get ηaiηbi=ηDi=ηDj=ηajηbjΛ(i)Λ(j)\eta_{a_{i}}\vee\eta_{b_{i}}=\eta_{D_{i}}=\eta_{D_{j}}=\eta_{a_{j}}\vee\eta_{b_{j}}\in\Lambda_{(i)}\cap\Lambda_{(j)}, which would contradict being clustered in pairs. Now assume without loss of generality that 𝔰i\mathfrak{s}_{i} does not contain 𝔰j\mathfrak{s}_{j}. This means by definition of 𝔰j\mathfrak{s}_{j} that we have aj,bj𝔰ia_{j},b_{j}\notin\mathfrak{s}_{i}, which is also the case if 𝔰i\mathfrak{s}_{i} and 𝔰j\mathfrak{s}_{j} are disjoint. In order to prove that aj,bj≁ai,bia_{j},b_{j}\not\sim a_{i},b_{i}, it then suffices to show that both 𝔰i\mathfrak{s}_{i} and 𝔰j\mathfrak{s}_{j} have even cardinality. If the cluster 𝔰i\mathfrak{s}_{i} had odd cardinality, that would imply that there is an index ll such that #(𝔰i{al,bl})=1\#(\mathfrak{s}_{i}\cap\{a_{l},b_{l}\})=1 and therefore that ηDi=ηaiηbi\eta_{D_{i}}=\eta_{a_{i}}\vee\eta_{b_{i}} lies in Λ(l)\Lambda_{(l)}; since ηaiηbi\eta_{a_{i}}\vee\eta_{b_{i}} also lies in Λ(i)\Lambda_{(i)}, this again contradicts the fact that SS is clustered in pairs. Therefore, we have 𝔰i\mathfrak{s}_{i} has even cardinality and, by the exact same argument, so does 𝔰j\mathfrak{s}_{j}. This completes the proof of part (a).

Let 𝔰\mathfrak{s} be a cluster of AA which is not the union of 2\geq 2 even-cardinality sub-clusters. Choose an element in AA which does not lie in a proper even-cardinality sub-cluster of AA; we may call this element aia_{i} for some index i{0,,g}i\in\{0,\dots,g\}. Then since we have aibia_{i}\sim b_{i} by part (a), we either have biAb_{i}\notin A or that bib_{i} is also an element in AA which does not lie in a proper even-cardinality sub-cluster of AA. In the former case, we immediately get ηD𝔰Λ(i)\eta_{D_{\mathfrak{s}}}\in\Lambda_{(i)}, whereas in the latter case, we claim that ηD𝔰=ηaiηbiΛ(i)\eta_{D_{\mathfrak{s}}}=\eta_{a_{i}}\vee\eta_{b_{i}}\in\Lambda_{(i)}. To see this, let 𝔠𝔰\mathfrak{c}\subseteq\mathfrak{s} be the smallest sub-cluster containing ai,bia_{i},b_{i}. If there were 3\geq 3 elements of 𝔠\mathfrak{c} not lying in a proper even-cardinality sub-cluster of 𝔠\mathfrak{c}, they would all lie in the same equivalence class and contradict part (a), so it must be the case that ai,bi𝔠a_{i},b_{i}\in\mathfrak{c} are the only elements not lying in a proper even-cardinality sub-cluster, and so 𝔠\mathfrak{c} has even cardinality. By construction of ai,bia_{i},b_{i}, we then get 𝔠=𝔰\mathfrak{c}=\mathfrak{s}, which implies our claim. We therefore have ηD𝔰Λ(i)\eta_{D_{\mathfrak{s}}}\in\Lambda_{(i)}, and we now only need to show that any neighborhood of ηD𝔰\eta_{D_{\mathfrak{s}}} contains a point ηΣSΛ(i)\eta\in\Sigma_{S}\smallsetminus\Lambda_{(i)}, as making such a neighborhood small enough ensures that η\eta does not lie in any other axis Λ(j)\Lambda_{(j)}. Note that for any index jij\neq i, we have [ηD𝔰,ηaj]ΣS[\eta_{D_{\mathfrak{s}}},\eta_{a_{j}}]\subset\Sigma_{S} and that every neighborhood of ηD𝔰\eta_{D_{\mathfrak{s}}} contains a sub-path [ηD𝔰,ηDϵ][ηD𝔰,ηaj][\eta_{D_{\mathfrak{s}}},\eta_{D_{\epsilon}}]\subset[\eta_{D_{\mathfrak{s}}},\eta_{a_{j}}]. If the cluster 𝔰\mathfrak{s} has even cardinality, then there is an index jij\neq i such that aj𝔰a_{j}\notin\mathfrak{s}, and then one may take DϵD_{\epsilon} to be a slightly larger disc containing D𝔰=DiD_{\mathfrak{s}}=D_{i}, and it is clear that DϵΛ(i)D_{\epsilon}\notin\Lambda_{(i)}. If, on the other hand, the cluster 𝔰\mathfrak{s} has odd cardinality, then we have ai𝔰a_{i}\in\mathfrak{s} and bi𝔰b_{i}\notin\mathfrak{s}; as the elements of 𝔰{ai}\mathfrak{s}\smallsetminus\{a_{i}\} are not equivalent to aia_{i}, there is an even-cardinality sub-cluster 𝔠𝔰{ai}\mathfrak{c}\subseteq\mathfrak{s}\smallsetminus\{a_{i}\}. Choosing aj𝔠a_{j}\in\mathfrak{c}, we get that ai,biDϵD𝔰a_{i},b_{i}\notin D_{\epsilon}\subsetneq D_{\mathfrak{s}}, so that ηDϵΛ(i)\eta_{D_{\epsilon}}\notin\Lambda_{(i)}. So in either case, we are done showing that the point ηD𝔰\eta_{D_{\mathfrak{s}}} corresponding to the cluster 𝔰\mathfrak{s} is a distinguished vertex and have proved one case of part (b).

Now let 𝔰\mathfrak{s} be a cluster of AA which is the union of even-cardinality sub-clusters; let us write 𝔰=𝔠1𝔠s\mathfrak{s}=\mathfrak{c}_{1}\sqcup\dots\sqcup\mathfrak{c}_{s} for some s2s\geq 2, where each 𝔠l\mathfrak{c}_{l} is a maximal proper even-cardinality sub-cluster of 𝔰\mathfrak{s}. For each index ii, it follows from part (a) that we either have ai,bi𝔠la_{i},b_{i}\in\mathfrak{c}_{l} for some ll or have ai,bi𝔰a_{i},b_{i}\notin\mathfrak{s}. In the former case, we have D𝔰DiD_{\mathfrak{s}}\supsetneq D_{i}, and in the latter case, we have ai,biD𝔰a_{i},b_{i}\notin D_{\mathfrak{s}}, so in either case we get ηD𝔰Λ(i)\eta_{D_{\mathfrak{s}}}\notin\Lambda_{(i)}, implying that ηD𝔰\eta_{D_{\mathfrak{s}}} is not a distinguished vertex. For 1ls1\leq l\leq s, choose an element a(l)𝔠la_{(l)}\in\mathfrak{c}_{l}. We have [ηD𝔰,ηa(l)]ΣS[\eta_{D_{\mathfrak{s}}},\eta_{a_{(l)}}]\subset\Sigma_{S} and that every neighborhood of ηD𝔰\eta_{D_{\mathfrak{s}}} contains a sub-path [ηD𝔰,ηDϵ,l][ηD𝔰,ηa(l)][\eta_{D_{\mathfrak{s}}},\eta_{D_{\epsilon,l}}]\subset[\eta_{D_{\mathfrak{s}}},\eta_{a_{(l)}}], with 𝔠lDϵ,lD𝔰\mathfrak{c}_{l}\subset D_{\epsilon,l}\subsetneq D_{\mathfrak{s}}. At the same time, as there is some index ii such that ai𝔰a_{i}\notin\mathfrak{s}, we have [ηD𝔰,ηai]ΣS[\eta_{D_{\mathfrak{s}}},\eta_{a_{i}}]\subset\Sigma_{S} and that every neighborhood of ηD𝔰\eta_{D_{\mathfrak{s}}} contains a sub-path [ηD𝔰,ηDϵ,0][ηD𝔰,ηa(l)][\eta_{D_{\mathfrak{s}}},\eta_{D_{\epsilon,0}}]\subset[\eta_{D_{\mathfrak{s}}},\eta_{a_{(l)}}], with Dϵ,0D𝔰D_{\epsilon,0}\supsetneq D_{\mathfrak{s}}. It is clear that the shortest path between any pair of these points ηDϵ,0,ηDϵ,1,,ηDϵ,s\eta_{D_{\epsilon,0}},\eta_{D_{\epsilon,1}},\dots,\eta_{D_{\epsilon,s}} passes through ηD𝔰\eta_{D_{\mathfrak{s}}} and therefore, the neighborhood contains a star shape centered at ηD𝔰\eta_{D_{\mathfrak{s}}} (with at least s+13s+1\geq 3 edges coming out). The point ηD𝔰\eta_{D_{\mathfrak{s}}} is thus a natural vertex. This completes the proof of part (b). ∎

3.3. Cluster data and metric properties of convex hulls

We may now establish a dictionary between the cluster data of a finite subset AK1A\subset\mathbb{P}_{K}^{1} and the metric properties of its convex hull ΣS\Sigma_{S}. In order to express the following results, we define an order relation (denoted by >>) on K1,an\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} by decreeing that ξ>ξ\xi>\xi^{\prime} for any points ξ,ξK1,an\xi,\xi^{\prime}\in\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} means that ξ[ξ,η]{ξ}\xi\in[\xi^{\prime},\eta_{\infty}]\smallsetminus\{\xi^{\prime}\}. Note that by this definition, we have ηD>ηD\eta_{D}>\eta_{D^{\prime}} for discs D,DKD,D^{\prime}\subset\mathbb{C}_{K} if and only if we have DDD\supsetneq D^{\prime}.

The following proposition justifies the use of the letter δ\delta for both depth of a cluster and the distance function on K1,an\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}}.

Proposition 3.9.

Suppose that 𝔰\mathfrak{s} is a non-maximal cluster of a set SS which is clustered in pairs, and let v=ηD𝔰ΣSv=\eta_{D_{\mathfrak{s}}}\in\Sigma_{S} be the vertex corresponding to 𝔰\mathfrak{s} as in 3.8(b). Then the relative depth δ(𝔰)\delta(\mathfrak{s}) is equal to δ(v,v)\delta(v,v^{\prime}), where vΣSv^{\prime}\in\Sigma_{S} is the closest vertex to vv satisfying v>vv^{\prime}>v.

Proof.

It is clear from definitions and from the correspondence between clusters and vertices established by 3.8(b) that the closest vertex vv^{\prime} to vv satisfying v>vv^{\prime}>v corresponds to the smallest cluster 𝔰\mathfrak{s}^{\prime} which properly contains 𝔰\mathfrak{s}. We have δ(𝔰)=d(𝔰)d(𝔰)\delta(\mathfrak{s})=d(\mathfrak{s})-d(\mathfrak{s}^{\prime}), which equals the difference in logarithmic radii of the minimal discs D𝔰,D𝔰D_{\mathfrak{s}},D_{\mathfrak{s}^{\prime}} respectively containing the clusters 𝔰,𝔰\mathfrak{s},\mathfrak{s}^{\prime}, and which in turn by definition equals the distance between the respective points v=ηD𝔰,v=ηD𝔰v=\eta_{D_{\mathfrak{s}}},v^{\prime}=\eta_{D_{\mathfrak{s}^{\prime}}} in K1,an\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}}. ∎

Our above results now allow us to present a definition in the language of clusters of being clustered in rr-separated pairs.

Proposition 3.10.

Given any cluster 𝔰A\mathfrak{s}\subset A, write 𝔰\mathfrak{s}^{\prime} (resp. 𝔰\mathfrak{s}^{\sim}) for the smallest cluster properly containing 𝔰\mathfrak{s} (resp. properly containing 𝔰\mathfrak{s} and which is not itself the disjoint union of 2\geq 2 even-cardinality sub-clusters), if such a cluster properly containing 𝔰\mathfrak{s} exists.

A set AA is clustered in rr-separated pairs for some r>0r\in\mathbb{R}_{>0} if and only if the following conditions hold, for any even-cardinality clusters 𝔠,𝔠1,𝔠2\mathfrak{c},\mathfrak{c}_{1},\mathfrak{c}_{2} which are themselves not the disjoint union of 2\geq 2 even-cardinality clusters:

  1. (i)

    the set AA is clustered in pairs;

  2. (ii)

    if 𝔠\mathfrak{c}^{\sim} is defined, we have d(𝔠)d(𝔠)>2rd(\mathfrak{c})-d(\mathfrak{c}^{\sim})>2r; and

  3. (iii)

    if 𝔠1,𝔠2\mathfrak{c}_{1}^{\prime},\mathfrak{c}_{2}^{\prime} are defined and 𝔰:=𝔠1=𝔠2\mathfrak{s}:=\mathfrak{c}_{1}^{\prime}=\mathfrak{c}_{2}^{\prime}, we have δ(𝔠1)+δ(𝔠2)>2r\delta(\mathfrak{c}_{1})+\delta(\mathfrak{c}_{2})>2r.

Proof.

We know from 3.7 that AA is clustered in rr-separated pairs if and only if, for each pair of distinguished vertices v,vΣAv,v^{\prime}\in\Sigma_{A} which do not lie on the same axis Λ(i)\Lambda_{(i)}, we have δ(v,v)>2r\delta(v,v^{\prime})>2r; we will show that this latter condition is equivalent to properties (i)-(iii).

Property (i) is implied by the property of being clustered in rr-separated pairs by definition. Note that for any point w=ηDw=\eta_{D} sufficiently close to the vertex v=ηD𝔰v=\eta_{D_{\mathfrak{s}}} corresponding (as in the statement of 3.8) to an even-cardinality cluster 𝔰\mathfrak{s} with w>vw>v, we have DA=𝔰D\cap A=\mathfrak{s} and DD𝔰D\supsetneq D_{\mathfrak{s}} so that w=ηDw=\eta_{D} does not lie in any axis Λ(i)\Lambda_{(i)}. Thus, if vv^{\prime} is another vertex such that the path [v,v]ΣA[v,v^{\prime}]\subset\Sigma_{A} contains a point w>vw>v, it is not possible for vv and vv^{\prime} to lie in the same axis Λ(i)\Lambda_{(i)}. Now using 3.8(b) and 3.9, properties (ii) and (iii) can each be interpreted as saying that the distance between a pair of distinguished vertices satisfying a certain property is >2r>2r (consulting 3.1 and keeping in mind δ(𝔠1)+δ(𝔠2)=d(𝔠1)+d(𝔠2)2d(𝔰)\delta(\mathfrak{c}_{1})+\delta(\mathfrak{c}_{2})=d(\mathfrak{c}_{1})+d(\mathfrak{c}_{2})-2d(\mathfrak{s}) and that d(𝔠)=d(D𝔠)d(\mathfrak{c})=d(D_{\mathfrak{c}}) for any cluster 𝔠\mathfrak{c}). In the case of property (ii), these distinguished vertices v,vv,v^{\prime} satisfy that vv^{\prime} is the closest distinguished vertex to vv such that v>vv^{\prime}>v, while in the case of property (iii), these distinguished vertices w1,w2w_{1},w_{2} satisfy that v:=w1w2v:=w_{1}\vee w_{2} is the closest vertex to w1w_{1} (resp. w2w_{2}) such that v>w1v>w_{1} (resp. v>w2v>w_{2}). In both cases, our above observations imply that the two distinguished vertices in question cannot lie in the same axis Λ(i)\Lambda_{(i)}. The property of being clustered in rr-separated pairs therefore implies properties (ii) and (iii).

Conversely, suppose that a set AA which is clustered in pairs satisfies properties (ii) and (iii), and choose distinguished vertices v1v2v_{1}\neq v_{2} of ΣA\Sigma_{A}; we need to prove that δ(v1,v2)>2r\delta(v_{1},v_{2})>2r. It clearly suffices to replace v2v_{2} with the closest distinguished vertex to v1v_{1} in the half-open segment [v1,v2]{v1}[v_{1},v_{2}]\smallsetminus\{v_{1}\}. Having made such a replacement, if we have v2>v1v_{2}>v_{1} or v1>v2v_{1}>v_{2}, then the desired inequality is provided by property (ii). If we do not have v2>v1v_{2}>v_{1} or v1>v2v_{1}>v_{2}, then it is clear that v1v2>v1,v2v_{1}\vee v_{2}>v_{1},v_{2} satisfies that there is no distinguished vertex lying in the interior of either of the paths [v1,v1v2],[v2,v1v2][v_{1},v_{1}\vee v_{2}],[v_{2},v_{1}\vee v_{2}], and so the desired inequality is provided by property (iii). ∎

Remark 3.11.

Given a subset AK1A\subset\mathbb{P}_{K}^{1} which is clustered in pairs, let ΣAΣA\Sigma_{A}^{*}\subset\Sigma_{A} be the convex hull of the vertices of ΣA\Sigma_{A} (i.e. the smallest connected subspace of ΣA\Sigma_{A} containing its vertices). Propositions 3.8(b) and 3.9 show that ΣA\Sigma_{A}^{*} is a finite metric graph and that the cluster data (resp. the combinatorial cluster data) of AA determines and is determined (up to alterations induced by replacing AA with its image under a fractional linear transformation) by the metric graph isomorphism type (resp. the graph isomorphism type) of AA. More precisely, if we let φ:AA\varphi:A\to A^{\prime} be a bijection of finite subsets of K1\mathbb{P}_{K}^{1} and assume that we have AA\infty\in A\cap A^{\prime} and that φ()=\varphi(\infty)=\infty, then the map φ\varphi acts as a bijection between clusters of AA and clusters of AA^{\prime} if and only if this bijection of clusters, viewed as a bijection between the vertex sets of ΣA\Sigma_{A}^{*} and ΣA\Sigma_{A^{\prime}}^{*} via 3.8(b), is a graph isomorphism; moreover, this bijection preserves the relative depths of clusters if and only this corresponding graph isomorphism is an isometry.

Proposition 3.12.

1.3 implies 1.1.

Proof.

1.3 says that, under the optimality hypothesis which is common to both theorems’ statements, the bijection π:S\pi:S\to\mathcal{B} extends to a map π:ΣSΣ\pi_{*}:\Sigma_{S}\to\Sigma_{\mathcal{B}} between the convex hulls which is a homeomorphism. This last property implies that the image of the path between two points of Type I under π\pi_{*} coincides with the path between the images of those two points of Type I, so that in particular, we have π(Λai,bi)=Λαi,βi\pi_{*}(\Lambda_{a_{i},b_{i}})=\Lambda_{\alpha_{i},\beta_{i}} for 0ig0\leq i\leq g. Now from the definitions and the fact that π\pi_{*} is a homeomorphism, it follows that π\pi_{*} acts as a bijection between distinguished (resp. natural) vertices of ΣS\Sigma_{S} and those of Σ\Sigma_{\mathcal{B}}. By 3.8(b), then the bijection π\pi preserves combinatorial cluster data, thus implying 1.1(a).

Suppose that 𝔰\mathfrak{s} is an odd-cardinality cluster; let vΣSv\in\Sigma_{S} be the corresponding vertex as in 3.8(b); and let vΣSv^{\prime}\in\Sigma_{S} be the closest vertex to vv satisfying v>vv^{\prime}>v. Then the disc DD corresponding to any point in the interior of the path [v,v]ΣS[v,v^{\prime}]\subset\Sigma_{S} must satisfy DA=𝔰D\cap A=\mathfrak{s} and therefore its intersection with {ai,bi}A\{a_{i},b_{i}\}\subset A must be a singleton for some index ii; it is easy to deduce from this that we must then have [v,v]Λ(i)[v,v^{\prime}]\subset\Lambda_{(i)}. Then, in the notation of 1.3, we have v,v=[v,v]\llbracket v,v^{\prime}\rrbracket=[v,v^{\prime}], implying μ(v,v)=δ(v,v)\mu(v,v^{\prime})=\delta(v,v^{\prime}), and so the formula in (6) gives us δ(π(v),π(v))=pδ(v,v)\delta(\pi_{*}(v),\pi_{*}(v^{\prime}))=p\delta(v,v^{\prime}). Now we may apply 3.9 to get 1.1(b)(i).

Now suppose that 𝔰\mathfrak{s} is an even-cardinality cluster, and define v,vΣSv,v^{\prime}\in\Sigma_{S} as before. Again, the disc DD corresponding to any point in the interior of the path [v,v]ΣS[v,v^{\prime}]\subset\Sigma_{S} satisfies DA=𝔰D\cap A=\mathfrak{s}, and it is not the smallest such disc; by the hypothesis of being clustered in pairs, for each index ii, we have D{ai,bi}=D\cap\{a_{i},b_{i}\}=\varnothing or D{ai,bi}D\supset\{a_{i},b_{i}\}, and so no point in the interior of [v,v][v,v^{\prime}] lies in any axis Λ(i)\Lambda_{(i)}. Assume for the moment that pp is not the residue characteristic of KK. Then, in the notation of 1.3, we have μ(v,v)=0\mu(v,v^{\prime})=0, and therefore, the formula in (6) gives us δ(π(v),π(v))=δ(v,v)\delta(\pi_{*}(v),\pi_{*}(v^{\prime}))=\delta(v,v^{\prime}), and applying 3.9 implies 1.1(b)(ii). Now let us drop any assumption on pp and rather assume for the moment that neither 𝔰\mathfrak{s} nor 𝔰\mathfrak{s}^{\prime} is the union of 2\geq 2 even-cardinality clusters. Then by 3.8(b), the corresponding vertices v,vΣSv,v^{\prime}\in\Sigma_{S} are distinguished. Since there is no distinguished vertex in the interior of the path [v,v][v,v^{\prime}] (as this interior does not intersect any axis Λ(i)\Lambda_{(i)}), we get

(13) v,v=[v,v~][v~,v],\llbracket v,v^{\prime}\rrbracket=[v,\tilde{v}]\sqcup[\tilde{v}^{\prime},v^{\prime}],

where v~\tilde{v} (resp. v~\tilde{v}^{\prime}) is the (unique) point in [v,v][v,v^{\prime}] of distance v(p)p1\frac{v(p)}{p-1} from vv (resp. vv^{\prime}). We thus have μ(v,v)=2v(p)p1\mu(v,v^{\prime})=\frac{2v(p)}{p-1}. Therefore, the formula in (6) gives us δ(π(v),π(v)=δ(v,v)+2v(p)\delta(\pi_{*}(v),\pi_{*}(v^{\prime})=\delta(v,v^{\prime})+2v(p), and applying 3.9 implies 1.1(b)(iii). ∎

4. Proof of the main theorem

This section is devoted to proving 1.3. Our first task is to gather some background results on the action of PGL2(K)\mathrm{PGL}_{2}(K) on the convex hull ΣS\Sigma_{S} of an optimal subset SK1S\subset\mathbb{P}_{K}^{1} which are variants of (and are proved using) results in the author’s previous paper [14]; this is done in §4.1. Our method of proving our main result requires making explicit the modulo-action-of-Γ0\Gamma_{0} function that takes Ω\Omega to K1Ω/Γ0\mathbb{P}_{K}^{1}\cong\Omega/\Gamma_{0} as a theta function, which is the topic of §4.2. The remaining three subsections are then dedicated to the actual proof, which is broken into three parts: first (in §4.3) the presentation and proof of a result (4.10) providing an approximation of values of one of these theta functions Θ\Theta at certain inputs under a simplifying hypothesis on SS; then (in §4.4) the construction of an extension Θ\Theta_{*} of the theta function on SS to a map from the convex hull ΣS\Sigma_{S} to K1,an\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} and an explicit formula for Θ\Theta_{*} when restricted to a single segment of ΣS\Sigma_{S} (with the simplifying hypothesis retained); and finally (in §4.5) a “gluing” argument that completes the proof of 1.3.

We return to the setting where we have a pp-superelliptic subset SK1S\subset\mathbb{P}_{K}^{1}, meaning that (in particular) the set SS is clustered in v(p)p1\frac{v(p)}{p-1}-separated pairs {a0,b0},,{ag,bg}\{a_{0},b_{0}\},\dots,\{a_{g},b_{g}\}, and that, letting siPGL2(K)s_{i}\in\mathrm{PGL}_{2}(K) be an order-pp automorphism fixing ai,biSa_{i},b_{i}\in S for 0ig0\leq i\leq g, the group Γ0:=s0,,sg\Gamma_{0}:=\langle s_{0},\dots,s_{g}\rangle is a pp-Whittaker group (and so in particular is isomorphic to the free product of its cyclic subgroups si\langle s_{i}\rangle). The Schottky group ΓΓ0\Gamma\lhd\Gamma_{0} is the index-pp normal subgroup consisting of words on the generators sis_{i} whose total exponent is divisible by pp. Throughout this section, let Ω=ΩΓ=ΩΓ0\Omega=\Omega_{\Gamma}=\Omega_{\Gamma_{0}} denote the set of non-limit points of Γ\Gamma (and of Γ0\Gamma_{0}) in K1\mathbb{P}_{\mathbb{C}_{K}}^{1}, noting that in previous parts of the paper we wrote Ω\Omega to refer to the set non-limit KK-points rather than the non-limit K\mathbb{C}_{K}-points.

4.1. Some useful results on pp-Whittaker groups

Before we can start to prove 1.3 in earnest, we need to establish some useful properties of the action of the pp-Whittaker group Γ0\Gamma_{0} on the convex hull ΣS\Sigma_{S} in the case that SS is optimal. We begin by presenting a variation of our previous result [14, Lemma 3.16].

Lemma 4.1.

Let SK1S\subset\mathbb{P}_{K}^{1} be an optimal subset with associated Schottky group Γ\Gamma, and choose a point vΣSv\in\Sigma_{S} and a nontrivial element γΓ\gamma\in\Gamma, which we write as a word

(14) γ=sitntsit1nt1si1n1\gamma=s_{i_{t}}^{n_{t}}s_{i_{t-1}}^{n_{t-1}}\cdots s_{i_{1}}^{n_{1}}

for some t1t\geq 1, some n1,,ntpn_{1},\dots,n_{t}\in\mathbb{Z}\smallsetminus p\mathbb{Z}, and some indices ili_{l} satisfying ilil1i_{l}\neq i_{l-1} for 2lt2\leq l\leq t. Then the closest point in ΣS\Sigma_{S} to γ(η)\gamma(\eta) lies in Λ^it\hat{\Lambda}_{i_{t}}, and in fact we have

(15) δ(γ(η),ΣS)δ(η,Λ^i1)+l=2tδ(Λ^il1,Λ^il)>0.\delta(\gamma(\eta),\Sigma_{S})\geq\delta(\eta,\hat{\Lambda}_{i_{1}})+\sum_{l=2}^{t}\delta(\hat{\Lambda}_{i_{l-1}},\hat{\Lambda}_{i_{l}})>0.
Proof.

If we have vΛ^i1v\notin\hat{\Lambda}_{i_{1}}, then the hypothesis of [14, Lemma 3.16] applies, and that result gives us the desired conclusion. If instead we have vΛ^i1v\in\hat{\Lambda}_{i_{1}}, then the element γst1n1Γ0\gamma s_{t_{1}}^{-n_{1}}\in\Gamma_{0} is nontrivial, and the hypothesis of [14, Lemma 3.16] applies when replacing γ\gamma with γst1n1\gamma s_{t_{1}}^{-n_{1}}. Keeping in mind that δ(v,Λ^i1)=0\delta(v,\hat{\Lambda}_{i_{1}})=0 in this case, this again gives us the desired conclusion. ∎

Lemma 4.2.

Let SK1S\subset\mathbb{P}_{K}^{1} be an optimal subset; let aK1a\in\mathbb{P}_{K}^{1} be any point such that the closest point in ΣS\Sigma_{S} to ηa\eta_{a} is not a distinguished vertex; and let bK1b\in\mathbb{P}_{K}^{1} be any point. Then there exists MM\in\mathbb{Q} such that we have v(γ(b)a)>Mv(\gamma(b)-a)>M for at most one element γΓ\gamma\in\Gamma.

Proof.

We first set out to show that there is at most one element γΓ\gamma\in\Gamma such that the closest point in ΣS\Sigma_{S} to ηγ(b)\eta_{\gamma(b)} is not a distinguished vertex. Suppose that there exists an element γ0Γ\gamma_{0}\in\Gamma such that the closest point ξ\xi in ΣS\Sigma_{S} to ηγ0(b)\eta_{\gamma_{0}(b)} is not a distinguished vertex. Choose a nontrivial element γΓ\gamma\in\Gamma and assume that the path [ηγγ0(b)=γ(ηγ0(b)),γ(ξ)]K1,an[\eta_{\gamma\gamma_{0}(b)}=\gamma(\eta_{\gamma_{0}(b)}),\gamma(\xi)]\subset\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} intersects ΣS\Sigma_{S} at a point ξΣS\xi^{\prime}\in\Sigma_{S}. Lemma 4.1 implies that ξγ(ξ)\xi^{\prime}\neq\gamma(\xi), so ξ\xi^{\prime} lies in the interior of the path [γ(ηγ0(b)),γ(ξ)][\gamma(\eta_{\gamma_{0}(b)}),\gamma(\xi)]. Applying the action of γ1\gamma^{-1} shows us that γ1(ξ)\gamma^{-1}(\xi^{\prime}) lies in the interior of the path [ηγ0(b),ξ][\eta_{\gamma_{0}(b)},\xi]. By Lemma 4.1, the closest point in ΣS\Sigma_{S} to γ1(ξ)\gamma^{-1}(\xi^{\prime}) is a distinguished vertex, which contradicts the fact that the closest point in ΣS\Sigma_{S} to ηγ0(b)\eta_{\gamma_{0}(b)} (and thus also to γ1(ξ)\gamma^{-1}(\xi^{\prime})) is not a distinguished vertex. From this contradiction we get [ηγγ0(b),γ(ξ)]ΣS=[\eta_{\gamma\gamma_{0}(b)},\gamma(\xi)]\cap\Sigma_{S}=\varnothing. By Lemma 4.1, the closest point in ΣS\Sigma_{S} to γ(ξ)\gamma(\xi) is a distinguished vertex; it follows that the closest point in ΣS\Sigma_{S} to ηγγ0(b)\eta_{\gamma\gamma_{0}(b)} is a distinguished vertex. Since γ\gamma was chosen arbitrarily, we have proved our claim.

We now consider the possible values of v(γ(b)a)v(\gamma(b)-a) over all elements γΓ\gamma\in\Gamma. After discarding at most one choice of γ\gamma, we may assume that the closest point in ΣS\Sigma_{S} to ηγ(b)\eta_{\gamma(b)} is a distinguished vertex. Since the closest point in ΣS\Sigma_{S} to ηa\eta_{a} is not a distinguished vertex, we have Λγ(b),aΣS\Lambda_{\gamma(b),a}\cap\Sigma_{S}\neq\varnothing and that this intersection is the path [ξ,ξγ][\xi,\xi^{\prime}_{\gamma}], where ξ\xi (resp. ξγ\xi^{\prime}_{\gamma}) is the closest point in ΣS\Sigma_{S} to ηa\eta_{a} and (resp. ηγ(b)\eta_{\gamma(b)}). In terms of the partial order established at the top of §3.3, we get either ηaηγ(b){ξ,ξγ}\eta_{a}\vee\eta_{\gamma(b)}\in\{\xi,\xi^{\prime}_{\gamma}\} or ηaηγ(b)>ξ,ξγ\eta_{a}\vee\eta_{\gamma(b)}>\xi,\xi^{\prime}_{\gamma}. It follows that ηγ(b)ηa\eta_{\gamma(b)}\vee\eta_{a} is the point ξΣS\xi\in\Sigma_{S}, is a distinguished vertex of ΣS,0\Sigma_{S,0}, or is greater than a distinguished vertex vΣS,0v\in\Sigma_{S,0} with respect to our partial order; the last condition implies that ηaηγ(b)[v,η]\eta_{a}\vee\eta_{\gamma(b)}\in[v,\eta_{\infty}] and so we get δ(ηaηγ(b),Λ0,)δ(v,Λ0,)\delta(\eta_{a}\vee\eta_{\gamma(b)},\Lambda_{0,\infty})\leq\delta(v,\Lambda_{0,\infty}). Thus, the distance δ(ηaηγ(b),Λ0,)\delta(\eta_{a}\vee\eta_{\gamma(b)},\Lambda_{0,\infty}) either equals δ(ξ,Λ0,)\delta(\xi,\Lambda_{0,\infty}) or is at most the maximum distance between a distinguished vertex and the axis Λ0,\Lambda_{0,\infty}. Now applying 3.2(c) yields the inequality

(16) v(γ(b)a)M:=v(a)+max({δ(ξ,Λ0,)}{δ(v,Λ0,)}v),v(\gamma(b)-a)\leq M:=v(a)+\max(\{\delta(\xi,\Lambda_{0,\infty})\}\cup\{\delta(v,\Lambda_{0,\infty})\}_{v}),

where the maximum is taken over all distinguished vertices vv (of which there are only finitely many). Since our choice of γΓ\gamma\in\Gamma was arbitrary apart from possibly discarding one element, the assertion of the lemma follows. ∎

Corollary 4.3.

Let SK1S\subset\mathbb{P}_{K}^{1} be an optimal subset, and let aK1a\in\mathbb{P}_{K}^{1} be an element satisfying that the closest point in ΣS\Sigma_{S} to ηa\eta_{a} is not a distinguished vertex. Then we have aΩa\in\Omega. In particular, we have SΩS\subset\Omega.

Proof.

Suppose that a point aa satisfying this hypothesis is a limit point. Then for some bK1b\in\mathbb{P}_{K}^{1} and some subset {γn}n1Γ\{\gamma_{n}\}_{n\geq 1}\subset\Gamma, we have limnγn(b)=a\lim_{n\to\infty}\gamma_{n}(b)=a, or equivalently, limnv(γn(b)a)=\lim_{n\to\infty}v(\gamma_{n}(b)-a)=\infty. But this is contradicted by Lemma 4.2. ∎

Remark 4.4.

As is explained in §1.2, any pp-superelliptic subset SK1S\subset\mathbb{P}_{K}^{1} can be “folded into” an optimal set SminS^{\mathrm{min}} without affecting the associated pp-Whittaker group Γ0\Gamma_{0}; modifying SS via a folding amounts to acting on each element of SS by some automorphism in Γ0\Gamma_{0}. Since the set of non-limit points Ω\Omega of Γ0\Gamma_{0} is invariant under the action of Γ0\Gamma_{0}, the second statement of 4.3, which says that SminΩS^{\mathrm{min}}\subset\Omega, implies that we have SΩS\subset\Omega as well. This crucial fact about fixed points of generators of pp-Whittaker groups is proved (using good fundamental domains) in the p=2p=2 case as [12, Lemma 2.3] and is mentioned at the top of [11, §3] for general pp; in the latter reference, the author van Steen refers to his thesis for a proof. Our above argument appears to be more or less independent of van Steen’s, and in our context it comes as part of a more general statement which is useful to us in its own right.

Lemma 4.5.

Let SK1S\subset\mathbb{P}_{K}^{1} be an optimal subset. For each γΓ\gamma\in\Gamma and 0ig0\leq i\leq g, write ciγ=[γ(ai)]1γ(bi)1c_{i}^{\gamma}=[\gamma(a_{i})]^{-1}\gamma(b_{i})-1. Assume that for some index jj, we have aj=0a_{j}=0 and bj=b_{j}=\infty.

  1. (a)

    For any index ii and any M>0M>0 there are only finitely many elements γΓ\gamma\in\Gamma such that v(ciγ)Mv(c_{i}^{\gamma})\leq M.

  2. (b)

    For any index iji\neq j, we have

    (17) v(ciγ)v(ai1bi1) for all γΓ.v(c_{i}^{\gamma})\geq v(a_{i}^{-1}b_{i}-1)\text{ for all }\gamma\in\Gamma.

    Moreover, under the additional assumption that we have [ηaiηbi,𝔳j]Λ^(l)=[\eta_{a_{i}}\vee\eta_{b_{i}},\mathfrak{v}_{j}]\cap\hat{\Lambda}_{(l)}=\varnothing for indices li,jl\neq i,j, equality occurs in (17) if and only if we have γsj\gamma\in\langle s_{j}\rangle (that is, only when γ\gamma acts as zζpnzz\mapsto\zeta_{p}^{n}z for some nn\in\mathbb{Z}).

Proof.

Let GjΓG_{j}\subset\Gamma be the subset of elements which, when written as a word as in (14), satisfy itji_{t}\neq j (interpreting this definition so that 1Gj1\in G_{j}). It is elementary to check that Γ\Gamma can be written as the disjoint union n=0p1sjnGjsin\bigsqcup_{n=0}^{p-1}s_{j}^{n}G_{j}s_{i}^{-n}. Using the fact that sis_{i} fixes the points ai,biKa_{i},b_{i}\in K, we compute

(18) cisjnγsin=(ζpnγ(ai))1(ζpnγ(bi))1=[γ(ai)]1γ(bi)1=ciγ.c_{i}^{s_{j}^{n}\gamma s_{i}^{-n}}=(\zeta_{p}^{n}\gamma(a_{i}))^{-1}(\zeta_{p}^{n}\gamma(b_{i}))-1=[\gamma(a_{i})]^{-1}\gamma(b_{i})-1=c_{i}^{\gamma}.

It therefore suffices to prove that the statements of parts (a) and (b) hold for γGj\gamma\in G_{j}. We will show for such γ\gamma that, letting t(γ)t(\gamma) denote the length of γ\gamma as a word on the generators sls_{l} of Γ0\Gamma_{0} (that is, t(γ)t(\gamma) is the natural number tt appearing in the expression for γ\gamma in (14)), we have

(19) v(ciγ)t(γ)min{δ(Λ^(l),Λ^(m))}lm.v(c_{i}^{\gamma})\geq t(\gamma)\min\{\delta(\hat{\Lambda}_{(l)},\hat{\Lambda}_{(m)})\}_{l\neq m}.

Since the set {δ(Λ^(l),Λ^(m))}lm\{\delta(\hat{\Lambda}_{(l)},\hat{\Lambda}_{(m)})\}_{l\neq m} is finite and consists of positive numbers, the inequality (19) immediately implies part (a).

We have ciγ=ai1bi1c_{i}^{\gamma}=a_{i}^{-1}b_{i}-1 when γ=1\gamma=1, which immediately verifies both the inequality in (19) and the statement of part (b) in this case, so we choose γGj{1}\gamma\in G_{j}\smallsetminus\{1\} and proceed to show that both inequalities (19) and v(ciγ)v(λ)v(c_{i}^{\gamma})\geq v(\lambda) hold, and that the latter inequality is strict if we have [ηaiηbi,𝔳j]Λ^(l)=[\eta_{a_{i}}\vee\eta_{b_{i}},\mathfrak{v}_{j}]\cap\hat{\Lambda}_{(l)}=\varnothing for indices li,jl\neq i,j. Below we will write tt for t(γ)t(\gamma).

To this end, we first note using 3.2(c) that we have

(20) v(ciγ)=v(γ(bi)γ(ai))v(γ(ai))=δ(ηγ(ai)ηγ(bi),Λ(j)).v(c_{i}^{\gamma})=v(\gamma(b_{i})-\gamma(a_{i}))-v(\gamma(a_{i}))=\delta(\eta_{\gamma(a_{i})}\vee\eta_{\gamma(b_{i})},\Lambda_{(j)}).

Now as the fractional linear transformation γ\gamma acts on K1,an\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} as a self-homeomorphism and sends the endpoints ηai,ηbi\eta_{a_{i}},\eta_{b_{i}} of the axis Λ(i)\Lambda_{(i)} to the points ηγ(ai),ηγ(bi)\eta_{\gamma(a_{i})},\eta_{\gamma(b_{i})} respectively, we have γ(Λ(i))=Λγ(ai),γ(bi)ηγ(ai)ηγ(bi)\gamma(\Lambda_{(i)})=\Lambda_{\gamma(a_{i}),\gamma(b_{i})}\ni\eta_{\gamma(a_{i})}\vee\eta_{\gamma(b_{i})}, from which it follows that there is a point vΛ(i)ΣSv\in\Lambda_{(i)}\subset\Sigma_{S} such that γ(v)=ηγ(ai)ηγ(bi)\gamma(v)=\eta_{\gamma(a_{i})}\vee\eta_{\gamma(b_{i})}. Now we apply Lemma 4.1 to get

(21) δ(γ(v),Λ^(it))=δ(v,Λ^(i1))+l=2tδ(Λ^(il1),Λ^(il))δ(v,Λ^(it)).\delta(\gamma(v),\hat{\Lambda}_{(i_{t})})=\delta(v,\hat{\Lambda}_{(i_{1})})+\sum_{l=2}^{t}\delta(\hat{\Lambda}_{(i_{l-1})},\hat{\Lambda}_{(i_{l})})\geq\delta(v,\hat{\Lambda}_{(i_{t})}).

In fact, we may now estimate v(ciγ)v(c_{i}^{\gamma}) by computing

(22) v(ciγ)=δ(γ(v),Λ(j))\displaystyle v(c_{i}^{\gamma})=\delta(\gamma(v),\Lambda_{(j)}) by (20)
δ(γ(v),Λ^(it))+δ(Λ^(it),Λ(j))\displaystyle\geq\delta(\gamma(v),\hat{\Lambda}_{(i_{t})})+\delta(\hat{\Lambda}_{(i_{t})},\Lambda_{(j)}) using Lemma 4.1
δ(v,Λ^(it))+δ(Λ^(it),Λ(j))\displaystyle\geq\delta(v,\hat{\Lambda}_{(i_{t})})+\delta(\hat{\Lambda}_{(i_{t})},\Lambda_{(j)}) by (21)
δ(v,Λ(j))\displaystyle\geq\delta(v,\Lambda_{(j)}) (strict if [v,𝔳j]Λ^(it)=[v,\mathfrak{v}_{j}]\cap\hat{\Lambda}_{(i_{t})}=\varnothing)
δ(Λ(i),Λ(j))\displaystyle\geq\delta(\Lambda_{(i)},\Lambda_{(j)}) because vΛ(i)v\in\Lambda_{(i)}
=v(biai)v(ai)=v(ai1bi1)\displaystyle=v(b_{i}-a_{i})-v(a_{i})=v(a_{i}^{-1}b_{i}-1) by 3.2(b)(c),

where the particular consequence of Lemma 4.1 used in the first inequality of (22 is its assertion that the closest point in ΣS\Sigma_{S} to γ(𝔳i)\gamma(\mathfrak{v}_{i}) lies in Λ^(it)\hat{\Lambda}_{(i_{t})}. This directly provides the desired inequality v(ciγ)v(ai1bi1)v(c_{i}^{\gamma})\geq v(a_{i}^{-1}b_{i}-1); we see from (22) that it is strict if we have [v,𝔳j]Λ^(it)=[v,\mathfrak{v}_{j}]\cap\hat{\Lambda}_{(i_{t})}=\varnothing. As we have [v,𝔳j]=[v,ηaiηbi][ηaiηbi,𝔳j][v,\mathfrak{v}_{j}]=[v,\eta_{a_{i}}\vee\eta_{b_{i}}]\cup[\eta_{a_{i}}\vee\eta_{b_{i}},\mathfrak{v}_{j}] and [v,ηaiηbi]Λ(i)[v,\eta_{a_{i}}\vee\eta_{b_{i}}]\subset\Lambda_{(i)} is disjoint from Λ(j)\Lambda_{(j)} by the property of being clustered in pairs, the condition that implies strictness is equivalent to the condition that [ηaiηbi,𝔳j]Λ(it)=[\eta_{a_{i}}\vee\eta_{b_{i}},\mathfrak{v}_{j}]\cap\Lambda_{(i_{t})}=\varnothing.

Meanwhile, the sequence of inequalities in (22) also includes v(ciγ)δ(γ(𝔳i),Λ^(it))+δ(Λ^(it),Λ(j))v(c_{i}^{\gamma})\geq\delta(\gamma(\mathfrak{v}_{i}),\hat{\Lambda}_{(i_{t})})+\delta(\hat{\Lambda}_{(i_{t})},\Lambda_{(j)}) which, using (21), allows us to get the desired inequality (19) by computing

(23) v(ciγ)δ(γ(v),Λ^(it))+δ(Λ^(it),Λ(j))l=2tδ(Λ^(il1),Λ^(il))+\displaystyle v(c_{i}^{\gamma})\geq\delta(\gamma(v),\hat{\Lambda}_{(i_{t})})+\delta(\hat{\Lambda}_{(i_{t})},\Lambda_{(j)})\geq\sum_{l=2}^{t}\delta(\hat{\Lambda}_{(i_{l-1})},\hat{\Lambda}_{(i_{l})})+ δ(Λ^(it),Λ^(j))\displaystyle\delta(\hat{\Lambda}_{(i_{t})},\hat{\Lambda}_{(j)})
t(γ)min{δ(Λ^(l),Λ^(m))}lm.\displaystyle\geq t(\gamma)\min\{\delta(\hat{\Lambda}_{(l)},\hat{\Lambda}_{(m)})\}_{l\neq m}.

4.2. Theta functions

The KK-analytic isomorphism between the quotient of the set of non-limit points Ω\Omega modulo the action of a pp-Whittaker group Γ0\Gamma_{0} and the projective line K1\mathbb{P}_{K}^{1} is made explicit by a special type of infinite product function. We define (as in [6, §II.2], with slightly different notation) the theta function Θa,bG\Theta_{a,b}^{G} with respect to any subgroup G<PGL2(K)G<\mathrm{PGL}_{2}(K) and any choice of elements a,ba,b of its subset ΩGK1\Omega_{G}\subset\mathbb{P}_{K}^{1} of non-limit points as

Θa,bG(z)=γGzγ(a)zγ(b).\Theta_{a,b}^{G}(z)=\prod_{\gamma\in G}\frac{z-\gamma(a)}{z-\gamma(b)}.

Here and below, we adopt the convention that if exactly one of the terms in the numerator (resp. denominator) is \infty, then the numerator (resp. denominator) is replaced by 11 and that if the denominator comes out to 0, then the infinite product equals K1\infty\in\mathbb{P}_{K}^{1}. We will always assume for our purposes that we have bG(a)b\notin G(a) and G(a)G(b)\infty\notin G(a)\cup G(b) (i.e. that aa and bb are in separate orbits under the action of GG and that neither is in the orbit of \infty). It is immediate to see that the set of zeros (resp. poles) of Θa,bG\Theta^{G}_{a,b} coincides with {γ(ai)}γG\{\gamma(a_{i})\}_{\gamma\in G} (resp. {γ(bi)}γG\{\gamma(b_{i})\}_{\gamma\in G}); in particular, these theta functions are not constant.

In our situation, we are concerned with the theta functions Θa,bΓ\Theta_{a,b}^{\Gamma} and Θa,bΓ0\Theta_{a,b}^{\Gamma_{0}} corresponding to our Schottky and pp-Whittaker groups ΓΓ0<PGL2(K)\Gamma\lhd\Gamma_{0}<\mathrm{PGL}_{2}(K), where a,bΩ:=ΩΓ=ΩΓ0a,b\in\Omega:=\Omega_{\Gamma}=\Omega_{\Gamma_{0}} satisfy bΓ0(a)b\notin\Gamma_{0}(a) and Γ0(a)Γ0(b)\infty\notin\Gamma_{0}(a)\cup\Gamma_{0}(b). It is shown in [6, §II.2, IX.2] that for such a,ba,b, the functions Θa,bΓ\Theta_{a,b}^{\Gamma} and Θa,bΓ0\Theta_{a,b}^{\Gamma_{0}} are meromorphic on Ω\Omega; moreover, a primarily group-theoretic argument in [6, §VIII.1] demonstrates that Θa,bΓ0\Theta_{a,b}^{\Gamma_{0}} is invariant under the action of Γ0\Gamma_{0}, i.e. we have Θa,bΓ0(γ(z))=Θa,bΓ0(z)\Theta_{a,b}^{\Gamma_{0}}(\gamma(z))=\Theta_{a,b}^{\Gamma_{0}}(z) for γΓ0\gamma\in\Gamma_{0}. These above results imply that Θa,bΓ0:ΩK1\Theta_{a,b}^{\Gamma_{0}}:\Omega\to\mathbb{P}_{K}^{1} induces a map ϑa,b:Ω/Γ0K1\vartheta_{a,b}:\Omega/\Gamma_{0}\to\mathbb{P}_{K}^{1}; as the only poles of Θa,bΓ0\Theta_{a,b}^{\Gamma_{0}} are simple poles occuring at the elements of Γ0(b)\Gamma_{0}(b), the induced function ϑa,b\vartheta_{a,b} has exactly 11 simple pole and so it is an isomorphism.

For our purposes, the elements a,bΩa,b\in\Omega will be chosen to be ai,biSa_{i},b_{i}\in S for some index i{0,,g}i\in\{0,\dots,g\}, where SS is an optimal set whose associated Schottky and pp-Whittaker groups are ΓΓ0\Gamma\lhd\Gamma_{0}. This is a valid choice of a,ba,b for the following reasons. We know that SΩS\subset\Omega from 4.3. We know moreover that si(ai)=aibis_{i}(a_{i})=a_{i}\neq b_{i} and that ηγ(ai)=γ(ηai)=ΣSηbi\eta_{\gamma(a_{i})}=\gamma(\eta_{a_{i}})=\notin\Sigma_{S}\ni\eta_{b_{i}} for any γΓ{1}\gamma\in\Gamma\smallsetminus\{1\} thanks to Lemma 4.1. Since the group Γ0\Gamma_{0} is generated by its subgroup Γ\Gamma and the element sis_{i}, we get biΓ0(ai)b_{i}\notin\Gamma_{0}(a_{i}).

From the decomposition Γ0=0np1Γsin\Gamma_{0}=\bigsqcup_{0\leq n\leq p-1}\Gamma s_{i}^{n} and the fact that the element siΓ0s_{i}\in\Gamma_{0} fixes ai,bia_{i},b_{i}, one sees immediately from formulas that we have

(24) Θai,biΓ0=(Θai,biΓ)p.\Theta_{a_{i},b_{i}}^{\Gamma_{0}}=(\Theta_{a_{i},b_{i}}^{\Gamma})^{p}.

To state and demonstrate the results in the rest of this section, we introduce the following notation. Given a point aKa\in\mathbb{C}_{K} and a real number rr, we denote the disc {zK|v(za)r}\{z\in\mathbb{C}_{K}\ |\ v(z-a)\geq r\} by Da(r)D_{a}(r) and the corresponding point of K1,an\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} by ηa(r)\eta_{a}(r). Given another real number s>rs>r, we introduce the notation Aa(r,s)KA_{a}(r,s)\subset\mathbb{C}_{K} for the open annulus given by {zK|r<v(za)<s}\{z\in\mathbb{C}_{K}\ |\ r<v(z-a)<s\}; we will refer to aa as a center of the annulus Aa(r,s)A_{a}(r,s) even though aAa(r,s)a\notin A_{a}(r,s).

There is a natural way to extend any rational function Θ\Theta on K1\mathbb{P}_{\mathbb{C}_{K}}^{1} (viewed as the subspace of K1,an\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} consisting of the points of Type I) to a function Θ\Theta_{*} on K1,an\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} using the original seminorm definition of the points of K1,an\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}}: it is done by composing Θ\Theta with each seminorm; see [2, Definition 7.2]. It is clear from the construction and the proofs of the results given in [2, §7.1,7.2] that this can be generalized to the case where Θ\Theta is not necessarily a rational function but is meromorphic on a subspace ZK1Z\subset\mathbb{P}_{\mathbb{C}_{K}}^{1} which satisfies that, for any center aZa\in Z and any real number rr, there is some ε>0\varepsilon>0 such that the function Θ\Theta is meromorphic on each of the annuli Aa(r,r+ε)A_{a}(r,r+\varepsilon) and Aa(rε,r)A_{a}(r-\varepsilon,r); in this generalization, one expects the extended function Θ\Theta_{*} to be defined on the convex hull of ZZ in K1,an\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}}. We claim that this is the case when Θ\Theta is one of the theta functions Θa,bΓ\Theta_{a,b}^{\Gamma} described in §4.2 above, given a pp-superelliptic set SS with associated pp-Whittaker group Γ0\Gamma_{0} and elements a,bΩ=ΩΓ=ΩΓ0a,b\in\Omega=\Omega_{\Gamma}=\Omega_{\Gamma_{0}}, and that the values of the induced function Θ=(Θa,bΓ)\Theta_{*}=(\Theta_{a,b}^{\Gamma})_{*} can be computed at any point ηD\eta_{D} in the convex hull of Ω\Omega in K1,an\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} via [2, Theorem 7.12, Remark 7.14]. This result requires in particular that Θ\Theta be meromorphic when restricted to a small enough annulus with inner or outer radius equal to the radius of DD and which is centered at a center of DD, and it states that the image Θ(A)\Theta(A) is also an annulus.

As we do not need to show that the function Θ\Theta_{*} defined on a certain subspace of K1,an\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} via the construction given in [2, Theorem 7.12] is actually the map on seminorms induced by Θ\Theta in the sense of [2, Definition 7.2] or that it is defined on the convex hull of Ω\Omega, we leave out the details of such arguments. However, in order to define Θ\Theta_{*} using [2, Theorem 7.12, Remark 7.14] on the subspace ΣSK1,an\Sigma_{S}\subset\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}}, it is necessary and sufficient to establish the below facts. (This will also show that the theta function Θ~:=Θa,bΓ0\tilde{\Theta}:=\Theta_{a,b}^{\Gamma_{0}}, being the composition of a polynomial function with Θ\Theta as in (24), induces a map Θ~\tilde{\Theta}_{*} on ΣS\Sigma_{S}, as this construction is functorial with respect to composition.)

Proposition 4.6.

Assume the above set-up and notation, that the set SS is optimal, and that S\infty\in S; choose a,bΩa,b\in\Omega. For each point ηDΣS{ηai,ηbi}0ig\eta_{D}\in\Sigma_{S}\smallsetminus\{\eta_{a_{i}},\eta_{b_{i}}\}_{0\leq i\leq g}, the function Θa,bΓ\Theta_{a,b}^{\Gamma} is meromorphic on an annulus of the form Ac(d(D),d(D)+ε)A_{c}(d(D),d(D)+\varepsilon) or Ac(d(D)ε,d(D))A_{c}(d(D)-\varepsilon,d(D)) for some cDc\in D and ε>0\varepsilon>0.

As we have discussed in §4.2, a theta function Θa,bΓ\Theta_{a,b}^{\Gamma} is meromorphic on ΩK1\Omega\subset\mathbb{P}_{K}^{1}. Therefore, 4.6 is proved immediately from the following proposition and corollary.

Proposition 4.7.

Let SK1S\subset\mathbb{P}_{K}^{1} be an optimal subset with S\infty\in S. Choose an element aiSa_{i}\in S and a real number rr such that the point ηai(r)K1,an\eta_{a_{i}}(r)\in\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} lies in ΣS\Sigma_{S} but is not a vertex. Let aKa\in\mathbb{C}_{K} be an element satisfying v(aai)=rv(a-a_{i})=r. Then the closest point in ΣS\Sigma_{S} to ηa\eta_{a} is ηai(r)\eta_{a_{i}}(r), and we have aΩa\in\Omega.

Proof.

Let ξ\xi be the closest point in ΣS\Sigma_{S} to ηa\eta_{a}. Then we have ξ[ηa,ηa(r)]\xi\in[\eta_{a},\eta_{a}(r)], so that the disc DKD\subset\mathbb{C}_{K} corresponding to ξ\xi satisfies aDDai(r)a\in D\subseteq D_{a_{i}}(r). Suppose that ξηai(r)\xi\neq\eta_{a_{i}}(r), so that we have DDai(r)D\subsetneq D_{a_{i}}(r). Then the logarithmic radius of DD is <r<r so that we have aiDa_{i}\notin D and that the smallest disc containing DD and aia_{i} is Dai(r)D_{a_{i}}(r). The half-open segment [ξ,ηai(r)]{ηai(r)}ΣS[\xi,\eta_{a_{i}}(r)]\smallsetminus\{\eta_{a_{i}}(r)\}\subset\Sigma_{S} clearly intersects a neighborhood of the point ηai(r)\eta_{a_{i}}(r). But since ηai(r)\eta_{a_{i}}(r) is not a natural vertex, every sufficiently small neighborhood of ηai(r)\eta_{a_{i}}(r) in the space ΣS\Sigma_{S} is contained in a path [ηai(r+ε),ηai(rε)][ηai,η]ΣS[\eta_{a_{i}}(r+\varepsilon),\eta_{a_{i}}(r-\varepsilon)]\subsetneq[\eta_{a_{i}},\eta_{\infty}]\subset\Sigma_{S} for some small ε>0\varepsilon>0. This is a contradiction, so the closest point in ΣS\Sigma_{S} to ηa\eta_{a} must be ηai(r)\eta_{a_{i}}(r). Then 4.3 says that we have aΩa\in\Omega. ∎

Corollary 4.8.

Let SK1S\subset\mathbb{P}_{K}^{1} be an optimal subset with S\infty\in S. Choose an element aiSa_{i}\in S and a real number rr such that the point ηai(r)K1,an\eta_{a_{i}}(r)\in\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} lies in ΣS\Sigma_{S}. Then there exist ε>0\varepsilon>0 such that we have

(25) Aai(r,r+ε),Aai(rε,r)Ω.A_{a_{i}}(r,r+\varepsilon),A_{a_{i}}(r-\varepsilon,r)\subset\Omega.
Proof.

Since there are only finitely many vertices of ΣS\Sigma_{S}, for small enough ε>0\varepsilon>0, there is no vertex of ΣS\Sigma_{S} in the punctured neighborhood B({η},ε){η}B(\{\eta\},\varepsilon)\smallsetminus\{\eta\}. Choosing any rr^{\prime}\in\mathbb{R} with r<rr+εr<r^{\prime}\leq r+\varepsilon, we have that the point ηai(r)[ηa,η]ΣS\eta_{a_{i}}(r^{\prime})\in[\eta_{a},\eta_{\infty}]\subset\Sigma_{S} is not a vertex. Then 4.7 says that for any aK1a\in\mathbb{P}_{K}^{1} with v(aai)=rv(a-a_{i})=r^{\prime}, we have aΩa\in\Omega. The claim that Aai(r,r+ε)ΩA_{a_{i}}(r,r+\varepsilon)\subset\Omega follows, and the claim that Aai(rε,r)A_{a_{i}}(r-\varepsilon,r) follows from a similar argument. ∎

4.3. An approximation for some values of the theta function

Having set up all of the background results that we will need, we now set out to prove 1.3. The goal of this section is to prove 4.10 below, which will be crucial to our computations. In equations below, the expression [h.v.t.] (“higher-valuation terms”) will often appear in sums of elements of K\mathbb{C}_{K}; this means means we are adding an unspecified term whose valuation is higher than that of each of the other terms in the sum. The following lemma will be used in the proof of 4.10 and also serves to set up some notation used in the theorem.

Lemma 4.9.

With all of the above notation, suppose that SK1S\subset\mathbb{P}_{K}^{1} is an optimal subset satisfying that aj=0a_{j}=0 and bj=b_{j}=\infty for some index jj. Choose an element aK1a\in\mathbb{P}_{\mathbb{C}_{K}}^{1}, and for 0np10\leq n\leq p-1, let η(n)a\eta_{(n)}^{a} denote the closest point in the convex hull ΣS\Sigma_{S} to ηζpna\eta_{\zeta_{p}^{n}a}. Then all points η(n)a\eta_{(n)}^{a} share the same closest point in Λ^(j)\hat{\Lambda}_{(j)}, and there is at most one nn such that η(n)aΛ^(j)\eta_{(n)}^{a}\notin\hat{\Lambda}_{(j)}.

Proof.

As we have Λ^(j)ΣS\hat{\Lambda}_{(j)}\subset\Sigma_{S}, for any n{0,,p1}n\in\{0,\dots,p-1\}, the closest point in Λ^(j)\hat{\Lambda}_{(j)} to η(n)a\eta_{(n)}^{a} is clearly the closest point in Λ^(j)\hat{\Lambda}_{(j)} to ηζpna\eta_{\zeta_{p}^{n}a}. Meanwhile, we may take s0PGL2(K)s_{0}\in\mathrm{PGL}_{2}(K) to be the automorphism zζpzz\mapsto\zeta_{p}z, so for any two indices n,nn,n^{\prime}, we have ηζpna=s0nn(ηζpna)\eta_{\zeta_{p}^{n^{\prime}}a}=s_{0}^{n^{\prime}-n}(\eta_{\zeta_{p}^{n}a}). Now by applying [14, Proposition 2.6(d)], we get that ηζpna\eta_{\zeta_{p}^{n}a} and s0nn(ηζpna=ηζpnas_{0}^{n^{\prime}-n}(\eta_{\zeta_{p}^{n}a}=\eta_{\zeta_{p}^{n^{\prime}}a} share the same closest point in Λ^(j)\hat{\Lambda}_{(j)} (which we now denote by ξa\xi^{a}), and the first statement of the lemma follows.

Let mm be an index such that δ(η(m)a,Λ(j))δ(η(n)a,Λ(j))\delta(\eta_{(m)}^{a},\Lambda_{(j)})\geq\delta(\eta_{(n)}^{a},\Lambda_{(j)}) for 0np10\leq n\leq p-1. If η(m)aΛ^(j)\eta_{(m)}^{a}\in\hat{\Lambda}_{(j)}, then we have η(n)aΛ^(j)\eta_{(n)}^{a}\in\hat{\Lambda}_{(j)} for all other indices nn as well and we are done, so assume instead that we have η(m)aΛ^(j)\eta_{(m)}^{a}\notin\hat{\Lambda}_{(j)}. Choose any other index nmn\neq m. Now, keeping in mind that ηζpma=s0mn(ηζpna)\eta_{\zeta_{p}^{m}a}=s_{0}^{m-n}(\eta_{\zeta_{p}^{n}a}), by [14, Proposition 2.6(d)] we have

(26) [η(n)a,ξa][ξa,s0mn(η(n)a)]=[η(n)a,s0mn(η(n)a)][ηζpna,ηζpma]=[ηζpna,ξa][ξa,ηζpma],[\eta_{(n)}^{a},\xi^{a}]\cup[\xi^{a},s_{0}^{m-n}(\eta_{(n)}^{a})]=[\eta_{(n)}^{a},s_{0}^{m-n}(\eta_{(n)}^{a})]\subset[\eta_{\zeta_{p}^{n}a},\eta_{\zeta_{p}^{m}a}]=[\eta_{\zeta_{p}^{n}a},\xi^{a}]\cup[\xi^{a},\eta_{\zeta_{p}^{m}a}],

and therefore s0mn(η(n)a)[ξa,ηζpma]s_{0}^{m-n}(\eta_{(n)}^{a})\in[\xi^{a},\eta_{\zeta_{p}^{m}a}]. Since we have

(27) δ(s0mn(η(n)a),ξa)=δ(s0mn(η(n)a),s0mn(ξa))=δ(η(n)a,ξa)δ(η(m)a,ξa),\delta(s_{0}^{m-n}(\eta_{(n)}^{a}),\xi^{a})=\delta(s_{0}^{m-n}(\eta_{(n)}^{a}),s_{0}^{m-n}(\xi^{a}))=\delta(\eta_{(n)}^{a},\xi^{a})\leq\delta(\eta_{(m)}^{a},\xi^{a}),

this gives us s0mn(η(n)a)[η(m)a,ξa]ΣSs_{0}^{m-n}(\eta_{(n)}^{a})\in[\eta_{(m)}^{a},\xi^{a}]\subset\Sigma_{S}. Therefore, we have the inclusion [η(n)a,s0mn(η(n)a]=[η(n)a,ξa][ξa,s0mn(η(n)a]ΣS[\eta_{(n)}^{a},s_{0}^{m-n}(\eta_{(n)}^{a}]=[\eta_{(n)}^{a},\xi^{a}]\cup[\xi^{a},s_{0}^{m-n}(\eta_{(n)}^{a}]\subset\Sigma_{S}, and now if η(n)aΛ^(j)\eta_{(n)}^{a}\notin\hat{\Lambda}_{(j)} (so that s0mn(ζ(n))Λ^(j)s_{0}^{m-n}(\zeta_{(n)})\notin\hat{\Lambda}_{(j)} as well by [14, Proposition 2.6(c)]), one sees using [14, Proposition 3.11, Definition 3.12] that this contradicts the fact that SS is optimal. ∎

Theorem 4.10.

Suppose that SK1S\subset\mathbb{P}_{K}^{1} is an optimal subset satisfying the following conditions:

  1. (i)

    we have 0=:aj,=:bj,1=:biS0=:a_{j},\infty=:b_{j},1=:b_{i}\in S for some indices iji\neq j;

  2. (ii)

    the point 𝔳jΛ(j)\mathfrak{v}_{j}\in\Lambda_{(j)} corresponding to the disc {zK|v(z)0}\{z\in\mathbb{C}_{K}\ |\ v(z)\geq 0\} consisting of the integral elements is a distinguished vertex of ΣS,0\Sigma_{S,0}; and

  3. (iii)

    the point 𝔳i:=η1ηaiΛ(i)\mathfrak{v}_{i}:=\eta_{1}\vee\eta_{a_{i}}\in\Lambda_{(i)} (which corresponds to the disc {zK|v(z1)v(ai1)}\{z\in\mathbb{C}_{K}\ |\ v(z-1)\geq v(a_{i}-1)\}) is a distinguished vertex of ΣS,0\Sigma_{S,0} satisfying [𝔳i,𝔳j]ΣS,0Λ^(l)[\mathfrak{v}_{i},\mathfrak{v}_{j}]\subseteq\Sigma_{S,0}\smallsetminus\hat{\Lambda}_{(l)} for each li,jl\neq i,j.

Write λ=aibi=ai1\lambda=a_{i}-b_{i}=a_{i}-1. Given aΩa\in\Omega, we define the points η(n)a\eta_{(n)}^{a} as in Lemma 4.9. We may approximate Θ(a)\Theta(a) for certain inputs aa as follows.

  1. (a)

    For small enough ν>0\nu>0, and for any aΩa\in\Omega such that we have η(n)aΛ(i)[𝔳i,𝔳j]B({𝔳j},ν)\eta_{(n)}^{a}\in\Lambda_{(i)}\cup[\mathfrak{v}_{i},\mathfrak{v}_{j}]\cup B(\{\mathfrak{v}_{j}\},\nu) for some n{0,,p1}n\in\{0,\dots,p-1\}, we may make the approximation

    (28) Θ(a)=1+pλ(1ap)1+[h.v.t.].\Theta(a)=1+p\lambda(1-a^{p})^{-1}+[\mathrm{h.v.t.}].

    Moreover, the higher-valuation terms appearing in (28) above have valuation >v(p)+v(λ)v(1ap)+ν>v(p)+v(\lambda)-v(1-a^{p})+\nu.

  2. (b)

    Given aΩa\in\Omega and n{0,,p1}n\in\{0,\dots,p-1\} as in part (a), assume that we have η(n)a(Λ(i)[𝔳i,𝔳j])Λ^(j)\eta_{(n)}^{a}\in(\Lambda_{(i)}\cup[\mathfrak{v}_{i},\mathfrak{v}_{j}])\smallsetminus\hat{\Lambda}_{(j)}. Then we may make the approximation

    (29) Θ(a)=1+λ(1a)1+[h.v.t.].\Theta(a)=1+\lambda(1-a)^{-1}+[\mathrm{h.v.t.}].
Proof.

Define the subset GjΓG_{j}\subset\Gamma as in the proof of Lemma 4.5, where it was observed that Γ\Gamma can be written as the disjoint union n=0p1sjnGjsin\bigsqcup_{n=0}^{p-1}s_{j}^{n}G_{j}s_{i}^{-n}. Now our formula for the theta function Θai,biΓ\Theta_{a_{i},b_{i}}^{\Gamma} can be written as

(30) Θai,biΓ(a)=n=0p1(γsjnGjsinaγ(1+λ)aγ(1)).\Theta_{a_{i},b_{i}}^{\Gamma}(a)=\prod_{n=0}^{p-1}\Big{(}\prod_{\gamma\in s_{j}^{n}G_{j}s_{i}^{-n}}\frac{a-\gamma(1+\lambda)}{a-\gamma(1)}\Big{)}.

Now note that we may choose sjΓ0s_{j}\in\Gamma_{0} to be the automorphism given by zζpzz\mapsto\zeta_{p}z as it is of order pp and fixes 0,K10,\infty\in\mathbb{P}_{K}^{1}. Since the automorphism sis_{i} meanwhile fixes the points ai=1+λ,bi=1a_{i}=1+\lambda,b_{i}=1, we may write the expression for Θ(a)\Theta(a) in (30) as

(31) Θai,biΓ(a)=n=0p1(γGjaζpnγ(1+λ)aζpnγ(1))=γGj(n=0p1aζpnγ(1+λ)aζpnγ(1))=γGjap[γ(1+λ)]pap[γ(1)]p.\Theta_{a_{i},b_{i}}^{\Gamma}(a)=\prod_{n=0}^{p-1}\Big{(}\prod_{\gamma\in G_{j}}\frac{a-\zeta_{p}^{n}\gamma(1+\lambda)}{a-\zeta_{p}^{n}\gamma(1)}\Big{)}=\prod_{\gamma\in G_{j}}\Big{(}\prod_{n=0}^{p-1}\frac{a-\zeta_{p}^{n}\gamma(1+\lambda)}{a-\zeta_{p}^{n}\gamma(1)}\Big{)}=\prod_{\gamma\in G_{j}}\frac{a^{p}-[\gamma(1+\lambda)]^{p}}{a^{p}-[\gamma(1)]^{p}}.

Fix any element γGj\gamma\in G_{j}, and define ciγc_{i}^{\gamma} as in Lemma 4.5. Lemma 4.5(b) says that we have v(ciγ)v(λ)v(c_{i}^{\gamma})\geq v(\lambda); in turn, by applying 3.2(c) and using the property of being clustered in v(p)p1\frac{v(p)}{p-1}-separated pairs, we obtain v(λ)=δ(𝔳i,Λ(j))δ(Λ(i),Λ(j))>2v(p)p1v(\lambda)=\delta(\mathfrak{v}_{i},\Lambda_{(j)})\geq\delta(\Lambda_{(i)},\Lambda_{(j)})>\frac{2v(p)}{p-1}.

So, taking into account the inequality v(ciγ)>v(p)p1v(c_{i}^{\gamma})>\frac{v(p)}{p-1}, we may compute the approximation

(32) ap[γ(1+λ)]pap[γ(1)]p\displaystyle\frac{a^{p}-[\gamma(1+\lambda)]^{p}}{a^{p}-[\gamma(1)]^{p}} =ap[γ(1)]p(1+ciγ)pap[γ(1)]p=ap[γ(1)]p[γ(1)]p(pciγ++(ciγ)p)ap[γ(1)]p\displaystyle=\frac{a^{p}-[\gamma(1)]^{p}(1+c_{i}^{\gamma})^{p}}{a^{p}-[\gamma(1)]^{p}}=\frac{a^{p}-[\gamma(1)]^{p}-[\gamma(1)]^{p}(pc_{i}^{\gamma}+\dots+(c_{i}^{\gamma})^{p})}{a^{p}-[\gamma(1)]^{p}}
=1+\displaystyle=1+ [γ(1)]p([γ(1)]pap)1(pciγ++(ciγ)p)\displaystyle[\gamma(1)]^{p}([\gamma(1)]^{p}-a^{p})^{-1}(pc_{i}^{\gamma}+\dots+(c_{i}^{\gamma})^{p})
=1+\displaystyle=1+ pciγ[γ(1)]p([γ(1)]pap)1+[h.v.t.].\displaystyle pc_{i}^{\gamma}[\gamma(1)]^{p}([\gamma(1)]^{p}-a^{p})^{-1}+[\mathrm{h.v.t.}].

We now set out to show that, under the hypothesis of part (a), the approximation in (32) gives an element of K\mathbb{C}_{K} which is farthest from 11 only for γ=1\gamma=1, implying that the term in the product formula for Θai,biΓ(a)\Theta_{a_{i},b_{i}}^{\Gamma}(a) in (31) corresponding to γ=1\gamma=1 is the one which dominates; the approximation in (28) then follows directly from applying (32) to γ=1\gamma=1. For the second statement of part (a), it is necessary and sufficient to show something more: that the difference between 11 and the term in the aforementioned product formula for γGj{1}\gamma\in G_{j}\smallsetminus\{1\} has valuation exceeding that of the analogous difference for γ=1\gamma=1 by more than ν\nu if ν\nu is chosen small enough. Equivalently, we will show under the hypotheses of (a) and (b) that the inequality v(pciγ[γ(1)]p([γ(1)]pap)1)>v(pλ(1ap)1)+νv(pc_{i}^{\gamma}[\gamma(1)]^{p}([\gamma(1)]^{p}-a^{p})^{-1})>v(p\lambda(1-a^{p})^{-1})+\nu holds for all γGj{1}\gamma\in G_{j}\smallsetminus\{1\} for small enough ν\nu. With a few straightforward algebraic computations and rearrangements, for each γGj{1}\gamma\in G_{j}\smallsetminus\{1\}, this inequality can be rewritten as

(33) v(ciγ)v(λ)>n=0p1[v(γ(1)ζpna)v(γ(1))v(1ζpna)]+ν.v(c_{i}^{\gamma})-v(\lambda)>\sum_{n=0}^{p-1}[v(\gamma(1)-\zeta_{p}^{n}a)-v(\gamma(1))-v(1-\zeta_{p}^{n}a)]+\nu.

It follows from Lemma 4.5(a)(b) that {v(ciγ)v(λ)}γΓ{1}>0\{v(c_{i}^{\gamma})-v(\lambda)\}_{\gamma\in\Gamma\smallsetminus\{1\}}\subset\mathbb{Q}_{>0} has a minimum element which is positive. Choose a positive number

ν<1p+1minγΓ{1}{v(ciγ)v(λ)}.\nu<\tfrac{1}{p+1}\min_{\gamma\in\Gamma\smallsetminus\{1\}}\{v(c_{i}^{\gamma})-v(\lambda)\}.

After possibly replacing ν\nu with a smaller positive number, we have B(Λ(j),ν)Λ^(l)=B(\Lambda_{(j)},\nu)\cap\hat{\Lambda}_{(l)}=\varnothing for any ljl\neq j. After possibly further shrinking ν\nu, we that the closest point in Λ(j)\Lambda_{(j)} to each other space Λ^(l)\hat{\Lambda}_{(l)} either has distance >ν>\nu from 𝔳jΛ(j)\mathfrak{v}_{j}\in\Lambda_{(j)} or is itself the point 𝔳j\mathfrak{v}_{j}. We will freely assume for the rest of the proof that ν\nu is small enough for both of these properties to hold. Let us fix arbitrary choices of γGj{1}\gamma\in G_{j}\smallsetminus\{1\} and n{0,,p1}n\in\{0,\dots,p-1\} and prove the inequality

(34) v(γ(1)ζpna)v(γ(1))v(1ζpna)ν;v(\gamma(1)-\zeta_{p}^{n}a)-v(\gamma(1))-v(1-\zeta_{p}^{n}a)\leq\nu;

this clearly implies (33) by construction of ν\nu and the fact that v(ciγ)>v(λ)v(c_{i}^{\gamma})>v(\lambda) by Lemma 4.5(b).

As the subspace Λ(i)[𝔳i,𝔳j]B({𝔳j},ν)K1,an\Lambda_{(i)}\cup[\mathfrak{v}_{i},\mathfrak{v}_{j}]\cup B(\{\mathfrak{v}_{j}\},\nu)\subset\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} is simply connected and intersects Λ^(j)\hat{\Lambda}_{(j)}, it follows from Lemma 4.9 that we have η(n)aΛ(i)[𝔳i,𝔳j]B({𝔳j},ν)\eta_{(n)}^{a}\in\Lambda_{(i)}\cup[\mathfrak{v}_{i},\mathfrak{v}_{j}]\cup B(\{\mathfrak{v}_{j}\},\nu) for some index nn (as is our hypothesis for part (a)) if and only if it holds for all indices nn. Assume first that we have Λγ(1),ζpnaΛ(j)\Lambda_{\gamma(1),\zeta_{p}^{n}a}\cap\Lambda_{(j)}\neq\varnothing. Then using 3.2(c), we get

(35) v(γ(1)ζpna)v(γ(1))v(γ(1)ζpna)min{v(γ(1)),v(ζpna)}=δ(Λγ(1),ζpna,Λ(j))=0.v(\gamma(1)-\zeta_{p}^{n}a)-v(\gamma(1))\leq v(\gamma(1)-\zeta_{p}^{n}a)-\min\{v(\gamma(1)),v(\zeta_{p}^{n}a)\}=\delta(\Lambda_{\gamma(1),\zeta_{p}^{n}a},\Lambda_{(j)})=0.

Meanwhile, using 3.2(a), one sees that our hypotheses imply that νv(a)ν-\nu\leq v(a)\leq\nu, so we have

(36) n=0p1v(1ζpna)pν>v(λ)v(ciγ)+ν.\sum_{n=0}^{p-1}v(1-\zeta_{p}^{n}a)\geq-p\nu>v(\lambda)-v(c_{i}^{\gamma})+\nu.

Putting (35) and (36) together gives us the desired inequality (34).

Let us now assume instead that the axes Λγ(1),ζpna\Lambda_{\gamma(1),\zeta_{p}^{n}a} and Λ(j)\Lambda_{(j)} are disjoint. As in the proof of Lemma 4.5, we have γ(Λ(i))=Λγ(1),γ(1+λ)\gamma(\Lambda_{(i)})=\Lambda_{\gamma(1),\gamma(1+\lambda)}; in particular, we have ηγ(1)ηγ(1+λ)=γ(v)\eta_{\gamma(1)}\vee\eta_{\gamma(1+\lambda)}=\gamma(v) for some vΛ(i)ΣSv\in\Lambda_{(i)}\subset\Sigma_{S}. Then using Lemma 4.1, we get that the closest point ξ\xi in ΣS\Sigma_{S} to ηγ(1)ηγ(1+λ)\eta_{\gamma(1)}\vee\eta_{\gamma(1+\lambda)}(and thus to Λγ(1),γ(1+λ)\Lambda_{\gamma(1),\gamma(1+\lambda)}) lies in Λ^(it)\hat{\Lambda}_{(i_{t})} (with iti_{t} defined as in that lemma) and that we have ξηγ(1)ηγ(1+λ)\xi\neq\eta_{\gamma(1)}\vee\eta_{\gamma(1+\lambda)} (and thus ξΛγ(1),γ(1+λ)\xi\notin\Lambda_{\gamma(1),\gamma(1+\lambda)}). In particular, the point ξ\xi is also the closest point in ΣS\Sigma_{S} to γ(1)\gamma(1). From the fact that itji_{t}\neq j and from hypothesis (iii), we have ξΛ(i)[𝔳i,𝔳j]B({𝔳j},ν)\xi\notin\Lambda_{(i)}\cup[\mathfrak{v}_{i},\mathfrak{v}_{j}]\cup B(\{\mathfrak{v}_{j}\},\nu). Thus, our hypothesis on the point η(n)a\eta_{(n)}^{a} shows that η(n)aξ\eta_{(n)}^{a}\neq\xi and that in fact, the point ξ\xi must be farther from Λ(j)\Lambda_{(j)} than η(n)a\eta_{(n)}^{a} is. It follows from all of this that the closest point in Λγ(1),ζpna\Lambda_{\gamma(1),\zeta_{p}^{n}a} to Λ(j)\Lambda_{(j)} is η(n)a\eta_{(n)}^{a}. We may now apply 3.2(a)(c) to get

(37) v(γ(1)ζpna)v(γ(1))=δ(Λγ(1),ζpna,Λ(j))=δ(η(n)a,Λ(j)).v(\gamma(1)-\zeta_{p}^{n}a)-v(\gamma(1))=\delta(\Lambda_{\gamma(1),\zeta_{p}^{n}a},\Lambda_{(j)})=\delta(\eta_{(n)}^{a},\Lambda_{(j)}).

From what we have already observed, the points η(n)a\eta_{(n)}^{a} and γ(𝔳i)\gamma(\mathfrak{v}_{i}) share the same closest point ξ\xi^{\prime} in Λ(j)\Lambda_{(j)}; we deduce using Lemma 4.1 that ξ\xi^{\prime} is also the closest point in Λ(j)\Lambda_{(j)} to Λ^(it)\hat{\Lambda}_{(i_{t})}. Let us assume for the moment that η(n)aB(Λ(j),ν)\eta_{(n)}^{a}\in B(\Lambda_{(j)},\nu) for 0np10\leq n\leq p-1. We then have δ(ξ,𝔳j)δ(η(n)a,𝔳j)ν\delta(\xi^{\prime},\mathfrak{v}_{j})\leq\delta(\eta_{(n)}^{a},\mathfrak{v}_{j})\leq\nu. Now if ξ𝔳j\xi^{\prime}\neq\mathfrak{v}_{j}, we have (by assumption on ν\nu) the contradicting inequality δ(ξ,𝔳j)>ν\delta(\xi^{\prime},\mathfrak{v}_{j})>\nu. We therefore have ξ=𝔳j\xi^{\prime}=\mathfrak{v}_{j}. Then by 3.2(a), we have v(a)=v(ζpna)=0v(a)=v(\zeta_{p}^{n}a)=0, and so v(1ζpna)0v(1-\zeta_{p}^{n}a)\geq 0. Meanwhile, we get v(γ(1)ζpna)v(γ(1))νv(\gamma(1)-\zeta_{p}^{n}a)-v(\gamma(1))\leq\nu from (37), and by construction, we have ν<1p+1(v(ciγ)v(λ))\nu<\frac{1}{p+1}(v(c_{i}^{\gamma})-v(\lambda)). Putting these inequalities together, we see that the desired inequality in (34) holds; we have thus proved part (a) in the case that η(n)aB(Λ(j),ν)\eta_{(n)}^{a}\in B(\Lambda_{(j)},\nu) for 0np10\leq n\leq p-1.

Now, as it does not affect the conclusions of the proposition to replace aa with ζpna\zeta_{p}^{n}a for any exponent nn, let us assume that δ(η(0)a,Λ(j))δ(η(n)a,Λ(j))\delta(\eta_{(0)}^{a},\Lambda_{(j)})\geq\delta(\eta_{(n)}^{a},\Lambda_{(j)}) for all nn. By Lemma 4.9, we either have η(0)a==η(p1)aΛ^(j)\eta_{(0)}^{a}=\dots=\eta_{(p-1)}^{a}\in\hat{\Lambda}_{(j)} or we have η(0)aΛ^(j)\eta_{(0)}^{a}\notin\hat{\Lambda}_{(j)} and η(1)a==η(p1)a[η(0)a,𝔳j]Λ^(j)\eta_{(1)}^{a}=\dots=\eta_{(p-1)}^{a}\in[\eta_{(0)}^{a},\mathfrak{v}_{j}]\cap\hat{\Lambda}_{(j)}. Now suppose that we have η(n)a[𝔳i,𝔳j]\eta_{(n)}^{a}\in[\mathfrak{v}_{i},\mathfrak{v}_{j}] for some (and thus for all) n{0,,p1}n\in\{0,\dots,p-1\}. Then for each nn, the closest point in Λ1,ζpna\Lambda_{1,\zeta_{p}^{n}a} to Λ(j)\Lambda_{(j)} is η(n)a\eta_{(n)}^{a}. We may now apply 3.2(a)(c) to get

(38) v(1ζpna)=v(1ζpna)v(1)=δ(Λ1,ζpna,Λ(j))=δ(η(n)a,Λ(j)).v(1-\zeta_{p}^{n}a)=v(1-\zeta_{p}^{n}a)-v(1)=\delta(\Lambda_{1,\zeta_{p}^{n}a},\Lambda_{(j)})=\delta(\eta_{(n)}^{a},\Lambda_{(j)}).

Combining this with (37) gives us the equality v(1ζpna)=v(γ(1)ζpna)v(γ(1))v(1-\zeta_{p}^{n}a)=v(\gamma(1)-\zeta_{p}^{n}a)-v(\gamma(1)) for 0np10\leq n\leq p-1. This again directly implies the desired inequality in (34) under the assumption that η(n)a[𝔳i,𝔳j]\eta_{(n)}^{a}\in[\mathfrak{v}_{i},\mathfrak{v}_{j}] for some (all) nn. Part (a) is therefore proved.

Now assume the hypothesis of part (b). Retaining the assumption made above that we have δ(η(0)a,Λ(j))δ(η(n)a,Λ(j))\delta(\eta_{(0)}^{a},\Lambda_{(j)})\geq\delta(\eta_{(n)}^{a},\Lambda_{(j)}) for all nn, as before, Lemma 4.9 tells us that η(1)a,,η(p1)aΛ^(j)\eta_{(1)}^{a},\dots,\eta_{(p-1)}^{a}\in\hat{\Lambda}_{(j)} and so we must have η(0)a(Λ(i)[𝔳i,𝔳j])Λ^(j)\eta_{(0)}^{a}\in(\Lambda_{(i)}\cup[\mathfrak{v}_{i},\mathfrak{v}_{j}])\smallsetminus\hat{\Lambda}_{(j)}. If η(0)a[𝔳i,𝔳j]Λ^(j)\eta_{(0)}^{a}\in[\mathfrak{v}_{i},\mathfrak{v}_{j}]\smallsetminus\hat{\Lambda}_{(j)}, then we clearly have η(0)a=η1ηa\eta_{(0)}^{a}=\eta_{1}\vee\eta_{a}, whereas if η(0)aΛ(i)\eta_{(0)}^{a}\in\Lambda_{(i)}, then it is easy to see that η(0)aΛ(i)Λa,1\eta_{(0)}^{a}\in\Lambda_{(i)}\cap\Lambda_{a,1}\neq\varnothing and η1ηaΛ(i)\eta_{1}\vee\eta_{a}\in\Lambda_{(i)}. In either case, we get η1ηaΛ^(j)\eta_{1}\vee\eta_{a}\notin\hat{\Lambda}_{(j)}. Applying 3.2(c), we obtain

(39) v(a1)=v(a1)v(1)=δ(η(0)a,Λ(j))>v(p)p1.v(a-1)=v(a-1)-v(1)=\delta(\eta_{(0)}^{a},\Lambda_{(j)})>\frac{v(p)}{p-1}.

Now, keeping in mind the above inequality, we compute

(40) p1(1ap)=p1(1(1+(a1))p)=p1(p(a1)+[h.v.t.])=(1a)+[h.v.t.].p^{-1}(1-a^{p})=p^{-1}(1-(1+(a-1))^{p})=p^{-1}(-p(a-1)+[\mathrm{h.v.t.}])=(1-a)+[\mathrm{h.v.t.}].

Such an approximation is preserved under taking reciprocals, and so the approximation given in part (a) implies the one claimed by part (b). ∎

Remark 4.11.

In the case where the residue characteristic of KK is different from pp, one can show, using a variant of the arguments in the above proof, that the approximation in (28) holds under the alternate hypothesis that for 1np11\leq n\leq p-1, we have [𝔳i,η(n)a]Λ(l)=[\mathfrak{v}_{i},\eta_{(n)}^{a}]\cap\Lambda_{(l)}=\varnothing for all indices li,jl\neq i,j. Indeed, as in the above proof, we may assume that η(1)a==η(p1)aΛ^(j)=Λ(j)\eta_{(1)}^{a}=\dots=\eta_{(p-1)}^{a}\in\hat{\Lambda}_{(j)}=\Lambda_{(j)}, which via the equalities in (37) and (38) implies that δ(γ(1)ζpna)v(γ(1))=δ(1ζpna)=0\delta(\gamma(1)-\zeta_{p}^{n}a)-v(\gamma(1))=\delta(1-\zeta_{p}^{n}a)=0 for 1np11\leq n\leq p-1. Now the inequality in (33) that is needed for the conclusion of part (a) simplifies to

(41) v(ciγ)v(λ)>v(γ(1)a)v(γ(1))v(1a)v(c_{i}^{\gamma})-v(\lambda)>v(\gamma(1)-a)-v(\gamma(1))-v(1-a)

for γGj{1}\gamma\in G_{j}\smallsetminus\{1\}, which can be shown to hold when η(0)a[𝔳i,𝔳j]\eta_{(0)}^{a}\notin[\mathfrak{v}_{i},\mathfrak{v}_{j}] (the case when η(0)a[𝔳i,𝔳j]\eta_{(0)}^{a}\in[\mathfrak{v}_{i},\mathfrak{v}_{j}] is already covered by 4.10(a)) through arguments of a similar flavor, outlined as follows.

As we have v(1a)=0v(1-a)=0, after applying 3.2(c), the crucial inequality to verify is

(42) δ(Λγ(1),a,Λ(j))<v(ciγ)v(λ).\delta(\Lambda_{\gamma(1),a},\Lambda_{(j)})<v(c_{i}^{\gamma})-v(\lambda).

If the path [ηγ(1),η(0)a][\eta_{\gamma(1)},\eta_{(0)}^{a}] intersects Λ(j)\Lambda_{(j)}, then we have δ(Λγ(1),a,Λ(j))=0\delta(\Lambda_{\gamma(1),a},\Lambda_{(j)})=0, and (42) holds as v(ciγ)v(λ)v(c_{i}^{\gamma})-v(\lambda) is positive thanks to Lemma 4.5(b). If, on the other hand, we have [ηγ(1),η(0)a]Λ(j)=[\eta_{\gamma(1)},\eta_{(0)}^{a}]\cap\Lambda_{(j)}=\varnothing, one verifies using Lemma 4.1 that the path [𝔳i,Λ(it)][\mathfrak{v}_{i},\Lambda_{(i_{t})}] intersects Λ(j)\Lambda_{(j)}, so that δ(𝔳i,Λ(it))>δ(𝔳i,Λ(j))=v(λ)\delta(\mathfrak{v}_{i},\Lambda_{(i_{t})})>\delta(\mathfrak{v}_{i},\Lambda_{(j)})=v(\lambda). Then, using (22), we get δ(Λ(it),Λ(j))<v(ciγ)v(λ)\delta(\Lambda_{(i_{t})},\Lambda_{(j)})<v(c_{i}^{\gamma})-v(\lambda). Meanwhile, the fact that η(0)a[ηγ(1),Λ(j)]\eta_{(0)}^{a}\in[\eta_{\gamma(1)},\Lambda_{(j)}] implies δ(Λγ(1),a,Λ(j))δ(Λ(it),Λ(j))\delta(\Lambda_{\gamma(1),a},\Lambda_{(j)})\leq\delta(\Lambda_{(i_{t})},\Lambda_{(j)}), so we get the desired inequality (42).

4.4. The image of one segment of the convex hull under the theta function

Our next step is to use 4.10 to compute the values on a particular subspace of ΣS\Sigma_{S} of the function (Θai,biΓ)(\Theta_{a_{i},b_{i}}^{\Gamma})_{*} (the existence of which was established in §4.2) induced by the theta function Θai,biΓ\Theta_{a_{i},b_{i}}^{\Gamma}.

We retain all of the above notation and define, for any subspace ΛK1,an\Lambda\subset\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} and real number ν>0\nu>0, the subspace B({𝔳j},ν)K1,anB(\{\mathfrak{v}_{j}\},\nu)^{-}\subset\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} to be the subspace of the neighborhood B({𝔳j},ν)B(\{\mathfrak{v}_{j}\},\nu) consisting of the points of Type II or III corresponding to discs in K\mathbb{C}_{K} whose elements all have valuation 0\leq 0. We remark that, due to 3.2(a), an alternate (equivalent) definition of the space B({𝔳j},ν)B(\{\mathfrak{v}_{j}\},\nu)^{-} is that it consists of those points in B({𝔳j},ν)B(\{\mathfrak{v}_{j}\},\nu) whose closest point in Λ0,\Lambda_{0,\infty} lies in the path [𝔳j,η][\mathfrak{v}_{j},\eta_{\infty}].

The main result of this subsection is as follows.

Proposition 4.12.

Suppose that SK1S\subset\mathbb{P}_{K}^{1} is an optimal subset satisfying hypotheses (i)-(iii) of 4.10. As in the statement of that theorem, we write 𝔳j=η0(0)\mathfrak{v}_{j}=\eta_{0}(0), and we write Θ\Theta for the theta function Θai,biΓ\Theta_{a_{i},b_{i}}^{\Gamma}.

  1. (a)

    We have v(Θ(0)1)=v(p)+v(λ)v(\Theta(0)-1)=v(p)+v(\lambda).

  2. (b)

    For sufficiently small ν>0\nu>0, the induced map Θ\Theta_{*} established in §4.2 on ΣS\Sigma_{S}, when restricted to the subspace [η1,η1(d)]ΣS[\eta_{1},\eta_{1}(d)]\subset\Sigma_{S}, is one-to-one and bicontinuous. Its values on [η1,η1(ν)][\eta_{1},\eta_{1}(-\nu)] are given by

    (43) Θ(η1(d))={η1(v(p)+v(λ)pd)forνdv(p)p1η1(v(λ)d)fordv(p)p1.\Theta_{*}(\eta_{1}(d))=\begin{cases}\eta_{1}(v(p)+v(\lambda)-pd)&\mathrm{for}\ -\nu\leq d\leq\frac{v(p)}{p-1}\\ \eta_{1}(v(\lambda)-d)&\mathrm{for}\ d\geq\frac{v(p)}{p-1}\end{cases}.
  3. (c)

    For sufficiently small ν>0\nu>0, for each ηB({𝔳j},ν)ΣS([η1,𝔳j]Λ(j))\eta\in B(\{\mathfrak{v}_{j}\},\nu)^{-}\cap\Sigma_{S}\smallsetminus([\eta_{1},\mathfrak{v}_{j}]\cup\Lambda_{(j)}), the point Θ(η)\Theta_{*}(\eta) corresponds to a disc which does not contain the element 11 but is centered at an element cc satisfying v(c1)=v(p)+v(λ)v(c-1)=v(p)+v(\lambda).

Consequently, letting

U=[η1,𝔳j](B({𝔳j},ν)ΣS),U^{-}=[\eta_{1},\mathfrak{v}_{j}]\cup(B(\{\mathfrak{v}_{j}\},\nu)^{-}\cap\Sigma_{S}),

we have

(44) Θ([η1,𝔳j])Θ(U[η1,𝔳j])=.\Theta_{*}([\eta_{1},\mathfrak{v}_{j}])\cap\Theta_{*}(U^{-}\smallsetminus[\eta_{1},\mathfrak{v}_{j}])=\varnothing.

In order to prove the above theorem, we first need an elementary result regarding annuli.

Lemma 4.13.

Let A=Aa(r,s)A=A_{a}(r,s) be an (open) annulus, and suppose that there is a point aKa^{\prime}\in\mathbb{C}_{K} and rational numbers r<sr^{\prime}<s^{\prime} such that

  1. (i)

    for every zAz\in A, we have r<v(za)<sr^{\prime}<v(z-a^{\prime})<s^{\prime} and

  2. (ii)

    conversely, for every ρ\rho such that r<ρ<sr^{\prime}<\rho<s^{\prime}, there exists zAz\in A such that v(za)=ρv(z-a^{\prime})=\rho.

Then we have r=rr=r^{\prime} and s=ss=s^{\prime}, and we have v(aa)sv(a^{\prime}-a)\geq s^{\prime}, so that A=Aa(r,s)A=A_{a^{\prime}}(r^{\prime},s^{\prime}).

Proof.

Given any points z,wAz,w\in A such that v(za)v(wa)v(z-a)\neq v(w-a), by the non-archimedean property, we have v(zw)=min{v(za),v(wa)}v(z-w)=\min\{v(z-a),v(w-a)\}. This easily implies

(45) infz,wAv(zw)=r.\inf_{z,w\in A}v(z-w)=r.

By a similar argument, the hypotheses (i) and (ii) of the statement imply that the same infemum in (45) equals rr^{\prime}, so we get r=rr=r^{\prime}. Now the fact that aAa^{\prime}\notin A by hypothesis (i) implies that we have either v(aa)sv(a^{\prime}-a)\geq s or v(aa)rv(a^{\prime}-a)\leq r. If v(aa)rv(a^{\prime}-a)\leq r, then applying the non-archimedean property and using the fact that v(za)r=rv(z-a^{\prime})\geq r^{\prime}=r gives us v(za)=min{v(za),v(aa)}=v(aa)rv(z-a)=\min\{v(z-a^{\prime}),v(a^{\prime}-a)\}=v(a^{\prime}-a)\leq r, which contradicts the construction of AA. We therefore have v(aa)sv(a^{\prime}-a)\geq s. It follows that A=Aa(r,s)A=A_{a^{\prime}}(r,s); in other words, aa^{\prime} is also a center of the open annulus AA. Now it is immediate from hypotheses (i) and (ii) that s=ss^{\prime}=s. ∎

Proof (of 4.12).

Part (a) immediately follows from the estimation

(46) Θ(0)=1+pλ+[h.v.t.]\Theta(0)=1+p\lambda+[\mathrm{h.v.t.}]

obtained by from putting a=0a=0 into the approximation given by 4.10(a).

Choose ν>0\nu>0 small enough that the conclusion of 4.10(a) is satisfied. Fix a real number dd with νd<v(p)p1-\nu\leq d<\frac{v(p)}{p-1}. For sufficiently small ε>0\varepsilon>0, there is no vertex of ΣS\Sigma_{S} in the interior of the path [η1(d),η1(d+ε)]ΣS[\eta_{1}(d),\eta_{1}(d+\varepsilon)]\subset\Sigma_{S}. Then for any ρ\rho with d<ρ<d+εd<\rho<d+\varepsilon and any aKa\in\mathbb{C}_{K} with v(a1)=ρv(a-1)=\rho, by 4.7, we have aΩa\in\Omega. We observe that

(47) 1ap=1((a1)+1)p=(1a)p+[h.v.t.].1-a^{p}=1-((a-1)+1)^{p}=(1-a)^{p}+[\mathrm{h.v.t.}].

By 4.7, we may apply 4.10(a) to get

(48) v(Θ(a)1)=v(p)+v(λ)pρ.v(\Theta(a)-1)=v(p)+v(\lambda)-p\rho.

Now [2, Theorem 7.12], with the help of 4.6, says that, after possibly shrinking ε>0\varepsilon>0, the image Θ(A1(d,d+ε))\Theta(A_{1}(d,d+\varepsilon)) coincides with an open annulus Aa(r,s)A_{a}(r,s) (for some point aKa\in\mathbb{C}_{K} and rational numbers r<sr<s). Meanwhile, one deduces immediately from (48) that for every zΘ(A1(d,d+ε))z\in\Theta(A_{1}(d,d+\varepsilon)) we have

(49) v(p)+v(λ)p(d+ε)<v(z1)<v(p)+v(λ)pd.v(p)+v(\lambda)-p(d+\varepsilon)<v(z-1)<v(p)+v(\lambda)-pd.

Now applying Lemma 4.13, we get

(50) Θ(A1(d,d+ε))=A1(v(p)+v(λ)p(d+ε),v(p)+v(λ)pd).\Theta(A_{1}(d,d+\varepsilon))=A_{1}(v(p)+v(\lambda)-p(d+\varepsilon),v(p)+v(\lambda)-pd).

Now the formula claimed in (43) in the case that 0d<v(p)p10\leq d<\frac{v(p)}{p-1} follows from applying the last statement of [2, Theorem 7.12].

Now fix a real number d>v(p)p1d>\frac{v(p)}{p-1}. For sufficiently small ε>0\varepsilon>0, there is again no vertex of ΣS\Sigma_{S} in the interior of the path [η1(dε),η1(d)]ΣS[\eta_{1}(d-\varepsilon),\eta_{1}(d)]\subset\Sigma_{S}. Then for any ρ\rho with dε<ρ<dd-\varepsilon<\rho<d and any aKa\in\mathbb{C}_{K} with v(a1)=ρv(a-1)=\rho, again we have aΩa\in\Omega by 4.7. It is then straightforward to deduce from the approximation provided by 4.10(b) that we have

(51) v(Θ(a)1)=v(λ)ρ.v(\Theta(a)-1)=v(\lambda)-\rho.

This time, [2, Theorem 7.12, Remark 7.14], with the help of 4.6, says that for small enough ε>0\varepsilon>0, the image Θ(A1(dε,d))\Theta(A_{1}(d-\varepsilon,d)) coincides with an open annulus Aa(r,s)A_{a}(r,s) (for some point aKa\in\mathbb{C}_{K} and rational numbers r<sr<s). Now the same argument as above involving Lemma 4.13 shows that we get

(52) Θ(A1(dε,d))=A1(v(λ)d,v(λ)d+ε).\Theta(A_{1}(d-\varepsilon,d))=A_{1}(v(\lambda)-d,v(\lambda)-d+\varepsilon).

Now the formula claimed in (43) in the case that dv(p)p1d\geq\frac{v(p)}{p-1} follows from applying the last statement of [2, Theorem 7.12] (again with the help of [2, Remark 7.14]).

We have thus proved the formulas in (43), and it is immediate from these formulas that the claims of being one-to-one and bicontinuous hold, so part (b) is proved.

To prove part (c), again choose ν\nu to be small enough that the conclusion of 4.10(a) is satisfied; we may assume that the only distinguished vertex in the neighborhood B({𝔳j},ν)B(\{\mathfrak{v}_{j}\},\nu) is 𝔳j\mathfrak{v}_{j}. Choose a point ηB({𝔳j},ν)ΣS([η1,𝔳j]Λ(j))\eta\in B(\{\mathfrak{v}_{j}\},\nu)^{-}\cap\Sigma_{S}\smallsetminus([\eta_{1},\mathfrak{v}_{j}]\cup\Lambda_{(j)}), and let ξ\xi be the closest point in Λ(j)\Lambda_{(j)} to η\eta. By definition, the point ξ\xi is a distinguished vertex. As the path [η,ξ][ξ,𝔳j]=[η,𝔳j][\eta,\xi]\cup[\xi,\mathfrak{v}_{j}]=[\eta,\mathfrak{v}_{j}] is contained in the neighborhood B({𝔳j},ν)B(\{\mathfrak{v}_{j}\},\nu), we must have ξ=𝔳j\xi=\mathfrak{v}_{j} and δ(η,𝔳j)ν\delta(\eta,\mathfrak{v}_{j})\leq\nu.

From the structure of ΣS\Sigma_{S} and our hypotheses on η\eta, there is some element alSa_{l}\in S for li,jl\neq i,j such that η[𝔳j,ηal]\eta\in[\mathfrak{v}_{j},\eta_{a_{l}}], so that η=ηal(d)\eta=\eta_{a_{l}}(d) for some (logarithmic) radius dd\in\mathbb{R}. By 4.7, there is an element z0Ωz_{0}\in\Omega such that v(z0,al)=dv(z_{0},a_{l})=d and the closest point in ΣS\Sigma_{S} to ηz0\eta_{z_{0}} is η\eta. We apply 3.2(a) to get v(z0)=0v(z_{0})=0 and 3.2(c) to get v(z01)=v(z01)v(1)=0v(z_{0}-1)=v(z_{0}-1)-v(1)=0 since Λ1,z0Λ(j)={𝔳j}\Lambda_{1,z_{0}}\cap\Lambda_{(j)}=\{\mathfrak{v}_{j}\}\neq\varnothing. Meanwhile, as η=ηz0(d)η0(d)Λ(j)\eta=\eta_{z_{0}}(d)\neq\eta_{0}(d)\in\Lambda_{(j)}, we have νd>0\nu\geq d>0.

Let us now show that we have v(z0p1)=0v(z_{0}^{p}-1)=0 as well. If pp is the residue characteristic of KK, then this certainly follows immediately from the fact that v(z01)=0v(z_{0}-1)=0, since the reduction of z01z_{0}-1 in the residue field is then a unit, while the reductions of (z01)p(z_{0}-1)^{p} and of z0p1z_{0}^{p}-1 are equal. We therefore suppose for the moment that pp is not the residue characteristic of KK. We already have v(z01)=0v(z_{0}-1)=0 and v(z0ζpn)0v(z_{0}-\zeta_{p}^{n})\geq 0 for 1np11\leq n\leq p-1 and only need to show that equality holds to get v(z0p1)=n=0p1v(z0ζpn)=0v(z_{0}^{p}-1)=\sum_{n=0}^{p-1}v(z_{0}-\zeta_{p}^{n})=0. Lemma 4.9 says we have (using the notation of that lemma) η(n)1Λ^(j)=Λ(j)\eta_{(n)}^{1}\notin\hat{\Lambda}_{(j)}=\Lambda_{(j)} for at most one nn. As the path [η1,𝔳j][\eta_{1},\mathfrak{v}_{j}] passes through η1η1+λΣS\eta_{1}\vee\eta_{1+\lambda}\in\Sigma_{S}, it is clear that η(0)1Λ(j)\eta_{(0)}^{1}\notin\Lambda_{(j)}, so we have η(1)1==η(p1)1Λ(j)\eta_{(1)}^{1}=\dots=\eta_{(p-1)}^{1}\in\Lambda_{(j)} and, by 3.2(a), even η(1)1==η(p1)1=𝔳j\eta_{(1)}^{1}=\dots=\eta_{(p-1)}^{1}=\mathfrak{v}_{j}. If we have v(z0ζpn)=:d>0v(z_{0}-\zeta_{p}^{n})=:d^{\prime}>0 for some nn, then, applying 3.2(c), we have ηz0(d)=ηz0ηζpnΛ(j)\eta_{z_{0}}(d^{\prime})=\eta_{z_{0}}\vee\eta_{\zeta_{p}^{n}}\notin\Lambda_{(j)}. From ηz0(d)ΣSΛ(j)\eta_{z_{0}}(d)\in\Sigma_{S}\smallsetminus\Lambda_{(j)}, we get ηz0(min{d,d})ΣSΛ(j)\eta_{z_{0}}(\min\{d,d^{\prime}\})\in\Sigma_{S}\smallsetminus\Lambda_{(j)}. But as min{d,d}>0\min\{d,d^{\prime}\}>0, this point ηz0(min{d,d})\eta_{z_{0}}(\min\{d,d^{\prime}\}) lies in the interior of the path [ηζpn,𝔳j=η(n)1][\eta_{\zeta_{p}^{n}},\mathfrak{v}_{j}=\eta_{(n)}^{1}], a contradiction. Therefore, we have v(z0ζpn)=0v(z_{0}-\zeta_{p}^{n})=0, as desired.

Given any ρ\rho such that d>ρ>0d>\rho>0, the point ηz0(ρ)\eta_{z_{0}}(\rho) lies in the interior of the path [ηz0(d),𝔳j]B({𝔳j},ν)[\eta_{z_{0}}(d),\mathfrak{v}_{j}]\subset B(\{\mathfrak{v}_{j}\},\nu) and so is not a distinguished vertex. Then by 4.7, there is an element aΩa\in\Omega such that v(az0)=ρv(a-z_{0})=\rho and the closest point in ΣS\Sigma_{S} to ηa\eta_{a} is ηz0(ρ)\eta_{z_{0}}(\rho). Note that from v(z0)=0v(z_{0})=0 we get v(a)=0v(a)=0 and so v(aζpnz0)0v(a-\zeta_{p}^{n}z_{0})\geq 0 for 0np10\leq n\leq p-1. We thus see that v(apz0p)=v(az0)+n=1p1v(aζpnz0)ρv(a^{p}-z_{0}^{p})=v(a-z_{0})+\sum_{n=1}^{p-1}v(a-\zeta_{p}^{n}z_{0})\geq\rho. By the exact same argument as was used to show that v(z0p1)=0v(z_{0}^{p}-1)=0, we have v(ap1)=0v(a^{p}-1)=0.

We may now apply 4.10 to both inputs z0z_{0} and aa to get

(53) Θ(z0)=1+pλ(1z0p)1+[h.v.t.],Θ(a)=1+pλ(1ap)1+[h.v.t.],\Theta(z_{0})=1+p\lambda(1-z_{0}^{p})^{-1}+[\mathrm{h.v.t.}],\ \ \ \Theta(a)=1+p\lambda(1-a^{p})^{-1}+[\mathrm{h.v.t.}],

where in both cases the higher-valuation terms have valuation greater than v(p)+v(λ)0+ν>v(p)+v(λ)+ρv(p)+v(\lambda)-0+\nu>v(p)+v(\lambda)+\rho. Now we get the estimation

(54) Θ(a)Θ(z0)=pλ[(1ap)1(1z0p)1]+[h.v.t.],\Theta(a)-\Theta(z_{0})=p\lambda[(1-a^{p})^{-1}-(1-z_{0}^{p})^{-1}]+[\mathrm{h.v.t.}],

where the higher-valuation terms again have valuation greater than v(p)+v(λ)+ρv(p)+v(\lambda)+\rho. Using the fact that v(apz0p)>v(1z0p)=0v(a^{p}-z_{0}^{p})>v(1-z_{0}^{p})=0, we compute that the difference (1ap)1(1z0p)1=([1z0p][apz0p])1(1z0p)1(1-a^{p})^{-1}-(1-z_{0}^{p})^{-1}=([1-z_{0}^{p}]-[a^{p}-z_{0}^{p}])^{-1}-(1-z_{0}^{p})^{-1} has valuation v(apz0p)ρv(a^{p}-z_{0}^{p})\geq\rho, so we get

(55) v(Θ(a)Θ(z0))v(p)+v(λ)+ρ.v(\Theta(a)-\Theta(z_{0}))\geq v(p)+v(\lambda)+\rho.

As before, we may apply [2, Theorem 7.12, Remark 7.14] and Lemma 4.13; this time we find that

(56) Θ(Az0(dε,d))=AΘ(z0)(r,s)\Theta(A_{z_{0}}(d-\varepsilon,d))=A_{\Theta(z_{0})}(r,s)

for d>ε>0d>\varepsilon>0 and with s>r>v(p)+v(λ)+dε>v(p)+v(λ)s>r>v(p)+v(\lambda)+d-\varepsilon>v(p)+v(\lambda). It follows (again using [2, Theorem 7.12, Remark 7.14]) that the output Θ(η)\Theta_{*}(\eta) is ηΘ(z0)(d)\eta_{\Theta(z_{0})}(d^{\prime}) for some d>v(p)+v(λ)d^{\prime}>v(p)+v(\lambda). At the same time, the approximation of c:=Θ(z0)c:=\Theta(z_{0}) given in (53), combined with the fact that v(1z0p)=0v(1-z_{0}^{p})=0, tells us that v(Θ(z0)1)=v(p)+v(λ)v(\Theta(z_{0})-1)=v(p)+v(\lambda). Therefore the disc corresponding to Θ(η)\Theta_{*}(\eta) cannot contain 11. Thus, we have proved part (c).

Now, by inspecting the formulas in (58) given by part (b) for d0d\geq 0, which give output points corresponding to discs centered at 1K1\in\mathbb{C}_{K} and whose logarithmic radii are v(p)+v(λ)\leq v(p)+v(\lambda), we conclude from the νdv(p)p1-\nu\leq d\leq\frac{v(p)}{p-1} case of part (b) and from part (c) that we have

(57) Θ(η)Θ([η1,𝔳j]) for all ηU[η1,𝔳j].\Theta_{*}(\eta)\notin\Theta_{*}([\eta_{1},\mathfrak{v}_{j}])\text{ for all }\eta\in U^{-}\smallsetminus[\eta_{1},\mathfrak{v}_{j}].

We therefore get the final statement of the theorem asserting disjointness of images. ∎

Corollary 4.14.

Suppose that SK1S\subset\mathbb{P}_{K}^{1} is an optimal subset satisfying hypotheses (i)-(iii) of 4.10. For brevity of notation, write Θ~\tilde{\Theta} for the theta function Θai,biΓ0\Theta_{a_{i},b_{i}}^{\Gamma_{0}}.

  1. (a)

    We have v(Θ~(0)1)=2v(p)+v(λ)v(\tilde{\Theta}(0)-1)=2v(p)+v(\lambda).

  2. (b)

    The induced map Θ~\tilde{\Theta}_{*}, when restricted to the subspace [η1,𝔳j]K1,an[\eta_{1},\mathfrak{v}_{j}]\subset\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}}, is one-to-one and bicontinuous. Its values on [η1,𝔳j][\eta_{1},\mathfrak{v}_{j}] are given by

    (58) Θ~(η(1,d))={η(1,2v(p)+v(λ)pd)for 0dv(p)p1η(1,v(p)+v(λ)d)forv(p)p1dv(λ)v(p)p1η(1,pv(λ)pd)fordv(λ)v(p)p1.\tilde{\Theta}_{*}(\eta(1,d))=\begin{cases}\eta(1,2v(p)+v(\lambda)-pd)&\mathrm{for}\ 0\leq d\leq\frac{v(p)}{p-1}\\ \eta(1,v(p)+v(\lambda)-d)&\mathrm{for}\ \frac{v(p)}{p-1}\leq d\leq v(\lambda)-\frac{v(p)}{p-1}\\ \eta(1,pv(\lambda)-pd)&\mathrm{for}\ d\geq v(\lambda)-\frac{v(p)}{p-1}\end{cases}.
  3. (c)

    For sufficiently small ν>0\nu>0, defining UU^{-} as in the statement of 4.12, we have

    (59) Θ~([η1,𝔳j])Θ~(U[η1,𝔳j])=.\tilde{\Theta}_{*}([\eta_{1},\mathfrak{v}_{j}])\cap\tilde{\Theta}_{*}(U^{-}\smallsetminus[\eta_{1},\mathfrak{v}_{j}])=\varnothing.
Proof.

Write Θ\Theta for Θai,biΓ\Theta^{\Gamma}_{a_{i},b_{i}}. As observed in (24) above, we have Θ~Θp\tilde{\Theta}\equiv\Theta^{p}; we can therefore express Θ~\tilde{\Theta} as the composition PΘP\circ\Theta, where P:K1K1P:\mathbb{P}_{\mathbb{C}_{K}}^{1}\to\mathbb{P}_{\mathbb{C}_{K}}^{1} is the pp-power map zzpz\mapsto z^{p}. It follows that the induced map Θ~\tilde{\Theta}_{*} is the composition of Θ\Theta_{*} with the map P:K1,anK1,anP_{*}:\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}}\to\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} induced by PP. By [2, Proposition 7.6] we have P(ηD)=ηP(D)P_{*}(\eta_{D})=\eta_{P(D)} for any disc DKD\subset\mathbb{C}_{K}. There is a well-known formula for the image of any disc under PP (see, for instance, [2, Exercise 7.15]) given by

(60) P(Da(r))={Dap(pr)forrv(a)+v(p)p1Dap(v(p)+(p1)v(a)+r)forrv(a)+v(p)p1.P(D_{a}(r))=\begin{cases}D_{a^{p}}(pr)\ &\mathrm{for}\ r\leq v(a)+\frac{v(p)}{p-1}\\ D_{a^{p}}(v(p)+(p-1)v(a)+r)\ &\mathrm{for}\ r\geq v(a)+\frac{v(p)}{p-1}\end{cases}.

Now parts (a) and (b) can be confirmed using the fact that Θ~=PΘ\tilde{\Theta}_{*}=P_{*}\circ\Theta_{*} and applying the above formula (60) to the outputs of Θ~\tilde{\Theta}_{*} provided by 4.12(a)(b).

To prove part (c), we begin by observing directly from 4.12(a)(b) that we may describe the image of [η1,𝔳j][\eta_{1},\mathfrak{v}_{j}] under Θ\Theta_{*} as

(61) Θ([η1,𝔳j])=[η(1,v(p)+v(λ)),η]={η(1,d)|dv(p)+v(λ)}.\Theta_{*}([\eta_{1},\mathfrak{v}_{j}])=[\eta(1,v(p)+v(\lambda)),\eta_{\infty}]=\{\eta(1,d^{\prime})\ |\ d^{\prime}\leq v(p)+v(\lambda)\}.

Choose a point ηU[η1,𝔳j]\eta\in U^{-}\smallsetminus[\eta_{1},\mathfrak{v}_{j}]. If η\eta can be written as η(0,d)\eta(0,d) for some (necessarily negative) dd, then, using 4.12(b) and the formula in (60), we compute

(62) Θ~(η(0,d))=(PΘ)(η(0,d))=P(η(\displaystyle\tilde{\Theta}_{*}(\eta(0,d))=(P_{*}\circ\Theta_{*})(\eta(0,d))=P_{*}(\eta( 1,v(p)+v(λ)pd))=η(1,2v(p)+v(λ)pd)\displaystyle 1,v(p)+v(\lambda)-pd))=\eta(1,2v(p)+v(\lambda)-pd)
{η(1,d)|dv(p)+v(λ)}=Θ([η1,𝔳j]).\displaystyle\notin\{\eta(1,d^{\prime})\ |\ d^{\prime}\leq v(p)+v(\lambda)\}=\Theta_{*}([\eta_{1},\mathfrak{v}_{j}]).

Now suppose on the other hand that ηΛ(j)\eta\notin\Lambda_{(j)}. Then 4.12(c) implies that the point Θ(η)\Theta_{*}(\eta) can be written as η(c,r)\eta(c,r) where r>v(c1)=v(p)+v(λ)>v(p)p1r>v(c-1)=v(p)+v(\lambda)>\frac{v(p)}{p-1}. By observing that

(63) v(cζpn)=min{v(c1),v(ζpn1)}=min{r,v(p)p1}=v(p)p1<rv(c-\zeta_{p}^{n})=\min\{v(c-1),v(\zeta_{p}^{n}-1)\}=\min\{r,\tfrac{v(p)}{p-1}\}=\tfrac{v(p)}{p-1}<r

for 1np11\leq n\leq p-1, we deduce that the point Θ(η)\Theta_{*}(\eta) corresponds to a disc DD not containing ζpn\zeta_{p}^{n} for any nn. Then we have 1P(D)1\notin P(D), and since P(D)P(D) is the disc corresponding to Θ~(η)=P(Θ(η))\tilde{\Theta}_{*}(\eta)=P_{*}(\Theta_{*}(\eta)), we again get Θ~(η)Θ~([η1,𝔳j])\tilde{\Theta}_{*}(\eta)\notin\tilde{\Theta}_{*}([\eta_{1},\mathfrak{v}_{j}]). This completes the proof of part (c). ∎

4.5. The image of the whole convex hull under the theta function

This subsection consists of the rest of the proof of 1.3. We will actually prove a slightly more sophisticated statement by more precisely defining the claimed map π\pi_{*} in the following manner. In the statement of 1.3, we have implicitly provided a KK-analytic isomorphism from the quotient of the set of KK-points ΩΓ0(K)\Omega_{\Gamma_{0}}(K) by the action of Γ0\Gamma_{0} and the superelliptic curve C/KC/K; let us extend this to an isomorphism over K\mathbb{C}_{K} and denote it by ϑ:ΩΓ0/Γ0C/K\vartheta:\Omega_{\Gamma_{0}}/\Gamma_{0}\stackrel{{\scriptstyle\sim}}{{\to}}C/\mathbb{C}_{K}. Also, in that statement, the map π\pi is simply a bijection from the (2g+2)(2g+2)-element set SS to the (2g+2)(2g+2)-element set \mathcal{B} of branch points of CC. Let us extend π\pi to the composition of ϑ\vartheta with the quotient map ΩΓ0ΩΓ0/Γ0\Omega_{\Gamma_{0}}\twoheadrightarrow\Omega_{\Gamma_{0}}/\Gamma_{0}. If we have ΩΓ0\infty\notin\Omega_{\Gamma_{0}}, then we may choose an automorphism σPGL2(K)\sigma\in\mathrm{PGL}_{2}(\mathbb{C}_{K}) such that σ(Γ0)\infty\in\sigma(\Gamma_{0}) and replace SS with σ(S)\sigma(S) (so that the associated objects Γ0\Gamma_{0} and ΩΓ0\Omega_{\Gamma_{0}} are replaced with σΓ0σ1\sigma\Gamma_{0}\sigma^{-1} and σ(ΩΓ0)\sigma(\Omega_{\Gamma_{0}}) respectively) without affecting the curve CC (see 1.2) or the structure of the convex hull ΣS\Sigma_{S} (as σ\sigma acts as a metric-preserving homeomorphism on K1,an\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}}). Having done this, we now assume that we have ΩΓ0\infty\in\Omega_{\Gamma_{0}}.

Lemma 4.15.

With the above set-up, there exists τPGL2(K)\tau\in\mathrm{PGL}_{2}(\mathbb{C}_{K}) such that the map τπ:ΩΓ0K1\tau\circ\pi:\Omega_{\Gamma_{0}}\to\mathbb{P}_{K}^{1} is the theta function Θa,bΓ0\Theta^{\Gamma_{0}}_{a,b} for some a,bΩΓ0a,b\in\Omega_{\Gamma_{0}} with bΓ0(a)b\notin\Gamma_{0}(a) and Γ0(a)Γ0(b)\infty\notin\Gamma_{0}(a)\cup\Gamma_{0}(b).

Proof.

Let τPGL2(K)\tau\in\mathrm{PGL}_{2}(\mathbb{C}_{K}) be an automorphism sending ϑσ¯()\vartheta\circ\bar{\sigma}(\infty) to 11, and choose elements a,bΩΓ0a,b\in\Omega_{\Gamma_{0}} whose respective images modulo the action of Γ0\Gamma_{0} are (τϑ)1(0)(\tau\circ\vartheta)^{-1}(0) and (τϑ)1()(\tau\circ\vartheta)^{-1}(\infty). Then it is clear from formulas for theta functions (more specifically, noting that we the images of ,a,b\infty,a,b under Θa,bΓ0\Theta^{\Gamma_{0}}_{a,b} are 1,0,1,0,\infty respectively) that the composition of analytic isomorphisms τϑ(ϑa,bΓ0)1\tau\circ\vartheta\circ(\vartheta^{\Gamma_{0}}_{a,b})^{-1} fixes each of the points 1,0,K11,0,\infty\in\mathbb{P}_{\mathbb{C}_{K}}^{1}. The only analytic automorphism of the projective line fixing 33 distinct points is the identity, so we get τϑ=ϑa,bΓ0\tau\circ\vartheta=\vartheta^{\Gamma_{0}}_{a,b}. ∎

The above lemma says that the function π\pi is the composition τ1Θa,bΓ0\tau^{-1}\circ\Theta^{\Gamma_{0}}_{a,b} for some τPGL2(K)\tau\in\mathrm{PGL}_{2}(K) and a,bΩΓ0a,b\in\Omega_{\Gamma_{0}}. We have seen in §4.2 that the map Θa,bΓ0\Theta^{\Gamma_{0}}_{a,b} induces a map (Θa,bΓ0):ΣSK1,an(\Theta^{\Gamma_{0}}_{a,b})_{*}:\Sigma_{S}\to\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}}; now by functoriality of this construction with respect to compositions of functions, the induced function π=τ1(Θa,bΓ0)\pi_{*}=\tau^{-1}\circ(\Theta^{\Gamma_{0}}_{a,b})_{*} is defined on ΣS\Sigma_{S} (here τ=τ\tau=\tau_{*} is the usual extension of τ\tau to a metric-preserving self-homeomorphism of K1,an\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}}). As τ1\tau^{-1} respects distances, is now clear that it suffices to prove the assertions of 1.3 under the assumption that τ=1\tau=1, or in other words, that π=(Θa,bΓ0)\pi_{*}=(\Theta^{\Gamma_{0}}_{a,b})_{*}.

From now on, we abbreviate Θa,bΓ0\Theta^{\Gamma_{0}}_{a,b} as Θ~\tilde{\Theta} and set out to prove that the assertions of 1.3 hold for the map Θ~:ΣSK1,an\tilde{\Theta}_{*}:\Sigma_{S}\to\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}}. We also write 𝔳K1,an\mathfrak{v}\in\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} for the point η1(0)=η0(0)\eta_{1}(0)=\eta_{0}(0).

Our strategy is now to deduce 1.3 from 4.14 (which asserts that the desired conclusions hold when restricted to a certain subspace UU^{-} of the convex ΣS\Sigma_{S} where the set SS satisfies certain additional hypotheses) using a “gluing method” which exploits the fact that (loosely speaking) images of translations of this subspace UU^{-} cover the whole convex hull ΣS\Sigma_{S}. The following lemma is crucial to the gluing process.

Lemma 4.16.

Let SK1S\subset\mathbb{P}_{K}^{1} be a pp-superelliptic set with associated pp-Whittaker group Γ0<PGL2(K)\Gamma_{0}<\mathrm{PGL}_{2}(K). Let σPGL2(K)\sigma\in\mathrm{PGL}_{2}(K) be a fractional linear transformation, and choose elements a,b,a,bΩΓ0a,b,a^{\prime},b^{\prime}\in\Omega_{\Gamma_{0}} with bΓ0(a)b\notin\Gamma_{0}(a) and bΓ0(a)b^{\prime}\notin\Gamma_{0}(a^{\prime}) and Γ0(a)Γ0(b)Γ0(a)Γ0(b)\infty\notin\Gamma_{0}(a)\cup\Gamma_{0}(b)\cup\Gamma_{0}(a^{\prime})\cup\Gamma_{0}(b^{\prime}). The automorphism σ\sigma maps the set of non-limit points of Γ0\Gamma_{0} to those of its conjugate Γ0σ:=σΓ0σ1\Gamma_{0}^{\sigma}:=\sigma\Gamma_{0}\sigma^{-1}, and there is a fractional linear transformation τPGL2(K)\tau\in\mathrm{PGL}_{2}(\mathbb{C}_{K}) and an analytic isomorphism σ¯\bar{\sigma} such that the below diagram commutes, where πΓ0\pi_{\Gamma_{0}} and πΓ0σ\pi_{\Gamma_{0}^{\sigma}} are the obvious quotient maps.

(64) ΩΓ0\textstyle{\Omega_{\Gamma_{0}}\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces}\scriptstyle{\sim}σ\scriptstyle{\sigma}πΓ0\scriptstyle{\pi_{\Gamma_{0}}}Θ~a,bΓ0\scriptstyle{\tilde{\Theta}^{\Gamma_{0}}_{a,b}}ΩΓ0σ\textstyle{\Omega_{\Gamma_{0}^{\sigma}}\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces}πΓ0σ\scriptstyle{\pi_{\Gamma_{0}^{\sigma}}}Θ~σ(a),σ(b)Γ0σ\scriptstyle{\tilde{\Theta}^{\Gamma_{0}^{\sigma}}_{\sigma(a^{\prime}),\sigma(b^{\prime})}}ΩΓ0/Γ0\textstyle{\Omega_{\Gamma_{0}}/\Gamma_{0}\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces}\scriptstyle{\sim}σ¯\scriptstyle{\bar{\sigma}}\scriptstyle{\wr}ϑa,bΓ0\scriptstyle{\vartheta^{\Gamma_{0}}_{a,b}}ΩΓ0σ/Γ0σ\textstyle{\Omega_{\Gamma_{0}^{\sigma}}/\Gamma_{0}^{\sigma}\ignorespaces\ignorespaces\ignorespaces\ignorespaces}\scriptstyle{\wr}ϑσ(a),σ(b)Γ0σ\scriptstyle{\vartheta^{\Gamma_{0}^{\sigma}}_{\sigma(a^{\prime}),\sigma(b^{\prime})}}K1\textstyle{\mathbb{P}_{\mathbb{C}_{K}}^{1}\ignorespaces\ignorespaces\ignorespaces\ignorespaces}\scriptstyle{\sim}τ\scriptstyle{\tau}K1\textstyle{\mathbb{P}_{\mathbb{C}_{K}}^{1}}
Proof.

It is an easy exercise to directly check that we have σ(ΩΓ0)=ΩΓ0σ\sigma(\Omega_{\Gamma_{0}})=\Omega_{\Gamma_{0}^{\sigma}} (and that we have σ(b)Γ0σ(σ(a))\sigma(b^{\prime})\notin\Gamma_{0}^{\sigma}(\sigma(a^{\prime}))); the automorphism σ:K1K1\sigma:\mathbb{P}_{\mathbb{C}_{K}}^{1}\to\mathbb{P}_{\mathbb{C}_{K}}^{1} thus restricts to an analytic isomorphism σ:ΩΓ0ΩΓ0σ\sigma:\Omega_{\Gamma_{0}}\to\Omega_{\Gamma_{0}^{\sigma}}. This clearly induces a well-defined map σ¯:ΩΓ0/Γ0ΩΓ0σ/Γ0σ\bar{\sigma}:\Omega_{\Gamma_{0}}/\Gamma_{0}\to\Omega_{\Gamma_{0}^{\sigma}}/\Gamma_{0}^{\sigma} which is also an analytic isomorphism. Now the composition τ:=ϑσ(a),σ(b)Γ0σσ¯(ϑa,bΓ0)1:K1K1\tau:=\vartheta^{\Gamma_{0}^{\sigma}}_{\sigma(a^{\prime}),\sigma(b^{\prime})}\circ\bar{\sigma}\circ(\vartheta^{\Gamma_{0}}_{a,b})^{-1}:\mathbb{P}_{\mathbb{C}_{K}}^{1}\to\mathbb{P}_{\mathbb{C}_{K}}^{1} of analytic isomorphisms is an analytic automorphism of K1\mathbb{P}_{\mathbb{C}_{K}}^{1} and is therefore a fractional linear transformation. ∎

For the rest of this subsection, we call an ordered pair (v,w)(v,w) of distinguished vertices a neighboring pair (of distinguished vertices of ΣS,0\Sigma_{S,0}) if vv and ww do not lie in the same axis Λ(i)\Lambda_{(i)} and if the path [v,w]ΣS[v,w]\subset\Sigma_{S} contains no other distinguished vertex. Given any neighboring pair (v,w)(v,w) and any real number ν>0\nu>0, define the subspace B({w},ν)+B({w},ν)B(\{w\},\nu)^{+}\subset B(\{w\},\nu) (resp. B({w},ν)B({w},ν)B(\{w\},\nu)^{-}\subset B(\{w\},\nu)) to consist of the points whose closest point in Λ(j)\Lambda_{(j)} lies in the half-axis [w,ηaj][w,\eta_{a_{j}}] (resp. [w,ηbj][w,\eta_{b_{j}}]); note that we have B({w},ν)+B({w},ν)=B({w},ν)B(\{w\},\nu)^{+}\cup B(\{w\},\nu)^{-}=B(\{w\},\nu). Now for any neighboring pair (v,w)(v,w), we make the definitions

Uv,w+=[ηai,w](B({w},ν)+ΣS),Uv,w=[ηbi,w](B({w},ν)ΣS),U_{v,w}^{+}=[\eta_{a_{i}},w]\cup(B(\{w\},\nu)^{+}\cap\Sigma_{S}),\ \ \ U_{v,w}^{-}=[\eta_{b_{i}},w]\cup(B(\{w\},\nu)^{-}\cap\Sigma_{S}),
Uv,w=Uv,w+Uv,w=Λ(i)[v,w](B({w},ν)ΣS).U_{v,w}=U_{v,w}^{+}\cup U_{v,w}^{-}=\Lambda_{(i)}\cup[v,w]\cup(B(\{w\},\nu)\cap\Sigma_{S}).

(Although the sets Uv,w±,Uv,wU_{v,w}^{\pm},U_{v,w} described above depend on the choice of ν\nu, we suppress it from the notation.) It is an easy observation to note that given any ν>0\nu>0, the convex hull ΣS\Sigma_{S} coincides with the union of the subspaces Uv,wU_{v,w} over all neighboring pairs (v,w)(v,w): indeed, it contains the union of the subspaces [ai,w][bi,w]=Λ(i)[v,w]Uv,w[a_{i},w]\cup[b_{i},w]=\Lambda_{(i)}\cup[v,w]\subset U_{v,w}; this includes all axes Λ(i)\Lambda_{(i)} (as there is at least one distinguished vertex of ΣS,0\Sigma_{S,0} lying in each Λ(i)\Lambda_{(i)}) as well as all paths between any pair of distinguished vertices lying in distinct axes and thus paths between any pair of axes and between any pair of points corresponding to elements of SS. Our strategy for proving the theorem is, after choosing an appropriate ν>0\nu>0, to use 4.14 to describe the behavior of Θ\Theta_{*} restricted to each Uv,wU_{v,w}^{-} and Uv,w+U_{v,w}^{+} and from that, describe it restricted to each Uv,wU_{v,w}, and then “glue these restrictions together” to get a description of Θ\Theta_{*} on all of ΣS\Sigma_{S}.

Choose a neighboring pair (v,w)(v,w) with corresponding indices iji\neq j as in the construction of Uv,w±U_{v,w}^{\pm}. Letting σv,wPGL2(K)\sigma_{v,w}^{-}\in\mathrm{PGL}_{2}(K) be the (unique) automorphism mapping bi,aj,bjΩΓ0b_{i},a_{j},b_{j}\in\Omega_{\Gamma_{0}} respectively to 1,0,σ(Ω)1,0,\infty\in\sigma(\Omega), Lemma 4.16 (putting a=aia^{\prime}=a_{i} and b=bib^{\prime}=b_{i}) tells us that there is a fractional linear transformation τv,wPGL2(K)\tau_{v,w}^{-}\in\mathrm{PGL}_{2}(K) making the diagram (64) commute. On identifying the respective elements σv,w(al),σv,w(bl)σ(S)\sigma_{v,w}^{-}(a_{l}),\sigma_{v,w}^{-}(b_{l})\in\sigma(S) with the elements named al,bla_{l},b_{l} in the statement of 4.14 for all indices ll, we see that the hypotheses (i)-(iii) of 4.10 (and thus the hypotheses of 4.14) are satisfied as they describe (ησv,w(ai)η1,𝔳)(\eta_{\sigma_{v,w}^{-}(a_{i})}\vee\eta_{1},\mathfrak{v}) as a neighboring pair of vertices of Σσv,w(S)\Sigma_{\sigma_{v,w}^{-}(S)}. It is clear that for any ν>0\nu>0, the image σ(Uv,w)\sigma(U_{v,w}^{-}) coincides with the subspace UU^{-} defined in the statement of 4.14; in fact, we have σ([ηbi,v])=[η1,η1η1+λ]\sigma([\eta_{b_{i}},v])=[\eta_{1},\eta_{1}\vee\eta_{1+\lambda}] and σ([v,w])=[η1η1+λ,𝔳]\sigma([v,w])=[\eta_{1}\vee\eta_{1+\lambda},\mathfrak{v}]. Now, using the fact that the action of fractional linear transformations (in particular, τv,w\tau_{v,w}^{-}) on K1,an\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} is a metric-preserving homeomorphism, we use 4.14(b) to describe the behavior of the function Θ~\tilde{\Theta}_{*} restricted to [ηbi,w][\eta_{b_{i}},w] as follows:

  • it maps [ηbi,v]=(σv,w)1([η1,η1η1+λ])[\eta_{b_{i}},v]=(\sigma_{v,w}^{-})^{-1}([\eta_{1},\eta_{1}\vee\eta_{1+\lambda}]) to the path [(τv,w)1(η),(τv,w)1(𝔳)][(\tau_{v,w}^{-})^{-1}(\eta_{\infty}),(\tau_{v,w}^{-})^{-1}(\mathfrak{v})] in a manner that scales the metric by pp; and

  • it maps [v,w]=(σv,w)1([η1η1+λ,𝔳])[v,w]=(\sigma_{v,w}^{-})^{-1}([\eta_{1}\vee\eta_{1+\lambda},\mathfrak{v}]) to the path [(τv,w)1(𝔳),(τv,w)1(ηcη1)][(\tau_{v,w}^{-})^{-1}(\mathfrak{v}),(\tau_{v,w}^{-})^{-1}(\eta_{c}\vee\eta_{1})] (where c=Θ1,1+λΓ0σ(0)c=\Theta^{\Gamma_{0}^{\sigma}}_{1,1+\lambda}(0)) in a manner that scales the metric by pp on the sub-segments [v,v][v,v^{\prime}] and [w,w][w^{\prime},w] and which preserves the metric on the sub-segment [v~,w~][\tilde{v},\tilde{w}], where v~\tilde{v} (resp. w~\tilde{w}) is the unique point in the path [v,w][v,w] of distance v(p)p1\frac{v(p)}{p-1} from vv (resp. ww).

We also use 4.14(c) to conclude that, if ν>0\nu>0 is chosen small enough, we have

(65) Θ~([ηbi,w])(Θ~(Uv,w[ηbi,w])=.\tilde{\Theta}_{*}([\eta_{b_{i}},w])\cap(\tilde{\Theta}_{*}(U_{v,w}^{-}\smallsetminus[\eta_{b_{i}},w])=\varnothing.

Now letting σv,w+PGL2(K)\sigma_{v,w}^{+}\in\mathrm{PGL}_{2}(K) be the (unique) automorphism mapping ai,bj,ajΩΓ0a_{i},b_{j},a_{j}\in\Omega_{\Gamma_{0}} respectively to 1,0,σ(Ω)1,0,\infty\in\sigma(\Omega), Lemma 4.16 (again putting a=aia^{\prime}=a_{i} and b=bib^{\prime}=b_{i}) tells us that there is a fractional linear transformation τv,w+PGL2(K)\tau_{v,w}^{+}\in\mathrm{PGL}_{2}(K) making the diagram (64) commute. We now use 4.14 by identifying the respective elements σv,w+(ai),σv,w+(bi)σ(S)\sigma_{v,w}^{+}(a_{i}),\sigma_{v,w}^{+}(b_{i})\in\sigma(S) with the elements named bi,aib_{i},a_{i} in the statement of the statement of that corollary for all indices ii (as before, except that we have now switched aia_{i} with bib_{i}!), and by employing a completely analogous argument to the one above, to describe the behavior of the function Θ~\tilde{\Theta}_{*} restricted to [ηai,w][\eta_{a_{i}},w] with respect to the metric to get a similar description as the one obtained above for its behavior on [ηbi,w][\eta_{b_{i}},w]. This matches with the function’s previously known behavior on [v,w]=[ηai,w][ηbi,w][v,w]=[\eta_{a_{i}},w]\cap[\eta_{b_{i}},w]. It also gives us the new information that (Θ~ai,bi)(\tilde{\Theta}_{a_{i},b_{i}})_{*} maps [ηai,v]=(σv,w+)1([η1,η1η1+λ])[\eta_{a_{i}},v]=(\sigma_{v,w}^{+})^{-1}([\eta_{1},\eta_{1}\vee\eta_{1+\lambda^{\prime}}]) (where λ=σv,w+(bi)1\lambda^{\prime}=\sigma_{v,w}^{+}(b_{i})-1) to the path [ι(τv,w+)1(η),ι(τv,w+)1(𝔳)][\iota\circ(\tau_{v,w}^{+})^{-1}(\eta_{\infty}),\iota\circ(\tau_{v,w}^{+})^{-1}(\mathfrak{v})] (where ι\iota is the reciprocal map) in a manner that scales the metric by pp. Finally, it is clear that for any ν>0\nu>0, the image σ(Uv,w+)\sigma(U_{v,w}^{+}) again coincides with the subspace UU^{-} defined in the statement of 4.14, which means that if ν>0\nu>0 is chosen small enough, we have

(66) Θ~([ηai,w])Θ~(Uv,w+[ηai,w])=.\tilde{\Theta}_{*}([\eta_{a_{i}},w])\cap\tilde{\Theta}_{*}(U_{v,w}^{+}\smallsetminus[\eta_{a_{i}},w])=\varnothing.

In particular, since the images Θ~([ηbi,w])\tilde{\Theta}_{*}([\eta_{b_{i}},w]) and Θ~([ηai,w])\tilde{\Theta}_{*}([\eta_{a_{i}},w]) are each non-backtracking paths whose endpoints of Type I must respectively equal ηΘ~(ai),ηΘ~(bi)K1\eta_{\tilde{\Theta}_{*}(a_{i})},\eta_{\tilde{\Theta}_{*}(b_{i})}\in\mathbb{P}_{\mathbb{C}_{K}}^{1}, we get that Θai,biΓ0\Theta^{\Gamma_{0}}_{a_{i},b_{i}} maps the axis Λ(i)\Lambda_{(i)} homeomorphically onto the axis ΛΘ~(ai),Θ~(bi)\Lambda_{\tilde{\Theta}_{*}(a_{i}),\tilde{\Theta}_{*}(b_{i})}.

It now follows directly that the behavior of the function Θ~\tilde{\Theta}_{*} restricted to Λ(i)[v,w]\Lambda_{(i)}\cup[v,w] is as claimed in 1.3 with respect to the metric as well as being a homeomorphism onto its image ΛΘ~(ai),Θ~(bi)[Θ~(v),Θ~(w)]\Lambda_{\tilde{\Theta}(a_{i}),\tilde{\Theta}(b_{i})}\cup[\tilde{\Theta}_{*}(v),\tilde{\Theta}_{*}(w)]. Combining (65) and (66), we moreover conclude that, for sufficiently small ν>0\nu>0, we have

(67) Θ~([v,w])Θ~(B({w},ν)[v,w])=.\tilde{\Theta}_{*}([v,w])\cap\tilde{\Theta}_{*}(B(\{w\},\nu)\smallsetminus[v,w])=\varnothing.

As the choice of neighboring pair (v,w)(v,w) was arbitrary, we have demonstrated these properties for all subspaces Uv,wU_{v,w}.

Now we will show that Θ~\tilde{\Theta}_{*} is one-to-one on a neighborhood of each of its distinguished vertices; as it has already been been shown to be one-to-one on each axis Λ(i)\Lambda_{(i)} as well as each path [v,w][v,w] between neighboring pairs of distinguished vertices, it will then follow from a straightforward exercise in topology concerning continuous maps between real trees that the function Θ~\tilde{\Theta}_{*} is one-to-one on all of ΣS\Sigma_{S}. We choose a distinguished vertex ww lying in an axis Λ(j)\Lambda_{(j)} and a real number ν>0\nu>0 small enough that ww is the only distinguished vertex in B({w},ν)B(\{w\},\nu); it is clear from 3.5 that any neighborhood of ww in ΣS\Sigma_{S} contains a star shape centered at ww with 22 edges coming out of ww being ends of the paths [ηaj,w][\eta_{a_{j}},w] and [ηbj,w][\eta_{b_{j}},w] and with each other edge coming out of ww being the end of the path [v,w][v,w] where (v,w)(v,w) is a neighboring pair. We know from (67) that, after possibly shrinking ν\nu, for any distinguished vertex vv such that (v,w)(v,w) is a neighboring pair, the image of B({w},ν)[v,w]B(\{w\},\nu)\cap[v,w] under (Θ~ai,biΓ0)(\tilde{\Theta}^{\Gamma_{0}}_{a_{i},b_{i}})_{*} is disjoint from the image of its complement in B({w},ν)B(\{w\},\nu). We also know (by applying results obtained above to the neighboring pair (w,v)(w,v) rather than (v,w)(v,w)) that Θ~\tilde{\Theta}_{*} is one-to-one when restricted to the axis Λ(j)\Lambda_{(j)} and when restricted to the each of the paths [ηai,w],[ηbi,w]ΣS[\eta_{a_{i}},w],[\eta_{b_{i}},w]\subset\Sigma_{S}. All of this implies that the images of each edge coming out of ww in the star shape under Θ~\tilde{\Theta}_{*} intersect only at the point Θ~(w)\tilde{\Theta}_{*}(w). It follows that the function Θ~\tilde{\Theta}_{*} is one-to-one on B(w,ν)B(w,\nu), as desired.

We finally have to show that Θ~\tilde{\Theta}_{*} maps ΣS\Sigma_{S} homeomorphically onto the convex hull Σ\Sigma_{\mathcal{B}}. From the fact that Θ~\tilde{\Theta}_{*} is a homoeomorphism onto its image when restricted to each Uv,wU_{v,w} and that it is one-to-one on ΣS\Sigma_{S}, we see that it is a homeomorphism when restricted to ΣS=Uv,w\Sigma_{S}=\bigcup U_{v,w} as well. We have already noted that Θ~\tilde{\Theta}_{*} maps axes to axes, or more precisely, that we have Θ~(Λ(i))=ΛΘ~(ai),Θ~(bi)\tilde{\Theta}_{*}(\Lambda_{(i)})=\Lambda_{\tilde{\Theta}(a_{i}),\tilde{\Theta}(b_{i})} for 0ig0\leq i\leq g. Meanwhile, we have seen that for any neighboring pair (v,w)(v,w) of distinguished vertices of ΣS\Sigma_{S} with vΛ(i)v\in\Lambda_{(i)} and wΛ(j)w\in\Lambda_{(j)}, the image Θ~([v,w])\tilde{\Theta}_{*}([v,w]) is a (non-backtracking) path. Then clearly the endpoints of the image lie in ΛΘ~(ai),Θ~(bi)\Lambda_{\tilde{\Theta}(a_{i}),\tilde{\Theta}(b_{i})} and ΛΘ~(aj),Θ~(bj)\Lambda_{\tilde{\Theta}(a_{j}),\tilde{\Theta}(b_{j})} while its interior does not intersect any axis ΛΘ~(al),Θ~(bl)\Lambda_{\tilde{\Theta}(a_{l}),\tilde{\Theta}(b_{l})}. Such a path must be contained in the convex hull Σ\Sigma_{\mathcal{B}} (as the convex hull is path connected and must contain the shortest path between each pair of axes), so we have Θ~(ΣS)Σ\tilde{\Theta}_{*}(\Sigma_{S})\subseteq\Sigma_{\mathcal{B}}. Meanwhile, since the convex hull ΣS\Sigma_{S} is connected, the image Θ~(ΣS)\tilde{\Theta}_{*}(\Sigma_{S}) is also connected and we get the reverse inclusion Θ~(ΣS)Σ\tilde{\Theta}_{*}(\Sigma_{S})\supseteq\Sigma_{\mathcal{B}}. This completes the proof of 1.3.

5. Cluster data of branch points of split degenerate superelliptic curves

An almost immediate corollary of 1.3 gives us a result concerning the branch locus of a split degenerate pp-cover of the projective line.

Corollary 5.1.

Let C/KC/K be a pp-cyclic cover of K1\mathbb{P}_{K}^{1} which has split degenerate reduction, and denote its set of branch points by K1\mathcal{B}\subset\mathbb{P}_{\mathbb{C}_{K}}^{1}. The set \mathcal{B} is clustered in pv(p)p1\frac{pv(p)}{p-1}-separated pairs.

Proof.

Let v,wv,w be distinct distinguished vertices of ΣS\Sigma_{S} which do not lie in the same axis Λ(l)\Lambda_{(l)} for any index ll; 3.7 says that we have δ(v,w)>2v(p)p1\delta(v,w)>\frac{2v(p)}{p-1}. Let iji\neq j be the indices such that vΛ(i)v\in\Lambda_{(i)} and wΛ(j)w\in\Lambda_{(j)}, and let v~\tilde{v} (resp. w~\tilde{w}) be the (unique) point in the path [v,w][v,w] whose distance from vv (resp. ww) is equal to v(p)p1\frac{v(p)}{p-1}. Then, using the terminology of 1.3, we have [v,v~][w~,w]v,w[v,\tilde{v}]\cup[\tilde{w},w]\subset\llbracket v,w\rrbracket. As each of the segments [v,v~][v,\tilde{v}] and [w~,w][\tilde{w},w] has length v(p)p1\frac{v(p)}{p-1}, we get μ(v,w)2v(p)p1\mu(v,w)\geq\frac{2v(p)}{p-1}. Now 1.3 says that the images π(v),π(w)\pi_{*}(v),\pi_{*}(w) of the distinguished vertices v,wv,w are themselves distinguished vertices of Σ\Sigma_{\mathcal{B}} and that we have

(68) δ(π(v),π(w))=δ(v,w)+(p1)μ(v,w)δ(v,w)+2v(p)>2v(p)p1+2v(p)=2pv(p)p1.\delta(\pi_{*}(v),\pi_{*}(w))=\delta(v,w)+(p-1)\mu(v,w)\geq\delta(v,w)+2v(p)>\frac{2v(p)}{p-1}+2v(p)=\frac{2pv(p)}{p-1}.

As each distinguished vertex of Σ\Sigma_{\mathcal{B}} is the image under π\pi_{*} of a distinguished vertex of ΣS\Sigma_{S}, we have shown that the distance between any pair of distinguished vertices of Σ\Sigma_{\mathcal{B}} is >2v(p)p1>\frac{2v(p)}{p-1}. 3.7 now tells us that the set \mathcal{B} is clustered in pv(p)p1\frac{pv(p)}{p-1}-separated pairs. ∎

Remark 5.2.

It should be possible to provide a purely geometric proof of 5.1, based on the fact that a curve with split degenerate reduction over KK, by definition, has a model over the ring of integers 𝒪K\mathcal{O}_{K} of KK the components of whose special fiber are each a copy of the projective line k1\mathbb{P}_{k}^{1} over the residue field.

Moreover, the converse of 5.1 – that the branch points of a pp-cyclic cover of the projective line being clustered in pv(p)p1\frac{pv(p)}{p-1}-separated pairs implies split degenerate reduction – is true, at least in the tame case. This again can in principle be demonstrated by purely geometric arguments, and the idea of the proof is as follows. Assume for simplicity that we have \infty\in\mathcal{B} and that the (maximal) cluster 𝔰0:={}\mathfrak{s}_{0}:=\mathcal{B}\smallsetminus\{\infty\} of \mathcal{B} has depth d(𝔰0)=0d(\mathfrak{s}_{0})=0 (these conditions can always be imposed after applying a suitable fractional linear transformation to xx and possibly replacing KK by a degree-pp extension). Given such a curve CC, we want to construct a model 𝒞\mathcal{C} of CC over the ring of integers 𝒪K\mathcal{O}_{K}. In order to do so, we will construct a model 𝒳/𝒪K\mathcal{X}/\mathcal{O}_{K} of K1\mathbb{P}_{K}^{1} and let 𝒞\mathcal{C} be the normalization of 𝒳\mathcal{X} in the function field K(C)K(C); the pp-cyclic covering map CK1C\to\mathbb{P}_{K}^{1} extends to a pp-cyclic covering map 𝒞𝒳\mathcal{C}\to\mathcal{X}. It is well known that any model of K1\mathbb{P}_{K}^{1} is defined by a set of equations {x=cixi+αi}0it\{x=c_{i}x_{i}+\alpha_{i}\}_{0\leq i\leq t} for some elements αiK\alpha_{i}\in K and ciK×c_{i}\in K^{\times}; each coordinate xix_{i} corresponds to a component X¯i\bar{X}_{i} of the special fiber.

Suppose first that pp is not the residue characteristic of KK. Let {}=:𝔰0,𝔰1,,𝔰t\mathcal{B}\smallsetminus\{\infty\}=:\mathfrak{s}_{0},\mathfrak{s}_{1},\dots,\mathfrak{s}_{t} be the non-singleton clusters of \mathcal{B}, and for 1it1\leq i\leq t, choose an element αi𝔰i\alpha_{i}\in\mathfrak{s}_{i} and a scalar ciK×c_{i}\in K^{\times} satisfying v(ci)=d(𝔰i)v(c_{i})=d(\mathfrak{s}_{i}), setting c0=1c_{0}=1 so that x0x_{0} is just the standard coordinate xx. Then the desired model 𝒳/𝒪K\mathcal{X}/\mathcal{O}_{K} is defined by the set of equations {x0=cixi+αi}1it\{x_{0}=c_{i}x_{i}+\alpha_{i}\}_{1\leq i\leq t}: one can show that the normalization 𝒞\mathcal{C} of 𝒳\mathcal{X} in the function field K(C)K(C) is semistable. (In fact, with a little work, one can see that the special fiber of 𝒳\mathcal{X} is isomorphic over the residue field kk to the image of the function RSR_{S} as defined in §2.1.) The (KK-)points αK1\alpha\in\mathcal{B}\subset\mathbb{P}_{K}^{1} extend to 𝒪K\mathcal{O}_{K}-points α¯\underline{\alpha} of 𝒳\mathcal{X} which, by slight abuse of terminology, we will also refer to as elements of \mathcal{B}. For each index ii, the points α\alpha\in\mathcal{B} intersecting the component X¯i\bar{X}_{i} are exactly the elements of 𝔰i\mathfrak{s}_{i} which do not lie in any proper non-singleton sub-cluster of 𝔰i\mathfrak{s}_{i}. It then follows from the property of being clustered in pairs that each component X¯i\bar{X}_{i} of the special fiber of 𝒳\mathcal{X} intersects with exactly 0, 11, or 22 elements of \mathcal{B}; these cases happen respectively when 𝔰i\mathfrak{s}_{i} is the union of 2\geq 2 even-cardinality clusters, when 𝔰i\mathfrak{s}_{i} has odd cardinality, and when 𝔰i\mathfrak{s}_{i} does not satisfy either of the above two properties. In the case that a component X¯i\bar{X}_{i} does not intersect with any branch points, the cluster 𝔰i\mathfrak{s}_{i} is the union of 2\geq 2 even-cardinality clusters, and there are exactly pp components of the special fiber of 𝒞\mathcal{C} (each isomorphic to k1\mathbb{P}_{k}^{1}) mapping to the component X¯i\bar{X}_{i} (with no ramification) under the pp-cyclic covering map 𝒞𝒳\mathcal{C}\to\mathcal{X}. In the other two cases, there is exactly 11 component CiC_{i} of the special fiber of 𝒞\mathcal{C} mapping to X¯i\bar{X}_{i} ramified at 22 points, which by Riemann-Hurwitz implies that CiC_{i} is again isomorphic to k1\mathbb{P}_{k}^{1}. In this way we see that the components of the special fiber of 𝒞\mathcal{C} are each isomorphic to k1\mathbb{P}_{k}^{1} and so CC has split degenerate reduction over KK.

It is expected that via a more complicated construction of a semistable model 𝒞\mathcal{C} of CC exhibiting split degenerate reduction, one can prove the converse also in the wild case. For instance, it can be done for hyperelliptic curves (i.e. when p=2p=2) using methods found in the author’s preprint [5] (collaborated with Leonardo Fiore). In fact, under the hypothesis that \mathcal{B} is clustered in 2v(2)2v(2)-separated pairs and using methods in [5, §6] and in particular results from §6.4 of that paper, one may compute that the valid discs (see [5, Definition 5.11]) are precisely the discs corresponding to the points vv of the convex hull ΣK1,an\Sigma_{\mathcal{B}}\subset\mathbb{P}_{\mathbb{C}_{K}}^{1,\mathrm{an}} which satisfy δ(w,Λ(i))=2v(2)\delta(w,\Lambda_{(i)})=2v(2) for some index ii as well as the discs corresponding to non-distinguished vertices not lying in the tubular neighborhood B(Λ(i),2v(2))B(\Lambda_{(i)},2v(2)) for any index ii. (We suspect that when p3p\geq 3, an identical statement holds, with 2v(2)2v(2) replaced by pv(p)p1\frac{pv(p)}{p-1}.) In this context, a valid disc is by definition given by {zK|v(zα)v(c)}\{z\in\mathbb{C}_{K}\ |\ v(z-\alpha)\geq v(c)\} for some αK\alpha\in\mathbb{C}_{K} and cK×c\in\mathbb{C}_{K}^{\times} such that the corresponding coordinate xx^{\prime} (with x=cx+αx=cx+\alpha) defines one of the components of the special fiber of the desired model 𝒳\mathcal{X} of K1\mathbb{P}_{K}^{1}. If X¯\bar{X} is the component of the special fiber of 𝒳\mathcal{X} corresponding as above to a (non-distinguished) vertex of Σ\Sigma_{\mathcal{B}} (in which case the cluster 𝔰\mathfrak{s} corresponding to it via 3.8(b) is the union of 2\geq 2 even-cardinality sub-clusters, or is übereven as in [5, Definition 8.6]), then there are exactly 22 components of the special fiber of 𝒞\mathcal{C} (each isomorphic to k1\mathbb{P}_{k}^{1}) mapping to the component X¯\bar{X} (with no ramification) under the 22-covering 𝒞𝒳\mathcal{C}\to\mathcal{X}. Meanwhile, for all other components of the special fiber of 𝒳\mathcal{X}, there is exactly 11 component of the special fiber of 𝒞\mathcal{C} mapping to it, which is ramified at 11 point, and [5, Proposition 4.28, Proposition 6.17(c)(d)] can be used to show that this component is also isomorphic to k1\mathbb{P}_{k}^{1}. It follows again that CC has split degenerate reduction over KK.

Remark 5.3.

In the case that g=1g=1, every subset SK1S\subset\mathbb{P}_{K}^{1} which is clustered in pairs is not only pp-superelliptic but optimal by [14, Proposition 3.25], and one can easily and quickly use 1.3 to describe the cluster data of the set \mathcal{B} of branch points of the resulting superelliptic curve: the convex hull ΣS\Sigma_{S} contains the axes Λ(0),Λ(1)\Lambda_{(0)},\Lambda_{(1)} at some distance δ>2v(p)p1\delta>\frac{2v(p)}{p-1} apart (equivalently, there is a cluster of SS of cardinality 22 and relative depth δ>2v(p)p1\delta>\frac{2v(p)}{p-1}), while the convex hull Σ\Sigma_{\mathcal{B}} contains the images of these axes at a distance of δ+2v(p)>2pv(p)p1\delta+2v(p)>\frac{2pv(p)}{p-1} apart (equivalently, if S\infty\in S, the image of the cluster of SS of cardinality 22 is a cluster of \mathcal{B} with relative depth δ+2v(p)>2pv(p)p1\delta+2v(p)>\frac{2pv(p)}{p-1}). 5.1 says that the set of branch points of any split degenerate cyclic pp-cover of K1\mathbb{P}_{K}^{1} of genus (p1)g(p-1)g satisfies this property. This is already known in the case of an elliptic curve (i.e. when g=1g=1 and p=2p=2), in which case the split degenerate reduction condition is known as split multiplicative reduction: see [5, Remark 9.5(a)] or the results of [13], for instance.

References

  • [1] Matthew Baker. An introduction to Berkovich analytic spaces and non-archimedean potential theory on curves. p-adic Geometry (Lectures from the 2007 Arizona Winter School), AMS University Lecture Series, 45, 2008.
  • [2] Robert L Benedetto. Dynamics in one non-archimedean variable, volume 198. American Mathematical Soc., 2019.
  • [3] Tim Dokchitser, Vladimir Dokchitser, Céline Maistret, and Adam Morgan. Semistable types of hyperelliptic curves. Algebraic curves and their applications, 724:73–135, 2019.
  • [4] Tim Dokchitser, Vladimir Dokchitser, Céline Maistret, and Adam Morgan. Arithmetic of hyperelliptic curves over local fields. Mathematische Annalen, pages 1–110, 2022.
  • [5] Leonardo Fiore and Jeffrey Yelton. Clusters and semistable models of hyperelliptic curves in the wild case. arXiv preprint arXiv:2207.12490v4, 2023.
  • [6] Lothar Gerritzen and Marius van der Put. Schottky groups and Mumford curves. Springer, 2006.
  • [7] Samuel Kadziela. Rigid analytic uniformization of hyperelliptic curves. PhD thesis, University of Illinois at Urbana-Champaign, 2007.
  • [8] David Mumford. An analytic construction of degenerating curves over complete local rings. Compositio Mathematica, 24(2):129–174, 1972.
  • [9] Mihran Papikian. Non-archimedean uniformization and monodromy pairing. Tropical and non-Archimedean geometry, 605:123–160, 2013.
  • [10] John Tate. A review of non-archimedean elliptic functions. Elliptic Curves, Modular Forms and Fermat’s Last Theorem, pages 310–314, 1995.
  • [11] Guido Van Steen. Galois coverings of the non-archimedean projective line. Mathematische Zeitschrift, 180:217–224, 1982.
  • [12] Guido van Steen. Non-archimedean Schottky groups and hyperelliptic curves. In Indagationes Mathematicae (Proceedings), volume 86, pages 97–109. North-Holland, 1983.
  • [13] Jeffrey Yelton. Semistable models of elliptic curves over residue characteristic 2. Canadian Mathematical Bulletin, 64(1):154–162, 2021.
  • [14] Jeffrey Yelton. Branch points of split degenerate superelliptic curves I: construction of Schottky groups. arXiv preprint arXiv:2306.17823v3, 2024.