This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Bounding Lifts of Markoff Triples modp\text{mod}\,p

Elisa Bellah (Bellah) Department of Mathematics, Carnegie Mellon University Siran Chen (Chen) Department of Mathematics, Carnegie Mellon University Elena Fuchs (Fuchs) Department of Mathematics, University of California, Davis  and  Lynnelle Ye
Abstract.

In 2016, Bourgain, Gamburd, and Sarnak proved that Strong Approximation holds for the Markoff surface in most cases. That is, the modulo pp solutions to the equation X12+X22+X32=3X1X2X3X_{1}^{2}+X_{2}^{2}+X_{3}^{2}=3X_{1}X_{2}X_{3} are covered by the integer solutions for most primes pp. In this paper, we provide upper bounds on lifts of modp\text{mod}\,p points of the Markoff surface by analyzing the growth along paths in the Markoff modp\text{mod}\,p graphs. Our first upper bound follows the algorithm given in the paper of Bourgain, Gamburd, and Sarnak, which constructs a path of possibly long length but where points grow relatively slowly. Our second bound considers paths in these graphs of short length but possibly large growth. We then provide numerical evidence and heuristic arguments for how these bounds might be improved.

1. Introduction

The Markoff surface 𝕏\mathbb{X} is the affine surface in 𝔸3\mathbb{A}^{3} given by

𝕏:X12+X22+X32=3X1X2X3,\mathbb{X}:X_{1}^{2}+X_{2}^{2}+X_{3}^{2}=3X_{1}X_{2}X_{3},

and the integer points 𝕏()\mathbb{X}(\mathbb{Z}) are called Markoff Triples. Markoff first studied these triples in the context of Diophantine approximation and the theory of quadratic forms (see [12] or Chapter 2 of [3], for example). Markoff triples have since found application in other areas. For example, Cohn used these triples to study free groups on two generators (see Theorem 2 of [6]), and in [10] Hirzebruch and Zagier used Markoff triples to study the signature of certain 4-dimensional manifolds. More recently, work has been done to study Markoff surface modulo pp.

Define the Vieta group Γ\Gamma to be the group of affine integral morphisms on 𝔸3\mathbb{A}^{3} generated by permutations σij\sigma_{ij} and the Vieta involutions RiR_{i}; that is

σ12(x1,x2,x3)=(x2,x1,x3),\sigma_{12}(x_{1},x_{2},x_{3})=(x_{2},x_{1},x_{3}),
R1(x1,x2,x3)=(3x2x3x1,x2,x3),R_{1}(x_{1},x_{2},x_{3})=(3x_{2}x_{3}-x_{1},x_{2},x_{3}),

and the remaining σij\sigma_{ij} and R2,R3R_{2},R_{3} are defined similarly. It is well-known that the orbit of (1,1,1)(1,1,1) under Γ\Gamma gives all the nonzero Markoff triples (see [1], for example). It is expected that the same holds modulo pp. More precisely, we have the following conjecture.

Conjecture 1.1 (Strong Approximation).

For any prime pp,

Γ(1,1,1)=𝕏(/p){(0,0,0)}.\Gamma\cdot(1,1,1)=\mathbb{X}(\mathbb{Z}/p\mathbb{Z})-\{(0,0,0)\}.

In Theorem 1 of [2], Bourgain, Gamburd and Sarnak show that Strong Approximation holds for primes pp where p21p^{2}-1 does not have too many divisors. The authors prove this theorem by showing algorithmically that the Markoff mod pp graphs, defined in Section 2, are connected.

It is conjectured that the Markoff modp\text{mod}\,p graphs in fact form an expander family, and so in light of [4] these graphs have been proposed as a means to produce cryptographic hash functions. In [9], it is noted that one avenue of attack depends on the difficulty of finding lifts of Markoff triples modulo pp.

Throughout this paper, we assume that p>3p>3 is prime. For convenience, we set the following notation and terminology. We denote the nonzero modp\text{mod}\,p points 𝕏(/p){(0,0,0)}\mathbb{X}(\mathbb{Z}/p\mathbb{Z})-\{(0,0,0)\} by 𝕏(p)\mathbb{X}^{*}(p) and call elements in 𝐱𝕏(p)\mathbf{x}\in\mathbb{X}^{*}(p) Markoff modp\text{mod}\,p points. We refer to the algorithm given by Bourgain, Gamburd, and Sarnak in [2] to BGS algorithm.

Define the size of a Markoff triple 𝐱=(x1,x2,x3)\mathbf{x}=(x_{1},x_{2},x_{3}) by

size(𝐱):=max{x1,x2,x3}.\operatorname{size}(\mathbf{x}):=\max\{x_{1},x_{2},x_{3}\}.

Furthermore, for a Markoff modp\text{mod}\,p point 𝐱𝕏(p)\mathbf{x}\in\mathbb{X}^{*}(p) we call a Markoff triple 𝐱~\tilde{\mathbf{x}} a lift of 𝐱\mathbf{x} if we have 𝐱~𝐱(modp)\tilde{\mathbf{x}}\equiv\mathbf{x}(\text{mod}\,p). The main results of this paper give bounds on the size of minimal lifts by considering “optimal” paths in the Markoff modp\text{mod}\,p graphs, which are defined in Section 2 along with relevant definitions. We show the following.

Theorem 1.2.

Let pp be a prime so that ordp(rot1n(1,1,1))p1\operatorname{ord}_{p}(\operatorname{rot}_{1}^{n}(1,1,1))\geq p-1 for 0n50\leq n\leq 5 and suppose that 𝐱𝕏(p)\mathbf{x}\in\mathbb{X}^{*}(p) with ordp(𝐱)>p1/2\operatorname{ord}_{p}(\mathbf{x})>p^{1/2}. Let 𝐱~\tilde{\mathbf{x}} be a lift of 𝐱\mathbf{x} of minimal size. Then,

size(𝐱~)<(3ε)96(2p+1)4,\operatorname{size}(\tilde{\mathbf{x}})<(3\varepsilon)^{96(2p+1)^{4}},

where ε=(3+5)/2\varepsilon=(3+\sqrt{5})/{2}.

In Section 4, we give numerical evidence to show that the conditions in Theorem 1.2 appear to be satisfied for many primes pp and approximately 80% of Markoff modp\text{mod}\,p points. Furthermore, we provide direction for how these results might be obtained theoretically, and for obtaining upper bounds on the sizes of smallest lifts for the remaining points 𝐱\mathbf{x} in 𝕏(p)\mathbb{X}^{*}(p) with ordp(𝐱)p1/2\operatorname{ord}_{p}(\mathbf{x})\leq p^{1/2}.

Our second upper bound is conditional on Strong Approximation, and depends on the expansion constant h(p)h(p) of the Markoff modp\text{mod}\,p graph 𝒢p\mathcal{G}_{p}, but holds for all points 𝐱\mathbf{x} in 𝕏(p)\mathbb{X}^{*}(p). It is expected that the family of Markoff modp\text{mod}\,p graphs (𝒢p)p(\mathcal{G}_{p})_{p} forms an expander family, and so this bound is expected to be uniform (see the discussion on Super Strong Approximation in Section 5). We show the following.

Theorem 1.3.

Let pp be any prime where Strong Approximation holds, and let h(p)h(p) be the expansion constant of the Markoff graph 𝒢p\mathcal{G}_{p}. For 𝐱𝕏(p)\mathbf{x}\in\mathbb{X}^{*}(p), let 𝐱~\tilde{\mathbf{x}} be a lift of 𝐱\mathbf{x} of minimal size. Then,

size(𝐱~)<(3ε)α,\operatorname{size}(\tilde{\mathbf{x}})<(3\varepsilon)^{\alpha},

where ε=(3+5)/2\varepsilon=(3+\sqrt{5})/{2} and

α=(p3+32)20/log(1+h(p)3).\alpha=\left(\frac{p^{3}+3}{2}\right)^{20/\log\left(1+\frac{h(p)}{3}\right)}.

This paper is organized as follows. In Section 2 we introduce special elements of the Vieta group, called rotations, and give an explicit upper bound for the growth of Markoff triples under the action of the group of rotations. In Sections 2 and 3 we give the necessary background and an explicit description of the algorithm given in the paper [2] of Bourgain, Gamburd, and Sarnak for which Theorem 1.2 rests, and provide several alternate proofs to those given in [2] using the language of linear recurrence sequences. In Section 4 we prove Theorem 1.2 and give numerical evidence and a heuristic argument for how one might relax the conditions of this Theorem. In Section 5 we prove Theorem 1.3 and give evidence for how this bound might be improved on average. Finally, in Section 6 we discuss an alternate approach to finding paths in the Markoff modp\text{mod}\,p graphs which could be used for further improvements to the bounds given in Theorems 1.2 and 1.3.

2. Rotations and The Markoff modp\text{mod}\,p Graphs

To analyze lifts, we consider the following special elements in the Vieta group Γ\Gamma as introduced in [2].

Definition 2.1.

The rotations of Γ\Gamma are the elements roti\operatorname{rot}_{i} given by

rot1=σ23R2,rot2=σ13R1, and rot3=σ12R1.\operatorname{rot}_{1}=\sigma_{23}R_{2},\operatorname{rot}_{2}=\sigma_{13}R_{1},\text{ and }\operatorname{rot}_{3}=\sigma_{12}R_{1}.

Explicitly, we have

rot1(x1,x2,x3)=(x1,x3,3x1x3x2)\operatorname{rot}_{1}(x_{1},x_{2},x_{3})=(x_{1},x_{3},3x_{1}x_{3}-x_{2})
rot2(x1,x2,x3)=(x3,x2,3x2x3x1)\operatorname{rot}_{2}(x_{1},x_{2},x_{3})=(x_{3},x_{2},3x_{2}x_{3}-x_{1})
rot3(x1,x2,x3)=(x2,3x2x3x1,x3).\operatorname{rot}_{3}(x_{1},x_{2},x_{3})=(x_{2},3x_{2}x_{3}-x_{1},x_{3}).

For a prime pp, the Markoff modp\text{mod}\,p graph 𝒢p\mathcal{G}_{p} is defined to be the graph with vertex set 𝕏(p)\mathbb{X}^{*}(p) and edges (𝐱,roti𝐱)(\mathbf{x},\operatorname{rot}_{i}\mathbf{x}) for i{1,2,3}i\in\{1,2,3\}.

In Figure 1 we give an example of the Markoff mod pp graph 𝒢p\mathcal{G}_{p} when p=31p=31 to demonstrate the complexity of the path structure for these graphs. In this figure, nodes are colored depending on their rotation order, with blue nodes corresponding to points in the cage (defined in Section 3.1).

Refer to caption
Figure 1. The Markoff modp\text{mod}\,p graph 𝒢p\mathcal{G}_{p} when p=31p=31

2.1. Sizes under action of rotations

It’s known that in most cases, the Markoff modp\text{mod}\,p graph 𝒢p\mathcal{G}_{p} is connected (see [2] and [5]). Observe that when 𝒢p\mathcal{G}_{p} is connected, we can construct lifts of points in 𝕏(p)\mathbb{X}^{*}(p) by finding paths between (1,1,1)(1,1,1) and points 𝐱𝒢p\mathbf{x}\in\mathcal{G}_{p}. Proposition 2.3 will tell us that finding small lifts modulo pp can then be done by finding “optimal” paths in 𝒢p\mathcal{G}_{p}. Observe that we have the following.

Lemma 2.2.

Given (x1,x2,x3)𝕏(p)(x_{1},x_{2},x_{3})\in\mathbb{X}^{*}(p) we have

rot1n(x1,x2,x3)=(x1,an,an+1),\operatorname{rot}_{1}^{n}(x_{1},x_{2},x_{3})=(x_{1},a_{n},a_{n+1}),

where ana_{n} is the linear recurrence sequence with initial conditions a0=x2,a1=x3a_{0}=x_{2},a_{1}=x_{3} and recurrence

an+2=3xian+1an.a_{n+2}=3x_{i}a_{n+1}-a_{n}.

Similarly, rotin(x1,x2,x3)=σ(xi,an,an+1)\operatorname{rot}_{i}^{n}(x_{1},x_{2},x_{3})=\sigma(x_{i},a_{n},a_{n+1}) where the sequence ana_{n} has initial conditions given by the other two coordinates xi,xjx_{i},x_{j} and recurrence as above, and σ\sigma is a suitable permutation.

This observation allows us to analyze the growth of points obtained from (1,1,1)(1,1,1) by the action of rot1,rot2,rot3\langle\operatorname{rot}_{1},\operatorname{rot}_{2},\operatorname{rot}_{3}\rangle. We have the following.

Proposition 2.3.

Let ni1n_{i}\in\mathbb{Z}_{\geq 1} and ij{1,2,3}i_{j}\in\{1,2,3\}. Then we have

size(rotisnsroti1n1(1,1,1))(3ε)2s1(n1+1)(ns+1),\operatorname{size}(\operatorname{rot}_{i_{s}}^{n_{s}}\cdots\operatorname{rot}_{i_{1}}^{n_{1}}(1,1,1))\leq(3\varepsilon)^{2^{s-1}(n_{1}+1)\cdots(n_{s}+1)},

where ε=3+52\varepsilon=\frac{3+\sqrt{5}}{2}.

Note that an exponential lower bound can be derived using a similar method to that outlined below, and so we obtain a result similar to that of Zagier in [13], which bounds the growth of sizes in the Markoff tree. Furthermore, we see that switching between rotations contributes doubly exponentially in growth, while traveling along a single rotation only contributes exponentially. Theorem 1.2 follows the path constructed by Bourgain, Gamburd and Sarnak in [2] which has a relatively small number of switches between different rotations, but possibly long path lengths along each orbit. Theorem 1.3 instead considers paths of shortest possible length, but possibly many switches between distinct rotations.

Proof of Proposition 2.3.

Let 𝐱=(x1,x2,x3)\mathbf{x}=(x_{1},x_{2},x_{3}) be a Markoff triple, and without loss of generality set x=x1x=x_{1} and suppose that x>0x>0. Let ana_{n} denote the linear recurrence sequence defined in Lemma 2.2, and εx,ε¯x\varepsilon_{x},\bar{\varepsilon}_{x} denote the characteristic roots of ana_{n}. That is, εx,ε¯x\varepsilon_{x},\bar{\varepsilon}_{x} are the roots of the minimal polynomial of the sequence ana_{n}, which is given by

fx(T)=T23xT+1.f_{x}(T)=T^{2}-3xT+1.

Note that fxf_{x} has distinct positive real roots, so we can set εx>ε¯x\varepsilon_{x}>\bar{\varepsilon}_{x}. Since ana_{n} is a linear recurrence sequence we can write

an=Aεxn+Bε¯xna_{n}=A\varepsilon_{x}^{n}+B\bar{\varepsilon}_{x}^{n}

for some A,B(ε)A,B\in\mathbb{Q}(\varepsilon). Using our initial conditions we have

x2=A+B and x3=Aεx+Bε¯x,x_{2}=A+B\text{ and }x_{3}=A\varepsilon_{x}+B\bar{\varepsilon}_{x},

and so solving for AA and BB gives

an+1=(x3x2ε¯x)εxn+1+(εxx2x3)ε¯xn+1εxε¯xa_{n+1}=\frac{(x_{3}-x_{2}\bar{\varepsilon}_{x})\varepsilon_{x}^{n+1}+(\varepsilon_{x}x_{2}-x_{3})\bar{\varepsilon}_{x}^{n+1}}{\varepsilon_{x}-\bar{\varepsilon}_{x}}
(2.1) =x3εxn+1ε¯xn+1εxε¯xx2εxnε¯xnεxε¯x,=x_{3}\cdot\frac{\varepsilon_{x}^{n+1}-\bar{\varepsilon}_{x}^{n+1}}{\varepsilon_{x}-\bar{\varepsilon}_{x}}-x_{2}\cdot\frac{\varepsilon_{x}^{n}-\bar{\varepsilon}_{x}^{n}}{\varepsilon_{x}-\bar{\varepsilon}_{x}},

where the second equality uses that εxε¯x=1\varepsilon_{x}\bar{\varepsilon}_{x}=1.

We induct on s1s\in\mathbb{Z}_{\geq 1}. When s=1s=1, Equation (2.1) gives

size(roti1n1(1,1,1))\displaystyle\operatorname{size}(\operatorname{rot}_{i_{1}}^{n_{1}}(1,1,1)) <εn+1ε¯n+1εε¯\displaystyle<\frac{\varepsilon^{n+1}-\bar{\varepsilon}^{n+1}}{\varepsilon-\bar{\varepsilon}}
<εn+1εε¯\displaystyle<\frac{\varepsilon^{n+1}}{\varepsilon-\bar{\varepsilon}}
<εn+1,\displaystyle<\varepsilon^{n+1},

where ε=(3+5)/2\varepsilon=(3+\sqrt{5})/2, as desired.

Next, suppose that

𝐱=rotis1ns1roti1n1(1,1,1)\mathbf{x}=\operatorname{rot}_{i_{s-1}}^{n_{s-1}}\cdots\operatorname{rot}_{i_{1}}^{n_{1}}(1,1,1)

and let xx denote the isi_{s}th coordinate of 𝐱\mathbf{x}. We have

size(rotisns(𝐱))\displaystyle\operatorname{size}(\operatorname{rot}_{i_{s}}^{n_{s}}(\mathbf{x})) <size(𝐱)εxns+1, by Equation (2.1)\displaystyle<\operatorname{size}(\mathbf{x})\,\varepsilon_{x}^{n_{s}+1},\text{ by Equation (\ref{eq0})}
=size(𝐱)(3x+(3x)242)ns+1\displaystyle=\operatorname{size}(\mathbf{x})\cdot\left(\frac{3x+\sqrt{(3x)^{2}-4}}{2}\right)^{n_{s}+1}
<size(𝐱)(3x+(3x)22)ns+1\displaystyle<\operatorname{size}(\mathbf{x})\cdot\left(\frac{3x+\sqrt{(3x)^{2}}}{2}\right)^{n_{s}+1}
=size(𝐱)(3size(𝐱))ns+1,\displaystyle=\operatorname{size}(\mathbf{x})\cdot(3\operatorname{size}(\mathbf{x}))^{n_{s}+1},

and so the result follows by induction. ∎

Remark 2.4.

In Figure 2, the sizes of points

rotisroti1(1,1,1)\operatorname{rot}_{i_{s}}\cdots\operatorname{rot}_{i_{1}}(1,1,1)

are graphed with the number of rotations ss on the horizontal axis. Observe that our upper bound in Proposition 2.3 overshoots the growth of these sizes. However, through our experimentation is appears that a tight upper bound will still depend doubly exponentially on ss and exponentially on the nin_{i}. Improvements to this upper bound or derivation of such an asymptotic would improve the results of this paper, but we expect the true upper bound to still depend on a balance of finding paths in 𝒢p\mathcal{G}_{p} which are short (i.e. minimizes the nin_{i}, which we expect contributes exponentially to the asymptotic growth as in Proposition 2.3) but switch between distinct rotations minimally (i.e. minimizes ss, which we contribute doubly exponentially to the asmptotic growth as in Proposition 2.3).

Refer to caption
Figure 2. Size of rotisroti1(1,1,1)\operatorname{rot}_{i_{s}}\cdots\operatorname{rot}_{i_{1}}(1,1,1)

2.2. Rotation Orders and Conic Sections

The BGS algorithm constructs paths in the Markoff mod pp graph by analyzing the orbits under the rotations, defined in 2.1. We present the analysis of these orbits as in [2], giving alternate proofs of several results.

Definition 2.5.

For 𝐱=(x1,x2,x3)𝕏(p)\mathbf{x}=(x_{1},x_{2},x_{3})\in\mathbb{X}^{*}(p), we define the following.

  1. (1)

    The iith rotation order of 𝐱\mathbf{x} is given by

    ordp,i(𝐱):=min{n>0rotin(X)X(modp)}.\operatorname{ord}_{p,i}(\mathbf{x}):=\min\{n\in\mathbb{Z}_{>0}\mid\operatorname{rot}_{i}^{n}(X)\equiv X(\text{mod}\,p)\}.
  2. (2)

    The rotation order of 𝐱\mathbf{x} is given by

    ordp(𝐱):=max{ordp,i(x)i=1,2,3}.\operatorname{ord}_{p}(\mathbf{x}):=\max\{\operatorname{ord}_{p,i}(x)\mid i=1,2,3\}.

If ordp(𝐱)=ordp,i(𝐱)\operatorname{ord}_{p}(\mathbf{x})=\operatorname{ord}_{p,i}(\mathbf{x}), then we call ii the maximal index of 𝐱\mathbf{x}.

Observe that for distinct i,j,ki,j,k, roti\operatorname{rot}_{i} acts on (xj,xk)(x_{j},x_{k}), and we can write

roti(xi)(xjxk)=(0113xi)(xkxj).\operatorname{rot}_{i}(x_{i})\begin{pmatrix}x_{j}\\ x_{k}\end{pmatrix}=\begin{pmatrix}0&1\\ -1&3x_{i}\end{pmatrix}\begin{pmatrix}x_{k}\\ x_{j}\end{pmatrix}.

So, the iith rotation order of 𝐱\mathbf{x} is equal to the order of

(0113xi)\begin{pmatrix}0&1\\ -1&3x_{i}\end{pmatrix}

in SL2(𝔽p)\operatorname{SL}_{2}(\mathbb{F}_{p}). Note in particular that rotp,i(𝐱)\operatorname{rot}_{p,i}(\mathbf{x}) only depends on the iith coordinate of 𝐱\mathbf{x}.

Definition 2.6.

We set the following notation. For 𝐱=(x1,x2,x3)𝕏(p)\mathbf{x}=(x_{1},x_{2},x_{3})\in\mathbb{X}^{*}(p), let M𝐱,i:={rotin(𝐱)n}M_{\mathbf{x},i}:=\{\operatorname{rot}_{i}^{n}(\mathbf{x})\mid n\in\mathbb{Z}\} be the orbit of 𝐱\mathbf{x} under roti\operatorname{rot}_{i}. Furthermore, let

A𝐱,i:=(0113xi)A_{\mathbf{x},i}:=\begin{pmatrix}0&1\\ -1&3x_{i}\end{pmatrix}

be the matrix discussed above, and denote the characteristic polynomial of A𝐱,iA_{\mathbf{x},i} by f𝐱,if_{\mathbf{x},i} and the discriminant of f𝐱,if_{\mathbf{x},i} by Δ𝐱,i\Delta_{\mathbf{x},i}. In our analysis below, we will sometimes only be concerned with a single coordinate xx of a Markoff modp\text{mod}\,p point. In this case, we will instead use the notation Mx,Ax,fxM_{x},A_{x},f_{x} and Δx\Delta_{x}, respectively.

Note that the action of roti\operatorname{rot}_{i} on a Markoff triple 𝐱\mathbf{x} leaves the iith coordinate of 𝐱\mathbf{x} fixed, and so the orbits M𝐱,iM_{\mathbf{x},i} correspond to points inside of some conic section with discriminant Δ𝐱,i\Delta_{\mathbf{x},i}. Given a/pa\in\mathbb{Z}/p\mathbb{Z} we use the notation

Ci(a):={(x1,x2,x3)𝕏(p)xi=a}C_{i}(a):=\{(x_{1},x_{2},x_{3})\in\mathbb{X}^{*}(p)\mid x_{i}=a\}

to denote the conic section consisting of Markoff modp\text{mod}\,p points with iith coordinate equal to aa. Note that we may also use the notation Ci(𝐱)C_{i}(\mathbf{x}) to indicate the conic section cut out by fixing the iith coordinate of 𝐱\mathbf{x}. Accordingly, we have the following definition.

Definition 2.7.

Let 𝐱=(x1,x2,x3)𝕏(p)\mathbf{x}=(x_{1},x_{2},x_{3})\in\mathbb{X}^{*}(p) be a Markoff modp\text{mod}\,p point with iith coordinate equal to xx. We say that xx is

  1. (1)

    parabolic if Δx0(modp)\Delta_{x}\equiv 0(\text{mod}\,p),

  2. (2)

    hyperbolic if Δx\Delta_{x} is a nonzero square modulo pp, and

  3. (3)

    elliptic if Δx\Delta_{x} is not a square modulo pp.

We have the following observation.

Lemma 2.8.

Let 𝐱=(x1,x2,x3)\mathbf{x}=(x_{1},x_{2},x_{3}) be a Markoff modp\text{mod}\,p, and set x=xix=x_{i}. Let εx\varepsilon_{x} a root of fxf_{x}. If xx is not parabolic, then ordp,i(𝐱)\operatorname{ord}_{p,i}(\mathbf{x}) is equal to the order of εx\varepsilon_{x} in 𝔽×\mathbb{F}^{\times}, where

𝔽={𝔽p when x is hyperbolic𝔽p2 when x is elliptic\mathbb{F}=\begin{cases}\mathbb{F}_{p}&\text{ when $x$ is hyperbolic}\\ \mathbb{F}_{p^{2}}&\text{ when $x$ is elliptic}\end{cases}

under the identification 𝔽p2𝔽p[T]/(T2Δx).\mathbb{F}_{p^{2}}\cong\mathbb{F}_{p}[T]/(T^{2}-\Delta_{x}).

Proof.

When xx is hyperbolic or elliptic, AxA_{x} is diagonalizable over 𝔽\mathbb{F} with eigenvalues given by εx\varepsilon_{x} and its conjugate ε¯x\bar{\varepsilon}_{x}. Since εx,ε¯x\varepsilon_{x},\bar{\varepsilon}_{x} are roots of

fx(T)=T23xT+1f_{x}(T)=T^{2}-3xT+1

then we have ε¯x=εx1\bar{\varepsilon}_{x}=\varepsilon_{x}^{-1}. This gives

Axn=Pn(εxn00εxn)PnA_{x}^{n}=P^{n}\begin{pmatrix}\varepsilon_{x}^{n}&0\\ 0&\varepsilon_{x}^{-n}\end{pmatrix}P^{-n}

where PSLn(𝔽)P\in\operatorname{SL}_{n}(\mathbb{F}). So, the order of AxA_{x} is equal to the order of ε\varepsilon in 𝔽\mathbb{F}. ∎

The following Proposition can be found in Lemma 3 of [2]. We present an alternate proof of this result here.

Proposition 2.9.

Let 𝐱=(x1,x2,x3)𝕏(p)\mathbf{x}=(x_{1},x_{2},x_{3})\in\mathbb{X}^{*}(p). For a prime p>3p>3. We have

ordp,i(𝐱) divides {p1 if xi is hyperbolicp+1 if xi is elliptic.\operatorname{ord}_{p,i}(\mathbf{x})\text{ divides }\begin{cases}p-1&\text{ if $x_{i}$ is hyperbolic}\\ p+1&\text{ if $x_{i}$ is elliptic}.\end{cases}

If xix_{i} is parabolic, then xi=±2/3x_{i}=\pm 2/3 and we have

ordp,i(𝐱)={2p if xi=2/3p if xi=2/3.\operatorname{ord}_{p,i}(\mathbf{x})=\begin{cases}2p&\text{ if }x_{i}=-2/3\\ p&\text{ if }x_{i}=2/3.\end{cases}
Proof.

For convenience, set x=xix=x_{i} and ε=εx\varepsilon=\varepsilon_{x}. If xx is hyperbolic, then the result follows directly from Lemma 2.8. So suppose that xix_{i} is elliptic. Since

ε+ε¯=3xi𝔽p×\varepsilon+\bar{\varepsilon}=3x_{i}\in\mathbb{F}_{p}^{\times}

then we get

(ε+ε¯)p=ε+ε¯,(\varepsilon+\bar{\varepsilon})^{p}=\varepsilon+\bar{\varepsilon},

and since ε¯=ε1\bar{\varepsilon}=\varepsilon^{-1} then

εp+εp=ε+ε1.\varepsilon^{p}+\varepsilon^{-p}=\varepsilon+\varepsilon^{-1}.

So, by Lemma 2.10 below, we have ε=εp\varepsilon=\varepsilon^{p} or ε=εp\varepsilon=\varepsilon^{-p}. But if ε=εp\varepsilon=\varepsilon^{p} then εp1=1\varepsilon^{p-1}=1 which implies ε𝔽p×\varepsilon\in\mathbb{F}_{p}^{\times}, contradicting xix_{i} being elliptic. So we must have ε=εp\varepsilon=\varepsilon^{-p} which gives εp+1=1\varepsilon^{p+1}=1. Hence, the order of ε\varepsilon divides p+1p+1 and the result follows from Lemma 2.8.

Finally, suppose that xx is parabolic. Then Δx0(modp)\Delta_{x}\equiv 0(\text{mod}\,p) which gives

(3x)240(modp)x±2/3(modp).(3x)^{2}-4\equiv 0(\text{mod}\,p)\Rightarrow x\equiv\pm 2/3(\text{mod}\,p).

Note that in the parabolic case AxA_{x} has Jordan normal form

Ax=P(ε10ε)P1,A_{x}=P\begin{pmatrix}\varepsilon&1\\ 0&\varepsilon\end{pmatrix}P^{-1},

for some PGL2(/p)P\in\operatorname{GL}_{2}(\mathbb{Z}/p\mathbb{Z}) and so

(2.2) Axn=P(εnnε0εn)P1.A_{x}^{n}=P\begin{pmatrix}\varepsilon^{n}&n\varepsilon\\ 0&\varepsilon^{n}\end{pmatrix}P^{-1}.

Recalling that εx\varepsilon_{x} denotes a root of the characteristic polynomial of AxA_{x}, we can compute

εx={1 if x=2/31 if x=2/3..\varepsilon_{x}=\begin{cases}1&\text{ if }x=2/3\\ -1&\text{ if }x=-2/3.\end{cases}.

So, the result for the order of parabolic points follows from Equation (2.2) and Lemma 2.8. ∎

Lemma 2.10.

If a+a1=b+b1a+a^{-1}=b+b^{-1} for nonzero elements a,ba,b in a field 𝔽\mathbb{F} with char(𝔽)2\text{char}(\mathbb{F})\not=2 then we have a=ba=b or a=b1a=b^{-1}.

Proof.

If a+a1=b+b1a+a^{-1}=b+b^{-1} then multiplying both sides by abab gives

a2b+b=ab2+aa2ba(b2+1)+b=0.a^{2}b+b=ab^{2}+a\Rightarrow a^{2}b-a(b^{2}+1)+b=0.

Since char(𝔽)2\text{char}(\mathbb{F})\not=2 the quadratic formula gives the desired result. ∎

Definition 2.11.

Let 𝐱𝕏(p)\mathbf{x}\in\mathbb{X}^{*}(p).

  1. (1)

    If ordp,i(𝐱)=p1\operatorname{ord}_{p,i}(\mathbf{x})=p-1 then we say xix_{i} is maximal hyperbolic.

  2. (2)

    If ordp,i(𝐱)=p+1\operatorname{ord}_{p,i}(\mathbf{x})=p+1 then we’ll call xix_{i} is maximal elliptic, and

  3. (3)

    If ordp,i(𝐱)=2p\operatorname{ord}_{p,i}(\mathbf{x})=2p we say xix_{i} is maximal parabolic.

A triple 𝐱𝕏(p)\mathbf{x}\in\mathbb{X}^{*}(p) will be called maximal (hyperbolic, elliptic, or parabolic) if one of its coordinates is either maximal hyperbolic, elliptic, or parabolic.

2.3. Connection to Lucas Sequences

In Equation (2.1), we saw that the orbit of points 𝐱\mathbf{x} in 𝕏(p)\mathbb{X}^{*}(p) under a single rotation can be described in terms of a corresponding Lucas sequence. We prove the following identity giving this relation explicitly for any Markoff modp\text{mod}\,p points, which may be useful for future study, and is referenced in Section 3.5 as a possible direction to relax the conditions of Theorem 1.2.

Lemma 2.12.

Let 𝐱𝕏(p)\mathbf{x}\in\mathbb{X}^{*}(p) have iith coordinate xx. Then,

Axn=(un1ununun+1),A_{x}^{n}=\begin{pmatrix}-u_{n-1}&u_{n}\\ -u_{n}&u_{n+1}\end{pmatrix},

where AxA_{x} is the matrix defined in 2.6 and unu_{n} is the Lucas sequence with integer parameters (3x,1)(3x,1). That is, unu_{n} is the linear recurrence sequence with initial conditions u0=0,u1=1u_{0}=0,u_{1}=1 and recurrence un+2=3xun+1un.u_{n+2}=3xu_{n+1}-u_{n}.

Proof.

For convenience, set x=xix=x_{i}. Observe that

Ax=(0113x)=(3x110)1.A_{x}=\begin{pmatrix}0&1\\ -1&3x\end{pmatrix}=\begin{pmatrix}3x&-1\\ 1&0\end{pmatrix}^{-1}.

The matrix on the right-hand side gives a familiar Lucas sequence identity. We have

(0113x)n\displaystyle\begin{pmatrix}0&1\\ -1&3x\end{pmatrix}^{n} =(un+1ununun1)1\displaystyle=\begin{pmatrix}u_{n+1}&-u_{n}\\ u_{n}&-u_{n-1}\end{pmatrix}^{-1}
=1un2un+1un1(un1ununun+1).\displaystyle=\frac{1}{u_{n}^{2}-u_{n+1}u_{n-1}}\begin{pmatrix}-u_{n-1}&u_{n}\\ -u_{n}&u_{n+1}\end{pmatrix}.

Now, let ε,ε¯\varepsilon,\bar{\varepsilon} be the roots of

f(T)=T23xT+1.f(T)=T^{2}-3xT+1.

Then, we can write

un=εnε¯nεε¯u_{n}=\frac{\varepsilon^{n}-\bar{\varepsilon}^{n}}{\varepsilon-\bar{\varepsilon}}

and we note that εε¯=1\varepsilon\bar{\varepsilon}=1. So we have

un2un+1un1\displaystyle u_{n}^{2}-u_{n+1}u_{n-1} =(εnε¯nεε¯)2(εn+1ε¯n+1εε¯)(εn1ε¯n1εε¯)\displaystyle=\left(\frac{\varepsilon^{n}-\bar{\varepsilon}^{n}}{\varepsilon-\bar{\varepsilon}}\right)^{2}-\left(\frac{\varepsilon^{n+1}-\bar{\varepsilon}^{n+1}}{\varepsilon-\bar{\varepsilon}}\right)\left(\frac{\varepsilon^{n-1}-\bar{\varepsilon}^{n-1}}{\varepsilon-\bar{\varepsilon}}\right)
=1(εε¯)2((ε2n+ε¯2n2)(ε2n+ε¯2n(ε2+ε¯2)))\displaystyle=\frac{1}{(\varepsilon-\bar{\varepsilon})^{2}}\left((\varepsilon^{2n}+\bar{\varepsilon}^{2n}-2)-(\varepsilon^{2n}+\bar{\varepsilon}^{2n}-(\varepsilon^{2}+\bar{\varepsilon}^{2}))\right)
=1,\displaystyle=1,

as required. ∎

3. The BGS Algorithm

The algorithm outlined in [2] constructs a path between any two points in the Markoff modp\text{mod}\,p graph by connecting them through the cage, which we discuss below. The BGS algorithm then guarantees a path between any two points in the cage which contains at most two distinct rotations. In light of Proposition 2.3, the paths constructed in the BGS algorithm are a good candidate for small lifts, particularly when one or both of our points are in the cage. In this section, we give an outline of the key components of the BGS algorithm from [2], extracting explicit information when possible. An explicit description of the path constructed by the BGS algorithm is then outlined in Algorithm 3.6.

3.1. The Cage

Define the cage to be the subgraph 𝒞(p)\mathcal{C}(p) of the Markoff modp\text{mod}\,p graph 𝒢p\mathcal{G}_{p} containing all vertices that are maximal points in 𝕏(p)\mathbb{X}^{*}(p), as defined in 2.11. The convenience of the cage is that the orbits with respect to the maximal index are precisely equal to the conic sections at that index. More precisely, we have the following.

Proposition 3.1.

If 𝐱=(x1,x2,x3)\mathbf{x}=(x_{1},x_{2},x_{3}) is in the cage 𝒞(p)\mathcal{C}(p) with maximal index ii then M𝐱,i=Ci(xi)M_{\mathbf{x},i}=C_{i}(x_{i}). That is, if 𝐱\mathbf{x} is a maximal triple with maximal index ii, then the orbit of 𝐱\mathbf{x} under roti\operatorname{rot}_{i} contains all Markoff modp\text{mod}\,p points with iith coordinate equal to xix_{i}.

This Proposition follows from our definition of maximal points along with the following lemma which can be found in [2].

Lemma 3.2.

If xx is parabolic, then

|Ci(x)|={0 if p1(mod 4)2p if p1(mod 4).|C_{i}(x)|=\begin{cases}0&\text{ if }p\equiv-1(\text{mod}\,4)\\ 2p&\text{ if }p\equiv 1(\text{mod}\,4)\end{cases}.

Otherwise, we have

|Ci(x)|={p+1 if x is ellipticp1 if x is hyperbolic.|C_{i}(x)|=\begin{cases}p+1&\text{ if $x$ is elliptic}\\ p-1&\text{ if $x$ is hyperbolic}.\end{cases}

The connectedness of the cage will then follow from the following result. For completeness of our analysis and implementation of the BGS algorithm, we outline the proof given in [2].

Lemma 3.3 (Section 3.2 of [2]).

Let 𝐱,𝐲𝒞(p)\mathbf{x},\mathbf{y}\in\mathcal{C}(p) with maximal indices i,j{1,2,3}i,j\in\{1,2,3\}, respectively. Then there is a point 𝐳𝒞(p)\mathbf{z}\in\mathcal{C}(p) and k{1,2,3}k\in\{1,2,3\} so that

Ci(𝐱)Ck(𝐳) and Cj(𝐲)Ck(𝐳).C_{i}(\mathbf{x})\cap C_{k}(\mathbf{z})\not=\varnothing\text{ and }C_{j}(\mathbf{y})\cap C_{k}(\mathbf{z})\not=\varnothing.
Proof.

Let 𝐱=(x1,x2,x3)\mathbf{x}=(x_{1},x_{2},x_{3}) and 𝐲=(y1,y2,y3)\mathbf{y}=(y_{1},y_{2},y_{3}). Suppose first that the maximal indices of 𝐱\mathbf{x} and 𝐲\mathbf{y} are distinct. Without loss of generality, say 𝐱\mathbf{x} has maximal index i=1i=1 and 𝐲\mathbf{y} has maximal index j=2j=2. We claim that there are elements ζ,γ,z\zeta,\gamma,z in 𝔽p\mathbb{F}_{p} so that

(x1,ζ,z),(γ,y2,z)𝕏(p).(x_{1},\zeta,z),(\gamma,y_{2},z)\in\mathbb{X}^{*}(p).

Considering the Markoff equation as a quadratic in ζ\zeta, for (x1,ζ,z)(x_{1},\zeta,z) to be in 𝕏(p)\mathbb{X}^{*}(p) we must have

(3x1z)24(x12+z2)=α2(3x_{1}z)^{2}-4(x_{1}^{2}+z^{2})=\alpha^{2}

for some α𝔽p\alpha\in\mathbb{F}_{p}. Similarly, considering the Markoff equation as a quadratic in γ\gamma, for (γ,y2,z)(\gamma,y_{2},z) to be in 𝕏(p)\mathbb{X}^{*}(p) we must have

(3y2z)24(y22+z2)=β2(3y_{2}z)^{2}-4(y_{2}^{2}+z^{2})=\beta^{2}

for some β𝔽p\beta\in\mathbb{F}_{p}. Rearranging, this gives the system of equations

(9x124)z2α2=4x12(9x_{1}^{2}-4)z^{2}-\alpha^{2}=4x_{1}^{2}
(9y224)z2β2=4y22(9y_{2}^{2}-4)z^{2}-\beta^{2}=4y_{2}^{2}

with unknowns z,α,βz,\alpha,\beta.

When x12y22x_{1}^{2}\not=y_{2}^{2}, this system of equations defines an irreducible curve in 𝔸3\mathbb{A}^{3}, which has solutions modulo any prime. If x12=y22x_{1}^{2}=y_{2}^{2} then this reduces to finding solutions z,αz,\alpha to

(3.1) (9x124)z2α2=4x12.(9x_{1}^{2}-4)z^{2}-\alpha^{2}=4x_{1}^{2}.

If x1x_{1} is parabolic, then 9x1240(modp)9x_{1}^{2}-4\equiv 0(\text{mod}\,p). Since we’ve assumed that C1(𝐱)C_{1}(\mathbf{x}) is nonempty, then by Lemma 3.2 we must have p1(mod 4)p\equiv 1(\text{mod}\,4) an so 1-1 is a square modulo pp, which gives a solution to equation (3.1). If x1x_{1} is not parabolic, then equation (3.1) is a conic section, which has a solution modulo any prime. Using an inclusion/exclusion argument, the authors of [2] show that such a solution can be found so that zz has maximal order. Hence, if we let 𝐳=(x1,ζ,z)\mathbf{z}=(x_{1},\zeta,z) then 𝐳𝒞(p)\mathbf{z}\in\mathcal{C}(p) and C3(𝐳)C_{3}(\mathbf{z}) intersects C1(𝐱)C_{1}(\mathbf{x}) and C2(𝐲)C_{2}(\mathbf{y}) nontrivially. Note that if 𝐱\mathbf{x} and 𝐲\mathbf{y} have the same maximal index, then we can find points (x1,ζ,z),(y1,γ,z)𝕏(p)(x_{1},\zeta,z),(y_{1},\gamma,z)\in\mathbb{X}^{*}(p) by solving the Equation (3.1) as above. This gives triple 𝐳=(x1,ζ,z)𝕏(p)\mathbf{z}=(x_{1},\zeta,z)\in\mathbb{X}^{*}(p) with zz maximal and (y1,γ,z)C3(𝐳)(y_{1},\gamma,z)\in C_{3}(\mathbf{z}). ∎

3.2. Connecting points to the cage

The following two Propositions from [2] describe how to connect points of large enough order to the cage. We give a brief outline of the proofs given in [2], both to gather explicit information and to highlight the nonexplicit steps in this construction which require we assume long path lengths in the proof of Theorem 1.2. Note that these Propositions do not cover the case when 𝐱=(x1,x2,x3)\mathbf{x}=(x_{1},x_{2},x_{3}) has maximal index ii and xix_{i} is parabolic of order pp. We will instead deal with this separately in Proposition 3.3.

Proposition 3.4 (The Endgame from Section 4 of [2]).

Let 𝐱\mathbf{x} be in 𝕏(p)\mathbb{X}^{*}(p) with ordp(𝐱)>p1/2\operatorname{ord}_{p}(\mathbf{x})>p^{1/2}, and suppose that 𝐱=(x1,x2,x3)\mathbf{x}=(x_{1},x_{2},x_{3}) has maximal index ii with xix_{i} not parabolic. Then, there exists a positive integer nn and i{1,2,3}i\in\{1,2,3\} so that rotin(𝐱)\operatorname{rot}_{i}^{n}(\mathbf{x}) is a maximal triple.

Proof.

Let 𝔽\mathbb{F} denote 𝔽p\mathbb{F}_{p} when xix_{i} is hyperbolic and 𝔽p2\mathbb{F}_{p^{2}} if xix_{i} is elliptic, under the identification 𝔽p2𝔽p[T]/(T2Δxi)\mathbb{F}_{p^{2}}\cong\mathbb{F}_{p}[T]/(T^{2}-\Delta_{x_{i}}), as in the proof of Lemma 2.8. Without loss of generality, let i=1i=1. By Lemma 2.2 we can write

rot1n(𝐱)=(x1,C1εn+C2ε¯n,C1εn+1+C2ε¯n+1),\operatorname{rot}_{1}^{n}(\mathbf{x})=(x_{1},C_{1}\varepsilon^{n}+C_{2}\bar{\varepsilon}^{n},C_{1}\varepsilon^{n+1}+C_{2}\bar{\varepsilon}^{n+1}),

where ε,ε¯\varepsilon,\bar{\varepsilon} are roots of f(T)=T23xiT+1f(T)=T^{2}-3x_{i}T+1 and C1,C2C_{1},C_{2} are constants in 𝔽\mathbb{F}. The proof given in [2] uses the Weil bound along with an inclusion/exclusion argument to show that there exists a positive integer solution nn to

C1εn+C2ε¯n=ρ+ρ1C_{1}\varepsilon^{n}+C_{2}\bar{\varepsilon}^{n}=\rho+\rho^{-1}

where ρ\rho is defined as follows: in the hyperbolic case, ρ\rho is any generator of 𝔽p×\mathbb{F}_{p}^{\times}, and in the elliptic case we take ρ=up1\rho=u^{p-1} where uu generates 𝔽p2×\mathbb{F}_{p^{2}}^{\times}. Then rotin(𝐱)\operatorname{rot}_{i}^{n}(\mathbf{x}) has second coordinate equal to ρ+ρ1\rho+\rho^{-1}, and since ρ,ρ1\rho,\rho^{-1} are the roots of

f(T)=T2(ρ+ρ1)T+1f(T)=T^{2}-(\rho+\rho^{-1})T+1

then by Lemma 2.8 ordp,2(𝐲)\operatorname{ord}_{p,2}(\mathbf{y}) is equal to the order of ρ\rho, which is maximal by construction. ∎

Proposition 3.5 (The Middlegame from Section 4 of [2]).

Let 𝐱\mathbf{x} be in 𝕏(p)\mathbb{X}^{*}(p) with c<ordp(𝐱)p1/2,c<\operatorname{ord}_{p}(\mathbf{x})\leq p^{1/2}, where cc is a fixed constant independent of pp, which is described in [2]. Then, there exist i1,,it{1,2,3}i_{1},\dots,i_{t}\in\{1,2,3\} and positive integers m1,,mtm_{1},\dots,m_{t} so that the rotation order of

rotitmtroti1m1(𝐱)\operatorname{rot}_{i_{t}}^{m_{t}}\cdots\operatorname{rot}_{i_{1}}^{m_{1}}(\mathbf{x})

is larger than p1/2p^{1/2}, where tτ(p21)t\leq\tau(p^{2}-1).

Proof.

Since the rotation order of a parabolic triple is lower bounded by pp, we know that 𝐱\mathbf{x} is hyperbolic or elliptic. So, by Proposition 2.9, we have ordp(X)p21\operatorname{ord}_{p}(X)\mid p^{2}-1. Suppose that 𝐱\mathbf{x} has maximal index ii and let M𝐱,iM_{\mathbf{x},i} be the orbit of 𝐱\mathbf{x} under roti\operatorname{rot}_{i}, as defined in 2.6. Let 𝐲=(y1,y2,y3)\mathbf{y}=(y_{1},y_{2},y_{3}) be in M𝐱,iM_{\mathbf{x},i} with maximal index jj. If yjy_{j} is parabolic, then ordp(𝐲)p\operatorname{ord}_{p}(\mathbf{y})\geq p and we’re done. Otherwise, yjy_{j} is hyperbolic or elliptic, and so ordp(𝐲)\operatorname{ord}_{p}(\mathbf{y}) divides p21p^{2}-1 as well. If ordp(𝐲)>ordp(𝐱)\operatorname{ord}_{p}(\mathbf{y})>\operatorname{ord}_{p}(\mathbf{x}) then iji\not=j and we repeat the process above by considering the orbit M𝐲,jM_{\mathbf{y},j}. If ordp(𝐲)ordp(𝐱)\operatorname{ord}_{p}(\mathbf{y})\leq\operatorname{ord}_{p}(\mathbf{x}), then we consider the sum

N𝐱:=|M𝐱,i|#{𝐲M𝐱,i:|M𝐲,j|=}.N_{\mathbf{x}}:=\sum_{\ell\leq|M_{\mathbf{x},i}|}\#\{\mathbf{y}\in M_{\mathbf{x},i}:|M_{\mathbf{y},j}|=\ell^{\prime}\}.

The authors of [2] use estimates on the gcd of elements of the form u1,v1u-1,v-1 from Corvaja and Zannier in [7] to show that if p21p^{2}-1 does not have too many prime divisors, then N𝐱<|M𝐱,i|N_{\mathbf{x}}<|M_{\mathbf{x},i}|. So, by definition of N𝐱N_{\mathbf{x}} there must be a point in the orbit of 𝐱\mathbf{x} with larger rotation order, and we proceed as above. That is, given our point 𝐱\mathbf{x}, we can find 𝐳1M𝐱,i\mathbf{z}_{1}\in M_{\mathbf{x},i} with ordp(𝐳1)>ordp(𝐳)\operatorname{ord}_{p}(\mathbf{z}_{1})>\operatorname{ord}_{p}(\mathbf{z}). If 𝐳1\mathbf{z}_{1} has maximal coordinate z1z_{1} parabolic, then we’re done. Otherwise, ordp(𝐲)p21\operatorname{ord}_{p}(\mathbf{y})\mid p^{2}-1. Iterating in this way, we obtain 𝐳1,𝐳2,𝕏(p)\mathbf{z}_{1},\mathbf{z}_{2},\dots\in\mathbb{X}^{*}(p) with

ordp(𝐱)<ordp(𝐳1)<ordp(𝐳2)<\operatorname{ord}_{p}(\mathbf{x})<\operatorname{ord}_{p}(\mathbf{z}_{1})<\operatorname{ord}_{p}(\mathbf{z}_{2})<\cdots

and since ordp(𝐳i)(p21)\operatorname{ord}_{p}(\mathbf{z}_{i})\mid(p^{2}-1) for every ii, this process will stop after at most τ(p21)\tau(p^{2}-1) steps, where τ(n)\tau(n) denotes the number of positive divisors of an integer nn. ∎

3.3. Connecting Parabolic Points to the Cage

Suppose 𝐱=(x1,x2,x3)\mathbf{x}=(x_{1},x_{2},x_{3}) is in 𝕏(p)\mathbb{X}^{*}(p) with xix_{i} parabolic. For convenience, suppose that i=1i=1 and set x=x1x=x_{1}. From Proposition 2.9 and Lemma 3.2, we know that p1(mod 4)p\equiv 1(\text{mod}\,4) and if x=2/3x=2/3 then C1(x)C_{1}(x) consists of two disjoint orbits. Here, we describe the conic section in terms of these orbits, and use this to show how to connect parabolic points of order pp to any point in the cage.

The Jordan normal decomposition of A2/3A_{2/3} is given by

A2/3=(1110)(1101)(0111)A_{2/3}=\begin{pmatrix}1&-1\\ 1&0\end{pmatrix}\begin{pmatrix}1&1\\ 0&1\end{pmatrix}\begin{pmatrix}0&1\\ -1&1\end{pmatrix}

and so we get

A2/3n=(1110)(1n01)(0111).A_{2/3}^{n}=\begin{pmatrix}1&-1\\ 1&0\end{pmatrix}\begin{pmatrix}1&n\\ 0&1\end{pmatrix}\begin{pmatrix}0&1\\ -1&1\end{pmatrix}.

Let

𝐱(+)=(23,1,1+23i) and 𝐱()=(23,1,123i)\mathbf{x}^{(+)}=\left(\frac{2}{3},1,1+\frac{2}{3}i\right)\text{ and }\mathbf{x}^{(-)}=\left(\frac{2}{3},1,1-\frac{2}{3}i\right)

where ii is the square root of 1-1 modulo pp, noting again that the assumption of a parabolic point implies that p1(mod 4)p\equiv 1(\text{mod}\,4). From above we get

(3.2) M𝐱(+)={(23,1+23in,1+23i(n+1))n}M_{\mathbf{x}^{(+)}}=\left\{\left(\frac{2}{3},1+\frac{2}{3}in,1+\frac{2}{3}i(n+1)\right)\mid n\in\mathbb{Z}\right\}
(3.3) M𝐱()={(23,123in,123i(n+1))n}.M_{\mathbf{x}^{(-)}}=\left\{\left(\frac{2}{3},1-\frac{2}{3}in,1-\frac{2}{3}i(n+1)\right)\mid n\in\mathbb{Z}\right\}.

Observe that these orbits are disjoint and each contain pp elements, so we have

Ci(𝐱)=M𝐱(+)M𝐱().C_{i}(\mathbf{x})=M_{\mathbf{x}^{(+)}}\cup M_{\mathbf{x}^{(-)}}.

Now, let 𝐲=(y1,y2,y3)\mathbf{y}=(y_{1},y_{2},y_{3}) be any point in the cage with maximal index j{2,3}j\in\{2,3\} with and yj0y_{j}\not=0. Since M𝐱(+)M_{\mathbf{x}^{(+)}} contains p1p-1 points with distinct jjth coordinates and yj𝔽p×y_{j}\in\mathbb{F}_{p}^{\times} then we must have Cj(y)M𝐱(+)C_{j}(y)\cap M_{\mathbf{x}^{(+)}}\not=\varnothing and similarly we have Cj(y)M𝐱()C_{j}(y)\cap M_{\mathbf{x}^{(-)}}\not=\varnothing. Hence, we can connect 𝐱\mathbf{x} to the cage by rot1\operatorname{rot}_{1}.

3.4. Points of small order

The authors of [2] show that points 𝐱\mathbf{x} with order bounded from above by a constant not depending on pp are connected to the cage. This proof is nonconstructive, and so we do not include such points in Theorem 1.2. We refer the reader to Section 5 of [2] on “The Opening” for this analysis.

3.5. Connecting (1,1,1)(1,1,1) to the cage

In the tables below, we demonstrate that rot1n(1,1,1)\operatorname{rot}_{1}^{n}(1,1,1) is in the cage for all primes p199p\leq 199 where 0n50\leq n\leq 5. If we can show that (1,1,1)(1,1,1) is always “close” to the cage, or find a large family of primes where this holds, we could then relax the conditions of Theorem 1.2. One direction to do this might be to note the following connection to the Fibonacci sequence. Observe that the Lucas sequence unu_{n} defined in Lemma 2.12 when 𝐱=(1,1,1)\mathbf{x}=(1,1,1) is equal to f2nf_{2n}. This gives

A1n=(f2n2f2nf2nf2n+2)A_{1}^{n}=\begin{pmatrix}-f_{2n-2}&f_{2n}\\ -f_{2n}&f_{2n+2}\end{pmatrix}

and so

rot1n(1,1,1)\displaystyle\operatorname{rot}_{1}^{n}(1,1,1) =(1,f2nf2n2,f2n+2f2n)\displaystyle=(1,f_{2n}-f_{2n-2},f_{2n+2}-f_{2n})
=(1,f2n1,f2n+1).\displaystyle=(1,f_{2n-1},f_{2n+1}).

So, one approach would be to investigate the orders of AxA_{x} where xx is an odd term in the Fibonacci sequence.

   Prime     Path Points
2 rot11\operatorname{rot}_{1}^{1} (1,1,1)(1,1,1), (1,1,0)(1,1,0)
3 rot11\operatorname{rot}_{1}^{1} (1,1,1)(1,1,1), (1,1,2)(1,1,2)
5 rot11\operatorname{rot}_{1}^{1} (1,1,1)(1,1,1), (1,1,2)(1,1,2)
7 rot11\operatorname{rot}_{1}^{1} (1,1,1)(1,1,1), (1,1,2)(1,1,2)
11 rot11\operatorname{rot}_{1}^{1} (1,1,1)(1,1,1), (1,1,2)(1,1,2)
13 rot11\operatorname{rot}_{1}^{1} (1,1,1)(1,1,1), (1,1,2)(1,1,2)
17 rot11\operatorname{rot}_{1}^{1} (1,1,1)(1,1,1), (1,1,2)(1,1,2)
19 rot11\operatorname{rot}_{1}^{1} (1,1,1)(1,1,1), (1,1,2)(1,1,2)
23 rot11\operatorname{rot}_{1}^{1} (1,1,1)(1,1,1), (1,1,2)(1,1,2)
29 rot12\operatorname{rot}_{1}^{2} (1,1,1)(1,1,1), (1,1,2)(1,1,2), (1,2,5)(1,2,5)
31 rot12\operatorname{rot}_{1}^{2} (1,1,1)(1,1,1), (1,1,2)(1,1,2), (1,2,5)(1,2,5)
37 rot11\operatorname{rot}_{1}^{1} (1,1,1)(1,1,1), (1,1,2)(1,1,2)
41 rot12\operatorname{rot}_{1}^{2} (1,1,1)(1,1,1), (1,1,2)(1,1,2), (1,2,5)(1,2,5)
43 rot11\operatorname{rot}_{1}^{1} (1,1,1)(1,1,1), (1,1,2)(1,1,2)
47 rot13\operatorname{rot}_{1}^{3} (1,1,1)(1,1,1), (1,1,2)(1,1,2), (1,2,5)(1,2,5), (1,5,13)(1,5,13)
53 rot11\operatorname{rot}_{1}^{1} (1,1,1)(1,1,1), (1,1,2)(1,1,2)
59 rot15\operatorname{rot}_{1}^{5} (1,1,1)(1,1,1), (1,1,2)(1,1,2), (1,2,5)(1,2,5), (1,5,13)(1,5,13), (1,13,34)(1,13,34), (1,34,30)(1,34,30)
61 rot11\operatorname{rot}_{1}^{1} (1,1,1)(1,1,1), (1,1,2)(1,1,2)
67 rot11\operatorname{rot}_{1}^{1} (1,1,1)(1,1,1), (1,1,2)(1,1,2)
71 rot12\operatorname{rot}_{1}^{2} (1,1,1)(1,1,1), (1,1,2)(1,1,2), (1,2,5)(1,2,5)
Table 1. Paths from (1,1,1)(1,1,1) to 𝒞(p)\mathcal{C}(p)
   Prime     Path Points
73 rot11\operatorname{rot}_{1}^{1} (1,1,1)(1,1,1), (1,1,2)(1,1,2)
79 rot12\operatorname{rot}_{1}^{2} (1,1,1)(1,1,1), (1,1,2)(1,1,2), (1,2,5)(1,2,5)
83 rot11\operatorname{rot}_{1}^{1} (1,1,1)(1,1,1), (1,1,2)(1,1,2)
89 rot13\operatorname{rot}_{1}^{3} (1,1,1)(1,1,1), (1,1,2)(1,1,2), (1,2,5)(1,2,5), (1,5,13)(1,5,13)
97 rot11\operatorname{rot}_{1}^{1} (1,1,1)(1,1,1), (1,1,2)(1,1,2)
101 rot11\operatorname{rot}_{1}^{1} (1,1,1)(1,1,1), (1,1,2)(1,1,2)
103 rot11\operatorname{rot}_{1}^{1} (1,1,1)(1,1,1), (1,1,2)(1,1,2)
107 rot11\operatorname{rot}_{1}^{1} (1,1,1)(1,1,1), (1,1,2)(1,1,2)
109 rot11\operatorname{rot}_{1}^{1} (1,1,1)(1,1,1), (1,1,2)(1,1,2)
113 rot15\operatorname{rot}_{1}^{5} (1,1,1)(1,1,1), (1,1,2)(1,1,2), (1,2,5)(1,2,5), (1,5,13)(1,5,13), (1,13,34)(1,13,34), (1,34,89)(1,34,89)
127 rot11\operatorname{rot}_{1}^{1} (1,1,1)(1,1,1), (1,1,2)(1,1,2)
131 rot11\operatorname{rot}_{1}^{1} (1,1,1)(1,1,1), (1,1,2)(1,1,2)
137 rot11\operatorname{rot}_{1}^{1} (1,1,1)(1,1,1), (1,1,2)(1,1,2)
139 rot11\operatorname{rot}_{1}^{1} (1,1,1)(1,1,1), (1,1,2)(1,1,2)
149 rot11\operatorname{rot}_{1}^{1} (1,1,1)(1,1,1), (1,1,2)(1,1,2)
151 rot13\operatorname{rot}_{1}^{3} (1,1,1)(1,1,1), (1,1,2)(1,1,2), (1,2,5)(1,2,5), (1,5,13)(1,5,13)
157 rot11\operatorname{rot}_{1}^{1} (1,1,1)(1,1,1), (1,1,2)(1,1,2)
163 rot11\operatorname{rot}_{1}^{1} (1,1,1)(1,1,1), (1,1,2)(1,1,2)
167 rot11\operatorname{rot}_{1}^{1} (1,1,1)(1,1,1), (1,1,2)(1,1,2)
173 rot11\operatorname{rot}_{1}^{1} (1,1,1)(1,1,1), (1,1,2)(1,1,2)
179 rot13\operatorname{rot}_{1}^{3} (1,1,1)(1,1,1), (1,1,2)(1,1,2), (1,2,5)(1,2,5), (1,5,13)(1,5,13)
181 rot11\operatorname{rot}_{1}^{1} (1,1,1)(1,1,1), (1,1,2)(1,1,2)
191 rot13\operatorname{rot}_{1}^{3} (1,1,1)(1,1,1), (1,1,2)(1,1,2), (1,2,5)(1,2,5), (1,5,13)(1,5,13)
193 rot11\operatorname{rot}_{1}^{1} (1,1,1)(1,1,1), (1,1,2)(1,1,2)
197 rot11\operatorname{rot}_{1}^{1} (1,1,1)(1,1,1), (1,1,2)(1,1,2)
199 rot11\operatorname{rot}_{1}^{1} (1,1,1)(1,1,1), (1,1,2)(1,1,2)
Table 2. Paths from (1,1,1)(1,1,1) to 𝒞(p)\mathcal{C}(p) continued

3.6. Path Construction

We are now prepared to give an explicit description of the path from (1,1,1)(1,1,1) to points 𝐱𝒢p\mathbf{x}\in\mathcal{G}_{p} constructed in the BGS algorithm.

Algorithm 3.6.

Let pp be prime and let 𝐱𝕏(p)\mathbf{x}\in\mathbb{X}^{*}(p) with ordp(X)>c\operatorname{ord}_{p}(X)>c, where cc is defined in the Middlegame of [2]. Then, we can connect (1,1,1)(1,1,1) to XX in 𝒢p\mathcal{G}_{p} as follows.

  1. I.

    We assume that rot1n(1,1,1)\operatorname{rot}_{1}^{n}(1,1,1) is connected to the cage for 0n50\leq n\leq 5, which appears numerically to hold for many primes pp as discussed in Section 3.5.

  2. II.

    If 𝐱𝒞(p)\mathbf{x}\in\mathcal{C}(p) then we can connect 𝐲\mathbf{y} to 𝐱\mathbf{x} as follows. Suppose that 𝐱\mathbf{x} has maximal index i1i_{1} and 𝐲\mathbf{y} has maximal index i2i_{2}. By Lemma 3.3 there exists a point 𝐳𝒞(p)\mathbf{z}\in\mathcal{C}(p) with maximal index i1{1,2,3}i_{1}\in\{1,2,3\} so that

    Ci3(𝐱)Ci2(𝐳) and Ci1(𝐲)Ci2(𝐳).C_{i_{3}}(\mathbf{x})\cap C_{i_{2}}(\mathbf{z})\not=\varnothing\text{ and }C_{i_{1}}(\mathbf{y})\cap C_{i_{2}}(\mathbf{z})\not=\varnothing.

    Furthermore, by Proposition 3.1 we have that

    Ci3(𝐱)=Mi3(𝐱),Ci2(𝐳)=Mi2(𝐳) and Ci1(𝐲)=Mi1(𝐲).C_{i_{3}}(\mathbf{x})=M_{i_{3}}(\mathbf{x}),C_{i_{2}}(\mathbf{z})=M_{i_{2}}(\mathbf{z})\text{ and }C_{i_{1}}(\mathbf{y})=M_{i_{1}}(\mathbf{y}).

    So, the orbit of 𝐳\mathbf{z} under roti2\operatorname{rot}_{i_{2}} intersects the orbits of 𝐱\mathbf{x} under roti3\operatorname{rot}_{i_{3}} and the orbit of 𝐲\mathbf{y} under roti1\operatorname{rot}_{i_{1}}. That is, there are points 𝐳x,𝐳y𝒞(p)\mathbf{z}_{x},\mathbf{z}_{y}\in\mathcal{C}(p) with

    𝐱=roti3n3(𝐳x) and 𝐳y=roti1n1(𝐲)\mathbf{x}=\operatorname{rot}_{i_{3}}^{n_{3}}(\mathbf{z}_{x})\text{ and }\mathbf{z}_{y}=\operatorname{rot}_{i_{1}}^{n_{1}}(\mathbf{y})

    for integers n1,n3n_{1},n_{3}. As in Figure 3, this gives the following path from (1,1,1)(1,1,1) to our point 𝐱𝒞(p)\mathbf{x}\in\mathcal{C}(p)

    𝐱=roti3n3roti2n2roti1n1rotjn(1,1,1).\mathbf{x}=\operatorname{rot}_{i_{3}}^{n_{3}}\operatorname{rot}_{i_{2}}^{n_{2}}\operatorname{rot}_{i_{1}}^{n_{1}}\operatorname{rot}_{j}^{n}(1,1,1).
    𝐱\mathbf{x}roti3\operatorname{rot}_{i_{3}}roti1\operatorname{rot}_{i_{1}}roti2\operatorname{rot}_{i_{2}}𝐲\mathbf{y}roti3\operatorname{rot}_{i_{3}}roti1\operatorname{rot}_{i_{1}}roti2\operatorname{rot}_{i_{2}}𝐳\mathbf{z}𝐳x\mathbf{z}_{x}𝐳y\mathbf{z}_{y}roti2\operatorname{rot}_{i_{2}}(1,1,1)(1,1,1)rotj\operatorname{rot}_{j}
    Figure 3. Existence of 𝐳\mathbf{z} intersecting the orbits of 𝐱\mathbf{x} and 𝐲\mathbf{y}
  3. III.

    If ordp(𝐱)>p1/2\operatorname{ord}_{p}(\mathbf{x})>p^{1/2} then by Proposition 3.4 there exists a point 𝐱C\mathbf{x}_{C} so that 𝐱C\mathbf{x}_{C} is in the orbit of 𝐱\mathbf{x} under the rotation roti4\operatorname{rot}_{i_{4}} for some i4{1,2,3}i_{4}\in\{1,2,3\}. This implies that 𝐱\mathbf{x} is also in the orbit of 𝐱C\mathbf{x}_{C} under the rotation roti4\operatorname{rot}_{i_{4}} and so we can write

    𝐱=roti4n4(𝐱C).\mathbf{x}=\operatorname{rot}_{i_{4}}^{n_{4}}(\mathbf{x}_{C}).

    Then, we can connect (1,1,1)(1,1,1) to 𝐱C\mathbf{x}_{C} as in Step II to get

    𝐱=roti4n4roti1n1rotjn(1,1,1).\mathbf{x}=\operatorname{rot}_{i_{4}}^{n_{4}}\cdots\operatorname{rot}_{i_{1}}^{n_{1}}\operatorname{rot}_{j}^{n}(1,1,1).
  4. IV.

    If c<ordp(𝐱)p1/2c<\operatorname{ord}_{p}(\mathbf{x})\leq p^{1/2} then by Proposition 3.5 there exists a point 𝐱M\mathbf{x}_{M} with ordp(𝐱M)>p1/2\operatorname{ord}_{p}(\mathbf{x}_{M})>p^{1/2} where

    𝐱M=roti5m5rotit+4mt+4𝐱,\mathbf{x}_{M}=\operatorname{rot}_{i_{5}}^{m_{5}}\cdots\operatorname{rot}_{i_{t+4}}^{m_{t+4}}\mathbf{x},

    and t<τ(p21)t<\tau(p^{2}-1). Note that if 𝐱i=rotim𝐱j\mathbf{x}_{i}=\operatorname{rot}_{i}^{m}\mathbf{x}_{j} then this means that 𝐱i\mathbf{x}_{i} is in the orbit of 𝐱j\mathbf{x}_{j} under roti\operatorname{rot}_{i}. So as discussed in Step III, 𝐱j\mathbf{x}_{j} is also in the orbit of 𝐱i\mathbf{x}_{i} and we can write 𝐱j=rotin𝐱i\mathbf{x}_{j}=\operatorname{rot}_{i}^{n}\mathbf{x}_{i} for some integer nn. Repeating this process gives us the following path from 𝐱M\mathbf{x}_{M} to 𝐱\mathbf{x}

    𝐱=rotit+4nt+4roti5n5𝐱M.\mathbf{x}=\operatorname{rot}_{i_{t+4}}^{n_{t+4}}\cdots\operatorname{rot}_{i_{5}}^{n_{5}}\mathbf{x}_{M}.

    Since rotp(𝐱M)>p1/2\operatorname{rot}_{p}(\mathbf{x}_{M})>p^{1/2} then we can connect (1,1,1)(1,1,1) to 𝐱M\mathbf{x}_{M} as in Step III to get 𝐱=rotit+4nt+4roti1n1rotjn(1,1,1).\mathbf{x}=\operatorname{rot}_{i_{t+4}}^{n_{t+4}}\cdots\operatorname{rot}_{i_{1}}^{n_{1}}\operatorname{rot}_{j}^{n}(1,1,1).

4. Proof of Theorem 1.2

We are now prepared to prove our first main result.

Proof of Theorem 1.2.

Take 𝐱𝕏(p)\mathbf{x}\in\mathbb{X}^{*}(p). If ordp(𝐱)>p1/2\operatorname{ord}_{p}(\mathbf{x})>p^{1/2} given our assumptions and by Step III of Algorithm 3.6 we can write

𝐱=roti4n4roti1n1rot1n(1,1,1).\mathbf{x}=\operatorname{rot}_{i_{4}}^{n_{4}}\cdots\operatorname{rot}_{i_{1}}^{n_{1}}\operatorname{rot}_{1}^{n}(1,1,1).

where ni0n_{i}\geq 0, 0n50\leq n\leq 5 and the iji_{j} are not necessarily distinct. So, by Proposition 2.3 we have

size(𝐱)<(3ε)24(n+1)(n1+1)(n2+1)(n3+1)(n4+1),\operatorname{size}(\mathbf{x})<(3\varepsilon)^{2^{4}(n+1)(n_{1}+1)(n_{2}+1)(n_{3}+1)(n_{4}+1)},

where ε=(3+5)/2\varepsilon=(3+\sqrt{5})/2. Note that the nin_{i} correspond to path lengths from steps of the BGS algorithm. Since the proofs given in [2] are nonconstructive, we upper bound each nin_{i} by the corresponding rotation order. By assumption we have n5n\geq 5 and by Lemma 2.9 we have ni2pn_{i}\leq 2p for each i{1,2,3,4}i\in\{1,2,3,4\}, and so the result follows. ∎

Observe that, using Step IV of Algorithm 3.6 and Proposition 2.3, we can obtain the following upper bound for minimal lifts of points 𝐱𝕏(p)\mathbf{x}\in\mathbb{X}^{*}(p) in the middlegame.

Proposition 4.1.

Let pp be a prime so that ordp(rot1n(1,1,1))>p\operatorname{ord}_{p}(\operatorname{rot}_{1}^{n}(1,1,1))>p for 0n50\leq n\leq 5 and suppose that 𝐱𝕏(p)\mathbf{x}\in\mathbb{X}^{*}(p) with ordp(𝐱)>c\operatorname{ord}_{p}(\mathbf{x})>c, where cc is the absolute constant introduced in the Middlegame of [2]. Let 𝐱~\tilde{\mathbf{x}} be a lift of 𝐱\mathbf{x} of minimal size. Then,

size(𝐱~)<(3ε)96(2p+1)4+t/2,\operatorname{size}(\tilde{\mathbf{x}})<(3\varepsilon)^{96(2p+1)^{4+t/2}},

where t=τ(p21)t=\tau(p^{2}-1) denotes the number of positive divisors of p21p^{2}-1.

The proof of Proposition 4.1 follows identically to the proof of Theorem 1.2 above, where we additionally note that in the Middlegame our points have order upper bounded by p1/2p^{1/2}. A natural next step would be to explicitly compute the absolute constant cc references in [2] in order to extend Theorem 1.2 to a larger family of points in 𝕏(p)\mathbb{X}^{*}(p)

4.1. Percentage of points in the cage

In Figure 4, the percentage of total points in 𝕏(p)\mathbb{X}^{*}(p) which satisfy the conditions of Theorem 1.2 is plotted for primes p<300p<300. Observe that this percentage appears to hover around 80%. Note here that this percentage also includes parabolic points of order pp, since they are a similar distance from (1,1,1)(1,1,1) as points in the cage.

Refer to caption
Figure 4. Percentage of points from 𝕏(p)\mathbb{X}^{*}(p) in 𝒞(p)\mathcal{C}(p)

As shown in the proof of Proposition 2.9, note that a triple 𝐱\mathbf{x} is in the cage 𝒞(p)\mathcal{C}(p) if and only if one of its coordinates xx satisfies any of the following

  1. (1)

    x=2/3x=-2/3 if xx is parabolic.

  2. (2)

    εx\varepsilon_{x} generates 𝔽p×\mathbb{F}_{p}^{\times} if xx is hyperbolic, or

  3. (3)

    εx=ρxp1\varepsilon_{x}=\rho_{x}^{p-1} where ρx\rho_{x} generates 𝔽p2×\mathbb{F}_{p^{2}}^{\times} if xx is elliptic.

Suppose that xx is a randomly chosen element in 𝔽p\mathbb{F}_{p}. If we assume the εx\varepsilon_{x} are equally likely to take the form in (2) or (3) as a randomly chosen element of 𝔽×\mathbb{F}^{\times} and that generators are equally distributed in 𝔽×\mathbb{F}^{\times}, then the probability that xx satisfies conditions (1), (2) or (3) is given by

1p+φ(p1)p1+φ(p21)(p1)(p21).\frac{1}{p}+\frac{\varphi(p-1)}{p-1}+\frac{\varphi(p^{2}-1)}{(p-1)(p^{2}-1)}.

Since the probability that a Markoff modp\text{mod}\,p point 𝐱=(x1,x2,x3)\mathbf{x}=(x_{1},x_{2},x_{3}) is in the cage is lower bounded by the probability that x=x1x=x_{1} satisfies one of (1)(1), (2)(2) or (3)(3), an interesting future direction would be to use the argument above to derive a heuristic lower bound on the percentage of points in the cage for primes pp where φ(p1)/p\varphi(p-1)/p has a known asymptotic formula.

5. Shortest Paths and Proof of Theorem 1.3

In Proposition 2.3 we saw that the size of our lift grows much faster when the corresponding path contains many switches between different rotations. By following the BGS algorithm we were able to make a small number of these switches, but had to compromise by assuming long paths as we traveled along the orbit of a single rotation due to the non explicit methods used in [2]. In this section, we study lifts obtained from paths of shortest possible length with possibly many switches between rotations. The bound in Theorem 1.3 are uniform if the following conjecture holds, which is expected to be true (see [8], for example).

Conjecture 5.1 (Super Strong Approximation).

For any prime pp, the collection of Markoff mod pp graphs (𝒢p)p(\mathcal{G}_{p})_{p} forms an expander family.

An expander family is a collection of graphs (Gi)i(G_{i})_{i} that is “highly connected but relatively sparse”. We refer to [11] for the formal definition. To obtain a uniform bound, we will only need to know that the expansion constant of any expander family is bounded. As a Corollary to the following Proposition, we obtain a bound on the diameter of any Markoff modp\text{mod}\,p graph.

Proposition 5.2 (Proposition 3.1.5 of [11]).

Let GG be any finite non-empty connected graph. We have

diam(G)2log|G|2log(1+h(G)v)+3,\operatorname{diam}(G)\leq 2\frac{\log\frac{|G|}{2}}{\log\left(1+\frac{h(G)}{v}\right)}+3,

where vv is the maximum number of edges at each vertex and h(G)h(G) is the expansion constant of GG.

Note that |𝕏(p)|=p2±3n|\mathbb{X}^{*}(p)|=p^{2}\pm 3n, depending on whether p1(mod 4)p\equiv 1(\text{mod}\,4). Since the v=3v=3 in 𝒢p\mathcal{G}_{p} for any prime pp we have the following Corollary.

Corollary 5.3.

If Conjecture 5.1 holds, then

diam(𝒢p)Clog(p2+32).\operatorname{diam}(\mathcal{G}_{p})\leq C\log\left(\frac{p^{2}+3}{2}\right).

where 𝒢p\mathcal{G}_{p} is the Markoff modp\text{mod}\,p graph and CC is a constant given by

C=5log(1+h3),C=\frac{5}{\log\left(1+\frac{h}{3}\right)},

where hh is an upper bound for the expansion constant of (Gp)p(G_{p})_{p}.

We our now prepared to prove our second main result.

Proof of Theorem 1.3.

Let 𝐱𝕏(p)\mathbf{x}\in\mathbb{X}^{*}(p) and suppose that the shortest path from (1,1,1)(1,1,1) to 𝐱\mathbf{x} in 𝒢p\mathcal{G}_{p} has length \ell. Then 𝐱\mathbf{x} is of the form

𝐱=rotisnsroti1n1(1,1,1),\mathbf{x}=\operatorname{rot}_{i_{s}}^{n_{s}}\cdots\operatorname{rot}_{i_{1}}^{n_{1}}(1,1,1),

where 1s1\leq s\leq\ell and n1++ns=n_{1}+\cdots+n_{s}=\ell. So by Proposition 2.3, if 𝐱~\tilde{\mathbf{x}} is a lift of minimal size of 𝐱\mathbf{x} we have size(𝐱~)<(3ε)α\operatorname{size}(\tilde{\mathbf{x}})<(3\varepsilon)^{\alpha} where

α\displaystyle\alpha =2s1(n1+1)(ns+1)\displaystyle=2^{s-1}(n_{1}+1)\cdots(n_{s}+1)
<2s15\displaystyle<2^{s-1}5^{\ell}
<24\displaystyle<2^{4\ell}

where the second to last inequality follows from Lemma 5.4 below. By Corollary 5.3 we have

α<24Clog((p3+3)/2)<(p3+32)4C,\alpha<2^{4C\log((p^{3}+3)/2)}<\left(\frac{p^{3}+3}{2}\right)^{4C},

where

C=5log(1+h3)C=\frac{5}{\log(1+\frac{h}{3})}

as in Lemma 5.4. ∎

Lemma 5.4.

For any positive integer \ell we have

max{(n1+1)(ns+1)(n1,,ns) partitions  and 1sn}5.\max\{(n_{1}+1)\cdots(n_{s}+1)\mid(n_{1},\dots,n_{s})\text{ partitions }\ell\text{ and }1\leq s\leq n\}\leq 5^{\ell}.
Proof.

Suppose that (n1,,ns)(n_{1},\dots,n_{s}) partition \ell, and that one of our parts is larger than 4, say n1>4n_{1}>4. Then n1<3(n13)n_{1}<3(n_{1}-3) and we get

(n1+1)(n2+1)(ns+1)<(3(n13)+1)(n2+1)(ns+1).(n_{1}+1)(n_{2}+1)\cdots(n_{s}+1)<(3(n_{1}-3)+1)(n_{2}+1)\cdots(n_{s}+1).

Since (3,n13,n2,,ns)(3,n_{1}-3,n_{2},\dots,n_{s}) is also a partition \ell then (n1+1)(n+1)(n_{1}+1)\cdots(n_{\ell}+1) is not maximal. So, any partition (n1,,ns)(n_{1},\dots,n_{s}) maximizing (m1+1)(ns+1)(m_{1}+1)\cdots(n_{s}+1) must have ni4n_{i}\leq 4 for every i=1,,si=1,\dots,s. ∎

5.1. Data to support improvements on average

Note that the bound in Theorem 1.3 assumed our shortest path from 𝐱\mathbf{x} in 𝒢p\mathcal{G}_{p} to (1,1,1)(1,1,1) switches between distinct rotations maximally. By Proposition 2.3, this contributes doubly exponentially to the growth of the corresponding lift. From numerical experimentation, we expect that this bound can be improved if we consider lifts on average.

The histogram in Figure 5 plots the frequency of the logarithmic size of Markoff triples of fixed length =14\ell=14 from (1,1,1)(1,1,1) in the Markoff tree defined by the rotations roti\operatorname{rot}_{i}; that is, the tree with root node (1,1,1)(1,1,1) and edges defined by {rot1,rot2,rot3}\{\operatorname{rot}_{1},\operatorname{rot}_{2},\operatorname{rot}_{3}\}.

Refer to caption
Figure 5. Distribution of logarithmic sizes at level 14

This data suggests that sizes are distributed in such a way that favor smaller lifts, and so we expect that the largest sizes used in our upper bound occur infrequently enough to allow for improvement to this bound on average. An interesting future direction would be an explicit calculation of this distribution in order to obtain an improved upper bound on the size of lifts on average using the methods in Theorem 1.3.

6. Short Paths Along Parabolic Orbits

Suppose 𝐱=(x1,x2,x3)\mathbf{x}=(x_{1},x_{2},x_{3}) is in 𝕏(p)\mathbb{X}^{*}(p) with xix_{i} parabolic. For convenience, suppose that i=1i=1 and set x=x1x=x_{1}. In Section 3.3 we showed that when p1(mod 4)p\equiv 1(\text{mod}\,4) and x=2/3x=2/3 the conic section consists of the disjoint orbits

Ci(𝐱)=M𝐱(+)M𝐱()C_{i}(\mathbf{x})=M_{\mathbf{x}^{(+)}}\cup M_{\mathbf{x}^{(-)}}

where M𝐱(+)M_{\mathbf{x}^{(+)}} and M𝐱()M_{\mathbf{x}^{(-)}} are defined in Equations (3.2) and (3.3). By Proposition 2.9 and Lemma 3.2, when x=2/3x=-2/3 we have Ci(𝐱)=M𝐱,iC_{i}(\mathbf{x})=M_{\mathbf{x},i} and using a similar analysis to that in Section 3.3 we get

M𝐱,i={(23,1±23in,123i(n+1))}.M_{\mathbf{x},i}=\left\{\left(\frac{-2}{3},-1\pm\frac{2}{3}in,1\mp\frac{2}{3}i(n+1)\right)\right\}.

Now, let pp be a prime so that there exists a point 𝐲=(y1,y2,y3)\mathbf{y}=(y_{1},y_{2},y_{3}) in the cage with

rot1n(1,1,1)=𝐲\operatorname{rot}_{1}^{n}(1,1,1)=\mathbf{y}

for 0n50\leq n\leq 5. If it’s the case that the maximal index of 𝐲\mathbf{y} is not equal to one, then we can connect any parabolic point directly to 𝐲\mathbf{y} as discussed in Section 3.3 by a path of the form

𝐱=rotinirotjnjrot1n(1,1,1).\mathbf{x}=\operatorname{rot}_{i}^{n_{i}}\operatorname{rot}_{j}^{n_{j}}\operatorname{rot}_{1}^{n}(1,1,1).

As in the proof of Theorem 1.2 we can take ni,njn_{i},n_{j} smaller than the largest rotation order, using Proposition 2.3 gives the following.

Proposition 6.1.

Let 𝐱𝕏(p)\mathbf{x}\in\mathbb{X}^{*}(p) be of the form 𝐱=(±2/3,x2,x3)\mathbf{x}=(\pm 2/3,x_{2},x_{3}). Suppose that the point 𝐲\mathbf{y} above has maximal index i{2,3}i\in\{2,3\} and let 𝐱~\tilde{\mathbf{x}} be a lift of 𝐱\mathbf{x} of minimal size. Then

size(x~)(3ε)20(2p+1)2.\operatorname{size}(\tilde{x})\leq(3\varepsilon)^{20(2p+1)^{2}}.

Since parabolic points have large orbits, it’s expected that many points are “close” to parabolic orbits. Because the bound in Proposition 6.1 is better than our previously obtained bounds, a natural next direction would be to investigate points that have short paths to nearby parabolic orbits.

References

  • [1] Martin Aigner. Markov’s theorem and 100 years of the uniqueness conjecture. Springer, Cham, 2013. A mathematical journey from irrational numbers to perfect matchings.
  • [2] Jean Bourgain, Alexander Gamburd, and Peter Sarnak. Markoff triples and strong approximation. Comptes Rendus Mathematique, 354(2):131–135, 2016.
  • [3] John William Scott Cassels. An introduction to the geometry of numbers. Springer Science & Business Media, 2012.
  • [4] Denis X. Charles, Kristin E. Lauter, and Eyal Z. Goren. Cryptographic hash functions from expander graphs. J. Cryptology, 22(1):93–113, 2009.
  • [5] William Chen. Strong approximation for the markoff equation. arXiv: Number Theory, 2020.
  • [6] Harvey Cohn. Markoff forms and primitive words. Math. Ann., 196:8–22, 1972.
  • [7] Pietro Corvaja and Umberto Zannier. Greatest common divisors of u1,v1u-1,\ v-1 in positive characteristic and rational points on curves over finite fields. J. Eur. Math. Soc. (JEMS), 15(5):1927–1942, 2013.
  • [8] Matthew De Courcy-Ireland and Seungjae Lee. Experiments with the Markoff surface. arXiv preprint arXiv: 1812.07275v1, 2018.
  • [9] Elena Fuchs, Kristin Lauter, and Austin Tran. A cryptographic hash function from Markoff triples. arXiv preprint arXiv:2107.10906, 2021.
  • [10] Friedrich Hirzebruch and Don Zagier. The Atiyah-Singer theorem and elementary number theory. Mathematics Lecture Series, No. 3. Publish or Perish, Inc., Boston, Mass., 1974.
  • [11] Emmanuel Kowalski. An introduction to expander graphs, volume 26 of Cours Spécialisés [Specialized Courses]. Société Mathématique de France, Paris, 2019.
  • [12] Andrey Markoff. Sur les formes quadratiques binaires indéfinies. Math. Ann., 17(3):379–399, 1880.
  • [13] Don Zagier. On the number of Markoff numbers below a given bound. Math. Comp., 39(160):709–723, 1982.