This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.


Modular forms, projective structures, and the four squares theorem

Michael Eastwood School of Mathematical Sciences
University of Adelaide
SA 5005
Australia
meastwoo@gmail.com
 and  Ben Moore Mathematics Institute
Zeeman Building
University of Warwick
Coventry CV4 7AL
England
benmoore196884@gmail.com
Abstract.

It is well-known that Lagrange’s four-square theorem, stating that every natural number may be written as the sum of four squares, may be proved using methods from the classicaltheory of modular forms and theta functions. We revisit this proof. In doing so, we concentrate on geometry and thereby avoid some of the tricky analysis that is often encountered. Guided by projective differential geometry we find a new route to Lagrange’s theorem.

1991 Mathematics Subject Classification:
11F03, 11F27, 53A20
[Uncaptioned image]

An artist’s impression of the action of Γ0(4)\Gamma_{0}(4) on the unit disc.

In the Polish wycinanka łowicka style by Katarzyna Nurowska.

1. Introduction

In 1770, Lagrange proved that every natural number can be written as the sum of four squares. In 1834, Jacobi gave a formula for the number of different ways that this can be done. More precisely, if we consider the formal power series

(1) θ(q)nqn2=1+2(q+q4+q9+q16+q25+),\theta(q)\equiv\sum_{n\in{\mathbb{Z}}}q^{n^{2}}=1+2(q+q^{4}+q^{9}+q^{16}+q^{25}+\cdots),

then Lagrange’s Theorem says that all coefficients of

(θ(q))4=1+8(q+3q2+4q3+3q4+6q5+12q6+8q7+3q8+13q9+)(\theta(q))^{4}=1+8(q+3q^{2}+4q^{3}+3q^{4}+6q^{5}+12q^{6}+8q^{7}+3q^{8}+13q^{9}+\cdots)

are positive whilst Jacobi’s Theorem gives a manifestly positive formula for these coefficients. In fact, it is evident from the identity

2(a2+b2+c2+d2)=(a+b)2+(ab)2+(c+d)2+(cd)2,2(a^{2}+b^{2}+c^{2}+d^{2})=(a+b)^{2}+(a-b)^{2}+(c+d)^{2}+(c-d)^{2},

that, for Lagrange’s theorem, it suffices to show that all odd natural numbers may be written as the sum of four squares whence it suffices to establish Jacobi’s formula in this case, namely that

(2) (θ(q))4(θ(q))4=16(q+4q3+6q5+8q7+13q9+)=16(m=0σ(2m+1)q2m+1),\begin{array}[]{rcl}(\theta(q))^{4}-(\theta(-q))^{4}&\!\!\!=\!\!\!&16(q+4q^{3}+6q^{5}+8q^{7}+13q^{9}+\cdots)\\[4.0pt] &\!\!\!=\!\!\!&16(\sum_{m=0}^{\infty}\sigma(2m+1)q^{2m+1}),\end{array}

where σ(n)d|nd\sigma(n)\equiv\sum_{d|n}d is the sum-of-divisors function. The aim of this article is to prove (2). It is well-known that this can be accomplished using modular forms and this is what we shall do. However, some of the tricky analysis can be avoided in favour of geometry. This is one motivation for this article. Another is that a key feature of the usual proof, namely that a certain vector space of modular forms is two-dimensional, is replaced by the two-dimensionality of the solution space to a projectively invariant linear differential equation. This reasoning is potentially applicable for automorphic forms beyond complex analysis.

2. The twice-punctured sphere

It is not commonly realised that the first contributor to the theory of modular forms was the cartographer Mercator, who in 1569 found an accurate conformal map of the twice-punctured round sphere. With the punctures at the South and North Poles, this Mercator projection is the default representation of the Earth to be found in ordinary atlases♯\sharp♯\sharp\sharpBut we find it convenient to put the southern hemisphere at the top.. From a modern perspective, it may be constructed in two steps:

  • Use stereographic projection

    NN\bullet\bullet\bullet\bulletSS

    to identify S2{N}S^{2}\setminus\{N\} with the complex plane {\mathbb{C}}.

  • Use the complex logarithm to ‘unwrap’ the punctured complex plane {0}{\mathbb{C}}\setminus\{0\} to its universal cover {\mathbb{C}}.

These two steps are conformal, the first by geometry or calculus, and the second by the Cauchy-Riemann equations. Explicit formulæ are

{0}S2{S,N}τq=e2πiτq=u+iv1u2+v2+4[4u4vu2+v24]\begin{array}[]{ccccc}{\mathbb{C}}&\longrightarrow&{\mathbb{C}}\setminus\{0\}&\longrightarrow&S^{2}\setminus\{S,N\}\\[5.0pt] \tau&\longmapsto&q=e^{2\pi i\tau}\\[-7.0pt] &&q=u+iv&\longmapsto&\mbox{\footnotesize$\displaystyle\frac{1}{u^{2}+v^{2}+4}\left[\begin{array}[]{c}4u\\ 4v\\ u^{2}+v^{2}-4\end{array}\right]$}\end{array}

and we end up with two crucial (and conformal) facts:

  • S2{S,N}{ττ+1}S^{2}\setminus\{S,N\}\cong\displaystyle\frac{\mathbb{C}}{\{\tau\sim\tau+1\}},

  • q=e2πiτq=e^{2\pi i\tau} is a local coördinate on S2S^{2} near the South Pole.

Note that this essential appearance of the logarithm in the Mercator projection predates Napier and others (in the seventeenth century).

The Mercator realisation of the twice-punctured sphere

S2{S,N}=S2{q=0,q=}S^{2}\setminus\{S,N\}=S^{2}\setminus\{q=0,q=\infty\}

may already be used to prove some useful identities as follows.

Theorem 1.

If q=e2πiτq=e^{2\pi i\tau}, then

(3) d=1(τ+d)2=4π2m=1mqm,for|q|<1.\sum_{d=-\infty}^{\infty}\frac{1}{(\tau+d)^{2}}=-4\pi^{2}\sum_{m=1}^{\infty}mq^{m},\quad\mbox{for}\enskip|q|<1.
Proof.

It is easy to check that the left hand side is uniformly convergent on compact subsets of {\mathbb{C}}\setminus{\mathbb{Z}}. It is invariant under ττ+1\tau\mapsto\tau+1 and therefore descends to a holomorphic function on the thrice-punctured sphere:

S2{q=0,q=,q=1}.S^{2}\setminus\{q=0,q=\infty,q=1\}.

Let us call this function F(q)F(q) and note that

  • F(q)0F(q)\to 0 as q0q\to 0,

  • F(1/q)=F(q)F(1/q)=F(q).

It follows that F(q)F(q) extends holomorphically through q=0q=0 and q=q=\infty and has zeroes at these two points whilst at q=1q=1 it clearly extends meromorphically with a double pole there. Hence,

F(q)=Cq(q1)2F(q)=C\frac{q}{(q-1)^{2}}

for some constant CC. To compute CC, we may substitute τ=1/2\tau=1/2 to find that

C=16d=1(2d+1)2=16π24=4π2.C=-16\sum_{d=-\infty}^{\infty}\frac{1}{(2d+1)^{2}}=-16\frac{\pi^{2}}{4}=-4\pi^{2}.

Finally, if |q|<1|q|<1, then

q(q1)2=qq11q=qqm=0qm=m=1mqm,\frac{q}{(q-1)^{2}}=q\frac{\partial}{\partial q}\frac{1}{1-q}=q\frac{\partial}{\partial q}\sum_{m=0}^{\infty}q^{m}=\sum_{m=1}^{\infty}mq^{m},

as required. ∎

Corollary 1.

For q=e2πiτq=e^{2\pi i\tau} and |q|<1|q|<1,

(4) d=1(τ+d)4=8π43m=1m3qm.\sum_{d=-\infty}^{\infty}\frac{1}{(\tau+d)^{4}}=\frac{8\pi^{4}}{3}\sum_{m=1}^{\infty}m^{3}q^{m}.
Proof.

By the chain rule

τ=2πiqq,\frac{\partial}{\partial\tau}=2\pi iq\frac{\partial}{\partial q},

and applying this operator twice to (3) gives the required identity. ∎

We remark that identities such as (3) and (4) are often established using ‘unfamiliar expressions’ for trigonometric functions and regarded as a ‘standard rite of passage into modular forms’ [2, p. 5]. Already, we see the utility of the Mercator projection in identifying the universal cover of the twice-punctured sphere and it is natural to ask about a similar identification for the thrice-punctured sphere.

3. The thrice-punctured sphere

Our exposition in this section follows advice from Tony Scholl to the the first author in 1984.

Let Σ\Sigma be the thrice-punctured Riemann sphere. More specifically, let us use the standard coördinate z{}=S2z\in{\mathbb{C}}\hookrightarrow{\mathbb{C}}\sqcup\{\infty\}=S^{2}, and set

ΣS2{0,1,}={zz0,1}.\Sigma\equiv S^{2}\setminus\{0,1,\infty\}=\{z\in{\mathbb{C}}\mid z\not=0,1\}.

By the Riemann mapping theorem there is a conformal isomorphism between the lower half plane

{z=x+iyy<0}\{z=x+iy\in{\mathbb{C}}\mid y<0\}

and the following subset

(5) \bullet\bulletτ=0\tau\!=\!0τ=1/2\tau\!=\!1/2sstt

of the upper half plane {τ=s+itt>0}{\mathcal{H}}\equiv\{\tau=s+it\mid t>0\}. In fact, as with all Riemann mappings, there is a three-parameter family thereof and we need to specify just one of them. To do this let us extend the lower half plane as the complement of two rays

\bullet\bulletz=0z=0z=1z=1\bullet

extend the target domain as

\bullet\bullet\bulletτ=0\tau\!=\!0τ=1/2\tau\!=\!1/2τ=1\tau\!=\!1\bullet

and consider the Riemann mapping between these extensions that sends z=1/2z\!=\!1/2 to τ=(1+i)/2\tau\!=\!(1+i)/2 and, at these points, sends the direction /x\partial/\partial x to /t-\partial/\partial t, as shown.

This particular Riemann mapping is chosen so that it intertwines the involution z1zz\mapsto 1-z (having fixed point z=1/2z\!=\!1/2) with the involution τ(τ1)/(2τ1)\tau\mapsto(\tau-1)/(2\tau-1) of {\mathcal{H}} (fixing τ=(1+i)/2\tau\!=\!(1+i)/2 and preserving the extended target).

We conclude that the lower half plane is sent to the ‘tile’

\bullet\bulletz=z=\infty\vphantom{1}z=0z=0z=1z=1

and that this mapping holomorphically extends across the line segment [0,1][0,1] to the upper half plane, which itself is sent to a neighbouring and translated tile attached to the right of the original. It is illuminating to view this construction on the sphere

\bullet\bullet\bullet\infty011

with the lower hemisphere as domain and, replacing the upper half plane by the unit disc with its hyperbolic metric, the target is now the ideal triangle:

(6) \bullet\bullet\bulletz=1z=1z=0z=0z=z=\infty\vphantom{1}

From this point of view, the mapping extends holomorphically through a ‘portal’ in the equator between 0 and 11 to the upper hemisphere, with the result mapping to

\bullet\bullet\bullet\bulletz=1z=1z=z=\infty\vphantom{1}z=0z=0z=z=\infty\vphantom{1}
Southern
Hemisphere
Northern
Hemisphere

However, there are three such portals to the upper hemisphere, all on an equal footing with respect to the evident three-fold rotational symmetry. Using all three unwraps the thrice-punctured sphere to

(7)

and, of course, we can keep going to and fro between north and south through our three portals to obtain a tessellation♯\sharp♯\sharp\sharpFamiliar from the works of M.C. Escher. of the hyperbolic disc Δ\Delta and a conformal covering ΔS2{0,1,}\Delta\to S^{2}\setminus\{0,1,\infty\}. This is an explicit realisation of the universal covering. We remark that the Little Picard Theorem follows immediately from this realisation.

4. Symmetries of the upper half plane

The reader may be wondering why we viewed the extended target as a domain in the upper half plane

\bullet\bullet\bulletτ=0\tau\!=\!0τ=1/2\tau\!=\!1/2τ=1\tau\!=\!1𝒟={\mathcal{D}}={}

rather than the corresponding domain in the unit disc:

\bullet\bullet\bullet\bullet

The point is that the upper half plane is more congenial with regard to an explicit realisation of the symmetry group for which this extended tile is a fundamental domain.

Lemma 1.

The two transformations

Tττ+1andUττ4τ+1T\tau\equiv\tau+1\qquad\mbox{and}\qquad U\tau\equiv\frac{\tau}{4\tau+1}

generate a group of biholomorphisms of the upper half plane {\mathcal{H}}, having 𝒟{\mathcal{D}} as fundamental domain.

Proof.

Regarding the ideal triangle (6), the corresponding tessellation (7) is evidently generated by the three hyperbolic reflections in its sides. Viewed in the upper half plane (5), these three reflections are

Π1τ=τ¯,Π2τ=1τ¯,Π3τ=τ¯4τ¯1.\Pi_{1}\tau=-\overline{\tau},\qquad\Pi_{2}\tau=1-\overline{\tau},\qquad\Pi_{3}\tau=\frac{\overline{\tau}}{4\overline{\tau}-1}.

Therefore, the group we seek may be generated by Π2Π1\Pi_{2}\circ\Pi_{1}, Π3Π1\Pi_{3}\circ\Pi_{1}, and Π3Π2\Pi_{3}\circ\Pi_{2}, namely

ττ+1,ττ4τ+1,ττ14τ3.\tau\mapsto\tau+1,\qquad\tau\mapsto\frac{\tau}{4\tau+1},\qquad\tau\mapsto\frac{\tau-1}{4\tau-3}.

But these three transformations are TT, UU, and UT1UT^{-1}. ∎

It is useful to have an algebraic description of the group generated by TT and UU. To this end, and also because we shall need some of this algebra for other purposes later on, we record some well-known properties of the following well-known group.

4.1. The modular group

This is an alternative name for the group SL(2,){\mathrm{SL}}(2,{\mathbb{Z}}), of 2×22\times 2 unit determinant matrices with integer entries. It is generated by

S[0110]andT[1101].S\equiv\mbox{\small$\left[\!\begin{array}[]{cc}0&-1\\[-1.0pt] 1&0\end{array}\!\right]$}\quad\mbox{and}\quad T\equiv\mbox{\small$\left[\!\begin{array}[]{cc}1&1\\[-1.0pt] 0&1\end{array}\!\right]$}.

Notice that

S2=Id=(ST)3.S^{2}=-{\mathrm{Id}}=(ST)^{3}.

There is a normal subgroup {±Id}SL(2,)\{\pm{\mathrm{Id}}\}\lhd{\mathrm{SL}}(2,{\mathbb{Z}}). The quotient group is denoted PSL(2,){\mathrm{PSL}}(2,{\mathbb{Z}}). It is generated by SS and TT subject to the relations S2=Id=(ST)3S^{2}={\mathrm{Id}}=(ST)^{3}. The group SL(2,){\mathrm{SL}}(2,{\mathbb{R}}) acts on the upper half plane {\mathcal{H}} according to

[abcd]τ=aτ+bcτ+d,\mbox{\small$\left[\!\begin{array}[]{cc}a&b\\[-1.0pt] c&d\end{array}\!\right]$}\tau=\frac{a\tau+b}{c\tau+d}\,,

this action descending to a faithful action of PSL(2,){\mathrm{PSL}}(2,{\mathbb{Z}}). Indeed, this action identifies PSL(2,){\mathrm{PSL}}(2,{\mathbb{R}}) as the biholomorphisms of {\mathcal{H}}. Having done this, the subgroup PSL(2,){\mathrm{PSL}}(2,{\mathbb{Z}}) acts properly discontinuously on {\mathcal{H}}. It is easy to verify and well-known that

(8) \bullet\bullet\bulletτ=0\tau=0τ=1/2\tau=1/2τ=i\tau=i\bulletτ=1/2+3i/2\tau=1/2+\sqrt{3}i/2

is a fundamental domain for this action.

4.2. Some congruence subgroups

Let us consider the following two subgroups of SL(2,){\mathrm{SL}}(2,{\mathbb{Z}}).

  • Γ(4){[abcd]SL(2,)[abcd][1001]mod4}\Gamma(4)\equiv\left\{\mbox{\small$\left[\!\begin{array}[]{cc}a&b\\[-1.0pt] c&d\end{array}\!\right]$}\in{\mathrm{SL}}(2,{\mathbb{Z}})\mid\mbox{\small$\left[\!\begin{array}[]{cc}a&b\\[-1.0pt] c&d\end{array}\!\right]$}\equiv\mbox{\small$\left[\!\begin{array}[]{cc}1&0\\[-1.0pt] 0&1\end{array}\!\right]$}\bmod 4\right\}.

  • Γ1(4){[abcd]SL(2,)[abcd][101]mod4}\Gamma_{1}(4)\equiv\rule{0.0pt}{21.0pt}\left\{\mbox{\small$\left[\!\begin{array}[]{cc}a&b\\[-1.0pt] c&d\end{array}\!\right]$}\in{\mathrm{SL}}(2,{\mathbb{Z}})\mid\mbox{\small$\left[\!\begin{array}[]{cc}a&b\\[-1.0pt] c&d\end{array}\!\right]$}\equiv\mbox{\small$\left[\!\begin{array}[]{cc}1&*\\[-1.0pt] 0&1\end{array}\!\right]$}\bmod 4\right\}.

It is clear that

Γ(4)SL(2,)SL(2,4)\Gamma(4)\lhd{\mathrm{SL}}(2,{\mathbb{Z}})\twoheadrightarrow{\mathrm{SL}}(2,{\mathbb{Z}}_{4})

and easily verified that SL(2,4){\mathrm{SL}}(2,{\mathbb{Z}}_{4}) has 48 elements. In particular, the subgroup Γ(4)\Gamma(4) has index 48 in SL(2,){\mathrm{SL}}(2,{\mathbb{Z}}). Also the homomorphism

Γ1(4)[abcd]bmod44\Gamma_{1}(4)\ni\mbox{\small$\left[\!\begin{array}[]{cc}a&b\\[-1.0pt] c&d\end{array}\!\right]$}\longmapsto b\bmod 4\in{\mathbb{Z}}_{4}

shows that Γ(4)Γ1(4)\Gamma(4)\lhd\Gamma_{1}(4) of index 44. Therefore, whilst Γ1(4)\Gamma_{1}(4) is not a normal subgroup of SL(2,){\mathrm{SL}}(2,{\mathbb{Z}}), it has index 48/4=1248/4=12.

We may now achieve our goal of an algebraic description of the group generated by TT and UU.

Lemma 2.

The subgroup of SL(2,){\mathrm{SL}}(2,{\mathbb{Z}}) generated by

[1101]\left[\!\begin{array}[]{cc}1&1\\[-1.0pt] 0&1\end{array}\!\right]   and   [1041]\left[\!\begin{array}[]{cc}1&0\\[-1.0pt] 4&1\end{array}\!\right]

is Γ1(4)\Gamma_{1}(4).

Proof.

We give a geometric proof by comparing fundamental domains. To this end we note that

(9) \bullet\bullet\bulletτ=0\tau=0τ=1/2\tau=1/2τ=1\tau=1\bullet\bulletτ=i\tau=iτ=1/2+3i/2\tau=1/2+\sqrt{3}i/2

is a perfectly good alternative to the usual (8) as a fundamental domain for the action of PSL(2,){\mathrm{PSL}}(2,{\mathbb{Z}}). Moreover, six hyperbolic copies of this alternative may be used to tile the fundamental domain 𝒟{\mathcal{D}} concerning the action of Lemma 1:

(10) \bullet\bullet\bulletτ=0\tau=0τ=1/2\tau=1/2τ=1\tau=1

We have observed that Γ1(4)SL(4,)\Gamma_{1}(4)\subset{\mathrm{SL}}(4,{\mathbb{Z}}) has index 1212. It follows that

{±Id}×Γ1(4)SL(2,)\{\pm{\mathrm{Id}}\}\times\Gamma_{1}(4)\subset{\mathrm{SL}}(2,{\mathbb{Z}})

has index 66 and, therefore, that Γ1(4)\Gamma_{1}(4) may be regarded as a subgroup of PSL(2,){\mathrm{PSL}}(2,{\mathbb{Z}}) of index 66. Certainly,

[1101],[1041]Γ1(4).\left\langle\mbox{\small$\left[\!\begin{array}[]{cc}1&1\\[-1.0pt] 0&1\end{array}\!\right]$},\mbox{\small$\left[\!\begin{array}[]{cc}1&0\\[-1.0pt] 4&1\end{array}\!\right]$}\right\rangle\subseteq\Gamma_{1}(4).

Equality follows because, as subgroups of PSL(2,){\mathrm{PSL}}(2,{\mathbb{Z}}), they have the same index of 66, as (10) shows. ∎

It is usual to introduce another congruence subgroup of the modular group

Γ0(4){[abcd]SL(2,)[abcd][0]mod4}\Gamma_{0}(4)\equiv\left\{\mbox{\small$\left[\!\begin{array}[]{cc}a&b\\[-1.0pt] c&d\end{array}\!\right]$}\in{\mathrm{SL}}(2,{\mathbb{Z}})\mid\mbox{\small$\left[\!\begin{array}[]{cc}a&b\\[-1.0pt] c&d\end{array}\!\right]$}\equiv\mbox{\small$\left[\!\begin{array}[]{cc}*&*\\[-1.0pt] 0&*\end{array}\!\right]$}\bmod 4\right\}

but it has already occurred in our proof above as {±Id}×Γ1(4)\{\pm{\mathrm{Id}}\}\times\Gamma_{1}(4).

In summary, the group SL(2,){\mathrm{SL}}(2,{\mathbb{R}}) acts on the upper half plane {\mathcal{H}} by

[abcd]τaτ+bcτ+d.\mbox{\small$\left[\!\begin{array}[]{cc}a&b\\[-1.0pt] c&d\end{array}\!\right]$}\tau\equiv\frac{a\tau+b}{c\tau+d}.

The resulting homomorphism SL(2,)Biholo(){\mathrm{SL}}(2,{\mathbb{R}})\to{\mathrm{Biholo}}({\mathcal{H}}) is a double cover, having {±Id}\{\pm{\mathrm{Id}}\} as kernel. The subgroup Γ0(4)SL(2,)\Gamma_{0}(4)\subset{\mathrm{SL}}(2,{\mathbb{R}}) descends to

Γ1(4)PSL(2,)=Biholo(),\Gamma_{1}(4)\subset{\mathrm{PSL}}(2,{\mathbb{R}})={\mathrm{Biholo}}({\mathcal{H}}),

which acts discontinuously and without fixed points. The resulting mapping

Γ1(4)\={ττ+1,ττ4τ+1}S2{0,1,}Σ{\mathcal{H}}\longrightarrow\raisebox{-3.0pt}{$\Gamma_{1}(4)$}\raisebox{-2.0pt}{\large$\backslash$}{\mathcal{H}}=\frac{\mathcal{H}}{\Big{\{}\tau\sim\tau+1,\displaystyle\tau\sim\frac{\tau}{4\tau+1}\Big{\}}}\cong S^{2}\setminus\{0,1,\infty\}\equiv\Sigma

is an explicit (and conformal) realisation of the universal cover of the thrice-punctured sphere Σ\Sigma.

Note that there is still a certain amount of mystery built into this realisation, which can be traced back to our use of the non-constructive Riemann mapping theorem at the start of Section 3. This mystery now shows up in our having two natural local coördinates near the South Pole. On the one hand, we may write q=e2πiτq=e^{2\pi i\tau}, as we did for the twice-punctured sphere, to obtain a local holomorphic coördinate qq replacing ττ+1\tau\sim\tau+1 for τ=s+it\tau=s+it as tt\uparrow\infty. On the other hand, we have, by construction, the global meromorphic coördinate zz on the sphere with the South Pole at z=0z=0. It follow that zz is a holomorphic function of qq near {q=0}\{q=0\} and vice versa. For the moment, the relationship between zz and qq is mysterious save that various key points coincide:

z01q011.\begin{array}[]{c||c|c|c}z&0&1&\infty\\ \hline\cr q&0&-1&1\end{array}.

It is clear, however, that Σ\Sigma acquires a projective structure: a preferred set of local coördinates related by Möbius transformations. In fact, it is better: we have τ\tau defined up to PSL(2,){\mathrm{PSL}}(2,{\mathbb{R}}) freedom (real Möbius transformations).

5. Puncture repair

The main upshot of the reasoning in Sections 34 is a realisation of the thrice-punctured Riemann sphere ΣS2{0,1,}\Sigma\equiv S^{2}\setminus\{0,1,\infty\} as the upper half plane {\mathcal{H}} modulo the action of Γ1(4)\Gamma_{1}(4), an explicit subgroup of Aut(){\mathrm{Aut}}({\mathcal{H}}) acting properly discontinuously and without fixed points. Furthermore, it is evident from this construction, that Σ\Sigma may be compactified as the Riemann sphere (using, for example, the coördinate change q=e2πiτq=e^{2\pi i\tau}). In fact, an argument due to Ahlfors and Beurling [1] shows that there are no other conformal compactifications.

Theorem 2.

Suppose MM is a compact Riemann surface with ΣM\Sigma\hookrightarrow M a conformal isomorphism onto an open subset of MM. Then MM must be conformal to the Riemann sphere with ΣS2\Sigma\hookrightarrow S^{2} the standard embedding.

Proof.

In fact, this is a local result as in the following picture,

\cong      

taken from [3]. The punctured open disc is assumed to be conformally isomorphic to the open set UU (but nothing is supposed concerning the boundary U\partial{U} of UU in VV). We conclude that VV is conformally the disc and UVU\hookrightarrow V the punctured disc, tautologically included. To see this, we calculate in polar coördinates (r,θ)(r,\theta) on the unit disc. We know that there is a smooth positive function Ω(r,θ)\Omega(r,\theta) defined for r>0r>0 so that the metric Ω(r,θ)2(dr2+r2dθ2)\Omega(r,\theta)^{2}(dr^{2}+r^{2}d\theta^{2}) smoothly extends from UU to VV. We will encounter a contradiction if U\partial U contains two or more points since, in this case, the concentric curves {r=ϵ}\{r=\epsilon\}, as ϵ0\epsilon\downarrow 0, have length bounded away from zero in the metric Ω(r,θ)2(dr2+r2dθ2)\Omega(r,\theta)^{2}(dr^{2}+r^{2}d\theta^{2}). More explicitly,

02πΩ(r,θ)r𝑑θ\int_{0}^{2\pi}\Omega(r,\theta)r\,d\theta

is bounded away from zero as r0r\downarrow 0. On the other hand, the area of the region {0<r<ϵ}\{0<r<\epsilon\} in VV is estimated by Cauchy-Schwarz as

0ϵ02πΩ2𝑑θr𝑑r12π0ϵ[02πΩ𝑑θ]2r𝑑r=12π0ϵ[02πΩr𝑑θ]2drr\int_{0}^{\epsilon}\!\!\int_{0}^{2\pi}\Omega^{2}d\theta\,r\,dr\geq\frac{1}{2\pi}\int_{0}^{\epsilon}\!\left[\int_{0}^{2\pi}\Omega\,d\theta\right]^{2}\!r\,dr=\frac{1}{2\pi}\int_{0}^{\epsilon}\!\left[\int_{0}^{2\pi}\Omega r\,d\theta\right]^{2}\frac{dr}{r}

and is therefore forced to be infinite.∎

Otherwise said, there is no difference between the Riemann sphere, either marked at {0,1,}\{0,1,\infty\} or punctured there. Thus, it makes intrinsic sense on ΣS2{0,1,}\Sigma\equiv S^{2}\setminus\{0,1,\infty\} to consider holomorphic 11-forms that are restricted from meromorphic 11-forms on S2S^{2} with poles only at the marks. Of special interest is the space (in traditional arcane notation)

2(Γ0(4)){holomorphic 1-forms ω on Σ extendingmeromorphically to S2 with, at worst,only simple poles at 0,1,.}.{\mathcal{M}}_{2}(\Gamma_{0}(4))\equiv\left\{\!\!\begin{tabular}[]{l}holomorphic $1$-forms $\omega$ on $\Sigma$ extending\\ meromorphically to $S^{2}$ with, at worst,\\ only simple poles at $0,1,\infty$.\end{tabular}\!\!\right\}.
Theorem 3.

There is a canonical isomorphism

2(Γ0(4)){(a,b,c)3a+b+c=0}.{\mathcal{M}}_{2}(\Gamma_{0}(4))\cong\{(a,b,c)\in{\mathbb{C}}^{3}\mid a+b+c=0\}.
Proof.

The isomorphism is given by

ω(Resz=0ω,Resz=1ω,Resz=ω),\omega\longmapsto({\mathrm{Res}}_{z=0\,}\omega,{\mathrm{Res}}_{z=1\,}\omega,{\mathrm{Res}}_{z=\infty\,}\omega),

with a+b+c=0a+b+c=0 being a consequence of the Residue Theorem. ∎

In particular, there is the special meromorphic 11-form

dzz,holomorphic save for{simple poles only at 0 and ,residue=1 at 0.\frac{dz}{z},\enskip\mbox{holomorphic save for}\enskip\Big{\{}\!\begin{array}[]{l}\mbox{simple poles only at $0$ and $\infty$},\\ \mbox{residue}=1\mbox{ at }0.\end{array}

6. Automorphisms of the thrice-punctured sphere

By the Ahlfors-Beurling Theorem, automorphisms of Σ\Sigma correspond to permutations of {0,1,}\{0,1,\infty\} and there are two particular ones that we shall find useful. Firstly, since

[01/220][abcd]=[dc/44ba][01/220],\mbox{\small$\left[\!\begin{array}[]{cc}0&-1/2\\[-1.0pt] 2&0\end{array}\!\right]$}\mbox{\small$\left[\!\begin{array}[]{cc}a&b\\[-1.0pt] c&d\end{array}\!\right]$}=\mbox{\small$\left[\!\begin{array}[]{cc}d&-c/4\\[-1.0pt] -4b&a\end{array}\!\right]$}\mbox{\small$\left[\!\begin{array}[]{cc}0&-1/2\\[-1.0pt] 2&0\end{array}\!\right]$},

it follows that

(11) τ1/4τ\tau\mapsto-1/{4\tau}

induces an automorphism of Σ\Sigma. In the zz-coördinate, it is the one that swops 0 and \infty but fixes 11, namely z1/zz\mapsto 1/z.

Secondly, since

[11/201][abcd]=[a+c/2b+(da)/2c/4cdc/2][11/201],\mbox{\small$\left[\!\begin{array}[]{cc}1&1/2\\[-1.0pt] 0&1\end{array}\!\right]$}\mbox{\small$\left[\!\begin{array}[]{cc}a&b\\[-1.0pt] c&d\end{array}\!\right]$}=\mbox{\small$\left[\!\begin{array}[]{cc}a+c/2&b+(d-a)/2-c/4\\[-1.0pt] c&d-c/2\end{array}\!\right]$}\mbox{\small$\left[\!\begin{array}[]{cc}1&1/2\\[-1.0pt] 0&1\end{array}\!\right]$},

it follows that

(12) ττ+1/2\tau\mapsto\tau+1/2

is the automorphism of Σ\Sigma that swops z=1z=1 and z=z=\infty whilst fixing 0. Close to q=0q=0, we recognise it as qqq\mapsto-q. In the zz-coördinate, it is

zz/(z1).z\mapsto z/(z-1).

7. The normal distribution

At this point, rather bizarrely, it is useful to discuss the normal distribution

f(x)eπx2f(x)\equiv e^{-\pi x^{2}}

and its well-known invariance under the Fourier transform

f^(ξ)f(x)e2πξx𝑑x=eπξ2.\widehat{f}(\xi)\equiv\int_{-\infty}^{\infty}f(x)e^{-2\pi\xi x}dx=e^{-\pi\xi^{2}}.

More generally, integration by substitution shows that

(13) f(x)=e2πtx2f^(ξ)=12te(π/2t)ξ2f(x)=e^{-2\pi tx^{2}}\Longrightarrow\widehat{f}(\xi)=\frac{1}{\sqrt{2t}}e^{-(\pi/2t)\xi^{2}}

for any t>0t>0. The Poisson summation formula says that

nf(n)=nf^(n)\sum_{n\in{\mathbb{Z}}}f(n)=\sum_{n\in{\mathbb{Z}}}\widehat{f}(n)

for f:f:{\mathbb{R}}\to{\mathbb{R}} a suitably well-behaved function (for example, one that lies in Schwartz space). For f(x)=eπtx2f(x)=e^{-\pi tx^{2}}, as in (13), we find that

(14) ne2πtn2=12tne(π/2t)n2.\sum_{n\in{\mathbb{Z}}}e^{-2\pi tn^{2}}=\frac{1}{\sqrt{2t}}\sum_{n\in{\mathbb{Z}}}e^{-(\pi/2t)n^{2}}.

8. A miracle

An outrageous suggestion is to view the formal power series (1) as defining a holomorphic function of the complex variable qq (now called Jacobi’s theta function). Clearly, it is convergent for {|q|<1}\{|q|<1\}. Hence, setting q=e2πiτq=e^{2\pi i\tau}, we obtain a holomorphic function of τ\tau for τ\tau\in{\mathcal{H}}. Then a miracle occurs:

Theorem 4.

For τ\tau\in{\mathcal{H}}, we have

(θ(1/4τ))4=4τ2(θ(τ))4.(\theta(-1/4\tau))^{4}=-4\tau^{2}(\theta(\tau))^{4}.

Equivalently, if we define ϕ:\phi:{\mathcal{H}}\to{\mathcal{H}} by

ϕ(τ)1/4τ\phi(\tau)\equiv-1/4\tau

and consider the holomorphic 11-form Θ(θ(τ))4dτ,\Theta\equiv(\theta(\tau))^{4}d\tau, then

(15) ϕΘ=Θ.\phi^{*}\Theta=-\Theta.
Proof.

When τ\tau lies on the imaginary axis, i.e. τ=it\tau=it for t>0t>0,

θ(τ)=nqn2=ne2πtn2\theta(\tau)=\sum_{n\in{\mathbb{Z}}}q^{n^{2}}=\sum_{n\in{\mathbb{Z}}}e^{-2\pi tn^{2}}

whilst

θ(1/4τ)=ne2π(1/4t)n2=ne(π/2t)n2\theta(-1/4\tau)=\sum_{n\in{\mathbb{Z}}}e^{-2\pi(1/4t)n^{2}}=\sum_{n\in{\mathbb{Z}}}e^{-(\pi/2t)n^{2}}

so (14) says that

θ(1/4τ)=2iτθ(τ),whence(θ(1/4τ))4=4τ2(θ(τ))4,\theta(-1/4\tau)=\sqrt{-2i\tau}\theta(\tau),\quad\mbox{whence}\quad(\theta(-1/4\tau))^{4}=-4\tau^{2}(\theta(\tau))^{4},

along the imaginary axis. The transformation (15) now holds for all τ\tau\in{\mathcal{H}} by analytic continuation. ∎

Notice that the transformation ϕ\phi has already made its appearance (11) as inducing an automorphism of Σ\Sigma, the thrice-punctured sphere. If we also introduce T:T:{\mathcal{H}}\to{\mathcal{H}} by

T(τ)τ+1,T(\tau)\equiv\tau+1,

then it is clear that Tθ=θT^{*}\theta=\theta and Tdτ=dτT^{*}d\tau=d\tau. Hence, we see that

(16) TΘ=Θ.T^{*}\Theta=\Theta.

Finally, to obtain a geometric interpretation of (15) we note that

RϕT1ϕR\equiv\phi\circ T^{-1}\circ\phi

is given by

R(τ)=τ4τ+1R(\tau)=\frac{\tau}{4\tau+1}

and recall that RR and TT together generate Γ0(4)\Gamma_{0}(4). Note that RΘ=ΘR^{*}\Theta=\Theta in accordance with (15) and (16). Putting all this together, we have proved the following.

Theorem 5.

The holomorphic 11-form Θ(θ(τ))4dτ\Theta\equiv(\theta(\tau))^{4}d\tau descends to the thrice-punctured sphere Σ\Sigma and, under the automorphism ϕ:ΣΣ\phi:\Sigma\to\Sigma, satisfies ϕΘ=Θ\phi^{*}\Theta=-\Theta.

Corollary 2.

In the usual zz-coördinate on the thrice-punctured sphere,

Θ=dz2πiz.\Theta=\frac{dz}{2\pi iz}.
Proof.

From q=e2πiτq=e^{2\pi i\tau} we see that dq=2πiqdτdq=2\pi iqd\tau and so

Θ=12πiq(1+8q+24q2+32q3+)dq\Theta=\frac{1}{2\pi iq}\left(1+8q+24q^{2}+32q^{3}+\cdots\right)dq

near q=0q=0 and, in particular, meromorphically extends through q=0q=0, having a simple pole there with residue 1/2πi1/2\pi i. This is a coördinate-free statement and so also applies in the zz-coördinate:

Θ=12πiz(1+)dz.\Theta=\frac{1}{2\pi iz}\left(1+\cdots\right)dz.

Recall that in the zz-coördinate, the automorphism ϕ\phi interchanges z=0z=0 with z=z=\infty whilst fixing z=1z=1. The relation ϕΘ=Θ\phi^{*}\Theta=-\Theta, implies that Θ\Theta also has a pole at z=z=\infty with residue 1/2πi-1/2\pi i. Finally, the behaviour of Θ\Theta at z=1z=1 may be investigated by means of the automorphism (12), let us call it ψ\psi, which swops z=1z=1 and z=z=\infty whilst fixing z=0z=0. In particular, we may easily compare Θ/i\Theta/i along the imaginary τ\tau-axis {τ=it}\{\tau=it\} with its behaviour along the translated axis {τ=1/2+it}\{\tau=1/2+it\}:

Θ/i=(1+8q+24q2+32q3+)dtψΘ/i=(18q+24q232q3+)dtwhere q=e2πt.\begin{array}[]{rcl}\Theta/i&=&\left(1+8q+24q^{2}+32q^{3}+\cdots\right)dt\\[5.0pt] \psi^{*}\Theta/i&=&\left(1-8q+24q^{2}-32q^{3}+\cdots\right)dt\end{array}\quad\mbox{where }q=e^{-2\pi t}.

It is clear that Θ(it)\Theta(it) has only a simple pole at t=0t=0. But Θ(τ)\Theta(\tau) is real-valued when Re(τ)=0\mathrm{Re}(\tau)=0 or Re(τ)=1/2\mathrm{Re}(\tau)=1/2, and the qq-expansion coefficients are all non-negative, so Θ(1/2+it)\Theta(1/2+it) is dominated by Θ(it)\Theta(it) as t0+t\to 0^{+}. The possibility of an essential singularity is excluded by the observation that the intersection of any semicircle centred at τ=1/2\tau=1/2 with an appropriately chosen fundamental domain containing {1/2+itt0}\{1/2+it\mid t\geq 0\} is a finite curve, so the maximal value of Θ(τ)\Theta(\tau), as τ\tau runs along the semicircle, is bounded by Θ(is)\Theta(is) for some real ss. So the behaviour of Θ\Theta at z=1z=1 is certainly no worse than the behaviour at z=0z=0.

In summary, the holomorphic 11-form Θ\Theta on ΣS2{0,1,}\Sigma\equiv S^{2}\setminus\{0,1,\infty\} enjoys a meromorphic extension to S2S^{2} with

  • a simple pole at z=0z=0 with residue 1/2πi1/2\pi i,

  • a simple pole at z=z=\infty with residue 1/2πi-1/2\pi i,

  • at worse at simple pole at z=1z=1.

By the residue theorem, the sum of the residues of any meromorphic 11-form on any Riemann surface is zero. It follows that Θ\Theta has poles only at z=0z=0 and z=z=\infty. Having identified precisely two poles, it cannot have any zeros. At this point Θ\Theta is determined as stated. ∎

9. An Eisenstein series

Introduce

G4(τ)(c,d)2{(0,0)}1(cτ+d)4G_{4}(\tau)\equiv\sum_{(c,d)\in{\mathbb{Z}}^{2}\setminus\{(0,0)\}}\frac{1}{(c\tau+d)^{4}}

and, by absolute convergence, observe that

(17) G4(aτ+bcτ+d)=(cτ+d)4G4(τ),for[abcd]SL(2,).G_{4}\Big{(}\frac{a\tau+b}{c\tau+d}\Big{)}=(c\tau+d)^{4}G_{4}(\tau),\quad\mbox{for}\enskip\mbox{\small$\left[\!\begin{array}[]{cc}a&b\\[-1.0pt] c&d\end{array}\!\right]$}\in{\mathrm{SL}}(2,{\mathbb{Z}}).
Theorem 6.
(18) G4(q)=π445(1+240n=1σ3(n)qn),G_{4}(q)=\frac{\pi^{4}}{45}\Big{(}1+240\sum_{n=1}^{\infty}\sigma_{3}(n)q^{n}\Big{)},

where σ3(n)d|nd3\sigma_{3}(n)\equiv\sum_{d|n}d^{3} (and recall that q=e2πiτq=e^{2\pi i\tau}).

Proof.

This is a straightforward application of (4):

G4(τ)\displaystyle G_{4}(\tau) =d=d01d4+c=c0d=1(cτ+d)4=2ζ(4)+2c=1d=1(cτ+d)4\displaystyle=\!\sum_{\begin{subarray}{c}d=-\infty\\ d\neq 0\end{subarray}}^{\infty}\frac{1}{d^{4}}+\!\!\sum_{\begin{subarray}{c}c=-\infty\\ c\neq 0\end{subarray}}^{\infty}\sum_{d=-\infty}^{\infty}\frac{1}{(c\tau+d)^{4}}=2\zeta(4)+2\sum_{c=1}^{\infty}\sum_{d=-\infty}^{\infty}\frac{1}{(c\tau+d)^{4}}
=π445+2c=1(8π43m=1m3e2πicmτ)(from (4))\displaystyle=\frac{\pi^{4}}{45}+2\sum_{c=1}^{\infty}\left(\frac{8\pi^{4}}{3}\sum_{m=1}^{\infty}m^{3}e^{2\pi icm\tau}\right)\quad\mbox{(from (\ref{another_rite}))}
=π445(1+240m=1σ3(m)e2πimτ).\displaystyle=\frac{\pi^{4}}{45}\left(1+240\sum_{m=1}^{\infty}\sigma_{3}(m)e^{2\pi im\tau}\right).\qed

Following Ramanujan, let

(19) M(q)1+240n=1σ3(n)qnM(q)\equiv 1+240\sum_{n=1}^{\infty}\sigma_{3}(n)q^{n}

and, as a consequence of (17) and (18), observe that

(20) M(τ+1)=M(τ)andM(1/τ)=τ4M(τ).M(\tau+1)=M(\tau)\quad\mbox{and}\quad M(-1/\tau)=\tau^{4}M(\tau).

10. The Ramanujan ODE

Following Ramanujan, let

(21) L(q)124n=1σ(n)qnL(q)\equiv 1-24\sum_{n=1}^{\infty}\sigma(n)q^{n}

defined for {|q|<1}\{|q|<1\}. The following identity was proved by Ramanujan [4, identities (17), (27), (28), and (30)], as a corollary of his straightforward but inspired proof of a certain identity between Lambert series. These Lambert series identities were elucidated by van der Pol [6], who showed that they ultimately derive from the product formula and transformation formula for Jacobi’s theta function. A direct combinatorial proof is due to Skoruppa [5].

Theorem 7.

As (formal) power series,

(22) 12qdLdqL2+M=0.12q\frac{dL}{dq}-L^{2}+M=0.

As usual, by setting q=e2πiτq=e^{2\pi i\tau}, we may view LL as a holomorphic function L(τ)L(\tau) for τ\tau\in{\mathcal{H}}. A change of variables gives

(23) 6πidLdτL2+M=0,\frac{6}{\pi i}\frac{dL}{d\tau}-L^{2}+M=0,

an equivalent statement to (22). Locally, we may write

(24) L(τ)=6πig(τ)g(τ)L(\tau)=-\frac{6}{\pi i}\frac{g^{\prime}(\tau)}{g(\tau)}

and (23) becomes g′′+π236Mg=0g^{\prime\prime}+\frac{\pi^{2}}{36}Mg=0. Thus, we are led to consider

(25) y′′+π236My=0y^{\prime\prime}+\frac{\pi^{2}}{36}My=0

for y:y:{\mathcal{H}}\to{\mathbb{C}} a holomorphic function and (22) says that y(τ)=g(τ)y(\tau)=g(\tau) is a solution of (25). We may investigate the solutions of the linear equation (25) quite explicitly. Firstly, we may figure out much more about g(τ)g(\tau) as follows.

Lemma 3.

We may take

g(τ)=eπiτ/6exp(2n=1σ(n)nqn)=eπiτ/6(1+2q+5q2+10q3+20q4+36q5+65q6+),\begin{array}[]{rcl}g(\tau)&=&\displaystyle e^{-\pi i\tau/6}\exp\Big{(}2\sum_{n=1}^{\infty}\frac{\sigma(n)}{n}q^{n}\Big{)}\\[14.0pt] &=&e^{-\pi i\tau/6}\big{(}1+2q+5q^{2}+10q^{3}+20q^{4}+36q^{5}+65q^{6}+\cdots\big{)},\end{array}

a globally defined holomorphic function {0}{\mathcal{H}}\to{\mathbb{C}}\setminus\{0\}.

Proof.

Of course, the function g(τ)g(\tau) is locally defined by (24) up to a constant. As a global Ansatz, let us try

g(τ)=eπiτ/6ψ(q),forq=e2πiτg(\tau)=e^{-\pi i\tau/6}\psi(q),\quad\mbox{for}\enskip q=e^{2\pi i\tau}

and ψ:{|q|<1}{0}\psi:\{|q|<1\}\to{\mathbb{C}}\setminus\{0\} holomorphic. Substituting this form of gg into (24) gives

(26) ψ12qdψdq=Lψ=ψ24ψn=1σ(n)qn\psi-12q\frac{d\psi}{dq}=L\psi=\psi-24\psi\sum_{n=1}^{\infty}\sigma(n)q^{n}

so

ddqlogψ=1ψdψdq=2n=1σ(n)qn1=2ddqn=1σ(n)nqn\frac{d}{dq}\log\psi=\frac{1}{\psi}\frac{d\psi}{dq}=2\sum_{n=1}^{\infty}\sigma(n)q^{n-1}=2\frac{d}{dq}\sum_{n=1}^{\infty}\frac{\sigma(n)}{n}q^{n}

and, normalising ψ(q)\psi(q) by ψ(0)=1\psi(0)=1, conclude that

logψ=2n=1σ(n)nqn.\log\psi=2\sum_{n=1}^{\infty}\frac{\sigma(n)}{n}q^{n}.

Evidently, this power series converges for |q|<1|q|<1 and we are done. ∎

As an aside, we note that the resulting power series expansion

ψ(q)=n=0bnqn=1+2q+5q2+10q3+20q4+36q5+65q6+,\psi(q)=\sum_{n=0}^{\infty}b_{n}q^{n}=1+2q+5q^{2}+10q^{3}+20q^{4}+36q^{5}+65q^{6}+\cdots,

where, as one obtains easily from (26),

(27) b0=1,bn=2nk=1nσ(k)bnk,forn1,b_{0}=1,\quad b_{n}=\frac{2}{n}\sum_{k=1}^{n}\sigma(k)b_{n-k},\enskip\mbox{for}\enskip n\geq 1,

has integer coefficients. Indeed, the generating function of σ\sigma is the qq-expansion of a Lambert series

n=1σ(n)qn=n=1nqn1qn,\sum_{n=1}^{\infty}\sigma(n)q^{n}=\sum_{n=1}^{\infty}\frac{nq^{n}}{1-q^{n}},

which, upon rewriting, assumes the form

n=1nqn1qn=qddqn=1log(11qn)=qddqlogk=111qk.\sum_{n=1}^{\infty}\frac{nq^{n}}{1-q^{n}}=q\frac{d}{dq}\sum_{n=1}^{\infty}\log\left(\frac{1}{1-q^{n}}\right)=q\frac{d}{dq}\log\prod_{k=1}^{\infty}\frac{1}{1-q^{k}}.

But the qq-expansion of this infinite product is well-known. It is the generating function of the manifestly integral partition numbers p(k)p(k):

k=111qk=k=0p(k)qkP(q).\prod_{k=1}^{\infty}\frac{1}{1-q^{k}}=\sum_{k=0}^{\infty}p(k)q^{k}\equiv P(q).

Returning to (26), we find that ψ\psi satisfies

ddq(logψ(q)2logP(q))=0,\frac{d}{dq}\left(\log\psi(q)-2\log P(q)\right)=0,

and, recalling that P(0)=1P(0)=1, we find that ψ=P2\psi=P^{2}.

Let 𝕊{\mathbb{S}} denote the solution space of (25). As {\mathcal{H}} is simply-connected, we conclude that 𝕊{\mathbb{S}} is two-dimensional and in Lemma 3 we have already found one non-zero element in 𝕊{\mathbb{S}}. To complete our understanding of 𝕊{\mathbb{S}} it suffices to find another linearly independent element:

Lemma 4.

There is a convergent power series

ϕ(q)=1+107q+36591q2+136101729q3+1357018645q4+7419742267995q5+for|q|<1\textstyle\phi(q)=1+\frac{10}{7}q+\frac{365}{91}q^{2}+\frac{13610}{1729}q^{3}+\frac{135701}{8645}q^{4}+\frac{7419742}{267995}q^{5}+\cdots\quad\mbox{for}\enskip|q|<1

so that h(τ)eπiτ/6ϕ(q)h(\tau)\equiv e^{\pi i\tau/6}\phi(q) is in 𝕊{\mathbb{S}}.

Proof.

We try y(τ)=eπiτ/6ϕ(q)y(\tau)=e^{\pi i\tau/6}\phi(q) as an Ansatz in (25). A calculation shows that (25) reduces to

6q2d2ϕdq2+7qdϕdq=10n=1σ3(n)qn,6q^{2}\frac{d^{2}\phi}{dq^{2}}+7q\frac{d\phi}{dq}=10\sum_{n=1}^{\infty}\sigma_{3}(n)q^{n},

whereas substituting h(τ)=eπiτ/6ψ(q)h(\tau)=e^{-\pi i\tau/6}\psi(q) instead, gives

6q2d2ψdq2+5qdψdq=10n=1σ3(n)qn.6q^{2}\frac{d^{2}\psi}{dq^{2}}+5q\frac{d\psi}{dq}=10\sum_{n=1}^{\infty}\sigma_{3}(n)q^{n}.

Each of these gives a recursion relation for the coefficients of a formal power series for the function in question, namely

ϕ(q)=n=0anqnψ(q)=n=0bnqn\phi(q)=\sum_{n=0}^{\infty}a_{n}q^{n}\qquad\psi(q)=\sum_{n=0}^{\infty}b_{n}q^{n}

where a0=b0=1a_{0}=b_{0}=1 and, for n1n\geq 1,

an=10n(6n+1)k=1nσ3(k)ankbn=10n(6n1)k=1nσ3(k)bnk.a_{n}=\frac{10}{n(6n+1)}\sum_{k=1}^{n}\sigma_{3}(k)a_{n-k}\qquad b_{n}=\frac{10}{n(6n-1)}\sum_{k=1}^{n}\sigma_{3}(k)b_{n-k}.

By Lemma 3, we know that the power series n=0bnqn\sum_{n=0}^{\infty}b_{n}q^{n} converges for |q|<1|q|<1 (and, from this formal point of view, the content of (22) is that the recursion relation (27) yields the same coefficients bnb_{n}). From these recurrence relations it is clear, by induction, that 0<anbn0<a_{n}\leq b_{n}. It follows that n=0anqn\sum_{n=0}^{\infty}a_{n}q^{n} also converges for |q|<1|q|<1 and we are done.∎

In summary, Lemmata 3 and 4 give us a basis for 𝕊{\mathbb{S}} of the form

g(τ)=eπiτ/6ψ(q)h(τ)=eπiτ/6ϕ(q),whereq=e2πiτ\begin{array}[]{rcr}g(\tau)&=&e^{-\pi i\tau/6}\psi(q)\phantom{,}\\[4.0pt] h(\tau)&=&e^{\pi i\tau/6}\phi(q),\end{array}\quad\mbox{where}\enskip q=e^{2\pi i\tau}

and ϕ(q),ψ(q)\phi(q),\psi(q) are holomorphic functions on the unit disc {|q|<1}\{|q|<1\}. Also notice that both ψ(e2πiτ)\psi(e^{2\pi i\tau}) and ϕ(e2πiτ)\phi(e^{2\pi i\tau}) are strictly positive along the imaginary axis {τ=it|t>0}\{\tau=it|t>0\} in {\mathcal{H}}. In particular, we conclude that h(i)0h(i)\not=0.

Theorem 8.

The equation (25) is projectively invariant.

Proof.

Firstly, we must explain what the phrase ‘projectively invariant’ means. There is no local structure in the conformal geometry of {\mathcal{H}} (an nn-dimensional complex manifold is locally biholomorphic to n{\mathbb{C}}^{n}; end of story). Globally, however, the group SL(2,){\mathrm{SL}}(2,{\mathbb{R}}) acts conformally on {\mathcal{H}} and this may be recorded as local information on {\mathcal{H}}, specifically as a collection of preferred local coördinates, namely τ\tau and its translates

aτ+bcτ+dfor[abcd]SL(2,).\frac{a\tau+b}{c\tau+d}\quad\mbox{for}\enskip\mbox{\small$\left[\!\begin{array}[]{cc}a&b\\[-1.0pt] c&d\end{array}\!\right]$}\in{\mathrm{SL}}(2,{\mathbb{R}}).

Roughly speaking, this is a ‘projective structure.’ In any case, to say that (25) is ‘projectively invariant’ is to say that it respects the action of SL(2,){\mathrm{SL}}(2,{\mathbb{Z}}). For this to be true we decree that

(28) (A1g)(τ)(cτ+d)g(Aτ),forA=[abcd]SL(2,).(A^{-1}g)(\tau)\equiv(c\tau+d)g(A\tau),\quad\mbox{for}\enskip A=\mbox{\small$\left[\!\begin{array}[]{cc}a&b\\[-1.0pt] c&d\end{array}\!\right]$}\in{\mathrm{SL}}(2,{\mathbb{R}}).

(In the language of projective differential geometry gg is a ‘projective density of weight 11.’) From (17), (18), and (19), we already know that

M(aτ+bcτ+d)=(cτ+d)4M(τ),for[abcd]SL(2,)M\Big{(}\frac{a\tau+b}{c\tau+d}\Big{)}=(c\tau+d)^{4}M(\tau),\quad\mbox{for}\enskip\mbox{\small$\left[\!\begin{array}[]{cc}a&b\\[-1.0pt] c&d\end{array}\!\right]$}\in{\mathrm{SL}}(2,{\mathbb{Z}})

and so it suffices to show that

d2dτ2[(cτ+d)g(aτ+bcτ+d)]=1(cτ+d)3d2gdτ2(aτ+bcτ+d),\frac{d^{2}}{d\tau^{2}}\left[(c\tau+d)g\Big{(}\frac{a\tau+b}{c\tau+d}\Big{)}\right]=\frac{1}{(c\tau+d)^{3}}\frac{d^{2}g}{d\tau^{2}}\Big{(}\frac{a\tau+b}{c\tau+d}\Big{)},

which is an elementary consequence of the chain rule. ∎

Recall that 𝕊{\mathbb{S}}, the solution space of (25), is two-dimensional. In accordance with Theorem 8, the group SL(2,){\mathrm{SL}}(2,{\mathbb{Z}}), generated by

(29) T[1101]andS[0110],T\equiv\mbox{\small$\left[\!\begin{array}[]{cc}1&1\\[-1.0pt] 0&1\end{array}\!\right]$}\quad\mbox{and}\quad S\equiv\mbox{\small$\left[\!\begin{array}[]{cc}0&-1\\[-1.0pt] 1&0\end{array}\!\right]$},

is represented on 𝕊{\mathbb{S}}. More specifically, if g(τ)g(\tau) solves (25) then, according to (28), so do

(Tg)(τ)g(τ1)and(Sg)(τ)τg(1/τ).(Tg)(\tau)\equiv g(\tau-1)\quad\mbox{and}\quad(Sg)(\tau)\equiv-\tau g(-1/\tau).
Theorem 9.

The holomorphic function L:L:{\mathcal{H}}\to{\mathbb{C}} satisfies

(30) L(aτ+bcτ+d)=(cτ+d)2L(τ)+6πic(cτ+d)L\Big{(}\frac{a\tau+b}{c\tau+d}\Big{)}=(c\tau+d)^{2}L(\tau)+\frac{6}{\pi i}c(c\tau+d)

for [abcd]SL(2,)\mbox{\small$\left[\!\begin{array}[]{cc}a&b\\[-1.0pt] c&d\end{array}\!\right]$}\in{\mathrm{SL}}(2,{\mathbb{Z}}).

Proof.

It suffices to prove (30) for the generators TT and SS of SL(2,){\mathrm{SL}}(2,{\mathbb{Z}}), specifically that

L(τ+1)=L(τ)andL(1/τ)=τ2L(τ)+6τ/πi.L(\tau+1)=L(\tau)\quad\mbox{and}\quad L(-1/\tau)=\tau^{2}L(\tau)+6\tau/\pi i.

The first of these holds by Lemma 3, which implies that Tg=eπi/6gTg=e^{\pi i/6}g. To establish the second identity, it suffices to show that Sg=βgSg=\beta g for some constant β\beta: if τg(1/τ)=βg(τ)-\tau g(-1/\tau)=\beta g(\tau), then

βg(1/τ)=g(τ)/τβg(1/τ)=τg(τ)g(τ)\beta g(-1/\tau)=g(\tau)/\tau\enskip\Rightarrow\enskip\beta g^{\prime}(-1/\tau)=\tau g^{\prime}(\tau)-g(\tau)

so

βg(1/τ)g(τ)=τg(τ)g(τ)1.\frac{\beta g^{\prime}(-1/\tau)}{g(\tau)}=\frac{\tau g^{\prime}(\tau)}{g(\tau)}-1.

Therefore

g(1/τ)τg(1/τ)=τg(τ)g(τ)1\frac{g^{\prime}(-1/\tau)}{\tau g(-1/\tau)}=\frac{\tau g^{\prime}(\tau)}{g(\tau)}-1

and so

6πig(1/τ)τg(1/τ)=6πiτg(τ)g(τ)+6πi;-\frac{6}{\pi i}\frac{g^{\prime}(-1/\tau)}{\tau g(-1/\tau)}=-\frac{6}{\pi i}\frac{\tau g^{\prime}(\tau)}{g(\tau)}+\frac{6}{\pi i};

in other words, from (24),

L(1/τ)τ=τL(τ)+6πi,\frac{L(-1/\tau)}{\tau}=\tau L(\tau)+\frac{6}{\pi i},

as required. To finish the proof, let us consider the action of SL(2,){\mathrm{SL}}(2,{\mathbb{Z}}) on 𝕊{\mathbb{S}}. If SgβgSg\not=\beta g, then we may set fSgf\equiv Sg to obtain {f,g}\{f,g\} as a basis of 𝕊{\mathbb{S}}. By construction

S[fg]=[0110][fg].S\mbox{\small$\left[\!\begin{array}[]{cc}f\\[-1.0pt] g\end{array}\!\right]$}=\mbox{\small$\left[\!\begin{array}[]{cc}0&-1\\[-1.0pt] 1&0\end{array}\!\right]$}\mbox{\small$\left[\!\begin{array}[]{cc}f\\[-1.0pt] g\end{array}\!\right]$}.

By Lemma 3, we already know that Tg=eπi/6gTg=e^{\pi i/6}g and, from Lemma 4, we know that the action of TT on 𝕊{\mathbb{S}} is diagonalisable with the other eigenvalue being eπi/6e^{-\pi i/6}. In other words

T[fg]=[eπi/6α0eπi/6][fg]T\mbox{\small$\left[\!\begin{array}[]{cc}f\\[-1.0pt] g\end{array}\!\right]$}=\mbox{\small$\left[\!\begin{array}[]{cc}e^{-\pi i/6}&\alpha\\[-1.0pt] 0&e^{\pi i/6}\end{array}\!\right]$}\mbox{\small$\left[\!\begin{array}[]{cc}f\\[-1.0pt] g\end{array}\!\right]$}

for some constant α\alpha. In SL(2,){\mathrm{SL}}(2,{\mathbb{Z}}), the matrices (29) satisfy the relations

S2=Idand(ST)3=Id.S^{2}=-{\mathrm{Id}}\quad\mbox{and}\quad(ST)^{3}=-{\mathrm{Id}}.

These same relations must hold for their action on 𝕊{\mathbb{S}}. For SS this is evident and for TT we conclude that α=1\alpha=1. Therefore, since

1+ieπi/6=ieπi/61+ie^{\pi i/6}=ie^{-\pi i/6}

we find that

T(f+ig)=Tf+iTg=eπi/6f+g+ieπi/6g=eπi/6(f+ig).T(f+ig)=Tf+iTg=e^{-\pi i/6}f+g+ie^{\pi i/6}g=e^{-\pi i/6}(f+ig).

However, in Lemma 4, we already found in hh an eigenvector for the action of TT on 𝕊{\mathbb{S}} with eigenvalue eπi/6e^{-\pi i/6}. It follows that

(31) f(τ)+ig(τ)=Ch(τ)f(\tau)+ig(\tau)=Ch(\tau)

for some constant CC. We have already observed that h(i)0h(i)\not=0 whereas, substituting τ=i\tau=i into f=Sgf=Sg, we find that

[f(τ)=τg(1/τ)]|τ=if(i)=ig(i)[f+ig]|τ=i=0.\big{[}f(\tau)=-\tau g(-1/\tau)\big{]}|_{\tau=i}\enskip\Rightarrow\enskip f(i)=-ig(i)\enskip\Rightarrow\enskip\big{[}f+ig]|_{\tau=i}=0.

Therefore, the only option in (31) is that C=0C=0 and so f+ig0f+ig\equiv 0. Hence, assuming that SgβgSg\not=\beta g we have found that Sg=igSg=-ig. This contradiction finishes the proof. ∎

Corollary 3.

The holomorphic 11-form

(L(τ)L(τ+1/2))dτ\big{(}L(\tau)-L(\tau+1/2)\big{)}d\tau

is Γ0(4)\Gamma_{0}(4)-invariant.

Proof.

We need only check invariance under the generators of Γ0(4)\Gamma_{0}(4):

ττ+1andττ4τ+1.\tau\mapsto\tau+1\quad\mbox{and}\quad\tau\mapsto\frac{\tau}{4\tau+1}.

The first of these is clear since L(τ+1)=L(τ)L(\tau+1)=L(\tau). For the second, we may use Theorem 9 immediately to conclude that

L(τ4τ+1)=(4τ+1)2L(τ)+24πi(4τ+1)L\Big{(}\frac{\tau}{4\tau+1}\Big{)}=(4\tau+1)^{2}L(\tau)+\frac{24}{\pi i}(4\tau+1)

but also that

L(τ4τ+1+12)=L(3(τ+1/2)14(τ+1/2)1)=(4(τ+1/2)1)2L(τ+1/2)+24πi(4(τ+1/2)1)=(4τ+1)2L(τ+1/2)+24πi(4τ+1).\begin{array}[]{rcl}\displaystyle L\Big{(}\frac{\tau}{4\tau+1}+\frac{1}{2}\Big{)}&\!\!\!\!=\!\!\!\!&\displaystyle L\Big{(}\frac{3(\tau+1/2)-1}{4(\tau+1/2)-1}\Big{)}\\[10.0pt] &\!\!\!\!=\!\!\!\!&\displaystyle\big{(}4(\tau+1/2)-1\big{)}^{2}L(\tau+1/2)+\frac{24}{\pi i}(4(\tau+1/2)-1)\\[8.0pt] &\!\!\!\!=\!\!\!\!&\displaystyle(4\tau+1)^{2}L(\tau+1/2)+\frac{24}{\pi i}(4\tau+1).\end{array}

Subtracting these identities gives

L(τ4τ+1)L(τ4τ+1+12)=(4τ+1)2(L(τ)L(τ+12)).L\Big{(}\frac{\tau}{4\tau+1}\Big{)}-L\Big{(}\frac{\tau}{4\tau+1}+\frac{1}{2}\Big{)}=(4\tau+1)^{2}\Big{(}L\big{(}\tau\big{)}-L\big{(}\tau+\frac{1}{2}\big{)}\Big{)}.

But

d(τ4τ+1)=(4τ+1)dτ4τdτ(4τ+1)2=1(4τ+1)2dτ,d\Big{(}\frac{\tau}{4\tau+1}\Big{)}=\frac{(4\tau+1)d\tau-4\tau d\tau}{(4\tau+1)^{2}}=\frac{1}{(4\tau+1)^{2}}d\tau,

the factor of (4τ+1)2(4\tau+1)^{2} cancels, and we are done. ∎

Lemma 5.

Suppose ξ(τ)\xi(\tau) is a holomorphic function {\mathcal{H}}\to{\mathbb{C}} and let q=e2πiτq=e^{2\pi i\tau}. In order that ξ(τ)dτ\xi(\tau)d\tau extend to a meromorphic differential form on the unit disc {|q|<1}\{|q|<1\} with at worse a simple pole at q=0q=0, it is necessary and sufficient that

  • ξ(τ+1)=ξ(τ),τ\xi(\tau+1)=\xi(\tau),\;\forall\tau\in{\mathcal{H}},

  • ξ(τ)\xi(\tau) is bounded on the rectangle {τ=x+iy| 0x1,y1}\{\tau=x+iy\,|\,0\leq x\leq 1,y\geq 1\}.

Proof.

The first condition ensures that ξ(τ)\xi(\tau) is, in fact, a holomorphic function of qq and then, since q=e2πiτ=e2πye2πixq=e^{2\pi i\tau}=e^{-2\pi y}e^{2\pi ix} the second condition says that ξ(q)\xi(q) is bounded on the disc {|q|<e2π}\{|q|<e^{-2\pi}\} at which point Riemann’s removable singularities theorem implies that ξ(q)\xi(q) extends holomorphically across the origin: ξ(q)=a+bq+\xi(q)=a+bq+\cdots. Therefore,

q=e2πiτdq=2πiqdτξ(τ)dτ=12πi(aq+b+)dq,q=e^{2\pi i\tau}\enskip\Rightarrow\enskip dq=2\pi iqd\tau\enskip\Rightarrow\enskip\xi(\tau)d\tau=\frac{1}{2\pi i}\Big{(}\frac{a}{q}+b+\cdots\Big{)}dq,

as required. ∎

Now consider the holomorphic 11-form

Ξ(L(τ)L(τ+1/2))dτon.\Xi\equiv\big{(}L(\tau)-L(\tau+1/2)\big{)}d\tau\enskip\mbox{on}\enskip{\mathcal{H}}.

With q=e2πiτq=e^{2\pi i\tau}, as usual, it follows from the definition (21) of LL that

L(τ)L(τ+1/2)=48(q+4q3+6q5+)L(\tau)-L(\tau+1/2)=-48\big{(}q+4q^{3}+6q^{5}+\cdots\big{)}

and so Ξ=24πi(1+4q2+6q4+)dq\Xi=-\frac{24}{\pi i}(1+4q^{2}+6q^{4}+\cdots)dq and, in particular, extends holomorphically across q=0q=0. Now we ask what happens at the cusps, a sensible question in view of Corollary 3.

The change of coördinates τ=1/4τ~\tau=-1/4\widetilde{\tau} sends our usual fundamental domain for Γ0(4)\Gamma_{0}(4) into itself whilst sending

0,1/21/2,0,1/21/20\mapsto\infty,\quad 1/2\mapsto-1/2,\quad\infty\mapsto 0,\quad-1/2\mapsto 1/2

(it’s a half turn about i/2i/2 in the hyperbolic metric on {\mathcal{H}}). In order to figure out the behaviour of Ξ\Xi let us firstly consider the holomorphic 11-form ξL(τ)dτ\xi\equiv L(\tau)d\tau. We may view it in the coördinate τ~\widetilde{\tau}:

ξ=L(1/4τ~)d(1/4τ~)=L(1/4τ~)4τ~2dτ~\xi=L(-1/4\tilde{\tau})d(-1/4\widetilde{\tau})=\frac{L(-1/4\widetilde{\tau})}{4\widetilde{\tau}^{2}}d\widetilde{\tau}

and employ Theorem 9 to conclude that

ξ=16τ~2L(4τ~)+24τ~/πi4τ~2dτ~=(4L(4τ~)+6πiτ~)dτ~.\xi=\frac{16\widetilde{\tau}^{2}L(4\widetilde{\tau})+24\widetilde{\tau}/\pi i}{4\widetilde{\tau}^{2}}d\widetilde{\tau}=\Big{(}4L(4\widetilde{\tau})+\frac{6}{\pi i\widetilde{\tau}}\Big{)}d\widetilde{\tau}.

Of course, whilst 4L(4τ~)4L(4\widetilde{\tau}) is periodic under τ~τ~+1\widetilde{\tau}\mapsto\widetilde{\tau}+1, 6/πiτ~6/\pi i\widetilde{\tau} is not. Thus, the first stipulation of Lemma 5 in this case (namely, theperiodicity of 4L(4τ~)+6/πiτ~4L(4\widetilde{\tau})+6/\pi i\widetilde{\tau}) is not satisfied. But on the rectangle in the statement of Lemma 5, this function is at least bounded. Now, if we apply the same reasoning to the holomorphic 11-form L(τ+1/2)dτL(\tau+1/2)d\tau, then the boundedness hypothesis of Lemma 5 is again satisfied, and again periodicity fails. When we subtract L(τ+1/2)dτL(\tau+1/2)d\tau from L(τ)dτL(\tau)d\tau, periodicity is restored in view of Corollary 3 and boundedness persists! Lemma 5 now applies and we conclude that Ξ\Xi has no worse than a simple pole at z=z=\infty. Similar reasoning applies concerning the cusp at z=1z=1. With more care we could even compute the residues at these points (but this is an optional extra).

To conclude, we have verified that

(L(τ)L(τ+1/2))dτ\left(L(\tau)-L(\tau+1/2)\right)d\tau

and

((θ(τ))4(θ(τ+1/2))4)dτ\big{(}(\theta(\tau))^{4}-(\theta(\tau+1/2))^{4}\big{)}d\tau

are meromorphic one-forms on the thrice-punctured sphere with poles and zeros in the same locations. It follows that one is a constant multiple of the other, and the proof of (2)♯\sharp♯\sharp\sharpThe full force of the Jacobi four-square theorem, namely that the number of ways of representing an integer nn as a sum of four squares of integers is equal to 84dnd8\sum_{4\nmid d\mid n}d, follows from (2) in an elementary fashion. is complete upon comparing their power series expansions in qq.

References

  • [1] L. Ahlfors and A. Beurling, Conformal invariants and function-theoretic null-sets, Acta Math. 83 (1950) 101–129.
  • [2] F.I. Diamond and J. Shurman, A First Course in Modular Forms, Springer 2005.
  • [3] M.G. Eastwood and A.R. Gover, Volume growth and puncture repair in conformal geometry, Jour. Geom. Phys. 127 (2018) 128–132.
  • [4] S. Ramanujan, On certain arithmetical functions, Trans. Cambridge Philos. Soc. 22 (1916) 159–184.
  • [5] N.-P. Skoruppa, A quick combinatorial proof of Eisenstein series identities, Jour. Number Theory 43 (1993) 68–73.
  • [6] B. van der Pol, On a non-linear partial differential equation satisfied by the logarithm of the Jacobian theta-functions, with arithmetical applications, Parts I and II, Indagationes Math. 13 (1951) 261–284.