Modular forms, projective structures, and the four squares theorem

1. Introduction

In 1770, Lagrange proved that every natural number can be written as the sum of four squares. In 1834, Jacobi gave a formula for the number of different ways that this can be done. More precisely, if we consider the formal power series

(1)

\theta(q)\equiv\sum_{n\in{\mathbb{Z}}}q^{n^{2}}=1+2(q+q^{4}+q^{9}+q^{16}+q^{25}+\cdots),

then Lagrange’s Theorem says that all coefficients of

(\theta(q))^{4}=1+8(q+3q^{2}+4q^{3}+3q^{4}+6q^{5}+12q^{6}+8q^{7}+3q^{8}+13q^{9}+\cdots)

are positive whilst Jacobi’s Theorem gives a manifestly positive formula for these coefficients. In fact, it is evident from the identity

2(a^{2}+b^{2}+c^{2}+d^{2})=(a+b)^{2}+(a-b)^{2}+(c+d)^{2}+(c-d)^{2},

that, for Lagrange’s theorem, it suffices to show that all odd natural numbers may be written as the sum of four squares whence it suffices to establish Jacobi’s formula in this case, namely that

(2)

\begin{array}[]{rcl}(\theta(q))^{4}-(\theta(-q))^{4}&\!\!\!=\!\!\!&16(q+4q^{3}+6q^{5}+8q^{7}+13q^{9}+\cdots)\\[4.0pt] &\!\!\!=\!\!\!&16(\sum_{m=0}^{\infty}\sigma(2m+1)q^{2m+1}),\end{array}

where $\sigma(n)\equiv\sum_{d|n}d$ is the sum-of-divisors function. The aim of this article is to prove (2). It is well-known that this can be accomplished using modular forms and this is what we shall do. However, some of the tricky analysis can be avoided in favour of geometry. This is one motivation for this article. Another is that a key feature of the usual proof, namely that a certain vector space of modular forms is two-dimensional, is replaced by the two-dimensionality of the solution space to a projectively invariant linear differential equation. This reasoning is potentially applicable for automorphic forms beyond complex analysis.

2. The twice-punctured sphere

It is not commonly realised that the first contributor to the theory of modular forms was the cartographer Mercator, who in 1569 found an accurate conformal map of the twice-punctured round sphere. With the punctures at the South and North Poles, this Mercator projection is the default representation of the Earth to be found in ordinary atlases^♯\sharp^♯\sharp $\sharp$ But we find it convenient to put the southern hemisphere at the top.. From a modern perspective, it may be constructed in two steps:

•

Use stereographic projection

to identify $S^{2}\setminus\{N\}$ with the complex plane ${\mathbb{C}}$ .
•

Use the complex logarithm to ‘unwrap’ the punctured complex plane ${\mathbb{C}}\setminus\{0\}$ to its universal cover ${\mathbb{C}}$ .

These two steps are conformal, the first by geometry or calculus, and the second by the Cauchy-Riemann equations. Explicit formulæ are

\begin{array}[]{ccccc}{\mathbb{C}}&\longrightarrow&{\mathbb{C}}\setminus\{0\}&\longrightarrow&S^{2}\setminus\{S,N\}\\[5.0pt] \tau&\longmapsto&q=e^{2\pi i\tau}\\[-7.0pt] &&q=u+iv&\longmapsto&\mbox{\footnotesize$\displaystyle\frac{1}{u^{2}+v^{2}+4}\left[\begin{array}[]{c}4u\\ 4v\\ u^{2}+v^{2}-4\end{array}\right]$}\end{array}

and we end up with two crucial (and conformal) facts:

•

$S^{2}\setminus\{S,N\}\cong\displaystyle\frac{\mathbb{C}}{\{\tau\sim\tau+1\}}$ ,
•

$q=e^{2\pi i\tau}$ is a local coördinate on $S^{2}$ near the South Pole.

Note that this essential appearance of the logarithm in the Mercator projection predates Napier and others (in the seventeenth century).

The Mercator realisation of the twice-punctured sphere

S^{2}\setminus\{S,N\}=S^{2}\setminus\{q=0,q=\infty\}

may already be used to prove some useful identities as follows.

Theorem 1.

If $q=e^{2\pi i\tau}$ , then

(3)

\sum_{d=-\infty}^{\infty}\frac{1}{(\tau+d)^{2}}=-4\pi^{2}\sum_{m=1}^{\infty}mq^{m},\quad\mbox{for}\enskip|q|<1.

Proof.

It is easy to check that the left hand side is uniformly convergent on compact subsets of ${\mathbb{C}}\setminus{\mathbb{Z}}$ . It is invariant under $\tau\mapsto\tau+1$ and therefore descends to a holomorphic function on the thrice-punctured sphere:

S^{2}\setminus\{q=0,q=\infty,q=1\}.

Let us call this function $F(q)$ and note that

•

$F(q)\to 0$ as $q\to 0$ ,
•

$F(1/q)=F(q)$ .

It follows that $F(q)$ extends holomorphically through $q=0$ and $q=\infty$ and has zeroes at these two points whilst at $q=1$ it clearly extends meromorphically with a double pole there. Hence,

F(q)=C\frac{q}{(q-1)^{2}}

for some constant $C$ . To compute $C$ , we may substitute $\tau=1/2$ to find that

C=-16\sum_{d=-\infty}^{\infty}\frac{1}{(2d+1)^{2}}=-16\frac{\pi^{2}}{4}=-4\pi^{2}.

Finally, if $|q|<1$ , then

\frac{q}{(q-1)^{2}}=q\frac{\partial}{\partial q}\frac{1}{1-q}=q\frac{\partial}{\partial q}\sum_{m=0}^{\infty}q^{m}=\sum_{m=1}^{\infty}mq^{m},

as required. ∎

Corollary 1.

For $q=e^{2\pi i\tau}$ and $|q|<1$ ,

(4)

\sum_{d=-\infty}^{\infty}\frac{1}{(\tau+d)^{4}}=\frac{8\pi^{4}}{3}\sum_{m=1}^{\infty}m^{3}q^{m}.

Proof.

By the chain rule

\frac{\partial}{\partial\tau}=2\pi iq\frac{\partial}{\partial q},

and applying this operator twice to (3) gives the required identity. ∎

We remark that identities such as (3) and (4) are often established using ‘unfamiliar expressions’ for trigonometric functions and regarded as a ‘standard rite of passage into modular forms’ [2, p. 5]. Already, we see the utility of the Mercator projection in identifying the universal cover of the twice-punctured sphere and it is natural to ask about a similar identification for the thrice-punctured sphere.

3. The thrice-punctured sphere

Our exposition in this section follows advice from Tony Scholl to the the first author in 1984.

Let $\Sigma$ be the thrice-punctured Riemann sphere. More specifically, let us use the standard coördinate $z\in{\mathbb{C}}\hookrightarrow{\mathbb{C}}\sqcup\{\infty\}=S^{2}$ , and set

\Sigma\equiv S^{2}\setminus\{0,1,\infty\}=\{z\in{\mathbb{C}}\mid z\not=0,1\}.

By the Riemann mapping theorem there is a conformal isomorphism between the lower half plane

\{z=x+iy\in{\mathbb{C}}\mid y<0\}

and the following subset

(5)

of the upper half plane ${\mathcal{H}}\equiv\{\tau=s+it\mid t>0\}$ . In fact, as with all Riemann mappings, there is a three-parameter family thereof and we need to specify just one of them. To do this let us extend the lower half plane as the complement of two rays

extend the target domain as

and consider the Riemann mapping between these extensions that sends $z\!=\!1/2$ to $\tau\!=\!(1+i)/2$ and, at these points, sends the direction $\partial/\partial x$ to $-\partial/\partial t$ , as shown.

This particular Riemann mapping is chosen so that it intertwines the involution $z\mapsto 1-z$ (having fixed point $z\!=\!1/2$ ) with the involution $\tau\mapsto(\tau-1)/(2\tau-1)$ of ${\mathcal{H}}$ (fixing $\tau\!=\!(1+i)/2$ and preserving the extended target).

We conclude that the lower half plane is sent to the ‘tile’

and that this mapping holomorphically extends across the line segment $[0,1]$ to the upper half plane, which itself is sent to a neighbouring and translated tile attached to the right of the original. It is illuminating to view this construction on the sphere

with the lower hemisphere as domain and, replacing the upper half plane by the unit disc with its hyperbolic metric, the target is now the ideal triangle:

(6)

From this point of view, the mapping extends holomorphically through a ‘portal’ in the equator between $0$ and $1$ to the upper hemisphere, with the result mapping to

However, there are three such portals to the upper hemisphere, all on an equal footing with respect to the evident three-fold rotational symmetry. Using all three unwraps the thrice-punctured sphere to

(7)

and, of course, we can keep going to and fro between north and south through our three portals to obtain a tessellation^♯\sharp^♯\sharp $\sharp$ Familiar from the works of M.C. Escher. of the hyperbolic disc $\Delta$ and a conformal covering $\Delta\to S^{2}\setminus\{0,1,\infty\}$ . This is an explicit realisation of the universal covering. We remark that the Little Picard Theorem follows immediately from this realisation.

4. Symmetries of the upper half plane

The reader may be wondering why we viewed the extended target as a domain in the upper half plane

rather than the corresponding domain in the unit disc:

The point is that the upper half plane is more congenial with regard to an explicit realisation of the symmetry group for which this extended tile is a fundamental domain.

Lemma 1.

The two transformations

T\tau\equiv\tau+1\qquad\mbox{and}\qquad U\tau\equiv\frac{\tau}{4\tau+1}

generate a group of biholomorphisms of the upper half plane ${\mathcal{H}}$ , having ${\mathcal{D}}$ as fundamental domain.

Proof.

Regarding the ideal triangle (6), the corresponding tessellation (7) is evidently generated by the three hyperbolic reflections in its sides. Viewed in the upper half plane (5), these three reflections are

\Pi_{1}\tau=-\overline{\tau},\qquad\Pi_{2}\tau=1-\overline{\tau},\qquad\Pi_{3}\tau=\frac{\overline{\tau}}{4\overline{\tau}-1}.

Therefore, the group we seek may be generated by $\Pi_{2}\circ\Pi_{1}$ , $\Pi_{3}\circ\Pi_{1}$ , and $\Pi_{3}\circ\Pi_{2}$ , namely

\tau\mapsto\tau+1,\qquad\tau\mapsto\frac{\tau}{4\tau+1},\qquad\tau\mapsto\frac{\tau-1}{4\tau-3}.

But these three transformations are $T$ , $U$ , and $UT^{-1}$ . ∎

It is useful to have an algebraic description of the group generated by $T$ and $U$ . To this end, and also because we shall need some of this algebra for other purposes later on, we record some well-known properties of the following well-known group.

4.1. The modular group

This is an alternative name for the group ${\mathrm{SL}}(2,{\mathbb{Z}})$ , of $2\times 2$ unit determinant matrices with integer entries. It is generated by

S\equiv\mbox{\small$\left[\!\begin{array}[]{cc}0&-1\\[-1.0pt] 1&0\end{array}\!\right]$}\quad\mbox{and}\quad T\equiv\mbox{\small$\left[\!\begin{array}[]{cc}1&1\\[-1.0pt] 0&1\end{array}\!\right]$}.

Notice that

S^{2}=-{\mathrm{Id}}=(ST)^{3}.

There is a normal subgroup $\{\pm{\mathrm{Id}}\}\lhd{\mathrm{SL}}(2,{\mathbb{Z}})$ . The quotient group is denoted ${\mathrm{PSL}}(2,{\mathbb{Z}})$ . It is generated by $S$ and $T$ subject to the relations $S^{2}={\mathrm{Id}}=(ST)^{3}$ . The group ${\mathrm{SL}}(2,{\mathbb{R}})$ acts on the upper half plane ${\mathcal{H}}$ according to

\mbox{\small$\left[\!\begin{array}[]{cc}a&b\\[-1.0pt] c&d\end{array}\!\right]$}\tau=\frac{a\tau+b}{c\tau+d}\,,

this action descending to a faithful action of ${\mathrm{PSL}}(2,{\mathbb{Z}})$ . Indeed, this action identifies ${\mathrm{PSL}}(2,{\mathbb{R}})$ as the biholomorphisms of ${\mathcal{H}}$ . Having done this, the subgroup ${\mathrm{PSL}}(2,{\mathbb{Z}})$ acts properly discontinuously on ${\mathcal{H}}$ . It is easy to verify and well-known that

(8)

is a fundamental domain for this action.

4.2. Some congruence subgroups

Let us consider the following two subgroups of ${\mathrm{SL}}(2,{\mathbb{Z}})$ .

•

$\Gamma(4)\equiv\left\{\mbox{\small$\left[\!\begin{array}[]{cc}a&b\\[-1.0pt] c&d\end{array}\!\right]$}\in{\mathrm{SL}}(2,{\mathbb{Z}})\mid\mbox{\small$\left[\!\begin{array}[]{cc}a&b\\[-1.0pt] c&d\end{array}\!\right]$}\equiv\mbox{\small$\left[\!\begin{array}[]{cc}1&0\\[-1.0pt] 0&1\end{array}\!\right]$}\bmod 4\right\}$ .
•

$\Gamma_{1}(4)\equiv\rule{0.0pt}{21.0pt}\left\{\mbox{\small$\left[\!\begin{array}[]{cc}a&b\\[-1.0pt] c&d\end{array}\!\right]$}\in{\mathrm{SL}}(2,{\mathbb{Z}})\mid\mbox{\small$\left[\!\begin{array}[]{cc}a&b\\[-1.0pt] c&d\end{array}\!\right]$}\equiv\mbox{\small$\left[\!\begin{array}[]{cc}1&*\\[-1.0pt] 0&1\end{array}\!\right]$}\bmod 4\right\}$ .

It is clear that

\Gamma(4)\lhd{\mathrm{SL}}(2,{\mathbb{Z}})\twoheadrightarrow{\mathrm{SL}}(2,{\mathbb{Z}}_{4})

and easily verified that ${\mathrm{SL}}(2,{\mathbb{Z}}_{4})$ has 48 elements. In particular, the subgroup $\Gamma(4)$ has index 48 in ${\mathrm{SL}}(2,{\mathbb{Z}})$ . Also the homomorphism

\Gamma_{1}(4)\ni\mbox{\small$\left[\!\begin{array}[]{cc}a&b\\[-1.0pt] c&d\end{array}\!\right]$}\longmapsto b\bmod 4\in{\mathbb{Z}}_{4}

shows that $\Gamma(4)\lhd\Gamma_{1}(4)$ of index $4$ . Therefore, whilst $\Gamma_{1}(4)$ is not a normal subgroup of ${\mathrm{SL}}(2,{\mathbb{Z}})$ , it has index $48/4=12$ .

We may now achieve our goal of an algebraic description of the group generated by $T$ and $U$ .

Lemma 2.

The subgroup of ${\mathrm{SL}}(2,{\mathbb{Z}})$ generated by

\left[\!\begin{array}[]{cc}1&1\\[-1.0pt] 0&1\end{array}\!\right]

and

\left[\!\begin{array}[]{cc}1&0\\[-1.0pt] 4&1\end{array}\!\right]

is $\Gamma_{1}(4)$ .

Proof.

We give a geometric proof by comparing fundamental domains. To this end we note that

(9)

is a perfectly good alternative to the usual (8) as a fundamental domain for the action of ${\mathrm{PSL}}(2,{\mathbb{Z}})$ . Moreover, six hyperbolic copies of this alternative may be used to tile the fundamental domain ${\mathcal{D}}$ concerning the action of Lemma 1:

(10)

We have observed that $\Gamma_{1}(4)\subset{\mathrm{SL}}(4,{\mathbb{Z}})$ has index $12$ . It follows that

\{\pm{\mathrm{Id}}\}\times\Gamma_{1}(4)\subset{\mathrm{SL}}(2,{\mathbb{Z}})

has index $6$ and, therefore, that $\Gamma_{1}(4)$ may be regarded as a subgroup of ${\mathrm{PSL}}(2,{\mathbb{Z}})$ of index $6$ . Certainly,

\left\langle\mbox{\small$\left[\!\begin{array}[]{cc}1&1\\[-1.0pt] 0&1\end{array}\!\right]$},\mbox{\small$\left[\!\begin{array}[]{cc}1&0\\[-1.0pt] 4&1\end{array}\!\right]$}\right\rangle\subseteq\Gamma_{1}(4).

Equality follows because, as subgroups of ${\mathrm{PSL}}(2,{\mathbb{Z}})$ , they have the same index of $6$ , as (10) shows. ∎

It is usual to introduce another congruence subgroup of the modular group

\Gamma_{0}(4)\equiv\left\{\mbox{\small$\left[\!\begin{array}[]{cc}a&b\\[-1.0pt] c&d\end{array}\!\right]$}\in{\mathrm{SL}}(2,{\mathbb{Z}})\mid\mbox{\small$\left[\!\begin{array}[]{cc}a&b\\[-1.0pt] c&d\end{array}\!\right]$}\equiv\mbox{\small$\left[\!\begin{array}[]{cc}*&*\\[-1.0pt] 0&*\end{array}\!\right]$}\bmod 4\right\}

but it has already occurred in our proof above as $\{\pm{\mathrm{Id}}\}\times\Gamma_{1}(4)$ .

In summary, the group ${\mathrm{SL}}(2,{\mathbb{R}})$ acts on the upper half plane ${\mathcal{H}}$ by

\mbox{\small$\left[\!\begin{array}[]{cc}a&b\\[-1.0pt] c&d\end{array}\!\right]$}\tau\equiv\frac{a\tau+b}{c\tau+d}.

The resulting homomorphism ${\mathrm{SL}}(2,{\mathbb{R}})\to{\mathrm{Biholo}}({\mathcal{H}})$ is a double cover, having $\{\pm{\mathrm{Id}}\}$ as kernel. The subgroup $\Gamma_{0}(4)\subset{\mathrm{SL}}(2,{\mathbb{R}})$ descends to

\Gamma_{1}(4)\subset{\mathrm{PSL}}(2,{\mathbb{R}})={\mathrm{Biholo}}({\mathcal{H}}),

which acts discontinuously and without fixed points. The resulting mapping

{\mathcal{H}}\longrightarrow\raisebox{-3.0pt}{$\Gamma_{1}(4)$}\raisebox{-2.0pt}{\large$\backslash$}{\mathcal{H}}=\frac{\mathcal{H}}{\Big{\{}\tau\sim\tau+1,\displaystyle\tau\sim\frac{\tau}{4\tau+1}\Big{\}}}\cong S^{2}\setminus\{0,1,\infty\}\equiv\Sigma

is an explicit (and conformal) realisation of the universal cover of the thrice-punctured sphere $\Sigma$ .

Note that there is still a certain amount of mystery built into this realisation, which can be traced back to our use of the non-constructive Riemann mapping theorem at the start of Section 3. This mystery now shows up in our having two natural local coördinates near the South Pole. On the one hand, we may write $q=e^{2\pi i\tau}$ , as we did for the twice-punctured sphere, to obtain a local holomorphic coördinate $q$ replacing $\tau\sim\tau+1$ for $\tau=s+it$ as $t\uparrow\infty$ . On the other hand, we have, by construction, the global meromorphic coördinate $z$ on the sphere with the South Pole at $z=0$ . It follow that $z$ is a holomorphic function of $q$ near $\{q=0\}$ and vice versa. For the moment, the relationship between $z$ and $q$ is mysterious save that various key points coincide:

\begin{array}[]{c||c|c|c}z&0&1&\infty\\ \hline\cr q&0&-1&1\end{array}.

It is clear, however, that $\Sigma$ acquires a projective structure: a preferred set of local coördinates related by Möbius transformations. In fact, it is better: we have $\tau$ defined up to ${\mathrm{PSL}}(2,{\mathbb{R}})$ freedom (real Möbius transformations).

5. Puncture repair

The main upshot of the reasoning in Sections 3–4 is a realisation of the thrice-punctured Riemann sphere $\Sigma\equiv S^{2}\setminus\{0,1,\infty\}$ as the upper half plane ${\mathcal{H}}$ modulo the action of $\Gamma_{1}(4)$ , an explicit subgroup of ${\mathrm{Aut}}({\mathcal{H}})$ acting properly discontinuously and without fixed points. Furthermore, it is evident from this construction, that $\Sigma$ may be compactified as the Riemann sphere (using, for example, the coördinate change $q=e^{2\pi i\tau}$ ). In fact, an argument due to Ahlfors and Beurling [1] shows that there are no other conformal compactifications.

Theorem 2.

Suppose $M$ is a compact Riemann surface with $\Sigma\hookrightarrow M$ a conformal isomorphism onto an open subset of $M$ . Then $M$ must be conformal to the Riemann sphere with $\Sigma\hookrightarrow S^{2}$ the standard embedding.

Proof.

In fact, this is a local result as in the following picture,

\cong

taken from [3]. The punctured open disc is assumed to be conformally isomorphic to the open set $U$ (but nothing is supposed concerning the boundary $\partial{U}$ of $U$ in $V$ ). We conclude that $V$ is conformally the disc and $U\hookrightarrow V$ the punctured disc, tautologically included. To see this, we calculate in polar coördinates $(r,\theta)$ on the unit disc. We know that there is a smooth positive function $\Omega(r,\theta)$ defined for $r>0$ so that the metric $\Omega(r,\theta)^{2}(dr^{2}+r^{2}d\theta^{2})$ smoothly extends from $U$ to $V$ . We will encounter a contradiction if $\partial U$ contains two or more points since, in this case, the concentric curves $\{r=\epsilon\}$ , as $\epsilon\downarrow 0$ , have length bounded away from zero in the metric $\Omega(r,\theta)^{2}(dr^{2}+r^{2}d\theta^{2})$ . More explicitly,

\int_{0}^{2\pi}\Omega(r,\theta)r\,d\theta

is bounded away from zero as $r\downarrow 0$ . On the other hand, the area of the region $\{0<r<\epsilon\}$ in $V$ is estimated by Cauchy-Schwarz as

\int_{0}^{\epsilon}\!\!\int_{0}^{2\pi}\Omega^{2}d\theta\,r\,dr\geq\frac{1}{2\pi}\int_{0}^{\epsilon}\!\left[\int_{0}^{2\pi}\Omega\,d\theta\right]^{2}\!r\,dr=\frac{1}{2\pi}\int_{0}^{\epsilon}\!\left[\int_{0}^{2\pi}\Omega r\,d\theta\right]^{2}\frac{dr}{r}

and is therefore forced to be infinite.∎

Otherwise said, there is no difference between the Riemann sphere, either marked at $\{0,1,\infty\}$ or punctured there. Thus, it makes intrinsic sense on $\Sigma\equiv S^{2}\setminus\{0,1,\infty\}$ to consider holomorphic $1$ -forms that are restricted from meromorphic $1$ -forms on $S^{2}$ with poles only at the marks. Of special interest is the space (in traditional arcane notation)

{\mathcal{M}}_{2}(\Gamma_{0}(4))\equiv\left\{\!\!\begin{tabular}[]{l}holomorphic $1$-forms $\omega$ on $\Sigma$ extending\\ meromorphically to $S^{2}$ with, at worst,\\ only simple poles at $0,1,\infty$.\end{tabular}\!\!\right\}.

Theorem 3.

There is a canonical isomorphism

{\mathcal{M}}_{2}(\Gamma_{0}(4))\cong\{(a,b,c)\in{\mathbb{C}}^{3}\mid a+b+c=0\}.

Proof.

The isomorphism is given by

\omega\longmapsto({\mathrm{Res}}_{z=0\,}\omega,{\mathrm{Res}}_{z=1\,}\omega,{\mathrm{Res}}_{z=\infty\,}\omega),

with $a+b+c=0$ being a consequence of the Residue Theorem. ∎

In particular, there is the special meromorphic $1$ -form

\frac{dz}{z},\enskip\mbox{holomorphic save for}\enskip\Big{\{}\!\begin{array}[]{l}\mbox{simple poles only at $0$ and $\infty$},\\ \mbox{residue}=1\mbox{ at }0.\end{array}

6. Automorphisms of the thrice-punctured sphere

By the Ahlfors-Beurling Theorem, automorphisms of $\Sigma$ correspond to permutations of $\{0,1,\infty\}$ and there are two particular ones that we shall find useful. Firstly, since

\mbox{\small$\left[\!\begin{array}[]{cc}0&-1/2\\[-1.0pt] 2&0\end{array}\!\right]$}\mbox{\small$\left[\!\begin{array}[]{cc}a&b\\[-1.0pt] c&d\end{array}\!\right]$}=\mbox{\small$\left[\!\begin{array}[]{cc}d&-c/4\\[-1.0pt] -4b&a\end{array}\!\right]$}\mbox{\small$\left[\!\begin{array}[]{cc}0&-1/2\\[-1.0pt] 2&0\end{array}\!\right]$},

it follows that

(11)

\tau\mapsto-1/{4\tau}

induces an automorphism of $\Sigma$ . In the $z$ -coördinate, it is the one that swops $0$ and $\infty$ but fixes $1$ , namely $z\mapsto 1/z$ .

Secondly, since

\mbox{\small$\left[\!\begin{array}[]{cc}1&1/2\\[-1.0pt] 0&1\end{array}\!\right]$}\mbox{\small$\left[\!\begin{array}[]{cc}a&b\\[-1.0pt] c&d\end{array}\!\right]$}=\mbox{\small$\left[\!\begin{array}[]{cc}a+c/2&b+(d-a)/2-c/4\\[-1.0pt] c&d-c/2\end{array}\!\right]$}\mbox{\small$\left[\!\begin{array}[]{cc}1&1/2\\[-1.0pt] 0&1\end{array}\!\right]$},

it follows that

(12)

\tau\mapsto\tau+1/2

is the automorphism of $\Sigma$ that swops $z=1$ and $z=\infty$ whilst fixing $0$ . Close to $q=0$ , we recognise it as $q\mapsto-q$ . In the $z$ -coördinate, it is

z\mapsto z/(z-1).

7. The normal distribution

At this point, rather bizarrely, it is useful to discuss the normal distribution

f(x)\equiv e^{-\pi x^{2}}

and its well-known invariance under the Fourier transform

\widehat{f}(\xi)\equiv\int_{-\infty}^{\infty}f(x)e^{-2\pi\xi x}dx=e^{-\pi\xi^{2}}.

More generally, integration by substitution shows that

(13)

f(x)=e^{-2\pi tx^{2}}\Longrightarrow\widehat{f}(\xi)=\frac{1}{\sqrt{2t}}e^{-(\pi/2t)\xi^{2}}

for any $t>0$ . The Poisson summation formula says that

\sum_{n\in{\mathbb{Z}}}f(n)=\sum_{n\in{\mathbb{Z}}}\widehat{f}(n)

for $f:{\mathbb{R}}\to{\mathbb{R}}$ a suitably well-behaved function (for example, one that lies in Schwartz space). For $f(x)=e^{-\pi tx^{2}}$ , as in (13), we find that

(14)

\sum_{n\in{\mathbb{Z}}}e^{-2\pi tn^{2}}=\frac{1}{\sqrt{2t}}\sum_{n\in{\mathbb{Z}}}e^{-(\pi/2t)n^{2}}.

8. A miracle

An outrageous suggestion is to view the formal power series (1) as defining a holomorphic function of the complex variable $q$ (now called Jacobi’s theta function). Clearly, it is convergent for $\{|q|<1\}$ . Hence, setting $q=e^{2\pi i\tau}$ , we obtain a holomorphic function of $\tau$ for $\tau\in{\mathcal{H}}$ . Then a miracle occurs:

Theorem 4.

For $\tau\in{\mathcal{H}}$ , we have

(\theta(-1/4\tau))^{4}=-4\tau^{2}(\theta(\tau))^{4}.

Equivalently, if we define $\phi:{\mathcal{H}}\to{\mathcal{H}}$ by

\phi(\tau)\equiv-1/4\tau

and consider the holomorphic $1$ -form $\Theta\equiv(\theta(\tau))^{4}d\tau,$ then

(15)

\phi^{*}\Theta=-\Theta.

Proof.

When $\tau$ lies on the imaginary axis, i.e. $\tau=it$ for $t>0$ ,

\theta(\tau)=\sum_{n\in{\mathbb{Z}}}q^{n^{2}}=\sum_{n\in{\mathbb{Z}}}e^{-2\pi tn^{2}}

whilst

\theta(-1/4\tau)=\sum_{n\in{\mathbb{Z}}}e^{-2\pi(1/4t)n^{2}}=\sum_{n\in{\mathbb{Z}}}e^{-(\pi/2t)n^{2}}

so (14) says that

\theta(-1/4\tau)=\sqrt{-2i\tau}\theta(\tau),\quad\mbox{whence}\quad(\theta(-1/4\tau))^{4}=-4\tau^{2}(\theta(\tau))^{4},

along the imaginary axis. The transformation (15) now holds for all $\tau\in{\mathcal{H}}$ by analytic continuation. ∎

Notice that the transformation $\phi$ has already made its appearance (11) as inducing an automorphism of $\Sigma$ , the thrice-punctured sphere. If we also introduce $T:{\mathcal{H}}\to{\mathcal{H}}$ by

T(\tau)\equiv\tau+1,

then it is clear that $T^{*}\theta=\theta$ and $T^{*}d\tau=d\tau$ . Hence, we see that

(16)

T^{*}\Theta=\Theta.

Finally, to obtain a geometric interpretation of (15) we note that

R\equiv\phi\circ T^{-1}\circ\phi

is given by

R(\tau)=\frac{\tau}{4\tau+1}

and recall that $R$ and $T$ together generate $\Gamma_{0}(4)$ . Note that $R^{*}\Theta=\Theta$ in accordance with (15) and (16). Putting all this together, we have proved the following.

Theorem 5.

The holomorphic $1$ -form $\Theta\equiv(\theta(\tau))^{4}d\tau$ descends to the thrice-punctured sphere $\Sigma$ and, under the automorphism $\phi:\Sigma\to\Sigma$ , satisfies $\phi^{*}\Theta=-\Theta$ .

Corollary 2.

In the usual $z$ -coördinate on the thrice-punctured sphere,

\Theta=\frac{dz}{2\pi iz}.

Proof.

From $q=e^{2\pi i\tau}$ we see that $dq=2\pi iqd\tau$ and so

\Theta=\frac{1}{2\pi iq}\left(1+8q+24q^{2}+32q^{3}+\cdots\right)dq

near $q=0$ and, in particular, meromorphically extends through $q=0$ , having a simple pole there with residue $1/2\pi i$ . This is a coördinate-free statement and so also applies in the $z$ -coördinate:

\Theta=\frac{1}{2\pi iz}\left(1+\cdots\right)dz.

Recall that in the $z$ -coördinate, the automorphism $\phi$ interchanges $z=0$ with $z=\infty$ whilst fixing $z=1$ . The relation $\phi^{*}\Theta=-\Theta$ , implies that $\Theta$ also has a pole at $z=\infty$ with residue $-1/2\pi i$ . Finally, the behaviour of $\Theta$ at $z=1$ may be investigated by means of the automorphism (12), let us call it $\psi$ , which swops $z=1$ and $z=\infty$ whilst fixing $z=0$ . In particular, we may easily compare $\Theta/i$ along the imaginary $\tau$ -axis $\{\tau=it\}$ with its behaviour along the translated axis $\{\tau=1/2+it\}$ :

\begin{array}[]{rcl}\Theta/i&=&\left(1+8q+24q^{2}+32q^{3}+\cdots\right)dt\\[5.0pt] \psi^{*}\Theta/i&=&\left(1-8q+24q^{2}-32q^{3}+\cdots\right)dt\end{array}\quad\mbox{where }q=e^{-2\pi t}.

It is clear that $\Theta(it)$ has only a simple pole at $t=0$ . But $\Theta(\tau)$ is real-valued when $\mathrm{Re}(\tau)=0$ or $\mathrm{Re}(\tau)=1/2$ , and the $q$ -expansion coefficients are all non-negative, so $\Theta(1/2+it)$ is dominated by $\Theta(it)$ as $t\to 0^{+}$ . The possibility of an essential singularity is excluded by the observation that the intersection of any semicircle centred at $\tau=1/2$ with an appropriately chosen fundamental domain containing $\{1/2+it\mid t\geq 0\}$ is a finite curve, so the maximal value of $\Theta(\tau)$ , as $\tau$ runs along the semicircle, is bounded by $\Theta(is)$ for some real $s$ . So the behaviour of $\Theta$ at $z=1$ is certainly no worse than the behaviour at $z=0$ .

In summary, the holomorphic $1$ -form $\Theta$ on $\Sigma\equiv S^{2}\setminus\{0,1,\infty\}$ enjoys a meromorphic extension to $S^{2}$ with

•

a simple pole at $z=0$ with residue $1/2\pi i$ ,
•

a simple pole at $z=\infty$ with residue $-1/2\pi i$ ,
•

at worse at simple pole at $z=1$ .

By the residue theorem, the sum of the residues of any meromorphic $1$ -form on any Riemann surface is zero. It follows that $\Theta$ has poles only at $z=0$ and $z=\infty$ . Having identified precisely two poles, it cannot have any zeros. At this point $\Theta$ is determined as stated. ∎

9. An Eisenstein series

Introduce

G_{4}(\tau)\equiv\sum_{(c,d)\in{\mathbb{Z}}^{2}\setminus\{(0,0)\}}\frac{1}{(c\tau+d)^{4}}

and, by absolute convergence, observe that

(17)

G_{4}\Big{(}\frac{a\tau+b}{c\tau+d}\Big{)}=(c\tau+d)^{4}G_{4}(\tau),\quad\mbox{for}\enskip\mbox{\small$\left[\!\begin{array}[]{cc}a&b\\[-1.0pt] c&d\end{array}\!\right]$}\in{\mathrm{SL}}(2,{\mathbb{Z}}).

Theorem 6.

(18)

G_{4}(q)=\frac{\pi^{4}}{45}\Big{(}1+240\sum_{n=1}^{\infty}\sigma_{3}(n)q^{n}\Big{)},

where $\sigma_{3}(n)\equiv\sum_{d|n}d^{3}$ (and recall that $q=e^{2\pi i\tau}$ ).

Proof.

This is a straightforward application of (4):

	$\displaystyle G_{4}(\tau)$	$\displaystyle=\!\sum_{\begin{subarray}{c}d=-\infty\\ d\neq 0\end{subarray}}^{\infty}\frac{1}{d^{4}}+\!\!\sum_{\begin{subarray}{c}c=-\infty\\ c\neq 0\end{subarray}}^{\infty}\sum_{d=-\infty}^{\infty}\frac{1}{(c\tau+d)^{4}}=2\zeta(4)+2\sum_{c=1}^{\infty}\sum_{d=-\infty}^{\infty}\frac{1}{(c\tau+d)^{4}}$
		$\displaystyle=\frac{\pi^{4}}{45}+2\sum_{c=1}^{\infty}\left(\frac{8\pi^{4}}{3}\sum_{m=1}^{\infty}m^{3}e^{2\pi icm\tau}\right)\quad\mbox{(from (\ref{another_rite}))}$
		$\displaystyle=\frac{\pi^{4}}{45}\left(1+240\sum_{m=1}^{\infty}\sigma_{3}(m)e^{2\pi im\tau}\right).\qed$

Following Ramanujan, let

(19)

M(q)\equiv 1+240\sum_{n=1}^{\infty}\sigma_{3}(n)q^{n}

and, as a consequence of (17) and (18), observe that

(20)

M(\tau+1)=M(\tau)\quad\mbox{and}\quad M(-1/\tau)=\tau^{4}M(\tau).

10. The Ramanujan ODE

Following Ramanujan, let

(21)

L(q)\equiv 1-24\sum_{n=1}^{\infty}\sigma(n)q^{n}

defined for $\{|q|<1\}$ . The following identity was proved by Ramanujan [4, identities (17), (27), (28), and (30)], as a corollary of his straightforward but inspired proof of a certain identity between Lambert series. These Lambert series identities were elucidated by van der Pol [6], who showed that they ultimately derive from the product formula and transformation formula for Jacobi’s theta function. A direct combinatorial proof is due to Skoruppa [5].

Theorem 7.

As (formal) power series,

(22)

12q\frac{dL}{dq}-L^{2}+M=0.

As usual, by setting $q=e^{2\pi i\tau}$ , we may view $L$ as a holomorphic function $L(\tau)$ for $\tau\in{\mathcal{H}}$ . A change of variables gives

(23)

\frac{6}{\pi i}\frac{dL}{d\tau}-L^{2}+M=0,

an equivalent statement to (22). Locally, we may write

(24)

L(\tau)=-\frac{6}{\pi i}\frac{g^{\prime}(\tau)}{g(\tau)}

and (23) becomes $g^{\prime\prime}+\frac{\pi^{2}}{36}Mg=0$ . Thus, we are led to consider

(25)

y^{\prime\prime}+\frac{\pi^{2}}{36}My=0

for $y:{\mathcal{H}}\to{\mathbb{C}}$ a holomorphic function and (22) says that $y(\tau)=g(\tau)$ is a solution of (25). We may investigate the solutions of the linear equation (25) quite explicitly. Firstly, we may figure out much more about $g(\tau)$ as follows.

Lemma 3.

We may take

\begin{array}[]{rcl}g(\tau)&=&\displaystyle e^{-\pi i\tau/6}\exp\Big{(}2\sum_{n=1}^{\infty}\frac{\sigma(n)}{n}q^{n}\Big{)}\\[14.0pt] &=&e^{-\pi i\tau/6}\big{(}1+2q+5q^{2}+10q^{3}+20q^{4}+36q^{5}+65q^{6}+\cdots\big{)},\end{array}

a globally defined holomorphic function ${\mathcal{H}}\to{\mathbb{C}}\setminus\{0\}$ .

Proof.

Of course, the function $g(\tau)$ is locally defined by (24) up to a constant. As a global Ansatz, let us try

g(\tau)=e^{-\pi i\tau/6}\psi(q),\quad\mbox{for}\enskip q=e^{2\pi i\tau}

and $\psi:\{|q|<1\}\to{\mathbb{C}}\setminus\{0\}$ holomorphic. Substituting this form of $g$ into (24) gives

(26)

\psi-12q\frac{d\psi}{dq}=L\psi=\psi-24\psi\sum_{n=1}^{\infty}\sigma(n)q^{n}

so

\frac{d}{dq}\log\psi=\frac{1}{\psi}\frac{d\psi}{dq}=2\sum_{n=1}^{\infty}\sigma(n)q^{n-1}=2\frac{d}{dq}\sum_{n=1}^{\infty}\frac{\sigma(n)}{n}q^{n}

and, normalising $\psi(q)$ by $\psi(0)=1$ , conclude that

\log\psi=2\sum_{n=1}^{\infty}\frac{\sigma(n)}{n}q^{n}.

Evidently, this power series converges for $|q|<1$ and we are done. ∎

As an aside, we note that the resulting power series expansion

\psi(q)=\sum_{n=0}^{\infty}b_{n}q^{n}=1+2q+5q^{2}+10q^{3}+20q^{4}+36q^{5}+65q^{6}+\cdots,

where, as one obtains easily from (26),

(27)

b_{0}=1,\quad b_{n}=\frac{2}{n}\sum_{k=1}^{n}\sigma(k)b_{n-k},\enskip\mbox{for}\enskip n\geq 1,

has integer coefficients. Indeed, the generating function of $\sigma$ is the $q$ -expansion of a Lambert series

\sum_{n=1}^{\infty}\sigma(n)q^{n}=\sum_{n=1}^{\infty}\frac{nq^{n}}{1-q^{n}},

which, upon rewriting, assumes the form

\sum_{n=1}^{\infty}\frac{nq^{n}}{1-q^{n}}=q\frac{d}{dq}\sum_{n=1}^{\infty}\log\left(\frac{1}{1-q^{n}}\right)=q\frac{d}{dq}\log\prod_{k=1}^{\infty}\frac{1}{1-q^{k}}.

But the $q$ -expansion of this infinite product is well-known. It is the generating function of the manifestly integral partition numbers $p(k)$ :

\prod_{k=1}^{\infty}\frac{1}{1-q^{k}}=\sum_{k=0}^{\infty}p(k)q^{k}\equiv P(q).

Returning to (26), we find that $\psi$ satisfies

\frac{d}{dq}\left(\log\psi(q)-2\log P(q)\right)=0,

and, recalling that $P(0)=1$ , we find that $\psi=P^{2}$ .

Let ${\mathbb{S}}$ denote the solution space of (25). As ${\mathcal{H}}$ is simply-connected, we conclude that ${\mathbb{S}}$ is two-dimensional and in Lemma 3 we have already found one non-zero element in ${\mathbb{S}}$ . To complete our understanding of ${\mathbb{S}}$ it suffices to find another linearly independent element:

Lemma 4.

There is a convergent power series

\textstyle\phi(q)=1+\frac{10}{7}q+\frac{365}{91}q^{2}+\frac{13610}{1729}q^{3}+\frac{135701}{8645}q^{4}+\frac{7419742}{267995}q^{5}+\cdots\quad\mbox{for}\enskip|q|<1

so that $h(\tau)\equiv e^{\pi i\tau/6}\phi(q)$ is in ${\mathbb{S}}$ .

Proof.

We try $y(\tau)=e^{\pi i\tau/6}\phi(q)$ as an Ansatz in (25). A calculation shows that (25) reduces to

6q^{2}\frac{d^{2}\phi}{dq^{2}}+7q\frac{d\phi}{dq}=10\sum_{n=1}^{\infty}\sigma_{3}(n)q^{n},

whereas substituting $h(\tau)=e^{-\pi i\tau/6}\psi(q)$ instead, gives

6q^{2}\frac{d^{2}\psi}{dq^{2}}+5q\frac{d\psi}{dq}=10\sum_{n=1}^{\infty}\sigma_{3}(n)q^{n}.

Each of these gives a recursion relation for the coefficients of a formal power series for the function in question, namely

\phi(q)=\sum_{n=0}^{\infty}a_{n}q^{n}\qquad\psi(q)=\sum_{n=0}^{\infty}b_{n}q^{n}

where $a_{0}=b_{0}=1$ and, for $n\geq 1$ ,

a_{n}=\frac{10}{n(6n+1)}\sum_{k=1}^{n}\sigma_{3}(k)a_{n-k}\qquad b_{n}=\frac{10}{n(6n-1)}\sum_{k=1}^{n}\sigma_{3}(k)b_{n-k}.

By Lemma 3, we know that the power series $\sum_{n=0}^{\infty}b_{n}q^{n}$ converges for $|q|<1$ (and, from this formal point of view, the content of (22) is that the recursion relation (27) yields the same coefficients $b_{n}$ ). From these recurrence relations it is clear, by induction, that $0<a_{n}\leq b_{n}$ . It follows that $\sum_{n=0}^{\infty}a_{n}q^{n}$ also converges for $|q|<1$ and we are done.∎

In summary, Lemmata 3 and 4 give us a basis for ${\mathbb{S}}$ of the form

\begin{array}[]{rcr}g(\tau)&=&e^{-\pi i\tau/6}\psi(q)\phantom{,}\\[4.0pt] h(\tau)&=&e^{\pi i\tau/6}\phi(q),\end{array}\quad\mbox{where}\enskip q=e^{2\pi i\tau}

and $\phi(q),\psi(q)$ are holomorphic functions on the unit disc $\{|q|<1\}$ . Also notice that both $\psi(e^{2\pi i\tau})$ and $\phi(e^{2\pi i\tau})$ are strictly positive along the imaginary axis $\{\tau=it|t>0\}$ in ${\mathcal{H}}$ . In particular, we conclude that $h(i)\not=0$ .

Theorem 8.

The equation (25) is projectively invariant.

Proof.

Firstly, we must explain what the phrase ‘projectively invariant’ means. There is no local structure in the conformal geometry of ${\mathcal{H}}$ (an $n$ -dimensional complex manifold is locally biholomorphic to ${\mathbb{C}}^{n}$ ; end of story). Globally, however, the group ${\mathrm{SL}}(2,{\mathbb{R}})$ acts conformally on ${\mathcal{H}}$ and this may be recorded as local information on ${\mathcal{H}}$ , specifically as a collection of preferred local coördinates, namely $\tau$ and its translates

\frac{a\tau+b}{c\tau+d}\quad\mbox{for}\enskip\mbox{\small$\left[\!\begin{array}[]{cc}a&b\\[-1.0pt] c&d\end{array}\!\right]$}\in{\mathrm{SL}}(2,{\mathbb{R}}).

Roughly speaking, this is a ‘projective structure.’ In any case, to say that (25) is ‘projectively invariant’ is to say that it respects the action of ${\mathrm{SL}}(2,{\mathbb{Z}})$ . For this to be true we decree that

(28)

(A^{-1}g)(\tau)\equiv(c\tau+d)g(A\tau),\quad\mbox{for}\enskip A=\mbox{\small$\left[\!\begin{array}[]{cc}a&b\\[-1.0pt] c&d\end{array}\!\right]$}\in{\mathrm{SL}}(2,{\mathbb{R}}).

(In the language of projective differential geometry $g$ is a ‘projective density of weight $1$ .’) From (17), (18), and (19), we already know that

M\Big{(}\frac{a\tau+b}{c\tau+d}\Big{)}=(c\tau+d)^{4}M(\tau),\quad\mbox{for}\enskip\mbox{\small$\left[\!\begin{array}[]{cc}a&b\\[-1.0pt] c&d\end{array}\!\right]$}\in{\mathrm{SL}}(2,{\mathbb{Z}})

and so it suffices to show that

\frac{d^{2}}{d\tau^{2}}\left[(c\tau+d)g\Big{(}\frac{a\tau+b}{c\tau+d}\Big{)}\right]=\frac{1}{(c\tau+d)^{3}}\frac{d^{2}g}{d\tau^{2}}\Big{(}\frac{a\tau+b}{c\tau+d}\Big{)},

which is an elementary consequence of the chain rule. ∎

Recall that ${\mathbb{S}}$ , the solution space of (25), is two-dimensional. In accordance with Theorem 8, the group ${\mathrm{SL}}(2,{\mathbb{Z}})$ , generated by

(29)

T\equiv\mbox{\small$\left[\!\begin{array}[]{cc}1&1\\[-1.0pt] 0&1\end{array}\!\right]$}\quad\mbox{and}\quad S\equiv\mbox{\small$\left[\!\begin{array}[]{cc}0&-1\\[-1.0pt] 1&0\end{array}\!\right]$},

is represented on ${\mathbb{S}}$ . More specifically, if $g(\tau)$ solves (25) then, according to (28), so do

(Tg)(\tau)\equiv g(\tau-1)\quad\mbox{and}\quad(Sg)(\tau)\equiv-\tau g(-1/\tau).

Theorem 9.

The holomorphic function $L:{\mathcal{H}}\to{\mathbb{C}}$ satisfies

(30)

L\Big{(}\frac{a\tau+b}{c\tau+d}\Big{)}=(c\tau+d)^{2}L(\tau)+\frac{6}{\pi i}c(c\tau+d)

for $\mbox{\small$\left[\!\begin{array}[]{cc}a&b\\[-1.0pt] c&d\end{array}\!\right]$}\in{\mathrm{SL}}(2,{\mathbb{Z}})$ .

Proof.

It suffices to prove (30) for the generators $T$ and $S$ of ${\mathrm{SL}}(2,{\mathbb{Z}})$ , specifically that

L(\tau+1)=L(\tau)\quad\mbox{and}\quad L(-1/\tau)=\tau^{2}L(\tau)+6\tau/\pi i.

The first of these holds by Lemma 3, which implies that $Tg=e^{\pi i/6}g$ . To establish the second identity, it suffices to show that $Sg=\beta g$ for some constant $\beta$ : if $-\tau g(-1/\tau)=\beta g(\tau)$ , then

\beta g(-1/\tau)=g(\tau)/\tau\enskip\Rightarrow\enskip\beta g^{\prime}(-1/\tau)=\tau g^{\prime}(\tau)-g(\tau)

so

\frac{\beta g^{\prime}(-1/\tau)}{g(\tau)}=\frac{\tau g^{\prime}(\tau)}{g(\tau)}-1.

Therefore

\frac{g^{\prime}(-1/\tau)}{\tau g(-1/\tau)}=\frac{\tau g^{\prime}(\tau)}{g(\tau)}-1

and so

-\frac{6}{\pi i}\frac{g^{\prime}(-1/\tau)}{\tau g(-1/\tau)}=-\frac{6}{\pi i}\frac{\tau g^{\prime}(\tau)}{g(\tau)}+\frac{6}{\pi i};

in other words, from (24),

\frac{L(-1/\tau)}{\tau}=\tau L(\tau)+\frac{6}{\pi i},

as required. To finish the proof, let us consider the action of ${\mathrm{SL}}(2,{\mathbb{Z}})$ on ${\mathbb{S}}$ . If $Sg\not=\beta g$ , then we may set $f\equiv Sg$ to obtain $\{f,g\}$ as a basis of ${\mathbb{S}}$ . By construction

S\mbox{\small$\left[\!\begin{array}[]{cc}f\\[-1.0pt] g\end{array}\!\right]$}=\mbox{\small$\left[\!\begin{array}[]{cc}0&-1\\[-1.0pt] 1&0\end{array}\!\right]$}\mbox{\small$\left[\!\begin{array}[]{cc}f\\[-1.0pt] g\end{array}\!\right]$}.

By Lemma 3, we already know that $Tg=e^{\pi i/6}g$ and, from Lemma 4, we know that the action of $T$ on ${\mathbb{S}}$ is diagonalisable with the other eigenvalue being $e^{-\pi i/6}$ . In other words

T\mbox{\small$\left[\!\begin{array}[]{cc}f\\[-1.0pt] g\end{array}\!\right]$}=\mbox{\small$\left[\!\begin{array}[]{cc}e^{-\pi i/6}&\alpha\\[-1.0pt] 0&e^{\pi i/6}\end{array}\!\right]$}\mbox{\small$\left[\!\begin{array}[]{cc}f\\[-1.0pt] g\end{array}\!\right]$}

for some constant $\alpha$ . In ${\mathrm{SL}}(2,{\mathbb{Z}})$ , the matrices (29) satisfy the relations

S^{2}=-{\mathrm{Id}}\quad\mbox{and}\quad(ST)^{3}=-{\mathrm{Id}}.

These same relations must hold for their action on ${\mathbb{S}}$ . For $S$ this is evident and for $T$ we conclude that $\alpha=1$ . Therefore, since

1+ie^{\pi i/6}=ie^{-\pi i/6}

we find that

T(f+ig)=Tf+iTg=e^{-\pi i/6}f+g+ie^{\pi i/6}g=e^{-\pi i/6}(f+ig).

However, in Lemma 4, we already found in $h$ an eigenvector for the action of $T$ on ${\mathbb{S}}$ with eigenvalue $e^{-\pi i/6}$ . It follows that

(31)

f(\tau)+ig(\tau)=Ch(\tau)

for some constant $C$ . We have already observed that $h(i)\not=0$ whereas, substituting $\tau=i$ into $f=Sg$ , we find that

\big{[}f(\tau)=-\tau g(-1/\tau)\big{]}|_{\tau=i}\enskip\Rightarrow\enskip f(i)=-ig(i)\enskip\Rightarrow\enskip\big{[}f+ig]|_{\tau=i}=0.

Therefore, the only option in (31) is that $C=0$ and so $f+ig\equiv 0$ . Hence, assuming that $Sg\not=\beta g$ we have found that $Sg=-ig$ . This contradiction finishes the proof. ∎

Corollary 3.

The holomorphic $1$ -form

\big{(}L(\tau)-L(\tau+1/2)\big{)}d\tau

is $\Gamma_{0}(4)$ -invariant.

Proof.

We need only check invariance under the generators of $\Gamma_{0}(4)$ :

\tau\mapsto\tau+1\quad\mbox{and}\quad\tau\mapsto\frac{\tau}{4\tau+1}.

The first of these is clear since $L(\tau+1)=L(\tau)$ . For the second, we may use Theorem 9 immediately to conclude that

L\Big{(}\frac{\tau}{4\tau+1}\Big{)}=(4\tau+1)^{2}L(\tau)+\frac{24}{\pi i}(4\tau+1)

but also that

\begin{array}[]{rcl}\displaystyle L\Big{(}\frac{\tau}{4\tau+1}+\frac{1}{2}\Big{)}&\!\!\!\!=\!\!\!\!&\displaystyle L\Big{(}\frac{3(\tau+1/2)-1}{4(\tau+1/2)-1}\Big{)}\\[10.0pt] &\!\!\!\!=\!\!\!\!&\displaystyle\big{(}4(\tau+1/2)-1\big{)}^{2}L(\tau+1/2)+\frac{24}{\pi i}(4(\tau+1/2)-1)\\[8.0pt] &\!\!\!\!=\!\!\!\!&\displaystyle(4\tau+1)^{2}L(\tau+1/2)+\frac{24}{\pi i}(4\tau+1).\end{array}

Subtracting these identities gives

L\Big{(}\frac{\tau}{4\tau+1}\Big{)}-L\Big{(}\frac{\tau}{4\tau+1}+\frac{1}{2}\Big{)}=(4\tau+1)^{2}\Big{(}L\big{(}\tau\big{)}-L\big{(}\tau+\frac{1}{2}\big{)}\Big{)}.

But

d\Big{(}\frac{\tau}{4\tau+1}\Big{)}=\frac{(4\tau+1)d\tau-4\tau d\tau}{(4\tau+1)^{2}}=\frac{1}{(4\tau+1)^{2}}d\tau,

the factor of $(4\tau+1)^{2}$ cancels, and we are done. ∎

Lemma 5.

Suppose $\xi(\tau)$ is a holomorphic function ${\mathcal{H}}\to{\mathbb{C}}$ and let $q=e^{2\pi i\tau}$ . In order that $\xi(\tau)d\tau$ extend to a meromorphic differential form on the unit disc $\{|q|<1\}$ with at worse a simple pole at $q=0$ , it is necessary and sufficient that

•

$\xi(\tau+1)=\xi(\tau),\;\forall\tau\in{\mathcal{H}}$ ,
•

$\xi(\tau)$ is bounded on the rectangle $\{\tau=x+iy\,|\,0\leq x\leq 1,y\geq 1\}$ .

Proof.

The first condition ensures that $\xi(\tau)$ is, in fact, a holomorphic function of $q$ and then, since $q=e^{2\pi i\tau}=e^{-2\pi y}e^{2\pi ix}$ the second condition says that $\xi(q)$ is bounded on the disc $\{|q|<e^{-2\pi}\}$ at which point Riemann’s removable singularities theorem implies that $\xi(q)$ extends holomorphically across the origin: $\xi(q)=a+bq+\cdots$ . Therefore,

q=e^{2\pi i\tau}\enskip\Rightarrow\enskip dq=2\pi iqd\tau\enskip\Rightarrow\enskip\xi(\tau)d\tau=\frac{1}{2\pi i}\Big{(}\frac{a}{q}+b+\cdots\Big{)}dq,

as required. ∎

Now consider the holomorphic $1$ -form

\Xi\equiv\big{(}L(\tau)-L(\tau+1/2)\big{)}d\tau\enskip\mbox{on}\enskip{\mathcal{H}}.

With $q=e^{2\pi i\tau}$ , as usual, it follows from the definition (21) of $L$ that

L(\tau)-L(\tau+1/2)=-48\big{(}q+4q^{3}+6q^{5}+\cdots\big{)}

and so $\Xi=-\frac{24}{\pi i}(1+4q^{2}+6q^{4}+\cdots)dq$ and, in particular, extends holomorphically across $q=0$ . Now we ask what happens at the cusps, a sensible question in view of Corollary 3.

The change of coördinates $\tau=-1/4\widetilde{\tau}$ sends our usual fundamental domain for $\Gamma_{0}(4)$ into itself whilst sending

0\mapsto\infty,\quad 1/2\mapsto-1/2,\quad\infty\mapsto 0,\quad-1/2\mapsto 1/2

(it’s a half turn about $i/2$ in the hyperbolic metric on ${\mathcal{H}}$ ). In order to figure out the behaviour of $\Xi$ let us firstly consider the holomorphic $1$ -form $\xi\equiv L(\tau)d\tau$ . We may view it in the coördinate $\widetilde{\tau}$ :

\xi=L(-1/4\tilde{\tau})d(-1/4\widetilde{\tau})=\frac{L(-1/4\widetilde{\tau})}{4\widetilde{\tau}^{2}}d\widetilde{\tau}

and employ Theorem 9 to conclude that

\xi=\frac{16\widetilde{\tau}^{2}L(4\widetilde{\tau})+24\widetilde{\tau}/\pi i}{4\widetilde{\tau}^{2}}d\widetilde{\tau}=\Big{(}4L(4\widetilde{\tau})+\frac{6}{\pi i\widetilde{\tau}}\Big{)}d\widetilde{\tau}.

Of course, whilst $4L(4\widetilde{\tau})$ is periodic under $\widetilde{\tau}\mapsto\widetilde{\tau}+1$ , $6/\pi i\widetilde{\tau}$ is not. Thus, the first stipulation of Lemma 5 in this case (namely, theperiodicity of $4L(4\widetilde{\tau})+6/\pi i\widetilde{\tau}$ ) is not satisfied. But on the rectangle in the statement of Lemma 5, this function is at least bounded. Now, if we apply the same reasoning to the holomorphic $1$ -form $L(\tau+1/2)d\tau$ , then the boundedness hypothesis of Lemma 5 is again satisfied, and again periodicity fails. When we subtract $L(\tau+1/2)d\tau$ from $L(\tau)d\tau$ , periodicity is restored in view of Corollary 3 and boundedness persists! Lemma 5 now applies and we conclude that $\Xi$ has no worse than a simple pole at $z=\infty$ . Similar reasoning applies concerning the cusp at $z=1$ . With more care we could even compute the residues at these points (but this is an optional extra).

To conclude, we have verified that

\left(L(\tau)-L(\tau+1/2)\right)d\tau

and

\big{(}(\theta(\tau))^{4}-(\theta(\tau+1/2))^{4}\big{)}d\tau

are meromorphic one-forms on the thrice-punctured sphere with poles and zeros in the same locations. It follows that one is a constant multiple of the other, and the proof of (2)^♯\sharp^♯\sharp $\sharp$ The full force of the Jacobi four-square theorem, namely that the number of ways of representing an integer $n$ as a sum of four squares of integers is equal to $8\sum_{4\nmid d\mid n}d$ , follows from (2) in an elementary fashion. is complete upon comparing their power series expansions in $q$ .

Modular forms, projective structures, and the four squares theorem

Abstract.

1991 Mathematics Subject Classification:

1. Introduction

2. The twice-punctured sphere

Theorem 1.

Proof.

Corollary 1.

Proof.

3. The thrice-punctured sphere

4. Symmetries of the upper half plane

Lemma 1.

Proof.

4.1. The modular group

4.2. Some congruence subgroups

Lemma 2.

Proof.

5. Puncture repair

Theorem 2.

Proof.

Theorem 3.

Proof.

6. Automorphisms of the thrice-punctured sphere

7. The normal distribution

8. A miracle

Theorem 4.

Proof.

Theorem 5.

Corollary 2.

Proof.

9. An Eisenstein series

Theorem 6.

Proof.

10. The Ramanujan ODE

Theorem 7.

Lemma 3.

Proof.

Lemma 4.

Proof.

Theorem 8.

Proof.

Theorem 9.

Proof.

Corollary 3.

Proof.

Lemma 5.

Proof.

References