This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

A conceptual breakthrough in sphere packing

Henry Cohn
Henry Cohn is principal researcher at Microsoft Research New England and adjunct professor of mathematics at the Massachusetts Institute of Technology. His email address is cohn@microsoft.com.

On March 14, 2016, the world of mathematics received an extraordinary Pi Day surprise when Maryna Viazovska posted to the arXiv a solution of the sphere packing problem in eight dimensions [15]. Her proof shows that the E8E_{8} root lattice is the densest sphere packing in eight dimensions, via a beautiful and conceptually simple argument. Sphere packing is notorious for complicated proofs of intuitively obvious facts, as well as hopelessly difficult unsolved problems, so it’s wonderful to see a relatively simple proof of a deep theorem in sphere packing. No proof of optimality had been known for any dimension above three, and Viazovska’s paper does not even address four through seven dimensions. Instead, it relies on remarkable properties of the E8E_{8} lattice. Her proof is thus a notable contribution to the story of E8E_{8}, and more generally the story of exceptional structures in mathematics.

One measure of the complexity of a proof is how long it takes the community to digest it. By this standard, Viazovska’s proof is remarkably simple. It was understood by a number of people within a few days of her arXiv posting, and within a week it led to further progress: Abhinav Kumar, Stephen D. Miller, Danylo Radchenko, and I worked with Viazovska to adapt her methods to prove that the Leech lattice is an optimal sphere packing in twenty-four dimensions [4]. This is the only other case above three dimensions in which the sphere packing problem has been solved.

The new ingredient in Viazovska’s proof is a certain special function, which enforces the optimality of E8E_{8} via the Poisson summation formula. The existence of such a function had been conjectured by Cohn and Elkies in 2003, but what sort of function it might be remained mysterious despite considerable effort. Viazovska constructs this function explicitly in terms of modular forms by using an unexpected integral transform, which establishes a new connection between modular forms and discrete geometry.

A landmark achievement like Viazovska’s deserves to be appreciated by a broad audience of mathematicians, and indeed it can be. In this article we’ll take a look at how her proof works, as well as the background and context. We won’t cover all the details completely, but we’ll see the main ideas and how they fit together. Readers who wish to read a complete proof will then be well prepared to study Viazovska’s paper [15] and the follow-up work on the Leech lattice [4]. See also de Laat and Vallentin’s survey article and interview [13] for a somewhat different perspective, as well as [1] and [7] for further background and references.

Refer to caption
Figure 1. Maryna Viazovska solved the sphere packing problem in eight dimensions.
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 2. Henry Cohn, Abhinav Kumar, Stephen D. Miller, and Danylo Radchenko collaborated with Maryna Viazovska to extend her methods to twenty-four dimensions.

1. Sphere packing

The sphere packing problem asks for the densest packing of n{\mathbb{R}}^{n} with congruent balls. In other words, what is the largest fraction of n{\mathbb{R}}^{n} that can be covered by congruent balls with disjoint interiors?

Pathological packings may not have well-defined densities, but we can handle the technicalities as follows. A sphere packing 𝒫\mathcal{P} is a nonempty subset of n{\mathbb{R}}^{n} consisting of congruent balls with disjoint interiors. The upper density of 𝒫\mathcal{P} is

lim suprvol(Brn(0)𝒫)vol(Brn(0)),\limsup_{r\to\infty}\frac{\mathop{\textup{vol}}\mathopen{}\big{(}B_{r}^{n}(0)\cap\mathcal{P}\big{)}\mathclose{}}{\mathop{\textup{vol}}\mathopen{}\big{(}B_{r}^{n}(0)\big{)}\mathclose{}},

where Brn(x)B_{r}^{n}(x) denotes the closed ball of radius rr about xx, and the sphere packing density Δn\Delta_{{\mathbb{R}}^{n}} in n{\mathbb{R}}^{n} is the supremum of all the upper densities of sphere packings. In other words, we avoid technicalities by using a generous definition of the packing density. This generosity does not cause any harm, as shown by the theorem of Groemer that there exists a sphere packing 𝒫\mathcal{P} for which

limrvol(Brn(x)𝒫)vol(Brn(x))=Δn\lim_{r\to\infty}\frac{\mathop{\textup{vol}}\mathopen{}\big{(}B_{r}^{n}(x)\cap\mathcal{P}\big{)}\mathclose{}}{\mathop{\textup{vol}}\mathopen{}\big{(}B_{r}^{n}(x)\big{)}\mathclose{}}=\Delta_{{\mathbb{R}}^{n}}

uniformly for all xnx\in{\mathbb{R}}^{n}. Thus, the supremum of the upper densities is in fact achieved as the density of some packing, in the nicest possible way. Of course the densest packing is not unique, since there are any number of ways to perturb a packing without changing its overall density.

Why should we care about the sphere packing problem? Two obvious reasons are that it’s a natural geometric problem in its own right and a toy model for granular materials. A more surprising application is that sphere packings are error-correcting codes for a continuous communication channel. Real-world communication channels can be modeled using high-dimensional vector spaces, and thus high-dimensional sphere packings have practical importance.

Instead of justifying sphere packing by aspects of the problem or its applications, we’ll justify it by its solutions: a question is good if it has good answers. Sphere packing turns out to be a far richer and more beautiful topic than the bare problem statement suggests. From this perspective, the point of the subject is the remarkable structures that arise as dense sphere packings.

To begin, let’s examine the familiar cases of one, two, and three dimensions. The one-dimensional sphere packing problem is the interval packing problem on the line, which is of course trivial: the optimal density is 11. The two- and three-dimensional problems are far from trivial, but the optimal packings, shown in Figure 3, are exactly what one would expect. In particular, the sphere packing density is π/12=0.9068\pi/\sqrt{12}=0.9068\dots in 2{\mathbb{R}}^{2} and π/18=0.7404\pi/\sqrt{18}=0.7404\dots in 3{\mathbb{R}}^{3}. The two-dimensional problem was solved by Thue. Giving a rigorous proof requires a genuine idea, but there exist short, elementary proofs [8]. The three-dimensional problem was solved by Hales [9] via a lengthy and complex computer-assisted proof, which was extraordinarily difficult to check but has since been completely verified using formal logic [10].

In both two and three dimensions, one can obtain an optimal packing by stacking layers that are packed optimally in the previous dimension, with the layers nestled together as closely as possible. Guessing this answer is not difficult, nor is computing the density of such a packing. Instead, the difficulty lies in proving that no other construction could achieve a greater density.

Refer to caption
Refer to caption
Figure 3. Fragments of optimal sphere packings in two and three dimensions, with density π/12=0.9068\pi/\sqrt{12}=0.9068\dots in 2{\mathbb{R}}^{2} and π/18=0.7404\pi/\sqrt{18}=0.7404\dots in 3{\mathbb{R}}^{3}.

Unfortunately, our low-dimensional experience is poor preparation for understanding high-dimensional sphere packing. Based on the first three dimensions, it appears that guessing the optimal packing is easy, but this expectation turns out to be completely false in high dimensions. In particular, stacking optimal layers from the previous dimension does not always yield an optimal packing. (One can recursively determine the best packings in successive dimensions under such a hypothesis [6], and this procedure yields a suboptimal packing by the time it reaches 10{\mathbb{R}}^{10}.)

The sphere packing problem seems to have no simple, systematic solution that works across all dimensions. Instead, each dimension has its own idiosyncracies and charm. Understanding the densest sphere packing in 8{\mathbb{R}}^{8} tells us only a little about 7{\mathbb{R}}^{7} or 9{\mathbb{R}}^{9}, and hardly anything about 10{\mathbb{R}}^{10}.

Aside from 8{\mathbb{R}}^{8} and 24{\mathbb{R}}^{24}, our ignorance grows as the dimension increases. In high dimensions, we have absolutely no idea how the densest sphere packings behave. We do not know even the most basic facts, such as whether the densest packings should be crystalline or disordered. Here “do not know” does not merely mean “cannot prove,” but rather the much stronger “cannot predict.”

A simple greedy argument shows that the optimal density in n{\mathbb{R}}^{n} is at least 2n2^{-n}. To see why, consider any sphere packing in which there is no room to add even one more sphere. If we double the radius of each sphere, then the enlarged spheres must cover space completely, because any uncovered point could serve as the center of a new sphere that would fit in the original packing. Doubling the radius multiplies volume by 2n2^{n}, and so the original packing must cover at least a 2n2^{-n} fraction of n{\mathbb{R}}^{n}.

That may sound appallingly low, but it is very nearly the best lower bound known. Even the most recent bounds, obtained by Venkatesh [14] in 2013, have been unable to improve on 2n2^{-n} by more than a linear factor in general and an nloglognn\log\log n factor in special cases. As for upper bounds, in 1978 Kabatyanskii and Levenshtein [11] proved an upper bound of 2(0.599+o(1))n2^{(-0.599\ldots+o(1))n}, which remains essentially the best upper bound known in high dimensions. Thus, we know that the sphere packing density decreases exponentially as a function of dimension, but the best upper and lower bounds known are exponentially far apart.

dimensionlog(density)\log(\text{density})best packing known4488121216162020242428283232363611014-14
Figure 4. The sphere packing density is jagged and irregular, with no obvious way to interpolate data points from their neighbors.
Table 1. The record sphere packing densities in n{\mathbb{R}}^{n} with 1n361\leq n\leq 36, from Table I.1 of [7, pp. xix–xx]. All numbers are rounded down.
nn density nn density nn density
11 1.0000000001.000000000 1313 0.03201429210.0320142921 2525 0.000677212009770.00067721200977
22 0.9068996820.906899682 1414 0.02162409600.0216240960 2626 0.000269220050430.00026922005043
33 0.7404804890.740480489 1515 0.01685757060.0168575706 2727 0.000157594390720.00015759439072
44 0.6168502750.616850275 1616 0.01470816430.0147081643 2828 0.000104638104920.00010463810492
55 0.4652576130.465257613 1717 0.00881131910.0088113191 2929 0.000034144646900.00003414464690
66 0.3729475450.372947545 1818 0.00616789810.0061678981 3030 0.000021915353440.00002191535344
77 0.2952978730.295297873 1919 0.00412080620.0041208062 3131 0.000011837765180.00001183776518
88 0.2536695070.253669507 2020 0.00339458140.0033945814 3232 0.000011040749300.00001104074930
99 0.1457748750.145774875 2121 0.00246588470.0024658847 3333 0.000004140688280.00000414068828
1010 0.0996157820.099615782 2222 0.00245103400.0024510340 3434 0.000001766973880.00000176697388
1111 0.0662380270.066238027 2323 0.00190532810.0019053281 3535 0.000000946190410.00000094619041
1212 0.0494541760.049454176 2424 0.00192957430.0019295743 3636 0.000000616146600.00000061614660

Table 1 lists the best packing densities currently known in up to 3636 dimensions, and Figure 4 shows a logarithmic plot. The plot has several noteworthy features:

  1. (1)

    The curve is jagged and irregular, with no obvious way to interpolate data points from their neighbors.

  2. (2)

    The density is clearly decreasing exponentially, but the irregularity makes it unclear how to extrapolate to estimate the decay rate as the dimension tends to infinity.

  3. (3)

    There seem to be parity effects. Even dimensions look slightly better than odd dimensions, multiples of four are better yet, and multiples of eight are the best of all.

  4. (4)

    Certain dimensions, most notably 2424, have packings so good that they seem to pull the entire curve in their direction. The fact that this occurs is not so surprising, since one expects cross sections and stackings of great packings to be at least good, but the effect is surprisingly large.

2. Lattices and periodic packings

How can we describe sphere packings? Random or pathological packings can be infinitely complicated, but the most important packings can generally be given a finite description via periodicity.

Recall that a lattice in n{\mathbb{R}}^{n} is a discrete subgroup of rank nn. In other words, it consists of the integral span of a basis of n{\mathbb{R}}^{n}. Equivalently, a lattice is the image of n{\mathbb{Z}}^{n} under an invertible linear operator.

A sphere packing 𝒫\mathcal{P} is periodic if there exists a lattice Λ\Lambda such that 𝒫\mathcal{P} is invariant under translation by every element of Λ\Lambda. In that case, the translational symmetry group of 𝒫\mathcal{P} must be a lattice, since it is clearly a discrete group, and 𝒫\mathcal{P} consists of finitely many orbits of this group. A lattice packing is a periodic packing in which the spheres form a single orbit under the translational symmetry group (i.e., their centers form a lattice, up to translation). See Figure 5 for an illustration.

Figure 5. The spheres in a lattice packing form a single orbit under translation (left), while those in a periodic packing can form several orbits (right). The small parallelograms are fundamental cells.

It is not known whether periodic packings attain the optimal sphere packing density in each dimension, aside from the five cases in which the sphere packing problem has been solved. They certainly come arbitrarily close to the optimal density: given an optimal packing, one can approximate it by taking the spheres contained in a large box and repeating them periodically throughout space, and the density loss is negligible if the box is large enough. However, there seems to be no reason why periodic packings should reach the exact optimum, and perhaps they don’t in high dimensions.

By contrast, lattices probably do not even come arbitrarily close to the optimal packing density in high dimensions. For example, the best periodic packing known in 10{\mathbb{R}}^{10} is more than 8% denser than the best lattice packing known. Seen in this light, the optimality of lattices in 8{\mathbb{R}}^{8} and 24{\mathbb{R}}^{24} is not a foregone conclusion, but rather an indication that sphere packing in these dimensions is particularly simple.

To compute the density of a lattice packing, it’s convenient to view the lattice as a tiling of space with parallelotopes (the nn-dimensional analogue of parallelograms). Given a basis v1,,vnv_{1},\dots,v_{n} for a lattice Λ\Lambda, the parallelotope

{x1v1++xnvn:0xi<1 for i=1,2,,n}\{x_{1}v_{1}+\cdots+x_{n}v_{n}:\text{$0\leq x_{i}<1$ for $i=1,2,\dots,n$}\}

is called the fundamental cell of Λ\Lambda with respect to this basis. Translating the fundamental cell by elements of Λ\Lambda tiles n{\mathbb{R}}^{n}, as in Figure 5. From this perspective, a lattice sphere packing amounts to placing spheres at the vertices of such a tiling. On a global scale, there is one sphere for each copy of the fundamental cell. Thus, if the packing uses spheres of radius rr and has fundamental cell CC, then its density is the ratio

vol(Brn)vol(C).\frac{\mathop{\textup{vol}}\mathopen{}\big{(}B_{r}^{n}\big{)}\mathclose{}}{\mathop{\textup{vol}}\mathopen{}(C)\mathclose{}}.

Both factors in this ratio are easily computed if we are given rr and CC. The volume of a fundamental cell is just the absolute value of the determinant of the corresponding lattice basis; we will write it as vol(n/Λ)\mathop{\textup{vol}}\mathopen{}({\mathbb{R}}^{n}/\Lambda)\mathclose{}, the volume of the quotient torus, to avoid having to specify a basis. Computing the volume of a ball of radius rr in n{\mathbb{R}}^{n} is a multivariate calculus exercise, whose answer is

vol(Brn)=πn/2(n/2)!rn,\mathop{\textup{vol}}\mathopen{}\big{(}B_{r}^{n}\big{)}\mathclose{}=\frac{\pi^{n/2}}{(n/2)!}r^{n},

where of course (n/2)!(n/2)! means Γ(n/2+1)\Gamma(n/2+1) when nn is odd. We can therefore compute the density of any lattice packing explicitly. The density of a periodic packing is equally easy to compute: if the packing consists of NN translates of a lattice Λ\Lambda in n{\mathbb{R}}^{n} and uses spheres of radius rr, then its density is

Nvol(Brn)vol(n/Λ).\frac{N\mathop{\textup{vol}}\mathopen{}\big{(}B_{r}^{n}\big{)}\mathclose{}}{\mathop{\textup{vol}}\mathopen{}({\mathbb{R}}^{n}/\Lambda)\mathclose{}}.

Of course the density of a packing depends on the radius of the spheres. Given a lattice with no radius specified, it is standard to use the largest radius that does not lead to overlap. The minimal vector length of a lattice Λ\Lambda is the length of the shortest nonzero vector in Λ\Lambda, or equivalently the shortest distance between two distinct points in Λ\Lambda. If the minimal vector length is rr, then r/2r/2 is the largest radius that yields a packing, since that is the radius at which neighboring spheres become tangent.

3. The E8E_{8} and Leech lattices

Many dimensions feature noteworthy sphere packings, but the E8E_{8} root lattice in 8{\mathbb{R}}^{8} and the Leech lattice in 24{\mathbb{R}}^{24} are perhaps the most remarkable of all, with connections to exceptional structures across mathematics. In this section, we’ll construct E8E_{8} and prove some of its basic properties. It was discovered by Korkine and Zolotareff in 1873, in the guise of a quadratic form they called W8W_{8}. We’ll give a construction much like Korkine and Zolotareff’s but more modern. The Leech lattice Λ24\Lambda_{24}, discovered by Leech in 1967, is similar in spirit, but more complicated. In lieu of constructing it, we will briefly summarize its properties.

To specify E8E_{8}, we just need to describe a lattice basis v1,,v8v_{1},\dots,v_{8} in 8{\mathbb{R}}^{8}. Furthermore, only the relative positions of the basis vectors matter, so all we need to specify is their inner products with each other. All this information will be encoded by the Dynkin diagram

of E8E_{8}. In this diagram, the eight nodes correspond to the basis vectors, each of squared length 22. The inner product between distinct vectors is 1-1 if the nodes are joined by an edge, and 0 otherwise. Thus, if we number the nodes

12345678

then the Gram matrix of inner products for this basis is given by

(3.1) (vi,vj)1i,j8=[2100000012100000012110000012000000102100000012100000012100000012].\big{(}\langle v_{i},v_{j}\rangle\big{)}_{1\leq i,j\leq 8}=\begin{bmatrix}2&-1&0&0&0&0&0&0\\ -1&2&-1&0&0&0&0&0\\ 0&-1&2&-1&-1&0&0&0\\ 0&0&-1&2&0&0&0&0\\ 0&0&-1&0&2&-1&0&0\\ 0&0&0&0&-1&2&-1&0\\ 0&0&0&0&0&-1&2&-1\\ 0&0&0&0&0&0&-1&2\end{bmatrix}.

Before we go further, we must address a fundamental question: how do we know there really are vectors v1,,v8v_{1},\dots,v_{8} with these inner products? All we need is for the matrix in (3.1) to be symmetric and positive definite, and indeed it is, although it’s not obviously positive definite. That can be checked in several ways. We’ll take the pedestrian approach of observing that the characteristic polynomial of this matrix is

t816t7+105t6364t5+714t4784t3+440t296t+1,t^{8}-16t^{7}+105t^{6}-364t^{5}+714t^{4}-784t^{3}+440t^{2}-96t+1,

which clearly has no roots when t<0t<0 because every term is then positive.

We can now define the E8E_{8} root lattice to be the integral span of v1,,v8v_{1},\dots,v_{8}. We will use this definition to derive several fundamental properties of E8E_{8}. These properties will let us determine its packing density, and they will also be essential for Viazovska’s proof.

The E8E_{8} lattice is an integral lattice, which means all the inner products between vectors in E8E_{8} are integers. This follows immediately from the integrality of the inner products of the basis vectors v1,,v8v_{1},\dots,v_{8}. Even more importantly, E8E_{8} is an even lattice, which means the squared length of every vector is an even integer. Specifically, for m1,,m8m_{1},\dots,m_{8}\in{\mathbb{Z}} the vector m1v1++m8v8m_{1}v_{1}+\dots+m_{8}v_{8} has squared length

|m1v1++m8v8|2=2m12++2m82+1i<j82mimjvi,vj,|m_{1}v_{1}+\dots+m_{8}v_{8}|^{2}=2m_{1}^{2}+\dots+2m_{8}^{2}+\sum_{1\leq i<j\leq 8}2m_{i}m_{j}\langle v_{i},v_{j}\rangle,

which is visibly even. Thus, the distances between distinct points in E8E_{8} are all of the form 2k\sqrt{2k} with k=1,2,k=1,2,\dots, and in fact each of those distances does occur.

In particular, the distance between neighboring points in E8E_{8} is 2\sqrt{2}, so we can form a packing with spheres of radius 2/2\sqrt{2}/2 and density

vol(B2/28)vol(8/E8)=π4384vol(8/E8).\frac{\mathop{\textup{vol}}\mathopen{}\left(B^{8}_{\sqrt{2}/2}\right)\mathclose{}}{\mathop{\textup{vol}}\mathopen{}({\mathbb{R}}^{8}/E_{8})\mathclose{}}=\frac{\pi^{4}}{384\mathop{\textup{vol}}\mathopen{}({\mathbb{R}}^{8}/E_{8})\mathclose{}}.

To compute the density of the E8E_{8} packing, all we need to compute is vol(8/E8)\mathop{\textup{vol}}\mathopen{}({\mathbb{R}}^{8}/E_{8})\mathclose{}.

To compute this volume, recall that it’s the absolute value of the determinant of the basis matrix:

vol(8/E8)=|det[v1v2v8]|.\mathop{\textup{vol}}\mathopen{}({\mathbb{R}}^{8}/E_{8})\mathclose{}=\left|\det\begin{bmatrix}\longleftarrow v_{1}\longrightarrow\\ \longleftarrow v_{2}\longrightarrow\\ \vdots\\ \longleftarrow v_{8}\longrightarrow\end{bmatrix}\right|.

However, we can write the Gram matrix (vi,vj)1i,j8\big{(}\langle v_{i},v_{j}\rangle\big{)}_{1\leq i,j\leq 8} as the product

[v1v2v8][v1v2v8]\begin{bmatrix}\longleftarrow v_{1}\longrightarrow\\ \longleftarrow v_{2}\longrightarrow\\ \vdots\\ \longleftarrow v_{8}\longrightarrow\end{bmatrix}\begin{bmatrix}{\Big{\uparrow}}&{\Big{\uparrow}}&&{\Big{\uparrow}}\\ v_{1}&v_{2}&\cdots&v_{8}\\ {\Big{\downarrow}}&{\Big{\downarrow}}&&{\Big{\downarrow}}\\ \end{bmatrix}

of the basis matrix with its transpose, and thus

det(vi,vj)1i,j8=vol(8/E8)2.\det\big{(}\langle v_{i},v_{j}\rangle\big{)}_{1\leq i,j\leq 8}=\mathop{\textup{vol}}\mathopen{}({\mathbb{R}}^{8}/E_{8})\mathclose{}^{2}.

Computing the determinant of the matrix in (3.1) then shows that vol(8/E8)=1\mathop{\textup{vol}}\mathopen{}({\mathbb{R}}^{8}/E_{8})\mathclose{}=1. In other words, E8E_{8} is a unimodular lattice.

Putting together our calculations, we have proved the following proposition:

Proposition 3.1.

The E8E_{8} lattice packing in 8{\mathbb{R}}^{8} has density π4/384=0.2536\pi^{4}/384=0.2536\dots.

Our calculations so far have led us to what turns out to be the densest sphere packing in 8{\mathbb{R}}^{8}, but it’s not obvious from this construction that E8E_{8} is an especially interesting lattice. The E8E_{8} lattice is in fact magnificently symmetrical, far more so than one might naively guess based on its lopsided Dynkin diagram. Its symmetry group is the E8E_{8} Weyl group, which is generated by reflections in the hyperplanes orthogonal to each of v1,,v8v_{1},\dots,v_{8}. We will not make use of this group, but it’s important to keep in mind that the lattice itself is far more symmetrical than its definition. This is a common pattern when defining highly symmetrical objects.

Our density calculation for E8E_{8} was based on its being an even unimodular lattice. In fact, E8E_{8} is the unique even unimodular lattice in 8{\mathbb{R}}^{8}, up to orthogonal transformations. Even unimodular lattices exist only when the dimension is a multiple of eight, and they play a surprisingly large role in the theory of sphere packing.

Refer to caption
Figure 6. Stephen D. Miller explains dual lattices and transference theorems to his graduate class on the geometry of numbers.

The last property of E8E_{8} we will need for Viazovska’s proof is that it is its own dual lattice, a concept we will define shortly. Given a lattice Λ\Lambda with basis v1,,vnv_{1},\dots,v_{n}, let v1,,vnv_{1}^{*},\dots,v_{n}^{*} be the dual basis with respect to the usual inner product. In other words,

vi,vj={1if i=j, and0otherwise.\langle v_{i},v_{j}^{*}\rangle=\begin{cases}1&\text{if $i=j$, and}\\ 0&\text{otherwise.}\end{cases}

Then the dual lattice Λ\Lambda^{*} of Λ\Lambda is the lattice with basis v1,,vnv_{1}^{*},\dots,v_{n}^{*}. It is not difficult to check that Λ\Lambda^{*} is independent of the choice of basis for Λ\Lambda; one basis-free way to characterize it is that

(3.2) Λ={yn:x,y for all xΛ}.\Lambda^{*}=\{y\in{\mathbb{R}}^{n}:\text{$\langle x,y\rangle\in{\mathbb{Z}}$ for all $x\in\Lambda$}\}.

The self-duality E8=E8E_{8}^{*}=E_{8} is a consequence of the following lemma:

Lemma 3.2.

Every integral unimodular lattice Λ\Lambda satisfies Λ=Λ\Lambda^{*}=\Lambda.

Proof.

Let v1,,vnv_{1},\dots,v_{n} be a basis of Λ\Lambda, and v1,,vnv_{1}^{*},\dots,v_{n}^{*} the dual basis of Λ\Lambda^{*}. By construction, the basis matrix formed by v1,,vnv_{1}^{*},\dots,v_{n}^{*} is the inverse of the transpose of that formed by v1,,vnv_{1},\dots,v_{n}, and hence vol(n/Λ)=1/vol(n/Λ)\mathop{\textup{vol}}\mathopen{}({\mathbb{R}}^{n}/\Lambda^{*})\mathclose{}=1/\!\mathop{\textup{vol}}\mathopen{}({\mathbb{R}}^{n}/\Lambda)\mathclose{}. If Λ\Lambda is an integral lattice, then ΛΛ\Lambda\subseteq\Lambda^{*}, and the index of Λ\Lambda in Λ\Lambda^{*} is given by

[Λ:Λ]=vol(n/Λ)/vol(n/Λ)=vol(n/Λ)2.[\Lambda^{*}:\Lambda]=\mathop{\textup{vol}}\mathopen{}(\mathclose{}{\mathbb{R}}^{n}/\Lambda)/\!\mathop{\textup{vol}}\mathopen{}({\mathbb{R}}^{n}/\Lambda^{*})\mathclose{}=\mathop{\textup{vol}}\mathopen{}({\mathbb{R}}^{n}/\Lambda)\mathclose{}^{2}.

If Λ\Lambda is unimodular as well, then [Λ:Λ]=1[\Lambda^{*}:\Lambda]=1 and hence Λ=Λ\Lambda^{*}=\Lambda. ∎

As mentioned above, the Leech lattice Λ24\Lambda_{24} is similar to E8E_{8} but more elaborate. It’s an even unimodular lattice in 24{\mathbb{R}}^{24}, but this time with no vectors of length 2\sqrt{2}, and it’s the unique lattice with these properties, up to orthogonal transformations. The nonzero vectors in Λ24\Lambda_{24} have lengths 2k\sqrt{2k} for k=2,3,k=2,3,\dots, and of course Λ24=Λ24\Lambda_{24}^{*}=\Lambda_{24} because Λ24\Lambda_{24} is integral and unimodular. One noteworthy property of Λ24\Lambda_{24} is that it’s chiral: all of its symmetries are orientation-preserving, and the Leech lattice therefore occurs in left-handed and right-handed variants, which are mirror images of each other. (By contrast, the symmetry group of E8E_{8} is generated by reflections, so E8E_{8} is certainly not chiral.)

The sphere packing density of the Leech lattice is

vol(B124)vol(24/Λ24)=π1212!=0.001929,\frac{\mathop{\textup{vol}}\mathopen{}\left(B^{24}_{1}\right)\mathclose{}}{\mathop{\textup{vol}}\mathopen{}({\mathbb{R}}^{24}/\Lambda_{24})\mathclose{}}=\frac{\pi^{12}}{12!}=0.001929\dots,

which looks awfully low, but keep in mind that the optimal density decreases exponentially as a function of dimension. In fact, the density of the Leech lattice is remarkably high, as one can see from Figure 4 and Table 1. For comparison, the best density known in 23{\mathbb{R}}^{23} is 0.0019050.001905\dots, which is lower than the density of the Leech lattice, and this is the only case in which the density increases from one dimension to the next in Table 1.

4. Linear programming bounds

The underlying technique used in Viazovska’s proof is linear programming bounds for the sphere packing density in n{\mathbb{R}}^{n}. These upper bounds were developed by Cohn and Elkies [2], based on several decades of previous work initiated by Delsarte and extended by numerous mathematicians. In this approach to sphere packing, one uses auxiliary functions with certain properties to obtain density bounds. Viazovska’s breakthrough consists of a new technique for constructing these auxiliary functions, but before we turn to her proof let’s examine the general theory and review how the bounds work. We will see that the general bounds do not refer to special dimensions such as eight and twenty-four, which makes it all the more remarkable that they can be used to solve the sphere packing problem in these dimensions.

Refer to caption
Figure 7. Noam Elkies developed the linear programming bounds for sphere packing with Henry Cohn.

Linear programming bounds are based on harmonic analysis. That may sound surprising, since sphere packing is a problem in discrete geometry, which at first glance seems to have little to do with the continuous problems studied in harmonic analysis. However, there is a deep connection between these fields, because the Fourier transform is essential for understanding the action of the additive group n{\mathbb{R}}^{n} on itself by translation, so much so that one can’t truly understand lattices without harmonic analysis.

Define the Fourier transform f^\widehat{f} of an integrable function f:nf\colon{\mathbb{R}}^{n}\to{\mathbb{R}} by

f^(y)=nf(x)e2πix,y𝑑x.\widehat{f}(y)=\int_{{\mathbb{R}}^{n}}f(x)e^{-2\pi i\langle x,y\rangle}\,dx.

Fourier inversion tells us that if f^\widehat{f} is integrable as well, then one can similarly recover ff from f^\widehat{f}:

(4.1) f(x)=nf^(y)e2πix,y𝑑yf(x)=\int_{{\mathbb{R}}^{n}}\widehat{f}(y)e^{2\pi i\langle x,y\rangle}\,dy

almost everywhere. In other words, the Fourier transform gives the unique coefficients needed to express ff in terms of complex exponentials.

To avoid analytic technicalities, we will focus on Schwartz functions. Recall that f:nf\colon{\mathbb{R}}^{n}\to{\mathbb{R}} is a Schwartz function if ff is infinitely differentiable, f(x)=O((1+|x|)k)f(x)=O\big{(}(1+|x|)^{-k}\big{)} for all k=1,2,k=1,2,\dots, and the same holds for all the partial derivatives of ff (of every order). Schwartz functions behave particularly well, well enough to justify everything we’d like to do with them, and they are closed under the Fourier transform. We could get by with weaker hypotheses, but in fact Viazovska’s construction produces Schwartz functions, so we might as well focus on that case.

The significance of the Fourier transform in sphere packing is that it diagonalizes the operation of translation by any vector. Specifically, (4.1) implies that

f(x+t)=nf^(y)e2πit,ye2πix,y𝑑y,f(x+t)=\int_{{\mathbb{R}}^{n}}\widehat{f}(y)e^{2\pi i\langle t,y\rangle}e^{2\pi i\langle x,y\rangle}\,dy,

which means that translating the input to the function ff by tt amounts to multiplying its Fourier transform f^(y)\widehat{f}(y) by e2πit,ye^{2\pi i\langle t,y\rangle}. Simultaneously diagonalizing all these translation operators makes the Fourier transform an ideal tool for studying periodic structures.

The key technical tool behind linear programming bounds is the Poisson summation formula, which expresses a duality between summing a function over a lattice and summing the Fourier transform over the dual lattice, as defined in (3.2). Poisson summation says that if ff is a Schwartz function, then

(4.2) xΛf(x)=1vol(n/Λ)yΛf^(y).\sum_{x\in\Lambda}f(x)=\frac{1}{\mathop{\textup{vol}}\mathopen{}({\mathbb{R}}^{n}/\Lambda)\mathclose{}}\sum_{y\in\Lambda^{*}}\widehat{f}(y).

In other words, summing f^\widehat{f} over Λ\Lambda^{*} is almost the same as summing ff over Λ\Lambda, with the only difference being a factor of vol(n/Λ)\mathop{\textup{vol}}\mathopen{}({\mathbb{R}}^{n}/\Lambda)\mathclose{}. When expressed in this form, Poisson summation looks mysterious, but it becomes far more transparent when written in the translated form

(4.3) xΛf(x+t)=1vol(n/Λ)yΛf^(y)e2πiy,t.\sum_{x\in\Lambda}f(x+t)=\frac{1}{\mathop{\textup{vol}}\mathopen{}({\mathbb{R}}^{n}/\Lambda)\mathclose{}}\sum_{y\in\Lambda^{*}}\widehat{f}(y)e^{2\pi i\langle y,t\rangle}.

This equation reduces to (4.2) when t=0t=0, and it has a simple proof. As a function of tt, the left side of (4.3) is periodic modulo Λ\Lambda, while the right side is its Fourier series. In particular, the right side uses exactly the complex exponentials te2πiy,tt\mapsto e^{2\pi i\langle y,t\rangle} that are periodic modulo Λ\Lambda, namely those with yΛy\in\Lambda^{*} (as follows easily from (3.2)). Orthogonality let us compute the coefficient of such an exponential, and some manipulation yields f^(y)/vol(n/Λ)\widehat{f}(y)/\!\mathop{\textup{vol}}\mathopen{}({\mathbb{R}}^{n}/\Lambda)\mathclose{}.

Now we can state and prove the linear programming bounds, which show how to convert a certain sort of auxiliary function into a sphere packing bound. Specifically, we will use functions f:nf\colon{\mathbb{R}}^{n}\to{\mathbb{R}} such that ff is eventually nonpositive (i.e., there exists a radius rr such that f(x)rf(x)\leq r for |x|r|x|\geq r) while f^\widehat{f} is nonnegative everywhere.

Theorem 4.1 (Cohn and Elkies [2]).

Let f:nf\colon{\mathbb{R}}^{n}\to{\mathbb{R}} be a Schwartz function and rr a positive real number such that f(0)=f^(0)>0f(0)=\widehat{f}(0)>0, f^(y)0\widehat{f}(y)\geq 0 for all yny\in{\mathbb{R}}^{n}, and f(x)0f(x)\leq 0 for |x|r|x|\geq r. Then the sphere packing density in n{\mathbb{R}}^{n} is at most vol(Br/2n)\mathop{\textup{vol}}\mathopen{}\big{(}B_{r/2}^{n}\big{)}\mathclose{}.

The name “linear programming” refers to optimizing a linear function subject to linear constraints. The optimization problem of choosing ff so as to minimize rr can be rephrased as an infinite-dimensional linear program after a change of variables, but we will not adopt that perspective here.

Proof.

The proof consists of applying the contrasting inequalities f(x)0f(x)\leq 0 and f^(y)0\widehat{f}(y)\geq 0 to the two sides of Poisson summation. We will begin by proving the theorem for lattice packings, which is the simplest case.

Suppose Λ\Lambda is a lattice in n{\mathbb{R}}^{n}, and suppose without loss of generality that the minimal vector length of Λ\Lambda is rr (since the sphere packing density is invariant under rescaling). In other words, the packing uses balls of radius r/2r/2, and its density is

vol(Br/2n)vol(n/Λ).\frac{\mathop{\textup{vol}}\mathopen{}\big{(}B_{r/2}^{n}\big{)}\mathclose{}}{\mathop{\textup{vol}}\mathopen{}({\mathbb{R}}^{n}/\Lambda)\mathclose{}}.

Proving the desired density bound vol(Br/2n)\mathop{\textup{vol}}\mathopen{}\big{(}B_{r/2}^{n}\big{)}\mathclose{} for Λ\Lambda amounts to showing that vol(n/Λ)1\mathop{\textup{vol}}\mathopen{}({\mathbb{R}}^{n}/\Lambda)\mathclose{}\geq 1. By Poisson summation,

(4.4) xΛf(x)=1vol(n/Λ)yΛf^(y).\sum_{x\in\Lambda}f(x)=\frac{1}{\mathop{\textup{vol}}\mathopen{}({\mathbb{R}}^{n}/\Lambda)\mathclose{}}\sum_{y\in\Lambda^{*}}\widehat{f}(y).

Now the inequality f(x)0f(x)\leq 0 for |x|r|x|\geq r tells us that the left side of (4.4) is bounded above by f(0)f(0), and the inequality f^(y)0\widehat{f}(y)\geq 0 tells us that the right side is bounded below by f^(0)/vol(n/Λ)\widehat{f}(0)/\!\mathop{\textup{vol}}\mathopen{}({\mathbb{R}}^{n}/\Lambda)\mathclose{}. It follows that

f(0)f^(0)vol(n/Λ),f(0)\geq\frac{\widehat{f}(0)}{\mathop{\textup{vol}}\mathopen{}({\mathbb{R}}^{n}/\Lambda)\mathclose{}},

which yields vol(n/Λ)1\mathop{\textup{vol}}\mathopen{}({\mathbb{R}}^{n}/\Lambda)\mathclose{}\geq 1 because f(0)=f^(0)>0f(0)=\widehat{f}(0)>0.

The general case is almost as simple, but the algebraic manipulations are a little trickier. Because periodic packings come arbitrarily close to the optimal sphere packing density, without loss of generality we can consider a periodic packing using balls of radius r/2r/2, centered at the translates of a lattice Λn\Lambda\subseteq{\mathbb{R}}^{n} by vectors t1,,tNt_{1},\dots,t_{N}. The packing density is

Nvol(Br/2n)vol(n/Λ),\frac{N\mathop{\textup{vol}}\mathopen{}\big{(}B_{r/2}^{n}\big{)}\mathclose{}}{\mathop{\textup{vol}}\mathopen{}({\mathbb{R}}^{n}/\Lambda)\mathclose{}},

and so we wish to prove that vol(n/Λ)N\mathop{\textup{vol}}\mathopen{}({\mathbb{R}}^{n}/\Lambda)\mathclose{}\geq N.

We will use the translated Poisson summation formula (4.3), which after a little manipulation implies that

j,k=1NxΛf(tjtk+x)=1vol(n/Λ)yΛf^(y)|j=1Ne2πiy,tj|2.\sum_{j,k=1}^{N}\sum_{x\in\Lambda}f(t_{j}-t_{k}+x)=\frac{1}{\mathop{\textup{vol}}\mathopen{}({\mathbb{R}}^{n}/\Lambda)\mathclose{}}\sum_{y\in\Lambda^{*}}\widehat{f}(y)\left|\sum_{j=1}^{N}e^{2\pi i\langle y,t_{j}\rangle}\right|^{2}.

Again we apply the contrasting inequalities on ff and f^\widehat{f} to the left and right sides, respectively. On the left, we obtain an upper bound by throwing away every term except when j=kj=k and x=0x=0; on the right, we obtain a lower bound by throwing away every term except when y=0y=0. Thus,

Nf(0)N2vol(n/Λ)f^(0),Nf(0)\geq\frac{N^{2}}{\mathop{\textup{vol}}\mathopen{}({\mathbb{R}}^{n}/\Lambda)\mathclose{}}\widehat{f}(0),

which implies that vol(n/Λ)N\mathop{\textup{vol}}\mathopen{}({\mathbb{R}}^{n}/\Lambda)\mathclose{}\geq N and hence that the density is at most vol(Br/2n)\mathop{\textup{vol}}\mathopen{}\big{(}B_{r/2}^{n}\big{)}\mathclose{}, as desired. ∎

This proof technique may look absurdly inefficient. We start with Poisson summation, which expresses a deep duality, and then we recklessly throw away all the nontrivial terms, leaving only the contributions from the origin. One practical justification is that we have little choice in the matter, since we don’t know what the other terms are (they all depend on the lattice). A deeper justification is that the omitted terms are generally small, and sometimes zero, so omitting them is not as bad as it sounds.

To apply Theorem 4.1, we must choose an auxiliary function ff. The theorem then shows how to obtain a density bound from ff, but it says nothing about how to choose ff so as to minimize rr and hence minimize the density bound. Sadly, optimizing the auxiliary function remains an unsolved problem, and the best possible choice of ff is known only when n=1n=1, 88, or 2424.

As a first step towards solving this problem, note that we can radially symmetrize ff, so that f(x)f(x) depends only on |x||x|, because all the constraints on ff are linear and rotationally invariant. Then ff is really a function of one radial variable, as is f^\widehat{f}. Functions of one variable feel like they should be tractable, but this optimization problem turns out to be impressively subtle.

If we can’t fully optimize the choice of ff, then what can we do? Several explicit constructions are known, but in general we must resort to numerical computation. For this purpose, it’s convenient to use auxiliary functions of the form f(x)=p(|x|2)eπ|x|2f(x)=p(|x|^{2})e^{-\pi|x|^{2}}, where pp is a polynomial. These functions are flexible enough to approximate arbitrary radial Schwartz functions, but simple enough to be tractable. Numerical optimization then yields a high-precision approximation to the linear programming bound, which is shown in Figure 8 and Table 2.

5. The hunt for the magic functions

The most striking property of Figure 8 is that the upper and lower bounds in n{\mathbb{R}}^{n} seem to touch when n=8n=8 or 2424. In other words, there should be magic auxiliary functions that solve the sphere packing problem in these dimensions, by achieving r=2r=\sqrt{2} in Theorem 4.1 when n=8n=8 and r=2r=2 when n=24n=24. (These values of rr are the minimal vector lengths in E8E_{8} and Λ24\Lambda_{24}, respectively.) This is exactly what has now been proved, and the proof simply amounts to constructing an appropriate auxiliary function. Linear programming bounds do not seem to be sharp for any other n>2n>2, which makes these two cases truly remarkable.

dimensionlog(density)\log(\text{density})linear programming boundbest packing known4488121216162020242428283232363611014-14
Figure 8. The logarithm of sphere packing density as a function of dimension. The upper curve is the numerically optimized linear programming bound, while the lower curve is the best packing currently known. The truth lies somewhere in between.
Table 2. The linear programming bound for the sphere packing density in n{\mathbb{R}}^{n} with 1n361\leq n\leq 36. All numbers are rounded up.
nn upper bound nn upper bound nn upper bound
11 1.0000000001.000000000 1313 0.06248170020.0624817002 2525 0.0013841907230.001384190723
22 0.9068996830.906899683 1414 0.04636448930.0463644893 2626 0.0009910238900.000991023890
33 0.7797467620.779746762 1515 0.03424826210.0342482621 2727 0.0007082297960.000708229796
44 0.6477049660.647704966 1616 0.02519413080.0251941308 2828 0.0005052542170.000505254217
55 0.5249800220.524980022 1717 0.01846409040.0184640904 2929 0.0003598581860.000359858186
66 0.4176734160.417673416 1818 0.01348534050.0134853405 3030 0.0002559028750.000255902875
77 0.3274556110.327455611 1919 0.00981795520.0098179552 3131 0.0001817083820.000181708382
88 0.2536695080.253669508 2020 0.00712705370.0071270537 3232 0.0001288432890.000128843289
99 0.1945553390.194555339 2121 0.00515966040.0051596604 3333 0.0000912356040.000091235604
1010 0.1479534790.147953479 2222 0.00372594200.0037259420 3434 0.0000645221970.000064522197
1111 0.1116907660.111690766 2323 0.00268427990.0026842799 3535 0.0000455743850.000045574385
1212 0.0837758310.083775831 2424 0.00192957440.0019295744 3636 0.0000321530560.000032153056

The existence of these magic functions was conjectured by Cohn and Elkies [2] on the basis of numerical evidence and analogies with other problems in coding theory. Further evidence was obtained by Cohn and Kumar [3] in the course of proving that the Leech lattice is the densest lattice in 24{\mathbb{R}}^{24}, while Cohn and Miller [5] carried out an even more detailed study of the magic functions. These calculations left no doubt that the magic functions existed: one could compute them to fifty decimal places, plot them, approximate their roots and power series coefficients, etc. They were perfectly concrete and accessible functions, amenable to exploration and experimentation, which indeed uncovered various intriguing patterns. All that was missing was an existence proof.

However, proving existence was no easy matter. There was no sign of an explicit formula, or any other characterization that could lead to a proof. Instead, the magic functions seemed to come out of nowhere.

The fundamental difficulty is explaining where the magic comes from. One can optimize the auxiliary function in any dimension, but that will generally not produce a sharp bound for the packing density. Why should eight and twenty-four dimensions be any different? The numerical results show that the bound is nearly sharp in those dimensions, but why couldn’t it be exact for a hundred decimal places, followed by random noise? That’s not a plausible scenario for anyone with faith in the beauty of mathematics, but faith does not amount to a proof, and any proof must take advantage of special properties of these dimensions.

For comparison, the answer is far less nice in sixteen dimensions. By analogy with r=2r=\sqrt{2} when n=8n=8 and r=2r=2 when n=24n=24, one might guess that r=3r=\sqrt{3} when n=16n=16, but that bound cannot be achieved. Instead, numerical optimization seems to converge to r2=3.0252593116828820r^{2}=3.0252593116828820\dots, which is close to 33 but not equal to it. This number has not yet been identified exactly.

Despite the lack of an existence proof, the proof of Theorem 4.1 implicitly describes what the magic functions must look like:

Lemma 5.1.

Suppose ff satisfies the hypotheses of the linear programming bounds for sphere packing in n{\mathbb{R}}^{n}, with f(x)0f(x)\leq 0 for |x|r|x|\geq r, and suppose Λ\Lambda is a lattice in n{\mathbb{R}}^{n} with minimal vector length rr. Then the density of Λ\Lambda equals the bound vol(Br/2n)\mathop{\textup{vol}}\mathopen{}\big{(}B_{r/2}^{n}\big{)}\mathclose{} from Theorem 4.1 if and only if ff vanishes on Λ{0}\Lambda\setminus\{0\} and f^\widehat{f} vanishes on Λ{0}\Lambda^{*}\setminus\{0\}.

Proof.

Recall that the proof of Theorem 4.1 for a lattice Λ\Lambda amounted to dropping all the nontrivial terms in the Poisson summation formula, to obtain the inequality

f(0)xΛf(x)=1vol(n/Λ)yΛf^(y)f^(0)vol(n/Λ).f(0)\geq\sum_{x\in\Lambda}f(x)=\frac{1}{\mathop{\textup{vol}}\mathopen{}({\mathbb{R}}^{n}/\Lambda)\mathclose{}}\sum_{y\in\Lambda^{*}}\widehat{f}(y)\geq\frac{\widehat{f}(0)}{\mathop{\textup{vol}}\mathopen{}({\mathbb{R}}^{n}/\Lambda)\mathclose{}}.

The only way this argument could yield a sharp bound is if all the omitted terms were already zero. In other words, ff proves that Λ\Lambda is an optimal sphere packing if and only if ff vanishes on Λ{0}\Lambda\setminus\{0\} and f^\widehat{f} vanishes on Λ{0}\Lambda^{*}\setminus\{0\}. ∎

As discussed in the previous section, without loss of generality we can assume that ff is a radial function, as is f^\widehat{f}. We know exactly where the roots of ff and f^\widehat{f} should be, since E8=E8E_{8}=E_{8}^{*} with vector lengths 2k\sqrt{2k} for k=1,2,k=1,2,\dots, while Λ24=Λ24\Lambda_{24}=\Lambda_{24}^{*} with vector lengths 2k\sqrt{2k} for k=2,3,k=2,3,\dots. These roots should have order two, to avoid sign changes, except that the first root of ff should be a single root. See Figure 9 for a diagram.

ff2\sqrt{2}4\sqrt{4}6\sqrt{6}8\sqrt{8}
f^\widehat{f}2\sqrt{2}4\sqrt{4}6\sqrt{6}8\sqrt{8}
Figure 9. A schematic diagram showing the roots of the magic function ff and its Fourier transform f^\widehat{f} in eight dimensions. The figure is not to scale, because the actual functions decrease too rapidly for an accurate plot to be illuminating.

Thus, our problem is simple to state: how can we construct a radial Schwartz function ff such that ff and f^\widehat{f} have the desired roots and no others? Note that Poisson summation over E8E_{8} or Λ24\Lambda_{24} then implies that f(0)=f^(0)f(0)=\widehat{f}(0), and flipping the sign of ff if necessary ensures that all the necessary inequalities hold.

Unfortunately it’s difficult to take advantage of this characterization. The problem is that it’s hard to control a function and its Fourier transform simultaneously: it’s easy to produce the desired roots in either one separately, but not at the same time. Our inability to control ff without losing control of f^\widehat{f} is at the root of the Heisenberg uncertainty principle, and it’s a truly fundamental obstacle.

One natural way to approach this problem is to carry out numerical experiments. Cohn and Miller used functions of the form f(x)=p(|x|2)eπ|x|2f(x)=p(|x|^{2})e^{-\pi|x|^{2}} to approximate the magic functions, where pp is a polynomial chosen to force ff and f^\widehat{f} to have many of the desired roots. Such an approximation can never be exact, since it has only finitely many roots, but it can come arbitrarily close to the truth. This investigation uncovered several noteworthy properties of the magic functions, which showed that they had unexpected structure. For example, if we normalize the magic functions f8f_{8} and f24f_{24} in 88 and 2424 dimensions so that f8(0)=f24(0)=1f_{8}(0)=f_{24}(0)=1, then Cohn and Miller conjectured that their second Taylor coefficients are rational:

f8(x)\displaystyle f_{8}(x) =12710|x|2+O(|x|4),\displaystyle=1-\frac{27}{10}|x|^{2}+O\big{(}|x|^{4}\big{)}, f^8(x)\displaystyle\quad\widehat{f}_{8}(x) =132|x|2+O(|x|4),\displaystyle=1-\frac{3}{2}|x|^{2}+O\big{(}|x|^{4}\big{)},
f24(x)\displaystyle f_{24}(x) =1143475460|x|2+O(|x|4),\displaystyle=1-\frac{14347}{5460}|x|^{2}+O\big{(}|x|^{4}\big{)}, f^24(x)\displaystyle\quad\widehat{f}_{24}(x) =1205156|x|2+O(|x|4).\displaystyle=1-\frac{205}{156}|x|^{2}+O\big{(}|x|^{4}\big{)}.

If all the higher-order coefficients had been rational as well, then it would have opened the door to determining these functions exactly, but frustratingly it seems that the other coefficients are far more subtle and presumably irrational. The magic functions retained their mystery, and this Taylor series behavior went unexplained until the exact formulas for the magic functions were discovered.

Given the difficulty of controlling ff and f^\widehat{f} simultaneously, one natural approach is to split them into eigenfunctions of the Fourier transform. By Fourier inversion, every radial function ff satisfies f^^=f\widehat{\widehat{f\,}}=f. Thus, if we set f+=(f+f^)/2f_{+}=\big{(}f+\widehat{f}\,\big{)}/2 and f=(ff^)/2f_{-}=\big{(}f-\widehat{f}\,\big{)}/2, then f=f++ff=f_{+}+f_{-} with f^+=f+\widehat{f}_{+}=f_{+} and f^=f\widehat{f}_{-}=-f_{-}. Because ff and f^\widehat{f} vanish at the same points, they share these roots with f+f_{+} and ff_{-}. Our goal is therefore to construct radial eigenfunctions of the Fourier transform with prescribed roots. The advantage of this approach is that it conveniently separates into two distinct problems, namely constructing the +1+1 and 1-1 eigenfunctions, but these problems remain difficult.

6. Modular forms

Ever since the Cohn-Elkies paper in 2003, number theorists had hoped to construct the magic functions using modular forms. The reasoning is simple: modular forms are deep and mysterious functions connected with lattices, as are the magic functions, so wouldn’t it make sense for them to be related? Unfortunately, they are entirely different sorts of functions, with no clear connection between them. That’s where matters stood until Viazovska discovered a remarkable integral transform, which enabled her to construct the magic functions using modular forms. We’ll get there shortly, but first let’s briefly review how modular forms work.

We’ll start with some examples. Every lattice Λ\Lambda has a theta series ΘΛ\Theta_{\Lambda}, defined by

(6.1) ΘΛ(z)=xΛeπi|x|2z.\Theta_{\Lambda}(z)=\sum_{x\in\Lambda}e^{\pi i|x|^{2}z}.

This series converges when Imz>0\operatorname{Im}z>0, and it defines an analytic function on the upper half-plane 𝔥={z:Imz>0}\mathfrak{h}=\{z\in{\mathbb{C}}:\operatorname{Im}z>0\}. To motivate the definition, think of the theta series as a generating function, where the coefficient of eπitze^{\pi itz} counts the number of xΛx\in\Lambda with |x|2=t|x|^{2}=t. However, there’s one aspect not explained by the generating function interpretation: why write this function in terms of eπize^{\pi iz}? Doing so may at first look like a gratuitous nod to Fourier series, but it leads to an elegant transformation law based on applying Poisson summation to a Gaussian:

Proposition 6.1.

If Λ\Lambda is a lattice in n{\mathbb{R}}^{n}, then

ΘΛ(z)=1vol(n/Λ)(iz)n/2ΘΛ(1/z)\Theta_{\Lambda}(z)=\frac{1}{\mathop{\textup{vol}}\mathopen{}({\mathbb{R}}^{n}/\Lambda)\mathclose{}}\left(\frac{i}{z}\right)^{n/2}\Theta_{\Lambda^{*}}(-1/z)

for all z𝔥z\in\mathfrak{h}.

Proof.

One of the most important properties of Gaussians is that the set of Gaussians is closed under the Fourier transform: the Fourier transform of a wide Gaussian is a narrow Gaussian, and vice versa. More precisely, for t>0t>0 the Fourier transform of the Gaussian xetπ|x|2x\mapsto e^{-t\pi|x|^{2}} on n{\mathbb{R}}^{n} is xtn/2eπ|x|2/tx\mapsto t^{-n/2}e^{-\pi|x|^{2}/t}. In fact, the same holds whenever tt is a complex number with Ret>0\operatorname{Re}t>0, by analytic continuation. Then Poisson summation tells us that

xΛetπ|x|2=1vol(n/Λ)yΛtn/2eπ|y|2/t.\sum_{x\in\Lambda}e^{-t\pi|x|^{2}}=\frac{1}{\mathop{\textup{vol}}\mathopen{}({\mathbb{R}}^{n}/\Lambda)\mathclose{}}\sum_{y\in\Lambda^{*}}t^{-n/2}e^{-\pi|y|^{2}/t}.

Setting z=itz=it, we find that

ΘΛ(z)=1vol(n/Λ)(iz)n/2ΘΛ(1/z)\Theta_{\Lambda}(z)=\frac{1}{\mathop{\textup{vol}}\mathopen{}({\mathbb{R}}^{n}/\Lambda)\mathclose{}}\left(\frac{i}{z}\right)^{n/2}\Theta_{\Lambda^{*}}(-1/z)

whenever Imz>0\operatorname{Im}z>0, as desired. ∎

If we set Λ=E8\Lambda=E_{8}, then Λ=E8\Lambda^{*}=E_{8} as well, and we find that

ΘE8(1/z)=z4ΘE8(z).\Theta_{E_{8}}(-1/z)=z^{4}\Theta_{E_{8}}(z).

Furthermore, E8E_{8} is an even lattice, and hence the Fourier series (6.1) implies that

ΘE8(z+1)=ΘE8(z).\Theta_{E_{8}}(z+1)=\Theta_{E_{8}}(z).

These two symmetries are the most important properties of ΘE8\Theta_{E_{8}}. For exactly the same reasons, the theta series of the Leech lattice Λ24\Lambda_{24} satisfies

ΘΛ24(1/z)=z12ΘΛ24(z)andΘΛ24(z+1)=ΘΛ24(z).\Theta_{\Lambda_{24}}(-1/z)=z^{12}\Theta_{\Lambda_{24}}(z)\qquad\text{and}\qquad\Theta_{\Lambda_{24}}(z+1)=\Theta_{\Lambda_{24}}(z).

The mappings zz+1z\mapsto z+1 and z1/zz\mapsto-1/z generate a discrete group of transformations of the upper half-plane, called the modular group. It turns out to be the same as the action of the group SL2()\mathop{\textup{SL}}_{2}({\mathbb{Z}}) on the upper half-plane by linear fractional transformations, but we will not need this fact except for naming purposes.

A modular form of weight kk for SL2()\mathop{\textup{SL}}_{2}({\mathbb{Z}}) is a holomorphic function φ:𝔥\varphi\colon\mathfrak{h}\to{\mathbb{C}} such that φ(z+1)=φ(z)\varphi(z+1)=\varphi(z) and φ(1/z)=zkφ(z)\varphi(-1/z)=z^{k}\varphi(z) for all z𝔥z\in\mathfrak{h}, while φ(z)\varphi(z) remains bounded as Imz\operatorname{Im}z\to\infty. (The latter condition is called being holomorphic at infinity, because it means the singularity there is removable.) It’s not hard to show that the weight of a nonzero modular form must be nonnegative and even, and the only modular forms of weight zero are the constant functions.

We have seen that ΘE8\Theta_{E_{8}} and ΘΛ24\Theta_{\Lambda_{24}} satisfy the transformation laws for modular forms of weight 44 and 1212, respectively, and it is easy to check that they are holomorphic at infinity. Thus, these theta series are modular forms.

There are a number of other well-known modular forms. For example, the Eisenstein series EkE_{k} defined by

Ek(z)=12ζ(k)(m,n)2(m,n)(0,0)1(mz+n)kE_{k}(z)=\frac{1}{2\zeta(k)}\sum_{\begin{subarray}{c}(m,n)\in{\mathbb{Z}}^{2}\\ (m,n)\neq(0,0)\end{subarray}}\frac{1}{(mz+n)^{k}}

is a modular form of weight kk for SL2()\mathop{\textup{SL}}_{2}({\mathbb{Z}}) whenever kk is an even integer greater than 22 (while it vanishes when kk is odd). The proofs of the required identities Ek(z+1)=Ek(z)E_{k}(z+1)=E_{k}(z) and Ek(1/z)=zkEk(z)E_{k}(-1/z)=z^{k}E_{k}(z) simply amount to rearranging the sum. Here ζ\zeta denotes the Riemann zeta function, and 2ζ(k)2\zeta(k) is a normalizing factor. The advantage of this normalization is that it leads to the Fourier expansion

(6.2) Ek(z)=1+2ζ(1k)m=1σk1(m)e2πimz,E_{k}(z)=1+\frac{2}{\zeta(1-k)}\sum_{m=1}^{\infty}\sigma_{k-1}(m)e^{2\pi imz},

where σk1(m)\sigma_{k-1}(m) is the sum of the (k1)(k-1)-st powers of the divisors of mm and ζ(1k)\zeta(1-k) turns out to be a rational number.

The notational conflict between the Eisenstein series EkE_{k} and the E8E_{8} lattice is unfortunate, but both notations are well established. Fortunately, we will never need to set k=8k=8, and the context should easily distinguish between Eisenstein series and lattices.

Modular forms are highly constrained objects, which makes coincidences commonplace. For example, ΘE8\Theta_{E_{8}} is the same as E4E_{4}, because there is a unique modular form of weight 44 for SL2()\mathop{\textup{SL}}_{2}({\mathbb{Z}}) with constant term 11. Equivalently, for m=1,2,m=1,2,\dots there are exactly 240σ3(m)240\sigma_{3}(m) vectors xE8x\in E_{8} with |x|2=2m|x|^{2}=2m. The theta series ΘΛ24\Theta_{\Lambda_{24}} is not an Eisenstein series, but it can be written in terms of them as

ΘΛ24=712E43+512E62.\Theta_{\Lambda_{24}}=\frac{7}{12}E_{4}^{3}+\frac{5}{12}E_{6}^{2}.

More generally, let k\mathcal{M}_{k} denote the space of modular forms of weight kk for SL2()\mathop{\textup{SL}}_{2}({\mathbb{Z}}). Then k0k\bigoplus_{k\geq 0}\mathcal{M}_{k} is a graded ring, because the product of modular forms of weights kk and \ell is a modular form of weight k+k+\ell. This ring is isomorphic to a polynomial ring on two generators, namely E4E_{4} and E6E_{6}. In other words, the set

{E4iE6j:i,j0 and 4i+6j=k}\left\{E_{4}^{i}E_{6}^{j}:\text{$i,j\geq 0$ and $4i+6j=k$}\right\}

is a basis for the modular forms of weight kk. In particular, there is no modular form of weight 22 for SL2()\mathop{\textup{SL}}_{2}({\mathbb{Z}}), because the weights of E4E_{4} and E6E_{6} are too high to generate such a form.

One cannot obtain a modular form of weight 22 by setting k=2k=2 in the double sum definition of EkE_{k}. The problem is that rearranging the terms is crucial for proving modularity, but when k=2k=2 the series converges only conditionally, not absolutely. Instead, we can define E2E_{2} using (6.2). That defines a merely quasimodular form, rather than an actual modular form, because one can show that E2(1/z)=z2E2(z)6iz/πE_{2}(-1/z)=z^{2}E_{2}(z)-6iz/\pi rather than z2E2(z)z^{2}E_{2}(z). This imperfect Eisenstein series will play a role in constructing the magic functions.

By default all modular forms are required to be holomorphic, but we can of course consider quotients that are no longer holomorphic. A meromorphic modular form is the quotient of two modular forms, and it is weakly holomorphic if it is holomorphic on 𝔥\mathfrak{h} (but not necessarily at infinity). Unlike the holomorphic case, there is an infinite-dimensional space of weakly holomorphic modular forms of each even weight, positive or negative. Allowing a pole at infinity offers tremendous flexibility.

On the face of it, modular forms seem to have little to do with the magic functions. In particular, it’s not clear what modular forms have to do with the radial Fourier transform in nn dimensions. One hint that they may be relevant comes from the Laplace transform. As we saw when we looked at theta series, Gaussians are a particularly useful family of functions for which we can easily compute the Fourier transform. It’s natural to define a function ff as a continuous linear combination of Gaussians via

f(x)=0etπ|x|2g(t)𝑑t,f(x)=\int_{0}^{\infty}e^{-t\pi|x|^{2}}g(t)\,dt,

where the weighting function g(t)g(t) gives the coefficient of the Gaussian etπ|x|2e^{-t\pi|x|^{2}}. This formula is simply the Laplace transform of gg, evaluated at π|x|2\pi|x|^{2}.

Assuming gg is sufficiently well behaved, we can compute f^\widehat{f} by interchanging the Fourier transform with the integral over tt, which yields

f^(y)\displaystyle\widehat{f}(y) =0tn/2eπ|y|2/tg(t)𝑑t\displaystyle=\int_{0}^{\infty}t^{-n/2}e^{-\pi|y|^{2}/t}g(t)\,dt
=0etπ|y|2tn/22g(1/t)𝑑t.\displaystyle=\int_{0}^{\infty}e^{-t\pi|y|^{2}}t^{n/2-2}g(1/t)\,dt.

In other words, taking the Fourier transform of ff amounts to replacing gg with ttn/22g(1/t)t\mapsto t^{n/2-2}g(1/t).

As a consequence, if g(1/t)=εt2n/2g(t)g(1/t)=\varepsilon t^{2-n/2}g(t) with ε=±1\varepsilon=\pm 1, then f^=εf\widehat{f}=\varepsilon f. Thus, we can construct eigenfunctions of the Fourier transform by taking the Laplace transform of functions satisfying a certain functional equation. What’s noteworthy about this functional equation is how much it looks like the transformation law for a modular form on the imaginary axis. If we set g(t)=φ(it)g(t)=\varphi(it), then the modular form equation φ(1/z)=zkφ(z)\varphi(-1/z)=z^{k}\varphi(z) with z=itz=it corresponds to g(1/t)=iktkg(t)g(1/t)=i^{k}t^{k}g(t). If φ\varphi is a meromorphic modular form of weight k=2n/2k=2-n/2 that vanishes at ii\infty and has no poles on the imaginary axis, then ff is a radial eigenfunction of the Fourier transform in n{\mathbb{R}}^{n} with eigenvalue iki^{k}.

Of course this is far from the only way to construct Fourier eigenfunctions, but it’s a natural way to construct them from modular forms. As stated here, it’s clearly not flexible enough to construct the magic functions, because it produces only one eigenvalue. If we take n=8n=8 and weight k=2n/2=2k=2-n/2=-2, then ik=1i^{k}=-1, so we can construct a 1-1 eigenfunction but not a +1+1 eigenfunction for the same dimension. This turns out not to be a serious obstacle: there are many variants of modular forms (for other groups or with characters), and it’s not hard to produce eigenfunctions with both eigenvalues. However, there’s a much worse problem. If we build an eigenfunction this way, then there’s no obvious way to control the roots of the eigenfunction using the Laplace transform. Given that our goal is to prescribe the roots, this approach seems to be useless. What’s holding us back is that we have not taken full advantage of the modular form: we are using only the identity φ(1/z)=zkφ(z)\varphi(-1/z)=z^{k}\varphi(z), and not φ(z+1)=φ(z)\varphi(z+1)=\varphi(z).

7. Viazovska’s proof

The fundamental problem with the Laplace transform approach in the previous section is that it seems to be impossible to achieve the desired roots. Viazovska gets around this difficulty by a bold construction: she simply inserts the desired roots by brute force, by including an explicit factor of sin2(π|x|2/2)\sin^{2}\mathopen{}\big{(}\pi|x|^{2}/2\big{)}\mathclose{}, which vanishes to second order at |x|=2k|x|=\sqrt{2k} for k=1,2,k=1,2,\dots and fourth order at x=0x=0. In her construction for eight dimensions, both eigenfunctions have the form

(7.1) sin2(π|x|2/2)0g(t)eπ|x|2tdt\sin^{2}\mathopen{}\big{(}\pi|x|^{2}/2\big{)}\mathclose{}\int_{0}^{\infty}g(t)e^{-\pi|x|^{2}t}\,dt

for some function gg.

One obvious issue with this approach is that sin2(π|x|2/2)\sin^{2}\mathopen{}\big{(}\pi|x|^{2}/2\big{)}\mathclose{} vanishes more often than we would like. Specifically, it vanishes to fourth order when x=0x=0 and second order when |x|=2|x|=\sqrt{2}, whereas we wish to have no root when x=0x=0 and only a first-order root when |x|=2|x|=\sqrt{2}. To avoid this difficulty, the integral in (7.1) must have poles at 0 and 2\sqrt{2} as a function of |x||x|, which cancel the unwanted roots. The integral will converge only for |x|>2|x|>\sqrt{2}, but the function defined by (7.1) extends to |x|2|x|\leq\sqrt{2} by analytic continuation.

Which choices of gg will produce eigenfunctions of the Fourier transform in 8{\mathbb{R}}^{8}? This is not clear, because the factor of sin2(π|x|2/2)\sin^{2}\mathopen{}\big{(}\pi|x|^{2}/2\big{)}\mathclose{} disrupts the straightforward Laplace transform calculations from the end of Section 6. Instead, Viazovska writes the sine function in terms of complex exponentials and carries out elegant contour integral arguments to show that (7.1) gives an eigenfunction whenever gg satisfies certain transformation laws. Identifying the right conditions on gg is not at all obvious, and it’s the heart of her paper.

To get a +1+1 eigenfunction, Viazovska shows that it suffices to take g(t)=t2φ(i/t)g(t)=t^{2}\varphi(i/t), where φ\varphi is a weakly holomorphic quasimodular form of weight 0 and depth 22 for SL2()\mathop{\textup{SL}}_{2}({\mathbb{Z}}). Here, a quasimodular form of depth 22 is a quadratic polynomial in E2E_{2} with modular forms as coefficients, where E2E_{2} is the Eisenstein series of weight 22. Recall that E2E_{2} fails to be a modular form because of the strange transformation law E2(1/z)=z2E2(z)6iz/πE_{2}(-1/z)=z^{2}E_{2}(z)-6iz/\pi, but that functional equation works perfectly here.

To get a 1-1 eigenfunction, Viazovska shows that it suffices to take g(t)=ψ(it)g(t)=\psi(it), where ψ\psi is a weakly holomorphic modular form of weight 2-2 for a subgroup of SL2()\mathop{\textup{SL}}_{2}({\mathbb{Z}}) called Γ(2)\Gamma(2) and ψ\psi satisfies the additional functional equation

ψ(z)=ψ(z+1)+z2ψ(1/z).\psi(z)=\psi(z+1)+z^{2}\psi(-1/z).

We have not discussed modular forms for other groups such as Γ(2)\Gamma(2), but they are similar in spirit to those for SL2()\mathop{\textup{SL}}_{2}({\mathbb{Z}}). In particular, the ring of modular forms for Γ(2)\Gamma(2) is generated by two forms of weight 22, namely Θ4\Theta_{\mathbb{Z}}^{4} (the fourth power of the theta series of the one-dimensional integer lattice) and its translate zΘ4(z+1)z\mapsto\Theta_{\mathbb{Z}}^{4}(z+1).

These conditions for φ\varphi and ψ\psi are every bit as arcane as they look. It’s far from obvious that they lead to eigenfunctions, but Viazovska’s contour integral proof shows that they do. Even once we know that this method gives eigenfunctions, it’s unclear how to choose φ\varphi and ψ\psi to yield the magic eigenfunctions, or whether this is possible at all.

Fortunately, one can write down some necessary conditions, and then the simplest functions satisfying those conditions work perfectly. In particular, we can take

φ=4π(E2E4E6)25(E62E43)\varphi=\frac{4\pi(E_{2}E_{4}-E_{6})^{2}}{5(E_{6}^{2}-E_{4}^{3})}

and

ψ=32Θ4|T(5Θ85Θ4|TΘ4+2Θ8|T)15πΘ8(Θ4Θ4|T)2,\psi=-\frac{32\Theta_{\mathbb{Z}}^{4}|_{T}\big{(}5\Theta_{\mathbb{Z}}^{8}-5\Theta_{\mathbb{Z}}^{4}|_{T}\Theta_{\mathbb{Z}}^{4}+2\Theta_{\mathbb{Z}}^{8}|_{T}\big{)}}{15\pi\Theta_{\mathbb{Z}}^{8}\big{(}\Theta_{\mathbb{Z}}^{4}-\Theta_{\mathbb{Z}}^{4}|_{T}\big{)}^{2}},

where f|Tf|_{T} denotes the translate zf(z+1)z\mapsto f(z+1) of a function ff.

Thus, to obtain the magic function for E8E_{8} we set

(7.2) f(x)=sin2(π|x|2/2)0(t2φ(i/t)+ψ(it))eπ|x|2tdtf(x)=\sin^{2}\mathopen{}\big{(}\pi|x|^{2}/2\big{)}\mathclose{}\int_{0}^{\infty}\big{(}t^{2}\varphi(i/t)+\psi(it)\big{)}e^{-\pi|x|^{2}t}\,dt

for the specific φ\varphi and ψ\psi identified by Viazovska. Because the φ\varphi and ψ\psi terms yield eigenfunctions of the Fourier transform, we find that

f^(y)=sin2(π|y|2/2)0(t2φ(i/t)ψ(it))eπ|y|2tdt.\widehat{f}(y)=\sin^{2}\mathopen{}\big{(}\pi|y|^{2}/2\big{)}\mathclose{}\int_{0}^{\infty}\big{(}t^{2}\varphi(i/t)-\psi(it)\big{)}e^{-\pi|y|^{2}t}\,dt.

The integral in the formula for f(x)f(x) converges only when |x|>2|x|>\sqrt{2}, but the one in the formula for f^(y)\widehat{f}(y) turns out to converge whenever |y|>0|y|>0, because the problematic growth of the integrand cancels in the difference t2φ(i/t)ψ(it)t^{2}\varphi(i/t)-\psi(it).

These formulas define Schwartz functions that have the desired roots, and one can check that f(0)=f^(0)=1f(0)=\widehat{f}(0)=1, but it’s not obvious that they satisfy the inequalities f(x)0f(x)\leq 0 for |x|2|x|\geq\sqrt{2} and f^(y)0\widehat{f}(y)\geq 0 for all yy, because there might be additional sign changes. In fact, these inequalities hold for a fundamental reason:

(7.3) t2φ(i/t)+ψ(it)<0andt2φ(i/t)ψ(it)>0t^{2}\varphi(i/t)+\psi(it)<0\qquad\text{and}\qquad t^{2}\varphi(i/t)-\psi(it)>0

for all t(0,)t\in(0,\infty). In other words, the inequalities already hold at the level of the quasimodular forms, with no need to worry about the Laplace transform except to observe that it preserves positivity. Note that the restriction of the inequality f(x)0f(x)\leq 0 to |x|2|x|\geq\sqrt{2} fits perfectly into this framework, because the integral in (7.2) diverges for |x|<2|x|<\sqrt{2} and thus we do not obtain f(x)0f(x)\leq 0 there. All that remains is to prove the inequalities (7.3). Unfortunately, no simple proof of these inequalities is known at present, but one can verify them by reducing the problem to a finite calculation.

Thus, Viazovska’s formula (7.2) defines the long-sought magic function for E8E_{8} and solves the sphere packing problem in eight dimensions. What about twenty-four dimensions? The same basic approach works, but choosing the quasimodular forms requires more effort. Fortunately, the conjectures by Cohn and Miller can be used to help pin down the right choices. Once the magic function has been identified, there are additional technicalities involved in verifying the inequality for f^\widehat{f}, but these challenges can be overcome, which leads to a solution of the sphere packing problem in twenty-four dimensions.

8. Future prospects

Nobody expects Viazovska’s proof to generalize to any other dimensions above two. Why just eight and twenty-four? At one level, we really don’t know why. Nobody has been able to find a proof, or even a compelling heuristic argument, that rules out similar phenomena in higher dimensions. We can’t even rule out the possibility that linear programming bounds might solve the sphere packing problem in every sufficiently high dimension, although that’s clearly ridiculous.

Despite our lack of understanding, the special role of eight and twenty-four dimensions aligns with our experience elsewhere in mathematics. Mathematics is full of exceptional or sporadic phenomena that occur in only finitely many cases, and the E8E_{8} and Leech lattices are prototypical examples. These objects do not occur in isolation, but rather in constellations of remarkable structures. For example, both E8E_{8} and the Leech lattice are connected with binary error-correcting codes, combinatorial designs, spherical designs, finite simple groups, etc. Each of these connections constrains the possibilities, especially given the classification of finite simple groups, and there just doesn’t seem to be room for a similar constellation in higher dimensions.

Instead, solving the sphere packing problem in further dimensions will presumably require new techniques. One particularly attractive case is the D4D_{4} root lattice, which is surely the best sphere packing in 4{\mathbb{R}}^{4}. This lattice shares some of the wonderful properties of E8E_{8} and the Leech lattice, but not enough for the four-dimensional linear programming bound to be sharp. It would be a plausible target for any generalization of this bound, and in fact such a generalization may be emerging.

Building on work of Schrijver, Bachoc and Vallentin, and other researchers, de Laat and Vallentin have generalized linear programming bounds to a hierarchy of semidefinite programming bounds [12]. Linear programming bounds are the first level of this hierarchy, which means that E8E_{8} and the Leech lattice have the simplest possible proofs from this perspective. What about D4D_{4}? Perhaps this case can be solved at one of the next few levels of the hierarchy. Much work remains to be done here, and it’s unclear what the prospects are for any particular dimension, but it is not beyond hope that four dimensions could someday join eight and twenty-four among the solved cases of the sphere packing problem.

Acknowledgments

I am grateful to James Bernhard, Donald Cohn, Matthew de Courcy-Ireland, Stephen D. Miller, Frank Morgan, David Rohrlich, Achill Schürmann, Frank Vallentin, and Maryna Viazovska for their feedback and suggestions.

Photo Credits

Figure 1 is courtesy of Daniil Yevtushynsky.

The photos in Figure 2 are courtesy of Mary Caisley, Mark Ostow, C. J. Mozzochi, and Julia Semikina, from left to right.

Figures 3 and 7 are courtesy of Henry Cohn.

Figure 6 is courtesy of Matthew Kownacki.

References

  • [1] H. Cohn, Packing, coding, and ground states, PCMI 2014 lecture notes, 2016. arXiv:1603.05202
  • [2] H. Cohn and N. Elkies, New upper bounds on sphere packings I, Ann. of Math. (2) 157 (2003), no. 2, 689–714. arXiv:math/0110009 MR1973059 doi:10.4007/annals.2003.157.689
  • [3] H. Cohn and A. Kumar, Optimality and uniqueness of the Leech lattice among lattices, Ann. of Math. (2) 170 (2009), no. 3, 1003–1050. arXiv:math/0403263 MR2600869 doi:10.4007/annals.2009.170.1003
  • [4] H. Cohn, A. Kumar, S. D. Miller, D. Radchenko, and M. Viazovska, The sphere packing problem in dimension 2424, preprint, 2016. arXiv:1603.06518
  • [5] H. Cohn and S. D. Miller, Some properties of optimal functions for sphere packing in dimensions 88 and 2424, preprint, 2016. arXiv:1603.04759
  • [6] J. H. Conway and N. J. A. Sloane, What are all the best sphere packings in low dimensions?, Discrete Comput. Geom. 13 (1995), no. 3–4, 383–403. MR1318784 doi:10.1007/BF02574051
  • [7] J. H. Conway and N. J. A. Sloane, Sphere packings, lattices and groups, third edition, Grundlehren der Mathematischen Wissenschaften 290, Springer, New York, 1999. MR1662447 doi:10.1007/978-1-4757-6568-7
  • [8] T. C. Hales, Cannonballs and honeycombs, Notices Amer. Math. Soc. 47 (2000), no. 4, 440–449. MR1745624
  • [9] T. C. Hales, A proof of the Kepler conjecture, Ann. of Math. (2) 162 (2005), no. 3, 1065–1185. MR2179728 doi:10.4007/annals.2005.162.1065
  • [10] T. Hales, M. Adams, G. Bauer, D. T. Dang, J. Harrison, T. L. Hoang, C. Kaliszyk, V. Magron, S. McLaughlin, T. T. Nguyen, T. Q. Nguyen, T. Nipkow, S. Obua, J. Pleso, J. Rute, A. Solovyev, A. H. T. Ta, T. N. Tran, D. T. Trieu, J. Urban, K. K. Vu, and R. Zumkeller, A formal proof of the Kepler conjecture, preprint, 2015. arXiv:1501.02155
  • [11] G. A. Kabatyanskii and V. I. Levenshtein, Bounds for packings on a sphere and in space, Problems Inform. Transmission 14 (1978), no. 1, 1–17. MR0514023
  • [12] D. de Laat and F. Vallentin, A semidefinite programming hierarchy for packing problems in discrete geometry, Math. Program. 151 (2015), no. 2, Ser. B, 529–553. arXiv:1311.3789 MR3348162 doi:10.1007/s10107-014-0843-4
  • [13] D. de Laat and F. Vallentin, A breakthrough in sphere packing: the search for magic functions, Nieuw Arch. Wiskd. (5) 17 (2016), no. 3, 184–192. arXiv:1607.02111
  • [14] A. Venkatesh, A note on sphere packings in high dimension, Int. Math. Res. Not. 2013 (2013), no. 7, 1628–1642. MR3044452 doi:10.1093/imrn/rns096
  • [15] M. S. Viazovska, The sphere packing problem in dimension 88, preprint, 2016. arXiv:1603.04246