Packing pennies on an infinite table top

An illustrated proof of Kepler's conjecture in 2D

by Bill Casselman

This and the other image nearby are from Kepler's pamphlet on snowflakes. Contrary to what one might think at first. they are not of two dimensional objects, but rather an attempt to render on the page three dimensional packings of spheres.
In his book De nive sexangula (`On the six-sided snowflake') of 1611, Kepler asserted that the packing in three dimensions made familiar to us by fruit stands (called the face-centred cubic packing by crystallographers) was the tightest possible: Coaptatio fiet arctissima: ut nullo praetera ordine plures globuli in idem vas compingi queant.

He didn't elaborate much, and his statement lacks precision. It is almost certain that he had no idea that this assertion required rigorous proof. At any rate, this claim came to be known as Kepler's conjecture, and it turned out to be extremely difficult to verify.

Kepler quite likely would have thought that the analogous assertion about the hexagonal packing in 2D was even more obvious. However, it took about 300 years before it was proven, by the Norwegian mathematician Axel Thue. It is arguable that it took that long just to understand that such an `obvious' assertion required proof. It took another century before a proof of the much more difficult claim about 3D was found, by Tom Hales.

Both images are from photographs taken of the copy of the original edition of Kepler's pamphlet now located at the Thomas L. Fisher Library at the University of Toronto.



Kepler's assertions were possibly prompted by correspondence beginning in the year 1606 between him and the remarkable English mathematician Thomas Harriot. And Harriot's interest was perhaps prompted by a question his employer, Sir Walter Raleigh, had asked him much earlier about how to count the cannon balls in stacks on a ship. (Such were applied mathematics and the military-industrial complex in the XVI and XVII century.) Now Thue's and Hales's theorems have little to do with real world packings in a finite region. Optimal packings of finite regions are ridiculously difficult to ascertain rigourously, even in the simplest cases. Thue's and Hales' theorems are concerned instead with ideal packings throughout all of the plane and space. Hales' proof is one of the most complicated yet required by any theorem, in any branch of mathematics, and I will say little about it here. But Hales observed that, by combining in this small dimension ideas of Fejes-Toth and C. A. Rogers, one could arrive at an extremely elementary proof of Thue's theorem.

This note will follow Hales' suggested argument for the 2D conjecture, filling in a few minor gaps here and there, and relying almost exlusively on illustrations and a few animations to explain the reasoning.



Statement of Thue's theorem

The hexagonal packing of discs in the plane is obtained by laying out a row of discs in a line, then successively adding rows on either side packed in as closely as possible. This coincides with what you get by fitting discs tightly inside a honeycomb pattern of hexagons.

Thue's theorem. No packing of non-overlapping discs of equal size in the plane has density higher than that of the hexagonal packing.

It is really impossible to imagine how it could be otherwise. We can also build the hexagonal packing in this way: we start with a single disc in the plane, and then place around it six others. In contrast to the similar construction in 3D, where spheres are placed around a sphere, it is clear that no more than six can be so placed. Furthermore this continues on for each of the new discs etc. to give a global packing, which has to be optimal - doesn't it? But no straightforward proof of the Theorem has yet been found.

The hexagonal packing is obtained by laying out rows of discs as close to one another as possible.

Density

What is meant by the density of a layout of discs in the plane? Density is measured by the fraction of area covered by the discs.

For example, the density of the layout on the left below is the ratio of the area of a circle to the square which just encloses it, and the density of the layout on the right is the ratio of the area of a circle to that of its circumscribing regular hexagon. It is visibly apparent, and easy to calculate explicitly, that the density on the right is greater. It is also easy to prove that any lattice packing (i.e. a packing in which the discs are located on an arithmetic lattice) has density at most that of the hexagonal packing, as the figures illustrate. The density of a lattice packing is the ratio of the area of a disc to that of a fundamental parallelogram, and among all lattice packings with a given size disc the hexagonal lattice clearly minimizes the area of a suitable fundamental parallelogram. This was observed first by Gauss.
The fundamental parallelogram of this lattice is a square, and the density of the distribution of discs is the ratio of the area of a circle to its circumscribing square.
The base of the fundamental parallelogram is necessarily the same, but its height is the smallest possible among lattice distributions of the disc. The density is therefore maximal among such distributions.

The statement of Thue's Theorem is a bit subtle, because the notion of density for an infinite layout, other than one associated to a lattice, is a bit subtle. Instead of trying to define exactly what we mean by the density of an arbitrary infinite layout, we shall just apply one intuitive principle which must follow from any valid definition.
Suppose we are given a distribution of non-overlapping discs on the plane, and suppose that the plane is partitioned into regions (not necessarily of finite area) surrounding each disc. For each one, calculate the ratio of the area of the disc to the area of the region. This is the the density of the distribution in that region. Then, no matter what the partition is, the density of the overall distribution cannot be greater than than the maximum of its densities in the various regions.

The density of the distribution cannot be greater than the maximum density among the smaller regions.

Now we shall associate to any distribution of equal-sized discs in the plane a partition of the plane into regions, with one disc in each region. We shall prove that the proportion of disc in each of those regions can be no more than the density of the hexagonal packing. This will show that no distribution will be denser than the hexagonal packing, which is Thue's theorem.

The Voronoi partition

Consider any collection of non-overlapping discs of equal size (possibly just isolated points). We associate to this collection a partition of the plane into regions called Voronoi cells. The Voronoi cell of a disc in the distribution is the set of points in the plane which are as close or closer to the centre of that disc than to the centre of any other disc in the distribution.

The partition we associate to any distribution of non-overlapping discs in the plane is its Voronoi partition. We shall show that for any Voronoi partition the ratio of the area of a disc to the area of its cell is at most the ratio of a disc to its circumscribing regular hexagon. This will prove Thue's Theorem.

The Voronoi cells determined by a distribution of discs.
This figure is live: the discs are movable.

There are a few characteristic properties of Voronoi cells that we shall require.
(1) Since all the discs are of equal size, a disc is contained within its Voronoi cell.
(2) If P and Q are two points in the plane, then the set of points closer to P than to Q is the half-plane containing P and bounded by the line half-way between them. In any configuration, therefore, the Voronoi cell of a disc is the intersection of all the half-planes determined in this way by that disc and other discs in the collection. It follows that every Voronoi cell is convex.
(3) For a layout of three discs, the Voronoi cells will be simplicial cones with a common vertex at one point - the triple point - which is the common intersection of all three bisectors between the points - except in the singular case when the bisectors are mutually parallel. This triple point will be equi-distant from each of the three discs. Draw from the triple point the pair of tangent line segments to each of the discs, making a kind of dunce's cap. The vertex angle of that cap will be the same for each disc, since this angle depends only on distance of the triple point from the disc. Because Voronoi cells are convex, the whole of each cap will be contained in a the Voronoi cell of its disc.

If two discs are close enough, then there will be a `dead' region
(shown in pink in the animation) on the line half-way between them
where the triple point for a third disc cannot be found.
Moving the third disc (which is `live') shows how the three vertex angles
remain equal, and the caps contained in Voronoi cells.

Suppose we fix two of the three discs and ask how this triple point varies as the third disc moves around. If the two discs are far enough apart, the triple point can be anywhere on the line bisecting the given pair. But as soon as they are close enough together, there will be a dead space on this line right in between the two discs where the triple point can never be found. This dead space will exist as soon as the centres of discs are closer than times their common diameter. As we decrease the distance between the two discs, the size of the dead region will expand. This is shown in the animation at left, where the dead space is highlighted in pink.

The figure also illustrates that the caps drawn from the triple point don't overlap (except possibly along their edges), and that their common vertex angle can be at most one-third of 360o or 120o.

The hexagonally circumscribed circle

The vertex angle of the cap decreases with distance from the disc.
Circumscribe a disc with a regular hexagon, and circumscribe the hexagon with a circle. This gives what I call the hexagonally circumscribed circle of the original disc. It is a concentric circle whose radius is times the original one. If P is any point outside the original disc, the two tangents from P to that disc bound what I call the cap corresponding to P and the disc. The key property we shall need later on is that the vertex angle at P is a decreasing function of the distance of P from the disc. This vertex angle will be exactly 120o precisely when P lies on the hexagonally circumscribed circle, because then it is a vertex of a circumscribed hexagon. So when this angle is less than 120o the point P must lie outside the hexagonally circumscribed circle. This vertex angle will be exactly 120o precisely when P lies on the hexagonally circumscribed circle, and when this angle is less than 120o the point lies outside the hexagonally circumscribed circle.
We shall need also another property of these circumscribed circles. Suppose two of them intersect, but that the discs themselves do not intersect. Then that intersection can only intersect the Voronoi cell of a third disc if the three discs are mutually touching in the configuration of discs in a hexagonal packing.

For suppose P to be a point in the intersection of two circumcribed circles, and also in the Voronoi cell of a third disc. The two discs are close enough that the point exactly half-way between them will be in the dead region where the triple point cannot be located. The triple point must therefore lie between P and this half-way point, and also in both hexagonally circumscribed circles. But each of the vertex angles at the capped discs from this triple point must then be at least 120o. This can only happen if this angle is exactly 120o and the three discs are mutually touching.

In the diagram to the right, this means that the points in the yellow rhombus are never in the Voronoi cell of the third disc.

The triple point never lies inside
the hexagonally circumscribed circles.
As a consequence, points in the yellow rhombus
never lie in the Voronoi cell of a third disc.

How dense can a disc be in its Voronoi cell?

We want to show that a disc take up no larger proportion of the area in its Voronoi cell than it does in its circumscribing regular hexagon. To show this, we partition the cell into sub-regions of different types.

Suppose we now look at a cell in a layout. Draw the circumscribing circle as well, or at least its intersection with the cell. Where it extends beyond the cell, the line cut off by the cell boundary will be the bisector of the rhombus associated to two discs whose hexagonally circumscribed circles intersect.

We can now divide up the cell into distinct regions of three types:
The points in the cell which do not belong to the enlarged disc
The points in the cell lying in a rhombus associated to a neighbouring cell.
The parts of the circumscribing disc not belonging to one of these rhomboi.
The density of the distribution in the first type of region is of course 0, since by definition such a region contains no part of a disc. The density of the distribution in the third type is just the ratio of the radius of the disc to that of the larger one. Both densities are strictly less than the density of the disc in its regular circumscribed hexagon. The crux of the argument is now to examine regions of the second type.

Partitioning several cells. This is `live'.

Why is the regular hexagon special?

So now we look at one of the regions of type II. We will show that the ratio of the area inside the disc to that of the whole type II region is at most equal to the ratio in the special case when the central angle is 60o. This can be done by an explicit calculation, and the claim then reduces to the observation that the function sin(x) / x is monotonic decreasing in the range between 0 and /2.

But it can also be demonstrated by a purely geometric argument. The figure on the left above shows the special case. In the figure on the right the animation generates the other cases. As the central angle decreases we also generate the image of the left disc under the unique linear transformation which acts by scaling vertically and horizontally, transforming the original (60o) triangle to the narrower isosceles one. This is indicated in the figure above. Linear transformations preserve ratios of areas. The animation then shows how the largest ratio of circular sector to triangle occurs in a type II sector when the central angle is 60o.

In summary, the density will achieve the maximum possible only when there are no regions of the first or third type and when the central angle of each of the regions of type II is exactly 60o. These criteria uniquely determine the cells in a hexagonal packing. The same proof explains the argument about the 6 discs surrounding a single one, since it shows that the hexagon circumscribing a disc is the Voronoi cell of smallest area containing it.

The analogous argument fails in 3D, since the smallest cell surrounding a sphere is known to be a regular dodecahedron, a shape which cannot partition all of space. In particular, as Kepler himself well knew and well described, this is not the same as the cell in the densest layout of spheres throughout all of space. It is this disparity between local and global behaviour that offers a serious obstacle to a simple proof of Kepler's conjecture in 3D.

References

The End