The Mathematical Intelligencer encourages comments about the material in this issue. Letters to the editor should be sent to the editor-in-chief, Chandler Davis.
A Ramanujan Puzzle
Numerical values:
George E. Andrews's review of Ramanujan's Notebooks, Part III (Mathematical Intelligencer, Summer 1994), refers to an anecdote cited by Kanigel in his book The M a n Who K n e w Infinity, where Ramanujan casually (while he is cooking vegetables) solves a puzzle. One is asked to find an address, i.e., a house number n on a street with numbers 1 through N, such that the house numbers on one side of the address n add up exactly to the same sum as all numbers on the other side of the address. Ramanujan answered by dictating to his friend Mahalonobis a continued fraction, and gave this explanation: "Immediately I heard the problem it was clear that the solution should obviously be a continued fraction; I then thought, Which continued fraction? And the answer came to my mind." Andrews suggests, "A number theorist should be able to fill in the missing parts of Ramanujan's explanation; however, Ramanujan's explanation itself is both startling and unilluminating." We think Ramanujan dictated to his friend the continued fraction 1 6-
1
6 _
, .
.
.
and he also saw that the solutions satisfy n~ -
N'2 + Ni 2
(where no = No = 0 and nl = N1 = 1 for obvious reasons). No doubt, he was familiar with the properties of square triangular numbers, and knew therefore that ni = 6hi-1 - hi-2
i
ni
2Ni + 1
Ni
0 1
0 1
1 3
0 1
2 3 4 5
6 35 204 1,189
17 99 577 3,363
8 49 288 1,681
genius; it only shows that he was familiar with triangular numbers. This is not surprising. Such numbers have fascinated other mathematical geniuses, e.g., 17-year-old Gauss recorded in his diary his discovery: "Eureka! Triangle + Triangle + Triangle = Number" (every integer is a sum of three triangular numbers). We, of course, spent considerable time arriving at our conclusions, not having the Goddess Namagiri to help us. We began by an inspired guess for a subset of solutions, postulating that Ni = 8(nk) 2
where i = 2k. We communicated our guess to the Editor (Chandler Davis), who challenged us to prove this assertion. Our labors to prove the validity of this wild guess led us to Pell's equation, and its continued fraction form. To quote George Andrews, "Even if we never learn how Ramanujan actually thought, we can gain greatly by trying to find o u t . . . " Paul Weidlinger Joseph Wright Weidlinger Associates, Inc. 333 Seventh Avenue New York, N Y 10001 USA
i >~ 2
(derivable from PelFs equation), and consequently the convergents of his continued fraction satisfy 1
ci ~ n i / n i - 1 ~ 6 - - ; Ci-1
the same holds for the Ni, because 2Ni + 1 = 6(2Ni-1 + 1) - (2Ni-2 + 1). We do agree that Ramanujan's methods are frequently obscure, but this puzzle is rather simple. Its solution, we believe, does not contribute to our understanding of his
Biographical Note The excellent Gazette des Mathdmaticiens, published by the Soci6t4 Math6matique d e France, has not a few readers in comm o n with The Mathematical Intelligencer. It was noticed by one of t h e s e - - m y s e l f that the report by Vladimir Arnold on the International Congress of Mathematicians appears both in The Intelligencer, vol. 17, no. 3, and simultaneously in the Gazette, no. 65. S o m e h o w the editors of both magazines had remained ignorant that this duplication was occurring. Curioser and curioser: of the m a n y barbs flung by Professor Arnold in his article, one of the sharpest, directed at French mathematicians, was a d d e d b y him to The Intelligencer version but not to the French one. ~ T h e Editor
THE MATHEMATICALINTELLIGENCERVOL. 17, NO. 4 9 1995Springer-Verlag New York 3
--The
Impossible Problem Revisited Again - -
other. Compare David Gale's version of a knowledge problem in the Fall 1994 issue, pp. 38--44, where a beeper In the Mathematical Intelligencer, Winter 1995, pp. 27-33, is prescribed. Sallows believes, for example, that at the Lee Sallows discussed the somewhat hyperbolically time of his call, S actually did know that p is different named "impossible" problem that was presented by from 15 because otherwise, P would immediately have Martin Gardner in his Mathematical Games column in deduced that the two numbers were 3 and 5 and would the December 1979 issue of Scientific American magazine. already have placed a call to S telling him so. Sallows Gardner returned to this problem in his columns of apparently believes that no mathematician could wait March and May 1980 in order to correct an error in his even a moment after proving a theorem before anoriginal solution, but as is evidenced by Sallows's arti- nouncing it. The statement of the problem says nothing cle 15 years later, the impossible problem is still capa- to indicate that the phone calls were made as soon as ble of creating confusion. The purpose of this letter is to they possibly could be, and it does not seem reasonable to assume that they were. try (finally) to clarify the situation. Then what is the solution to the problem? Originally, The problem, as given by Gardner and reproduced Martin Gardner claimed that the two numbers must be by Sallows, is exactly as follows: 4 and 13, but in his follow-up, he said that this is inTwo numbers (not necessarily different) are chosen from the correct and that the problem is truly impossible. range of positive integers greater than I and not greater than Gardner explained that when he first heard the prob20. Only the sum of the two numbers is given to mathematician S. Only the product of the two is given to mathe- lem, the range of allowable integers was given as 2 matician P. On the telephone S says to P, "I see no way you through 100 and that the unique solution was the pair can determine my sum." An hour later P calls him back to 4 and 13. In an attempt to simplify the problem for his say, "I know your sum." Later S calls P again to report, column, he reduced the upper bound from 100 to 20, "Now I know your product." What are the two numbers? still safely exceeding the numbers 4 and 13 of the supposed solution. Gardner said in his follow-up, however, Sallows claims that the two numbers must be 2 and that this pair of numbers is not a solution to the prob6. We shall see that there is an ambiguity in the state- lem as stated and that, in fact, the modified problem has ment of the problem which admits at least two alterna- no solution at all. tive interpretations, but I begin by showing that the According to the problem, players S and P are told numbers 2 and 6 cannot be a solution in either case, un- only the sum and product of the two numbers, respecless an unwarranted additional assumption is made. tively. Nevertheless, as is apparent from Gardner's disIf the numbers are 2 and 6, then player S knows only cussion, he assumed that in addition, the players know that the sum s is 8. With only that information available that the two numbers are integers exceeding I and also to him, we shall see that S cannot correctly claim that that the upper bound is 20. He further assumed that there is "no way" that P can determine the sum. each player is aware that the other knows these facts. At the time that S makes this assertion in his first The problem is still interesting (and the answer is difphone call, the only information available to P is the ferent) if, instead, we assume that the players are not product p. There are some possible values of p, how- told the upper bound. (In all cases, I shall continue to ever, that would enable player P to determine the two assume that the players are publicly told that the numnumbers and thus to compute their sum. If p were equal bers are integers exceeding 1.) to 15, for example, then P could infer that the two numTo see w h y there is no solution when the upper bound bers were 3 and 5 (since these are the only two integers is 20 (and this information is publicly known), we need exceeding I that multiply to 15). If P is really unable to to determine which possible sums s -~ 4 would enable determine s, therefore, then p cannot be 15, and so S player S to assert that there is "no way" that P can dewould have to be sure that p is not 15 in order for him termine the sum. (Let us call such sums no-way numto claim that P cannot determine s. That information (al- bers.) To determine if a given number s is a no-way though true) is surely not available to S at the time of number, we shall see that it is necessary to know his call because all that he knows then is that s = 8, and whether or not an upper bound was divulged, and if so, this is consistent with the possibility that the numbers we need to know the upper bound. are 3 and 5, in which case p = 15. In summary, when S With an upper bound of 20, we know, of course, that says that there is "no way" that P can determine s, he 4 ~ s -- 40. I argue that if this upper bound is public, is wrong; as far as he knows, there is a way, namely, then the only no-way number s is 11. Certainly, if s = that the two numbers are 3 and 5. u + v, where u and v are prime numbers less than 20, The reasoning that led Lee Sallows to the conclusion then s cannot be a no-way number because S knows that that the numbers were 2 and 6 was based on some as- it is possible that p = uv, and in that case P could desumptions about how quickly mathematicians S and P termine s. This eliminates all of the even numbers bewould make certain deductions and how eager they low 40 and it also eliminates 5, 7, 9, i3, 15, and 19. Next, would be to communicate these deductions to each I show that no number in the range 21 through 39 can 4
r~
MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4, 1995
be a n o - w a y number. To see why, suppose that s lies in this range and write s = 19 + v, w h e r e 2 ~ v -< 20. Since the only factorization of p = 19v for which both factors lie in the range 2 t h r o u g h 20 is 19 • v, player S w o u l d k n o w that there is the possibility that the n u m b e r s are 19 and v, in which case, P could determine the sum. (This is because S k n o w s that P knows that the u p p e r b o u n d is 20.) The remaining numbers that w e need to check are 11, 17, and 40. If s = 17 or s = 40, w e can write s = 4 + 13 or s = 20 + 20 and we observe that there is o n l y one factorization of 4 • 13 and of 20 x 20 for which b o t h factors lie in the range 2 t h r o u g h 20. We see n o w that s = 11 is the only possible n o - w a y n u m b e r in this case. (It is true, but not relevant, that 11 actually is a now a y number.) Refer n o w to the script of the telephone calls in the statement of the puzzle. W h e n S says that there is no w a y that P can d e t e r m i n e the sum, P carries out the analysis of the previous p a r a g r a p h and deduces that s = 11 because 11 is the o n l y possible n o - w a y number. (Observe that in order to carry out this analysis, P has to k n o w that S knows that P knows the u p p e r bound.) Furthermore, S has p r e s u m a b l y also carried out the above analysis, and so he knows that w h e n he makes his call, P will be able to determine that s = 11. (This, of course, requires that S k n o w s that P knows that S k n o w s that P k n o w s the u p p e r bound. This k n o w l e d g e is inc l u d e d in o u r assumption that the u p p e r b o u n d of 20 is "public.") It follows that S gains no new information w h e n P calls him to boast, "I k n o w y o u r sum." As far as S knows, therefore, the two numbers can be a n y two positive integers that s u m to 11, and all he can say is that p is one of 18, 24, 28, or 30. It is not possible, therefore, for S to make the last call attributed to him, and the given scenario is self-contradictory. The problem, as Martin Gardner said, is "truly impossible." If the u p p e r b o u n d is 100 (and public), it turns out that there are 10 no-way sums: 11, 17, 23, 27, 29, 35, 37, 41, 47, and 53. In this situation, therefore, w h e n S places his first phone call, he effectively tells P that the s u m s is one of these numbers. To illustrate the analysis n o w carried out b y P, let us suppose that the two numbers are 10 and 13, yielding the n o - w a y s u m 23. Since p = 130, player P k n o w s that the set of two numbers must be one of {2, 65}, {5, 26}, or {10, 13}, with corresponding sums 67, 31, and 23. Of these three possible sums, only 23 is a n o - w a y sum, and thus P can d e d u c e that s = 23, and this explains P's return p h o n e call. Player S is aware, of course, that another possibility consistent with his k n o w n s = 23 is that the two numbers are 1 L a n d 12, yielding p = 132. If this were the correct value of p, then S would reason that P w o u l d k n o w that the possible sets of numbers are {2, 66}, {3, 44}, {4, 33}, {6, 22} and {11, 12}, corresponding to sums 68, 47, 37, 28, and 23. In this situation, P w o u l d deduce that s is one ,of 47, 37, or 23, but it w o u l d not be possible for P actually to determine s. Therefore, w h e n S receives P's call he can conclude that p is not 132.
We have just seen that P's call does communicate some information to S (assuming that the n u m b e r s are 10 and 13), b u t I claim that it yields insufficient information for S to prove that p = 130. W h e n S considers the possibility that the two n u m b e r s are 9 and 14, for example, he assumes that p = 126 and he knows in that case that P w o u l d see that the correct pair of numbers must be one of {2, 63}, {3, 42}, {7, 18}, or {9, 14}. The corresponding s u m s are 65, 45, 25, and 23 a n d since 23 is the only n o - w a y n u m b e r among these, S knows that P w o u l d d e d u c e the s u m and call to report" that fact. Therefore, w h e n S receives P's call, he has not eliminated the possibility that p = 126. In thi~ situation, S has not d e t e r m i n e d p, and so he w o u l d not make the last p h o n e call attributed to him. It follows that the two n u m b e r s are not 10 and 13. In general, w h e n S receives P's call, w h a t he k n o w s (in addition to k n o w i n g s) is that there m u s t be a unique w a y to factor p as a p r o d u c t of two allowable n u m b e r s so that the s u m of the factors is a n o - w a y number. (Let us say that p is a unique-no-way p r o d u c t in this case.) Since S is able to determine p (and thereby determine the two numbers) after P's call, it must b6 the case that there is exactly one w a y to d e c o m p o s e s = u + v into two allowable n u m b e r s such the p r o d u c t uv is a uniquen o - w a y product. In order for us to solve G a r d n e r ' s impossible problem with an arbitrary u p p e r b o u n d (either public or secret), we must first determine all n o - w a y n u m b e r s s (one of which must be the correct sum) and for each of these, we need to count the decompositions s = u + v such that the p r o d u c t uv has the u n i q u e - n o - w a y property. If there is just one such decomposition, then the two n u m bers u and v form a solution to the problem. Of course, it is possible that more than one n o - w a y s u m will have a u n i q u e decomposition into pieces w h o s e p r o d u c t has the u n i q u e - n o - w a y property, a n d in that situation, the impossible p r o b l e m will have multiple solutions. A c o m p u t e r search has turned u p m a n y m a n y sets {u, v} of n u m b e r s that solve G a r d n e r ' s problem with no stated u p p e r bound. (Let us call these unrestricted solutions.) There are 16 unrestricted solutions, for example, with sums less than 500. Listed in o r d e r of increasing sums, they are as follows:
{4, 13} {64, 73} {64, 127} {64, 241}
{4, 61} {32, 131} {4, 229} {32, 311}
{16, 73} {16, 163} {8, 239} {8, 419}
{16, 111} {4, 181} {13, 256} {8, 449}
It is easy to see that each unrestricted solution will also be a solution for some sufficiently large public u p p e r b o u n d , but it is not clear to this a u t h o r w h e t h e r or not the converse of this is true. The unrestricted solution with smallest s u m is {4, 13} and this is the u n i q u e soluTHE MATHEMATICALINTELLIGENCERVOL. 17, NO. 4, 1995 5
Lee Sallows Responds But is my additional assumption unwarranted? P and S are clearly involved in a friendly contest to discover the other's number. If not, why don't they just tell each other their numbers? "P calls S to boast, 'I know your sum,' " says Isaacs, and that word boast is his own. So he, like me, has recognised that P is out to win: to tell S that he knows her number before she can name his product. Under these circumstances, an assumption that given 15 at the outset, P would phone immediately to boast that he knew her sum, is not merely warranted, it is entirely reasonable and obvious. Isaacs further comments that the Impossible Problem is ambiguous. This is an understatement, as anyone who seriously attempts it will become aware. Indeed, in the original draft of my paper, I added some clarifications which were later cut from the published version in the interest of brevity. Isaacs forgets that my entire purpose was to treat the problem as if it were "a deliberately and carefully constructed puzzle." That same earlier draft included a longish section, entitled "Exploring a Blind Alley," the focus of which was what Isaacs calls "the only no-way number," 11, with an analysis that closely parallels Isaacs's own. My decision to exclude it from the published version was partly to save space, but partly because the entire argument is a blind alley. The hypothesis that S's sum is 11 leads to a contradiction. Isaacs's conclusion is identical. Having established that this hypothesis is false, I was then free to continue searching for the real solution.
]ohannaweg 12 6523 MA Nijmegen The Netherlands tion with public upper bound 100. We have seen that it is not a solution with public upper bound 20 (that was Gardner's mistake). Experiment shows that the smallest public upper bound for which {4, 13} is a solution is 62 and the smallest upper bound that admits the next unrestricted solution, {4, 61}, is 866. With a public upper bound of 2000, it turns out that there are exactly four solutions: {4, 13}, {4, 61}, {16, 73}, and {32, 131}. Each of these appears to be an unrestricted solution, but it is interesting that these are not the four "smallest" unrestricted solutions. We close with the observation that for each of the 16 unrestricted solutions listed above, one of the two numbers is a power of 2. Although it is tempting to conjecture that this is a general phenomenon, it is not. The 26th unrestricted solution (in order of increasing sum) is {201, 556}.
LM. Isaacs Mathematics Department University of Wisconsin Madison, WI 53706, USA 6
THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4, 1995
Will Fermat Last? To Saunders Mac Lane's quatrain While at Chicago Andr6 Weil Ruled the roost For quite a while
(Mathematical Intelligencer 16 (1994), no. 3, 65), I am moved to offer a footnote: If "Andr6 Weil" Is to be anglicized, Then to "Andrew Wiles" Must he be pluralized.
Jacob E. Goodman Department of Mathematics The City College of CUNY New York, NY 10031 USA
If Hamilton Had Prevailed: Quaternions in Physics J. Lambek
This is a nostalgic account of how certain key results in modern theoretical physics (prior to World War II) can be expressed concisely in the language of quaternions, thus suggesting how they might have been discovered 9if Hamilton's views had prevailed9 In the first instance, biquaternions are used to discuss special relativity and Maxwell's equations. To express Dirac's equation of the electron, we are led to replace the complex number i by the right regular representation of the quaternion unit il. Looked at in this way, it is actually equivalent to the relativistic version of Schr6dinger's equation. The complex number i reappears as soon as we consider the electron in an electromagnetic field. When expressed in terms of complex matrices, Dirac's equation turns out to be invariant not only under a projective representation of the Lorentz group and under Weyl's gauge transformation but also under a projective representation of SU(3).
where N +, 0_',+, and ]?_+ refer to the positive naturals, positive rationals, and positive reals, respectively. Not many mathematicians can claim to have introduced (invented? discovered?) a new kind of number. Although the positive reals had been effectively used by Thales, as ratios of geometric quantities, it was only after the Pythagoreans discovered, to their great discomfort, that the equation x 2 = 2 cannot be solved for rational x, that a rigorous definition of the positive reals was given by Eudoxus, essentially by what we now call
The Invention of Quaternions 9 Bourbaki introduced the following symbols for various species of numbers: H C Z C Q c ~ c C, c H , referring to the naturals, integers, rationals, reals, complex numbers, and quaternions, respectively. This sequence expresses a logical development of the number system, but its historical (and pedagogical) development proceeds somewhat differently:
H+CQ + c R +CRCCCH~ THE MATHEMATICAL 1NTELLIGENCER VOL. 17, NO. 4 9 1995 Springer-Verlag New York
7
Dedekind cuts. As far as we know, the Indian mathematician Brahmagupta was the first to allow zero and negative numbers to be subject to arithmetical operations, thus permitting the transition from I~+ to IlL Cardano, perhaps better known as a physician than as a mathematician, introduced complex numbers, not just to solve equations such a s x 2 -{- 1 = 0, but because they were needed to find real solutions of cubic equations with real coefficients. After Gauss had proved the fundamental theorem of algebra, there was no longer any need to introduce new numbers to solve equations. It was with a different motivation in mind that quaternions were invented by William Rowan Hamilton and, according to Altmann [1986], independently by Olinde Rodrigues. (He also points out that they were already known to Gauss.) Hamilton had already made important contributions to mathematical physics, the most celebrated one being his reformulation of the Euler-Lagrange equations in a form in which position and momentum appear on the same footing. He was now looking for numbers of the form x + iy + j z , with i 2 = j2 = --1, which would do for space what complex numbers had done for the plane. According to Conway [1951], he may have been influenced by the complex number identity (x+iy)(x--iy)=x2+y
2
called the n o r m of a. When a # O, one can define a -1 = a t / N ( a ) , so that aa -1 = I = a - l a .
The quaternions form a skew field or division ring, which is denoted by H in Hamilton's honour, Q having been preempted for the field of rationals. Note also that (ab)t = b t a t .
As far as we know, Hamilton was the first to look at a noncommutative system of numbers. Had matrices been known to him, Hamilton might have defined
il =
(01) -1
,
i2 =
0
,
i
0
where i is the ordinary complex square root of -1, thus forcing
i3 =
ili2
=
0
-i
If these three matrices are multiplied by - i , one obtains the so-called Pauli spin matrices, which were to play a role in quantum mechanics later. Thus, the quaternion a could have been identified with the complex 2-by-2 matrix
when he looked at --al + ia2
(x + iy + j z ) ( x - iy - j z ) = x 2 + y2 q_ z 2 _ ( i j + j i ) y z .
In 1843 he had the sudden insight to abandon the commutative law of multiplication. (It should be noted that matrix multiplication may have come later.) Writing i j = k, he found that
V*
Numbers of the form a + bi + c j + dk, with a, b, c, d E R, were called quaternions. They were added, subtracted, and multiplied according to the usual laws of arithmetic, except for the commutative law of multiplication. It will be convenient to replace i, j, and k by il, i2, and i3, respectively, so that any quaternion can be written as 3
ithaca,
c*=O
where i0 = 1. The conjugate of a is given by a t = ao - i l a l - i2a2 - i3a3,
and one notes that aa t and a t a are both equal to the real number N(a) = a2 + a2 + a2 + 8
a2 3,
THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4, 1995
--v*
u*
where u and v are complex numbers with complex conjugates u* and v*. Note that the quaternion conjugate of such a matrix is
i 2 = j2 = k2 = i j k = - 1 .
a = ao + i l a l + i2a2 q- i3a3 = E
ao -- ia3
U
which is the same as the complex conjugate of the transposed matrix. Replacing 0, 1, and i in these complex matrices by
(00) 0
0
0
1
-1
0
respectively, one can obtain a representation of quaternions as 4-by-4 real matrices. A more natural way of representing quaternions by real matrices is with the help of linear algebra. Clearly, the mapping x H ax, induced by left multiplication with a, is a linear transformation of the vector space of quaternions x, hence can be represented by a 4-by-4 real matrix L(a). Writing [x] for the column 3 vector associated with the quaternion x = ~ = 0 i~x~, namely, the transposed of the row vector (x0, xl, x2, x3), we thus have lax] = L(a)[x].
We may also represent the linear mapping x H xb, induced by right multiplication with b, by a matrix R(b) such that
where
B = B i l l + B2i2 +
E = Eli1 + E2i2 + E3i3
B3i3,
[xb] = R(b)[x]. In view of the associative law
(ax)b = a(xb) for quaternions, we have
R(b)L(a) = L(a)R(b), L(ax) = L(a)L(x), R(xb) = R(b)R(x). Thus, L : ~ ~ M4(R) and R : H ~ ---+ M4(A) are ring homomorphisms.
Applications to Classical Physics
represent the magnetic and electric fields, respectively, p is the charge density, and j is the current density. (Units have been chosen to make c, the velocity of light, equal to one light-second per second.) These laws are usually ascribed to Coulomb, Faraday, Gauss, and Amp6re, respectively, although it was Maxwell who added the term OE/Ot to the last equation. This addition is in.fact crucial to what follows. Using the language of quaternions, albeit quaternions with complex components, known as biquaternions, Maxwell's four equations may be combined into a single equation:
( O - i V ) (B + iE) + (p + ij) = O.
It seems reasonable to represent a three-dimensional vector (xl, x2, x3) by the headless quaternion `; = ilXl -F i2x2 -F i3x3;
but what is the meaning of the fourth coordinate x0? Influenced by the pre-Socratic philosopher Parmenides, who believed that the flow of time is an illusion, Hamilton might have suspected that x0 stands for time. But what then is the significance of the norm N(x) when x0 # 0? (Of course, when x0 = 0, it stands for the square of the distance from the origin.) If we assume that `; and
As far as I know, thi~ was first pointed out independently by Conway [1911] and Silberstein [1912], although it might have been realized by Clerk Ma;~well himself, when he said in 1869: The invention of the calculus of quaternions is a step towards the knowledge of quantities related to space which can only be compared, for its importance, with the invention of triple coordinates by Descartes. It follows from Maxwell's equations that
Op
0
,7 = ilYl + i2Y2 + i3Y3
are pure 3-vectors, then
+Voj=0.
This is known as the equation of continuity, which asserts that the scalar part of
`;,7 = -(,;0,7)
+`; x ,7,
where
`; o ,7 = xlyl + x2y2 q- x3Y3 and `; X ,7 • i 1 (Xay 3 -- XBY2) q- i2(xgy 1 -- XlY3) Jr- i3(xlY2 -- X2Yl )
came to be called the scalar product and vector product, respectively. It was these two products which were applied to physics in the vector analysis of Gibbs and Heaviside, rather than the quaternion product. In particular, writing V ---- i 1 ~ X l q- i2
q-i3
is zero. Another consequence of Maxwell's equations is the existence of a 4-potential, here denoted by the biquaternion ~ + / A , such that
OA E = -V~
Ot '
B = V x A.
In quaternion notation, this asserts that the vector part of
,
one usually summarizes Maxwell's equations as follows: VoB=O,
0B VxE+~-~ =0,
VoE=p,
V xB--~
OE
=j,
is B + iE. (Sometimes, the scalar part is put equal to zero.) If Maxwell had faith in quaternions, other physicists despised them. Thus, Oliver Heaviside, as quoted by Conway [1948], asserted that quaternions are "a positive evil of no inconsiderabIe magnitude," and William THE MATHEMATICAL1NTELLIGENCERVOU 17, NO. 4, 1995 9
Thomson, better known as Lord Kelvin, as quoted by Altmann [1986], said that they "have been an unmixed evil to those who touched them in any way, including Clerk Maxwell." Even Minkowski, who should have known better, rejected quaternions as "too narrow and clumsy." It may be difficult to understand how a mathematical concept can evoke such strong antagonism; but, even in our day, similar opinions have been expressed about categories, although for different reasons. Not surprisingly, quaternions never entered the mainstream of physics, yet they had a small but dedicated group of devotees, of whom I shall only mention the few whose articles are cited in the bibliography: Conway, Dirac, Silberstein, Weiss, Gfirsey, and Synge. My own interest as a graduate student was raised by the inspiring book by Silberstein [1924] and led to Part I of my thesis, from which, however, all physics was expunged when I realized that my main ideas had been anticipated by Conway [1948]. If we take the quaternion form of Maxwell's equations seriously, we are led to the study of quaternions with complex components, also called biquaternions. Extending the representation of quaternions as 2-by-2 complex matrices to biquaternions, we see that the latter must be represented by matrices of the form
(u
--V*
v)
§
It*
- - V r*
We may also extend the representation L of quaternions as 4-by-4 real matrices to one of biquaternions as 4-by-4 complex matrices. Using the same letter L for the latter, we see that again L(c*) = L(c)*,
are the conjugate and transposed matrices, respectively. For the present, we shall favour this representation and identify the biquaternion c with the matrix L(c) in M4 (C), although later we shall look at a representation of biquaternions in M4(~).
A p p l i c a t i o n s to Special Relativity The special theory of relativity requires the invariance of the expression t 2 - Xl2 - x 2 - x32 under a coordinate transformation passing from a stationary platform to a moving train. This suggests that position in space and time be joined together in a biquaternion of the form x = xo § i i l x l § ii2x2 § ii3x3 = t § ir,
where the x~ are real. These biquaternions are characterized by the property x* = x t and have been called hermitian biquaternions. In fact, the matrices L ( x ) and R ( x ) are then hermitian matrices. The differential operator d
U t*
dx
But it is easily seen that any 2-by-2 complex matrix is of this form, as we can solve the four equations u+iu~=p, - v * - iv'* = r,
v+iv~=q, u* + i u ~* = s
for u, v, u', and v'. For example, writing u = uo + i u l , v = vo + iv1, and so on, and adding the first and last equations, we see that
so that uo = k (Po + so).
Thus, the algebra of biquaternions is isomorphic to
-
0 Ot
iV
is also a hermitian biquaternion, but the so-called sixvector F = B + iE satisfies F t = - F ; it is represented by a skew-symmetric matrix. We may ask which continuous transformations leave the norm of a hermitian biquaternion invariant. It is easily seen that this is so for X I-"> X* ~ X H
2uo + 2iu~o = p + s,
L ( c t) = L(c) t
--X~
x ~ qxq *t
when N ( q ) = 1,
and for no others. (See, e.g., Lambek [1950].) If we ask, more generally, which continuous transformations leave the norm of the difference of two hermitian biquaternions unchanged, we should add also the translation
M2(C). The complex conjugate
x~-*x§
(a+ib)*
=a-ib
of a biquaternion is then represented by the conjugate matrix, whereas the quaternion conjugate (a+ib) t = (ao+ibo)-il(al
+ibl)-i2(a2+ib2)-i3(a3+ib3)
is represented by the transposed matrix. 10 THEMATHEMATICAL INTELLIGENCERVOL.17,NO.4,1995
The group generated by all these transformations is called the Poincard group. We shall rule out the transformation x ~ x* but admit x H - x , thus following Lewis Carroll, who suggested that time is reversed in a mirror. Transformations of the form x ~ q x q *t are called (proper) Lorentz transformations; they were originally postulated by Lorentz to account for the Michelson- Morley experiment. This ad hoc explanation was later justified by Albert Einstein, who
realized that they also described, in addition to rotations, the changes in coordinates w h e n passing from a stationary platform to a uniformly m o v i n g train; these are called boosts (see Sudbury [1986]) and form the cornerstones of the special theory of relativity. A Lorentz transformation is given by a biquaternion q = u + iv, with u, v c ~ , such that qqt = 1, that is, u u t - vv t = 1,
Einstein had realized that the m a s s - m o m e n t u m p = m + i m v should also transform like x (without using the language of quaternions); hence, he wrote p = mo d x / d s , where ds 2 = N ( d x ) , a n d mo = m ( d t / d s ) -1 is the rest mass of a m o v i n g particle, assumed to be invariant. The
conservation of m o m e n t u m is then coupled with the conservation of mass. Now
uv t + vut = O.
q describes a rotation in three-space if q E ~ , that is, if v = 0; it is a boost if it is hermitian, that is, qt = u - iv, which means that u is a scalar and v a 3-vector. It is easily seen that every Lorentz transformation is a rotation followed by a boost. Indeed, let
dt
m = mo ds = mo(1 - v2) -1/2, where v 2 = v o v. If the velocity v of the m o v i n g particle is small c o m p a r e d with the velocity c = 1 of light,
lv2 1+
m--too
C2 ]
#2 = uu t = l + vv t >_1, r = u p -1,
s = p - iuvt#-l;
then r is a rotation, s is a boost, and q = st. The space of hermitian biquaternions is called Minkowski space; it is generated by 1, ill, ii2, and ii3. If w e put As = i w h e n c~ > 0 but Ao = 1, we m a y write these generators as A~i~. A n y hermitian biquaternion then has the form X
=
E
'
Xor~ ~
=
E
x~A~i~,
where the x~ = x'~ A*,~are real. Applying a Lorentz transformation, we obtain another hermitian biquaternion qxq*t = Z
x~A~AaflAflifi,
oqfi
where the A~Z are real numbers. Putting
where we have temporarily restored the symbol c to obtain the famous approximation
" c2 9- " 0c2 + 1" ~ Einstein considered this to be the total energy of the particle, the kinetic energy 89 2 being a u g m e n t e d b y the atomic energy moc 2. The charge-current density J = p + ij m a y be thought of as J = Po d x / d s , where Po = p ( d t / d s ) - I is assumed to be an invariant scalar; hence, d also transforms like x, namely J H pJp*t. In our notation, Maxwell's equations are combined into the single equation d F / d x + J = O. It seems that Henri Poincar6 was the first to realize that they are invariant u n d e r Lorentz transformations. To see this, we only need F H q*Fq*t and d / d x .-* q ( d / d x ) q *t The former transformation is natural for a 6-vector, as (q*Fq*t)t = q*Ftq*t = -q*Fq*t.
w e see that L(q)R(q*t)[x] = At [x].
This being so for all [x], we infer that n ( q ) R ( q *t) = A t,
The latter transformation may be justified on general principle: if the column vector of the x~ = A~x~ is transformed by the matrix A, then the c o l u m n vector of the O/Ox~ = As O/Ox~ is transformed by the inverse of A t. Now, if A leaves x 2 - x 2 - x 2 - x 2 invariant, the inverse of At is A; hence,
hence, taking the transposed of each side, that
d
R ( q * ) L ( q t) = A.
o~
Multiplying this b y [x*], for any hermitian biquaternion x, w e obtain qtx* q* = E
ic~A~flx~* = E
transforms like x. In summary, the following hermitian quaternions are transformed b y q( )q,t:
i~)C A~flx*fl.
(position in space-time),
x=t+ir
For later reference, we s u m m a r i z e this observation as follows: L E M M A 1. I f a Lorentz transformation transforms A~i~ into qA~i~q *t' = ~ A~A~i~, then qtA*fli~q* =
d dx
0 -
iV
Ot
(partial derivation),
p = m + imv
(energy-momentum),
J=p+/i
(charge-current density),
(I) = ~ + i A
(4-potential).
THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4,1995
11
On the other hand, the 6-vector F = B + iE
'electromagnetic field)
is transformed by q( )qf. The quaternion form of Maxwell's equations is d --F+J=O. dx Operating on this equation by ( d / d x ) t and noting that ( d / d x ) t (dF/dx) is a 6-vector, we infer that the scalar part of ( d / d x ) t Y is zero. This is the equation of continuity. We should also add the observation that (d/dx) t 9 + F is a scalar, which is sometimes put equal to zero. Maxwell's equations describe the electromagnetic field created by a continuous distribution of charge in motion, as expressed by the charge-current biquaternion J. Conversely, a given electromagnetic field also acts on a moving charge, this time viewed as a discrete charge q0 moving with velocity v, by exerting a force q0(E + v x B). According to Newton, this should be the rate of change of momentum, but special relativity requires that the rate be measured by d/ds, not by d/dt. Moreover, according to the dictates of special relativity, this should be augmented by a term q0(v x E + B), expressing the rate of change of energy. Thus, we obtain the relativistic form of the equation of motion of an electron, with q0 = - e ,
Dirac obtained his celebrated first-order equation by extracting the square root of this second-order differential operator, thus rediscovering the main idea behind Clifford algebras. It does not seem to be widely realized that such ad hoc methods are not needed and that Dirac's first-order equation is in fact equivalent to the Klein-Gordon equation, provided we do not insist that remain a scalar. In fact, we shall assume that ~ is a biquaternion and write d~ dx
- -
rn0x
:
,
where X = Xo + ix1 is another biquaternion. Then
G
=
G
so the Klein-Gordon equation is equivalent to the following pair of biquaternion equations: dx
moX,
= -mo~.
How can these be combined into a single first-order equation? Assume, for the moment, that there is an entity j such that j2 = -1, j i = - i j , and j G = i ~ j for a = 0, 1, 2, or 3. Then we have
dp dx d~ = e-~s F + ig:
where g is some hermitian biquaternion. There remains a contradiction: we described the charge as continuously distributed in Maxwell's equation, but as discrete in the equation of motion. This contradiction will only be resolved when we pass to the quantummechanical treatment of the electron.
A p p l i c a t i o n to Q u a n t u m M e c h a n i c s Quantum mechanics prescribes that the momentum p = m o d x / d s be replaced by the differential operator - ( h / 2 7 r i ) ( d / d x ) , where h is Planck's constant. Choosing units so that h/2~r = 1, we thus expect the equation ppt = m 2 to be replaced by the relativistic wave equation
= moX + j
dx
X
= m0(x -
= -jmo(p
+ JX).
There is certainly no 4-by-4 matrix which anticommutes with the complex number i. But let us pass to real 4-by4 matrices and identify i~ with the matrix L(i~) representing it. We shall take j~ = R(i~), the contravariant representation; then j2 =_ j2 = j2 = j3j2jl = --1:
j~,i~ = i~j~,
for a, fl = 1, 2, or 3. Now replace the complex number i by the real matrix jl and identify j with j2- Putting = ~ + J2X = ~o + j1~1 + j2xo + j3Xl:
we may write the above first-order equation as d
where 9~ = ~0 + i ~ is a function of position in spacetime. This is usually referred to as the Klein - Gordon equation; but, as a recent biography of Schr6dinger points out (Moore [1989]), Schr6dinger himself had considered this form of the wave equation before introducing the timeevolving form which is named after him. 12
THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4,1995
- - qJ + j2rno~ = O: dx
where now
d
0
dx
Ot
jlV,
and we have essentially recaptured Dirac's equation for the free electron.
A word of warning: having replaced i by jl, we must n o w write x = t + jlr, and so on, and even the biquaternion q = u + iv in the Lorentz transformation has now become the matrix u + jl v. To make sure that our first-order equation is preserved u n d e r the Lorentz transformation which sends x to qxq *t, hence d/dx to q(d/dx)q *t, it suffices to let 9 be transformed into q* ~. (There are other possibilities, namely, sending 9 to q*qdqt or q*~q*t, but we shall not consider these.) We note that the Lorentz transformation q( )q*t is unchanged when q is replaced by -q, but that it corresponds to two distinct ways of transforming @: into q*~ and - q * ~ , respectively. The transformations of do not constitute a representation of the Lorentz group, but what has been called a projective representation. This is the mathematical reason for saying that the electron 1 has spin i. It can be shown that the 16 matrices 1, i~, jfl, and i~iz (cb fl = 1, 2, or 3) span the space of all 4-by-4 real matrices. The reason is that ]HIis a central simple algebra of degree 4 over ~, hence H | ]~op is isomorphic to M4(I~) (see Jacobson [1980], Theorem 4.6). Thus, 9 is just an arbitrary 4-by-4 real matrix. However, the transformation rule 9 ~ q*~ permits us to multiply 9 in the Dirac equation by the column vector (1, 0, 0, 0) t, so we m a y assume, without loss in generality, that 9 is the real column vector (~0, r ~d2~ ~)3)t; we m a y write 9 = [~b], where = ~d0 q- i1~bl -4- i 2 ~ 2 q- i 3 ~ 2 is a real quaternion. There is nothing to prevent us from introducing a complex colu m n vector here; on the other hand, there seems to be no necessity for doing so either, at least as long as we confine attention to the free electron, not influenced by an electromagnetic field. Dirac's equation for the free electron m a y be written more explicitly in matrix form as follows: (0
3
O)
Multiplying by r on the right, we would then obtain
0r
Ot
V e r i l + mo~ri2 = O~
which is the same as the original equation with ~ replaced by e r . I recall telling Dirac in 1949 that I could derive his equation with the help of quaternions. After thinking quietly for several minutes, as was his habit before speaking, he said, "Unless you can do it with real quateraions, I am not interested." As I had biquaternions in mind, it was perhaps this remark which finally persuaded me to abandon theoretical physics for pure mathematics. Looking at the problem again almost half a century later, in connection with a project to write an undergraduate textbook on the history and philosophy of mathematics (together with Bill Anglin), I realize that I should have replied: "Yes; but can you do it using a real wave vector?" It is only quite recently that I became aware of Dirac's 1945 article, which shows h o w to express Lorentz transformations with the help of real quaternions in a roundabout way. I don't k n o w whether this idea was ever followed up.
The Electron in an Electromagnetic Field What happens to an electron w h e n an electromagnetic field is present? Classical physics requires that the kinetic energy m should be augmented by the potential energy - e ~ , where - e is the charge of the electron and ~ is the potential. According to special relativity, the energym o m e n t u m 4-vector p should then be augmented by -eCP, where 9 = ~ + jl A is the 4-potential. Finally, quantum mechanics requires that p be replaced by i(d/dx); hence, p - e~ by i(d/dx) - eq~. Multiplying this by - i , we see that Dirac's equation becomes
o~=1
N o w L(i~)[~b] = [Gel and R(i~)[~b] = [~bi~], so this is really an equation in real quaternions:
0r
Ot
Vr
+ mo~;i2 = O.
When written in this way, the equation is less apparently Lorentz-invariant. When taking jl = R(il) and j2 = R(i2), we m a d e an arbitrary choice. We might just as well have taken jl = R(i2) and j2 = R(i3) or, more generally, j~ = R(ri~r t) (cr = 1, 2, or 3), where r is-a real quaternion such that rr t = 1, that is, where r( )r t is a rotation in 3-space. What would happen to Dirac's equation had we chosen a different coordinate frame in 3-space? In its quaternionic form, it would then become 0r Ot
V e r i l r t + moeri2r t = O.
~x + i~
qd + j2mo~ = O,
where 9 = [r r being a quaternion. One is tempted to replace the complex number i by the real matrix jl as before, in which case Dirac's equation would become
or --
Ot
-- eAr
-
(Vr
-
e99~;)il
q-
mor
= O.
As long as ~ is nonzero, we could multiply this equation by ~b-1 and solve for A. I do not k n o w whether this makes any sense, so I shall allow i to remain a complex number, thus forcing 9 to be a complex vector, hence a biquaternion. N o w ~ was only determined inasmuch as (d/dx) t ~2+ F is a scalar. As pointed out by H e r m a n n Weyl [1950], this property is not affected by a so-called gauge transformation: replacing ~Pby ~ + d~/dx, where c~is a real scalar. THE MATHEMATICALINTELLIGENCERVOL. 17, NO, 4, 1995 13
The same result could have been achieved replacing 02 in Dirac's equation by
that is,
3
(02"02)-
02 exp(iec~) = ~(cos(ec 0 + i sin(ec~)),
0 (02Hjtic~02) = O.
s~O
for
This equation resembles the equation of continuity
d
~-xx(02 exp(ie~)) =
(d02 dc~) exp(iec~). -~x + ie -~x
3
[The argument depends on the fact that 02 commutes with exp(ie~); it would not have worked had we replaced i by Jl.] Dirac's equation expresses the action of the electromagnetic field, as determined by ~, on the electron. It replaces the equation of motion discussed earlier. On the other hand, the contribution of the electron to the electromagnetic field was expressed by Maxwell's equations. In terms of 9 these can be written dxx
(1) where 02 = [~] is a complex column vector, r n o w being a biquaternion. Since the old symbols 9 and t have changed their meanings in the course of our discussion, we shall n o w write 02T for the transpose of 02 and 02c = [r for the complex conjugate of 02. It will be convenient to invoke the hermitian conjugate qjH = 02CT = [r Now d
0
jl V,
O=p+jlA
are real symmetric matrices, but j2 is antisymmetric. Multiplying (1) by ~H, we obtain
-~x + ieq~ 02 = --02H j2m002,
(2)
and the hermitian conjugate of this is
~H ( ~-ieffp)dx 02=q-~Hj2rn~
(3)
where the arrow indicates that differentiation operates leftwards. A d d i n g (2) and (3), we obtain ~H d 02 = 0, 14
which suggests that we define
ds = e02H is)~*a02 and consider d = E 3c~=0 3rs/~c~is as a candidate for the charge-current density. Here ,Xs is defined as before, except that i has now been replaced by jl, thus As = jli~ w h e n ~ > 0 but A0 = 1. It remains to check that d is transformed by a Lorentz transformation into qjq*t. Indeed, d transforms into
O=d,
where J is the charge-current density. Only n o w we can calculate J with the help of the wavefunction 9. The following considerations are adapted from Sudbury [1986], after translation into our language. Recall the matrix form of Dirac's equation for an electron in an electromagnetic field:
dx - Ot
o~=0
THE MATHEMATICAL 1NTELLIGENCER VOL. 17, NO. 4,1995
Efl e02Hqtiz)~*flq*q~)~zi~, and, by L e m m a 1, this is
E e02His'\*~02A~fl)~ziz = E Jc~Asz)~ziz = qjq*t, a,fl s,fl as was to be shown. The wave vector 02 appearing in the Dirac equation (1) depends on the following three data: (a) A point in Minkowski space represented by z = t + jlr. The Lorentz transformation x H qxq *t corresponds to a transformation 02 ~ q* 02. (b) A choice of the 4-potential ~, compatible with the electromagnetic field, which itself depends on x. The gauge transformation (I) H q) + dc~/dx is equivalent to the transformation 02 ~-* 02 exp (eic~). (c) The choice of jl and j2, which determines j3 = j2j~. We had assumed that j s = R(G) but allowed i~ to be replaced by ricer T, r being a quaternion with rr T = 1. However, since 02 is now allowed to be a complex vector, we m a y as well allow is to be replaced by ric, r H, where r is a biquaternion with rr H = 1. This induces a transformation ~b ~ Cr, or 02 ~ R(r)02. The group acting on 02 according to (a) is the group SU(2) of unitary transformations of determinant 1. The group acting on 02 according to (b) is the group U(1) of the complex numbers with absolute value 1. The group acting on 9 according to (c) is a projective representation of SU(3). The physical theories underlying the foregoing discussion have been known for a long time; they are all discussed in the classic text by H e r m a n n Weyl [1950], a
translation of the German edition of 1930. Even some further steps are indicated there, in particular the so-called second quantization (accompanied by some hocus pocus to remove u n w a n t e d infinities). Following this procedure, one is told to replace the functions 9 and 9 by operators; but, according to Sudbury [1986], the form of the Dirac equation remains the same. These ideas culminate in quantum-electro-dynamics (QED), which has succeeded in making highly accurate predictions. More recent developments exploit the gauge theories initiated by Weyl. Thus, the strong force acting on a fermion is explained with the help of the group SU(3) and the electrow e a k force with the help of the group SU(2) x U(1). There does not seem to be any significance to the observation that these same groups arise in the above discussion. What then is the final verdict on the usefulness of quaternions for physics? I am told that they are catching on as a tool for computation, but in the more general framework of Clifford algebras. Indeed, Dirac's original derivation of his equation implicitly used a Clifford algebra argument. One m a y cling to a feeling that there is something special about quaternions: their 4-dimensionality and the fact that they form a division algebra. These special properties of quaternions might be expected to put restraints on the nature of our universe. Unfortunately, the 4-dimensionality of Hamiltonian quaternions does not account for the difference between space and time in Minkowski space, and division plays no role in our story. These aspects of quaternions are called upon, however, w h e n one expresses the algebra of 4-by-4 real matrices as a tensor product of I~ and ~op, which fact we have exploited here. Be that as it may, I firmly believe that quaternions can supply a shortcut for pure mathematicians w h o wish to familiarize themselves with certain aspects of theoretical physics.
Acknowledgments I would like to thank Niki Kamran and Prakash Panangaden for their advice and encouragement and an anonymous referee for his criticism. This work was supported by the Social Sciences and Humanities Research Council of Canada.
P. A. M. Dirac, Applications of quaternions to Lorentz transformations, Proc. Roy. Irish Acad. A50 (1945), 261-270. E Giirsey, Contributions to the quaternionic formalism in special relativity, Rev. Facultd Sci. Univ. Istanbul A20 (1955), 149171. E G6rsey, Correspondence between quaternions and fourspinors, Rev. Facultd Sci. Univ. Istanbul A21 (1958), 33-54. N. Jacobson, Basic Algebra II, San Francisco: Freeman (1980). J. Lambek, Biquaternion vectorfields over Minkowski space, Thesis, Part I, McGill University (1950). J. C. Maxwell, Remarks on the mathematical classification of physical quantities, Proc. London Math. Soc. 3 ~(1869), 224232. W. Moore, Schr6dinger, Life and Thought, Cambridge: Cambridge University Press (1989). P. J. Nahin, Oliver Heaviside, Sci. Am. 1990, 122-129. S. Silberstein, Theory of Relativity, London: Macmillan (1924). A. Sudbury, Quantum Mechanics and the Particles of Nature, Cambridge: Cambridge University Press,(1986). J. L. Synge, Quaternions, Lorentz transformations, and the Conway- Dirac- Eddington matrices, Commun. Dublin Inst. Adv. Studies A21 (1972), 1-67. P. Weiss, On some applications of quaternions to restricted relativity and classical radiation theory, Proc. Roy. Irish Academy A46 (1941), 129-168. H. Weyl, The Theory of Groups and Quantum Mechanics, Dover Publications, New York: (1950). Mathematics Department McGill University Montreal, Qudbec, H3A 2K6 Canada
MOVING? We need your new address so that you do not miss any issues of
THE MATHEMATICAL INTELLIGENCER. Please fill out the form below and send it to:
Springer-Verlag New York, Inc. Journal Fulfillment Services EO. Box 2485, Secaucus, NJ 07096-2485
References
Name Old
S. L. Altmann, Rotations, Quaternions and Double Groups, Oxford: Clarendon Press (1986). A. W. Conway, On the application of quaternions to some recent developments of electrical theory, Proc. Roy. Irish Acad. A29 (1911), 80. A. W. Conway, The quaternionic form of relativity, Phil. Mag. 24 (1912), 208. A. W. Conway, Quaternions and quantum mechanics, Ponteacr~ Acad. Sci. Acta 12 (1948), 204-277. A. W. Conway, Ham!lton, his life work and influence, Proc. Second Canadian Math. Congress, Toronto: University of Toronto Press (1951).
Address (or label)
Address City/State/Zip Name
New Address
Address
City~State~Zip Please give us six weeks notice.
THE MATHEMATICALINTELLIGENCERVOL. 17, NO. 4, 1995
15
A Philosopher's Mathematician: Hans Hahn and the Vienna Circle Karl Sigmund
"No one enters here who is no geometer," said a notice on Plato's door. In spite of these brave words, most contacts between mathematicians and philosophers, through the centuries, have been marked by mutual neglect and even animosity, almost as if the realm of abstract thought were too narrow to accomodate two species. Descartes and Pascal achieved greatness in both fields; but apart from such one-man enterprises, we had to wait for this century to find true cooperation between philosophers and mathematicians. It started with the collaboration of Russell and Whitehead on the Principia Mathematica, and led to the Vienna Circle, which flourished around 1930. The founder and arguably the center of the Vienna Circle was the mathematician Hans Hahn of HahnBanach fame. He was the mentor and thesis advisor of Kurt G6del; he taught calculus to young Karl Popper, and encouraged him in his early work on the Logic of Scientific Discovery; he helped shape Carnap's philosophical development; and he hosted the celebrated evening that resulted in Wittgenstein's return to philosophy. In Hahn's collected works--three heavy volumes dealing mostly with analysis, topology and the calculus of variations--the philosophical writings take up little space; but he was a front-seat witness and a catalyst of the great foundational debate on mathematics and logic. By the time of Hahn's birth in 1879, the University of his home town, Vienna, could not boast of any outstanding mathematician, with the exception (briefly) of Ludwig Boltzmann. But Hahn found among his fellowstudents enough talent for a whole academy. There was Gustav Herglotz, whose contributions to astronomy and number theory earned him in due course professorships in Leipzig and G6ttingen; Herglotz's classmate Paul Ehrenfest, later one of the most influential theoretical physicists of his age (and the successor of Lorentz in Leiden); and Heinrich Tietze, an amazingly prolific mathematician best known for his work in topology, who became professor in Munich. This "inseparable foursome" actually separated quite soon, each sailing out from Vienna on a different course. But like Dumas's 16
musketeers, they kept up a life-long friendship and frequently met again. Ten years later, for instance, Hahn and Tietze gave public lectures on elementary mathematics which, ten years later again, were published as a book. It took some time, however, before the foursome congregated. Hahn started by studying law, following his father's wish. After one year, he decided to assert his independence, switched to mathematics, and spent several terms at the universities of Strassburg and Munich. Only then did he return to Vienna. Hahn wrote his doctoral thesis as a student of Gustav von Escherich, under whose influence the Mathematics Institute had taken a decided turn for the better. Hahn's second examiner was Boltzmann, who, after job-hopping from Graz via Munich and Vienna to Leipzig, had returned to the chair of theoretical physics in his home town. Von Escherich, incidentally, had the distinction of discovering both Hans Hahn and (some years later) Johann Radon, who laid the mathematical foundations of tomography in 1917 and w h o had the Radon measure and the Radon transform named after him. Under von Escherich's
ThE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4 9 1995 Springer-Verlag New York
guidance, both Hahn and Radon made their d6buts in the calculus of variations, which at that time was in a particularly active phase (three out of the 23 Hilbert problems of 1900 dealt with the calculus of variations). In this field, Radon's contributions turned out to be more lasting than Hahn's; but Hahn became very quickly a widely recognized expert. The foremost mathematical centre at the turn of the century was doubtless G6ttingen, and each of the foursome spent some time as a post-doc in the heady atmosphere of Hilbert's seminar. Hahn was not yet 25 when he was elected to write a chapter on the calculus of variations for Klein's prestigious Enzyklop4die der Mathematischen Wissenschafien. His co-author was Hilbert's assistant Ernst Zermelo. At about that time, Zermelo was formulating the axiom of choice which, some 25 years later, found one of its first applications in the proof of the Hahn-Banach theorem. Zermelo had been engaged in a vehement scientific controversy with Hahn's mentor Boltzmann, and later became one of the most persistent critics of Hahn's disciple G6del.
Looking for a Circle Hot upon the encyclopedia article came Hahn's
Habilitation. Again, Boltzmann was a member of the committee. By now he had as good as succeeded Ernst Mach as professor of philosophy. Mach, who had studied shock-waves as a physicist, had had a huge impact as a philosopher, influencing thinkers as diverse as Einstein and Lenin. The views of Boltzmann and Mach happened to be almost opposite on almost everything, but both were based on physics, rather than metaphysics. Hahn could not fail to be impressed by the union of scientific and philosophic thought in their wrote an important textbook on positivism. A n o t h e r work. But Mach had been incapacitated by a stroke, and member of the circle was Philipp Frank, who became Boltzmann committed suicide in 1906, so that Hahn was the successor of Albert Einstein in Prague, on Einstein's left, philosophically, more or less on his own. There was fervent recommendation. (Actually Einstein first recBertrand Russell to lean on, who had just published his ommended Paul Ehrenfest, one of the inseparable fourPrinciples of Mathematics. Hahn went into print with the some, who---jointly with his wife Tatjana--had written statement that "Bertrand Russell will be seen as the most i a superb, still much-quoted chapter of the Enzyklop~die important philosopher of our time"--a sentence which der Mathernatischen Wissenschaflen, which greatly claritoday would raise few eyebrows, but which rang fied Boltzmann's statistical mechanics. But old Franz strangely in a philosophical atmosphere dominated by Joseph, the emperor, would not appoint a professor without religious affiliation. Even Einstein had had no the successors of Kant, Hegel, and even Saint Thomas. The foursome had dispersed, but Hahn found (or trouble resigning himself to what was generally rerather, founded) during his post-doctoral years in garded as an empty formality, but in spite of his Vienna a new circle of highly gifted, like-minded urgings, and Hahn's, Ehrenfest adamantly refused.) friends, who regularly met in some of Vienna's glorious Another close friend of Hahn's was Otto Neurath, a redcoffee-houses for long discussions on philosophy and haired giant with a booming voice and encyclopedic ineverything else. One of his closest friends was Richard terests. Neurath soon married Hahn's sister Olga, a revon Mises, who a few years later designed the first gi- markable woman who, although struck by blindness at ant airplane, and whose textbook on aerodynamics was an early age, wrote several influential papers on symto set the standard, in ever new editions, for five decades bolic logic. Hahn's work up to the First World War deals with an and more; von Mises, who became professor of mechanics in Berlin, Istanbul, and Harvard, also had a pro- impressive variety of subjects, in addition to his steady found influence on the foundations of probability, and flow of papers on the calculus of variations. THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4, 1995
17
He wrote a surprisingly prescient paper on the hydrodynamics of the Boussinesq flow, jointly with Herglotz (of the inseparable foursome) and Karl Schwarzschild (who was soon to create modern astrophysics). In spite of the eminence of the three authors, however, this work remained completely neglected. Hahn also ventured into algebra, characterising ordered abelian groups by an imbedding theorem. This was recognised, but only much later, as a fundamental result in the theory of ordered vector spaces, and was named after him, but Hahn never returned to the field. Hahn solved a problem of Lebesgue, showing that the fundamental theorem is no longer valid if infinite values of the derivative are allowed; and he proved an analogue of the Weierstrass product theorem for holomorphic functions of two complex variables. Hahn also caught on early with Fr6chet's topological spaces, and characterised neatly those admitting non-constant continuous functions; and he published a proof of the Jordan curve theorem for polygons, using only Hilbert's axioms for plane Euclidean geometry, and bypassing any continuity considerations. To one of the still rather rare female students at the institute, he set the task of extending this result to
18
THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4, 1995
higher dimensions. By the time she published her result, Lilly Minor had become Hahn's wife. She withdrew from active scientific work, but kept a lifelong enthusiasm for the subject, and her house later became a focus of the social life of Viennese mathematicians.
Ticket to Czernowitz Hahn's academic career made due progress. In an Austria dominated by the initials k.u.k. (kaiserlich und kfJniglich--turned into Kakanien in Robert MusiI's Man without Properties, whose hero Ulrich, incidentally, was a Viennese mathematician of about Hahn's age)--such a career usually began with an appointment as extraordinary professor in a far-off provincial town. In Hahn's case, as in Escherich's, this was Czernowitz, at the very border between the Habsburg empire and Czarist Russia, a wearying 26-hour train ride from Vienna. The idea was that a young professor should get his act together and prove his mettle before being promoted to some more established university like Prague or Graz and ultimately to Vienna. Young actors at the time had to embark on some similar campaign if they wanted to end up at a major theater of the capital. Needless to say, many got stuck along the way. But Hahn seems never to have doubted that his career would lead him back to Vienna. When he took leave of his friends (in a coffeehouse, of course), he announced already his plans for the coming return. Discussions would be resumed, but this time "with the help of Universitiitsphilosophen" (in contrast to the Kaffeehausphilosophen of which there were plenty). It took 12 years to achieve this goal. Hahn's self-confidence was soon bolstered by a spectacular success. He solved the problem of topologically characterising the continuous images of compact intervals. Some 20 years before, Peano and Hilbert had astonished the mathematical world by exhibiting spacefilling curves (see H. Sagan, 1993. Math. Intelligencer 15(4), 37). The time-honoured definition of a curve as a continuous image of a compact interval had obviously to be handled with care, if it could yield such uncurvelike objects as the full square, for example. Of course this could not happen if the continuous mapping was one-to-one. On the other hand, Hilbert's example was nowhere four-to-one. Hahn showed that such a squarefilling map had to be at least two-to-one at a continuum of points, and at least three-to-one on a dense set. More importantly, Hahn showed that the continuous images of segments were exactly those compact and connected sets that were locally connected. The same result was obtained independently, and almost at the same time, by Stefan Mazurkiewicz, one of those young mathematicians who, with amazing purpose, put Poland on the map of the mathematical world some years before it reacquired its geographical boundaries. Throughout his lifetime, Hahn remained proud of this theorem, which
inspired a lot of research on dimension theory and counts as a mainstay of general topology. During his Czernowitz years, Hahn was drawn towards what was to become functional analysis. Calculus of variations is one of the best motivations for studying functionals, i.e., functions defined, not on sets of points, but on sets of functions. Hahn wrote a survey on integral operators, and greatly improved on Hellinger's spectral theory for bounded orthogonal forms, couching it in the framework of Lebesgue integration.
Years of Upheaval When the World War broke out, Hahn was enrolled in the k.u.k, army. He was severely wounded in the lung in 1915 on the Italian front. Even after his recovery, the future looked rather grim. Czernowitz had been quickly overrun by the Russian army, so that Hahn had lost both his home and his job. For a while, he had to make do with teaching in a cadet's school. He nevertheless found time for writing important memoirs on harmonic analysis and integration theory. Then things picked up again: in 1917, he became a full professor in Bonn. A short while later, he occasioned a scandal by handing out pacifist leaflets. We are told by his close friend Schumpeter, the famous economist, who had also moved from Czernowitz to Bonn, that this made him persona non grata at German universities. (At about the same time, Bertrand Russell was sentenced for his pacifist views; he used his time in jail to write an Introduction to Mathematical Philosophy.) Some of Hahn's finest work was done during the chaotic years following the collapse of the German and Austro-Hungarian empires. He proved, for instance, the well-known "sandwich theorem" (if an upper semi-continuous function is dominated by a lower semi-continuous function, we can "sandwich" a continuous function in between). He characterised the set where a series of continuous functions diverges as a countable union of countable intersections of open sets. And he showed that if a function is continuous in its variables xl . . . . . Xn separately, then its points of continuity are dense in every hyperplane xi = const. Hahn also discovered that every signed measure could be written as the difference of two positive measures, a result which corresponds to Jordan's representation of a function of bounded variation as the difference of two monotonically increasing functions (this is known today as the Hahn-Jordan decomposition theorem). In yet another paper, Hahn used results of Riesz to discussthe representation and convergence of interpolation operators on the space of continuous functions. Functional analysis was in the air, and Hahn was about to become one of its founding fathers. He took up a thread spun by the young Viennese mathematician Eduard Helly, who had written a seminal paper on linear operators in infinite-dimensional spaces before be-
ing snatched away by the war. In 1920, Hell}, surfaced again: he had been a prisoner of war for almost six years, surviving an excruciating odyssey through POW camps from Siberia to Egypt. His mathematical powers had not deserted him, however, and he resumed his ground-breaking work on systems of linear equations with infinitely many variables. Hahn extended it in a truly astonishing memoir which ranks as one of the cornerstones of functional analysis. Hahn showed that no less than 23 apparently quite unrelated problems could be solved within the same framework. All one needed was a linear structure with a norm and a completeness property. Indeed, as Harro Heuser formulates it: Basically, [Hahn] does nothing else than add to the axiomatically defined norm of Helly an axiomatically defined linear space. This is what Fr6chet later called a Banach space (and Banach, modestly, designated as espace du type (B)). Indeed, by the time Hahn's paper came out, Banach's famous thesis was already one year old (both appeared in print in 1922). That thesis started out by defining a complete, normed vector space, and included the theoTHE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4, 1995
19
rem which Hahn also used to good effect and which is nowadays called the uniform boundedness principle.
The Circle Starts Rolling In 1921, Escherich retired, and Hahn returned to Vienna to take over his former supervisor's chair (runners-up had been Radon and Tietze). Hahn had not lost his philosophical bent in Bonn: in fact, he had recently edited Bolzano's Paradoxes of Infinity. (Bolzano, a Catholic priest living in Prague, had anticipated not a few of the insights of Weierstrass, Cantor, and even Poincar6, but his work had long been suppressed by the Church.) A few months after Hahn took up his position in Vienna, the chair of philosophy that once had been Mach's became vacant again. This was Hahn's opportunity to attract the "Universit/itsphilosophen" he had been longing for. He managed to persuade faculty and ministry to appoint the, German Moritz Schlick, who had written his thesis un20
w r a MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4, 1995
der Max Planck's supervision and therefore shared, with Mach and Boltzmann, a physicist's background. Schlick's philosophical outlook was deeply influenced by Einstein, with whom he was on close terms. Not yet in their mid-forties, Schlick, Hahn, and Neurath (who had returned to Vienna after wild years culminating in a prison term for "high treason" in overthrowing Bavarian authority) were the senior members of the Vienna Circle. As Sir Karl Popper described it, According to what I later heard from several members, Hahn was the spiritual founder of the Vienna Circle and his brotherin-law Neurath its organizer... Schlick was at first, I believe, a kind of honorary president. But he became very active. The three men launched a vigourous program of public lectures, seminars, and (of course) coffee-house meetings, which was to bear remarkable fruits. Promising youngsters gathered round them~ On Hahn's recommendation, the 30-year old German, Kurt Reidemeister,
THE MATHEMATICAL 1NTELLIGENCER VOL. 17, NO. 4, 1995
21
was appointed as associate professor in geometry. Reidemeister, who was soon to leave his mark on geometry and knot theory, was immediately captivated by the Circle. Another young German philosopher named Carnap joined the group. Carnap was to become the Circle's most enduring voice in the later years of exile and dispersal. Among the undergraduates inspired by the new enthusiasm reigning in the Institute of Mathematics, three prodigies stand out. Karl Popper, who had been disappointed by his first brush with calculus, found delight in the clarity of Hahn's lectures. Popper's star, however, was to rise only much later. Kurt G6del, a quiet, intense young man from Brno, just over the newly-acquired frontier with Czechoslovakia, impressed everyone who attended Hahn's seminar on the Principia Mathematica. But the undisputed star among the students was Karl Menger, the son of the famous economist Carl Menger (who, incidentally, had been the tutor of Crown Prince Rudolf of dreary Mayerling fame). He recalled,
quadratic mean. But now, his interest turned increasingly to functional analysis. He applied it to elucidate the role of Lagrange multipliers in the calculus of variations, for instance, and to discuss summation methods (introducing, among others, a space of null sequences which is sometimes called the "Hahn sequence space"). Hahn's close contacts with Helly were a constant source of inspiration. Helly, who had found no regular employment at the University and was obliged to work, first at a bank and later at a life insurance firm, continued with his ground-breaking work in functional analysis. Hahn m a y also have been influenced by his friend Tietze's extension theorem for bounded continuous functions. In 1927, Hahn published in the Crelle Journal a paper on "Systems of linear equations in linear spaces" in which he proved, almost as a side result, an extension theorem of his own: any linear operator can be extended from a subspace to the whole Banach space, while keeping the same norm. This turned out to have tremendous consequences. To quote Dieudonn6's
History of Functional Analysis: In March of 1921--I had just completed one semester at the university--H. Hahn joined the mathematics faculty. His first release was the announcement of a seminar on the curve concept... I hesitated, but finally mustered up my courage to audit the first session. There, without introduction, Hahn formulated the problem of making precise the idea of curve that everyone has but no one had been able to articulate... I left in a d a z e . . . After a week of complete engrossment in the question I presented Hahn a solution in terms of the simplest set-theoretical concepts Hahn agreed that mine was a promising attack on the problem. .
.
THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4, 1995
And in Heuser's Functional Analysis, we read: It is not saying too much if one terms [this theorem] the crown-jewel of functional analysis.
.
A severe lung disease forced Menger, a few weeks later, to withdraw to a sanatorium for more than one year. When he came back to the institute, he brought with him several seminal papers on the concept of dimension which quickly earned him a Ph.D. with Hahn. (Unknown to them, the Russian Urysohn obtained m a n y of the same results at the same time, but died in a drowning accident before they were published.) Hahn encouraged Menger to take up a post-doc job in Amsterdam, where Brouwer's magisterial work and personality could not fail to fascinate anyone interested in topology and the foundations of mathematics. Another student of Hahn's, the Pole Witold Hurewicz, soon followed Menger to Amsterdam. Other budding topologists remained for a while in Vienna and received their Habilitation from committees headed by Hahn: one was Leopold Vietoris (of the Vietoris-Mayer sequences), who, incidentally, celebrated his hundred-fourth birthday a short while ago. Another was Georg N6beling, who became a close co-worker of Menger's for a while. Hahn had been working on the theory of Fourier transforms, somewhat along the lines of Wiener's contemporary General harmonic analysis, but emphasising' pointwise convergence rather than convergence in the 22
It may be rightly said that with this paper, duality theory at last had come into its own.
Two years later, Banach proved the same theorem, using the same technique of transfinite induction. The Polish school, and especially the young Mazur, were quick in recognizing the extraordinary power of this result. But this time, Hahn had beaten Banach to the draw. The Polish leader of functional analysis rose to the occasion, and published in the Studia a note acknowledging Hahn's priority. In 1928, Bologna hosted the World Congress of Mathematicians, where mathematicians from the former Central Powers were, for the first time since the war, welcome to attend. Hahn chose to speak not on his work in functional analysis, but on continuous images of compact intervals. Using some recent work of Hausdorff, he had devised a most elegant, concise proof of the characterisation theorem due to Mazurkiewicz and himself. The leading star of the Congress of 1928 was, of course, David Hilbert, who had made a widely applauded entrance at the head of the German delegation, and who set a challenge to the mathematical community which was intended to rival his famous 23 problems from the Congress of 1900. Hilbert wanted to see mathematics firmly set on a secure foundation. He had, for years, championed a program to prove, by finitary means, the consistency of formal axiom systems. In Bologna, Hilbert set out four open problems concerning
the consistency and completeness of various systems on which elementary logic, number theory, and analysis would be based. He was remarkably confident, even claiming--wrongly, as it turned out--that the consistency of number theory had been proved already, but said that a full solution of the task "needed the devoted cooperation of the younger generation of mathematicians." Hahn lost no time in communicating Hilbert's views to the Vienna Circle, and in encouraging young G6del, the student who had shone so brilliantly in the Principia Mathematica seminar, to set to work on it.
Attracted by a Tractate By now, Bertrand Russell had been superseded in the pantheon of the Vienna Circle by another prophet who was both much closer and much more remote: Ludwig Wittgenstein. This eccentric Viennese, who had been a student of Bertrand Russell in Cambridge before the war, and who had written, in Ukrainian trenches and Italian prison camps, what he viewed in all modesty as "the final answer to the problems of philosophy," had had a terribly hard time in finding a publisher for his crisp Tractatus logico-philosophicus. Wittgenstein had inherited one of the largest fortunes of Europe (which, because his father had shifted it to the United States in an uncanny anticipation of the catadysm, had actually grown during the war), but had divested himself of it all. He neither could nor would pay the page charges which publishers required for his "final answers." Of course, his sister could have helped him easily. Margaret Stonborough-Wittgenstein, whose portrait is one of the best known works by Gustav Klimt, had set up a fund, in the dark years after the war, to defray publication costs for deserving authors. (Some of Hahn's articles were sponsored by her.) But an indignant Wittgenstein would have none of that sisterly help. It took a full five years, and a foreword by Bertrand Russell, before the Tractatus saw the light of day. Wittgenstein had, in the meantime, become a teacher at a Lower Austrian village school, where he tyrannised farmers' kids with the same imperiousness which he had used on Bertrand Russell and John Maynard Keynes in the old Cambridge days. The thin booklet caught the attention of Reidemeister. An author who was Russell's friend did clearly merit study. It turned out that the book was not an easy read. The Circle spent a whole year discussing it sentence by sentence. The iconoclastic Neurath called it metap h y s i c s - t h e strongest word of censure in the Circle but Hahn was, after some initial difficulties, completely won over to its views on logic and on mathematics; and the gentle, urbane Professor Schlick took the train, with some like-minded friends, to Wittgenstein's village and humbly knocked at the prophet's door. The door, to their dismay, remained shut. It transpired that the prophet had moved back to Vienna. In his impatience
with the schoolchildren, Wittgenstein had gone a few steps too far, and had had to resign his teacher's job in order to avoid being asked to do so. Eventually, Wittgenstein's sister managed to arrange a dinner where the university philosopher met her fiery brother. Schlick returned home ecstatic. Considering that there was nothing left to say about philosophy, Wittgenstein had been quite gracious, in fact. Persistent wooing by Schlick and his assistant Waismann softened him, after a few months, to the point of meeting regularly with a few select members of the Circle, under the proviso that the discussions would not have to be philosophical. In 1927, Reidemeister had been called up to a chair in K6nigsberg. To Helly's disappointment, it was Karl Menger who, not yet 26, was appointed to succeed him. (Austria could no longer send its young professors to provincial universities: there weren't enough provinces left.) Menger departed from Amsterdam with a sense of relief: owing to somepriority dispute, a chill had fallen on his contacts with Brouwer. Nevertheless, Menger and Hahn set out to invite Brouwer to Vienna. Some members of the Vienna Circle talked Wittgenstein into attending one of his lectures. When Wittgenstein, at length, deigned to appear, Hahn welcomed him with open arms and urged him to take a place in the front. Wittgenstein, rather gruffly but true to his Tolstoyan style, insisted on a modest place in the fifth row. But Brouwer's lecture clearly fascinated him; to everyone's delight, he came along to the usual after-session in the inevitable coffee-house, and there, incredibly, started to talk philosophy again. It was as if flood-gates had been opened. Obviously, there was something for him left to say. The reborn Wittgenstein soon returned to Cambridge, and never again stopped with philosop h y - m o s t l y taking up positions that were at odds with the Tractatus. He left the task of publishing his notes (posthumously) to a devoted group of followers.
The Road to K6nigsberg Young G6del had also been at Brouwer's lecture. Besides quietly attending the Thursday evenings with the Circle, and tutoring Carnap on mathematical logic, he worked steadily on his thesis, and had made an indepth study of the recently published introductory book of Hilbert and Ackermann (which, incidentally, contained two references to Hahn's sister Olga). By mid1929, G6del had solved the fourth problem of Hilbert's Bologna address: the proof that first order logic was complete every true statement could be derived from its axioms. (First-order logic allows the statement that f(x) holds for all x, but not that it holds for all f.) Hahn was delighted, and rushed the paper to publication in the Monatshefle, the mathematics journal that had been founded by von Escherich and was now edited by Wirtinger and Hahn. THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4, 1995
23
Genius Loci: In the Caf4 Reichsrat, G/idel first announced his incompleteness theorem to Carnap. The m e m b e r s of the Vienna Circle usually m e t in various coffee-houses; the b i - w e e k l y Thursday sessions took place in a seminar room in the Institute of Mathematics. After one session in w h i c h Schlick, Hahn, Neurath, and Waismann had talked about language, G/Jdel confessed to Menger, "The more I think about language, the more it amazes m e that people ever understand each other."
The next Congress of German Mathematicians was to take place in K6nigsberg. With Reidemeister (who had just finished his book Die Grundlagen der Geometrie, dedicated to Hahn) on the spot, it was easy to add to this event the first international conference on the philosophy of mathematics. The Vienna Circle planned to attend in strength. K6nigsberg was the birthplace of both Kant and Hilbert. A whole-hearted opposition to Kant's views was probably the strongest link between the members of the Vienna Circle. But it was the venerated David Hilbert who was in for a shock. To quote Carnap's diary: "August 26, 1930:6 to half past 8, in the Caf6 Reichsrat coffee-house with Feigl, G6del, later Waismann. Plan of travelling to K6nigsberg by boat. G6del's discovery: incompleteness of the system of Principia Mathematica." G6del had managed to construct, by an ingenious diagonalisation procedure, a sentence that asserted its o w n unprovability. In a consistent system, it could not be proved, and hence was true. Consequently, any finite axiom system that was rich enough to allow for arithmetic had to contain statements that were true, but not provable in the system. The Hilbert program was shaken to the core. 24
THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4, 1995
The K6nigsberg conference on the philosophy of mathematics had been intended as a tournament between the views of Russell, Brouwer, and Hilbert. None of the great men attended the event, but they had formidable champions: Carnap spoke for logicism, Heyting for intuitionism, and John von Neumann for formalism. Hilbert was in town, but busy preparing his talk for the upcoming ceremony--he had been elected honorary citizen of K6nigsberg. In blissful ignorance of G6del's results, Hilbert broadcast his conviction that there was no unsolvable problem in mathematics, ending with the words: "We must know. We shall know." In a special issue of the journal Erkenntnis, one can find the text of the lectures of Carnap, Heyting, and von Neumann, as well as a summary of the subsequent historic discussion which conveys its drama. The debate was at first monopolised by H a h n - - w h o chaired the sess i o n - w i t h a long disquisition on the place of mathematics in an empiricist philosophy, which came out strongly in favour of Russell's (and Wittgenstein's) logicist point of view. This was followed by a vague attempt of Carnap's at combining formalism and logicism---contingent on a proof of consistency being found! Next came a short exchange between John von Neumann and
H a n s H a h n on the p r o p e r role of the axiom of reducibility, and an a r g u m e n t b y Heyting on the compatibility of intuitionism and formalism, again u n d e r the assumption of a p r o o f of consistency. It was only then that G6del joined the discussion. His first remarks dealt with the role of a consistency criterion for formal theories; they convey the impression that he had decided not yet to mention his incompleteness result. But then, after an interjection by von N e u m a n n , G6del stated off-handedly that there exist true sentences which can not be proved, so one could add the negation of such a sentence to the axioms and obtain a system which was just as consistent as before, but n o w contained a patently false statement. With his few sentences, G6del h a d knocked the breath out of the discussion. That issue of Erkenntnis ends with a brief postscript, written b y G6del at the editor's request, describing his incompleteness result. John von N e u m a n n took G6del aside, after the debate, and had him explain his results in detail. H e immediately grasped their importance, and u n d e r s t o o d v e r y soon afterwards an amazing consequence: if a consistent system of axioms was rich e n o u g h to allow for arithmetic, one could not prove its consistency within that system. But G6del had, in the meantime, d r a w n the same conclusion. H e thus had solved the remaining three of Hilbert's four Bologna problems, but the answers were the opposite of what Hilbert had expected! John von N e u m a n n was e n o r m o u s l y impressed, and lost no time in r e c o m m e n d i n g G6del to the newly f o u n d e d Princeton Institute. So did Menger (who had been in the States d u r i n g the K6nigsberg events). In Vienna, G6del's w o r k was speeded t h r o u g h the Monatshefle. It earned him the Habilitation with a glowing report b y Hans Hahn. 9 a scientific achievement of the first o r d e r . . , it can be safely predicted to earn a place in the history of mathemati c s . . . Herr G6del is already acknowledged as the foremost authority on symbolic logic and on the foundations of mathematics. For a few years, Vienna was the Mecca of mathematical logic, regularly visited b y Alfred Tarski, Willard van O r m a n Quine, Alfred Ayer, and John von N e u m a n n . In quick succession, G6del published several important notes in Menger's Ergebnisse eines mathematischen Kolloquiums, three of them as answers to problems posed b y Hahn; their general trend was to relativise the intuitionist position and to underline the inadequacy of finite models. This agreed perfectly well with the point of view of Hahn, w h o d e r i d e d "intuition" as force of habit rooted in psychological inertia, and w h o wrote: There is no absolute proof of freedom from contradiction for the theory of sets and thus no absolute proof of the mathematical existence.of infinite sets and infinite numbers. But neither is there any such proof for the arithmetic of finite numbers, nor for the simplest part of logic . . . Hence we
can ascribe mathematical existence to infinite sets and Cantor's transfinite numbers with approximately the same certainty as we ascribe existence to finite numbers.
The Spell of Wittgenstein Interestingly, H a h n n e v e r mentioned G6del's n a m e in his philosophical writings; neither did Wittgenstein, b y the way. Both viewed G6del's w o r k as mathematical rather than philosophical. For them, the basic questions concerning the philosophy of mathematics had m o r e to d o with applying mathematics to the real world than with completeness or consistency, which they v i e w e d as internal affairs, so to speak. W,ittgenstein said to Waismann: p
Would the calculations mathematicians have made through the ages suddenly come to an end because a contradiction has been found in mathematics? Certainly not. H a h n shared the same pragmatic attitude, which is that of most mathematicians. H e wrote in a comparable vein: From the fact that no contradiction is known [in the new logic], it does not follow that none exists, any more than the fact that in 1900 no okapi was known proved that none exi s t e d . . . On the basis of present knowledge it may be said that an absolute proof of freedom from contradiction is probably unattainable... But is not this concession fatal to the logistic position, according to which mathematical existence depends entirely on freedom from contradiction? I think not. For here, as in every sphere of thought, the demand for absolute certainty of knowledge is an exaggerated demand: in no field is such certainty attainable. This is quite different from Hilbert's faith. As a philosopher, H a h n first and foremost was a dedicated empiricist in the sense of H u m e : all k n o w l e d g e of the real w o r l d was to be f o u n d e d on experience. I take this empiricist position, not because I have selected it from among several possible positions, but because it appears to me to be the only possible one, because any real knowledge gained by pure t h o u g h t . . , appears to me to be completely mystical. Mathematics, however, seems to offer k n o w l e d g e which is not f o u n d e d on observation. I can imagine that a stone will, tomorrow, not fall downwards; but not that tomorrow, two times two will be different from four. Since mathematical theorems cannot be falsified by experience, they cannot be based on experience. This gives rise to w h a t H a h n terms the fundamental question: How is the empiricist position compatible with the applicability of logic and mathematics to reality? T H E M A T H E M A T I C A L INTELLIGENCER VOL. 17, N O . 4, 1995
25
Like Russell, H a h n was p e r s u a d e d that mathematics was f o u n d e d in logic. But while it had been c o m m o n l y assumed that the laws of logic deal with the most general properties of objects, Wittgenstein claimed that they d o not deal with objects at all, but only with the way w e speak about objects; or, in H a h n ' s words, Logic does not say anything about the world but has to do only with the way in which I talk about the world, and it should be evident that on this view the existence of logic is directly compatible with the empiricist position. Both logical and mathematical propositions are just tautologies--rules for turning statements into equivalent statements. H a h n wrote: Language correlates combinations of symbols with states of affairs in the world, and the way it correlates them is not one-to-one (which would be quite pointless) but many-toone; and logic gives the rules about the way in which one combination of symbols can be transformed into another one which designates the same state of affairs; this is what is called the tautological character of logic. Being tautological does not mean being trivial, of course: It seems hardly credible at first sight that the whole of mathematics with its hard-earned theorems and its frequently surprising results could be dissolved into tautologies. But this argument overlooks just a minor detail, namely the circumstance that we are not omniscient. A n d on another occasion, An omniscient subject needs no logic, and contrary to Plato we can say: God never does mathematics. The mathematician H a h n w o u l d have f o u n d Plato's d o o r wide open; but the philosopher H a h n w o u l d have lost little time in starting to argue with his host. H a h n was a dedicated anti-Platonist; he characterised all rationalist and idealist thinking as world-denying philosop h y b e f u d d l e d b y meaningless abstractions like the "Ding an s i c k " and o p t e d vigourously for the world-affirming empiricist p h i l o s o p h y based u p o n o u r senses. Our conviction is that the world-denying philosophy greatly overestimates thought. For all thought is nothing but transformation; thought can never lead to anything new. H a h n emphatically r e p u d i a t e d G e r m a n thinkers from Kant to Heidegger, a n d hailed the " d e h v e r a n c e " coming from England t h r o u g h Bertrand Russell (of course), David H u m e , and John Locke, all the w a y back to the scholastic doctor William of Occam, w h o s e celebrated razor did a w a y with superfluous notions. In fact, H a h n ' s first philosophical p u b l i c a t i o n - - a p a m p h l e t written in 1930--was entitled Occam's razor and s h o w e d h o w some of the pitfalls associated with transcendent concepts like "impossible entities," "universals," or " e m p t y space" 26
THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4, 1995
were solely d u e to the uncritical use of an inappropriate language. If the first error of the world-denying philosophy is to overestimate thought, its second basic error is to overestimate l a n g u a g e . . . Language is an extraordinarily imperfect instrument, which constantly reveals the primitive grimaces of our primeval ancestors, as when a highly enlightened free-thinker cannot help feeling uneasy if he is the thirteenth person at table. We find this view succinctly encapsulated in a sentence that Wittgenstein wrote years later: Philosophy is combat against the bewitchment of our intelligence by means of language. Other quotes by H a h n have distinctly the flavour of Wittgenstein's (posthumously published) Philosophical Investigations. For example, If someone does not want to accept logical inference, it is not that he has a different opinion from mine about the behaviour of objects, but that he is refusing to talk about objects according to the same rules as I; it is not that I cannot convince him, but that I must refuse to go on talking with him, just as I shall refuse to go on playing tarot with a partner who insists on taking my fool with the moon. But whereas H a h n quoted Wittgenstein in each of his philosophic writings, and n e v e r tired in expressing his admiration and indebtedness, Wittgenstein did not return in kind; w e find H a h n ' s n a m e just once, in a footnote of the Philosophical Remarks, w h e r e an o d d argument about the role of finitary p r o c e d u r e s is attributed to him. Take, for instance, the claim: I will n o w obtain the sexagesimal expansion of r b y repeatedly throwing a die. This claim will probably turn out to be w r o n g after the first few throws. But w e can't hope to see it turn out right, because we cannot p e r f o r m infinitely m a n y throws. Intuitionists used examples of this kind to cast d o u b t on the law of the excluded middle for statements involving infinite sets. More generally, both intuitionists and formalists place a special emphasis on finiteness and finitary procedures. Hilbert and Brouwer, in spite of their fierce differences, a g r e e d on viewing finite mathematics as more basic than the rest. H a h n countered this with a thought-experiment. Suppose that (with practice), we halve the time w e need for a t h r o w at each step. If we need half a m i n u t e for the first throw, a quarter of a minute for the next, etc., the job w o u l d be over in just one minute. (Recently, Ian Stewart took this idea further, b y sketching h o w a rapidly accelerating computer could be devised, within the realm of classical physics, that doubles its speed at e v e r y step. Of course, it so h a p p e n s that classical physics is not valid in our world: b u t w e cannot allow m e r e physical happenstances to impinge on questions of logic.) In a sense, o u r
emphasis on finite decision procedures is coincidental. It tells something about us, not about logic. This is an instance of the principle of tolerance in action, which was first formulated by Menger and later adopted by Carnap and Popper, among others. It says that there is no privileged part of mathematics, more secure or more real or more intuitive than the rest; that one can speak neither of the logic nor of the language, and that any demarcation is arbitrary. In a similar spirit, Hahn wrote (a full year before G6del announced to von Neumann his proof that the axiom of choice was compatible with the usual other axioms of set theory): We can do mathematics using the axiom of choice (a 'Zermelian mathematics'), and we can do mathematics using an axiom stating the contrary (a 'non-Zermelian mathematics'). The whole question has nothing to do with reali t y . . , or with pure intuition... It depends on which sense we decide to give to the word 'set'. In the early 1930s, Wittgenstein's thinking centred mostly on the philosophy of mathematics, but he restricted his contacts to the philosopher's half of the Vienna Circle, and shunned G6del, Menger, and Hahn. He may have felt that he could not expect from his mathematical followers the same adulation as from Waismann or Schlick. Menger noted a characteristic episode. In the Tractatus, Wittgenstein wrote of the "apparently unimportant fact" that logical notation needs parentheses. But later, heedless of this edict, the Polish logician JLukasiewicz devised a notation needing no parentheses. When Menger apprised the Circle of this, Waismann immediately launched into a blundering defense of Wittgenstein's view. Hahn stopped him, rather testily, with the words: "But Mr. Waismann, why not admit that on this point Wittgenstein was evidently mistaken?"
Science Lectures Rival the Opera Moreover, Wittgenstein was estranged by the more mundane activities of what he termed, in a letter to Waismann, "Hahn and Neurath and that clique." This politicking was one of the reasons why G6del, Menger, and Popper, the brightest of the rising generation, all made a point of keeping at some distance from the Vienna Circle, although they owed so much to it. The Circle was purportedly a completely private grouping, but some of its members, Neurath and Hahn in particular, felt that they also had a public duty to perform. Accordingly, they foundect an "Ernst Mach Society" devoted to spreading the "scientific world-view." This was the time of Red Vienna, when an intellectually supercharged "Austro-marxist" party, strongly leftist, but strictly anti-communist, dominated the town hall and embarked on a vast programme of public welfare and general education. Staff and students of the university
were predominantly fascist or clerical right-wingers, so that "gown," at that time, was considerably more reactionary than "town." Hahn, however, was chairman of the socialist professors, and president of the freethinkers' association. (He combined this with an intense, although sceptical interest in spiritualistic phenomena.) Hahn was one of the co-signers of the Ernst Mach Society's founding manifesto, and its vice-president. The association launched a series of five to six public lectures a year. Tickets for the lectures were'sold at the same price as opera tickets. It worked, in spite of the harsh economic crisis. With titles like "Is there an infinity?" or "The crisis of intuition" ,(to quote two of Hahn's contributions), the lectures filled a large auditorium to capacity. Other lectures were given by luminaries like the chemist Hermann Mark or the physicist Werner Heisenberg. The royalties were used in part to erect a statue on Ludwig Boltzmann's grave; another part covered the assistant's salary of young OIga Taussky (who went Q~ to a distinguished career in the United States as Professor Taussky-Todd). Hahn's lectures were always meticulously prepared, and followed a highly personal style. He reformulated statements again and again, apparently not progressing at all, until, at the end of the hour, the listerner was amazed at the amount of ground that had been covered. Of course, this method agreed well with Hahn's conviction that mathematics consists of saying the same in different ways. Apparently, it worked exceedingly well. Erwin Schr6dinger, for instance, who had attended Hahn's lectures on the calculus of variations in 1907, kept the notes for the rest of his life, in spite of a dozen changes in residence. And Sir Karl Popper wrote in his autobiography: I learned most from Hans Hahn. His lectures attained a degree of perfection which I have never encountered again. Each lecture was a work of art: dramatic in logical structure; not a word too much; of perfect clarity; and delivered in beautiful and civilised language. The subject, and sometimes the problems discussed, were introduced by an exciting historical sketch. Everything was alive, though due to its very perfection a bit aloof. At another place, Popper writes: Hahn's lectures were, for me at least, a revelation... As a teacher and a speaker, he was beyond compare. The beginner's course started with a detailed history of the priority dispute between Newton and Leibniz, with Hahn firmly on the side of Newton. (Popper found it necessary to stress that this was not due to pro-British bias.) There followed a tour d'horizon on the problems in the development of analysis and the paradoxes of early set theory; the whole culminated in a discussion of the logical foundations of mathematics. Popper describes how Hahn displayed the first volume of the
THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4, 1995
27
but was made welcome at the "Mathematisches Kolloquium" organised by Menger together with G6del and N6beling. There, Hahn was impressed by some ideas of Popper's on the foundations of probability theory which greatly clarified the approach of his old friend von Mises and which were later taken up by Abraham Wald. In 1934, Hahn read the page proofs of Popper's Logik der Forschung (whose translation, The Logic of Discovery, appeared only 25 years later), invited Popper to his home, and overwhelmed the young man with his warm praise for a book that challenged the Vienna Circle with a sharp critique of logical positivism.
The Circle is Broken
Principia Mathematica to the students, and h o w ardently he espoused its viewpoint. No doubt G6del had the same experience. The young Popper had no wish to become a professional mathematician. In 1930, he was appointed as a teacher at a secondary school. (Schooling reforms were another favourite of Red Vienna, and Hahn took a leading part in that movement too.) Popper was never invited to attend the meetings of the Vienna Circle, 28
THE MATHEMATICAL INTELL1GENCER VOL. 17, NO. 4, 1995
To the new generation, by that time, the Circle looked a bit staid, and the Kolloquium was where things happened. Hahn frequented both, of course: the Circle on Thursday and the Kolloquium on Tuesday evenings. (Wednesdays, he frequently joined Helly's friends in the Caf6 Central.) This meant endless rounds of learned discussions, punctuated by attempts to keep his cigar lit. Hahn was fast approaching the stereotype of the worldrenowned Professor, which includes, as movie-goers know, a pretty daughter (Nora was making a name for herself as an actress) and an entertaining eccentricity (in his case an interest in extrasensory perception and spiritualistic s6ances). Hahn often attended concerts, assiduously following the score, and (like Sigmund Freud) played an occasional round of tarot. Hahn's wife enjoyed her part as hostess to a regular "salon" (the term was still in use), and as patroness of promising scientists. But shadows were gathering. Worried by the Nazi ascendancy in neighbouring Germany and the increasingly reactionary turn in Austrian politics, the first members of the Circle left for the States. Hahn's friend Paul Ehrenfest, who had done so much to explain Boltzmann's work, committed suicide in 1933. After a short but sharp outburst of civil war in February 1934, the clerico-fascist regime suppressed the Socialist Party in Austria. The Ernst Mach Society was outlawed, and authorities were frowning upon the Vienna Circle. Hans Hahn, w h o had until then had no reason to complain of his health, now became more and more subject to violent stomach-cramps. In the summer term, the pain forced him sometimes to interrupt his lectures for minutes on end. A tumor was diagnosed, and in July, Hahn died under the scalpel. He had planned to finish, in autumn, his treatise on real analysis--a book he had been writing and rewriting for 25 years. Originally, it had been conceived as a joint work with Arthur Sch6nflies, a German mathematician whose Bericht fiber Punktmengen had, in 1900, played an important role in spreading the new ideas on real analysis, but this plan petered out. (One of Hahn's first publications consisted in pointing out and correcting some errors in Sch6nflies' treatise.) In 1921, Hahn
published the first of two volumes of Reelle Analysis, some 865 pages so c r a m m e d with discussions of analytic sets, Baire classes, and notions like "being of first category at a given point," that integrals and derivatives had to be left for the second volume. In an autobiographical note at that time, H a h n wrote that the second v o l u m e ' s publication was imminent. Instead, he started rewriting Volume I, and published in 1932 a completely altered version. As G6del wrote in a review in the
Monatshefte: The proofs are given with a rigour and an attention to detail that probably have hardly an equal in the textbooks of mathematics, and are not far removed from complete 'formalisation' (in the sense of, say, Principia Mathematica). In spite of working furiously on part two of his treatise, H a h n was never to see it in print. M e n g e r salvaged H a h n ' s notes, and in 1948, Arthur Rosenthal published their translation in an American edition. It was a labour of love; but H a h n ' s magnum opus (which experts rank o n a par with Hausdorff's Set Theory and Saks's Theory of Integration) suffered severely from the handicaps of its birth: with part one in G e r m a n and part two in English, and a delay of 16 years between the two, the b o o k never had the impact it deserved. The Austrian ministry decided not to appoint a successor to H a h n ' s c h a i r - - a clear sign that the Vienna Circle was not m u c h in favour with authority. Both Karl Menger and E d u a r d Helly, the obvious candidates, left. Schlick, w h o had decided to remain in Vienna, was shot in 1936 on the steps of the University b y a frustrated philosopher w h o claimed that Schlick's outlook had threatened the metaphysical foundation of his ethics. As Alfred Ayer wrote, The right-wing press duly deplored the act, but there was a faint suggestion that this was the sort of fate that radically anti-clerical professors might expect to suffer. W h e n Nazi G e r m a n y annexed Austria, the Heimat of its leader Adolf Hitler, Schlick's m u r d e r e r was quickly freed. By this time, w h a t was left of the Vienna Circle h a d dispersed, mostly to Prague and Amsterdam. As it t u r n e d out, this was not far enough. All f o u n d a haven eventually in England or the States, the irrepressible Otto N e u r a t h crossing the Channel in a small boat with Reidemeister's daughter, and the G6dels steaming across Siberia and the Pacific to reach Princeton. The members of the Vienna Circle survived the holocaust. They were luckier than the mathematicians of the Polish school, for instance, of w h o m m a n y s u c c u m b e d to the Nazis. But those Viennese w h o once w o u l d pay the price of an opera ticket for a lecture on logic or science were gone. After the war, the reconstruction of the b o m b e d - o u t State Opera was accorded highest priority b y democratic n e w Austria. Men like P o p p e r and Menger, however, were politely told that the University of Vienna had no place for them.
On the brighter side, we still d o have a few good coffee-houses.
Selected Bibliography The Collected Works of Hans Hahn will be published by Springer-Verlag, Vienna (edited by L. Schmetterer and K. Sigmund). The first volume (with an introductory essay by Sir Karl Popper) will appear in autumn 1995. For biographical material on Hahn, we refer to Karl Mayerhofer: Nachruf auf Hans Hahn, Monatshefte far Mathematik und Physik 41 (1934), 221-238. Karl Menger: Introduction to Hans Hahn, Empiricism, Logic and Mathematics, edited by Brian Guiness, Vienna Circle Collection, Kluwer, Dordrecht (1980). Karl Menger: Reminiscences of the Vienna Circle and the Mathematical Kolloquium, Vienna Circle Collection, Kluwer, Dordrecht (1995). Rudolf Einhorn: Vertreter der Mathematik und Geometrie an den Wiener Hochschulen 1890-1940, Ph.D. thesis, Technical Univ. Vienna (1985).
Institut far Mathematik Universitdt Wien A-1090 Strudlhofgasse 4 Vienna, Austria THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4, 1995
29
Old and New Moving-Knife Schemes Steven J. Brams, Alan D. Taylor, and William S. Zwicker
Introduction Over the past 50 years, the mathematical theory of fair division has often been formulated in terms of cutting a cake. More specifically, one seeks ways to divide a cake among n people so that each person is satisfied, in some sense, with the piece he or she receives, even though different people may value certain parts of the cake differently. (See [BT1], [BT2], [BT3], [G], [K], [O] and [$3].) We focus in this article on "moving-knife" schemes for fair division, but the earliest fair-division methods were quite different. When there are only two people (n = 2), the parental solution for appeasing two quarreling children of "one cuts, the other chooses" is well known, going back at least to Hesiod's Theogeny some 2800 years ago [L], pp. 126-131. A half-century ago, Hugo Stein-
30
haus [$1, $2] first asked about generalizing cut-andchoose to more than two people. Since that time, three kinds of results have been obtained: 1. Existence Results One of the earliest results of this kind was Neyman's theorem [N, SW], which established the existence, for n countably additive probability measures defined on the same space C, of a partition of C into n sets that is even (meaning that each of the sets is of measure 1 / n with respect to all the measures). The proof, however, in the words of Rebman [R], p. 33, gives "no clue as to how to accomplish such a wonderful partition." Note that any even division is envy-free (meaning that each person thinks he or she received a piece at least as large as those received by the other people), and that any envy-free division is proportional (meaning that each person thinks he or she received at least 1/n of the cake).
THE MATHEMATICAL 1NTELLIGENCER VOL, 17, NO. 4 9 1995 Springer-Verlag New York
The reader is invited to construct, for n = 3, examples showing that neither implication is reversible. For n = 2, of course, proportionality and envy-freeness are equivalent. 2. Discrete Algorithms The so-called last-diminisher procedure of Stefan Banach and Bronislaw Knaster (see [$1]) provides an early example of this kind of result, which is called a "protocol" in [BT1] [G] and a "gametheoretic algorithm" in [B]. Under the Banach-Knaster procedure, the first person cuts a piece from the cake [that he or she considers to be of size l/n]. 1 This piece is then passed, in turn, to each of the other people. Upon receiving such a piece, a person has the option either to pass it along unaltered to the next person [which is done if the person holding it considers it to be of size at most 1/n] or to trim it [to size 1 / n in his or her measure] and then pass it along. The last one to trim i t - - o r the first person, if no one trimmed i t - - g e t s that piece as his or her share. The trimmings are returned to the cake, and the procedure is then repeated for the remaining cake. Note that a person may receive a piece consisting of many chunks that were widely separated in the original cake, which is a feature common to many discrete algorithms. We leave it to the reader to check that if a player follows his or her strategy-- and it is precisely these strategic aspects that we have placed in square brackets-- then that player will receive a piece he or she thinks is of size at least l / n , regardless of what strategy the other players employ. We shall say more later about the difference between strategies and rules. 3. Moving-Knife Procedures This kind of continuous procedure seems first to have been proposed in 1961, when Lester Dubins and Edwin Spanier [DS] presented the following elegant version of the Banach-Knaster protocol: A knife is slowly moved across the cake, say from left to right. (Figure 1, borrowed from [A], illustrates this.) At any time, any player can call "cut" and then receive the piece to the left of the knife, with ties broken by some kind of random device. It is easy to see that if a player employs the obvious strategy of calling "cut" any time the piece so determined is of size exactly 1/n in his or her measure, then this will certainly yield him or her a piece of size at least 1/n. In contrast to discrete algorithms, moving-knife schemes tend, by exploiting continuity and intermediate values, to minimize the number of chunks each person receives.
I I
cake
Figure 1. Dubins-Spanier procedure: A knife slowly moves, from left to right, over a rectangular cake. 1 Our use of square brackets willbe explainedlater.
Any discrete algorithm or moving-knife procedure consists of both rules and strategies. Rules incorporate those parts of the procedure that can be enforced by a referee (because the referee can tell whether or not a rule has been followed without knowing the measures of the various players), whereas strategies are good advice to the players (which they can follow by using their knowledge of only their own measure). To keep the distinctions clear, strategies are put in square brackets, whereas rules are not. For example, a rule and strategy ~qe shall use later is, "Player I divides the cake into three'pieces [each of which is 1/3 of the cake in his or her opinion]." For the sake of brevity, in what follpws we ignore the possibility that all of the players involved in some part of a procedure abandon their strategies by choosing never to call "cut." The problem is easily addressed by adding a rule that allocates the cake in some manner (e.g., according to some random device) should this situation arise. We are not concerned here with how an individual might exploit knowledge of another's measure to do even better than he or she would by following the suggested strategy. (Even cut-and-choose is sensitive to this kind of information [BT2].) We assume~ in effect, that none of the players has knowledge of others' measures, and we seek procedures that guarantee a certain payoff under this assumption. In fact, in the schemes we present, each player's strategy will guarantee him or her the appropriate payoff (either at least 1 / n of the cake or freedom from envy), even in the face of a conspiracy by the other players. Our goal in the present article is to present eight moving-knife schemes, in addition to that of Dubins and Spanier [DS] described earlier, several of which are new. Our main focus is on obtaining an envy-free division among three p e o p l e - - t h a t is, one in which each person considers the piece he or she gets to be at least tied for largest among the three pieces allocated. Neither the Banach- Knaster [$1] scheme nor the Dubins- Spanier [DS] moving-knife version of it guarantees an envy-free allocation. Five of the schemes to be presented yield such an allocation for three people. Two of the envy-free schemes that we will present employ Austin's two-person scheme, to be described next. Austin's T w o - P e r s o n E q u a l i z i n g S c h e m e Recall that Neyman's [N] result guarantees the existence of a partition of the cake into n pieces such that every person thinks every piece is of size 1/n. For n = 2, this yields a single piece of cake that both people agree is of size exactly 1/2. In 1982, A. K. Austin [A] produced the following elegant scheme which achieves this using a pair of moving knives, each player's strategy guaranteeing that he or she receives a piece of size exactly 1/2: Assume there is a single knife that moves slowly across the cake from the left edge toward the right edge, as in the Dubins- Spanier [DS] procedure, until one of the players-- assume it is player 1 - - calls "stop" [which he or she does at the point when the piece so determined THEMATHEMATICAL INTELLIGENCERVOL.17,NO.4.1995 31
is of size exactly 1/2]. At this time, a second knife is placed at the left edge of the cake. Player 1 then moves both knives across the cake in parallel fashion [in such a w a y that the piece between the two knives remains of size exactly 1/2 in player l's measure], subject to the requirement (superfluous, if the strategies are followed) that when the knife on the right arrives at the right-hand edge of the cake, the left-hand knife lines up with the position that the first knife was in at the m o m e n t when player 1 first called "stop" (see Fig. 2). While the two knives are moving, player 2 can call "stop" at a n y time [which he or she does precisely w h e n the measure of the piece between the two knives is of size exactly 1/2 in his or her measure].
cake
2nd
st
knife
knife
becomes
2 cannot possibly think all k pieces are of size less than 1/k, and player 2 cannot possibly think all k pieces are of size greater than 1/k. Thus, either player 2 thinks one of the pieces is of size exactly 1 / k - - i n which case we are d o n e - - or we can assume, without loss of generality, that he or she thinks the first piece is of size less than 1 / k and the second piece is of size greater than 1/k. But n o w we can have player 1 place knives on the left and right edges of the first piece and move them as before [so as to keep the measure of the piece between the two knives at exactly l / k ] , subject to the same sort of requirement as before. This argument shows that, at some point, player 2 will think the piece between the two knives has measure exactly 1 / k. An iteration of Austin's two-person scheme allows two players to partition the cake into J pieces, each of which is of size 1/d according to both players. For example, if J = 3, we begin by using Austin's two-person scheme to obtain a single piece of cake that both players think is of size 1/3. Now we apply the k = 2 version of Austin's two-person scheme to the rest of the cake. In what follows, we refer to both the original and iterated version as Austin's two-person scheme. Austin's Version of Fink's Algorithm
cake
Figure 2. Austin's procedure: Player I moves two knives from left to right so that the piece between the knives remains of size 1/2 in his or her opinion.
Now, what guarantees that there will be a point where player 2 thinks the piece between the knives is of size exactly 1/2? Notice that at the instant w h e n the two knives start moving, player 2 thinks the piece between the knives is of size strictly less than 1/2 (assuming he or she has followed the strategy given). If the first knife were to reach the right edge of the cake, the piece between the knives would be the complement of what it was w h e n the knives started moving. Hence, player 2 w o u l d think the piece between the knives is now of measure strictly greater than 1/2. Thus, with an appropriate continuity assumption, there must have been a point where the measure of the piece between the knives is exactly 1/2. No generalization of Austin's scheme to n > 2 people is known. (Later, we give an additional reason w h y a generalization would be of interest.) However, Austin himself n o t e d - and we will need this observation l a t e r - that a simple extension of his scheme produces a single piece of cake that each of two players thinks is of size exactly 1/k for any k. This extension proceeds as follows: Player I first makes a sequence of k - 1 parallel marks on the cake [in such a w a y that he or she thinks the k. pieces so determined are all of size 1/k]. Now, player 32
THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4, 1995
A. M. Fink [F] devised a clever discrete algorithm which, like the B a n a c h - K n a s t e r [$1] procedure, yields an allocation for n players wherein each player receives a piece of cake that he or she thinks is of size at least 1/n: Austin [A] introduced his two-person scheme into Fink's algorithm, obtaining a moving-knife scheme which does the same. The scheme proceeds as follows: Players 1 and 2 use Austin's two-person scheme to divide the cake into two pieces, A and B [so that both think A and B are of size 1/2]. Players 1 and 3 than cut a piece A / from A [which they both think is exactly 1/3 of A]. Players 2 and 3 now do exactly the same thing to B to obtain B/ (see Fig. 3). Player 1 now receives A - A ~,
1/2 (players 1 & 2)
1/3 (players 1 & 3)
1 /2 (players 1 & 2)
1 /3 (plagers 2 & 3)
X~plager I's piece /
plager2'$piece
plager3'spiece / Figure 3. Austin's version of Fink's algorithm: Each of three players receives a piece he or she thinks is of size exactly 1;/3.
player 2 receives B - B', and player 3 receives A'U B'. It is easy to see that each player receives a piece of cake that he or she thinks is of size exactly 1/3. If a fourth person now comes along, each of the three earlier players simply gets together with this fourth person and cuts a small piece that he or she and the fourth person agree is 1/4 of the piece held by that player. The fourth person then gets the union of the three small pieces, and so on.
P
Q
R
Figure 5. Levmore-Cookprocedure.
Stromquist's Envy-Free Scheme
The Levmore-Cook Envy-Free Scheme"
Probably the best known but most complicated movingknife scheme is the envy-free procedure for three players due to Walter Stromquist [St]. This procedure begins with a referee holding a knife at the left edge of the cake. Each of the three players holds a knife parallel to the referee's [at a point that that player thinks exactly halves the remainder of the cake to the right of the referee's knife]. The referee moves his or her knife slowly across the cake, as was the case with all the previous procedures. The three players move their knives in the same way as the referee, with each player keeping his or her knife to the right of the referee's [so that it exactly halves the piece to the right of the referee's knife, with respect to that player's measure]. At any time, a player can call "cut" and receive the piece to the left of the referee's knife (X in Fig. 4). A cut is then also made by whichever of the three players' knives is in the middle (yielding Y and Z in Fig. 4). Of the other two players, the one whose knife was closer to the referee's knife gets Y, and the other gets Z. The strategy is for a player to call "cut" only if he or she thinks the left-hand piece, X, is at least as large as both the middle piece Y and the right-hand piece Z. (Misinterpretations of this strategy have caused some confusion in the literature; see [O] and [St].) Hence, the player calling "cut" will never envy the other two. Since neither of the other two players called "cut," they must each think the largest piece is either Y or Z. To see that neither of them experiences any envy, notice that each thinks he or she is getting the larger of Y and Z, or there is a tie. This is easy to check and left to the reader.
There is another procedure for producing an envy-free division among three people. It is due to Saul X. Levmore and Elizabeth Early Cook [LC] and seems to have been largely overlooked. It is essentially a moving-knife algorithm, although they describe it as'a process with "infinitely small shavings." It can be described as follows: Player 1 divides the cake into three pieces P, Q, R [which he or she considers equal]. Each of the other two players selects a piece {-whi'chhe or she considers largest]. If they choose different pieces, we are done. Otherwise, we can assume they both choose P. Now.player I starts a vertical moving knife, as in the Dubins-Spanier [DS] scheme, but at the same time he or she places a second knife perpendicular to the first and over the portion of the cake over which the vertical knife has already swept (see Fig. 5). Notice that if cuts were to be made from such a positioning of the knives, the piece of cake labeled P would be cut into three pieces, exactly two of which would involve both knives. Let S denote one of these two pieces and let T denote the other. The second knife is moved up and down [in such a manner that player I thinks Q u S is the same size as R u T]. When the process begins, both S and T are empty, so player 2 and player 3 both think P - (S U T) is larger than both Q u S (giving two inequalities) and R U T (giving two inequalities). Now let either player 2 or player 3 call "stop" [when he or she knows that any of these four inequalities first reverses], and take Q u S or R u T [whichever he or she thinks is bigger]. Player I gets the other composite piece, and the player who did not call "stop" gets P - (S U T).
X
Y
1I Referee
Webb's Envy-Free Scheme
Z
Three players
Figure 4. Stromquist'sprocedure.
cske
It turns out that by combining the basic idea in the Dubins-Spanier [DS] scheme with Austin's [A] twoperson scheme, one can obtain a fairly simple movingknife procedure that guarantees an envy-free allocation among three people. This scheme was first discovered by William Webb [W], although he was unaware of Austin's work and thus recreated the part of it he needed. The version we give next uses Austin's original scheme. A knife is slowly moved across the cake, as in the Dubins- Spanier procedure, until some p e r s o n - - assume it THEMATHEMATICAL INTELLIGENCERVOL.17,NO.4,1995 33
is player 1 1 calls "cut" ]because he or she thinks the piece so determined is of size 1/3]. Call the piece resulting from this cut A1, and notice that both players 2 and 3 think A1 is of size at most 1/3. We now have player I and either one of the other two players I assume for definiteness it is player 2 1 apply Austin's two-person scheme to the rest of the cake, resulting in a partition of it into two sets, A2 and A3 [which players 1 and 2 think is a 50-50 division of the rest of the cake]. Notice that if the bracketed strategies are followed, then 1. player I thinks that all three pieces are of size exactly 1/3; 2. player 2 thinks A2 and A3 are tied for largest (since each is exactly 1/2 of a piece that is at least 2/3 of the whole cake). An envy-free division is now easily obtained by having the players choose among the three pieces in the following order: player 3, player 2, player 1. Player 3 envies no one, because he or she is choosing first; player 2 envies no one, because he or she had two pieces tied for largest; and player I envies no one, because he or she thinks all three pieces are the same size.
A Pie Scheme for Envy-Free D i v i s i o n s w i t h n - 3 Another conceptually simple envy-free moving-knife scheme for 3 players can be achieved by picturing a round cake (or pie, as in [GD instead of a rectangular one. The idea of using a pie and radial knives seems to be a part of the cake-division folklore, but the following scheme is, as far as we know, new. Start by having player 1 hold three knives over the round cake as if they were hands of a clock [in such a way that he or she considers the three wedged-shaped pieces to be all of size exactly 1/3]. Now have player 1 start moving all three knives in a clockwise fashion [so that each piece remains of size exactly 1/3 in his or her measure], subject to the requirement (superfluous if the strategies are followed) that the moment any knife reaches the initial position of some other knife, all knives line up with such initial positions. The claim is that at some point player 2 must think at least two of the wedges are tied for largest; that is, if player 2 thinks a single wedge (call it A) is largest at the instant when the knives start moving, then A is eventually transformed, as in Austin's scheme, to the wedge immediately clockwise. Thus, at some point prior to this, the piece determined by the two knives that originally determined A loses its position as largest to another piece. At the instant when this happens, we have the desired two-way tie for largest in the eyes of player 2. The envy-free allocation is now obtained by having the players choose in the following order: player 3, player 2, player 1. This scheme can be recast as one in which three knives move in parallel across a rectangular cake (with 34
THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4, 1995
the understanding that as a knife slides off the right edge, it immediately jumps back onto the left edge).
A n Easy Three-Person Envy-Free Scheme Our final envy-free moving-knife scheme for three people is an immediate consequence of Austin's [A] twoperson scheme for dividing a cake into three pieces such that each of two players thinks that the division is even. One simply has players 2 and 3 use that scheme to obtain a partition of the cake into three pieces [which they both think are all of size 1/3]. The players then choose the piece they want in the following order: player 1, player 2, player 3. Player I experiences no envy, because he or she is choosing first; and neither player 2 nor player 3 will experience envy, because each thinks all three pieces are the same size. Perhaps the most important aspect of this three-person scheme is that it can be extended, at the cost of some complexity, to a moving-knife scheme that produces an envy-free allocation among four players. This scheme is described in [BTZ].
An A l m o s t Envy-Free S c h e m e for n > 3 As we pointed out earlier, the Dubins-Spanier [DS] moving-knife scheme does not guarantee an envy-free allocation. The reason is that as soon as a player calls "cut," he or she is relegated to spectator status for the remainder of the procedure. Thus, if a larger piece should arise later, he or she has no recourse but to sit quietly by and watch someone else get it. Might we not alter the procedure by allowing a player to reenter the process in some way? Suppose we allow a player to call "cut" again, even though he or she already has done so at least once and thus received a piece of cake. That player would then be required to take the new piece determined by this most recent cut, returning his or her previous piece to the cake. What this yields is the following: Given n people and some c > 0, there is a moving-knife scheme that will guarantee each player a piece of cake that he or she thinks is at most c smaller than the largest. The scheme is simply the one we just described, and the concomitant strategy: each player calls "cut" initially whenever he or she thinks the piece this will yield is of size l / n , and thereafter calls "cut" whenever he or she thinks the new piece is e larger than the one he or she presently holds. Ties are broken at random. Unfortunately, the rules of this scheme i as opposed to the strategies-- would allow a player to call "cut" infinitely many times. However, the strategies described are not affected by an additional rule that each player can call "cut" at most 1/e times. Thus, if c = 1/100, a player need never call "cut" more than 100 times to ensure that his or her piece is "almost" the largest i that is, smaller than the largest piece by at most 1/100 of the entire cake.
Toward an Envy-Free Scheme for Arbitrary n
References
If Austin's [A] two-person scheme could be extended to n players, then one could immediately obtain an envyfree moving-knife scheme for n + 1 players b y simply mimicking what we did earlier. More generally: If there exists a moving-knife scheme A that will divide a cake into n pieces so that each of n players thinks all the pieces are of size 1 / n , then there exists a movingknife procedure for producing an envy-free division of the cake a m o n g n + 1 players. To see h o w the envy-free procedure w o u l d w o r k for n + 1 players, we b e g i n - - as in the D u b i n s - Spanier [DS] p r o c e d u r e - - b y obtaining a piece of cake that, say, player 1 thinks is of size exactly 1 / ( n § 1) and e v e r y o n e else thinks is of size at most 1 / ( n + 1). We n o w have player 1, together with any n - 1 of the other players, divide u p the rest of the cake into n pieces, using A, so that each thinks all n pieces are the same size. The player not involved in the application of A then gets to choose first, whereas player I is forced to choose last. The order in which the others choose is immaterial. Envy-freeness follows as before. The above result shows that if we had a moving-knife scheme to divide a cake into four pieces so that each of four players w o u l d think it is an even division, then we could produce a moving-knife scheme that w o u l d yield an envy-free allocation a m o n g any five people. In fact, b y a considerably more complicated argument, we show in [BTZ] that such an envy-free moving-knife scheme for five people would follow from a minimal extension of Austin's scheme: a partition of the cake into two pieces so that each of three players (instead of two) thinks it is a 5 0 - 5 0 division.
[A] A. K. Austin, Sharing a cake, Mathematical Gazette 6 (437), (1982), 212-215. [B] J. B. Barbanel, Game-theoretic algorithms for fair and strongly fair cake division with entitlements, Colloquium Math. (forthcoming). [BT1] S.J. Brams and A. D. Taylor, An envy-free cake-division protocol, Am. Math. Monthly 102(1) (1995), 9-18. [BT2] S. J. Brams and A. D. Taylor, Fair Division: From CakeCutting to Dispute Resolution, Cambridge: Cambridge University Press (1996). [BT3] S.J. Brams and A. D. Taylor, A note on envy-free cake division, J. Combin. Theory (A) 70(1) (1995), 170-173. [BTZ] S.J. Brams, A. D. Taylor, and W. S. Zwicker, A movingknife solution to the four-person enVy-free cake-division problem, Proc. of the Am. Math. Soc. (forthcoming). [DS] L. E. Dubins and E. H. Spanier, How to cut a cake fairly, Am. Math. Monthly 68(1) (1961), 1 ;-17. [F] A.M. Fink, A note on the fair division problem, Math. Mag. 37(5) (1964), 341 - 342. [G] D. Gale, Mathematical entertainments, Mathematical Intelligencer 15(1) (1993), 48-52. [GS] G. Gamow and I ~ Stern, Puzzle-Math, New York: Viking (1958). [Ga] M. Gardner, Aha! Aha! Insight, New York: W. H. Freeman and Company, (1978), I23-124. [K] H. Kuhn, On games of fair division, Essays in Mathematical Economics (Martin Shubik, ed.), Princeton, NJ: Princeton University Press (1967), 29- 37. ILl S. T. Lowry, The Archeology of Economic Ideas: The Classical Greek Tradition, Durham, NC, Duke University Press: (1987). [LC] S. X. Levmore and E. E. Cook, Super Strategies for Puzzles and Games, Garden City, NY: Doubleday and Company (1981), 47-53. [N] J. Neyman, Un th6orbme d'existence, C. R. Acad. Sci. Paris 222 (1946), 843-845. [O] D. Olivastro, Preferred shares, The Sciences (March/April, 1992), 52-54. JR] K. Rebman, How to get (at least) a fair share of the cake, Mathematical Plums (Ross Honsberger, ed.), Washington, DC: Mathematical Association of America (1979), 22-37. [$1] H. Steinhaus, The problem of fair division, Econometrica 16(1) (1948), 101-104. [$2] H. Steinhaus, Sur la division pragmatique, Econometrica (Supplement) 17 (1.949), 315- 319. [$3] H. Steinhaus, Mathematical Snapshots, 3rd ed., New York: Oxford University Press (1969). [St] W. Stromquist, How to cut a cake fairly, Am. Math. Monthly 87(8) (1980), 640-644; addendum, 88(8) (1981), 613-614. [SW] W. Stromquist and D. R. Woodall, Sets on which several measures agree, J. Math. Anal. Appl. 108(1) (1985), 241 -248. [W] W. Webb, But he got a bigger piece than I did, preprint, (n.d.).
Conclusions In general, moving-knife schemes seem to be easier to come by than pure existence results (like N e y m a n ' s [N] theorem) but harder to come b y than discrete algorithms (like the D u b i n s - S p a n i e r [DS] last-diminisher method). For envy-free allocations for four or more people, however, the order of difficulty might actually be reversed. N e y m a n ' s existence proof (for any n) goes back to 1946, the discovery of a discrete algorithm for all n > 4 is quite recent [BT1, BT2, BT3], and a moving-knife solution for n = 4 was found only as this article was being prepared (see [BTZ]). We are left with this u n a n s w e r e d question: Is there a moving-knife scheme that yields an envyfree division for five (or more) players?
Acknowledgments The authors thank the editor and referee for their helpful comments. S.J. Brams gratefully acknowledges the support of N e w York University's C.V. Starr Center for Applied Economics, and A.D. Taylor and WoS. Zwicker the support of the National Science Foundation u n d e r grant DMS 9101830.
Brams: Department of Politics New York University New York, NY 10003 USA Taylor and Zwicker: Department of Mathematics Union College Schenectady, NY 12308 USA THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4, 1995
35
David Gale* This column is interested in publishing mathematical material which satisfies the following criteria, among others: 1. It should not require technical expertise in any specialized area of mathematics. 2. The topics treated should when possible be comprehen-
sible not only to professional mathematicians but also to reasonably knowledgeable and interested nonmathematicians. We welcome, encourage and frequently publish contributions from readers. Contributors who wish an acknowledgement of submission should enclose a self-addressed postcard.
Has this ever happened to you? You've just finished patiently trying to explain some beautiful result of pure mathematics to a group of nonmathematicians, hoping that you've conveyed something of the flavor of this pearl of truth and beauty, and then after a pause someone says, "Yes, but what has any of this got to do with everyday life?". After much thought I've decided the correct response is to say "Nothing. That's what's so nice about it. After all, Every Day Life is often a drag, so we do mathematics for the same reason we listen to music or ski down a mountain, to get away from and above and beyond Every Day Life." But now I must admit that every once in a while it works out the other way and EDL turns out to be a source of unexpectedly interesting mathematics. A nice example of this is the following item, written by guest columnist John H. Halton.
The Shoelace Problem
*Column editor's address: Departmentof Mathematics,Universityof California,Berkeley,CA 94720 USA.
John H. Halton In a number of discussions of how shoes should be laced, it became apparent that no one seemed to have the definitive answer. Shoes were laced and relaced, passions flared, and shoes were even thrown . . . . The author decided that an appeal to mathematics was indicated. This problem is a restriction of the Traveling Salesman Problem. We are given a set of 2(n + 1) points (the laceholes or eyelets) arranged in a bi-partite lattice, as shown in Figure 1. The problem is to find the shortest path from A0 to B0, passing through every eyelet just once, in such a way that points of the subsets
Figure 1. The shoe (a schematic).
36 THEMATHEMATICAL INTELLIGENCER VOL. 17, NO. 4 9 1995 Springer-Verlag
New York
A = {Ao, A1, A 2 . . . . . An} a n d B = {Bo, B1, B2 . . . . . Bn}
(1)
alternate in the path. T h r e e s t a n d a r d lacing strategies are s h o w n in Figu r e s 2-4. For the A m e r i c a n (AM) style, as in F i g u r e 2, if n is odd, the lacing is
For the E u r o p e a n (EU) style, as in Figure 3, w h e n n is odd, the lacing is Ao ~ B1 ~ A1 --* B3 ---->A3 ~ " " 9 An-2 ~
(5)
w h e n n is even, the lacing is, similarly, A o --* B1 ~
A1 ~
B3 --~ A 3 --~ 999
--> A n - 1 ---> Bn --~ A n --) B n - 2 ~
A n - 2 ---) B n - 1 --~ " " "
A3 ~ B2 ---) A1 --~ Bo;
A n - 1 --'->B n - 3 -'-> 9" "
---) B2 "-~ A2 ~ Bo;
Ao --~ B1 ~ A2 ~ B3 ~ A4 ~ 999 A n - 1 ---> Bn ---> A n --~ B n - I ~
Bn --~ Am --) Bn-1 ~
An-2 ~
B n - 4 "-~ " " "
-~ B2 ---) A2 --~ B0;
(2)
(6)
and, w i t h a little m o r e t h o u g h t , w e see that, in b o t h cases, the total length of lace is
if n is even, the lacing is, similarly, A o - - * B1 --> A2---~ B3---> A4--> 999
LEU = LEu(n, v, w)
--~ A n - 2 --~ Bn-1 --~ A n ---> B,~ --~ A n - 1 ---> B n - 2 --~ 999
A3 ~ B2 ~ A1 --~ Bo;
a n d it is easily v e r i f i e d that, in either case, the total l e n g t h of lace u s e d is LAM = L a M ( n , v, w ) = w + 2 n V ' - - ~ + w 2.
(4)
a+(n-1)
=nw+2X/-~-+w
(3)
4V~+w
2.
(7)
For the s h o e s h o p (SS) style, as in F i g u r e 4, the lacing is A o --* Bn --~ A n --'-> B n - 1 ~
B3 --'," A 3 ~
B2 ~
An-1 ~
"""
A 2 --~ B1 --* A1 --~ Bo
(8)
a n d w e find that the total l e n g t h is Lss = Lss(n, v, w) n w + nX/--~ + w 2 + X,/n2v 2 + w 2.
(9)
W e can g e n e r a l i z e t h e situation as follows. Let a a n d 13 d e n o t e p e r m u t a t i o n s of {1, 2, 3 . . . . . n}:
a = {al, a2, . . . . an}, /3 = {/31,/32 . . . . . ~ } .
(10)
To t h e m will c o r r e s p o n d the lacing A0 ~ B~I ~ A~I ~ B~2 ~ A ~ --~ B/33--~ 999 --* Ann_, --~ B& --, A ~ --*/3o,
Figure 2. American zig zag lacing.
a n d this will h a v e total l e n g t h
Figure 3. European straight lacing.
Figure 4. Shoe-shop quick lacing.
(11)
THEMATHEMATICALINTELLIGENCERVOL 17,NO.4, 1995 37
L = V'/3~-v2 + w2 + V ( a l -/3~)2v2 + wa q- V ( j ~ 2 -- O/1)2V2 q- W 2 q- V(Or
-- ~2)2V 2 q- W 2
q- 9 9 9 q - V r ( ~ n - Ogn_l)2V2 q- W2 q- VO/2V 2 q- W 2.
(12)
For the three special lacings shown above, the particular permutations are aaM = {all even numbers increasing; then all odd numbers decreasing},
(13)
/3AM = {all odd numbers increasing; then all even numbers decreasing}; OeEU -- ~EU = J~AM,
(14)
ass =/3ss = {all numbers decreasing}.
(15)
The simplicity of these permutations is indeed remarkable. THEOREM 1. If v = 0 or w = O, for all positive n, LAM = LEU = LSS.
(16)
LAM(1, V, W) = LEU(1, V, W) = LSS(1, v, w),
(17)
Ifv>-Oandw>-O,
and, if v > 0 and w > O,
LAM(2, v, W) < LEU(2, V, W) = Lss(2, v, w).
(18)
Finally, if v > O, w > O, and n > 2, LAM < LEU < LSS.
(19)
Figure 5. Lattice representation of the three standard lacings.
the total length of the representation L will equal the original total length of L the lacing ~ itself. That the "American" (AM) lacing is better than the "European" (EU) lacing is now immediately apparent, by a straightforward application of the triangle inequality (see Figure 6). The two representations, LAM and LEU coincide in several places. Where they differ, replicas of a triangle P Q R occur, and it is clear that PR < P Q + QR, so that the first inequality in (19) follows, without further algebra! That the EU lacing is better than the SS lacing is a little harder to prove (see Figure 7). First, we observe that the representations LEU and Lss have in common just two diagonal segments, moving by one lattice interval in both directions (slopes + w/v), and n (vertical) segments, moving by one vertical lattice interval w only. If we omit all of these common intervals, shifting the sep-
This theorem can be proved, using (4), (7), and (9), by the careful analysis of cases and elimination of radicals. The proof is left as an exercise for the reader. (It is given b y the author in a technical report [1].) T h e Lattice R e p r e s e n t a t i o n
Let us make a lattice of alternating parallel, equidistant sets A and B, as shown in Figure 5. Given any lacing s we can represent it, as is shown for our three standard examples, by a polygonal (piecewise straight) line L moving always downward across the new lattice, visiting the eyelet points only once each. The first line segment in the order of lacing, Ao ~ B~,, is unchanged; the next, B~I --+ A,,,, is replaced by its mirror image in the original B line; the next A~1 --+ B&, is moved downward by two lattice intervals, parallel to itself (i.e., it is a twice-repeated mirror image), and so on; the last segment, A~n --+ B0, returns to the image of B0 in the B line displaced d o w n w a r d by 2n intervals. Clearly, 38 THEMATHEMATICAL INTELLIGENCERVOL.17,NO.4, 1995
Figure 6. Comparison of AM and EU lacing.
of the path from Bn to B0. The form of the path corresponding to a typical general lacing is illustrated in Figure 10. The path LAM corresponding to the AM lacing is also shown. In this particular example, as before, n = 7 and the lacing is Ao --+ B2 --+ A7 --+ B4 --+ A6 --+ B1 -+ A1 --+ B3 --+ A3 --+ B6 --+ A5 -+ B5 --+ A4 --+ B7 --+ A2 --+ Bo. (20) Its length is [compare (12) and collect similar radicals] L = 3w + 2X/-~- + w2 + 4X/4v2 + w2 + 3 X / ~ + w2 + 3X/25va + w2.
(21)
In general, let the lacing have total length
L=~
NkX/k2v2 +
W 2,
(22)
k=--n
Figure 7. Comparison of EU and SS lacing.
where, clearly, n~
~. Nk = 2n + 1
(23)
k = -n
is the net total number of downward displacements (i.e., the number of steps, since each step has a downward displacement by one lattice interval w), and
~, kNk = 2n
(24)
k= --n
is the net total number of rightward displacements by one lattice interval v. For the AM lacing, it is clear that No = 1, N1 = 2n,
Figure 8. Comparison of EU and SS lacing--reduced repre-
sentations. arated lower segment upward (and in the first two cases, sideways also), parallel to themselves, to rejoin the upper segment, and thus subtracting equal lengths from each representation, we obtain reduced representations, L~u and L~s. The result is shown in Figure 8. Each representation now consists of a singly-broken line (just two successive line segments--a zig and a zag). Now perform the "reflection trick" again, this time in the horizontal coordinate direction, so that the leftward segment of each representation is reflected about the vertical. The resulting representation lines are denoted by L**EUand L**ss(see Figure 9). We can now simply observe that L~*uis just a single straight segment UV, whereas L** ss consists of two straight segments, UW and WV, so that, again by the triangle inequality, (19) clearly holds.
all other Nk = 0.
(25)
The AM lacing has the shortest possible total length L, and it is the unique optimum lacing.
THEOREM 2.
Proof. Let L be the reflected representation of an arbitrary lacing ~s and let L be its total length. (i) If No -> 1, let us remove any one corresponding (vertical) step from ~; and let us remove the sole vertical step from s162 rejoining the separated pieces of the representations by parallel displacement, as before; then
Optimization We adopt the lattice representation described above (see Figures 5-7) and apply the "reflection trick" to the part
Figure 9. Comparison of EU and SS lacing--reflected repre-
sentations. T H E M A T H E M A T I C A L INTELLIGENCER VOL. 17, NO. 4, 1995
39
It cannot be that Nk > 0 only for positive values of k; for then, b y (23) and (24), we would have that
Z k N k - Z Nk = N2 + 2N3 k=l
k=l
+ . . . + (n - 1)Nn = 1,
(27)
which is impossible, since all Nk >-- O. Therefore, there is at least one step with a negative (leftward) horizontal displacement, and thus there is a first leftward step, ST, in the d o w n w a r d order. It obviously cannot be either the first or the last step of the representation. Hence, it is preceded by a rightward step, RS, forming an angle pointing to the right. N o w (see the enlarged detail of Figure 12), let F and G be the respective lattice points in which the vertical
Figure 10. General lacing--reflected representations
Figure 12. Magnified detail of Figure 11.
lines through R and T meet the horizontal line through S. Then
IFRI = IGTI = w
(28)
and
IFSI >- v
and
IGSI >- v.
(29)
Figure 11. Case of No = 0---no vertical segment.
the two new representations, L t and LtAM, still share their end points, and both lengths are just w less than they were. N o w L ~ is clearly minimal, being the straight line connecting these end points. Therefore, for all ~, L ~ -< L.
(26)
(ii) Suppose now that No = 0. This is illustrated in Figure 11. 40
~
MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4, 1995
Through G, draw a line parallel to TS, and let it meet RS (as it must) at X. N o w draw a vertical line (parallel to RF) through X to meet TS at Y. Clearly, XYTG is a parallelogram, and therefore IXYI = IGT I = w, by (28). 1 Thus we can replace the polygonal segment RST of the representation L by the polygonal segment RXYT, and by the triangle inequality,
IXYI <. IXSI + ISYI;
(30)
1Note,too, that XYFR is also a parallelogram, since the oppositesides, XY and RF, are equal and parallel.
so that the modified representation L ~- say, is shorter than L. But now L ~- has a vertical segment of length w; so, by the same argument as in case (i), the inequality (26) prevails. NOTE: The representative polygonal line L • is, generally, not a representation of any lacing, since it does not, in general, join lattice points; but this does not matter, since, at this stage of the argument, we are only concerned with the length of the line. We have now proved that, if ~MIN is any lacing of minimal length, then it and its (horizontally reflected) representation LMINwill have a total length equal to that of the AM lacing, that is, by (4), LMIN = LAM = W + 2nX/-~ + w 2.
segment in the chain does not matter to the total length LMIN, as is indicated in (31). Nevertheless, since LMIN is not just any lattice polygon, but the representation of a lacing, it must pass through the vertical lattice line corresponding to index n just twice (corresponding to the eyelets An and Bn), and this is the only lattice line which is not duplicated b y the reflection transformation, since it is the reflection line. Therefore, since the representation moves monotonely right (i.e., never to the left), the solitary vertical segment is constrained to be precisely in the index n position, as in LAM. This completes the proof of Theorem 2. []
(31)
(iii) Finally, we prove t i e uniqueness of the optimal lacing ~M~. The arguments presented in cases (i) and (ii) show that any minimal lacing ~ M I N will satisfy (25); that is, its (horizontally reflected) representation LMIN will have 2n straight segments, moving diagonally down-and-to-the-right by one lattice interval, and one vertical segment. However, the position of this vertical
Reference 1. The Shoelace Problem, Department of Computer Science Technical Report No. 92-032 (1992), University of North Carolina at Chapel Hill.
Computer Science Department University of North Carolina Chapel Hill, NC 27599-3175 USA THE MATHEMATICALINTELLIGENCERVOL. 17, NO. 4, 1995
41
100th Anniversary of the Jagiellonian University Students' Maths Society Krzysztof Ciesielski
The Polish Mathematical Society is 75 years old. However, it is not the oldest mathematical organization in Poland. Recently we celebrated the 100th anniversary of the Jagiellonian University Students' Math Society (Ko/o Matematyk6w Student6w UJ) in Krak6w. In the late nineteenth century the rules of mathematical studies at the Jagiellonian University (which is the oldest university in Poland, founded in 1364 by a Polish king, Kazimierz the Great) were rather relaxed. There was only one compulsory exam (to obtain Ph.D. or bachelor's degree). About 15 lecture courses and some seminars were organized each year, but students were allowed to attend the courses or not, as they chose. The professors knew only the most talented students, meeting them during the seminars. Many students were not in touch with others. Moreover, the number of mathematical monographs available (especially in Polish) was very small. Studies became more systematic only in 1926. The Founding Session of the Society (then called Maths and Physics Jagiellonian University Students' Society--K6/ko Matematyczno-Fizyczne Uczni6w UJ) was held on 3 December 1893. A group of 20 people elected Zdzis/aw Krygowski the first President of the Society. The real work started in April 1894, after the General Assembly. The main activity of the Society was the regular meetings, where talks on various subjects were given by the students. Books were gathered to make a Society library. However, after 3 years the Society "fell asleep." It revived in 1900, when Antoni Hoborski, later a well-known geometer, was elected President. Apart from talks on various subjects (sometimes on recent results, obtained by famous mathematicians), problem sessions were organized. In the minutes of the Society the time is recorded in an interesting way: "the meeting was opened at 6 1/4 p.m . . . . " The meetings were regularly announced in local newspapers. Prompted by Hoborski, the students started publishing lecture notes. The lecture courses were very carefully noted down, checked by the professors, calligraphically rewritten, and published (lithographed or printed). The Society began with An Introduction to 42
THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4 9 1995 Springer-Verlag New York
Mathematical Analysis based on lectures by Stanis/aw
#,
O. 0
.
,)o*
The emblem of the Society from the beginning of the century.
K ~
~
Zaremba; between 1902 and 1939 the Society published a number of very valuable books, an extraordinary accomplishment. Until 1918 the Jagiellonian University Students' Maths Society was the only such society in Poland, but after the war four others were created: in Warsaw, Lw6w, Poznaii and Wilno. Recall that the Krak6w Mathematical Society (which later changed its name to the Polish Mathematical Society) was founded in 1919, when the Students' Society was 25 years old. Among the first members of the Krak6w Mathematical Society we find many mathematicians w h o , h a d been in the Students' Math Society. The five students' societies organized annual meetings, with talks and discussions. Thanks to their collaboration, the .books published in Krak6w were sold throughout Poland. Many students working in the JagieUonian University Society later became very well-known mathematicians. In the 1920s, the presidents were, for instance, Stanis/aw Krystyn Zaremba (a son of Stanis/aw Zaremba), Andrzej Turowicz (later on a mathematician and a priest and a
~,,,,~
A copy of the first page of the minutes of the Founding Session of the Society, 3 December 1893. THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4, 1995
43
Cover of the lecture notes "The theory of determinants of linear equations" by S. Zaremba, published in 1906.
friar, known to the readers of The Intelligencer from [1]), and Stanis/aw Go/~b. Go/~b was a favourite pupil of Hoborski, who led the Society 20 years earlier. Hoborski relied very much on Go/~b's opinion; students joked that Hoborski, leaving the University building together with Go4b, would open his umbrella and ask, "Stanis/aw, does it knock or doesn't it knock?" and shut the umbrella if Go/~b answered "No." Also active in the Society were Zofia Krygowska, later famed for her achievements in mathematical education, and many others. After the Second World War, publishing books was much more complicated than before. According to the 44
THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4, 1995
new political rules in Poland, every publication had to be accepted by the censor; there were also problems with paper and type. Moreover, the Polish authorities wanted to introduce a "new order" in student organisations and make one great socialist students' society. In 1950 the students of the Mathematics Department found the room of the Society sealed (without previous notice!). After a few months it turned out that about 2000 precious books from the Society library had disappeared. Fortunately most of them were recovered after some years. The Society functioned unofficially for a time. In 1959 the students were able to reactivate it, with
A calendar for 1984, on the skeleton of the dodecahedron. THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4, 1995
45
may find one of these problems in Mathematical Entertainments in vol. 10 (1988), no. 1 (by the way, it was regarded as the best contribution to that Intelligencer column in 1988). The annual joint meetings of the Polish students' mathematical societies stopped after 1976, mainly for lack of financial support; up to 1976 many of these meetings took place in Krak6w. There was a Computer Science section which in 1976 split off as a separate society (on the occasion of the 100th anniversary, mathematicians joked that it was the oldest computer science society in the world, as it had its origin in 1893). The students do not limit themselves to scientific activities. They gather and invent mathematical jokes, poems, and cartoons; they note funny quotations by professors from their lectures; they publish the Society calendars . . . . Also, each year we watch the StudentsStaff football match and may attend The Mathematicians" Ball (in 1994, nineteenth-century costumes were almost compulsory). Almost every member of the Mathematical Institute of the Jagiellonian University worked in the Society during his student years. There is something incredible in this Society, where people work two to four years, during their studies, and yet younger students pick up the projects of older students, as if by mathematical induction. People change, generations change, the style of the Society work changes, but the atmosphere in the Society is always superb. I wonder if there are many so long-lasting and continuously active mathematical students' organizations in the world. In the spring of 1994 the Society officially celebrated its centennial. There was an exhibition of "treasures" from the archives. Many people, past members of the Society, came to the special meeting held in March. Many of them came from far, many of them have not been in contact with mathematics for a long time. Clearly the connection with the Jagiellonian University Students' Math Society was for many of its members not only a passing adventure of student days. A copy of the special pennant made by the students on the occasion of the Staff-Students match in 1985 (notice the soccer ball with negative curvature).
the help of Professors T. Wa~ewski and S. Go/~b, and it has continued fully active since then. What have the students done? Many scientific meetings are organized. There are single meetings with students' talks, as well as continuing seminars on particular subjects. Moreover, students combine mathematics with tourism. Every year, there are Winter and Summer Schools (which are one- or two-week camps with daily mathematical talks); the students also go into the countryside for short seminars (called "quasinars'). The questions stated during problem sessions are taken d o w n in a special book; the readers of The Intelligencer 46
T , E MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4, 1995
References
1. K. Ciesielski and Z. Pogoda, "Conversation with Andrzej Turowicz," The Mathematical Intelligencer, vol. 10 (1988), no. 4, 13-20. 2. Minutes of the JagieUonian University Students' Maths Society (in Polish), manuscript, four volumes (1883-1904, 1900-1913, 1923-1930, 1948-1950). 3. J. Pi6rek, From the minutes of the Mathematical Society in Cracow, Mathematics--Society--Teaching, special issue, 1990, 9-12. 4. K. Sza/ajko, Memoirs about Jan Kazimierz University in Lw6w Students' Maths Society (in Polish), Wiadomodci Matematyczne 26 (1984), 85--96.
Jagiellonian University Mathematics Institute Reymonta 4 Krak6w, Poland
THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4 9 1995 Springer-Verlag New York
47
Friedrich II and the Love of Geometry* Heinz GiJtze
The Castel del Monte was built in the northern part of Apulia by the Holy Roman Emperor Friedrich II of Hohenstaufen in the last decade of his life. Even today it remains an object of wonder (Fig. 1). Standing on a conical hill in the fiat tablelike countryside called the Murge that slowly falls toward the sea, the castle is visible from afar, golden in the bright sunshine. Its unique form eight-sided with octagonal towers at each corn e r - i s sharply defined by shadows. The stark, sharply delineated walls emphasize its stereometric character. Nothing built before or after shares its remarkable layout. There are of course other octagonal buildings, especially in the Byzantine and Moslem world, at times even with round corner towers, but nowhere else has the construction of towers been done with such attention to geometry. It is remarkable that these obvious mathematical-geometric relationships have so far hardly been discussed, let alone the role they m a y have played in the creative processes of the architects. There may be many reasons for this neglect of the purely geometric aspect of architectural design. Not only did many of the early attempts to discover the presumed "secrets" of the medieval cathedral-builders by looking for hidden geometric relationships acquire a bad reputation among historians of architecture, they also suffered from a poor understanding of the mathematical foundations themselves. A major role in such attempts was played b y squares and triangles (ad quadratum et ad triangulum). However, until recently such attempts were pure inventions [1]. The alleged "secret" of the medieval architects was demolished by P. Frankl in 1945, and in 1979 K. Hecht thoroughly debunked the attempts to find arbitrary geometrical patterns in Gothic architecture. But in denying the very existence of a geometry of proportions in Gothic architecture, he threw out the baby with the bathwater. After all, any two measurable quantities automatically stand in some proportion to each other, whether on purpose or not. Moreover, proportion has always been an aesthetic element in architecture.
For Castel del Monte, there is no need to invent or to guess at the geometric relationships: They simply exist in the building, as is immediately obvious from a single glance at an aerial photo (Fig. 2) [2]. All that is needed is a critical analysis of the mathematical system that follows from it. One must, of course, ignore small deviations of the castle's actual measurements from the obvious design objectives. Builders in the Middle Ages did not have experience with dealing with such a clear geometric concept and were not used to exactly following architectural designs, as were the builders of the classical Greek era. In addition, the architect here built in certain small deviations; for example, to emphasize the east wing of the building. This did not lead to a devaluation of the basic concept--quite the contrary. The deviations clearly are based on a clear overall plan. A building such as Castel del Monte cries out for a mathematical analysis to help evaluate it as an architectural object in the same w a y that historical, chronological, art-historical, and architectural-historical analyses do.
*This is a somewhat expanded version of an article in Architektur Aktuel1169/170 (1994),88-95, reprinted by permission. English translation by L.L. Schumaker. 48
THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4 9 1995 Springer-Verlag New York
A Geometric Analysis and Definition The two-dimensional layout of CasteI deI Monte can be identified as a symmetry group with 16 elements: 8 reflection planes and 8 rotation planes. It is an automorphic group. These symmetry relations repeat themselves in all four towers (Fig. 5). The multiplicity of symmetries is expanded by homotheties, that is, "similarities" among the large octagon of the main building,
the octagon of the inner courtyard, and the eight octagonal towers placed in the same system of axes. The ratios of the size of the three octagons "similar" to each other is given by 4h(I) : 2h(II) : 2(V2-1)h(III), where h is the mesh size of the basic grid (see Fig. 7). The ratios of the sides of the three octagons is 2a : a : a(X/21), where a is the side length of the courtyard octagon. The symmetry group involved here is a planar group of type D16, in the notation of J.M. Montesinos-Amilibia [3].
Figure 1 THE MATHEMATICAL INTELLIGENCER VOL. I7, NO. 4, t995
49
Figure 2
Figure 4
Figure 3
Figure 5
Simple reflection symmetry, which is often encountered in nature, has long played an essential aesthetic function in architecture. There are other mathematical relationships that have equally strong aesthetic effects, and which also appear in nature--for example, the "golden ratio" which is connected with the pentagon. The architect of the Castel del Monte was clearly aware of the aesthetic importance of symmetries. He used them to achieve the impressive appearance of the casfie, which still affects us today. The planimetric aerial photo (Fig. 2) shows that the tangents of the octagon forming the inner courtyard intersect at the centers of the octagonal corner towers: they
form an eight-pointed star whose tips lie at the centers of the towers. This provides a geometric relationship between the inner courtyard and the corner towers, established by the similarity relationships discussed above. This relationship was first discussed in my book on Castel del Monte, where I also gave a geometric construction for the layout (reproduced here in Figures 3, 4, and 6). I am indebted to three very famous mathematicians w h o have dealt with this geometric configuration and m y construction: F.L. Bauer of Munich, Marcel Ern6 of Hannover, and Max Koecher of Miinster. It was Max Koecher who first thought about the strong aesthetic effect of the multiple symmetries in the geo-
50
THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4, 1995
mon straight lines. This last condition is essential for the aesthetic configuration. Koecher then discussed the many different possible collinear arrangements.
Figure 6 metric plan, and defined it as a geometric configuration with its own intrinsic aesthetic [4]. He described it as a configuration of octahedra, generalized from regular polygons, built up according to the following principles: 1. The configuration comprises the center and the vertices of a central octagon O, along with the center points and vertices of eight smaller translated copies Ov of the central octagon, all equal in size. 2. The centers Mv of Ov lie on the rays passing through the center M of O and the vertices of O. 3. All Mv are at the same distance from M. 4. As many as possible of the vertices and centers of O and of the Ov are collinear; that is, they lie on corn-
In Figure 11 two such "collineations" (i.e., points of the basic octagonal construction which lie on a single line) are marked: the line connecting the center of the octagons Ma, Ma (or equivalently the line through c and b), and the tangent lines to the two exterior octagons. An alternative to the construction of the layout shown in Figures 3, 4, and 6 (which requires both compass and ruler) is due to Marcel Ern6 of the Technical University of Hannover; it does not require a con~pass (Figs. 7 and 8). After drawing a right angle, as in the first construction, we divide a large square into 16 subsquares with side-lengths h. The quantity h can* be considered as a modulus, whose size can be computed from the measurements of the castle. In the next step a second, equally large square is created and rotated by 45 ~ so that the vertices of the new sc[uare lie on the extensions of the main axes of the original square. We have here just the classical approach to constructing an octagon by rotating a square. One of the advantages of this second construction is that one can readily compute the lengths of all line segments, as shown in Figure 9.
Relationships The sides of the octagon forming the interior walls of the outside of the castle are determined by the intersecting grids and are twice as long as the sides of the octagon forming the interior courtyard. The width of the eight outside towers is equal to the length of a courtyard-side a.
Ms
M4
"
)<
\
x/
x ) MI
M~
7
Figure 7
Figure 8 THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4, 1995
51
Instead of the quantity a, we can also use the mesh size h as the basic unit. The quantity a can be found from the actual building by measuring the width of a tower or the length of a side of the interior courtyard. It is not so clear how to measure the quantity h, although it is of great importance as the basic unit used in constructing the rectangular grids. The grid plays an essential role in the development of the layout, and presumably also played a key role in the translation of the drawing to the actual physical construction of the building. For this kind of complicated geometric configuration, it is impossible to build without marking it full-scale on the construction site. It is also clear that this could not have been done '%y hand," but must have been accomplished with mechanical measurement tools which were readily available. Using h as the basic unit (Figs. 7 and 9), we have the following: 1. The distance between the outside of the courtyard walls and the inside of the exterior walls is h. 2. The diameter of the interior courtyard from the center of one wall to the center of the opposite wall is 2h. 3. The distance between the centers of the towers is also 2h. 4. The lengths of the sides of the towers is b = a2h/2, and the thickness of the outer walls is c = a2/ (2(V2)h). 5. The overall width of the castle is 4(V2)h. This mathematical prescription corresponds to the geometrical configuration. The appearance of V 2 is noteworthy, albeit not surprising since we are dealing with the construction of squares and octagons. The Russian architectural historian M.S. Bulatow
Figure 9
52 THEMATHEMATICAL INTELLIGENCERVOL.17,NO.4, 1995
pointed to a high multiplicity of symmetries as well as the repeated use of the quantity ~ as characteristics of Islamic architecture. Bulatow based his interpretation on his o w n careful studies and surveys of central Asian architecture as well as on the writings of Arabic scholars. As already suggested, the practical marking of the floor plan on the construction site probably began with the two gridded squares that determine the outer octagon. This was technically possible using the grid size which provided the basis for all other measurements. The essential role of the initial square, and with it the outside octagon, is also suggested by the exact equality of the distances between opposing walls of the outside octagon--it is 36 meters, which is approximately 120 Roman feet. Castel del Monte was erected by Cistercian masons. Friedrich II felt a close affinity to this order, which played a major role in the architecture of the high Middle Ages. For comparisons, note that the nave of the abbey church of Eberbach in the Rheingau region of Germany, also erected by Cistercians, has a length of approximately 71 meters, that is, around 240 Roman feet--exactly twice the distance between the insides of opposite walls of CasteI del Monte. Furthermore, the central room of the Basilica of Fanum of Vitruvius Pollio is 120 Roman feet long and 60 Roman feet wide. In all three buildings, the lengths are multiples of 60 Roman feet, and thus correspond to numbers in the hexadecimal system, known since the time of Babylonian astronomers, and which remained in use in Europe until the tables of Regiomontanus appeared (1436---1476). In view of this, it seems quite reasonable to consider the hexadecimal system of the outside octagon as the practical starting point for the construction of the layout. This leads us to wonder where this very unique idea for designing a European building at this time in history might have originated. The layout of other castles built around the same time (for example, the Wartburg in central Germany, the Marksburg on the Rhein, and the Chateau Chillon near Montreaux, to mention only three arbitrary examples from different regions) are all very far from the kind of geometrical configuration which Castel del Monte exhibits. The keeps of castles in southern England and donjons in northern France are two other examples lacking such a geometrical structure. The practical aspects of building, the technique of ribbed arches, and the design of capitals were all part of the work of the Cistercian masons, who were dedicated to the Gothic style of Middle Europe. This says nothing about the design of Castel del Monte itself. Even the book of the contemporaneous architect Villard d'Honnecourt contains nothing remotely similar. There are no written records concerning the history of the design of the castle, and we do not know who the architects were. But we need not depend on conjecture alone
Figure 10
Figure 11 to learn something about the creation of this remarkable building, given its well-defined geometric configuration and its inner aesthetic. Searching for related structures, we find in the Carta Pisana, a navigational chart drawn at the end of the thirteenth century, an interesting depiction of an octagonal compass that exactly matches the shape of the layout of
Figure 12 THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4, 1995
53
Figure 13
Figure 14
of the layout of Castel del Monte provides a clear indithe Castel del Monte (Fig. 10). The navigational charts of the Mediterranean are pri- cation of its origin, and possibly also for the meaning of marily of Arabic origin (including the Arabic Caliphate the building itself. Eight-sided stars are older than the of C6rdoba). Two other examples include the Magrebian navigational charts. They can be found already as pronavigational chart of the western part of the jections of the ribs of the cupola in front of the Mihrab Mediterranean (Fig. 12) produced in the first half of the in the Umaiyaden Mosque in C6rdoba (A.D. 961-966). fourteenth century, and a drawing of two symmetric This mosque also contains the first stage of the pattern wind roses that does not include any underlying geo- in the form of crossed squares, which can also be seen graphical information (Fig. 13). On the other hand, both in the cupola of the Umaiyadi Alferia Palace (second of these wind roses clearly show an underlying square half of the eleventh century) in Zaragoza. The appearance of eight-sided stars is not confined to grid and the extended grid lines of the crossed squares the Arabic region of the Mediterranean--they also apwhose intersections determine the vertices of the eightpear as far away as Persia, India, and central Asia. pointed star. As the examples show, the eight-pointed star conFuat Sezgin [5] has carefully studied navigational structed from crossed squares is a widely used motif in charts. He cites the historian Ibn Fadlallah al-'Umari the Muslim world, appearing in many contexts. Its use (who died in 1349) as saying that navigational charts alin the cupolas of religious buildings as well as in the ways include wind roses. In addition, as part of work form of wind roses suggests a connection with concepts done for the seafarer Abu Mahammad 'Abdalla B. Abi of the heavens. For the discussion here, it suffices to recNu' Aim al-Ansari al-Qurtubi of C6rdoba, Ibn Fadlallah ognize the underlying geometric configuration, exalted al-'Umari also reported that only the 4 principal direcby the places in which it was used, and characterized tions and the 4 directions halfway between have Arabic by its multitude of symmetries. names, even when the wind rose shows more than 8 diThe navigational charts, along with a mosaic of simrections (up to 32). Clearly, wind roses in the form of ilar form in the Alhambra (Fig. 15), exhibit an additional eight-pointed stars were developed by Arabic-Spanish step in the development of the eight-pointed star figure: sailors. The use of this style of wind rose for the design 54
THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4, 1995
the intersection points of the tangents, that is, the vertices of the star are distinguished by bundles of rays (Figs. 12 and 13). Here a new idea of the architect of Castel del Monte comes into play: he emphasizes the special position of the tips of the stars by repeating the eight-sided pattern in smaller, similar form. These corner towers are not only connected by a line from their centers My to the center M of the central octagon, but they also form a complete geometrical system together with the octagons of the outside wall and the inside courtyard (Fig. 5). With this novel idea, the architect increased the symmetries by an order of magnitude, and thus the aesthetic effect of the entire building. Since simple reflection-symmetry is already regarded as a harmonizing element in architecture, it is no surprise what an immense effect this additional symmetry has. Suppose we carry the idea of the architects of Castel del Monte one step further and construct new eightpointed stars formed from the tangents to the sides of the towers (Figs. 11 and 16). We immediately recognize geometric relationships between these outer stars as manifested in additional collineations. This is not to say that the architect of Castel del Monte took this additional step. It is not needed to establish the geometric relationships found in the castle, as these are based on the geometric system that includes the eight exterior octagons as well as the two basic octagons. The figure of the eight exterior stars (Fig. 16) corresponds to patterns found in Indo-Arabic constructive geometry, as can be seen in the mosaic in the center of the mausoleum of H u m a y u n in Delhi (1565), in which the stars are aligned along lines and touch each other only at one tip (Fig. 14).
The close geometric connections between the eight "satellite" stars, as shown by the touching of the points of neighboring stars, provide additional evidence that the size of the towers was not chosen arbitrarily, but follows the geometrical system. These connections further support the completeness of the geometric design of Castel del Monte as an example of a configuration with an inner aesthetic. The repetition of the basic eight-pointed star can be continued and, as Max Koecher observed, results in a fractal with infinite iteration possibilities (Figs. 17 add 18). An indirect proof of the geometric rules underlying the design of the castle is the fact that a computer-graphics model of the castle requires nothing more than these rules to effect a complete re~'onstruction, as the Heidelberg Center for Scientific Computing has shown. The determination and analysis of the origins of the basic geometric form underlying the design of Castel del Monte establishes a connection between it and IndoArabic geometry. Such a complex geometric analysis, however, is unusual even for this area. It is thus natural to ask if the emperor himself provided the inspiration for the design of this, the crown of his castles. His interest in mathematics and architecture is well known. The collection of permanent scholars at his court included Arabic mathematicians such as Theodorus of Antioch, who carried out mathematical correspondence with Leonardo of Pisa. This was the period in which Leonardo of Pisa collected Indo-Arabic mathematical results and disseminated them throughout Europe. The accomplishments of Greek mathematics had been preserved and extended by Arabic scholars. Leonardo was in close contact with Friedrich II and his court. It may be assumed that
Figure 15
Figure 16 THE MATHEMATICAL IN'I~LL1GENCER VOL. 17, NO. 4, 1995 ~ 5
Figure
17
Figure 18 56
THE MATHEMATICALINTELLIGENCERVOL. 17, NO. 4, 1995
Friedrich II, xgiven his interest in mathematics, took an active part in the design of this castle, which could symbolize the principles of his empire. Thus, Castel del Monte, with its extraordinary aesthetic radiance, is not only an art and architectural monument without equal, but also a scientific and cultural one. It stands at the crossroads of the Arabic-geometric [5] and middle-European Gothic worlds, and represents the ruling spirit of one of the most important emperors of the middle ages.
Sources The above discussion is based on work presented in Heinz G6tze, Castel del Monte, Gestalt und Symbol der Architektur Friedrichs II, first edition, Munich, 1984; third edition, 1991, p. 84 ff. In the following, we abbreviate these as HG 1 and HG 3, respectively. See also the preface to the third edition, pp. 9-12, as well as H. G6tze, "Die Baugeometrie von Castel del Monte," in Sitz-
ungsberichte der Heidelberger Akademie der Wissenschafien, Phil.-hist. Klasse, Jahrg. 1991, Bericht 4.
4. M. Koecher, "Castel del Monte und das Oktogon," in Miscellanea mathematica, Heidelberg: Springer-Verlag (1991), pp. 221-233. 5. Fuat Sezgin, a manuscript [in press] on Mathematical Geography and Cartography for Volumes 10 and 11 of Geschichte des arabischen Schrifitums, pp. 325-330, especially p. 328. 6. The influence of Arabic art in lower Italy and in Sicily is not restricted to Castel del Monte, eyen if it is the most impressive example. In 1927, Ernst.Kiihnel already observed the influence of Islam in Roman art; see Ernst Kiihnel, "Islamische Einfl/isse in der romanischen Kunst', in Sitzungsberichte der ku'astgeschichtlichen Gesellschafl Berlin, Nov. 11, 1927, and in "Das Rautenmotiv an romanischen Fassaden in Italien', in G. Rhode, et al. (eds.), Edwin Redslob zum 70. Geburtstag, Berlin, 1955. See also HG 3, 1991, pp. 126-127. This influence can be traced into the Renaissance, see HG 3, 1991, p. 136ff.
Springer-Verlag Tiergartenstrasse 17 D-69121 Heidelberg Germany
Additional sources are given in the following notes. I am indebted to Susanne Kr6mker of the Interdisciplinary Center for Scientific Computation of the University of Heidelberg (Director, Prof. Dr. W. J/iger) for preparing the computer-generated figures. 1. Antonio Thiery, "Federico I I e la conoscenza scientifica," in Intellectual Life at the Court of Frederick II Hohenstaufen, Washington, DC: National Gallery of Art (1994), pp. 273-290, and in particular Figures 4, 7, and 8. 2. Anyone who denies this (e.g., D. Leistikow), or equates it with fantastical claims (e.g., those of A. Thiery) lacks mathematical-geometric understanding (see D. Leistikow, "Castel del Monte, Baudenkmal zwischen Spekulation und Forschung," in Staufisches Apulien, Schriflen zur staufischen Geschichte und Kunst 13 (1993), 15-56). D.L. includes an "independent construction" (see p. 29, Fig. 8), but fails to realize that he is drawing a part of a true geometric eight-pointed star, and that his construction coincides with mine. On p. 25, D.L. asserts that "the design is much too complicated, and is not really practical for an architect in the middle ages." This is an unacceptable generalization: there is no typical architect in the middle ages. The layout for Castel del Monte has nothing to do with design as it is described in the book of Villard d'Honnecourt--it has much more to do with the Arabic mentality. The Cistercian architects had a hard enough time with it-see, for example, HG 3, 1991, p. 90 and Figures 131 and 132, in which neighboring ribs meet blindly in a connecting trapezium. 3. J.M. Montesinos-Amilibia, Classical Tesselations and Three-Manifolds, Heidelberg: Springer-Verlag (1987). THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4, 1995
57
Ian Stewart* The catapult that Archimedes built, the gambling-houses that Descartes frequented in his dissolute youth, the field where Galois fought his duel, the bridge where Hamilton carved quaternions--not all of these monuments to mathematical history survive today, but the mathematician on vacation can still find many reminders of our subject's glorious and inglorious past: statues, plaques, graves, the cafd where the famous conjecture was made, the desk where the famous ini-
tials are scratched, birthplaces, houses, memorials. Does your hometown have a mathematical tourist attraction ? Have you encountered a mathematical sight on your travels? If so, we inviteyou to submit to this column a picture, a description of its mathematical significance, and either a map or directions so that others may follow in your tracks. Please send all submissions to the Mathematical Tourist Editor, Ian Stewart.
Snellius's M e m o r i a l Stone Dirk Huylebrouck A visit to the Peter Church at the Dutch city of L e y d e n turns out to be interesting for two reasons: the presence of a memorial stone in honor of Willebrord Snellius (15801626), and the absence of L u d o l p h van Ceulen's (15401610) tombstone, generally k n o w n to be engraved with 35 digits of 7:. They w o r k e d at about the same period at the University of Leyden, but w h e n still alive they already did not do things the same way. As Petr Beckmann stated it in his A History of 7r (The Golem Press, 1977), There is all the difference in the world between Ludolph's digit hunting and Snellius's numerical test. Snellius had found a new method and checked its quality by calculating the decimal digits of ~r; Ludolph's evaluation to 35 decimal digits by a method known for 1900 years was no more than a stunt.
Snellius's memorial stone inside the church.
In Leyden, a visitor cannot miss the Peter Church; but what is called the "Sint-Pieterskerkhof" or "Peter's cemetery" is not a graveyard, but the name of a little road next to the church. In fact, the dead were buried inside the church, the size of their tombstone being proportional to the m o n e y their family was prepared to invest. As space inside the church was not unlimited, it h a p p e n e d that several years after someone died, other investors disturbed his rest. Sometimes the unfortunate's stone a n d / o r mortal remains were m o v e d elsewhere. This is w h a t probably h a p p e n e d to van Ceulen (further evidence of this story does exist). Not only Snellius's computational but also his funeral m e t h o d was of a better quality: instead of having just a tombstone on the floor, a m e m o r i a l plaque was fixed to the church wall. It is still there, and quite easy to find: his name and two Latin words referring to mathematics are clearly visible. Aarsthertogstraat 42
*Column Editor's address: Mathematics Institute, University of 8400 Oostende Warwick, Coventry,CV4 7AL England. Belgium 58
THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4 9 1995 Springer-Verlag New York
T h e church as s e e n f r o m "Peter's cemetery." THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4, 1995
59
Van Ceulen's Tombstone Dirk Huylebrouck
Ludolph van Ceulen, professor of mathematics at the University of Leyden, died in 1610 and was buried in the Peter Church at Leyden (the Netherlands). His tombstone is generally known to be engraved with 35 digits of 7r. However, in A History of Tr (The Golem Press, 1977), Petr Beckmann quotes the historian J. Tropfke to the effect that "the last three digits [of the 35 Van Ceulen computed] were engraved in his tombstone in the Peter Church at Leyden. This seems to invalidate vague references by other historians, according to which all 35 digits were engraved in his tombstone, but that the stone has been lost." A transcript of the tombstone text out of Humanitds Scientifiques-- Vols. 331,332, 333 by Serge Minois is copied in Le Petit Archim~de, numdro Spdcial 7r. This suggests the memorial would have survived anyway. There is some confusion about van Ceulen's biography. His year of birth does not seem to be known exactly: some references mention 1539, others 1540. His name is written Ludolph(ff) van C(K)eulen, and he is even called differently, Ludolph a Collen, in an original Latin text by
60
his contemporary Adrien M6tius. Van Collen is read on the front page of a book by the mathematician himself! The exact number of digits he computed seems to be a problem too (34, 35, or even 3 more?). A visit to the Documentary Center of the City (ask for the Gemeentearchief, it is a few minutes walk from the church) can clear some doubts. The Center contains certificates of where, when, and how someone was buried. Catalogs and transcripts of tombstone texts are available. Item 320 (see outline below) is found under the disappointing heading Inscriptions Of Tombstones And Memorials That Are No Longer Present. It is not the original text, says a footnote: the real epitaph was in the local Dutch language (which is different from the official Dutch language). The above Latin translation was found in Les Ddices de Leide, p. 67, this reference indicates. Appar-
THE MATHEMATICAL 1NTELLIGENCER VOL. 17, NO. 4 9 1995 Springer-Verlag New York
ently, it must have been a French book, and that may be w h y it was translated (but why then into Latin and not into French?). A tour at the site, the Peter Church, does not reveal much, except under the guidance of Mr. R. M. Th. E. Oomes, a retired mathematics teacher. He did archeological research to find out what happened to van Ceulen's tombstone. During the restoration of the church, the whole floor was literally rooted up as can be seen on the photograph, and Mr. Oomes was called in for help. No decimals were excavated: the stone must have been moved or destroyed during the second half of the 19th
century. A French army man wrote that he had seen the stone at that time, but a few years later the well-known mathematician Bierens-De Haen (responsible for many formulas in M. Abramowitz and I. A. Stegun's Handbook of Mathematical Functions) no longer mentions it. Why did he not at least note the tombstone's disappearance? It is nevertheless possible to retrieve the location where the tombstone should have been. Our guide thinks he can prove that stone 106 would still cover the mortal remains of Ludolph van Ceulen. The author of the present paper was proud to find it finally (picture!), but it is just an ordinary stone with the number 106 notched in it.
The disappointed visitor can compensate his disillusion by leaving the church area through a little road called the Klokstraat. At the end of the street, to the left, coming from the church, one walks along the area (the houses have been rebuilt) where van Ceulen would have computed, during the best part of his life and until death prevented him, 34 decimals of ~r. Let us hope many mathematicians visit the site, so that van Ceulen's zeal may be recognized one day. Snellius and a professor in theology, both contemporaries of van Ceulen, do have nice memorial stones inside the church, so why not our digit hunter? If you plan to visit the site and need a qualified guide, contact Mr. R. M. Th. E. Oomes, van den Brandelerkade 23, N1-2313 GW Leyden.
Aarsthertogstraat 42 8400 Oostende Belgium THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4, 1995
61
Jeremy J. Gray* Arthur Cayley (1821-1895) Numerous events have been held to mark the centenary of Arthur Cayley's death on January 26, 1895. The British Society for the History of Mathematics, supported by the London Mathematical Society, held a well-attended one-day meeting at Oxford on February 11, 1995. The first speaker was Tony Crilly, the author of a forthcoming biography of Cayley. Cayley was born in England, but spent the first 8 years of his life in St Petersburg, where his father was a successful merchant. The common language of the international community there was French, which undoubtedly contributed to Cayley's later fluency in foreign languages. On the family's return to London they lived in some style near Regent's Park, then on the northern edge of the city (now an attractive park near the railw a y stations for the North). Cayley entered King's College Senior School when only 14, although the usual age was 16, and won the school mathematics prize until he had learned all they could teach, whereupon he took the school chemistry prize after only a year's study. He then went to Trinity College Cambridge at the age of 17, and graduated Senior Wrangler in 1842. Mathematics was then the central course of study at the university, and if the subject was therefore kept elementary, competition was fierce for the highest honours, and candidates were listed in rank order. By coming out at the top, Cayley guaranteed himself a job for life somewhere. In fact, he went into law, becoming a barrister specialising in conveyancing and land law. Tony Crilly quoted a footnote in a legal journal describing Cayley's career to date, praising his legal acumen, and observing with some satisfaction that he had now turned from mathematics to the more congenial practise of the law. Even the presence of Sylvester as a near neighbour could not keep him in London, and after 14 years, having published some 300 papers, he returned to Cambridge and full-time mathematics. His appointment at Cambridge was the Sadleirian professorship, a matter that the current holder, John Coates, illuminated. The will of Lady Sadleir, w h o died in 1701, provided for 17 lectureships in mathematics, one for each College. By *Column editor's address: Faculty of Mathematics, The Open University, Milton Keynes,MK7 6AA, England. 62
1863 these had fallen into disrepair, and the University, wanting to reform the teaching of mathematics, thought it could better honour the spirit of the will by reconstituting the bequest as a single Professorship. The Sadleir family gave their blessing and Cayley was appointed in 1863, holding the post until his death 32 years later. Ironically, as several speakers indicated, teaching was perhaps the aspect of a mathematician's job that Cayley did least well. It was said of him that he regarded the reading of a paper as a disagreeable formality that had to be gone through before it could be printed. June Barrow-Green, surveying the few people who might be regarded as his students (Glaisher, Forsyth, Charlotte Scott, and Baker) showed that although Cayley lectured diligently on a wide variety of topics, his influence was indirect. It may well have been more useful that several of the lecture courses were written up in extensive memoirs. The British audience was given thereby its first modern treatment in English of such Continental developments as dynamics, the hypergeometric equation, and Clebsch and Gordan's theory of Abelian functions. Cayley's reputation in his day, and the lasting value of his contributions, proved difficult to assess. The claim of his obituarists that in a hundred years' time his work would be as well known as Newton's and Euler's was then (and still is) taken to have been excessive ~ l though that may indirectly say something about the impact he made on his contemporaries. His name is certainly still remembered, but the work that made him famous has begun to fade, and his name is often either attached to some ephemeral items or else pinned up somewhat generously. The speakers on elementary algebra and combinatorics had therefore to portray rather less than might be imagined, while those on geometry and invariant theory had to deal with substantial contributions to obscurer branches of mathematics. Cayley's name can be found on a variety of pieces of mathematics. The Cayley=Hamilton theorem for matrices is one place, and Professor Ledermann discussed Cayley's contributions here. It seems that the word matrix, in this sense, was coined by Sylvester, and that Cayley's use of it in 1856 is one of the first times it occurs in print. That paper is at times utterly lucid, and at times obscure. The eponymous theorem is proved in the 2 x 2 case only; the 3 • 3 case is said to have been ver-
THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4 9 1995 Springer-Verlag New York
Arthur Cayley ified by the author, after which its truth in general is claimed. Cayley gave no hint of how the general argument might go, and it is not clear if he had one. He continued throughout his life to describe only those two low-dimensional cases. That Cayley tables and Cayley graphs occur in group theory may also overstate his contributions, as Peter Neumann described. The 10 papers (counted with multiplicity--a habit referees would not tolerate these days) began with a muddled attempt to define a group abstractly and ended with a survey of the distinct subgroups, up to conjugacy, of the permutation group Sn, for n -< 8. Although Cayley was impressively well acquainted with the early work of Galois and Cauchy, the Continental tradition in group theory was to leave him far behind. Cayley's work on chemical trees was described by Robin Wilson, standing in for Keith Lloyd, who had been rushed to hospital three days before. The problem is similar to the matrix case: the insights into the number of labelled, rooted, and other types of tree with n vertices are' fine, but the proofs are often little more than generalisation from the cases of small n.
Three speakers addressed Cayley's more important work in the related topics of geometry and invariant theory. Maria Fernanda Estrada came from Portugal to describe Cayley's discovery in 1849 of the 27 lines on a cubic surface; Cayley found that there would be a finite number of lines and his friend Salmon counted them. This discovery opened up the theory of surfaces, and attracted much international attention. Sylvester conjectured a canonical form for the equation of the surface, a result first proved by Clebsch. Steiner became interested, and communicated his interest to Schl/ifli, who was the first to study the detailed configuration of the lines (the double-six configurations). The first t o make a successful model of the real surface with the 27 lines real and visible was (Ludwig Christian) Wiener; other models were made by Clebsch, and a long story began here. Jeremy Gray described Cayley's work on scrolls (ruled surfaces in projective space), less well known but which illustrates Cayley's botanical style; he was an energetic collector and classifier of examples. His important classification of s~olls of degree 4 was flawed by his preference for generic reasoning, and Schwarz and Cremona found that what he had dismissed as special cases amounted to four more species. But the work paid off when Clebsch tried naively to generalise Riemann's concept of genus from curves to surfaces: the example of scrolls showed Cayley at once that Clebsch's approach could not work. In 1859 Cayley published his well-known "Sixth Memoir on Quantics,' in which he sketched h o w by fixing a conic (called the absolute conic) projective invariants could be made to yield metrical invariants, and deduced the formulae of Euclidean geometry by a special choice of conic. In 1871 Klein showed that the same procedure applied to a non-degenerate conic yields the formulae of non-Euclidean geometry, which by then had become more widely known. I learned at the conference that the absolute conic plays an important role in modern theories of computer vision. David Rowe's talk, on the intimate connection of invariant theory and geometry, showed clearly that although the British school never took to the heavyweight symbolical methods of Aronhold, Clebsch, Lindemann, and others, it was well regarded in its day for its blend of algebra and geometry. A nice example was the attention paid to Salmon's theorem that the cross-ratio of the four tangents to a cubic curve that can be drawn from a point on the curve is independent of the point and depends only on the curve. Cayley was the first English mathematician to achieve international recognition since Newton, and the first to open up Continental mathematics to an Anglophone audience. His achievements were memorably captured by the physicist James Clerk Maxwell in a poem read out at the end of the day, which contains the immortal, and accurate, line "His soul too large for vulgar space, in n dimensions flourished unrestricted." THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4,1995
63
Farey Series and Pick's Area Theorem Maxim Bruckheimer and Abraham Arcavi
In 1816 a minerologist [sic] named Farey published a mathematical paper in which he discussed the properties of ... [what] have since been called the Farey sequences of order n, although he was not the first to consider them ....
Introduction In this article, we c o m p a r e the historical notes and references in well-known texts with what w o u l d seem to be the reality as exhibited by the original sources, in the case of Farey series, Pick's area theorem, and the connection between them. Although one might suppose that w e have chosen a particularly "unfortunate" example, in which the historical notes and texts are almost totally misleading, it is our experience in preparing historical activities for the classroom that, more often than not, the information readily available to nonprofessional historians is unreliable. There are signs that history is playing a greater role in the mathematics classroom, a n d there is a need for readily available reliable historical information relevant to the school curriculum. Errors in printed histories are relatively costly to correct, and the significance of the error relative to the whole justifies neither the expense nor the effort, and, thus, the errors achieve more or less p e r m a n e n t status. Perhaps in these days of flexible electronic data handling and storage, some historians will devise an electronic historical retrieval system, to which corrections and additions can be m a d e as they are d i s c o v e r e d - a sort of electronic Tropfke [1].
For further information on this last point the authors refer to the book by Dickson [3]. W h y should a mineralogist be interested in fractions? Sufficiently interested to publish "a mathematical p a p e r " ? Before w e referred to Dickson, w e turned to another of the books immediately at h a n d - b y H a r d y and Wright [4], pp. 3 6 - 3 7 .
The "Textbook" Farey The sequence of all non-negative reduced p r o p e r fractions with d e n o m i n a t o r not exceeding n, arranged in increasing order, is called the Farey sequence of order n, In. To u n d e r s t a n d the discussion one needs to know two fundamental properties of In: I. If a/b and c/d are two adjacent terms of In, then bc - ad = 1. II. If a/b, e / f , and c / d are three adjacent terms of In, then e / f = (a + c)/(b + d). Our interest in Farey series 1 began as a result of some w o r k for students on Egyptian unit fractions. In Beck, Bleicher, and Crowe [2], pp. 416 ft. we f o u n d that Farey series could be used to express any fraction between 0 and 1 as the sum of distinct unit fractions. So w e began reading. 1 Farey series are not really series but sequences,but everyone (including Beck,et al.) calls them Farey series. 64
THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4 9 1995 Springer-VerlagNew York
The history of 'Farey series' is very curious. Theorems 28 and 29 [properties I and II above[ seem to have been stated and proved first by Haros in 1802: see Dickson, History, i, 156. Farey did not publish anything on the subject until 1816, when he stated Theorem 29 in a note in the Philosophical Magazine. He gave no proof, and it is unlikely that he had found one, since he seems to have been at the best an indifferent mathematician. A s s u m i n g the s t a t e m e n t a b o u t H a r o s to be true, w e note a g a i n an oft-recurring p h e n o m e n o n , w h i c h c a n be described in the spirit of M a y [5] as follows: If Theorem X bears the name of Y, then it was probably first stated a n d / o r proved by Z. We w e r e also left w o n d e r i n g w h e t h e r Farey h a d claimed to h a v e a proof. To r e t u r n to H a r d y a n d Wright: Cauchy, however, saw Farey's statement, and supplied the proof (Exercices de mathdmatiques, i, 114-16). 2 Mathematicians generally have followed Cauchy's example in attributing the results to Farey, and the series will no doubt continue to bear his name. Farey has a notice of twenty lines in the Dictionary of national biography, where he is described as a geologist. 3 As a geologist he is forgotten, and his biographer does not mention the one thing in his life which survives. But if Farey w a s an "indifferent m a t h e m a t i c i a n , " then w h y s h o u l d he get a m e n t i o n in the Dictionary of national biography (DNB), just b e c a u s e C a u c h y a t t a c h e d his n a m e to a result w h i c h he d i d n o t p r o v e a n d w h i c h he w a s not the first to notice? 4 O u r t w o "sources" so far agree on one t h i n g - - that one s h o u l d refer to Dickson. H e says, C. Haros proved the results rediscovered by Farey and Cauchy. T h e n follows a description of w h a t Farey stated, w h i c h is essentially p r o p e r t y II s t a t e d above. D i c k s o n continues, Henry Goodwyn mentioned this property on page 5 of the introduction to his "tabular series of decimal quotients" of 1818, published in 1816 for private circulation..., and is apparently to be credited with the theorem.
2 T h e date of this reference is 1826 a n d it is a reprint of the original published in 1816 (immediately after the appearance of a French translation of Farey's letter) in the Bulletin des Sciences par la Soci~t~ Philomatique de Paris 3 (1816), 133-135. 3 A p p a r e n t l y Farey as a mineralogist a n d geologist is not completely forgotten, as we are i n f o r m e d by Dr. H u g h Torrens of Keele University, England. 4 Even worse, in o u r view, is H a r d y ' s r e m a r k in A Mathematician's Apology (p. 81): " . . . Farey is i m m o r t a l because he failed to u n d e r s t a n d a t h e o r e m w h i c h Haros h a d p r o v e d perfectly fourteen years before;..." F r o m w h e r e does H a r d y k n o w that Farey "failed to u n d e r s t a n d " - - o r that Farey even k n e w of H a r o s ' s paper. Glaisher in 1879 does n o t m e n tion it! A n d then "a t h e o r e m w h i c h Haros h a d p r o v e d p e r f e c t l y " - poetic licence, perhaps, to w h i c h H a r d y s e e m s to h a v e s u c c u m b e d m o r e t h a n once in this book.
W h y s h o u l d G o o d w y n be credited w i t h the t h e o r e m if H a r o s p r o v e d the result 14 y e a r s earlier? Later (p. 157), Dickson states, J. W. L. Glaisher gave some of the above facts on the history of Farey series. Glaisher treated the history more fully .... But e v e n Glaisher is at best a s e c o n d a r y source. W i t h so m a n y d o u b t s a n d u n a n s w e r e d questions, o n l y p r i m a r y sources can resolve them.
The "Real" Farey The w h o l e is so short that w e c a n let the original Farey [6] s p e a k for himself.
On a curious Property of vulgar Fractions. By Mr. J. Farey, Sen. To Mr. Tilloch SIR.--On examining lately, some very curious and elaborate Tables of "Complete decimal Quotients," calculated by Henry Goodwyn, Esq. of Blackheath, of which he has printed a copious specimen, for private circulation among curious and practical calculators, preparatory to the printing of the whole of these useful Tables, if sufficient encouragement, either public or individual, should appear to warrant such a step: I was fortunate while so doing, to deduce from them the following general property; viz. If all the possible vulgar fractions of different values, whose greatest denominator (when in their lowest terms) does not exceed any given number, be arranged in the order of their values, or quotients; then if both the numerator and the denominator of any fraction therein, be added to the numerator and the denominator, respectively, of the fraction next but one to it (on either side), the sums will give the fraction next to it; although, perhaps, not in its lowest terms. For example, if 5 be the greatest denominator given; then are all the possible fractions, when arranged, 1 1 1 2 1 3 2 3 a n d ~; t a k i n g 89 a s t h e g i v e n f r a c t i o n , 3, ~, 3, 3, ~, 3, 5, ~'
we have 1 T + ~1 ~ ~ 2 = ~1 the next smaller fraction than 89 q- il = 3, 2 the next larger fraction to 3" 1 Again, if 99 be or 51 T the largest denominator, then, in a part of the arranged Table, we should have ~~5 ,28 97 ~ ~13~ ~24 ~11' •C.; a n d if t h e t h i r d o f these fractions be given, we have ~15 q+ ~13 - ~ t h e s e c o n d : o r 13 + 11 24 the fourth of them: and so in all the other cases. 45 + 38 -I am not acquainted, whether this curious property of vulgar fractions has been before pointed out?; or whether it may admit of any easy or general demonstration?; which are points on which I should be glad to learn the sentiments of some of your mathematical readers; and am Sir, Your obedient humble servant, J. Farey. Howland-street. We n o w k n o w w h a t Farey did a n d d i d n o t do. H e d i d n o t write a " m a t h e m a t i c a l paper," a n d n o t o n l y is it "unlikely that he h a d f o u n d one" b u t it w o u l d s e e m certain that he did n o t h a v e a proof. W h a t r e m a i n s is H a r o s ' s "claim" to priority. Glaisher [7], p. 335 w a s a p p a r e n t l y u n a w a r e of Haros, b u t s e e m s to h a v e b e e n suitably cautious: THE MATHEMATICALINTELLIGENCERVOL. 17, NO. 4, 1995 65
It seems curious that so elementary and remarkable a property of fractions should not have been discovered until 1816. It may of course be found that it had been published previously; but supposing the discovery to be due to Mr. Goodwyn and Mr. Farey, an explanation might be afforded by the fact that the 'Tabular Series' is probably the earliest Table of the kind, and that the property would not be likely to present itself to anyone who had not arranged a complete series of proper fractions having denominators less than a given number in order of magnitude. The fact is that Haros [8] anticipated both G o o d w y n and Farey in a certain sense, as just the title of his paper indicates5: Tables pour 6valuer une fraction ordinaire avec autant de d6cimales qu'on voudra; et pour la fraction ordinaire la plus simple, et qui approche sensiblement d'une fraction d6cimale. [Tables for evaluating a c o m m o n fraction with as m a n y decimals as desired; and the simplest good approximation by a c o m m o n fraction of a decimal fraction.] In the first part, he discusses the conversion of a fraction into decimal form. After stating some of the properties, he announces that he has calculated a n e w table yielding the decimal expansion of any irreducible fraction with denominator not exceeding 99. Unfortunately, he does not give the table, and this makes his description somewhat difficult to follow. However, it is the second part of the paper which interests us here. His aim is to enable one to evaluate best approximations to decimal n u m b e r s by fractions with a low d e n o m inator. For this, he wants to arrange all fractions with denominator < 99 in order of size, for then he will be able to rely on their already calculated decimal values. In other words, Haros proposes to write d o w n the sequence f99" H e begins with the sequence 1 1 1 99' 98' 97
1 1 1 2 3 4' 3' 2' 3' 4
96 97 98 97' 98' 99'
in which, as he shows, each fraction differs from its neighbour b y the reciprocal of the product of their denominators. N o w comes the crux of his argument: It remains to intercalate between the foregoing all other irreducible fractions with denominator less than 100. In this process, intermediate fractions must follow in order of size, and the difference of a fraction from its neighbour must always be one over the product of their denominators; for then any fraction in the sequence will be irreducible and will give as simply as possible the approximate value of one or the other of the two fractions between which it lies. This falls far short of proving the first of the fundamental properties of Farey series. (The two properties are equivalent, see [4].) s We are grateful to Dr. Baruch Schwarz of the H e b r e w U n i v e r s i t y at Jerusalem for help with u n d e r s t a n d i n g the French a n d for checking w h a t we have written a b o u t H a r o s ' s paper.
66 THE MATHEMATICALINTELLIGENCERVOL. 17, NO. 4, 1995
In the following, Haros shows that if a/b and c/d satisfy the condition bc - ad = 1, and x / y is a fraction between them satisfying the same condition with regard to its neighbours, then x / y = (a + c)/(b + d). What Haros seems to have d o n e is to give a m e t h o d for finding fractions belonging to f99, between those already l i s t e d - but how does one k n o w that one will get them all? A n d it does not prove the m o r e general result noted by Farey that if a/b, e / f , and c / d are any three consecutive fractions in a Farey series, then e / f = ( a + e) / ( b + d). Clearly, Dickson overstated the case w h e n he wrote that Haros proved the results rediscovered by Farey and C a u c h y - - and understated the case w h e n he devoted relatively m a n y lines to G o o d w y n ' s tables, without a mention of those of Haros described in the 1802 paper. As Glaisher surmised, it was tables of fractions that m a d e people notice the remarkable p r o p e r t y of three consecutive fractions in what have come to be called Farey series. Farey did no more, but Haros d e d u c e d this property in special circumstances from the fundamental property of the difference of two neighbouring fractions. However, not until Cauchy saw Farey's letter were both results stated and proved satisfactorily.
Pick's Area Theorem That w o u l d have been the end of the story. But a couple of years later w e decided to develop an activity for students around Pick's area theorem and, as usual, we wanted to include some historical background. So we began our search in textbooks again, starting with one by Coxeter [ 9 ] - - a n d at once we were back in Farey land. There, opposite Pick's area theorem (p. 209), heading the section was the misleading quote from H a r d y and Wright about Farey's entry in the DNB. The connection between Pick and Farey obviously had to be explored, both historically and for the activity we wanted to develop. The relevant bits from Coxeter are as follows. According to Steinhaus ... it was G. Pick in 1899, who discovered the following theorem: The area of any simple polygon whose vertices are lattice points is given by the formula 89 where b is the number of lattice points on the boundary while c is the number of lattice points inside. "According to Steinhaus" w o u l d suggest that Coxeter is being c a r e f u l - - or w h y not quote Pick, as cited in Ref. 10 (p. 260), directly. Perhaps because he had not seen Pick's paper. From Pick's area theorem, Coxeter deduces that if a triangle, w h o s e vertices are the lattice points (0, 0), (b, a), (d, c), contains no other lattice points within or on its sides, then bc - ad = 1. N o w if w e represent any fractio n a/b in f ~ by the lattice point (b, a) then because the fractions are reduced, any two adjacent fractions a/b and c / d in dn together with
the origin f o r m an o t h e r w i s e lattice-point-free triangle as above. 6 Hence, w e h a v e bc - ad = 1, one of the two f u n d a m e n t a l properties of Farey series p r o v e d in a m o s t elegant fashion. Coxeter attributes this proof of the Farey p r o p e r t y to P61ya [11]. To fill in the historical b a c k g r o u n d a little, w e obtained a f e w biographical details of Pick f r o m P o g g e n d o r f [12], p. 569 and ordered copies of the p a p e r s by Pick a n d P61ya. It a p p e a r s that Georg Alexander Pick was b o r n in 1859 in Vienna a n d died in 1943 (?) in the Theresienstadt concentration camp. H e s p e n t m o s t of his w o r k i n g life at the G e r m a n University in Prague, and Kline [13], p. 1131, in connection with Einstein's w o r k on the theory of general relativity, notes, However, to make progress.., he [Einstein] discussed it in Prague with a colleague, the mathematician Georg Pick . . . . . To analysts, Pick is well r e m e m b e r e d for interpolation of analytic functions; see [14]. 7 To return to Pick's area theorem. The i m p r e s s i o n we w e r e left with from Coxeter w a s that Pick discovered his t h e o r e m and P61ya a p p l i e d Pick to Farey. s H o w e v e r , the facts as they a p p e a r f r o m the original articles are somew h a t different. Thus, Pick [16] begins his article b y citing the w i d e s p r e a d use of p l a n e lattices "for visualisation a n d as heuristic aids in n u m b e r theory" going b a c k to Gauss. 9 His o w n aim, he says, is rather to p u t the elem e n t s of n u m b e r theory on a geometric basis, b y use of an area formula for lattice p o l y g o n s which "in spite of its simplicity seems to h a v e gone unnoticed till now." The surprise, however, c o m e s in the third section of his article, where he derives the a b o v e f u n d a m e n t a l property of Farey series (and s o m e more) in exactly the s a m e w a y as in [9], w h e r e Coxeter as w e s a w attributes this to P61ya. t~ A n d so to P61ya and the introduction to his paper. A beautiful geometric treatment of the well-known principal property of Farey series goes back to Sylvester. As this treatment seems to have been generally forgotten, and as, moreover, Sylvester's inferences are not irreproachable, it is probably in order to go through the matter briefly here. Thus, P61ya himself attributes the lattice a p p r o a c h to the Farey p r o p e r t y to Sylvester, in a p a p e r originally p u b lished in 1883. P61ya does not present it as his o w n 6 For the details, see Ref. 9, p. 211. 7 In a recent paper on Pick's theorem, Griinbaum and Shephard [15] write: "He [Pick] made significant contributions to analysis and differential geometry." Perhaps we may say that among geometers Pick is remembered almost exclusively for a relatively minor, if extremely beautiful, result. 8 A development similar to that of Coxeter,but without Pick's area theorem, can already be found in [4], where in their note on the appropriate sections (3.5-3.7) Hardy and Wright write, "Here we follow the lines of a lecture by Professor P61ya,"thus strengthening the impression that the application of lattices to Farey series is clue to P61ya. 9 See also, for example, [17], p. 35. 10The mathematics in Pick's paper is discussed by the present authors in "A visual approach to some elementary number theory", Mathematical Gazette (to appear).
method. 11 Sylvester's p a p e r does not contain Pick's area theorem; all he needs, as w e h a v e n o t e d above, is the area of a lattice-point-free triangle. H o w e v e r , P61ya does give Pick's area t h e o r e m (in a slightly variant form) but does not attribute it to P i c k - - or to a n y o n e else. It w o u l d s e e m that the application of the g e o m e t r y of lattices to Farey series p r o b a b l y dates back to Sylvester in 1883. Pick published his t h e o r e m in 1899 and, a p p a r ently u n a w a r e of Sylvester, applied a special case to Farey series again. This was repeated in 1925 b y P61ya, a p p a r ently u n a w a r e of Pick, b u t based on Sylvester. By his o w n account, P61ya does not "deserve" the historical credit given him b y Coxeter. 12 11And presumably did not do so either in the "lecture" referred to by Hardy and Wright. 12And Hardy and Wright.
References 1. J. Tropfke, Geschichte der Elementarmathematik, 4th ed., Berlin: de Gruyter, (1980). 2. A. Beck, M. N. Bleicher, and D. W. Crowe, Excursions into Mathematics, New York: Worth (1969). 3. L.E. Dickson, History of the Theory of Numbers, Vol. I, New York: Chelsea (1952). 4. G.H. Hardy and E. M. Wright, An Introduction to the Theory of Numbers, 2nd ed., London: Clarendon Press, 1945. 5. K.O. May, Historiographic vices I. Logical attribution, Historia Math. 2 (1975), 185-187. 6. J. Farey, On a curious property of vulgar fractions, Philos. Mag. J. 47 (1816), 385-386. 7. J.W.L. Glaisher, On a property of vulgar fractions, London, Edinburgh, Dublin Philos. Mag. 7 (1879), 321-336. 8. C. Haros, Tables pour 6valuer une fraction ordinaire avec autant de d6cimals qu'on voudra; et pour trouver la fraction ordinaire la plus simple, et qui approche sensiblement d'une fraction d6cimale. J. Ecole Polytechn. 4 (1802), 364368. 9. H. S. M. Coxeter, Introduction to Geometry, 2nd ed., New York: Wiley (1969). 10. H. Steinhaus, Mathematical Snapshots, 2nd ed., Oxford: Oxford University Press (1950). 11. G. P61ya, Uber eine geometrische Darstellung der Fareyschen Reihe, Acta Litterarum ac Scientiarum Regiae Universitatis Hungaricae Francisco-Josephinae, Sectio Scientiarum Mathematicarum 2 (1925), 129-133. 12. J. C. Poggendorff, Biographisch-Literarisches HandwOrterbuch der Exakten Naturwissenschaften, Band VIIa, Teil 3, Akademie-Verlag (1959). 13. M. Kline, Mathematical Thought from Ancient to Modern Times, Oxford: Oxford University Press (1972). 14. John B. Garnett, Bounded Analytic Functions, New York: Academic Press (1981). 15. B. Gr6nbaum and G. C. Shephard, Pick's theorem, Amer. Math. Monthly 100 (1993), 150-160. 16. G. Pick Geometrisches zur Zahlenlehre, Zeit. Vereines 'Lotos" 19 (1899), 311-319. 17. F. Klein, Vorlesungen idberdie Entwicklung der Mathematik im 19. Jahrhundert, Vol. I, Heidelberg: Springer-Verlag (1926).
Department of Science Teaching Weizmann Institute of Science Rehovot, 76100 Israel THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4,1995
67
Jet Wimp*
A Radical Approach to Real Analysis by David Bressoud Washington, DC: Mathematical Association of America, 1994. xii + 324 pp.
Reviewed by Ivor Grattan-Guinness This second volume in the series "Classroom resource materials" is distinguished by its heavy reliance on history to present and teach real-variable mathematical analysis. The author explains his strategy at the head of a fine preface: "This course of analysis is radical; it returns to the roots of the subject." The normal foundations-up style "is the right way to build analysis, but not the right way to teach it"; so here "the first part of the book follows the historical progression and moves backwards," in a foundations-down style (not explicitly mentioned). Thus the coverage treats methods of sumruing series and approximating roots of equations (Chap. 2) before worrying about convergence of infinite series (Chap. 4) and their manipulation (Chap. 5). Again, continuity and mean-value theorems are handled (Chap. 3) before the formal definition of thc Cauchy integral arrives in the final chapter (Chap. 6). One consequence is that the book deals nearly as much with numerical methods as with foundational issues. This feature is strengthened by the discussion in the appendices of special results such as Wallis's product for 1r and the Bernoulli numbers, and by the use of Mathematica programs (with an attendant glossary of
* C o l u m n Editor's address: Department of Mathematics, Drexel University, Philadelphia, PA 19104 USA.
68
terms at the end) to present successive partial sums of series and approximations to functions. In addition, a generous number of exercises is supplied, although unfortunately none is drawn from historical sources. Fourier series constitute the nexus of the account, for they provide the themes of the first and last chapters and turn up several times in between. This structure follows the historical record; for the series provided many of the problems, both general and specific, which the classical analysts had to solve in the development of their subject. I quibble a little, however, at the author's use of the word "crisis" (title of Chap. 1) to describe Fourier's first presentation of the series in 1807. Although Fourier saw a wide range of properties and at least envisioned many of the later problems (GrattanGuinness and Ravetz, 1972), no obvious crisis was perceived at the time; Fourier had resurrected a known but (wrongly) discredited method, and so he had a fight on his hands. One of his innovations might have been given greater stress in this book--namely, he understood the representability property, that a series (full, sine, or cosine) represents the function only within the interval of definition of the coefficients, and parts company from it outside. He also understood how the series differs from other kinds of series. This point is of considerable educational difficulty, as teachers of the series well know; Fourier's presentation is the best that I know. The author rightly points to frailties in several aspects of Cauchy's version of mathematical analysis, which was first presented in the 1820s; he rightly points to the quality and importance of the contributions made soon after by Abel and Dirichlet. Some positive features of Cauchy's w o r k might have been brought out more strongly. One is the enhanced role given to the logic of necessary and sufficient conditions for the truth of the-
THE MATHEMATICALINTELLIGENCERVOL.17, NO. 4 9 1995 Springer-VerlagNew York
orems: in a newly systematic and comprehensive way he stated " i f . . . then", and clarified the need to weaken sufficient and strengthen necessary conditions in developing the subject. A central example is the fundamental theorem of the calculus itself; with him it was a pukka theorem, based on independent definitions of derivative and integral, rather than a more or less automatic switch back and forth between the differential and integral calculi (compare p. 237). Another context in which Cauchy inspired a concern with logic in this sense is convergence tests. The treatment here is very nice, especially Gauss's in connection with the hypergeometric series (pp. 143-158); it is worth adding that several of the tests were originally found in the 1820s (because known ones were indecisive for certain series), and that the logical relationships between tests were examined (Grattan-Guinness, 1970, appendix). Only when the topic became established did the presentation degenerate into a useful but boring list. The author does not follow history when he defines the continuity of a function in the now usual (~, ~) manner (p. 93). In fact, this version was proposed by Augustus De Morgan in his Elements of Algebra (1835): He saw limits as part of algebra rather than analysis. Cauchy had used limits occasionally, and the Greek letters are his; but his basic definition of continuity took the sequential form (of which a version is given on pp. 96-97) and some of his seemingly false proofs can be justified in those terms (see Laugwitz, 1989). .. However, this point does not refute the general thesis, supported by the author, that the era after Cauchy's, launched in mid-century by the insights of Riemann and developed by the teaching and research of Weierstrass, marked a substantial refounding of analysis. One of the principal innovations effected by Weierstrass and his followers was the extension of largely univariate into multivariate analysis, with pointwise convergence of series of constants desimplified into modes of uniform and nonuniform convergence of functions (GrattanGuinness, 1970, Chap. 6). It was accompanied by, among other things, clarification of the distinction between the least upper bound and the upper limit of a collection of values; these topics appear here on pp. 110 and 140, respectively, but the treatment could have benefited from more discussion and placing them in closer proximity to each other. Among the numerical methods, the Newton-Raphson algorithm for approximating roots of equations contains educational nuggets in its surprising history. The author attributes the standard version, using the calculus, to Newton himself (p. 60); but in fact it is due to Thomas Simpson in the mid-18th century (see Kollerstrom, 1992). The relative weaknesses of the alternative versions make good educational talk: the historical record includes the curious fact that Newton used a geometrical analogue of the standard version in the Principia (Book 1, Proposition 31) to approximate to the position
of a planet on its orbit via Kepler's equation, but seems not to have noticed the connection with the method. Even the best mathematicians can miss a trick. In spirit and historical chronology this book follows Toeplitz's "genetic approach" to the calculus, developed from the late 1920s and available in English (Toeplitz, 1963). Surprisingly, that book is not cited here. However, further successors for this series come to mind. The most obvious one would start oBt from the short epilogue on the Lebesgue integral. Just as the era of Riemann and especially Weierstrass provided foundations for the work launched b y Cauchy, so at the beginning of this century, measure theory, and also the ramifications of the axioms of choice, gave foundations for the edifice which Riemann and Weierstrass had envisioned (Grattan-Guinness, 1975). Mathematical logic and model theory can also be naturally drawn out of this story. Another possibility is-a comparable treatment of complex-variable analysis. Cauchy's method imitated his approach to real-variable analysis, but the,later conceptions of Riemann (with his surfaces) and Weierstrass (with power-series and continuation techniques) provided two quite different approaches to analysis (Bottazzini, 1986), and the competition is of great educational as well as historical interest. The applications of analysis within and without mathematics is also a rich area; for example, historical presentation of elliptic and Abelian functions would provide a beautiful counterweight to the dreary w a y in which the topics normally slouch across the classroom floor. Many years ago I coined the name "history-satire" for the w a y I have used history in mathematical education (Grattan-Guinness, 1973). This is the best large-scale example of it that I have seen, and one could not imagine a better application of it to this important area of mathematics and mathematical education. References Bottazzini, U., The Higher Calculus, New York: Springer-Verlag (1986). Grattan-Guinness, I., The Development of the Foundations of Mathematical Analysis from Euler to Riemann, Cambridge, MA: M.I.T. Press (1970). Grattan-Guinness, I., Not from nowhere. History and philosophy behind mathematical education, International Journal of Mathematical Education in Science and Technology 4 (1973), 421-453. Grattan-Guinness, I., Preliminary notes on the historical significance of quantification and of the axioms of choice in the development of mathematical analysis, Historia Mathematica 2 (1975), 475--488. Grattan-Guinness, I., in collaboration with J.R. Ravetz, Joseph Fourier 1768-1830. A survey of his life and work, basedon a critical edition of his monograph on the propagation of heat, presented to the Institut de Francein 1807, Cambridge, MA.: M.I.T. Press (1972). THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4, 1995
69
Kollerstrom, N. Thomas Simpson, "Newton's method of approximation": an enduring myth, British Journal for the History of Science 25 (1992), 347-354. Laugwitz, D., Definite values of infinite sums I, Archive for History of Exact Sciences 39 (1989), 195-245. Toeplitz, O. The Calculus. A genetic approach, Chicago: University of Chicago Press (1963). (German original 1949.) Middlesex University Enfield, Middlesex EN3 4SF England e-maih
[email protected]
Wavelets: Algorithms and Applications by Y. Meyer Philadelphia: SIAM, 1993. 130 pp. Paperbacks: US $19.50, ISBN 0-89871-309-9. Reviewed by Mary Beth Ruskai Wavelet theory can be roughly described as a mathematical tool for multiscale analysis; the new toys in the toolbox are sets of basis functions in which the individual functions are related by dilation and translation, that is, by rescaling and shifting. The idea of multiscale analysis is hardly new: fractals are almost a household term; the renormalization group has been widely and successfully used in theoretical physics since the late 1960s; and pyramid algorithms were important in signal analysis long before wavelets. Moreover, some special cases of such basis sets were known before wavelets, going at least back to Haar in 1909, and some of the related mathematical analysis had been developed earlier by Littlewood and Paley and by Calder6n and Zygmund. Reaction to wavelets has ranged from those w h o acclaim them as one of the great advances of the 20th century to those who dismiss them as mere rediscoveries. Although both extremes contain a germ of truth, neither tells the whole story. Advances in wavelet theory have been closely related to progress in signal analysis, each field benefiting from insights from the other. For example, Daubechies's construction of compactly supported orthonormal wavelets used the concepts of pyramid algorithms and multiresolution analysis from signal processing. However, wavelets have also led to new signal processing algorithms. Recently, wavelet theory seems to have come full circle; that is, what began as a replacement of time-frequency analysis by time-scale analysis has led to new methods of time-frequency analysis using functions called "Malvar wavelets" and wavelet packets. Because of these connections, Yves Meyer, a mathematician with a well-established reputation in harmonic analysis, has approached the task of writing an accessible set of introductory lectures by tying wavelet theory to signal processing. Indeed, about half the book could 70
THE MATHEMATICAL INTELL1GENCER VOL. 17, NO. 4, 1995
be described as an introductory survey of signal processing for mathematicians. This approach is successful for precisely the same reason that wavelet theory itself has benefited from signal processing--the language and perspectives of signal analysis provide important insights. That Meyer considers this essential is evident in the conclusion to his second chapter: The status of "wavelet analysis" within mathematics is rather unique. Indeed mathematicians have been working on wavelets ... for a fairly long time .... But during all of this period, which stretches from 1909 to 1980, from Haar to Str6mberg, there was very little interchange among mathematicians (of the "Chicago School"), physicists, and experts in signal processing. Not knowing about the mathematical developments.., the last two groups were led to rediscover wavelets. In the numerous fields of science and technology where wavelets appeared at the end of the 1970% they were handcrafted by the scientists and engineers themselves. Their use has never resulted from proselytism by mathematicians. Today the boundaries between mathematics and signal and image processing have faded, and mathematics has benefited from the rediscovery of wavelets by experts from other disciplines. The detour through signal and image processing was the most direct path leading from the Haar basis to Daubechies"s wavelets. [emphasis added] That the insights obtained during the detour through signal processing were vital to Daubechies's construction of orthonormal, compactly supported wavelets is historically accurate. But whether it was really the "most direct path" possible is debatable. As a mathematical physicist, I find it interesting that the renormalization group did not lead more directly to wavelets. Although two of the founders of modern wavelet theory, Grossmann and Daubechies, were trained as physicists, it was only after the discovery of orthogonal wavelets by other means that Battle showed that renormalization group techniques could also be used to construct exponentially decaying orthonormal wavelets. Why are so many different groups interested in wavelets? Mathematicians, scientists, and engineers all use sets of basis functions to analyze or decompose functions and signals. In some cases these bases are simply convenient tools for analysis; in others, they have special properties which allow scientists to give physical interpretations to certain quantities. For example, when an electrostatic potential is expanded in spherical harmonics, the first few expansion coefficients are the charge, dipole moment, and quadrupole moment. The most familiar example of this phenomenon is Fourier analysis with the dual variables usually interpreted as time and frequency. This is, however, not the only possibility; in quantum mechanics these same dual variables of Fourier analysis are interpreted as position and momentum. As we shall see, different interpretations can lead to different insights.
Despite the initial attractiveness of Fourier analysis for time-frequency analysis, it has a serious defect, namely, bad time-frequency localization. Consider a musical note. Before the musician plays the note, none is heard. Yet the Fourier representation of a note of precise frequency will have infinite time duration. An accurate Fourier representation of the silence preceding the note requires the introduction of many spurious frequencies to produce the necessary cancellation. Although the Heisenberg uncertainty principle limits time-frequency localization, other bases do better. Perhaps the most familiar alternative consists of modulated Gaussians and their generalizations which have the form gmn(X) = ei2zrmxg(x - n) and are called "windowed Fourier transforms" or "Gabor functions" by engineers and "coherent states" by physicists. Although there is a sense in which these functions have optimal time-frequency localization, they also have drawbacks. These include regular distribution of time-frequency peaks (instead of increased density in time for peaks of high frequency), the impossibility of orthogonal bases of this type (first observed by Balian-Low and later shown to be a direct consequence of the Heisenberg uncertainty principle by Battle), and more subtle defects discussed by Meyer. Dyadic wavelets of the form ~lmn(X)-----2-m/2 ~(2-mx -- n) which overcome these defects can be constructed. The replacement of modulation by scaling re,suits in a higher density of fine-scale wavelets. This is what gives wavelets the ability both to analyze efficiently at broad scales and to "zoom in" to observe details at fine scale as needed. The mathematical surprise was that this can be done without sacrificing either orthogonality or good decay and regularity properties, and that ~ can even have compact support. (A brief accessible account of Daubechies's construction was given by Strichartz [16].) By synthesizing the signal processing techniques of subband coding, quadrature mirror filters, and pyramid algorithms, Mallat introduced the concept of a multiresolution analysis; that is, a doubly infinite nested decreasing sequence of spaces Vk satisfying
AkEzV k = {0} and [..JkEzVk is dense in L2(R), (b) fix) ~ Vk ~ f(2x) E Vk-~, (a)
(c) V0 contains a ~b such that the set {~x - n)}n~Z is a Riesz basis for V0. If one then defines Wk as the orthogonal complement of Vk in Vk-1 at each level, one can decompose an arbitrary function in Vj-1 = Vj ~ W/into its "trend" in Vj and its "fluctuation" in Wj. Condition (c) then guarantees the existence of a function ~bwhich generates a basis for W0 by translation. The translates of ~bcan always be made orthogonal (although this is not obvious), so that the orthogonality of the Wk ensures that the set ~bmn(X)= 2 - m / 2 ~ 2 - m x - n) is an orthonormal basis for L2(R).
Daubechies's [3] breakthrough was the discovery that one can choose ~bto have compact support, arbitrary regularity, and an arbitrary number of vanishing moments. Despite the reliance on multiresolution analysis, the Daubechies wavelets were not implicit in the earlier algorithms of signal processors. Her construction needs the convergence of an iterative process. In this sense, it is similar to Battle's renormalization group approach to constructing orthonormal wavelets, which is an outgrowth of the quantum field theorists' need to prove the existence of a fixed point. However, the renorfnalization group, which is based on the concept of rescaling lattice spaces, has not yet been cast into a forth which makes wavelets accessible to nonspecialists. Approximationtheorists had also developed an approach to rescaling, called stationary subdivision algorithms, very similar to multiresolution analysis. However, because they were only interested in the "trend," they ignored the "fluctuation" spaces Wk which are essential to orthogonal wavelets. Despite similar formalisms, the differing views of signal processing engineers, physicists, approximation-theorists, and harmonic analysts sometimes led to very different mathematical developments~ Meyer devotes three chapters to explaining how multiscale analysis turned out also to be the key to better time-frequency analysis. In the quantum physicist's interpretation of the dual spaces of Fourier analysis as position x and momentum p (rather than time and frequency), localization in the kinetic energy p2 may be more important than localization in momentum itself. Therefore, Ken Wilson suggested that one consider functions with two peaks in momentum space, at +p and -p. Daubechies, Jaffard, and Journe6 then constructed functions with exactly the properties Wilson wanted, including exponential decay in both position and momentum. In position space, the resulting functions look remarkably similar to the orthogonal Gabor functions forbidden by the Balian-Low theorem, except that the modulating exponential is replaced by alternating sines and cosines, and translation is by half-integers n/2 rather than n. Although the difference is subfie, it is enough. But it is only back in momenturn space, where the bimodal peaks spread, that one sees why the uncertainty principle is not violated. Starting from a completely different perspective, which can be regarded as bimodal in position space, Malvar found another variant of compactly supported Gabor functions in which the modulation factors are also replaced by alternating sines and cosines. Coifman and Meyer then used multiscale analysis to generalize this construction. Chapter 6 of Meyer's book gives a remarkably clear explanation of the mystery connecting multiscale analysis to time-frequency analysis in this way. Malvar's approach can be regarded as first slicing in time, followed by time-frequency analysis. Coifman and Meyer showed that multiscale analysis could be used to optimize this time slicing in an approach they THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4, 1995
71
call adaptive segmentation. If this process is reversed, that is, a slicing called adaptive filtering is done first in frequency space, the corresponding basis sets are called wavelet packets. They can be constructed by generalizing multiresolution analysis to produce several bases. Thus, the introduction of multiresolution analysis led not only to the wavelets of time-scale analysis but also far beyond to a large variety of new basis sets for time-frequency analysis. A book of this type cannot include everything. Meyer has wisely chosen to concentrate on the related topics of orthogonal wavelets, time-scale analysis, signal processing algorithms, and time--frequency analysis. The result is a fascinating story as well as an introduction to new mathematical ideas. However, such important topics as nonorthogonal wavelets, whether continuous or discrete, are alluded to only briefly. Furthermore, Meyer's observation that some Malvar wavelets can yield an orthonormal basis on the half-line m a y mislead the reader into thinking that one can do multiresolution analysis on a half-line. The proof that this is not possible in the case of Hardy spaces was recently completed by Auscher [1], leaving an important role for nonorthogonal wavelets. These comments are intended not as a criticism of Meyer's exposition, but as an indication of the subtlety and breadth of the subject. The book concludes with four brief chapters on applications. The chapter on vision analysis continues the signal analysis theme with time replaced by position on a two-dimensional grid. In this framework, edge detection plays a major role; Meyer gives a nice overview of Marr's and Mallat's approaches to this challenging problem. Meyer devotes the chapter on fractals to explaining how wavelet analysis can be used to identify points at which a function is not differentiable by reexamining classic examples of nowhere differentiable functions. The last two chapters on turbulence and astronomy are only intended to whet the reader's appetite. Therefore, it is unfortunate that even the English translation cites only Farge's relatively inaccessible 1990 essay on turbulence for the Soci6t6 Math6matique de France [6], and not the updated English version [7] available in [15]. Meyer has written a superb introduction to wavelet theory for nonspecialists. For some of those who have been wondering what "all the fuss" is about, reading this book may suffice. Many, however, will want to learn more. Fortunately, they can proceed to Daubechies's Ten Lectures on Wavelets [4] which is a model of expository writing. (See, e.g, the reviews by Grunbaum [8] and Meyer [14].) It is both accessible and comprehensive. Like its author, the book is truly interdisciplinary, combining approaches and insights from physics, mathematics, and engineering. Those who prefer to concentrate on the mathematical analysis can also consult Meyer's earlier book Ondelettes [12] or its English translation Wavelets and Operators [13]. For those who prefer to pursue the signal processing side there are two excel72
THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4, 1995
lent new books. Adapted Wavelet Analysis by Wickerhauser [19] has a very applied and practical flavor, with considerable attention to the actual implementation of algorithrns. The comprehensive graduate text Wavelets and Subband Coding by Vetterli and Kova~evi~ [17] will be attractive to applied mathematicians as well as engineers. There are also books by Chui [2], Kaiser [11], and Walter [18], each of which, although intended as an introductory text, has a slant and more advanced material related to the author's own special interest. For those interested in further developments and special topics at the research level, there seems to be a steady stream of conference proceedings and other collections. Among the more noteworthy are the two special issues of IEEE journals [9, 10] and the proceedings of the 1993 AMS short course on wavelets [5]. Meyer's book also serves another purpose in the debate taking place among government officials, policy analysts, economists, and scientists about the value of "pure" versus applied, or "investigator-driven" versus "strategic" research. Superficially, especially given the provocative conclusion to Chapter 2 quoted above, it might appear to support those who would downgrade the value of basic research under the assumption that scientists and engineers are capable of developing new mathematics when needed. However, a more careful reading also shows the limitations of the engineers and physicists; they did not construct orthonormal wavelets, Malvar wavelets, or wavelet packets. These important developments were not the result of pure mathematics alone, either; they required interdisciplinary insight. What this book does make a strong case for is using all the insight available. Wavelet theory progressed rapidly, even explosively, because the underlying mathematics was sufficiently well developed to take advantage of the discoveries of engineers. It is noteworthy that eminent harmonic analysts like Coifman and Meyer, instead of pouting at the rediscovery of Calder6n's formula by Grossman and Morlet, learned from the signal processors. The new insight, combined with their mathematical expertise, enabled them to advance both fields, and even beat the signal processing engineers at their own game by constructing new basis sets for the adaptive time-frequency analysis. There are those who wish to study pure mathematics as "art for art's sake," those who are motivated by practical problems, and those who (like myself) are endlessly fascinated by the "unreasonable effectiveness" of mathematics in all fields. The lines between pure and applied mathematics are becoming increasingly blurred. So many fields of mathematics now have applications not anticipated by their founders that one can hardly find areas so "pure" that no applications exists. That some mathematicians prefer to concentrate on the underlying theory and leave the applications to others is understandable and defensible. However, all mathematicians should be aware that, just as analysis sometimes pro-
vides insights into geometry and vice versa, physics and engineering can also be the source of new mathematical ideas as well as the "market" for their results. Progress in mathematics requires people with the vision a n d open-mindedness to explore n e w ideas regardless of the source. Mathematicians can still make a strong case that scientific progress requires a community in which good mathematical research is valued and supported even w h e n immediate applications are not apparent. Those w h o would convince public policy-makers of this owe it to themselves and their colleagues to read Meyer's book and reflect u p o n the issues he raises.
References [1]
[2] [3] [41 [5[
[61 [71 [8] [9] [101 [111 [121 [13] [141 [15] [16] [17] [181 [19]
P. Auscher, I1 n'existe pas de bases d'ondelettes r6guli6res dans l'espace de Hardy I-I2(R), C.R. Acad. Sci. Paris 315 (1992), 769-772. C. Chui, An Introduction to Wavelets, New York: Academic Press (1992). I. Daubechies, Orthonormal bases of compactly supported wavelets, Commun. Pure Appl. Math. 41 (1988), 909-996. I. Daubechies, Ten Lectures on Wavelets, Philadelphia: SIAM, (1992). I. Daubechies (ed.), Different Perspectives on Wavelets, Proceedings of Symposia in Applied Mathematics No. 47, Providence, Rh American Mathematical Society (1993). M. Farge, Transformde en ondelettes continue et application ~i la turbulence, in Les ondelettes, Paris: Soci6t6 Mathdmatique de France (1990). M. Farge, The continuous wavelet transform of twodimensional continuous flows, in [R]. A. Grunbaum, Science 257 (1992), 821-822. Special issue on wavelet transforms and multiresolution signal analysis IEEE Trans. Inform. Theory IT38 (1992). Special Issue on wavelets and signal processing, IEEE Trans. Signal Proc. SP-41 (1993). G. Kaiser, A Friendly Guide to Wavelets, Cambridge, MA: Birkhauser (1994). Y. Meyer, Ondelettes, Paris: Hermann (1990). Y. Meyer, Wavelets and Operators, Cambridge: Cambridge University Press, (1992). Y. Meyer, Bull. AMS 28 (1993), 350-360. M.B. Ruskai, et al. (eds.), Wavelets and Their Applications, Boston: Jones and Bartlett (1992). R.S. Strichartz, How to make wavelets, Am. Math. Monthly 100 (1993), 539-556. M. Vetterli and J. Kova~evi~, Wavelets and Subband Coding Englewood Cliffs, NJ: Prentice-Hall (1995). G.G. Walter, Wavelets and Other Orthogonal Systems with Applications, Boca Raton, FL: CRC Press (1994). M.V. Wickerhauser, Adapted Wavelet Analysis from Theory to Software, Wellesley: A.K. Peters (1994).
Department of Mathematics University of MassachUsetts Lowell, MA 01854, USA
[email protected]
Norbert Wiener, 1894-1964 by Pesi R. Masani Basel, Boston: Birkh~iuser, 1990. 416 p. US $99.50, ISBN 0-8176-2246-2.
Invention: The Care and Feeding of Ideas by Norbert Wiener (introduction by S.J. Heims) Cambridge, MA: MIT Press, 1993. 159 pp. Hardcover: $25.00, ISBN 0-262-23167-0 Softcover: $9.95, ISBN 0-262-73111-8
Reviewed by Philip J. Davis* Fame is the spur that the clear spirit doth raise (That last infirmity of noble mind) To scorn delights and live laborious days. --Milton, Lycidas Toward the end of his life, Norbert Wiener seemed to be obsessed with the value of his contributions to science. I have this from a number of independent sources. M y o w n view of the m a n is that he was concerned not with the short t e r m - - s a y one h u n d r e d years--but with the long t e r m - - s a y five h u n d r e d years. For the short term he was certainly secure. Even as Wiener is n o w fading in living memories as an individual and as a personality, his n a m e has passed--a process that began even in his lifetime into the lifeblood of mathematics t h r o u g h such concepts as Wiener measure, Wiener process, Wiener-Hopf equations, Paley-Wiener theorems, the Wiener extrapolation of linear time series, generalized harmonic analysis. Wiener was the man w h o put the w o r d "cybernetics" into our current vocabulary, and every child w h o watches TV or plays with a computer seems to k n o w what the "cyber universe" is all about, even though Wiener himself might have been revulsed by what he saw.
The long term? Ah, there lies the sweet-sour ambiguity of life. In a scientific time scale that n o w extends from the Big Bang to the Big Collapse (or to the Exponential Death), in an age w h e n in-depth histories of the first three minutes of the cosmos have been written, w h a t is the definition of the short term, or of the long? I was nine or ten years old w h e n I first heard of Wiener. M y older brother, w h o was then working on an ScD in chemical engineering at MIT, was taking one of Wiener's courses. Wiener was in his late thirties at the time. I suspect n o w that m y brother hardly needed the mathematics for his experimental w o r k on the kinetics of the combustion of carbon and that he took the course
*Reprinted with permission from SIAM News, February 1995. THE MATHEMATICALINTELLIGENCERVOL. 17, NO. 4, 1995
73
merely to be able to say that he'd been a student of an acknowledged master. Naturally, my brother couldn't resist telling the famfly about the genius at whose feet he sat. I asked him what the course was about, and I remember my brother's response as though it had been yesterday: "Wiener talks about what the answers to certain problems would be like, assuming that the problems had answers." Naturally, I didn't understand a word of this: It sounded like hilarious double-talk, but it was m y first introduction to the language of existential mathematics. Years later, when I myself came to take Wiener's course, his reputation for genius and mild eccentricity had grown. He was one of those rare child prodigies w h o make good; he was a public figure who was soon to become a guru, an oracle, a much-quoted writer, a self-advertiser, and whose pronouncements on such issues as the future of labor under the impact of automatization or the strategic aspect of the cold war would be well publicized. In short, in my undergraduate and graduate days Wiener bestrode the narrow world of mathematics like a colossus, and he was one of the three or four colossi who did so. Pesi Masani, professor emeritus of mathematics at the University of Pittsburgh, knew Wiener well in his last years and did a number of joint papers with him. He edited Wiener's Collected Works and has written a splendid biography. Masani's book embraces Wiener the man, the logician, the mathematician both pure and applied, the social economic, and scientific philosopher, the novelist, the eccentric, the source of legend, and yet more. What is missing, perhaps, is Wiener on the Freudian couch, and what is present to the extent that I find it slightly redundant is Wiener as an object of adulation. If, therefore, you want to learn about Wiener, you can do no better than to start with Masani. An earlier book by Steve J. Heims, equally meritorious, John von Neumann and Norbert Wiener: From Mathematics to the Technologies of Life and Death (MIT Press, 1980), offers an extended comparison of the two men as regards their scientific visions and social philosophies. What I should like to turn to now is the recent posthumous publication of an early manuscript of Norbert Wiener, entitled Invention. The book was published in 1993, the hundredth year since Wiener's birth, 1 in circumstances that are themselves interesting. Sometime in the early 1950s, Jason Epstein, an editor at Doubleday (who subsequently rose to great prominence in Random House and the publishing world), suggested that Wiener write a short and popular book on the philosophy of invention. Wiener agreed and accepted an advance. By June 1954, he had completed a manuscript. He then lost interest in the project, in favor ~One of the three major conferences marking the centenary in I994 w a s organized by Pesi Masani.
7 4 THE MATHEMATICALINTELLIGENCERVOL.17, NO. 4, 1995
of other writings (e.g., his 1959 Random House novel, The Tempter), and returned the advance. The publisher did not press the matter, and the manuscript was left on the shelf to gather dust. After Wiener's death, Gordon Raisbeck, his literary executor and son-in-law, was naturally more concerned with the published books, and the existence of the manuscript became a dim memory. The Wiener Papers were transferred to the Institute Archives of the MIT Libraries in the early 1980s. Some years later, archivist Helen Samuels, working through the papers, found the manuscript and brought it to the attention of Larry Cohen of MIT Press, who decided to publish. Steve Heims, the author of the von Neumann/Wiener book, provided an excellent introduction. Invention is a short, flawed, and fascinating book. It is flawed in the sense that it suffers from multiple focusing. Does Wiener want to discuss historical instances of invention from a technical point of view? Does he want to discuss them from the point of view of the larger metaphysical assumptions of science, e.g., changes in the intellectual climate that thrust probability theory to the front of the stage? Does he want to display his own triumphs? Does he want to reopen an old scandal, taking up the cudgels in favor of Oliver Heaviside (the good guy), who Wiener claims was edged out of fame and fortune by Michael Pupin and AT&T (the bad guys)? Does he want to talk about the long-term effects of inventions on society? Does he want to warn against "megabuck science," large laboratories, and the rise of a generation of scientific and medical practitioners more interested in the dollar than in scientific truths? The answer to all these questions is: Yes. In this short book, Wiener wants to do and does all these things. Any one of them might have been given a full-length treatment. Sometimes he does them well, for he could be an excellent writer and could write a clear and pungent paragraph, but sometimes he constructs his discussions out of boilerplate. What fascinates me most about Invention is Wiener's taking up, time and time again, of the role of the individual in invention. Individual genius must not be denied, and the products of such genius, he says, are "acts of Grace." In an essay on what was then called the Great-Man Theory of History, William James wrote, "The community stagnates without the impulse of the individual; the impulse dies away without the sympathy of the community." Focusing largely on the individual, Wiener develops no general theory of invention and society. He is certainly aware of the community, but in this book his concern is largely with the individual creator. He talks about the true rewards, both achieved and denied, and the false rewards; he talks about the suffering experienced by the individual for the simple reason that ge-
nius is b y definition an extremely rare event. It is fascinating to observe how, w h e n Wiener's collar gets hot as he writes about things that upset him, his choice of w o r d s , generally elevated and elegant, becomes colloquial and strident. Ultimately, he set this manuscript aside a n d t u r n e d his attention to other scientific matters 9 Even as he did so, he wrote a full-length fictional treatment of the H e a v i s i d e - P u p i n story. It h a d been in his m i n d for easily 20 years; as early as 1941, he had tried to interest the radio and H o l l y w o o d actor Orson Welles in it. "False m o d e s t y is not one of the major virtues," Wiener writes near the e n d of Invention. The desire for fame, established over the long term, b u r n e d strong in this noble and extraordinary mind. Reviewing the full s p r e a d of Wienerian mathematics as p r e s e n t e d b y Masani, we can agree that he had m u c h not to be modest about. His unsuccessful attempt to talk H o l l y w o o d into doing a realistic version of h o w genius operates suggests the impossibility faced b y those w h o wish to explain, b y manuscript or b y formula, what is fundamentally an act of Grace.
Division of Applied Mathematics Brown University Providence, RI 02912
And now, we offer you a sharply differing point of view-The Editors
Norbert Wiener, 1894--1964 by Pesi R. Masani Boston: Birkhiuser, 1990. 416 pp. H a r d c o v e r , US $99.50, ISBN 0-8176-2246-2.
Reviewed by Adrian Riskin P.R. Masani's book " N o r b e r t Wiener," n u m b e r five in Birkhauser's Vita Mathematica series, I will argue, suffers from inaccuracy, bias, hero-worship, ax-grinding, turgid prose, and bigotry, both political and religious 9 If y o u read this book, y o u will k n o w nothing about Wiener and everything a b o u t Masani, whose ill-considered opinions and prejudices form the book's whole subject matter. Wiener appears only as a lifeless stone idol behind w h o m Masani pontificates, vainly attempting to legitimize his o w n views b y placing t h e m in Wiener's mouth. We will se& to what contortions he resorts to substitute his ideas for Wiener's, and to canonize t h e m b y deifying Wiener. Since m a n y statements of Wiener's that are a matter of public record contradict the views that Masani wants to attribute to him, Masani needs a mechanism b y which such statements can be explained away. Masani states
clearly the tactic he uses for dealing with Wiener's inconvenient statements: The basic proposition of cybernetics that signal = message + noise, and that the message, and not the noise, is the sensible term in communication, is applicable in all sorts Of contexts with suitable reinterpretation of the three terms . . . . Wiener is the signal, and for us the Wiener-message, and not the Wiener-noise, must be of significance. [p.19] The conflation of a h u m a n being with a radio signal ignores the fact that no h u m a n action or statement is unintentional. Everything a person does or says reveals something true about him, and for Mas~ni to dismiss a whole class of Wiener's utterances and actions as "noise" m a k e s him unable to reveal the truth of Wiener's life. For example, Wiener's father Leo was obsessed with transforming the y o u n g Wiener into a child prodigy. To this end, he kept him out of school and forced him to s t u d y extremely advanced material. Wiener sensitively tells the story of his y o u t h in the first v o l u m e of his autob i o g r a p h y [Wiener, 1953]. It is clear from this account, written w h e n Wiener was almost 60 yearg old, that he recalled his childhood and his father with a great deal of pain: I cannot deny that in my own attitude to my father there were hostile elements. There were elements of self-defense and even fear. But I always recognized his exceeding ability in intellectual matters and his fundamental honesty and respect for the truth, and these made tolerable the many frequently occurring painful situations . . . . [Wiener, 1953, p.71] If a child or a grandchild of mine should be as disturbed as I was, I should take him to a psychoanalyst, not with confidence that the treatment would be successful in some definitive way, but at least with the hope that there might be a certain measure of relief. But in 1909 there were no psychoanalysts in America . . . . Moreover ... it would have seemed to my parents a blasphemy and a confession of defeat to admit even to themselves that a member of the family might need such treatment. [Wiener, 1953, p.121] The vividness of his m e m o r y so long after shows the powerful impact that his u n h a p p y childhood had on his entire life. Masani, however, dismisses Wiener's view of his o w n youth. Of Wiener's account, Masani states that "some passages are deluged with narcissistic t e r m s . . . " [p.17], and refers to Wiener's " i m m a t u r e egoism." Another example of Masani's overruling Wiener's interpretation of his life involves Wiener's choice of academic subject. W h e n he was 14, Wiener decided to enroll in a Ph.D. p r o g r a m at H a r v a r d to s t u d y biology. His first semester was unsuccessful, and according to Masani, 9 his father decided to transfer him to the Sage School of Philosophy at Cornell University for the purpose of working for a doctorate in philosophy: Wiener resented this fresh demonstration of his father's authority. [p.40]
THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4, 1995 7 5
Wiener more than resented it, he regretted it even more than 40 years later. His account of the episode ends with the pitiable statement that " . . . despite all m y grievous faults, I might still have [had] a contribution to make to biology" [Wiener, 1953, p.129]. All Masani can find to say about this is that "most would agree that Wiener's transfer from biology to philosophy was a good thing . . . " [p.40]. How can Masani presume to know that it was a good thing? Perhaps Wiener would have had a huge impact on biology had he remained in the field, and even have lived a more fulfilling, happy life---a consideration that Masani ought not to be above noticing. Masani appears to distort the facts and meaning of Wiener's life to allow himself to expatiate on his own irrelevant theories. For example, Masani describes in vague terms an "unfortunate incident" in which Wiener was involved: "a problem of timing [Wiener's] publications so as not to compromise the theses of t w o . . , students" [p.92]. Masani claims that "Wiener's competitiveness, as revealed by this incident, had something to do with his slow promotion in the Mathematics department at MIT." [p.92] This assessment is at variance with Wiener's own version. Wiener discusses his competitiveness in the second volume of his autobiography: I knew very well that I was competitive beyond the run of younger mathematicians, and I know equally well that this was not a very pretty attitude. However, it was not an attitude which I was free to assume or reject. [Wiener, 1956, p.87-881 He discusses this openly and with self-awareness for another half a page, and never mentions his "slow promotion." Also, in [Wiener, 1953, pp.276--282] he discusses his early years at MIT and never once complains about his "slow promotion."] He even states that "From the beginning of m y relationship with MIT, I have received loyal backing from it and an understanding of m y needs, limitations, and possibilities" [Wiener, 1953, p.282]. This is hardly the attitude of an assistant professor steeped in bitterness over a lack of tenure. Also, it is clear from reading either of Wiener's autobiographies that he was not shy about revealing his less commendable emotions, so there is no reason to suspect him of duplicity here. It would seem that Masani twists the facts in this way because he has an ax to grind concerning the funding of science in the United States: ... in reality tension over promotion looms large because the scientists have to share a tiny fragment of the national income, consequent to its large scale diversion in support of inefficient bureaucracies, substandard and questionable manufacture, and pseudointellectual culture. Unfortunately Wiener did not analyze the problem objectively, and see it as the malaUocation of national income. [Masani pp.92-3] 76
THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4, 1995
A passage of this sort would arguably have a place in the book if it had anything to do with Norbert Wiener, but it does not. Masani goes so far in his insertion of his opinions into the story of Wiener that chapter 8, in which this passage occurs, is entitled "The Allotment of National Income, Marriage, and Wiener's Career." Masani's reactionary agenda becomes an even greater part of the book in its later pages. For instance, section 19B, on "The educational challenge," Masani begins with a gratuitous attack on John Dewey, whom he holds personally responsible for what he sees as the collapse of high-quality public education in this country. This is a factually indefensible position, as Masani must realize, since he modifies it later to the weaker assertion that "the outcome of [Dewey's] pragmatic ideas in education was harmful, as indeed Dewey himself realized." [p.2791 Then the section degenerates into a tirade against uncultured high school students who "stud their sentences with senseless verbiage such as "you know' . . . university instructors in mathematics" who call "tensors and matrices 'gadgets' instead of abstractions," teachers' colleges, and the PTA. For all these evils he suggests that "perhaps extensive reading from the King James edition of the Bible . . . might be the right antidote." [pp.280-281] Masani offers but one short, innocuous quotation from Wiener by way of introducing this rant, but can offer no concrete connection between Wiener and his social criticism, although he does make many unsubstantiated linkages of Wiener's name with various critics of the public schools. For instance--Wiener's sentiments were unwittingly expressed in the contents, and especially in the title, "The Emperor's New Clothes, or Prius Dementat" of an article written by Professor H.J. Fuller in the early 1950s. Following it came an interesting book, "Quackery in the Public Schools," by A. Lynd. [p.280] If Wiener expressed the sentiments in these works, w h y not quote him instead of involving Fuller and Lynd, with w h o m there is no evidence of Wiener's involvement? Masani also passes off as Wiener's his views on such topics as credit cards, hedonism, capitalism, sex in advertising, Karl Marx, the work-ethic, consumer gullibility, and Benedictine monks. These examples of Masani playing ventriloquist to Wiener m a y sound absurd, but the ultimate absurdity comes when Masani the Christian converts Wiener the atheist to Wiener the believer--post mortem. He first quotes Wiener as saying in 1952 that "I have never made up my childish quarrel with Jehovah, and a skeptic I have remained to the present d a y . . . " [Wiener, quoted on p.333]. Masani then criticizes two of Wiener's previous biographers, S. Heims and N.J. Faramelli, for referring to Wiener as an atheist, though he admits that
Faramelli "got this information from Wiener's family, friends, and colleagues" [p.333]. Of course it is unacceptable for the demigod to be an unbeliever! All is not l o s t , however: Masani reassures us that "It is easy to get to a personal God by completing Wiener's o w n thought" [p.333]. The falseness and stupidity of this book are exemplified in that one phrase. Masani sees nothing outrageous or dangerous in his "completing" the thoughts of a dead m a n who cannot sue, protest, or just send an insulting letter in retaliation. He proceeds to argue that "from the Wienerian p e r s p e c t i v e . . , it is atheism that appears to be 'unscientific' and even 'reactionary"" [p.334]. Notice that Wiener the m a n has disappeared altogether from the discussion, and has been replaced by a "Wienerian perspective." In other words, Masani has replaced him by the religion of Wienerism, a faith that bears the same relation to Norbert Wiener as Jerry Falwell's Christianity bears to Jesus Christ. Finally, since Wiener was a mathematician, and since this is a mathematics journal, I want to mention that Masani's account of Wiener's mathematical work is just as shabby. The only descriptions of Wiener's mathematics in this book that I could follow were the direct quotes from Wiener. M a s a ~ seems more concerned with displaying his mathematical virtuosity than with making himself intelligible. Also, he seems to have no fixed conception of the level of knowledge of his intended readership, solemnly noting that "the operator S is linear if S(af + bg) -- aS(f) + bS(g)" [p.991, and a few pages later claiming that
the time frequency uncertainty principle.., asserts that 1 f~Jt f(t)l 2 dt • ~_JA f(A)l2 dA -> ~-. Here f is in L2 (~), with norm 1, i.e. If(t)l2 dt = 1, and the integrals
It f(t)l2 dt and
f_~ IAf(A)[2 dA are finite, c~
f(A) being the (indirect) Fourier-Plancherel transform of f. [p.115] Perhaps I'm wrong, but it strikes me that a reader who doesn't k n o w what a linear operator is m a y not be in a position to appreciate the full impact of this statement. In short, Masani's book is little more than a polemic in the guise of a biography. A definitive biography of Wiener is long overdue. Until one is written, Wiener's own autobiographies are still the best source for information on his life, his thought, and his mathematics. References
N. Wiener. Ex-Prodigy: My Childhood and Youth. Simon and Schuster, New York, 1953. N. Wiener. I am a Mathematician. Doubleday, Garden City, New York. 1956. Department of Mathematics Northern Arizona University Flagstaff, AZ 86011 USA
T H E M A T H E M A T I C A L INTELLIGENCER VOL. 17, NO. 4, 1995
77
Robin Wilson* Mathematical Physics II Keith Hannabuss and Robin Wilson When Albert Einstein (1879-1955) won the Nobel Prize for Physics in 1921, it was awarded not for the theory of relativity but for his 1905 paper on the photoelectric effect. Maxwell's electromagnetic theory of light and new spectroscopic data, as well as experiments on the newly invented electric light bulb, had led scientists to consider how atoms emit light. At first it seemed that all the light should be of very high frequency. Late in 1900, Max Planck (1858-1947) had succeeded in reconciling theory with experiment by postulating that atoms could emit light only in discrete packets whose energy E was proportional to their frequency v--namely, E = hv, where h is Planck's constant. Lacking sufficient energy to make many high-frequency packets, atoms would be forced to radiate at lower frequencies instead. Einstein's paper showed that the relationship E = h v is a fundamental feature of light itself, rather than of the atoms, although the atom still determines the frequency of radiation. Despite the many experiments which
showed that light consisted of waves, it also behaved as though it consisted of particles, each carrying one quantum of energy, h v. Shortly after Einstein's paper, Ernest Rutherford's experiments led to the idea of atoms as "miniature solar systems" in which electrons orbit around a nucleus. To explain w h y atoms radiate light at the particular frequencies observed, Niels Bohr (1885-1962) proposed in 1913 that the angular momentum of each oribiting electron must be an integral multiple of the basic unit h/2cr. Ten years later, Louis de Broglie (1892-1987) suggested that if light waves could also behave as particles, then perhaps other particles (such as electrons) could also behave as waves. This was rapidly verified experiment a l l y - i n d e e d , interference and diffraction effects had already been observed, but not recognized as such. In de Broglie's theory, the electron orbits permitted by Bohr were precisely those whose circumference contains an integral number of wavelengths.
Einstein's Photoelectric Effect
de Broglie's Law
Niels Bohr Max Planck
*Column editor's address: Faculty of Mathematics, The Open University,Milton Keynes,MK7 6AA, England. 78
THE MATHEMATICAL INTELLIGENCER VOL. 17, NO. 4 9 1995 Springer-Verlag New York
de Broglie